Sample records for reliability study comparing

  1. Test-Retest Reliability of Standard and Emotional Stroop Tasks: An Investigation of Color-Word and Picture-Word Versions

    ERIC Educational Resources Information Center

    Strauss, Gregory P.; Allen, Daniel N.; Jorgensen, Melinda L.; Cramer, Stacey L.

    2005-01-01

    Previous studies have examined the reliability of scores derived from various Stroop tasks. However, few studies have compared reliability of more recently developed Stroop variants such as emotional Stroop tasks to standard versions of the Stroop. The current study developed four different single-stimulus Stroop tasks and compared test-retest…

  2. The Americleft Speech Project: A Training and Reliability Study.

    PubMed

    Chapman, Kathy L; Baylis, Adriane; Trost-Cardamone, Judith; Cordero, Kelly Nett; Dixon, Angela; Dobbelsteyn, Cindy; Thurmes, Anna; Wilson, Kristina; Harding-Bell, Anne; Sweeney, Triona; Stoddard, Gregory; Sell, Debbie

    2016-01-01

    To describe the results of two reliability studies and to assess the effect of training on interrater reliability scores. The first study (1) examined interrater and intrarater reliability scores (weighted and unweighted kappas) and (2) compared interrater reliability scores before and after training on the use of the Cleft Audit Protocol for Speech-Augmented (CAPS-A) with British English-speaking children. The second study examined interrater and intrarater reliability on a modified version of the CAPS-A (CAPS-A Americleft Modification) with American and Canadian English-speaking children. Finally, comparisons were made between the interrater and intrarater reliability scores obtained for Study 1 and Study 2. The participants were speech-language pathologists from the Americleft Speech Project. In Study 1, interrater reliability scores improved for 6 of the 13 parameters following training on the CAPS-A protocol. Comparison of the reliability results for the two studies indicated lower scores for Study 2 compared with Study 1. However, this appeared to be an artifact of the kappa statistic that occurred due to insufficient variability in the reliability samples for Study 2. When percent agreement scores were also calculated, the ratings appeared similar across Study 1 and Study 2. The findings of this study suggested that improvements in interrater reliability could be obtained following a program of systematic training. However, improvements were not uniform across all parameters. Acceptable levels of reliability were achieved for those parameters most important for evaluation of velopharyngeal function.

  3. The Americleft Speech Project: A Training and Reliability Study

    PubMed Central

    Chapman, Kathy L.; Baylis, Adriane; Trost-Cardamone, Judith; Cordero, Kelly Nett; Dixon, Angela; Dobbelsteyn, Cindy; Thurmes, Anna; Wilson, Kristina; Harding-Bell, Anne; Sweeney, Triona; Stoddard, Gregory; Sell, Debbie

    2017-01-01

    Objective To describe the results of two reliability studies and to assess the effect of training on interrater reliability scores. Design The first study (1) examined interrater and intrarater reliability scores (weighted and unweighted kappas) and (2) compared interrater reliability scores before and after training on the use of the Cleft Audit Protocol for Speech–Augmented (CAPS-A) with British English-speaking children. The second study examined interrater and intrarater reliability on a modified version of the CAPS-A (CAPS-A Americleft Modification) with American and Canadian English-speaking children. Finally, comparisons were made between the interrater and intrarater reliability scores obtained for Study 1 and Study 2. Participants The participants were speech-language pathologists from the Americleft Speech Project. Results In Study 1, interrater reliability scores improved for 6 of the 13 parameters following training on the CAPS-A protocol. Comparison of the reliability results for the two studies indicated lower scores for Study 2 compared with Study 1. However, this appeared to be an artifact of the kappa statistic that occurred due to insufficient variability in the reliability samples for Study 2. When percent agreement scores were also calculated, the ratings appeared similar across Study 1 and Study 2. Conclusion The findings of this study suggested that improvements in interrater reliability could be obtained following a program of systematic training. However, improvements were not uniform across all parameters. Acceptable levels of reliability were achieved for those parameters most important for evaluation of velopharyngeal function. PMID:25531738

  4. Test-retest reliability of schizoaffective disorder compared with schizophrenia, bipolar disorder, and unipolar depression--a systematic review and meta-analysis.

    PubMed

    Santelmann, Hanno; Franklin, Jeremy; Bußhoff, Jana; Baethge, Christopher

    2015-11-01

    Schizoaffective disorder is a frequent diagnosis, and its reliability is subject to ongoing discussion. We compared the diagnostic reliability of schizoaffective disorder with its main differential diagnoses. We systematically searched Medline, Embase, and PsycInfo for all studies on the test-retest reliability of the diagnosis of schizoaffective disorder as compared with schizophrenia, bipolar disorder, and unipolar depression. We used meta-analytic methods to describe and compare Cohen's kappa as well as positive and negative agreement. In addition, multiple pre-specified and post hoc subgroup and sensitivity analyses were carried out. Out of 4,415 studies screened, 49 studies were included. Test-retest reliability of schizoaffective disorder was consistently lower than that of schizophrenia (in 39 out of 42 studies), bipolar disorder (27/33), and unipolar depression (29/35). The mean difference in kappa between schizoaffective disorder and the other diagnoses was approximately 0.2, and mean Cohen's kappa for schizoaffective disorder was 0.50 (95% confidence interval: 0.40-0.59). While findings were unequivocal and homogeneous for schizoaffective disorder's diagnostic reliability relative to its three main differential diagnoses (dichotomous: smaller versus larger), heterogeneity was substantial for continuous measures, even after subgroup and sensitivity analyses. In clinical practice and research, schizoaffective disorder's comparatively low diagnostic reliability should lead to increased efforts to correctly diagnose the disorder. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  5. Binge Eating Disorder: Reliability and Validity of a New Diagnostic Category.

    ERIC Educational Resources Information Center

    Brody, Michelle L.; And Others

    1994-01-01

    Examined reliability and validity of binge eating disorder (BED), proposed for inclusion in Diagnostic and Statistical Manual of Mental Disorders (DSM), fourth edition. Interrater reliability of BED diagnosis compared favorably with that of most diagnoses in DSM revised third edition. Study comparing obese individuals with and without BED and…

  6. Reliability-based structural optimization: A proposed analytical-experimental study

    NASA Technical Reports Server (NTRS)

    Stroud, W. Jefferson; Nikolaidis, Efstratios

    1993-01-01

    An analytical and experimental study for assessing the potential of reliability-based structural optimization is proposed and described. In the study, competing designs obtained by deterministic and reliability-based optimization are compared. The experimental portion of the study is practical because the structure selected is a modular, actively and passively controlled truss that consists of many identical members, and because the competing designs are compared in terms of their dynamic performance and are not destroyed if failure occurs. The analytical portion of this study is illustrated on a 10-bar truss example. In the illustrative example, it is shown that reliability-based optimization can yield a design that is superior to an alternative design obtained by deterministic optimization. These analytical results provide motivation for the proposed study, which is underway.

  7. Adaptation of the ToxRTool to Assess the Reliability of Toxicology Studies Conducted with Genetically Modified Crops and Implications for Future Safety Testing.

    PubMed

    Koch, Michael S; DeSesso, John M; Williams, Amy Lavin; Michalek, Suzanne; Hammond, Bruce

    2016-01-01

    To determine the reliability of food safety studies carried out in rodents with genetically modified (GM) crops, a Food Safety Study Reliability Tool (FSSRTool) was adapted from the European Centre for the Validation of Alternative Methods' (ECVAM) ToxRTool. Reliability was defined as the inherent quality of the study with regard to use of standardized testing methodology, full documentation of experimental procedures and results, and the plausibility of the findings. Codex guidelines for GM crop safety evaluations indicate toxicology studies are not needed when comparability of the GM crop to its conventional counterpart has been demonstrated. This guidance notwithstanding, animal feeding studies have routinely been conducted with GM crops, but their conclusions on safety are not always consistent. To accurately evaluate potential risks from GM crops, risk assessors need clearly interpretable results from reliable studies. The development of the FSSRTool, which provides the user with a means of assessing the reliability of a toxicology study to inform risk assessment, is discussed. Its application to the body of literature on GM crop food safety studies demonstrates that reliable studies report no toxicologically relevant differences between rodents fed GM crops or their non-GM comparators.

  8. Reliability of 3D laser-based anthropometry and comparison with classical anthropometry.

    PubMed

    Kuehnapfel, Andreas; Ahnert, Peter; Loeffler, Markus; Broda, Anja; Scholz, Markus

    2016-05-26

    Anthropometric quantities are widely used in epidemiologic research as possible confounders, risk factors, or outcomes. 3D laser-based body scans (BS) allow evaluation of dozens of quantities in short time with minimal physical contact between observers and probands. The aim of this study was to compare BS with classical manual anthropometric (CA) assessments with respect to feasibility, reliability, and validity. We performed a study on 108 individuals with multiple measurements of BS and CA to estimate intra- and inter-rater reliabilities for both. We suggested BS equivalents of CA measurements and determined validity of BS considering CA the gold standard. Throughout the study, the overall concordance correlation coefficient (OCCC) was chosen as indicator of agreement. BS was slightly more time consuming but better accepted than CA. For CA, OCCCs for intra- and inter-rater reliability were greater than 0.8 for all nine quantities studied. For BS, 9 of 154 quantities showed reliabilities below 0.7. BS proxies for CA measurements showed good agreement (minimum OCCC > 0.77) after offset correction. Thigh length showed higher reliability in BS while upper arm length showed higher reliability in CA. Except for these issues, reliabilities of CA measurements and their BS equivalents were comparable.

  9. A Study of Reliability of Marking and Absolute Grading in Secondary Schools

    ERIC Educational Resources Information Center

    Abdul Gafoor, K.; Jisha, P.

    2014-01-01

    Using a non-experimental comparative group design in a sample consisting of 100 English teachers randomly selected from 30 secondary schools of a district of Kerala and assigning fifty teachers to groups for marking and grading, this study compares inter and intra-individual reliability in marking and absolute grading. Studying (1) the in marking…

  10. Development and inter-rater reliability of a standardized verbal instruction manual for the Chinese Geriatric Depression Scale-short form.

    PubMed

    Wong, M T P; Ho, T P; Ho, M Y; Yu, C S; Wong, Y H; Lee, S Y

    2002-05-01

    The Geriatric Depression Scale (GDS) is a common screening tool for elderly depression in Hong Kong. This study aimed at (1) developing a standardized manual for the verbal administration and scoring of the GDS-SF, and (2) comparing the inter-rater reliability between the standardized and non-standardized verbal administration of GDS-SF. Two studies were reported. In Study 1, the process of developing the manual was described. In Study 2, we compared the inter-rater reliabilities of GDS-SF scores using the standardized verbal instructions and the traditional non-standardized administration. Results of Study 2 indicated that the standardized procedure in verbal administration and scoring improved the inter-rater reliabilities of GDS-SF. Copyright 2002 John Wiley & Sons, Ltd.

  11. VFS interjudge reliability using a free and directed search.

    PubMed

    Bryant, Karen N; Finnegan, Eileen; Berbaum, Kevin

    2012-03-01

    Reports in the literature suggest that clinicians demonstrate poor reliability in rating videofluoroscopic swallow (VFS) variables. Contemporary perception theories suggest that the methods used in VFS reliability studies constrain subjects to make judgments in an abnormal way. The purpose of this study was to determine whether a directed search or a free search approach to rating swallow studies results in better interjudge reliability. Ten speech pathologists served as judges. Five clinical judges were assigned to the directed search group (use checklist) and five to the free search group (unguided observations). Clinical judges interpreted 20 VFS examinations of swallowing. Interjudge reliability of ratings of dysphagia severity, affected stage of swallow, dysphagia symptoms, and attributes identified by clinical judges using a directed search was compared with that using a free search approach. Interjudge reliability for rating the presence of aspiration and penetration was significantly better using a free search ("substantial" to "almost perfect" agreement) compared to a directed search ("moderate" agreement). Reliability of dysphagia severity ratings ranged from "moderate" to "almost perfect" agreement for both methods of search. Reliability for reporting all other symptoms and attributes of dysphagia was variable and was not significantly different between the groups.

  12. Comparing Interrater reliability between eye examination and eye self-examination 1

    PubMed Central

    de Lima, Maria Alzete; Pagliuca, Lorita Marlena Freitag; do Nascimento, Jennara Cândido; Caetano, Joselany Áfio

    2017-01-01

    Resume Objective: to compare Interrater reliability concerning two eye assessment methods. Method: quasi-experimental study conducted with 324 college students including eye self-examination and eye assessment performed by the researchers in a public university. Kappa coefficient was used to verify agreement. Results: reliability coefficients between Interraters ranged from 0.85 to 0.95, with statistical significance at 0.05. The exams to check for near acuity and peripheral vision presented a reasonable kappa >0.2. The remaining coefficients were higher, ranging from very to totally reliable. Conclusion: comparatively, the results of both methods were similar. The virtual manual on eye self-examination can be used to screen for eye conditions. PMID:29069269

  13. Reliability Stress-Strength Models for Dependent Observations with Applications in Clinical Trials

    NASA Technical Reports Server (NTRS)

    Kushary, Debashis; Kulkarni, Pandurang M.

    1995-01-01

    We consider the applications of stress-strength models in studies involving clinical trials. When studying the effects and side effects of certain procedures (treatments), it is often the case that observations are correlated due to subject effect, repeated measurements and observing many characteristics simultaneously. We develop maximum likelihood estimator (MLE) and uniform minimum variance unbiased estimator (UMVUE) of the reliability which in clinical trial studies could be considered as the chances of increased side effects due to a particular procedure compared to another. The results developed apply to both univariate and multivariate situations. Also, for the univariate situations we develop simple to use lower confidence bounds for the reliability. Further, we consider the cases when both stress and strength constitute time dependent processes. We define the future reliability and obtain methods of constructing lower confidence bounds for this reliability. Finally, we conduct simulation studies to evaluate all the procedures developed and also to compare the MLE and the UMVUE.

  14. Inter- and Intrarater Reliability Using Different Software Versions of E4D Compare in Dental Education.

    PubMed

    Callan, Richard S; Cooper, Jeril R; Young, Nancy B; Mollica, Anthony G; Furness, Alan R; Looney, Stephen W

    2015-06-01

    The problems associated with intra- and interexaminer reliability when assessing preclinical performance continue to hinder dental educators' ability to provide accurate and meaningful feedback to students. Many studies have been conducted to evaluate the validity of utilizing various technologies to assist educators in achieving that goal. The purpose of this study was to compare two different versions of E4D Compare software to determine if either could be expected to deliver consistent and reliable comparative results, independent of the individual utilizing the technology. Five faculty members obtained E4D digital images of students' attempts (sample model) at ideal gold crown preparations for tooth #30 performed on typodont teeth. These images were compared to an ideal (master model) preparation utilizing two versions of E4D Compare software. The percent correlations between and within these faculty members were recorded and averaged. The intraclass correlation coefficient was used to measure both inter- and intrarater agreement among the examiners. The study found that using the older version of E4D Compare did not result in acceptable intra- or interrater agreement among the examiners. However, the newer version of E4D Compare, when combined with the Nevo scanner, resulted in a remarkable degree of agreement both between and within the examiners. These results suggest that consistent and reliable results can be expected when utilizing this technology under the protocol described in this study.

  15. Test Assembly Implications for Providing Reliable and Valid Subscores

    ERIC Educational Resources Information Center

    Lee, Minji K.; Sweeney, Kevin; Melican, Gerald J.

    2017-01-01

    This study investigates the relationships among factor correlations, inter-item correlations, and the reliability estimates of subscores, providing a guideline with respect to psychometric properties of useful subscores. In addition, it compares subscore estimation methods with respect to reliability and distinctness. The subscore estimation…

  16. Comparative Reliability Studies and Analysis of Au, Pd-Coated Cu and Pd-Doped Cu Wire in Microelectronics Packaging

    PubMed Central

    Chong Leong, Gan; Uda, Hashim

    2013-01-01

    This paper compares and discusses the wearout reliability and analysis of Gold (Au), Palladium (Pd) coated Cu and Pd-doped Cu wires used in fineline Ball Grid Array (BGA) package. Intermetallic compound (IMC) thickness measurement has been carried out to estimate the coefficient of diffusion (Do) under various aging conditions of different bonding wires. Wire pull and ball bond shear strengths have been analyzed and we found smaller variation in Pd-doped Cu wire compared to Au and Pd-doped Cu wire. Au bonds were identified to have faster IMC formation, compared to slower IMC growth of Cu. The obtained weibull slope, β of three bonding wires are greater than 1.0 and belong to wearout reliability data point. Pd-doped Cu wire exhibits larger time-to-failure and cycles-to-failure in both wearout reliability tests in Highly Accelerated Temperature and Humidity (HAST) and Temperature Cycling (TC) tests. This proves Pd-doped Cu wire has a greater potential and higher reliability margin compared to Au and Pd-coated Cu wires. PMID:24244344

  17. A study on reliability of power customer in distribution network

    NASA Astrophysics Data System (ADS)

    Liu, Liyuan; Ouyang, Sen; Chen, Danling; Ma, Shaohua; Wang, Xin

    2017-05-01

    The existing power supply reliability index system is oriented to power system without considering actual electricity availability in customer side. In addition, it is unable to reflect outage or customer’s equipment shutdown caused by instantaneous interruption and power quality problem. This paper thus makes a systematic study on reliability of power customer. By comparing with power supply reliability, reliability of power customer is defined and extracted its evaluation requirements. An indexes system, consisting of seven customer indexes and two contrast indexes, are designed to describe reliability of power customer from continuity and availability. In order to comprehensively and quantitatively evaluate reliability of power customer in distribution networks, reliability evaluation method is proposed based on improved entropy method and the punishment weighting principle. Practical application has proved that reliability index system and evaluation method for power customer is reasonable and effective.

  18. Reliability and validity of goniometric iPhone applications for the assessment of active shoulder external rotation.

    PubMed

    Mitchell, Katy; Gutierrez, Simran Bakshi; Sutton, Stacy; Morton, Stephanie; Morgenthaler, Andrea

    2014-10-01

    The purpose of this study was to determine the reliability and validity of two smartphone applications: (1) GetMyROM - inclinometery-based and (2) DrGoniometry - photo-based in the measurement of active shoulder external rotation (ER) as compared to standard goniometry (SG). Ninety-four Texas Woman's University Doctor of Physical Therapy students from the School of Physical Therapy - Houston campus, were recruited to participate in this study. Two iPhone applications were compared to SG using both novice and experienced raters. Active shoulder ER range of motion was measured over two time periods in random order by blinded novice and experienced raters. Intra-rater reliability using novice raters for the two applications ranged from an intraclass correlation coefficient (ICC) of 0.79 to 0.81 with SG at 0.82. Inter-rater reliability (novice/expert) for the two applications ranged from an ICC of 0.92 to 0.94 with SG at 0.91. Concurrent validity (when compared to SG) ranged from 0.93 to 0.94. There were no significant differences between the novice and experienced raters. Both applications were found to be reliable and comparable to SG. A photo-based application potentially offers a superior method of measurement as visualizing the landmarks may be simplified in this format and it provides a record of measurement. Further study using patient populations may find the two studied applications are useful as an adjunct for clinical practice.

  19. A Comparison of Two Methods of Determining Interrater Reliability

    ERIC Educational Resources Information Center

    Fleming, Judith A.; Taylor, Janeen McCracken; Carran, Deborah

    2004-01-01

    This article offers an alternative methodology for practitioners and researchers to use in establishing interrater reliability for testing purposes. The majority of studies on interrater reliability use a traditional methodology where by two raters are compared using a Pearson product-moment correlation. This traditional method of estimating…

  20. Reliability of Test Scores in Nonparametric Item Response Theory.

    ERIC Educational Resources Information Center

    Sijtsma, Klaas; Molenaar, Ivo W.

    1987-01-01

    Three methods for estimating reliability are studied within the context of nonparametric item response theory. Two were proposed originally by Mokken and a third is developed in this paper. Using a Monte Carlo strategy, these three estimation methods are compared with four "classical" lower bounds to reliability. (Author/JAZ)

  1. Validation of a novel smartphone accelerometer-based knee goniometer.

    PubMed

    Ockendon, Matthew; Gilbert, Robin E

    2012-09-01

    Loss of full knee extension following anterior cruciate ligament surgery has been shown to impair knee function. However, there can be significant difficulties in accurately and reproducibly measuring a fixed flexion of the knee. We studied the interobserver and the intraobserver reliabilities of a novel, smartphone accelerometer-based, knee goniometer and compared it with a long-armed conventional goniometer for the assessment of fixed flexion knee deformity. Five healthy male volunteers (age range 30 to 40 years) were studied. Measurements of knee flexion angle were made with a telescopic-armed goniometer (Lafayette Instrument, Lafayette, IN) and compared with measurements using the smartphone (iPhone 3GS, Apple Inc., Cupertino, CA) knee goniometer using a novel trigonometric technique based on tibial inclination. Bland-Altman analysis of validity and reliability including statistical analysis of correlation by Pearson's method was undertaken. The iPhone goniometer had an interobserver correlation (r) of 0.994 compared with 0.952 for the Lafayette. The intraobserver correlation was r = 0.982 for the iPhone (compared with 0.927). The datasets from the two instruments correlate closely (r = 0.947) are proportional and have mean difference of only -0.4 degrees (SD 3.86 degrees). The Lafayette goniometer had an intraobserver reliability +/- 9.6 degrees. The interobserver reliability was +/- 8.4 degrees. By comparison the iPhone had an interobserver reliability +/- 2.7 degrees and an intraobserver reliability +/- 4.6 degrees. We found the iPhone goniometer to be a reliable tool for the measurement of subtle knee flexion in the clinic setting.

  2. A Comparison of the Approaches of Generalizability Theory and Item Response Theory in Estimating the Reliability of Test Scores for Testlet-Composed Tests

    ERIC Educational Resources Information Center

    Lee, Guemin; Park, In-Yong

    2012-01-01

    Previous assessments of the reliability of test scores for testlet-composed tests have indicated that item-based estimation methods overestimate reliability. This study was designed to address issues related to the extent to which item-based estimation methods overestimate the reliability of test scores composed of testlets and to compare several…

  3. Comparability and Reliability Considerations of Adequate Yearly Progress

    ERIC Educational Resources Information Center

    Maier, Kimberly S.; Maiti, Tapabrata; Dass, Sarat C.; Lim, Chae Young

    2012-01-01

    The purpose of this study is to develop an estimate of Adequate Yearly Progress (AYP) that will allow for reliable and valid comparisons among student subgroups, schools, and districts. A shrinkage-type estimator of AYP using the Bayesian framework is described. Using simulated data, the performance of the Bayes estimator will be compared to…

  4. A Meta-Analysis of the Reliability of Free and For-Pay Big Five Scales.

    PubMed

    Hamby, Tyler; Taylor, Wyn; Snowden, Audrey K; Peterson, Robert A

    2016-01-01

    The present study meta-analytically compared coefficient alpha reliabilities reported for free and for-pay Big Five scales. We collected 288 studies from five previous meta-analyses of Big Five traits and harvested 1,317 alphas from these studies. We found that free and for-pay scales measuring Big Five traits possessed comparable reliabilities. However, after we controlled for the numbers of items in the scales with the Spearman-Brown formula, we found that free scales possessed significantly higher alpha coefficients than for-pay scales for each of the Big Five traits. Thus, the study offers initial evidence that Big Five scales that are free more efficiently measure these traits for research purposes than do for-pay scales.

  5. Reliability of joint count assessment in rheumatoid arthritis: a systematic literature review.

    PubMed

    Cheung, Peter P; Gossec, Laure; Mak, Anselm; March, Lyn

    2014-06-01

    Joint counts are central to the assessment of rheumatoid arthritis (RA) but reliability is an issue. To evaluate the reliability and agreement of joint counts (intra-observer and inter-observer) by health care professionals (physicians, nurses, and metrologists) and patients in RA, and the impact of training and standardization on joint count reliability through a systematic literature review. Articles reporting joint count reliability or agreement in RA in PubMed, EMBase, and the Cochrane library between 1960 and 2012 were selected. Data were extracted regarding tender joint counts (TJCs) and swollen joint counts (SJCs) derived by physicians, metrologists, or patients for intra-observer and inter-observer reliability. In addition, methods and effects of training or standardization were extracted. Statistics expressing reliability such as intraclass correlation coefficients (ICCs) were extracted. Data analysis was primarily descriptive due to high heterogeneity. Twenty-eight studies on health care professionals (HCP) and 20 studies on patients were included. Intra-observer reliability for TJCs and SJCs was good for HCPs and patients (range of ICC: 0.49-0.98). Inter-observer reliability between HCPs for TJCs was higher than for SJCs (range of ICC: 0.64-0.88 vs. 0.29-0.98). Patient inter-observer reliability with HCPs as comparators was better for TJCs (range of ICC: 0.31-0.91) compared to SJCs (0.16-0.64). Nine studies (7 with HCPs and 2 with patients) evaluated consensus or training, with improvement in reliability of TJCs but conflicting evidence for SJCs. Intra- and inter-observer reliability was high for TJCs for HCPs and patients: among all groups, reliability was better for TJCs than SJCs. Inter-observer reliability of SJCs was poorer for patients than HCPs. Data were inconclusive regarding the potential for training to improve SJC reliability. Overall, the results support further evaluation for patient-reported joint counts as an outcome measure. © 2013 Published by Elsevier Inc.

  6. Circumferential finger measurements utilizing a torque meter to increase reliability.

    PubMed

    King, T I

    1993-01-01

    The purpose of this study was to compare the reliabilities of two methods of measuring finger circumference. Traditionally, finger circumference is determined clinically by the use of a tape measure. In this study, a tape-measure device for recording finger circumference utilizing a torque meter was compared with the traditional method to determine reliability differences. Ninety-two occupational therapists and occupational therapy students obtained circumferential measurements of the author's left index finger at the middle of the proximal phalanx utilizing the two methods. The readings obtained for each method were analyzed to determine the coefficient of variation and to compare their variances. The coefficient of variation for the traditional method was 2.92 and for the device utilizing the torque meter was 0.75. The F ratio was 15.63, which is significant at the 0.01 level. The results of this study indicate greater interrater reliability using a device that can accurately measure torque and allow the therapist to control the amount of tension applied when obtaining circumferential measurements using a tape measure.

  7. Critically re-evaluating a common technique: Accuracy, reliability, and confirmation bias of EMG.

    PubMed

    Narayanaswami, Pushpa; Geisbush, Thomas; Jones, Lyell; Weiss, Michael; Mozaffar, Tahseen; Gronseth, Gary; Rutkove, Seward B

    2016-01-19

    (1) To assess the diagnostic accuracy of EMG in radiculopathy. (2) To evaluate the intrarater reliability and interrater reliability of EMG in radiculopathy. (3) To assess the presence of confirmation bias in EMG. Three experienced academic electromyographers interpreted 3 compact discs with 20 EMG videos (10 normal, 10 radiculopathy) in a blinded, standardized fashion without information regarding the nature of the study. The EMGs were interpreted 3 times (discs A, B, C) 1 month apart. Clinical information was provided only with disc C. Intrarater reliability was calculated by comparing interpretations in discs A and B, interrater reliability by comparing interpretation between reviewers. Confirmation bias was estimated by the difference in correct interpretations when clinical information was provided. Sensitivity was similar to previous reports (77%, confidence interval [CI] 63%-90%); specificity was 71%, CI 56%-85%. Intrarater reliability was good (κ 0.61, 95% CI 0.41-0.81); interrater reliability was lower (κ 0.53, CI 0.35-0.71). There was no substantial confirmation bias when clinical information was provided (absolute difference in correct responses 2.2%, CI -13.3% to 17.7%); the study lacked precision to exclude moderate confirmation bias. This study supports that (1) serial EMG studies should be performed by the same electromyographer since intrarater reliability is better than interrater reliability; (2) knowledge of clinical information does not bias EMG interpretation substantially; (3) EMG has moderate diagnostic accuracy for radiculopathy with modest specificity and electromyographers should exercise caution interpreting mild abnormalities. This study provides Class III evidence that EMG has moderate diagnostic accuracy and specificity for radiculopathy. © 2015 American Academy of Neurology.

  8. INFLUENCES OF RESPONSE RATE AND DISTRIBUTION ON THE CALCULATION OF INTEROBSERVER RELIABILITY SCORES

    PubMed Central

    Rolider, Natalie U.; Iwata, Brian A.; Bullock, Christopher E.

    2012-01-01

    We examined the effects of several variations in response rate on the calculation of total, interval, exact-agreement, and proportional reliability indices. Trained observers recorded computer-generated data that appeared on a computer screen. In Study 1, target responses occurred at low, moderate, and high rates during separate sessions so that reliability results based on the four calculations could be compared across a range of values. Total reliability was uniformly high, interval reliability was spuriously high for high-rate responding, proportional reliability was somewhat lower for high-rate responding, and exact-agreement reliability was the lowest of the measures, especially for high-rate responding. In Study 2, we examined the separate effects of response rate per se, bursting, and end-of-interval responding. Response rate and bursting had little effect on reliability scores; however, the distribution of some responses at the end of intervals decreased interval reliability somewhat, proportional reliability noticeably, and exact-agreement reliability markedly. PMID:23322930

  9. Reliabilities of Intraindividual Variability Indicators with Autocorrelated Longitudinal Data: Implications for Longitudinal Study Designs.

    PubMed

    Du, Han; Wang, Lijuan

    2018-04-23

    Intraindividual variability can be measured by the intraindividual standard deviation ([Formula: see text]), intraindividual variance ([Formula: see text]), estimated hth-order autocorrelation coefficient ([Formula: see text]), and mean square successive difference ([Formula: see text]). Unresolved issues exist in the research on reliabilities of intraindividual variability indicators: (1) previous research only studied conditions with 0 autocorrelations in the longitudinal responses; (2) the reliabilities of [Formula: see text] and [Formula: see text] have not been studied. The current study investigates reliabilities of [Formula: see text], [Formula: see text], [Formula: see text], [Formula: see text], and the intraindividual mean, with autocorrelated longitudinal data. Reliability estimates of the indicators were obtained through Monte Carlo simulations. The impact of influential factors on reliabilities of the intraindividual variability indicators is summarized, and the reliabilities are compared across the indicators. Generally, all the studied indicators of intraindividual variability were more reliable with a more reliable measurement scale and more assessments. The reliabilities of [Formula: see text] were generally lower than those of [Formula: see text] and [Formula: see text], the reliabilities of [Formula: see text] were usually between those of [Formula: see text] and [Formula: see text] unless the scale reliability was large and/or the interindividual standard deviation in autocorrelation coefficients was large, and the reliabilities of the intraindividual mean were generally the highest. An R function is provided for planning longitudinal studies to ensure sufficient reliabilities of the intraindividual indicators are achieved.

  10. A Comparative Study of the Reliability and Validity of the "Degrees of Reading Power" and the "Iowa Tests of Basic Skills."

    ERIC Educational Resources Information Center

    Hildebrand, Myrene; Hoover, H. D.

    This study compared the reliability and validity of two different measures of reading ability, the Degrees of Reading Power (DRP) and the Iowa Tests of Basic Skills (ITBS) Reading test and the ITBS Vocabulary test. The data consisted of scores of 377 grade 5 and grade 6 students on these tests, along with their assigned reading levels in the…

  11. Reliability and validity of the Safe Routes to school parent and student surveys

    PubMed Central

    2011-01-01

    Background The purpose of this study is to assess the reliability and validity of the U.S. National Center for Safe Routes to School's in-class student travel tallies and written parent surveys. Over 65,000 tallies and 374,000 parent surveys have been completed, but no published studies have examined their measurement properties. Methods Students and parents from two Charlotte, NC (USA) elementary schools participated. Tallies were conducted on two consecutive days using a hand-raising protocol; on day two students were also asked to recall the previous days' travel. The recall from day two was compared with day one to assess 24-hour test-retest reliability. Convergent validity was assessed by comparing parent-reports of students' travel mode with student-reports of travel mode. Two-week test-retest reliability of the parent survey was assessed by comparing within-parent responses. Reliability and validity were assessed using kappa statistics. Results A total of 542 students participated in the in-class student travel tally reliability assessment and 262 parent-student dyads participated in the validity assessment. Reliability was high for travel to and from school (kappa > 0.8); convergent validity was lower but still high (kappa > 0.75). There were no differences by student grade level. Two-week test-retest reliability of the parent survey (n = 112) ranged from moderate to very high for objective questions on travel mode and travel times (kappa range: 0.62 - 0.97) but was substantially lower for subjective assessments of barriers to walking to school (kappa range: 0.31 - 0.76). Conclusions The student in-class student travel tally exhibited high reliability and validity at all elementary grades. The parent survey had high reliability on questions related to student travel mode, but lower reliability for attitudinal questions identifying barriers to walking to school. Parent survey design should be improved so that responses clearly indicate issues that influence parental decision making in regards to their children's mode of travel to school. PMID:21651794

  12. Reliability and Validity of Prototype Diagnosis for Adolescent Psychopathology.

    PubMed

    Haggerty, Greg; Zodan, Jennifer; Mehra, Ashwin; Zubair, Ayyan; Ghosh, Krishnendu; Siefert, Caleb J; Sinclair, Samuel J; DeFife, Jared

    2016-04-01

    The current study investigated the interrater reliability and validity of prototype ratings of 5 common adolescent psychiatric disorders: attention-deficit/hyperactivity disorder, conduct disorder, major depressive disorder, generalized anxiety disorder, and posttraumatic stress disorder. One hundred fifty-seven adolescent inpatient participants consented to participate in this study. We compared ratings from 2 inpatient clinicians, blinded to each other's ratings and patient measures, after their separate initial diagnostic interview to assess interrater reliability. Prototype ratings completed by clinicians after their initial diagnostic interview with adolescent inpatients and outpatients were compared with patient-reported behavior problems and parents' report of their child's behavioral problems. Prototype ratings demonstrated good interrater reliability. Clinicians' prototype ratings showed predicted relationships with patient-reported behavior problems and parent-reported behavior problems. Prototype matching seems to be a possible alternative for psychiatric diagnosis. Prototype ratings showed good interrater reliability based on clinicians unique experiences with the patient (as opposed to video-/audio-recorded material) with no training.

  13. Training and quality assurance with the Structured Clinical Interview for DSM-IV (SCID-I/P).

    PubMed

    Ventura, J; Liberman, R P; Green, M F; Shaner, A; Mintz, J

    1998-06-15

    Accuracy in psychiatric diagnosis is critical for evaluating the suitability of the subjects for entry into research protocols and for establishing comparability of findings across study sites. However, training programs in the use of diagnostic instruments for research projects are not well systematized. Furthermore, little information has been published on the maintenance of interrater reliability of diagnostic assessments. At the UCLA Research Center for Major Mental Illnesses, a Training and Quality Assurance Program for SCID interviewers was used to evaluate interrater reliability and diagnostic accuracy. Although clinically experienced interviewers achieved better interrater reliability and overall diagnostic accuracy than neophyte interviewers, both groups were able to achieve and maintain high levels of interrater reliability, diagnostic accuracy, and interviewer skill. At the first quality assurance check after training, there were no significant differences between experienced and neophyte interviewers in interrater reliability or diagnostic accuracy. Standardization of training and quality assurance procedures within and across research projects may make research findings from study sites more comparable.

  14. The test-retest reliability and minimal detectable change of spatial and temporal gait variability during usual over-ground walking for younger and older adults.

    PubMed

    Almarwani, Maha; Perera, Subashan; VanSwearingen, Jessie M; Sparto, Patrick J; Brach, Jennifer S

    2016-02-01

    Gait variability is a marker of gait performance and future mobility status in older adults. Reliability of gait variability has been examined mainly in community dwelling older adults who are likely to fluctuate over time. The purpose of this study was to compare test-retest reliability and determine minimal detectable change (MDC) of spatial and temporal gait variability in younger and older adults. Forty younger (mean age=26.6 ± 6.0 years) and 46 older adults (mean age=78.1 ± 6.2 years) were included in the study. Gait characteristics were measured twice, approximately 1 week apart, using a computerized walkway (GaitMat II). Participants completed 4 passes on the GaitMat II at their self-selected walking speed. Test-retest reliability was calculated using Intra-class correlation coefficients (ICCs(2,1)), 95% limits of agreement (95% LoA) in conjunction with Bland-Altman plots, relative limits of agreement (LoA%) and standard error of measurement (SEM). The MDC at 90% and 95% level were also calculated. ICCs of gait variability ranged 0.26-0.65 in younger and 0.28-0.74 in older adults. The LoA% and SEM were consistently higher (i.e. less reliable) for all gait variables in older compared to younger adults except SEM for step width. The MDC was consistently larger for all gait variables in older compared to younger adults except step width. ICCs were of limited utility due to restricted ranges in younger adults. Based on absolute reliability measures and MDC, younger had greater test-retest reliability and smaller MDC of spatial and temporal gait variability compared to older adults. Copyright © 2015 Elsevier B.V. All rights reserved.

  15. The validity and reliability of a dynamic neuromuscular stabilization-heel sliding test for core stability.

    PubMed

    Cha, Young Joo; Lee, Jae Jin; Kim, Do Hyun; You, Joshua Sung H

    2017-10-23

    Core stabilization plays an important role in the regulation of postural stability. To overcome shortcomings associated with pain and severe core instability during conventional core stabilization tests, we recently developed the dynamic neuromuscular stabilization-based heel sliding (DNS-HS) test. The purpose of this study was to establish the criterion validity and test-retest reliability of the novel DNS-HS test. Twenty young adults with core instability completed both the bilateral straight leg lowering test (BSLLT) and DNS-HS test for the criterion validity study and repeated the DNS-HS test for the test-retest reliability study. Criterion validity was determined by comparing hip joint angle data that were obtained from BSLLT and DNS-HS measures. The test-retest reliability was determined by comparing hip joint angle data. Criterion validity was (ICC2,3) = 0.700 (p< 0.05), suggesting a good relationship between the two core stability measures. Test-retest reliability was (ICC3,3) = 0.953 (p< 0.05), indicating excellent consistency between the repeated DNS-HS measurements. Criterion validity data demonstrated a good relationship between the gold standard BSLLT and DNS-HS core stability measures. Test-retest reliability data suggests that DNS-HS core stability was a reliable test for core stability. Clinically, the DNS-HS test is useful to objectively quantify core instability and allow early detection and evaluation.

  16. Towards an Operational Definition of Clinical Competency in Pharmacy

    PubMed Central

    2015-01-01

    Objective. To estimate the inter-rater reliability and accuracy of ratings of competence in student pharmacist/patient clinical interactions as depicted in videotaped simulations and to compare expert panelist and typical preceptor ratings of those interactions. Methods. This study used a multifactorial experimental design to estimate inter-rater reliability and accuracy of preceptors’ assessment of student performance in clinical simulations. The study protocol used nine 5-10 minute video vignettes portraying different levels of competency in student performance in simulated clinical interactions. Intra-Class Correlation (ICC) was used to calculate inter-rater reliability and Fisher exact test was used to compare differences in distribution of scores between expert and nonexpert assessments. Results. Preceptors (n=42) across 5 states assessed the simulated performances. Intra-Class Correlation estimates were higher for 3 nonrandomized video simulations compared to the 6 randomized simulations. Preceptors more readily identified high and low student performances compared to satisfactory performances. In nearly two-thirds of the rating opportunities, a higher proportion of expert panelists than preceptors rated the student performance correctly (18 of 27 scenarios). Conclusion. Valid and reliable assessments are critically important because they affect student grades and formative student feedback. Study results indicate the need for pharmacy preceptor training in performance assessment. The process demonstrated in this study can be used to establish minimum preceptor benchmarks for future national training programs. PMID:26089563

  17. Critically re-evaluating a common technique

    PubMed Central

    Geisbush, Thomas; Jones, Lyell; Weiss, Michael; Mozaffar, Tahseen; Gronseth, Gary; Rutkove, Seward B.

    2016-01-01

    Objectives: (1) To assess the diagnostic accuracy of EMG in radiculopathy. (2) To evaluate the intrarater reliability and interrater reliability of EMG in radiculopathy. (3) To assess the presence of confirmation bias in EMG. Methods: Three experienced academic electromyographers interpreted 3 compact discs with 20 EMG videos (10 normal, 10 radiculopathy) in a blinded, standardized fashion without information regarding the nature of the study. The EMGs were interpreted 3 times (discs A, B, C) 1 month apart. Clinical information was provided only with disc C. Intrarater reliability was calculated by comparing interpretations in discs A and B, interrater reliability by comparing interpretation between reviewers. Confirmation bias was estimated by the difference in correct interpretations when clinical information was provided. Results: Sensitivity was similar to previous reports (77%, confidence interval [CI] 63%–90%); specificity was 71%, CI 56%–85%. Intrarater reliability was good (κ 0.61, 95% CI 0.41–0.81); interrater reliability was lower (κ 0.53, CI 0.35–0.71). There was no substantial confirmation bias when clinical information was provided (absolute difference in correct responses 2.2%, CI −13.3% to 17.7%); the study lacked precision to exclude moderate confirmation bias. Conclusions: This study supports that (1) serial EMG studies should be performed by the same electromyographer since intrarater reliability is better than interrater reliability; (2) knowledge of clinical information does not bias EMG interpretation substantially; (3) EMG has moderate diagnostic accuracy for radiculopathy with modest specificity and electromyographers should exercise caution interpreting mild abnormalities. Classification of evidence: This study provides Class III evidence that EMG has moderate diagnostic accuracy and specificity for radiculopathy. PMID:26701380

  18. Wise Crowd Content Assessment and Educational Rubrics

    ERIC Educational Resources Information Center

    Passonneau, Rebecca J.; Poddar, Ananya; Gite, Gaurav; Krivokapic, Alisa; Yang, Qian; Perin, Dolores

    2018-01-01

    Development of reliable rubrics for educational intervention studies that address reading and writing skills is labor-intensive, and could benefit from an automated approach. We compare a main ideas rubric used in a successful writing intervention study to a highly reliable wise-crowd content assessment method developed to evaluate…

  19. Intra- and inter-tester reliability and validity of normal finger size measurement using the Japanese ring gauge system.

    PubMed

    Suzuki, T; Sato, Y; Sotome, S; Arai, H; Arai, A; Yoshida, H

    2017-06-01

    This study was designed to investigate the reliability and validity of measurements of finger diameters with a ring gauge. A reliability study enrolled two independent samples (50 participants and seven examiners in Study I; 26 participants and 26 examiners in Study II). The sizes of each participant's little fingers were measured twice with a ring gauge by each examiner. To investigate the validity of the measurements, five hand therapists compared the finger size and hand volume of 30 participants with the ring gauge and with a figure-of-eight technique (Study III). The intra-class correlation coefficient for intra-observer reliability ranged from 0.97 to 0.99 in Study I, and 0.90 to 0.97 in Study II. The intra-class correlation coefficient for inter-observer reliability was 0.95 in Study I and 0.94 in Study II. The validity study showed a Pearson product moment correlation coefficient of 0.75. The ring gauge showed high reliability and validity for measurement of finger size. III, diagnostic.

  20. A study of the longevity and operational reliability of Goddard Spacecraft, 1960-1980

    NASA Technical Reports Server (NTRS)

    Shockey, E. F.

    1981-01-01

    Compiled data regarding the design lives and lifetimes actually achieved by 104 orbiting satellites launched by the Goddard Spaceflight Center between the years 1960 and 1980 is analyzed. Historical trends over the entire 21 year period are reviewed, and the more recent data is subjected to an examination of several key parameters. An empirical reliability function is derived, and compared with various mathematical models. Data from related studies is also discussed. The results provide insight into the reliability history of Goddard spacecraft an guidance for estimating the reliability of future programs.

  1. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Roach, Mack, E-mail: mroach@radonc.ucsf.edu; Ceron Lizarraga, Tania L.; Lazar, Ann A.

    Purpose: The optimal treatment of clinically localized prostate cancer is controversial. Most studies focus on biochemical (PSA) failure when comparing radical prostatectomy (RP) with radiation therapy (RT), but this endpoint has not been validated as predictive of overall survival (OS) or cause-specific survival (CSS). We analyzed the available literature to determine whether reliable conclusions could be made concerning the effectiveness of RP compared with RT with or without androgen deprivation therapy (ADT), assuming current treatment standards. Methods: Articles published between February 29, 2004, and March 1, 2015, that compared OS and CSS after RP or RT with or without ADTmore » were included. Because the GRADE (Grading of Recommendations, Assessment, Development, and Evaluation) system emphasis is on randomized controlled clinical trials, a reliability score (RS) was explored to further understand the issues associated with the study quality of observational studies, including appropriateness of treatment, source of data, clinical characteristics, and comorbidity. Lower RS values indicated lower reliability. Results: Fourteen studies were identified, and 13 were completely evaluable. Thirteen of the 14 studies (93%) were observational studies with low-quality evidence. The median RS was 12 (range, 5-18); the median difference in 10-year OS and CSS favored RP over RT: 10% and 4%, respectively. In studies with a RS ≤12 (average RS 9) the 10-year OS and CSS median differences were 17% and 6%, respectively. For studies with a RS >12 (average RS 15.5), the 10-year OS and CSS median differences were 5.5% and 1%, respectively. Thus, we observed an association between low RS and a higher percentage difference in OS and CSS. Conclusions: Reliable evidence that RP provides a superior CSS to RT with ADT is lacking. The most reliable studies suggest that the differences in 10-year CSS between RP and RT are small, possibly <1%.« less

  2. Reliability of EEG Interactions Differs between Measures and Is Specific for Neurological Diseases

    PubMed Central

    Höller, Yvonne; Butz, Kevin; Thomschewski, Aljoscha; Schmid, Elisabeth; Uhl, Andreas; Bathke, Arne C.; Zimmermann, Georg; Tomasi, Santino O.; Nardone, Raffaele; Staffen, Wolfgang; Höller, Peter; Leitinger, Markus; Höfler, Julia; Kalss, Gudrun; Taylor, Alexandra C.; Kuchukhidze, Giorgi; Trinka, Eugen

    2017-01-01

    Alterations of interaction (connectivity) of the EEG reflect pathological processes in patients with neurologic disorders. Nevertheless, it is questionable whether these patterns are reliable over time in different measures of interaction and whether this reliability of the measures is the same across different patient populations. In order to address this topic we examined 22 patients with mild cognitive impairment, five patients with subjective cognitive complaints, six patients with right-lateralized temporal lobe epilepsy, seven patients with left lateralized temporal lobe epilepsy, and 20 healthy controls. We calculated 14 measures of interaction from two EEG-recordings separated by 2 weeks. In order to characterize test-retest reliability, we correlated these measures for each group and compared the correlations between measures and between groups. We found that both measures of interaction as well as groups differed from each other in terms of reliability. The strongest correlation coefficients were found for spectrum, coherence, and full frequency directed transfer function (average rho > 0.9). In the delta (2–4 Hz) range, reliability was lower for mild cognitive impairment compared to healthy controls and left lateralized temporal lobe epilepsy. In the beta (13–30 Hz), gamma (31–80 Hz), and high gamma (81–125 Hz) frequency ranges we found decreased reliability in subjective cognitive complaints compared to mild cognitive impairment. In the gamma and high gamma range we found increased reliability in left lateralized temporal lobe epilepsy patients compared to healthy controls. Our results emphasize the importance of documenting reliability of measures of interaction, which may vary considerably between measures, but also between patient populations. We suggest that studies claiming clinical usefulness of measures of interaction should provide information on the reliability of the results. In addition, differences between patient groups in reliability of interactions in the EEG indicate the potential of reliability to serve as a new biomarker for pathological memory decline as well as for epilepsy. While the brain concert of information flow is generally variable, high reliability, and thus, low variability may reflect abnormal firing patterns. PMID:28725190

  3. A Monte Carlo Simulation Study of the Reliability of Intraindividual Variability

    PubMed Central

    Estabrook, Ryne; Grimm, Kevin J.; Bowles, Ryan P.

    2012-01-01

    Recent research has seen intraindividual variability (IIV) become a useful technique to incorporate trial-to-trial variability into many types of psychological studies. IIV as measured by individual standard deviations (ISDs) has shown unique prediction to several types of positive and negative outcomes (Ram, Rabbit, Stollery, & Nesselroade, 2005). One unanswered question regarding measuring intraindividual variability is its reliability and the conditions under which optimal reliability is achieved. Monte Carlo simulation studies were conducted to determine the reliability of the ISD compared to the intraindividual mean. The results indicate that ISDs generally have poor reliability and are sensitive to insufficient measurement occasions, poor test reliability, and unfavorable amounts and distributions of variability in the population. Secondary analysis of psychological data shows that use of individual standard deviations in unfavorable conditions leads to a marked reduction in statistical power, although careful adherence to underlying statistical assumptions allows their use as a basic research tool. PMID:22268793

  4. Assessing disease severity: accuracy and reliability of rater estimates in relation to number of diagrams in a standard area diagram set

    USDA-ARS?s Scientific Manuscript database

    Error in rater estimates of plant disease severity occur, and standard area diagrams (SADs) help improve accuracy and reliability. The effects of diagram number in a SAD set on accuracy and reliability is unknown. The objective of this study was to compare estimates of pecan scab severity made witho...

  5. Reliability of the Kinetic Measures under Different Heel Conditions during Normal Walking

    ERIC Educational Resources Information Center

    Liu, Yuanlong; Wang, Yong Tai

    2004-01-01

    The purpose of this study was to determine and compare the reliability of 3 dimension reaction forces and impulses in walking with 3 different heel shoe conditions. These results suggest that changing the height of the heels affects mainly the reliability of the ground reaction force and impulse measures on the medial and lateral dimension and not…

  6. Validity and Reliability of Visual Analog Scaling for Assessment of Hypernasality and Audible Nasal Emission in Children With Repaired Cleft Palate.

    PubMed

    Baylis, Adriane; Chapman, Kathy; Whitehill, Tara L; Group, The Americleft Speech

    2015-11-01

    To investigate the validity and reliability of multiple listener judgments of hypernasality and audible nasal emission, in children with repaired cleft palate, using visual analog scaling (VAS) and equal-appearing interval (EAI) scaling. Prospective comparative study of multiple listener ratings of hypernasality and audible nasal emission. Multisite institutional. Five trained and experienced speech-language pathologist listeners from the Americleft Speech Project. Average VAS and EAI ratings of hypernasality and audible nasal emission/turbulence for 12 video-recorded speech samples from the Americleft Speech Project. Intrarater and interrater reliability was computed, as well as linear and polynomial models of best fit. Intrarater and interrater reliability was acceptable for both rating methods; however, reliability was higher for VAS as compared to EAI ratings. When VAS ratings were plotted against EAI ratings, results revealed a stronger curvilinear relationship. The results of this study provide additional evidence that alternate rating methods such as VAS may offer improved validity and reliability over EAI ratings of speech. VAS should be considered a viable method for rating hypernasality and nasal emission in speech in children with repaired cleft palate.

  7. The Reliability of Evidence Contained in the National Qualifications Framework Impact Study: A Critical Reflection--Research Article

    ERIC Educational Resources Information Center

    Higgs, Philip; Keevy, James

    2007-01-01

    This article reflects on the reliability of the evidence contained in the National Qualifications Framework Impact Study, a longitudinal comparative study conducted by the South African Qualifications Authority since 2002. In so doing, the veracity of evidence-based research in determining the impact of the South African Qualifications Framework…

  8. Interformat reliability of digital psychiatric self-report questionnaires: a systematic review.

    PubMed

    Alfonsson, Sven; Maathz, Pernilla; Hursti, Timo

    2014-12-03

    Research on Internet-based interventions typically use digital versions of pen and paper self-report symptom scales. However, adaptation into the digital format could affect the psychometric properties of established self-report scales. Several studies have investigated differences between digital and pen and paper versions of instruments, but no systematic review of the results has yet been done. This review aims to assess the interformat reliability of self-report symptom scales used in digital or online psychotherapy research. Three databases (MEDLINE, Embase, and PsycINFO) were systematically reviewed for studies investigating the reliability between digital and pen and paper versions of psychiatric symptom scales. From a total of 1504 publications, 33 were included in the review, and interformat reliability of 40 different symptom scales was assessed. Significant differences in mean total scores between formats were found in 10 of 62 analyses. These differences were found in just a few studies, which indicates that the results were due to study effects and sample effects rather than unreliable instruments. The interformat reliability ranged from r=.35 to r=.99; however, the majority of instruments showed a strong correlation between format scores. The quality of the included studies varied, and several studies had insufficient power to detect small differences between formats. When digital versions of self-report symptom scales are compared to pen and paper versions, most scales show high interformat reliability. This supports the reliability of results obtained in psychotherapy research on the Internet and the comparability of the results to traditional psychotherapy research. There are, however, some instruments that consistently show low interformat reliability, suggesting that these conclusions cannot be generalized to all questionnaires. Most studies had at least some methodological issues with insufficient statistical power being the most common issue. Future studies should preferably provide information about the transformation of the instrument into digital format and the procedure for data collection in more detail.

  9. Strengthening the reliability and credibility of observational epidemiology studies by creating an Observational Studies Register.

    PubMed

    Swaen, Gerard M H; Carmichael, Neil; Doe, John

    2011-05-01

    To evaluate the need for the creation of a system in which observational epidemiology studies are registered; an Observational Studies Register (OSR). The current scientific process for observational epidemiology studies is described. Next, a parallel is made with the clinical trials area, where the creation of clinical trial registers has greatly restored and improved their credibility and reliability. Next, the advantages and disadvantages of an OSR are compared. The advantages of an OSR outweigh its disadvantages. The creation of an OSR, similar to the existing Clinical Trials Registers, will improve the assessment of publication bias and will provide an opportunity to compare the original study protocol with the results reported in the publication. Reliability, credibility, and transparency of observational epidemiology studies are strengthened by the creation of an OSR. We propose a structured, collaborative, and coordinated approach for observational epidemiology studies that can provide solutions for existing weaknesses and will strengthen credibility and reliability, similar to the approach currently used in clinical trials, where Clinical Trials Registers have played a key role in strengthening their scientific value. Copyright © 2011 Elsevier Inc. All rights reserved.

  10. Clinical methods to quantify trunk mobility in an elite male surfing population.

    PubMed

    Furness, James; Climstein, Mike; Sheppard, Jeremy M; Abbott, Allan; Hing, Wayne

    2016-05-01

    Thoracic mobility in the sagittal and horizontal planes are key requirements in the sport of surfing; however to date the normal values of these movements have not yet been quantified in a surfing population. To develop a reliable method to quantify thoracic mobility in the sagittal plane; to assess the reliability of an existing thoracic rotation method, and quantify thoracic mobility in an elite male surfing population. Clinical Measurement, reliability and comparative study. A total of 30 subjects were used to determine the reliability component. 15 elite surfers were used as part of a comparative analysis with age and gender matched controls. Intraclass correlation coefficient values ranged between 0.95-0.99 (95% CI; 0.89-0.99) for both thoracic methods. The elite surfing group had significantly (p ≤ 0.05) greater rotation than the comparative group (mean rotation 63.57° versus 40.80°, respectively). This study has illustrated reliable methods to assess the thoracic spine in the sagittal plane and thoracic rotation. It has also quantified ROM in a surfing cohort; identifying thoracic rotation as a key movement. This information may provide clinicians, coaches and athletic trainers with imperative information regarding the importance of maintaining adequate thoracic rotation. Copyright © 2015 Elsevier Ltd. All rights reserved.

  11. Validity and Reliability of Spine Rasterstereography in Patients With Adolescent Idiopathic Scoliosis.

    PubMed

    Tabard-Fougère, Anne; Bonnefoy-Mazure, Alice; Hanquinet, Sylviane; Lascombes, Pierre; Armand, Stéphane; Dayer, Romain

    2017-01-15

    Test-retest study. This study aimed to evaluate the validity and reliability of rasterstereography in patients with adolescent idiopathic scoliosis (AIS) with a major curve Cobb angle (CA) between 10° and 40° for frontal, sagittal, and transverse parameters. Previous studies evaluating the validity and reliability of rasterstereography concluded that this technique had good accuracy compared with radiographs and a high intra- and interday reliability in healthy volunteers. To the best of our knowledge, the validity and reliability have not been assessed in AIS patients. Thirty-five adolescents with AIS (male = 13) aged 13.1 ± 2.0 years were included. To evaluate the validity of the scoliosis angle (SA) provided by rasterstereography, a comparison (t test, Pearson correlation) was performed with the CA obtained using 2D EOS® radiography (XR). Three rasterstereographic repeated measurements were independently performed by two operators on the same day (interrater reliability) and again by the first operator 1 week later (intrarater reliability). The variables of interest were the SA, lumbar lordosis, and thoracic kyphosis angle, trunk length, pelvic obliquity, and maximum, root mean square and amplitude of vertebral rotations. The data analyses used intraclass correlation coefficients (ICCs). The CA and SA were strongly correlated (R = 0.70) and were nonsignificantly different (P = 0.60). The intrarater reliability (same day: ICC [1, 1], n = 35; 1 week later: ICC [1, 3], n = 28) and interrater reliability (ICC [3, 3], n = 16) were globally excellent (ICC > 0.75) except for the assessment of pelvic obliquity. This study showed that the rasterstereographic system allows for the evaluation of AIS patients with a good validity compared with XR with an overall excellent intra- and interrater reliability. Based on these results, this automatic, fast, and noninvasive system can be used for monitoring the evolution of AIS in growing patients instead of repetitive radiographs, thereby reducing radiation exposure and decreasing costs. 4.

  12. Evaluating Written Patient Information for Eczema in German: Comparing the Reliability of Two Instruments, DISCERN and EQIP

    PubMed Central

    McCool, Megan E.; Wahl, Josepha; Schlecht, Inga; Apfelbacher, Christian

    2015-01-01

    Patients actively seek information about how to cope with their health problems, but the quality of the information available varies. A number of instruments have been developed to assess the quality of patient information, primarily though in English. Little is known about the reliability of these instruments when applied to patient information in German. The objective of our study was to investigate and compare the reliability of two validated instruments, DISCERN and EQIP, in order to determine which of these instruments is better suited for a further study pertaining to the quality of information available to German patients with eczema. Two independent raters evaluated a random sample of 20 informational brochures in German. All the brochures addressed eczema as a disorder and/or therapy options and care. Intra-rater and inter-rater reliability were assessed by calculating intra-class correlation coefficients, agreement was tested with weighted kappas, and the correlation of the raters’ scores for each instrument was measured with Pearson’s correlation coefficient. DISCERN demonstrated substantial intra- and inter-rater reliability. It also showed slightly better agreement than EQIP. There was a strong correlation of the raters’ scores for both instruments. The findings of this study support the reliability of both DISCERN and EQIP. However, based on the results of the inter-rater reliability, agreement and correlation analyses, we consider DISCERN to be the more precise tool for our project on patient information concerning the treatment and care of eczema. PMID:26440612

  13. Evaluating Written Patient Information for Eczema in German: Comparing the Reliability of Two Instruments, DISCERN and EQIP.

    PubMed

    McCool, Megan E; Wahl, Josepha; Schlecht, Inga; Apfelbacher, Christian

    2015-01-01

    Patients actively seek information about how to cope with their health problems, but the quality of the information available varies. A number of instruments have been developed to assess the quality of patient information, primarily though in English. Little is known about the reliability of these instruments when applied to patient information in German. The objective of our study was to investigate and compare the reliability of two validated instruments, DISCERN and EQIP, in order to determine which of these instruments is better suited for a further study pertaining to the quality of information available to German patients with eczema. Two independent raters evaluated a random sample of 20 informational brochures in German. All the brochures addressed eczema as a disorder and/or therapy options and care. Intra-rater and inter-rater reliability were assessed by calculating intra-class correlation coefficients, agreement was tested with weighted kappas, and the correlation of the raters' scores for each instrument was measured with Pearson's correlation coefficient. DISCERN demonstrated substantial intra- and inter-rater reliability. It also showed slightly better agreement than EQIP. There was a strong correlation of the raters' scores for both instruments. The findings of this study support the reliability of both DISCERN and EQIP. However, based on the results of the inter-rater reliability, agreement and correlation analyses, we consider DISCERN to be the more precise tool for our project on patient information concerning the treatment and care of eczema.

  14. Reliability and validity of the Safe Routes to school parent and student surveys.

    PubMed

    McDonald, Noreen C; Dwelley, Amanda E; Combs, Tabitha S; Evenson, Kelly R; Winters, Richard H

    2011-06-08

    The purpose of this study is to assess the reliability and validity of the U.S. National Center for Safe Routes to School's in-class student travel tallies and written parent surveys. Over 65,000 tallies and 374,000 parent surveys have been completed, but no published studies have examined their measurement properties. Students and parents from two Charlotte, NC (USA) elementary schools participated. Tallies were conducted on two consecutive days using a hand-raising protocol; on day two students were also asked to recall the previous days' travel. The recall from day two was compared with day one to assess 24-hour test-retest reliability. Convergent validity was assessed by comparing parent-reports of students' travel mode with student-reports of travel mode. Two-week test-retest reliability of the parent survey was assessed by comparing within-parent responses. Reliability and validity were assessed using kappa statistics. A total of 542 students participated in the in-class student travel tally reliability assessment and 262 parent-student dyads participated in the validity assessment. Reliability was high for travel to and from school (kappa > 0.8); convergent validity was lower but still high (kappa > 0.75). There were no differences by student grade level. Two-week test-retest reliability of the parent survey (n=112) ranged from moderate to very high for objective questions on travel mode and travel times (kappa range: 0.62-0.97) but was substantially lower for subjective assessments of barriers to walking to school (kappa range: 0.31-0.76). The student in-class student travel tally exhibited high reliability and validity at all elementary grades. The parent survey had high reliability on questions related to student travel mode, but lower reliability for attitudinal questions identifying barriers to walking to school. Parent survey design should be improved so that responses clearly indicate issues that influence parental decision making in regards to their children's mode of travel to school. © 2011 McDonald et al; licensee BioMed Central Ltd.

  15. The Reliability and Precision of Total Scores and IRT Estimates as a Function of Polytomous IRT Parameters and Latent Trait Distribution

    ERIC Educational Resources Information Center

    Culpepper, Steven Andrew

    2013-01-01

    A classic topic in the fields of psychometrics and measurement has been the impact of the number of scale categories on test score reliability. This study builds on previous research by further articulating the relationship between item response theory (IRT) and classical test theory (CTT). Equations are presented for comparing the reliability and…

  16. Examining the Reliability and Validity of the "Supports Intensity Scale-Children's Version" in Children with Autism and Intellectual Disability

    ERIC Educational Resources Information Center

    Shogren, Karrie A.; Wehmeyer, Michael L.; Seo, Hyojeong; Thompson, James R.; Schalock, Robert L.; Hughes, Carolyn; Little, Todd D.; Palmer, Susan B.

    2017-01-01

    This study compared the reliability, validity, and measurement properties of the "Supports Intensity Scale-Children's Version" (SIS-C) in children with autism and intellectual disability (n = 2,124) and children with intellectual disability only (n = 1,861). The results suggest that SIS-C is a valid and reliable tool in both populations.…

  17. Independent predictors of reliability between full time employee-dependent acquisition of functional outcomes compared to non-full time employee-dependent methodologies: a prospective single institutional study.

    PubMed

    Adogwa, Owoicho; Elsamadicy, Aladine A; Cheng, Joseph; Bagley, Carlos

    2016-03-01

    The prospective acquisition of reliable patient-reported outcomes (PROs) measures demonstrating the effectiveness of spine surgery, or lack thereof, remains a challenge. The aims of this study are to compare the reliability of functional outcomes metrics obtained using full time employee (FTE) vs. non-FTE-dependent methodologies and to determine the independent predictors of response reliability using non FTE-dependent methodologies. One hundred and nineteen adult patients (male: 65, female: 54) undergoing one- and two-level lumbar fusions at Duke University Medical Center were enrolled in this prospective study. Enrollment criteria included available demographic, clinical and baseline functional outcomes data. All patients were administered two similar sets of baseline questionnaires-(I) phone interviews (FTE-dependent) and (II) hardcopy in clinic (patient self-survey, non-FTE-dependent). All patients had at least a two-week washout period between phone interviews and in-clinic self-surveys to minimize effect of recall. Questionnaires included Oswestry disability index (ODI) and Visual Analog Back and Leg Pain Scale (VAS-BP/LP). Reliability was assessed by the degree to which patient responses to baseline questionnaires differed between both time points. About 26.89% had a history an anxiety disorder and 28.57% reported a history of depression. At least 97.47% of patients had a High School Diploma or GED, with 49.57% attaining a 4-year college degree or post-graduate degree. 29.94% reported full-time employment and 14.28% were on disability. There was a very high correlation between baseline PRO's data captured between FTE-dependent compared to non-FTE-dependent methodologies (r=0.89). In a multivariate logistic regression model, the absence of anxiety and depression, higher levels of education (college or greater) and full-time employment, were independently associated with high response reliability using non-FTE-dependent methodologies. Our study suggests that capturing health-related quality of life data using non-FTE-dependent methodologies is highly reliable and maybe a more cost-effective alternative. Well-educated patients who are employed full-time appear to be the most reliable.

  18. How to assess and compare inter-rater reliability, agreement and correlation of ratings: an exemplary analysis of mother-father and parent-teacher expressive vocabulary rating pairs

    PubMed Central

    Stolarova, Margarita; Wolf, Corinna; Rinker, Tanja; Brielmann, Aenne

    2014-01-01

    This report has two main purposes. First, we combine well-known analytical approaches to conduct a comprehensive assessment of agreement and correlation of rating-pairs and to dis-entangle these often confused concepts, providing a best-practice example on concrete data and a tutorial for future reference. Second, we explore whether a screening questionnaire developed for use with parents can be reliably employed with daycare teachers when assessing early expressive vocabulary. A total of 53 vocabulary rating pairs (34 parent–teacher and 19 mother–father pairs) collected for two-year-old children (12 bilingual) are evaluated. First, inter-rater reliability both within and across subgroups is assessed using the intra-class correlation coefficient (ICC). Next, based on this analysis of reliability and on the test-retest reliability of the employed tool, inter-rater agreement is analyzed, magnitude and direction of rating differences are considered. Finally, Pearson correlation coefficients of standardized vocabulary scores are calculated and compared across subgroups. The results underline the necessity to distinguish between reliability measures, agreement and correlation. They also demonstrate the impact of the employed reliability on agreement evaluations. This study provides evidence that parent–teacher ratings of children's early vocabulary can achieve agreement and correlation comparable to those of mother–father ratings on the assessed vocabulary scale. Bilingualism of the evaluated child decreased the likelihood of raters' agreement. We conclude that future reports of agreement, correlation and reliability of ratings will benefit from better definition of terms and stricter methodological approaches. The methodological tutorial provided here holds the potential to increase comparability across empirical reports and can help improve research practices and knowledge transfer to educational and therapeutic settings. PMID:24994985

  19. Accuracy and Reliability of Marker-Based Approaches to Scale the Pelvis, Thigh, and Shank Segments in Musculoskeletal Models.

    PubMed

    Kainz, Hans; Hoang, Hoa X; Stockton, Chris; Boyd, Roslyn R; Lloyd, David G; Carty, Christopher P

    2017-10-01

    Gait analysis together with musculoskeletal modeling is widely used for research. In the absence of medical images, surface marker locations are used to scale a generic model to the individual's anthropometry. Studies evaluating the accuracy and reliability of different scaling approaches in a pediatric and/or clinical population have not yet been conducted and, therefore, formed the aim of this study. Magnetic resonance images (MRI) and motion capture data were collected from 12 participants with cerebral palsy and 6 typically developed participants. Accuracy was assessed by comparing the scaled model's segment measures to the corresponding MRI measures, whereas reliability was assessed by comparing the model's segments scaled with the experimental marker locations from the first and second motion capture session. The inclusion of joint centers into the scaling process significantly increased the accuracy of thigh and shank segment length estimates compared to scaling with markers alone. Pelvis scaling approaches which included the pelvis depth measure led to the highest errors compared to the MRI measures. Reliability was similar between scaling approaches with mean ICC of 0.97. The pelvis should be scaled using pelvic width and height and the thigh and shank segment should be scaled using the proximal and distal joint centers.

  20. The Americleft Project: A Modification of Asher-McDade Method for Rating Nasolabial Esthetics in Patients With Unilateral Cleft Lip and Palate Using Q-sort.

    PubMed

    Stoutland, Alicia; Long, Ross E; Mercado, Ana; Daskalogiannakis, John; Hathaway, Ronald R; Russell, Kathleen A; Singer, Emily; Semb, Gunvor; Shaw, William C

    2017-11-01

    The purpose of this study was to investigate ways to improve rater reliability and satisfaction in nasolabial esthetic evaluations of patients with complete unilateral cleft lip and palate (UCLP), by modifying the Asher-McDade method with use of Q-sort methodology. Blinded ratings of cropped photographs of one hundred forty-nine 5- to 7-year-old consecutively treated patients with complete UCLP from 4 different centers were used in a rating of frontal and profile nasolabial esthetic outcomes by 6 judges involved in the Americleft Project's intercenter outcome comparisons. Four judges rated in previous studies using the original Asher-McDade approach. For the Q-sort modification, rather than projection of images, each judge had cards with frontal and profile photographs of each patient and rated them on a scale of 1 to 5 for vermillion border, nasolabial frontal, and profile, using the Q-sort method with placement of cards into categories 1 to 5. Inter- and intrarater reliabilities were calculated using the Weighted Kappa (95% confidence interval). For 4 raters, the reliabilities were compared with those in previous studies. There was no significant improvement in inter-rater reliabilities using the new method. Intrarater reliability consistently improved. All raters preferred the Q-sort method with rating cards rather than a PowerPoint of photos, which improved internal consistency in rating compared to previous studies using the original Asher-McDade method. All raters preferred this method because of the ability to continuously compare photos and adjust relative ratings between patients.

  1. Does Changing Examiner Stations During UK Postgraduate Surgery Objective Structured Clinical Examinations Influence Examination Reliability and Candidates' Scores?

    PubMed

    Brennan, Peter A; Croke, David T; Reed, Malcolm; Smith, Lee; Munro, Euan; Foulkes, John; Arnett, Richard

    2016-01-01

    Objective structured clinical examinations (OSCE) are widely used for summative assessment in surgery. Despite standardizing these as much as possible, variation, including examiner scoring, can occur which may affect reliability. In study of a high-stakes UK postgraduate surgical OSCE, we investigated whether examiners changing stations once during a long examining day affected marking, reliability, and overall candidates' scores compared with examiners who examined the same scenario all day. An observational study of 18,262 examiner-candidate interactions from the UK Membership of the Royal College of Surgeons examination was carried at 3 Surgical Colleges across the United Kingdom. Scores between examiners were compared using analysis of variance. Examination reliability was assessed with Cronbach's alpha, and the comparative distribution of total candidates' scores for each day was evaluated using t-tests of unit-weighted z scores. A significant difference was found in absolute scores differences awarded in the morning and afternoon sessions between examiners who changed stations at lunchtime and those who did not (p < 0.001). No significant differences were found for the main effects of either broad content area (p = 0.290) or station content area (p = 0.450). The reliability of each day was not affected by examiner switching (p = 0.280). Overall, no difference was found in z-score distribution of total candidate scores and categories of examiner switching. This large study has found that although the range of marks awarded varied when examiners change OSCE stations, examination reliability and the likely candidate outcome were not affected. These results may have implications for examination design and examiner experience in surgical OSCEs and beyond. Copyright © 2016 Association of Program Directors in Surgery. Published by Elsevier Inc. All rights reserved.

  2. Psychometric Properties of Performance-based Measurements of Functional Capacity: Test-Retest Reliability, Practice Effects, and Potential Sensitivity to Change

    PubMed Central

    Leifker, Feea R.; Patterson, Thomas L.; Bowie, Christopher R.; Mausbach, Brent T.; Harvey, Philip D.

    2010-01-01

    Performance-based measures of the ability to perform social and everyday living skills are being more widely used to assess functional capacity in people with serious mental illnesses such as schizophrenia and bipolar disorder. Since they are also being used as outcome measures in pharmacological and cognitive remediation studies aimed at cognitive impairments in schizophrenia, understanding their measurement properties and potential sensitivity to change is important. In this study, the test-retest reliability, practice effects, and reliable change indices of two different performance-based functional capacity measures, the UCSD Performance-based skills assessment (UPSA) and Social skills performance assessment (SSPA) were examined over several different retest intervals in two different samples of people with schizophrenia (n’s=238 and 116) and a healthy comparison sample (n=109). These psychometric properties were compared to those of a neuropsychological assessment battery. Test-retest reliabilities of the long form of the UPSA ranged from r=.63 to r=.80 over follow-up periods up to 36 months in people with schizophrenia, while brief UPSA reliabilities ranged from r=.66 to r=.81. Test-retest reliability of the NP performance scores ranged from r=.77 to r=.79. Test-retest reliabilities of the UPSA were lower in healthy controls, while NP performance was slightly more reliable. SSPA test-retest reliability was lower. Practice effect sizes ranged from .05 to .16 for the UPSA and .07 to .19 for the NP assessment in patients, with HC having more practice effects. Reliable change intervals were consistent across NP and both FC measures, indicating equal potential for detection of change. These performance-based measures of functional capacity appear to have similar potential to be sensitive to change compared to NP performance in people with schizophrenia. PMID:20399613

  3. Superior Temporal Activation as a Function of Linguistic Knowledge: Insights from Deaf Native Signers Who Speechread

    ERIC Educational Resources Information Center

    Capek, Cheryl M.; Woll, Bencie; MacSweeney, Mairead; Waters, Dafydd; McGuire, Philip K.; David, Anthony S.; Brammer, Michael J.; Campbell, Ruth

    2010-01-01

    Studies of spoken and signed language processing reliably show involvement of the posterior superior temporal cortex. This region is also reliably activated by observation of meaningless oral and manual actions. In this study we directly compared the extent to which activation in posterior superior temporal cortex is modulated by linguistic…

  4. Exploring Equivalent Forms Reliability Using a Key Stage 2 Reading Test

    ERIC Educational Resources Information Center

    Benton, Tom

    2013-01-01

    This article outlines an empirical investigation into equivalent forms reliability using a case study of a national curriculum reading test. Within the situation being studied, there has been a genuine attempt to create several equivalent forms and so it is of interest to compare the actual behaviour of the relationship between these forms to the…

  5. Assessing Reliability of Student Ratings of Advisor: A Comparison of Univariate and Multivariate Generalizability Approaches.

    ERIC Educational Resources Information Center

    Sun, Anji; Valiga, Michael J.

    In this study, the reliability of the American College Testing (ACT) Program's "Survey of Academic Advising" (SAA) was examined using both univariate and multivariate generalizability theory approaches. The primary purpose of the study was to compare the results of three generalizability theory models (a random univariate model, a mixed…

  6. Validity and reliability of smartphone magnetometer-based goniometer evaluation of shoulder abduction--A pilot study.

    PubMed

    Johnson, Linda B; Sumner, Sean; Duong, Tina; Yan, Posu; Bajcsy, Ruzena; Abresch, R Ted; de Bie, Evan; Han, Jay J

    2015-12-01

    Goniometers are commonly used by physical therapists to measure range-of-motion (ROM) in the musculoskeletal system. These measurements are used to assist in diagnosis and to help monitor treatment efficacy. With newly emerging technologies, smartphone-based applications are being explored for measuring joint angles and movement. This pilot study investigates the intra- and inter-rater reliability as well as concurrent validity of a newly-developed smartphone magnetometer-based goniometer (MG) application for measuring passive shoulder abduction in both sitting and supine positions, and compare against the traditional universal goniometer (UG). This is a comparative study with repeated measurement design. Three physical therapists utilized both the smartphone MG and a traditional UG to measure various angles of passive shoulder abduction in a healthy subject, whose shoulder was positioned in eight different positions with pre-determined degree of abduction while seated or supine. Each therapist was blinded to the measured angles. Concordance correlation coefficients (CCCs), Bland-Altman plotting methods, and Analysis of Variance (ANOVA) were used for statistical analyses. Both traditional UG and smartphone MG were reliable in repeated measures of standardized joint angle positions (average CCC > 0.997) with similar variability in both measurement tools (standard deviation (SD) ± 4°). Agreement between the UG and MG measurements was greater than 0.99 in all positions. Our results show that the smartphone MG has equivalent reliability compared to the traditional UG when measuring passive shoulder abduction ROM. With concordant measures and comparable reliability to the UG, the newly developed MG application shows potential as a useful tool to assess joint angles. Published by Elsevier Ltd.

  7. The Experiences in Close Relationship Scale (ECR)-short form: reliability, validity, and factor structure.

    PubMed

    Wei, Meifen; Russell, Daniel W; Mallinckrodt, Brent; Vogel, David L

    2007-04-01

    We developed a 12-item, short form of the Experiences in Close Relationship Scale (ECR; Brennan, Clark, & Shaver, 1998) across 6 studies. In Study 1, we examined the reliability and factor structure of the measure. In Studies 2 and 3, we cross-validated the reliability, factor structure, and validity of the short form measure; whereas in Study 4, we examined test-retest reliability over a 1-month period. In Studies 5 and 6, we further assessed the reliability, factor structure, and validity of the short version of the ECR when administered as a stand-alone instrument. Confirmatory factor analyses indicated that 2 factors, labeled Anxiety and Avoidance, provided a good fit to the data after removing the influence of response sets. We found validity to be equivalent for the short and the original versions of the ECR across studies. Finally, the results were comparable when we embedded the short form within the original version of the ECR and when we administered it as a stand-alone measure.

  8. Notions of reliability: considering the importance of difference in guiding patients to health care Web sites.

    PubMed

    Adams, S A; De Bont, A A

    2003-01-01

    This article analyzes the efforts of three organizations to provide a standard that guides Internet users to reliable health care sites. Comparison of health Internet sites, interviews and document studies. In comparing these approaches, three different constructions of reliability are identified. The resulting possibilities and restrictions of these constructions for users that are searching for health information on the Internet are revealed.

  9. Impact of Rating Scale Categories on Reliability and Fit Statistics of the Malay Spiritual Well-Being Scale using Rasch Analysis.

    PubMed

    Daher, Aqil Mohammad; Ahmad, Syed Hassan; Winn, Than; Selamat, Mohd Ikhsan

    2015-01-01

    Few studies have employed the item response theory in examining reliability. We conducted this study to examine the effect of Rating Scale Categories (RSCs) on the reliability and fit statistics of the Malay Spiritual Well-Being Scale, employing the Rasch model. The Malay Spiritual Well-Being Scale (SWBS) with the original six; three and four newly structured RSCs was distributed randomly among three different samples of 50 participants each. The mean age of respondents in the three samples ranged between 36 and 39 years old. The majority was female in all samples, and Islam was the most prevalent religion among the respondents. The predominating race was Malay, followed by Chinese and Indian. The original six RSCs indicated better targeting of 0.99 and smallest model error of 0.24. The Infit Mnsq (mean square) and Zstd (Z standard) of the six RSCs were "1.1"and "-0.1"respectively. The six RSCs achieved the highest person and item reliabilities of 0.86 and 0.85 respectively. These reliabilities yielded the highest person (2.46) and item (2.38) separation indices compared to other the RSCs. The person and item reliability and, to a lesser extent, the fit statistics, were better with the six RSCs compared to the four and three RSCs.

  10. Value of travel-time reliability, part II : a study of tradeoffs between travel reliability, congestion-mitigation strategies and emissions.

    DOT National Transportation Integrated Search

    2012-09-01

    Capacity, demand, and vehicle based emissions reduction strategies are compared for several pollutants employing aggregate US : congestion and vehicle fleet condition data. We find that congestion mitigation does not inevitably lead to reduced emissi...

  11. System Architectural Considerations on Reliable Guidance, Navigation, and Control (GN and C) for Constellation Program (CxP) Spacecraft

    NASA Technical Reports Server (NTRS)

    Dennehy, Cornelius J.

    2010-01-01

    This final report summarizes the results of a comparative assessment of the fault tolerance and reliability of different Guidance, Navigation and Control (GN&C) architectural approaches. This study was proactively performed by a combined Massachusetts Institute of Technology (MIT) and Draper Laboratory team as a GN&C "Discipline-Advancing" activity sponsored by the NASA Engineering and Safety Center (NESC). This systematic comparative assessment of GN&C system architectural approaches was undertaken as a fundamental step towards understanding the opportunities for, and limitations of, architecting highly reliable and fault tolerant GN&C systems composed of common avionic components. The primary goal of this study was to obtain architectural 'rules of thumb' that could positively influence future designs in the direction of an optimized (i.e., most reliable and cost-efficient) GN&C system. A secondary goal was to demonstrate the application and the utility of a systematic modeling approach that maps the entire possible architecture solution space.

  12. A systematic review of statistical methods used to test for reliability of medical instruments measuring continuous variables.

    PubMed

    Zaki, Rafdzah; Bulgiba, Awang; Nordin, Noorhaire; Azina Ismail, Noor

    2013-06-01

    Reliability measures precision or the extent to which test results can be replicated. This is the first ever systematic review to identify statistical methods used to measure reliability of equipment measuring continuous variables. This studyalso aims to highlight the inappropriate statistical method used in the reliability analysis and its implication in the medical practice. In 2010, five electronic databases were searched between 2007 and 2009 to look for reliability studies. A total of 5,795 titles were initially identified. Only 282 titles were potentially related, and finally 42 fitted the inclusion criteria. The Intra-class Correlation Coefficient (ICC) is the most popular method with 25 (60%) studies having used this method followed by the comparing means (8 or 19%). Out of 25 studies using the ICC, only 7 (28%) reported the confidence intervals and types of ICC used. Most studies (71%) also tested the agreement of instruments. This study finds that the Intra-class Correlation Coefficient is the most popular method used to assess the reliability of medical instruments measuring continuous outcomes. There are also inappropriate applications and interpretations of statistical methods in some studies. It is important for medical researchers to be aware of this issue, and be able to correctly perform analysis in reliability studies.

  13. Online Studies on Variation in Orthopedic Surgery: Computed Tomography in MPEG4 Versus DICOM Format.

    PubMed

    Mellema, Jos J; Mallee, Wouter H; Guitton, Thierry G; van Dijk, C Niek; Ring, David; Doornberg, Job N

    2017-10-01

    The purpose of this study was to compare the observer participation and satisfaction as well as interobserver reliability between two online platforms, Science of Variation Group (SOVG) and Traumaplatform Study Collaborative, for the evaluation of complex tibial plateau fractures using computed tomography in MPEG4 and DICOM format. A total of 143 observers started with the online evaluation of 15 complex tibial plateau fractures via either the SOVG or Traumaplatform Study Collaborative websites using MPEG4 videos or a DICOM viewer, respectively. Observers were asked to indicate the absence or presence of four tibial plateau fracture characteristics and to rate their satisfaction with the evaluation as provided by the respective online platforms. The observer participation rate was significantly higher in the SOVG (MPEG4 video) group compared to that in the Traumaplatform Study Collaborative (DICOM viewer) group (75 and 43%, respectively; P < 0.001). The median observer satisfaction with the online evaluation was seven (range, 0-10) using MPEG4 video compared to six (range, 1-9) using DICOM viewer (P = 0.11). The interobserver reliability for recognition of fracture characteristics in complex tibial plateau fractures was higher for the evaluation using MPEG4 video. In conclusion, observer participation and interobserver reliability for the characterization of tibial plateau fractures was greater with MPEG4 videos than with a standard DICOM viewer, while there was no difference in observer satisfaction. Future reliability studies should account for the method of delivering images.

  14. Evaluation of General Classes of Reliability Estimators Often Used in Statistical Analyses of Quasi-Experimental Designs

    NASA Astrophysics Data System (ADS)

    Saini, K. K.; Sehgal, R. K.; Sethi, B. L.

    2008-10-01

    In this paper major reliability estimators are analyzed and there comparatively result are discussed. There strengths and weaknesses are evaluated in this case study. Each of the reliability estimators has certain advantages and disadvantages. Inter-rater reliability is one of the best ways to estimate reliability when your measure is an observation. However, it requires multiple raters or observers. As an alternative, you could look at the correlation of ratings of the same single observer repeated on two different occasions. Each of the reliability estimators will give a different value for reliability. In general, the test-retest and inter-rater reliability estimates will be lower in value than the parallel forms and internal consistency ones because they involve measuring at different times or with different raters. Since reliability estimates are often used in statistical analyses of quasi-experimental designs.

  15. Reliability of the Serbian version of the International Physical Activity Questionnaire for older adults.

    PubMed

    Milanović, Zoran; Pantelić, Saša; Trajković, Nebojša; Jorgić, Bojan; Sporiš, Goran; Bratić, Milovan

    2014-01-01

    The purpose of this study was to determine the test-retest reliability of the International Physical Activity Questionnaire (IPAQ) for older adults in Serbia. Six hundred and sixty older adults (352 men, 53%; 308 women, 47%; mean age 67.65±5.76 years) participated in the study. To examine test-retest reliability, the participants were asked to complete the IPAQ on two occasions 2 weeks apart. Moderate reliability was observed between the repeated IPAQ, with intraclass correlation coefficients ranging from 0.53 to 0.91. The least reliability was established in leisure time activity (0.53) and the most reliability in the transport domain (0.91). Men and women had similar intraclass correlation coefficients for total physical activity (0.71 versus 0.74, respectively), while the biggest difference was obtained for housework in men (0.68) and in women (0.90). Our study shows that the long version of the IPAQ is a reliable instrument for assessing physical activity levels in older adults and that it may be useful for generating internationally comparable data.

  16. Interrater and intrarater reliability of FDI criteria applied to photographs of posterior tooth-colored restorations.

    PubMed

    Kim, Dohyun; Ahn, So-Yeon; Kim, Junyoung; Park, Sung-Ho

    2017-07-01

    Since 2007, the FDI World Dental Federation (FDI) criteria have been used for the clinical evaluation of dental restorations. However, the reliability of the FDI criteria has not been sufficiently addressed. The purpose of this study was to assess and compare the interrater and intrarater reliability of the FDI criteria by evaluating posterior tooth-colored restorations photographically. A total of 160 clinical photographs of posterior tooth-colored restorations were evaluated independently by 5 raters with 9 of the FDI criteria suitable for photographic evaluation. The raters recorded the score of each restoration by using 5 grades, and the score was dichotomized into the clinical evaluation scores. After 1 month, 2 of the raters reevaluated the same set of 160 photographs in random order. To estimate the interrater reliability among the 5 raters, the proportion of agreement was calculated, and the Fleiss multirater kappa statistic was used. For the intrarater reliability, the proportion of agreement was calculated, and the Cohen standard kappa statistic was used for each of the 2 raters. The interrater proportion of agreement was 0.41 to 0.57, and the kappa value was 0.09 to 0.39. Overall, the intrarater reliability was higher than the interrater reliability, and rater 1 demonstrated higher intrarater reliability than rater 2. The proportion of agreement and kappa values increased when the 5 scores were dichotomized. The reliability was relatively lower for the esthetic properties compared with the functional or biological properties. Within the limitations of this study, the FDI criteria presented slight to fair interrater reliability and fair to excellent intrarater reliability in the photographic evaluation of posterior tooth-colored restorations. The reliability was improved by simplifying the evaluation scores. Copyright © 2016 Editorial Council for the Journal of Prosthetic Dentistry. Published by Elsevier Inc. All rights reserved.

  17. Analysis of the reliability and reproducibility of goniometry compared to hand photogrammetry

    PubMed Central

    de Carvalho, Rosana Martins Ferreira; Mazzer, Nilton; Barbieri, Claudio Henrique

    2012-01-01

    Objective: To evaluate the intra- and inter-examiner reliability and reproducibility of goniometry in relation to photogrammetry of hand, comparing the angles of thumb abduction, PIP joint flexion of the II finger and MCP joint flexion of the V finger. Methods: The study included 30 volunteers, who were divided into three groups: one group of 10 physiotherapy students, one group of 10 physiotherapists, and a third group of 10 therapists of the hand. Each examiner performed the measurements on the same hand mold, using the goniometer followed by two photogrammetry software programs; CorelDraw® and ALCimagem®. Results: The results revealed that the groups and the methods proposed presented inter-examiner reliability, generally rated as excellent (ICC 0.998 I.C. 95% 0.995 - 0.999). In the intra-examiner evaluation, an excellent level of reliability was found between the three groups. In the comparison between groups for each angle and each method, no significant differences were found between the groups for most of the measurements. Conclusion: Goniometry and photogrammetry are reliable and reproducible methods for evaluating measurements of the hand. However, due to the lack of similar references, detailed studies are needed to define the normal parameters between the methods in the joints of the hand. Level of Evidence II, Diagnostic Study. PMID:24453594

  18. Validity and reliability of the Diagnostic Adaptive Behaviour Scale.

    PubMed

    Tassé, M J; Schalock, R L; Balboni, G; Spreat, S; Navas, P

    2016-01-01

    The Diagnostic Adaptive Behaviour Scale (DABS) is a new standardised adaptive behaviour measure that provides information for evaluating limitations in adaptive behaviour for the purpose of determining a diagnosis of intellectual disability. This article presents validity evidence and reliability data for the DABS. Validity evidence was based on comparing DABS scores with scores obtained on the Vineland Adaptive Behaviour Scale, second edition. The stability of the test scores was measured using a test and retest, and inter-rater reliability was assessed by computing the inter-respondent concordance. The DABS convergent validity coefficients ranged from 0.70 to 0.84, while the test-retest reliability coefficients ranged from 0.78 to 0.95, and the inter-rater concordance as measured by intraclass correlation coefficients ranged from 0.61 to 0.87. All obtained validity and reliability indicators were strong and comparable with the validity and reliability coefficients of the most commonly used adaptive behaviour instruments. These results and the advantages of the DABS for clinician and researcher use are discussed. © 2015 MENCAP and International Association of the Scientific Study of Intellectual and Developmental Disabilities and John Wiley & Sons Ltd.

  19. Absolute and relative reliability of acute effects of aerobic exercise on executive function in seniors.

    PubMed

    Donath, Lars; Ludyga, Sebastian; Hammes, Daniel; Rossmeissl, Anja; Andergassen, Nadin; Zahner, Lukas; Faude, Oliver

    2017-10-25

    Aging is accompanied by a decline of executive function. Aerobic exercise training induces moderate improvements of cognitive domains (i.e., attention, processing, executive function, memory) in seniors. Most conclusive data are obtained from studies with dementia or cognitive impairment. Confident detection of exercise training effects requires adequate between-day reliability and low day-to-day variability obtained from acute studies, respectively. These absolute and relative reliability measures have not yet been examined for a single aerobic training session in seniors. Twenty-two healthy and physically active seniors (age: 69 ± 3 y, BMI: 24.8 ± 2.2, VO 2peak : 32 ± 6 mL/kg/bodyweight) were enrolled in this randomized controlled cross-over study. A repeated between-day comparison [i.e., day 1 (habituation) vs. day 2 & day 2 vs. day 3] of executive function testing (Eriksen-Flanker-Test, Stroop-Color-Test, Digit-Span, Five-Point-Test) before and after aerobic cycling exercise at 70% of the heart rate reserve [0.7 × (HR max - HR rest )] was conducted. Reliability measures were calculated for pre, post and change scores. Large between-day differences between day 1 and 2 were found for reaction times (Flanker- and Stroop Color testing) and completed figures (Five-Point test) at pre and post testing (0.002 < p < 0.05, 0.16 < ɳ p 2  < 0.38). These differences notably declined when comparing day 2 and 3. Absolute between days variability (CoV) dropped from 10 to 5% when comparing day 2 vs. day 3 instead of day 1 vs. day 2. Also ICC ranges increased from day 1 vs. day 2 (0.65 < ICC < 0.87) to day 2 vs. day 3 (0.40 < ICC < 0.93). Interestingly, reliability measures for pre-post change scores were low (0.02 < ICC < 0.71). These data did not improve when comparing day 2 with day 3. During inhibition tests, reaction times showed excellent reliability values compared to the poor to fair reliability of accuracy. Notable habituation to the whole testing procedure should be considered as it increased the reliability of different executive function tests. Change scores of executive function after acute aerobic exercise cannot be detected reliably. Large intra- and inter-individual of responses to acute aerobic exercise in seniors can be presumed.

  20. Comparing reliabilities of strip and conventional patch testing.

    PubMed

    Dickel, Heinrich; Geier, Johannes; Kreft, Burkhard; Pfützner, Wolfgang; Kuss, Oliver

    2017-06-01

    The standardized protocol for performing the strip patch test has proven to be valid, but evidence on its reliability is still missing. To estimate the parallel-test reliability of the strip patch test as compared with the conventional patch test. In this multicentre, prospective, randomized, investigator-blinded reliability study, 132 subjects were enrolled. Simultaneous duplicate strip and conventional patch tests were performed with the Finn Chambers ® on Scanpor ® tape test system and the patch test preparations nickel sulfate 5% pet., potassium dichromate 0.5% pet., and lanolin alcohol 30% pet. Reliability was estimated by the use of Cohen's kappa coefficient. Parallel-test reliability values of the three standard patch test preparations turned out to be acceptable, with slight advantages for the strip patch test. The differences in reliability were 9% (95%CI: -8% to 26%) for nickel sulfate and 23% (95%CI: -16% to 63%) for potassium dichromate, both favouring the strip patch test. The standardized strip patch test method for the detection of allergic contact sensitization in patients with suspected allergic contact dermatitis is reliable. Its application in routine clinical practice can be recommended, especially if the conventional patch test result is presumably false negative. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  1. A simulation model for risk assessment of turbine wheels

    NASA Technical Reports Server (NTRS)

    Safie, Fayssal M.; Hage, Richard T.

    1991-01-01

    A simulation model has been successfully developed to evaluate the risk of the Space Shuttle auxiliary power unit (APU) turbine wheels for a specific inspection policy. Besides being an effective tool for risk/reliability evaluation, the simulation model also allows the analyst to study the trade-offs between wheel reliability, wheel life, inspection interval, and rejection crack size. For example, in the APU application, sensitivity analysis results showed that the wheel life limit has the least effect on wheel reliability when compared to the effect of the inspection interval and the rejection crack size. In summary, the simulation model developed represents a flexible tool to predict turbine wheel reliability and study the risk under different inspection policies.

  2. A simulation model for risk assessment of turbine wheels

    NASA Astrophysics Data System (ADS)

    Safie, Fayssal M.; Hage, Richard T.

    A simulation model has been successfully developed to evaluate the risk of the Space Shuttle auxiliary power unit (APU) turbine wheels for a specific inspection policy. Besides being an effective tool for risk/reliability evaluation, the simulation model also allows the analyst to study the trade-offs between wheel reliability, wheel life, inspection interval, and rejection crack size. For example, in the APU application, sensitivity analysis results showed that the wheel life limit has the least effect on wheel reliability when compared to the effect of the inspection interval and the rejection crack size. In summary, the simulation model developed represents a flexible tool to predict turbine wheel reliability and study the risk under different inspection policies.

  3. Reliability of the "Ten Test" for assessment of discriminative sensation in hand trauma.

    PubMed

    Berger, Michael J; Regan, William R; Seal, Alex; Bristol, Sean G

    2016-10-01

    "Ten Test" (TT) is a bedside measure of discriminative sensation, whereby the magnitude of abnormal sensation to moving light touch is normalized to an area of normal sensation on an 11-point Likert scale (0-10). The purposes of this study were to determine reliability parameters of the TT in a cohort of patients presenting to a hand trauma clinic with subjectively altered sensation post-injury and to compare the reliability of TT to that of the Weinstein Enhanced Sensory Test (WEST). Study participants (n = 29, mean age = 37 ± 12) comprised patients presenting to an outpatient hand trauma clinic with recent hand trauma and self reported abnormal sensation. Participants underwent TT and WEST by two separate raters on the same day. Interrater reliability, response stability and responsiveness of each test were determined by the intraclass correlation coefficient (ICC: 2, 1), standard error of measurement (SEM) with 95% confidence intervals (CI) and minimal detectable difference score, with 95% CI (MDD95), respectively. The TT displayed excellent interrater reliability (ICC = 0.95, 95% CI 0.89-0.97) compared to good reliability for WEST (ICC = 0.78, 95% CI 0.58-0.89). The range of true scores expected with 95% confidence based on the SEM (i.e. response stability), was ±1.1 for TT and ±1.1 for WEST. MDD95 scores reflecting test responsiveness were 1.5 and 1.6 for TT and WEST, respectively. The TT displayed excellent reliability parameters in this patient population. Reliability parameters were stronger for TT compared to WEST. These results provide support for the use of TT as a component of the sensory exam in hand trauma. Copyright © 2016 British Association of Plastic, Reconstructive and Aesthetic Surgeons. Published by Elsevier Ltd. All rights reserved.

  4. Japanese Adaptation of the Stroke and Aphasia Quality of Life Scale-39 (SAQOL-39): Comparative Study among Different Types of Aphasia.

    PubMed

    Kamiya, Akane; Kamiya, Kentaro; Tatsumi, Hiroshi; Suzuki, Makihiko; Horiguchi, Satoshi

    2015-11-01

    We have developed a Japanese version of the Stroke and Aphasia Quality of Life Scale-39 (SAQOL-39), designated as SAQOL-39-J, and used psychometric methods to examine its acceptability and reliability. The acceptability and reliability of SAQOL-39-J, which was developed from the English version using a standard translation and back-translation method, were examined in 54 aphasia patients using standard psychometric methods. The acceptability and reliability of SAQOL-39-J were then compared among patients with different types of aphasia. SAQOL-39-J showed good acceptability, internal consistency (Cronbach's α score = .90), and test-retest reliability (intraclass correlation coefficient = .97). Broca's aphasia patients showed the lowest total scores and communication scores on SAQOL-39-J. The Japanese version of SAQOL-39, SAQOL-39-J, provides acceptable and reliable data in Japanese stroke patients with aphasia. Among different types of aphasia, Broca's aphasia patients had the lowest total and communication SAQOL-39-J scores. Further studies are needed to assess the effectiveness of health care interventions on health-related quality of life in this population. Copyright © 2015 National Stroke Association. Published by Elsevier Inc. All rights reserved.

  5. Effects of test method and participant musical training on preference ratings of stimuli with different reverberation times.

    PubMed

    Lawless, Martin S; Vigeant, Michelle C

    2017-10-01

    Selecting an appropriate listening test design for concert hall research depends on several factors, including listening test method and participant critical-listening experience. Although expert listeners afford more reliable data, their perceptions may not be broadly representative. The present paper contains two studies that examined the validity and reliability of the data obtained from two listening test methods, a successive and a comparative method, and two types of participants, musicians and non-musicians. Participants rated their overall preference of auralizations generated from eight concert hall conditions with a range of reverberation times (0.0-7.2 s). Study 1, with 34 participants, assessed the two methods. The comparative method yielded similar results and reliability as the successive method. Additionally, the comparative method was rated as less difficult and more preferable. For study 2, an additional 37 participants rated the stimuli using the comparative method only. An analysis of variance of the responses from both studies revealed that musicians are better than non-musicians at discerning their preferences across stimuli. This result was confirmed with a k-means clustering analysis on the entire dataset that revealed five preference groups. Four groups exhibited clear preferences to the stimuli, while the fifth group, predominantly comprising non-musicians, demonstrated no clear preference.

  6. Inter-Observer Reliability of DSM-5 Substance Use Disorders*

    PubMed Central

    Denis, Cécile M.; Gelernter, Joel; Hart, Amy B.; Kranzler, Henry R.

    2015-01-01

    Aims Although studies have examined the impact of changes made in DSM-5 on the estimated prevalence of substance use disorder (SUD) diagnoses, there is limited evidence of the reliability of DSM-5 SUDs. We evaluated the inter-observer reliability of four DSM-5 SUDs in a sample in which we had previously evaluated the reliability of DSM-IV diagnoses, allowing us to compare the two systems. Methods Two different interviewers each assessed 173 subjects over a 2-week period using the Semi-Structured Assessment for Drug Dependence and Alcoholism (SSADDA). Using the percent agreement and kappa (κ) coefficient, we examined the reliability of DSM-5 lifetime alcohol, opioid, cocaine, and cannabis use disorders, which we compared to that of SSADDA-derived DSM-IV SUD diagnoses. We also assessed the effect of additional lifetime SUD and lifetime mood or anxiety disorder diagnoses on the reliability of the DSM-5 SUD diagnoses. Results Reliability was good to excellent for the four disorders, with κ values ranging from 0.65 to 0.94. Agreement was consistently lower for SUDs of mild severity than for moderate or severe disorders. DSM-5 SUD diagnoses showed greater reliability than DSM-IV diagnoses of abuse or dependence or dependence only. Co-occurring SUD and lifetime mood or anxiety disorders exerted a modest effect on the reliability of the DSM-5 SUD diagnoses. Conclusions For alcohol, opioid, cocaine and cannabis use disorders, DSM-5 criteria and diagnoses are at least as reliable as those of DSM-IV. PMID:26048641

  7. Test-Retest Reliability of the Salutogenic Wellness Promotion Scale (SWPS)

    ERIC Educational Resources Information Center

    Anderson, L. M.; Moore, J. B.; Hayden, B. M.; Becker, C. M.

    2014-01-01

    Objective: This study examined the temporal stability (i.e. test-retest reliability) of the Salutogenic Wellness Promotion Scale (SWPS) using intraclass correlation coefficients (ICC). Current intraclass results were also compared to previously published interclass correlations to support the use of the intraclass method for test-retest…

  8. A tonic heat test stimulus yields a larger and more reliable conditioned pain modulation effect compared to a phasic heat test stimulus

    PubMed Central

    Lie, Marie Udnesseter; Matre, Dagfinn; Hansson, Per; Stubhaug, Audun; Zwart, John-Anker; Nilsen, Kristian Bernhard

    2017-01-01

    Abstract Introduction: The interest in conditioned pain modulation (CPM) as a clinical tool for measuring endogenously induced analgesia is increasing. There is, however, large variation in the CPM methodology, hindering comparison of results across studies. Research comparing different CPM protocols is needed in order to obtain a standardized test paradigm. Objectives: The aim of the study was to assess whether a protocol with phasic heat stimuli as test-stimulus is preferable to a protocol with tonic heat stimulus as test-stimulus. Methods: In this experimental crossover study, we compared 2 CPM protocols with different test-stimulus; one with tonic test-stimulus (constant heat stimulus of 120-second duration) and one with phasic test-stimuli (3 heat stimulations of 5 seconds duration separated by 10 seconds). Conditioning stimulus was a 7°C water bath in parallel with the test-stimulus. Twenty-four healthy volunteers were assessed on 2 occasions with minimum 1 week apart. Differences in the magnitude and test–retest reliability of the CPM effect in the 2 protocols were investigated with repeated-measures analysis of variance and by relative and absolute reliability indices. Results: The protocol with tonic test-stimulus induced a significantly larger CPM effect compared to the protocol with phasic test-stimuli (P < 0.001). Fair and good relative reliability was found with the phasic and tonic test-stimuli, respectively. Absolute reliability indices showed large intraindividual variability from session to session in both protocols. Conclusion: The present study shows that a CPM protocol with a tonic test-stimulus is preferable to a protocol with phasic test-stimuli. However, we emphasize that one should be cautious to use the CPM effect as biomarker or in clinical decision making on an individual level due to large intraindividual variability. PMID:29392240

  9. A Comparison of Reliability and Construct Validity between the Original and Revised Versions of the Rosenberg Self-Esteem Scale

    PubMed Central

    Nahathai, Wongpakaran

    2012-01-01

    Objective The Rosenberg Self-Esteem Scale (RSES) is a widely used instrument that has been tested for reliability and validity in many settings; however, some negative-worded items appear to have caused it to reveal low reliability in a number of studies. In this study, we revised one negative item that had previously (from the previous studies) produced the worst outcome in terms of the structure of the scale, then re-analyzed the new version for its reliability and construct validity, comparing it to the original version with respect to fit indices. Methods In total, 851 students from Chiang Mai University (mean age: 19.51±1.7, 57% of whom were female), participated in this study. Of these, 664 students completed the Thai version of the original RSES - containing five positively worded and five negatively worded items, while 187 students used the revised version containing six positively worded and four negatively worded items. Confirmatory factor analysis was applied, using a uni-dimensional model with method effects and a correlated uniqueness approach. Results The revised version showed the same level of reliability (good) as the original, but yielded a better model fit. The revised RSES demonstrated excellent fit statistics, with χ2=29.19 (df=19, n=187, p=0.063), GFI=0.970, TFI=0.969, NFI=0.964, CFI=0.987, SRMR=0.040 and RMSEA=0.054. Conclusion The revised version of the Thai RSES demonstrated an equivalent level of reliability but a better construct validity when compared to the original. PMID:22396685

  10. A comparison of reliability and construct validity between the original and revised versions of the Rosenberg Self-Esteem Scale.

    PubMed

    Wongpakaran, Tinakon; Tinakon, Wongpakaran; Wongpakaran, Nahathai; Nahathai, Wongpakaran

    2012-03-01

    The Rosenberg Self-Esteem Scale (RSES) is a widely used instrument that has been tested for reliability and validity in many settings; however, some negative-worded items appear to have caused it to reveal low reliability in a number of studies. In this study, we revised one negative item that had previously (from the previous studies) produced the worst outcome in terms of the structure of the scale, then re-analyzed the new version for its reliability and construct validity, comparing it to the original version with respect to fit indices. In total, 851 students from Chiang Mai University (mean age: 19.51±1.7, 57% of whom were female), participated in this study. Of these, 664 students completed the Thai version of the original RSES - containing five positively worded and five negatively worded items, while 187 students used the revised version containing six positively worded and four negatively worded items. Confirmatory factor analysis was applied, using a uni-dimensional model with method effects and a correlated uniqueness approach. The revised version showed the same level of reliability (good) as the original, but yielded a better model fit. The revised RSES demonstrated excellent fit statistics, with χ²=29.19 (df=19, n=187, p=0.063), GFI=0.970, TFI=0.969, NFI=0.964, CFI=0.987, SRMR=0.040 and RMSEA=0.054. The revised version of the Thai RSES demonstrated an equivalent level of reliability but a better construct validity when compared to the original.

  11. The Children's Play Therapy Instrument (CPTI). Description, development, and reliability studies.

    PubMed

    Kernberg, P F; Chazan, S E; Normandin, L

    1998-01-01

    The Children's Play Therapy Instrument (CPTI), its development, and reliability studies are described. The CPTI is a new instrument to examine a child's play activity in individual psychotherapy. Three independent raters used the CPTI to rate eight videotaped play therapy vignettes. Results were compared with the authors' consensual scores from a preliminary study. Generally good to excellent levels of interrater reliability were obtained for the independent raters on intraclass correlation coefficients for ordinal categories of the CPTI. Likewise, kappa levels were acceptable to excellent for nominal categories of the scale. The CPTI holds promise to become a reliable measure of play activity in child psychotherapy. Further research is needed to assess discriminant validity of the CPTI for use as a diagnostic tool and as a measure of process and outcome.

  12. Reliability based design optimization: Formulations and methodologies

    NASA Astrophysics Data System (ADS)

    Agarwal, Harish

    Modern products ranging from simple components to complex systems should be designed to be optimal and reliable. The challenge of modern engineering is to ensure that manufacturing costs are reduced and design cycle times are minimized while achieving requirements for performance and reliability. If the market for the product is competitive, improved quality and reliability can generate very strong competitive advantages. Simulation based design plays an important role in designing almost any kind of automotive, aerospace, and consumer products under these competitive conditions. Single discipline simulations used for analysis are being coupled together to create complex coupled simulation tools. This investigation focuses on the development of efficient and robust methodologies for reliability based design optimization in a simulation based design environment. Original contributions of this research are the development of a novel efficient and robust unilevel methodology for reliability based design optimization, the development of an innovative decoupled reliability based design optimization methodology, the application of homotopy techniques in unilevel reliability based design optimization methodology, and the development of a new framework for reliability based design optimization under epistemic uncertainty. The unilevel methodology for reliability based design optimization is shown to be mathematically equivalent to the traditional nested formulation. Numerical test problems show that the unilevel methodology can reduce computational cost by at least 50% as compared to the nested approach. The decoupled reliability based design optimization methodology is an approximate technique to obtain consistent reliable designs at lesser computational expense. Test problems show that the methodology is computationally efficient compared to the nested approach. A framework for performing reliability based design optimization under epistemic uncertainty is also developed. A trust region managed sequential approximate optimization methodology is employed for this purpose. Results from numerical test studies indicate that the methodology can be used for performing design optimization under severe uncertainty.

  13. [Training of interviewers in the utilization of standardized questionnaires in psychiatry: studies realized with the Present State Examination (PSE)].

    PubMed

    Lesage, A D; Cyr, M; Toupin, J; Cormier, H; Valiquette, C

    1991-01-01

    Interview questionnaires offer more validity than self-administered format in exploring psychopathological or psychosocial phenomena of interest in psychiatric research. If used, special care needs to be paid to interviewers' training and ensuring that they maintain their reliability. No widespread training standards exist and each schedule may carry its own procedure. Our aims are to indicate how we trained interviewers with the French version of the Present State Examination (Wing, Cooper and Sartorius, 1974) and how we checked and kept acceptable interraters reliability during one study. We will provide data on the interraters reliability during the training and the study, as well as the test-retest reliability. These results will be used to support some guidelines when using this sort of psychiatric research questionnaires in order to ensure comparability both within the study and between studies.

  14. Intersession reliability of fMRI activation for heat pain and motor tasks

    PubMed Central

    Quiton, Raimi L.; Keaser, Michael L.; Zhuo, Jiachen; Gullapalli, Rao P.; Greenspan, Joel D.

    2014-01-01

    As the practice of conducting longitudinal fMRI studies to assess mechanisms of pain-reducing interventions becomes more common, there is a great need to assess the test–retest reliability of the pain-related BOLD fMRI signal across repeated sessions. This study quantitatively evaluated the reliability of heat pain-related BOLD fMRI brain responses in healthy volunteers across 3 sessions conducted on separate days using two measures: (1) intraclass correlation coefficients (ICC) calculated based on signal amplitude and (2) spatial overlap. The ICC analysis of pain-related BOLD fMRI responses showed fair-to-moderate intersession reliability in brain areas regarded as part of the cortical pain network. Areas with the highest intersession reliability based on the ICC analysis included the anterior midcingulate cortex, anterior insula, and second somatosensory cortex. Areas with the lowest intersession reliability based on the ICC analysis also showed low spatial reliability; these regions included pregenual anterior cingulate cortex, primary somatosensory cortex, and posterior insula. Thus, this study found regional differences in pain-related BOLD fMRI response reliability, which may provide useful information to guide longitudinal pain studies. A simple motor task (finger-thumb opposition) was performed by the same subjects in the same sessions as the painful heat stimuli were delivered. Intersession reliability of fMRI activation in cortical motor areas was comparable to previously published findings for both spatial overlap and ICC measures, providing support for the validity of the analytical approach used to assess intersession reliability of pain-related fMRI activation. A secondary finding of this study is that the use of standard ICC alone as a measure of reliability may not be sufficient, as the underlying variance structure of an fMRI dataset can result in inappropriately high ICC values; a method to eliminate these false positive results was used in this study and is recommended for future studies of test–retest reliability. PMID:25161897

  15. Reliability in the DSM-III field trials: interview v case summary.

    PubMed

    Hyler, S E; Williams, J B; Spitzer, R L

    1982-11-01

    A study compared the reliability of psychiatric diagnoses obtained from the live interviews and from case summaries, on the same patients, by the same clinicians, using the same DSM-III diagnostic criteria. The results showed that the reliability of the major diagnostic classes of DSM-III was higher when diagnoses were made from live interviews than when they were made from case summaries. We conclude that diagnoses based on information contained in traditionally prepared case summaries may lead to an underestimation of the reliability of diagnoses made based on information collected during a "live" interview.

  16. Time to competency, reliability of flexible transnasal laryngoscopy by training level: a pilot study.

    PubMed

    Brook, Christopher D; Platt, Michael P; Russell, Kimberly; Grillone, Gregory A; Aliphas, Avner; Noordzij, J Pieter

    2015-05-01

    To determine the progression of flexible transnasal laryngoscopy reliability and competency in otolaryngology residency training. Prospective case control study. Academic otolaryngology department. Medical students, otolaryngology residents, and otolaryngology attending physicians. Fourteen otolaryngology residents from PGY-1 to PGY-5 and 3 attending otolaryngologists viewed 25 selected and digitally recorded flexible transnasal laryngoscopies. The evaluators were asked to rate 13 items relating to abnormalities in the oropharynx, hypopharynx, larynx, and subglottis. The level of concern and level of comfort with the diagnosis were assessed. Intraclass correlations were calculated for each topic and by level of training to determine reliability within each class and compare competency versus attending interpretations. Intraclass correlation of residents compared to attending physicians demonstrated significant improvements by year for left and right vocal fold immobility, subglottic stenosis, laryngeal mass, left and right vocal cord abnormalities, and level of concern. Additionally, pooled vocal cord mobility and pooled results in categories with good attending reliability demonstrated stepwise improvement as well. For these categories, resident reliability was found to be statistically similar to attending physicians in all categories by PGY-3. There were no trends for base of tongue abnormalities, pharyngeal abnormalities, and pharyngeal and hypopharyngeal masses. Resident competency for flexible transnasal laryngoscopy progresses during residency to reliability with attending otolaryngologists by the PGY-3 year over key facets of the examination. © American Academy of Otolaryngology-Head and Neck Surgery Foundation 2015.

  17. Reliability and validity of an audio signal modified shuttle walk test.

    PubMed

    Singla, Rupak; Rai, Richa; Faye, Abhishek Anil; Jain, Anil Kumar; Chowdhury, Ranadip; Bandyopadhyay, Debdutta

    2017-01-01

    The audio signal in the conventionally accepted protocol of shuttle walk test (SWT) is not well-understood by the patients and modification of the audio signal may improve the performance of the test. The aim of this study is to study the validity and reliability of an audio signal modified SWT, called the Singla-Richa modified SWT (SWTSR), in healthy normal adults. In SWTSR, the audio signal was modified with the addition of reverse counting to it. A total of 54 healthy normal adults underwent conventional SWT (CSWT) at one instance and two times SWTSRon the same day. The validity was assessed by comparing outcomes of the SWTSRto outcomes of CSWT using the Pearson correlation coefficient and Bland-Altman plot. Test-retest reliability of SWTSRwas assessed using the intraclass correlation coefficient (ICC). The acceptability of the modified test in comparison to the conventional test was assessed using Likert scale. The distance walked (mean ± standard deviation) in the CSWT and SWTSRtest was 853.33 ± 217.33 m and 857.22 ± 219.56 m, respectively (Pearson correlation coefficient - 0.98; P < 0.001) indicating SWTSRto be a valid test. The SWTSRwas found to be a reliable test with ICC of 0.98 (95% confidence interval: 0.97-0.99). The acceptability of SWTSRwas significantly higher than CSWT. The SWTSRwith modified audio signal with reverse counting is a reliable as well as a valid test when compared with CSWT in healthy normal adults. It better understood by subjects compared to CSWT.

  18. Arm cranking versus wheelchair propulsion for testing aerobic fitness in children with spina bifida who are wheelchair dependent.

    PubMed

    Bloemen, Manon A T; de Groot, Janke F; Backx, Frank J G; Westerveld, Rosalyne A; Takken, Tim

    2015-05-01

    To determine the best test performance and feasibility using a Graded Arm Cranking Test vs a Graded Wheelchair Propulsion Test in young people with spina bifida who use a wheelchair, and to determine the reliability of the best test. Validity and reliability study. Young people with spina bifida who use a wheelchair. Physiological responses were measured during a Graded Arm Cranking Test and a Graded Wheelchair Propulsion Test using a heart rate monitor and calibrated mobile gas analysis system (Cortex Metamax). For validity, peak oxygen uptake (VO2peak) and peak heart rate (HRpeak) were compared using paired t-tests. For reliability, the intra-class correlation coefficients, standard error of measurement, and standard detectable change were calculated. VO2peak and HRpeak were higher during wheelchair propulsion compared with arm cranking (23.1 vs 19.5 ml/kg/min, p = 0.11; 165 vs 150 beats/min, p < 0.05). Reliability of wheelchair propulsion showed high intra-class correlation coefficients (ICCs) for both VO2peak (ICC = 0.93) and HRpeak (ICC = 0.90). This pilot study shows higher HRpeak and a tendency to higher VO2peak in young people with spina bifida who are using a wheelchair when tested during wheelchair propulsion compared with arm cranking. Wheelchair propulsion showed good reliability. We recommend performing a wheelchair propulsion test for aerobic fitness testing in this population.

  19. Towards early software reliability prediction for computer forensic tools (case study).

    PubMed

    Abu Talib, Manar

    2016-01-01

    Versatility, flexibility and robustness are essential requirements for software forensic tools. Researchers and practitioners need to put more effort into assessing this type of tool. A Markov model is a robust means for analyzing and anticipating the functioning of an advanced component based system. It is used, for instance, to analyze the reliability of the state machines of real time reactive systems. This research extends the architecture-based software reliability prediction model for computer forensic tools, which is based on Markov chains and COSMIC-FFP. Basically, every part of the computer forensic tool is linked to a discrete time Markov chain. If this can be done, then a probabilistic analysis by Markov chains can be performed to analyze the reliability of the components and of the whole tool. The purposes of the proposed reliability assessment method are to evaluate the tool's reliability in the early phases of its development, to improve the reliability assessment process for large computer forensic tools over time, and to compare alternative tool designs. The reliability analysis can assist designers in choosing the most reliable topology for the components, which can maximize the reliability of the tool and meet the expected reliability level specified by the end-user. The approach of assessing component-based tool reliability in the COSMIC-FFP context is illustrated with the Forensic Toolkit Imager case study.

  20. Comparative Reliability of Structured Versus Unstructured Interviews in the Admission Process of a Residency Program

    PubMed Central

    Blouin, Danielle; Day, Andrew G.; Pavlov, Andrey

    2011-01-01

    Background Although never directly compared, structured interviews are reported as being more reliable than unstructured interviews. This study compared the reliability of both types of interview when applied to a common pool of applicants for positions in an emergency medicine residency program. Methods In 2008, one structured interview was added to the two unstructured interviews traditionally used in our resident selection process. A formal job analysis using the critical incident technique guided the development of the structured interview tool. This tool consisted of 7 scenarios assessing 4 of the domains deemed essential for success as a resident in this program. The traditional interview tool assessed 5 general criteria. In addition to these criteria, the unstructured panel members were asked to rate each candidate on the same 4 essential domains rated by the structured panel members. All 3 panels interviewed all candidates. Main outcomes were the overall, interitem, and interrater reliabilities, the correlations between interview panels, and the dimensionality of each interview tool. Results Thirty candidates were interviewed. The overall reliability reached 0.43 for the structured interview, and 0.81 and 0.71 for the unstructured interviews. Analyses of the variance components showed a high interrater, low interitem reliability for the structured interview, and a high interrater, high interitem reliability for the unstructured interviews. The summary measures from the 2 unstructured interviews were significantly correlated, but neither was correlated with the structured interview. Only the structured interview was multidimensional. Conclusions A structured interview did not yield a higher overall reliability than both unstructured interviews. The lower reliability is explained by a lower interitem reliability, which in turn is due to the multidimensionality of the interview tool. Both unstructured panels consistently rated a single dimension, even when prompted to assess the 4 specific domains established as essential to succeed in this residency program. PMID:23205201

  1. Comparative reliability of structured versus unstructured interviews in the admission process of a residency program.

    PubMed

    Blouin, Danielle; Day, Andrew G; Pavlov, Andrey

    2011-12-01

    Although never directly compared, structured interviews are reported as being more reliable than unstructured interviews. This study compared the reliability of both types of interview when applied to a common pool of applicants for positions in an emergency medicine residency program. In 2008, one structured interview was added to the two unstructured interviews traditionally used in our resident selection process. A formal job analysis using the critical incident technique guided the development of the structured interview tool. This tool consisted of 7 scenarios assessing 4 of the domains deemed essential for success as a resident in this program. The traditional interview tool assessed 5 general criteria. In addition to these criteria, the unstructured panel members were asked to rate each candidate on the same 4 essential domains rated by the structured panel members. All 3 panels interviewed all candidates. Main outcomes were the overall, interitem, and interrater reliabilities, the correlations between interview panels, and the dimensionality of each interview tool. Thirty candidates were interviewed. The overall reliability reached 0.43 for the structured interview, and 0.81 and 0.71 for the unstructured interviews. Analyses of the variance components showed a high interrater, low interitem reliability for the structured interview, and a high interrater, high interitem reliability for the unstructured interviews. The summary measures from the 2 unstructured interviews were significantly correlated, but neither was correlated with the structured interview. Only the structured interview was multidimensional. A structured interview did not yield a higher overall reliability than both unstructured interviews. The lower reliability is explained by a lower interitem reliability, which in turn is due to the multidimensionality of the interview tool. Both unstructured panels consistently rated a single dimension, even when prompted to assess the 4 specific domains established as essential to succeed in this residency program.

  2. Assessing the environmental characteristics of cycling routes to school: a study on the reliability and validity of a Google Street View-based audit.

    PubMed

    Vanwolleghem, Griet; Van Dyck, Delfien; Ducheyne, Fabian; De Bourdeaudhuij, Ilse; Cardon, Greet

    2014-06-10

    Google Street View provides a valuable and efficient alternative to observe the physical environment compared to on-site fieldwork. However, studies on the use, reliability and validity of Google Street View in a cycling-to-school context are lacking. We aimed to study the intra-, inter-rater reliability and criterion validity of EGA-Cycling (Environmental Google Street View Based Audit - Cycling to school), a newly developed audit using Google Street View to assess the physical environment along cycling routes to school. Parents (n = 52) of 11-to-12-year old Flemish children, who mostly cycled to school, completed a questionnaire and identified their child's cycling route to school on a street map. Fifty cycling routes of 11-to-12-year olds were identified and physical environmental characteristics along the identified routes were rated with EGA-Cycling (5 subscales; 37 items), based on Google Street View. To assess reliability, two researchers performed the audit. Criterion validity of the audit was examined by comparing the ratings based on Google Street View with ratings through on-site assessments. Intra-rater reliability was high (kappa range 0.47-1.00). Large variations in the inter-rater reliability (kappa range -0.03-1.00) and criterion validity scores (kappa range -0.06-1.00) were reported, with acceptable inter-rater reliability values for 43% of all items and acceptable criterion validity for 54% of all items. EGA-Cycling can be used to assess physical environmental characteristics along cycling routes to school. However, to assess the micro-environment specifically related to cycling, on-site assessments have to be added.

  3. Reliability and Validity of the Greek Migraine Disability Assessment (MIDAS) Questionnaire.

    PubMed

    Oikonomidi, Theodora; Vikelis, Michail; Artemiadis, Artemios; Chrousos, George P; Darviri, Christina

    2018-03-01

    The Migraine Disability Assessment (MIDAS) Questionnaire is a reliable and valid instrument for migraine-related disability. Such a tool is needed to quantify migraine-related disability in the Greek population. This validation study aims to assess the test-retest reliability, internal consistency, item discriminant and convergent validity of the Greek translation of the MIDAS. Adults diagnosed with migraine completed the MIDAS Questionnaire on two occasions 3 weeks apart to assess reliability, and completed the RAND-36 to assess validity. Participants (n = 152) had a median MIDAS score of 24 and mostly severe disability (58% were grade IV). The test-retest reliability analysis (N = 59) revealed excellent reliability for the total score. Internal consistency was α = 0.71 for initial and α = 0.82 for retest completion. For item discriminant validity, the correlations between each question and the total score were significant, with high correlations for questions 2-5 (range 0.67 ≤ r ≤ 0.79; p < 0.01). For convergent validity, there was significant negative correlation between the total score and all RAND-36 subscales except for 'emotional wellbeing'. The negative correlation indicates that patients with a lower degree of disability according to their MIDAS score tended to have better wellbeing. Psychometric properties are comparable with those of other published validation studies of the MIDAS and the original. Findings on question 1 show that missing work/school days may be closely related with increased affect issues. The Greek version of the MIDAS Questionnaire has good reliability and validity. This study allowed for cross-cultural comparability of research findings.

  4. The reliability of WorkWell Systems Functional Capacity Evaluation: a systematic review

    PubMed Central

    2014-01-01

    Background Functional capacity evaluation (FCE) determines a person’s ability to perform work-related tasks and is a major component of the rehabilitation process. The WorkWell Systems (WWS) FCE (formerly known as Isernhagen Work Systems FCE) is currently the most commonly used FCE tool in German rehabilitation centres. Our systematic review investigated the inter-rater, intra-rater and test-retest reliability of the WWS FCE. Methods We performed a systematic literature search of studies on the reliability of the WWS FCE and extracted item-specific measures of inter-rater, intra-rater and test-retest reliability from the identified studies. Intraclass correlation coefficients ≥ 0.75, percentages of agreement ≥ 80%, and kappa coefficients ≥ 0.60 were categorised as acceptable, otherwise they were considered non-acceptable. The extracted values were summarised for the five performance categories of the WWS FCE, and the results were classified as either consistent or inconsistent. Results From 11 identified studies, 150 item-specific reliability measures were extracted. 89% of the extracted inter-rater reliability measures, all of the intra-rater reliability measures and 96% of the test-retest reliability measures of the weight handling and strength tests had an acceptable level of reliability, compared to only 67% of the test-retest reliability measures of the posture/mobility tests and 56% of the test-retest reliability measures of the locomotion tests. Both of the extracted test-retest reliability measures of the balance test were acceptable. Conclusions Weight handling and strength tests were found to have consistently acceptable reliability. Further research is needed to explore the reliability of the other tests as inconsistent findings or a lack of data prevented definitive conclusions. PMID:24674029

  5. Study of thermal management for space platform applications

    NASA Technical Reports Server (NTRS)

    Oren, J. A.

    1980-01-01

    Techniques for the management of the thermal energy of large space platforms using many hundreds of kilowatts over a 10 year life span were evaluated. Concepts for heat rejection, heat transport within the vehicle, and interfacing were analyzed and compared. The heat rejection systems were parametrically weight optimized over conditions for heat pipe and pumped fluid approaches. Two approaches to achieve reliability were compared for: performance, weight, volume, projected area, reliability, cost, and operational characteristics. Technology needs are assessed and technology advancement recommendations are made.

  6. Optimized Biasing of Pump Laser Diodes in a Highly Reliable Metrology Source for Long-Duration Space Missions

    NASA Technical Reports Server (NTRS)

    Poberezhskiy, Ilya; Chang, Daniel; Erlig, Hernan

    2011-01-01

    Non Planar Ring Oscillator (NPRO) lasers are highly attractive for metrology applications. NPRO reliability for prolonged space missions is limited by reliability of 808 nm pump diodes. Combined laser farm aging parameter allows comparing different bias approaches. Monte-Carlo software developed to calculate the reliability of laser pump architecture, perform parameter sensitivity studies To meet stringent Space Interferometry Mission (SIM) Lite lifetime reliability / output power requirements, we developed a single-mode Laser Pump Module architecture that: (1) provides 2 W of power at 808 nm with >99.7% reliability for 5.5 years (2) consists of 37 de-rated diode lasers operating at -5C, with outputs combined in a very low loss 37x1 all-fiber coupler

  7. Validation of a telephone screening tool for spasmodic dysphonia and vocal fold tremor.

    PubMed

    Johnson, David M; Hapner, Edie R; Klein, Adam M; Pethan, Madeleine; Johns, Michael M

    2014-11-01

    The objective of this study was to ascertain whether clinicians can reliably distinguish between spasmodic dysphonia (SD)/vocal tremor and other voice disorders by telephone, despite this modality's limited frequency response. Randomized, single-blinded, and prospective study. Voice-disordered patients with (n = 22) and without (n = 17) SD and/or vocal tremor recorded standardized utterances via landline telephone. A laryngologist and two speech-language pathologists blinded to the diagnoses rated each recording as "yes" or "no" to "SD or tremor present?," and if "yes" categorized into adductor, abductor, tremor only, or adductor with tremor subtypes. Twenty-one recordings were presented twice at random so intrarater reliability could be assessed. All ratings were compared with gold standard diagnosis by a second laryngologist who performed a full examination, including videostroboscopy, on each patient. For the comparison "SD or tremor" yes versus no, sensitivity, specificity, positive predictive value, and negative predictive value are 90%, 95%, 96%, and 89%, respectively. Interrater reliability (Cohen kappa) compared with the gold standard ranged from 0.70 to 0.93 (substantial to almost perfect agreement). Cronbach alpha among three raters was 0.90 for this comparison. Intrarater reliability (number matched/number inspected) was very high, ranging from 0.97 to 1.0. Comparing gold standard and telephone rating of SD/tremor subtypes, kappa ranged from 0.48 to 0.60 (moderate agreement). Cronbach alpha among three raters was 0.88 for this comparison. Intrarater reliability ranged from 0.84 to 0.97. SD and tremor can be reliably distinguished from other voice disorders over the telephone. Copyright © 2014 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  8. Measurement of fatigue: Comparison of the reliability and validity of single-item and short measures to a comprehensive measure.

    PubMed

    Kim, Hee-Ju; Abraham, Ivo

    2017-01-01

    Evidence is needed on the clinicometric properties of single-item or short measures as alternatives to comprehensive measures. We examined whether two single-item fatigue measures (i.e., Likert scale, numeric rating scale) or a short fatigue measure were comparable to a comprehensive measure in reliability (i.e., internal consistency and test-retest reliability) and validity (i.e., convergent, concurrent, and predictive validity) in Korean young adults. For this quantitative study, we selected the Functional Assessment of Chronic Illness Therapy-Fatigue for the comprehensive measure and the Profile of Mood States-Brief, Fatigue subscale for the short measure; and constructed two single-item measures. A total of 368 students from four nursing colleges in South Korea participated. We used Cronbach's alpha and item-total correlation for internal consistency reliability and intraclass correlation coefficient for test-retest reliability. We assessed Pearson's correlation with a comprehensive measure for convergent validity, with perceived stress level and sleep quality for concurrent validity and the receiver operating characteristic curve for predictive validity. The short measure was comparable to the comprehensive measure in internal consistency reliability (Cronbach's alpha=0.81 vs. 0.88); test-retest reliability (intraclass correlation coefficient=0.66 vs. 0.61); convergent validity (r with comprehensive measure=0.79); concurrent validity (r with perceived stress=0.55, r with sleep quality=0.39) and predictive validity (area under curve=0.88). Single-item measures were not comparable to the comprehensive measure. A short fatigue measure exhibited similar levels of reliability and validity to the comprehensive measure in Korean young adults. Copyright © 2016 Elsevier Ltd. All rights reserved.

  9. An alternative to the balance error scoring system: using a low-cost balance board to improve the validity/reliability of sports-related concussion balance testing.

    PubMed

    Chang, Jasper O; Levy, Susan S; Seay, Seth W; Goble, Daniel J

    2014-05-01

    Recent guidelines advocate sports medicine professionals to use balance tests to assess sensorimotor status in the management of concussions. The present study sought to determine whether a low-cost balance board could provide a valid, reliable, and objective means of performing this balance testing. Criterion validity testing relative to a gold standard and 7 day test-retest reliability. University biomechanics laboratory. Thirty healthy young adults. Balance ability was assessed on 2 days separated by 1 week using (1) a gold standard measure (ie, scientific grade force plate), (2) a low-cost Nintendo Wii Balance Board (WBB), and (3) the Balance Error Scoring System (BESS). Validity of the WBB center of pressure path length and BESS scores were determined relative to the force plate data. Test-retest reliability was established based on intraclass correlation coefficients. Composite scores for the WBB had excellent validity (r = 0.99) and test-retest reliability (R = 0.88). Both the validity (r = 0.10-0.52) and test-retest reliability (r = 0.61-0.78) were lower for the BESS. These findings demonstrate that a low-cost balance board can provide improved balance testing accuracy/reliability compared with the BESS. This approach provides a potentially more valid/reliable, yet affordable, means of assessing sports-related concussion compared with current methods.

  10. Tackling reliability and construct validity: the systematic development of a qualitative protocol for skill and incident analysis.

    PubMed

    Savage, Trevor Nicholas; McIntosh, Andrew Stuart

    2017-03-01

    It is important to understand factors contributing to and directly causing sports injuries to improve the effectiveness and safety of sports skills. The characteristics of injury events must be evaluated and described meaningfully and reliably. However, many complex skills cannot be effectively investigated quantitatively because of ethical, technological and validity considerations. Increasingly, qualitative methods are being used to investigate human movement for research purposes, but there are concerns about reliability and measurement bias of such methods. Using the tackle in Rugby union as an example, we outline a systematic approach for developing a skill analysis protocol with a focus on improving objectivity, validity and reliability. Characteristics for analysis were selected using qualitative analysis and biomechanical theoretical models and epidemiological and coaching literature. An expert panel comprising subject matter experts provided feedback and the inter-rater reliability of the protocol was assessed using ten trained raters. The inter-rater reliability results were reviewed by the expert panel and the protocol was revised and assessed in a second inter-rater reliability study. Mean agreement in the second study improved and was comparable (52-90% agreement and ICC between 0.6 and 0.9) with other studies that have reported inter-rater reliability of qualitative analysis of human movement.

  11. Measuring acuity of the approximate number system reliably and validly: the evaluation of an adaptive test procedure

    PubMed Central

    Lindskog, Marcus; Winman, Anders; Juslin, Peter; Poom, Leo

    2013-01-01

    Two studies investigated the reliability and predictive validity of commonly used measures and models of Approximate Number System acuity (ANS). Study 1 investigated reliability by both an empirical approach and a simulation of maximum obtainable reliability under ideal conditions. Results showed that common measures of the Weber fraction (w) are reliable only when using a substantial number of trials, even under ideal conditions. Study 2 compared different purported measures of ANS acuity as for convergent and predictive validity in a within-subjects design and evaluated an adaptive test using the ZEST algorithm. Results showed that the adaptive measure can reduce the number of trials needed to reach acceptable reliability. Only direct tests with non-symbolic numerosity discriminations of stimuli presented simultaneously were related to arithmetic fluency. This correlation remained when controlling for general cognitive ability and perceptual speed. Further, the purported indirect measure of ANS acuity in terms of the Numeric Distance Effect (NDE) was not reliable and showed no sign of predictive validity. The non-symbolic NDE for reaction time was significantly related to direct w estimates in a direction contrary to the expected. Easier stimuli were found to be more reliable, but only harder (7:8 ratio) stimuli contributed to predictive validity. PMID:23964256

  12. The risk of bias in systematic reviews tool showed fair reliability and good construct validity.

    PubMed

    Bühn, Stefanie; Mathes, Tim; Prengel, Peggy; Wegewitz, Uta; Ostermann, Thomas; Robens, Sibylle; Pieper, Dawid

    2017-11-01

    There is a movement from generic quality checklists toward a more domain-based approach in critical appraisal tools. This study aimed to report on a first experience with the newly developed risk of bias in systematic reviews (ROBIS) tool and compare it with A Measurement Tool to Assess Systematic Reviews (AMSTAR), that is, the most common used tool to assess methodological quality of systematic reviews while assessing validity, reliability, and applicability. Validation study with four reviewers based on 16 systematic reviews in the field of occupational health. Interrater reliability (IRR) of all four raters was highest for domain 2 (Fleiss' kappa κ = 0.56) and lowest for domain 4 (κ = 0.04). For ROBIS, median IRR was κ = 0.52 (range 0.13-0.88) for the experienced pair of raters compared to κ = 0.32 (range 0.12-0.76) for the less experienced pair of raters. The percentage of "yes" scores of each review of ROBIS ratings was strongly correlated with the AMSTAR ratings (r s  = 0.76; P = 0.01). ROBIS has fair reliability and good construct validity to assess the risk of bias in systematic reviews. More validation studies are needed to investigate reliability and applicability, in particular. Copyright © 2017 Elsevier Inc. All rights reserved.

  13. Observation and Classification of Prehension in Preschool Children: A Reliability Study.

    ERIC Educational Resources Information Center

    Moss, S. C.; Hogg, J.

    1981-01-01

    The variety of hand grips of 12 children, most of whom were moderately or severely retarded, were classified in order to begin an analysis of hand function. Test reliability was not as great when items were presented to the children as compared to when children were observed or rated by videotape. (FG)

  14. Attenuation of the Squared Canonical Correlation Coefficient under Varying Estimates of Score Reliability

    ERIC Educational Resources Information Center

    Wilson, Celia M.

    2010-01-01

    Research pertaining to the distortion of the squared canonical correlation coefficient has traditionally been limited to the effects of sampling error and associated correction formulas. The purpose of this study was to compare the degree of attenuation of the squared canonical correlation coefficient under varying conditions of score reliability.…

  15. Comparison of Difficulties and Reliabilities of Math-Completion and Multiple-Choice Item Formats.

    ERIC Educational Resources Information Center

    Oosterhof, Albert C.; Coats, Pamela K.

    Instructors who develop classroom examinations that require students to provide a numerical response to a mathematical problem are often very concerned about the appropriateness of the multiple-choice format. The present study augments previous research relevant to this concern by comparing the difficulty and reliability of multiple-choice and…

  16. Use of volunteer student abstractors for a retrospective cohort analysis: a study of inter-rater reliability.

    PubMed

    Gritsiouk, Yaroslav; Hegsted, Damian; Gardiner, Stuart; Merriman, Lisa; Gubler, Kelly Dean

    2013-05-01

    Little is known about the reliability of data collected by abstractors without professional medical training. This investigation sought to determine the level of agreement among untrained volunteer abstractors as part of a study to evaluate the risk assessment of venous thromboembolism in patients who have undergone trauma. Forty-nine paper charts were chosen randomly from a volunteer-reviewed cohort of 2,339 and were compared with those of a single experienced abstractor. Inter-rater agreement was assessed using percent agreement, Cohen's kappa, and prevalence-adjusted bias-adjusted kappa (PABAK). Of the 71 data points, 28 had perfect agreement. The average agreement across all charts was 97%. Data with imperfect agreement had kappa values between .27 and .96 (mean, .75), with one additional value at zero even though it was associated with an agreement of 94%. PABAK values ranged from .67 to .98 (mean, .91), an average increase of .17 compared with kappa values. The performance of volunteers showed outstanding inter-rater reliability; however, limitations of interpretation can influence reliability. Copyright © 2013 Elsevier Inc. All rights reserved.

  17. Timed activity performance in persons with upper limb amputation: A preliminary study.

    PubMed

    Resnik, Linda; Borgia, Mathew; Acluche, Frantzy

    55 subjects with upper limb amputation were administered the T-MAP twice within one week. To develop a timed measure of activity performance for persons with upper limb amputation (T-MAP); examine the measure's internal consistency, test-retest reliability and validity; and compare scores by prosthesis use. Measures of activity performance for persons with upper limb amputation are needed The time required to perform daily activities is a meaningful metric that implication for participation in life roles. Internal consistency and test-retest reliability were evaluated. Construct validity was examined by comparing scores by amputation level. Exploratory analyses compared sub-group scores, and examined correlations with other measures. Scale alpha was 0.77, ICC was 0.93. Timed scores differed by amputation level. Subjects using a prosthesis took longer to perform all tasks. T-MAP was not correlated with other measures of dexterity or activity, but was correlated with pain for non-prosthesis users. The timed scale had adequate internal consistency and excellent test-retest reliability. Analyses support reliability and construct validity of the T-MAP. 2c "outcomes" research. Published by Elsevier Inc.

  18. Inter-method reliability of paper surveys and computer assisted telephone interviews in a randomized controlled trial of yoga for low back pain

    PubMed Central

    2014-01-01

    Background Little is known about the reliability of different methods of survey administration in low back pain trials. This analysis was designed to determine the reliability of responses to self-administered paper surveys compared to computer assisted telephone interviews (CATI) for the primary outcomes of pain intensity and back-related function, and secondary outcomes of patient satisfaction, SF-36, and global improvement among participants enrolled in a study of yoga for chronic low back pain. Results Pain intensity, back-related function, and both physical and mental health components of the SF-36 showed excellent reliability at all three time points; ICC scores ranged from 0.82 to 0.98. Pain medication use showed good reliability; kappa statistics ranged from 0.68 to 0.78. Patient satisfaction had moderate to excellent reliability; ICC scores ranged from 0.40 to 0.86. Global improvement showed poor reliability at 6 weeks (ICC = 0.24) and 12 weeks (ICC = 0.10). Conclusion CATI shows excellent reliability for primary outcomes and at least some secondary outcomes when compared to self-administered paper surveys in a low back pain yoga trial. Having two reliable options for data collection may be helpful to increase response rates for core outcomes in back pain trials. Trial registration ClinicalTrials.gov: NCT01761617. Date of trial registration: December 4, 2012. PMID:24716775

  19. Reliability of Various Measurement Stations for Determining Plantar Fascia Thickness and Echogenicity.

    PubMed

    Bisi-Balogun, Adebisi; Cassel, Michael; Mayer, Frank

    2016-04-13

    This study aimed to determine the relative and absolute reliability of ultrasound (US) measurements of the thickness and echogenicity of the plantar fascia (PF) at different measurement stations along its length using a standardized protocol. Twelve healthy subjects (24 feet) were enrolled. The PF was imaged in the longitudinal plane. Subjects were assessed twice to evaluate the intra-rater reliability. A quantitative evaluation of the thickness and echogenicity of the plantar fascia was performed using Image J, a digital image analysis and viewer software. A sonography evaluation of the thickness and echogenicity of the PF showed a high relative reliability with an Intra class correlation coefficient of ≥0.88 at all measurement stations. However, the measurement stations for both the PF thickness and echogenicity which showed the highest intraclass correlation coefficient (ICCs) did not have the highest absolute reliability. Compared to other measurement stations, measuring the PF thickness at 3 cm distal and the echogenicity at a region of interest 1 cm to 2 cm distal from its insertion at the medial calcaneal tubercle showed the highest absolute reliability with the least systematic bias and random error. Also, the reliability was higher using a mean of three measurements compared to one measurement. To reduce discrepancies in the interpretation of the thickness and echogenicity measurements of the PF, the absolute reliability of the different measurement stations should be considered in clinical practice and research rather than the relative reliability with the ICC.

  20. Reliability of Various Measurement Stations for Determining Plantar Fascia Thickness and Echogenicity

    PubMed Central

    Bisi-Balogun, Adebisi; Cassel, Michael; Mayer, Frank

    2016-01-01

    This study aimed to determine the relative and absolute reliability of ultrasound (US) measurements of the thickness and echogenicity of the plantar fascia (PF) at different measurement stations along its length using a standardized protocol. Twelve healthy subjects (24 feet) were enrolled. The PF was imaged in the longitudinal plane. Subjects were assessed twice to evaluate the intra-rater reliability. A quantitative evaluation of the thickness and echogenicity of the plantar fascia was performed using Image J, a digital image analysis and viewer software. A sonography evaluation of the thickness and echogenicity of the PF showed a high relative reliability with an Intra class correlation coefficient of ≥0.88 at all measurement stations. However, the measurement stations for both the PF thickness and echogenicity which showed the highest intraclass correlation coefficient (ICCs) did not have the highest absolute reliability. Compared to other measurement stations, measuring the PF thickness at 3 cm distal and the echogenicity at a region of interest 1 cm to 2 cm distal from its insertion at the medial calcaneal tubercle showed the highest absolute reliability with the least systematic bias and random error. Also, the reliability was higher using a mean of three measurements compared to one measurement. To reduce discrepancies in the interpretation of the thickness and echogenicity measurements of the PF, the absolute reliability of the different measurement stations should be considered in clinical practice and research rather than the relative reliability with the ICC. PMID:27089369

  1. Effect of knee angle on neuromuscular assessment of plantar flexor muscles: A reliability study

    PubMed Central

    Cornu, Christophe; Jubeau, Marc

    2018-01-01

    Introduction This study aimed to determine the intra- and inter-session reliability of neuromuscular assessment of plantar flexor (PF) muscles at three knee angles. Methods Twelve young adults were tested for three knee angles (90°, 30° and 0°) and at three time points separated by 1 hour (intra-session) and 7 days (inter-session). Electrical (H reflex, M wave) and mechanical (evoked and maximal voluntary torque, activation level) parameters were measured on the PF muscles. Intraclass correlation coefficients (ICC) and coefficients of variation were calculated to determine intra- and inter-session reliability. Results The mechanical measurements presented excellent (ICC>0.75) intra- and inter-session reliabilities regardless of the knee angle considered. The reliability of electrical measurements was better for the 90° knee angle compared to the 0° and 30° angles. Conclusions Changes in the knee angle may influence the reliability of neuromuscular assessments, which indicates the importance of considering the knee angle to collect consistent outcomes on the PF muscles. PMID:29596480

  2. Reliability and Validity of the Dyadic Observed Communication Scale (DOCS).

    PubMed

    Hadley, Wendy; Stewart, Angela; Hunter, Heather L; Affleck, Katelyn; Donenberg, Geri; Diclemente, Ralph; Brown, Larry K

    2013-02-01

    We evaluated the reliability and validity of the Dyadic Observed Communication Scale (DOCS) coding scheme, which was developed to capture a range of communication components between parents and adolescents. Adolescents and their caregivers were recruited from mental health facilities for participation in a large, multi-site family-based HIV prevention intervention study. Seventy-one dyads were randomly selected from the larger study sample and coded using the DOCS at baseline. Preliminary validity and reliability of the DOCS was examined using various methods, such as comparing results to self-report measures and examining interrater reliability. Results suggest that the DOCS is a reliable and valid measure of observed communication among parent-adolescent dyads that captures both verbal and nonverbal communication behaviors that are typical intervention targets. The DOCS is a viable coding scheme for use by researchers and clinicians examining parent-adolescent communication. Coders can be trained to reliably capture individual and dyadic components of communication for parents and adolescents and this complex information can be obtained relatively quickly.

  3. Study on Distribution Reliability with Parallel and On-site Distributed Generation Considering Protection Miscoordination and Tie Line

    NASA Astrophysics Data System (ADS)

    Chaitusaney, Surachai; Yokoyama, Akihiko

    In distribution system, Distributed Generation (DG) is expected to improve the system reliability as its backup generation. However, DG contribution in fault current may cause the loss of the existing protection coordination, e.g. recloser-fuse coordination and breaker-breaker coordination. This problem can drastically deteriorate the system reliability, and it is more serious and complicated when there are several DG sources in the system. Hence, the above conflict in reliability aspect unavoidably needs a detailed investigation before the installation or enhancement of DG is done. The model of composite DG fault current is proposed to find the threshold beyond which existing protection coordination is lost. Cases of protection miscoordination are described, together with their consequences. Since a distribution system may be tied with another system, the issues of tie line and on-site DG are integrated into this study. Reliability indices are evaluated and compared in the distribution reliability test system RBTS Bus 2.

  4. Validity and reliability of a low-cost digital dynamometer for measuring isometric strength of lower limb.

    PubMed

    Romero-Franco, Natalia; Jiménez-Reyes, Pedro; Montaño-Munuera, Juan A

    2017-11-01

    Lower limb isometric strength is a key parameter to monitor the training process or recognise muscle weakness and injury risk. However, valid and reliable methods to evaluate it often require high-cost tools. The aim of this study was to analyse the concurrent validity and reliability of a low-cost digital dynamometer for measuring isometric strength in lower limb. Eleven physically active and healthy participants performed maximal isometric strength for: flexion and extension of ankle, flexion and extension of knee, flexion, extension, adduction, abduction, internal and external rotation of hip. Data obtained by the digital dynamometer were compared with the isokinetic dynamometer to examine its concurrent validity. Data obtained by the digital dynamometer from 2 different evaluators and 2 different sessions were compared to examine its inter-rater and intra-rater reliability. Intra-class correlation (ICC) for validity was excellent in every movement (ICC > 0.9). Intra and inter-tester reliability was excellent for all the movements assessed (ICC > 0.75). The low-cost digital dynamometer demonstrated strong concurrent validity and excellent intra and inter-tester reliability for assessing isometric strength in the main lower limb movements.

  5. Test-retest and between-site reliability in a multicenter fMRI study.

    PubMed

    Friedman, Lee; Stern, Hal; Brown, Gregory G; Mathalon, Daniel H; Turner, Jessica; Glover, Gary H; Gollub, Randy L; Lauriello, John; Lim, Kelvin O; Cannon, Tyrone; Greve, Douglas N; Bockholt, Henry Jeremy; Belger, Aysenil; Mueller, Bryon; Doty, Michael J; He, Jianchun; Wells, William; Smyth, Padhraic; Pieper, Steve; Kim, Seyoung; Kubicki, Marek; Vangel, Mark; Potkin, Steven G

    2008-08-01

    In the present report, estimates of test-retest and between-site reliability of fMRI assessments were produced in the context of a multicenter fMRI reliability study (FBIRN Phase 1, www.nbirn.net). Five subjects were scanned on 10 MRI scanners on two occasions. The fMRI task was a simple block design sensorimotor task. The impulse response functions to the stimulation block were derived using an FIR-deconvolution analysis with FMRISTAT. Six functionally-derived ROIs covering the visual, auditory and motor cortices, created from a prior analysis, were used. Two dependent variables were compared: percent signal change and contrast-to-noise-ratio. Reliability was assessed with intraclass correlation coefficients derived from a variance components analysis. Test-retest reliability was high, but initially, between-site reliability was low, indicating a strong contribution from site and site-by-subject variance. However, a number of factors that can markedly improve between-site reliability were uncovered, including increasing the size of the ROIs, adjusting for smoothness differences, and inclusion of additional runs. By employing multiple steps, between-site reliability for 3T scanners was increased by 123%. Dropping one site at a time and assessing reliability can be a useful method of assessing the sensitivity of the results to particular sites. These findings should provide guidance toothers on the best practices for future multicenter studies.

  6. Comparison of fMRI paradigms assessing visuospatial processing: Robustness and reproducibility

    PubMed Central

    Herholz, Peer; Zimmermann, Kristin M.; Westermann, Stefan; Frässle, Stefan; Jansen, Andreas

    2017-01-01

    The development of brain imaging techniques, in particular functional magnetic resonance imaging (fMRI), made it possible to non-invasively study the hemispheric lateralization of cognitive brain functions in large cohorts. Comprehensive models of hemispheric lateralization are, however, still missing and should not only account for the hemispheric specialization of individual brain functions, but also for the interactions among different lateralized cognitive processes (e.g., language and visuospatial processing). This calls for robust and reliable paradigms to study hemispheric lateralization for various cognitive functions. While numerous reliable imaging paradigms have been developed for language, which represents the most prominent left-lateralized brain function, the reliability of imaging paradigms investigating typically right-lateralized brain functions, such as visuospatial processing, has received comparatively less attention. In the present study, we aimed to establish an fMRI paradigm that robustly and reliably identifies right-hemispheric activation evoked by visuospatial processing in individual subjects. In a first study, we therefore compared three frequently used paradigms for assessing visuospatial processing and evaluated their utility to robustly detect right-lateralized brain activity on a single-subject level. In a second study, we then assessed the test-retest reliability of the so-called Landmark task–the paradigm that yielded the most robust results in study 1. At the single-voxel level, we found poor reliability of the brain activation underlying visuospatial attention. This suggests that poor signal-to-noise ratios can become a limiting factor for test-retest reliability. This represents a common detriment of fMRI paradigms investigating visuospatial attention in general and therefore highlights the need for careful considerations of both the possibilities and limitations of the respective fMRI paradigm–in particular, when being interested in effects at the single-voxel level. Notably, however, when focusing on the reliability of measures of hemispheric lateralization (which was the main goal of study 2), we show that hemispheric dominance (quantified by the lateralization index, LI, with |LI| >0.4) of the evoked activation could be robustly determined in more than 62% and, if considering only two categories (i.e., left, right), in more than 93% of our subjects. Furthermore, the reliability of the lateralization strength (LI) was “fair” to “good”. In conclusion, our results suggest that the degree of right-hemispheric dominance during visuospatial processing can be reliably determined using the Landmark task, both at the group and single-subject level, while at the same time stressing the need for future refinements of experimental paradigms and more sophisticated fMRI data acquisition techniques. PMID:29059201

  7. Reliability of Physical Activity Measures During Free-Living Activities in People After Total Knee Arthroplasty.

    PubMed

    Almeida, Gustavo J; Irrgang, James J; Fitzgerald, G Kelley; Jakicic, John M; Piva, Sara R

    2016-06-01

    Few instruments that measure physical activity (PA) can accurately quantify PA performed at light and moderate intensities, which is particularly relevant in older adults. The evidence of their reliability in free-living conditions is limited. The study objectives were: (1) to determine the test-retest reliability of the Actigraph (ACT), SenseWear Armband (SWA), and Community Healthy Activities Model Program for Seniors (CHAMPS) questionnaire in assessing free-living PA at light and moderate intensities in people after total knee arthroplasty; (2) to compare the reliability of the 3 instruments relative to each other; and (3) to determine the reliability of commonly used monitoring time frames (24 hours, waking hours, and 10 hours from awakening). A one-group, repeated-measures design was used. Participants wore the activity monitors for 2 weeks, and the CHAMPS questionnaire was completed at the end of each week. Test-retest reliability was determined by using the intraclass correlation coefficient (ICC [2,k]) to compare PA measures from one week with those from the other week. Data from 28 participants who reported similar PA during the 2 weeks were included in the analysis. The mean age of these participants was 69 years (SD=8), and 75% of them were women. Reliability ranged from moderate to excellent for the ACT (ICC=.75-.86) and was excellent for the SWA (ICC=.93-.95) and the CHAMPS questionnaire (ICC=.86-.92). The 95% confidence intervals (95% CI) of the ICCs from the SWA were the only ones within the excellent reliability range (.85-.98). The CHAMPS questionnaire showed systematic bias, with less PA being reported in week 2. The reliability of PA measures in the waking-hour time frame was comparable to that in the 24-hour time frame and reflected most PA performed during this period. Reliability may be lower for time intervals longer than 1 week. All PA measures showed good reliability. The reliability of the ACT was lower than those of the SWA and the CHAMPS questionnaire. The SWA provided more precise reliability estimates. Wearing PA monitors during waking hours provided sufficiently reliable measures and can reduce the burden on people wearing them. © 2016 American Physical Therapy Association.

  8. Post-traumatic subtalar osteoarthritis: which grading system should we use?

    PubMed

    de Muinck Keizer, Robert-Jan O; Backes, Manouk; Dingemans, Siem A; Goslings, J Carel; Schepers, Tim

    2016-09-01

    To assess and compare post-traumatic osteoarthritis following intra-articular calcaneal fractures, one must have a reliable grading system that consistently grades the post-traumatic changes of the joint. A reliable grading system aids in the communication between treating physicians and improves the interpretation of research. To date, there is no consensus on what grading system to use in the evaluation of post-traumatic subtalar osteoarthritis. The objective of this study was to determine and compare the inter- and intra-rater reliability of two grading systems for post-traumatic subtalar osteoarthritis. Four observers evaluated 50 calcaneal fractures at least one year after trauma on conventional oblique lateral, internally and externally rotated views, and graded post-traumatic subtalar osteoarthritis using the Kellgren and Lawrence Grading Scale (KLGS) and the Paley Grading System (PGS). Inter- and intra-rater reliability were calculated and compared. The inter-rater reliability showed an intra-class correlation (ICC) of 0.54 (95 % CI 0.40-0.67) for the KLGS and an ICC of 0.41 (95 % CI 0.26 - 0.57) for the PGS. This difference was not statistically significant. The intra-rater reliability showed a mean weighted kappa of 0.62 for both the KLGS and the PGS. There is no statistically significant difference in reliability between the Kellgren and Lawrence Grading System (KLGS) and the Paley Grading System (PGS). The PGS allows for an easy two-step approach making it easy for everyday clinical purposes. For research purposes however, the more detailed and widely used KLGS seems preferable.

  9. True communication skills assessment in interdepartmental OSCE stations: Standard setting using the MAAS-Global and EduG.

    PubMed

    Setyonugroho, Winny; Kropmans, Thomas; Murphy, Ruth; Hayes, Peter; van Dalen, Jan; Kennedy, Kieran M

    2018-01-01

    Comparing outcome of clinical skills assessment is challenging. This study proposes reliable and valid comparison of communication skills (1) assessment as practiced in Objective Structured Clinical Examinations (2). The aim of the present study is to compare CS assessment, as standardized according to the MAAS Global, between stations in a single undergraduate medical year. An OSCE delivered in an Irish undergraduate curriculum was studied. We chose the MAAS-Global as an internationally recognized and validated instrument to calibrate the OSCE station items. The MAAS-Global proportion is the percentage of station checklist items that can be considered as 'true' CS. The reliability of the OSCE was calculated with G-Theory analysis and nested ANOVA was used to compare mean scores of all years. MAAS-Global scores in psychiatry stations were significantly higher than those in other disciplines (p<0.03) and above the initial pass mark of 50%. The higher students' scores in psychiatry stations were related to higher MAAS-Global proportions when compared to the general practice stations. Comparison of outcome measurements, using the MAAS Global as a standardization instrument, between interdisciplinary station checklists was valid and reliable. The MAAS-Global was used as a single validated instrument and is suggested as gold standard. Copyright © 2017. Published by Elsevier B.V.

  10. Comparative ex vivo evaluation of two electronic percussive testing devices measuring the stability of dental implants.

    PubMed

    Geckili, Onur; Bilhan, Hakan; Cilingir, Altug; Bilmenoglu, Caglar; Ates, Gokcen; Urgun, Aliye Ceren; Bural, Canan

    2014-12-01

    A comparative ex vivo study was performed to determine electronic percussive test values (PTVs) measured by cabled and wireless electronic percussive testing (EPT) devices and to evaluate the intra- and interobserver reliability of the wireless EPT device. Forty implants were inserted into the vertebrae and forty into the pelvis of a steer, a safe distance apart. The implants were all 4.3 mm wide and 13 mm long, from the same manufacturer. PTV of each implant was measured by four different examiners, using both EPT devices, and compared. Additionally, the intra- and interobserver reliability of the wireless EPT device was evaluated. Statistically significant differences (P <0.05) were observed between PTVs made by the two EPT devices. PTVs measured by the wireless EPT device were significantly higher than the cabled EPT device (P <0.05), indicating lower implant stability. The intraobserver reliability of the wireless EPT device was evaluated as excellent for the measurements in type II bone and good-to-excellent in type IV bone; interobserver reliability was evaluated as fair-to-good in both bone types. The wireless EPT device gives PTVs higher than the cabled EPT device, indicating lower implant stability, and its inter- and intraobserver reliability is good and acceptable.

  11. Inter-observer reliability of DSM-5 substance use disorders.

    PubMed

    Denis, Cécile M; Gelernter, Joel; Hart, Amy B; Kranzler, Henry R

    2015-08-01

    Although studies have examined the impact of changes made in DSM-5 on the estimated prevalence of substance use disorder (SUD) diagnoses, there is limited evidence concerning the reliability of DSM-5 SUDs. We evaluated the inter-observer reliability of four DSM-5 SUDs in a sample in which we had previously evaluated the reliability of DSM-IV diagnoses, allowing us to compare the two systems. Two different interviewers each assessed 173 subjects over a 2-week period using the Semi-Structured Assessment for Drug Dependence and Alcoholism (SSADDA). Using the percent agreement and kappa (κ) coefficient, we examined the reliability of DSM-5 lifetime alcohol, opioid, cocaine, and cannabis use disorders, which we compared to that of SSADDA-derived DSM-IV SUD diagnoses. We also assessed the effect of additional lifetime SUD and lifetime mood or anxiety disorder diagnoses on the reliability of the DSM-5 SUD diagnoses. Reliability was good to excellent for the four disorders, with κ values ranging from 0.65 to 0.94. Agreement was consistently lower for SUDs of mild severity than for moderate or severe disorders. DSM-5 SUD diagnoses showed greater reliability than DSM-IV diagnoses of abuse or dependence or dependence only. Co-occurring SUD and lifetime mood or anxiety disorders exerted a modest effect on the reliability of the DSM-5 SUD diagnoses. For alcohol, opioid, cocaine and cannabis use disorders, DSM-5 criteria and diagnoses are at least as reliable as those of DSM-IV. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  12. Test-retest reliability and comparability of paper and computer questionnaires for the Finnish version of the Tampa Scale of Kinesiophobia.

    PubMed

    Koho, P; Aho, S; Kautiainen, H; Pohjolainen, T; Hurri, H

    2014-12-01

    To estimate the internal consistency, test-retest reliability and comparability of paper and computer versions of the Finnish version of the Tampa Scale of Kinesiophobia (TSK-FIN) among patients with chronic pain. In addition, patients' personal experiences of completing both versions of the TSK-FIN and preferences between these two methods of data collection were studied. Test-retest reliability study. Paper and computer versions of the TSK-FIN were completed twice on two consecutive days. The sample comprised 94 consecutive patients with chronic musculoskeletal pain participating in a pain management or individual rehabilitation programme. The group rehabilitation design consisted of physical and functional exercises, evaluation of the social situation, psychological assessment of pain-related stress factors, and personal pain management training in order to regain overall function and mitigate the inconvenience of pain and fear-avoidance behaviour. The mean TSK-FIN score was 37.1 [standard deviation (SD) 8.1] for the computer version and 35.3 (SD 7.9) for the paper version. The mean difference between the two versions was 1.9 (95% confidence interval 0.8 to 2.9). Test-retest reliability was 0.89 for the paper version and 0.88 for the computer version. Internal consistency was considered to be good for both versions. The intraclass correlation coefficient for comparability was 0.77 (95% confidence interval 0.66 to 0.85), indicating substantial reliability between the two methods. Both versions of the TSK-FIN demonstrated substantial intertest reliability, good test-retest reliability, good internal consistency and acceptable limits of agreement, suggesting their suitability for clinical use. However, subjects tended to score higher when using the computer version. As such, in an ideal situation, data should be collected in a similar manner throughout the course of rehabilitation or clinical research. Copyright © 2014 Chartered Society of Physiotherapy. Published by Elsevier Ltd. All rights reserved.

  13. The development and reliability of a simple field based screening tool to assess core stability in athletes.

    PubMed

    O'Connor, S; McCaffrey, N; Whyte, E; Moran, K

    2016-07-01

    To adapt the trunk stability test to facilitate further sub-classification of higher levels of core stability in athletes for use as a screening tool. To establish the inter-tester and intra-tester reliability of this adapted core stability test. Reliability study. Collegiate athletic therapy facilities. Fifteen physically active male subjects (19.46 ± 0.63) free from any orthopaedic or neurological disorders were recruited from a convenience sample of collegiate students. The intraclass correlation coefficients (ICC) and 95% Confidence Intervals (CI) were computed to establish inter-tester and intra-tester reliability. Excellent ICC values were observed in the adapted core stability test for inter-tester reliability (0.97) and good to excellent intra-tester reliability (0.73-0.90). While the 95% CI were narrow for inter-tester reliability, Tester A and C 95% CI's were widely distributed compared to Tester B. The adapted core stability test developed in this study is a quick and simple field based test to administer that can further subdivide athletes with high levels of core stability. The test demonstrated high inter-tester and intra-tester reliability. Copyright © 2015 Elsevier Ltd. All rights reserved.

  14. Measurement Properties of the NIH-Minimal Dataset Dutch Language Version in Patients With Chronic Low Back Pain.

    PubMed

    Boer, Annemarie; Dutmer, Alisa L; Schiphorst Preuper, Henrica R; van der Woude, Lucas H V; Stewart, Roy E; Deyo, Richard A; Reneman, Michiel F; Soer, Remko

    2017-10-01

    Validation study with cross-sectional and longitudinal measurements. To translate the US National Institutes of Health (NIH)-minimal dataset for clinical research on chronic low back pain into the Dutch language and to test its validity and reliability among people with chronic low back pain. The NIH developed a minimal dataset to encourage more complete and consistent reporting of clinical research and to be able to compare studies across countries in patients with low back pain. In the Netherlands, the NIH-minimal dataset has not been translated before and measurement properties are unknown. Cross-cultural validity was tested by a formal forward-backward translation. Structural validity was tested with exploratory factor analyses (comparative fit index, Tucker-Lewis index, and root mean square error of approximation). Hypothesis testing was performed to compare subscales of the NIH dataset with the Pain Disability Index and the EurQol-5D (Pearson correlation coefficients). Internal consistency was tested with Cronbach α and test-retest reliability at 2 weeks was calculated in a subsample of patients with Intraclass Correlation Coefficients and weighted Kappa (κω). In total, 452 patients were included of which 52 were included for the test-retest study. factor analysis for structural validity pointed into the direction of a seven-factor model (Cronbach α = 0.78). Factors and total score of the NIH-minimal dataset showed fair to good correlations with Pain Disability Index (r = 0.43-0.70) and EuroQol-5D (r = -0.41 to -0.64). Reliability: test-retest reliability per item showed substantial agreement (κω=0.65). Test-retest reliability per factor was moderate to good (Intraclass Correlation Coefficient = 0.71). The Dutch language version measurement properties of the NIH-minimal were satisfactory. N/A.

  15. Greater understanding of normal hip physical function may guide clinicians in providing targeted rehabilitation programmes.

    PubMed

    Kemp, Joanne L; Schache, Anthony G; Makdissi, Michael; Sims, Kevin J; Crossley, Kay M

    2013-07-01

    This study investigated tests of hip muscle strength and functional performance. The specific objectives were to: (i) establish intra- and inter-rater reliability; (ii) compare differences between dominant and non-dominant limbs; (iii) compare agonist and antagonist muscle strength ratios; (iv) compare differences between genders; and (v) examine relationships between hip muscle strength, baseline measures and functional performance. Reliability study and cross-sectional analysis of hip strength and functional performance. In healthy adults aged 18-50years, normalised hip muscle peak torque and functional performance were evaluated to: (i) establish intra-rater and inter-rater reliability; (ii) analyse differences between limbs, between antagonistic muscle groups and genders; and (iii) associations between strength and functional performance. Excellent reliability (intra-rater ICC=0.77-0.96; inter-rater ICC=0.82-0.95) was observed. No difference existed between dominant and non-dominant limbs. Differences in strength existed between antagonistic pairs of muscles: hip abduction was greater than adduction (p<0.001) and hip ER was greater than IR (p<0.001). Men had greater ER strength (p=0.006) and hop for distance (p<0.001) than women. Strong associations were observed between measures of hip muscle strength (except hip flexion) and age, height, and functional performance. Deficits in hip muscle strength or functional performance may influence hip pain. In order to provide targeted rehabilitation programmes to address patient-specific impairments, and determine when individuals are ready to return to physical activity, clinicians are increasingly utilising tests of hip strength and functional performance. This study provides a battery of reliable, clinically applicable tests which can be used for these purposes. Copyright © 2012 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.

  16. Evaluation of the Indian Migration Study Physical Activity Questionnaire (IMS-PAQ): a cross-sectional study.

    PubMed

    Sullivan, Ruth; Kinra, Sanjay; Ekelund, Ulf; Bharathi, A V; Vaz, Mario; Kurpad, Anura; Collier, Tim; Reddy, K Srinath; Prabhakaran, Dorairaj; Ebrahim, Shah; Kuper, Hannah

    2012-02-09

    Socio-cultural differences for country-specific activities are rarely addressed in physical activity questionnaires. We examined the reliability and validity of the Indian Migration Study Physical Activity Questionnaire (IMS-PAQ) in urban and rural groups in India. A sub-sample of IMS participants (n = 479) was used to examine short term (≤ 1 month [n = 158]) and long term (> 1 month [n = 321]) IMS-PAQ reliability for levels of total, sedentary, light and moderate/vigorous activity (MVPA) intensity using intraclass correlation (ICC) and kappa coefficients (k). Criterion validity (n = 157) was examined by comparing the IMS-PAQ to a uniaxial accelerometer (ACC) worn ≥ 4 days, via Spearman's rank correlations (ρ) and k, using Bland-Altman plots to check for systematic bias. Construct validity (n = 7,000) was established using linear regression, comparing IMS-PAQ against theoretical constructs associated with physical activity (PA): BMI [kg/m2], percent body fat and pulse rate. IMS-PAQ reliability ranged from ICC 0.42-0.88 and k = 0.37-0.61 (≤ 1 month) and ICC 0.26 to 0.62; kappa 0.17 to 0.45 (> 1 month). Criterion validity was ρ = 0.18-0.48; k = 0.08-0.34. Light activity was underestimated and MVPA consistently and substantially overestimated for the IMS-PAQ vs. the accelerometer. Criterion validity was moderate for total activity and MVPA. Reliability and validity were comparable for urban and rural participants but lower in women than men. Increasing time spent in total activity or MVPA, and decreasing time in sedentary activity were associated with decreasing BMI, percent body fat and pulse rate, thereby demonstrating construct validity. IMS-PAQ reliability and validity is similar to comparable self-reported instruments. It is an appropriate tool for ranking PA of individuals in India. Some refinements may be required for sedentary populations and women in India.

  17. Evaluation of the Indian Migration Study Physical Activity Questionnaire (IMS-PAQ): a cross-sectional study

    PubMed Central

    2012-01-01

    Background Socio-cultural differences for country-specific activities are rarely addressed in physical activity questionnaires. We examined the reliability and validity of the Indian Migration Study Physical Activity Questionnaire (IMS-PAQ) in urban and rural groups in India. Methods A sub-sample of IMS participants (n = 479) was used to examine short term (≤1 month [n = 158]) and long term (> 1 month [n = 321]) IMS-PAQ reliability for levels of total, sedentary, light and moderate/vigorous activity (MVPA) intensity using intraclass correlation (ICC) and kappa coefficients (k). Criterion validity (n = 157) was examined by comparing the IMS-PAQ to a uniaxial accelerometer (ACC) worn ≥4 days, via Spearman's rank correlations (ρ) and k, using Bland-Altman plots to check for systematic bias. Construct validity (n = 7,000) was established using linear regression, comparing IMS-PAQ against theoretical constructs associated with physical activity (PA): BMI [kg/m2], percent body fat and pulse rate. Results IMS-PAQ reliability ranged from ICC 0.42-0.88 and k = 0.37-0.61 (≤1 month) and ICC 0.26 to 0.62; kappa 0.17 to 0.45 (> 1 month). Criterion validity was ρ = 0.18-0.48; k = 0.08-0.34. Light activity was underestimated and MVPA consistently and substantially overestimated for the IMS-PAQ vs. the accelerometer. Criterion validity was moderate for total activity and MVPA. Reliability and validity were comparable for urban and rural participants but lower in women than men. Increasing time spent in total activity or MVPA, and decreasing time in sedentary activity were associated with decreasing BMI, percent body fat and pulse rate, thereby demonstrating construct validity. Conclusion IMS-PAQ reliability and validity is similar to comparable self-reported instruments. It is an appropriate tool for ranking PA of individuals in India. Some refinements may be required for sedentary populations and women in India. PMID:22321669

  18. Assessing the reliability of ecotoxicological studies: An overview of current needs and approaches.

    PubMed

    Moermond, Caroline; Beasley, Amy; Breton, Roger; Junghans, Marion; Laskowski, Ryszard; Solomon, Keith; Zahner, Holly

    2017-07-01

    In general, reliable studies are well designed and well performed, and enough details on study design and performance are reported to assess the study. For hazard and risk assessment in various legal frameworks, many different types of ecotoxicity studies need to be evaluated for reliability. These studies vary in study design, methodology, quality, and level of detail reported (e.g., reviews, peer-reviewed research papers, or industry-sponsored studies documented under Good Laboratory Practice [GLP] guidelines). Regulators have the responsibility to make sound and verifiable decisions and should evaluate each study for reliability in accordance with scientific principles regardless of whether they were conducted in accordance with GLP and/or standardized methods. Thus, a systematic and transparent approach is needed to evaluate studies for reliability. In this paper, 8 different methods for reliability assessment were compared using a number of attributes: categorical versus numerical scoring methods, use of exclusion and critical criteria, weighting of criteria, whether methods are tested with case studies, domain of applicability, bias toward GLP studies, incorporation of standard guidelines in the evaluation method, number of criteria used, type of criteria considered, and availability of guidance material. Finally, some considerations are given on how to choose a suitable method for assessing reliability of ecotoxicity studies. Integr Environ Assess Manag 2017;13:640-651. © 2016 The Authors. Integrated Environmental Assessment and Management published by Wiley Periodicals, Inc. on behalf of Society of Environmental Toxicology & Chemistry (SETAC). © 2016 The Authors. Integrated Environmental Assessment and Management published by Wiley Periodicals, Inc. on behalf of Society of Environmental Toxicology & Chemistry (SETAC).

  19. Reliability and validity of the Assessment of Daily Activity Performance (ADAP) in community-dwelling older women.

    PubMed

    de Vreede, Paul L; Samson, Monique M; van Meeteren, Nico L; Duursma, Sijmen A; Verhaar, Harald J

    2006-08-01

    The Assessment of Daily Activity Performance (ADAP) test was developed, and modeled after the Continuous-scale Physical Functional Performance (CS-PFP) test, to provide a quantitative assessment of older adults' physical functional performance. The aim of this study was to determine the intra-examiner reliability and construct validity of the ADAP in a community-living older population, and to identify the importance of tester experience. Forty-three community-dwelling, older women (mean age 75 yr +/-4.3) were randomized to the test-retest reliability study (n=19) or validation study (n=24). The intra-examiner reliability of an experienced (tester 1) and an inexperienced tester (tester 2) was assessed by comparing test and retest scores of 19 participants. Construct validity was assessed by comparing the ADAP scores of 24 participants with self-perceived function by the SF-36 Health Survey, muscle function tests, and the Timed Up and Go test (TUG). Tester 1 had good consistency and reliability scores (mean difference between test and retest scores (DIF), -1.05+/-1.99; 95% confidence interval (CI), -2.58 to 0.48; Cronbach's alpha (alpha) range, 0.83 to 0.98; intraclass correlation (ICC) range, 0.75 to 0.96; Limits of Agreement (LoA), -2.58 to 4.95). Tester 2 had lower reliability scores (DIF, -2.45+/-4.36; 95% CI, -5.56 to 0.67; alpha range, 0.53 to 0.94; ICC range, 0.36 to 0.90; LoA, -6.09 to 10.99), with a systematic difference between test and retest scores for the ADAP domain lower-body strength (-3.81; 95% CI, -6.09 to -1.54), ADAP correlated with SF-36 Physical Functioning scale (r=0.67), TUG test (r=-0.91) and with isometric knee extensor strength (r=0.80). The ADAP test is a reliable and valid instrument. Our results suggest that testers should practise using the test, to improve reliability, before applying it to clinical settings.

  20. Validity and reliability of GPS and LPS for measuring distances covered and sprint mechanical properties in team sports.

    PubMed

    Hoppe, Matthias W; Baumgart, Christian; Polglaze, Ted; Freiwald, Jürgen

    2018-01-01

    This study aimed to investigate the validity and reliability of global (GPS) and local (LPS) positioning systems for measuring distances covered and sprint mechanical properties in team sports. Here, we evaluated two recently released 18 Hz GPS and 20 Hz LPS technologies together with one established 10 Hz GPS technology. Six male athletes (age: 27±2 years; VO2max: 48.8±4.7 ml/min/kg) performed outdoors on 10 trials of a team sport-specific circuit that was equipped with double-light timing gates. The circuit included various walking, jogging, and sprinting sections that were performed either in straight-lines or with changes of direction. During the circuit, athletes wore two devices of each positioning system. From the reported and filtered velocity data, the distances covered and sprint mechanical properties (i.e., the theoretical maximal horizontal velocity, force, and power output) were computed. The sprint mechanical properties were modeled via an inverse dynamic approach applied to the center of mass. The validity was determined by comparing the measured and criterion data via the typical error of estimate (TEE), whereas the reliability was examined by comparing the two devices of each technology (i.e., the between-device reliability) via the coefficient of variation (CV). Outliers due to measurement errors were statistically identified and excluded from validity and reliability analyses. The 18 Hz GPS showed better validity and reliability for determining the distances covered (TEE: 1.6-8.0%; CV: 1.1-5.1%) and sprint mechanical properties (TEE: 4.5-14.3%; CV: 3.1-7.5%) than the 10 Hz GPS (TEE: 3.0-12.9%; CV: 2.5-13.0% and TEE: 4.1-23.1%; CV: 3.3-20.0%). However, the 20 Hz LPS demonstrated superior validity and reliability overall (TEE: 1.0-6.0%; CV: 0.7-5.0% and TEE: 2.1-9.2%; CV: 1.6-7.3%). For the 10 Hz GPS, 18 Hz GPS, and 20 Hz LPS, the relative loss of data sets due to measurement errors was 10.0%, 20.0%, and 15.8%, respectively. This study shows that 18 Hz GPS has enhanced validity and reliability for determining movement patterns in team sports compared to 10 Hz GPS, whereas 20 Hz LPS had superior validity and reliability overall. However, compared to 10 Hz GPS, 18 Hz GPS and 20 Hz LPS technologies had more outliers due to measurement errors, which limits their practical applications at this time.

  1. Precision of lumbar intervertebral measurements: does a computer-assisted technique improve reliability?

    PubMed

    Pearson, Adam M; Spratt, Kevin F; Genuario, James; McGough, William; Kosman, Katherine; Lurie, Jon; Sengupta, Dilip K

    2011-04-01

    Comparison of intra- and interobserver reliability of digitized manual and computer-assisted intervertebral motion measurements and classification of "instability." To determine if computer-assisted measurement of lumbar intervertebral motion on flexion-extension radiographs improves reliability compared with digitized manual measurements. Many studies have questioned the reliability of manual intervertebral measurements, although few have compared the reliability of computer-assisted and manual measurements on lumbar flexion-extension radiographs. Intervertebral rotation, anterior-posterior (AP) translation, and change in anterior and posterior disc height were measured with a digitized manual technique by three physicians and by three other observers using computer-assisted quantitative motion analysis (QMA) software. Each observer measured 30 sets of digital flexion-extension radiographs (L1-S1) twice. Shrout-Fleiss intraclass correlation coefficients for intra- and interobserver reliabilities were computed. The stability of each level was also classified (instability defined as >4 mm AP translation or 10° rotation), and the intra- and interobserver reliabilities of the two methods were compared using adjusted percent agreement (APA). Intraobserver reliability intraclass correlation coefficients were substantially higher for the QMA technique THAN the digitized manual technique across all measurements: rotation 0.997 versus 0.870, AP translation 0.959 versus 0.557, change in anterior disc height 0.962 versus 0.770, and change in posterior disc height 0.951 versus 0.283. The same pattern was observed for interobserver reliability (rotation 0.962 vs. 0.693, AP translation 0.862 vs. 0.151, change in anterior disc height 0.862 vs. 0.373, and change in posterior disc height 0.730 vs. 0.300). The QMA technique was also more reliable for the classification of "instability." Intraobserver APAs ranged from 87 to 97% for QMA versus 60% to 73% for digitized manual measurements, while interobserver APAs ranged from 91% to 96% for QMA versus 57% to 63% for digitized manual measurements. The use of QMA software substantially improved the reliability of lumbar intervertebral measurements and the classification of instability based on flexion-extension radiographs.

  2. Comparison of serum, EDTA plasma and P100 plasma for luminex-based biomarker multiplex assays in patients with chronic obstructive pulmonary disease in the SPIROMICS study.

    PubMed

    O'Neal, Wanda K; Anderson, Wayne; Basta, Patricia V; Carretta, Elizabeth E; Doerschuk, Claire M; Barr, R Graham; Bleecker, Eugene R; Christenson, Stephanie A; Curtis, Jeffrey L; Han, Meilan K; Hansel, Nadia N; Kanner, Richard E; Kleerup, Eric C; Martinez, Fernando J; Miller, Bruce E; Peters, Stephen P; Rennard, Stephen I; Scholand, Mary Beth; Tal-Singer, Ruth; Woodruff, Prescott G; Couper, David J; Davis, Sonia M

    2014-01-08

    As a part of the longitudinal Chronic Obstructive Pulmonary Disease (COPD) study, Subpopulations and Intermediate Outcome Measures in COPD study (SPIROMICS), blood samples are being collected from 3200 subjects with the goal of identifying blood biomarkers for sub-phenotyping patients and predicting disease progression. To determine the most reliable sample type for measuring specific blood analytes in the cohort, a pilot study was performed from a subset of 24 subjects comparing serum, Ethylenediaminetetraacetic acid (EDTA) plasma, and EDTA plasma with proteinase inhibitors (P100). 105 analytes, chosen for potential relevance to COPD, arranged in 12 multiplex and one simplex platform (Myriad-RBM) were evaluated in duplicate from the three sample types from 24 subjects. The reliability coefficient and the coefficient of variation (CV) were calculated. The performance of each analyte and mean analyte levels were evaluated across sample types. 20% of analytes were not consistently detectable in any sample type. Higher reliability and/or smaller CV were determined for 12 analytes in EDTA plasma compared to serum, and for 11 analytes in serum compared to EDTA plasma. While reliability measures were similar for EDTA plasma and P100 plasma for a majority of analytes, CV was modestly increased in P100 plasma for eight analytes. Each analyte within a multiplex produced independent measurement characteristics, complicating selection of sample type for individual multiplexes. There were notable detectability and measurability differences between serum and plasma. Multiplexing may not be ideal if large reliability differences exist across analytes measured within the multiplex, especially if values differ based on sample type. For some analytes, the large CV should be considered during experimental design, and the use of duplicate and/or triplicate samples may be necessary. These results should prove useful for studies evaluating selection of samples for evaluation of potential blood biomarkers.

  3. Comparison of serum, EDTA plasma and P100 plasma for luminex-based biomarker multiplex assays in patients with chronic obstructive pulmonary disease in the SPIROMICS study

    PubMed Central

    2014-01-01

    Background As a part of the longitudinal Chronic Obstructive Pulmonary Disease (COPD) study, Subpopulations and Intermediate Outcome Measures in COPD study (SPIROMICS), blood samples are being collected from 3200 subjects with the goal of identifying blood biomarkers for sub-phenotyping patients and predicting disease progression. To determine the most reliable sample type for measuring specific blood analytes in the cohort, a pilot study was performed from a subset of 24 subjects comparing serum, Ethylenediaminetetraacetic acid (EDTA) plasma, and EDTA plasma with proteinase inhibitors (P100™). Methods 105 analytes, chosen for potential relevance to COPD, arranged in 12 multiplex and one simplex platform (Myriad-RBM) were evaluated in duplicate from the three sample types from 24 subjects. The reliability coefficient and the coefficient of variation (CV) were calculated. The performance of each analyte and mean analyte levels were evaluated across sample types. Results 20% of analytes were not consistently detectable in any sample type. Higher reliability and/or smaller CV were determined for 12 analytes in EDTA plasma compared to serum, and for 11 analytes in serum compared to EDTA plasma. While reliability measures were similar for EDTA plasma and P100 plasma for a majority of analytes, CV was modestly increased in P100 plasma for eight analytes. Each analyte within a multiplex produced independent measurement characteristics, complicating selection of sample type for individual multiplexes. Conclusions There were notable detectability and measurability differences between serum and plasma. Multiplexing may not be ideal if large reliability differences exist across analytes measured within the multiplex, especially if values differ based on sample type. For some analytes, the large CV should be considered during experimental design, and the use of duplicate and/or triplicate samples may be necessary. These results should prove useful for studies evaluating selection of samples for evaluation of potential blood biomarkers. PMID:24397870

  4. Comparison of validity and reliability of the Migraine disability assessment (MIDAS) versus headache impact test (HIT) in an Iranian population.

    PubMed

    Ghorbani, Abbas; Chitsaz, Ahmad

    2011-01-01

    Migraine is one of the most common headaches that affect 11% or more adult population. Recently, researchers have designed two questionnaires, namely Headache Impact Test (HIT) and Migraine Disability Assessment (MIDAS), with the aim of improving migraine care. These two tests provide a standard measurement about migraine's effects on people's life style that divide patients into 4 groups (grades) based on headaches intensity. The aim of this study was to compare the validity and reliability of these two tests. This study was designed as a multicenter, descriptive study to compare validity and reliability of Persian version of MIDAS and HIT questionnaires in 240 males and females with a migraine diagnosis according to criteria for headache and facial pain of the International Headache Society (IHS). The patients were enrolled in the study from 3 neurology clinics in Isfahan, Iran, between July 2004 and January 2005 and were evaluated at baseline (visit 1) and 4 weeks later (visit 2). According to our study, there was a high correlation between two tests (r = 0.94). This decreased their MIDAS grade in comparison to their grade HIT questionnaire. These findings demonstrated that Persian version of HIT have the same validity and reliability as MIDAS. Replying to HIT questionnaire was easier than MIDAS for Iranian patients. Physicians can reliably use the Persian translation of both MIDAS and HIT questionnaires to define the severity of illness and its treatment strategy as a self-administered report by migraine patients. However, we recommend HIT for its simplicity in headache clinics.

  5. Intra- and inter-observer reliability of ten major histological scoring systems used for the evaluation of in vivo cartilage repair.

    PubMed

    Bonasia, Davide Edoardo; Marmotti, Antongiulio; Massa, Alessandro Domenico Felice; Ferro, Andrea; Blonna, Davide; Castoldi, Filippo; Rossi, Roberto

    2015-09-01

    In the last two decades, many surgical techniques have been described for articular cartilage repair. Reliable histological scoring systems are fundamental tools to evaluate new procedures. Several histological scoring systems have been described, and these can be divided in elementary and comprehensive scores, according to the number of sub-items. The aim of this study was to test the inter- and intra-observer reliability of ten main scores used for the histological evaluation of in vivo cartilage repair. The authors tested the starting hypothesis that elementary scores would show superior intra- and inter-observer reliability compared with comprehensive scores. Fifty histological sections obtained from the trochlea of New Zealand Rabbit and stained with Safranin-O fast green were used. The histological sections were analysed by 4 observers: 2 experienced in cartilage histology and 2 inexperienced. Histological evaluations were performed at time 1 and time 2, separated by a 30-day interval. The following scores were used: Mankin, O'Driscoll, Pineda, Wakitani, Fortier, Selleres, ICRS, ICRSII, Oswestry (OsScore) and modified O'Driscoll. Intra- and inter-observer reliability were evaluated for each score. In addition, the pavement-ceiling effect and the Bland-Altman Coefficient of Repeatability were then evaluated for each sub-item of every score. Intra-observer reliability was high for all observers in every score, even though the reliability was significantly lower for non-expert observers compared with expert counterparts. In terms of Coefficient of Repeatability, some scores performed better (O'Driscoll, Modified O'Driscoll and ICRSII) than others (Fortier, Seller). Inter-observer reliability was high for all observers in every score, but significantly lower for non-expert compared with expert observers. In expert hands, all the scores showed high intra- and inter-observer reliability, independently of the complexity. Although every score has advantages and disadvantages, ICRSII, O'Driscoll and Modified O'Driscoll scores should be preferred for the evaluation of in vivo cartilage repair in animal models.

  6. Validity and Interrater Reliability of the Visual Quarter-Waste Method for Assessing Food Waste in Middle School and High School Cafeteria Settings.

    PubMed

    Getts, Katherine M; Quinn, Emilee L; Johnson, Donna B; Otten, Jennifer J

    2017-11-01

    Measuring food waste (ie, plate waste) in school cafeterias is an important tool to evaluate the effectiveness of school nutrition policies and interventions aimed at increasing consumption of healthier meals. Visual assessment methods are frequently applied in plate waste studies because they are more convenient than weighing. The visual quarter-waste method has become a common tool in studies of school meal waste and consumption, but previous studies of its validity and reliability have used correlation coefficients, which measure association but not necessarily agreement. The aims of this study were to determine, using a statistic measuring interrater agreement, whether the visual quarter-waste method is valid and reliable for assessing food waste in a school cafeteria setting when compared with the gold standard of weighed plate waste. To evaluate validity, researchers used the visual quarter-waste method and weighed food waste from 748 trays at four middle schools and five high schools in one school district in Washington State during May 2014. To assess interrater reliability, researcher pairs independently assessed 59 of the same trays using the visual quarter-waste method. Both validity and reliability were assessed using a weighted κ coefficient. For validity, as compared with the measured weight, 45% of foods assessed using the visual quarter-waste method were in almost perfect agreement, 42% of foods were in substantial agreement, 10% were in moderate agreement, and 3% were in slight agreement. For interrater reliability between pairs of visual assessors, 46% of foods were in perfect agreement, 31% were in almost perfect agreement, 15% were in substantial agreement, and 8% were in moderate agreement. These results suggest that the visual quarter-waste method is a valid and reliable tool for measuring plate waste in school cafeteria settings. Copyright © 2017 Academy of Nutrition and Dietetics. Published by Elsevier Inc. All rights reserved.

  7. The reliability of a simplified water displacement instrument: a method for measuring arm volume.

    PubMed

    Sagen, Ase; Kåresen, Rolf; Risberg, May Arna

    2005-01-01

    To present a new water displacement measurement, the Simplified Water Displacement Instrument (SWDI), and to evaluate its intra- and intertester reliability. Reliability design. Hospital setting. Fifty-six healthy people were studied. Intratester reliability was evaluated once a week for 4 weeks in 20 women and 10 men. Intertester reliability was assessed by 2 physical therapists in 26 people. Not applicable. Coefficients of variation (CVs) and intraclass correlation coefficients (ICCs). The intratester reliability showed a CV range of 2.2% to 2.6% and an ICC range of .98 to .99. The intertester reliability showed a CV of 1.3% and an ICC of .99. There was a significant increase in arm volume in men compared with women. There were no significant differences in changes in volume over the 4 weeks. There was a significant greater right arm volume (3.3%) among the right-handed subjects (P<.001). Both intra- and intertester reliability were satisfactory for the SWDI.

  8. Reliability and Validity of the Spanish Adaptation of EOSS, Comparing Normal and Clinical Samples

    ERIC Educational Resources Information Center

    Valero-Aguayo, Luis; Ferro-Garcia, Rafael; Lopez-Bermudez, Miguel Angel; de Huralde, Ma. Angeles Selva-Lopez

    2012-01-01

    The Experiencing of Self Scale (EOSS) was created for the evaluation of Functional Analytic Psychotherapy (Kohlenberg & Tsai, 1991, 2001, 2008) in relation to the concept of the experience of personal self as socially and verbally constructed. This paper presents a reliability and validity study of the EOSS with a Spanish sample (582…

  9. The Reliability and Validity of the Thin Slice Technique: Observational Research on Video Recorded Medical Interactions

    ERIC Educational Resources Information Center

    Foster, Tanina S.

    2014-01-01

    Introduction: Observational research using the thin slice technique has been routinely incorporated in observational research methods, however there is limited evidence supporting use of this technique compared to full interaction coding. The purpose of this study was to determine if this technique could be reliability coded, if ratings are…

  10. Students' and Teacher's Experiences of the Validity and Reliability of Assessment in a Bioscience Course

    ERIC Educational Resources Information Center

    Räisänen, Milla; Tuononen, Tarja; Postareff, Liisa; Hailikari, Telle; Virtanen, Viivi

    2016-01-01

    This case study explores the assessment of students' learning outcomes in a second-year lecture course in biosciences. The aim is to deeply explore the teacher's and the students' experiences of the validity and reliability of assessment and to compare those perspectives. The data were collected through stimulated recall interviews. The results…

  11. Ankle Accelerometry for Assessing Physical Activity among Adolescent Girls: Threshold Determination, Validity, Reliability, and Feasibility

    ERIC Educational Resources Information Center

    Hager, Erin R.; Treuth, Margarita S.; Gormely, Candice; Epps, LaShawna; Snitker, Soren; Black, Maureen M.

    2015-01-01

    Purpose: Ankle accelerometry allows for 24-hr data collection and improves data volume/integrity versus hip accelerometry. Using Actical ankle accelerometry, the purpose of this study was to (a) develop sensitive/specific thresholds, (b) examine validity/reliability, (c) compare new thresholds with those of the manufacturer, and (d) examine…

  12. Is intra-bladder pressure measurement a reliable indicator for raised intra-abdominal pressure? A prospective comparative study.

    PubMed

    Al-Abassi, Abdulla Ahmed; Al Saadi, Azan Saleh; Ahmed, Faisal

    2018-06-19

    Intra-abdominal pressure (IAP) can be measured by several indirect methods; however, the urinary bladder is largely preferred. The aim of this study was to compare intra-bladder pressure (IBP) at different levels of IAPs and assess its reliability as an indirect method for IAP measurement. We compared IBP with IAP in twenty-one patients undergoing laparoscopic cholecystectomy under general anesthesia. Measurements were recorded at increasing levels of insufflation pressures to approximately 22 mmHg. Pearson's correlation coefficient was calculated to establish the relationship between the two pressure measurements and Bland-Altman analysis was used to assess the limits of agreement between the two methods of measurements. The urinary bladder pressures reflected well the pressures in the abdominal cavity. Pearson correlation coefficient showed a good correlation between the two measurement techniques (r = 0.966, p < 0.0001) and Bland-Altman analysis indicated that the 95% limits of agreement between the two methods ranged from - 2.83 to 2.64. This range is accepted both clinically and according to the recommendations of the World Society of Abdominal Compartment Syndrome (WSACS). Our study showed that IBP measurement is a simple, minimally invasive method that may reliably estimates IAP in patients placed in supine position. Measurements for pressures higher than 12 mmHg may be less reliable. When applied clinically, this should alert the clinician to take safety measures to avoid abdominal compartment syndrome (ACS).

  13. Reliability on intra-laboratory and inter-laboratory data of hair mineral analysis comparing with blood analysis.

    PubMed

    Namkoong, Sun; Hong, Seung Phil; Kim, Myung Hwa; Park, Byung Cheol

    2013-02-01

    Nowadays, although its clinical value remains controversial institutions utilize hair mineral analysis. Arguments about the reliability of hair mineral analysis persist, and there have been evaluations of commercial laboratories performing hair mineral analysis. The objective of this study was to assess the reliability of intra-laboratory and inter-laboratory data at three commercial laboratories conducting hair mineral analysis, compared to serum mineral analysis. Two divided hair samples taken from near the scalp were submitted for analysis at the same time, to all laboratories, from one healthy volunteer. Each laboratory sent a report consisting of quantitative results and their interpretation of health implications. Differences among intra-laboratory and interlaboratory data were analyzed using SPSS version 12.0 (SPSS Inc., USA). All the laboratories used identical methods for quantitative analysis, and they generated consistent numerical results according to Friedman analysis of variance. However, the normal reference ranges of each laboratory varied. As such, each laboratory interpreted the patient's health differently. On intra-laboratory data, Wilcoxon analysis suggested they generated relatively coherent data, but laboratory B could not in one element, so its reliability was doubtful. In comparison with the blood test, laboratory C generated identical results, but not laboratory A and B. Hair mineral analysis has its limitations, considering the reliability of inter and intra laboratory analysis comparing with blood analysis. As such, clinicians should be cautious when applying hair mineral analysis as an ancillary tool. Each laboratory included in this study requires continuous refinement from now on for inducing standardized normal reference levels.

  14. Study samples are too small to produce sufficiently precise reliability coefficients.

    PubMed

    Charter, Richard A

    2003-04-01

    In a survey of journal articles, test manuals, and test critique books, the author found that a mean sample size (N) of 260 participants had been used for reliability studies on 742 tests. The distribution was skewed because the median sample size for the total sample was only 90. The median sample sizes for the internal consistency, retest, and interjudge reliabilities were 182, 64, and 36, respectively. The author presented sample size statistics for the various internal consistency methods and types of tests. In general, the author found that the sample sizes that were used in the internal consistency studies were too small to produce sufficiently precise reliability coefficients, which in turn could cause imprecise estimates of examinee true-score confidence intervals. The results also suggest that larger sample sizes have been used in the last decade compared with those that were used in earlier decades.

  15. The Children's Play Therapy Instrument (CPTI): Description, Development, and Reliability Studies

    PubMed Central

    Kernberg, Paulina F.; Chazan, Saralea E.; Normandin, Lina

    1998-01-01

    The Children's Play Therapy Instrument (CPTI), its development, and reliability studies are described. The CPTI is a new instrument to examine a child's play activity in individual psychotherapy. Three independent raters used the CPTI to rate eight videotaped play therapy vignettes. Results were compared with the authors' consensual scores from a preliminary study. Generally good to excellent levels of interrater reliability were obtained for the independent raters on intraclass correlation coefficients for ordinal categories of the CPTI. Likewise, kappa levels were acceptable to excellent for nominal categories of the scale. The CPTI holds promise to become a reliable measure of play activity in child psychotherapy. Further research is needed to assess discriminant validity of the CPTI for use as a diagnostic tool and as a measure of process and outcome.(The Journal of Psychotherapy Practice and Research 1998; 7:196–207) PMID:9631341

  16. Identifying dyslexia in adults: an iterative method using the predictive value of item scores and self-report questions.

    PubMed

    Tamboer, Peter; Vorst, Harrie C M; Oort, Frans J

    2014-04-01

    Methods for identifying dyslexia in adults vary widely between studies. Researchers have to decide how many tests to use, which tests are considered to be the most reliable, and how to determine cut-off scores. The aim of this study was to develop an objective and powerful method for diagnosing dyslexia. We took various methodological measures, most of which are new compared to previous methods. We used a large sample of Dutch first-year psychology students, we considered several options for exclusion and inclusion criteria, we collected as many cognitive tests as possible, we used six independent sources of biographical information for a criterion of dyslexia, we compared the predictive power of discriminant analyses and logistic regression analyses, we used both sum scores and item scores as predictor variables, we used self-report questions as predictor variables, and we retested the reliability of predictions with repeated prediction analyses using an adjusted criterion. We were able to identify 74 dyslexic and 369 non-dyslexic students. For 37 students, various predictions were too inconsistent for a final classification. The most reliable predictions were acquired with item scores and self-report questions. The main conclusion is that it is possible to identify dyslexia with a high reliability, although the exact nature of dyslexia is still unknown. We therefore believe that this study yielded valuable information for future methods of identifying dyslexia in Dutch as well as in other languages, and that this would be beneficial for comparing studies across countries.

  17. Effects of resting state condition on reliability, trait specificity, and network connectivity of brain function measured with arterial spin labeled perfusion MRI.

    PubMed

    Li, Zhengjun; Vidorreta, Marta; Katchmar, Natalie; Alsop, David C; Wolf, Daniel H; Detre, John A

    2018-06-01

    Resting state fMRI (rs-fMRI) provides imaging biomarkers of task-independent brain function that can be associated with clinical variables or modulated by interventions such as behavioral training or pharmacological manipulations. These biomarkers include time-averaged regional brain function as manifested by regional cerebral blood flow (CBF) measured using arterial spin labeled (ASL) perfusion MRI and correlated temporal fluctuations of function across brain networks with either ASL or blood oxygenation level dependent (BOLD) fMRI. Resting-state studies are typically carried out using just one of several prescribed state conditions such as eyes closed (EC), eyes open (EO), or visual fixation on a cross-hair (FIX), which may affect the reliability and specificity of rs-fMRI. In this study, we collected test-retest ASL MRI data during 4 resting-state task conditions: EC, EO, FIX and PVT (low-frequency psychomotor vigilance task), and examined the effects of these task conditions on reliability and reproducibility as well as trait specificity of regional brain function. We also acquired resting-state BOLD fMRI under FIX and compared the network connectivity reliabilities between the four ASL conditions and the BOLD FIX condition. For resting-state ASL data, EC provided the highest CBF reliability, reproducibility, trait specificity, and network connectivity reliability, followed by EO, while FIX was lowest on all of these measures. PVT demonstrated lower CBF reliability, reproducibility and trait specificity than EO and EC. Overall network connectivity reliability was comparable between ASL and BOLD. Our findings confirm ASL CBF as a reliable, stable, and consistent measure of resting-state regional brain function and support the use of EC or EO over FIX and PVT as the resting-state condition. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.

  18. Factor validity and reliability of the aberrant behavior checklist-community (ABC-C) in an Indian population with intellectual disability.

    PubMed

    Lehotkay, R; Saraswathi Devi, T; Raju, M V R; Bada, P K; Nuti, S; Kempf, N; Carminati, G Galli

    2015-03-01

    In this study realised in collaboration with the department of psychology and parapsychology of Andhra University, validation of the Aberrant Behavior Checklist-Community (ABC-C) in Telugu, the official language of Andhra Pradesh, one of India's 28 states, was carried out. To assess the factor validity and reliability of this Telugu version, 120 participants with moderate to profound intellectual disability (94 men and 26 women, mean age 25.2, SD 7.1) were rated by the staff of the Lebenshilfe Institution for Mentally Handicapped in Visakhapatnam, Andhra Pradesh, India. Rating data were analysed with a confirmatory factor analysis. The internal consistency was estimated by Cronbach's alpha. To confirm the test-retest reliability, 50 participants were rated twice with an interval of 4 weeks, and 50 were rated by pairs of raters to assess inter-rater reliability. Confirmatory factor analysis revealed that the root mean square error of approximation (RMSEA) was equal to 0.06, the comparative fit index (CFI) was equal to 0.77, and the Tucker Lewis index (TLI) was equal to 0.77, which indicated that the model with five correlated factors had a good fit. Coefficient alpha ranged from 0.85 to 0.92 across the five subscales. Spearman's rank correlation coefficients for inter-rater reliability tests ranged from 0.65 to 0.75, and the correlations for test-retest reliability ranged from 0.58 to 0.76. All reliability coefficients were statistically significant (P < 0.01). The factor validity and reliability of Telugu version of the ABC-C evidenced factor validity and reliability comparable to the original English version and appears to be useful for assessing behaviour disorders in Indian people with intellectual disabilities. © 2014 MENCAP and International Association of the Scientific Study of Intellectual and Developmental Disabilities and John Wiley & Sons Ltd.

  19. Inter-rater reliability for movement pattern analysis (MPA): measuring patterning of behaviors versus discrete behavior counts as indicators of decision-making style

    PubMed Central

    Connors, Brenda L.; Rende, Richard; Colton, Timothy J.

    2014-01-01

    The unique yield of collecting observational data on human movement has received increasing attention in a number of domains, including the study of decision-making style. As such, interest has grown in the nuances of core methodological issues, including the best ways of assessing inter-rater reliability. In this paper we focus on one key topic – the distinction between establishing reliability for the patterning of behaviors as opposed to the computation of raw counts – and suggest that reliability for each be compared empirically rather than determined a priori. We illustrate by assessing inter-rater reliability for key outcome measures derived from movement pattern analysis (MPA), an observational methodology that records body movements as indicators of decision-making style with demonstrated predictive validity. While reliability ranged from moderate to good for raw counts of behaviors reflecting each of two Overall Factors generated within MPA (Assertion and Perspective), inter-rater reliability for patterning (proportional indicators of each factor) was significantly higher and excellent (ICC = 0.89). Furthermore, patterning, as compared to raw counts, provided better prediction of observable decision-making process assessed in the laboratory. These analyses support the utility of using an empirical approach to inform the consideration of measuring patterning versus discrete behavioral counts of behaviors when determining inter-rater reliability of observable behavior. They also speak to the substantial reliability that may be achieved via application of theoretically grounded observational systems such as MPA that reveal thinking and action motivations via visible movement patterns. PMID:24999336

  20. Inter-rater reliability for movement pattern analysis (MPA): measuring patterning of behaviors versus discrete behavior counts as indicators of decision-making style.

    PubMed

    Connors, Brenda L; Rende, Richard; Colton, Timothy J

    2014-01-01

    The unique yield of collecting observational data on human movement has received increasing attention in a number of domains, including the study of decision-making style. As such, interest has grown in the nuances of core methodological issues, including the best ways of assessing inter-rater reliability. In this paper we focus on one key topic - the distinction between establishing reliability for the patterning of behaviors as opposed to the computation of raw counts - and suggest that reliability for each be compared empirically rather than determined a priori. We illustrate by assessing inter-rater reliability for key outcome measures derived from movement pattern analysis (MPA), an observational methodology that records body movements as indicators of decision-making style with demonstrated predictive validity. While reliability ranged from moderate to good for raw counts of behaviors reflecting each of two Overall Factors generated within MPA (Assertion and Perspective), inter-rater reliability for patterning (proportional indicators of each factor) was significantly higher and excellent (ICC = 0.89). Furthermore, patterning, as compared to raw counts, provided better prediction of observable decision-making process assessed in the laboratory. These analyses support the utility of using an empirical approach to inform the consideration of measuring patterning versus discrete behavioral counts of behaviors when determining inter-rater reliability of observable behavior. They also speak to the substantial reliability that may be achieved via application of theoretically grounded observational systems such as MPA that reveal thinking and action motivations via visible movement patterns.

  1. Reliability and One-Year Stability of the PIN3 Neighborhood Environmental Audit in Urban and Rural Neighborhoods.

    PubMed

    Porter, Anna K; Wen, Fang; Herring, Amy H; Rodríguez, Daniel A; Messer, Lynne C; Laraia, Barbara A; Evenson, Kelly R

    2018-06-01

    Reliable and stable environmental audit instruments are needed to successfully identify the physical and social attributes that may influence physical activity. This study described the reliability and stability of the PIN3 environmental audit instrument in both urban and rural neighborhoods. Four randomly sampled road segments in and around a one-quarter mile buffer of participants' residences from the Pregnancy, Infection, and Nutrition (PIN3) study were rated twice, approximately 2 weeks apart. One year later, 253 of the year 1 sampled roads were re-audited. The instrument included 43 measures that resulted in 73 item scores for calculation of percent overall agreement, kappa statistics, and log-linear models. For same-day reliability, 81% of items had moderate to outstanding kappa statistics (kappas ≥ 0.4). Two-week reliability was slightly lower, with 77% of items having moderate to outstanding agreement using kappa statistics. One-year stability had 68% of items showing moderate to outstanding agreement using kappa statistics. The reliability of the audit measures was largely consistent when comparing urban to rural locations, with only 8% of items exhibiting significant differences (α < 0.05) by urbanicity. The PIN3 instrument is a reliable and stable audit tool for studies assessing neighborhood attributes in urban and rural environments.

  2. Reliability and minimal detectable difference in multisegment foot kinematics during shod walking and running.

    PubMed

    Milner, Clare E; Brindle, Richard A

    2016-01-01

    There has been increased interest recently in measuring kinematics within the foot during gait. While several multisegment foot models have appeared in the literature, the Oxford foot model has been used frequently for both walking and running. Several studies have reported the reliability for the Oxford foot model, but most studies to date have reported reliability for barefoot walking. The purpose of this study was to determine between-day (intra-rater) and within-session (inter-trial) reliability of the modified Oxford foot model during shod walking and running and calculate minimum detectable difference for common variables of interest. Healthy adult male runners participated. Participants ran and walked in the gait laboratory for five trials of each. Three-dimensional gait analysis was conducted and foot and ankle joint angle time series data were calculated. Participants returned for a second gait analysis at least 5 days later. Intraclass correlation coefficients and minimum detectable difference were determined for walking and for running, to indicate both within-session and between-day reliability. Overall, relative variables were more reliable than absolute variables, and within-session reliability was greater than between-day reliability. Between-day intraclass correlation coefficients were comparable to those reported previously for adults walking barefoot. It is an extension in the use of the Oxford foot model to incorporate wearing a shoe while maintaining marker placement directly on the skin for each segment. These reliability data for walking and running will aid in the determination of meaningful differences in studies which use this model during shod gait. Copyright © 2015 Elsevier B.V. All rights reserved.

  3. A systematic review of publications assessing reliability and validity of the Behavioral Risk Factor Surveillance System (BRFSS), 2004–2011

    PubMed Central

    2013-01-01

    Background In recent years response rates on telephone surveys have been declining. Rates for the behavioral risk factor surveillance system (BRFSS) have also declined, prompting the use of new methods of weighting and the inclusion of cell phone sampling frames. A number of scholars and researchers have conducted studies of the reliability and validity of the BRFSS estimates in the context of these changes. As the BRFSS makes changes in its methods of sampling and weighting, a review of reliability and validity studies of the BRFSS is needed. Methods In order to assess the reliability and validity of prevalence estimates taken from the BRFSS, scholarship published from 2004–2011 dealing with tests of reliability and validity of BRFSS measures was compiled and presented by topics of health risk behavior. Assessments of the quality of each publication were undertaken using a categorical rubric. Higher rankings were achieved by authors who conducted reliability tests using repeated test/retest measures, or who conducted tests using multiple samples. A similar rubric was used to rank validity assessments. Validity tests which compared the BRFSS to physical measures were ranked higher than those comparing the BRFSS to other self-reported data. Literature which undertook more sophisticated statistical comparisons was also ranked higher. Results Overall findings indicated that BRFSS prevalence rates were comparable to other national surveys which rely on self-reports, although specific differences are noted for some categories of response. BRFSS prevalence rates were less similar to surveys which utilize physical measures in addition to self-reported data. There is very little research on reliability and validity for some health topics, but a great deal of information supporting the validity of the BRFSS data for others. Conclusions Limitations of the examination of the BRFSS were due to question differences among surveys used as comparisons, as well as mode of data collection differences. As the BRFSS moves to incorporating cell phone data and changing weighting methods, a review of reliability and validity research indicated that past BRFSS landline only data were reliable and valid as measured against other surveys. New analyses and comparisons of BRFSS data which include the new methodologies and cell phone data will be needed to ascertain the impact of these changes on estimates in the future. PMID:23522349

  4. Interventions to assist health consumers to find reliable online health information: a comprehensive review.

    PubMed

    Lee, Kenneth; Hoti, Kreshnik; Hughes, Jeffery D; Emmerton, Lynne M

    2014-01-01

    Health information on the Internet is ubiquitous, and its use by health consumers prevalent. Finding and understanding relevant online health information, and determining content reliability, pose real challenges for many health consumers. To identify the types of interventions that have been implemented to assist health consumers to find reliable online health information, and where possible, describe and compare the types of outcomes studied. PubMed, PsycINFO, CINAHL Plus and Cochrane Library databases; WorldCat and Scirus 'gray literature' search engines; and manual review of reference lists of selected publications. Publications were selected by firstly screening title, abstract, and then full text. Seven publications met the inclusion criteria, and were summarized in a data extraction form. The form incorporated the PICOS (Population Intervention Comparators Outcomes and Study Design) Model. Two eligible gray literature papers were also reported. Relevant data from included studies were tabulated to enable descriptive comparison. A brief critique of each study was included in the tables. This review was unable to follow systematic review methods due to the paucity of research and humanistic interventions reported. While extensive, the gray literature search may have had limited reach in some countries. The paucity of research on this topic limits conclusions that may be drawn. The few eligible studies predominantly adopted a didactic approach to assisting health consumers, whereby consumers were either taught how to find credible websites, or how to use the Internet. Common types of outcomes studied include knowledge and skills pertaining to Internet use and searching for reliable health information. These outcomes were predominantly self-assessed by participants. There is potential for further research to explore other avenues for assisting health consumers to find reliable online health information, and to assess outcomes via objective measures.

  5. Adaptation, reliability and validity testing of a Persian version of the Health Assessment Questionnaire-Disability Index in Iranian patients with rheumatoid arthritis.

    PubMed

    Nazary-Moghadam, Salman; Zeinalzadeh, Afsaneh; Salavati, Mahyar; Almasi, Simin; Negahban, Hossein

    2017-01-01

    The aim of the present study was to culturally adapt and evaluate reliability and validity of Health Assessment Questionnaire-Disability Index (HAQ-DI) in Iranian patients with rheumatoid arthritis (RA). 234 patients with RA for validation study, Eighty-six participants for reliability study. Test-retest relative reliability and internal consistency of Persian version of HAQ-DI were examined by intraclass correlation coefficient (ICC) and Cronbach's alpha, respectively. Additionally, HAQ-DI construct validity (Spearman's correlation) was examined using Persian version of Short-Form 36 Health survey (SF-36), activity and severity parameters. Persian version of HAQ-DI total score showed excellent test-retest reliability (ICC = 0.98) and internal consistency (Cronbach's alpha = 0.95). Spearman's correlations between the total PHAQ-DI score and activity and severity parameters were above 0.55. Correlation between PHAQ-DI and SF-36 Physical Health were higher as compared with SF-36 Mental Health. Persian version of HAQ-DI is a reliable and valid culturally-adapted instrument in order to measure functional limitations in Iranian people with RA. Copyright © 2016 Elsevier Ltd. All rights reserved.

  6. What makes an accurate and reliable subject-specific finite element model? A case study of an elephant femur

    PubMed Central

    Panagiotopoulou, O.; Wilshin, S. D.; Rayfield, E. J.; Shefelbine, S. J.; Hutchinson, J. R.

    2012-01-01

    Finite element modelling is well entrenched in comparative vertebrate biomechanics as a tool to assess the mechanical design of skeletal structures and to better comprehend the complex interaction of their form–function relationships. But what makes a reliable subject-specific finite element model? To approach this question, we here present a set of convergence and sensitivity analyses and a validation study as an example, for finite element analysis (FEA) in general, of ways to ensure a reliable model. We detail how choices of element size, type and material properties in FEA influence the results of simulations. We also present an empirical model for estimating heterogeneous material properties throughout an elephant femur (but of broad applicability to FEA). We then use an ex vivo experimental validation test of a cadaveric femur to check our FEA results and find that the heterogeneous model matches the experimental results extremely well, and far better than the homogeneous model. We emphasize how considering heterogeneous material properties in FEA may be critical, so this should become standard practice in comparative FEA studies along with convergence analyses, consideration of element size, type and experimental validation. These steps may be required to obtain accurate models and derive reliable conclusions from them. PMID:21752810

  7. The active movement scale: an evaluative tool for infants with obstetrical brachial plexus palsy.

    PubMed

    Curtis, Christine; Stephens, Derek; Clarke, Howard M; Andrews, David

    2002-05-01

    Newborns with peripheral nerve lesions involving the upper extremity are difficult to evaluate. The reliability of the Active Movement Scale (AMS), a tool for assessing motor function in infants with obstetrical brachial plexus palsy (OBPP), was examined in 2 complementary studies. Part A was an interrater reliability study in which 63 infants younger than 1 year with OBPP were independently evaluated by 2 physical therapists using the AMS. The scores were compared for reliability and controlled for chance agreement by using kappa statistics. Overall kappa analysis of the 15 tested movements showed a moderate strength of score agreement (kappa = 0.51). Quadratic-weighted kappa (kappa(quad)) statistics showed that 8 of the 15 movements tested were in the highest strength of agreement category (kappa(quad) = 0.81-1.00). Five movements showed substantial agreement (kappa(quad) = 0.61-0.80), and 2 movements had moderate agreement (kappa(quad) = 0.41- 0.60). The overall kappa(quad) was 0.89. Part B was a variability study designed to examine the dispersion of scores when infants with OBPP were evaluated with the AMS by multiple raters. Ten pediatric physical therapists with varying degrees of experience using the scale attended a 1(1/2)-hour instructional workshop on administration of the tool for infants with OBPP. A chain-block study design was used to obtain 30 assessments of 10 infants by 10 raters. A 2-way analysis of variance indicated that the variability of scores due to rater factors was low compared with the variability due to patient factors and that variation in scores due to rater experience was minimal. The results of part A indicate that the AMS is a reliable tool for the assessment of infants with OBPP when raters familiar with the scale are compared. The results of part B suggest that, with minimal training, raters with a range of experience using the AMS are able to reliably evaluate infants with upper-extremity paralysis.

  8. Reliabilities of mental rotation tasks: limits to the assessment of individual differences.

    PubMed

    Hirschfeld, Gerrit; Thielsch, Meinald T; Zernikow, Boris

    2013-01-01

    Mental rotation tasks with objects and body parts as targets are widely used in cognitive neuropsychology. Even though these tasks are well established to study between-groups differences, the reliability on an individual level is largely unknown. We present a systematic study on the internal consistency and test-retest reliability of individual differences in mental rotation tasks comparing different target types and orders of presentations. In total n = 99 participants (n = 63 for the retest) completed the mental rotation tasks with hands, feet, faces, and cars as targets. Different target types were presented in either randomly mixed blocks or blocks of homogeneous targets. Across all target types, the consistency (split-half reliability) and stability (test-retest reliabilities) were good or acceptable both for intercepts and slopes. At the level of individual targets, only intercepts showed acceptable reliabilities. Blocked presentations resulted in significantly faster and numerically more consistent and stable responses. Mental rotation tasks-especially in blocked variants-can be used to reliably assess individual differences in global processing speed. However, the assessment of the theoretically important slope parameter for individual targets requires further adaptations to mental rotation tests.

  9. Reliability of surface electromyography activity of gluteal and hamstring muscles during sub-maximal and maximal voluntary isometric contractions.

    PubMed

    Bussey, Melanie D; Aldabe, Daniela; Adhia, Divya; Mani, Ramakrishnan

    2018-04-01

    Normalizing to a reference signal is essential when analysing and comparing electromyography signals across or within individuals. However, studies have shown that MVC testing may not be as reliable in persons with acute and chronic pain. The purpose of this study was to compare the test-retest reliability of the muscle activity in the biceps femoris and gluteus maximus between a novel sub-MVC and standard MVC protocols. This study utilized a single individual repeated measures design with 12 participants performing multiple trials of both the sub-MVC and MVC tasks on two separate days. The participant position in the prone leg raise task was standardised with an ultrasonic sensor to improve task precession between trials/days. Day-to-day and trial-to-trial reliability of the maximal muscle activity was examined using ICC and SEM. Day-to-day and trial-to-trial reliability of the EMG activity in the BF and GM were high (0.70-0.89) to very high (≥0.90) for both test procedures. %SEM was <5-10% for both tests on a given day but higher in the day-to-day comparisons. The lower amplitude of the sub-MVC is a likely contributor to increased %SEM (8-13%) in the day-to-day comparison. The findings show that the sub-MVC modified prone double leg raise results in GM and BF EMG measures similar in reliability and precision to the standard MVC tasks. Therefore, the modified prone double leg raise may be a useful substitute for traditional MVC testing for normalizing EMG signals of the BF and GM. Copyright © 2017 Elsevier Ltd. All rights reserved.

  10. Reliability of the German version of the Children's Assessment of Participation and Enjoyment (CAPE) and Preferences for Activities of Children (PAC).

    PubMed

    Fink, A; Gebhard, B; Erdwiens, S; Haddenhorst, L; Nowak, S

    2016-09-01

    The introduction of the International Classification of Functioning, Disabilities and Health of the World Health Organization in 2001 made social participation a major rehabilitation outcome and the ultimate goal of rehabilitation services. There is no available instrument to measure the youth participation in leisure activities apart from asking the youth themselves. The goal of this study was to present a German version of the Children's Assessment of Participation and Enjoyment and Preferences for Activities of Children (CAPE/PAC). The CAPE/PAC questionnaire was translated into German, a cultural adaptation process was designed and a reliability study was conducted. One hundred and fifty-two youths with and without disabilities, with a mean age of 15.2 years (standard deviation 1.7), participated in the study. The participants completed CAPE and PAC twice within 4 weeks. Reliability was examined by intraclass correlation coefficients, standard error of measurement, smallest detectable change and Cronbach's alpha. The absolute values of participation differ between the typically developed youth group and those with impairments; the reliability of the CAPE/PAC is comparable in both groups. Intraclass correlation coefficients ranged from 0.43 to 0.74 for the CAPE and from 0.71 to 0.83 for the PAC in all participants. The alpha values for internal consistency ranged from 0.42 to 0.82 for the CAPE and from 0.65 to 0.92 for the PAC. The German version of the PAC showed satisfactory reliability; however, reliability was not satisfactory for all scores of the CAPE, but comparable with versions in other languages. The need for newly developed participation measurements requires further discussion. © 2016 John Wiley & Sons Ltd.

  11. Validity and reliability of Optojump photoelectric cells for estimating vertical jump height.

    PubMed

    Glatthorn, Julia F; Gouge, Sylvain; Nussbaumer, Silvio; Stauffacher, Simone; Impellizzeri, Franco M; Maffiuletti, Nicola A

    2011-02-01

    Vertical jump is one of the most prevalent acts performed in several sport activities. It is therefore important to ensure that the measurements of vertical jump height made as a part of research or athlete support work have adequate validity and reliability. The aim of this study was to evaluate concurrent validity and reliability of the Optojump photocell system (Microgate, Bolzano, Italy) with force plate measurements for estimating vertical jump height. Twenty subjects were asked to perform maximal squat jumps and countermovement jumps, and flight time-derived jump heights obtained by the force plate were compared with those provided by Optojump, to examine its concurrent (criterion-related) validity (study 1). Twenty other subjects completed the same jump series on 2 different occasions (separated by 1 week), and jump heights of session 1 were compared with session 2, to investigate test-retest reliability of the Optojump system (study 2). Intraclass correlation coefficients (ICCs) for validity were very high (0.997-0.998), even if a systematic difference was consistently observed between force plate and Optojump (-1.06 cm; p < 0.001). Test-retest reliability of the Optojump system was excellent, with ICCs ranging from 0.982 to 0.989, low coefficients of variation (2.7%), and low random errors (±2.81 cm). The Optojump photocell system demonstrated strong concurrent validity and excellent test-retest reliability for the estimation of vertical jump height. We propose the following equation that allows force plate and Optojump results to be used interchangeably: force plate jump height (cm) = 1.02 × Optojump jump height + 0.29. In conclusion, the use of Optojump photoelectric cells is legitimate for field-based assessments of vertical jump height.

  12. [Santa Claus is perceived as reliable and friendly: results of the Danish Christmas 2013 survey].

    PubMed

    Amin, Faisal Mohammad; West, Anders Sode; Jørgensen, Carina Sleiborg; Simonsen, Sofie Amalie; Lindberg, Ulrich; Tranum-Jensen, Jørgen; Hougaard, Anders

    2013-12-02

    Several studies have indicated that the population in general perceives doctors as reliable. In the present study perceptions of reliability and kindness attributed to another socially significant archetype, Santa Claus, have been comparatively examined in relation to the doctor. In all, 52 randomly chosen participants were shown a film, where a narrator dressed either as Santa Claus or as a doctor tells an identical story. Structured interviews were then used to assess the subjects' perceptions of reliability and kindness in relation to the narrator's appearance. We found a strong inclination for Santa Claus being perceived as friendlier than the doctor (p = 0.053). However, there was no significant difference in the perception of reliability between Santa Claus and the doctor (p = 0.524). The positive associations attributed to Santa Claus probably cause that he is perceived friendlier than the doctor who may be associated with more serious and unpleasant memories of illness and suffering. Surprisingly, and despite him being an imaginary person, Santa Claus was assessed as being as reliable as the doctor.

  13. Reliability of lower limb alignment measures using an established landmark-based method with a customized computer software program

    PubMed Central

    Sled, Elizabeth A.; Sheehy, Lisa M.; Felson, David T.; Costigan, Patrick A.; Lam, Miu; Cooke, T. Derek V.

    2010-01-01

    The objective of the study was to evaluate the reliability of frontal plane lower limb alignment measures using a landmark-based method by (1) comparing inter- and intra-reader reliability between measurements of alignment obtained manually with those using a computer program, and (2) determining inter- and intra-reader reliability of computer-assisted alignment measures from full-limb radiographs. An established method for measuring alignment was used, involving selection of 10 femoral and tibial bone landmarks. 1) To compare manual and computer methods, we used digital images and matching paper copies of five alignment patterns simulating healthy and malaligned limbs drawn using AutoCAD. Seven readers were trained in each system. Paper copies were measured manually and repeat measurements were performed daily for 3 days, followed by a similar routine with the digital images using the computer. 2) To examine the reliability of computer-assisted measures from full-limb radiographs, 100 images (200 limbs) were selected as a random sample from 1,500 full-limb digital radiographs which were part of the Multicenter Osteoarthritis (MOST) Study. Three trained readers used the software program to measure alignment twice from the batch of 100 images, with two or more weeks between batch handling. Manual and computer measures of alignment showed excellent agreement (intraclass correlations [ICCs] 0.977 – 0.999 for computer analysis; 0.820 – 0.995 for manual measures). The computer program applied to full-limb radiographs produced alignment measurements with high inter- and intra-reader reliability (ICCs 0.839 – 0.998). In conclusion, alignment measures using a bone landmark-based approach and a computer program were highly reliable between multiple readers. PMID:19882339

  14. The TiltMeter app is a novel and accurate measurement tool for the weight bearing lunge test.

    PubMed

    Williams, Cylie M; Caserta, Antoni J; Haines, Terry P

    2013-09-01

    The weight bearing lunge test is increasing being used by health care clinicians who treat lower limb and foot pathology. This measure is commonly established accurately and reliably with the use of expensive equipment. This study aims to compare the digital inclinometer with a free app, TiltMeter on an Apple iPhone. This was an intra-rater and inter-rater reliability study. Two raters (novice and experienced) conducted the measurements in both a bent knee and straight leg position to determine the intra-rater and inter-rater reliability. Concurrent validity was also established. Allied health practitioners were recruited as participants from the workplace. A preconditioning stretch was conducted and the ankle range of motion was established with the weight bearing lunge test position with firstly the leg straight and secondly with the knee bent. The measurement device and each participant were randomised during measurement. The intra-rater reliability and inter-rater reliability for the devices and in both positions were all over ICC 0.8 except for one intra-rater measure (Digital inclinometer, novice, ICC 0.65). The inter-rater reliability between the digital inclinometer and the tilmeter was near perfect, ICC 0.96 (CI: 0.898-0.983); Concurrent validity ICC between the two devices was 0.83 (CI: -0.740 to 0.445). The use of the Tiltmeter app on the iPhone is a reliable and inexpensive tool to measure the available ankle range of motion. Health practitioners should use caution in applying these findings to other smart phone equipment if surface areas are not comparable. Crown Copyright © 2013. Published by Elsevier Ltd. All rights reserved.

  15. The development and validation of a custom built device for assessing frontal knee joint laxity.

    PubMed

    Ismail, Shiek Abdullah; Simic, Milena; Clarke, Jillian L; Lopes, Thiago Jambo Alves; Pappas, Evangelos

    2017-12-01

    This study reports the development and validation of a quantitative technique of assessing frontal knee joint laxity through a custom built device named KLICP. The objectives of this study were to determine: (i) the intra- and inter-rater reliability and (ii) the validity of the device when compared to real time ultrasound. Twenty-five participants had their frontal knee joint laxity assessed by the KLICP, by manual varus/valgus tests and by ultrasound. Two raters independently assessed laxity manually by three repeated measurements, repeated at least 48h later. Results were validated by comparing them to the medial and lateral joint space opening measured by the ultrasound. Intraclass correlation coefficients and standard error of measurement reliability were calculated. Pearson's correlation coefficients were calculated to determine the correlation between the KLICP and the joint space. Intra-rater reliability (intra-session) for each rater was good on both sessions (0.91-0.98), intra-rater reliability (inter-sessions) was moderate to good (0.62-0.87), and inter-rater reliability (intra-session) was good (0.75-0.80). There is low agreement for intra-rater (inter-session) and for inter-rater (intra-session) reliability. The KLICP measurement has a significant positive fair to moderate correlation to the ultrasound measurement at the left (r: 0.61, p: 0.01) and right (r: 0.48, p: 0.02) knee in the valgus direction and at the left (r: 0.51, p: 0.01) and right (r: 0.39, p: 0.05) knee in the varus direction. There is low agreement between the KLICP and the RTU. Reliability and agreement was good only when measured for intra-rater, within session. Copyright © 2017 Elsevier B.V. All rights reserved.

  16. Validation of the translation of an instrument to measure reliability of written information on treatment choices: a study on attention deficit/hyperactivity disorder (ADHD).

    PubMed

    Montoya, A; Llopis, N; Gilaberte, I

    2011-12-01

    DISCERN is an instrument designed to help patients assess the reliability of written information on treatment choices. Originally created in English, there is no validated Spanish version of this instrument. This study seeks to validate the Spanish translation of the DISCERN instrument used as a primary measure on a multicenter study aimed to assess the reliability of web-based information on treatment choices for attention deficit/hyperactivity disorder (ADHD). We used a modified version of a method for validating translated instruments in which the original source-language version is formally compared with the back-translated source-language version. Each item was ranked in terms of comparability of language, similarity of interpretability, and degree of understandability. Responses used Likert scales ranging from 1 to 7, where 1 indicates the best interpretability, language and understandability, and 7 indicates the worst. Assessments were performed by 20 raters fluent in the source language. The Spanish translation of DISCERN, based on ratings of comparability, interpretability and degree of understandability (mean score (SD): 1.8 (1.1), 1.4 (0.9) and 1.6 (1.1), respectively), was considered extremely comparable. All items received a score of less than three, therefore no further revision of the translation was needed. The validation process showed that the quality of DISCERN translation was high, validating the comparable language of the tool translated on assessing written information on treatment choices for ADHD.

  17. Validation of the Practice Environment Scale to the Brazilian culture.

    PubMed

    Gasparino, Renata C; Guirardello, Edinêis de B

    2017-07-01

    To validate the Brazilian version of the Practice Environment Scale. The Practice Environment Scale is a tool that evaluates the presence of characteristics that are favourable for professional nursing practice because a better work environment contributes to positive results for patients, professionals and institutions. Methodological study including 209 nurses. Validity was assessed via a confirmatory factor analysis using structural equation modelling, in which the correlations between the instrument and the following variables were tested: burnout, job satisfaction, safety climate, perception of quality of care and intention to leave the job. Subgroups were compared and the reliability was assessed using Cronbach's alpha and the composite reliability. Factor analysis resulted in exclusion of seven items. Significant correlations were obtained between the subscales and all variables in the study. The reliability was considered acceptable. The Brazilian version of the Practice Environment Scale is a valid and reliable tool used to assess the characteristics that promote professional nursing practice. Use of this tool in Brazilian culture should allow managers to implement changes that contribute to the achievement of better results, in addition to identifying and comparing the environments of health institutions. © 2017 John Wiley & Sons Ltd.

  18. Reliability of videotaped observational gait analysis in patients with orthopedic impairments

    PubMed Central

    Brunnekreef, Jaap J; van Uden, Caro JT; van Moorsel, Steven; Kooloos, Jan GM

    2005-01-01

    Background In clinical practice, visual gait observation is often used to determine gait disorders and to evaluate treatment. Several reliability studies on observational gait analysis have been described in the literature and generally showed moderate reliability. However, patients with orthopedic disorders have received little attention. The objective of this study is to determine the reliability levels of visual observation of gait in patients with orthopedic disorders. Methods The gait of thirty patients referred to a physical therapist for gait treatment was videotaped. Ten raters, 4 experienced, 4 inexperienced and 2 experts, individually evaluated these videotaped gait patterns of the patients twice, by using a structured gait analysis form. Reliability levels were established by calculating the Intraclass Correlation Coefficient (ICC), using a two-way random design and based on absolute agreement. Results The inter-rater reliability among experienced raters (ICC = 0.42; 95%CI: 0.38–0.46) was comparable to that of the inexperienced raters (ICC = 0.40; 95%CI: 0.36–0.44). The expert raters reached a higher inter-rater reliability level (ICC = 0.54; 95%CI: 0.48–0.60). The average intra-rater reliability of the experienced raters was 0.63 (ICCs ranging from 0.57 to 0.70). The inexperienced raters reached an average intra-rater reliability of 0.57 (ICCs ranging from 0.52 to 0.62). The two expert raters attained ICC values of 0.70 and 0.74 respectively. Conclusion Structured visual gait observation by use of a gait analysis form as described in this study was found to be moderately reliable. Clinical experience appears to increase the reliability of visual gait analysis. PMID:15774012

  19. Reliability and validity of abbreviated surveys derived from the National Eye Institute Visual Function Questionnaire: The Study of Osteoporotic Fractures

    PubMed Central

    Gergana, Kodjebacheva; Coleman, Anne L.; Ensrud, Kristine E.; Cauley, Jane A.; Yu, Fei; Stone, Katie L.; Pedula, Kathryn L.; Hochberg, Marc C.; Mangione, Carol M.

    2010-01-01

    Purpose To test the reliability and validity of questionnaires shortened from the National Eye Institute 25-item Vision Function Questionnaire (NEI VFQ-9 and NEI VFQ-8). Design A cross-sectional multi-center cohort study. Methods Reliability was assessed by Cronbach alpha coefficients. Validity was evaluated by studying the association of vision-targeted quality-of-life composite scores with objective visual function measurements. Study population: A total of 5,482 women between the ages of 65 and 100 years participated in the Year-10 clinic visit in the Study of Osteoporotic Fractures (SOF). A total of 3,631 women with complete data were included in the visual acuity (VA) and visual field (VF) analysis of the NEI VFQ-9, which is defined for those who care to drive. and 5,311 in the analysis of the NEI VFQ-8. To assess differences in prevalent eye diseases, which were ascertained for a random sample of SOF participants, 853 and 1,237 women were included in the NEI VFQ-9 and the NEI VFQ-8 analyses, respectively. Results Cronbach alpha coefficient for the NEI VFQ-9 scale was 0.83 and that of the NEI VFQ-8 was 0.84. Using both questionnaires, women with VA worse than 20/40 had lower composite scores compared to those with VA 20/40 or better (p<0.001). Participants with mild, moderate, and severe binocular VF loss had lower composite scores compared to those with no binocular VF loss (p<0.001).Compared to women without chronic eye diseases in both eyes, women with at least one chronic eye disease in at least one eye had lower composite scores. Conclusions Both questionnaires showed high reliability across items and validity with respect to clinical markers of eye disease Future research should compare the properties of these shortened surveys to those of the NEI VFQ-25. PMID:20103058

  20. A newly developed tool for classifying study designs in systematic reviews of interventions and exposures showed substantial reliability and validity.

    PubMed

    Seo, Hyun-Ju; Kim, Soo Young; Lee, Yoon Jae; Jang, Bo-Hyoung; Park, Ji-Eun; Sheen, Seung-Soo; Hahn, Seo Kyung

    2016-02-01

    To develop a study Design Algorithm for Medical Literature on Intervention (DAMI) and test its interrater reliability, construct validity, and ease of use. We developed and then revised the DAMI to include detailed instructions. To test the DAMI's reliability, we used a purposive sample of 134 primary, mainly nonrandomized studies. We then compared the study designs as classified by the original authors and through the DAMI. Unweighted kappa statistics were computed to test interrater reliability and construct validity based on the level of agreement between the original and DAMI classifications. Assessment time was also recorded to evaluate ease of use. The DAMI includes 13 study designs, including experimental and observational studies of interventions and exposure. Both the interrater reliability (unweighted kappa = 0.67; 95% CI [0.64-0.75]) and construct validity (unweighted kappa = 0.63, 95% CI [0.52-0.67]) were substantial. Mean classification time using the DAMI was 4.08 ± 2.44 minutes (range, 0.51-10.92). The DAMI showed substantial interrater reliability and construct validity. Furthermore, given its ease of use, it could be used to accurately classify medical literature for systematic reviews of interventions although minimizing disagreement between authors of such reviews. Copyright © 2016 Elsevier Inc. All rights reserved.

  1. Validity and reliability of self-reported diabetes in the Atherosclerosis Risk in Communities Study.

    PubMed

    Schneider, Andrea L C; Pankow, James S; Heiss, Gerardo; Selvin, Elizabeth

    2012-10-15

    The objective of this study was to assess the validity of prevalent and incident self-reported diabetes compared with multiple reference definitions and to assess the reliability (repeatability) of a self-reported diagnosis of diabetes. Data from 10,321 participants in the Atherosclerosis Risk in Communities (ARIC) Study who attended visit 4 (1996-1998) were analyzed. Prevalent self-reported diabetes was compared with reference definitions defined by fasting glucose and medication use obtained at visit 4. Incident self-reported diabetes was assessed during annual follow-up telephone calls and was compared with reference definitions defined by fasting glucose, hemoglobin A1c, and medication use obtained during an in-person visit attended by a subsample of participants (n = 1,738) in 2004-2005. The sensitivity of prevalent self-reported diabetes ranged from 58.5% to 70.8%, and specificity ranged from 95.6% to 96.8%, depending on the reference definition. Similarly, the sensitivity of incident self-reported diabetes ranged from 55.9% to 80.4%, and specificity ranged from 84.5% to 90.6%. Percent positive agreement of self-reported diabetes during 9 years of repeat assessments ranged from 92.7% to 95.4%. Both prevalent self-reported diabetes and incident self-reported diabetes were 84%-97% specific and 55%-80% sensitive as compared with reference definitions using glucose and medication criteria. Self-reported diabetes was >92% reliable over time.

  2. The reliability and validity of ultrasound to quantify muscles in older adults: a systematic review

    PubMed Central

    Scafoglieri, Aldo; Jager‐Wittenaar, Harriët; Hobbelen, Johannes S.M.; van der Schans, Cees P.

    2017-01-01

    Abstract This review evaluates the reliability and validity of ultrasound to quantify muscles in older adults. The databases PubMed, Cochrane, and Cumulative Index to Nursing and Allied Health Literature were systematically searched for studies. In 17 studies, the reliability (n = 13) and validity (n = 8) of ultrasound to quantify muscles in community‐dwelling older adults (≥60 years) or a clinical population were evaluated. Four out of 13 reliability studies investigated both intra‐rater and inter‐rater reliability. Intraclass correlation coefficient (ICC) scores for reliability ranged from −0.26 to 1.00. The highest ICC scores were found for the vastus lateralis, rectus femoris, upper arm anterior, and the trunk (ICC = 0.72 to 1.000). All included validity studies found ICC scores ranging from 0.92 to 0.999. Two studies describing the validity of ultrasound to predict lean body mass showed good validity as compared with dual‐energy X‐ray absorptiometry (r 2 = 0.92 to 0.96). This systematic review shows that ultrasound is a reliable and valid tool for the assessment of muscle size in older adults. More high‐quality research is required to confirm these findings in both clinical and healthy populations. Furthermore, ultrasound assessment of small muscles needs further evaluation. Ultrasound to predict lean body mass is feasible; however, future research is required to validate prediction equations in older adults with varying function and health. PMID:28703496

  3. Test-retest reliability and validity of a web-based food-frequency questionnaire for adolescents aged 13-14 to be used in the Norwegian Mother and Child Cohort Study (MoBa).

    PubMed

    Overby, Nina Cecilie; Johannesen, Elisabeth; Jensen, Grete; Skjaevesland, Anne-Kirsti; Haugen, Margaretha

    2014-01-01

    The assessment of food intake is challenging and prone to errors; it is therefore important to consider the reliability and validity of the assessment methods. The aim of this study was to analyze the reproducibility and validity of a developed food-frequency questionnaire (FFQ) for use among adolescents. In total, 58 students (aged 13-14) from four different schools in the southern part of Norway participated in the reproducibility study of filling out the FFQ 4 weeks apart. In addition, 93 students participated in the relative validity study where the FFQ was compared to 2×24-hour dietary recalls, while 92 students participated in the absolute validity study where the intakes of fatty acids and vitamin D from the FFQ were compared to fatty acids and 25-hydroxy-vitamin D3 in whole blood. The median Spearman correlation coefficient for all nutrients in the test-retest reliability study was 0.57. The median Spearman correlation for all nutrients in the relative validity study was 0.26, while the correlations coefficients were low in the absolute validity study with n-3 fatty acid coefficients ranging from 0.05 to 0.25, and absent for vitamin D (r=0.000). The test-retest reproducibility was considered good, the relative validity was considered poor to good, and the absolute validity was considered poor. However, the results are comparable to other studies among adolescents.

  4. A validation study of the Keyboard Personal Computer Style instrument (K-PeCS) for use with children.

    PubMed

    Green, Dido; Meroz, Anat; Margalit, Adi Edit; Ratzon, Navah Z

    2012-11-01

    This study examines a potential instrument for measurement of typing postures of children. This paper describes inter-rater, test-retest reliability and concurrent validity of the Keyboard Personal Computer Style instrument (K-PeCS), an observational measurement of postures and movements during keyboarding, for use with children. Two trained raters independently rated videos of 24 children (aged 7-10 years). Six children returned one week later for identifying test-retest reliability. Concurrent validity was assessed by comparing ratings obtained using the K-PECS to scores from a 3D motion analysis system. Inter-rater reliability was moderate to high for 12 out of 16 items (Kappa: 0.46 to 1.00; correlation coefficients: 0.77-0.95) and test-retest reliability varied across items (Kappa: 0.25 to 0.67; correlation coefficients: r = 0.20 to r = 0.95). Concurrent validity compared favourably across arm pathlength, wrist extension and ulnar deviation. In light of the limitations of other tools the K-PeCS offers a fairly affordable, reliable and valid instrument to address the gap for measurement of typing styles of children, despite the shortcomings of some items. However further research is required to refine the instrument for use in evaluating typing among children. Copyright © 2012 Elsevier Ltd and The Ergonomics Society. All rights reserved.

  5. Dental age estimation: Comparison of reliability between Malay formula of Demirjian method and Malay formula of Cameriere method

    NASA Astrophysics Data System (ADS)

    Alghali, R.; Kamaruddin, A. F.; Mokhtar, N.

    2016-12-01

    Introduction: The application of forensic odontology using teeth and bones becomes the most commonly used methods to determine age of unknown individuals. Objective: The aim of this study was to determine the reliability of Malay formula of Demirjian and Malay formula of Cameriere methods in determining the dental age that is closely matched with the chronological age of Malay children in Kepala Batas region. Methodology: This is a retrospective cross-sectional study. 126 good quality dental panoramic radiographs (DPT) of healthy Malay children aged 8-16 years (49 boys and 77 girls) were selected and measured. All radiographs were taken at Dental Specialist Clinic, Advanced Medical and Dental Institute, Universiti Sains Malaysia. The measurements were carried out using new Malay formula of both Demirjian and Cameriere methods by calibrated examiner. Results: The intraclass correlation coefficient (ICC) analysis between the chronological age with Demirjian and Cameriere has been calculated. The Demirjian method has shown a better percentage (91.4%) of ICC compared to Cameriere (89.2%) which also indicates a high association, with good reliability. However, by comparing between Demirjian and Cameriere, it can be concluded that Demirjian has a better reliability. Conclusion: Thus, the results suggested that, modified Demirjian method is more reliable than modified Cameriere method among the population in Kepala Batas region.

  6. Educational testing validity and reliability in pharmacy and medical education literature.

    PubMed

    Hoover, Matthew J; Jung, Rose; Jacobs, David M; Peeters, Michael J

    2013-12-16

    To evaluate and compare the reliability and validity of educational testing reported in pharmacy education journals to medical education literature. Descriptions of validity evidence sources (content, construct, criterion, and reliability) were extracted from articles that reported educational testing of learners' knowledge, skills, and/or abilities. Using educational testing, the findings of 108 pharmacy education articles were compared to the findings of 198 medical education articles. For pharmacy educational testing, 14 articles (13%) reported more than 1 validity evidence source while 83 articles (77%) reported 1 validity evidence source and 11 articles (10%) did not have evidence. Among validity evidence sources, content validity was reported most frequently. Compared with pharmacy education literature, more medical education articles reported both validity and reliability (59%; p<0.001). While there were more scholarship of teaching and learning (SoTL) articles in pharmacy education compared to medical education, validity, and reliability reporting were limited in the pharmacy education literature.

  7. Development of Internet-Based Tasks for the Executive Function Performance Test.

    PubMed

    Rand, Debbie; Lee Ben-Haim, Keren; Malka, Rachel; Portnoy, Sigal

    The Executive Function Performance Test (EFPT) is a reliable and valid performance-based tool to assess executive functions (EFs). This study's objective was to develop and verify two Internet-based tasks for the EFPT. A cross-sectional study assessed the alternate-form reliability of the Internet-based bill-paying and telephone-use tasks in healthy adults and people with subacute stroke (Study 1). It also sought to establish the tasks' criterion reliability for assessing EF deficits by correlating performance with that on the Trail Making Test in five groups: healthy young adults, healthy older adults, people with subacute stroke, people with chronic stroke, and young adults with attention deficit hyperactivity disorder (Study 2). The alternative-form reliability and initial construct validity for the Internet-based bill-paying task were verified. Criterion validity was established for both tasks. The Internet-based tasks are comparable to the original EFPT tasks and can be used for assessment of EF deficits. Copyright © 2018 by the American Occupational Therapy Association, Inc.

  8. Scaling Impacts in Life Support Architecture and Technology Selection

    NASA Technical Reports Server (NTRS)

    Lange, Kevin

    2016-01-01

    For long-duration space missions outside of Earth orbit, reliability considerations will drive higher levels of redundancy and/or on-board spares for life support equipment. Component scaling will be a critical element in minimizing overall launch mass while maintaining an acceptable level of system reliability. Building on an earlier reliability study (AIAA 2012-3491), this paper considers the impact of alternative scaling approaches, including the design of technology assemblies and their individual components to maximum, nominal, survival, or other fractional requirements. The optimal level of life support system closure is evaluated for deep-space missions of varying duration using equivalent system mass (ESM) as the comparative basis. Reliability impacts are included in ESM by estimating the number of component spares required to meet a target system reliability. Common cause failures are included in the analysis. ISS and ISS-derived life support technologies are considered along with selected alternatives. This study focusses on minimizing launch mass, which may be enabling for deep-space missions.

  9. Electrical impedance myography in facioscapulohumeral muscular dystrophy.

    PubMed

    Statland, Jeffrey M; Heatwole, Chad; Eichinger, Katy; Dilek, Nuran; Martens, William B; Tawil, Rabi

    2016-10-01

    In this study we determined the reliability and validity of electrical impedance myography (EIM) in facioscapulohumeral muscular dystrophy (FSHD). We performed a prospective study of EIM on 16 bilateral limb and trunk muscles in 35 genetically defined and clinically affected FSHD patients (reliability testing on 18 patients). Summary scores based on body region were derived. Reactance and phase (50 and 100 kHz) were compared with measures of strength, FSHD disease severity, and functional outcomes. Participants were mostly men, mean age 53.0 years, and included a full range of severity. Limb and trunk muscles showed good to excellent reliability [intraclass correlation coefficients (ICC) 0.72-0.99]. Summary scores for the arm, leg, and trunk showed excellent reliability (ICC 0.89-0.98). Reactance was the most sensitive EIM parameter to a broad range of FSHD disease metrics. EIM is a reliable measure of muscle composition in FSHD that offers the possibility to serially evaluate affected muscles. Muscle Nerve 54: 696-701, 2016. © 2016 Wiley Periodicals, Inc.

  10. A comparison of the reliability of make versus break testing in measuring palmar abduction strength of the thumb.

    PubMed

    Lim, J X; Toh, R X; Chook, S K H; Sebastin, S J; Karjalainen, T

    2014-06-01

    Previous studies have established the role of quantitative measurements of palmar abduction strength of the thumb (PAST). This study compares the reliability of the 'make' versus the 'break' test in measuring PAST in healthy volunteers. In a 'make' test, the body part being tested is positioned at the start of its range of motion and the participant is asked to exert his/her maximal force. In a 'break' test, increasing force is applied to a body part after it has completed its range of motion, until the joint being tested gives way. PAST was measured in both hands in 100 healthy volunteers using a handheld device. Two examiners measured PAST using both the 'make' and 'break' test to determine inter-rater reliability. The tests were repeated in 30 volunteers 6 weeks after the initial testing to determine intra-rater reliability. Our results showed that the 'make' test has better inter and intra-rater reliability.

  11. Validity and reliability of the Myotest accelerometric system for the assessment of vertical jump height.

    PubMed

    Casartelli, Nicola; Müller, Roland; Maffiuletti, Nicola A

    2010-11-01

    The aim of the present study was to verify the validity and reliability of the Myotest accelerometric system (Myotest SA, Sion, Switzerland) for the assessment of vertical jump height. Forty-four male basketball players (age range: 9-25 years) performed series of squat, countermovement and repeated jumps during 2 identical test sessions separated by 2-15 days. Flight height was simultaneously quantified with the Myotest system and validated photoelectric cells (Optojump). Two calculation methods were used to estimate the jump height from Myotest recordings: flight time (Myotest-T) and vertical takeoff velocity (Myotest-V). Concurrent validity was investigated comparing Myotest-T and Myotest-V to the criterion method (Optojump), and test-retest reliability was also examined. As regards validity, Myotest-T overestimated jumping height compared to Optojump (p < 0.001) with a systematic bias of approximately 7 cm, even though random errors were low (2.7 cm) and intraclass correlation coefficients (ICCs) where high (>0.98), that is, excellent validity. Myotest-V overestimated jumping height compared to Optojump (p < 0.001), with high random errors (>12 cm), high limits of agreement ratios (>36%), and low ICCs (<0.75), that is, poor validity. As regards reliability, Myotest-T showed high ICCs (range: 0.92-0.96), whereas Myotest-V showed low ICCs (range: 0.56-0.89), and high random errors (>9 cm). In conclusion, Myotest-T is a valid and reliable method for the assessment of vertical jump height, and its use is legitimate for field-based evaluations, whereas Myotest-V is neither valid nor reliable.

  12. Empirical Recommendations for Improving the Stability of the Dot-Probe Task in Clinical Research

    PubMed Central

    Price, Rebecca B.; Kuckertz, Jennie M.; Siegle, Greg J.; Ladouceur, Cecile D.; Silk, Jennifer S.; Ryan, Neal D.; Dahl, Ronald E.; Amir, Nader

    2014-01-01

    The dot-probe task has been widely used in research to produce an index of biased attention based on reaction times (RTs). Despite its popularity, very few published studies have examined psychometric properties of the task, including test-retest reliability, and no previous study has examined reliability in clinically anxious samples or systematically explored the effects of task design and analysis decisions on reliability. In the current analysis, we utilized dot-probe data from three studies where attention bias towards threat-related faces was assessed at multiple (≥5) timepoints. Two of the studies were similar (adults with Social Anxiety Disorder, similar design features) while one was much more disparate (pediatric healthy volunteers, distinct task design). We explored the effects of analysis choices (e.g., bias score calculation formula, methods for outlier handling) on reliability and searched for convergence of findings across the three studies. We found that, when considering the three studies concurrently, the most reliable RT bias index utilized data from dot-bottom trials, comparing congruent to incongruent trials, with rescaled outliers, particularly after averaging across more than one assessment point. Although reliability of RT bias indices was moderate to low under most circumstances, within-session variability in bias (attention bias variability; ABV), a recently proposed RT index, was more reliable across sessions. Several eyetracking-based indices of attention bias (available in the pediatric healthy sample only) showed reliability that matched the optimal RT index (ABV). On the basis of these findings, we make specific recommendations to researchers using the dot probe, particularly those wishing to investigate individual differences and/or single-patient applications. PMID:25419646

  13. Dutch translation and cross-cultural validation of the Adult Social Care Outcomes Toolkit (ASCOT).

    PubMed

    van Leeuwen, Karen M; Bosmans, Judith E; Jansen, Aaltje Pd; Rand, Stacey E; Towers, Ann-Marie; Smith, Nick; Razik, Kamilla; Trukeschitz, Birgit; van Tulder, Maurits W; van der Horst, Henriette E; Ostelo, Raymond W

    2015-05-13

    The Adult Social Care Outcomes Toolkit was developed to measure outcomes of social care in England. In this study, we translated the four level self-completion version (SCT-4) of the ASCOT for use in the Netherlands and performed a cross-cultural validation. The ASCOT SCT-4 was translated into Dutch following international guidelines, including two forward and back translations. The resulting version was pilot tested among frail older adults using think-aloud interviews. Furthermore, using a subsample of the Dutch ACT-study, we investigated test-retest reliability and construct validity and compared response distributions with data from a comparable English study. The pilot tests showed that translated items were in general understood as intended, that most items were reliable, and that the response distributions of the Dutch translation and associations with other measures were comparable to the original English version. Based on the results of the pilot tests, some small modifications and a revision of the Dignity items were proposed for the final translation, which were approved by the ASCOT development team. The complete original English version and the final Dutch translation can be obtained after registration on the ASCOT website ( http://www.pssru.ac.uk/ascot ). This study provides preliminary evidence that the Dutch translation of the ASCOT is valid, reliable and comparable to the original English version. We recommend further research to confirm the validity of the modified Dutch ASCOT translation.

  14. A new scale for the assessment of performance and capacity of hand function in children with hemiplegic cerebral palsy: reliability and validity studies.

    PubMed

    Rosa-Rizzotto, M; Visonà Dalla Pozza, L; Corlatti, A; Luparia, A; Marchi, A; Molteni, F; Facchin, P; Pagliano, E; Fedrizzi, E

    2014-10-01

    In hemiplegic children, the recognition of the activity limitation pattern and the possibility of grading its severity are relevant for clinicians while planning interventions, monitoring results, predicting outcomes. Aim of the study is to examine the reliability and validity of Besta Scale, an instrument used to measure in hemiplegic children from 18 months to 12 years of age both grasp on request (capacity) and spontaneous use of upper limb (performance) in bimanual play activities and in ADL. Psychometric analysis of reliability and of validity of the Besta scale was performed. Outpatient study sample Reliability study: A sample of 39 patients was enrolled. The administration of Besta scale was video-recorded in a standardized manner. All videos were scored by 20 independent raters on subsequent viewing. 3 raters randomly selected from the 20-raters group rescored the same video two years later for intra-rater reliability. Intra and inter-rater reliability were calculated using Intraclass Correlation Coefficient (ICC) and Kendall's coefficient (K), respectively. Internal consistency reliability was assessed using Alpha's Chronbach coefficient. Validity study: a sample of 105 children was assessed 5 times (at t0 and 2, 3, 6 and 12 months later) by 20 independent raters. Each patient underwent at the same time to QUEST and Besta scale administration and assessment. Criterion validity was calculated using rho-Pearson coefficient. Reliability study: The inter-rater reliability calculated with Kendall's coefficient resulted moderate K=0.47. The intra-rater (or test-retest) reliability for 3 raters was excellent (ICC=0.927). The Cronbach's alpha for internal consistency was 0.972. Validity study: Besta scale showed a good criterion validity compared to QUEST increasing by age and severity of impairment. Rho Pearson's correlation coefficient r was 0.81 (P<0.0001). Limitations. Besta scales in infants finds hard to distinguish between mild to moderately impaired hand function. Besta scale scoring system is a valid and reliable tool, utilizable in a clinical setting to monitor evolution of unimanual and bimanual manipulation and to distinguish hand's capacity from performance.

  15. Concurrent validity and reliability of the Simple Goniometer iPhone app compared with the Universal Goniometer.

    PubMed

    Jones, Anne; Sealey, Rebecca; Crowe, Michael; Gordon, Susan

    2014-10-01

    The aim of this study was to assess the concurrent validity and reliability of the Simple Goniometer (SG) iPhone® app compared to the Universal Goniometer (UG). Within subject comparison design comparing the UG with the SG app. James Cook University, Townsville, Queensland, Australia. Thirty-six volunteer participants, with a mean age of 60.6 years (SD 6.2). Not applicable. Thirty-six participants performed three standing lunges during which the knee joint angle was measured with the SG app and the UG. There were no significant differences in the measures of individual knee joint angles between the UG and the SG app. Pearson correlations of 0.96-0.98 and intraclass correlation coefficients of 0.97-0.99 (95% confidence interval: 0.95-1.00) were recorded for all measures. Using the Bland-Altman method, the standard error of the mean of the differences and the standard deviation of the mean of the differences were low. The measurements from the SG iPhone® app were reliable and possessed concurrent validity for this sample and protocol when compared to the UG.

  16. The Advantages of Normalizing Electromyography to Ballistic Rather than Isometric or Isokinetic Tasks.

    PubMed

    Suydam, Stephen M; Manal, Kurt; Buchanan, Thomas S

    2017-07-01

    Isometric tasks have been a standard for electromyography (EMG) normalization stemming from anatomic and physiologic stability observed during contraction. Ballistic dynamic tasks have the benefit of eliciting maximum EMG signals for normalization, despite having the potential for greater signal variability. It is the purpose of this study to compare maximum voluntary isometric contraction (MVIC) to nonisometric tasks with increasing degrees of extrinsic variability, ie, joint range of motion, velocity, rate of contraction, etc., to determine if the ballistic tasks, which elicit larger peak EMG signals, are more reliable than the constrained MVIC. Fifteen subjects performed MVIC, isokinetic, maximum countermovement jump, and sprint tasks while EMG was collected from 9 muscles in the quadriceps, hamstrings, and lower leg. The results revealed the unconstrained ballistic tasks were more reliable compared to the constrained MVIC and isokinetic tasks for all triceps surae muscles. The EMG from sprinting was more reliable than the constrained cases for both the hamstrings and vasti. The most reliable EMG signals occurred when the body was permitted its natural, unconstrained motion. These results suggest that EMG is best normalized using ballistic tasks to provide the greatest within-subject reliability, which beneficially yield maximum EMG values.

  17. Children's Depression Inventory (CDI) and the Children's Depression Rating Scale-Revised (CDRS-R): reliability of the Hebrew version.

    PubMed

    Zalsman, Gil; Misgav, Sagit; Sommerfeld, Eliane; Kohn, Yoav; Brunstein-Klomek, Anat; Diller, Robyne; Sher, Leo; Schwartz, Joseph; Shoval, Gal; Ben-Dor, David H; Wolovik, Luisa; Oquendo, Maria A

    2005-01-01

    The Children's Depression Inventory (CDI) and Children's Depression Rating Scale-Revised (CDRS-R) are two widely used instruments, which measure depression in children and adolescents. This pilot study assessed the reliability of the Hebrew versions of these two instruments. Both CDRS-R and CDI were translated from English into Hebrew and then back translated. Seventeen healthy Israeli bilingual children volunteers were interviewed with both scales with a one day intermission between the interviews. Non-parametric correlations were used to compare scores in the two versions for each item. Results showed high agreement between the two versions for almost all items of the CDI and moderate to high for the CDRS-R. When CDRS-R summary scores for each item were compared, the agreement was high for this instrument as well. It is concluded that both CDI and CDRS-R Hebrew versions are reliable and can be used for studies of depression in the Israeli pediatric population.

  18. Urdu translation of the Hamilton Rating Scale for Depression: Results of a validation study

    PubMed Central

    Hashmi, Ali M.; Naz, Shahana; Asif, Aftab; Khawaja, Imran S.

    2016-01-01

    Objective: To develop a standardized validated version of the Hamilton Rating Scale for Depression (HAM-D) in Urdu. Methods: After translation of the HAM-D into the Urdu language following standard guidelines, the final Urdu version (HAM-D-U) was administered to 160 depressed outpatients. Inter-item correlation was assessed by calculating Cronbach alpha. Correlation between HAM-D-U scores at baseline and after a 2-week interval was evaluated for test-retest reliability. Moreover, scores of two clinicians on HAM-D-U were compared for inter-rater reliability. For establishing concurrent validity, scores of HAM-D-U and BDI-U were compared by using Spearman correlation coefficient. The study was conducted at Mayo Hospital, Lahore, from May to December 2014. Results: The Cronbach alpha for HAM-D-U was 0.71. Composite scores for HAM-D-U at baseline and after a 2-week interval were also highly correlated with each other (Spearman correlation coefficient 0.83, p-value < 0.01) indicating good test-retest reliability. Composite scores for HAM-D-U and BDI-U were positively correlated with each other (Spearman correlation coefficient 0.85, p < 0.01) indicating good concurrent validity. Scores of two clinicians for HAM-D-U were also positively correlated (Spearman correlation coefficient 0.82, p-value < 0.01) indicated good inter-rater reliability. Conclusion: The HAM-D-U is a valid and reliable instrument for the assessment of Depression. It shows good inter-rater and test-retest reliability. The HAM-D-U can be a tool either for clinical management or research. PMID:28083049

  19. Urdu translation of the Hamilton Rating Scale for Depression: Results of a validation study.

    PubMed

    Hashmi, Ali M; Naz, Shahana; Asif, Aftab; Khawaja, Imran S

    2016-01-01

    To develop a standardized validated version of the Hamilton Rating Scale for Depression (HAM-D) in Urdu. After translation of the HAM-D into the Urdu language following standard guidelines, the final Urdu version (HAM-D-U) was administered to 160 depressed outpatients. Inter-item correlation was assessed by calculating Cronbach alpha. Correlation between HAM-D-U scores at baseline and after a 2-week interval was evaluated for test-retest reliability. Moreover, scores of two clinicians on HAM-D-U were compared for inter-rater reliability. For establishing concurrent validity, scores of HAM-D-U and BDI-U were compared by using Spearman correlation coefficient. The study was conducted at Mayo Hospital, Lahore, from May to December 2014. The Cronbach alpha for HAM-D-U was 0.71. Composite scores for HAM-D-U at baseline and after a 2-week interval were also highly correlated with each other (Spearman correlation coefficient 0.83, p-value < 0.01) indicating good test-retest reliability. Composite scores for HAM-D-U and BDI-U were positively correlated with each other (Spearman correlation coefficient 0.85, p < 0.01) indicating good concurrent validity. Scores of two clinicians for HAM-D-U were also positively correlated (Spearman correlation coefficient 0.82, p-value < 0.01) indicated good inter-rater reliability. The HAM-D-U is a valid and reliable instrument for the assessment of Depression. It shows good inter-rater and test-retest reliability. The HAM-D-U can be a tool either for clinical management or research.

  20. Validity and reliability of GPS and LPS for measuring distances covered and sprint mechanical properties in team sports

    PubMed Central

    Baumgart, Christian; Polglaze, Ted; Freiwald, Jürgen

    2018-01-01

    This study aimed to investigate the validity and reliability of global (GPS) and local (LPS) positioning systems for measuring distances covered and sprint mechanical properties in team sports. Here, we evaluated two recently released 18 Hz GPS and 20 Hz LPS technologies together with one established 10 Hz GPS technology. Six male athletes (age: 27±2 years; VO2max: 48.8±4.7 ml/min/kg) performed outdoors on 10 trials of a team sport-specific circuit that was equipped with double-light timing gates. The circuit included various walking, jogging, and sprinting sections that were performed either in straight-lines or with changes of direction. During the circuit, athletes wore two devices of each positioning system. From the reported and filtered velocity data, the distances covered and sprint mechanical properties (i.e., the theoretical maximal horizontal velocity, force, and power output) were computed. The sprint mechanical properties were modeled via an inverse dynamic approach applied to the center of mass. The validity was determined by comparing the measured and criterion data via the typical error of estimate (TEE), whereas the reliability was examined by comparing the two devices of each technology (i.e., the between-device reliability) via the coefficient of variation (CV). Outliers due to measurement errors were statistically identified and excluded from validity and reliability analyses. The 18 Hz GPS showed better validity and reliability for determining the distances covered (TEE: 1.6–8.0%; CV: 1.1–5.1%) and sprint mechanical properties (TEE: 4.5–14.3%; CV: 3.1–7.5%) than the 10 Hz GPS (TEE: 3.0–12.9%; CV: 2.5–13.0% and TEE: 4.1–23.1%; CV: 3.3–20.0%). However, the 20 Hz LPS demonstrated superior validity and reliability overall (TEE: 1.0–6.0%; CV: 0.7–5.0% and TEE: 2.1–9.2%; CV: 1.6–7.3%). For the 10 Hz GPS, 18 Hz GPS, and 20 Hz LPS, the relative loss of data sets due to measurement errors was 10.0%, 20.0%, and 15.8%, respectively. This study shows that 18 Hz GPS has enhanced validity and reliability for determining movement patterns in team sports compared to 10 Hz GPS, whereas 20 Hz LPS had superior validity and reliability overall. However, compared to 10 Hz GPS, 18 Hz GPS and 20 Hz LPS technologies had more outliers due to measurement errors, which limits their practical applications at this time. PMID:29420620

  1. Issues in benchmarking human reliability analysis methods : a literature review.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lois, Erasmia; Forester, John Alan; Tran, Tuan Q.

    There is a diversity of human reliability analysis (HRA) methods available for use in assessing human performance within probabilistic risk assessment (PRA). Due to the significant differences in the methods, including the scope, approach, and underlying models, there is a need for an empirical comparison investigating the validity and reliability of the methods. To accomplish this empirical comparison, a benchmarking study is currently underway that compares HRA methods with each other and against operator performance in simulator studies. In order to account for as many effects as possible in the construction of this benchmarking study, a literature review was conducted,more » reviewing past benchmarking studies in the areas of psychology and risk assessment. A number of lessons learned through these studies are presented in order to aid in the design of future HRA benchmarking endeavors.« less

  2. A Comparative Evaluation of Mixed Dentition Analysis on Reliability of Cone Beam Computed Tomography Image Compared to Plaster Model.

    PubMed

    Gowd, Snigdha; Shankar, T; Dash, Samarendra; Sahoo, Nivedita; Chatterjee, Suravi; Mohanty, Pritam

    2017-01-01

    The aim of the study was to evaluate the reliability of cone beam computed tomography (CBCT) obtained image over plaster model for the assessment of mixed dentition analysis. Thirty CBCT-derived images and thirty plaster models were derived from the dental archives, and Moyer's and Tanaka-Johnston analyses were performed. The data obtained were interpreted and analyzed statistically using SPSS 10.0/PC (SPSS Inc., Chicago, IL, USA). Descriptive and analytical analysis along with Student's t -test was performed to qualitatively evaluate the data and P < 0.05 was considered statistically significant. Statistically, significant results were obtained on data comparison between CBCT-derived images and plaster model; the mean for Moyer's analysis in the left and right lower arch for CBCT and plaster model was 21.2 mm, 21.1 mm and 22.5 mm, 22.5 mm, respectively. CBCT-derived images were less reliable as compared to data obtained directly from plaster model for mixed dentition analysis.

  3. Patient-specific 3D models created by 3D imaging system or bi-planar imaging coupled with Moiré-Fringe projections: a comparative study of accuracy and reliability on spinal curvatures and vertebral rotation data.

    PubMed

    Hocquelet, Arnaud; Cornelis, François; Jirot, Anna; Castaings, Laurent; de Sèze, Mathieu; Hauger, Olivier

    2016-10-01

    The aim of this study is to compare the accuracy and reliability of spinal curvatures and vertebral rotation data based on patient-specific 3D models created by 3D imaging system or by bi-planar imaging coupled with Moiré-Fringe projections. Sixty-two consecutive patients from a single institution were prospectively included. For each patient, frontal and sagittal calibrated low-dose bi-planar X-rays were performed and coupled simultaneously with an optical Moiré back surface-based technology. The 3D reconstructions of spine and pelvis were performed independently by one radiologist and one technician in radiology using two different semi-automatic methods using 3D radio-imaging system (method 1) or bi-planar imaging coupled with Moiré projections (method 2). Both methods were compared using Bland-Altman analysis, and reliability using intraclass correlation coefficient (ICC). ICC showed good to very good agreement. Between the two techniques, the maximum 95 % prediction limits was -4.9° degrees for the measurements of spinal coronal curves and less than 5° for other parameters. Inter-rater reliability was excellent for all parameters across both methods, except for axial rotation with method 2 for which ICC was fair. Method 1 was faster for reconstruction time than method 2 for both readers (13.4 vs. 20.7 min and 10.6 vs. 13.9 min; p = 0.0001). While a lower accuracy was observed for the evaluation of the axial rotation, bi-planar imaging coupled with Moiré-Fringe projections may be an accurate and reliable tool to perform 3D reconstructions of the spine and pelvis.

  4. Reliability of quadriceps surface electromyography measurements is improved by two vs. single site recordings.

    PubMed

    Balshaw, T G; Fry, A; Maden-Wilkinson, T M; Kong, P W; Folland, J P

    2017-06-01

    The reliability of surface electromyography (sEMG) is typically modest even with rigorous methods, and therefore further improvements in sEMG reliability are desirable. This study compared the between-session reliability (both within participant absolute reliability and between-participant relative reliability) of sEMG amplitude from single vs. average of two distinct recording sites, for individual muscle (IM) and whole quadriceps (WQ) measures during voluntary and evoked contractions. Healthy males (n = 20) performed unilateral isometric knee extension contractions: voluntary maximum and submaximum (60%), as well as evoked twitch contractions on two separate days. sEMG was recorded from two distinct sites on each superficial quadriceps muscle. Averaging two recording sites vs. using single site measures improved reliability for IM and WQ measurements during voluntary (16-26% reduction in within-participant coefficient of variation, CV W ) and evoked contractions (40-56% reduction in CV W ). For sEMG measurements from large muscles, averaging the recording of two distinct sites is recommended as it improves within-participant reliability. This improved sensitivity has application to clinical and research measurement of sEMG amplitude.

  5. Reliability and concurrent validity of a Smartphone, bubble inclinometer and motion analysis system for measurement of hip joint range of motion.

    PubMed

    Charlton, Paula C; Mentiplay, Benjamin F; Pua, Yong-Hao; Clark, Ross A

    2015-05-01

    Traditional methods of assessing joint range of motion (ROM) involve specialized tools that may not be widely available to clinicians. This study assesses the reliability and validity of a custom Smartphone application for assessing hip joint range of motion. Intra-tester reliability with concurrent validity. Passive hip joint range of motion was recorded for seven different movements in 20 males on two separate occasions. Data from a Smartphone, bubble inclinometer and a three dimensional motion analysis (3DMA) system were collected simultaneously. Intraclass correlation coefficients (ICCs), coefficients of variation (CV) and standard error of measurement (SEM) were used to assess reliability. To assess validity of the Smartphone application and the bubble inclinometer against the three dimensional motion analysis system, intraclass correlation coefficients and fixed and proportional biases were used. The Smartphone demonstrated good to excellent reliability (ICCs>0.75) for four out of the seven movements, and moderate to good reliability for the remaining three movements (ICC=0.63-0.68). Additionally, the Smartphone application displayed comparable reliability to the bubble inclinometer. The Smartphone application displayed excellent validity when compared to the three dimensional motion analysis system for all movements (ICCs>0.88) except one, which displayed moderate to good validity (ICC=0.71). Smartphones are portable and widely available tools that are mostly reliable and valid for assessing passive hip range of motion, with potential for large-scale use when a bubble inclinometer is not available. However, caution must be taken in its implementation as some movement axes demonstrated only moderate reliability. Copyright © 2014 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.

  6. Test-retest reliability of the assessment of postural stability in typically developing children and in hearing impaired children.

    PubMed

    De Kegel, A; Dhooge, I; Cambier, D; Baetens, T; Palmans, T; Van Waelvelde, H

    2011-04-01

    The purpose of this study was to establish test-retest reliability of centre of pressure (COP) measurements obtained by an AccuGait portable forceplate (ACG), mean COG sway velocity measured by a Basic Balance Master (BBM) and clinical balance tests in children with and without balance difficulties. 49 typically developing children and 23 hearing impaired children, with a higher risk for stability problems, between 6 and 12 years of age participated. Each child performed the modified Clinical Test of Sensory Interaction on Balance (mCTSIB), Unilateral Stance (US) and Tandem Stance on ACG, mCTSIB and US on BBM and clinical balance tests: one-leg standing, balance beam walking and one-leg hopping. All subjects completed 2 test sessions on 2 different days in the same week assessed by the same examiner. Among COP measurements obtained by the ACG, mean sway velocity was the most reliable parameter with all ICCs higher than 0.72. The standard deviation (SD) of sway velocity, sway area, SD of anterior-posterior and SD of medio-lateral COP data showed moderate to excellent reliability with ICCs between 0.55 and 0.96 but some caution must be taken into account in some conditions. BBM is less reliable but clinical balance tests are as reliable as ACG. Hearing impaired children exhibited better relative reliability (ICC) and comparable absolute reliability (SEM) for most balance parameters compared to typically developing children. Reliable information regarding postural stability of typically developing children and hearing impaired children may be obtained utilizing COP measurements generated by an AccuGait system and clinical balance tests. Copyright © 2011 Elsevier B.V. All rights reserved.

  7. Reliability of sonographic assessment of tendinopathy in tennis elbow.

    PubMed

    Poltawski, Leon; Ali, Syed; Jayaram, Vijay; Watson, Tim

    2012-01-01

    To assess the reliability and compute the minimum detectable change using sonographic scales to quantify the extent of pathology and hyperaemia in the common extensor tendon in people with tennis elbow. The lateral elbows of 19 people with tennis elbow were assessed sonographically twice, 1-2 weeks apart. Greyscale and power Doppler images were recorded for subsequent rating of abnormalities. Tendon thickening, hypoechogenicity, fibrillar disruption and calcification were each rated on four-point scales, and scores were summed to provide an overall rating of structural abnormality; hyperaemia was scored on a five point scale. Inter-rater reliability was established using the intraclass correlation coefficient (ICC) to compare scores assigned independently to the same set of images by a radiologist and a physiotherapist with training in musculoskeletal imaging. Test-retest reliability was assessed by comparing scores assigned by the physiotherapist to images recorded at the two sessions. The minimum detectable change (MDC) was calculated from the test-retest reliability data. ICC values for inter-rater reliability ranged from 0.35 (95% CI: 0.05, 0.60) for fibrillar disruption to 0.77 (0.55, 0.88) for overall greyscale score, and 0.89 (0.79, 0.95) for hyperaemia. Test-retest reliability ranged from 0.70 (0.48, 0.84) for tendon thickening to 0.82 (0.66, 0.90) for overall greyscale score and 0.86 (0.73, 0.93) for calcification. The MDC for the greyscale total score was 2.0/12 and for the hyperaemia score was 1.1/5. The sonographic scoring system used in this study may be used reliably to quantify tendon abnormalities and change over time. A relatively inexperienced imager can conduct the assessment and use the rating scales reliably.

  8. Reliability of bounce drop jump parameters within elite male rugby players.

    PubMed

    Costley, Lisa; Wallace, Eric; Johnston, Michael; Kennedy, Rodney

    2017-07-25

    The aims of the study were to investigate the number of familiarisation sessions required to establish reliability of the bounce drop jump (BDJ) and subsequent reliability once familiarisation is achieved. Seventeen trained male athletes completed 4 BDJs in 4 separate testing sessions. Force-time data from a 20 cm BDJ was obtained using two force plates (ensuring ground contact < 250 ms). Subjects were instructed to 'jump for maximal height and minimal contact time' while the best and average of four jumps were compared. A series of performance variables were assessed in both eccentric and concentric phases including jump height, contact time, flight time, reactive strength index (RSI), peak power, rate of force development (RFD) and actual dropping height (ADH). Reliability was assessed using the intraclass correlation coefficient (ICC) and coefficient of variation (CV) while familiarisation was assessed using a repeated measures analysis of variance (ANOVA). The majority of DJ parameters exhibited excellent reliability with no systematic bias evident, while the average of 4 trials provided greater reliability. With the exception of vertical stiffness (CV: 12.0 %) and RFD (CV: 16.2 %) all variables demonstrated low within subject variation (CV range: 3.1 - 8.9 %). Relative reliability was very poor for ADH, with heights ranging from 14.87 - 29.85 cm. High levels of reliability can be obtained from the BDJ with the exception of vertical stiffness and RFD, however, extreme caution must be taken when comparing DJ results between individuals and squads due to large discrepancies between actual drop height and platform height.

  9. Reliability of Computerized Neurocognitive Tests for Concussion Assessment: A Meta-Analysis.

    PubMed

    Farnsworth, James L; Dargo, Lucas; Ragan, Brian G; Kang, Minsoo

    2017-09-01

      Although widely used, computerized neurocognitive tests (CNTs) have been criticized because of low reliability and poor sensitivity. A systematic review was published summarizing the reliability of Immediate Post-Concussion Assessment and Cognitive Testing (ImPACT) scores; however, this was limited to a single CNT. Expansion of the previous review to include additional CNTs and a meta-analysis is needed. Therefore, our purpose was to analyze reliability data for CNTs using meta-analysis and examine moderating factors that may influence reliability.   A systematic literature search (key terms: reliability, computerized neurocognitive test, concussion) of electronic databases (MEDLINE, PubMed, Google Scholar, and SPORTDiscus) was conducted to identify relevant studies.   Studies were included if they met all of the following criteria: used a test-retest design, involved at least 1 CNT, provided sufficient statistical data to allow for effect-size calculation, and were published in English.   Two independent reviewers investigated each article to assess inclusion criteria. Eighteen studies involving 2674 participants were retained. Intraclass correlation coefficients were extracted to calculate effect sizes and determine overall reliability. The Fisher Z transformation adjusted for sampling error associated with averaging correlations. Moderator analyses were conducted to evaluate the effects of the length of the test-retest interval, intraclass correlation coefficient model selection, participant demographics, and study design on reliability. Heterogeneity was evaluated using the Cochran Q statistic.   The proportion of acceptable outcomes was greatest for the Axon Sports CogState Test (75%) and lowest for the ImPACT (25%). Moderator analyses indicated that the type of intraclass correlation coefficient model used significantly influenced effect-size estimates, accounting for 17% of the variation in reliability.   The Axon Sports CogState Test, which has a higher proportion of acceptable outcomes and shorter test duration relative to other CNTs, may be a reliable option; however, future studies are needed to compare the diagnostic accuracy of these instruments.

  10. Teletoxicology: Patient Assessment Using Wearable Audiovisual Streaming Technology.

    PubMed

    Skolnik, Aaron B; Chai, Peter R; Dameff, Christian; Gerkin, Richard; Monas, Jessica; Padilla-Jones, Angela; Curry, Steven

    2016-12-01

    Audiovisual streaming technologies allow detailed remote patient assessment and have been suggested to change management and enhance triage. The advent of wearable, head-mounted devices (HMDs) permits advanced teletoxicology at a relatively low cost. A previously published pilot study supports the feasibility of using the HMD Google Glass® (Google Inc.; Mountain View, CA) for teletoxicology consultation. This study examines the reliability, accuracy, and precision of the poisoned patient assessment when performed remotely via Google Glass®. A prospective observational cohort study was performed on 50 patients admitted to a tertiary care center inpatient toxicology service. Toxicology fellows wore Google Glass® and transmitted secure, real-time video and audio of the initial physical examination to a remote investigator not involved in the subject's care. High-resolution still photos of electrocardiograms (ECGs) were transmitted to the remote investigator. On-site and remote investigators recorded physical examination findings and ECG interpretation. Both investigators completed a brief survey about the acceptability and reliability of the streaming technology for each encounter. Kappa scores and simple agreement were calculated for each examination finding and electrocardiogram parameter. Reliability scores and reliability difference were calculated and compared for each encounter. Data were available for analysis of 17 categories of examination and ECG findings. Simple agreement between on-site and remote investigators ranged from 68 to 100 % (median = 94 %, IQR = 10.5). Kappa scores could be calculated for 11/17 parameters and demonstrated slight to fair agreement for two parameters and moderate to almost perfect agreement for nine parameters (median = 0.653; substantial agreement). The lowest Kappa scores were for pupil size and response to light. On a 100-mm visual analog scale (VAS), mean comfort level was 93 and mean reliability rating was 89 for on-site investigators. For remote users, the mean comfort and reliability ratings were 99 and 86, respectively. The average difference in reliability scores between on-site and remote investigators was 2.6, with the difference increasing as reliability scores decreased. Remote evaluation of poisoned patients via Google Glass® is possible with a high degree of agreement on examination findings and ECG interpretation. Evaluation of pupil size and response to light is limited, likely by the quality of streaming video. Users of Google Glass® for teletoxicology reported high levels of comfort with the technology and found it reliable, though as reported reliability decreased, remote users were most affected. Further study should compare patient-centered outcomes when using HMDs for consultation to those resulting from telephone consultation.

  11. Interrater reliability of schizoaffective disorder compared with schizophrenia, bipolar disorder, and unipolar depression - A systematic review and meta-analysis.

    PubMed

    Santelmann, Hanno; Franklin, Jeremy; Bußhoff, Jana; Baethge, Christopher

    2016-10-01

    Schizoaffective disorder is a common diagnosis in clinical practice but its nosological status has been subject to debate ever since it was conceptualized. Although it is key that diagnostic reliability is sufficient, schizoaffective disorder has been reported to have low interrater reliability. Evidence based on systematic review and meta-analysis methods, however, is lacking. Using a highly sensitive literature search in Medline, Embase, and PsycInfo we identified studies measuring the interrater reliability of schizoaffective disorder in comparison to schizophrenia, bipolar disorder, and unipolar disorder. Out of 4126 records screened we included 25 studies reporting on 7912 patients diagnosed by different raters. The interrater reliability of schizoaffective disorder was moderate (meta-analytic estimate of Cohen's kappa 0.57 [95% CI: 0.41-0.73]), and substantially lower than that of its main differential diagnoses (difference in kappa between 0.22 and 0.19). Although there was considerable heterogeneity, analyses revealed that the interrater reliability of schizoaffective disorder was consistently lower in the overwhelming majority of studies. The results remained robust in subgroup and sensitivity analyses (e.g., diagnostic manual used) as well as in meta-regressions (e.g., publication year) and analyses of publication bias. Clinically, the results highlight the particular importance of diagnostic re-evaluation in patients diagnosed with schizoaffective disorder. They also quantify a widely held clinical impression of lower interrater reliability and agree with earlier meta-analysis reporting low test-retest reliability. Copyright © 2016. Published by Elsevier B.V.

  12. The Development, Validation, and Reliability of SAM: A Tool for Measurement of Moderate to Vigorous Physical Activity in School Physical Education

    ERIC Educational Resources Information Center

    Surapiboonchai, Kampol

    2010-01-01

    There is a lack of valid and reliable low cost observational instruments to measure moderate to vigorous physical activity (MVPA) in school physical education (PE). The participants in this study were third to tenth grade boys and girls from a south Texas school district. The SAM (Simple Activity Measurement) activity levels were compared with…

  13. Are Bibliographic Management Software Search Interfaces Reliable?: A Comparison between Search Results Obtained Using Database Interfaces and the EndNote Online Search Function

    ERIC Educational Resources Information Center

    Fitzgibbons, Megan; Meert, Deborah

    2010-01-01

    The use of bibliographic management software and its internal search interfaces is now pervasive among researchers. This study compares the results between searches conducted in academic databases' search interfaces versus the EndNote search interface. The results show mixed search reliability, depending on the database and type of search…

  14. The Validity and Reliability of the Gymaware Linear Position Transducer for Measuring Counter-Movement Jump Performance in Female Athletes

    ERIC Educational Resources Information Center

    O'Donnell, Shannon; Tavares, Francisco; McMaster, Daniel; Chambers, Samuel; Driller, Matthew

    2018-01-01

    The current study aimed to assess the validity and test-retest reliability of a linear position transducer when compared to a force plate through a counter-movement jump in female participants. Twenty-seven female recreational athletes (19 ± 2 years) performed three counter-movement jumps simultaneously using the linear position transducer and…

  15. Validity and reliability of intraoral scanners compared to conventional gypsum models measurements: a systematic review.

    PubMed

    Aragón, Mônica L C; Pontes, Luana F; Bichara, Lívia M; Flores-Mir, Carlos; Normando, David

    2016-08-01

    The development of 3D technology and the trend of increasing the use of intraoral scanners in dental office routine lead to the need for comparisons with conventional techniques. To determine if intra- and inter-arch measurements from digital dental models acquired by an intraoral scanner are as reliable and valid as the similar measurements achieved from dental models obtained through conventional intraoral impressions. An unrestricted electronic search of seven databases until February 2015. Studies that focused on the accuracy and reliability of images obtained from intraoral scanners compared to images obtained from conventional impressions. After study selection the QUADAS risk of bias assessment tool for diagnostic studies was used to assess the risk of bias (RoB) among the included studies. Four articles were included in the qualitative synthesis. The scanners evaluated were OrthoProof, Lava, iOC intraoral, Lava COS, iTero and D250. These studies evaluated the reliability of tooth widths, Bolton ratio measurements, and image superimposition. Two studies were classified as having low RoB; one had moderate RoB and the remaining one had high RoB. Only one study evaluated the time required to complete clinical procedures and patient's opinion about the procedure. Patients reported feeling more comfortable with the conventional dental impression method. Associated costs were not considered in any of the included study. Inter- and intra-arch measurements from digital models produced from intraoral scans appeared to be reliable and accurate in comparison to those from conventional impressions. This assessment only applies to the intraoral scanners models considered in the finally included studies. Digital models produced by intraoral scan eliminate the need of impressions materials; however, currently, longer time is needed to take the digital images. PROSPERO (CRD42014009702). None. © The Author 2016. Published by Oxford University Press on behalf of the European Orthodontic Society. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  16. Validity and Reliability of Accelerometers in Patients With COPD: A SYSTEMATIC REVIEW.

    PubMed

    Gore, Shweta; Blackwood, Jennifer; Guyette, Mary; Alsalaheen, Bara

    2018-05-01

    Reduced physical activity is associated with poor prognosis in chronic obstructive pulmonary disease (COPD). Accelerometers have greatly improved quantification of physical activity by providing information on step counts, body positions, energy expenditure, and magnitude of force. The purpose of this systematic review was to compare the validity and reliability of accelerometers used in patients with COPD. An electronic database search of MEDLINE and CINAHL was performed. Study quality was assessed with the Strengthening the Reporting of Observational Studies in Epidemiology checklist while methodological quality was assessed using the modified Quality Appraisal Tool for Reliability Studies. The search yielded 5392 studies; 25 met inclusion criteria. The SenseWear Pro armband reported high criterion validity under controlled conditions (r = 0.75-0.93) and high reliability (ICC = 0.84-0.86) for step counts. The DynaPort MiniMod demonstrated highest concurrent validity for step count using both video and manual methods. Validity of the SenseWear Pro armband varied between studies especially in free-living conditions, slower walking speeds, and with addition of weights during gait. A high degree of variability was found in the outcomes used and statistical analyses performed between studies, indicating a need for further studies to measure reliability and validity of accelerometers in COPD. The SenseWear Pro armband is the most commonly used accelerometer in COPD, but measurement properties are limited by gait speed variability and assistive device use. DynaPort MiniMod and Stepwatch accelerometers demonstrated high validity in patients with COPD but lack reliability data.

  17. The reliability and validity of a sexual functioning questionnaire.

    PubMed

    Corty, E W; Althof, S E; Kurit, D M

    1996-01-01

    The present study assessed the reliability and validity of a measure of sexual functioning, the CMSH-SFQ, for male patients and their partners. The CMSH-SFQ measures erectile and orgasmic functioning, sexual drive, frequency of sexual behavior, and sexual satisfaction. Test-retest reliability was assessed with 19 males and 19 females for the baseline CMSH-SFQ. Criterion validity was measured by comparing the answers of 25 male patients to those of their partners at baseline and follow-up. The majority of items had acceptable levels of reliability and validity. The CMSH-SFQ provides a reliable and valid device that can be used to measure global sexual functioning in men and their partners and may be used to evaluate the efficacy of treatments for sexual dysfunctions. Limitations and suggestions for use of the CMSH-SFQ are addressed.

  18. Calculus detection calibration among dental hygiene faculty members utilizing dental endoscopy: a pilot study.

    PubMed

    Partido, Brian B; Jones, Archie A; English, Dana L; Nguyen, Carol A; Jacks, Mary E

    2015-02-01

    Dental and dental hygiene faculty members often do not provide consistent instruction in the clinical environment, especially in tasks requiring clinical judgment. From previous efforts to calibrate faculty members in calculus detection using typodonts, researchers have suggested using human subjects and emerging technology to improve consistency in clinical instruction. The purpose of this pilot study was to determine if a dental endoscopy-assisted training program would improve intra- and interrater reliability of dental hygiene faculty members in calculus detection. Training included an ODU 11/12 explorer, typodonts, and dental endoscopy. A convenience sample of six participants was recruited from the dental hygiene faculty at a California community college, and a two-group randomized experimental design was utilized. Intra- and interrater reliability was measured before and after calibration training. Pretest and posttest Kappa averages of all participants were compared using repeated measures (split-plot) ANOVA to determine the effectiveness of the calibration training on intra- and interrater reliability. The results showed that both kinds of reliability significantly improved for all participants and the training group improved significantly in interrater reliability from pretest to posttest. Calibration training was beneficial to these dental hygiene faculty members, especially those beginning with less than full agreement. This study suggests that calculus detection calibration training utilizing dental endoscopy can effectively improve interrater reliability of dental and dental hygiene clinical educators. Future studies should include human subjects, involve more participants at multiple locations, and determine whether improved rater reliability can be sustained over time.

  19. Advancing methods for reliably assessing motivational interviewing fidelity using the Motivational Interviewing Skills Code

    PubMed Central

    Lord, Sarah Peregrine; Can, Doğan; Yi, Michael; Marin, Rebeca; Dunn, Christopher W.; Imel, Zac E.; Georgiou, Panayiotis; Narayanan, Shrikanth; Steyvers, Mark; Atkins, David C.

    2014-01-01

    The current paper presents novel methods for collecting MISC data and accurately assessing reliability of behavior codes at the level of the utterance. The MISC 2.1 was used to rate MI interviews from five randomized trials targeting alcohol and drug use. Sessions were coded at the utterance-level. Utterance-based coding reliability was estimated using three methods and compared to traditional reliability estimates of session tallies. Session-level reliability was generally higher compared to reliability using utterance-based codes, suggesting that typical methods for MISC reliability may be biased. These novel methods in MI fidelity data collection and reliability assessment provided rich data for therapist feedback and further analyses. Beyond implications for fidelity coding, utterance-level coding schemes may elucidate important elements in the counselor-client interaction that could inform theories of change and the practice of MI. PMID:25242192

  20. Advancing methods for reliably assessing motivational interviewing fidelity using the motivational interviewing skills code.

    PubMed

    Lord, Sarah Peregrine; Can, Doğan; Yi, Michael; Marin, Rebeca; Dunn, Christopher W; Imel, Zac E; Georgiou, Panayiotis; Narayanan, Shrikanth; Steyvers, Mark; Atkins, David C

    2015-02-01

    The current paper presents novel methods for collecting MISC data and accurately assessing reliability of behavior codes at the level of the utterance. The MISC 2.1 was used to rate MI interviews from five randomized trials targeting alcohol and drug use. Sessions were coded at the utterance-level. Utterance-based coding reliability was estimated using three methods and compared to traditional reliability estimates of session tallies. Session-level reliability was generally higher compared to reliability using utterance-based codes, suggesting that typical methods for MISC reliability may be biased. These novel methods in MI fidelity data collection and reliability assessment provided rich data for therapist feedback and further analyses. Beyond implications for fidelity coding, utterance-level coding schemes may elucidate important elements in the counselor-client interaction that could inform theories of change and the practice of MI. Copyright © 2015 Elsevier Inc. All rights reserved.

  1. Deliberate Self-Harm within an International Community Sample of Young People: Comparative Findings from the Child & Adolescent Self-Harm in Europe (CASE) Study

    ERIC Educational Resources Information Center

    Madge, Nicola; Hewitt, Anthea; Hawton, Keith; de Wilde, Erik Jan; Corcoran, Paul; Fekete, Sandor; van Heeringen, Kees; De Leo, Diego; Ystgaard, Mette

    2008-01-01

    Background: Deliberate self-harm among young people is an important focus of policy and practice internationally. Nonetheless, there is little reliable comparative international information on its extent or characteristics. We have conducted a seven-country comparative community study of deliberate self-harm among young people. Method: Over 30,000…

  2. Reliability and validity of a brief method to assess nociceptive flexion reflex (NFR) threshold.

    PubMed

    Rhudy, Jamie L; France, Christopher R

    2011-07-01

    The nociceptive flexion reflex (NFR) is a physiological tool to study spinal nociception. However, NFR assessment can take several minutes and expose participants to repeated suprathreshold stimulations. The 4 studies reported here assessed the reliability and validity of a brief method to assess NFR threshold that uses a single ascending series of stimulations (Peak 1 NFR), by comparing it to a well-validated method that uses 3 ascending/descending staircases of stimulations (Staircase NFR). Correlations between the NFR definitions were high, were on par with test-retest correlations of Staircase NFR, and were not affected by participant sex or chronic pain status. Results also indicated the test-retest reliabilities for the 2 definitions were similar. Using larger stimulus increments (4 mAs) to assess Peak 1 NFR tended to result in higher NFR threshold estimates than using the Staircase NFR definition, whereas smaller stimulus increments (2 mAs) tended to result in lower NFR threshold estimates than the Staircase NFR definition. Neither NFR definition was correlated with anxiety, pain catastrophizing, or anxiety sensitivity. In sum, a single ascending series of electrical stimulations results in a reliable and valid estimate of NFR threshold. However, caution may be warranted when comparing NFR thresholds across studies that differ in the ascending stimulus increments. This brief method to assess NFR threshold is reliable and valid; therefore, it should be useful to clinical pain researchers interested in quickly assessing inter- and intra-individual differences in spinal nociceptive processes. Copyright © 2011 American Pain Society. Published by Elsevier Inc. All rights reserved.

  3. A simple method of measuring tibial tubercle to trochlear groove distance on MRI: description of a novel and reliable technique.

    PubMed

    Camp, Christopher L; Heidenreich, Mark J; Dahm, Diane L; Bond, Jeffrey R; Collins, Mark S; Krych, Aaron J

    2016-03-01

    Tibial tubercle-trochlear groove (TT-TG) distance is a variable that helps guide surgical decision-making in patients with patellar instability. The purpose of this study was to compare the accuracy and reliability of an MRI TT-TG measuring technique using a simple external alignment method to a previously validated gold standard technique that requires advanced software read by radiologists. TT-TG was calculated by MRI on 59 knees with a clinical diagnosis of patellar instability in a blinded and randomized fashion by two musculoskeletal radiologists using advanced software and by two orthopaedists using the study technique which utilizes measurements taken on a simple electronic imaging platform. Interrater reliability between the two radiologists and the two orthopaedists and intermethods reliability between the two techniques were calculated using interclass correlation coefficients (ICC) and concordance correlation coefficients (CCC). ICC and CCC values greater than 0.75 were considered to represent excellent agreement. The mean TT-TG distance was 14.7 mm (Standard Deviation (SD) 4.87 mm) and 15.4 mm (SD 5.41) as measured by the radiologists and orthopaedists, respectively. Excellent interobserver agreement was noted between the radiologists (ICC 0.941; CCC 0.941), the orthopaedists (ICC 0.978; CCC 0.976), and the two techniques (ICC 0.941; CCC 0.933). The simple TT-TG distance measurement technique analysed in this study resulted in excellent agreement and reliability as compared to the gold standard technique. This method can predictably be performed by orthopaedic surgeons without advanced radiologic software. II.

  4. Three-dimensional facial anthropometry of unilateral cleft lip infants with a structured light scanning system.

    PubMed

    Li, Guanghui; Wei, Jianhua; Wang, Xi; Wu, Guofeng; Ma, Dandan; Wang, Bo; Liu, Yanpu; Feng, Xinghua

    2013-08-01

    Cleft lip in the presence or absence of a cleft palate is a major public health problem. However, few studies have been published concerning the soft-tissue morphology of cleft lip infants. Currently, obtaining reliable three-dimensional (3D) surface models of infants remains a challenge. The aim of this study was to investigate a new way of capturing 3D images of cleft lip infants using a structured light scanning system. In addition, the accuracy and precision of the acquired facial 3D data were validated and compared with direct measurements. Ten unilateral cleft lip patients were enrolled in the study. Briefly, 3D facial images of the patients were acquired using a 3D scanner device before and after the surgery. Fourteen items were measured by direct anthropometry and 3D image software. The accuracy and precision of the 3D system were assessed by comparative analysis. The anthropometric data obtained using the 3D method were in agreement with the direct anthropometry measurements. All data calculated by the software were 'highly reliable' or 'reliable', as defined in the literature. The localisation of four landmarks was not consistent in repeated experiments of inter-observer reliability in preoperative images (P<0.05), while the intra-observer reliability in both pre- and postoperative images was good (P>0.05). The structured light scanning system is proven to be a non-invasive, accurate and precise method in cleft lip anthropometry. Copyright © 2013 British Association of Plastic, Reconstructive and Aesthetic Surgeons. Published by Elsevier Ltd. All rights reserved.

  5. [Ultrasonic scissors. New vs resterilized instruments].

    PubMed

    Gärtner, D; Münz, K; Hückelheim, E; Hesse, U

    2008-02-01

    The aim of this study was to compare reliability in handling and function of resterilized and single-use disposable ultrasonic scissors. In a prospective randomized study, the surgeon blindly tested new and resterilized ultrasonographic scissors. The parameters were force of activation, cutting effect, coagulation effect, error messages, and disturbing generator noise. Fifty-one new and 49 resterilized instruments in 94 operations were evaluated. The differences in force of activation, cutting effect, and coagulation were not significant. Error messages and disturbing noises were rare in both groups. Six new instruments and two resterilized instruments had to be exchanged because of problems during surgery. This study demonstrates comparable reliability in function and handling of resterilized and new ultrasonic scissors. The use of resterilized instruments leads to distinctly reduced costs and could contribute to efficiency in laparoscopic surgery.

  6. Comparison of Two Validated Voiding Questionnaires and Clinical Impression in Children With Lower Urinary Tract Symptoms: ICIQ-CLUTS Versus Akbal Survey.

    PubMed

    Goknar, Nilüfer; Oktem, Faruk; Demir, Aysegul D; Vehapoglu, Aysel; Silay, Mesrur S

    2016-08-01

    To compare the correlation of 2 commonly used and validated voiding questionnaires (ICIQ-CLUTS and Akbal's) according to the physician's clinical impressions. Also, we investigated the reliability of these instruments in children with lower urinary tract symptoms (LUTS). Akbal's questionnaires and ICIQ-CLUTS forms were completed by children between 5 and 18 years old with and without LUTS and by their parents. The data were classified into 3 age groups (5-9, 10-13, 14-18). The reliability of Akbal and ICIQ-CLUTS was investigated by using Cronbach's α (≥0.7 is indicated acceptability). The total scores of the tools were compared with the physician's clinical impression (Kendall's tau b-test). A total of 154 children (LUTS: n = 88, controls: n = 66) were prospectively enrolled into the study. The reliability of both instruments was excellent (Cronbach's alpha scores; Akbal = 0.811, ICIQ-CLUTS children version: 0.728 and ICIQ-CLUTS parental version: 0.746). When we compared by Kendal tau, Akbal was better correlated with physician's clinical impression. In addition, the children version of ICIQ-CLUTS was better correlated than parental version. The results of our study provide that both tools are reliable and objective to grade the LUTS in pediatric population. Although both surveys were significantly correlated with clinical impression, the consistency of Akbal's questionnaire is found superior than that of ICIQ-CLUTS. Copyright © 2016 Elsevier Inc. All rights reserved.

  7. Bootstrap study of genome-enabled prediction reliabilities using haplotype blocks across Nordic Red cattle breeds.

    PubMed

    Cuyabano, B C D; Su, G; Rosa, G J M; Lund, M S; Gianola, D

    2015-10-01

    This study compared the accuracy of genome-enabled prediction models using individual single nucleotide polymorphisms (SNP) or haplotype blocks as covariates when using either a single breed or a combined population of Nordic Red cattle. The main objective was to compare predictions of breeding values of complex traits using a combined training population with haplotype blocks, with predictions using a single breed as training population and individual SNP as predictors. To compare the prediction reliabilities, bootstrap samples were taken from the test data set. With the bootstrapped samples of prediction reliabilities, we built and graphed confidence ellipses to allow comparisons. Finally, measures of statistical distances were used to calculate the gain in predictive ability. Our analyses are innovative in the context of assessment of predictive models, allowing a better understanding of prediction reliabilities and providing a statistical basis to effectively calibrate whether one prediction scenario is indeed more accurate than another. An ANOVA indicated that use of haplotype blocks produced significant gains mainly when Bayesian mixture models were used but not when Bayesian BLUP was fitted to the data. Furthermore, when haplotype blocks were used to train prediction models in a combined Nordic Red cattle population, we obtained up to a statistically significant 5.5% average gain in prediction accuracy, over predictions using individual SNP and training the model with a single breed. Copyright © 2015 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  8. Persian version of frontal assessment battery: Correlations with formal measures of executive functioning and providing normative data for Persian population.

    PubMed

    Asaadi, Sina; Ashrafi, Farzad; Omidbeigi, Mahmoud; Nasiri, Zahra; Pakdaman, Hossein; Amini-Harandi, Ali

    2016-01-05

    Cognitive impairment in patients with Parkinson's disease (PD) mainly involves executive function (EF). The frontal assessment battery (FAB) is an efficient tool for the assessment of EFs. The aims of this study were to determine the validity and reliability of the psychometric properties of the Persian version of FAB and assess its correlation with formal measures of EFs to provide normative data for the Persian version of FAB in patients with PD. The study recruited 149 healthy participants and 49 patients with idiopathic PD. In PD patients, FAB results were compared to their performance on EF tests. Reliability analysis involved test-retest reliability and internal consistency, whereas validity analysis involved convergent validity approach. FAB scores compared in normal controls and in PD patients matched for age, education, and Mini-Mental State Examination (MMSE) score. In PD patients, FAB scores were significantly decreased compared to normal controls, and correlated with Stroop test and Wisconsin Card Sorting Test (WCST). In healthy subjects, FAB scores varied according to the age, education, and MMSE. In the FAB subtest analysis, the performances of PD patients were worse than the healthy participants on similarities, fluency tasks, and Luria's motor series. Persian version of FAB could be used as a reliable scale for the assessment of frontal lobe functions in Iranian patients with PD. Furthermore, normative data provided for the Persian version of this test improve the accuracy and confidence in the clinical application of the FAB.

  9. The reliability and validity of a soccer-specific nonmotorised treadmill simulation (intermittent soccer performance test).

    PubMed

    Aldous, Jeffrey W F; Akubat, Ibrahim; Chrismas, Bryna C R; Watkins, Samuel L; Mauger, Alexis R; Midgley, Adrian W; Abt, Grant; Taylor, Lee

    2014-07-01

    This study investigated the reliability and validity of a novel nonmotorised treadmill (NMT)-based soccer simulation using a novel activity category called a "variable run" to quantify fatigue during high-speed running. Twelve male University soccer players completed 3 familiarization sessions and 1 peak speed assessment before completing the intermittent soccer performance test (iSPT) twice. The 2 iSPTs were separated by 6-10 days. The total distance, sprint distance, and high-speed running distance (HSD) were 8,968 ± 430 m, 980 ± 75 m and 2,122 ± 140 m, respectively. No significant difference (p > 0.05) was found between repeated trials of the iSPT for all physiological and performance variables. Reliability measures between iSPT1 and iSPT2 showed good agreement (coefficient of variation: <4.6%; intraclass correlation coefficient: >0.80). Furthermore, the variable run phase showed HSD significantly decreased (p ≤ 0.05) in the last 15 minutes (89 ± 6 m) compared with the first 15 minutes (85 ± 7 m), quantifying decrements in high-speed exercise compared with the previous literature. This study validates the iSPT as a NMT-based soccer simulation compared with the previous match-play data and is a reliable tool for assessing and monitoring physiological and performance variables in soccer players. The iSPT could be used in a number of ways including player rehabilitation, understanding the efficacy of nutritional interventions, and also the quantification of environmentally mediated decrements on soccer-specific performance.

  10. Reliability and relative validity of three physical activity questionnaires in Taizhou population of China: the Taizhou Longitudinal Study.

    PubMed

    Hu, B; Lin, L F; Zhuang, M Q; Yuan, Z Y; Li, S Y; Yang, Y J; Lu, M; Yu, S Z; Jin, L; Ye, W M; Wang, X F

    2015-09-01

    To examine the test-retest reliabilities and relative validities of the Chinese version of short International Physical Activity Questionnaire (IPAQ-S-C), the Global Physical Activity Questionnaire (GPAQ-C), and the Total Energy Expenditure Questionnaire (TEEQ-C) in a population-based prospective study, the Taizhou Longitudinal Study (TZLS). A longitudinal comparative study. A total of 205 participants (male: 38.54%) aged 30-70 years completed three questionnaires twice (day one and day nine) and physical activity log (PA-log) over seven consecutive days. The test-retest reliabilities were evaluated using intra-class correlation coefficients (ICCs) and the relative validities were estimated by comparing the data from physical activity questionnaires (PAQs) and PA-log. Good reliabilities were observed between the repeated PAQs. The ICCs ranged from 0.51 to 0.80 for IPAQ-C, 0.67 to 0.85 for GPAQ-C, and 0.74 to 0.94 for TEEQ-C, respectively. Energy expenditure of most PA domains estimated by the three PAQs correlated moderately with the results recorded by PA-log except the walking domain of IPAQ-S-C. The partial correlation coefficients between the PAQs and PA-log ranged from 0.44 to 0.58 for IPAQ-S-C, 0.26 to 0.52 for GPAQ-C, and 0.41 to 0.72 for TEEQ-C, respectively. Bland-Altman plots showed acceptable agreement between the three PAQs and PA-log. The three PAQs, especially TEEQ-C, were relatively reliable and valid for assessment of physical activity and could be used in TZLS. Copyright © 2015 The Royal Society for Public Health. Published by Elsevier Ltd. All rights reserved.

  11. Reliability analysis of visual ranking of coronary artery calcification on low-dose CT of the thorax for lung cancer screening: comparison with ECG-gated calcium scoring CT.

    PubMed

    Kim, Yoon Kyung; Sung, Yon Mi; Cho, So Hyun; Park, Young Nam; Choi, Hye-Young

    2014-12-01

    Coronary artery calcification (CAC) is frequently detected on low-dose CT (LDCT) of the thorax. Concurrent assessment of CAC and lung cancer screening using LDCT is beneficial in terms of cost and radiation dose reduction. The aim of our study was to evaluate the reliability of visual ranking of positive CAC on LDCT compared to Agatston score (AS) on electrocardiogram (ECG)-gated calcium scoring CT. We studied 576 patients who were consecutively registered for health screening and undergoing both LDCT and ECG-gated calcium scoring CT. We excluded subjects with an AS of zero. The final study cohort included 117 patients with CAC (97 men; mean age, 53.4 ± 8.5). AS was used as the gold standard (mean score 166.0; range 0.4-3,719.3). Two board-certified radiologists and two radiology residents participated in an observer performance study. Visual ranking of CAC was performed according to four categories (1-10, 11-100, 101-400, and 401 or higher) for coronary artery disease risk stratification. Weighted kappa statistics were used to measure the degree of reliability on visual ranking of CAC on LDCT. The degree of reliability on visual ranking of CAC on LDCT compared to ECG-gated calcium scoring CT was excellent for board-certified radiologists and good for radiology residents. A high degree of association was observed with 71.6% of visual rankings in the same category as the Agatston category and 98.9% varying by no more than one category. Visual ranking of positive CAC on LDCT is reliable for predicting AS rank categorization.

  12. Dental hygiene faculty calibration in the evaluation of calculus detection.

    PubMed

    Garland, Kandis V; Newell, Kathleen J

    2009-03-01

    The purpose of this pilot study was to explore the impact of faculty calibration training on intra- and interrater reliability regarding calculus detection. After IRB approval, twelve dental hygiene faculty members were recruited from a pool of twenty-two for voluntary participation and randomized into two groups. All subjects provided two pre- and two posttest scorings of calculus deposits on each of three typodonts by recording yes or no indicating if they detected calculus. Accuracy and consistency of calculus detection were evaluated using an answer key. The experimental group received three two-hour training sessions to practice a prescribed exploring sequence and technique for calculus detection. Participants immediately corrected their answers, received feedback from the trainer, and reconciled missed areas. Intra- and interrater reliability (pre- and posttest) was determined using Cohen's Kappa and compared between groups using repeated measures (split-plot) ANOVA. The groups did not differ from pre- to posttraining (intrarater reliability p=0.64; interrater reliability p=0.20). Training had no effect on reliability levels for simulated calculus detection in this study. Recommendations for future studies of faculty calibration when evaluating students include using patients for assessing rater reliability, employing larger samples at multiple sites, and assessing the impact on students' attitudes and learning outcomes.

  13. A proposed method to investigate reliability throughout a questionnaire.

    PubMed

    Wentzel-Larsen, Tore; Norekvål, Tone M; Ulvik, Bjørg; Nygård, Ottar; Pripp, Are H

    2011-10-05

    Questionnaires are used extensively in medical and health care research and depend on validity and reliability. However, participants may differ in interest and awareness throughout long questionnaires, which can affect reliability of their answers. A method is proposed for "screening" of systematic change in random error, which could assess changed reliability of answers. A simulation study was conducted to explore whether systematic change in reliability, expressed as changed random error, could be assessed using unsupervised classification of subjects by cluster analysis (CA) and estimation of intraclass correlation coefficient (ICC). The method was also applied on a clinical dataset from 753 cardiac patients using the Jalowiec Coping Scale. The simulation study showed a relationship between the systematic change in random error throughout a questionnaire and the slope between the estimated ICC for subjects classified by CA and successive items in a questionnaire. This slope was proposed as an awareness measure--to assessing if respondents provide only a random answer or one based on a substantial cognitive effort. Scales from different factor structures of Jalowiec Coping Scale had different effect on this awareness measure. Even though assumptions in the simulation study might be limited compared to real datasets, the approach is promising for assessing systematic change in reliability throughout long questionnaires. Results from a clinical dataset indicated that the awareness measure differed between scales.

  14. Interrater reliability: the kappa statistic.

    PubMed

    McHugh, Mary L

    2012-01-01

    The kappa statistic is frequently used to test interrater reliability. The importance of rater reliability lies in the fact that it represents the extent to which the data collected in the study are correct representations of the variables measured. Measurement of the extent to which data collectors (raters) assign the same score to the same variable is called interrater reliability. While there have been a variety of methods to measure interrater reliability, traditionally it was measured as percent agreement, calculated as the number of agreement scores divided by the total number of scores. In 1960, Jacob Cohen critiqued use of percent agreement due to its inability to account for chance agreement. He introduced the Cohen's kappa, developed to account for the possibility that raters actually guess on at least some variables due to uncertainty. Like most correlation statistics, the kappa can range from -1 to +1. While the kappa is one of the most commonly used statistics to test interrater reliability, it has limitations. Judgments about what level of kappa should be acceptable for health research are questioned. Cohen's suggested interpretation may be too lenient for health related studies because it implies that a score as low as 0.41 might be acceptable. Kappa and percent agreement are compared, and levels for both kappa and percent agreement that should be demanded in healthcare studies are suggested.

  15. Issues in Benchmarking Human Reliability Analysis Methods: A Literature Review

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ronald L. Boring; Stacey M. L. Hendrickson; John A. Forester

    There is a diversity of human reliability analysis (HRA) methods available for use in assessing human performance within probabilistic risk assessments (PRA). Due to the significant differences in the methods, including the scope, approach, and underlying models, there is a need for an empirical comparison investigating the validity and reliability of the methods. To accomplish this empirical comparison, a benchmarking study comparing and evaluating HRA methods in assessing operator performance in simulator experiments is currently underway. In order to account for as many effects as possible in the construction of this benchmarking study, a literature review was conducted, reviewing pastmore » benchmarking studies in the areas of psychology and risk assessment. A number of lessons learned through these studies are presented in order to aid in the design of future HRA benchmarking endeavors.« less

  16. Reliability and Validity of the Italian Version of the Protocol of Orofacial Myofunctional Evaluation with Scores (I-OMES).

    PubMed

    Scarponi, Letizia; de Felicio, Claudia Maria; Sforza, Chiarella; Pimenta Ferreira, Claudia Lucia; Ginocchio, Daniela; Pizzorni, Nicole; Barozzi, Stefania; Mozzanica, Francesco; Schindler, Antonio

    2018-05-30

    To evaluate the reliability, validity, and responsiveness of the Italian OMES (I-OMES). The study consisted of 3 phases: (1) internal consistency and reliability, (2) validity, and (3) responsiveness analysis. The recruited population included 27 patients with orofacial myofunctional disorders (OMD) and 174 healthy volunteers. Forty-seven subjects, 18 healthy and all recruited patients with OMD were assessed for inter-rater and test-retest reliability analysis. I-OMES and Nordic Orofacial Test - Screening (NOT-S) scores of the patients were correlated for concurrent validity analysis. I-OMES scores from 27 patients with OMD and 27 age- and gender-matched healthy subjects were compared to investigate construct validity. I-OMES scores before and after successful swallowing rehabilitation in patients were compared for responsiveness analysis. Adequate internal consistency (Cronbach α = 0.71) and strong inter-rater and test-retest reliability (intraclass coefficient correlation = 0.97 and 0.98, respectively) were found. I-OMES and NOT-S scores significantly and inversely correlated (r = -0.38). A statistical significance (p < 0.001) was found between the pathological group and the control group for the total I-OMES score. The mean I-OMES score improved from 90 (78-102) to 99 (89-103) after myofunctional rehabilitation (p < 0.001). The I-OMES is a reliable and valid tool to evaluate OMD. © 2018 S. Karger AG, Basel.

  17. Reliability of plain radiographic parameters for developmental dysplasia of the hip in children.

    PubMed

    Upasani, Vidyadhar V; Bomar, James D; Parikh, Gaurav; Hosalkar, Harish

    2012-07-01

    Few studies have evaluated the reliability and reproducibility of the femoral neck-shaft angle (NSA), center-edge angle (CEA), and acetabular index (AI) in young children with developmental dysplasia of the hip (DDH). We wanted to determine whether these parameters could be used reliably by practitioners. Fifty radiographs from 21 children with DDH were reviewed. Analysis was performed by three observers, at two time periods. The intra- and inter-observer reliability for each measure was assessed. At time period one, we noted a "high" level of agreement between observers when measuring the NSA, a "low" level when measuring the CEA, and a "moderate" level when measuring the AI. At time period two, we noted a "very high" level of agreement between observers when measuring the NSA and a "high" level when measuring the CEA and AI. When comparing the measurements of observer 1 at the two different time periods, we noted nearly "very high" agreement when measuring the NSA, a "moderate" agreement when measuring the CEA, and a "high" agreement for the AI. In comparing the measurements of observer 2, we noted "very high" agreement for the NSA and "high" agreement for the CEA and AI. In comparing the measurements for observer 3, we noted nearly "very high" agreement for the NSA, nearly "high" agreement for the CEA, and "high" agreement for the AI. It is difficult to reliably measure three-dimensional pelvic morphology on a frontal plane radiograph, especially when important pelvic landmarks have yet to ossify.

  18. Pulse oximeter sensor application during neonatal resuscitation: a randomized controlled trial.

    PubMed

    Louis, Deepak; Sundaram, Venkataseshan; Kumar, Praveen

    2014-03-01

    This study was done to compare 2 techniques of pulse oximeter sensor application during neonatal resuscitation for faster signal detection. Sensor to infant first (STIF) and then to oximeter was compared with sensor to oximeter first (STOF) and then to infant in ≥28 weeks gestations. The primary outcome was time from completion of sensor application to reliable signal, defined as stable display of heart rate and saturation. Time from birth to sensor application, time taken for sensor application, time from birth to reliable signal, and need to reapply sensor were secondary outcomes. An intention-to-treat analysis was done, and subgroup analysis was done for gestation and need for resuscitation. One hundred fifty neonates were randomized with 75 to each technique. The median (IQR) time from sensor application to detection of reliable signal was longer in STIF group compared with STOF group (16 [15-17] vs. 10 [6-18] seconds; P <0.001). Time taken for application of sensor was longer with STIF technique than with STOF technique (12 [10-16] vs. 11 [9-15] seconds; P = 0.04). Time from birth to reliable signal did not differ between the 2 methods (STIF: 61 [52-76] seconds; STOF: 58 [47-73] seconds [P = .09]). Time taken for signal acquisition was longer with STIF than with STOF in both subgroups. In the delivery room setting, the STOF method recognized saturation and heart rate faster than the STIF method. The time from birth to reliable signal was similar with the 2 methods.

  19. Rater methodology for stroboscopy: a systematic review.

    PubMed

    Bonilha, Heather Shaw; Focht, Kendrea L; Martin-Harris, Bonnie

    2015-01-01

    Laryngeal endoscopy with stroboscopy (LES) remains the clinical gold standard for assessing vocal fold function. LES is used to evaluate the efficacy of voice treatments in research studies and clinical practice. LES as a voice treatment outcome tool is only as good as the clinician interpreting the recordings. Research using LES as a treatment outcome measure should be evaluated based on rater methodology and reliability. The purpose of this literature review was to evaluate the rater-related methodology from studies that use stroboscopic findings as voice treatment outcome measures. Systematic literature review. Computerized journal databases were searched for relevant articles using terms: stroboscopy and treatment. Eligible articles were categorized and evaluated for the use of rater-related methodology, reporting of number of raters, types of raters, blinding, and rater reliability. Of the 738 articles reviewed, 80 articles met inclusion criteria. More than one-third of the studies included in the review did not report the number of raters who participated in the study. Eleven studies reported results of rater reliability analysis with only two studies reporting good inter- and intrarater reliability. The comparability and use of results from treatment studies that use LES are limited by a lack of rigor in rater methodology and variable, mostly poor, inter- and intrarater reliability. To improve our ability to evaluate and use the findings from voice treatment studies that use LES features as outcome measures, greater consistency of reporting rater methodology characteristics across studies and improved rater reliability is needed. Copyright © 2015 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  20. Robotic-Assisted Knee Arthroplasty: An Overview.

    PubMed

    van der List, Jelle P; Chawla, Harshvardhan; Pearle, Andrew D

    2016-01-01

    Unicompartmental knee arthroplasty and total knee arthroplasty are reliable treatment options for osteoarthritis. In order to improve survivorship rates, variables that are intraoperatively controlled by the orthopedic surgeon are being evaluated. These variables include lower leg alignment, soft tissue balance, joint line maintenance, and tibial and femoral component alignment, size, and fixation methods. Since tighter control of these factors is associated with improved outcomes of knee arthroplasty, several computer-assisted surgery systems have been developed. These systems differ in the number and type of variables they control. Robotic-assisted systems control these aforementioned variables and, in addition, aim to improve the surgical precision of the procedure. Robotic-assisted systems are active, semi-active, or passive, depending on how independently the systems perform maneuvers. Reviewing the robotic-assisted knee arthroplasty systems, it becomes clear that these systems can accurately and reliably control the aforementioned variables. Moreover, these systems are more accurate and reliable in controlling these variables when compared to the current gold standard of conventional manual surgery. At present, few studies have assessed the survivorship and functional outcomes of robotic-assisted surgery, and no sufficiently powered studies were identified that compared survivorship or functional outcomes between robotic-assisted and conventional knee arthroplasty. Although preliminary outcomes of robotic-assisted surgery look promising, more studies are necessary to assess if the increased accuracy and reliability in controlling the surgical variables leads to better outcomes of robotic-assisted knee arthroplasty.

  1. Concordance and Reliability of Photogrammetric Protocols for Measuring the Cervical Lordosis Angle: A Systematic Review of the Literature.

    PubMed

    de Albuquerque, Priscila Maria Nascimento Martins; de Alencar, Geisa Guimarães; de Oliveira, Daniela Araújo; de Siqueira, Gisela Rocha

    2018-01-01

    The aim of this study was to examine and interpret the concordance, accuracy, and reliability of photogrammetric protocols available in the literature for evaluating cervical lordosis in an adult population aged 18 to 59 years. A systematic search of 6 electronic databases (MEDLINE via PubMed, LILACS, CINAHL, Scopus, ScienceDirect, and Web of Science) located studies that assessed the reliability and/or concordance and/or accuracy of photogrammetric protocols for evaluating cervical lordosis, compared with radiography. Articles published through April 2016 were selected. Two independent reviewers used a critical appraisal tool (QUADAS and QAREL) to assess the quality of the selected studies. Two studies were included in the review and had high levels of reliability (intraclass correlation coefficient: 0.974-0.98). Only 1 study assessed the concordance between the methods, which was calculated using Pearson's correlation coefficient. To date, the accuracy of photogrammetry has not been investigated thoroughly. We encountered no study in the literature that investigated the accuracy of photogrammetry in diagnosing hyperlordosis of cervical spine. However, both current studies report high levels of intra- and interrater reliability. To increase the level of evidence of photogrammetry in the evaluation of cervical lordosis, it is necessary to conduct further studies using a larger sample to increase the external validity of the findings. Copyright © 2018. Published by Elsevier Inc.

  2. Test–retest reliability and validity of a web-based food-frequency questionnaire for adolescents aged 13–14 to be used in the Norwegian Mother and Child Cohort Study (MoBa)

    PubMed Central

    Øverby, Nina Cecilie; Johannesen, Elisabeth; Jensen, Grete; Skjaevesland, Anne-Kirsti; Haugen, Margaretha

    2014-01-01

    Background The assessment of food intake is challenging and prone to errors; it is therefore important to consider the reliability and validity of the assessment methods. Objective The aim of this study was to analyze the reproducibility and validity of a developed food-frequency questionnaire (FFQ) for use among adolescents. Design In total, 58 students (aged 13–14) from four different schools in the southern part of Norway participated in the reproducibility study of filling out the FFQ 4 weeks apart. In addition, 93 students participated in the relative validity study where the FFQ was compared to 2×24-hour dietary recalls, while 92 students participated in the absolute validity study where the intakes of fatty acids and vitamin D from the FFQ were compared to fatty acids and 25-hydroxy-vitamin D3 in whole blood. Results The median Spearman correlation coefficient for all nutrients in the test–retest reliability study was 0.57. The median Spearman correlation for all nutrients in the relative validity study was 0.26, while the correlations coefficients were low in the absolute validity study with n-3 fatty acid coefficients ranging from 0.05 to 0.25, and absent for vitamin D (r=0.000). Conclusion The test–retest reproducibility was considered good, the relative validity was considered poor to good, and the absolute validity was considered poor. However, the results are comparable to other studies among adolescents. PMID:25371661

  3. A comparison of Google Glass and traditional video vantage points for bedside procedural skill assessment.

    PubMed

    Evans, Heather L; O'Shea, Dylan J; Morris, Amy E; Keys, Kari A; Wright, Andrew S; Schaad, Douglas C; Ilgen, Jonathan S

    2016-02-01

    This pilot study assessed the feasibility of using first person (1P) video recording with Google Glass (GG) to assess procedural skills, as compared with traditional third person (3P) video. We hypothesized that raters reviewing 1P videos would visualize more procedural steps with greater inter-rater reliability than 3P rating vantages. Seven subjects performed simulated internal jugular catheter insertions. Procedures were recorded by both Google Glass and an observer's head-mounted camera. Videos were assessed by 3 expert raters using a task-specific checklist (CL) and both an additive- and summative-global rating scale (GRS). Mean scores were compared by t-tests. Inter-rater reliabilities were calculated using intraclass correlation coefficients. The 1P vantage was associated with a significantly higher mean CL score than the 3P vantage (7.9 vs 6.9, P = .02). Mean GRS scores were not significantly different. Mean inter-rater reliabilities for the CL, additive-GRS, and summative-GRS were similar between vantages. 1P vantage recordings may improve visualization of tasks for behaviorally anchored instruments (eg, CLs), whereas maintaining similar global ratings and inter-rater reliability when compared with conventional 3P vantage recordings. Copyright © 2016 Elsevier Inc. All rights reserved.

  4. Reliability of Two Smartphone Applications for Radiographic Measurements of Hallux Valgus Angles.

    PubMed

    Mattos E Dinato, Mauro Cesar; Freitas, Marcio de Faria; Milano, Cristiano; Valloto, Elcio; Ninomiya, André Felipe; Pagnano, Rodrigo Gonçalves

    The objective of the present study was to assess the reliability of 2 smartphone applications compared with the traditional goniometer technique for measurement of radiographic angles in hallux valgus and the time required for analysis with the different methods. The radiographs of 31 patients (52 feet) with a diagnosis of hallux valgus were analyzed. Four observers, 2 with >10 years' experience in foot and ankle surgery and 2 in-training surgeons, measured the hallux valgus angle and intermetatarsal angle using a manual goniometer technique and 2 smartphone applications (Hallux Angles and iPinPoint). The interobserver and intermethod reliability were estimated using intraclass correlation coefficients (ICCs), and the time required for measurement of the angles among the 3 methods was compared using the Friedman test. A very good or good interobserver reliability was found among the 4 observers measuring the hallux valgus angle and intermetatarsal angle using the goniometer (ICC 0.913 and 0.821, respectively) and iPinPoint (ICC 0.866 and 0.638, respectively). Using the Hallux Angles application, a very good interobserver reliability was found for measurements of the hallux valgus angle (ICC 0.962) and intermetatarsal angle (ICC 0.935) only among the more experienced observers. The time required for the measurements was significantly shorter for the measurements using both smartphone applications compared with the goniometer method. One smartphone application (iPinPoint) was reliable for measurements of the hallux valgus angles by either experienced or nonexperienced observers. The use of these tools might save time in the evaluation of radiographic angles in the hallux valgus. Copyright © 2016 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.

  5. Reliability of Central Adiposity Assessments Using B-Mode Ultrasound: A Comparison of Linear and Curved Array Transducers.

    PubMed

    Stoner, Lee; Geoffron, Morgane; Cornwall, Jon; Chinn, Victoria; Gram, Martin; Credeur, Daniel; Fryer, Simon

    2016-12-01

    Recently, it was reported that intra-abdominal thickness (IAT) assessments using ultrasound are most reliable if measured from the linea alba to the anterior vertebral column. These 2 anatomical sites can be simultaneously visualized using a linear array transducer. Linear array transducers have different operational characteristics when compared with conventional curved array transducers and are more reliable for some ultrasound-derived measures such as abdominal subcutaneous fat thickness. However, it is unknown whether linear array transducers facilitate more reliable IAT measurements than curved array transducers. The purpose of the current study was to (1) compare the reliability of linear and curved array transducer assessments of IAT and maximal abdominal ratio (MAR) and (2) use the findings to update central adiposity measurement guidelines. Fifteen healthy adults (mean [SD], 27 [10] years; 60% female) with a range of somatotypes (body mass index: mean [SD], 24 [4]; range, 19-33 kg/m; waist circumference: mean [SD], 75 [11]; range, 61-96 cm) were tested on 3 mornings under standardized conditions. Intra-abdominal thickness was assessed 2 cm above the umbilicus (transverse plane), measuring from linea alba to the anterior vertebral column. Maximal abdominal ratio was defined as the ratio of IAT to abdominal subcutaneous fat thickness. The IAT range was 25 to 87 mm, and the MAR range was 0.15 to 0.77. Between-day intraclass correlation coefficient values for IAT measurements made were comparable (0.96-0.97) for both transducers, as were MAR values (0.95). In conclusion, while both transducers provided equally reliable measurement of IAT, the use of a single linear array transducer simplifies the assessment of central adiposity.

  6. Point-Connecting Measurements of the Hallux Valgus Deformity: A New Measurement and Its Clinical Application

    PubMed Central

    Seo, Jeong-Ho; Boedijono, Dimas

    2016-01-01

    Purpose The aim of this study was to investigate new point-connecting measurements for the hallux valgus angle (HVA) and the first intermetatarsal angle (IMA), which can reflect the degree of subluxation of the first metatarsophalangeal joint (MTPJ). Also, this study attempted to compare the validity of midline measurements and the new point-connecting measurements for the determination of HVA and IMA values. Materials and Methods Sixty feet of hallux valgus patients who underwent surgery between 2007 and 2011 were classified in terms of the severity of HVA, congruency of the first MTPJ, and type of chevron metatarsal osteotomy. On weight-bearing dorsal-plantar radiographs, HVA and IMA values were measured and compared preoperatively and postoperatively using both the conventional and new methods. Results Compared with midline measurements, point-connecting measurements showed higher inter- and intra-observer reliability for preoperative HVA/IMA and similar or higher inter- and intra-observer reliability for postoperative HVA/IMA. Patients who underwent distal chevron metatarsal osteotomy (DCMO) had higher intraclass correlation coefficient for inter- and intra-observer reliability for pre- and post-operative HVA and IMA measured by the point-connecting method compared with the midline method. All differences in the preoperative HVAs and IMAs determined by both the midline method and point-connecting methods were significant between the deviated group and subluxated groups (p=0.001). Conclusion The point-connecting method for measuring HVA and IMA in the subluxated first MTPJ may better reflect the severity of a HV deformity with higher reliability than the midline method, and is more useful in patients with DCMO than in patients with proximal chevron metatarsal osteotomy. PMID:26996576

  7. Reliability and criterion validity of measurements using a smart phone-based measurement tool for the transverse rotation angle of the pelvis during single-leg lifting.

    PubMed

    Jung, Sung-Hoon; Kwon, Oh-Yun; Jeon, In-Cheol; Hwang, Ui-Jae; Weon, Jong-Hyuck

    2018-01-01

    The purposes of this study were to determine the intra-rater test-retest reliability of a smart phone-based measurement tool (SBMT) and a three-dimensional (3D) motion analysis system for measuring the transverse rotation angle of the pelvis during single-leg lifting (SLL) and the criterion validity of the transverse rotation angle of the pelvis measurement using SBMT compared with a 3D motion analysis system (3DMAS). Seventeen healthy volunteers performed SLL with their dominant leg without bending the knee until they reached a target placed 20 cm above the table. This study used a 3DMAS, considered the gold standard, to measure the transverse rotation angle of the pelvis to assess the criterion validity of the SBMT measurement. Intra-rater test-retest reliability was determined using the SBMT and 3DMAS using intra-class correlation coefficient (ICC) [3,1] values. The criterion validity of the SBMT was assessed with ICC [3,1] values. Both the 3DMAS (ICC = 0.77) and SBMT (ICC = 0.83) showed excellent intra-rater test-retest reliability in the measurement of the transverse rotation angle of the pelvis during SLL in a supine position. Moreover, the SBMT showed an excellent correlation with the 3DMAS (ICC = 0.99). Measurement of the transverse rotation angle of the pelvis using the SBMT showed excellent reliability and criterion validity compared with the 3DMAS.

  8. Readability and Test-Retest Reliability of a Psychometric Instrument Designed to Assess HIV/AIDS Attitudes, Beliefs, Behaviours and Sources of HIV Prevention Information of Young Adults

    ERIC Educational Resources Information Center

    Balogun, Joseph; Abiona, Titilayo; Lukobo-Durrell, Mainza; Adefuye, Adedeji; Amosun, Seyi; Frantz, Jose; Yakut, Yavuz

    2011-01-01

    Objective: This comparative study evaluated the readability and test-retest reliability of a questionnaire designed to assess the attitudes, beliefs behaviours and sources of information about HIV/AIDS among young adults recruited from universities in the United States of America (USA), Turkey and South Africa. Design/Setting: The instrument was…

  9. Cast Coil Transformer Fire Susceptibility and Reliability Study

    DTIC Science & Technology

    1991-04-01

    transformers reduce risk to the user compared to liquid-filled units, eliminate environmental impacts, are more efficient than most transformer designs, and...filled units, eliminate environmental impacts, arc more efficient than most transformer designs, and add minimal risk to the facility in a fire situation...add minimal risk to the facility in a fire situation. Cast coil transformers have a long record of operation and have proven to be reliable and

  10. Unique reliability characteristics of fully depleted silicon-on-insulator tunneling FET

    NASA Astrophysics Data System (ADS)

    Kang, Soo Cheol; Lim, Donghwan; Lim, Sung Kwan; Noh, Jinwoo; Kim, Seung-Mo; Lee, Sang Kyung; Choi, Changhwan; Lee, Byoung Hun

    2018-04-01

    This study investigated the unique reliability characteristics of tunneling field effect transistors (TFETs) by comparing the effects of positive bias temperature instability (PBTI) and hot carrier injection (HCI) stresses. In case of hot carrier injection (HCI) stress, the interface trap generation near a p/n+ region was the primary degradation mechanism. However, strong recovery after a high-pressure hydrogen annealing and weak degradation at low temperature indicates that the degradation mechanism of TFET under the HCI stress is different from the high-energy carrier stress induced permanent defect generation mechanism observed in MOSFETs. Further study is necessary to identify the exact location and defect species causing TFET degradation; however, a significant difference is evident between the dominant reliability mechanism of TFET and MOSFET.

  11. The Comparative Reliability and Feasibility of the Past-Year Canadian Diet History Questionnaire II: Comparison of the Paper and Web Versions.

    PubMed

    Lo Siou, Geraldine; Csizmadi, Ilona; Boucher, Beatrice A; Akawung, Alianu K; Whelan, Heather K; Sharma, Michelle; Al Rajabi, Ala; Vena, Jennifer E; Kirkpatrick, Sharon I; Koushik, Anita; Massarelli, Isabelle; Rondeau, Isabelle; Robson, Paula J

    2017-02-13

    Advances in technology-enabled dietary assessment include the advent of web-based food frequency questionnaires, which may reduce costs and researcher burden but may introduce new challenges related to internet connectivity and computer literacy. The purpose of this study was to evaluate the intra- and inter-version reliability, feasibility and acceptability of the paper and web Canadian Diet History Questionnaire II (CDHQ-II) in a sub-sample of 648 adults (aged 39-81 years) recruited from Alberta's Tomorrow Project. Participants were randomly assigned to one of two groups: (1) paper, web, paper; or (2) web, paper, web over a six-week period. With few exceptions, no statistically significant differences in mean nutrient intake were found in the intra- and inter-version reliability analyses. The majority of participants indicated future willingness to complete the CDHQ-II online, and 59% indicated a preference for the web over the paper version. Findings indicate that, in this population of adults drawn from an existing cohort, the CDHQ-II may be administered in paper or web modalities (increasing flexibility for questionnaire delivery), and the nutrient estimates obtained with either version are comparable. We recommend that other studies explore the feasibility and reliability of different modes of administration of dietary assessment instruments prior to widespread implementation.

  12. Laser System Reliability

    DTIC Science & Technology

    1977-03-01

    system acquisition cycle since they provide necessary inputs to comparative analyses, cost/benefit trade -offs, and system simulations. In addition, the...Management Program from above performs the function of analyzing the system trade -offs with respect to reliability to determine a reliability goal...one encounters the problem of comparing present dollars with future dollars. In this analysis, we are trading off costs expended initially (or at

  13. Use of Jebsen Taylor Hand Function Test in evaluating the hand dexterity in people with Parkinson's disease.

    PubMed

    Mak, M K Y; Lau, E T L; Tam, V W K; Woo, C W Y; Yuen, S K Y

    2015-01-01

    To investigate the test-retest reliability of JTT in older patients with Parkinson's disease (PD); and to compare the Jebsen Taylor Hand Function Test (JTT) scores between PD and healthy subjects. Cross-sectional comparative study. Fifteen PD and fifteen healthy subjects performed the JTT and the time taken to complete the JTT was recorded. Test-retest reliabilities of JTT subtests and total score of both dominant and non-dominant hand were good to excellent (ICCs = 0.77-0.97) except J5 checkers which had moderate reliability. PD subjects required significantly longer time to finish subtests and the whole JTT (p < 0.05), except the subtest J1 writing of dominant hand that showed marginal significance (p = 0.059). JTT is a reliable and easily available assessment tool for assessing the hand function of PD subjects. PD subjects took a longer time to complete the JTT, suggesting that they have deficits in gross and fine functional dexterity. Copyright © 2015 Hanley & Belfus. Published by Elsevier Inc. All rights reserved.

  14. Timeline historical review of income and financial transactions: a reliable assessment of personal finances.

    PubMed

    Black, Anne C; Serowik, Kristin L; Ablondi, Karen M; Rosen, Marc I

    2013-01-01

    The need for accurate and reliable information about income and resources available to individuals with psychiatric disabilities is critical for the assessment of need and evaluation of programs designed to alleviate financial hardship or affect finance allocation. Measurement of finances is ubiquitous in studies of economics, poverty, and social services. However, evidence has demonstrated that these measures often contain error. We compare the 1-week test-retest reliability of income and finance data from 24 adult psychiatric outpatients using assessment-as-usual (AAU) and a new instrument, the Timeline Historical Review of Income and Financial Transactions (THRIFT). Reliability estimates obtained with the THRIFT for Income (0.77), Expenses (0.91), and Debt (0.99) domains were significantly better than those obtained with AAU. Reliability estimates for Balance did not differ. THRIFT reduced measurement error and provided more reliable information than AAU for assessment of personal finances in psychiatric patients receiving Social Security benefits. The instrument also may be useful with other low-income groups.

  15. An improved spanning tree approach for the reliability analysis of supply chain collaborative network

    NASA Astrophysics Data System (ADS)

    Lam, C. Y.; Ip, W. H.

    2012-11-01

    A higher degree of reliability in the collaborative network can increase the competitiveness and performance of an entire supply chain. As supply chain networks grow more complex, the consequences of unreliable behaviour become increasingly severe in terms of cost, effort and time. Moreover, it is computationally difficult to calculate the network reliability of a Non-deterministic Polynomial-time hard (NP-hard) all-terminal network using state enumeration, as this may require a huge number of iterations for topology optimisation. Therefore, this paper proposes an alternative approach of an improved spanning tree for reliability analysis to help effectively evaluate and analyse the reliability of collaborative networks in supply chains and reduce the comparative computational complexity of algorithms. Set theory is employed to evaluate and model the all-terminal reliability of the improved spanning tree algorithm and present a case study of a supply chain used in lamp production to illustrate the application of the proposed approach.

  16. Achieving Reliable Communication in Dynamic Emergency Responses

    PubMed Central

    Chipara, Octav; Plymoth, Anders N.; Liu, Fang; Huang, Ricky; Evans, Brian; Johansson, Per; Rao, Ramesh; Griswold, William G.

    2011-01-01

    Emergency responses require the coordination of first responders to assess the condition of victims, stabilize their condition, and transport them to hospitals based on the severity of their injuries. WIISARD is a system designed to facilitate the collection of medical information and its reliable dissemination during emergency responses. A key challenge in WIISARD is to deliver data with high reliability as first responders move and operate in a dynamic radio environment fraught with frequent network disconnections. The initial WIISARD system employed a client-server architecture and an ad-hoc routing protocol was used to exchange data. The system had low reliability when deployed during emergency drills. In this paper, we identify the underlying causes of unreliability and propose a novel peer-to-peer architecture that in combination with a gossip-based communication protocol achieves high reliability. Empirical studies show that compared to the initial WIISARD system, the redesigned system improves reliability by as much as 37% while reducing the number of transmitted packets by 23%. PMID:22195075

  17. Reliability Analysis of a Glacier Lake Warning System Using a Bayesian Net

    NASA Astrophysics Data System (ADS)

    Sturny, Rouven A.; Bründl, Michael

    2013-04-01

    Beside structural mitigation measures like avalanche defense structures, dams and galleries, warning and alarm systems have become important measures for dealing with Alpine natural hazards. Integrating them into risk mitigation strategies and comparing their effectiveness with structural measures requires quantification of the reliability of these systems. However, little is known about how reliability of warning systems can be quantified and which methods are suitable for comparing their contribution to risk reduction with that of structural mitigation measures. We present a reliability analysis of a warning system located in Grindelwald, Switzerland. The warning system was built for warning and protecting residents and tourists from glacier outburst floods as consequence of a rapid drain of the glacier lake. We have set up a Bayesian Net (BN, BPN) that allowed for a qualitative and quantitative reliability analysis. The Conditional Probability Tables (CPT) of the BN were determined according to manufacturer's reliability data for each component of the system as well as by assigning weights for specific BN nodes accounting for information flows and decision-making processes of the local safety service. The presented results focus on the two alerting units 'visual acoustic signal' (VAS) and 'alerting of the intervention entities' (AIE). For the summer of 2009, the reliability was determined to be 94 % for the VAS and 83 % for the AEI. The probability of occurrence of a major event was calculated as 0.55 % per day resulting in an overall reliability of 99.967 % for the VAS and 99.906 % for the AEI. We concluded that a failure of the VAS alerting unit would be the consequence of a simultaneous failure of the four probes located in the lake and the gorge. Similarly, we deduced that the AEI would fail either if there were a simultaneous connectivity loss of the mobile and fixed network in Grindelwald, an Internet access loss or a failure of the regional operations centre. However, the probability of a common failure of these components was assumed to be low. Overall it can be stated that due to numerous redundancies, the investigated warning system is highly reliable and its influence on risk reduction is very high. Comparable studies in the future are needed to classify these results and to gain more experience how the reliability of warning systems could be determined in practice.

  18. Vestibular Assessments in Children With Global Developmental Delay: An Exploratory Study.

    PubMed

    Dannenbaum, Elizabeth; Horne, Victoria; Malik, Farwa; Villeneuve, Myriam; Salvo, Lora; Chilingaryan, Gevorg; Lamontagne, Anouk

    2016-01-01

    To compare results of 3 clinical vestibular tests between children with global developmental delay (GDD) and children with typical development (TD) and investigate the test-retest reliability. Twenty children with GDD (aged 4.1-12.1 years) and 11 age-matched controls with TD participated. Participants with GDD underwent 2 sessions of testing. Each session consisted of the Clinical Test of Sensory Interaction and Balance (CTSIB), Dynamic Visual Acuity (DVA) test, and the modified Emory Clinical Vestibular Chair Test (m-ECVCT). Up to 33% of the children with GDD had abnormal DVA scores. m-ECVCT results of children with GDD demonstrated larger variance than children with TD. The CTSIB score was significantly reduced in the group with GDD. The test-retest reliability varied, with good reliability for the m-ECVCT and CTSIB, and fair reliability for the DVA. Findings suggest vestibular involvement in children in GDD. The clinical tests demonstrated moderate test-retest reliability.

  19. Reliability Analysis of Sealing Structure of Electromechanical System Based on Kriging Model

    NASA Astrophysics Data System (ADS)

    Zhang, F.; Wang, Y. M.; Chen, R. W.; Deng, W. W.; Gao, Y.

    2018-05-01

    The sealing performance of aircraft electromechanical system has a great influence on flight safety, and the reliability of its typical seal structure is analyzed by researcher. In this paper, we regard reciprocating seal structure as a research object to study structural reliability. Having been based on the finite element numerical simulation method, the contact stress between the rubber sealing ring and the cylinder wall is calculated, and the relationship between the contact stress and the pressure of the hydraulic medium is built, and the friction force on different working conditions are compared. Through the co-simulation, the adaptive Kriging model obtained by EFF learning mechanism is used to describe the failure probability of the seal ring, so as to evaluate the reliability of the sealing structure. This article proposes a new idea of numerical evaluation for the reliability analysis of sealing structure, and also provides a theoretical basis for the optimal design of sealing structure.

  20. Operation Reliability Assessment for Cutting Tools by Applying a Proportional Covariate Model to Condition Monitoring Information

    PubMed Central

    Cai, Gaigai; Chen, Xuefeng; Li, Bing; Chen, Baojia; He, Zhengjia

    2012-01-01

    The reliability of cutting tools is critical to machining precision and production efficiency. The conventional statistic-based reliability assessment method aims at providing a general and overall estimation of reliability for a large population of identical units under given and fixed conditions. However, it has limited effectiveness in depicting the operational characteristics of a cutting tool. To overcome this limitation, this paper proposes an approach to assess the operation reliability of cutting tools. A proportional covariate model is introduced to construct the relationship between operation reliability and condition monitoring information. The wavelet packet transform and an improved distance evaluation technique are used to extract sensitive features from vibration signals, and a covariate function is constructed based on the proportional covariate model. Ultimately, the failure rate function of the cutting tool being assessed is calculated using the baseline covariate function obtained from a small sample of historical data. Experimental results and a comparative study show that the proposed method is effective for assessing the operation reliability of cutting tools. PMID:23201980

  1. [Reliability and validity of the Braden Scale for predicting pressure sore risk].

    PubMed

    Boes, C

    2000-12-01

    For more accurate and objective pressure sore risk assessment various risk assessment tools were developed mainly in the USA and Great Britain. The Braden Scale for Predicting Pressure Sore Risk is one such example. By means of a literature analysis of German and English texts referring to the Braden Scale the scientific control criteria reliability and validity will be traced and consequences for application of the scale in Germany will be demonstrated. Analysis of 4 reliability studies shows an exclusive focus on interrater reliability. Further, even though examination of 19 validity studies occurs in many different settings, such examination is limited to the criteria sensitivity and specificity (accuracy). The range of sensitivity and specificity level is 35-100%. The recommended cut off points rank in the field of 10 to 19 points. The studies prove to be not comparable with each other. Furthermore, distortions in these studies can be found which affect accuracy of the scale. The results of the here presented analysis show an insufficient proof for reliability and validity in the American studies. In Germany, the Braden scale has not yet been tested under scientific criteria. Such testing is needed before using the scale in different German settings. During the course of such testing, construction and study procedures of the American studies can be used as a basis as can the problems be identified in the analysis presented below.

  2. Research on Novel Algorithms for Smart Grid Reliability Assessment and Economic Dispatch

    NASA Astrophysics Data System (ADS)

    Luo, Wenjin

    In this dissertation, several studies of electric power system reliability and economy assessment methods are presented. To be more precise, several algorithms in evaluating power system reliability and economy are studied. Furthermore, two novel algorithms are applied to this field and their simulation results are compared with conventional results. As the electrical power system develops towards extra high voltage, remote distance, large capacity and regional networking, the application of a number of new technique equipments and the electric market system have be gradually established, and the results caused by power cut has become more and more serious. The electrical power system needs the highest possible reliability due to its complication and security. In this dissertation the Boolean logic Driven Markov Process (BDMP) method is studied and applied to evaluate power system reliability. This approach has several benefits. It allows complex dynamic models to be defined, while maintaining its easy readability as conventional methods. This method has been applied to evaluate IEEE reliability test system. The simulation results obtained are close to IEEE experimental data which means that it could be used for future study of the system reliability. Besides reliability, modern power system is expected to be more economic. This dissertation presents a novel evolutionary algorithm named as quantum evolutionary membrane algorithm (QEPS), which combines the concept and theory of quantum-inspired evolutionary algorithm and membrane computation, to solve the economic dispatch problem in renewable power system with on land and offshore wind farms. The case derived from real data is used for simulation tests. Another conventional evolutionary algorithm is also used to solve the same problem for comparison. The experimental results show that the proposed method is quick and accurate to obtain the optimal solution which is the minimum cost for electricity supplied by wind farm system.

  3. DIRECT operational field test evaluation natural use study. Part 4, Recommendations for expanded deployment

    DOT National Transportation Integrated Search

    1998-08-01

    The DIRECT project compared four low-cost driver information systems. Of the four that were : compared, the RDS approach proved superior to the others in toggling reliability and voice quality. The DIRECT project planned to expand the implementation ...

  4. Radiologic analysis of hindfoot alignment: Comparison of Méary, long axial, and hindfoot alignment views.

    PubMed

    Neri, T; Barthelemy, R; Tourné, Y

    2017-12-01

    Among radiographic views available for assessing hindfoot alignment, the antero-posterior weight-bearing view with metal cerclage of the hindfoot (Méary view) is the most widely used in France. Internationally, the long axial view (LAV) and hindfoot alignment view (HAV) are used also. The objective of this study was to compare the reliability of these three views. The Méary view with cerclage of the hindfoot is as reliable as the LAV and HAV for assessing hindfoot alignment. All three views were obtained in each of 22 prospectively included patients. Intra-observer and inter-observer reliabilities were assessed by having two observers collect the radiographic measurements then computing the intra-class correlation coefficients (ICCs). The intra-observer and inter-observer ICCs were 0.956 and 0.988 with the Méary view, 0.990 and 0.765 with the HAV, and 0.997 and 0.991 with the LAV, respectively. Correlations were far stronger between the LAV and HAV than between each of these and the Méary view. Compared to the LAV and HAV, the Méary view indicated a greater degree of hindfoot valgus. Intra-observer reliability was excellent with both the LAV and HAV, whereas inter-observer reliability was better with the LAV. Excellent reliability was also obtained with the Méary view. Combining the Méary view to obtain a radiographic image of the clinical deformity with the LAV to measure the angular deviation of the hindfoot axis may be useful when assessing hindfoot malalignment. A comparison of the three views in a larger population is needed before clinical recommendations can be made. II, prospective study. Copyright © 2017 Elsevier Masson SAS. All rights reserved.

  5. Wearable Lactate Threshold Predicting Device is Valid and Reliable in Runners.

    PubMed

    Borges, Nattai R; Driller, Matthew W

    2016-08-01

    Borges, NR and Driller, MW. Wearable lactate threshold predicting device is valid and reliable in runners. J Strength Cond Res 30(8): 2212-2218, 2016-A commercially available device claiming to be the world's first wearable lactate threshold predicting device (WLT), using near-infrared LED technology, has entered the market. The aim of this study was to determine the levels of agreement between the WLT-derived lactate threshold workload and traditional methods of lactate threshold (LT) calculation and the interdevice and intradevice reliability of the WLT. Fourteen (7 male, 7 female; mean ± SD; age: 18-45 years, height: 169 ± 9 cm, mass: 67 ± 13 kg, V[Combining Dot Above]O2max: 53 ± 9 ml·kg·min) subjects ranging from recreationally active to highly trained athletes completed an incremental exercise test to exhaustion on a treadmill. Blood lactate samples were taken at the end of each 3-minute stage during the test to determine lactate threshold using 5 traditional methods from blood lactate analysis which were then compared against the WLT predicted value. In a subset of the population (n = 12), repeat trials were performed to determine both inter-reliability and intrareliability of the WLT device. Intraclass correlation coefficient (ICC) found high to very high agreement between the WLT and traditional methods (ICC > 0.80), with TEMs and mean differences ranging between 3.9-10.2% and 1.3-9.4%. Both interdevice and intradevice reliability resulted in highly reproducible and comparable results (CV < 1.2%, TEM <0.2 km·h, ICC > 0.97). This study suggests that the WLT is a practical, reliable, and noninvasive tool for use in predicting LT in runners.

  6. The prone bridge test: Performance, validity, and reliability among older and younger adults.

    PubMed

    Bohannon, Richard W; Steffl, Michal; Glenney, Susan S; Green, Michelle; Cashwell, Leah; Prajerova, Kveta; Bunn, Jennifer

    2018-04-01

    The prone bridge maneuver, or plank, has been viewed as a potential alternative to curl-ups for assessing trunk muscle performance. The purpose of this study was to assess prone bridge test performance, validity, and reliability among younger and older adults. Sixty younger (20-35 years old) and 60 older (60-79 years old) participants completed this study. Groups were evenly divided by sex. Participants completed surveys regarding physical activity and abdominal exercise participation. Height, weight, body mass index (BMI), and waist circumference were measured. On two occasions, 5-9 days apart, participants held a prone bridge until volitional exhaustion or until repeated technique failure. Validity was examined using data from the first session: convergent validity by calculating correlations between survey responses, anthropometrics, and prone bridge time, known groups validity by using an ANOVA comparing bridge times of younger and older adults and of men and women. Test-retest reliability was examined by using a paired t-test to compare prone bridge times for Session1 and Session 2. Furthermore, an intraclass correlation coefficient (ICC) was used to characterize relative reliability and minimal detectable change (MDC 95% ) was used to describe absolute reliability. The mean prone bridge time was 145.3 ± 71.5 s, and was positively correlated with physical activity participation (p ≤ 0.001) and negatively correlated with BMI and waist circumference (p ≤ 0.003). Younger participants had significantly longer plank times than older participants (p = 0.003). The ICC between testing sessions was 0.915. The prone bridge test is a valid and reliable measure for evaluating abdominal performance in both younger and older adults. Copyright © 2017 Elsevier Ltd. All rights reserved.

  7. Validity and inter-observer reliability of subjective hand-arm vibration assessments.

    PubMed

    Coenen, Pieter; Formanoy, Margriet; Douwes, Marjolein; Bosch, Tim; de Kraker, Heleen

    2014-07-01

    Exposure to mechanical vibrations at work (e.g., due to handling powered tools) is a potential occupational risk as it may cause upper extremity complaints. However, reliable and valid assessment methods for vibration exposure at work are lacking. Measuring hand-arm vibration objectively is often difficult and expensive, while often used information provided by manufacturers lacks detail. Therefore, a subjective hand-arm vibration assessment method was tested on validity and inter-observer reliability. In an experimental protocol, sixteen tasks handling powered tools were executed by two workers. Hand-arm vibration was assessed subjectively by 16 observers according to the proposed subjective assessment method. As a gold standard reference, hand-arm vibration was measured objectively using a vibration measurement device. Weighted κ's were calculated to assess validity, intra-class-correlation coefficients (ICCs) were calculated to assess inter-observer reliability. Inter-observer reliability of the subjective assessments depicting the agreement among observers can be expressed by an ICC of 0.708 (0.511-0.873). The validity of the subjective assessments as compared to the gold-standard reference can be expressed by a weighted κ of 0.535 (0.285-0.785). Besides, the percentage of exact agreement of the subjective assessment compared to the objective measurement was relatively low (i.e., 52% of all tasks). This study shows that subjectively assessed hand-arm vibrations are fairly reliable among observers and moderately valid. This assessment method is a first attempt to use subjective risk assessments of hand-arm vibration. Although, this assessment method can benefit from some future improvement, it can be of use in future studies and in field-based ergonomic assessments. Copyright © 2014 Elsevier Ltd and The Ergonomics Society. All rights reserved.

  8. Reliability of the modified Tufts Lumbar Degenerative Disc Classification between neurosurgeons and neuroradiologists.

    PubMed

    Burke, Shane M; Hwang, Steven W; Mehan, William A; Bedi, Harprit S; Ogbuji, Richard; Riesenburger, Ron I

    2016-07-01

    Cross-specialty inter-rater reliability has not been explicitly reported for imaging characteristics that are thought to be important in lumbar intervertebral disc degeneration. Sufficient cross-specialty reliability is an essential consideration if radiographic stratification of symptomatic patients to specific treatment modalities is to ever be realized. Therefore the purpose of this study was to directly compare the assessment of such characteristics between neurosurgeons and neuroradiologists. Sixty consecutive patients with a diagnosis of lumbago and appropriate imaging were selected for inclusion. Lumbar MRI were evaluated using the Tufts Degenerative Disc Classification by two neurosurgeons and two neuroradiologists. Inter-rater reliability was assessed using Cohen's κ values both within and between specialties. A sensitivity analysis was performed for a modified grading system, which excluded high intensity zones (HIZ), due to poor cross-specialty inter-rater reliability of HIZ between specialties. The reliability of HIZ between neurosurgeons and neuroradiologists was fair in two of the four cross-specialty comparisons in this study (neurosurgeon 1 versus both radiologists κ=0.364 and κ=0.290). Removing HIZ from the classification improved inter-rater reliability for all comparisons within and between specialties (0.465⩽κ⩽0.576). In addition, intra-rater reliability remained in the moderate to substantial range (0.523⩽κ⩽0.649). Given our findings and corroboration with previous studies, identification of HIZ seems to have a markedly variable reliability. Thus we recommend modification of the original Tufts Degenerative Disc Classification by removing HIZ in order to make the overall grade provided by this classification more reproducible when scored by practitioners of different training backgrounds. Copyright © 2015 Elsevier Ltd. All rights reserved.

  9. Reliability of Measurement of Glenohumeral Internal Rotation, External Rotation, and Total Arc of Motion in 3 Test Positions

    PubMed Central

    Kevern, Mark A.; Beecher, Michael; Rao, Smita

    2014-01-01

    Context: Athletes who participate in throwing and racket sports consistently demonstrate adaptive changes in glenohumeral-joint internal and external rotation in the dominant arm. Measurements of these motions have demonstrated excellent intrarater and poor interrater reliability. Objective: To determine intrarater reliability, interrater reliability, and standard error of measurement for shoulder internal rotation, external rotation, and total arc of motion using an inclinometer in 3 testing procedures in National Collegiate Athletic Association Division I baseball and softball athletes. Design: Cross-sectional study. Setting: Athletic department. Patients or Other Participants Thirty-eight players participated in the study. Shoulder internal rotation, external rotation, and total arc of motion were measured by 2 investigators in 3 test positions. The standard supine position was compared with a side-lying test position, as well as a supine test position without examiner overpressure. Results: Excellent intrarater reliability was noted for all 3 test positions and ranges of motion, with intraclass correlation coefficient values ranging from 0.93 to 0.99. Results for interrater reliability were less favorable. Reliability for internal rotation was highest in the side-lying position (0.68) and reliability for external rotation and total arc was highest in the supine-without-overpressure position (0.774 and 0.713, respectively). The supine-with-overpressure position yielded the lowest interrater reliability results in all positions. The side-lying position had the most consistent results, with very little variation among intraclass correlation coefficient values for the various test positions. Conclusions: The results of our study clearly indicate that the side-lying test procedure is of equal or greater value than the traditional supine-with-overpressure method. PMID:25188316

  10. Test-retest reliability of sensor-based sit-to-stand measures in young and older adults.

    PubMed

    Regterschot, G Ruben H; Zhang, Wei; Baldus, Heribert; Stevens, Martin; Zijlstra, Wiebren

    2014-01-01

    This study investigated test-retest reliability of sensor-based sit-to-stand (STS) peak power and other STS measures in young and older adults. In addition, test-retest reliability of the sensor method was compared to test-retest reliability of the Timed Up and Go Test (TUGT) and Five-Times-Sit-to-Stand Test (FTSST) in older adults. Ten healthy young female adults (20-23 years) and 31 older adults (21 females; 73-94 years) participated in two assessment sessions separated by 3-8 days. Vertical peak power was assessed during three (young adults) and five (older adults) normal and fast STS trials with a hybrid motion sensor worn on the hip. Older adults also performed the FTSST and TUGT. The average sensor-based STS peak power of the normal STS trials and the average sensor-based STS peak power of the fast STS trials showed excellent test-retest reliability in young adults (intra-class correlation (ICC)≥0.90; zero in 95% confidence interval of mean difference between test and retest (95%CI of D); standard error of measurement (SEM)≤6.7% of mean peak power) and older adults (ICC≥0.91; zero in 95%CI of D; SEM≤9.9%). Test-retest reliability of sensor-based STS peak power and TUGT (ICC=0.98; zero in 95%CI of D; SEM=8.5%) was comparable in older adults, test-retest reliability of the FTSST was lower (ICC=0.73; zero outside 95%CI of D; SEM=14.4%). Sensor-based STS peak power demonstrated excellent test-retest reliability and may therefore be useful for clinical assessment of functional status and fall risk. Copyright © 2014 Elsevier B.V. All rights reserved.

  11. The Research Diagnostic Criteria for Temporomandibular Disorders. I: overview and methodology for assessment of validity.

    PubMed

    Schiffman, Eric L; Truelove, Edmond L; Ohrbach, Richard; Anderson, Gary C; John, Mike T; List, Thomas; Look, John O

    2010-01-01

    The purpose of the Research Diagnostic Criteria for Temporomandibular Disorders (RDC/TMD) Validation Project was to assess the diagnostic validity of this examination protocol. The aim of this article is to provide an overview of the project's methodology, descriptive statistics, and data for the study participant sample. This article also details the development of reliable methods to establish the reference standards for assessing criterion validity of the Axis I RDC/TMD diagnoses. The Axis I reference standards were based on the consensus of two criterion examiners independently performing a comprehensive history, clinical examination, and evaluation of imaging. Intersite reliability was assessed annually for criterion examiners and radiologists. Criterion examination reliability was also assessed within study sites. Study participant demographics were comparable to those of participants in previous studies using the RDC/TMD. Diagnostic agreement of the criterion examiners with each other and with the consensus-based reference standards was excellent with all kappas > or = 0.81, except for osteoarthrosis (moderate agreement, k = 0.53). Intrasite criterion examiner agreement with reference standards was excellent (k > or = 0.95). Intersite reliability of the radiologists for detecting computed tomography-disclosed osteoarthrosis and magnetic resonance imaging-disclosed disc displacement was good to excellent (k = 0.71 and 0.84, respectively). The Validation Project study population was appropriate for assessing the reliability and validity of the RDC/TMD Axis I and II. The reference standards used to assess the validity of Axis I TMD were based on reliable and clinically credible methods.

  12. Testing two methods to create comparable scale scores between the Job Content Questionnaire (JCQ) and JCQ-like questionnaires in the European JACE Study.

    PubMed

    Karasek, Robert; Choi, BongKyoo; Ostergren, Per-Olof; Ferrario, Marco; De Smet, Patrick

    2007-01-01

    Scale comparative properties of "JCQ-like" questionnaires with respect to the JCQ have been little known. Assessing validity and reliability of two methods for generating comparable scale scores between the Job Content Questionnaire (JCQ) and JCQ-like questionnaires in sub-populations of the large Job Stress, Absenteeism and Coronary Heart Disease European Cooperative (JACE) study: the Swedish version of Demand-Control Questionnaire (DCQ) and a transformed Multinational Monitoring of Trends and Determinants in Cardiovascular Disease Project (MONICA) questionnaire. A random population sample of all Malmo males and females aged 52-58 (n = 682) years was given a new test questionnaire with both instruments (the JCQ and the DCQ). Comparability-facilitating algorithms were created (Method I). For the transformed Milan MONICA questionnaire, a simple weighting system was used (Method II). The converted scale scores from the JCQ-like questionnaires were found to be reliable and highly correlated to those of the original JCQ. However, agreements for the high job strain group between the JCQ and the DCQ, and between the JCQ and the DCQ (Method I applied) were only moderate (Kappa). Use of a multiple level job strain scale generated higher levels of job strain agreement, as did a new job strain definition that excludes the intermediate levels of the job strain distribution. The two methods were valid and generally reliable.

  13. Can the Fatigue Severity Scale 7-item version be used across different patient populations as a generic fatigue measure - a comparative study using a Rasch model approach

    PubMed Central

    2014-01-01

    Background Fatigue is a disabling symptom associated with reduced quality of life in various populations living with chronic illnesses. The transfer of knowledge about fatigue from one group to another is crucial in both research and healthcare. Outcomes should be validly and reliably comparable between groups and should not be unduly influenced by diagnostic variations. The present study evaluates whether the Fatigue Severity Scale 7-item version (FSS-7) demonstrates similar item hierarchy across people with multiple sclerosis, stroke or HIV/AIDS to ensure valid comparisons between groups, and provide further evidence of internal scale validity. Methods A secondary comparative analysis was performed using data from three different studies of three different chronic illnesses: multiple sclerosis, stroke and HIV/AIDS. Each of these studies had previously concluded that the FSS-7 has better psychometric properties than the original FSS for measuring fatigue interference. Data from 224 people with multiple sclerosis, 104 people with stroke and 316 people with HIV/AIDS were examined. Item response theory and a Rasch model were chosen to analyze the similarity of the FSS-7 item hierarchy across the three diagnostic groups Results Cross-sample differences were found for items #3, #5, #6 and #9 for two of the three samples, which raise questions about item validity across groups. However, disease-specific and disease-generic Rasch measures were similar across samples, indicating that individual fatigue interference measures in these three chronic illnesses might still be reliably comparable using the FSS-7. Conclusions Some items performed differently between the three samples but did not bias person measures, thereby indicating that fatigue interference in these illnesses might still be reliably compared using FSS-7 scores. However, caution is warranted when comparing fatigue raw sum scores directly across diagnostic groups using the FSS-7. Further studies of the scale are needed in other types of chronic illnesses. PMID:24559076

  14. Predictors of validity and reliability of a physical activity record in adolescents

    PubMed Central

    2013-01-01

    Background Poor to moderate validity of self-reported physical activity instruments is commonly observed in young people in low- and middle-income countries. However, the reasons for such low validity have not been examined in detail. We tested the validity of a self-administered daily physical activity record in adolescents and assessed if personal characteristics or the convenience level of reporting physical activity modified the validity estimates. Methods The study comprised a total of 302 adolescents from an urban and rural area in Ecuador. Validity was evaluated by comparing the record with accelerometer recordings for seven consecutive days. Test-retest reliability was examined by comparing registrations from two records administered three weeks apart. Time spent on sedentary (SED), low (LPA), moderate (MPA) and vigorous (VPA) intensity physical activity was estimated. Bland Altman plots were used to evaluate measurement agreement. We assessed if age, sex, urban or rural setting, anthropometry and convenience of completing the record explained differences in validity estimates using a linear mixed model. Results Although the record provided higher estimates for SED and VPA and lower estimates for LPA and MPA compared to the accelerometer, it showed an overall fair measurement agreement for validity. There was modest reliability for assessing physical activity in each intensity level. Validity was associated with adolescents’ personal characteristics: sex (SED: P = 0.007; LPA: P = 0.001; VPA: P = 0.009) and setting (LPA: P = 0.000; MPA: P = 0.047). Reliability was associated with the convenience of completing the physical activity record for LPA (low convenience: P = 0.014; high convenience: P = 0.045). Conclusions The physical activity record provided acceptable estimates for reliability and validity on a group level. Sex and setting were associated with validity estimates, whereas convenience to fill out the record was associated with better reliability estimates for LPA. This tendency of improved reliability estimates for adolescents reporting higher convenience merits further consideration. PMID:24289296

  15. Validation of Clinical Observations of Mastication in Persons with ALS.

    PubMed

    Simione, Meg; Wilson, Erin M; Yunusova, Yana; Green, Jordan R

    2016-06-01

    Amyotrophic lateral sclerosis (ALS) is a progressive neurological disease that can result in difficulties with mastication leading to malnutrition, choking or aspiration, and reduced quality of life. When evaluating mastication, clinicians primarily observe spatial and temporal aspects of jaw motion. The reliability and validity of clinical observations for detecting jaw movement abnormalities is unknown. The purpose of this study is to determine the reliability and validity of clinician-based ratings of chewing performance in neuro-typical controls and persons with varying degrees of chewing impairments due to ALS. Adults chewed a solid food consistency while full-face video were recorded along with jaw kinematic data using a 3D optical motion capture system. Five experienced speech-language pathologists watched the videos and rated the spatial and temporal aspects of chewing performance. The jaw kinematic data served as the gold-standard for validating the clinicians' ratings. Results showed that the clinician-based rating of temporal aspects of chewing performance had strong inter-rater reliability and correlated well with comparable kinematic measures. In contrast, the reliability of rating the spatial and spatiotemporal aspects of chewing (i.e., range of motion of the jaw, consistency of the chewing pattern) was mixed. Specifically, ratings of range of motion were at best only moderately reliable. Ratings of chewing movement consistency were reliable but only weakly correlated with comparable measures of jaw kinematics. These findings suggest that clinician ratings of temporal aspects of chewing are appropriate for clinical use, whereas ratings of the spatial and spatiotemporal aspects of chewing may not be reliable or valid.

  16. The reliability of humerothoracic angles during arm elevation depends on the representation of rotations.

    PubMed

    López-Pascual, Juan; Cáceres, Magda Liliana; De Rosario, Helios; Page, Álvaro

    2016-02-08

    The reliability of joint rotation measurements is an issue of major interest, especially in clinical applications. The effect of instrumental errors and soft tissue artifacts on the variability of human motion measures is well known, but the influence of the representation of joint motion has not yet been studied. The aim of the study was to compare the within-subject reliability of three rotation formalisms for the calculation of the shoulder elevation joint angles. Five repetitions of humeral elevation in the scapular plane of 27 healthy subjects were recorded using a stereophotogrammetry system. The humerothoracic joint angles were calculated using the YX'Y" and XZ'Y" Euler angle sequences and the attitude vector. A within-subject repeatability study was performed for the three representations. ICC, SEM and CV were the indices used to estimate the error in the calculation of the angle amplitudes and the angular waveforms with each method. Excellent results were obtained in all representations for the main angle (elevation), but there were remarkable differences for axial rotation and plane of elevation. The YX'Y" sequence generally had the poorest reliability in the secondary angles. The XZ'Y' sequence proved to be the most reliable representation of axial rotation, whereas the attitude vector had the highest reliability in the plane of elevation. These results highlight the importance of selecting the method used to describe the joint motion when within-subjects reliability is an important issue of the experiment. This may be of particular importance when the secondary angles of motions are being studied. Copyright © 2016 Elsevier Ltd. All rights reserved.

  17. A systematic review of reliability and objective criterion-related validity of physical activity questionnaires.

    PubMed

    Helmerhorst, Hendrik J F; Brage, Søren; Warren, Janet; Besson, Herve; Ekelund, Ulf

    2012-08-31

    Physical inactivity is one of the four leading risk factors for global mortality. Accurate measurement of physical activity (PA) and in particular by physical activity questionnaires (PAQs) remains a challenge. The aim of this paper is to provide an updated systematic review of the reliability and validity characteristics of existing and more recently developed PAQs and to quantitatively compare the performance between existing and newly developed PAQs.A literature search of electronic databases was performed for studies assessing reliability and validity data of PAQs using an objective criterion measurement of PA between January 1997 and December 2011. Articles meeting the inclusion criteria were screened and data were extracted to provide a systematic overview of measurement properties. Due to differences in reported outcomes and criterion methods a quantitative meta-analysis was not possible.In total, 31 studies testing 34 newly developed PAQs, and 65 studies examining 96 existing PAQs were included. Very few PAQs showed good results on both reliability and validity. Median reliability correlation coefficients were 0.62-0.71 for existing, and 0.74-0.76 for new PAQs. Median validity coefficients ranged from 0.30-0.39 for existing, and from 0.25-0.41 for new PAQs.Although the majority of PAQs appear to have acceptable reliability, the validity is moderate at best. Newly developed PAQs do not appear to perform substantially better than existing PAQs in terms of reliability and validity. Future PAQ studies should include measures of absolute validity and the error structure of the instrument.

  18. Braden scale (ALB) for assessing pressure ulcer risk in hospital patients: A validity and reliability study.

    PubMed

    Chen, Hong-Lin; Cao, Ying-Juan; Zhang, Wei; Wang, Jing; Huai, Bao-Sha

    2017-02-01

    The inter-rater reliability of Braden Scale is not so good. We modified the Braden(ALB) scale by defining nutrition subscale based on serum albumin, then assessed it's the validity and reliability in hospital patients. We designed a retrospective study for validity analysis, and a prospective study for reliability analysis. Receiver operating curve (ROC) and area under the curve (AUC) were used to evaluate the predictive validity. Intra-class correlation coefficient (ICC) was used to investigate the inter-rater reliability. Two thousand five hundred twenty-five patients were included for validity analysis, 76 patients (3.0%) developed pressure ulcer. Positive correlation was found between serum albumin and nutrition score in Braden scale (Spearman's coefficient 0.2203, P<0.0001). The AUCs for Braden scale and Braden(ALB) scale predicting pressure ulcer risk were 0.813 (95% CI 0.797-0.828; P<0.0001), and 0.859 (95% CI 0.845-0.872; P<0.0001), respectively. The Braden(ALB) scale was even more valid than the Braden scale (z=1.860, P=0.0628). In different age subgroups, the Braden(ALB) scale seems also more valid than the original Braden scale, but no statistically significant differences were found (P>0.05). The inter-rater reliability study showed the ICC-value for nutrition increased 45.9%, and increased 4.3% for total score. The Braden(ALB) scale has similar validity compared with the original Braden scale for in hospital patients. However, the inter-rater reliability was significantly increased. Copyright © 2016 Elsevier Inc. All rights reserved.

  19. A systematic review of reliability and objective criterion-related validity of physical activity questionnaires

    PubMed Central

    2012-01-01

    Physical inactivity is one of the four leading risk factors for global mortality. Accurate measurement of physical activity (PA) and in particular by physical activity questionnaires (PAQs) remains a challenge. The aim of this paper is to provide an updated systematic review of the reliability and validity characteristics of existing and more recently developed PAQs and to quantitatively compare the performance between existing and newly developed PAQs. A literature search of electronic databases was performed for studies assessing reliability and validity data of PAQs using an objective criterion measurement of PA between January 1997 and December 2011. Articles meeting the inclusion criteria were screened and data were extracted to provide a systematic overview of measurement properties. Due to differences in reported outcomes and criterion methods a quantitative meta-analysis was not possible. In total, 31 studies testing 34 newly developed PAQs, and 65 studies examining 96 existing PAQs were included. Very few PAQs showed good results on both reliability and validity. Median reliability correlation coefficients were 0.62–0.71 for existing, and 0.74–0.76 for new PAQs. Median validity coefficients ranged from 0.30–0.39 for existing, and from 0.25–0.41 for new PAQs. Although the majority of PAQs appear to have acceptable reliability, the validity is moderate at best. Newly developed PAQs do not appear to perform substantially better than existing PAQs in terms of reliability and validity. Future PAQ studies should include measures of absolute validity and the error structure of the instrument. PMID:22938557

  20. Web-Based Assessment of Mental Well-Being in Early Adolescence: A Reliability Study.

    PubMed

    Hamann, Christoph; Schultze-Lutter, Frauke; Tarokh, Leila

    2016-06-15

    The ever-increasing use of the Internet among adolescents represents an emerging opportunity for researchers to gain access to larger samples, which can be queried over several years longitudinally. Among adolescents, young adolescents (ages 11 to 13 years) are of particular interest to clinicians as this is a transitional stage, during which depressive and anxiety symptoms often emerge. However, it remains unclear whether these youngest adolescents can accurately answer questions about their mental well-being using a Web-based platform. The aim of the study was to examine the accuracy of responses obtained from Web-based questionnaires by comparing Web-based with paper-and-pencil versions of depression and anxiety questionnaires. The primary outcome was the score on the depression and anxiety questionnaires under two conditions: (1) paper-and-pencil and (2) Web-based versions. Twenty-eight adolescents (aged 11-13 years, mean age 12.78 years and SD 0.78; 18 females, 64%) were randomly assigned to complete either the paper-and-pencil or the Web-based questionnaire first. Intraclass correlation coefficients (ICCs) were calculated to measure intrarater reliability. Intraclass correlation coefficients were calculated separately for depression (Children's Depression Inventory, CDI) and anxiety (Spence Children's Anxiety Scale, SCAS) questionnaires. On average, it took participants 17 minutes (SD 6) to answer 116 questions online. Intraclass correlation coefficient analysis revealed high intrarater reliability when comparing Web-based with paper-and-pencil responses for both CDI (ICC=.88; P<.001) and the SCAS (ICC=.95; P<.001). According to published criteria, both of these values are in the "almost perfect" category indicating the highest degree of reliability. The results of the study show an excellent reliability of Web-based assessment in 11- to 13-year-old children as compared with the standard paper-pencil assessment. Furthermore, we found that Web-based assessments with young adolescents are highly feasible, with all enrolled participants completing the Web-based form. As early adolescence is a time of remarkable social and behavioral changes, these findings open up new avenues for researchers from diverse fields who are interested in studying large samples of young adolescents over time.

  1. A stochastic simulation method for the assessment of resistive random access memory retention reliability

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Berco, Dan, E-mail: danny.barkan@gmail.com; Tseng, Tseung-Yuen, E-mail: tseng@cc.nctu.edu.tw

    This study presents an evaluation method for resistive random access memory retention reliability based on the Metropolis Monte Carlo algorithm and Gibbs free energy. The method, which does not rely on a time evolution, provides an extremely efficient way to compare the relative retention properties of metal-insulator-metal structures. It requires a small number of iterations and may be used for statistical analysis. The presented approach is used to compare the relative robustness of a single layer ZrO{sub 2} device with a double layer ZnO/ZrO{sub 2} one, and obtain results which are in good agreement with experimental data.

  2. [Development of a Japanese version of a short form of the Profile of Emotional Competence].

    PubMed

    Nozaki, Yuki; Koyasu, Masuo

    2015-06-01

    Emotional competence refers to individual differences in the ability to appropriately identity, understand, express, regulate, and utilize one's own emotions and those of others. This study developed a Japanese version of a short form of the Profile of Emotional Competence, a measure that allows the comprehensive assessment of intra- and interpersonal emotional competence with shorter items, and investigated its reliability and validity. In Study 1, we selected items for a short version and compared it with the full scale in terms of scores, internal consistency, and validity. In Study 2, we examined the short form's test-retest reliability. Results supported the original two-factor model and the measure had adequate reliability and validity. We discuss the construct validity and practical applicability of the short form of the Profile of Emotional Competence.

  3. Reliable Alignment in Total Knee Arthroplasty by the Use of an iPod-Based Navigation System

    PubMed Central

    Koenen, Paola; Schneider, Marco M.; Fröhlich, Matthias; Driessen, Arne; Bouillon, Bertil; Bäthis, Holger

    2016-01-01

    Axial alignment is one of the main objectives in total knee arthroplasty (TKA). Computer-assisted surgery (CAS) is more accurate regarding limb alignment reconstruction compared to the conventional technique. The aim of this study was to analyse the precision of the innovative navigation system DASH® by Brainlab and to evaluate the reliability of intraoperatively acquired data. A retrospective analysis of 40 patients was performed, who underwent CAS TKA using the iPod-based navigation system DASH. Pre- and postoperative axial alignment were measured on standardized radiographs by two independent observers. These data were compared with the navigation data. Furthermore, interobserver reliability was measured. The duration of surgery was monitored. The mean difference between the preoperative mechanical axis by X-ray and the first intraoperatively measured limb axis by the navigation system was 2.4°. The postoperative X-rays showed a mean difference of 1.3° compared to the final navigation measurement. According to radiographic measurements, 88% of arthroplasties had a postoperative limb axis within ±3°. The mean additional time needed for navigation was 5 minutes. We could prove very good precision for the DASH system, which is comparable to established navigation devices with only negligible expenditure of time compared to conventional TKA. PMID:27313898

  4. Intra-instrument reliability of 4 goniometers.

    PubMed

    Pringle, R Kevin

    2003-01-01

    Cervical spine ROM movements taken accurately with reliable measuring devices are important in outcome measures as well as in measuring disability. To compare the active cervical spine ROM in healthy young adult population using 4 different goniometers. Subjects were tested during active cervical spine ROM. The devices were a single hinge inclinometer, single bubble carpenter's inclinometer, dual bubble goniometers and Cybex EDI 320 electrical inclinometer. All subjects were tested for rotational limits along each of the orthogonal axes of movement. There are 3 trials for each movement direction, except rotation was not measured with the Cybex as per manual suggestions. The subjects were randomly assigned to the sequence of devices. Twenty-seven student volunteers (19 men and 8 women) were tested. Ages ranged from 21 to 41, mean age of 27.6 years of age. Active cervical spine ROM trials for each measurement was used to calculate mean and standard deviation. An overall analysis of variance (ANOVA) and Bonferroni adjusted T-test were determined in order to calculate reliability and significance. The cost of the instruments were not used in determining reliability or significance. The single hinge inclinometer was found to be a reliable measure but not likely valid. The Cybex EDI 320 was found to be the best measuring device; however, the 2 instruments whose cost were in-between the single hinge inclinometer and the electrical goniometer were just as reliable as the more expensive device. The AMA Guides of Impairment were used as the normative data to compare these devices. Since the devices could measure reliably, whether expensive or more cost effective for students they would likely make adequate devices for training students on the methods for measuring ROM. There is previous data to suggest that older populations have gender differences and age differences with ROM. This study could not measure that and would make a useful follow-up study.

  5. Validity and Reliability of a Digital Inclinometer to Assess Knee Joint Position Sense in an Open Kinetic Chain.

    PubMed

    Romero-Franco, Natalia; Montaño-Munuera, Juan Antonio; Fernández-Domínguez, Juan Carlos; Jiménez-Reyes, Pedro

    2017-12-18

    New methods are being validated to easily evaluate the knee joint position sense (JPS) due to its role in sports movement and the risk of injury. However, no studies to date have considered the open kinetic chain (OKC) technique, despite the biomechanical differences compared to closed kinetic chain movements. To analyze the validity and reliability of a digital inclinometer to measure the knee JPS in the OKC movement. The validity, inter-tester and intra-tester reliability of a digital inclinometer for measuring knee JPS were evaluated. Sports research laboratory. Eighteen athletes (11 males and 7 females; 28.4 ± 6.6 years; 71.9 ± 14.0 kg; 1.77 ± 0.09 m; 22.8 ± 3.2 kg/m 2 ) voluntary participated in this study. Absolute angular error (AAE), relative angular error (RAE) and variable angular error (VAE) of knee JPS in an OKC. Intraclass correlation coefficient (ICC) and standard error of the mean (SEM) were calculated to determine the validity and reliability of the inclinometer. Data showed excellent validity of the inclinometer to obtain proprioceptive errors compared to the video analysis in JPS tasks (AAE: ICC = 0.981, SEM = 0.08; RAE: ICC = 0.974, SEM = 0.12; VAE: ICC = 0.973, SEM = 0.07). Inter-tester reliability was also excellent for all the proprioceptive errors (AAE: ICC = 0.967, SEM = 0.04; RAE: ICC = 0.974, SEM = 0.03; VAE: ICC = 0.939, SEM = 0.08). Similar results were obtained for intra-tester reliability (AAE: ICC = 0.861, SEM = 0.1; RAE: ICC = 0.894, SEM = 0.1; VAE: ICC = 0.700, SEM = 0.2). The digital inclinometer is a valid and reliable method to assess the knee JPS in OKC. Sport professionals may evaluate the knee JPS to monitor its deterioration during training or improvements throughout the rehabilitation process.

  6. Detection of myocardial ischemia by automated, motion-corrected, color-encoded perfusion maps compared with visual analysis of adenosine stress cardiovascular magnetic resonance imaging at 3 T: a pilot study.

    PubMed

    Doesch, Christina; Papavassiliu, Theano; Michaely, Henrik J; Attenberger, Ulrike I; Glielmi, Christopher; Süselbeck, Tim; Fink, Christian; Borggrefe, Martin; Schoenberg, Stefan O

    2013-09-01

    The purpose of this study was to compare automated, motion-corrected, color-encoded (AMC) perfusion maps with qualitative visual analysis of adenosine stress cardiovascular magnetic resonance imaging for detection of flow-limiting stenoses. Myocardial perfusion measurements applying the standard adenosine stress imaging protocol and a saturation-recovery temporal generalized autocalibrating partially parallel acquisition (t-GRAPPA) turbo fast low angle shot (Turbo FLASH) magnetic resonance imaging sequence were performed in 25 patients using a 3.0-T MAGNETOM Skyra (Siemens Healthcare Sector, Erlangen, Germany). Perfusion studies were analyzed using AMC perfusion maps and qualitative visual analysis. Angiographically detected coronary artery (CA) stenoses greater than 75% or 50% or more with a myocardial perfusion reserve index less than 1.5 were considered as hemodynamically relevant. Diagnostic performance and time requirement for both methods were compared. Interobserver and intraobserver reliability were also assessed. A total of 29 CA stenoses were included in the analysis. Sensitivity, specificity, positive predictive value, negative predictive value, and accuracy for detection of ischemia on a per-patient basis were comparable using the AMC perfusion maps compared to visual analysis. On a per-CA territory basis, the attribution of an ischemia to the respective vessel was facilitated using the AMC perfusion maps. Interobserver and intraobserver reliability were better for the AMC perfusion maps (concordance correlation coefficient, 0.94 and 0.93, respectively) compared to visual analysis (concordance correlation coefficient, 0.73 and 0.79, respectively). In addition, in comparison to visual analysis, the AMC perfusion maps were able to significantly reduce analysis time from 7.7 (3.1) to 3.2 (1.9) minutes (P < 0.0001). The AMC perfusion maps yielded a diagnostic performance on a per-patient and on a per-CA territory basis comparable with the visual analysis. Furthermore, this approach demonstrated higher interobserver and intraobserver reliability as well as a better time efficiency when compared to visual analysis.

  7. Validation of different pediatric triage systems in the emergency department

    PubMed Central

    Aeimchanbanjong, Kanokwan; Pandee, Uthen

    2017-01-01

    BACKGROUND: Triage system in children seems to be more challenging compared to adults because of their different response to physiological and psychosocial stressors. This study aimed to determine the best triage system in the pediatric emergency department. METHODS: This was a prospective observational study. This study was divided into two phases. The first phase determined the inter-rater reliability of five triage systems: Manchester Triage System (MTS), Emergency Severity Index (ESI) version 4, Pediatric Canadian Triage and Acuity Scale (CTAS), Australasian Triage Scale (ATS), and Ramathibodi Triage System (RTS) by triage nurses and pediatric residents. In the second phase, to analyze the validity of each triage system, patients were categorized as two groups, i.e., high acuity patients (triage level 1, 2) and low acuity patients (triage level 3, 4, and 5). Then we compared the triage acuity with actual admission. RESULTS: In phase I, RTS illustrated almost perfect inter-rater reliability with kappa of 1.0 (P<0.01). ESI and CTAS illustrated good inter-rater reliability with kappa of 0.8–0.9 (P<0.01). Meanwhile, ATS and MTS illustrated moderate to good inter-rater reliability with kappa of 0.5–0.7 (P<0.01). In phase II, we included 1 041 participants with average age of 4.7±4.2 years, of which 55% were male and 45% were female. In addition 32% of the participants had underlying diseases, and 123 (11.8%) patients were admitted. We found that ESI illustrated the most appropriate predicting ability for admission with sensitivity of 52%, specificity of 81%, and AUC 0.78 (95%CI 0.74–0.81). CONCLUSION: RTS illustrated almost perfect inter-rater reliability. Meanwhile, ESI and CTAS illustrated good inter-rater reliability. Finally, ESI illustrated the appropriate validity for triage system. PMID:28680520

  8. Validity and reliability of the iPhone to measure rib hump in scoliosis.

    PubMed

    Balg, Frederic; Juteau, Mathieu; Theoret, Chantal; Svotelis, Amy; Grenier, Guillaume

    2014-12-01

    This was a prospective blinded validity and reliability analysis. The aim of this study was validation and reliability evaluation of the Scoligauge iPhone app. The scoliometer is used to clinically measure the rib hump in scoliosis as a means to evaluate the axial trunk rotation. The increasing availability of smartphone with built-in accelerometer led to the development of a vast number of applications to measure angles. Of these, the Scoligauge mimics a scoliometer. The aim of this study was to compare the validity of the Scoligauge iPhone application without an associated adapter with the traditional scoliometer and to test the reliability of the application in a clinical setting. Two observers measured the rib hump deformity on 34 consecutive patients with idiopathic scoliosis with an average Cobb angle of 24.2 ± 13.5 degrees (range, 4 to 65 degrees). Measurements were made with an iPhone without the adapter and with a scoliometer. The validity as well as the interobserver and intraobserver reliability were calculated using the intraclass coefficient (ICC) and the Bland-Altman test. The mean difference between the scoliometer and the Scoligauge application was 0.4 degrees [95% confidence interval (CI) of ± 3.1 degrees] with an ICC of 0.947 (P < 0.001). The intraobserver and interobserver ICC were 0.961 (P < 0.001) and 0.901 (P < 0.001), respectively. The mean intraobserver difference was 0.0 degrees (95% CI of ± 2.7 degrees) and the mean interobserver difference was 0.1 degrees (95% CI of ± 4.4 degrees). The intraobserver and interobserver reliability of the Scoligauge iPhone app, as well as its validity compared with the scoliometer, are excellent. The mean differences between measurements are small and clinically not significant. Thus, the Scoligauge application is valid for clinical evaluation even without special adapter. Level I (Diagnostic Study).

  9. Reliability and Validity of Wisconsin Upper Respiratory Symptom Survey, Korean Version

    PubMed Central

    Yang, Su-Young; Kang, Weechang; Yeo, Yoon; Park, Yang-Chun

    2011-01-01

    Background The Wisconsin Upper Respiratory Symptom Survey (WURSS) is a self-administered questionnaire developed in the United States to evaluate the severity of the common cold and its reliability has been validated. We developed a Korean language version of this questionnaire by using a sequential forward and backward translation approach. The purpose of this study was to validate the Korean version of the Wisconsin Upper Respiratory Symptom Survey (WURSS-K) in Korean patients with common cold. Methods This multicenter prospective study enrolled 107 participants who were diagnosed with common cold and consented to participate in the study. The WURSS-K includes 1 global illness severity item, 32 symptom-based items, 10 functional quality-of-life (QOL) items, and 1 item assessing global change. The SF-8 was used as an external comparator. Results The participants were 54 women and 53 men aged 18 to 42 years. The WURSS-K showed good reliability in 10 domains, with Cronbach’s alphas ranging from 0.67 to 0.96 (mean: 0.84). Comparison of the reliability coefficients of the WURSS-K and WURSS yielded a Pearson correlation coefficient of 0.71 (P = 0.02). Validity of the WURSS-K was evaluated by comparing it with the SF-8, which yielded a Pearson correlation coefficient of −0.267 (P < 0.001). The Guyatt’s responsiveness index of the WURSS-K ranged from 0.13 to 0.46, and the correlation coefficient with the WURSS was 0.534 (P < 0.001), indicating that there was close correlation between the WURSS-K and WURSS. Conclusions The WURSS-K is a reliable, valid, and responsive disease-specific questionnaire for assessing symptoms and QOL in Korean patients with common cold. PMID:21691034

  10. The Dartmouth COOP Charts: a simple, reliable, valid and responsive quality of life tool for chronic obstructive pulmonary disease.

    PubMed

    Eaton, T; Young, P; Fergusson, W; Garrett, J E; Kolbe, J

    2005-04-01

    The negative impact of chronic obstructive pulmonary disease (COPD) on health-related quality of life (HRQL) is substantial. Measurement of HRQL is increasingly advocated in clinical practice; traditional outcome measures such as lung function are poorly responsive. However many HRQL tools are not user-friendly in the clinic setting. Hence HRQL is often neglected. The Dartmouth Cooperative Functional Assessment Charts (COOP) have the requisite attributes of a tool suitable for routine clinical practice: they are simple, reliable, quick and easy to perform and score and well accepted. We aimed to determine the reliability, validity and responsiveness of the COOP in patients with significant COPD. HRQL was assessed during a prospective, randomised, placebo-controlled, double-blind, 12 week cross-over interventional study of ambulatory oxygen in patients (n = 50) with COPD. Test-retest reliability of the COOP domains was only modest however it was measured over a 2 month period. Significant correlations ranging between 0.4 and 0.8 were observed between all comparable domains of the COOP and the Medical Outcomes Study 36-item Short-form Health Survey, Chronic Respiratory Questionnaire (CRQ) and Hospital Anxiety and Depression (HAD) scale. Following ambulatory oxygen significant improvements were noted in all CRQ and HAD domains. Several domains of the generic SF-36 (role emotional, social functioning, role-physical) showed significant improvements. Comparable domains of the COOP (social activities, feelings) also showed significant improvements. The COOP change in health domain improved very significantly. The COOP is a simple, reliable HRQL tool which proved valid and responsive in our study population of COPD patients and may have a valuable role in routine clinical practice.

  11. Validity and intra-rater reliability of MyJump app on iPhone 6s in jump performance.

    PubMed

    Stanton, Robert; Wintour, Sally-Anne; Kean, Crystal O

    2017-05-01

    Smartphone applications are increasingly used by researchers, coaches, athletes and clinicians. The aim of this study was to examine the concurrent validity and intra-rater reliability of the smartphone-based application, MyJump, against laboratory-based force plate measurements. Cross sectional study. Participants completed counter-movement jumps (CMJ) (n=29) and 30cm drop jumps (DJ) (n=27) on a force plate which were simultaneously recorded using MyJump. To assess concurrent validity, jump height, derived from flight time acquired from each device, was compared for each jump type. Intra-rater reliability was determined by replicating data analysis of MyJump recordings on two occasions separated by seven days. CMJ and DJ heights derived from MyJump showed excellent agreement with the force plate (ICC values range from 0.991 for CMJ to 0.993) However mean DJ height from the force plate was significantly higher than MyJump (mean difference: 0.87cm, 95% CI: 0.69-1.04cm). Intra-rater reliability of MyJump for both CMJ and DJ was almost perfect (ICC values range from 0.997 for CMJ to 0.998 for DJ); however, mean CMJ and DJ jump height for Day 1 was significantly higher than Day 2 (CMJ: 0.43cm, 95% CI: 0.23-0.62cm); (DJ: 0.38cm, 95% CI: 0.23-0.53cm). The present study finds MyJump to be a valid and highly reliable tool for researchers, coaches, athletes and clinicians; however, systematic bias should be considered when comparing MyJump outputs to other testing devices. Copyright © 2016 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.

  12. Validity and reliability of an occupational exposure questionnaire for parkinsonism in welders.

    PubMed

    Hobson, Angela J; Sterling, David A; Emo, Brett; Evanoff, Bradley A; Sterling, Callen S; Good, Laura; Seixas, Noah; Checkoway, Harvey; Racette, Brad A

    2009-06-01

    This study assessed the validity and test-retest reliability of a medical and occupational history questionnaire for workers performing welding in the shipyard industry. This self-report questionnaire was developed for an epidemiologic study of the risk of parkinsonism in welders. Validity participants recruited from three similar shipyards were asked to give consent for access to personnel files and complete the questionnaire. Responses on the questionnaire were compared with information extracted from personnel records. Reliability participants were recruited from the same shipyards and were asked to complete the questionnaire at two different times approximately 4 weeks apart. Percent agreement, kappa, intraclass correlation coefficient (ICC), and sensitivity and specificity were used as measures of validity and/or reliability. Personnel files were obtained for 101 of 143 participants (70%) in the validity study, and 56 of the 95 (58.9%) participants in the reliability study completed the retest of the questionnaire. Validity scores for items extracted from personnel files were high. Percent agreement for employment dates and job titles ranged from 83-100%, while ICC for start and stop dates ranged from 0.93-0.99. Sensitivity and specificity for current job title ranged from 0.5-1.0. Reliability scores for demographic, medical and health behavior items were mainly moderate or high, but ranged from 0.19 to 1.0. Most recent job/title items such as title, types of welding performed, and material used showed substantial to perfect agreement. Certain determinants of exposure such as days and hours per week exposed to welding fumes demonstrated mainly moderate agreement (kappa= 0.42-0.47, percent agreement 63-77%); however, mean days and hours reported did not differ between test and retest. The results of this study suggest that participants' self-report for job title and dates employed are valid compared with employer records. While kappa scores were low for some medical conditions and for caffeine consumption, high kappa scores for job title, dates worked, types of welding, and materials welded suggest participants generated reproducible answers important for occupational exposure assessment.

  13. Concurrent validity of different functional and neuroproteomic pain assessment methods in the rat osteoarthritis monosodium iodoacetate (MIA) model.

    PubMed

    Otis, Colombe; Gervais, Julie; Guillot, Martin; Gervais, Julie-Anne; Gauvin, Dominique; Péthel, Catherine; Authier, Simon; Dansereau, Marc-André; Sarret, Philippe; Martel-Pelletier, Johanne; Pelletier, Jean-Pierre; Beaudry, Francis; Troncy, Eric

    2016-06-23

    Lack of validity in osteoarthritis pain models and assessment methods is suspected. Our goal was to 1) assess the repeatability and reproducibility of measurement and the influence of environment, and acclimatization, to different pain assessment outcomes in normal rats, and 2) test the concurrent validity of the most reliable methods in relation to the expression of different spinal neuropeptides in a chemical model of osteoarthritic pain. Repeatability and inter-rater reliability of reflexive nociceptive mechanical thresholds, spontaneous static weight-bearing, treadmill, rotarod, and operant place escape/avoidance paradigm (PEAP) were assessed by the intraclass correlation coefficient (ICC). The most reliable acclimatization protocol was determined by comparing coefficients of variation. In a pilot comparative study, the sensitivity and responsiveness to treatment of the most reliable methods were tested in the monosodium iodoacetate (MIA) model over 21 days. Two MIA (2 mg) groups (including one lidocaine treatment group) and one sham group (0.9 % saline) received an intra-articular (50 μL) injection. No effect of environment (observer, inverted circadian cycle, or exercise) was observed; all tested methods except mechanical sensitivity (ICC <0.3), offered good repeatability (ICC ≥0.7). The most reliable acclimatization protocol included five assessments over two weeks. MIA-related osteoarthritic change in pain was demonstrated with static weight-bearing, punctate tactile allodynia evaluation, treadmill exercise and operant PEAP, the latter being the most responsive to analgesic intra-articular lidocaine. Substance P and calcitonin gene-related peptide were higher in MIA groups compared to naive (adjusted P (adj-P) = 0.016) or sham-treated (adj-P = 0.029) rats. Repeated post-MIA lidocaine injection resulted in 34 times lower downregulation for spinal substance P compared to MIA alone (adj-P = 0.029), with a concomitant increase of 17 % in time spent on the PEAP dark side (indicative of increased comfort). This study of normal rats and rats with pain established the most reliable and sensitive pain assessment methods and an optimized acclimatization protocol. Operant PEAP testing was more responsive to lidocaine analgesia than other tests used, while neuropeptide spinal concentration is an objective quantification method attractive to support and validate different centralized pain functional assessment methods.

  14. Validity and Reliability of Three Self-Report Instruments for Assessing Attainment of Physical Activity Guidelines in University Students

    ERIC Educational Resources Information Center

    Murphy, Joseph J.; Murphy, Marie H.; MacDonncha, Ciaran; Murphy, Niamh; Nevill, Alan M.; Woods, Catherine B.

    2017-01-01

    The purpose of this study was to compare the validity and reliability of three short physical activity self-report instruments to determine their potential for use with university student populations. The participants (N = 155; 44.5% male; 22.9 ± 5.13 years) wore an accelerometer for 9 consecutive days and completed a single-item measure, the a…

  15. The Reliability of Galaxy Classifications by Citizen Scientists

    NASA Astrophysics Data System (ADS)

    Francis, Lennox; Kautsch, Stefan J.; Bizyaev, Dmitry

    2017-01-01

    Citizen scientists are becoming more and more important in helping professionals working through big data. An example in astronomy is crowdsourced galaxy classification. But how reliable are these classifications for studies of galaxy evolution? We present a tool in order to investigate those morphological classifications and test it on a diverse population on our campus. We observe a slight offset towards earlier Hubble types in the crowdsourced morphologies, when compared to professional classifications.

  16. Estimation of Reliability Coefficients Using the Test Information Function and Its Modifications.

    ERIC Educational Resources Information Center

    Samejima, Fumiko

    1994-01-01

    The reliability coefficient is predicted from the test information function (TIF) or two modified TIF formulas and a specific trait distribution. Examples illustrate the variability of the reliability coefficient across different trait distributions, and results are compared with empirical reliability coefficients. (SLD)

  17. A Comparative Evaluation of Mixed Dentition Analysis on Reliability of Cone Beam Computed Tomography Image Compared to Plaster Model

    PubMed Central

    Gowd, Snigdha; Shankar, T; Dash, Samarendra; Sahoo, Nivedita; Chatterjee, Suravi; Mohanty, Pritam

    2017-01-01

    Aims and Objective: The aim of the study was to evaluate the reliability of cone beam computed tomography (CBCT) obtained image over plaster model for the assessment of mixed dentition analysis. Materials and Methods: Thirty CBCT-derived images and thirty plaster models were derived from the dental archives, and Moyer's and Tanaka-Johnston analyses were performed. The data obtained were interpreted and analyzed statistically using SPSS 10.0/PC (SPSS Inc., Chicago, IL, USA). Descriptive and analytical analysis along with Student's t-test was performed to qualitatively evaluate the data and P < 0.05 was considered statistically significant. Results: Statistically, significant results were obtained on data comparison between CBCT-derived images and plaster model; the mean for Moyer's analysis in the left and right lower arch for CBCT and plaster model was 21.2 mm, 21.1 mm and 22.5 mm, 22.5 mm, respectively. Conclusion: CBCT-derived images were less reliable as compared to data obtained directly from plaster model for mixed dentition analysis. PMID:28852639

  18. A proposed method to investigate reliability throughout a questionnaire

    PubMed Central

    2011-01-01

    Background Questionnaires are used extensively in medical and health care research and depend on validity and reliability. However, participants may differ in interest and awareness throughout long questionnaires, which can affect reliability of their answers. A method is proposed for "screening" of systematic change in random error, which could assess changed reliability of answers. Methods A simulation study was conducted to explore whether systematic change in reliability, expressed as changed random error, could be assessed using unsupervised classification of subjects by cluster analysis (CA) and estimation of intraclass correlation coefficient (ICC). The method was also applied on a clinical dataset from 753 cardiac patients using the Jalowiec Coping Scale. Results The simulation study showed a relationship between the systematic change in random error throughout a questionnaire and the slope between the estimated ICC for subjects classified by CA and successive items in a questionnaire. This slope was proposed as an awareness measure - to assessing if respondents provide only a random answer or one based on a substantial cognitive effort. Scales from different factor structures of Jalowiec Coping Scale had different effect on this awareness measure. Conclusions Even though assumptions in the simulation study might be limited compared to real datasets, the approach is promising for assessing systematic change in reliability throughout long questionnaires. Results from a clinical dataset indicated that the awareness measure differed between scales. PMID:21974842

  19. Reliability measures of functional magnetic resonance imaging in a longitudinal evaluation of mild cognitive impairment.

    PubMed

    Zanto, Theodore P; Pa, Judy; Gazzaley, Adam

    2014-01-01

    As the aging population grows, it has become increasingly important to carefully characterize amnestic mild cognitive impairment (aMCI), a preclinical stage of Alzheimer's disease (AD). Functional magnetic resonance imaging (fMRI) is a valuable tool for monitoring disease progression in selectively vulnerable brain regions associated with AD neuropathology. However, the reliability of fMRI data in longitudinal studies of older adults with aMCI is largely unexplored. To address this, aMCI participants completed two visual working tasks, a Delayed-Recognition task and a One-Back task, on three separate scanning sessions over a three-month period. Test-retest reliability of the fMRI blood oxygen level dependent (BOLD) activity was assessed using an intraclass correlation (ICC) analysis approach. Results indicated that brain regions engaged during the task displayed greater reliability across sessions compared to regions that were not utilized by the task. During task-engagement, differential reliability scores were observed across the brain such that the frontal lobe, medial temporal lobe, and subcortical structures exhibited fair to moderate reliability (ICC=0.3-0.6), while temporal, parietal, and occipital regions exhibited moderate to good reliability (ICC=0.4-0.7). Additionally, reliability across brain regions was more stable when three fMRI sessions were used in the ICC calculation relative to two fMRI sessions. In conclusion, the fMRI BOLD signal is reliable across scanning sessions in this population and thus a useful tool for tracking longitudinal change in observational and interventional studies in aMCI. © 2013.

  20. Validity and Reliability of Baseline Testing in a Standardized Environment.

    PubMed

    Higgins, Kathryn L; Caze, Todd; Maerlender, Arthur

    2017-08-11

    The Immediate Postconcussion Assessment and Cognitive Testing (ImPACT) is a computerized neuropsychological test battery commonly used to determine cognitive recovery from concussion based on comparing post-injury scores to baseline scores. This model is based on the premise that ImPACT baseline test scores are a valid and reliable measure of optimal cognitive function at baseline. Growing evidence suggests that this premise may not be accurate and a large contributor to invalid and unreliable baseline test scores may be the protocol and environment in which baseline tests are administered. This study examined the effects of a standardized environment and administration protocol on the reliability and performance validity of athletes' baseline test scores on ImPACT by comparing scores obtained in two different group-testing settings. Three hundred-sixty one Division 1 cohort-matched collegiate athletes' baseline data were assessed using a variety of indicators of potential performance invalidity; internal reliability was also examined. Thirty-one to thirty-nine percent of the baseline cases had at least one indicator of low performance validity, but there were no significant differences in validity indicators based on environment in which the testing was conducted. Internal consistency reliability scores were in the acceptable to good range, with no significant differences between administration conditions. These results suggest that athletes may be reliably performing at levels lower than their best effort would produce. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  1. Bridge reliability assessment based on the PDF of long-term monitored extreme strains

    NASA Astrophysics Data System (ADS)

    Jiao, Meiju; Sun, Limin

    2011-04-01

    Structural health monitoring (SHM) systems can provide valuable information for the evaluation of bridge performance. As the development and implementation of SHM technology in recent years, the data mining and use has received increasingly attention and interests in civil engineering. Based on the principle of probabilistic and statistics, a reliability approach provides a rational basis for analysis of the randomness in loads and their effects on structures. A novel approach combined SHM systems with reliability method to evaluate the reliability of a cable-stayed bridge instrumented with SHM systems was presented in this paper. In this study, the reliability of the steel girder of the cable-stayed bridge was denoted by failure probability directly instead of reliability index as commonly used. Under the assumption that the probability distributions of the resistance are independent to the responses of structures, a formulation of failure probability was deduced. Then, as a main factor in the formulation, the probability density function (PDF) of the strain at sensor locations based on the monitoring data was evaluated and verified. That Donghai Bridge was taken as an example for the application of the proposed approach followed. In the case study, 4 years' monitoring data since the operation of the SHM systems was processed, and the reliability assessment results were discussed. Finally, the sensitivity and accuracy of the novel approach compared with FORM was discussed.

  2. Reliability and agreement in the use of four- and six-point ordinal scales for the assessment of erythema in digital images of canine skin.

    PubMed

    Hill, Peter B

    2015-06-01

    Grading of erythema in clinical practice is a subjective assessment that cannot be confirmed using a definitive test; nevertheless, erythema scores are typically measured in clinical trials assessing the response to treatment interventions. Most commonly, ordinal scales are used for this purpose, but the optimal number of categories in such scales has not been determined. This study aimed to compare the reliability and agreement of a four-point and a six-point ordinal scale for the assessment of erythema in digital images of canine skin. Fifteen digital images showing varying degrees of erythema were assessed by specialist dermatologists and laypeople, using either the four-point or the six-point scale. Reliability between the raters was assessed using intraclass correlation coefficients and Cronbach's α. Agreement was assessed using the variation ratio (the percentage of respondents who chose the mode, the most common answer). Intraobserver variability was assessed by comparing the results of two grading sessions, at least 6 weeks apart. Both scales demonstrated high reliability, with intraclass correlation coefficient values and Cronbach's α above 0.99. However, the four-point scale demonstrated significantly superior agreement, with variation ratios for the four-point scale averaging 74.8%, compared with 56.2% for the six-point scale. Intraobserver consistency for the four-point scale was very high. Although both scales demonstrated high reliability, the four-point scale was superior in terms of agreement. For the assessment of erythema in clinical trials, a four-point ordinal scale is recommended. © 2014 ESVD and ACVD.

  3. A Multi-Stage Longitudinal Comparative Design Stage II Evaluation of the Changing Lives Program: The Life Course Interview (RDA-LCI)

    ERIC Educational Resources Information Center

    Arango, Lisa Lewis; Kurtines, William M.; Montgomery, Marilyn J.; Ritchie, Rachel

    2008-01-01

    The study reported in this article, a Multi-Stage Longitudinal Comparative Design Stage II evaluation conducted as a planned preliminary efficacy evaluation (psychometric evaluation of measures, short-term controlled outcome studies, etc.) of the Changing Lives Program (CLP), provided evidence for the reliability and validity of the qualitative…

  4. Comparative Study of Eating-Related Attitudes and Psychological Traits between Israeli-Arab and -Jewish Schoolgirls

    ERIC Educational Resources Information Center

    Latzer, Yael; Tzischinsky, Orna; Geraisy, Nabil

    2007-01-01

    Objective: The aims of the study were to examine weight concerns, dieting and eating behaviours in a group of Israeli-Arab schoolgirls as compared with Israeli-Jewish schoolgirls, as well as to investigate the reliability of the Arabic (Palestinian) version of the eating disorder inventory-2 (EDI-2). Method: The sample consisted of 2548 Israeli…

  5. Using Digital and Paper Diaries for Learning and Assessment Purposes in Higher Education: A Comparative Study of Feasibility and Reliability

    ERIC Educational Resources Information Center

    Gleaves, Alan; Walker, Caroline; Grey, John

    2007-01-01

    The incorporation of diaries and journals as learning and assessment vehicles into programmes of study within higher education has enabled the further growth of reflection, creative writing, critical thinking and meta-cognitive processes of students' learning. However, there is currently little research that aims to compare how different types of…

  6. Preliminary design and analysis of an advanced rotorcraft transmission

    NASA Technical Reports Server (NTRS)

    Henry, Z. S.

    1990-01-01

    Future rotorcraft transmissions of the 1990s and beyond the year 2000 require the incorporation of key emerging material and component technologies using advanced and innovative design practices in order to meet the requirements for a reduced weight-to-power ratio, a decreased noise level, and a substantially increased reliability. The specific goals for future rotocraft transmissions when compared with current state-of-the-art transmissions are a 25 percent weight reduction, a 10-dB reduction in the transmitted noise level, and a system reliability of 5000 hours mean-time-between-removal for the transmission. This paper presents the results of the design studies conducted to meet the stated goals for an advanced rotorcraft transmission. These design studies include system configuration, planetary gear train selection, and reliability prediction methods.

  7. Reliability of shoulder internal rotation passive range of motion measurements in the supine versus sidelying position.

    PubMed

    Lunden, Jason B; Muffenbier, Mike; Giveans, M Russell; Cieminski, Cort J

    2010-09-01

    Clinical measurement, reliability. To compare intrarater and interrater reliability of shoulder internal rotation (IR) passive range of motion measurements utilizing a standard supine position and a sidelying position. Glenohumeral IR range of motion deficits are often noted in patients with shoulder pathology. Excellent intrarater reliability has been found when measuring this motion. However, interrater reliability has been reported as poor to fair. Some clinicians currently use a sidelying position for IR stretching with patients who have shoulder pathology. However, no objective data exist for IR passive range of motion measured in this sidelying position, either in terms of reliability or normative values. Seventy subjects (mean age, 36.8 years), with (n = 19) and without (n = 51) shoulder pathology, were included in this study. Shoulder IR passive range of motion of the dominant shoulder or involved shoulder was measured by 2 investigators in 2 positions: (1) a standard supine position, with the shoulder at 90 degrees of abduction, and (2) in sidelying on the tested side, with the shoulder flexed to 90 degrees . Intrarater reliability for supine measurements was good to excellent (ICC3,1 = 0.70-0.93) and for sidelying measurements was excellent (ICC3,1 = 0.94-0.98). Interrater reliability was fair to good for the supine measurement (ICC2,2 = 0.74-0.81) and good to excellent for the sidelying measurement (ICC2,2 = 0.88-0.96). The mean (range) value of the dominant shoulder sidelying IR passive range of motion was 40 degrees (11 degrees to 69 degrees ) for healthy subjects and 25 degrees (-16 degrees to 49 degrees) for subjects with shoulder pathology. For subjects with shoulder pathology, measurements of shoulder IR made in the sidelying position had superior intrarater and interrater reliability compared to those in the standard supine position.

  8. Quantitative comparison of in situ soil CO2 flux measurement methods

    Treesearch

    Jennifer D. Knoepp; James M. Vose

    2002-01-01

    Development of reliable regional or global carbon budgets requires accurate measurement of soil CO2 flux. We conducted laboratory and field studies to determine the accuracy and comparability of methods commonly used to measure in situ soil CO2 fluxes. Methods compared included CO2...

  9. The FLIR ONE thermal imager for the assessment of burn wounds: Reliability and validity study.

    PubMed

    Jaspers, M E H; Carrière, M E; Meij-de Vries, A; Klaessens, J H G M; van Zuijlen, P P M

    2017-11-01

    Objective measurement tools may be of great value to provide early and reliable burn wound assessment. Thermal imaging is an easy, accessible and objective technique, which measures skin temperature as an indicator of tissue perfusion. These thermal images might be helpful in the assessment of burn wounds. However, before implementation of a novel measurement tool into clinical practice is considered, it is appropriate to test its clinimetric properties (i.e. reliability and validity). The objective of this study was to assess the reliability and validity of the recently introduced FLIR ONE thermal imager. Two observers obtained thermal images of burn wounds in adult patients at day 1-3, 4-7 and 8-10 after burn. Subsequently, temperature differences between the burn wound and healthy skin (ΔT) were calculated on an iPad mini containing the FLIR Tools app. To assess reliability, ΔT values of both observers were compared by calculating the intraclass correlation coefficient (ICC) and measurement error parameters. To assess validity, the ΔT values of the first observer were compared to the registered healing time of the burn wounds, which was specified into three categories: (I) ≤14 days, (II) 15-21 days and (III) >21 days. The ability of the FLIR ONE to discriminate between healing ≤21 days and >21 days was evaluated by means of a receiver operating characteristic curve and an optimal ΔT cut-off value. Reliability: ICCs were 0.99 for each time point, indicating excellent reliability up to 10 days after burn. The standard error of measurement varied between 0.17-0.22°C. the area under the curve was calculated at 0.69 (95% CI 0.54-0.84). A cut-off value of -1.15°C shows a moderate discrimination between burn wound healing ≤21 days and >21 days (46% sensitivity; 82% specificity). Our results show that the FLIR ONE thermal imager is highly reliable, but the moderate validity calls for additional research. However, the FLIR ONE is pre-eminently feasible, allowing easy and fast measurements in clinical burn practice. Copyright © 2017 Elsevier Ltd and ISBI. All rights reserved.

  10. Reliability and accuracy analysis of a new semiautomatic radiographic measurement software in adult scoliosis.

    PubMed

    Aubin, Carl-Eric; Bellefleur, Christian; Joncas, Julie; de Lanauze, Dominic; Kadoury, Samuel; Blanke, Kathy; Parent, Stefan; Labelle, Hubert

    2011-05-20

    Radiographic software measurement analysis in adult scoliosis. To assess the accuracy as well as the intra- and interobserver reliability of measuring different indices on preoperative adult scoliosis radiographs using a novel measurement software that includes a calibration procedure and semiautomatic features to facilitate the measurement process. Scoliosis requires a careful radiographic evaluation to assess the deformity. Manual and computer radiographic process measures have been studied extensively to determine the reliability and reproducibility in adolescent idiopathic scoliosis. Most studies rely on comparing given measurements, which are repeated by the same user or by an expert user. A given measure with a small intra- or interobserver error might be deemed as good repeatability, but all measurements might not be truly accurate because the ground-truth value is often unknown. Thorough accuracy assessment of radiographic measures is necessary to assess scoliotic deformities, compare these measures at different stages or to permit valid multicenter studies. Thirty-four sets of adult scoliosis digital radiographs were measured two times by three independent observers using a novel radiographic measurement software that includes semiautomatic features to facilitate the measurement process. Twenty different measures taken from the Spinal Deformity Study Group radiographic measurement manual were performed on the coronal and sagittal images. Intra- and intermeasurer reliability for each measure was assessed. The accuracy of the measurement software was also assessed using a physical spine model in six different scoliotic configurations as a true reference. The majority of the measures demonstrated good to excellent intra- and intermeasurer reliability, except for sacral obliquity. The standard variation of all the measures was very small: ≤ 4.2° for Cobb angles, ≤ 4.2° for the kyphosis, ≤ 5.7° for the lordosis, ≤ 3.9° for the pelvic angles, and ≤5.3° for the sacral angles. The variability in the linear measurements (distances) was <4 mm. The variance of the measures was 1.7 and 2.6 times greater, respectively, for the angular and linear measures between the inter- and intrameasurer reliability. The image quality positively influenced the intermeasurer reliability especially for the proximal thoracic Cobb angle, T10-L2 lordosis, sacral slope and L5 seating. The accuracy study revealed that on average the difference in the angular measures was < 2° for the Cobb angles, and < 4° for the other angles, except T2-T12 kyphosis (5.3°). The linear measures were all <3.5 mm difference on average. The majority of the measures, which were analyzed in this study demonstrated good to excellent reliability and accuracy. The novel semiautomatic measurement software can be recommended for use for clinical, research or multicenter study purposes.

  11. Validity and reliability of balance assessment software using the Nintendo Wii balance board: usability and validation

    PubMed Central

    2014-01-01

    Background A balance test provides important information such as the standard to judge an individual’s functional recovery or make the prediction of falls. The development of a tool for a balance test that is inexpensive and widely available is needed, especially in clinical settings. The Wii Balance Board (WBB) is designed to test balance, but there is little software used in balance tests, and there are few studies on reliability and validity. Thus, we developed a balance assessment software using the Nintendo Wii Balance Board, investigated its reliability and validity, and compared it with a laboratory-grade force platform. Methods Twenty healthy adults participated in our study. The participants participated in the test for inter-rater reliability, intra-rater reliability, and concurrent validity. The tests were performed with balance assessment software using the Nintendo Wii balance board and a laboratory-grade force platform. Data such as Center of Pressure (COP) path length and COP velocity were acquired from the assessment systems. The inter-rater reliability, the intra-rater reliability, and concurrent validity were analyzed by an intraclass correlation coefficient (ICC) value and a standard error of measurement (SEM). Results The inter-rater reliability (ICC: 0.89-0.79, SEM in path length: 7.14-1.90, SEM in velocity: 0.74-0.07), intra-rater reliability (ICC: 0.92-0.70, SEM in path length: 7.59-2.04, SEM in velocity: 0.80-0.07), and concurrent validity (ICC: 0.87-0.73, SEM in path length: 5.94-0.32, SEM in velocity: 0.62-0.08) were high in terms of COP path length and COP velocity. Conclusion The balance assessment software incorporating the Nintendo Wii balance board was used in our study and was found to be a reliable assessment device. In clinical settings, the device can be remarkably inexpensive, portable, and convenient for the balance assessment. PMID:24912769

  12. Validity and reliability of balance assessment software using the Nintendo Wii balance board: usability and validation.

    PubMed

    Park, Dae-Sung; Lee, GyuChang

    2014-06-10

    A balance test provides important information such as the standard to judge an individual's functional recovery or make the prediction of falls. The development of a tool for a balance test that is inexpensive and widely available is needed, especially in clinical settings. The Wii Balance Board (WBB) is designed to test balance, but there is little software used in balance tests, and there are few studies on reliability and validity. Thus, we developed a balance assessment software using the Nintendo Wii Balance Board, investigated its reliability and validity, and compared it with a laboratory-grade force platform. Twenty healthy adults participated in our study. The participants participated in the test for inter-rater reliability, intra-rater reliability, and concurrent validity. The tests were performed with balance assessment software using the Nintendo Wii balance board and a laboratory-grade force platform. Data such as Center of Pressure (COP) path length and COP velocity were acquired from the assessment systems. The inter-rater reliability, the intra-rater reliability, and concurrent validity were analyzed by an intraclass correlation coefficient (ICC) value and a standard error of measurement (SEM). The inter-rater reliability (ICC: 0.89-0.79, SEM in path length: 7.14-1.90, SEM in velocity: 0.74-0.07), intra-rater reliability (ICC: 0.92-0.70, SEM in path length: 7.59-2.04, SEM in velocity: 0.80-0.07), and concurrent validity (ICC: 0.87-0.73, SEM in path length: 5.94-0.32, SEM in velocity: 0.62-0.08) were high in terms of COP path length and COP velocity. The balance assessment software incorporating the Nintendo Wii balance board was used in our study and was found to be a reliable assessment device. In clinical settings, the device can be remarkably inexpensive, portable, and convenient for the balance assessment.

  13. Three-dimensional assessment of the asymptomatic and post-stroke shoulder: intra-rater test-retest reliability and within-subject repeatability of the palpation and digitization approach.

    PubMed

    Pain, Liza A M; Baker, Ross; Sohail, Qazi Zain; Richardson, Denyse; Zabjek, Karl; Mogk, Jeremy P M; Agur, Anne M R

    2018-03-23

    Altered three-dimensional (3D) joint kinematics can contribute to shoulder pathology, including post-stroke shoulder pain. Reliable assessment methods enable comparative studies between asymptomatic shoulders of healthy subjects and painful shoulders of post-stroke subjects, and could inform treatment planning for post-stroke shoulder pain. The study purpose was to establish intra-rater test-retest reliability and within-subject repeatability of a palpation/digitization protocol, which assesses 3D clavicular/scapular/humeral rotations, in asymptomatic and painful post-stroke shoulders. Repeated measurements of 3D clavicular/scapular/humeral joint/segment rotations were obtained using palpation/digitization in 32 asymptomatic and six painful post-stroke shoulders during four reaching postures (rest/flexion/abduction/external rotation). Intra-class correlation coefficients (ICCs), standard error of the measurement and 95% confidence intervals were calculated. All ICC values indicated high to very high test-retest reliability (≥0.70), with lower reliability for scapular anterior/posterior tilt during external rotation in asymptomatic subjects, and scapular medial/lateral rotation, humeral horizontal abduction/adduction and axial rotation during abduction in post-stroke subjects. All standard error of measurement values demonstrated within-subject repeatability error ≤5° for all clavicular/scapular/humeral joint/segment rotations (asymptomatic ≤3.75°; post-stroke ≤5.0°), except for humeral axial rotation (asymptomatic ≤5°; post-stroke ≤15°). This noninvasive, clinically feasible palpation/digitization protocol was reliable and repeatable in asymptomatic shoulders, and in a smaller sample of painful post-stroke shoulders. Implications for Rehabilitation In the clinical setting, a reliable and repeatable noninvasive method for assessment of three-dimensional (3D) clavicular/scapular/humeral joint orientation and range of motion (ROM) is currently required. The established reliability and repeatability of this proposed palpation/digitization protocol will enable comparative 3D ROM studies between asymptomatic and post-stroke shoulders, which will further inform treatment planning. Intra-rater test-retest repeatability, which is measured by the standard error of the measure, indicates the range of error associated with a single test measure. Therefore, clinicians can use the standard error of the measure to determine the "true" differences between pre-treatment and post-treatment test scores.

  14. Quantitative estimation of the high-intensity zone in the lumbar spine: comparison between the symptomatic and asymptomatic population.

    PubMed

    Liu, Chao; Cai, Hong-Xin; Zhang, Jian-Feng; Ma, Jian-Jun; Lu, Yin-Jiang; Fan, Shun-Wu

    2014-03-01

    The high-intensity zone (HIZ) on magnetic resonance imaging (MRI) has been studied for more than 20 years, but its diagnostic value in low back pain (LBP) is limited by the high incidence in asymptomatic subjects. Little effort has been made to improve the objective assessment of HIZ. To develop quantitative measurements for HIZ and estimate intra- and interobserver reliability and to clarify different signal intensity of HIZ in patients with or without LBP. A measurement reliability and prospective comparative study. A consecutive series of patients with LBP between June 2010 and May 2011 (group A) and a successive series of asymptomatic controls during the same period (group B). Incidence of HIZ; quantitative measures, including area of disc, area and signal intensity of HIZ, and magnetic resonance imaging index; and intraclass correlation coefficients (ICCs) for intra- and interobserver reliability. On the basis of HIZ criteria, a series of quantitative dimension and signal intensity measures was developed for assessing HIZ. Two experienced spine surgeons traced the region of interest twice within 4 weeks for assessment of the intra- and interobserver reliability. The quantitative variables were compared between groups A and B. There were 72 patients with LBP and 79 asymptomatic controls enrolling in this study. The prevalence of HIZ in group A and group B was 45.8% and 20.2%, respectively. The intraobserver agreement was excellent for the quantitative measures (ICC=0.838-0.977) as well as interobserver reliability (ICC=0.809-0.935). The mean signal of HIZ in group A was significantly brighter than in group B (57.55±14.04% vs. 45.61±7.22%, p=.000). There was no statistical difference of area of disc and HIZ between the two groups. The magnetic resonance imaging index was found to be higher in group A when compared with group B (3.94±1.71 vs. 3.06±1.50), but with a p value of .050. A series of quantitative measurements for HIZ was established and demonstrated excellent intra- and interobserver reliability. The signal intensity of HIZ was different in patients with or without LBP, and significant brighter signal was observed in symptomatic subjects. Copyright © 2014 Elsevier Inc. All rights reserved.

  15. Interrater reliability and accuracy of clinicians and trained research assistants performing prospective data collection in emergency department patients with potential acute coronary syndrome.

    PubMed

    Cruz, Carlos O; Meshberg, Emily B; Shofer, Frances S; McCusker, Christine M; Chang, Anna Marie; Hollander, Judd E

    2009-07-01

    Clinical research requires high-quality data collection. Data collected at the emergency department evaluation is generally considered more precise than data collected through chart abstraction but is cumbersome and time consuming. We test whether trained research assistants without a medical background can obtain clinical research data as accurately as physicians. We hypothesize that they would be at least as accurate because they would not be distracted by clinical requirements. We conducted a prospective comparative study of 33 trained research assistants and 39 physicians (35 residents) to assess interrater reliability with respect to guideline-recommended clinical research data. Immediately after the research assistant and clinician evaluation, the data were compared by a tiebreaker third person who forced the patient to choose one of the 2 answers as the correct one when responses were discordant. Crude percentage agreement and interrater reliability were assessed (kappa statistic). One hundred forty-three patients were recruited (mean age 50.7 years; 47% female patients). Overall, the median agreement was 81% (interquartile range [IQR] 73% to 92%) and interrater reliability was fair (kappa value 0.36 [IQR 0.26 to 0.52]) but varied across categories of data: cardiac risk factors (median 86% [IQR 81% to 93%]; median 0.69 [IQR 0.62 to 0.83]), other cardiac history (median 93% [IQR 79% to 95%]; median 0.56 [IQR 0.29 to 0.77]), pain location (median 92% [IR 86% to 94%]; median 0.37 [IQR 0.25 to 0.29]), radiation (median 86% [IQR 85% to 87%]; median 0.37 [IQR 0.26 to 0.42]), quality (median 85% [IQR 75% to 94%]; median 0.29 [IQR 0.23 to 0.40]), and associated symptoms (median 74% [IQR 65% to 78%]; median 0.28 [IQR 0.20 to 0.40]). When discordant information was obtained, the research assistant was more often correct (median 64% [IQR 53% to 72%]). The relatively fair interrater reliability observed in our study is consistent with previous studies evaluating interrater reliability for cardiovascular disease in the inpatient setting. With respect to research data, we found that prospective ascertainment of clinical data is more often correct when done by research assistants compared with clinicians simultaneously evaluating patients.

  16. Reliability and known-group validity of the Arabic version of the 8-item Morisky Medication Adherence Scale among type 2 diabetes mellitus patients.

    PubMed

    Ashur, S T; Shamsuddin, K; Shah, S A; Bosseri, S; Morisky, D E

    2015-12-13

    No validation study has previously been made for the Arabic version of the 8-item Morisky Medication Adherence Scale (MMAS-8(©)) as a measure for medication adherence in diabetes. This study in 2013 tested the reliability and validity of the Arabic MMAS-8 for type 2 diabetes mellitus patients attending a referral centre in Tripoli, Libya. A convenience sample of 103 patients self-completed the questionnaire. Reliability was tested using Cronbach alpha, average inter-item correlation and Spearman-Brown coefficient. Known-group validity was tested by comparing MMAS-8 scores of patients grouped by glycaemic control. The Arabic version showed adequate internal consistency (α = 0.70) and moderate split-half reliability (r = 0.65). Known-group validity was supported as a significant association was found between medication adherence and glycaemic control, with a moderate effect size (ϕc = 0.34). The Arabic version displayed good psychometric properties and could support diabetes research and practice in Arab countries.

  17. The Reliability and Validity of Measures of Gait Variability in Community-Dwelling Older Adults

    PubMed Central

    Brach, Jennifer S.; Perera, Subashan; Studenski, Stephanie; Newman, Anne B.

    2009-01-01

    Objective To examine the test-retest reliability and concurrent validity of variability of gait characteristics. Design Cross-sectional study. Setting Research laboratory. Participants Older adults (N=558) from the Cardiovascular Health Study. Interventions Not applicable. Main Outcome Measures Gait characteristics were measured using a 4-m computerized walkway. SD determined from the steps recorded were used as the measures of variability. Intraclass correlation coefficients (ICC) were calculated to examine test-retest reliability of a 4-m walk and two 4-m walks. To establish concurrent validity, the measures of gait variability were compared across levels of health, functional status, and physical activity using independent t tests and analysis of variances. Results Gait variability measures from the two 4-m walks demonstrated greater test-retest reliability than those from the single 4-m walk (ICC=.22–.48 and ICC=.40–.63, respectively). Greater step length and stance time variability were associated with poorer health, functional status and physical activity (P<.05). Conclusions Gait variability calculated from a limited number of steps has fair to good test-retest reliability and concurrent validity. Reliability of gait variability calculated from a greater number of steps should be assessed to determine if the consistency can be improved. PMID:19061741

  18. Indirect Measurement of Sexual Orientation: Comparison of the Implicit Relational Assessment Procedure, Viewing Time, and Choice Reaction Time Tasks.

    PubMed

    Rönspies, Jelena; Schmidt, Alexander F; Melnikova, Anna; Krumova, Rosina; Zolfagari, Asadeh; Banse, Rainer

    2015-07-01

    The present study was conducted to validate an adaptation of the Implicit Relational Assessment Procedure (IRAP) as an indirect latency-based measure of sexual orientation. Furthermore, reliability and criterion validity of the IRAP were compared to two established indirect measures of sexual orientation: a Choice Reaction Time task (CRT) and a Viewing Time (VT) task. A sample of 87 heterosexual and 35 gay men completed all three indirect measures in an online study. The IRAP and the VT predicted sexual orientation nearly perfectly. Both measures also showed a considerable amount of convergent validity. Reliabilities (internal consistencies) reached satisfactory levels. In contrast, the CRT did not tap into sexual orientation in the present study. In sum, the VT measure performed best, with the IRAP showing only slightly lower reliability and criterion validity, whereas the CRT did not yield any evidence of reliability or criterion validity in the present research. The results were discussed in the light of specific task properties of the indirect latency-based measures (task-relevance vs. task-irrelevance).

  19. Digital transillumination in caries detection versus radiographic and clinical methods: an in-vivo study

    PubMed Central

    Lara-Capi, Cynthia; Lingström, Peter; Lai, Gianfranco; Cocco, Fabio; Simark-Mattsson, Charlotte; Campus, Guglielmo

    2017-01-01

    Objectives: This article aimed to evaluate: (a) the agreement between a near-infrared light transillumination device and clinical and radiographic examinations in caries lesion detection and (b) the reliability of images captured by the transillumination device. Methods: Two calibrated examiners evaluated the caries status in premolars and molars on 52 randomly selected subjects by comparing the transillumination device with a clinical examination for the occlusal surfaces and by comparing the transillumination device with a radiographic examination (bitewing radiographs) for the approximal surfaces. Forty-eight trained dental hygienists evaluated and reevaluated 30 randomly selected images 1-month later. Results: A high concordance between transillumination method and clinical examination (kappa = 0.99) was detected for occlusal caries lesions, while for approximal surfaces, the transillumination device identified a higher number of lesions with respect to bitewing (kappa = 0.91). At the dentinal level, the two methods identified the same number of caries lesions (kappa = 1), whereas more approximal lesions were recorded using the transillumination device in the enamel (kappa = 0.24). The intraexaminer reliability was substantial/almost perfect in 59.4% of the participants. Conclusions: The transillumination method showed a high concordance compared with traditional methods (clinical examination and bitewing radiographs). Caries detection reliability using the transillumination device images showed a high intraexaminer agreement. Transillumination showed to be a reliable method and as effective as traditional methods in caries detection. PMID:28191797

  20. Exercise-Induced Hypoalgesia After Isometric Wall Squat Exercise: A Test-Retest Reliabilty Study.

    PubMed

    Vaegter, Henrik Bjarke; Lyng, Kristian Damgaard; Yttereng, Fredrik Wannebo; Christensen, Mads Holst; Sørensen, Mathias Brandhøj; Graven-Nielsen, Thomas

    2018-05-19

    Isometric exercises decrease pressure pain sensitivity in exercising and nonexercising muscles known as exercise-induced hypoalgesia (EIH). No studies have assessed the test-retest reliability of EIH after isometric exercise. This study investigated the EIH on pressure pain thresholds (PPTs) after an isometric wall squat exercise. The relative and absolute test-retest reliability of the PPT as a test stimulus and the EIH response in exercising and nonexercising muscles were calculated. In two identical sessions, PPTs of the thigh and shoulder were assessed before and after three minutes of quiet rest and three minutes of wall squat exercise, respectively, in 35 healthy subjects. The relative test-retest reliability of PPT and EIH was determined using analysis of variance models, Person's r, and intraclass correlations (ICCs). The absolute test-retest reliability of EIH was determined based on PPT standard error of measurements and Cohen's kappa for agreement between sessions. Squat increased PPTs of exercising and nonexercising muscles by 16.8% ± 16.9% and 6.7% ± 12.9%, respectively (P < 0.001), with no significant differences between sessions. PPTs within and between sessions showed moderately strong correlations (r ≥ 0.74) and excellent (ICC ≥ 0.84) within-session (rest) and between-session test-retest reliability. EIH responses of exercising and nonexercising muscles showed no systematic errors between sessions; however, the relative test-retest reliability was low (ICCs = 0.03-0.43), and agreement in EIH responders and nonresponders between sessions was not significant (κ < 0.13, P > 0.43). A wall squat exercise increased PPTs compared with quiet rest; however, the relative and absolute reliability of the EIH response was poor. Future research is warranted to investigate the reliability of EIH in clinical pain populations.

  1. Development of an Integrated Agricultural Planning Model Considering Climate Change

    NASA Astrophysics Data System (ADS)

    Santikayasa, I. P.

    2016-01-01

    The goal of this study is to develop an agriculture planning model in order to sustain the future water use under the estimation of crop water requirement, water availability and future climate projection. For this purpose, the Citarum river basin which is located in West Java - Indonesia is selected as the study area. Two emission scenarios A2 and B2 were selected. For the crop water requirement estimation, the output of HadCM3 AOGCM is statistically downscale using SDSM and used as the input for WEAP model developed by SEI (Stockholm Environmental Institute). The reliability of water uses is assessed by comparing the irrigation water demand and the water allocation for the irrigation area. The water supply resources are assessed using the water planning tool. This study shows that temperature and precipitation over the study area are projected to increase in the future. The water availability was projected to increase under both A2 and B2 emission scenarios in the future. The irrigation water requirement is expected to decrease in the future under A2 and B2 scenarios. By comparing the irrigation water demand and water allocation for irrigation, the reliability of agriculture water use is expected to change in the period of 2050s and 2080s while the reliability will not change in 2020s. The reliability under A2 scenario is expected to be higher than B2 scenario. The combination of WEAP and SDSM is significance to use in assessing and allocating the water resources in the region.

  2. Measuring decisional certainty among women seeking abortion.

    PubMed

    Ralph, Lauren J; Foster, Diana Greene; Kimport, Katrina; Turok, David; Roberts, Sarah C M

    2017-03-01

    Evaluating decisional certainty is an important component of medical care, including preabortion care. However, minimal research has examined how to measure certainty with reliability and validity among women seeking abortion. We examine whether the Decisional Conflict Scale (DCS), a measure widely used in other health specialties and considered the gold standard for measuring this construct, and the Taft-Baker Scale (TBS), a measure developed by abortion counselors, are valid and reliable for use with women seeking abortion and predict the decision to continue the pregnancy. Eligible women at four family planning facilities in Utah completed baseline demographic surveys and scales before their abortion information visit and follow-up interviews 3 weeks later. For each scale, we calculated mean scores and explored factors associated with high uncertainty. We evaluated internal reliability using Cronbach's alpha and assessed predictive validity by examining whether higher scale scores, indicative of decisional uncertainty or conflict, were associated with still being pregnant at follow-up. Five hundred women completed baseline surveys; two-thirds (63%) completed follow-up, at which time 11% were still pregnant. Mean scores on the DCS (15.5/100) and TBS (12.4/100) indicated low uncertainty, with acceptable reliability (α=.93 and .72, respectively). Higher scores on each scale were significantly and positively associated with still being pregnant at follow-up in both unadjusted and adjusted analyses. The DCS and TBS demonstrate acceptable reliability and validity among women seeking abortion care. Comparing scores on the DCS in this population to other studies of decision making suggests that the level of uncertainty in abortion decision making is comparable to or lower than other health decisions. The high levels of decisional certainty found in this study challenge the narrative that abortion decision making is exceptional compared to other healthcare decisions and requires additional protection such as laws mandating waiting periods, counseling and ultrasound viewing. Copyright © 2016. Published by Elsevier Inc.

  3. Assessment of the Validity of the Research Diagnostic Criteria for Temporomandibular Disorders: Overview and Methodology

    PubMed Central

    Schiffman, Eric L.; Truelove, Edmond L.; Ohrbach, Richard; Anderson, Gary C.; John, Mike T.; List, Thomas; Look, John O.

    2011-01-01

    AIMS The purpose of the Research Diagnostic Criteria for Temporomandibular Disorders (RDC/TMD) Validation Project was to assess the diagnostic validity of this examination protocol. An overview is presented, including Axis I and II methodology and descriptive statistics for the study participant sample. This paper details the development of reliable methods to establish the reference standards for assessing criterion validity of the Axis I RDC/TMD diagnoses. Validity testing for the Axis II biobehavioral instruments was based on previously validated reference standards. METHODS The Axis I reference standards were based on the consensus of 2 criterion examiners independently performing a comprehensive history, clinical examination, and evaluation of imaging. Intersite reliability was assessed annually for criterion examiners and radiologists. Criterion exam reliability was also assessed within study sites. RESULTS Study participant demographics were comparable to those of participants in previous studies using the RDC/TMD. Diagnostic agreement of the criterion examiners with each other and with the consensus-based reference standards was excellent with all kappas ≥ 0.81, except for osteoarthrosis (moderate agreement, k = 0.53). Intrasite criterion exam agreement with reference standards was excellent (k ≥ 0.95). Intersite reliability of the radiologists for detecting computed tomography-disclosed osteoarthrosis and magnetic resonance imaging-disclosed disc displacement was good to excellent (k = 0.71 and 0.84, respectively). CONCLUSION The Validation Project study population was appropriate for assessing the reliability and validity of the RDC/TMD Axis I and II. The reference standards used to assess the validity of Axis I TMD were based on reliable and clinically credible methods. PMID:20213028

  4. Reliability of Task-Based fMRI for Preoperative Planning: A Test-Retest Study in Brain Tumor Patients and Healthy Controls

    PubMed Central

    Morrison, Melanie A.; Churchill, Nathan W.; Cusimano, Michael D.; Schweizer, Tom A.; Das, Sunit; Graham, Simon J.

    2016-01-01

    Background Functional magnetic resonance imaging (fMRI) continues to develop as a clinical tool for patients with brain cancer, offering data that may directly influence surgical decisions. Unfortunately, routine integration of preoperative fMRI has been limited by concerns about reliability. Many pertinent studies have been undertaken involving healthy controls, but work involving brain tumor patients has been limited. To develop fMRI fully as a clinical tool, it will be critical to examine these reliability issues among patients with brain tumors. The present work is the first to extensively characterize differences in activation map quality between brain tumor patients and healthy controls, including the effects of tumor grade and the chosen behavioral testing paradigm on reliability outcomes. Method Test-retest data were collected for a group of low-grade (n = 6) and high-grade glioma (n = 6) patients, and for matched healthy controls (n = 12), who performed motor and language tasks during a single fMRI session. Reliability was characterized by the spatial overlap and displacement of brain activity clusters, BOLD signal stability, and the laterality index. Significance testing was performed to assess differences in reliability between the patients and controls, and low-grade and high-grade patients; as well as between different fMRI testing paradigms. Results There were few significant differences in fMRI reliability measures between patients and controls. Reliability was significantly lower when comparing high-grade tumor patients to controls, or to low-grade tumor patients. The motor task produced more reliable activation patterns than the language tasks, as did the rhyming task in comparison to the phonemic fluency task. Conclusion In low-grade glioma patients, fMRI data are as reliable as healthy control subjects. For high-grade glioma patients, further investigation is required to determine the underlying causes of reduced reliability. To maximize reliability outcomes, testing paradigms should be carefully selected to generate robust activation patterns. PMID:26894279

  5. Reliability and validity of the Japanese Migraine Disability Assessment (MIDAS) Questionnaire.

    PubMed

    Iigaya, Miho; Sakai, Fumihiko; Kolodner, Kenneth B; Lipton, Richard B; Stewart, Walter F

    2003-04-01

    This study was designed to assess the test-retest reliability, internal consistency, and validity of a Japanese translation of the Migraine Disability Assessment (MIDAS) Questionnaire in a sample of Japanese patients with headache. Previous studies have demonstrated that the English-language version of the MIDAS Questionnaire is a reliable and valid instrument for the assessment of migraine-related disability. Any translations of the MIDAS Questionnaire must also be assessed for reliability and validity. Study participants were recruited from the patient population attending either the Neurology Department of Kitasato University or an affiliated clinic. Participants were eligible for study entry if they had 6 or more primary headaches per year. For reliability testing, participants completed the MIDAS Questionnaire on 2 occasions, exactly 2 weeks apart. To assess validity, patients were also invited to participate in a 90-day daily diary study. Composite measures from the 90-day diaries were compared to equivalent MIDAS measures (ie, 5 questions on headache-related disability and 1 question each on average pain intensity and headache frequency in the last 3 months) and to the total MIDAS score obtained from a third MIDAS Questionnaire completed at the end of this 90-day period. One hundred one patients between the ages of 21 and 77 years were recruited (81 women and 20 men). Ninety-nine patients (80 women and 19 men) participated in the diary study. At baseline, 46.5% of patients were MIDAS grade I or II (minimal, mild, or infrequent disability), 22.2% were MIDAS grade III (moderate disability), and 31.3% were MIDAS grade IV (severe disability). Test-retest Spearman correlations for the 5 disability questions and the questions on average pain intensity and headache frequency ranged from 0.59 to 0.80 (P<.0001). The test-retest Spearman correlation coefficient for the total MIDAS score was 0.83 (P<.0001). The degree to which individual MIDAS questions correlated with the diary-based measures ranged from 0.36 to 0.88. The correlation between the total MIDAS score and the equivalent diary-based measure was 0.66. In general, the mean and median values for the MIDAS items and total MIDAS score were similar to the means and medians for the diary-based measures. However, the mean MIDAS scores for the number of days on which headache was experienced and the number of missed workdays were significantly different compared to the diary-based estimates for these items (P<.05). In addition, the mean MIDAS score for the number of days of missed housework was significantly higher than the corresponding diary-based estimate (P<.01). The results from this study show that the Japanese translation of the MIDAS Questionnaire is comparable with the English-language version in terms of reliability and validity.

  6. [Autism Spectrum Disorder in DSM-5 - concept, validity, and reliability, impact on clinical care and future research].

    PubMed

    Freitag, Christine M

    2014-05-01

    Autism Spectrum Disorder (ASD) in DSM-5 comprises the former DSM-IV-TR diagnoses of Autistic Disorder, Asperger's Disorder and PDD-nos. The criteria for ASD in DSM-5 were considerably revised from those of ICD-10 and DSM-IV-TR. The present article compares the diagnostic criteria, presents studies on the validity and reliability of ASD, and discusses open questions. It ends with a clinical and research perspective.

  7. Reliability of an fMRI Paradigm for Emotional Processing in a Multisite Longitudinal Study

    PubMed Central

    Gee, Dylan G.; McEwen, Sarah C.; Forsyth, Jennifer K.; Haut, Kristen M.; Bearden, Carrie E.; Addington, Jean; Goodyear, Bradley; Cadenhead, Kristin S.; Mirzakhanian, Heline; Cornblatt, Barbara A.; Olvet, Doreen; Mathalon, Daniel H.; McGlashan, Thomas H.; Perkins, Diana O.; Belger, Aysenil; Seidman, Larry J.; Thermenos, Heidi; Tsuang, Ming T.; van Erp, Theo G.M.; Walker, Elaine F.; Hamann, Stephan; Woods, Scott W.; Constable, Todd; Cannon, Tyrone D.

    2015-01-01

    Multisite neuroimaging studies can facilitate the investigation of brain-related changes in many contexts, including patient groups that are relatively rare in the general population. Though multisite studies have characterized the reliability of brain activation during working memory and motor functional magnetic resonance imaging tasks, emotion processing tasks, pertinent to many clinical populations, remain less explored. A traveling participants study was conducted with eight healthy volunteers scanned twice on consecutive days at each of the eight North American Longitudinal Prodrome Study sites. Tests derived from generalizability theory showed excellent reliability in the amygdala (Eρ2=0.82), inferior frontal gyrus (IFG;Eρ2=0.83), anterior cingulate cortex (ACC;Eρ2=0.76), insula (Eρ2=0.85), and fusiform gyrus (Eρ2=0.91) for maximum activation and fair to excellent reliability in the amygdala (Eρ2=0.44), IFG (Eρ2=0.48), ACC (Eρ2=0.55), insula (Eρ2=0.42), and fusiform gyrus (Eρ2=0.83) for mean activation across sites and test days. For the amygdala, habituation (Eρ2=0.71) was more stable than mean activation. In a second investigation, data from 111 healthy individuals across sites were aggregated in a voxelwise, quantitative meta-analysis. When compared with a mixed effects model controlling for site, both approaches identified robust activation in regions consistent with expected results based on prior single-site research. Overall, regions central to emotion processing showed strong reliability in the traveling participants study and robust activation in the aggregation study. These results support the reliability of blood oxygen level-dependent signal in emotion processing areas across different sites and scanners and may inform future efforts to increase efficiency and enhance knowledge of rare conditions in the population through multisite neuroimaging paradigms. PMID:25821147

  8. Reproducibility assessment of brain responses to visual food stimuli in adults with overweight and obesity.

    PubMed

    Drew Sayer, R; Tamer, Gregory G; Chen, Ningning; Tregellas, Jason R; Cornier, Marc-Andre; Kareken, David A; Talavage, Thomas M; McCrory, Megan A; Campbell, Wayne W

    2016-10-01

    The brain's reward system influences ingestive behavior and subsequently obesity risk. Functional magnetic resonance imaging (fMRI) is a common method for investigating brain reward function. This study sought to assess the reproducibility of fasting-state brain responses to visual food stimuli using BOLD fMRI. A priori brain regions of interest included bilateral insula, amygdala, orbitofrontal cortex, caudate, and putamen. Fasting-state fMRI and appetite assessments were completed by 28 women (n = 16) and men (n = 12) with overweight or obesity on 2 days. Reproducibility was assessed by comparing mean fasting-state brain responses and measuring test-retest reliability of these responses on the two testing days. Mean fasting-state brain responses on day 2 were reduced compared with day 1 in the left insula and right amygdala, but mean day 1 and day 2 responses were not different in the other regions of interest. With the exception of the left orbitofrontal cortex response (fair reliability), test-retest reliabilities of brain responses were poor or unreliable. fMRI-measured responses to visual food cues in adults with overweight or obesity show relatively good mean-level reproducibility but considerable within-subject variability. Poor test-retest reliability reduces the likelihood of observing true correlations and increases the necessary sample sizes for studies. © 2016 The Obesity Society.

  9. The Comparative Reliability and Feasibility of the Past-Year Canadian Diet History Questionnaire II: Comparison of the Paper and Web Versions

    PubMed Central

    Lo Siou, Geraldine; Csizmadi, Ilona; Boucher, Beatrice A.; Akawung, Alianu K.; Whelan, Heather K.; Sharma, Michelle; Al Rajabi, Ala; Vena, Jennifer E.; Kirkpatrick, Sharon I.; Koushik, Anita; Massarelli, Isabelle; Rondeau, Isabelle; Robson, Paula J.

    2017-01-01

    Advances in technology-enabled dietary assessment include the advent of web-based food frequency questionnaires, which may reduce costs and researcher burden but may introduce new challenges related to internet connectivity and computer literacy. The purpose of this study was to evaluate the intra- and inter-version reliability, feasibility and acceptability of the paper and web Canadian Diet History Questionnaire II (CDHQ-II) in a sub-sample of 648 adults (aged 39–81 years) recruited from Alberta’s Tomorrow Project. Participants were randomly assigned to one of two groups: (1) paper, web, paper; or (2) web, paper, web over a six-week period. With few exceptions, no statistically significant differences in mean nutrient intake were found in the intra- and inter-version reliability analyses. The majority of participants indicated future willingness to complete the CDHQ-II online, and 59% indicated a preference for the web over the paper version. Findings indicate that, in this population of adults drawn from an existing cohort, the CDHQ-II may be administered in paper or web modalities (increasing flexibility for questionnaire delivery), and the nutrient estimates obtained with either version are comparable. We recommend that other studies explore the feasibility and reliability of different modes of administration of dietary assessment instruments prior to widespread implementation. PMID:28208819

  10. Optimal sample sizes for the design of reliability studies: power consideration.

    PubMed

    Shieh, Gwowen

    2014-09-01

    Intraclass correlation coefficients are used extensively to measure the reliability or degree of resemblance among group members in multilevel research. This study concerns the problem of the necessary sample size to ensure adequate statistical power for hypothesis tests concerning the intraclass correlation coefficient in the one-way random-effects model. In view of the incomplete and problematic numerical results in the literature, the approximate sample size formula constructed from Fisher's transformation is reevaluated and compared with an exact approach across a wide range of model configurations. These comprehensive examinations showed that the Fisher transformation method is appropriate only under limited circumstances, and therefore it is not recommended as a general method in practice. For advance design planning of reliability studies, the exact sample size procedures are fully described and illustrated for various allocation and cost schemes. Corresponding computer programs are also developed to implement the suggested algorithms.

  11. Temporal similarity perfusion mapping: A standardized and model-free method for detecting perfusion deficits in stroke

    PubMed Central

    Song, Sunbin; Luby, Marie; Edwardson, Matthew A.; Brown, Tyler; Shah, Shreyansh; Cox, Robert W.; Saad, Ziad S.; Reynolds, Richard C.; Glen, Daniel R.; Cohen, Leonardo G.; Latour, Lawrence L.

    2017-01-01

    Introduction Interpretation of the extent of perfusion deficits in stroke MRI is highly dependent on the method used for analyzing the perfusion-weighted signal intensity time-series after gadolinium injection. In this study, we introduce a new model-free standardized method of temporal similarity perfusion (TSP) mapping for perfusion deficit detection and test its ability and reliability in acute ischemia. Materials and methods Forty patients with an ischemic stroke or transient ischemic attack were included. Two blinded readers compared real-time generated interactive maps and automatically generated TSP maps to traditional TTP/MTT maps for presence of perfusion deficits. Lesion volumes were compared for volumetric inter-rater reliability, spatial concordance between perfusion deficits and healthy tissue and contrast-to-noise ratio (CNR). Results Perfusion deficits were correctly detected in all patients with acute ischemia. Inter-rater reliability was higher for TSP when compared to TTP/MTT maps and there was a high similarity between the lesion volumes depicted on TSP and TTP/MTT (r(18) = 0.73). The Pearson's correlation between lesions calculated on TSP and traditional maps was high (r(18) = 0.73, p<0.0003), however the effective CNR was greater for TSP compared to TTP (352.3 vs 283.5, t(19) = 2.6, p<0.03.) and MTT (228.3, t(19) = 2.8, p<0.03). Discussion TSP maps provide a reliable and robust model-free method for accurate perfusion deficit detection and improve lesion delineation compared to traditional methods. This simple method is also computationally faster and more easily automated than model-based methods. This method can potentially improve the speed and accuracy in perfusion deficit detection for acute stroke treatment and clinical trial inclusion decision-making. PMID:28973000

  12. Assessment of the Maximal Split-Half Coefficient to Estimate Reliability

    ERIC Educational Resources Information Center

    Thompson, Barry L.; Green, Samuel B.; Yang, Yanyun

    2010-01-01

    The maximal split-half coefficient is computed by calculating all possible split-half reliability estimates for a scale and then choosing the maximal value as the reliability estimate. Osburn compared the maximal split-half coefficient with 10 other internal consistency estimates of reliability and concluded that it yielded the most consistently…

  13. Using archetypes to create user panels for usability studies: Streamlining focus groups and user studies.

    PubMed

    Stavrakos, S-K; Ahmed-Kristensen, S; Goldman, T

    2016-09-01

    Designers at the conceptual phase of products such as headphones, stress the importance of comfort, e.g. executing comfort studies and the need for a reliable user panel. This paper proposes a methodology to issue a reliable user panel to represent large populations and validates the proposed framework to predict comfort factors, such as physical fit. Data of 200 heads was analyzed by forming clusters, 9 archetypal people were identified out of a 200 people's ear database. The archetypes were validated by comparing the archetypes' responses on physical fit against those of 20 participants interacting with 6 headsets. This paper suggests a new method of selecting representative user samples for prototype testing compared to costly and time consuming methods which relied on the analysis of human geometry of large populations. Copyright © 2016 Elsevier Ltd. All rights reserved.

  14. Region of Interest Correction Factors Improve Reliability of Diffusion Imaging Measures Within and Across Scanners and Field Strengths

    PubMed Central

    Venkatraman, Vijay K; Gonzalez, Christopher E.; Landman, Bennett; Goh, Joshua; Reiter, David A.; An, Yang; Resnick, Susan M.

    2017-01-01

    Diffusion tensor imaging (DTI) measures are commonly used as imaging markers to investigate individual differences in relation to behavioral and health-related characteristics. However, the ability to detect reliable associations in cross-sectional or longitudinal studies is limited by the reliability of the diffusion measures. Several studies have examined reliability of diffusion measures within (i.e. intra-site) and across (i.e. inter-site) scanners with mixed results. Our study compares the test-retest reliability of diffusion measures within and across scanners and field strengths in cognitively normal older adults with a follow-up interval less than 2.25 years. Intra-class correlation (ICC) and coefficient of variation (CoV) of fractional anisotropy (FA) and mean diffusivity (MD) were evaluated in sixteen white matter and twenty-six gray matter bilateral regions. The ICC for intra-site reliability (0.32 to 0.96 for FA and 0.18 to 0.95 for MD in white matter regions; 0.27 to 0.89 for MD and 0.03 to 0.79 for FA in gray matter regions) and inter-site reliability (0.28 to 0.95 for FA in white matter regions, 0.02 to 0.86 for MD in gray matter regions) with longer follow-up intervals were similar to earlier studies using shorter follow-up intervals. The reliability of across field strengths comparisons was lower than intra- and inter-site reliability. Within and across scanner comparisons showed that diffusion measures were more stable in larger white matter regions (> 1500 mm3). For gray matter regions, the MD measure showed stability in specific regions and was not dependent on region size. Linear correction factor estimated from cross-sectional or longitudinal data improved the reliability across field strengths. Our findings indicate that investigations relating diffusion measures to external variables must consider variable reliability across the distinct regions of interest and that correction factors can be used to improve consistency of measurement across field strengths. An important result of this work is that inter-scanner and field strength effects can be partially mitigated with linear correction factors specific to regions of interest. These data-driven linear correction techniques can be applied in cross-sectional or longitudinal studies. PMID:26146196

  15. Test-retest reliability and construct validity of the ENERGY-parent questionnaire on parenting practices, energy balance-related behaviours and their potential behavioural determinants: the ENERGY-project.

    PubMed

    Singh, Amika S; Chinapaw, Mai J M; Uijtdewilligen, Léonie; Vik, Froydis N; van Lippevelde, Wendy; Fernández-Alvira, Juan M; Stomfai, Sarolta; Manios, Yannis; van der Sluijs, Maria; Terwee, Caroline; Brug, Johannes

    2012-08-13

    Insight in parental energy balance-related behaviours, their determinants and parenting practices are important to inform childhood obesity prevention. Therefore, reliable and valid tools to measure these variables in large-scale population research are needed. The objective of the current study was to examine the test-retest reliability and construct validity of the parent questionnaire used in the ENERGY-project, assessing parental energy balance-related behaviours, their determinants, and parenting practices among parents of 10-12 year old children. We collected data among parents (n = 316 in the test-retest reliability study; n = 109 in the construct validity study) of 10-12 year-old children in six European countries, i.e. Belgium, Greece, Hungary, the Netherlands, Norway, and Spain. Test-retest reliability was assessed using the intra-class correlation coefficient (ICC) and percentage agreement comparing scores from two measurements, administered one week apart. To assess construct validity, the agreement between questionnaire responses and a subsequent interview was assessed using ICC and percentage agreement.All but one item showed good to excellent test-retest reliability as indicated by ICCs > .60 or percentage agreement ≥ 75%. Construct validity appeared to be good to excellent for 92 out of 121 items, as indicated by ICCs > .60 or percentage agreement ≥ 75%. From the other 29 items, construct validity was moderate for 24 and poor for 5 items. The reliability and construct validity of the items of the ENERGY-parent questionnaire on multiple energy balance-related behaviours, their potential determinants, and parenting practices appears to be good. Based on the results of the validity study, we strongly recommend adapting parts of the ENERGY-parent questionnaire if used in future research.

  16. Screening of Cognitive Impairment in Schizophrenia: Reliability, Sensitivity, and Specificity of the Repeatable Battery for the Assessment of Neuropsychological Status in a Spanish Sample.

    PubMed

    De la Torre, Gabriel G; Perez, Maria J; Ramallo, Miguel A; Randolph, Christopher; González-Villegas, Macarena Bernal

    2016-04-01

    In recent years, a number of studies focusing on the evaluation of neuropsychological deficits in individuals with schizophrenia have shown deficits that include several cognitive functions. Attention deficits as well as memory or executive function deficits are common in this kind of disorder together with sustained attention problems, working memory deficiencies, and problem-solving difficulties, among many others. Currently, the Repeatable Battery for the Assessment of Neuropsychological Status (RBANS) is gaining special importance in the evaluation of the cognitive deficits associated with schizophrenia. In this article, we describe an RBANS screening in a sample of 88 Spanish patients diagnosed with schizophrenia. We also aimed to check the battery's reliability, sensitivity, and specificity in the studied sample. We performed a comparative study with 88 healthy participants. The results showed a reliability index value of α = .795 and an item value of α = .762. For total test reliability, we obtained an index value of α = .761 and an item value of α = .762. Sensitivity score was 87.5% and specificity 86.4%. RBANS obtained good reliability, sensitivity, and specificity scores and represents a good screening tool in detecting cognitive deficits associated with schizophrenia. © The Author(s) 2015.

  17. Comprehension of Written Grammar Test: Reliability and Known-Groups Validity Study With Hearing and Deaf and Hard-of-Hearing Students.

    PubMed

    Cannon, Joanna E; Hubley, Anita M; Millhoff, Courtney; Mazlouman, Shahla

    2016-01-01

    The aim of the current study was to gather validation evidence for the Comprehension of Written Grammar (CWG; Easterbrooks, 2010) receptive test of 26 grammatical structures of English print for use with children who are deaf and hard of hearing (DHH). Reliability and validity data were collected for 98 participants (49 DHH and 49 hearing) in Grades 2-6. The objectives were to: (a) examine 4-week test-retest reliability data; and (b) provide evidence of known-groups validity by examining expected differences between the groups on the CWG vocabulary pretest and main test, as well as selected structures. Results indicated excellent test-retest reliability estimates for CWG test scores. DHH participants performed statistically significantly lower on the CWG vocabulary pretest and main test than the hearing participants. Significantly lower performance by DHH participants on most expected grammatical structures (e.g., basic sentence patterns, auxiliary "be" singular/plural forms, tense, comparatives, and complementation) also provided known groups evidence. Overall, the findings of this study showed strong evidence of the reliability of scores and known group-based validity of inferences made from the CWG. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  18. Reliability of capturing foot parameters using digital scanning and the neutral suspension casting technique

    PubMed Central

    2011-01-01

    Background A clinical study was conducted to determine the intra and inter-rater reliability of digital scanning and the neutral suspension casting technique to measure six foot parameters. The neutral suspension casting technique is a commonly utilised method for obtaining a negative impression of the foot prior to orthotic fabrication. Digital scanning offers an alternative to the traditional plaster of Paris techniques. Methods Twenty one healthy participants volunteered to take part in the study. Six casts and six digital scans were obtained from each participant by two raters of differing clinical experience. The foot parameters chosen for investigation were cast length (mm), forefoot width (mm), rearfoot width (mm), medial arch height (mm), lateral arch height (mm) and forefoot to rearfoot alignment (degrees). Intraclass correlation coefficients (ICC) with 95% confidence intervals (CI) were calculated to determine the intra and inter-rater reliability. Measurement error was assessed through the calculation of the standard error of the measurement (SEM) and smallest real difference (SRD). Results ICC values for all foot parameters using digital scanning ranged between 0.81-0.99 for both intra and inter-rater reliability. For neutral suspension casting technique inter-rater reliability values ranged from 0.57-0.99 and intra-rater reliability values ranging from 0.36-0.99 for rater 1 and 0.49-0.99 for rater 2. Conclusions The findings of this study indicate that digital scanning is a reliable technique, irrespective of clinical experience, with reduced measurement variability in all foot parameters investigated when compared to neutral suspension casting. PMID:21375757

  19. Time-Variant Reliability Analysis for Rubber O-Ring Seal Considering Both Material Degradation and Random Load

    PubMed Central

    Liao, Baopeng; Yan, Meichen; Zhang, Weifang; Zhou, Kun

    2017-01-01

    Due to the increase in working hours, the reliability of rubber O-ring seals used in hydraulic systems of transfer machines will change. While traditional methods can only analyze one of the material properties or seal properties, the failure of the O-ring is caused by these two factors together. In this paper, two factors are mainly analyzed: the degradation of material properties and load randomization by processing technology. Firstly, the two factors are defined in terms of material failure and seal failure, before the experimental methods of rubber materials are studied. Following this, the time-variant material properties through experiments and load distribution by monitoring the processing can be obtained. Thirdly, compressive stress and contact stress have been calculated, which was combined with the reliability model to acquire the time-variant reliability for the O-ring. Finally, the life prediction and effect of oil pressure were discussed, then compared with the actual situation. The results show a lifetime of 12 months for the O-ring calculated in this paper, and compared with the replacement records from the maintenance workshop, the result is credible. PMID:29053597

  20. An experimental evaluation of software redundancy as a strategy for improving reliability

    NASA Technical Reports Server (NTRS)

    Eckhardt, Dave E., Jr.; Caglayan, Alper K.; Knight, John C.; Lee, Larry D.; Mcallister, David F.; Vouk, Mladen A.; Kelly, John P. J.

    1990-01-01

    The strategy of using multiple versions of independently developed software as a means to tolerate residual software design faults is suggested by the success of hardware redundancy for tolerating hardware failures. Although, as generally accepted, the independence of hardware failures resulting from physical wearout can lead to substantial increases in reliability for redundant hardware structures, a similar conclusion is not immediate for software. The degree to which design faults are manifested as independent failures determines the effectiveness of redundancy as a method for improving software reliability. Interest in multi-version software centers on whether it provides an adequate measure of increased reliability to warrant its use in critical applications. The effectiveness of multi-version software is studied by comparing estimates of the failure probabilities of these systems with the failure probabilities of single versions. The estimates are obtained under a model of dependent failures and compared with estimates obtained when failures are assumed to be independent. The experimental results are based on twenty versions of an aerospace application developed and certified by sixty programmers from four universities. Descriptions of the application, development and certification processes, and operational evaluation are given together with an analysis of the twenty versions.

  1. Anthropometric Measurement Standardization in the US-Affiliated Pacific: Report from the Children’s Healthy Living Program

    PubMed Central

    LI, FENFANG; WILKENS, LYNNE R.; NOVOTNY, RACHEL; FIALKOWSKI, MARIE K.; PAULINO, YVETTE C.; NELSON, RANDALL; BERSAMIN, ANDREA; MARTIN, URSULA; DEENIK, JONATHAN; BOUSHEY, CAROL J.

    2016-01-01

    Objectives Anthropometric standardization is essential to obtain reliable and comparable data from different geographical regions. The purpose of this study is to describe anthropometric standardization procedures and findings from the Children’s Healthy Living (CHL) Program, a study on childhood obesity in 11 jurisdictions in the US-Affiliated Pacific Region, including Alaska and Hawai‘i. Methods Zerfas criteria were used to compare the measurement components (height, waist, and weight) between each trainee and a single expert anthropometrist. In addition, intra- and inter-rater technical error of measurement (TEM), coefficient of reliability, and average bias relative to the expert were computed. Results From September 2012 to December 2014, 79 trainees participated in at least 1 of 29 standardization sessions. A total of 49 trainees passed either standard or alternate Zerfas criteria and were qualified to assess all three measurements in the field. Standard Zerfas criteria were difficult to achieve: only 2 of 79 trainees passed at their first training session. Intra-rater TEM estimates for the 49 trainees compared well with the expert anthropometrist. Average biases were within acceptable limits of deviation from the expert. Coefficient of reliability was above 99% for all three anthropometric components. Conclusions Standardization based on comparison with a single expert ensured the comparability of measurements from the 49 trainees who passed the criteria. The anthropometric standardization process and protocols followed by CHL resulted in 49 standardized field anthropometrists and have helped build capacity in the health workforce in the Pacific Region. PMID:26457888

  2. Assessing Pupils' Skills in Experimentation

    ERIC Educational Resources Information Center

    Hammann, Marcus; Phan, Thi Thanh Hoi; Ehmer, Maike; Grimm, Tobias

    2008-01-01

    This study is concerned with different forms of assessment of pupils' skills in experimentation. The findings of three studies are reported. Study 1 investigates whether it is possible to develop reliable multiple-choice tests for the skills of forming hypotheses, designing experiments and analysing experimental data. Study 2 compares scores from…

  3. Validation of heart rate extraction through an iPhone accelerometer.

    PubMed

    Kwon, Sungjun; Lee, Jeongsu; Chung, Gih Sung; Park, Kwang Suk

    2011-01-01

    Ubiquitous medical technology may provide advanced utility for evaluating the status of the patient beyond the clinical environment. The iPhone provides the capacity to measure the heart rate, as the iPhone consists of a 3-axis accelerometer that is sufficiently sensitive to perceive tiny body movements caused by heart pumping. In this preliminary study, an iPhone was tested and evaluated as the reliable heart rate extractor to use for medical purpose by comparing with reference electrocardiogram. By comparing the extracted heart rate from acquired acceleration data with the extracted one from ECG reference signal, iPhone functioning as the reliable heart rate extractor has demonstrated sufficient accuracy and consistency.

  4. The Movement Imagery Questionnaire-Revised, Second Edition (MIQ-RS) Is a Reliable and Valid Tool for Evaluating Motor Imagery in Stroke Populations

    PubMed Central

    Butler, Andrew J.; Cazeaux, Jennifer; Fidler, Anna; Jansen, Jessica; Lefkove, Nehama; Gregg, Melanie; Hall, Craig; Easley, Kirk A.; Shenvi, Neeta; Wolf, Steven L.

    2012-01-01

    Mental imagery can improve motor performance in stroke populations when combined with physical therapy. Valid and reliable instruments to evaluate the imagery ability of stroke survivors are needed to maximize the benefits of mental imagery therapy. The purposes of this study were to: examine and compare the test-retest intra-rate reliability of the Movement Imagery Questionnaire-Revised, Second Edition (MIQ-RS) in stroke survivors and able-bodied controls, examine internal consistency of the visual and kinesthetic items of the MIQ-RS, determine if the MIQ-RS includes both the visual and kinesthetic dimensions of mental imagery, correlate impairment and motor imagery scores, and investigate the criterion validity of the MIQ-RS in stroke survivors by comparing the results to the KVIQ-10. Test-retest analysis indicated good levels of reliability (ICC range: .83–.99) and internal consistency (Cronbach α: .95–.98) of the visual and kinesthetic subscales in both groups. The two-factor structure of the MIQ-RS was supported by factor analysis, with the visual and kinesthetic components accounting for 88.6% and 83.4% of the total variance in the able-bodied and stroke groups, respectively. The MIQ-RS is a valid and reliable instrument in the stroke population examined and able-bodied populations and therefore useful as an outcome measure for motor imagery ability. PMID:22474504

  5. Exploring the validity and reliability of a questionnaire for evaluating veterinary clinical teachers' supervisory skills during clinical rotations.

    PubMed

    Boerboom, T B B; Dolmans, D H J M; Jaarsma, A D C; Muijtjens, A M M; Van Beukelen, P; Scherpbier, A J J A

    2011-01-01

    Feedback to aid teachers in improving their teaching requires validated evaluation instruments. When implementing an evaluation instrument in a different context, it is important to collect validity evidence from multiple sources. We examined the validity and reliability of the Maastricht Clinical Teaching Questionnaire (MCTQ) as an instrument to evaluate individual clinical teachers during short clinical rotations in veterinary education. We examined four sources of validity evidence: (1) Content was examined based on theory of effective learning. (2) Response process was explored in a pilot study. (3) Internal structure was assessed by confirmatory factor analysis using 1086 student evaluations and reliability was examined utilizing generalizability analysis. (4) Relations with other relevant variables were examined by comparing factor scores with other outcomes. Content validity was supported by theory underlying the cognitive apprenticeship model on which the instrument is based. The pilot study resulted in an additional question about supervision time. A five-factor model showed a good fit with the data. Acceptable reliability was achievable with 10-12 questionnaires per teacher. Correlations between the factors and overall teacher judgement were strong. The MCTQ appears to be a valid and reliable instrument to evaluate clinical teachers' performance during short rotations.

  6. Psychometric properties of the Satisfaction with Life Scale (SWLS): secondary analysis of the Mexican Health and Aging Study.

    PubMed

    López-Ortega, Mariana; Torres-Castro, Sara; Rosas-Carrasco, Oscar

    2016-12-09

    The Satisfaction with Life Scale (SWLS) has been widely used and has proven to be a valid and reliable instrument for assessing satisfaction with life in diverse population groups, however, research on satisfaction with life and validation of different measuring instruments in Mexican adults is still lacking. The objective was to evaluate the psychometric properties of the Satisfaction with Life Scale (SWLS) in a representative sample of Mexican adults. This is a methodological study to evaluate a satisfaction with life scale in a sample of 13,220 Mexican adults 50 years of age or older from the 2012 Mexican Health and Aging Study. The scale's reliability (internal consistency) was analysed using Cronbach's alpha and inter-item correlations. An exploratory factor analysis was also performed. Known-groups validity was evaluated comparing good-health and bad-health participants. Comorbidity, perceived financial situation, self-reported general health, depression symptoms, and social support were included to evaluate the validity between these measures and the total score of the scale using Spearman's correlations. The analysis of the scale's reliability showed good internal consistency (α = 0.74). The exploratory factor analysis confirmed the existence of a unique factor structure that explained 54% of the variance. SWLS was related to depression, perceived health, financial situation, and social support, and these relations were all statistically significant (P < .01). There was significant difference in life satisfaction between the good- and bad-health groups. Results show good internal consistency and construct validity of the SWLS. These results are comparable with results from previous studies. Meeting the study's objective to validate the scale, the results show that the Spanish version of the SWLS is a reliable and valid measure of satisfaction with life in the Mexican context.

  7. The English and Chinese versions of the five-level EuroQoL Group's five-dimension questionnaire (EQ-5D) were valid and reliable and provided comparable scores in Asian breast cancer patients.

    PubMed

    Lee, Chun Fan; Ng, Raymond; Luo, Nan; Wong, Nan Soon; Yap, Yoon Sim; Lo, Soo Kien; Chia, Whay Kuang; Yee, Alethea; Krishna, Lalit; Wong, Celest; Goh, Cynthia; Cheung, Yin Bun

    2013-01-01

    To examine the measurement properties of and comparability between the English and Chinese versions of the five-level EuroQoL Group's five-dimension questionnaire (EQ-5D) in breast cancer patients in Singapore. This is an observational study of 269 patients. Known-group validity and responsiveness of the EQ-5D utility index and visual analog scale (VAS) were assessed in relation to various clinical characteristics and longitudinal change in performance status, respectively. Convergent and divergent validity was examined by correlation coefficients between the EQ-5D and a breast cancer-specific instrument. Test-retest reliability was evaluated. The two language versions were compared by multiple regression analyses. For both English and Chinese versions, the EQ-5D utility index and VAS demonstrated known-group validity and convergent and divergent validity, and presented sufficient test-retest reliability (intraclass correlation = 0.72 to 0.83). The English version was responsive to changes in performance status. The Chinese version was responsive to decline in performance status, but there was no conclusive evidence about its responsiveness to improvement in performance status. In the comparison analyses of the utility index and VAS between the two language versions, borderline results were obtained, and equivalence cannot be definitely confirmed. The five-level EQ-5D is valid, responsive, and reliable in assessing health outcome of breast cancer patients. The English and Chinese versions provide comparable measurement results.

  8. Correlation and Reliability of Cervical Sagittal Alignment Parameters between Lateral Cervical Radiograph and Lateral Whole-Body EOS Stereoradiograph.

    PubMed

    Singhatanadgige, Weerasak; Kang, Daniel G; Luksanapruksa, Panya; Peters, Colleen; Riew, K Daniel

    2016-09-01

    Retrospective analysis. To evaluate the correlation and reliability of cervical sagittal alignment parameters obtained from lateral cervical radiographs (XRs) compared with lateral whole-body stereoradiographs (SRs). We evaluated adults with cervical deformity using both lateral XRs and lateral SRs obtained within 1 week of each other between 2010 and 2014. XR and SR images were measured by two independent spine surgeons using the following sagittal alignment parameters: C2-C7 sagittal Cobb angle (SCA), C2-C7 sagittal vertical axis (SVA), C1-C7 translational distance (C1-7), T1 slope (T1-S), neck tilt (NT), and thoracic inlet angle (TIA). Pearson correlation and paired t test were used for statistical analysis, with intra- and interrater reliability analyzed using intraclass correlation coefficient (ICC). A total of 35 patients were included in the study. We found excellent intrarater reliability for all sagittal alignment parameters in both the XR and SR groups with ICC ranging from 0.799 to 0.994 for XR and 0.791 to 0.995 for SR. Interrater reliability was also excellent for all parameters except NT and TIA, which had fair reliability. We also found excellent correlations between XR and SR measurements for most sagittal alignment parameters; SCA, SVA, and C1-C7 had r > 0.90, and only NT had r < 0.70. There was a significant difference between groups, with SR having lower measurements compared with XR for both SVA (0.68 cm lower, p < 0.001) and C1-C7 (1.02 cm lower, p < 0.001). There were no differences between groups for SCA, T1-S, NT, and TIA. Whole-body stereoradiography appears to be a viable alternative for measuring cervical sagittal alignment parameters compared with standard radiography. XR and SR demonstrated excellent correlation for most sagittal alignment parameters except NT. However, SR had significantly lower average SVA and C1-C7 measurements than XR. The lower radiation exposure using single SR has to be weighed against its higher cost compared with XR.

  9. A GA based penalty function technique for solving constrained redundancy allocation problem of series system with interval valued reliability of components

    NASA Astrophysics Data System (ADS)

    Gupta, R. K.; Bhunia, A. K.; Roy, D.

    2009-10-01

    In this paper, we have considered the problem of constrained redundancy allocation of series system with interval valued reliability of components. For maximizing the overall system reliability under limited resource constraints, the problem is formulated as an unconstrained integer programming problem with interval coefficients by penalty function technique and solved by an advanced GA for integer variables with interval fitness function, tournament selection, uniform crossover, uniform mutation and elitism. As a special case, considering the lower and upper bounds of the interval valued reliabilities of the components to be the same, the corresponding problem has been solved. The model has been illustrated with some numerical examples and the results of the series redundancy allocation problem with fixed value of reliability of the components have been compared with the existing results available in the literature. Finally, sensitivity analyses have been shown graphically to study the stability of our developed GA with respect to the different GA parameters.

  10. Investigating the Intersession Reliability of Dynamic Brain-State Properties.

    PubMed

    Smith, Derek M; Zhao, Yrian; Keilholz, Shella D; Schumacher, Eric H

    2018-06-01

    Dynamic functional connectivity metrics have much to offer to the neuroscience of individual differences of cognition. Yet, despite the recent expansion in dynamic connectivity research, limited resources have been devoted to the study of the reliability of these connectivity measures. To address this, resting-state functional magnetic resonance imaging data from 100 Human Connectome Project subjects were compared across 2 scan days. Brain states (i.e., patterns of coactivity across regions) were identified by classifying each time frame using k means clustering. This was done with and without global signal regression (GSR). Multiple gauges of reliability indicated consistency in the brain-state properties across days and GSR attenuated the reliability of the brain states. Changes in the brain-state properties across the course of the scan were investigated as well. The results demonstrate that summary metrics describing the clustering of individual time frames have adequate test/retest reliability, and thus, these patterns of brain activation may hold promise for individual-difference research.

  11. Comparison of two methods of measuring physical activity in South African older adults.

    PubMed

    Kolbe-Alexander, Tracy L; Lambert, Estelle V; Harkins, Judith Biletnikoff; Ekelund, Ulf

    2006-01-01

    The aim of this study was to assess the validity and reliability of the Yale Physical Activity Survey (YPAS) and the short version of the International Physical Activity Questionnaire (IPAQ) in older South African adults. The YPAS includes measures of weekly energy expenditure (EE) for housework, yard work, caregiving, exercise, and recreation. The IPAQ measures total time and EE during vigorous and moderate activity, walking, and sitting. The instruments were administered twice for test-retest reliability (men, n = 52, 68 +/- 5.4 years, and women, n = 70, 66 +/- 5.8 years). Data for criterion validity were obtained from accelerometers. YPAS reliability ranged from r = .44 to.80 for men and r = .59 to .99 for women (p < .0001). IPAQ reliability was lower for men (r = .29 to .76) than for women (r = .46 to .77). Criterion validity of the YPAS was .31 to .54 for men and .26 to .29 for women. The YPAS and short IPAQ had comparable results for reliability and criterion validity.

  12. Periorbital Biometric Measurements using ImageJ Software: Standardisation of Technique and Assessment Of Intra- and Interobserver Variability

    PubMed Central

    Rajyalakshmi, R.; Prakash, Winston D.; Ali, Mohammad Javed; Naik, Milind N.

    2017-01-01

    Purpose: To assess the reliability and repeatability of periorbital biometric measurements using ImageJ software and to assess if the horizontal visible iris diameter (HVID) serves as a reliable scale for facial measurements. Methods: This study was a prospective, single-blind, comparative study. Two clinicians performed 12 periorbital measurements on 100 standardised face photographs. Each individual’s HVID was determined by Orbscan IIz and used as a scale for measurements using ImageJ software. All measurements were repeated using the ‘average’ HVID of the study population as a measurement scale. Intraclass correlation coefficient (ICC) and Pearson product-moment coefficient were used as statistical tests to analyse the data. Results: The range of ICC for intra- and interobserver variability was 0.79–0.99 and 0.86–0.99, respectively. Test-retest reliability ranged from 0.66–1.0 to 0.77–0.98, respectively. When average HVID of the study population was used as scale, ICC ranged from 0.83 to 0.99, and the test-retest reliability ranged from 0.83 to 0.96 and the measurements correlated well with recordings done with individual Orbscan HVID measurements. Conclusion: Periorbital biometric measurements using ImageJ software are reproducible and repeatable. Average HVID of the population as measured by Orbscan is a reliable scale for facial measurements. PMID:29403183

  13. Reliability of the European Society of Human Reproduction and Embryology/European Society for Gynaecological Endoscopy and American Society for Reproductive Medicine classification systems for congenital uterine anomalies detected using three-dimensional ultrasonography.

    PubMed

    Ludwin, Artur; Ludwin, Inga; Kudla, Marek; Kottner, Jan

    2015-09-01

    To estimate the inter-rater/intrarater reliability of the European Society of Human Reproduction and Embryology/European Society for Gynaecological Endoscopy (ESHRE-ESGE) classification of congenital uterine malformations and to compare the results obtained with the reliability of the American Society for Reproductive Medicine (ASRM) classification supplemented with additional morphometric criteria. Reliability/agreement study. Private clinic. Uterine malformations (n = 50 patients, consecutively included) and normal uterus (n = 62 women, randomly selected) constituted the study. These were classified based on real-time three-dimensional ultrasound single volume transvaginal (or transrectal in the case of virgins, 4 cases) ultrasonography findings, which were assessed by an expert rater based on the ESHRE-ESGE criteria. The samples were obtained from women of reproductive age. Unprocessed three-dimensional datasets were independently evaluated offline by two experienced, blinded raters using both classification systems. The κ-values and proportions of agreement. Standardized interpretation indicated that the ESHRE-ESGE system has substantial/good or almost perfect/very good reliability (κ >0.60 and >0.80), but the interpretation of the clinically relevant cutoffs of κ-values showed insufficient reliability for clinical use (κ < 0.90), especially in the diagnosis of septate uterus. The ASRM system had sufficient reliability (κ > 0.95). The low reliability of the ESHRE-ESGE system may lead to a lack of consensus about the management of common uterine malformations and biased research interpretations. The use of the ASRM classification, supplemented with simple morphometric criteria, may be preferred if their sufficient reliability can be confirmed real-time in a large sample size. Copyright © 2015 American Society for Reproductive Medicine. Published by Elsevier Inc. All rights reserved.

  14. Comparison of the haematoxylin basic fuchsin picric acid method and the fluorescence of haematoxylin and eosin stained sections for the identification of early myocardial infarction.

    PubMed Central

    Al-Rufaie, H K; Florio, R A; Olsen, E G

    1983-01-01

    A retrospective study has been carried out on the necropsy material from 30 patients who have died after a clinically diagnosed myocardial infarction. This study has been undertaken to compare the reliability of the fluorescence of infarcted myocardium when stained by haematoxylin and eosin and an adjacent section stained by the haematoxylin basic fuchsin picric acid (HBFP) method to detect early ischaemia. The results showed that the fluorescence technique is reliable, reproducible and coincides with the findings obtained by HBFP stain. Images PMID:6189866

  15. Differential Weighting for Subcomponent Measures of Integrated Clinical Encounter Scores Based on the USMLE Step 2 CS Examination: Effects on Composite Score Reliability and Pass-Fail Decisions.

    PubMed

    Park, Yoon Soo; Lineberry, Matthew; Hyderi, Abbas; Bordage, Georges; Xing, Kuan; Yudkowsky, Rachel

    2016-11-01

    Medical schools administer locally developed graduation competency examinations (GCEs) following the structure of the United States Medical Licensing Examination Step 2 Clinical Skills that combine standardized patient (SP)-based physical examination and the patient note (PN) to create integrated clinical encounter (ICE) scores. This study examines how different subcomponent scoring weights in a locally developed GCE affect composite score reliability and pass-fail decisions for ICE scores, contributing to internal structure and consequential validity evidence. Data from two M4 cohorts (2014: n = 177; 2015: n = 182) were used. The reliability of SP encounter (history taking and physical examination), PN, and communication and interpersonal skills scores were estimated with generalizability studies. Composite score reliability was estimated for varying weight combinations. Faculty were surveyed for preferred weights on the SP encounter and PN scores. Composite scores based on Kane's method were compared with weighted mean scores. Faculty suggested weighting PNs higher (60%-70%) than the SP encounter scores (30%-40%). Statistically, composite score reliability was maximized when PN scores were weighted at 40% to 50%. Composite score reliability of ICE scores increased by up to 0.20 points when SP-history taking (SP-Hx) scores were included; excluding SP-Hx only increased composite score reliability by 0.09 points. Classification accuracy for pass-fail decisions between composite and weighted mean scores was 0.77; misclassification was < 5%. Medical schools and certification agencies should consider implications of assigning weights with respect to composite score reliability and consequences on pass-fail decisions.

  16. Reliability and criterion validity of two applications of the iPhone™ to measure cervical range of motion in healthy participants

    PubMed Central

    2013-01-01

    Summary of background data Recent smartphones, such as the iPhone, are often equipped with an accelerometer and magnetometer, which, through software applications, can perform various inclinometric functions. Although these applications are intended for recreational use, they have the potential to measure and quantify range of motion. The purpose of this study was to estimate the intra and inter-rater reliability as well as the criterion validity of the clinometer and compass applications of the iPhone in the assessment cervical range of motion in healthy participants. Methods The sample consisted of 28 healthy participants. Two examiners measured cervical range of motion of each participant twice using the iPhone (for the estimation of intra and inter-reliability) and once with the CROM (for the estimation of criterion validity). Estimates of reliability and validity were then established using the intraclass correlation coefficient (ICC). Results We observed a moderate intra-rater reliability for each movement (ICC = 0.65-0.85) but a poor inter-rater reliability (ICC < 0.60). For the criterion validity, the ICCs are moderate (>0.50) to good (>0.65) for movements of flexion, extension, lateral flexions and right rotation, but poor (<0.50) for the movement left rotation. Conclusion We found good intra-rater reliability and lower inter-rater reliability. When compared to the gold standard, these applications showed moderate to good validity. However, before using the iPhone as an outcome measure in clinical settings, studies should be done on patients presenting with cervical problems. PMID:23829201

  17. 21 CFR 201.57 - Specific requirements on content and format of labeling for human prescription drug and...

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... comparative rates of occurrence cannot be reliably determined (e.g., adverse reactions were observed only in... in vivo study designs or results (e.g., drug interaction studies), may be included in this section if...

  18. 21 CFR 201.57 - Specific requirements on content and format of labeling for human prescription drug and...

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... comparative rates of occurrence cannot be reliably determined (e.g., adverse reactions were observed only in... in vivo study designs or results (e.g., drug interaction studies), may be included in this section if...

  19. 21 CFR 201.57 - Specific requirements on content and format of labeling for human prescription drug and...

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... comparative rates of occurrence cannot be reliably determined (e.g., adverse reactions were observed only in... in vivo study designs or results (e.g., drug interaction studies), may be included in this section if...

  20. Dimensional indicators of generalized anxiety disorder severity for DSM-V.

    PubMed

    Niles, Andrea N; Lebeau, Richard T; Liao, Betty; Glenn, Daniel E; Craske, Michelle G

    2012-03-01

    For DSM-V, simple dimensional measures of disorder severity will accompany diagnostic criteria. The current studies examine convergent validity and test-retest reliability of two potential dimensional indicators of worry severity for generalized anxiety disorder (GAD): percent of the day worried and number of worry domains. In study 1, archival data from diagnostic interviews from a community sample of individuals diagnosed with one or more anxiety disorders (n = 233) were used to assess correlations between percent of the day worried and number of worry domains with other measures of worry severity (clinical severity rating (CSR), age of onset, number of comorbid disorders, Penn state worry questionnaire (PSWQ)) and DSM-IV criteria (excessiveness, uncontrollability and number of physical symptoms). Both measures were significantly correlated with CSR and number of comorbid disorders, and with all three DSM-IV criteria. In study 2, test-retest reliability of percent of the day worried and number of worry domains were compared to test-retest reliability of DSM-IV diagnostic criteria in a non-clinical sample of undergraduate students (n = 97) at a large west coast university. All measures had low test-retest reliability except percent of the day worried, which had moderate test-retest reliability. Findings suggest that these two indicators capture worry severity, and percent of the day worried may be the most reliable existing indicator. These measures may be useful as dimensional measures for DSM-V. Copyright © 2012 Elsevier Ltd. All rights reserved.

  1. Age-Related Differences in Test-Retest Reliability in Resting-State Brain Functional Connectivity

    PubMed Central

    Song, Jie; Desphande, Alok S.; Meier, Timothy B.; Tudorascu, Dana L.; Vergun, Svyatoslav; Nair, Veena A.; Biswal, Bharat B.; Meyerand, Mary E.; Birn, Rasmus M.; Bellec, Pierre; Prabhakaran, Vivek

    2012-01-01

    Resting-state functional MRI (rs-fMRI) has emerged as a powerful tool for investigating brain functional connectivity (FC). Research in recent years has focused on assessing the reliability of FC across younger subjects within and between scan-sessions. Test-retest reliability in resting-state functional connectivity (RSFC) has not yet been examined in older adults. In this study, we investigated age-related differences in reliability and stability of RSFC across scans. In addition, we examined how global signal regression (GSR) affects RSFC reliability and stability. Three separate resting-state scans from 29 younger adults (18–35 yrs) and 26 older adults (55–85 yrs) were obtained from the International Consortium for Brain Mapping (ICBM) dataset made publically available as part of the 1000 Functional Connectomes project www.nitrc.org/projects/fcon_1000. 92 regions of interest (ROIs) with 5 cubic mm radius, derived from the default, cingulo-opercular, fronto-parietal and sensorimotor networks, were previously defined based on a recent study. Mean time series were extracted from each of the 92 ROIs from each scan and three matrices of z-transformed correlation coefficients were created for each subject, which were then used for evaluation of multi-scan reliability and stability. The young group showed higher reliability of RSFC than the old group with GSR (p-value = 0.028) and without GSR (p-value <0.001). Both groups showed a high degree of multi-scan stability of RSFC and no significant differences were found between groups. By comparing the test-retest reliability of RSFC with and without GSR across scans, we found significantly higher proportion of reliable connections in both groups without GSR, but decreased stability. Our results suggest that aging is associated with reduced reliability of RSFC which itself is highly stable within-subject across scans for both groups, and that GSR reduces the overall reliability but increases the stability in both age groups and could potentially alter group differences of RSFC. PMID:23227153

  2. Repeatability and reliability of muscle relaxation properties induced by motor cortical stimulation.

    PubMed

    Molenaar, Joery P; Voermans, Nicol C; de Jong, Lysanne A; Stegeman, Dick F; Doorduin, Jonne; van Engelen, Baziel G

    2018-03-15

    Impaired muscle relaxation is a feature of many neuromuscular disorders. However, there are few tests available to quantify muscle relaxation. Transcranial magnetic stimulation (TMS) of the motor cortex can induce muscle relaxation by abruptly inhibiting corticospinal drive. The aim of our study is to investigate if repeatability and reliability of TMS-induced relaxation is greater than voluntary relaxation. Furthermore, effects of sex, cooling and fatigue on muscle relaxation properties were studied. Muscle relaxation of deep finger flexors was assessed in twenty-five healthy subjects (14 M and 11 F, aged 39.1{plus minus}12.7 and 45.3{plus minus}8.7 years old, respectively) using handgrip dynamometry. All outcome measures showed greater repeatability and reliability in TMS-induced relaxation compared to voluntary relaxation. The within-subject coefficient of variability of normalized peak relaxation rate was lower in TMS-induced relaxation than in voluntary relaxation (3.0 vs 19.7% in men, and 6.1 vs 14.3% in women). The repeatability coefficient was lower (1.3 vs 6.1 s -1 in men and 2.3 vs 3.1 s -1 in women), and the intraclass correlation coefficient was higher (0.95 vs 0.53 in men and 0.78 vs 0.69 in women), for TMS-induced relaxation compared to voluntary relaxation. TMS enabled to demonstrate slowing effects of sex, muscle cooling, and muscle fatigue on relaxation properties that voluntary relaxation could not. In conclusion, repeatability and reliability of TMS-induced muscle relaxation was greater compared to voluntary muscle relaxation. TMS-induced muscle relaxation has the potential to be used in clinical practice for diagnostic purposes and therapy effect monitoring in patients with impaired muscle relaxation.

  3. 3D photography is as accurate as digital planimetry tracing in determining burn wound area.

    PubMed

    Stockton, K A; McMillan, C M; Storey, K J; David, M C; Kimble, R M

    2015-02-01

    In the paediatric population careful attention needs to be made concerning techniques utilised for wound assessment to minimise discomfort and stress to the child. To investigate whether 3D photography is a valid measure of burn wound area in children compared to the current clinical gold standard method of digital planimetry using Visitrak™. Twenty-five children presenting to the Stuart Pegg Paediatric Burn Centre for burn dressing change following acute burn injury were included in the study. Burn wound area measurement was undertaken using both digital planimetry (Visitrak™ system) and 3D camera analysis. Inter-rater reliability of the 3D camera software was determined by three investigators independently assessing the burn wound area. A comparison of wound area was assessed using intraclass correlation co-efficients (ICC) which demonstrated excellent agreement 0.994 (CI 0.986, 0.997). Inter-rater reliability measured using ICC 0.989 (95% CI 0.979, 0.995) demonstrated excellent inter-rater reliability. Time taken to map the wound was significantly quicker using the camera at bedside compared to Visitrak™ 14.68 (7.00)s versus 36.84 (23.51)s (p<0.001). In contrast, analysing wound area was significantly quicker using the Visitrak™ tablet compared to Dermapix(®) software for the 3D Images 31.36 (19.67)s versus 179.48 (56.86)s (p<0.001). This study demonstrates that images taken with the 3D LifeViz™ camera and assessed with Dermapix(®) software is a reliable method for wound area assessment in the acute paediatric burn setting. Copyright © 2014 Elsevier Ltd and ISBI. All rights reserved.

  4. 3D photography is a reliable burn wound area assessment tool compared to digital planimetry in very young children.

    PubMed

    Gee Kee, E L; Kimble, R M; Stockton, K A

    2015-09-01

    Reliability and validity of 3D photography (3D LifeViz™ System) compared to digital planimetry (Visitrak™) has been established in a compliant cohort of children with acute burns. Further research is required to investigate these assessment tools in children representative of the general pediatric burns population, specifically children under the age of three years. To determine if 3D photography is a reliable wound assessment tool compared to Visitrak™ in children of all ages with acute burns ≤10% TBSA. Ninety-six children (median age 1 year 9 months) who presented to the Royal Children's Hospital Brisbane with an acute burn ≤10% TBSA were recruited into the study. Wounds were measured at the first dressing change using the Visitrak™ system and 3D photography. All measurements were completed by one investigator and level of agreement between wound surface area measurements was calculated. Wound surface area measurements were complete (i.e. participants had measurements from both techniques) for 75 participants. Level of agreement between wound surface area measurements calculated using an intra-class correlation coefficient (ICC) was excellent (ICC 0.96, 95% CI 0.93, 0.97). Visitrak™ tracings could not be completed in 19 participants with 16 aged less than two years. 3D photography could not be completed for one participant. Barriers to completing tracings were: excessive movement, pain, young age or wound location (e.g. face or perineum). This study has confirmed 3D photography as a reliable alternative to digital planimetry in children of all ages with acute burns ≤10% TBSA. In addition, 3D photography is more suitable for very young children given its non-invasive nature. Copyright © 2015 Elsevier Ltd and ISBI. All rights reserved.

  5. SLOWLY REPEATED EVOKED PAIN (SREP) AS A MARKER OF CENTRAL SENSITIZATION IN FIBROMYALGIA: DIAGNOSTIC ACCURACY AND RELIABILITY IN COMPARISON WITH TEMPORAL SUMMATION OF PAIN.

    PubMed

    de la Coba, Pablo; Bruehl, Stephen; Gálvez-Sánchez, Carmen María; Reyes Del Paso, Gustavo A

    2018-05-01

    This study examined the diagnostic accuracy and test-retest reliability of a novel dynamic evoked pain protocol (slowly repeated evoked pain; SREP) compared to temporal summation of pain (TSP), a standard index of central sensitization. Thirty-five fibromyalgia (FM) and 30 rheumatoid arthritis (RA) patients completed, in pseudorandomized order, a standard mechanical TSP protocol (10 stimuli of 1s duration at the thenar eminence using a 300g monofilament with 1s interstimulus interval) and the SREP protocol (9 suprathreshold pressure stimuli of 5s duration applied to the fingernail with a 30s interstimulus interval). In order to evaluate reliability for both protocols, they were repeated in a second session 4-7 days later. Evidence for significant pain sensitization over trials (increasing pain intensity ratings) was observed for SREP in FM (p<.001) but not in RA (p=.35), whereas significant sensitization was observed in both diagnostic groups for the TSP protocol (p's<.008). Compared to TSP, SREP demonstrated higher overall diagnostic accuracy (87.7% vs. 64.6%), greater sensitivity (0.89 vs. 0.57), and greater specificity (0.87 vs. 0.73) in discriminating between FM and RA patients. Test-retest reliability of SREP sensitization was good in FM (ICCs: 0.80), and moderate in RA (ICC: 0.68). SREP seems to be a dynamic evoked pain index tapping into pain sensitization that allows for greater diagnostic accuracy in identifying FM patients compared to a standard TSP protocol. Further research is needed to study mechanisms underlying SREP and the potential utility of adding SREP to standard pain evaluation protocols.

  6. Standard setting: comparison of two methods.

    PubMed

    George, Sanju; Haque, M Sayeed; Oyebode, Femi

    2006-09-14

    The outcome of assessments is determined by the standard-setting method used. There is a wide range of standard-setting methods and the two used most extensively in undergraduate medical education in the UK are the norm-reference and the criterion-reference methods. The aims of the study were to compare these two standard-setting methods for a multiple-choice question examination and to estimate the test-retest and inter-rater reliability of the modified Angoff method. The norm-reference method of standard-setting (mean minus 1 SD) was applied to the 'raw' scores of 78 4th-year medical students on a multiple-choice examination (MCQ). Two panels of raters also set the standard using the modified Angoff method for the same multiple-choice question paper on two occasions (6 months apart). We compared the pass/fail rates derived from the norm reference and the Angoff methods and also assessed the test-retest and inter-rater reliability of the modified Angoff method. The pass rate with the norm-reference method was 85% (66/78) and that by the Angoff method was 100% (78 out of 78). The percentage agreement between Angoff method and norm-reference was 78% (95% CI 69% - 87%). The modified Angoff method had an inter-rater reliability of 0.81-0.82 and a test-retest reliability of 0.59-0.74. There were significant differences in the outcomes of these two standard-setting methods, as shown by the difference in the proportion of candidates that passed and failed the assessment. The modified Angoff method was found to have good inter-rater reliability and moderate test-retest reliability.

  7. Superior model for fault tolerance computation in designing nano-sized circuit systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Singh, N. S. S., E-mail: narinderjit@petronas.com.my; Muthuvalu, M. S., E-mail: msmuthuvalu@gmail.com; Asirvadam, V. S., E-mail: vijanth-sagayan@petronas.com.my

    2014-10-24

    As CMOS technology scales nano-metrically, reliability turns out to be a decisive subject in the design methodology of nano-sized circuit systems. As a result, several computational approaches have been developed to compute and evaluate reliability of desired nano-electronic circuits. The process of computing reliability becomes very troublesome and time consuming as the computational complexity build ups with the desired circuit size. Therefore, being able to measure reliability instantly and superiorly is fast becoming necessary in designing modern logic integrated circuits. For this purpose, the paper firstly looks into the development of an automated reliability evaluation tool based on the generalizationmore » of Probabilistic Gate Model (PGM) and Boolean Difference-based Error Calculator (BDEC) models. The Matlab-based tool allows users to significantly speed-up the task of reliability analysis for very large number of nano-electronic circuits. Secondly, by using the developed automated tool, the paper explores into a comparative study involving reliability computation and evaluation by PGM and, BDEC models for different implementations of same functionality circuits. Based on the reliability analysis, BDEC gives exact and transparent reliability measures, but as the complexity of the same functionality circuits with respect to gate error increases, reliability measure by BDEC tends to be lower than the reliability measure by PGM. The lesser reliability measure by BDEC is well explained in this paper using distribution of different signal input patterns overtime for same functionality circuits. Simulation results conclude that the reliability measure by BDEC depends not only on faulty gates but it also depends on circuit topology, probability of input signals being one or zero and also probability of error on signal lines.« less

  8. A reliability study on brain activation during active and passive arm movements supported by an MRI-compatible robot.

    PubMed

    Estévez, Natalia; Yu, Ningbo; Brügger, Mike; Villiger, Michael; Hepp-Reymond, Marie-Claude; Riener, Robert; Kollias, Spyros

    2014-11-01

    In neurorehabilitation, longitudinal assessment of arm movement related brain function in patients with motor disability is challenging due to variability in task performance. MRI-compatible robots monitor and control task performance, yielding more reliable evaluation of brain function over time. The main goals of the present study were first to define the brain network activated while performing active and passive elbow movements with an MRI-compatible arm robot (MaRIA) in healthy subjects, and second to test the reproducibility of this activation over time. For the fMRI analysis two models were compared. In model 1 movement onset and duration were included, whereas in model 2 force and range of motion were added to the analysis. Reliability of brain activation was tested with several statistical approaches applied on individual and group activation maps and on summary statistics. The activated network included mainly the primary motor cortex, primary and secondary somatosensory cortex, superior and inferior parietal cortex, medial and lateral premotor regions, and subcortical structures. Reliability analyses revealed robust activation for active movements with both fMRI models and all the statistical methods used. Imposed passive movements also elicited mainly robust brain activation for individual and group activation maps, and reliability was improved by including additional force and range of motion using model 2. These findings demonstrate that the use of robotic devices, such as MaRIA, can be useful to reliably assess arm movement related brain activation in longitudinal studies and may contribute in studies evaluating therapies and brain plasticity following injury in the nervous system.

  9. Characterizing wind power resource reliability in southern Africa

    DOE PAGES

    Fant, Charles; Gunturu, Bhaskar; Schlosser, Adam

    2015-08-29

    Producing electricity from wind is attractive because it provides a clean, low-maintenance power supply. However, wind resource is intermittent on various timescales, thus occasionally introducing large and sudden changes in power supply. A better understanding of this variability can greatly benefit power grid planning. In the following study, wind resource is characterized using metrics that highlight these intermittency issues; therefore identifying areas of high and low wind power reliability in southern Africa and Kenya at different time-scales. After developing a wind speed profile, these metrics are applied at various heights in order to assess the added benefit of raising themore » wind turbine hub. Furthermore, since the interconnection of wind farms can aid in reducing the overall intermittency, the value of interconnecting near-by sites is mapped using two distinct methods. Of the countries in this region, the Republic of South Africa has shown the most interest in wind power investment. For this reason, we focus parts of the study on wind reliability in the country. The study finds that, although mean Wind Power Density is high in South Africa compared to its neighboring countries, wind power resource tends to be less reliable than in other parts of southern Africa—namely central Tanzania. We also find that South Africa’s potential varies over different timescales, with higher reliability in the summer than winter, and higher reliability during the day than at night. This study is concluded by introducing two methods and measures to characterize the value of interconnection, including the use of principal component analysis to identify areas with a common signal.« less

  10. Characterizing wind power resource reliability in southern Africa

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fant, Charles; Gunturu, Bhaskar; Schlosser, Adam

    Producing electricity from wind is attractive because it provides a clean, low-maintenance power supply. However, wind resource is intermittent on various timescales, thus occasionally introducing large and sudden changes in power supply. A better understanding of this variability can greatly benefit power grid planning. In the following study, wind resource is characterized using metrics that highlight these intermittency issues; therefore identifying areas of high and low wind power reliability in southern Africa and Kenya at different time-scales. After developing a wind speed profile, these metrics are applied at various heights in order to assess the added benefit of raising themore » wind turbine hub. Furthermore, since the interconnection of wind farms can aid in reducing the overall intermittency, the value of interconnecting near-by sites is mapped using two distinct methods. Of the countries in this region, the Republic of South Africa has shown the most interest in wind power investment. For this reason, we focus parts of the study on wind reliability in the country. The study finds that, although mean Wind Power Density is high in South Africa compared to its neighboring countries, wind power resource tends to be less reliable than in other parts of southern Africa—namely central Tanzania. We also find that South Africa’s potential varies over different timescales, with higher reliability in the summer than winter, and higher reliability during the day than at night. This study is concluded by introducing two methods and measures to characterize the value of interconnection, including the use of principal component analysis to identify areas with a common signal.« less

  11. Spanish translation, cross-cultural adaptation, and validation of the Questionnaire for Diabetes-Related Foot Disease (Q-DFD)

    PubMed Central

    Castillo-Tandazo, Wilson; Flores-Fortty, Adolfo; Feraud, Lourdes; Tettamanti, Daniel

    2013-01-01

    Purpose To translate, cross-culturally adapt, and validate the Questionnaire for Diabetes-Related Foot Disease (Q-DFD), originally created and validated in Australia, for its use in Spanish-speaking patients with diabetes mellitus. Patients and methods The translation and cross-cultural adaptation were based on international guidelines. The Spanish version of the survey was applied to a community-based (sample A) and a hospital clinic-based sample (samples B and C). Samples A and B were used to determine criterion and construct validity comparing the survey findings with clinical evaluation and medical records, respectively; while sample C was used to determine intra- and inter-rater reliability. Results After completing the rigorous translation process, only four items were considered problematic and required a new translation. In total, 127 patients were included in the validation study: 76 to determine criterion and construct validity and 41 to establish intra- and inter-rater reliability. For an overall diagnosis of diabetes-related foot disease, a substantial level of agreement was obtained when we compared the Q-DFD with the clinical assessment (kappa 0.77, sensitivity 80.4%, specificity 91.5%, positive likelihood ratio [LR+] 9.46, negative likelihood ratio [LR−] 0.21); while an almost perfect level of agreement was obtained when it was compared with medical records (kappa 0.88, sensitivity 87%, specificity 97%, LR+ 29.0, LR− 0.13). Survey reliability showed substantial levels of agreement, with kappa scores of 0.63 and 0.73 for intra- and inter-rater reliability, respectively. Conclusion The translated and cross-culturally adapted Q-DFD showed good psychometric properties (validity, reproducibility, and reliability) that allow its use in Spanish-speaking diabetic populations. PMID:24039434

  12. Reliability, Validity, and Classification Accuracy of the DSM-5 Diagnostic Criteria for Gambling Disorder and Comparison to DSM-IV.

    PubMed

    Stinchfield, Randy; McCready, John; Turner, Nigel E; Jimenez-Murcia, Susana; Petry, Nancy M; Grant, Jon; Welte, John; Chapman, Heather; Winters, Ken C

    2016-09-01

    The DSM-5 was published in 2013 and it included two substantive revisions for gambling disorder (GD). These changes are the reduction in the threshold from five to four criteria and elimination of the illegal activities criterion. The purpose of this study was to twofold. First, to assess the reliability, validity and classification accuracy of the DSM-5 diagnostic criteria for GD. Second, to compare the DSM-5-DSM-IV on reliability, validity, and classification accuracy, including an examination of the effect of the elimination of the illegal acts criterion on diagnostic accuracy. To compare DSM-5 and DSM-IV, eight datasets from three different countries (Canada, USA, and Spain; total N = 3247) were used. All datasets were based on similar research methods. Participants were recruited from outpatient gambling treatment services to represent the group with a GD and from the community to represent the group without a GD. All participants were administered a standardized measure of diagnostic criteria. The DSM-5 yielded satisfactory reliability, validity and classification accuracy. In comparing the DSM-5 to the DSM-IV, most comparisons of reliability, validity and classification accuracy showed more similarities than differences. There was evidence of modest improvements in classification accuracy for DSM-5 over DSM-IV, particularly in reduction of false negative errors. This reduction in false negative errors was largely a function of lowering the cut score from five to four and this revision is an improvement over DSM-IV. From a statistical standpoint, eliminating the illegal acts criterion did not make a significant impact on diagnostic accuracy. From a clinical standpoint, illegal acts can still be addressed in the context of the DSM-5 criterion of lying to others.

  13. Spacecraft Conceptual Design Compared to the Apollo Lunar Lander

    NASA Technical Reports Server (NTRS)

    Young, C.; Bowie, J.; Rust, R.; Lenius, J.; Anderson, M.; Connolly, J.

    2011-01-01

    Future human exploration of the Moon will require an optimized spacecraft design with each sub-system achieving the required minimum capability and maintaining high reliability. The objective of this study was to trade capability with reliability and minimize mass for the lunar lander spacecraft. The NASA parametric concept for a 3-person vehicle to the lunar surface with a 30% mass margin totaled was considerably heavier than the Apollo 15 Lunar Module "as flown" mass of 16.4 metric tons. The additional mass was attributed to mission requirements and system design choices that were made to meet the realities of modern spaceflight. The parametric tool used to size the current concept, Envision, accounts for primary and secondary mass requirements. For example, adding an astronaut increases the mass requirements for suits, water, food, oxygen, as well as, the increase in volume. The environmental control sub-systems becomes heavier with the increased requirements and more structure was needed to support the additional mass. There was also an increase in propellant usage. For comparison, an "Apollo-like" vehicle was created by removing these additional requirements. Utilizing the Envision parametric mass calculation tool and a quantitative reliability estimation tool designed by Valador Inc., it was determined that with today?s current technology a Lunar Module (LM) with Apollo capability could be built with less mass and similar reliability. The reliability of this new lander was compared to Apollo Lunar Module utilizing the same methodology, adjusting for mission timeline changes as well as component differences. Interestingly, the parametric concept's overall estimated risk for loss of mission (LOM) and loss of crew (LOC) did not significantly improve when compared to Apollo.

  14. An Investigation of the Generalizability of Medical School Grades.

    PubMed

    Kreiter, Clarence D; Ferguson, Kristi J

    2016-01-01

    Construct/Background: Medical school grades are currently unstandardized, and their level of reliability is unknown. This means their usefulness for reporting on student achievement is also not well documented. This study investigates grade reliability within 1 medical school. Generalizability analyses are conducted on grades awarded. Grades from didactic and clerkship-based courses were treated as 2 levels of a fixed facet within a univariate mixed model. Grades from within the 2 levels (didactic and clerkship) were also entered in a multivariate generalizability study. Grades from didactic courses were shown to produce a highly reliable mean score (G = .79) when averaged over as few as 5 courses. Although the universe score correlation between didactic and clerkship courses was high (r = .80), the clerkship courses required almost twice as many grades to reach a comparable level of reliability. When grades were converted to a Pass/Fail metric, almost all information contained in the grades was lost. Although it has been suggested that the imprecision of medical school grades precludes their use as a reliable indicator of student achievement, these results suggest otherwise. While it is true that a Pass/Fail system of grading provides very little information about a student's level of performance, a multi-tiered grading system was shown to be a highly reliable indicator of student achievement within the medical school. Although grades awarded during the first 2 didactic years appear to be more reliable than clerkship grades, both yield useful information about student performance within the medical college.

  15. Reliability and validity of a Chinese version of the Diagnostic Interview for Borderlines-Revised.

    PubMed

    Wang, Lanlan; Yuan, Chenmei; Qiu, Jianying; Gunderson, John; Zhang, Min; Jiang, Kaida; Leung, Freedom; Zhong, Jie; Xiao, Zeping

    2014-09-01

    Borderline personality disorder (BPD) is the most studied of the axis II disorders. One of the most widely used diagnostic instruments is the Diagnostic Interview for Borderline Patients-Revised (DIB-R). The aim of this study was to test the reliability and validity of DIB-R for use in the Chinese culture. The reliability and validity of the DIB-R Chinese version were assessed in a sample of 236 outpatients with a probable BPD diagnosis. The Structured Clinical Interview for DSM-IV Personality Disorders (SCID-II) was used as a standard. Test-retest reliability was tested six months later with 20 patients, and inter-rater reliability was tested on 32 patients. The Chinese version of the DIB-R showed good internal global consistency (Cronbach's α of 0.916), good test-retest reliability (Pearson correlation of 0.704), good inter-rater reliability (intra-class correlation coefficient of 0.892 and kappa of 0.861). When compared with the DSM-IV diagnosis as measured by the SCID-II, the DIB-R showed relatively good sensitivity (0.768) and specificity (0.891) at the cutoff of 7, moderate diagnostic convergence (kappa of 0.631), as well as good discriminating validity. The Chinese version of the DIB-R has good psychometric properties, which renders it a valuable method for examining the presence, the severity, and component phenotypes of BPD in Chinese samples. © 2013 Wiley Publishing Asia Pty Ltd.

  16. Test-retest reliability of quantitative sensory testing for mechanical somatosensory and pain modulation assessment of masticatory structures.

    PubMed

    Costa, Y M; Morita-Neto, O; de Araújo-Júnior, E N S; Sampaio, F A; Conti, P C R; Bonjardim, L R

    2017-03-01

    Assessing the reliability of medical measurements is a crucial step towards the elaboration of an applicable clinical instrument. There are few studies that evaluate the reliability of somatosensory assessment and pain modulation of masticatory structures. This study estimated the test-retest reliability, that is over time, of the mechanical somatosensory assessment of anterior temporalis, masseter and temporomandibular joint (TMJ) and the conditioned pain modulation (CPM) using the anterior temporalis as the test site. Twenty healthy women were evaluated in two sessions (1 week apart) by the same examiner. Mechanical detection threshold (MDT), mechanical pain threshold (MPT), wind-up ratio (WUR) and pressure pain threshold (PPT) were assessed on the skin overlying the anterior temporalis, masseter and TMJ of the dominant side. CPM was tested by comparing PPT before and during the hand immersion in a hot water bath. anova and intra-class correlation coefficients (ICCs) were applied to the data (α = 5%). The overall ICCs showed acceptable values for the test-retest reliability of mechanical somatosensory assessment of masticatory structures. The ICC values of 75% of all quantitative sensory measurements were considered fair to excellent (fair = 8·4%, good = 33·3% and excellent = 33·3%). However, the CPM paradigm presented poor reliability (ICC = 0·25). The mechanical somatosensory assessment of the masticatory structures, but not the proposed CPM protocol, can be considered sufficiently reliable over time to evaluate the trigeminal sensory function. © 2016 John Wiley & Sons Ltd.

  17. Comparing the Psychometric Properties of Two Physical Activity Self-Efficacy Instruments in Urban, Adolescent Girls: Validity, Measurement Invariance, and Reliability

    PubMed Central

    Voskuil, Vicki R.; Pierce, Steven J.; Robbins, Lorraine B.

    2017-01-01

    Aims: This study compared the psychometric properties of two self-efficacy instruments related to physical activity. Factorial validity, cross-group and longitudinal invariance, and composite reliability were examined. Methods: Secondary analysis was conducted on data from a group randomized controlled trial investigating the effect of a 17-week intervention on increasing moderate to vigorous physical activity among 5th–8th grade girls (N = 1,012). Participants completed a 6-item Physical Activity Self-Efficacy Scale (PASE) and a 7-item Self-Efficacy for Exercise Behaviors Scale (SEEB) at baseline and post-intervention. Confirmatory factor analyses for intervention and control groups were conducted with Mplus Version 7.4 using robust weighted least squares estimation. Model fit was evaluated with the chi-square index, comparative fit index, and root mean square error of approximation. Composite reliability for latent factors with ordinal indicators was computed from Mplus output using SAS 9.3. Results: Mean age of the girls was 12.2 years (SD = 0.96). One-third of the girls were obese. Girls represented a diverse sample with over 50% indicating black race and an additional 19% identifying as mixed or other race. Both instruments demonstrated configural invariance for simultaneous analysis of cross-group and longitudinal invariance based on alternative fit indices. However, simultaneous metric invariance was not met for the PASE or the SEEB instruments. Partial metric invariance for the simultaneous analysis was achieved for the PASE with one factor loading identified as non-invariant. Partial metric invariance was not met for the SEEB. Longitudinal scalar invariance was achieved for both instruments in the control group but not the intervention group. Composite reliability for the PASE ranged from 0.772 to 0.842. Reliability for the SEEB ranged from 0.719 to 0.800 indicating higher reliability for the PASE. Reliability was more stable over time in the control group for both instruments. Conclusions: Results suggest that the intervention influenced how girls responded to indicator items. Neither of the instruments achieved simultaneous metric invariance making it difficult to assess mean differences in PA self-efficacy between groups. PMID:28824487

  18. The reliability of three psoriasis assessment tools: Psoriasis area and severity index, body surface area and physician global assessment.

    PubMed

    Bożek, Agnieszka; Reich, Adam

    2017-08-01

    A wide variety of psoriasis assessment tools have been proposed to evaluate the severity of psoriasis in clinical trials and daily practice. The most frequently used clinical instrument is the psoriasis area and severity index (PASI); however, none of the currently published severity scores used for psoriasis meets all the validation criteria required for an ideal score. The aim of this study was to compare and assess the reliability of 3 commonly used assessment instruments for psoriasis severity: the psoriasis area and severity index (PASI), body surface area (BSA) and physician global assessment (PGA). On the scoring day, 10 trained dermatologists evaluated 9 adult patients with plaque-type psoriasis using the PASI, BSA and PGA. All the subjects were assessed twice by each physician. Correlations between the assessments were analyzed using the Pearson correlation coefficient. Intra-class correlation coefficient (ICC) was calculated to analyze intra-rater reliability, and the coefficient of variation (CV) was used to assess inter-rater variability. Significant correlations were observed among the 3 scales in both assessments. In all 3 scales the ICCs were > 0.75, indicating high intra-rater reliability. The highest ICC was for the BSA (0.96) and the lowest one for the PGA (0.87). The CV for the PGA and PASI were 29.3 and 36.9, respectively, indicating moderate inter-rater variability. The CV for the BSA was 57.1, indicating high inter-rater variability. Comparing the PASI, PGA and BSA, it was shown that the PGA had the highest inter-rater reliability, whereas the BSA had the highest intra-rater reliability. The PASI showed intermediate values in terms of interand intra-rater reliability. None of the 3 assessment instruments showed a significant advantage over the other. A reliable assessment of psoriasis severity requires the use of several independent evaluations simultaneously.

  19. Reconciling Streamflow Uncertainty Estimation and River Bed Morphology Dynamics. Insights from a Probabilistic Assessment of Streamflow Uncertainties Using a Reliability Diagram

    NASA Astrophysics Data System (ADS)

    Morlot, T.; Mathevet, T.; Perret, C.; Favre Pugin, A. C.

    2014-12-01

    Streamflow uncertainty estimation has recently received a large attention in the literature. A dynamic rating curve assessment method has been introduced (Morlot et al., 2014). This dynamic method allows to compute a rating curve for each gauging and a continuous streamflow time-series, while calculating streamflow uncertainties. Streamflow uncertainty takes into account many sources of uncertainty (water level, rating curve interpolation and extrapolation, gauging aging, etc.) and produces an estimated distribution of streamflow for each days. In order to caracterise streamflow uncertainty, a probabilistic framework has been applied on a large sample of hydrometric stations of the Division Technique Générale (DTG) of Électricité de France (EDF) hydrometric network (>250 stations) in France. A reliability diagram (Wilks, 1995) has been constructed for some stations, based on the streamflow distribution estimated for a given day and compared to a real streamflow observation estimated via a gauging. To build a reliability diagram, we computed the probability of an observed streamflow (gauging), given the streamflow distribution. Then, the reliability diagram allows to check that the distribution of probabilities of non-exceedance of the gaugings follows a uniform law (i.e., quantiles should be equipropables). Given the shape of the reliability diagram, the probabilistic calibration is caracterised (underdispersion, overdispersion, bias) (Thyer et al., 2009). In this paper, we present case studies where reliability diagrams have different statistical properties for different periods. Compared to our knowledge of river bed morphology dynamic of these hydrometric stations, we show how reliability diagram gives us invaluable information on river bed movements, like a continuous digging or backfilling of the hydraulic control due to erosion or sedimentation processes. Hence, the careful analysis of reliability diagrams allows to reconcile statistics and long-term river bed morphology processes. This knowledge improves our real-time management of hydrometric stations, given a better caracterisation of erosion/sedimentation processes and the stability of hydrometric station hydraulic control.

  20. Quantitative evaluation of the viscoelastic properties of the ankle joint complex in patients suffering from ankle sprain by the anterior drawer test.

    PubMed

    Lin, Che-Yu; Shau, Yio-Wha; Wang, Chung-Li; Chai, Huei-Ming; Kang, Jiunn-Horng

    2013-06-01

    Biological tissues such as ligaments exhibit viscoelastic behaviours. Injury to the ligament may induce changes of these viscoelastic properties, and these changes could serve as biomarkers to detect the injury. In the present study, a novel instrument was developed to non-invasive quantify the viscoelastic properties of the ankle in vivo by the anterior drawer test. The purpose of the study was to investigate the reliability of the instrument and to compare the viscoelastic properties of the ankle between patients suffering from ankle sprain and controls. Eight patients and eight controls participated in the present study. The reliability test was performed on three randomly chosen subjects. In patient and control test, both ankles of each subject were tested to evaluate the viscoelastic properties of the ankle. The viscosity index was defined for quantitatively evaluating the viscosity of the ankle. Greater viscosity index was associated with lower viscosity. Injured and uninjured ankles of patient and both ankles of controls were compared. The instrument exhibited excellent test-retest reliability (r > 0.9). Injured ankles exhibited significantly less viscosity than uninjured ankles, since injured ankles of patients had significantly higher viscosity index (8,148 ± 5,266) compared with uninjured ankles of patients (948 ± 617; p = 0.008) and controls (1,326 ± 613; p < 0.001). The study revealed that the viscoelastic properties of the ankle can serve as sensitive and useful clinical biomarkers to differentiate between injured and uninjured ankles. The method may provide a clinical examination for objectively evaluating lateral ankle ligament injuries.

  1. Comparative Validity and Reproducibility Study of Various Landmark-Oriented Reference Planes in 3-Dimensional Computed Tomographic Analysis for Patients Receiving Orthognathic Surgery

    PubMed Central

    Lin, Hsiu-Hsia; Chuang, Ya-Fang; Weng, Jing-Ling; Lo, Lun-Jou

    2015-01-01

    Background Three-dimensional computed tomographic imaging has become popular in clinical evaluation, treatment planning, surgical simulation, and outcome assessment for maxillofacial intervention. The purposes of this study were to investigate whether there is any correlation among landmark-based horizontal reference planes and to validate the reproducibility and reliability of landmark identification. Materials and Methods Preoperative and postoperative cone-beam computed tomographic images of patients who had undergone orthognathic surgery were collected. Landmark-oriented reference planes including the Frankfort horizontal plane (FHP) and the lateral semicircular canal plane (LSP) were established. Four FHPs were defined by selecting 3 points from the orbitale, porion, or midpoint of paired points. The LSP passed through both the lateral semicircular canal points and nasion. The distances between the maxillary or mandibular teeth and the reference planes were measured, and the differences between the 2 sides were calculated and compared. The precision in locating the landmarks was evaluated by performing repeated tests, and the intraobserver reproducibility and interobserver reliability were assessed. Results A total of 30 patients with facial deformity and malocclusion—10 patients with facial symmetry, 10 patients with facial asymmetry, and 10 patients with cleft lip and palate—were recruited. Comparing the differences among the 5 reference planes showed no statistically significant difference among all patient groups. Regarding intraobserver reproducibility, the mean differences in the 3 coordinates varied from 0 to 0.35 mm, with correlation coefficients between 0.96 and 1.0, showing high correlation between repeated tests. Regarding interobserver reliability, the mean differences among the 3 coordinates varied from 0 to 0.47 mm, with correlation coefficients between 0.88 and 1.0, exhibiting high correlation between the different examiners. Conclusions The 5 horizontal reference planes were reliable and comparable for 3D craniomaxillofacial analysis. These reference planes were useful in standardizing the orientation of 3D skull models. PMID:25668209

  2. Comparative validity and reproducibility study of various landmark-oriented reference planes in 3-dimensional computed tomographic analysis for patients receiving orthognathic surgery.

    PubMed

    Lin, Hsiu-Hsia; Chuang, Ya-Fang; Weng, Jing-Ling; Lo, Lun-Jou

    2015-01-01

    Three-dimensional computed tomographic imaging has become popular in clinical evaluation, treatment planning, surgical simulation, and outcome assessment for maxillofacial intervention. The purposes of this study were to investigate whether there is any correlation among landmark-based horizontal reference planes and to validate the reproducibility and reliability of landmark identification. Preoperative and postoperative cone-beam computed tomographic images of patients who had undergone orthognathic surgery were collected. Landmark-oriented reference planes including the Frankfort horizontal plane (FHP) and the lateral semicircular canal plane (LSP) were established. Four FHPs were defined by selecting 3 points from the orbitale, porion, or midpoint of paired points. The LSP passed through both the lateral semicircular canal points and nasion. The distances between the maxillary or mandibular teeth and the reference planes were measured, and the differences between the 2 sides were calculated and compared. The precision in locating the landmarks was evaluated by performing repeated tests, and the intraobserver reproducibility and interobserver reliability were assessed. A total of 30 patients with facial deformity and malocclusion--10 patients with facial symmetry, 10 patients with facial asymmetry, and 10 patients with cleft lip and palate--were recruited. Comparing the differences among the 5 reference planes showed no statistically significant difference among all patient groups. Regarding intraobserver reproducibility, the mean differences in the 3 coordinates varied from 0 to 0.35 mm, with correlation coefficients between 0.96 and 1.0, showing high correlation between repeated tests. Regarding interobserver reliability, the mean differences among the 3 coordinates varied from 0 to 0.47 mm, with correlation coefficients between 0.88 and 1.0, exhibiting high correlation between the different examiners. The 5 horizontal reference planes were reliable and comparable for 3D craniomaxillofacial analysis. These reference planes were useful in standardizing the orientation of 3D skull models.

  3. Performance of intraclass correlation coefficient (ICC) as a reliability index under various distributions in scale reliability studies.

    PubMed

    Mehta, Shraddha; Bastero-Caballero, Rowena F; Sun, Yijun; Zhu, Ray; Murphy, Diane K; Hardas, Bhushan; Koch, Gary

    2018-04-29

    Many published scale validation studies determine inter-rater reliability using the intra-class correlation coefficient (ICC). However, the use of this statistic must consider its advantages, limitations, and applicability. This paper evaluates how interaction of subject distribution, sample size, and levels of rater disagreement affects ICC and provides an approach for obtaining relevant ICC estimates under suboptimal conditions. Simulation results suggest that for a fixed number of subjects, ICC from the convex distribution is smaller than ICC for the uniform distribution, which in turn is smaller than ICC for the concave distribution. The variance component estimates also show that the dissimilarity of ICC among distributions is attributed to the study design (ie, distribution of subjects) component of subject variability and not the scale quality component of rater error variability. The dependency of ICC on the distribution of subjects makes it difficult to compare results across reliability studies. Hence, it is proposed that reliability studies should be designed using a uniform distribution of subjects because of the standardization it provides for representing objective disagreement. In the absence of uniform distribution, a sampling method is proposed to reduce the non-uniformity. In addition, as expected, high levels of disagreement result in low ICC, and when the type of distribution is fixed, any increase in the number of subjects beyond a moderately large specification such as n = 80 does not have a major impact on ICC. Copyright © 2018 John Wiley & Sons, Ltd.

  4. Improving fMRI reliability in presurgical mapping for brain tumours.

    PubMed

    Stevens, M Tynan R; Clarke, David B; Stroink, Gerhard; Beyea, Steven D; D'Arcy, Ryan Cn

    2016-03-01

    Functional MRI (fMRI) is becoming increasingly integrated into clinical practice for presurgical mapping. Current efforts are focused on validating data quality, with reliability being a major factor. In this paper, we demonstrate the utility of a recently developed approach that uses receiver operating characteristic-reliability (ROC-r) to: (1) identify reliable versus unreliable data sets; (2) automatically select processing options to enhance data quality; and (3) automatically select individualised thresholds for activation maps. Presurgical fMRI was conducted in 16 patients undergoing surgical treatment for brain tumours. Within-session test-retest fMRI was conducted, and ROC-reliability of the patient group was compared to a previous healthy control cohort. Individually optimised preprocessing pipelines were determined to improve reliability. Spatial correspondence was assessed by comparing the fMRI results to intraoperative cortical stimulation mapping, in terms of the distance to the nearest active fMRI voxel. The average ROC-r reliability for the patients was 0.58±0.03, as compared to 0.72±0.02 in healthy controls. For the patient group, this increased significantly to 0.65±0.02 by adopting optimised preprocessing pipelines. Co-localisation of the fMRI maps with cortical stimulation was significantly better for more reliable versus less reliable data sets (8.3±0.9 vs 29±3 mm, respectively). We demonstrated ROC-r analysis for identifying reliable fMRI data sets, choosing optimal postprocessing pipelines, and selecting patient-specific thresholds. Data sets with higher reliability also showed closer spatial correspondence to cortical stimulation. ROC-r can thus identify poor fMRI data at time of scanning, allowing for repeat scans when necessary. ROC-r analysis provides optimised and automated fMRI processing for improved presurgical mapping. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/

  5. Mission Reliability Estimation for Repairable Robot Teams

    NASA Technical Reports Server (NTRS)

    Trebi-Ollennu, Ashitey; Dolan, John; Stancliff, Stephen

    2010-01-01

    A mission reliability estimation method has been designed to translate mission requirements into choices of robot modules in order to configure a multi-robot team to have high reliability at minimal cost. In order to build cost-effective robot teams for long-term missions, one must be able to compare alternative design paradigms in a principled way by comparing the reliability of different robot models and robot team configurations. Core modules have been created including: a probabilistic module with reliability-cost characteristics, a method for combining the characteristics of multiple modules to determine an overall reliability-cost characteristic, and a method for the generation of legitimate module combinations based on mission specifications and the selection of the best of the resulting combinations from a cost-reliability standpoint. The developed methodology can be used to predict the probability of a mission being completed, given information about the components used to build the robots, as well as information about the mission tasks. In the research for this innovation, sample robot missions were examined and compared to the performance of robot teams with different numbers of robots and different numbers of spare components. Data that a mission designer would need was factored in, such as whether it would be better to have a spare robot versus an equivalent number of spare parts, or if mission cost can be reduced while maintaining reliability using spares. This analytical model was applied to an example robot mission, examining the cost-reliability tradeoffs among different team configurations. Particularly scrutinized were teams using either redundancy (spare robots) or repairability (spare components). Using conservative estimates of the cost-reliability relationship, results show that it is possible to significantly reduce the cost of a robotic mission by using cheaper, lower-reliability components and providing spares. This suggests that the current design paradigm of building a minimal number of highly robust robots may not be the best way to design robots for extended missions.

  6. Comprehensive proficiency-based inanimate training for robotic surgery: reliability, feasibility, and educational benefit.

    PubMed

    Arain, Nabeel A; Dulan, Genevieve; Hogg, Deborah C; Rege, Robert V; Powers, Cathryn E; Tesfay, Seifu T; Hynan, Linda S; Scott, Daniel J

    2012-10-01

    We previously developed a comprehensive proficiency-based robotic training curriculum demonstrating construct, content, and face validity. This study aimed to assess reliability, feasibility, and educational benefit associated with curricular implementation. Over an 11-month period, 55 residents, fellows, and faculty (robotic novices) from general surgery, urology, and gynecology were enrolled in a 2-month curriculum: online didactics, half-day hands-on tutorial, and self-practice using nine inanimate exercises. Each trainee completed a questionnaire and performed a single proctored repetition of each task before (pretest) and after (post-test) training. Tasks were scored for time and errors using modified FLS metrics. For inter-rater reliability (IRR), three trainees were scored by two raters and analyzed using intraclass correlation coefficients (ICC). Data from eight experts were analyzed using ICC and Cronbach's α to determine test-retest reliability and internal consistency, respectively. Educational benefit was assessed by comparing baseline (pretest) and final (post-test) trainee performance; comparisons used Wilcoxon signed-rank test. Of the 55 trainees that pretested, 53 (96 %) completed all curricular components in 9-17 h and reached proficiency after completing an average of 72 ± 28 repetitions over 5 ± 1 h. Trainees indicated minimal prior robotic experience and "poor comfort" with robotic skills at baseline (1.8 ± 0.9) compared to final testing (3.1 ± 0.8, p < 0.001). IRR data for the composite score revealed an ICC of 0.96 (p < 0.001). Test-retest reliability was 0.91 (p < 0.001) and internal consistency was 0.81. Performance improved significantly after training for all nine tasks and according to composite scores (548 ± 176 vs. 914 ± 81, p < 0.001), demonstrating educational benefit. This curriculum is associated with high reliability measures, demonstrated feasibility for a large cohort of trainees, and yielded significant educational benefit. Further studies and adoption of this curriculum are encouraged.

  7. Kinematics of fast cervical rotations in persons with chronic neck pain: a cross-sectional and reliability study.

    PubMed

    Röijezon, Ulrik; Djupsjöbacka, Mats; Björklund, Martin; Häger-Ross, Charlotte; Grip, Helena; Liebermann, Dario G

    2010-09-27

    Assessment of sensorimotor function is useful for classification and treatment evaluation of neck pain disorders. Several studies have investigated various aspects of cervical motor functions. Most of these have involved slow or self-paced movements, while few have investigated fast cervical movements. Moreover, the reliability of assessment of fast cervical axial rotation has, to our knowledge, not been evaluated before. Cervical kinematics was assessed during fast axial head rotations in 118 women with chronic nonspecific neck pain (NS) and compared to 49 healthy controls (CON). The relationship between cervical kinematics and symptoms, self-rated functioning and fear of movement was evaluated in the NS group. A sub-sample of 16 NS and 16 CON was re-tested after one week to assess the reliability of kinematic variables. Six cervical kinematic variables were calculated: peak speed, range of movement, conjunct movements and three variables related to the shape of the speed profile. Together, peak speed and conjunct movements had a sensitivity of 76% and a specificity of 78% in discriminating between NS and CON, of which the major part could be attributed to peak speed (NS: 226 ± 88°/s and CON: 348 ± 92°/s, p < 0.01). Peak speed was slower in NS compared to healthy controls and even slower in NS with comorbidity of low-back pain. Associations were found between reduced peak speed and self-rated difficulties with running, performing head movements, car driving, sleeping and pain. Peak speed showed reasonably high reliability, while the reliability for conjunct movements was poor. Peak speed of fast cervical axial rotations is reduced in people with chronic neck pain, and even further reduced in subjects with concomitant low back pain. Fast cervical rotation test seems to be a reliable and valid tool for assessment of neck pain disorders on group level, while a rather large between subject variation and overlap between groups calls for caution in the interpretation of individual assessments.

  8. Cross-cultural adaptation and validation of the Italian version of the Kerlan-Jobe Orthopaedic Clinic Shoulder and Elbow score.

    PubMed

    Merolla, Giovanni; Corona, Katia; Zanoli, Gustavo; Cerciello, Simone; Giannotti, Stefano; Porcellini, Giuseppe

    2017-12-01

    The Kerlan-Jobe Orthopaedic Clinic (KJOC) Shoulder and Elbow score is a reliable and sensitive tool to measure the performance of overhead athletes. The purpose of this study was to carry out a cross-cultural adaptation and validation of the KJOC questionnaire in Italian and to assess its reliability, validity, and responsiveness. Ninety professional athletes with a painful shoulder were included in this study and were assigned to the "injury group" (n = 32) or the "overuse group" (n = 58); 65 were managed conservatively and 25 were treated by arthroscopic surgery. To assess the reliability of the KJOC score, patients were asked to fill in the questionnaire at baseline and after 2 weeks. To test the construct validity, KJOC scores were compared to those obtained with the Italian version of the Disabilities of the Arm, Shoulder, and Hand (DASH) scale, and with the DASH sports/performing arts module. To test KJOC score responsiveness, the follow-up KJOC scores of the participants treated conservatively were compared to those of the patients treated by arthroscopic surgery. Statistical analysis demonstrated that the KJOC questionnaire is reliable in terms of the single items and the overall score (ICC 0.95-0.99); that it has high construct validity (r s  = -0.697; p < 0.01); and that it is responsive to clinical differences in shoulder function (p < 0.0001). The Italian version of the KJOC Shoulder and Elbow score performed in a similar way to the English version and demonstrated good validity, reliability, and responsiveness after conservative and surgical treatment. II.

  9. The development and validation of a test of science critical thinking for fifth graders.

    PubMed

    Mapeala, Ruslan; Siew, Nyet Moi

    2015-01-01

    The paper described the development and validation of the Test of Science Critical Thinking (TSCT) to measure the three critical thinking skill constructs: comparing and contrasting, sequencing, and identifying cause and effect. The initial TSCT consisted of 55 multiple choice test items, each of which required participants to select a correct response and a correct choice of critical thinking used for their response. Data were obtained from a purposive sampling of 30 fifth graders in a pilot study carried out in a primary school in Sabah, Malaysia. Students underwent the sessions of teaching and learning activities for 9 weeks using the Thinking Maps-aided Problem-Based Learning Module before they answered the TSCT test. Analyses were conducted to check on difficulty index (p) and discrimination index (d), internal consistency reliability, content validity, and face validity. Analysis of the test-retest reliability data was conducted separately for a group of fifth graders with similar ability. Findings of the pilot study showed that out of initial 55 administered items, only 30 items with relatively good difficulty index (p) ranged from 0.40 to 0.60 and with good discrimination index (d) ranged within 0.20-1.00 were selected. The Kuder-Richardson reliability value was found to be appropriate and relatively high with 0.70, 0.73 and 0.92 for identifying cause and effect, sequencing, and comparing and contrasting respectively. The content validity index obtained from three expert judgments equalled or exceeded 0.95. In addition, test-retest reliability showed good, statistically significant correlations ([Formula: see text]). From the above results, the selected 30-item TSCT was found to have sufficient reliability and validity and would therefore represent a useful tool for measuring critical thinking ability among fifth graders in primary science.

  10. An evidence-based decision assistance model for predicting training outcome in juvenile guide dogs.

    PubMed

    Harvey, Naomi D; Craigon, Peter J; Blythe, Simon A; England, Gary C W; Asher, Lucy

    2017-01-01

    Working dog organisations, such as Guide Dogs, need to regularly assess the behaviour of the dogs they train. In this study we developed a questionnaire-style behaviour assessment completed by training supervisors of juvenile guide dogs aged 5, 8 and 12 months old (n = 1,401), and evaluated aspects of its reliability and validity. Specifically, internal reliability, temporal consistency, construct validity, predictive criterion validity (comparing against later training outcome) and concurrent criterion validity (comparing against a standardised behaviour test) were evaluated. Thirty-nine questions were sourced either from previously published literature or created to meet requirements identified via Guide Dogs staff surveys and staff feedback. Internal reliability analyses revealed seven reliable and interpretable trait scales named according to the questions within them as: Adaptability; Body Sensitivity; Distractibility; Excitability; General Anxiety; Trainability and Stair Anxiety. Intra-individual temporal consistency of the scale scores between 5-8, 8-12 and 5-12 months was high. All scales excepting Body Sensitivity showed some degree of concurrent criterion validity. Predictive criterion validity was supported for all seven scales, since associations were found with training outcome, at at-least one age. Thresholds of z-scores on the scales were identified that were able to distinguish later training outcome by identifying 8.4% of all dogs withdrawn for behaviour and 8.5% of all qualified dogs, with 84% and 85% specificity. The questionnaire assessment was reliable and could detect traits that are consistent within individuals over time, despite juvenile dogs undergoing development during the study period. By applying thresholds to scores produced from the questionnaire this assessment could prove to be a highly valuable decision-making tool for Guide Dogs. This is the first questionnaire-style assessment of juvenile dogs that has shown value in predicting the training outcome of individual working dogs.

  11. Validity and reliability of the abdominal test and evaluation systems tool (ABTEST) to accurately measure abdominal force.

    PubMed

    Glenn, Jordan M; Galey, Madeline; Edwards, Abigail; Rickert, Bradley; Washington, Tyrone A

    2015-07-01

    Ability to generate force from the core musculature is a critical factor for sports and general activities with insufficiencies predisposing individuals to injury. This study evaluated isometric force production as a valid and reliable method of assessing abdominal force using the abdominal test and evaluation systems tool (ABTEST). Secondary analysis estimated 1-repetition maximum on commercially available abdominal machine compared to maximum force and average power on ABTEST system. This study utilized test-retest reliability and comparative analysis for validity. Reliability was measured using test-retest design on ABTEST. Validity was measured via comparison to estimated 1-repetition maximum on a commercially available abdominal device. Participants applied isometric, abdominal force against a transducer and muscular activation was evaluated measuring normalized electromyographic activity at the rectus-abdominus, rectus-femoris, and erector-spinae. Test, re-test force production on ABTEST was significantly correlated (r=0.84; p<0.001). Mean electromyographic activity for the rectus-abdominus (72.93% and 75.66%), rectus-femoris (6.59% and 6.51%), and erector-spinae (6.82% and 5.48%) were observed for trial-1 and trial-2, respectively. Significant correlations for the estimated 1-repetition maximum were found for average power (r=0.70, p=0.002) and maximum force (r=0.72, p<0.001). Data indicate the ABTEST can accurately measure rectus-abdominus force isolated from hip-flexor involvement. Negligible activation of erector-spinae substantiates little subjective effort among participants in the lower back. Results suggest ABTEST is a valid and reliable method of evaluating abdominal force. Copyright © 2014 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.

  12. Cross-Cultural Adaptation and Validation of the Italian Version of SWAL-QOL.

    PubMed

    Ginocchio, Daniela; Alfonsi, Enrico; Mozzanica, Francesco; Accornero, Anna Rosa; Bergonzoni, Antonella; Chiarello, Giulia; De Luca, Nicoletta; Farneti, Daniele; Marilia, Simonelli; Calcagno, Paola; Turroni, Valentina; Schindler, Antonio

    2016-10-01

    The aim of the study was to evaluate the reliability and validity of the Italian SWAL-QOL (I-SWAL-QOL). The study consisted of five phases: item generation, reliability analysis, normative data generation, validity analysis, and responsiveness analysis. The item generation phase followed the five-step, cross-cultural, adaptation process of translation and back-translation. A group of 92 dysphagic patients was enrolled for the internal consistency analysis. Seventy-eight patients completed the I-SWAL-QOL twice, 2 weeks apart, for test-retest reliability analysis. A group of 200 asymptomatic subjects completed the I-SWAL-QOL for normative data generation. I-SWAL-QOL scores obtained by both the group of dysphagic subjects and asymptomatic ones were compared for validity analysis. I-SWAL-QOL scores were correlated with SF-36 scores in 67 patients with dysphagia for concurrent validity analysis. Finally, I-SWAL-QOL scores obtained in a group of 30 dysphagic patients before and after successful rehabilitation treatment were compared for responsiveness analysis. All the enrolled patients managed to complete the I-SWAL-QOL without needing any assistance, within 20 min. Internal consistency was acceptable for all I-SWAL-QOL subscales (α > 0.70). Test-retest reliability was also satisfactory for all subscales (ICC > 0.7). A significant difference between the dysphagic group and the control group was found in all I-SWAL-QOL subscales (p < 0.05). Mild to moderate correlations between I-SWAL-QOL and SF-36 subscales were observed. I-SWAL-QOL scores obtained in the pre-treatment condition were significantly lower than those obtained after swallowing rehabilitation. I-SWAL-QOL is reliable, valid, responsive to changes in QOL, and recommended for clinical practice and outcome research.

  13. Reliability and validity of cervical position measurements in individuals with and without chronic neck pain.

    PubMed

    Dunleavy, Kim; Neil, Joseph; Tallon, Allison; Adamo, Diane E

    2015-09-01

    The cervical range of motion device (CROM) has been shown to provide reliable forward head position (FHP) measurement when the upper cervical angle (UCA) is controlled. However, measurement without UCA standardization is reflective of habitual patterns. Criterion validity has not been reported. The purposes of this study were to establish: (1) criterion validity of CROM FHP and UCA compared to Optotrak data, (2) relative reliability and minimal detectable change (MDC95) in patients with and without cervical pain, and (3) to compare UCA and FHP in patients with and without pain in habitual postures. (1) Within-subjects single session concurrent criterion validity design. Simultaneous CROM and OP measurement was conducted in habitual sitting posture in 16 healthy young adults. (2) Reliability and MDC95 of UCA and FHP were calculated from three trials. (3) Values for adults over 35 years with cervical pain and age-matched healthy controls were compared. (1) Forward head position distances were moderately correlated and UCA angles were highly correlated. The mean (standard deviation) differences can be expected to vary between 1·48 cm (1·74) for FHP and -1·7 (2·46)° for UCA. (2) Reliability for CROM FHP measurements were good to excellent (no pain) and moderate (pain). Cervical range of motion FHP MDC95 was moderately low (no pain), and moderate (pain). Reliability for CROM UCA measurements was excellent and MDC95 low for both groups. There was no difference in FHP distances between the pain and no pain groups, UCA was significantly more extended in the pain group (P<0·05). Cervical range of motion FHP measurements were only moderately correlated with Optotrak data, and limits of agreement (LOA) and MDC95 were relatively large. There was also no difference in CROM FHP distance between older symptomatic and asymptomatic individuals. Cervical range of motion FHP measurement is therefore not recommended as a clinical outcome measure. Cervical range of motion UCA measurements showed good criterion validity, excellent test-retest reliability, and achievable MDC95 in asymptomatic and symptomatic participants. Differences of more than 6° are required to exceed error. Cervical range of motion UCA shows promise as a useful reliable and valid measurement, particularly as patients with cervical pain exhibited significantly more extended angles.

  14. Reliability and validity of cervical position measurements in individuals with and without chronic neck pain

    PubMed Central

    Neil, Joseph; Tallon, Allison; Adamo, Diane E.

    2015-01-01

    Objectives The cervical range of motion device (CROM) has been shown to provide reliable forward head position (FHP) measurement when the upper cervical angle (UCA) is controlled. However, measurement without UCA standardization is reflective of habitual patterns. Criterion validity has not been reported. The purposes of this study were to establish: (1) criterion validity of CROM FHP and UCA compared to Optotrak data, (2) relative reliability and minimal detectable change (MDC95) in patients with and without cervical pain, and (3) to compare UCA and FHP in patients with and without pain in habitual postures. Methods (1) Within-subjects single session concurrent criterion validity design. Simultaneous CROM and OP measurement was conducted in habitual sitting posture in 16 healthy young adults. (2) Reliability and MDC95 of UCA and FHP were calculated from three trials. (3) Values for adults over 35 years with cervical pain and age-matched healthy controls were compared. Results (1) Forward head position distances were moderately correlated and UCA angles were highly correlated. The mean (standard deviation) differences can be expected to vary between 1·48 cm (1·74) for FHP and −1·7 (2·46)° for UCA. (2) Reliability for CROM FHP measurements were good to excellent (no pain) and moderate (pain). Cervical range of motion FHP MDC95 was moderately low (no pain), and moderate (pain). Reliability for CROM UCA measurements was excellent and MDC95 low for both groups. There was no difference in FHP distances between the pain and no pain groups, UCA was significantly more extended in the pain group (P<0·05). Discussion Cervical range of motion FHP measurements were only moderately correlated with Optotrak data, and limits of agreement (LOA) and MDC95 were relatively large. There was also no difference in CROM FHP distance between older symptomatic and asymptomatic individuals. Cervical range of motion FHP measurement is therefore not recommended as a clinical outcome measure. Cervical range of motion UCA measurements showed good criterion validity, excellent test–retest reliability, and achievable MDC95 in asymptomatic and symptomatic participants. Differences of more than 6° are required to exceed error. Cervical range of motion UCA shows promise as a useful reliable and valid measurement, particularly as patients with cervical pain exhibited significantly more extended angles. PMID:26917936

  15. Modeling heterogeneous (co)variances from adjacent-SNP groups improves genomic prediction for milk protein composition traits.

    PubMed

    Gebreyesus, Grum; Lund, Mogens S; Buitenhuis, Bart; Bovenhuis, Henk; Poulsen, Nina A; Janss, Luc G

    2017-12-05

    Accurate genomic prediction requires a large reference population, which is problematic for traits that are expensive to measure. Traits related to milk protein composition are not routinely recorded due to costly procedures and are considered to be controlled by a few quantitative trait loci of large effect. The amount of variation explained may vary between regions leading to heterogeneous (co)variance patterns across the genome. Genomic prediction models that can efficiently take such heterogeneity of (co)variances into account can result in improved prediction reliability. In this study, we developed and implemented novel univariate and bivariate Bayesian prediction models, based on estimates of heterogeneous (co)variances for genome segments (BayesAS). Available data consisted of milk protein composition traits measured on cows and de-regressed proofs of total protein yield derived for bulls. Single-nucleotide polymorphisms (SNPs), from 50K SNP arrays, were grouped into non-overlapping genome segments. A segment was defined as one SNP, or a group of 50, 100, or 200 adjacent SNPs, or one chromosome, or the whole genome. Traditional univariate and bivariate genomic best linear unbiased prediction (GBLUP) models were also run for comparison. Reliabilities were calculated through a resampling strategy and using deterministic formula. BayesAS models improved prediction reliability for most of the traits compared to GBLUP models and this gain depended on segment size and genetic architecture of the traits. The gain in prediction reliability was especially marked for the protein composition traits β-CN, κ-CN and β-LG, for which prediction reliabilities were improved by 49 percentage points on average using the MT-BayesAS model with a 100-SNP segment size compared to the bivariate GBLUP. Prediction reliabilities were highest with the BayesAS model that uses a 100-SNP segment size. The bivariate versions of our BayesAS models resulted in extra gains of up to 6% in prediction reliability compared to the univariate versions. Substantial improvement in prediction reliability was possible for most of the traits related to milk protein composition using our novel BayesAS models. Grouping adjacent SNPs into segments provided enhanced information to estimate parameters and allowing the segments to have different (co)variances helped disentangle heterogeneous (co)variances across the genome.

  16. Repeatability of self-report measures of physical activity, sedentary and travel behaviour in Hong Kong adolescents for the iHealt(H) and IPEN - Adolescent studies.

    PubMed

    Cerin, Ester; Sit, Cindy H P; Huang, Ya-Jun; Barnett, Anthony; Macfarlane, Duncan J; Wong, Stephen S H

    2014-06-06

    Physical activity and sedentary behaviour are important contributors to adolescents' health. These behaviours may be affected by the school and neighbourhood built environments. However, current evidence on such effects is mainly limited to Western countries. The International Physical Activity and the Environment Network (IPEN)-Adolescent study aims to examine associations of the built environment with adolescent physical activity and sedentary behaviour across five continents.We report on the repeatability of measures of in-school and out-of school physical activity, plus measures of out-of-school sedentary and travel behaviours adopted by the IPEN - Adolescent study and adapted for Chinese-speaking Hong Kong adolescents participating in the international Healthy environments and active living in teenagers-(Hong Kong) [iHealt(H)] study, which is part of IPEN-Adolescent. Items gauging in-school physical activity and out-of-school physical activity, and out-of-school sedentary and travel behaviours developed for the IPEN - Adolescent study were translated from English into Chinese, adapted, and pilot tested. Sixty-eight Chinese-speaking 12-17 year old secondary school students (36 boys; 32 girls) residing in areas of Hong Kong differing in transport-related walkability were recruited. They self-completed the survey items twice, 8-16 days apart. Test-retest reliability was assessed for the whole sample and by gender using one-way random effects intra-class correlation coefficients (ICC). Test-retest reliability of items with restricted variability was assessed using percentage agreement. Overall test-retest reliability of items and scales was moderate to excellent (ICC = 0.47-0.92). Items with restricted variability in responses had a high percentage agreement (92%-100%). Test-retest reliability was similar in girls and boys, with the exception of daily hours of homework (reliability higher in girls) and number of school-based sports teams or after-school physical activity classes (reliability higher in boys). The translated and adapted self-report measures of physical activity, sedentary and travel behaviours used in the iHealt(H) study are sufficiently reliable. Levels of reliability are comparable or slightly higher than those observed for the original measures.

  17. Dynamic one-dimensional modeling of secondary settling tanks and system robustness evaluation.

    PubMed

    Li, Ben; Stenstrom, M K

    2014-01-01

    One-dimensional secondary settling tank models are widely used in current engineering practice for design and optimization, and usually can be expressed as a nonlinear hyperbolic or nonlinear strongly degenerate parabolic partial differential equation (PDE). Reliable numerical methods are needed to produce approximate solutions that converge to the exact analytical solutions. In this study, we introduced a reliable numerical technique, the Yee-Roe-Davis (YRD) method as the governing PDE solver, and compared its reliability with the prevalent Stenstrom-Vitasovic-Takács (SVT) method by assessing their simulation results at various operating conditions. The YRD method also produced a similar solution to the previously developed Method G and Enquist-Osher method. The YRD and SVT methods were also used for a time-to-failure evaluation, and the results show that the choice of numerical method can greatly impact the solution. Reliable numerical methods, such as the YRD method, are strongly recommended.

  18. Reliability of self-reported childhood physical abuse by adults and factors predictive of inconsistent reporting.

    PubMed

    McKinney, Christy M; Harris, T Robert; Caetano, Raul

    2009-01-01

    Little is known about the reliability of self-reported child physical abuse (CPA) or CPA reporting practices. We estimated reliability and prevalence of self-reported CPA and identified factors predictive of inconsistent CPA reporting among 2,256 participants using surveys administered in 1995 and 2000. Reliability of CPA was fair to moderate (kappa = 0.41). Using a positive report from either survey, the prevalence of moderate (61.8%) and severe (12.0%) CPA was higher than at either survey alone. Compared to consistent reporters of having experienced CPA, inconsistent reporters were less likely to be > or = 30 years old (vs. 18-29) or Black (vs. White) and more likely to have < 12 years of education (vs. 12), have no alcohol-related problems (vs. having problems), or report one type (vs. > or = 2) of CPA. These findings may assist researchers conducting and interpreting studies of CPA.

  19. Modelling utility-scale wind power plants. Part 2: Capacity credit

    NASA Astrophysics Data System (ADS)

    Milligan, Michael R.

    2000-10-01

    As the worldwide use of wind turbine generators in utility-scale applications continues to increase, it will become increasingly important to assess the economic and reliability impact of these intermittent resources. Although the utility industry appears to be moving towards a restructured environment, basic economic and reliability issues will continue to be relevant to companies involved with electricity generation. This article is the second in a two-part series that addresses modelling approaches and results that were obtained in several case studies and research projects at the National Renewable Energy Laboratory (NREL). This second article focuses on wind plant capacity credit as measured with power system reliability indices. Reliability-based methods of measuring capacity credit are compared with wind plant capacity factor. The relationship between capacity credit and accurate wind forecasting is also explored. Published in 2000 by John Wiley & Sons, Ltd.

  20. Bem Sex Role Inventory Validation in the International Mobility in Aging Study.

    PubMed

    Ahmed, Tamer; Vafaei, Afshin; Belanger, Emmanuelle; Phillips, Susan P; Zunzunegui, Maria-Victoria

    2016-09-01

    This study investigated the measurement structure of the Bem Sex Role Inventory (BSRI) with different factor analysis methods. Most previous studies on validity applied exploratory factor analysis (EFA) to examine the BSRI. We aimed to assess the psychometric properties and construct validity of the 12-item short-form BSRI in a sample administered to 1,995 older adults from wave 1 of the International Mobility in Aging Study (IMIAS). We used Cronbach's alpha to assess internal consistency reliability and confirmatory factor analysis (CFA) to assess psychometric properties. EFA revealed a three-factor model, further confirmed by CFA and compared with the original two-factor structure model. Results revealed that a two-factor solution (instrumentality-expressiveness) has satisfactory construct validity and superior fit to data compared to the three-factor solution. The two-factor solution confirms expected gender differences in older adults. The 12-item BSRI provides a brief, psychometrically sound, and reliable instrument in international samples of older adults.

  1. Puzzling with online games (BAM-COG): reliability, validity, and feasibility of an online self-monitor for cognitive performance in aging adults.

    PubMed

    Aalbers, Teun; Baars, Maria A E; Olde Rikkert, Marcel G M; Kessels, Roy P C

    2013-12-03

    Online interventions are aiming increasingly at cognitive outcome measures but so far no easy and fast self-monitors for cognition have been validated or proven reliable and feasible. This study examines a new instrument called the Brain Aging Monitor-Cognitive Assessment Battery (BAM-COG) for its alternate forms reliability, face and content validity, and convergent and divergent validity. Also, reference values are provided. The BAM-COG consists of four easily accessible, short, yet challenging puzzle games that have been developed to measure working memory ("Conveyer Belt"), visuospatial short-term memory ("Sunshine"), episodic recognition memory ("Viewpoint"), and planning ("Papyrinth"). A total of 641 participants were recruited for this study. Of these, 397 adults, 40 years and older (mean 54.9, SD 9.6), were eligible for analysis. Study participants played all games three times with 14 days in between sets. Face and content validity were based on expert opinion. Alternate forms reliability (AFR) was measured by comparing scores on different versions of the BAM-COG and expressed with an intraclass correlation (ICC: two-way mixed; consistency at 95%). Convergent validity (CV) was provided by comparing BAM-COG scores to gold-standard paper-and-pencil and computer-assisted cognitive assessment. Divergent validity (DV) was measured by comparing BAM-COG scores to the National Adult Reading Test IQ (NART-IQ) estimate. Both CV and DV are expressed as Spearman rho correlation coefficients. Three out of four games showed adequate results on AFR, CV, and DV measures. The games Conveyer Belt, Sunshine, and Papyrinth have AFR ICCs of .420, .426, and .645 respectively. Also, these games had good to very good CV correlations: rho=.577 (P=.001), rho=.669 (P<.001), and rho=.400 (P=.04), respectively. Last, as expected, DV correlations were low: rho=-.029 (P=.44), rho=-.029 (P=.45), and rho=-.134 (P=.28) respectively. The game Viewpoint provided less desirable results with an AFR ICC of .167, CV rho=.202 (P=.15), and DV rho=-.162 (P=.21). This study provides evidence for the use of the BAM-COG test battery as a feasible, reliable, and valid tool to monitor cognitive performance in healthy adults in an online setting. Three out of four games have good psychometric characteristics to measure working memory, visuospatial short-term memory, and planning capacity.

  2. Assessing Assessment: Evaluating Outcomes and Reliabilities of Grammar, Math, and Writing Skill Measures in an Introductory Journalism Course

    ERIC Educational Resources Information Center

    Farwell, Tricia M.; Alligood, Leon; Fitzgerald, Sharon; Blake, Ken

    2016-01-01

    This article introduces an objective grammar and math assessment and evaluates the assessment's outcome and reliability when fielded among eighty-one students in media writing courses. In addition, the article proposes a rubric for grading straight news leads and compares the rubric's reliability with the reliability of rating straight news leads…

  3. Assessment of the technological reliability of a hybrid constructed wetland for wastewater treatment in a mountain eco-tourist farm in Poland.

    PubMed

    Jucherski, Andrzej; Nastawny, Maria; Walczowski, Andrzej; Jóźwiakowski, Krzysztof; Gajewska, Magdalena

    2017-06-01

    The aim of the present study was to assess the technological reliability of a domestic hybrid wastewater treatment installation consisting of a classic three-chambered (volume 6 m 3 ) septic tank, a vertical flow trickling bed filled with granules of a calcinated clay material (KERAMZYT), a special wetland bed constructed on a slope, and a permeable pond used as a receiver. The test treatment plant was located at a mountain eco-tourist farm on the periphery of the spa municipality of Krynica-Zdrój, Poland. The plant's operational reliability in reducing the concentration of organic matter, measured as biochemical oxygen demand (BOD 5 ) and chemical oxygen demand (COD), was 100% when modelled by both the Weibull and the lognormal distributions. The respective reliability values for total nitrogen removal were 76.8% and 77.0%, total suspended solids - 99.5% and 92.6%, and PO 4 -P - 98.2% and 95.2%, with the differences being negligible. The installation was characterized by a very high level of technological reliability when compared with other solutions of this type. The Weibull method employed for statistical evaluation of technological reliability can also be used for comparison purposes. From the ecological perspective, the facility presented in the study has proven to be an effective tool for protecting local aquifer areas.

  4. Reliability of IGBT in a STATCOM for Harmonic Compensation and Power Factor Correction

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gopi Reddy, Lakshmi Reddy; Tolbert, Leon M; Ozpineci, Burak

    With smart grid integration, there is a need to characterize reliability of a power system by including reliability of power semiconductors in grid related applications. In this paper, the reliability of IGBTs in a STATCOM application is presented for two different applications, power factor correction and harmonic elimination. The STATCOM model is developed in EMTP, and analytical equations for average conduction losses in an IGBT and a diode are derived and compared with experimental data. A commonly used reliability model is used to predict reliability of IGBT.

  5. Systematic review of patient-specific instrumentation in total knee arthroplasty: new but not improved.

    PubMed

    Sassoon, Adam; Nam, Denis; Nunley, Ryan; Barrack, Robert

    2015-01-01

    Patient-specific cutting blocks have been touted as a more efficient and reliable means of achieving neutral mechanical alignment during TKA with the proposed downstream effect of improved clinical outcomes. However, it is not clear to what degree published studies support these assumptions. We asked: (1) Do patient-specific cutting blocks achieve neutral mechanical alignment more reliably during TKA when compared with conventional methods? (2) Does patient-specific instrumentation (PSI) provide financial benefit through improved surgical efficiency? (3) Does the use of patient-specific cutting blocks translate to improved clinical results after TKA when compared with conventional instrumentation? We performed a systematic review in accordance with Cochrane guidelines of controlled studies (prospective and retrospective) in MEDLINE® and EMBASE® with respect to patient-specific cutting blocks and their effect on alignment, cost, operative time, clinical outcome scores, complications, and survivorship. Sixteen studies (Level I-III on the levels of evidence rubric) were identified and used in addressing the first question, 13 (Level I-III) for the second question, and two (Level III) for the third question. Qualitative assessment of the selected Level I studies was performed using the modified Jadad score; Level II and III studies were rated based on the Newcastle-Ottawa scoring system. The majority of studies did not show an improvement in overall limb alignment when PSI was compared with standard instrumentation. Mixed results were seen across studies with regard to the prevalence of alignment outliers when PSI was compared with conventional cutting blocks with some studies demonstrating no difference, some showing an improvement with PSI, and a single study showing worse results with PSI. The studies demonstrated mixed results regarding the influence of PSI on operative times. Decreased operative times were not uniformly observed, and when noted, they were found to be of minimal clinical or financial significance. PSI did reliably reduce the number of instrument trays required for processing perioperatively. The accuracy of the preoperative plan, generated by the PSI manufacturers, was found lacking, often leading to multiple intraoperative changes, thereby disrupting the flow of the operation and negatively impacting efficiency. Limited data exist with regard to the effect of PSI on postoperative function, improvement in pain, and patient satisfaction. Neither of the two studies we identified provided strong evidence to support an advantage favoring the use of PSI. No identified studies addressed survivorship of components placed with PSI compared with those placed with standard instrumentation. PSI for TKA has not reliably demonstrated improvement of postoperative limb or component alignment when compared with standard instrumentation. Although decisive evidence exists to support that PSI requires fewer surgical trays, PSI has not clearly been shown to improve overall surgical efficiency or the cost-effectiveness of TKA. Mid- and long-term data regarding PSI's effect on functional outcomes and component survivorship do not exist and short-term data are scarce. Limited available literature does not clearly support any improvement of postoperative pain, activity, function, or ROM when PSI is compared with traditional instrumentation.

  6. Rater reliability and concurrent validity of the Keyboard Personal Computer Style instrument (K-PeCS).

    PubMed

    Baker, Nancy A; Cook, James R; Redfern, Mark S

    2009-01-01

    This paper describes the inter-rater and intra-rater reliability, and the concurrent validity of an observational instrument, the Keyboard Personal Computer Style instrument (K-PeCS), which assesses stereotypical postures and movements associated with computer keyboard use. Three trained raters independently rated the video clips of 45 computer keyboard users to ascertain inter-rater reliability, and then re-rated a sub-sample of 15 video clips to ascertain intra-rater reliability. Concurrent validity was assessed by comparing the ratings obtained using the K-PeCS to scores developed from a 3D motion analysis system. The overall K-PeCS had excellent reliability [inter-rater: intra-class correlation coefficients (ICC)=.90; intra-rater: ICC=.92]. Most individual items on the K-PeCS had from good to excellent reliability, although six items fell below ICC=.75. Those K-PeCS items that were assessed for concurrent validity compared favorably to the motion analysis data for all but two items. These results suggest that most items on the K-PeCS can be used to reliably document computer keyboarding style.

  7. A critical analysis of test-retest reliability in instrument validation studies of cancer patients under palliative care: a systematic review

    PubMed Central

    2014-01-01

    Background Patient-reported outcome validation needs to achieve validity and reliability standards. Among reliability analysis parameters, test-retest reliability is an important psychometric property. Retested patients must be in a clinically stable condition. This is particularly problematic in palliative care (PC) settings because advanced cancer patients are prone to a faster rate of clinical deterioration. The aim of this study was to evaluate the methods by which multi-symptom and health-related qualities of life (HRQoL) based on patient-reported outcomes (PROs) have been validated in oncological PC settings with regards to test-retest reliability. Methods A systematic search of PubMed (1966 to June 2013), EMBASE (1980 to June 2013), PsychInfo (1806 to June 2013), CINAHL (1980 to June 2013), and SCIELO (1998 to June 2013), and specific PRO databases was performed. Studies were included if they described a set of validation studies. Studies were included if they described a set of validation studies for an instrument developed to measure multi-symptom or multidimensional HRQoL in advanced cancer patients under PC. The COSMIN checklist was used to rate the methodological quality of the study designs. Results We identified 89 validation studies from 746 potentially relevant articles. From those 89 articles, 31 measured test-retest reliability and were included in this review. Upon critical analysis of the overall quality of the criteria used to determine the test-retest reliability, 6 (19.4%), 17 (54.8%), and 8 (25.8%) of these articles were rated as good, fair, or poor, respectively, and no article was classified as excellent. Multi-symptom instruments were retested over a shortened interval when compared to the HRQoL instruments (median values 24 hours and 168 hours, respectively; p = 0.001). Validation studies that included objective confirmation of clinical stability in their design yielded better results for the test-retest analysis with regard to both pain and global HRQoL scores (p < 0.05). The quality of the statistical analysis and its description were of great concern. Conclusion Test-retest reliability has been infrequently and poorly evaluated. The confirmation of clinical stability was an important factor in our analysis, and we suggest that special attention be focused on clinical stability when designing a PRO validation study that includes advanced cancer patients under PC. PMID:24447633

  8. A simple video-based timing system for on-ice team testing in ice hockey: a technical report.

    PubMed

    Larson, David P; Noonan, Benjamin C

    2014-09-01

    The purpose of this study was to describe and evaluate a newly developed on-ice timing system for team evaluation in the sport of ice hockey. We hypothesized that this new, simple, inexpensive, timing system would prove to be highly accurate and reliable. Six adult subjects (age 30.4 ± 6.2 years) performed on ice tests of acceleration and conditioning. The performance times of the subjects were recorded using a handheld stopwatch, photocell, and high-speed (240 frames per second) video. These results were then compared to allow for accuracy calculations of the stopwatch and video as compared with filtered photocell timing that was used as the "gold standard." Accuracy was evaluated using maximal differences, typical error/coefficient of variation (CV), and intraclass correlation coefficients (ICCs) between the timing methods. The reliability of the video method was evaluated using the same variables in a test-retest analysis both within and between evaluators. The video timing method proved to be both highly accurate (ICC: 0.96-0.99 and CV: 0.1-0.6% as compared with the photocell method) and reliable (ICC and CV within and between evaluators: 0.99 and 0.08%, respectively). This video-based timing method provides a very rapid means of collecting a high volume of very accurate and reliable on-ice measures of skating speed and conditioning, and can easily be adapted to other testing surfaces and parameters.

  9. Comparing the ability of OPTION(12) and OPTION(5) to assess shared decision-making in genetic counselling.

    PubMed

    Vortel, Martina A; Adam, Shelin; Port-Thompson, Ashley V; Friedman, Jan M; Grande, Stuart W; Birch, Patricia H

    2016-10-01

    OPTION(12) is the most widely used tool to measure shared decision-making (SDM) in health care. A newer scale, OPTION(5), has been proposed as a more parsimonious measure that better addresses core concepts of SDM. This study compares OPTION(5) to OPTION(12) in prenatal genetic counselling. Two raters independently used OPTION(12) and OPTION(5) to score 27 clinical encounters between genetic counsellors (GC) and women with pregnancies at increased risk for genetic conditions. Global and item scores on the two instruments were compared to test concurrent validity and to identify usability in this context. Inter-rater reliability was also assessed for both instruments. Mean scores for OPTION(12) were 43.8 (SD=9.7), and for OPTION(5) were=60.6 (SD=12.5). The correlation between OPTION(12) and OPTION(5) scores was r=0.70. Inter-rater reliability was 0.70 and 0.85 for OPTION(12) and OPTION(5) respectively, however mean inter-rater reliability for individual items was 0.31 and 0.63 for OPTION(12) and OPTION(5) respectively. GCs exhibit SDM as measured by both OPTION instruments. OPTION(5) exhibits improved psychometric performance relative to OPTION(12), and more specifically targets the core constructs of SDM. However, refinement of OPTION instruments or manuals is needed to improve reliability and validity in GC assessment. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  10. Reliability of the modified Gross Motor Function Measure-88 (GMFM-88) for children with both Spastic Cerebral Palsy and Cerebral Visual Impairment: A preliminary study.

    PubMed

    Salavati, M; Krijnen, W P; Rameckers, E A A; Looijestijn, P L; Maathuis, C G B; van der Schans, C P; Steenbergen, B

    2015-01-01

    The aims of this study were to adapt the Gross Motor Function Measure-88 (GMFM-88) for children with Cerebral Palsy (CP) and Cerebral Visual Impairment (CVI) and to determine the test-retest and interobserver reliability of the adapted version. Sixteen paediatric physical therapists familiar with CVI participated in the adaptation process. The Delphi method was used to gain consensus among a panel of experts. Seventy-seven children with CP and CVI (44 boys and 33 girls, aged between 50 and 144 months) participated in this study. To assess test-retest and interobserver reliability, the GMFM-88 was administered twice within three weeks (Mean=9 days, SD=6 days) by trained paediatric physical therapists, one of whom was familiar with the child and one who wasn't. Percentages of identical scores, Cronbach's alphas and intraclass correlation coefficients (ICC) were computed for each dimension level. All experts agreed on the proposed adaptations of the GMFM-88 for children with CP and CVI. Test-retest reliability ICCs for dimension scores were between 0.94 and 1.00, mean percentages of identical scores between 29 and 71, and interobserver reliability ICCs of the adapted GMFM-88 were 0.99-1.00 for dimension scores. Mean percentages of identical scores varied between 53 and 91. Test-retest and interobserver reliability of the GMFM-88-CVI for children with CP and CVI was excellent. Internal consistency of dimension scores lay between 0.97 and 1.00. The psychometric properties of the adapted GMFM-88 for children with CP and CVI are reliable and comparable to the original GMFM-88. Copyright © 2015 Elsevier Ltd. All rights reserved.

  11. Reliability and validity of general health questionnaire (GHQ-12) for male tannery workers: a study carried out in Kanpur, India.

    PubMed

    Kashyap, Gyan Chandra; Singh, Shri Kant

    2017-03-21

    The purpose of this study was to test the reliability, validity and factor structure of GHQ-12 questionnaire on male tannery workers of India. We have tested three different factor models of the GHQ-12. This paper used primary data obtained from a cross-sectional household study of tannery workers from Jajmau area of the city of Kanpur in northern India, which was conducted during January-June, 2015, as part of a doctoral program. The study covered 286 tannery workers from the study area. An interview schedule containing GHQ-12 was used for tannery workers who had completed at least 1 year at their present occupation preceding the survey. To test reliability, Cronbach's alpha test was used. The convergent test was used for validity. Confirmatory factor analysis was used to compare three factor structures for the GHQ-12. A total of 286 samples were analyzed in this study. The mean age of the tannery workers in this study was 38 years (SD = 1.42). We found the alpha coefficient to be 0.93 for the complete sample. The value of alpha represents the acceptable internal consistency for all the groups. Each item of scale showed almost the same internal consistency of 0.93 for the male tannery workers. The correlation between factor 1 (Anxiety and Depression) and factor 2 (Social Dysfunction) was 0.92. The correlation between factor 1 (Anxiety and Depression) and factor 3 (Loss of confidence) was the highest 0.98. Comparative fit index (CFI) estimate best-fitted for model-III that gave the CFI value 0.97. The SRMR indicator gave the lowest value 0.031 for the model-III. The findings suggest that the Hindi version of GHQ-12 is a reliable and valid tool for measuring psychological distress in male tannery workers of Kanpur city, India. Study found that the model proposed by the Graetz was the best fitted model for the data.

  12. Health Auctions: a Valuation Experiment (HAVE) study protocol.

    PubMed

    Kularatna, Sanjeewa; Petrie, Dennis; Scuffham, Paul A; Byrnes, Joshua

    2016-04-07

    Quality-adjusted life years are derived using health state utility weights which adjust for the relative value of living in each health state compared with living in perfect health. Various techniques are used to estimate health state utility weights including time-trade-off and standard gamble. These methods have exhibited limitations in terms of complexity, validity and reliability. A new composite approach using experimental auctions to value health states is introduced in this protocol. A pilot study will test the feasibility and validity of using experimental auctions to value health states in monetary terms. A convenient sample (n=150) from a population of university staff and students will be invited to participate in 30 auction sets with a group of 5 people in each set. The 9 health states auctioned in each auction set will come from the commonly used EQ-5D-3L instrument. At most participants purchase 2 health states, and the participant who acquires the 2 'best' health states on average will keep the amount of money they do not spend in acquiring those health states. The value (highest bid and average bid) of each of the 24 health states will be compared across auctions to test for reliability across auction groups and across auctioneers. A test retest will be conducted for 10% of the sample to assess reliability of responses for health states auctions. Feasibility of conducting experimental auctions to value health states will also be examined. The validity of estimated health states values will be compared with published utility estimates from other methods. This pilot study will explore the feasibility, reliability and validity in using experimental auction for valuing health states. Ethical clearance was obtained from Griffith University ethics committee. The results will be disseminated in peer-reviewed journals and major international conferences. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/

  13. Reliability, Convergent Validity and Time Invariance of Default Mode Network Deviations in Early Adult Major Depressive Disorder.

    PubMed

    Bessette, Katie L; Jenkins, Lisanne M; Skerrett, Kristy A; Gowins, Jennifer R; DelDonno, Sophie R; Zubieta, Jon-Kar; McInnis, Melvin G; Jacobs, Rachel H; Ajilore, Olusola; Langenecker, Scott A

    2018-01-01

    There is substantial variability across studies of default mode network (DMN) connectivity in major depressive disorder, and reliability and time-invariance are not reported. This study evaluates whether DMN dysconnectivity in remitted depression (rMDD) is reliable over time and symptom-independent, and explores convergent relationships with cognitive features of depression. A longitudinal study was conducted with 82 young adults free of psychotropic medications (47 rMDD, 35 healthy controls) who completed clinical structured interviews, neuropsychological assessments, and 2 resting-state fMRI scans across 2 study sites. Functional connectivity analyses from bilateral posterior cingulate and anterior hippocampal formation seeds in DMN were conducted at both time points within a repeated-measures analysis of variance to compare groups and evaluate reliability of group-level connectivity findings. Eleven hyper- (from posterior cingulate) and 6 hypo- (from hippocampal formation) connectivity clusters in rMDD were obtained with moderate to adequate reliability in all but one cluster (ICC's range = 0.50 to 0.76 for 16 of 17). The significant clusters were reduced with a principle component analysis (5 components obtained) to explore these connectivity components, and were then correlated with cognitive features (rumination, cognitive control, learning and memory, and explicit emotion identification). At the exploratory level, for convergent validity, components consisting of posterior cingulate with cognitive control network hyperconnectivity in rMDD were related to cognitive control (inverse) and rumination (positive). Components consisting of anterior hippocampal formation with social emotional network and DMN hypoconnectivity were related to memory (inverse) and happy emotion identification (positive). Thus, time-invariant DMN connectivity differences exist early in the lifespan course of depression and are reliable. The nuanced results suggest a ventral within-network hypoconnectivity associated with poor memory and a dorsal cross-network hyperconnectivity linked to poorer cognitive control and elevated rumination. Study of early course remitted depression with attention to reliability and symptom independence could lead to more readily translatable clinical assessment tools for biomarkers.

  14. International physical activity questionnaire: reliability and validity of the Turkish version.

    PubMed

    Saglam, Melda; Arikan, Hulya; Savci, Sema; Inal-Ince, Deniz; Bosnak-Guclu, Meral; Karabulut, Erdem; Tokgozoglu, Lale

    2010-08-01

    Physical inactivity is a global problem which is related to many chronic health disorders. Physical activity scales which allow cross-cultural comparisons have been developed. The goal was to assess the reliability and validity of a Turkish version of the International Physical Activity Questionnaire (IPAQ). 1,097 university students (721 women, 376 men; ages 18-32) volunteered. Short and long forms of the IPAQ gave good agreement and comparable 1-wk. test-retest reliabilities. Caltrac accelerometer data were compared with IPAQ scores in 80 participants with good agreement for short and long forms. Turkish versions of the IPAQ short and long forms are reliable and valid in assessment of physical activity.

  15. Live versus Video Observations: Comparing the Reliability and Validity of Two Methods of Assessing Classroom Quality

    ERIC Educational Resources Information Center

    Curby, Timothy W.; Johnson, Price; Mashburn, Andrew J.; Carlis, Lydia

    2016-01-01

    When conducting classroom observations, researchers are often confronted with the decision of whether to conduct observations live or by using pre-recorded video. The present study focuses on comparing and contrasting observations of live and video administrations of the Classroom Assessment Scoring System-PreK (CLASS-PreK). Associations between…

  16. Multidimensional CAT Item Selection Methods for Domain Scores and Composite Scores: Theory and Applications

    ERIC Educational Resources Information Center

    Yao, Lihua

    2012-01-01

    Multidimensional computer adaptive testing (MCAT) can provide higher precision and reliability or reduce test length when compared with unidimensional CAT or with the paper-and-pencil test. This study compared five item selection procedures in the MCAT framework for both domain scores and overall scores through simulation by varying the structure…

  17. Cross-National Prevalence of Traditional Bullying, Traditional Victimization, Cyberbullying and Cyber-Victimization: Comparing Single-Item and Multiple-Item Approaches of Measurement

    ERIC Educational Resources Information Center

    Yanagida, Takuya; Gradinger, Petra; Strohmeier, Dagmar; Solomontos-Kountouri, Olga; Trip, Simona; Bora, Carmen

    2016-01-01

    Many large-scale cross-national studies rely on a single-item measurement when comparing prevalence rates of traditional bullying, traditional victimization, cyberbullying, and cyber-victimization between countries. However, the reliability and validity of single-item measurement approaches are highly problematic and might be biased. Data from…

  18. Reliability of information on people with disabilities gathered by community health workers in highly consanguineous communities of Northeastern Brazil.

    PubMed

    Lopes, Fernando Rocha Lucena; Monteiro, Karolinne Souza; Figueiredo, Thalita; Wanderley, Thyago da Costa; Pequeno, Thiago de Almeida; Lima, Shirley; Santos, Silvana

    2017-05-02

    In Brazil, community health workers have gathered monthly information on people with disabilities to maintain the Primary Care Information System since 1998; however, few studies have used this database for scientific or public health policy purposes. This study aimed to evaluate the reliability of information on people with disabilities gathered by community health workers in primary care services. This was a cross-sectional population-based study conducted in two highly consanguineous communities, involving a population of 18,458 inhabitants in Northeastern Brazil. To study the prevalence of people with disabilities, estimations performed by health workers were compared with those obtained by researchers who interviewed 15.6% of the total population. To study the agreement of the information, data on 106 people with disabilities completed independently by researchers and health workers were compared to evaluate the degree of agreement for 28 variables analysed. Kappa statistics (κ) were used to calculate the inter-rater agreement. The prevalence of disability estimated by community health workers was 3.01 and 2.00% for city A and B, respectively, while the percentages obtained by researchers were 6.72 and 5.65%, respectively, showing an underestimation of prevalence according to community health workers. The Kappa index value obtained for all data analysed (2,589 items excluding losses) was 0.808 (p < 0.01), indicating an almost perfect consistency of information collected by health workers compared to by researchers. Community health workers collected information with a high degree of reliability, although the identification of the prevalence of disabled individuals was potentially impaired due to the work process.

  19. A study of automotive workers anthropometric physical characteristics from Mexico Northwest.

    PubMed

    Lucero-Duarte, Karla; de la Vega-Bustillos, Enrique; López-Millán, Francisco

    2012-01-01

    Due to the lack of anthropometric information in northwest Mexico, we did an anthropometric study that represents the population physical characteristics and that is reliable for the design or redesign of workstations. The study was divided in two phases. The first one was the anthropometric study of 2900 automotive industry workers in northwest of Mexico. The study includes 40 body dimensions of 2345 males and 555 females personalized to be used in future researches. Second phase includes compared anthropometric characteristics of population reported in four Mexican studies and a Colombian study against the current study. Benefits of this project are: a reliable database of anthropometric characteristic of automotive industry population for workstations design or redesign that match with the users, increase product quality and reduce economic, medical and union complains.

  20. Systematic review of methods for quantifying teamwork in the operating theatre

    PubMed Central

    Marshall, D.; Sykes, M.; McCulloch, P.; Shalhoub, J.; Maruthappu, M.

    2018-01-01

    Background Teamwork in the operating theatre is becoming increasingly recognized as a major factor in clinical outcomes. Many tools have been developed to measure teamwork. Most fall into two categories: self‐assessment by theatre staff and assessment by observers. A critical and comparative analysis of the validity and reliability of these tools is lacking. Methods MEDLINE and Embase databases were searched following PRISMA guidelines. Content validity was assessed using measurements of inter‐rater agreement, predictive validity and multisite reliability, and interobserver reliability using statistical measures of inter‐rater agreement and reliability. Quantitative meta‐analysis was deemed unsuitable. Results Forty‐eight articles were selected for final inclusion; self‐assessment tools were used in 18 and observational tools in 28, and there were two qualitative studies. Self‐assessment of teamwork by profession varied with the profession of the assessor. The most robust self‐assessment tool was the Safety Attitudes Questionnaire (SAQ), although this failed to demonstrate multisite reliability. The most robust observational tool was the Non‐Technical Skills (NOTECHS) system, which demonstrated both test–retest reliability (P > 0·09) and interobserver reliability (Rwg = 0·96). Conclusion Self‐assessment of teamwork by the theatre team was influenced by professional differences. Observational tools, when used by trained observers, circumvented this.

  1. Comparative Analysis of the Reliability of Steel Structure with Pinned and Rigid Nodes Subjected to Fire

    NASA Astrophysics Data System (ADS)

    Kubicka, Katarzyna; Radoń, Urszula; Szaniec, Waldemar; Pawlak, Urszula

    2017-10-01

    The paper concerns the reliability analysis of steel structures subjected to high temperatures of fire gases. Two types of spatial structures were analysed, namely with pinned and rigid nodes. The fire analysis was carried out according to prescriptions of Eurocode. The static-strength analysis was conducted using the finite element method (FEM). The MES3D program, developed by Szaniec (Kielce University of Technology, Poland), was used for this purpose. The results received from MES3D made it possible to carry out the reliability analysis using the Numpress Explore program that was developed at the Institute of Fundamental Technological Research of the Polish Academy of Sciences [9]. The measurement of reliability of structures is the Hasofer-Lind reliability index (β). The reliability analysis was carried out according to approximation (FORM, SORM) and simulation (Importance Sampling, Monte Carlo) methods. As the fire progresses, the value of reliability index decreases. The analysis conducted for the study made it possible to evaluate the impact of node types on those changes. In real structures, it is often difficult to define correctly types of nodes, so some simplifications are made. The presented analysis contributes to the recognition of consequences of such assumptions for the safety of structures, subjected to fire.

  2. 34 CFR 668.144 - Application for test approval.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... the comparability of scores on the current test to scores on the previous test, and data from validity... explanation of the methodology and procedures for measuring the reliability of the test; (ii) Evidence that different forms of the test, including, if applicable, short forms, are comparable in reliability; (iii...

  3. Intra- and interobserver reliability of quantitative ultrasound measurement of the plantar fascia.

    PubMed

    Rathleff, Michael Skovdal; Moelgaard, Carsten; Lykkegaard Olesen, Jens

    2011-01-01

    To determine intra- and interobserver reliability and measurement precision of sonographic assessment of plantar fascia thickness when using one, the mean of two, or the mean of three measurements. Two experienced observers scanned 20 healthy subjects twice with 60 minutes between test and retest. A GE LOGIQe ultrasound scanner was used in the study. The built-in software in the scanner was used to measure the thickness of the plantar fascia (PF). Reliability was calculated using intraclass correlation coefficient (ICC) and limits of agreement (LOA). Intraobserver reliability (ICC) using one measurement was 0.50 for one observer and 0.52 for the other, and using the mean of three measurements intraobserver reliability increased up to 0.77 and 0.67, respectively. Interobserver reliability (ICC) when using one measurement was 0.62 and increased to 0.82 when using the average of three measurements. LOA showed that when using the average of three measurements, LOA decreased to 0.6 mm, corresponding to 17.5% of the mean thickness of the PF. The results showed that reliability increases when using the mean of three measurements compared with one. Limits of agreement based on intratester reliability shows that changes in thickness that are larger than 0.6 mm can be considered actual changes in thickness and not a result of measurement error. Copyright © 2011 Wiley Periodicals, Inc.

  4. Patient safety in surgical environments: cross-countries comparison of psychometric properties and results of the Norwegian version of the Hospital Survey on Patient Safety.

    PubMed

    Haugen, Arvid S; Søfteland, Eirik; Eide, Geir E; Nortvedt, Monica W; Aase, Karina; Harthug, Stig

    2010-09-22

    How hospital health care personnel perceive safety climate has been assessed in several countries by using the Hospital Survey on Patient Safety (HSOPS). Few studies have examined safety climate factors in surgical departments per se. This study examined the psychometric properties of a Norwegian translation of the HSOPS and also compared safety climate factors from a surgical setting to hospitals in the United States, the Netherlands and Norway. This survey included 575 surgical personnel in Haukeland University Hospital in Bergen, an 1100-bed tertiary hospital in western Norway: surgeons, operating theatre nurses, anaesthesiologists, nurse anaesthetists and ancillary personnel. Of these, 358 returned the HSOPS, resulting in a 62% response rate. We used factor analysis to examine the applicability of the HSOPS factor structure in operating theatre settings. We also performed psychometric analysis for internal consistency and construct validity. In addition, we compared the percent of average positive responds of the patient safety climate factors with results of the US HSOPS 2010 comparative data base report. The professions differed in their perception of patient safety climate, with anaesthesia personnel having the highest mean scores. Factor analysis using the original 12-factor model of the HSOPS resulted in low reliability scores (r = 0.6) for two factors: "adequate staffing" and "organizational learning and continuous improvement". For the remaining factors, reliability was ≥ 0.7. Reliability scores improved to r = 0.8 by combining the factors "organizational learning and continuous improvement" and "feedback and communication about error" into one six-item factor, supporting an 11-factor model. The inter-item correlations were found satisfactory. The psychometric properties of the questionnaire need further investigations to be regarded as reliable in surgical environments. The operating theatre personnel perceived their hospital's patient safety climate far more negatively than the health care personnel in hospitals in the United States and with perceptions more comparable to those of health care personnel in hospitals in the Netherlands. In fact, the surgical personnel in our hospital may perceive that patient safety climate is less focused in our hospital, at least compared with the results from hospitals in the United States.

  5. Inter-rater reliability of select physical examination procedures in patients with neck pain.

    PubMed

    Hanney, William J; George, Steven Z; Kolber, Morey J; Young, Ian; Salamh, Paul A; Cleland, Joshua A

    2014-07-01

    This study evaluated the inter-rater reliability of select examination procedures in patients with neck pain (NP) conducted over a 24- to 48-h period. Twenty-two patients with mechanical NP participated in a standardized examination. One examiner performed standardized examination procedures and a second blinded examiner repeated the procedures 24-48 h later with no treatment administered between examinations. Inter-rater reliability was calculated with the Cohen Kappa and weighted Kappa for ordinal data while continuous level data were calculated using an intraclass correlation coefficient model 2,1 (ICC2,1). Coefficients for categorical variables ranged from poor to moderate agreement (-0.22 to 0.70 Kappa) and coefficients for continuous data ranged from slight to moderate (ICC2,1 0.28-0.74). The standard error of measurement for cervical range of motion ranged from 5.3° to 9.9° while the minimal detectable change ranged from 12.5° to 23.1°. This study is the first to report inter-rater reliability values for select components of the cervical examination in those patients with NP performed 24-48 h after the initial examination. There was considerably less reliability when compared to previous studies, thus clinicians should consider how the passage of time may influence variability in examination findings over a 24- to 48-h period.

  6. Comparison of reliability and responsiveness of patient-reported clinical outcome measures in knee osteoarthritis rehabilitation.

    PubMed

    Williams, Valerie J; Piva, Sara R; Irrgang, James J; Crossley, Chad; Fitzgerald, G Kelley

    2012-08-01

    Secondary analysis, pretreatment-posttreatment observational study. To compare the reliability and responsiveness of the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC), the Knee Outcome Survey activities of daily living subscale (KOS-ADL), and the Lower Extremity Functional Scale (LEFS) in individuals with knee osteoarthritis (OA). The WOMAC is the current standard in patient-reported measures of function in patients with knee OA. The KOS-ADL and LEFS were designed for potential use in patients with knee OA. If the KOS-ADL and LEFS are to be considered viable alternatives to the WOMAC for measuring patient-reported function in individuals with knee OA, they should have measurement properties comparable to the WOMAC. It would also be important to determine whether either of these instruments may be superior to the WOMAC in terms of reliability or responsiveness in this population. Data from 168 subjects with knee OA, who participated in a rehabilitation program, were used in the analyses. Reliability and responsiveness of each outcome measure were estimated at follow-ups of 2, 6, and 12 months. Reliability was estimated by calculating the intraclass correlation coefficient (ICC2,1) for subjects who were unchanged in status from baseline at each follow-up time, based on a global rating of change score. To examine responsiveness, the standard error of the measurement, minimal detectable change, minimal clinically important difference, and the Guyatt responsiveness index were calculated for each outcome measure at each follow-up time. All 3 outcome measures demonstrated reasonable reliability and responsiveness to change. Reliability and responsiveness tended to decrease somewhat with increasing follow-up time. There were no substantial differences between outcome measures for reliability or any of the 3 measures of responsiveness at any follow-up time. The results do not indicate that one outcome measure is more reliable or responsive than another when applied to subjects with knee OA. We believe that all 3 instruments are appropriate outcome measures to examine change in functional status of patients with knee OA.

  7. A particle swarm model for estimating reliability and scheduling system maintenance

    NASA Astrophysics Data System (ADS)

    Puzis, Rami; Shirtz, Dov; Elovici, Yuval

    2016-05-01

    Modifying data and information system components may introduce new errors and deteriorate the reliability of the system. Reliability can be efficiently regained with reliability centred maintenance, which requires reliability estimation for maintenance scheduling. A variant of the particle swarm model is used to estimate reliability of systems implemented according to the model view controller paradigm. Simulations based on data collected from an online system of a large financial institute are used to compare three component-level maintenance policies. Results show that appropriately scheduled component-level maintenance greatly reduces the cost of upholding an acceptable level of reliability by reducing the need in system-wide maintenance.

  8. Reliability and concurrent validity of the computer workstation checklist.

    PubMed

    Baker, Nancy A; Livengood, Heather; Jacobs, Karen

    2013-01-01

    Self-report checklists are used to assess computer workstation set up, typically by workers not trained in ergonomic assessment or checklist interpretation.Though many checklists exist, few have been evaluated for reliability and validity. This study examined reliability and validity of the Computer Workstation Checklist (CWC) to identify mismatches between workers' self-reported workstation problems. The CWC was completed at baseline and at 1 month to establish reliability. Validity was determined with CWC baseline data compared to an onsite workstation evaluation conducted by an expert in computer workstation assessment. Reliability ranged from fair to near perfect (prevalence-adjusted bias-adjusted kappa, 0.38-0.93); items with the strongest agreement were related to the input device, monitor, computer table, and document holder. The CWC had greater specificity (11 of 16 items) than sensitivity (3 of 16 items). The positive predictive value was greater than the negative predictive value for all questions. The CWC has strong reliability. Sensitivity and specificity suggested workers often indicated no problems with workstation setup when problems existed. The evidence suggests that while the CWC may not be valid when used alone, it may be a suitable adjunct to an ergonomic assessment completed by professionals.

  9. Reliability of Three Benton Judgment of Line Orientation Short Forms in Idiopathic Parkinson’s Disease

    PubMed Central

    Gullett, Joseph M.; Price, Catherine C.; Nguyen, Peter; Okun, Michael S.; Bauer, Russell M.; Bowers, Dawn

    2013-01-01

    Individuals with Parkinson’s disease (PD) often exhibit deficits in visuospatial functioning throughout the course of their disease. These deficits should be carefully assessed as they may have implications for patient safety and disease severity. One of the most commonly administered tests of visuospatial ability, the Benton Judgment of Line Orientation (JLO), consists of 30 pairs of lines requiring the patient to match the orientation of two lines to an array of 11 lines on a separate page. Reliable short forms have been constructed out of the full JLO form, but the reliability of these forms in PD has yet to be examined. Recent functional MRI studies examining the JLO demonstrate right parietal and occipital activation, as well as bilateral frontal activation and PD is known to adversely affect these pathways. We compared the reliability of the original full form to three unique short forms in a sample of 141 non-demented, idiopathic PD patients and 56 age and education matched controls. Results indicated that a two-thirds length short form can be used with high reliability and classification accuracy in patients with idiopathic PD. The other short forms performed in a similar, though slightly less reliable manner. PMID:23957375

  10. EVA Human Health and Performance Benchmarking Study Overview and Development of a Microgravity Protocol

    NASA Technical Reports Server (NTRS)

    Norcross, Jason; Jarvis, Sarah; Bekdash, Omar; Cupples, Scott; Abercromby, Andrew

    2017-01-01

    The primary objective of this study is to develop a protocol to reliably characterize human health and performance metrics for individuals working inside various EVA suits under realistic spaceflight conditions. Expected results and methodologies developed during this study will provide the baseline benchmarking data and protocols with which future EVA suits and suit configurations (e.g., varied pressure, mass, center of gravity [CG]) and different test subject populations (e.g., deconditioned crewmembers) may be reliably assessed and compared. Results may also be used, in conjunction with subsequent testing, to inform fitness-for-duty standards, as well as design requirements and operations concepts for future EVA suits and other exploration systems.

  11. Reliability and determinants of anogenital distance and penis dimensions in male newborns from Chiapas, México

    PubMed Central

    Romano-Riquer, S. Patricia; Hernández-Ávila, Mauricio; Gladen, Beth C.; Cupul-Uicab, Lea A.; Longnecker, Matthew P.

    2013-01-01

    Summary Development of the perineum as well as the external genitalia is determined by dihydrotestosterone, resulting in a greater anogenital distance (AGD) in males than females. In animal experiments with hormonally active agents, anogenital distance is used as a bioassay of fetal androgen action. Use of anogenital distance in human studies has been rare. Because anogenital distance has been an easy-to-measure, sensitive outcome in animal studies, we developed an anthropometric protocol for measurement of anogenital distance in human males. In this paper we describe the method for measurement of three anogenital distances, their reliability, and an assessment of predictors for each in the context of an epidemiological study. We compare the reliabilities and predictors to those for stretched penis length and penis width. A cross-sectional study of 781 newly-delivered male infants was conducted in 2002–2003 in Chiapas, México. Replicate measures were obtained on nearly all subjects. The reliability of the measures of anogenital distance (0.82–0.91) were higher than for stretched penis length (0.78) and width (0.75). Birthweight and gestational length were more strongly related to anogenital distance than to penis length. Anogenital distance was not related to penis length (r = 0.03). Our large study clearly shows that AGD can be measured well in newborn males, and that the measurements were more reliable than those of penis length. Whether AGD measures in humans relate to clinically important outcomes, however, remains to be determined, as does its utility as a measure of androgen action in epidemiological studies. PMID:17439530

  12. Systematic review found AMSTAR, but not R(evised)-AMSTAR, to have good measurement properties.

    PubMed

    Pieper, Dawid; Buechter, Roland Brian; Li, Lun; Prediger, Barbara; Eikermann, Michaela

    2015-05-01

    To summarize all available evidence on measurement properties in terms of reliability, validity, and feasibility of the Assessment of Multiple Systematic Reviews (AMSTAR) tool, including R(evised)-AMSTAR. MEDLINE, EMBASE, Psycinfo, and CINAHL were searched for studies containing information on measurement properties of the tools in October 2013. We extracted data on study characteristics and measurement properties. These data were analyzed following measurement criteria. We included 13 studies, four of them were labeled as validation studies. Nine articles dealt with AMSTAR, two articles dealt with R-AMSTAR, and one article dealt with both instruments. In terms of interrater reliability, most items showed a substantial agreement (>0.6). The median intraclass correlation coefficient (ICC) for the overall score of AMSTAR was 0.83 (range 0.60-0.98), indicating a high agreement. In terms of validity, ICCs were very high with all but one ICC lower than 0.8 when the AMSTAR score was compared with scores from other tools. Scoring AMSTAR takes between 10 and 20 minutes. AMSTAR seems to be reliable and valid. Further investigations for systematic reviews of other study designs than randomized controlled trials are needed. R-AMSTAR should be further investigated as evidence for its use is limited and its measurement properties have not been studied sufficiently. In general, test-retest reliability should be investigated in future studies. Copyright © 2015 Elsevier Inc. All rights reserved.

  13. CT-guided sternoclavicular joint injections: description of the procedure, reliability of imaging diagnosis, and short-term patient responses.

    PubMed

    Peterson, Cynthia K; Saupe, Nadja; Buck, Florian; Pfirrmann, Christian W A; Zanetti, Marco; Hodler, Juerg

    2010-12-01

    The purpose of this study was to evaluate pain relief 20 to 30 minutes after diagnostic or therapeutic injections into the sternoclavicular joint and to compare patient outcomes based on the CT diagnosis. Informed consent was obtained from each patient. Ethics approval was not required. Fifty patients who had CT-guided injections of corticosteroid and local anesthetic into their sternoclavicular joints were included in the study. Preinjection and 20- to 30-minute postinjection visual analog scale data were recorded and compared with the imaging findings agreed by consensus. Kappa statistics were calculated for the reliability of imaging diagnosis. The percentage of patients improving after joint injection was calculated, and the risk ratio comparing the response of patients with osteoarthritis to those without osteoarthritis was completed. The correlation between the severity of each patient's osteoarthritis and the pain response was calculated using Spearman's correlation coefficient. Sixty-six percent of the patients reported clinically significant pain reduction at between 20 and 30 minutes after injection. The proportion of patients with osteoarthritis who had a clinically significant response was 67% compared with 64% for patients who did not have osteoarthritis. This difference was not statistically or clinically significant. There was no correlation between the severity of osteoarthritis and the amount of pain reduction (r = 0.03). The reliability of imaging diagnosis was substantial. Two thirds of patients having sternoclavicular joint injections of corticosteroids and local anesthetics report clinically significant improvement regardless of the abnormalities detected on their CT images.

  14. Anthropometric measurement standardization in the US-affiliated pacific: Report from the Children's Healthy Living Program.

    PubMed

    Li, Fenfang; Wilkens, Lynne R; Novotny, Rachel; Fialkowski, Marie K; Paulino, Yvette C; Nelson, Randall; Bersamin, Andrea; Martin, Ursula; Deenik, Jonathan; Boushey, Carol J

    2016-05-01

    Anthropometric standardization is essential to obtain reliable and comparable data from different geographical regions. The purpose of this study is to describe anthropometric standardization procedures and findings from the Children's Healthy Living (CHL) Program, a study on childhood obesity in 11 jurisdictions in the US-Affiliated Pacific Region, including Alaska and Hawai'i. Zerfas criteria were used to compare the measurement components (height, waist, and weight) between each trainee and a single expert anthropometrist. In addition, intra- and inter-rater technical error of measurement (TEM), coefficient of reliability, and average bias relative to the expert were computed. From September 2012 to December 2014, 79 trainees participated in at least 1 of 29 standardization sessions. A total of 49 trainees passed either standard or alternate Zerfas criteria and were qualified to assess all three measurements in the field. Standard Zerfas criteria were difficult to achieve: only 2 of 79 trainees passed at their first training session. Intra-rater TEM estimates for the 49 trainees compared well with the expert anthropometrist. Average biases were within acceptable limits of deviation from the expert. Coefficient of reliability was above 99% for all three anthropometric components. Standardization based on comparison with a single expert ensured the comparability of measurements from the 49 trainees who passed the criteria. The anthropometric standardization process and protocols followed by CHL resulted in 49 standardized field anthropometrists and have helped build capacity in the health workforce in the Pacific Region. Am. J. Hum. Biol. 28:364-371, 2016. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.

  15. On the Effectiveness of Nature-Inspired Metaheuristic Algorithms for Performing Phase Equilibrium Thermodynamic Calculations

    PubMed Central

    Fateen, Seif-Eddeen K.; Bonilla-Petriciolet, Adrian

    2014-01-01

    The search for reliable and efficient global optimization algorithms for solving phase stability and phase equilibrium problems in applied thermodynamics is an ongoing area of research. In this study, we evaluated and compared the reliability and efficiency of eight selected nature-inspired metaheuristic algorithms for solving difficult phase stability and phase equilibrium problems. These algorithms are the cuckoo search (CS), intelligent firefly (IFA), bat (BA), artificial bee colony (ABC), MAKHA, a hybrid between monkey algorithm and krill herd algorithm, covariance matrix adaptation evolution strategy (CMAES), magnetic charged system search (MCSS), and bare bones particle swarm optimization (BBPSO). The results clearly showed that CS is the most reliable of all methods as it successfully solved all thermodynamic problems tested in this study. CS proved to be a promising nature-inspired optimization method to perform applied thermodynamic calculations for process design. PMID:24967430

  16. On the effectiveness of nature-inspired metaheuristic algorithms for performing phase equilibrium thermodynamic calculations.

    PubMed

    Fateen, Seif-Eddeen K; Bonilla-Petriciolet, Adrian

    2014-01-01

    The search for reliable and efficient global optimization algorithms for solving phase stability and phase equilibrium problems in applied thermodynamics is an ongoing area of research. In this study, we evaluated and compared the reliability and efficiency of eight selected nature-inspired metaheuristic algorithms for solving difficult phase stability and phase equilibrium problems. These algorithms are the cuckoo search (CS), intelligent firefly (IFA), bat (BA), artificial bee colony (ABC), MAKHA, a hybrid between monkey algorithm and krill herd algorithm, covariance matrix adaptation evolution strategy (CMAES), magnetic charged system search (MCSS), and bare bones particle swarm optimization (BBPSO). The results clearly showed that CS is the most reliable of all methods as it successfully solved all thermodynamic problems tested in this study. CS proved to be a promising nature-inspired optimization method to perform applied thermodynamic calculations for process design.

  17. Strength Analysis and Reliability Evaluation for Speed Reducers

    NASA Astrophysics Data System (ADS)

    Tsai, Yuo-Tern; Hsu, Yung-Yuan

    2017-09-01

    This paper studies the structural stresses of differential drive (DD) and harmonic drive (HD) for design improvement of reducers. The designed principles of the two reducers are reported for function comparison. The critical components of the reducers are constructed for performing motion simulation and stress analysis. DD is designed based on differential displacement of the decelerated gear ring as well as HD on a flexible spline. Finite element method (FEM) is used to analyze the structural stresses including the dynamic properties of the reducers. The stresses including kinematic properties of the two reducers are compared to observe the properties of the designs. The analyzed results are applied to identify the allowable loads of the reducers in use. The reliabilities of the reducers in different loads are further calculated according to the variation of stress. The studied results are useful on engineering analysis and reliability evaluation for designing a speed reducer with high ratios.

  18. Development of a reliable method to assess footwear comfort during running.

    PubMed

    Mündermann, Anne; Nigg, Benno M; Stefanyshyn, Darren J; Humble, R Neil

    2002-08-01

    The purposes of this study were: (a) to determine whether subjects are able to distinguish between differences in footwear with respect to footwear comfort; and (b) to determine how reliably footwear comfort can be assessed using a visual analogue scale (VAS) and a protocol including a control condition during running. Intraclass correlation coefficients (ICCs) between comfort ratings for repeated conditions were high (ICC = 0.799). Differences in comfort ratings between the insert conditions were significant. A paired t-test revealed a significant difference in overall comfort ratings for the control insert when tested after the soft insert compared to when tested after the hard insert (P = 0.008). The results of this study showed that VASs provide a reliable measure to assess footwear comfort during running under the conditions that: (a) a control condition is included; and (b) the average comfort rating of sessions 4-6 is used. Copyright 2002 Elsevier Science B.V.

  19. Psychometric testing of the modified Care Dependency Scale among hospitalized school-aged children in Germany.

    PubMed

    Tork, Hanan; Lohrmann, Christa; Dassen, Theo

    2008-03-01

    The objectives of this study were to examine the psychometric properties of the modified Care Dependency Scale in a pediatric setting and to explore the extent of dependency of school-aged children regarding their self-care. The data were collected from 130 hospitalized children, aged 6-12 years. The reliability was determined by Cronbach's alpha, which showed a high level of consistency. The subsequent inter-rater reliability revealed moderate-to-substantial agreement. The criterion-related validity was tested by comparing the sum scores of the Care Dependency Scale for Paediatrics and the Visual Analog Scale. Factor analysis was used to investigate the construct validity and resulted in a one-factor solution. In conclusion, this study provides evidence that the Care Dependency Scale for Paediatrics is a valid and reliable measure that offers a comprehensive assessment from a nursing perspective and enables nurses to help children acquire independence.

  20. Comparison of subjective olfaction ratings in patients with and without olfactory disorders.

    PubMed

    Haxel, B R; Bertz-Duffy, S; Fruth, K; Letzel, S; Mann, W J; Muttray, A

    2012-07-01

    Olfactory dysfunction is common. The reliability of self-assessment tools for smell testing is still controversial. This study aimed to provide new data about the accuracy of olfactory self-assessment compared with a standardised smell test. Prospective, controlled, cohort study of patients with olfactory disorders and healthy controls. Ninety-six patients with a smell deficit and 71 controls were asked to rate their sense of smell on a visual analogue scale. Their olfactory abilities were also evaluated with the Sniffin' Sticks tests. The whole cohort showed a significant correlation between visual analogue scale smell scores and Sniffin' Sticks total scores. This correlation was also significant in the patient group, but not in the control group. These results were independent of olfactory deficit aetiology and subject age. Self-assessment of olfaction is only a reliable indicator in smell-impaired patients, not in healthy controls. For an accurate assessment of olfaction, reliable, standardised tests are needed.

  1. Assessment of Safety Standards for Automotive Electronic Control Systems

    DOT National Transportation Integrated Search

    2016-06-01

    This report summarizes the results of a study that assessed and compared six industry and government safety standards relevant to the safety and reliability of automotive electronic control systems. These standards include ISO 26262 (Road Vehicles - ...

  2. QuickView video preview software of colon capsule endoscopy: reliability in presenting colorectal polyps as compared to normal mode reading.

    PubMed

    Farnbacher, Michael J; Krause, Horst H; Hagel, Alexander F; Raithel, Martin; Neurath, Markus F; Schneider, Thomas

    2014-03-01

    OBJECTIVE. Colon capsule endoscopy (CCE) proved to be highly sensitive in detection of colorectal polyps (CP). Major limitation is the time-consuming video reading. The aim of this prospective, double-center study was to assess the theoretical time-saving potential and its possible impact on the reliability of "QuickView" (QV), in the presentation of CP as compared to normal mode (NM). METHODS. During NM reading of 65 CCE videos (mean patient´s age 56 years), all frames showing CPs were collected and compared to the number of frames presented by QV at increasing QV settings (10, 20, ... 80%). Reliability of QV in presenting polyps <6 mm and ≥6 mm (significant polyp), and identifying patients for subsequent therapeutic colonoscopy, capsule egestion rate, cleansing level, and estimated time-saving potential were assessed. RESULTS. At a 30% QV setting, the QV video presented 89% of the significant polyps and 86% of any polyps with ≥1 frame (per-polyp analysis) identified in NM before. At a 10% QV setting, 98% of the 52 patients with significant polyps could be identified (per-patient analysis) by QV video analysis. Capsule excretion rate was 74% and colon cleanliness was adequate in 85%. QV´s presentation rate correlates to the QV setting, the polyp size, and the number of frames per finding. CONCLUSIONS. Depending on its setting, the reliability of QV in presenting CP as compared to NM reading is notable. However, if no significant polyp is presented by QV, NM reading must be performed afterwards. The reduction of frames to be analyzed in QV might speed up identification of candidates for therapeutic colonoscopy.

  3. A Brazilian-Portuguese version of the Kinesthetic and Visual Motor Imagery Questionnaire.

    PubMed

    Demanboro, Alan; Sterr, Annette; Anjos, Sarah Monteiro Dos; Conforto, Adriana Bastos

    2018-01-01

    Motor imagery has emerged as a potential rehabilitation tool in stroke. The goals of this study were: 1) to develop a translated and culturally-adapted Brazilian-Portugese version of the Kinesthetic and Visual Motor Imagery Questionnaire (KVIQ20-P); 2) to evaluate the psychometric characteristics of the scale in a group of patients with stroke and in an age-matched control group; 3) to compare the KVIQ20 performance between the two groups. Test-retest, inter-rater reliabilities, and internal consistencies were evaluated in 40 patients with stroke and 31 healthy participants. In the stroke group, ICC confidence intervals showed excellent test-retest and inter-rater reliabilities. Cronbach's alpha also indicated excellent internal consistency. Results for controls were comparable to those obtained in persons with stroke. The excellent psychometric properties of the KVIQ20-P should be considered during the design of studies of motor imagery interventions for stroke rehabilitation.

  4. Initial assessment of hearing loss using a mobile application for audiological evaluation.

    PubMed

    Derin, S; Cam, O H; Beydilli, H; Acar, E; Elicora, S S; Sahan, M

    2016-03-01

    This study aimed to compare an Apple iOS mobile operating system application for audiological evaluation with conventional audiometry, and to determine its accuracy and reliability in the initial evaluation of hearing loss. The study comprised 32 patients (16 females) diagnosed with hearing loss. The patients were first evaluated with conventional audiometry and the degree of hearing loss was recorded. Then they underwent a smartphone-based hearing test and the data were compared using Cohen's kappa analysis. Patients' mean age was 53.59 ± 18.01 years (range, 19-85 years). The mobile phone audiometry results for 39 of the 64 ears were fully compatible with the conventional audiometry results. There was a statistically significant concordant relationship between the two sets of audiometry results (p < 0.05). Ear Trumpet version 1.0.2 is a compact and simple mobile application on the Apple iPhone 5 that can measure hearing loss with reliable results.

  5. Smile line assessment comparing quantitative measurement and visual estimation.

    PubMed

    Van der Geld, Pieter; Oosterveld, Paul; Schols, Jan; Kuijpers-Jagtman, Anne Marie

    2011-02-01

    Esthetic analysis of dynamic functions such as spontaneous smiling is feasible by using digital videography and computer measurement for lip line height and tooth display. Because quantitative measurements are time-consuming, digital videography and semiquantitative (visual) estimation according to a standard categorization are more practical for regular diagnostics. Our objective in this study was to compare 2 semiquantitative methods with quantitative measurements for reliability and agreement. The faces of 122 male participants were individually registered by using digital videography. Spontaneous and posed smiles were captured. On the records, maxillary lip line heights and tooth display were digitally measured on each tooth and also visually estimated according to 3-grade and 4-grade scales. Two raters were involved. An error analysis was performed. Reliability was established with kappa statistics. Interexaminer and intraexaminer reliability values were high, with median kappa values from 0.79 to 0.88. Agreement of the 3-grade scale estimation with quantitative measurement showed higher median kappa values (0.76) than the 4-grade scale estimation (0.66). Differentiating high and gummy smile lines (4-grade scale) resulted in greater inaccuracies. The estimation of a high, average, or low smile line for each tooth showed high reliability close to quantitative measurements. Smile line analysis can be performed reliably with a 3-grade scale (visual) semiquantitative estimation. For a more comprehensive diagnosis, additional measuring is proposed, especially in patients with disproportional gingival display. Copyright © 2011 American Association of Orthodontists. Published by Mosby, Inc. All rights reserved.

  6. Does a web-based feedback training program result in improved reliability in clinicians' ratings of the Global Assessment of Functioning (GAF) Scale?

    PubMed

    Støre-Valen, Jakob; Ryum, Truls; Pedersen, Geir A F; Pripp, Are H; Jose, Paul E; Karterud, Sigmund

    2015-09-01

    The Global Assessment of Functioning (GAF) Scale is used in routine clinical practice and research to estimate symptom and functional severity and longitudinal change. Concerns about poor interrater reliability have been raised, and the present study evaluated the effect of a Web-based GAF training program designed to improve interrater reliability in routine clinical practice. Clinicians rated up to 20 vignettes online, and received deviation scores as immediate feedback (i.e., own scores compared with expert raters) after each rating. Growth curves of absolute SD scores across the vignettes were modeled. A linear mixed effects model, using the clinician's deviation scores from expert raters as the dependent variable, indicated an improvement in reliability during training. Moderation by content of scale (symptoms; functioning), scale range (average; extreme), previous experience with GAF rating, profession, and postgraduate training were assessed. Training reduced deviation scores for inexperienced GAF raters, for individuals in clinical professions other than nursing and medicine, and for individuals with no postgraduate specialization. In addition, training was most beneficial for cases with average severity of symptoms compared with cases with extreme severity. The results support the use of Web-based training with feedback routines as a means to improve the reliability of GAF ratings performed by clinicians in mental health practice. These results especially pertain to clinicians in mental health practice who do not have a masters or doctoral degree. (c) 2015 APA, all rights reserved.

  7. Validity and Reliability of the Italian Version of the Functioning Assessment Short Test (FAST) in Bipolar Disorder

    PubMed Central

    Moro, Maria Francesca; Colom, Francesc; Floris, Francesca; Pintus, Elisa; Pintus, Mirra; Contini, Francesca; Carta, Mauro Giovanni

    2012-01-01

    Background: Functioning Assessment Short Test (FAST) is a brief instrument designed to assess the main functioning problems experienced by psychiatric patients, specifically bipolar patients. It includes 24 items assessing impairment or disability in six domains of functioning: autonomy, occupational functioning, cognitive functioning, financial issues, interpersonal relationships and leisure time. The aim of this study is to measure the validity and reliability of the Italian version of this instrument. Methods: Twenty-four patients with DSM-IV TR bipolar disorder and 20 healthy controls were recruited and evaluated in three private clinics in Cagliari (Sardinia, Italy). The psychometric properties of FAST (feasibility, internal consistency, concurrent validity, discriminant validity (patients vs controls and eutimic patients vs manic and depressed), and test-retest reliability were analyzed. Results: The internal consistency obtained was very high with a Cronbach's alpha of 0.955. A highly significant negative correlation with GAF was obtained (r = -0.9; p < 0.001) pointing to a reasonable degree of concurrent validity. FAST show a good test-retest reliability between two independent evaluation differing of one week (mean K =0.73). The total FAST scores were lower in controls as compared with Bipolar Patients and in Euthimic patients compared with Depressed or Manic. Conclusion: The Italian version of the FAST showed similar psychometrics properties as far as regard internal consistency and discriminant validity of the original version and show a good test retest reliability measure by means of K statistics. PMID:22905035

  8. Test-retest reliability and construct validity of the ENERGY-child questionnaire on energy balance-related behaviours and their potential determinants: the ENERGY-project.

    PubMed

    Singh, Amika S; Vik, Froydis N; Chinapaw, Mai J M; Uijtdewilligen, Léonie; Verloigne, Maïté; Fernández-Alvira, Juan M; Stomfai, Sarolta; Manios, Yannis; Martens, Marloes; Brug, Johannes

    2011-12-09

    Insight in children's energy balance-related behaviours (EBRBs) and their determinants is important to inform obesity prevention research. Therefore, reliable and valid tools to measure these variables in large-scale population research are needed. To examine the test-retest reliability and construct validity of the child questionnaire used in the ENERGY-project, measuring EBRBs and their potential determinants among 10-12 year old children. We collected data among 10-12 year old children (n = 730 in the test-retest reliability study; n = 96 in the construct validity study) in six European countries, i.e. Belgium, Greece, Hungary, the Netherlands, Norway, and Spain. Test-retest reliability was assessed using the intra-class correlation coefficient (ICC) and percentage agreement comparing scores from two measurements, administered one week apart. To assess construct validity, the agreement between questionnaire responses and a subsequent face-to-face interview was assessed using ICC and percentage agreement. Of the 150 questionnaire items, 115 (77%) showed good to excellent test-retest reliability as indicated by ICCs > .60 or percentage agreement ≥ 75%. Test-retest reliability was moderate for 34 items (23%) and poor for one item. Construct validity appeared to be good to excellent for 70 (47%) of the 150 items, as indicated by ICCs > .60 or percentage agreement ≥ 75%. From the other 80 items, construct validity was moderate for 39 (26%) and poor for 41 items (27%). Our results demonstrate that the ENERGY-child questionnaire, assessing EBRBs of the child as well as personal, family, and school-environmental determinants related to these EBRBs, has good test-retest reliability and moderate to good construct validity for the large majority of items.

  9. Functional gait assessment and balance evaluation system test: reliability, validity, sensitivity, and specificity for identifying individuals with Parkinson disease who fall.

    PubMed

    Leddy, Abigail L; Crowner, Beth E; Earhart, Gammon M

    2011-01-01

    Gait impairments, balance impairments, and falls are prevalent in individuals with Parkinson disease (PD). Although the Berg Balance Scale (BBS) can be considered the reference standard for the determination of fall risk, it has a noted ceiling effect. Development of ceiling-free measures that can assess balance and are good at discriminating "fallers" from "nonfallers" is needed. The purpose of this study was to compare the Functional Gait Assessment (FGA) and the Balance Evaluation Systems Test (BESTest) with the BBS among individuals with PD and evaluate the tests' reliability, validity, and discriminatory sensitivity and specificity for fallers versus nonfallers. This was an observational study of community-dwelling individuals with idiopathic PD. The BBS, FGA, and BESTest were administered to 80 individuals with PD. Interrater reliability (n=15) was assessed by 3 raters. Test-retest reliability was based on 2 tests of participants (n=24), 2 weeks apart. Intraclass correlation coefficients (2,1) were used to calculate reliability, and Spearman correlation coefficients were used to assess validity. Cutoff points, sensitivity, and specificity were based on receiver operating characteristic plots. Test-retest reliability was .80 for the BBS, .91 for the FGA, and .88 for the BESTest. Interrater reliability was greater than .93 for all 3 tests. The FGA and BESTest were correlated with the BBS (r=.78 and r=.87, respectively). Cutoff scores to identify fallers were 47/56 for the BBS, 15/30 for the FGA, and 69% for the BESTest. The overall accuracy (area under the curve) for the BBS, FGA, and BESTest was .79, .80, and .85, respectively. Fall reports were retrospective. Both the FGA and the BESTest have reliability and validity for assessing balance in individuals with PD. The BESTest is most sensitive for identifying fallers.

  10. Test-retest reliability and construct validity of the ENERGY-child questionnaire on energy balance-related behaviours and their potential determinants: the ENERGY-project

    PubMed Central

    2011-01-01

    Background Insight in children's energy balance-related behaviours (EBRBs) and their determinants is important to inform obesity prevention research. Therefore, reliable and valid tools to measure these variables in large-scale population research are needed. Objective To examine the test-retest reliability and construct validity of the child questionnaire used in the ENERGY-project, measuring EBRBs and their potential determinants among 10-12 year old children. Methods We collected data among 10-12 year old children (n = 730 in the test-retest reliability study; n = 96 in the construct validity study) in six European countries, i.e. Belgium, Greece, Hungary, the Netherlands, Norway, and Spain. Test-retest reliability was assessed using the intra-class correlation coefficient (ICC) and percentage agreement comparing scores from two measurements, administered one week apart. To assess construct validity, the agreement between questionnaire responses and a subsequent face-to-face interview was assessed using ICC and percentage agreement. Results Of the 150 questionnaire items, 115 (77%) showed good to excellent test-retest reliability as indicated by ICCs > .60 or percentage agreement ≥ 75%. Test-retest reliability was moderate for 34 items (23%) and poor for one item. Construct validity appeared to be good to excellent for 70 (47%) of the 150 items, as indicated by ICCs > .60 or percentage agreement ≥ 75%. From the other 80 items, construct validity was moderate for 39 (26%) and poor for 41 items (27%). Conclusions Our results demonstrate that the ENERGY-child questionnaire, assessing EBRBs of the child as well as personal, family, and school-environmental determinants related to these EBRBs, has good test-retest reliability and moderate to good construct validity for the large majority of items. PMID:22152048

  11. Space Shuttle Program Primary Avionics Software System (PASS) Success Legacy - Quality and Reliability Date

    NASA Technical Reports Server (NTRS)

    Orr, James K.; Peltier, Daryl

    2010-01-01

    Thsi slide presentation reviews the avionics software system on board the space shuttle, with particular emphasis on the quality and reliability. The Primary Avionics Software System (PASS) provides automatic and fly-by-wire control of critical shuttle systems which executes in redundant computers. Charts given show the number of space shuttle flights vs time, PASS's development history, and other charts that point to the reliability of the system's development. The reliability of the system is also compared to predicted reliability.

  12. Reliability and validity of soft copy images based on flat-panel detector in pneumoconiosis classification: comparison with the analog radiographs.

    PubMed

    Lee, Won-Jeong; Choi, Byung-Soon

    2013-06-01

    The aim of this study was to evaluate the reliability and validity of soft copy images based on flat-panel detector of digital radiography (DR-FPD soft copy images) compared to analog radiographs (ARs) in pneumoconiosis classification and diagnosis. DR-FPD soft copy images and ARs from 349 subjects were independently read by four-experienced readers according to the International Labor Organization 2000 guidelines. DR-FPD soft copy images were used to obtain consensus reading (CR) by all readers as the gold standard. Reliability and validity were evaluated by a κ and receiver operating characteristic analysis, respectively. In small opacity, overall interreader agreement of DR-FPD soft copy images was significantly higher than that of ARs, but it was not significantly different in large opacity and costophrenic angle obliteration. In small opacity, agreement of DR-FPD soft copy images with CR was significantly higher than that of ARs with CR. It was also higher than that of ARs with CR in pleural plaque and thickening. Receiver operating characteristic areas were not different significantly between DR-FPD soft copy images and ARs. DR-FPD soft copy images showed accurate and reliable results in pneumoconiosis classification and diagnosis compared to ARs. Copyright © 2013 AUR. Published by Elsevier Inc. All rights reserved.

  13. Content validity and test-retest reliability of a low back pain questionnaire in Zimbabwean adolescents.

    PubMed

    Chiwaridzo, Matthew; Chikasha, Tafadzwa Nicole; Naidoo, Nirmala; Dambi, Jermaine Matewu; Tadyanemhandu, Cathrine; Munambah, Nyaradzai; Chizanga, Precious Trish

    2017-01-01

    In Zimbabwe, a recent increase in the volume of research on recurrent non-specific low back pain (NSLBP) has revealed that adolescents are commonly affected. This is alarming to health professionals and parents and calls for serious primary preventative strategies to be developed and implemented forthwith. Early identification initiatives should be prioritised in order to curtail the condition and its progression. In an attempt to be proactive in minimising the prevalence of recurrent NSLBP, this study was conducted to evaluate the content validity and test-retest reliability of a survey questionnaire with the aim of proffering a valid and reliable questionnaire which can be used in non-clinical settings to identify adolescents with recurrent NSLBP in Harare, Zimbabwe and determine the possible factors associated with the condition. The study was conducted in two parts. The first part assessed content validity of the questionnaire using four experts derived from academia and clinical practice. The second part evaluated the reliability of the questionnaire among 125 high school-children aged between 13 and 19 years in a test-retest study. Twenty-six (26) out of thirty questions in the questionnaire had an Item Content Validity index of 1.00, demonstrating complete agreement among content experts. Overall, the Scale Content Validity Index for the questionnaire was 0.97. Item completion for the reliability study was satisfactory. The questionnaire items had kappa values ranging from 0.17 (slight agreement) to 1 (perfect agreement). High levels of reliability were found for the questions on school bag use ( k =0.94), sports participation ( k =0.97), and lifetime prevalence ( k =0.89). Excellent content validity and slight to perfect test-retest reliability was found for the Low Back Pain (LBP) questionnaire. These results are comparable to findings of other studies evaluating the psychometric properties of LBP questionnaires. Cognisant of the limitations of the study, the results of this study suggest that the LBP questionnaire could be used in local studies investigating LBP among adolescents although questions enquiring on functional limitations and sciatica may need further consideration.

  14. Sexual behaviors among club drug users: prevalence and reliability

    PubMed Central

    Shacham, Enbal; Cottler, Linda B.

    2013-01-01

    HIV prevention efforts require a focus on reducing high risk sexual behavior. Because these are self-reported, assessments that reduce memory bias and improve elicitation of data are needed. As part of a multi-site psychometric study of club drug use, abuse, and dependence, data were collected with a test-retest design that measured the reliability of the Washington University Risk Behavior Assessment for Club Drugs (WU-RBA-CD). Reliability was assessed separately by sex via kappa coefficients and intraclass correlation coefficients (ICC); z tests compared coefficients by sex. A total of 603 participants were interviewed by independent assessors with 5 days in between interviews. Reliability for all 51 items of the sexual activity section of the WU-RBA-CD ranged from .23 to 1.00; 71% (n = 36) of items resulted in moderate to high reliability (.55–1.00). Number of lifetime sex partners was consistently reported for same-sex partners for both men and women and opposite-sex partners. Items with high reliability included reporting ever being under the influence of ecstasy (.87) or GHB (.87) while having sex. Items with lower reliability included those that queried the determinants of condom use (.45–.82) and about behaviors and attitudes experienced while using drugs (.23–.87). Very few sex differences were revealed in the reliability of reported sexual activities. Overall, the WU-RBA-CD performed with fairly high reliability rates. Assessing situations of when, how, and why individuals use condoms may offer the clearest evaluation of determinants of sexual behaviors, yet those items are not as reliable. PMID:19757011

  15. Standardization of Test for Assessment and Comparing of Students' Measurement

    ERIC Educational Resources Information Center

    Osadebe, Patrick U.

    2014-01-01

    The study Standardized Economics Achievement Test for senior secondary school students in Nigeria. Three research questions guided the study. The standardized test in Economics was first constructed by an expert as a valid and reliable instrument. The test was then used for standardization in this study. That is, ensuring that the Economics…

  16. A study of low-cost reliable actuators for light aircraft. Part A: Chapters 1-8

    NASA Technical Reports Server (NTRS)

    Eijsink, H.; Rice, M.

    1978-01-01

    An analysis involving electro-mechanical, electro-pneumatic, and electro-hydraulic actuators was performed to study which are compatible for use in the primary and secondary flight controls of a single engine light aircraft. Actuator characteristics under investigation include cost, reliability, weight, force, volumetric requirements, power requirements, response characteristics and heat accumulation characteristics. The basic types of actuators were compared for performance characteristics in positioning a control surface model and then were mathematically evaluated in an aircraft to get the closed loop dynamic response characteristics. Conclusions were made as to the suitability of each actuator type for use in an aircraft.

  17. Validity and reliability of the Paprosky acetabular defect classification.

    PubMed

    Yu, Raymond; Hofstaetter, Jochen G; Sullivan, Thomas; Costi, Kerry; Howie, Donald W; Solomon, Lucian B

    2013-07-01

    The Paprosky acetabular defect classification is widely used but has not been appropriately validated. Reliability of the Paprosky system has not been evaluated in combination with standardized techniques of measurement and scoring. This study evaluated the reliability, teachability, and validity of the Paprosky acetabular defect classification. Preoperative radiographs from a random sample of 83 patients undergoing 85 acetabular revisions were classified by four observers, and their classifications were compared with quantitative intraoperative measurements. Teachability of the classification scheme was tested by dividing the four observers into two groups. The observers in Group 1 underwent three teaching sessions; those in Group 2 underwent one session and the influence of teaching on the accuracy of their classifications was ascertained. Radiographic evaluation showed statistically significant relationships with intraoperative measurements of anterior, medial, and superior acetabular defect sizes. Interobserver reliability improved substantially after teaching and did not improve without it. The weighted kappa coefficient went from 0.56 at Occasion 1 to 0.79 after three teaching sessions in Group 1 observers, and from 0.49 to 0.65 after one teaching session in Group 2 observers. The Paprosky system is valid and shows good reliability when combined with standardized definitions of radiographic landmarks and a structured analysis. Level II, diagnostic study. See the Guidelines for Authors for a complete description of levels of evidence.

  18. Grant Peer Review: Improving Inter-Rater Reliability with Training

    DOE PAGES

    Sattler, David N.; McKnight, Patrick E.; Naney, Linda; ...

    2015-06-15

    In this study, we developed and evaluated a brief training program for grant reviewers that aimed to increase inter-rater reliability, rating scale knowledge, and effort to read the grant review criteria. Enhancing reviewer training may improve the reliability and accuracy of research grant proposal scoring and funding recommendations. Seventy-five Public Health professors from U.S. research universities watched the training video we produced and assigned scores to the National Institutes of Health scoring criteria proposal summary descriptions. For both novice and experienced reviewers, the training video increased scoring accuracy (the percentage of scores that reflect the true rating scale values), inter-ratermore » reliability, and the amount of time reading the review criteria compared to the no video condition. The increase in reliability for experienced reviewers is notable because it is commonly assumed that reviewers—especially those with experience—have good understanding of the grant review rating scale. Our findings suggest that both experienced and novice reviewers who had not received the type of training developed in this study may not have appropriate understanding of the definitions and meaning for each value of the rating scale and that experienced reviewers may overestimate their knowledge of the rating scale. Lastly, the results underscore the benefits of and need for specialized peer reviewer training.« less

  19. Grant Peer Review: Improving Inter-Rater Reliability with Training.

    PubMed

    Sattler, David N; McKnight, Patrick E; Naney, Linda; Mathis, Randy

    2015-01-01

    This study developed and evaluated a brief training program for grant reviewers that aimed to increase inter-rater reliability, rating scale knowledge, and effort to read the grant review criteria. Enhancing reviewer training may improve the reliability and accuracy of research grant proposal scoring and funding recommendations. Seventy-five Public Health professors from U.S. research universities watched the training video we produced and assigned scores to the National Institutes of Health scoring criteria proposal summary descriptions. For both novice and experienced reviewers, the training video increased scoring accuracy (the percentage of scores that reflect the true rating scale values), inter-rater reliability, and the amount of time reading the review criteria compared to the no video condition. The increase in reliability for experienced reviewers is notable because it is commonly assumed that reviewers--especially those with experience--have good understanding of the grant review rating scale. The findings suggest that both experienced and novice reviewers who had not received the type of training developed in this study may not have appropriate understanding of the definitions and meaning for each value of the rating scale and that experienced reviewers may overestimate their knowledge of the rating scale. The results underscore the benefits of and need for specialized peer reviewer training.

  20. Reliability and validity of a Turkish version of the Global Pelvic Floor Bother Questionnaire.

    PubMed

    Doğan, Hanife; Özengin, Nuriye; Bakar, Yeşim; Duran, Bülent

    2016-10-01

    The aim of this study was to translate the Global Pelvic Floor Bother Questionnaire (GPFBQ) into Turkish and to assess its validity and reliability. The Turkish adaptation of the GPFBQ was created by following the stages of the intercultural adaptation process. A test-retest interval of 1 week was used to assess the reliability, which was examined by the intraclass correlation coefficient. The validity of the GPFBQ was assessed and compared with the Pelvic Floor Distress Inventory-20 (PFDI-20) and the Pelvic Floor Impact Questionnaire-7 (PFIQ-7) using Spearman's rank correlation coefficients. For construct validity, confirmatory factor analysis was performed. A total of 131 women, whose mean age was 46.83 years, were included in the study. The test-retest reliability of the GPFBQ was excellent (0.998, p < 0.0001). The GPFBQ correlated significantly with the PFDI-20 (r = 0.860, p = 0.00) and PFIQ-7 (r = 0.802, p = 0.00). Confirmatory factor analysis was performed to determine construct validity, and it was found that it had four dimensions. The Turkish version of the GPFBQ is a valid and reliable tool for assessing the symptoms of bother and severity in Turkish-speaking women with pelvic floor dysfunction.

  1. Validation in the cross-cultural adaptation of the Korean version of the Oswestry Disability Index.

    PubMed

    Jeon, Chang-Hoon; Kim, Dong-Jae; Kim, Se-Kang; Kim, Dong-Jun; Lee, Hwan-Mo; Park, Heui-Jeon

    2006-12-01

    Disability questionnaires are used for clinical assessment, outcome measurement, and research methodology. Any disability measurement must be adapted culturally for comparability of data, when the patients, who are measured, use different languages. This study aimed to conduct cross-cultural adaptation in translating the original (English) version of the Oswestry Disability Index (ODI) into Korean, and then to assess the reliability of the Korean versions of the Oswestry Disability Index (KODI). We used methodology to obtain semantic, idiomatic, experimental, and conceptual equivalences for the process of cross-cultural adaptation. The KODI were tested in 116 patients with chronic low back pain. The internal consistency and reliability for the KODI reached 0.9168 (Cronbach's alpha). The test-retest reliability was assessed with 32 patients (who were not included in the assessment of Cronbach's alpha) over a time interval of 4 days. Test-retest correlation reliability was 0.9332. The entire process and the results of this study were reported to the developer (Dr. Fairbank JC), who appraised the KODI. There is little evidence of differential item functioning in KODI. The results suggest that the KODI is internally consistent and reliable. Therefore, the KODI can be recommended as a low back pain assessment tool in Korea.

  2. Validation in the Cross-Cultural Adaptation of the Korean Version of the Oswestry Disability Index

    PubMed Central

    Kim, Dong-Jae; Kim, Se-Kang; Kim, Dong-Jun; Lee, Hwan-Mo; Park, Heui-Jeon

    2006-01-01

    Disability questionnaires are used for clinical assessment, outcome measurement, and research methodology. Any disability measurement must be adapted culturally for comparability of data, when the patients, who are measured, use different languages. This study aimed to conduct cross-cultural adaptation in translating the original (English) version of the Oswestry Disability Index (ODI) into Korean, and then to assess the reliability of the Korean versions of the Oswestry Disability Index (KODI). We used methodology to obtain semantic, idiomatic, experimental, and conceptual equivalences for the process of cross-cultural adaptation. The KODI were tested in 116 patients with chronic low back pain. The internal consistency and reliability for the KODI reached 0.9168 (Cronbach's alpha). The test-retest reliability was assessed with 32 patients (who were not included in the assessment of Cronbach's alpha) over a time interval of 4 days. Test-retest correlation reliability was 0.9332. The entire process and the results of this study were reported to the developer (Dr. Fairbank JC), who appraised the KODI. There is little evidence of differential item functioning in KODI. The results suggest that the KODI is internally consistent and reliable. Therefore, the KODI can be recommended as a low back pain assessment tool in Korea. PMID:17179693

  3. Test-retest reliability of computer-based video analysis of general movements in healthy term-born infants.

    PubMed

    Valle, Susanne Collier; Støen, Ragnhild; Sæther, Rannei; Jensenius, Alexander Refsum; Adde, Lars

    2015-10-01

    A computer-based video analysis has recently been presented for quantitative assessment of general movements (GMs). This method's test-retest reliability, however, has not yet been evaluated. The aim of the current study was to evaluate the test-retest reliability of computer-based video analysis of GMs, and to explore the association between computer-based video analysis and the temporal organization of fidgety movements (FMs). Test-retest reliability study. 75 healthy, term-born infants were recorded twice the same day during the FMs period using a standardized video set-up. The computer-based movement variables "quantity of motion mean" (Qmean), "quantity of motion standard deviation" (QSD) and "centroid of motion standard deviation" (CSD) were analyzed, reflecting the amount of motion and the variability of the spatial center of motion of the infant, respectively. In addition, the association between the variable CSD and the temporal organization of FMs was explored. Intraclass correlation coefficients (ICC 1.1 and ICC 3.1) were calculated to assess test-retest reliability. The ICC values for the variables CSD, Qmean and QSD were 0.80, 0.80 and 0.86 for ICC (1.1), respectively; and 0.80, 0.86 and 0.90 for ICC (3.1), respectively. There were significantly lower CSD values in the recordings with continual FMs compared to the recordings with intermittent FMs (p<0.05). This study showed high test-retest reliability of computer-based video analysis of GMs, and a significant association between our computer-based video analysis and the temporal organization of FMs. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  4. Reliability of a retrospective decade-based life-course alcohol consumption questionnaire administered in later life.

    PubMed

    Bell, Steven; Britton, Annie

    2015-10-01

    Retrospective measures of alcohol intake are becoming increasingly popular; however, the reliability of such measures remains uncertain. This study assessed the reliability of a retrospective decade-based life-course alcohol consumption questionnaire, based on the standardized Alcohol Use Disorder Identification Test-Consumption (AUDIT-C) administered in older age in a well-characterized cohort study. A retrospective alcohol life-grid was administered to 5980 participants (72% male, mean age 70 years) in the Whitehall II study covering frequency of drinking, number of drinks in a typical drinking day and frequency of consuming six or more drinks in a single drinking occasion in the teens (16-19 years) through to the 80s. A subsample of 385 individuals completed a repeat survey to determine test-retest reliability. Retrospective measures were also compared with prospectively ascertained information and used to predict objectively measured systolic blood pressure to test their predictive validity. Across all decades of life, test-retest reliability was generally good (κ range = 0.62-0.78 for frequency, 0.55-0.62 for usual number of drinks and 0.57-0.65 for frequency of consuming six or more drinks in a single occasion). The concordance between prospective and retrospective measures was consistently moderate to high. The life-grid method performed better than a single question in identifying life-time abstainers. Retrospective measures were also related to systolic blood pressure in the manner anticipated. A retrospective decade-based AUDIT-C grid administered in older age provides a relatively reliable measure of alcohol consumption across the life-course. © 2015 The Authors. Addiction published by John Wiley & Sons Ltd on behalf of Society for the Study of Addiction.

  5. Validity and reliability of the session-RPE method for quantifying training in Australian football: a comparison of the CR10 and CR100 scales.

    PubMed

    Scott, Tannath J; Black, Cameron R; Quinn, John; Coutts, Aaron J

    2013-01-01

    The purpose of this study was to examine and compare the criterion validity and test-retest reliability of the CR10 and CR100 rating of perceived exertion (RPE) scales for team sport athletes that undertake high-intensity, intermittent exercise. Twenty-one male Australian football (AF) players (age: 19.0 ± 1.8 years, body mass: 83.92 ± 7.88 kg) participated the first part (part A) of this study, which examined the construct validity of the session-RPE (sRPE) method for quantifying training load in AF. Ten male athletes (age: 16.1 ± 0.5 years) participated in the second part of the study (part B), which compared the test-retest reliability of the CR10 and CR100 RPE scales. In part A, the validity of the sRPE method was assessed by examining the relationships between sRPE, and objective measures of internal (i.e., heart rate) and external training load (i.e., distance traveled), collected from AF training sessions. Part B of the study assessed the reliability of sRPE through examining the test-retest reliability of sRPE during 3 different intensities of controlled intermittent running (10, 11.5, and 13 km·h(-1)). Results from part A demonstrated strong correlations for CR10- and CR100-derived sRPE with measures of internal training load (Banisters TRIMP and Edwards TRIMP) (CR10: r = 0.83 and 0.83, and CR100: r = 0.80 and 0.81, p < 0.05). Correlations between sRPE and external training load (distance, higher speed running and player load) for both the CR10 (r = 0.81, 0.71, and 0.83) and CR100 (r = 0.78, 0.69, and 0.80) were significant (p < 0.05). Results from part B demonstrated poor reliability for both the CR10 (31.9% CV) and CR100 (38.6% CV) RPE scales after short bouts of intermittent running. Collectively, these results suggest both CR10- and CR100-derived sRPE methods have good construct validity for assessing training load in AF. The poor levels of reliability revealed under field testing indicate that the sRPE method may not be sensible to detecting small changes in exercise intensity during brief intermittent running bouts. Despite this limitation, the sRPE remains a valid method to quantify training loads in high-intensity, intermittent team sport.

  6. Reliability of Hypernasality Rating: Comparison of 3 Different Methods for Perceptual Assessment.

    PubMed

    Yamashita, Renata Paciello; Borg, Elisabet; Granqvist, Svante; Lohmander, Anette

    2018-01-01

    To compare reliability in auditory-perceptual assessment of hypernasality for 3 different methods and to explore the influence of language background. Comparative methodological study. Participants and Materials: Audio recordings of 5-year-old Swedish-speaking children with repaired cleft lip and palate consisting of 73 stimuli of 9 nonnasal single-word strings in 3 different randomized orders. Four experienced speech-language pathologists (2 native speakers of Brazilian-Portuguese and 2 native speakers of Swedish) participated as listeners. After individual training, each listener performed the hypernasality rating task. Each order of stimuli was analyzed individually using the 2-step, VISOR and Borg centiMax scale methods. Comparison of intra- and inter-rater reliability, and consistency  for each method within language of the listener and between listener languages (Swedish and Brazilian-Portuguese). Good to excellent intra-rater reliability was found within each listener for all methods, 2-step: κ = 0.59-0.93; VISOR: intraclass correlation coefficient (ICC) = 0.80-0.99; Borg centiMax (cM) scale: ICC = 0.80-1.00. The highest inter-rater reliability was demonstrated for VISOR (ICC = 0.60-0.90) and Borg cM-scale (ICC = 0.40-0.80). High consistency within each method was found with the highest for the Borg cM scale (ICC = 0.89-0.91). There was a significant difference in the ratings between the Swedish and the Brazilian listeners for all methods. The category-ratio scale Borg cM was considered most reliable in the assessment of hypernasality. Language background of Brazilian-Portuguese listeners influenced the perceptual ratings of hypernasality in Swedish speech samples, despite their experience in perceptual assessment of cleft palate speech disorders.

  7. The reliability and validity of the Saliba Postural Classification System

    PubMed Central

    Collins, Cristiana Kahl; Johnson, Vicky Saliba; Godwin, Ellen M.; Pappas, Evangelos

    2016-01-01

    Objectives To determine the reliability and validity of the Saliba Postural Classification System (SPCS). Methods Two physical therapists classified pictures of 100 volunteer participants standing in their habitual posture for inter and intra-tester reliability. For validity, 54 participants stood on a force plate in a habitual and a corrected posture, while a vertical force was applied through the shoulders until the clinician felt a postural give. Data were extracted at the time the give was felt and at a time in the corrected posture that matched the peak vertical ground reaction force (VGRF) in the habitual posture. Results Inter-tester reliability demonstrated 75% agreement with a Kappa = 0.64 (95% CI = 0.524–0.756, SE = 0.059). Intra-tester reliability demonstrated 87% agreement with a Kappa = 0.8, (95% CI = 0.702–0.898, SE = 0.05) and 80% agreement with a Kappa = 0.706, (95% CI = 0.594–0818, SE = 0.057). The examiner applied a significantly higher (p < 0.001) peak vertical force in the corrected posture prior to a postural give when compared to the habitual posture. Within the corrected posture, the %VGRF was higher when the test was ongoing vs. when a postural give was felt (p < 0.001). The %VGRF was not different between the two postures when comparing the peaks (p = 0.214). Discussion The SPCS has substantial agreement for inter- and intra-tester reliability and is largely a valid postural classification system as determined by the larger vertical forces in the corrected postures. Further studies on the correlation between the SPCS and diagnostic classifications are indicated. PMID:27559288

  8. The reliability and validity of the Saliba Postural Classification System.

    PubMed

    Collins, Cristiana Kahl; Johnson, Vicky Saliba; Godwin, Ellen M; Pappas, Evangelos

    2016-07-01

    To determine the reliability and validity of the Saliba Postural Classification System (SPCS). Two physical therapists classified pictures of 100 volunteer participants standing in their habitual posture for inter and intra-tester reliability. For validity, 54 participants stood on a force plate in a habitual and a corrected posture, while a vertical force was applied through the shoulders until the clinician felt a postural give. Data were extracted at the time the give was felt and at a time in the corrected posture that matched the peak vertical ground reaction force (VGRF) in the habitual posture. Inter-tester reliability demonstrated 75% agreement with a Kappa = 0.64 (95% CI = 0.524-0.756, SE = 0.059). Intra-tester reliability demonstrated 87% agreement with a Kappa = 0.8, (95% CI = 0.702-0.898, SE = 0.05) and 80% agreement with a Kappa = 0.706, (95% CI = 0.594-0818, SE = 0.057). The examiner applied a significantly higher (p < 0.001) peak vertical force in the corrected posture prior to a postural give when compared to the habitual posture. Within the corrected posture, the %VGRF was higher when the test was ongoing vs. when a postural give was felt (p < 0.001). The %VGRF was not different between the two postures when comparing the peaks (p = 0.214). The SPCS has substantial agreement for inter- and intra-tester reliability and is largely a valid postural classification system as determined by the larger vertical forces in the corrected postures. Further studies on the correlation between the SPCS and diagnostic classifications are indicated.

  9. Effects of momentary self-monitoring on empowerment in a randomized controlled trial in patients with depression.

    PubMed

    Simons, C J P; Hartmann, J A; Kramer, I; Menne-Lothmann, C; Höhn, P; van Bemmel, A L; Myin-Germeys, I; Delespaul, P; van Os, J; Wichers, M

    2015-11-01

    Interventions based on the experience sampling method (ESM) are ideally suited to provide insight into personal, contextualized affective patterns in the flow of daily life. Recently, we showed that an ESM-intervention focusing on positive affect was associated with a decrease in symptoms in patients with depression. The aim of the present study was to examine whether ESM-intervention increased patient empowerment. Depressed out-patients (n=102) receiving psychopharmacological treatment who had participated in a randomized controlled trial with three arms: (i) an experimental group receiving six weeks of ESM self-monitoring combined with weekly feedback sessions, (ii) a pseudo-experimental group participating in six weeks of ESM self-monitoring without feedback, and (iii) a control group (treatment as usual only). Patients were recruited in the Netherlands between January 2010 and February 2012. Self-report empowerment scores were obtained pre- and post-intervention. There was an effect of group×assessment period, indicating that the experimental (B=7.26, P=0.061, d=0.44, statistically imprecise) and pseudo-experimental group (B=11.19, P=0.003, d=0.76) increased more in reported empowerment compared to the control group. In the pseudo-experimental group, 29% of the participants showed a statistically reliable increase in empowerment score and 0% reliable decrease compared to 17% reliable increase and 21% reliable decrease in the control group. The experimental group showed 19% reliable increase and 4% reliable decrease. These findings tentatively suggest that self-monitoring to complement standard antidepressant treatment may increase patients' feelings of empowerment. Further research is necessary to investigate long-term empowering effects of self-monitoring in combination with person-tailored feedback. Copyright © 2015 Elsevier Masson SAS. All rights reserved.

  10. Development and validation of a quantitative snack and beverage food frequency questionnaire for adolescents.

    PubMed

    De Cock, N; Van Camp, J; Kolsteren, P; Lachat, C; Huybregts, L; Maes, L; Deforche, B; Verstraeten, R; Vangeel, J; Beullens, K; Eggermont, S; Van Lippevelde, W

    2017-04-01

    A short, reliable and valid tool to measure snack and beverage consumption in adolescents, taking into account the correct definitions, would benefit both epidemiological and intervention research. The present study aimed to develop a short quantitative beverage and snack food frequency questionnaire (FFQ) and to assess the reliability and validity of this FFQ against three 24-h recalls. Reliability was assessed by comparing estimates of the FFQ administered 14 days apart (FFQ1 and FFQ2) in a convenience sample of 179 adolescents [60.3% male; mean (SD) 14.7 (0.9) years]. Validity was assessed by comparing FFQ1 with three telephone-administered 24-h recalls in a convenience sample of 99 adolescents [52.5% male, mean (SD) 14.8 (0.9) years]. Reliability and validity were assessed using Bland-Altman plots, classification agreements and correlation coefficients for the amount and frequency of consumption of unhealthy snacks, healthy snacks, unhealthy beverages, healthy beverages, and for the healthy snack and beverage ratios. Small mean differences (FFQ1 versus FFQ2) were observed for reliability, ranking ability ranged from fair to substantial, and Spearman coefficients fell within normal ranges. For the validity, mean differences (FFQ1 versus recalls) were small for beverage intake but large for snack intake, except for the healthy snack ratio. Ranking ability ranged from slightly to moderate, and Spearman coefficients fell within normal ranges. Reliability and validity of the FFQ for all outcomes were found to be acceptable at a group level for epidemiological purposes, whereas for intervention purposes only the healthy snack and beverage ratios were found to be acceptable at a group level. © 2016 The British Dietetic Association Ltd.

  11. Reliability and Validity Assessment of a Linear Position Transducer

    PubMed Central

    Garnacho-Castaño, Manuel V.; López-Lastra, Silvia; Maté-Muñoz, José L.

    2015-01-01

    The objectives of the study were to determine the validity and reliability of peak velocity (PV), average velocity (AV), peak power (PP) and average power (AP) measurements were made using a linear position transducer. Validity was assessed by comparing measurements simultaneously obtained using the Tendo Weightlifting Analyzer Systemi and T-Force Dynamic Measurement Systemr (Ergotech, Murcia, Spain) during two resistance exercises, bench press (BP) and full back squat (BS), performed by 71 trained male subjects. For the reliability study, a further 32 men completed both lifts using the Tendo Weightlifting Analyzer Systemz in two identical testing sessions one week apart (session 1 vs. session 2). Intraclass correlation coefficients (ICCs) indicating the validity of the Tendo Weightlifting Analyzer Systemi were high, with values ranging from 0.853 to 0.989. Systematic biases and random errors were low to moderate for almost all variables, being higher in the case of PP (bias ±157.56 W; error ±131.84 W). Proportional biases were identified for almost all variables. Test-retest reliability was strong with ICCs ranging from 0.922 to 0.988. Reliability results also showed minimal systematic biases and random errors, which were only significant for PP (bias -19.19 W; error ±67.57 W). Only PV recorded in the BS showed no significant proportional bias. The Tendo Weightlifting Analyzer Systemi emerged as a reliable system for measuring movement velocity and estimating power in resistance exercises. The low biases and random errors observed here (mainly AV, AP) make this device a useful tool for monitoring resistance training. Key points This study determined the validity and reliability of peak velocity, average velocity, peak power and average power measurements made using a linear position transducer The Tendo Weight-lifting Analyzer Systemi emerged as a reliable system for measuring movement velocity and power. PMID:25729300

  12. Demonstrating the Safety and Reliability of a New System or Spacecraft: Incorporating Analyses and Reviews of the Design and Processing in Determining the Number of Tests to be Conducted

    NASA Technical Reports Server (NTRS)

    Vesely, William E.; Colon, Alfredo E.

    2010-01-01

    Design Safety/Reliability is associated with the probability of no failure-causing faults existing in a design. Confidence in the non-existence of failure-causing faults is increased by performing tests with no failure. Reliability-Growth testing requirements are based on initial assurance and fault detection probability. Using binomial tables generally gives too many required tests compared to reliability-growth requirements. Reliability-Growth testing requirements are based on reliability principles and factors and should be used.

  13. Evaluating North American Electric Grid Reliability Using the Barabasi-Albert Network Model

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chassin, David P.; Posse, Christian

    2005-09-15

    The reliability of electric transmission systems is examined using a scale-free model of network topology and failure propagation. The topologies of the North American eastern and western electric grids are analyzed to estimate their reliability based on the Barabási-Albert network model. A commonly used power system reliability index is computed using a simple failure propagation model. The results are compared to the values of power system reliability indices previously obtained using other methods and they suggest that scale-free network models are usable to estimate aggregate electric grid reliability.

  14. Evaluating North American Electric Grid Reliability Using the Barabasi-Albert Network Model

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chassin, David P.; Posse, Christian

    2005-09-15

    The reliability of electric transmission systems is examined using a scale-free model of network topology and failure propagation. The topologies of the North American eastern and western electric grids are analyzed to estimate their reliability based on the Barabasi-Albert network model. A commonly used power system reliability index is computed using a simple failure propagation model. The results are compared to the values of power system reliability indices previously obtained using standard power engineering methods, and they suggest that scale-free network models are usable to estimate aggregate electric grid reliability.

  15. Choosing a reliability inspection plan for interval censored data

    DOE PAGES

    Lu, Lu; Anderson-Cook, Christine Michaela

    2017-04-19

    Reliability test plans are important for producing precise and accurate assessment of reliability characteristics. This paper explores different strategies for choosing between possible inspection plans for interval censored data given a fixed testing timeframe and budget. A new general cost structure is proposed for guiding precise quantification of total cost in inspection test plan. Multiple summaries of reliability are considered and compared as the criteria for choosing the best plans using an easily adapted method. Different cost structures and representative true underlying reliability curves demonstrate how to assess different strategies given the logistical constraints and nature of the problem. Resultsmore » show several general patterns exist across a wide variety of scenarios. Given the fixed total cost, plans that inspect more units with less frequency based on equally spaced time points are favored due to the ease of implementation and consistent good performance across a large number of case study scenarios. Plans with inspection times chosen based on equally spaced probabilities offer improved reliability estimates for the shape of the distribution, mean lifetime, and failure time for a small fraction of population only for applications with high infant mortality rates. The paper uses a Monte Carlo simulation based approach in addition to the common evaluation based on the asymptotic variance and offers comparison and recommendation for different applications with different objectives. Additionally, the paper outlines a variety of different reliability metrics to use as criteria for optimization, presents a general method for evaluating different alternatives, as well as provides case study results for different common scenarios.« less

  16. Choosing a reliability inspection plan for interval censored data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lu, Lu; Anderson-Cook, Christine Michaela

    Reliability test plans are important for producing precise and accurate assessment of reliability characteristics. This paper explores different strategies for choosing between possible inspection plans for interval censored data given a fixed testing timeframe and budget. A new general cost structure is proposed for guiding precise quantification of total cost in inspection test plan. Multiple summaries of reliability are considered and compared as the criteria for choosing the best plans using an easily adapted method. Different cost structures and representative true underlying reliability curves demonstrate how to assess different strategies given the logistical constraints and nature of the problem. Resultsmore » show several general patterns exist across a wide variety of scenarios. Given the fixed total cost, plans that inspect more units with less frequency based on equally spaced time points are favored due to the ease of implementation and consistent good performance across a large number of case study scenarios. Plans with inspection times chosen based on equally spaced probabilities offer improved reliability estimates for the shape of the distribution, mean lifetime, and failure time for a small fraction of population only for applications with high infant mortality rates. The paper uses a Monte Carlo simulation based approach in addition to the common evaluation based on the asymptotic variance and offers comparison and recommendation for different applications with different objectives. Additionally, the paper outlines a variety of different reliability metrics to use as criteria for optimization, presents a general method for evaluating different alternatives, as well as provides case study results for different common scenarios.« less

  17. Reliability of measuring sciatic and tibial nerve movement with diagnostic ultrasound during a neural mobilisation technique.

    PubMed

    Ellis, Richard; Hing, Wayne; Dilley, Andrew; McNair, Peter

    2008-08-01

    Diagnostic ultrasound provides a technique whereby real-time, in vivo analysis of peripheral nerve movement is possible. This study measured sciatic nerve movement during a "slider" neural mobilisation technique (ankle dorsiflexion/plantar flexion and cervical extension/flexion). Transverse and longitudinal movement was assessed from still ultrasound images and video sequences by using frame-by-frame cross-correlation software. Sciatic nerve movement was recorded in the transverse and longitudinal planes. For transverse movement, at the posterior midthigh (PMT) the mean value of lateral sciatic nerve movement was 3.54 mm (standard error of measurement [SEM] +/- 1.18 mm) compared with anterior-posterior/vertical (AP) movement of 1.61 mm (SEM +/- 0.78 mm). At the popliteal crease (PC) scanning location, lateral movement was 6.62 mm (SEM +/- 1.10 mm) compared with AP movement of 3.26 mm (SEM +/- 0.99 mm). Mean longitudinal sciatic nerve movement at the PMT was 3.47 mm (SEM +/- 0.79 mm; n = 27) compared with the PC of 5.22 mm (SEM +/- 0.05 mm; n = 3). The reliability of ultrasound measurement of transverse sciatic nerve movement was fair to excellent (Intraclass correlation coefficient [ICC] = 0.39-0.76) compared with excellent (ICC = 0.75) for analysis of longitudinal movement. Diagnostic ultrasound presents a reliable, noninvasive, real-time, in vivo method for analysis of sciatic nerve movement.

  18. BurnCase 3D software validation study: Burn size measurement accuracy and inter-rater reliability.

    PubMed

    Parvizi, Daryousch; Giretzlehner, Michael; Wurzer, Paul; Klein, Limor Dinur; Shoham, Yaron; Bohanon, Fredrick J; Haller, Herbert L; Tuca, Alexandru; Branski, Ludwik K; Lumenta, David B; Herndon, David N; Kamolz, Lars-P

    2016-03-01

    The aim of this study was to compare the accuracy of burn size estimation using the computer-assisted software BurnCase 3D (RISC Software GmbH, Hagenberg, Austria) with that using a 2D scan, considered to be the actual burn size. Thirty artificial burn areas were pre planned and prepared on three mannequins (one child, one female, and one male). Five trained physicians (raters) were asked to assess the size of all wound areas using BurnCase 3D software. The results were then compared with the real wound areas, as determined by 2D planimetry imaging. To examine inter-rater reliability, we performed an intraclass correlation analysis with a 95% confidence interval. The mean wound area estimations of the five raters using BurnCase 3D were in total 20.7±0.9% for the child, 27.2±1.5% for the female and 16.5±0.1% for the male mannequin. Our analysis showed relative overestimations of 0.4%, 2.8% and 1.5% for the child, female and male mannequins respectively, compared to the 2D scan. The intraclass correlation between the single raters for mean percentage of the artificial burn areas was 98.6%. There was also a high intraclass correlation between the single raters and the 2D Scan visible. BurnCase 3D is a valid and reliable tool for the determination of total body surface area burned in standard models. Further clinical studies including different pediatric and overweight adult mannequins are warranted. Copyright © 2016 Elsevier Ltd and ISBI. All rights reserved.

  19. A systematic review on the quality of measurement techniques for the assessment of burn wound depth or healing potential.

    PubMed

    Jaspers, Mariëlle E H; van Haasterecht, Ludo; van Zuijlen, Paul P M; Mokkink, Lidwine B

    2018-06-22

    Reliable and valid assessment of burn wound depth or healing potential is essential to treatment decision-making, to provide a prognosis, and to compare studies evaluating different treatment modalities. The aim of this review was to critically appraise, compare and summarize the quality of relevant measurement properties of techniques that aim to assess burn wound depth or healing potential. A systematic literature search was performed using PubMed, EMBASE and Cochrane Library. Two reviewers independently evaluated the methodological quality of included articles using an adapted version of the Consensus-based Standards for the selection of health Measurement INstruments (COSMIN) checklist. A synthesis of evidence was performed to rate the measurement properties for each technique and to draw an overall conclusion on quality of the techniques. Thirty-six articles were included, evaluating various techniques, classified as (1) laser Doppler techniques; (2) thermography or thermal imaging; (3) other measurement techniques. Strong evidence was found for adequate construct validity of laser Doppler imaging (LDI). Moderate evidence was found for adequate construct validity of thermography, videomicroscopy, and spatial frequency domain imaging (SFDI). Only two studies reported on the measurement property reliability. Furthermore, considerable variation was observed among comparator instruments. Considering the evidence available, it appears that LDI is currently the most favorable technique; thereby assessing burn wound healing potential. Additional research is needed into thermography, videomicroscopy, and SFDI to evaluate their full potential. Future studies should focus on reliability and measurement error, and provide a precise description of which construct is aimed to measure. Copyright © 2018 Elsevier Ltd and ISBI. All rights reserved.

  20. Reliable protocol for shear wave elastography of lower limb muscles at rest and during passive stretching.

    PubMed

    Dubois, Guillaume; Kheireddine, Walid; Vergari, Claudio; Bonneau, Dominique; Thoreux, Patricia; Rouch, Philippe; Tanter, Mickael; Gennisson, Jean-Luc; Skalli, Wafa

    2015-09-01

    Development of shear wave elastography gave access to non-invasive muscle stiffness assessment in vivo. The aim of the present study was to define a measurement protocol to be used in clinical routine for quantifying the shear modulus of lower limb muscles. Four positions were defined to evaluate shear modulus in 10 healthy subjects: parallel to the fibers, in the anterior and posterior aspects of the lower limb, at rest and during passive stretching. Reliability was first evaluated on two muscles by three operators; these measurements were repeated six times. Then, measurement reliability was compared in 11 muscles by two operators; these measurements were repeated three times. Reproducibility of shear modulus was 0.48 kPa and repeatability was 0.41 kPa, with all muscles pooled. Position did not significantly influence reliability. Shear wave elastography appeared to be an appropriate and reliable tool to evaluate the shear modulus of lower limb muscles with the proposed protocol. Copyright © 2015 World Federation for Ultrasound in Medicine & Biology. Published by Elsevier Inc. All rights reserved.

  1. Inhibition in task switching: The reliability of the n - 2 repetition cost.

    PubMed

    Kowalczyk, Agnieszka W; Grange, James A

    2017-12-01

    The n - 2 repetition cost seen in task switching is the effect of slower response times performing a recently completed task (e.g. an ABA sequence) compared to performing a task that was not recently completed (e.g. a CBA sequence). This cost is thought to reflect cognitive inhibition of task representations and as such, the n - 2 repetition cost has begun to be used as an assessment of individual differences in inhibitory control; however, the reliability of this measure has not been investigated in a systematic manner. The current study addressed this important issue. Seventy-two participants performed three task switching paradigms; participants were also assessed on rumination traits and processing speed-measures of individual differences potentially modulating the n - 2 repetition cost. We found significant n - 2 repetition costs for each paradigm. However, split-half reliability tests revealed that this cost was not reliable at the individual-difference level. Neither rumination tendencies nor processing speed predicted this cost. We conclude that the n - 2 repetition cost is not reliable as a measure of individual differences in inhibitory control.

  2. Revisiting Individual Creativity Assessment: Triangulation in Subjective and Objective Assessment Methods

    ERIC Educational Resources Information Center

    Park, Namgyoo K.; Chun, Monica Youngshin; Lee, Jinju

    2016-01-01

    Compared to the significant development of creativity studies, individual creativity research has not reached a meaningful consensus regarding the most valid and reliable method for assessing individual creativity. This study revisited 2 of the most popular methods for assessing individual creativity: subjective and objective methods. This study…

  3. Physical Activity Measurement Methods for Young Children: A Comparative Study

    ERIC Educational Resources Information Center

    Hands, Beth; Parker, Helen; Larkin, Dawne

    2006-01-01

    Many behavior patterns that impact on physical activity experiences are established in early childhood, therefore it is important that valid, reliable, and feasible measures are constructed to identify children who are not developing appropriate and healthy activity habits. In this study, measures of physical activity derived by accelerometry and…

  4. Efficiency tests of samplers for microbiological aerosols, a review

    NASA Technical Reports Server (NTRS)

    Henningson, E.; Faengmark, I.

    1984-01-01

    To obtain comparable results from studies using a variety of samplers of microbiological aerosols with different collection performances for various particle sizes, methods reported in the literature were surveyed, evaluated, and tabulated for testing the efficiency of the samplers. It is concluded that these samplers were not thoroughly tested, using reliable methods. Tests were conducted in static air chambers and in various outdoor and work environments. Results are not reliable as it is difficult to achieve stable and reproducible conditions in these test systems. Testing in a wind tunnel is recommended.

  5. Strength and reliability analysis of metal-composite overwrapped pressure vessel

    NASA Astrophysics Data System (ADS)

    Burov, A. E.; Lepikhin, A. M.; Moskvichev, V. V.

    2017-12-01

    Metal-composite overwrapped pressure vessels (MCOPV) have found a wide application in aerospace and aeronautical industries. Such vessels should combine impermeability and high weight efficiency with enhanced long-term safety and durability. To meet these requirements, theoretical and experimental studies on the mechanics of deformation and failure of MCOPV are required. In the paper, the analysis on strength, lifetime and reliability of MCOPV is presented. A high performance of the MCOPV is justified by comparing the calculation results with experiment data obtained on full-scale samples.

  6. The quadrant method measuring four points is as a reliable and accurate as the quadrant method in the evaluation after anatomical double-bundle ACL reconstruction.

    PubMed

    Mochizuki, Yuta; Kaneko, Takao; Kawahara, Keisuke; Toyoda, Shinya; Kono, Norihiko; Hada, Masaru; Ikegami, Hiroyasu; Musha, Yoshiro

    2017-11-20

    The quadrant method was described by Bernard et al. and it has been widely used for postoperative evaluation of anterior cruciate ligament (ACL) reconstruction. The purpose of this research is to further develop the quadrant method measuring four points, which we named four-point quadrant method, and to compare with the quadrant method. Three-dimensional computed tomography (3D-CT) analyses were performed in 25 patients who underwent double-bundle ACL reconstruction using the outside-in technique. The four points in this study's quadrant method were defined as point1-highest, point2-deepest, point3-lowest, and point4-shallowest, in femoral tunnel position. Value of depth and height in each point was measured. Antero-medial (AM) tunnel is (depth1, height2) and postero-lateral (PL) tunnel is (depth3, height4) in this four-point quadrant method. The 3D-CT images were evaluated independently by 2 orthopaedic surgeons. A second measurement was performed by both observers after a 4-week interval. Intra- and inter-observer reliability was calculated by means of intra-class correlation coefficient (ICC). Also, the accuracy of the method was evaluated against the quadrant method. Intra-observer reliability was almost perfect for both AM and PL tunnel (ICC > 0.81). Inter-observer reliability of AM tunnel was substantial (ICC > 0.61) and that of PL tunnel was almost perfect (ICC > 0.81). The AM tunnel position was 0.13% deep, 0.58% high and PL tunnel position was 0.01% shallow, 0.13% low compared to quadrant method. The four-point quadrant method was found to have high intra- and inter-observer reliability and accuracy. This method can evaluate the tunnel position regardless of the shape and morphology of the bone tunnel aperture for use of comparison and can provide measurement that can be compared with various reconstruction methods. The four-point quadrant method of this study is considered to have clinical relevance in that it is a detailed and accurate tool for evaluating femoral tunnel position after ACL reconstruction. Case series, Level IV.

  7. ESTIMATING IMPERVIOUS COVER FROM REGIONALLY AVAILABLE DATA

    EPA Science Inventory

    The objective of this study is to compare and evaluate the reliability of different approaches for estimating impervious cover including three empirical formulations for estimating impervious cover from population density data, estimation from categorized land cover data, and to ...

  8. Assessment of Validity and Reliability of IMNCI Algorithm in Comparison to Provisional Diagnosis of Senior Pediatricians in a Tertiary Hospital of Kolkata.

    PubMed

    Bhattacharyya, Agnihotri; Mukherjee, Shuvankar; Chatterjee, Chitra; Dasgupta, Samir

    2013-04-01

    Integrated management of childhood illness (IMNCI) is already operational in many states of India, but there are only limited studies in Indian scenario comparing its validity and reliability with the decisions of pediatricians. Aims and. To assess the validity and reliability of the IMNCI algorithm with provisional diagnosis of senior pediatricians for each IMNCI classifications. The present study is done with all the young infants between 0-2 months presented during the study period with a fresh episode of illness to test the validity and reliability of the algorithm in comparison to provisional diagnoses of senior pediatricians. The study was done in a tertiary care hospital. Validity characteristics such as sensitivity, specificity, positive predictive value, negative predictive value, and reliability characteristics such as percent agreement and Kappa were assessed for individual IMNCI classifications. The sensitivity of possible serious bacterial infection, local bacterial infection, jaundice, no dehydration and possible serious bacterial infection, not able to feed were 88.89, 14.29, 66.67, 25 and 44.44% respectively. The specificities for the same conditions were 71.72, 99.09, 99.07, 94.50 and 86.87%. Percent agreements for similar conditions were 74, 94, 97, 90 and 80% respectively and the Kappa ratios were 0.38, 0.20, 0.73, 0.19 and 0.29 respectively. It could be concluded that IMNCI is quite a sensitive strategy and could identify severe illnesses of young infants requiring referral to higher facility. Further studies, particularly in primary health care setting, are required.

  9. Neural Bases Of Food Perception: Coordinate-Based Meta-Analyses Of Neuroimaging Studies In Multiple Modalities

    PubMed Central

    Huerta, Claudia I; Sarkar, Pooja R; Duong, Timothy Q.; Laird, Angela R; Fox, Peter T

    2013-01-01

    Objective The purpose of this study was to compare the results of the three food-cue paradigms most commonly used for functional neuroimaging studies to determine: i) commonalities and differences in the neural response patterns by paradigm; and, ii) the relative robustness and reliability of responses to each paradigm. Design and Methods functional magnetic resonance imaging (fMRI) studies using standardized stereotactic coordinates to report brain responses to food cues were identified using on-line databases. Studies were grouped by food-cue modality as: i) tastes (8 studies); ii) odors (8 studies); and, iii) images (11 studies). Activation likelihood estimation (ALE) was used to identify statistically reliable regional responses within each stimulation paradigm. Results Brain response distributions were distinctly different for the three stimulation modalities, corresponding to known differences in location of the respective primary and associative cortices. Visual stimulation induced the most robust and extensive responses. The left anterior insula was the only brain region reliably responding to all three stimulus categories. Conclusions These findings suggest visual food-cue paradigm as promising candidate for imaging studies addressing the neural substrate of therapeutic interventions. PMID:24174404

  10. Mother-reported sleep, accelerometer-estimated sleep and weight status in Mexican American children: sleep duration is associated with increased adiposity and risk for overweight/obese status

    USDA-ARS?s Scientific Manuscript database

    We know of no studies comparing parent-reported sleep with accelerometer-estimated sleep in their relation to paediatric adiposity. We examined: (i) the reliability of mother-reported sleep compared with accelerometer-estimated sleep; and (ii) the relationship between both sleep measures and child a...

  11. Reliability of the standard goniometry and diagrammatic recording of finger joint angles: a comparative study with healthy subjects and non-professional raters.

    PubMed

    Macionis, Valdas

    2013-01-09

    Diagrammatic recording of finger joint angles by using two criss-crossed paper strips can be a quick substitute to the standard goniometry. As a preliminary step toward clinical validation of the diagrammatic technique, the current study employed healthy subjects and non-professional raters to explore whether reliability estimates of the diagrammatic goniometry are comparable with those of the standard procedure. The study included two procedurally different parts, which were replicated by assigning 24 medical students to act interchangeably as 12 subjects and 12 raters. A larger component of the study was designed to compare goniometers side-by-side in measurement of finger joint angles varying from subject to subject. In the rest of the study, the instruments were compared by parallel evaluations of joint angles similar for all subjects in a situation of simulated change of joint range of motion over time. The subjects used special guides to position the joints of their left ring finger at varying angles of flexion and extension. The obtained diagrams of joint angles were converted to numerical values by computerized measurements. The statistical approaches included calculation of appropriate intraclass correlation coefficients, standard errors of measurements, proportions of measurement differences of 5 or less degrees, and significant differences between paired observations. Reliability estimates were similar for both goniometers. Intra-rater and inter-rater intraclass correlation coefficients ranged from 0.69 to 0.93. The corresponding standard errors of measurements ranged from 2.4 to 4.9 degrees. Repeated measurements of a considerable number of raters fell within clinically non-meaningful 5 degrees of each other in proportions comparable with a criterion value of 0.95. Data collected with both instruments could be similarly interpreted in a simulated situation of change of joint range of motion over time. The paper goniometer and the standard goniometer can be used interchangeably by non-professional raters for evaluation of normal finger joints. The obtained results warrant further research to assess clinical performance of the paper strip technique.

  12. Reliability of the standard goniometry and diagrammatic recording of finger joint angles: a comparative study with healthy subjects and non-professional raters

    PubMed Central

    2013-01-01

    Background Diagrammatic recording of finger joint angles by using two criss-crossed paper strips can be a quick substitute to the standard goniometry. As a preliminary step toward clinical validation of the diagrammatic technique, the current study employed healthy subjects and non-professional raters to explore whether reliability estimates of the diagrammatic goniometry are comparable with those of the standard procedure. Methods The study included two procedurally different parts, which were replicated by assigning 24 medical students to act interchangeably as 12 subjects and 12 raters. A larger component of the study was designed to compare goniometers side-by-side in measurement of finger joint angles varying from subject to subject. In the rest of the study, the instruments were compared by parallel evaluations of joint angles similar for all subjects in a situation of simulated change of joint range of motion over time. The subjects used special guides to position the joints of their left ring finger at varying angles of flexion and extension. The obtained diagrams of joint angles were converted to numerical values by computerized measurements. The statistical approaches included calculation of appropriate intraclass correlation coefficients, standard errors of measurements, proportions of measurement differences of 5 or less degrees, and significant differences between paired observations. Results Reliability estimates were similar for both goniometers. Intra-rater and inter-rater intraclass correlation coefficients ranged from 0.69 to 0.93. The corresponding standard errors of measurements ranged from 2.4 to 4.9 degrees. Repeated measurements of a considerable number of raters fell within clinically non-meaningful 5 degrees of each other in proportions comparable with a criterion value of 0.95. Data collected with both instruments could be similarly interpreted in a simulated situation of change of joint range of motion over time. Conclusions The paper goniometer and the standard goniometer can be used interchangeably by non-professional raters for evaluation of normal finger joints. The obtained results warrant further research to assess clinical performance of the paper strip technique. PMID:23302419

  13. Probability techniques for reliability analysis of composite materials

    NASA Technical Reports Server (NTRS)

    Wetherhold, Robert C.; Ucci, Anthony M.

    1994-01-01

    Traditional design approaches for composite materials have employed deterministic criteria for failure analysis. New approaches are required to predict the reliability of composite structures since strengths and stresses may be random variables. This report will examine and compare methods used to evaluate the reliability of composite laminae. The two types of methods that will be evaluated are fast probability integration (FPI) methods and Monte Carlo methods. In these methods, reliability is formulated as the probability that an explicit function of random variables is less than a given constant. Using failure criteria developed for composite materials, a function of design variables can be generated which defines a 'failure surface' in probability space. A number of methods are available to evaluate the integration over the probability space bounded by this surface; this integration delivers the required reliability. The methods which will be evaluated are: the first order, second moment FPI methods; second order, second moment FPI methods; the simple Monte Carlo; and an advanced Monte Carlo technique which utilizes importance sampling. The methods are compared for accuracy, efficiency, and for the conservativism of the reliability estimation. The methodology involved in determining the sensitivity of the reliability estimate to the design variables (strength distributions) and importance factors is also presented.

  14. Reliability and Validity of the Turkish Version of the Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V).

    PubMed

    Özcebe, Esra; Aydinli, Fatma Esen; Tiğrak, Tuğçe Karahan; İncebay, Önal; Yilmaz, Taner

    2018-01-11

    The main purpose of this study was to culturally adapt the Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V) to Turkish and to evaluate its internal consistency, validity, and reliability. The Turkish version of CAPE-V was developed, and with the use of a prospective case-control design, the voice recordings of 130 participants were collected according to CAPE-V protocol. Auditory-perceptual evaluation was conducted according to CAPE-V and Grade, Roughness, Breathiness, Asthenia, and Strain (GRBAS) scale by two ear, nose, and throat specialists and two speech and language therapists. The different types of voice disorders, classified as organic and functional disorders, were compared in terms of their CAPE-V scores. The overall severity parameter had the highest intrarater and inter-reliability values for all the participants. For all four raters, the differences in the six CAPE-V parameters between the study and the control groups were found to be statistically significant. Among the correlations for the comparable parameters of the CAPE-V and the GRBAS scales, the highest correlation was found between the overall severity-grade parameters. There was no difference found between the organic and functional voice disorders in terms of the CAPE-V scores. The Turkish version of CAPE-V has been proven to be a reliable and valid instrument to use in the auditory-perceptual evaluation of voice. For the future application of this study, it would be important to investigate whether cepstral measures correlate with the auditory-perceptual judgments of dysphonia severity collected by a Turkish version of the CAPE-V. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  15. A comparison of the shuttle and 6 minute walking tests with measured peak oxygen consumption in patients with heart failure.

    PubMed

    Green, D J; Watts, K; Rankin, S; Wong, P; O'Driscoll, J G

    2001-09-01

    This study investigated the use of an incremental, externally-paced 10 m shuttle walk test (SWT) as an objective, reliable and predictive test of functional capacity in patients with heart failure (CHF). The SWT was compared to a 6 minute walk test (6WT) and a maximal symptom-limited treadmill peak oxygen consumption (VO2peak) test. Experiment 1 examined the reproducibility of the SWT. Two SWF trials were performed and distance ambulated (DA), heart rate (HR) and rate of perceived exertion (RPE) results compared. In experiment 2, SWT, 6WT, and VO2 peak tests were performed and HR. RPE and ambulatory VO2 compared. The SWT demonstrated strong test/retest reliability for DA (r = 0.98). HR (r = 0.96) and RPE (r = 0.89). Treadmill VO2 peak was significantly correlated with DA during the SWT (r = 0.83, P < 0.05), but not the 6WT. SWT peak VO2 (18.5 +/- 1.8 ml.kg(-1) x min(-1)) and treadmill VO2 peak (18.3 +/-2.0 ml.kg(-1) x min(-1)) were also highly correlated (r = 0.78, P < 0.05). Conversely, 6WT peak VO2 and treadmill VO2 peak were not significantly correlated. This study suggests the SWT is a reliable, objective test, highly predictive of VO2 peak which may be a more optimal field exercise test than the self paced 6WT.

  16. A new principle for the standardization of long paragraphs for reading speed analysis.

    PubMed

    Radner, Wolfgang; Radner, Stephan; Diendorfer, Gabriela

    2016-01-01

    To investigate the reliability, validity, and statistical comparability of long paragraphs that were developed to be equivalent in construction and difficulty. Seven long paragraphs were developed that were equal in syntax, morphology, and number and position of words (111), with the same number of syllables (179) and number of characters (660). For validity analyses, the paragraphs were compared with the mean reading speed of a set of seven sentence optotypes of the RADNER Reading Charts (mean of 7 × 14 = 98 words read). Reliability analyses were performed by calculating the Cronbach's alpha value and the corrected total item correlation. Sixty participants (aged 20-77 years) read the paragraphs and the sentences (distance 40 cm; font: Times New Roman 12 pt). Test items were presented randomly; reading length was measured with a stopwatch. Reliability analysis yielded a Cronbach's alpha value of 0.988. When the long paragraphs were compared in pairwise fashion, significant differences were found in 13 of the 21 pairs (p < 0.05). In two sequences of three paragraphs each and in eight pairs of paragraphs, the paragraphs did not differ significantly, and these paragraph combinations are therefore suitable for comparative research studies. The mean reading speed was 173.34 ± 24.01 words per minute (wpm) for the long paragraphs and 198.26 ± 28.60 wpm for the sentence optotypes. The maximum difference in reading speed was 5.55 % for the long paragraphs and 2.95 % for the short sentence optotypes. The correlation between long paragraphs and sentence optotypes was high (r = 0.9243). Despite good reliability and equivalence in construction and degree of difficulty, a statistically significant difference in reading speed can occur between long paragraphs. Since statistical significance should be dependent only on the persons tested, either standardizing long paragraphs for statistical equality of reading speed measurements or increasing the number of presented paragraphs is recommended for comparative investigations.

  17. Overcoming the Challenges of Unstructured Data in Multisite, Electronic Medical Record-based Abstraction.

    PubMed

    Polnaszek, Brock; Gilmore-Bykovskyi, Andrea; Hovanes, Melissa; Roiland, Rachel; Ferguson, Patrick; Brown, Roger; Kind, Amy J H

    2016-10-01

    Unstructured data encountered during retrospective electronic medical record (EMR) abstraction has routinely been identified as challenging to reliably abstract, as these data are often recorded as free text, without limitations to format or structure. There is increased interest in reliably abstracting this type of data given its prominent role in care coordination and communication, yet limited methodological guidance exists. As standard abstraction approaches resulted in substandard data reliability for unstructured data elements collected as part of a multisite, retrospective EMR study of hospital discharge communication quality, our goal was to develop, apply and examine the utility of a phase-based approach to reliably abstract unstructured data. This approach is examined using the specific example of discharge communication for warfarin management. We adopted a "fit-for-use" framework to guide the development and evaluation of abstraction methods using a 4-step, phase-based approach including (1) team building; (2) identification of challenges; (3) adaptation of abstraction methods; and (4) systematic data quality monitoring. Unstructured data elements were the focus of this study, including elements communicating steps in warfarin management (eg, warfarin initiation) and medical follow-up (eg, timeframe for follow-up). After implementation of the phase-based approach, interrater reliability for all unstructured data elements demonstrated κ's of ≥0.89-an average increase of +0.25 for each unstructured data element. As compared with standard abstraction methodologies, this phase-based approach was more time intensive, but did markedly increase abstraction reliability for unstructured data elements within multisite EMR documentation.

  18. Reproducibility of objectively measured physical activity and sedentary time over two seasons in children; Comparing a day-by-day and a week-by-week approach

    PubMed Central

    Andersen, Lars Bo; Skrede, Turid; Ekelund, Ulf; Anderssen, Sigmund Alfred; Resaland, Geir Kåre

    2017-01-01

    Introduction Knowledge of reproducibility of accelerometer-determined physical activity (PA) and sedentary time (SED) estimates are a prerequisite to conduct high-quality epidemiological studies. Yet, estimates of reproducibility might differ depending on the approach used to analyze the data. The aim of the present study was to determine the reproducibility of objectively measured PA and SED in children by directly comparing a day-by-day and a week-by-week approach to data collected over two weeks during two different seasons 3–4 months apart. Methods 676 11-year-old children from the Active Smarter Kids study conducted in Sogn og Fjordane county, Norway, performed 7 days of accelerometer monitoring (ActiGraph GT3X+) during January-February and April-May 2015. Reproducibility was calculated using a day-by-day and a week-by-week approach applying mixed effect modelling and the Spearman Brown prophecy formula, and reported using intra-class correlation (ICC), Bland Altman plots and 95% limits of agreement (LoA). Results Applying a week-by-week approach, no variables provided ICC estimates ≥ 0.70 for one week of measurement in any model (ICC = 0.29–0.66 not controlling for season; ICC = 0.49–0.67 when controlling for season). LoA for these models approximated a factor of 1.3–1.7 of the sample PA level standard deviations. Compared to the week-by-week approach, the day-by-day approach resulted in too optimistic reliability estimates (ICC = 0.62–0.77 not controlling for season; ICC = 0.64–0.77 when controlling for season). Conclusions Reliability is lower when analyzed over different seasons and when using a week-by-week approach, than when applying a day-by-day approach and the Spearman Brown prophecy formula to estimate reliability over a short monitoring period. We suggest a day-by-day approach and the Spearman Brown prophecy formula to determine reliability be used with caution. Trial Registration The study is registered in Clinicaltrials.gov 7th April 2014 with identification number NCT02132494. PMID:29216318

  19. Nifurpirinol: A more potent and reliable substrate compared to metronidazole for nitroreductase-mediated cell ablations.

    PubMed

    Bergemann, David; Massoz, Laura; Bourdouxhe, Jordane; Carril Pardo, Claudio A; Voz, Marianne L; Peers, Bernard; Manfroid, Isabelle

    2018-04-17

    The zebrafish is a popular animal model with well-known regenerative capabilities. To study regeneration in this fish, the nitroreductase/metronidazole-mediated system is widely used for targeted ablation of various cell types. Nevertheless, we highlight here some variability in ablation efficiencies with the metronidazole prodrug that led us to search for a more efficient and reliable compound. Herein, we present nifurpirinol, another nitroaromatic antibiotic, as a more potent prodrug compared to metronidazole to trigger cell-ablation in nitroreductase expressing transgenic models. We show that nifurpirinol induces robust and reliable ablations at concentrations 2,000 fold lower than metronidazole and three times below its own toxic concentration. We confirmed the efficiency of nifurpirinol in triggering massive ablation of three different cell types: the pancreatic beta cells, osteoblasts, and dopaminergic neurons. Our results identify nifurpirinol as a very potent prodrug for the nitroreductase-mediated ablation system and suggest that its use could be extended to many other cell types, especially if difficult to ablate, or when combined pharmacological treatments are desired. © 2018 by the Wound Healing Society.

  20. The Development of a Motor-Free Short-Form of the Wechsler Intelligence Scale for Children-Fifth Edition.

    PubMed

    Piovesana, Adina M; Harrison, Jessica L; Ducat, Jacob J

    2017-12-01

    This study aimed to develop a motor-free short-form of the Wechsler Intelligence Scale for Children-Fifth Edition (WISC-V) that allows clinicians to estimate the Full Scale Intelligence Quotients of youths with motor impairments. Using the reliabilities and intercorrelations of six WISC-V motor-free subtests, psychometric methodologies were applied to develop look-up tables for four Motor-free Short-form indices: Verbal Comprehension Short-form, Perceptual Reasoning Short-form, Working Memory Short-form, and a Motor-free Intelligence Quotient. Index-level discrepancy tables were developed using the same methods to allow clinicians to statistically compare visual, verbal, and working memory abilities. The short-form indices had excellent reliabilities ( r = .92-.97) comparable to the original WISC-V. This motor-free short-form of the WISC-V is a reliable alternative for the assessment of intellectual functioning in youths with motor impairments. Clinicians are provided with user-friendly look-up tables, index level discrepancy tables, and base rates, displayed similar to those in the WISC-V manuals to enable interpretation of assessment results.

Top