interobserver reliability ofone: Topics by Science.gov

Sample records for interobserver reliability ofone

Surgeon Reliability for the Assessment of Lumbar Spinal Stenosis on MRI: The Impact of Surgeon Experience.

PubMed

Marawar, Satyajit V; Madom, Ian A; Palumbo, Mark; Tallarico, Richard A; Ordway, Nathaniel R; Metkar, Umesh; Wang, Dongliang; Green, Adam; Lavelle, William F

2017-01-01

Treating surgeon's visual assessment of axial MRI images to ascertain the degree of stenosis has a critical impact on surgical decision-making. The purpose of this study was to prospectively analyze the impact of surgeon experience on inter-observer and intra-observer reliability of assessing severity of spinal stenosis on MRIs by spine surgeons directly involved in surgical decision-making. Seven fellowship trained spine surgeons reviewed MRI studies of 30 symptomatic patients with lumbar stenosis and graded the stenosis in the central canal, the lateral recess and the foramen at T12-L1 to L5-S1 as none, mild, moderate or severe. No specific instructions were provided to what constituted mild, moderate, or severe stenosis. Two surgeons were "senior" (>fifteen years of practice experience); two were "intermediate" (>four years of practice experience), and three "junior" (< one year of practice experience). The concordance correlation coefficient (CCC) was calculated to assess inter-observer reliability. Seven MRI studies were duplicated and randomly re-read to evaluate inter-observer reliability. Surgeon experience was found to be a strong predictor of inter-observer reliability. Senior inter-observer reliability was significantly higher assessing central(p<0.001), foraminal p=0.005 and lateral p=0.001 than "junior" group.Senior group also showed significantly higher inter-observer reliability that intermediate group assessing foraminal stenosis (p=0.036). In intra-observer reliability the results were contrary to that found in inter-observer reliability. Inter-observer reliability of assessing stenosis on MRIs increases with surgeon experience. Lower intra-observer reliability values among the senior group, although not clearly explained, may be due to the small number of MRIs evaluated and quality of MRI images.Level of evidence: Level 3.
Reliability testing of two classification systems for osteoarthritis and post-traumatic arthritis of the elbow.

PubMed

Amini, Michael H; Sykes, Joshua B; Olson, Stephen T; Smith, Richard A; Mauck, Benjamin M; Azar, Frederick M; Throckmorton, Thomas W

2015-03-01

The severity of elbow arthritis is one of many factors that surgeons must evaluate when considering treatment options for a given patient. Elbow surgeons have historically used the Broberg and Morrey (BM) and Hastings and Rettig (HR) classification systems to radiographically stage the severity of post-traumatic arthritis (PTA) and primary osteoarthritis (OA). We proposed to compare the intraobserver and interobserver reliability between systems for patients with either PTA or OA. The radiographs of 45 patients were evaluated at least 2 weeks apart by 6 evaluators of different levels of training. Intraobserver and interobserver reliability were calculated by Spearman correlation coefficients with 95% confidence intervals. Agreement was considered almost perfect for coefficients >0.80 and substantial for coefficients of 0.61 to 0.80. In patients with both PTA and OA, intraobserver reliability and interobserver reliability were substantial, with no difference between classification systems. There were no significant differences in intraobserver or interobserver reliability between attending physicians and trainees for either classification system (all P > .10). The presence of fracture implants did not affect reliability in the BM system but did substantially worsen reliability in the HR system (intraobserver P = .04 and interobserver P = .001). The BM and HR classifications both showed substantial intraobserver and interobserver reliability for PTA and OA. Training level differences did not affect reliability for either system. Both trainees and fellowship-trained surgeons may easily and reliably apply each classification system to the evaluation of primary elbow OA and PTA, although the HR system was less reliable in the presence of fracture implants. Copyright © 2015 Journal of Shoulder and Elbow Surgery Board of Trustees. Published by Elsevier Inc. All rights reserved.
Reliability of joint count assessment in rheumatoid arthritis: a systematic literature review.

PubMed

Cheung, Peter P; Gossec, Laure; Mak, Anselm; March, Lyn

2014-06-01

Joint counts are central to the assessment of rheumatoid arthritis (RA) but reliability is an issue. To evaluate the reliability and agreement of joint counts (intra-observer and inter-observer) by health care professionals (physicians, nurses, and metrologists) and patients in RA, and the impact of training and standardization on joint count reliability through a systematic literature review. Articles reporting joint count reliability or agreement in RA in PubMed, EMBase, and the Cochrane library between 1960 and 2012 were selected. Data were extracted regarding tender joint counts (TJCs) and swollen joint counts (SJCs) derived by physicians, metrologists, or patients for intra-observer and inter-observer reliability. In addition, methods and effects of training or standardization were extracted. Statistics expressing reliability such as intraclass correlation coefficients (ICCs) were extracted. Data analysis was primarily descriptive due to high heterogeneity. Twenty-eight studies on health care professionals (HCP) and 20 studies on patients were included. Intra-observer reliability for TJCs and SJCs was good for HCPs and patients (range of ICC: 0.49-0.98). Inter-observer reliability between HCPs for TJCs was higher than for SJCs (range of ICC: 0.64-0.88 vs. 0.29-0.98). Patient inter-observer reliability with HCPs as comparators was better for TJCs (range of ICC: 0.31-0.91) compared to SJCs (0.16-0.64). Nine studies (7 with HCPs and 2 with patients) evaluated consensus or training, with improvement in reliability of TJCs but conflicting evidence for SJCs. Intra- and inter-observer reliability was high for TJCs for HCPs and patients: among all groups, reliability was better for TJCs than SJCs. Inter-observer reliability of SJCs was poorer for patients than HCPs. Data were inconclusive regarding the potential for training to improve SJC reliability. Overall, the results support further evaluation for patient-reported joint counts as an outcome measure. © 2013 Published by Elsevier Inc.
Training improves interobserver reliability for the diagnosis of scaphoid fracture displacement.

PubMed

Buijze, Geert A; Guitton, Thierry G; van Dijk, C Niek; Ring, David

2012-07-01

The diagnosis of displacement in scaphoid fractures is notorious for poor interobserver reliability. We tested whether training can improve interobserver reliability and sensitivity, specificity, and accuracy for the diagnosis of scaphoid fracture displacement on radiographs and CT scans. Sixty-four orthopaedic surgeons rated a set of radiographs and CT scans of 10 displaced and 10 nondisplaced scaphoid fractures for the presence of displacement, using a web-based rating application. Before rating, observers were randomized to a training group (34 observers) and a nontraining group (30 observers). The training group received an online training module before the rating session, and the nontraining group did not. Interobserver reliability for training and nontraining was assessed by Siegel's multirater kappa and the Z-test was used to test for significance. There was a small, but significant difference in the interobserver reliability for displacement ratings in favor of the training group compared with the nontraining group. Ratings of radiographs and CT scans combined resulted in moderate agreement for both groups. The average sensitivity, specificity, and accuracy of diagnosing displacement of scaphoid fractures were, respectively, 83%, 85%, and 84% for the nontraining group and 87%, 86%, and 87% for the training group. Assuming a 5% prevalence of fracture displacement, the positive predictive value was 0.23 in the nontraining group and 0.25 in the training group. The negative predictive value was 0.99 in both groups. Our results suggest training can improve interobserver reliability and sensitivity, specificity and accuracy for the diagnosis of scaphoid fracture displacement, but the improvements are slight. These findings are encouraging for future research regarding interobserver variation and how to reduce it further.
Does the Modified Gartland Classification Clarify Decision Making?

PubMed

Leung, Sophia; Paryavi, Ebrahim; Herman, Martin J; Sponseller, Paul D; Abzug, Joshua M

2018-01-01

The modified Gartland classification system for pediatric supracondylar fractures is often utilized as a communication tool to aid in determining whether or not a fracture warrants operative intervention. This study sought to determine the interobserver and intraobserver reliability of the Gartland classification system, as well as to determine whether there was agreement that a fracture warranted operative intervention regardless of the classification system. A total of 200 anteroposterior and lateral radiographs of pediatric supracondylar humerus fractures were retrospectively reviewed by 3 fellowship-trained pediatric orthopaedic surgeons and 2 orthopaedic residents and then classified as type I, IIa, IIb, or III. The surgeons then recorded whether they would treat the fracture nonoperatively or operatively. The κ coefficients were calculated to determine interobserver and intraobserver reliability. Overall, the Wilkins-modified Gartland classification has low-moderate interobserver reliability (κ=0.475) and high intraobserver reliability (κ=0.777). A low interobserver reliability was found when differentiating between type IIa and IIb (κ=0.240) among attendings. There was moderate-high interobserver reliability for the decision to operate (κ=0.691) and high intraobserver reliability (κ=0.760). Decreased interobserver reliability was present for decision to operate among residents. For fractures classified as type I, the decision to operate was made 3% of the time and 27% for type IIa. The decision was made to operate 99% of the time for type IIb and 100% for type III. There is almost full agreement for the nonoperative treatment of Type I fractures and operative treatment for type III fractures. There is agreement that type IIb fractures should be treated operatively and that the majority of type IIa fractures should be treated nonoperatively. However, the interobserver reliability for differentiating between type IIa and IIb fractures is low. Our results validate the Gartland classfication system as a method to help direct treatment of pediatric supracondylar humerus fractures, although the modification of the system, IIa versus IIb, seems to have limited reliability and utility. Terminology based on decision to treat may lead to a more clinically useful classification system in the evaluation and treatment of pediatric supracondylar humerus fractures. Level III-diagnostic studies.
Reliability analysis for digital adolescent idiopathic scoliosis measurements.

PubMed

Kuklo, Timothy R; Potter, Benjamin K; O'Brien, Michael F; Schroeder, Teresa M; Lenke, Lawrence G; Polly, David W

2005-04-01

Analysis of adolescent idiopathic scoliosis (AIS) requires a thorough clinical and radiographic evaluation to completely assess the three-dimensional deformity. Recently, these radiographic parameters have been analyzed for reliability and reproducibility following manual measurements; however, most of these parameters have not been analyzed with regard to digital measurements. The purpose of this study is to determine the intra- and interobserver reliability of common scoliosis radiographic parameters using a digital software measurement program. Thirty sets of preoperative (posteroanterior [PA], lateral, and side-bending [SB]) and postoperative (PA and lateral) radiographs were analyzed by three independent observers on two separate occasions using a software measurement program (PhDx, Albuquerque, NM). Coronal measures included main thoracic (MT) and thoracolumbar-lumbar (TL/L) Cobb, SB MT Cobb, MT and TL/L apical vertical translation (AVT), C7 to center sacral vertical line (CSVL), T1 tilt, LIV tilt, disk below lowest instrumented vertebra (LIV), coronal balance, and Risser, whereas sagittal measures included T2-T5, T5-T12, T2-T12, T10-L2, T12-S1, and sagittal balance. Analysis of variance for repeated measures or Cohen three-way kappa correlation coefficient analysis was performed as appropriate to calculate the intra- and interobserver reliability for each parameter. The majority of the radiographic parameters assessed demonstrated good or excellent intra- and interobserver reliability. The relationship of the LIV to the CSVL (intraobserver kappaa = 0.48-0.78, fair to excellent; interobserver kappaa = 0.34-0.41, fair to poor), interobserver measurement of AVT (rho = 0.49-0.73, low to good), Risser grade (intraobserver rho = 0.41-0.97, low to excellent; interobserver rho = 0.60-0.70, fair to good), intraobserver measurement of the angulation of the disk inferior to the LIV (rho = 0.53-0.88, fair to good), apical Nash-Moe vertebral rotation (intraobserver rho = 0.50-0.85, fair to good; interobserver rho = 0.53-0.59, fair), and especially regional thoracic kyphosis from T2 to T5 (intraobserver rho = 0.22-0.65, poor to fair; interobserver rho = 0.33-0.47, low) demonstrated lesser reliability. In general, preoperative measures demonstrated greater reliability than postoperative measures, and coronal angular measures were more reliable than sagittal measures. Most common radiographic parameters for AIS assessment demonstrated good or excellent reliability for digital measurement and can be recommended for routine clinical and academic use. Preoperative assessments and coronal measures may be more reliable than postoperative and sagittal measurements. The reliability of digital measurements will be increasingly important as digital radiographic viewing becomes commonplace.
Proximal humeral fracture classification systems revisited.

PubMed

Majed, Addie; Macleod, Iain; Bull, Anthony M J; Zyto, Karol; Resch, Herbert; Hertel, Ralph; Reilly, Peter; Emery, Roger J H

2011-10-01

This study evaluated several classification systems and expert surgeons' anatomic understanding of these complex injuries based on a consecutive series of patients. We hypothesized that current proximal humeral fracture classification systems, regardless of imaging methods, are not sufficiently reliable to aid clinical management of these injuries. Complex fractures in 96 consecutive patients were investigated by generation of rapid sequence prototyping models from computed tomography Digital Imaging and Communications in Medicine (DICOM) imaging data. Four independent senior observers were asked to classify each model using 4 classification systems: Neer, AO, Codman-Hertel, and a prototype classification system by Resch. Interobserver and intraobserver κ coefficient values were calculated for the overall classification system and for selected classification items. The κ coefficient values for the interobserver reliability were 0.33 for Neer, 0.11 for AO, 0.44 for Codman-Hertel, and 0.15 for Resch. Interobserver reliability κ coefficient values were 0.32 for the number of fragments and 0.30 for the anatomic segment involved using the Neer system, 0.30 for the AO type (A, B, C), and 0.53, 0.48, and 0.08 for the Resch impaction/distraction, varus/valgus and flexion/extension subgroups, respectively. Three-part fractures showed low reliability for the Neer and AO systems. Currently available evidence suggests fracture classifications in use have poor intra- and inter-observer reliability despite the modality of imaging used thus making treating these injuries difficult as weak as affecting scientific research as well. This study was undertaken to evaluate the reliability of several systems using rapid sequence prototype models. Overall interobserver κ values represented slight to moderate agreement. The most reliable interobserver scores were found with the Codman-Hertel classification, followed by elements of Resch's trial system. The AO system had the lowest values. The higher interobserver reliability values for the Codman-Hertel system showed that is the only comprehensive fracture description studied, whereas the novel classification by Resch showed clear definition in respect to varus/valgus and impaction/distraction angulation. Copyright © 2011 Journal of Shoulder and Elbow Surgery Board of Trustees. All rights reserved.
Development and validation of an objective instrument to measure surgical performance at tonsillectomy.

PubMed

Roberson, David W; Kentala, Erna; Forbes, Peter

2005-12-01

The goals of this project were 1) to develop and validate an objective instrument to measure surgical performance at tonsillectomy, 2) to assess its interobserver and interobservation reliability and construct validity, and 3) to select those items with best reliability and most independent information to design a simplified form suitable for routine use in otolaryngology surgical evaluation. Prospective, observational data collection for an educational quality improvement project. The evaluation instrument was based on previous instruments developed in general surgery with input from attending otolaryngologic surgeons and experts in medical education. It was pilot tested and subjected to iterative improvements. After the instrument was finalized, a total of 55 tonsillectomies were observed and scored during academic year 2002 to 2003: 45 cases by residents at different points during their rotation, 5 by fellows, and 5 by faculty. Results were assessed for interobserver reliability, interobservation reliability, and construct validity. Factor analysis was used to identify items with independent information. Interobserver and interobservation reliability was high. On technical items, faculty substantially outperformed fellows, who in turn outperformed residents (P < .0001 for both comparisons). On the "global" scale (overall assessment), residents improved an average of 1 full point (on a 5 point scale) during a 3 month rotation (P = .01). In the subscale of "patient care," results were less clear cut: fellows outperformed residents, who in turn outperformed faculty, but only the fellows to faculty comparison was statistically significant (P = .04), and residents did not clearly improve over time (P = .36). Factor analysis demonstrated that technical items and patient care items factor separately and thus represent separate skill domains in surgery. It is possible to objectively measure surgical skill at tonsillectomy with high reliability and good construct validity. Factor analysis demonstrated that patient care is a distinct domain in surgical skill. Although the interobserver reliability for some patient care items reached statistical significance, it was not high enough for "high stakes testing" purposes. Using reliability and factor analysis results, we propose a simplified instrument for use in evaluating trainees in otolaryngologic surgery.
Evaluating Random Error in Clinician-Administered Surveys: Theoretical Considerations and Clinical Applications of Interobserver Reliability and Agreement.

PubMed

Bennett, Rebecca J; Taljaard, Dunay S; Olaithe, Michelle; Brennan-Jones, Chris; Eikelboom, Robert H

2017-09-18

The purpose of this study is to raise awareness of interobserver concordance and the differences between interobserver reliability and agreement when evaluating the responsiveness of a clinician-administered survey and, specifically, to demonstrate the clinical implications of data types (nominal/categorical, ordinal, interval, or ratio) and statistical index selection (for example, Cohen's kappa, Krippendorff's alpha, or interclass correlation). In this prospective cohort study, 3 clinical audiologists, who were masked to each other's scores, administered the Practical Hearing Aid Skills Test-Revised to 18 adult owners of hearing aids. Interobserver concordance was examined using a range of reliability and agreement statistical indices. The importance of selecting statistical measures of concordance was demonstrated with a worked example, wherein the level of interobserver concordance achieved varied from "no agreement" to "almost perfect agreement" depending on data types and statistical index selected. This study demonstrates that the methodology used to evaluate survey score concordance can influence the statistical results obtained and thus affect clinical interpretations.
Inter-Observer, Intra-Observer and Intra-Individual Reliability of Uroflowmetry Tests in Aged Men: A Generalizability Theory Approach.

PubMed

Liu, Ying-Buh; Yang, Stephen S; Hsieh, Cheng-Hsing; Lin, Chia-Da; Chang, Shang-Jen

2014-05-01

To evaluate the inter-observer, intra-observer and intra-individual reliability of uroflowmetry and post-void residual urine (PVR) tests in adult men. Healthy volunteers aged over 40 years were enrolled. Every participant underwent two sets of uroflowmetry and PVR tests with a 2-week interval between the tests. The uroflowmetry tests were interpreted by four urologists independently. Uroflowmetry curves were classified as bell-shaped, bell-shaped with tail, obstructive, restrictive, staccato, interrupted and tower-shaped and scored from 1 (highly abnormal) to 5 (absolutely normal). The agreements between the observers, interpretations and tests within individuals were analyzed using kappa statistics and intraclass correlation coefficients. Generalizability theory with decision analysis was used to determine how many observers, tests, and interpretations were needed to obtain an acceptable reliability (> 0.80). Of 108 volunteers, we randomly selected the uroflowmetry results from 25 participants for the evaluation of reliability. The mean age of the studied adults was 55.3 years. The intra-individual and intra-observer reliability on uroflowmetry tests ranged from good to very good. However, the inter-observer reliability on normalcy and specific type of flow pattern were relatively lower. In generalizability theory, three observers were needed to obtain an acceptable reliability on normalcy of uroflow pattern if the patient underwent uroflowmetry tests twice with one observation. The intra-individual and intra-observer reliability on uroflowmetry tests were good while the inter-observer reliability was relatively lower. To improve inter-observer reliability, the definition of uroflowmetry should be clarified by the International Continence Society. © 2013 Wiley Publishing Asia Pty Ltd.
Measurement of the center edge angle and determination of the Severin classification using digital radiography, computer-assisted measurement tools, and a Severin algorithm: intraobserver and interobserver reliability revisited.

PubMed

Carroll, Kristen L; Murray, Kathleen A; MacLeod, Lynne M; Hennessey, Theresa A; Woiczik, Marcella R; Roach, James W

2011-06-01

Numerous studies underscore the poor intraobserver and interobserver reliability of both the center edge angle (CEA) and the Severin classification using plain film measurements. In this study, experienced observers applied a computer-assisted measurement program to determine the CEA in digital pelvic radiographs of adults who had been previously treated for dysplasia of the hip (DDH). Using a teaching aid/algorithm of the Severin classification, the observers then assigned a Severin rating to these hips. Intraobserver and interobserver errors were then calculated on both the CEA measurements and the Severin classifications. Four pediatric orthopaedic surgeons and 1 pediatric radiologist calculated the CEAs using the OrthoView TM planning system and then determined the Severin classification on 41 blinded digital pelvic radiographs. The radiographs were evaluated by each examiner twice, with evaluations separated by 2 months. All examiners reviewed a Severin classification algorithm before making their Severin assignments. The intraobserver and interobserver reliability for both the CEA and the Severin classification were calculated using the interclass correlation coefficients and Cohen and Fleiss κ scores, respectively. The intraobserver and interobserver reliability for CEA measurement was moderate to almost perfect. When we separated the Severin classification into 3 clinically relevant groups of good (Severin I and II), dysplastic (Severin III), and poor (Severin IV and above), our interobserver reliability neared almost perfect. The Severin classification is an extremely useful and oft-used radiographic measure for the success of DDH treatment. Our research found digital radiography, computer-aided measurement tools, the use of a Severin algorithm, and separating the Severin classification into 3 clinically relevant groups significantly increased the intraobserver and interobserver reliability of both the CEA and Severin classification. This finding will assist future studies using the CEA and Severin classification in the radiographic assessment of DDH treatment outcomes.
Reliability and criterion validity of an observation protocol for working technique assessments in cash register work.

PubMed

Palm, Peter; Josephson, Malin; Mathiassen, Svend Erik; Kjellberg, Katarina

2016-06-01

We evaluated the intra- and inter-observer reliability and criterion validity of an observation protocol, developed in an iterative process involving practicing ergonomists, for assessment of working technique during cash register work for the purpose of preventing upper extremity symptoms. Two ergonomists independently assessed 17 15-min videos of cash register work on two occasions each, as a basis for examining reliability. Criterion validity was assessed by comparing these assessments with meticulous video-based analyses by researchers. Intra-observer reliability was acceptable (i.e. proportional agreement >0.7 and kappa >0.4) for 10/10 questions. Inter-observer reliability was acceptable for only 3/10 questions. An acceptable inter-observer reliability combined with an acceptable criterion validity was obtained only for one working technique aspect, 'Quality of movements'. Thus, major elements of the cashiers' working technique could not be assessed with an acceptable accuracy from short periods of observations by one observer, such as often desired by practitioners. Practitioner Summary: We examined an observation protocol for assessing working technique in cash register work. It was feasible in use, but inter-observer reliability and criterion validity were generally not acceptable when working technique aspects were assessed from short periods of work. We recommend the protocol to be used for educational purposes only.
Cardiac valve calcifications on low-dose unenhanced ungated chest computed tomography: inter-observer and inter-examination reliability, agreement and variability.

PubMed

van Hamersvelt, Robbert W; Willemink, Martin J; Takx, Richard A P; Eikendal, Anouk L M; Budde, Ricardo P J; Leiner, Tim; Mol, Christian P; Isgum, Ivana; de Jong, Pim A

2014-07-01

To determine inter-observer and inter-examination variability for aortic valve calcification (AVC) and mitral valve and annulus calcification (MC) in low-dose unenhanced ungated lung cancer screening chest computed tomography (CT). We included 578 lung cancer screening trial participants who were examined by CT twice within 3 months to follow indeterminate pulmonary nodules. On these CTs, AVC and MC were measured in cubic millimetres. One hundred CTs were examined by five observers to determine the inter-observer variability. Reliability was assessed by kappa statistics (κ) and intra-class correlation coefficients (ICCs). Variability was expressed as the mean difference ± standard deviation (SD). Inter-examination reliability was excellent for AVC (κ = 0.94, ICC = 0.96) and MC (κ = 0.95, ICC = 0.90). Inter-examination variability was 12.7 ± 118.2 mm(3) for AVC and 31.5 ± 219.2 mm(3) for MC. Inter-observer reliability ranged from κ = 0.68 to κ = 0.92 for AVC and from κ = 0.20 to κ = 0.66 for MC. Inter-observer ICC was 0.94 for AVC and ranged from 0.56 to 0.97 for MC. Inter-observer variability ranged from -30.5 ± 252.0 mm(3) to 84.0 ± 240.5 mm(3) for AVC and from -95.2 ± 210.0 mm(3) to 303.7 ± 501.6 mm(3) for MC. AVC can be quantified with excellent reliability on ungated unenhanced low-dose chest CT, but manual detection of MC can be subject to substantial inter-observer variability. Lung cancer screening CT may be used for detection and quantification of cardiac valve calcifications. • Low-dose unenhanced ungated chest computed tomography can detect cardiac valve calcifications. • However, calcified cardiac valves are not reported by most radiologists. • Inter-observer and inter-examination variability of aortic valve calcifications is sufficient for longitudinal studies. • Volumetric measurement variability of mitral valve and annulus calcifications is substantial.
The reliability of four widely used patellar height ratios.

PubMed

van Duijvenbode, Dennis; Stavenuiter, Michel; Burger, Bart; van Dijke, Cees; Spermon, Jacco; Hoozemans, Marco

2016-03-01

The objective of this study was to evaluate the inter-observer reliability and the intra-observer reliability of four patellar height ratios: Insall-Salvati (IS), modified Insall-Salvati (MIS), Blackburne-Peel (BP) and Caton-Deschamps (CD). The patellar height ratios were assessed by four independent examiners using weight-bearing lateral knee radiographs in 30° flexion. Intra-class correlation coefficients and Fleiss' kappa's were determined. The inter-observer reliability was excellent for the IS and moderate for the other ratios. When the ratio values were categorized, the inter-observer reliability was strong for the IS, moderate for the MIS and BP, and poor for the CD. The intra-observer reliability was excellent for the IS, MIS and CD, and strong for the BP. When the ratio values were categorized, the intra-observer reliability was strong for the IS and MIS, and moderate for the other ratios. Although the IS showed best reliability, we advise to use the MIS as it showed the second best reliability but is, according to the literature, associated with better validity.
The sizing of hamstring grafts for anterior cruciate reconstruction: intra- and inter-observer reliability.

PubMed

Dwyer, Tim; Whelan, Daniel B; Khoshbin, Amir; Wasserstein, David; Dold, Andrew; Chahal, Jaskarndip; Nauth, Aaron; Murnaghan, M Lucas; Ogilvie-Harris, Darrell J; Theodoropoulos, John S

2015-04-01

The objective of this study was to establish the intra- and inter-observer reliability of hamstring graft measurement using cylindrical sizing tubes. Hamstring tendons (gracilis and semitendinosus) were harvested from ten cadavers by a single surgeon and whip stitched together to create ten 4-strand hamstring grafts. Ten sports medicine surgeons and fellows sized each graft independently using either hollow cylindrical sizers or block sizers in 0.5-mm increments—the sizing technique used was applied consistently to each graft. Surgeons moved sequentially from graft to graft and measured each hamstring graft twice. Surgeons were asked to state the measured proximal (femoral) and distal (tibial) diameter of each graft, as well as the diameter of the tibial and femoral tunnels that they would drill if performing an anterior cruciate ligament (ACL) reconstruction using that graft. Reliability was established using intra-class correlation coefficients. Overall, both the inter-observer and intra-observer agreement were >0.9, demonstrating excellent reliability. The inter-observer reliability for drill sizes was also excellent (>0.9). Excellent correlation was seen between cylindrical sizing, and drill sizes (>0.9). Sizing of hamstring grafts by multiple surgeons demonstrated excellent intra-observer and intra-observer reliability, potentially validating clinical studies exploring ACL reconstruction outcomes by hamstring graft diameter when standard techniques are used. III.
Assessment of the intraobserver and interobserver reliability of a communicating vessels volumeter to measure wrist-hand volume.

PubMed

de Carvalho, Rogério Mendonca; Perez, Maria Del Carmen Janerio; Miranda, Fausto

2012-10-01

Traditional volumetry based on Archimedes' principle is the gold standard for the measurement of limb volume, but the routine use of this technique is discouraged because of several disadvantages. The purpose of this study was to evaluate intraobserver and interobserver reliability of direct measurements of wrist-hand volume using a new communicating vessels volumeter based on Pascal's law. A reliability study was conducted. To evaluate the reliability of the communicating vessels volumeter in generating measurements, 30 hands of 15 participants (9 women, 6 men) were measured 3 times each by 3 observers, totaling 270 volumetric results. Measurement time was short (X =3 minutes 42 seconds). The intraclass correlation coefficient (ICC) was .9977 for observer 1 and .9976 for observers 2 and 3. The interobserver ICC was .9998. The standard error of measurement was about 3 mL for all observers; the interobserver result was 1 mL. The interrater coefficient of variance (CV) was 1.15% for the series of 9 measurements collected for each segment; the intrarater CV was 1.20%. Limitations No swollen hands were measured, and measurements were not compared with the gold standard technique. Thus, accuracy of the new volumeter was not determined in this study. A new device has been developed for plethysmography of the extremities, and the results of its use to measure the volume of the wrist-hand segment were reliable in both intraobserver and interobserver analyses.
Intra- and inter-observer reliability of ten major histological scoring systems used for the evaluation of in vivo cartilage repair.

PubMed

Bonasia, Davide Edoardo; Marmotti, Antongiulio; Massa, Alessandro Domenico Felice; Ferro, Andrea; Blonna, Davide; Castoldi, Filippo; Rossi, Roberto

2015-09-01

In the last two decades, many surgical techniques have been described for articular cartilage repair. Reliable histological scoring systems are fundamental tools to evaluate new procedures. Several histological scoring systems have been described, and these can be divided in elementary and comprehensive scores, according to the number of sub-items. The aim of this study was to test the inter- and intra-observer reliability of ten main scores used for the histological evaluation of in vivo cartilage repair. The authors tested the starting hypothesis that elementary scores would show superior intra- and inter-observer reliability compared with comprehensive scores. Fifty histological sections obtained from the trochlea of New Zealand Rabbit and stained with Safranin-O fast green were used. The histological sections were analysed by 4 observers: 2 experienced in cartilage histology and 2 inexperienced. Histological evaluations were performed at time 1 and time 2, separated by a 30-day interval. The following scores were used: Mankin, O'Driscoll, Pineda, Wakitani, Fortier, Selleres, ICRS, ICRSII, Oswestry (OsScore) and modified O'Driscoll. Intra- and inter-observer reliability were evaluated for each score. In addition, the pavement-ceiling effect and the Bland-Altman Coefficient of Repeatability were then evaluated for each sub-item of every score. Intra-observer reliability was high for all observers in every score, even though the reliability was significantly lower for non-expert observers compared with expert counterparts. In terms of Coefficient of Repeatability, some scores performed better (O'Driscoll, Modified O'Driscoll and ICRSII) than others (Fortier, Seller). Inter-observer reliability was high for all observers in every score, but significantly lower for non-expert compared with expert observers. In expert hands, all the scores showed high intra- and inter-observer reliability, independently of the complexity. Although every score has advantages and disadvantages, ICRSII, O'Driscoll and Modified O'Driscoll scores should be preferred for the evaluation of in vivo cartilage repair in animal models.
Inter- and intraobserver reliability assessment of the axial trunk rotation: manual versus smartphone-aided measurement tools.

PubMed

Qiao, Jun; Xu, Leilei; Zhu, Zezhang; Zhu, Feng; Liu, Zhen; Qian, Bangping; Qiu, Yong

2014-10-11

Scoliogauge, has been developed for the measurement of ATR on iPhone smartphones. This study was to evaluate the reliability for the smartphone-aided ATR measurement method and to compare its reliability with that of the manual method. Sixty-four AIS patients with single thoracic or lumbar curve participated in this study. Of these patients, thirty-two patients had main thoracic scoliosis while other thirty-two had main thoracolumbar/lumbar scoliosis. Two spine surgeons performed the measurements with Scoliometer and Scoliogauge. The Scoliogauge measurements were conducted on an iPhone 4 smartphone. The intraclass correlation coefficient (ICC) 2-way mixed model on absolute agreement was used to analyze the reliability categorized according to regions: thoracic or lumbar, and Cobb angles: <20 degrees and >40 degrees. ICC < 0.40 is considered as poor, 0.40-0.59 as fair, 0.60-0.74 as good, and 0.75-1.00 as excellent. The overall intraobserver variability was 0.954 and the overall interobserver variability was 0.943 for the scoliometer set, whereas the intraobserver variability was 0.965 and interobserver variability was 0.964 for the scoliogauge set. Both the intraobserver and interobserver ICCs reached the excellent value in the 2 sets for both observers. The mean Cobb angle of thoracic curves in patients with main thoracic scoliosis was similar to that of lumbar curves in those with main thoracolumbar/lumbar scoliosis (35.7 degrees vs. 36.1 degrees). The intraobserver and interobserver reliability was similar between two groups (thoracic vs. lumbar) in the 2 sets. There were 21 patients having Cobb angles < 20 degrees, while 20 patients >40 degrees. The intraobserver and interobserver reliability was better in severe curve(>40 degrees) group. Smartphone-aided measurement for ATR showed excellent reliability, and the reliability of measurement with either scoliometer or scoliogauge could be influenced by Cobb angle that reliability was better for curves with larger Cobb angles.
Expert Reliability for the World Health Organization Standardized Ultrasound Classification of Cystic Echinococcosis

PubMed Central

Solomon, Nadia; Fields, Paul J.; Tamarozzi, Francesca; Brunetti, Enrico; Macpherson, Calum N. L.

2017-01-01

Cystic echinococcosis (CE), a parasitic zoonosis, results in cyst formation in the viscera. Cyst morphology depends on developmental stage. In 2003, the World Health Organization (WHO) published a standardized ultrasound (US) classification for CE, for use among experts as a standard of comparison. This study examined the reliability of this classification. Eleven international CE and US experts completed an assessment of eight WHO classification images and 88 test images representing cyst stages. Inter- and intraobserver reliability and observer performance were assessed using Fleiss' and Cohen's kappa. Interobserver reliability was moderate for WHO images (κ = 0.600, P < 0.0001) and substantial for test images (κ = 0.644, P < 0.0001), with substantial to almost perfect interobserver reliability for stages with pathognomonic signs (CE1, CE2, and CE3) for WHO (0.618 < κ < 0.904) and test images (0.642 < κ < 0.768). Comparisons of expert performances against the majority classification for each image were significant for WHO (0.413 < κ < 1.000, P < 0.005) and test images (0.718 < κ < 0.905, P < 0.0001); and intraobserver reliability was significant for WHO (0.520 < κ < 1.000, P < 0.005) and test images (0.690 < κ < 0.896, P < 0.0001). Findings demonstrate moderate to substantial interobserver and substantial to almost perfect intraobserver reliability for the WHO classification, with substantial to almost perfect interobserver reliability for pathognomonic stages. This confirms experts' abilities to reliably identify WHO-defined pathognomonic signs of CE, demonstrating that the WHO classification provides a reproducible way of staging CE. PMID:28070008
Precision of lumbar intervertebral measurements: does a computer-assisted technique improve reliability?

PubMed

Pearson, Adam M; Spratt, Kevin F; Genuario, James; McGough, William; Kosman, Katherine; Lurie, Jon; Sengupta, Dilip K

2011-04-01

Comparison of intra- and interobserver reliability of digitized manual and computer-assisted intervertebral motion measurements and classification of "instability." To determine if computer-assisted measurement of lumbar intervertebral motion on flexion-extension radiographs improves reliability compared with digitized manual measurements. Many studies have questioned the reliability of manual intervertebral measurements, although few have compared the reliability of computer-assisted and manual measurements on lumbar flexion-extension radiographs. Intervertebral rotation, anterior-posterior (AP) translation, and change in anterior and posterior disc height were measured with a digitized manual technique by three physicians and by three other observers using computer-assisted quantitative motion analysis (QMA) software. Each observer measured 30 sets of digital flexion-extension radiographs (L1-S1) twice. Shrout-Fleiss intraclass correlation coefficients for intra- and interobserver reliabilities were computed. The stability of each level was also classified (instability defined as >4 mm AP translation or 10° rotation), and the intra- and interobserver reliabilities of the two methods were compared using adjusted percent agreement (APA). Intraobserver reliability intraclass correlation coefficients were substantially higher for the QMA technique THAN the digitized manual technique across all measurements: rotation 0.997 versus 0.870, AP translation 0.959 versus 0.557, change in anterior disc height 0.962 versus 0.770, and change in posterior disc height 0.951 versus 0.283. The same pattern was observed for interobserver reliability (rotation 0.962 vs. 0.693, AP translation 0.862 vs. 0.151, change in anterior disc height 0.862 vs. 0.373, and change in posterior disc height 0.730 vs. 0.300). The QMA technique was also more reliable for the classification of "instability." Intraobserver APAs ranged from 87 to 97% for QMA versus 60% to 73% for digitized manual measurements, while interobserver APAs ranged from 91% to 96% for QMA versus 57% to 63% for digitized manual measurements. The use of QMA software substantially improved the reliability of lumbar intervertebral measurements and the classification of instability based on flexion-extension radiographs.

Intra- and interobserver reliability of the Eaton classification for trapeziometacarpal arthritis: a systematic review.

PubMed

Berger, Aaron J; Momeni, Arash; Ladd, Amy L

2014-04-01

Trapeziometacarpal, or thumb carpometacarpal (CMC), arthritis is a common problem with a variety of treatment options. Although widely used, the Eaton radiographic staging system for CMC arthritis is of questionable clinical utility, as disease severity does not predictably correlate with symptoms or treatment recommendations. A possible reason for this is that the classification itself may not be reliable, but the literature on this has not, to our knowledge, been systematically reviewed. We therefore performed a systematic review to determine the intra- and interobserver reliability of the Eaton staging system. We systematically reviewed English-language studies published between 1973 and 2013 to assess the degree of intra- and interobserver reliability of the Eaton classification for determining the stage of trapeziometacarpal joint arthritis and pantrapezial arthritis based on plain radiographic imaging. Search engines included: PubMed, Scopus(®), and CINAHL. Four studies, which included a total of 163 patients, met our inclusion criteria and were evaluated. The level of evidence of the studies included in this analysis was determined using the Oxford Centre for Evidence Based Medicine Levels of Evidence Classification by two independent observers. A limited number of studies have been performed to assess intra- and interobserver reliability of the Eaton classification system. The four studies included were determined to be Level 3b. These studies collectively indicate that the Eaton classification demonstrates poor to fair interobserver reliability (kappa values: 0.11-0.56) and fair to moderate intraobserver reliability (kappa values: 0.54-0.657). Review of the literature demonstrates that radiographs assist in the assessment of CMC joint disease, but there is not a reliable system for classification of disease severity. Currently, diagnosis and treatment of thumb CMC arthritis are based on the surgeon's qualitative assessment combining history, physical examination, and radiographic evaluation. Inconsistent agreement using the current common radiographic classification system suggests a need for better radiographic tools to quantify disease severity.
The reliability of Cavalier's principle of stereological method in determining volumes of enchondromas using the computerized tomography tools.

PubMed

Acar, Nihat; Karakasli, Ahmet; Karaarslan, Ahmet; Mas, Nermin Ng; Hapa, Onur

2017-01-01

Volumetric measurements of benign tumors enable surgeons to trace volume changes during follow-up periods. For a volumetric measurement technique to be applicable, it should be easy, rapid, and inexpensive and should carry a high interobserver reliability. We aimed to assess the interobserver reliability of a volumetric measurement technique using the Cavalier's principle of stereological methods. The computerized tomography (CT) of 15 patients with a histopathologically confirmed diagnosis of enchondroma with variant tumor sizes and localizations was retrospectively reviewed for interobserver reliability evaluation of the volumetric stereological measurement with the Cavalier's principle, V = t × [((SU) × d) /SL]2 × Σ P. The volumes of the 15 tumors collected by the observers are demonstrated in Table 1. There was no statistical significance between the first and second observers ( p = 0.000 and intraclass correlation coefficient = 0.970) and between the first and third observers ( p = 0.000 and intraclass correlation coefficient = 0.981). No statistical significance was detected between the second and third observers ( p = 0.000 and intraclass correlation coefficient = 0.976). The Cavalier's principle with the stereological technique using the CT scans is an easy, rapid, and inexpensive technique in volumetric evaluation of enchondromas with a trustable interobserver reliability.
Scoring ultrasound synovitis in rheumatoid arthritis: a EULAR-OMERACT ultrasound taskforce-Part 2: reliability and application to multiple joints of a standardised consensus-based scoring system

PubMed Central

Terslev, Lene; Naredo, Esperanza; Aegerter, Philippe; Wakefield, Richard J; Backhaus, Marina; Balint, Peter; Bruyn, George A W; Iagnocco, Annamaria; Jousse-Joulin, Sandrine; Schmidt, Wolfgang A; Szkudlarek, Marcin; Conaghan, Philip G; Filippucci, Emilio

2017-01-01

Objectives To test the reliability of new ultrasound (US) definitions and quantification of synovial hypertrophy (SH) and power Doppler (PD) signal, separately and in combination, in a range of joints in patients with rheumatoid arthritis (RA) using the European League Against Rheumatisms–Outcomes Measures in Rheumatology (EULAR-OMERACT) combined score for PD and SH. Methods A stepwise approach was used: (1) scoring static images of metacarpophalangeal (MCP) joints in a web-based exercise and subsequently when scanning patients; (2) scoring static images of wrist, proximal interphalangeal joints, knee and metatarsophalangeal joints in a web-based exercise and subsequently when scanning patients using different acquisitions (standardised vs usual practice). For reliability, kappa coefficients (κ) were used. Results Scoring MCP joints in static images showed substantial intraobserver variability but good to excellent interobserver reliability. In patients, intraobserver reliability was the same for the two acquisition methods. Interobserver reliability for SH (κ=0.87) and PD (κ=0.79) and the EULAR-OMERACT combined score (κ=0.86) were better when using a ‘standardised’ scan. For the other joints, the intraobserver reliability was excellent in static images for all scores (κ=0.8–0.97) and the interobserver reliability marginally lower. When using standardised scanning in patients, the intraobserver was good (κ=0.64 for SH and the EULAR-OMERACT combined score, 0.66 for PD) and the interobserver reliability was also good especially for PD (κ range=0.41–0.92). Conclusion The EULAR-OMERACT score demonstrated moderate-good reliability in MCP joints using a standardised scan and is equally applicable in non-MCP joints. This scoring system should underpin improved reliability and consequently the responsiveness of US in RA clinical trials. PMID:28948984
Perme Intensive Care Unit Mobility Score and ICU Mobility Scale: translation into Portuguese and cross-cultural adaptation for use in Brazil

PubMed Central

Kawaguchi, Yurika Maria Fogaça; Nawa, Ricardo Kenji; Figueiredo, Thais Borgheti; Martins, Lourdes; Pires-Neto, Ruy Camargo

2016-01-01

ABSTRACT Objective: To translate the Perme Intensive Care Unit Mobility Score and the ICU Mobility Scale (IMS) into Portuguese, creating versions that are cross-culturally adapted for use in Brazil, and to determine the interobserver agreement and reliability for both versions. Methods: The processes of translation and cross-cultural validation consisted in the following: preparation, translation, reconciliation, synthesis, back-translation, review, approval, and pre-test. The Portuguese-language versions of both instruments were then used by two researchers to evaluate critically ill ICU patients. Weighted kappa statistics and Bland-Altman plots were used in order to verify interobserver agreement for the two instruments. In each of the domains of the instruments, interobserver reliability was evaluated with Cronbach's alpha coefficient. The correlation between the instruments was assessed by Spearman's correlation test. Results: The study sample comprised 103 patients-56 (54%) of whom were male-with a mean age of 52 ± 18 years. The main reason for ICU admission (in 44%) was respiratory failure. Both instruments showed excellent interobserver agreement (κ > 0.90) and reliability (α > 0.90) in all domains. Interobserver bias was low for the IMS and the Perme Score (−0.048 ± 0.350 and −0.06 ± 0.73, respectively). The 95% CIs for the same instruments ranged from −0.73 to 0.64 and −1.50 to 1.36, respectively. There was also a strong positive correlation between the two instruments (r = 0.941; p < 0.001). Conclusions: In their versions adapted for use in Brazil, both instruments showed high interobserver agreement and reliability. PMID:28117473
Reliability testing of the Larsen and Sharp classifications for rheumatoid arthritis of the elbow.

PubMed

Jew, Nicholas B; Hollins, Anthony M; Mauck, Benjamin M; Smith, Richard A; Azar, Frederick M; Miller, Robert H; Throckmorton, Thomas W

2017-01-01

Two popular systems for classifying rheumatoid arthritis affecting the elbow are the Larsen and Sharp schemes. To our knowledge, no study has investigated the reliability of these 2 systems. We compared the intraobserver and interobserver agreement of the 2 systems to determine whether one is more reliable than the other. The radiographs of 45 patients diagnosed with rheumatoid arthritis affecting the elbow were evaluated. Anteroposterior and lateral radiographs were deidentified and distributed to 6 evaluators (4 fellowship-trained upper extremity surgeons and 2 orthopedic trainees). Each evaluator graded all 45 radiographs according to the Larsen and Sharp scoring methods on 2 occasions, at least 2 weeks apart. Overall intraobserver reliability was 0.93 (95% confidence interval [CI], 0.90-0.95) for the Larsen system and 0.92 (95% CI, 0.86-0.96) for the Sharp classification, both indicating substantial agreement. Overall interobserver reliability was 0.70 (95% CI, 0.60-0.80) for the Larsen classification and 0.68 (95% CI, 0.54-0.81) for the Sharp system, both indicating good agreement. There were no significant differences in the intraobserver or interobserver reliability of the systems overall and no significant differences in reliability between attending surgeons and trainees for either classification system. The Larsen and Sharp systems both show substantial intraobserver reliability and good interobserver agreement for the radiographic classification of rheumatoid arthritis affecting the elbow. Differences in training level did not result in substantial variances in reliability for either system. We conclude that both systems can be reliably used to evaluate rheumatoid arthritis of the elbow by observers of varying training levels. Copyright © 2017 Journal of Shoulder and Elbow Surgery Board of Trustees. Published by Elsevier Inc. All rights reserved.
The postoperative COFAS end-stage ankle arthritis classification system: interobserver and intraobserver reliability.

PubMed

Krause, Fabian G; Di Silvestro, Matthew; Penner, Murray J; Wing, Kevin J; Glazebrook, Mark A; Daniels, Timothy R; Lau, Johnny T C; Younger, Alastair S E

2012-02-01

End-stage ankle arthritis is operatively treated with numerous designs of total ankle replacement and different techniques for ankle fusion. For superior comparison of these procedures, outcome research requires a classification system to stratify patients appropriately. A postoperative 4-type classification system was designed by 6 fellowship-trained foot and ankle surgeons. Four surgeons reviewed blinded patient profiles and radiographs on 2 occasions to determine the interobserver and intraobserver reliability of the classification. Excellent interobserver reliability (κ = .89) and intraobserver reproducibility (κ = .87) were demonstrated for the postoperative classification system. In conclusion, the postoperative Canadian Orthopaedic Foot and Ankle Society (COFAS) end-stage ankle arthritis classification system appears to be a valid tool to evaluate the outcome of patients operated for end-stage ankle arthritis.
Reliability of pelvic floor measurements on three- and four-dimensional ultrasound during and after first pregnancy: implications for training.

PubMed

van Veelen, G A; Schweitzer, K J; van der Vaart, C H

2013-11-01

To evaluate the reliability of measurements of the levator hiatus and levator-urethra gap (LUG) using three/four-dimensional (3D/4D) transperineal ultrasound in women during their first pregnancy and 6 months postpartum, and to assess the learning process for these measurements. An inexperienced observer was taught to perform measurements of the levator hiatus and LUG by an experienced observer. After training, 3D/4D ultrasound volume datasets of 40 women in the first trimester were analyzed by these two observers. Another training session then took place and both observers repeated the analyses of the same volume datasets. Finally, analyses of 40 volume datasets of the women 6 months postpartum were performed by both observers. Intra- and interobserver reliability were determined by intraclass correlation coefficients (ICC) with 95% CIs. For levator hiatal measurements, in the women during their first pregnancy the interobserver reliability was substantial to almost perfect after both the first and second training session (ICC, 0.62-0.83 and 0.71-0.89, respectively, for anteroposterior diameter, transverse diameter and area at rest, on contraction and on Valsalva) and the intraobserver reliability was substantial to almost perfect for both observers. For these measurements performed once the women had delivered, interobserver reliability was moderate to almost perfect. For LUG measurements performed during pregnancy, interobserver reliability was slight to moderate after the first training session (ICC, 0.14-0.54), but improved after the second training session (ICC, 0.38-0.71), and intraobserver reliability was moderate to substantial for the experienced observer and slight to moderate for the inexperienced observer. For these measurements performed when the women had delivered, interobserver reliability was fair to moderate. The levator hiatus and LUG can be measured reliably using 3D/4D ultrasound in primigravid and primiparous women. The technique to measure dimensions of the levator hiatus requires limited teaching, but LUG measurements are more difficult and require more extensive training. Copyright © 2013 ISUOG. Published by John Wiley & Sons Ltd.
Reliability of cervical vertebral maturation staging.

PubMed

Rainey, Billie-Jean; Burnside, Girvan; Harrison, Jayne E

2016-07-01

Growth and its prediction are important for the success of many orthodontic treatments. The aim of this study was to determine the reliability of the cervical vertebral maturation (CVM) method for the assessment of mandibular growth. A group of 20 orthodontic clinicians, inexperienced in CVM staging, was trained to use the improved version of the CVM method for the assessment of mandibular growth with a teaching program. They independently assessed 72 consecutive lateral cephalograms, taken at Liverpool University Dental Hospital, on 2 occasions. The cephalograms were presented in 2 different random orders and interspersed with 11 additional images for standardization. The intraobserver and interobserver agreement values were evaluated using the weighted kappa statistic. The intraobserver and interobserver agreement values were substantial (weighted kappa, 0.6-0.8). The overall intraobserver agreement was 0.70 (SE, 0.01), with average agreement of 89%. The interobserver agreement values were 0.68 (SE, 0.03) for phase 1 and 0.66 (SE, 0.03) for phase 2, with average interobserver agreement of 88%. The intraobserver and interobserver agreement values of classifying the vertebral stages with the CVM method were substantial. These findings demonstrate that this method of CVM classification is reproducible and reliable. Copyright © 2016 American Association of Orthodontists. Published by Elsevier Inc. All rights reserved.
Validation of a novel smartphone accelerometer-based knee goniometer.

PubMed

Ockendon, Matthew; Gilbert, Robin E

2012-09-01

Loss of full knee extension following anterior cruciate ligament surgery has been shown to impair knee function. However, there can be significant difficulties in accurately and reproducibly measuring a fixed flexion of the knee. We studied the interobserver and the intraobserver reliabilities of a novel, smartphone accelerometer-based, knee goniometer and compared it with a long-armed conventional goniometer for the assessment of fixed flexion knee deformity. Five healthy male volunteers (age range 30 to 40 years) were studied. Measurements of knee flexion angle were made with a telescopic-armed goniometer (Lafayette Instrument, Lafayette, IN) and compared with measurements using the smartphone (iPhone 3GS, Apple Inc., Cupertino, CA) knee goniometer using a novel trigonometric technique based on tibial inclination. Bland-Altman analysis of validity and reliability including statistical analysis of correlation by Pearson's method was undertaken. The iPhone goniometer had an interobserver correlation (r) of 0.994 compared with 0.952 for the Lafayette. The intraobserver correlation was r = 0.982 for the iPhone (compared with 0.927). The datasets from the two instruments correlate closely (r = 0.947) are proportional and have mean difference of only -0.4 degrees (SD 3.86 degrees). The Lafayette goniometer had an intraobserver reliability +/- 9.6 degrees. The interobserver reliability was +/- 8.4 degrees. By comparison the iPhone had an interobserver reliability +/- 2.7 degrees and an intraobserver reliability +/- 4.6 degrees. We found the iPhone goniometer to be a reliable tool for the measurement of subtle knee flexion in the clinic setting.
[Reliability and reproducibility of the Fitzpatrick phototype scale for skin sensitivity to ultraviolet light].

PubMed

Sánchez, Guillermo; Nova, John; Arias, Nilsa; Peña, Bibiana

2008-12-01

The Fitzpatrick phototype scale has been used to determine skin sensitivity to ultraviolet light. The reliability of this scale in estimating sensitivity permits risk evaluation of skin cancer based on phototype. Reliability and changes in intra and inter-observer concordance was determined for the Fitzpatrick phototype scale after the assessment methods for establishing the phototype were standardized. An analytical study of intra and inter-observer concordance was performed. The Fitzpatrick phototype scale was standardized using focus group methodology. To determine intra and inter-observer agreement, the weighted kappa statistical method was applied. The standardization effect was measured using the equal kappa contrast hypothesis and Wald test for dependent measurements. The phototype scale was applied to 155 patients over 15 years of age who were assessed four times by two independent observers. The sample was drawn from patients of the Centro Dermatol6gico Federico Lleras Acosta. During the pre-standardization phase, the baseline and six-week inter-observer weighted kappa were 0.31 and 0.40, respectively. The intra-observer kappa values for observers A and B were 0.47 and 0.51, respectively. After the standardization process, the baseline and six-week inter-observer weighted kappa values were 0.77, and 0.82, respectively. Intra-observer kappa coefficients for observers A and B were 0.78 and 0.82. Statistically significant differences were found between coefficients before and after standardization (p<0.001) in all comparisons. Following a standardization exercise, the Fitzpatrick phototype scale yielded reliable, reproducible and consistent results.
Reliability of a four-column classification for tibial plateau fractures.

PubMed

Martínez-Rondanelli, Alfredo; Escobar-González, Sara Sofía; Henao-Alzate, Alejandro; Martínez-Cano, Juan Pablo

2017-09-01

A four-column classification system offers a different way of evaluating tibial plateau fractures. The aim of this study is to compare the intra-observer and inter-observer reliability between four-column and classic classifications. This is a reliability study, which included patients presenting with tibial plateau fractures between January 2013 and September 2015 in a level-1 trauma centre. Four orthopaedic surgeons blindly classified each fracture according to four different classifications: AO, Schatzker, Duparc and four-column. Kappa, intra-observer and inter-observer concordance were calculated for the reliability analysis. Forty-nine patients were included. The mean age was 39 ± 14.2 years, with no gender predominance (men: 51%; women: 49%), and 67% of the fractures included at least one of the posterior columns. The intra-observer and inter-observer concordance were calculated for each classification: four-column (84%/79%), Schatzker (60%/71%), AO (50%/59%) and Duparc (48%/58%), with a statistically significant difference among them (p = 0.001/p = 0.003). Kappa coefficient for intr-aobserver and inter-observer evaluations: Schatzker 0.48/0.39, four-column 0.61/0.34, Duparc 0.37/0.23, and AO 0.34/0.11. The proposed four-column classification showed the highest intra and inter-observer agreement. When taking into account the agreement that occurs by chance, Schatzker classification showed the highest inter-observer kappa, but again the four-column had the highest intra-observer kappa value. The proposed classification is a more inclusive classification for the posteromedial and posterolateral fractures. We suggest, therefore, that it be used in addition to one of the classic classifications in order to better understand the fracture pattern, as it allows more attention to be paid to the posterior columns, it improves the surgical planning and allows the surgical approach to be chosen more accurately.
Clinically orientated classification incorporating shoulder balance for the surgical treatment of adolescent idiopathic scoliosis.

PubMed

Elsebaie, H B; Dannawi, Z; Altaf, F; Zaidan, A; Al Mukhtar, M; Shaw, M J; Gibson, A; Noordeen, H

2016-02-01

The achievement of shoulder balance is an important measure of successful scoliosis surgery. No previously described classification system has taken shoulder balance into account. We propose a simple classification system for AIS based on two components which include the curve type and shoulder level. Altogether, three curve types have been defined according to the size and location of the curves, each curve pattern is subdivided into type A or B depending on the shoulder level. This classification was tested for interobserver reproducibility and intraobserver reliability. A retrospective analysis of the radiographs of 232 consecutive cases of AIS patients treated surgically between 2005 and 2009 was also performed. Three major types and six subtypes were identified. Type I accounted for 30 %, type II 28 % and type III 42 %. The retrospective analysis showed three patients developed a decompensation that required extension of the fusion. One case developed worsening of shoulder balance requiring further surgery. This classification was tested for interobserver and intraobserver reliability. The mean kappa coefficients for interobserver reproducibility ranged from 0.89 to 0.952, while the mean kappa value for intraobserver reliability was 0.964 indicating a good-to-excellent reliability. The treatment algorithm guides the spinal surgeon to achieve optimal curve correction and postoperative shoulder balance whilst fusing the smallest number of spinal segments. The high interobserver reproducibility and intraobserver reliability makes it an invaluable tool to describe scoliosis curves in everyday clinical practice.
Reliability and concurrent validity of the Infant Motor Profile.

PubMed

Heineman, Kirsten R; Middelburg, Karin J; Bos, Arend F; Eidhof, Lieke; La Bastide-Van Gemert, Sacha; Van Den Heuvel, Edwin R; Hadders-Algra, Mijna

2013-06-01

The Infant Motor Profile (IMP) is a qualitative assessment of motor behaviour in infancy. It consists of five domains: movement variation, variability, fluency, symmetry, and performance. The aim of this study was to assess interobserver reliability and concurrent validity of the IMP with the Alberta Infant Motor Scale (AIMS) and an age-specific neurological examination. Fifty-nine preterm infants (25 females, 34 males; median gestational age 29.7wks, median birthweight 1285g) and 146 term infants (74 females, 72 males; median gestational age 40.1wks, birthweight 3500g) were included. Assessments were performed at corrected ages of 4, 6, 10, 12, and 18 months and consisted of the IMP, AIMS, and an age-specific neurological examination. Interobserver reliability was investigated on a sample of 25 video recordings. Non-parametric statistics were used to analyse the data. Interobserver reliability was high (intraclass correlation coefficient 0.95). At all ages, AIMS scores correlated weakly to fairly with total IMP scores (Spearman's ρ 0.36-0.55), but moderately to strongly with scores on the performance domain of the IMP (Spearman's ρ 0.47-0.84). A clear relation was found between total IMP score and outcome of the neurological examination (Kruskal-Wallis p<0.001 at all ages). Interobserver reliability of the IMP is good. Concurrent validity with the AIMS is best for the IMP performance domain. Concurrent validity with age-specific neurological examination is very good. © The Authors. Developmental Medicine & Child Neurology © 2013 Mac Keith Press.
Inter- and intraobserver reliability of the Rockwood classification in acute acromioclavicular joint dislocations.

PubMed

Schneider, M M; Balke, M; Koenen, P; Fröhlich, M; Wafaisade, A; Bouillon, B; Banerjee, M

2016-07-01

The reliability of the Rockwood classification, the gold standard for acute acromioclavicular (AC) joint separations, has not yet been tested. The purpose of this study was to investigate the reliability of visual and measured AC joint lesion grades according to the Rockwood classification. Four investigators (two shoulder specialists and two second-year residents) examined radiographs (bilateral panoramic stress and axial views) in 58 patients and graded the injury according to the Rockwood classification using the following sequence: (1) visual classification of the AC joint lesion, (2) digital measurement of the coracoclavicular distance (CCD) and the horizontal dislocation (HD) with Osirix Dicom Viewer (Pixmeo, Switzerland), (3) classification of the AC joint lesion according to the measurements and (4) repetition of (1) and (2) after repeated anonymization by an independent physician. Visual and measured Rockwood grades as well as the CCD and HD of every patient were documented, and a CC index was calculated (CCD injured/CCD healthy). All records were then used to evaluate intra- and interobserver reliability. The disagreement between visual and measured diagnosis ranged from 6.9 to 27.6 %. Interobserver reliability for visual diagnosis was good (0.72-0.74) and excellent (0.85-0.93) for measured Rockwood grades. Intraobserver reliability was good to excellent (0.67-0.93) for visual diagnosis and excellent for measured diagnosis (0.90-0.97). The correlations between measurements of the axial view varied from 0.68 to 0.98 (good to excellent) for interobserver reliability and from 0.90 to 0.97 (excellent) for intraobserver reliability. Bilateral panoramic stress and axial radiographs are reliable examinations for grading AC joint injuries according to Rockwood's classification. Clinicians of all experience levels can precisely classify AC joint lesions according to the Rockwood classification. We recommend to grade acute ACG lesions by performing a digital measurement instead of a sole visual diagnosis because of the higher intra- and interobserver reliability. Case series, Level IV.
Reliability of the modified Gross Motor Function Measure-88 (GMFM-88) for children with both Spastic Cerebral Palsy and Cerebral Visual Impairment: A preliminary study.

PubMed

Salavati, M; Krijnen, W P; Rameckers, E A A; Looijestijn, P L; Maathuis, C G B; van der Schans, C P; Steenbergen, B

2015-01-01

The aims of this study were to adapt the Gross Motor Function Measure-88 (GMFM-88) for children with Cerebral Palsy (CP) and Cerebral Visual Impairment (CVI) and to determine the test-retest and interobserver reliability of the adapted version. Sixteen paediatric physical therapists familiar with CVI participated in the adaptation process. The Delphi method was used to gain consensus among a panel of experts. Seventy-seven children with CP and CVI (44 boys and 33 girls, aged between 50 and 144 months) participated in this study. To assess test-retest and interobserver reliability, the GMFM-88 was administered twice within three weeks (Mean=9 days, SD=6 days) by trained paediatric physical therapists, one of whom was familiar with the child and one who wasn't. Percentages of identical scores, Cronbach's alphas and intraclass correlation coefficients (ICC) were computed for each dimension level. All experts agreed on the proposed adaptations of the GMFM-88 for children with CP and CVI. Test-retest reliability ICCs for dimension scores were between 0.94 and 1.00, mean percentages of identical scores between 29 and 71, and interobserver reliability ICCs of the adapted GMFM-88 were 0.99-1.00 for dimension scores. Mean percentages of identical scores varied between 53 and 91. Test-retest and interobserver reliability of the GMFM-88-CVI for children with CP and CVI was excellent. Internal consistency of dimension scores lay between 0.97 and 1.00. The psychometric properties of the adapted GMFM-88 for children with CP and CVI are reliable and comparable to the original GMFM-88. Copyright © 2015 Elsevier Ltd. All rights reserved.
Reliability of mercury-in-silastic strain gauge plethysmography curve reading: influence of clinical clues and observer variation.

PubMed

Høyer, Christian; Pavar, Susanne; Pedersen, Begitte H; Biurrun Manresa, José A; Petersen, Lars J

2013-08-01

Mercury-in-silastic strain gauge pletysmography (SGP) is a well-established technique for blood flow and blood pressure measurements. The aim of this study was to examine (i) the possible influence of clinical clues, e.g. the presence of wounds and color changes during blood pressure measurements, and (ii) intra- and inter-observer variation of curve interpretation for segmental blood pressure measurements. A total of 204 patients with known or suspected peripheral arterial disease (PAD) were included in a diagnostic accuracy trial. Toe and ankle pressures were measured in both limbs, and primary observers analyzed a total of 804 pressure curve sets. The SGP curves were later reanalyzed separately by two observers blinded to clinical clues. Intra- and inter-observer agreement was quantified using Cohen's kappa and reliability was quantified using intra-class correlation coefficients, coefficients of variance, and Bland-Altman analysis. There was an overall agreement regarding patient diagnostic classification (PAD/not PAD) in 202/204 (99.0%) for intra-observer (κ = 0.969, p < 0.001), and 201/204 (98.5%) for inter-observer readings (κ = 0.953, p < 0.001). Reliability analysis showed excellent correlation between blinded versus non-blinded and inter-observer readings for determination of absolute segmental pressures (all intraclass correlation coefficients ≥ 0.984). The coefficient of variance for determination of absolute segmental blood pressure ranged from 2.9-3.4% for blinded/non-blinded data and from 3.8-5.0% for inter-observer data. This study shows a low inter-observer variation among experienced laboratory technicians for reading strain gauge curves. The low variation between blinded/non-blinded readings indicates that SGP measurements are minimally biased by clinical clues.
Comparison of 3D computer-aided with manual cerebral aneurysm measurements in different imaging modalities.

PubMed

Groth, M; Forkert, N D; Buhk, J H; Schoenfeld, M; Goebell, E; Fiehler, J

2013-02-01

To compare intra- and inter-observer reliability of aneurysm measurements obtained by a 3D computer-aided technique with standard manual aneurysm measurements in different imaging modalities. A total of 21 patients with 29 cerebral aneurysms were studied. All patients underwent digital subtraction angiography (DSA), contrast-enhanced (CE-MRA) and time-of-flight magnetic resonance angiography (TOF-MRA). Aneurysm neck and depth diameters were manually measured by two observers in each modality. Additionally, semi-automatic computer-aided diameter measurements were performed using 3D vessel surface models derived from CE- (CE-com) and TOF-MRA (TOF-com) datasets. Bland-Altman analysis (BA) and intra-class correlation coefficient (ICC) were used to evaluate intra- and inter-observer agreement. BA revealed the narrowest relative limits of intra- and inter-observer agreement for aneurysm neck and depth diameters obtained by TOF-com (ranging between ±5.3 % and ±28.3 %) and CE-com (ranging between ±23.3 % and ±38.1 %). Direct measurements in DSA, TOF-MRA and CE-MRA showed considerably wider limits of agreement. The highest ICCs were observed for TOF-com and CE-com (ICC values, 0.92 or higher for intra- as well as inter-observer reliability). Computer-aided aneurysm measurement in 3D offers improved intra- and inter-observer reliability and a reproducible parameter extraction, which may be used in clinical routine and as objective surrogate end-points in clinical trials.
Intra- and interobserver reliability of quantitative ultrasound measurement of the plantar fascia.

PubMed

Rathleff, Michael Skovdal; Moelgaard, Carsten; Lykkegaard Olesen, Jens

2011-01-01

To determine intra- and interobserver reliability and measurement precision of sonographic assessment of plantar fascia thickness when using one, the mean of two, or the mean of three measurements. Two experienced observers scanned 20 healthy subjects twice with 60 minutes between test and retest. A GE LOGIQe ultrasound scanner was used in the study. The built-in software in the scanner was used to measure the thickness of the plantar fascia (PF). Reliability was calculated using intraclass correlation coefficient (ICC) and limits of agreement (LOA). Intraobserver reliability (ICC) using one measurement was 0.50 for one observer and 0.52 for the other, and using the mean of three measurements intraobserver reliability increased up to 0.77 and 0.67, respectively. Interobserver reliability (ICC) when using one measurement was 0.62 and increased to 0.82 when using the average of three measurements. LOA showed that when using the average of three measurements, LOA decreased to 0.6 mm, corresponding to 17.5% of the mean thickness of the PF. The results showed that reliability increases when using the mean of three measurements compared with one. Limits of agreement based on intratester reliability shows that changes in thickness that are larger than 0.6 mm can be considered actual changes in thickness and not a result of measurement error. Copyright © 2011 Wiley Periodicals, Inc.
Validity and inter-observer reliability of subjective hand-arm vibration assessments.

PubMed

Coenen, Pieter; Formanoy, Margriet; Douwes, Marjolein; Bosch, Tim; de Kraker, Heleen

2014-07-01

Exposure to mechanical vibrations at work (e.g., due to handling powered tools) is a potential occupational risk as it may cause upper extremity complaints. However, reliable and valid assessment methods for vibration exposure at work are lacking. Measuring hand-arm vibration objectively is often difficult and expensive, while often used information provided by manufacturers lacks detail. Therefore, a subjective hand-arm vibration assessment method was tested on validity and inter-observer reliability. In an experimental protocol, sixteen tasks handling powered tools were executed by two workers. Hand-arm vibration was assessed subjectively by 16 observers according to the proposed subjective assessment method. As a gold standard reference, hand-arm vibration was measured objectively using a vibration measurement device. Weighted κ's were calculated to assess validity, intra-class-correlation coefficients (ICCs) were calculated to assess inter-observer reliability. Inter-observer reliability of the subjective assessments depicting the agreement among observers can be expressed by an ICC of 0.708 (0.511-0.873). The validity of the subjective assessments as compared to the gold-standard reference can be expressed by a weighted κ of 0.535 (0.285-0.785). Besides, the percentage of exact agreement of the subjective assessment compared to the objective measurement was relatively low (i.e., 52% of all tasks). This study shows that subjectively assessed hand-arm vibrations are fairly reliable among observers and moderately valid. This assessment method is a first attempt to use subjective risk assessments of hand-arm vibration. Although, this assessment method can benefit from some future improvement, it can be of use in future studies and in field-based ergonomic assessments. Copyright © 2014 Elsevier Ltd and The Ergonomics Society. All rights reserved.
Inter-Rater Reliability and Downstream Financial Implications of Electrocardiography Screening in Young Athletes.

PubMed

Dhutia, Harshil; Malhotra, Aneil; Yeo, Tee Joo; Ster, Irina Chis; Gabus, Vincent; Steriotis, Alexandros; Dores, Helder; Mellor, Greg; García-Corrales, Carmen; Ensam, Bode; Jayalapan, Viknesh; Ezzat, Vivienne Anne; Finocchiaro, Gherardo; Gati, Sabiha; Papadakis, Michael; Tome-Esteban, Maria; Sharma, Sanjay

2017-08-01

Preparticipation screening for cardiovascular disease in young athletes with electrocardiography is endorsed by the European Society of Cardiology and several major sporting organizations. One of the concerns of the ECG as a screening test in young athletes relates to the potential for variation in interpretation. We investigated the degree of variation in ECG interpretation in athletes and its financial impact among cardiologists of differing experience. Eight cardiologists (4 with experience in screening athletes) each reported 400 ECGs of consecutively screened young athletes according to the 2010 European Society of Cardiology recommendations, Seattle criteria, and refined criteria. Cohen κ coefficient was used to calculate interobserver reliability. Cardiologists proposed secondary investigations after ECG interpretation, the costs of which were based on the UK National Health Service tariffs. Inexperienced cardiologists were more likely to classify an ECG as abnormal compared with experienced cardiologists (odds ratio, 1.44; 95% confidence interval, 1.03-2.02). Modification of ECG interpretation criteria improved interobserver reliability for categorizing an ECG as abnormal from poor (2010 European Society of Cardiology recommendations; κ=0.15) to moderate (refined criteria; κ=0.41) among inexperienced cardiologists; however, interobserver reliability was moderate for all 3 criteria among experienced cardiologists (κ=0.40-0.53). Inexperienced cardiologists were more likely to refer athletes for further evaluation compared with experienced cardiologists (odds ratio, 4.74; 95% confidence interval, 3.50-6.43) with poorer interobserver reliability (κ=0.22 versus κ=0.47). Interobserver reliability for secondary investigations after ECG interpretation ranged from poor to fair among inexperienced cardiologists (κ=0.15-0.30) and fair to moderate among experienced cardiologists (κ=0.21-0.46). The cost of cardiovascular evaluation per athlete was $175 (95% confidence interval, $142-$228) and $101 (95% confidence interval, $83-$131) for inexperienced and experienced cardiologists, respectively. Interpretation of the ECG in athletes and the resultant cascade of investigations are highly physician dependent even in experienced hands with important downstream financial implications, emphasizing the need for formal training and standardized diagnostic pathways. © 2017 American Heart Association, Inc.

Are distal radius fracture classifications reproducible? Intra and interobserver agreement.

PubMed

Belloti, João Carlos; Tamaoki, Marcel Jun Sugawara; Franciozi, Carlos Eduardo da Silveira; Santos, João Baptista Gomes dos; Balbachevsky, Daniel; Chap Chap, Eduardo; Albertoni, Walter Manna; Faloppa, Flávio

2008-05-01

Various classification systems have been proposed for fractures of the distal radius, but the reliability of these classifications is seldom addressed. For a fracture classification to be useful, it must provide prognostic significance, interobserver reliability and intraobserver reproducibility. The aim here was to evaluate the intraobserver and interobserver agreement of distal radius fracture classifications. This was a validation study on interobserver and intraobserver reliability. It was developed in the Department of Orthopedics and Traumatology, Universidade Federal de São Paulo - Escola Paulista de Medicina. X-rays from 98 cases of displaced distal radius fracture were evaluated by five observers: one third-year orthopedic resident (R3), one sixth-year undergraduate medical student (UG6), one radiologist physician (XRP), one orthopedic trauma specialist (OT) and one orthopedic hand surgery specialist (OHS). The radiographs were classified on three different occasions (times T1, T2 and T3) using the Universal (Cooney), Arbeitsgemeinschaft für Osteosynthesefragen/Association for the Study of Internal Fixation (AO/ASIF), Frykman and Fernández classifications. The kappa coefficient (kappa) was applied to assess the degree of agreement. Among the three occasions, the highest mean intraobserver k was observed in the Universal classification (0.61), followed by Fernández (0.59), Frykman (0.55) and AO/ASIF (0.49). The interobserver agreement was unsatisfactory in all classifications. The Fernández classification showed the best agreement (0.44) and the worst was the Frykman classification (0.26). The low agreement levels observed in this study suggest that there is still no classification method with high reproducibility.
The use and reliability of SymNose for quantitative measurement of the nose and lip in unilateral cleft lip and palate patients.

PubMed

Mosmuller, David; Tan, Robin; Mulder, Frans; Bachour, Yara; de Vet, Henrica; Don Griot, Peter

2016-10-01

It is essential to have a reliable assessment method in order to compare the results of cleft lip and palate surgery. In this study the computer-based program SymNose, a method for quantitative assessment of the nose and lip, will be assessed on usability and reliability. The symmetry of the nose and lip was measured twice in 50 six-year-old complete and incomplete unilateral cleft lip and palate patients by four observers. For the frontal view the asymmetry level of the nose and upper lip were evaluated and for the basal view the asymmetry level of the nose and nostrils were evaluated. A mean inter-observer reliability when tracing each image once or twice was 0.70 and 0.75, respectively. Tracing the photographs with 2 observers and 4 observers gave a mean inter-observer score of 0.86 and 0.92, respectively. The mean intra-observer reliability varied between 0.80 and 0.84. SymNose is a practical and reliable tool for the retrospective assessment of large caseloads of 2D photographs of cleft patients for research purposes. Moderate to high single inter-observer reliability was found. For future research with SymNose reliable outcomes can be achieved by using the average outcomes of single tracings of two observers. Copyright © 2016 European Association for Cranio-Maxillo-Facial Surgery. Published by Elsevier Ltd. All rights reserved.
Comparative ex vivo evaluation of two electronic percussive testing devices measuring the stability of dental implants.

PubMed

Geckili, Onur; Bilhan, Hakan; Cilingir, Altug; Bilmenoglu, Caglar; Ates, Gokcen; Urgun, Aliye Ceren; Bural, Canan

2014-12-01

A comparative ex vivo study was performed to determine electronic percussive test values (PTVs) measured by cabled and wireless electronic percussive testing (EPT) devices and to evaluate the intra- and interobserver reliability of the wireless EPT device. Forty implants were inserted into the vertebrae and forty into the pelvis of a steer, a safe distance apart. The implants were all 4.3 mm wide and 13 mm long, from the same manufacturer. PTV of each implant was measured by four different examiners, using both EPT devices, and compared. Additionally, the intra- and interobserver reliability of the wireless EPT device was evaluated. Statistically significant differences (P <0.05) were observed between PTVs made by the two EPT devices. PTVs measured by the wireless EPT device were significantly higher than the cabled EPT device (P <0.05), indicating lower implant stability. The intraobserver reliability of the wireless EPT device was evaluated as excellent for the measurements in type II bone and good-to-excellent in type IV bone; interobserver reliability was evaluated as fair-to-good in both bone types. The wireless EPT device gives PTVs higher than the cabled EPT device, indicating lower implant stability, and its inter- and intraobserver reliability is good and acceptable.
Evaluating the intra- and interobserver reliability of three-dimensional ultrasound and power Doppler angiography (3D-PDA) for assessment of placental volume and vascularity in the second trimester of pregnancy.

PubMed

Jones, Nia W; Raine-Fenning, Nick J; Mousa, Hatem A; Bradley, Eileen; Bugg, George J

2011-03-01

Three-dimensional (3-D) power Doppler angiography (3-D-PDA) allows visualisation of Doppler signals within the placenta and their quantification is possible by the generation of vascular indices by the 4-D View software programme. This study aimed to investigate intra- and interobserver reproducibility of 3-D-PDA analysis of stored datasets at varying gestations with the ultimate goal being to develop a tool for predicting placental dysfunction. Women with an uncomplicated, viable singleton pregnancy were scanned at 12, 16 or 20 weeks gestational age groups. 3-D-PDA datasets acquired of the whole placenta were analysed using the VOCAL software processing tool. Each volume was analysed by three observers twice in the A plane. Intra- and interobserver reliability was assessed by intraclass correlation coefficients (ICCs) and Bland Altman plots. At each gestational age group, 20 low risk women were scanned resulting in 60 datasets in total. The ICC demonstrated a high level of measurement reliability at each gestation with intraobserver values >0.90 and interobserver values of >0.6 for the vascular indices. Bland Altman plots also showed high levels of agreement. Systematic bias was seen at 20 weeks in the vascular indices obtained by different observers. This study demonstrates that 3-D-PDA data can be measured reliably by different observers from stored datasets up to 18 weeks gestation. Measurements become less reliable as gestation advances with bias between observers evident at 20 weeks. Copyright © 2011 World Federation for Ultrasound in Medicine & Biology. Published by Elsevier Inc. All rights reserved.
PubMed Central

Labrecque, M; Dostaler, L P; Dumont, H; Huard, G; Laflamme, L

1993-01-01

OBJECTIVE: To determine the interobserver reliability of tympanograms obtained with the MicroTymp, a portable tympanometer. SETTING: Family medicine teaching unit in a tertiary care hospital. PATIENTS: Thirty-three patients who presented to the ear, nose and throat clinic in August 1990 for an ear problem. INTERVENTION: Three residents in family medicine independently attempted to record with the MicroTymp one tympanogram for the 66 ears. We excluded the results for seven ears for which tympanograms could not be obtained. MAIN OUTCOME MEASURE: Using objective criteria, two family physicians and two residents in family medicine independently classified the 177 tympanograms into five categories (normal, possible effusion, possible perforation, possible tympano-ossicular dysfunction and unclassifiable). Reliability was estimated by means of the kappa (kappa) coefficient on 161 tympanograms from 59 ears for which the interpretation of the three tympanograms agreed. MAIN RESULTS: The interpretation of the three tympanograms agreed for 34 of the 59 ears (0.58) (kappa = 0.52, 95% confidence limits 0.45 and 0.59). There was no significant difference in interobserver reliability between pairs of observers or between symptomatic and asymptomatic ears. CONCLUSIONS: The interobserver reliability of the MicroTymp is moderate. The tympanograms obtained with the instrument should be interpreted in the context of the clinical findings. PMID:8431817
Reliability of laser Doppler flowmetry curve reading for measurement of toe and ankle pressures: intra- and inter-observer variation.

PubMed

Høyer, C; Paludan, J P D; Pavar, S; Biurrun Manresa, J A; Petersen, L J

2014-03-01

To assess the intra- and inter-observer variation in laser Doppler flowmetry curve reading for measurement of toe and ankle pressures. A prospective single blinded diagnostic accuracy study was conducted on 200 patients with known or suspected peripheral arterial disease (PAD), with a total of 760 curve sets produced. The first curve reading for this study was performed by laboratory technologists blinded to clinical clues and previous readings at least 3 months after the primary data sampling. The pressure curves were later reassessed following another period of at least 3 months. Observer agreement in diagnostic classification according to TASC-II criteria was quantified using Cohen's kappa. Reliability was quantified using intra-class correlation coefficients, coefficients of variance, and Bland-Altman analysis. The overall agreement in diagnostic classification (PAD/not PAD) was 173/200 (87%) for intra-observer (κ = .858) and 175/200 (88%) for inter-observer data (κ = .787). Reliability analysis confirmed excellent correlation for both intra- and inter-observer data (ICC all ≥.931). The coefficients of variance ranged from 2.27% to 6.44% for intra-observer and 2.39% to 8.42% for inter-observer data. Subgroup analysis showed lower observer-variation for reading of toe pressures in patients with diabetes and/or chronic kidney disease than patients not diagnosed with these conditions. Bland-Altman plots showed higher variation in toe pressure readings than ankle pressure readings. This study shows substantial intra- and inter-observer agreement in diagnostic classification and reading of absolute pressures when using laboratory technologists as observers. The study emphasises that observer variation for curve reading is an important factor concerning the overall reproducibility of the method. Our data suggest diabetes and chronic kidney disease have an influence on toe pressure reproducibility. Copyright © 2013 European Society for Vascular Surgery. Published by Elsevier Ltd. All rights reserved.
Reliability and Validity of the Arthroscopic International Cartilage Repair Society Classification System: Correlation With Histological Assessment of Depth.

PubMed

Dwyer, Tim; Martin, C Ryan; Kendra, Rita; Sermer, Corey; Chahal, Jaskarndip; Ogilvie-Harris, Darrell; Whelan, Daniel; Murnaghan, Lucas; Nauth, Aaron; Theodoropoulos, John

2017-06-01

To determine the interobserver reliability of the International Cartilage Repair Society (ICRS) grading system of chondral lesions in cadavers, to determine the intraobserver reliability of the ICRS grading system comparing arthroscopy and video assessment, and to compare the arthroscopic ICRS grading system with histological grading of lesion depth. Eighteen lesions in 5 cadaveric knee specimens were arthroscopically graded by 7 fellowship-trained arthroscopic surgeons using the ICRS classification system. The arthroscopic video of each lesion was sent to the surgeons 6 weeks later for repeat grading and determination of intraobserver reliability. Lesions were biopsied, and the depth of the cartilage lesion was assessed. Reliability was calculated using intraclass correlations. The interobserver reliability was 0.67 (95% confidence interval, 0.5-0.89) for the arthroscopic grading, and the intraobserver reliability with the video grading was 0.8 (95% confidence interval, 0.67-0.9). A high correlation was seen between the arthroscopic grading of depth and the histological grading of depth (0.91); on average, surgeons graded lesions using arthroscopy a mean of 0.37 (range, 0-0.86) deeper than the histological grade. The arthroscopic ICRS classification system has good interobserver and intraobserver reliability. A high correlation with histological assessment of depth provides evidence of validity for this classification system. As cartilage lesions are treated on the basis of the arthroscopic ICRS classification, it is important to ascertain the reliability and validity of this method. Copyright © 2016 Arthroscopy Association of North America. Published by Elsevier Inc. All rights reserved.
Five times sit-to-stand test in subjects with total knee replacement: Reliability and relationship with functional mobility tests.

PubMed

Medina-Mirapeix, Francesc; Vivo-Fernández, Iván; López-Cañizares, Juan; García-Vidal, José A; Benítez-Martínez, Josep Carles; Del Baño-Aledo, María Elena

2018-01-01

The objective was to determine the inter-observer and test/retest reliability of the "Five-repetition sit-to-stand" (5STS) test in patients with total knee replacement (TKR). To explore correlation between 5STS and two mobility tests. A reliability study was conducted among 24 (mean age 72.13, S.D. 10.67; 50% were women) outpatients with TKR. They were recruited from a traumatology unit of a public hospital via convenience sampling. A physiotherapist and trauma physician assessed each patient at the same time. The same physiotherapist realized a 5STS second measurement 45-60min after the first one. Reliability was assessed with intraclass correlation coefficients (ICCs) and Bland-Altman plots. Pearson coefficient was calculated to assess the correlation between 5STS, time up to go test (TUG) and four meters gait speed (4MGS). ICC for inter-observer and test-retest reliability of the 5STS were 0.998 (95% confidence interval [CI], 0.995-0.999) and 0.982 (95% CI, 0.959-0.992). Bland-Altman plot inter-observer showed limits between -0.82 and 1.06 with a mean of 0.11 and no heteroscedasticity within the data. Bland-Altman plot for test-retest showed the limits between 1.76 and 4.16, a mean of 1.20 and heteroscedasticity within the data. Pearson correlation coefficient revealed significant correlation between 5STS and TUG (r=0.7, p<0.001) and 4MGS (r=-0.583, p=0.003). This study demonstrates excellent inter-observer and test-retest reliability when it is used in people with TKR, and also significant correlation with other functional mobility tests. These findings support the use of 5STS as outcome measure in TKR population. Copyright © 2017 Elsevier B.V. All rights reserved.
Validity and reliability of the iPhone to measure rib hump in scoliosis.

PubMed

Balg, Frederic; Juteau, Mathieu; Theoret, Chantal; Svotelis, Amy; Grenier, Guillaume

2014-12-01

This was a prospective blinded validity and reliability analysis. The aim of this study was validation and reliability evaluation of the Scoligauge iPhone app. The scoliometer is used to clinically measure the rib hump in scoliosis as a means to evaluate the axial trunk rotation. The increasing availability of smartphone with built-in accelerometer led to the development of a vast number of applications to measure angles. Of these, the Scoligauge mimics a scoliometer. The aim of this study was to compare the validity of the Scoligauge iPhone application without an associated adapter with the traditional scoliometer and to test the reliability of the application in a clinical setting. Two observers measured the rib hump deformity on 34 consecutive patients with idiopathic scoliosis with an average Cobb angle of 24.2 ± 13.5 degrees (range, 4 to 65 degrees). Measurements were made with an iPhone without the adapter and with a scoliometer. The validity as well as the interobserver and intraobserver reliability were calculated using the intraclass coefficient (ICC) and the Bland-Altman test. The mean difference between the scoliometer and the Scoligauge application was 0.4 degrees [95% confidence interval (CI) of ± 3.1 degrees] with an ICC of 0.947 (P < 0.001). The intraobserver and interobserver ICC were 0.961 (P < 0.001) and 0.901 (P < 0.001), respectively. The mean intraobserver difference was 0.0 degrees (95% CI of ± 2.7 degrees) and the mean interobserver difference was 0.1 degrees (95% CI of ± 4.4 degrees). The intraobserver and interobserver reliability of the Scoligauge iPhone app, as well as its validity compared with the scoliometer, are excellent. The mean differences between measurements are small and clinically not significant. Thus, the Scoligauge application is valid for clinical evaluation even without special adapter. Level I (Diagnostic Study).
The intra- and inter-observer reliability of the physical examination methods used to assess patients with patellofemoral joint instability.

PubMed

Smith, Toby O; Clark, Allan; Neda, Sophia; Arendt, Elizabeth A; Post, William R; Grelsamer, Ronald P; Dejour, David; Almqvist, Karl Fredrik; Donell, Simon T

2012-08-01

An accurate physical examination of patients with patellar instability is an important aspect of the diagnosis and treatment. While previous studies have assessed the diagnostic accuracy of such physical examination tests, little has been undertaken to assess the inter- and intra-tester reliability of such techniques. The purpose of this study was to determine the inter- and intra-tester reliability of the physical examination tests used for patients with patellar instability. Five patients (10 knees) with bilateral recurrent patellar instability were assessed by five members of the International Patellofemoral Study Group. Each surgeon assessed each patient twice using 18 reported physical examination tests. The inter- and intra-observer reliability was assessed using weighted Kappa statistics with 95% confidence intervals. The findings of the study suggested that there were very poor inter-observer reliability for the majority of the physical tests, with only the assessments of patellofemoral crepitus, foot arch position and the J-sign presenting with fair to moderate agreement respectively. The intra-observer reliability indicated largely moderate to substantial agreement between the first and second tests performed by each assessor, with the greatest agreement seen for the assessment of tibial torsion, popliteal angle and the Bassett's sign. For the common physical examination tests used in the management of patients with patellar instability inter-observer reliability is poor, while intra-observer reliability is moderate. Standardization of physical exam assessments and further study of these results among different clinicians and more divergent patient groups is indicated. Copyright © 2011 Elsevier B.V. All rights reserved.
Quantitative estimation of the high-intensity zone in the lumbar spine: comparison between the symptomatic and asymptomatic population.

PubMed

Liu, Chao; Cai, Hong-Xin; Zhang, Jian-Feng; Ma, Jian-Jun; Lu, Yin-Jiang; Fan, Shun-Wu

2014-03-01

The high-intensity zone (HIZ) on magnetic resonance imaging (MRI) has been studied for more than 20 years, but its diagnostic value in low back pain (LBP) is limited by the high incidence in asymptomatic subjects. Little effort has been made to improve the objective assessment of HIZ. To develop quantitative measurements for HIZ and estimate intra- and interobserver reliability and to clarify different signal intensity of HIZ in patients with or without LBP. A measurement reliability and prospective comparative study. A consecutive series of patients with LBP between June 2010 and May 2011 (group A) and a successive series of asymptomatic controls during the same period (group B). Incidence of HIZ; quantitative measures, including area of disc, area and signal intensity of HIZ, and magnetic resonance imaging index; and intraclass correlation coefficients (ICCs) for intra- and interobserver reliability. On the basis of HIZ criteria, a series of quantitative dimension and signal intensity measures was developed for assessing HIZ. Two experienced spine surgeons traced the region of interest twice within 4 weeks for assessment of the intra- and interobserver reliability. The quantitative variables were compared between groups A and B. There were 72 patients with LBP and 79 asymptomatic controls enrolling in this study. The prevalence of HIZ in group A and group B was 45.8% and 20.2%, respectively. The intraobserver agreement was excellent for the quantitative measures (ICC=0.838-0.977) as well as interobserver reliability (ICC=0.809-0.935). The mean signal of HIZ in group A was significantly brighter than in group B (57.55±14.04% vs. 45.61±7.22%, p=.000). There was no statistical difference of area of disc and HIZ between the two groups. The magnetic resonance imaging index was found to be higher in group A when compared with group B (3.94±1.71 vs. 3.06±1.50), but with a p value of .050. A series of quantitative measurements for HIZ was established and demonstrated excellent intra- and interobserver reliability. The signal intensity of HIZ was different in patients with or without LBP, and significant brighter signal was observed in symptomatic subjects. Copyright © 2014 Elsevier Inc. All rights reserved.
Development and Testing of the Observational System for Recording Physical Activity in Children: Elementary School

PubMed Central

McIver, Kerry L.; Brown, William H.; Pfeiffer, Karin A.; Dowda, Marsha; Pate, Russell R.

2016-01-01

Purpose This study describes the development and pilot testing of the Observational System for Recording Physical Activity-Elementary School (OSRAC-E) version. Methods This system was developed to observe and document the levels and types of physical activity and physical and social contexts of physical activity in elementary school students during the school day. Inter-observer agreement scores and summary data were calculated. Results All categories had Kappa statistics above 0.80, with the exception of the activity initiator category. Inter-observer agreement scores were 96% or greater. The OSRAC-E was shown to be a reliable observation system that allows researchers to assess physical activity behaviors, the contexts of those behaviors, and the effectiveness of physical activity interventions in the school environment. Conclusion The OSRAC-E can yield data with high interobserver reliability and provide relatively extensive contextual information about physical activity of students in elementary schools. PMID:26889587
The Development of the Cleft Aesthetic Rating Scale: A New Rating Scale for the Assessment of Nasolabial Appearance in Complete Unilateral Cleft Lip and Palate Patients.

PubMed

Mosmuller, David G M; Mennes, Lisette M; Prahl, Charlotte; Kramer, Gem J C; Disse, Melissa A; van Couwelaar, Gijs M; Niessen, Frank B; Griot, J P W Don

2017-09-01

The development of the Cleft Aesthetic Rating Scale, a simple and reliable photographic reference scale for the assessment of nasolabial appearance in complete unilateral cleft lip and palate patients. A blind retrospective analysis of photographs of cleft lip and palate patients was performed with this new rating scale. VU Medical Center Amsterdam and the Academic Center for Dentistry of Amsterdam. Complete unilateral cleft lip and palate patients at the age of 6 years. Photographs that showed the highest interobserver agreement in earlier assessments were selected for the photographic reference scale. Rules were attached to the rating scale to provide a guideline for the assessment and improve interobserver reliability. Cropped photographs revealing only the nasolabial area were assessed by six observers using this new Cleft Aesthetic Rating Scale in two different sessions. Photographs of 62 children (6 years of age, 44 boys and 18 girls) were assessed. The interobserver reliability for the nose and lip together was 0.62, obtained with the intraclass correlation coefficient. To measure the internal consistency, a Cronbach alpha of .91 was calculated. The estimated reliability for three observers was .84, obtained with the Spearman Brown formula. A new, easy to use, and reliable scoring system with a photographic reference scale is presented in this study.
Interobserver Reliability of the Berlin ARDS Definition and Strategies to Improve the Reliability of ARDS Diagnosis.

PubMed

Sjoding, Michael W; Hofer, Timothy P; Co, Ivan; Courey, Anthony; Cooke, Colin R; Iwashyna, Theodore J

2018-02-01

Failure to reliably diagnose ARDS may be a major driver of negative clinical trials and underrecognition and treatment in clinical practice. We sought to examine the interobserver reliability of the Berlin ARDS definition and examine strategies for improving the reliability of ARDS diagnosis. Two hundred five patients with hypoxic respiratory failure from four ICUs were reviewed independently by three clinicians, who evaluated whether patients had ARDS, the diagnostic confidence of the reviewers, whether patients met individual ARDS criteria, and the time when criteria were met. Interobserver reliability of an ARDS diagnosis was "moderate" (kappa = 0.50; 95% CI, 0.40-0.59). Sixty-seven percent of diagnostic disagreements between clinicians reviewing the same patient was explained by differences in how chest imaging studies were interpreted, with other ARDS criteria contributing less (identification of ARDS risk factor, 15%; cardiac edema/volume overload exclusion, 7%). Combining the independent reviews of three clinicians can increase reliability to "substantial" (kappa = 0.75; 95% CI, 0.68-0.80). When a clinician diagnosed ARDS with "high confidence," all other clinicians agreed with the diagnosis in 72% of reviews. There was close agreement between clinicians about the time when a patient met all ARDS criteria if ARDS developed within the first 48 hours of hospitalization (median difference, 5 hours). The reliability of the Berlin ARDS definition is moderate, driven primarily by differences in chest imaging interpretation. Combining independent reviews by multiple clinicians or improving methods to identify bilateral infiltrates on chest imaging are important strategies for improving the reliability of ARDS diagnosis. Copyright © 2017 American College of Chest Physicians. All rights reserved.
COMFORT scale: a reliable and valid method to measure the amount of stress of ventilated preterm infants.

PubMed

Wielenga, J M; De Vos, R; de Leeuw, R; De Haan, R J

2004-01-01

Assessment of clinimetric properties and diagnostic quality of a stress measurement scale (COMFORT scale). Sample of an open population. Neonatology department (Neonatal Intensive Care Unit), Academic Medical Centre/Emma Children's Hospital, Amsterdam, The Netherlands. One clinical expert and 9 observers observed ventilated premature born babies simultaneously. Criterion validity was assessed by correlating the COMFORT scale with the clinical judgment regarding the amount of stress. Interobserver reliability was assessed on the clinical judgment as well as on the COMFORT scale. Diagnostic qualities were evaluated with a ROC curve. On 19 ventilated prematurely born babies (mean gestational age 30 weeks, mean birth weight 1385 gm), one clinical expert and 9 observers made 30 paired observations. The criterion validity of the COMFORT scale was good (Pearson's r of 0.84). The interobserver reliability of the clinical judgment was very good (weighted Kappa 0.84). The interobserver reliability of each item varied from good to almost perfect (weighted Kappa of 0.64 for muscle tone to 1.00 on heart rate). The reliability of the total COMFORT scale score was satisfying (intra-class correlation coefficient of 0.94). The diagnostic quality of the COMFORT scale was excellent, at a cut-off point of 20 the sensitivity was 100 percent, the specificity was 77 percent, and the area under the curve (AUC) of 0.95. In this first evaluation, the COMFORT scale appears to be a valid and reliable measurement tool to assess the stress of ventilated prematurely born babies.
Ultrasound definition of tendon damage in patients with rheumatoid arthritis. Results of a OMERACT consensus-based ultrasound score focussing on the diagnostic reliability.

PubMed

Bruyn, George A W; Hanova, Petra; Iagnocco, Annamaria; d'Agostino, Maria-Antonietta; Möller, Ingrid; Terslev, Lene; Backhaus, Marina; Balint, Peter V; Filippucci, Emilio; Baudoin, Paul; van Vugt, Richard; Pineda, Carlos; Wakefield, Richard; Garrido, Jesus; Pecha, Ondrej; Naredo, Esperanza

2014-11-01

To develop the first ultrasound scoring system of tendon damage in rheumatoid arthritis (RA) and assess its intraobserver and interobserver reliability. We conducted a Delphi study on ultrasound-defined tendon damage and ultrasound scoring system of tendon damage in RA among 35 international rheumatologists with experience in musculoskeletal ultrasound. Twelve patients with RA were included and assessed twice by 12 rheumatologists-sonographers. Ultrasound examination for tendon damage in B mode of five wrist extensor compartments (extensor carpi radialis brevis and longus; extensor pollicis longus; extensor digitorum communis; extensor digiti minimi; extensor carpi ulnaris) and one ankle tendon (tibialis posterior) was performed blindly, independently and bilaterally in each patient. Intraobserver and interobserver reliability were calculated by κ coefficients. A three-grade semiquantitative scoring system was agreed for scoring tendon damage in B mode. The mean intraobserver reliability for tendon damage scoring was excellent (κ value 0.91). The mean interobserver reliability assessment showed good κ values (κ value 0.75). The most reliable were the extensor digiti minimi, the extensor carpi ulnaris, and the tibialis posterior tendons. An ultrasound reference image atlas of tenosynovitis and tendon damage was also developed. Ultrasound is a reproducible tool for evaluating tendon damage in RA. This study strongly supports a new reliable ultrasound scoring system for tendon damage. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Ultrasonographic assessment of tendon thickness, Doppler activity and bony spurs of the elbow in patients with lateral epicondylitis and healthy subjects: a reliability and agreement study.

PubMed

Krogh, T P; Fredberg, U; Christensen, R; Stengaard-Pedersen, K; Ellingsen, T

2013-10-01

Tennis elbow, also known as lateral epicondylitis (LE), is a common disorder often assessed by ultrasound. The aim of this study was to evaluate the ultrasonographic outcomes and methods used in LE research and clinical practice. This study was designed as an intra- and interobserver reliability and agreement study. Ultrasonographic examination of the common extensor tendon of the elbow was performed. The intraobserver study examined tendon thickness twice in 20 right elbows from 20 healthy individuals at an interval of 7 to 12 days. The interobserver study examined tendon thickness, color Doppler activity, and bony spurs in 18 right elbows in 9 healthy individuals and 9 patients with LE. Two trained rheumatologists performed the interobserver examinations with the same scanner on the same day. The main outcomes were intra- and interclass correlation (ICC) and agreement. In the intraobserver study, the ICC with regard to tendon thickness ranged from 0.76 to 0.81, depending on the measurement techniques used. The agreement ranged from 0.06 to 0.13 mm. In the interobserver study, the tendon thickness ICC ranged from 0.45 to 0.65 and the agreement ranged from -0.17 to 0.13 mm. The ICC for color Doppler activity was 0.93, with agreement in 14/18 (78 %) of the cases. A perfect reliability was demonstrated for bony spurs, with an ICC of 1 and exact agreement in 18/18 (100 %) of the cases. Good to excellent reliability was obtained for all measurements. The ultrasonographic techniques evaluated in this trial can be recommended for use in both research and clinical practice. © Georg Thieme Verlag KG Stuttgart · New York.
Reliability of classification for post-traumatic ankle osteoarthritis.

PubMed

Claessen, Femke M A P; Meijer, Diederik T; van den Bekerom, Michel P J; Gevers Deynoot, Barend D J; Mallee, Wouter H; Doornberg, Job N; van Dijk, C Niek

2016-04-01

The purpose of this study was to identify the most reliable classification system for clinical outcome studies to categorize post-traumatic-fracture-osteoarthritis. A total of 118 orthopaedic surgeons and residents-gathered in the Ankle Platform Study Collaborative Science of Variation Group-evaluated 128 anteroposterior and lateral radiographs of patients after a bi- or trimalleolar ankle fracture on a Web-based platform in order to rate post-traumatic osteoarthritis according to the classification systems coined by (1) van Dijk, (2) Kellgren, and (3) Takakura. Reliability was evaluated with the use of the Siegel and Castellan's multirater kappa measure. Differences between classification systems were compared using the two-sample Z-test. Interobserver agreement of surgeons who participated in the survey was fair for the van Dijk osteoarthritis scale (k = 0.24), and poor for the Takakura (k = 0.19) and the Kellgren systems (k = 0.18) according to the categorical rating of Landis and Koch. This difference in one categorical rating was found to be significant (p < 0.001, CI 0.046-0.053) with the high numbers of observers and cases available. This study documents fair interobserver agreement for the van Dijk osteoarthritis scale, and poor interobserver agreement for the Takakura and Kellgren osteoarthritis classification systems. Because of the low interobserver agreement for the van Dijk, Kellgren, and Takakura classification systems, those systems cannot be used for clinical decision-making. Development of diagnostic criteria on basis of consecutive patients, Level II.
Measuring the Cobb angle with the iPhone in kyphoses: a reliability study.

PubMed

Jacquot, Frederic; Charpentier, Axelle; Khelifi, Sofiane; Gastambide, Daniel; Rigal, Regis; Sautet, Alain

2012-08-01

Smartphones have gained widespread use in the healthcare field to fulfill a variety of tasks. We developed a small iPhone application to take advantage of the built-in position sensor to measure angles in a variety of spinal deformities. We present a reliability study of this tool in measuring kyphotic angles. Radiographs taken from 20 different patients' charts were presented to a panel of six operators at two different times. Radiographs were measured with the protractor and the iPhone application and statistical analysis was applied to measure intraclass correlation coefficients between both measurement methods, and to measure intra- and interobserver reliability The intraclass correlation coefficient calculated between methods (i.e. CobbMeter application on the iPhone versus standard method with the protractor) was 0.963 for all measures, indicating excellent correlation was obtained between the CobbMeter application and the standard method. The interobserver correlation coefficient was 0.965. The intraobserver ICC was 0.977, indicating excellent reproductibility of measurements at different times for all operators. The interobserver ICC between fellowship trained senior surgeons and general orthopaedic residents was 0.989. Consistently, the ICC for intraobserver and interobserver correlations was higher with the CobbMeter application than with the regular protractor method. This difference was not statistically significant. Measuring kyphotic angles with the iPhone application appears to be a valid procedure and is in no way inferior to the standard way of measuring the Cobb angle in kyphotic deformities.
Inter- and intraobserver reliability of the vertebral, local and segmental kyphosis in 120 traumatic lumbar and thoracic burst fractures: evaluation in lateral X-rays and sagittal computed tomographies

PubMed Central

Brunner, Alexander; Gühring, Markus; Schmälzle, Traude; Weise, Kuno; Badke, Andreas

2009-01-01

Evaluation of the kyphosis angle in thoracic and lumbar burst fractures is often used to indicate surgical procedures. The kyphosis angle could be measured as vertebral, segmental and local kyphosis according to the method of Cobb. The vertebral, segmental and local kyphosis according to the method of Cobb were measured at 120 lateral X-rays and sagittal computed tomographies of 60 thoracic and 60 lumbar burst fractures by 3 independent observers on 2 separate occasions. Osteoporotic fractures were excluded. The intra- and interobserver reliability of these angles in X-ray and computed tomogram, using the intra class correlation coefficient (ICC) were evaluated. Highest reproducibility showed the segmental kyphosis followed by the vertebral kyphosis. For thoracic fractures segmental kyphosis shows in X-ray “excellent” inter- and intraobserver reliabilities (ICC 0.826, 0.802) and for lumbar fractures “good” to “excellent” inter- and intraobserver reliabilities (ICC = 0.790, 0.803). In computed tomography, the segmental kyphosis showed “excellent” inter- and intraobserver reliabilities (ICC = 0.824, 0.801) for thoracic and “excellent” inter- and intraobserver reliabilities (ICC = 0.874, 0.835) for the lumbar fractures. Regarding both diagnostic work ups (X-ray and computed tomography), significant differences were evaluated in interobserver reliabilities for vertebral kyphosis measured in lumbar fracture X-rays (p = 0.035) and interobserver reliabilities for local kyphosis, measured in thoracic fracture X-rays (p = 0.010). Regarding both fracture localizations (thoracic and lumbar fractures), significant differences could only be evaluated in interobserver reliabilities for the local kyphosis measured in computed tomographies (p = 0.045) and in intraobserver reliabilities for the vertebral kyphosis measured in X-rays (p = 0.024). “Good” to “excellent” inter- and intraobserver reliabilities for vertebral, segmental and local kyphosis in X-ray make these angles to a helpful tool, indicating surgical procedures. For the practical use in lateral X-ray, we emphasize the determination of the segmental kyphosis, because of the highest reproducibility of this angle. “Good” to “excellent” inter- and intraobserver reliabilities for these three angles could also be evaluated in computed tomographies. Therefore, also in computed tomography, the use of these three angles seems to be generally possible. For a direct correlation of the results in lateral X-ray and in computed tomography, further studies should be needed. PMID:19953277

Learning process for performing and analyzing 3D/4D transperineal ultrasound imaging and interobserver reliability study.

PubMed

Siafarikas, F; Staer-Jensen, J; Braekken, I H; Bø, K; Engh, M Ellström

2013-03-01

To evaluate the learning process for acquiring three- and four-dimensional (3D/4D) transperineal ultrasound volumes of the levator hiatus (LH) dimensions at rest, during pelvic floor muscle (PFM) contraction and on Valsalva maneuver, and for analyzing the ultrasound volumes, as well as to perform an interobserver reliability study between two independent ultrasound examiners. This was a prospective study including 22 women. We monitored the learning process of an inexperienced examiner (IE) performing 3D/4D transperineal ultrasonography and analyzing the volumes. The examination included acquiring volumes during three PFM contractions and three Valsalva maneuvers. LH dimensions were determined in the axial plane. The learning process was documented by estimating agreement between the IE and an experienced examiner (E) using the intraclass correlation coefficient. Agreement was calculated in blocks of 10 ultrasound examinations and analyzed volumes. After the learning process was complete the interobserver reliability for the technique was calculated between these two independent examiners. For offline analysis of the first 10 ultrasound volumes obtained by E, good to very good agreement between E and IE was achieved for all LH measurements except for the left and right levator-urethra gap and pubic arc. For the next 10 analyzed volumes, agreement improved for all LH measurements. Volumes that had been obtained by IE and E were then re-evaluated by IE, and good to very good agreement was found for all LH measurements indicating consistency in volume acquisition. The interobserver reliability study showed excellent ICC values (ICC, 0.81-0.97) for all LH measurements except the pubic arc (ICC = 0.67). 3D/4D transperineal ultrasound is a reliable technique that can be learned in a short period of time. Copyright © 2012 ISUOG. Published by John Wiley & Sons, Ltd.
Interobserver reliability of the young-burgess and tile classification systems for fractures of the pelvic ring.

PubMed

Koo, Henry; Leveridge, Mike; Thompson, Charles; Zdero, Rad; Bhandari, Mohit; Kreder, Hans J; Stephen, David; McKee, Michael D; Schemitsch, Emil H

2008-07-01

The purpose of this study was to measure interobserver reliability of 2 classification systems of pelvic ring fractures and to determine whether computed tomography (CT) improves reliability. The reliability of several radiographic findings was also tested. Thirty patients taken from a database at a Level I trauma facility were reviewed. For each patient, 3 radiographs (AP pelvis, inlet, and outlet) and CT scans were available. Six different reviewers (pelvic and acetabular specialist, orthopaedic traumatologist, or orthopaedic trainee) classified the injury according to Young-Burgess and Tile classification systems after reviewing plain radiographs and then after CT scans. The Kappa coefficient was used to determine interobserver reliability of these classification systems before and after CT scan. For plain radiographs, overall Kappa values for the Young-Burgess and Tile classification systems were 0.72 and 0.30, respectively. For CT scan and plain radiographs, the overall Kappa values for the Young-Burgess and Tile classification systems were 0.63 and 0.33, respectively. The pelvis/acetabular surgeons demonstrated the highest level of agreement using both classification systems. For individual questions, the addition of CT did significantly improve reviewer interpretation of fracture stability. The pre-CT and post-CT Kappa values for fracture stability were 0.59 and 0.93, respectively. The CT scan can improve the reliability of assessment of pelvic stability because of its ability to identify anatomical features of injury. The Young-Burgess system may be optimal for the learning surgeon. The Tile classification system is more beneficial for specialists in pelvic and acetabular surgery.
Increasing Reliability of Direct Observation Measurement Approaches in Emotional and/or Behavioral Disorders Research Using Generalizability Theory

ERIC Educational Resources Information Center

Gage, Nicholas A.; Prykanowski, Debra; Hirn, Regina

2014-01-01

Reliability of direct observation outcomes ensures the results are consistent, dependable, and trustworthy. Typically, reliability of direct observation measurement approaches is assessed using interobserver agreement (IOA) and the calculation of observer agreement (e.g., percentage of agreement). However, IOA does not address intraobserver…
Collaborative Policy Making: Vertical Integration in The Homeland Security Enterprise

DTIC Science & Technology

2011-12-01

NEMA ), • International Association Emergency Managers (IAEM), • National Association of Chiefs of Police, International Association of...on application of normative principles to the facts and evidence accumulated by decision makers—and will show why other alternative courses of
Test-retest and interobserver reliability of quantitative sensory testing according to the protocol of the German Research Network on Neuropathic Pain (DFNS): a multi-centre study.

PubMed

Geber, Christian; Klein, Thomas; Azad, Shahnaz; Birklein, Frank; Gierthmühlen, Janne; Huge, Volker; Lauchart, Meike; Nitzsche, Dorothee; Stengel, Maike; Valet, Michael; Baron, Ralf; Maier, Christoph; Tölle, Thomas; Treede, Rolf-Detlef

2011-03-01

Quantitative sensory testing (QST) is an instrument to assess positive and negative sensory signs, helping to identify mechanisms underlying pathologic pain conditions. In this study, we evaluated the test-retest reliability (TR-R) and the interobserver reliability (IO-R) of QST in patients with sensory disturbances of different etiologies. In 4 centres, 60 patients (37 male and 23 female, 56.4±1.9years) with lesions or diseases of the somatosensory system were included. QST comprised 13 parameters including detection and pain thresholds for thermal and mechanical stimuli. QST was performed in the clinically most affected test area and a less or unaffected control area in a morning and an afternoon session on 2 consecutive days by examiner pairs (4 QSTs/patient). For both, TR-R and IO-R, there were high correlations (r=0.80-0.93) at the affected test area, except for wind-up ratio (TR-R: r=0.67; IO-R: r=0.56) and paradoxical heat sensations (TR-R: r=0.35; IO-R: r=0.44). Mean IO-R (r=0.83, 31% unexplained variance) was slightly lower than TR-R (r=0.86, 26% unexplained variance, P<.05); the difference in variance amounted to 5%. There were no differences between study centres. In a subgroup with an unaffected control area (n=43), reliabilities were significantly better in the test area (TR-R: r=0.86; IO-R: r=0.83) than in the control area (TR-R: r=0.79; IO-R: r=0.71, each P<.01), suggesting that disease-related systematic variance enhances reliability of QST. We conclude that standardized QST performed by trained examiners is a valuable diagnostic instrument with good test-retest and interobserver reliability within 2days. With standardized training, observer bias is much lower than random variance. Quantitative sensory testing performed by trained examiners is a valuable diagnostic instrument with good interobserver and test-retest reliability for use in patients with sensory disturbances of different etiologies to help identify mechanisms of neuropathic and non-neuropathic pain. Copyright © 2010 International Association for the Study of Pain. Published by Elsevier B.V. All rights reserved.
From a formal training program in musculoskeletal ultrasound (MSUS) to a high reproducibility for Doppler ultrasound in rheumatoid arthritis.

PubMed

Villota, Orlando; Diaz, Mario; Ceron, Carmen; Moller, Ingrid; Naredo, Esperanza; Saaibi, Diego Luis

2017-07-28

To assess the intra- and inter-observer reliability of ultrasound (US) in scoring B-mode, Doppler synovitis and combined B-mode and Doppler synovitis scores in different peripheral joints of rheumatoid arthritis (RA) patients. Four rheumatologists with a formal training in musculoskeletal US (MSKUS) particularly focus on definitions and scoring synovitis on B-mode and Doppler mode participated in a patient-based reliability exercise on 16 active RA patients. The four rheumatologists independently and consecutively performed a B-mode and power Doppler (PD) US assessment of 7 joints of each patient in two rounds in a blinded fashion. Each joint was semi quantitatively scored from 0 to 3 for B-mode synovitis (BS), Doppler synovitis (DS), and combined B-mode/Doppler synovitis (CS). Intraobserver reliability was assessed by Cohen's κ. Interobserver reliability was assessed by unweight Light's κ. The mean prevalence of synovitis on B-mode was 83% of joints; scores ranging from grade 1 in 18% of joints, to grade 3 in 33%. In 55% of joints synovial PD signal was detected and the distribution of scores range from 14% of joints for grade 3, to 26% for grade 2. After a total of 448 joints scanned with 896 adquired images our intraobserver and interobserver reliability was good to excellent for most of the joints. Formal, structured and continuous training in musculoskeletal ultrasound would bring a good to excellent reproducibility in rheumatological hands with a high reliability in real time acquisition BS, DS and CS modalities for scoring synovitis in patients with active rheumatoid arthritis. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Radiologic analysis of hindfoot alignment: Comparison of Méary, long axial, and hindfoot alignment views.

PubMed

Neri, T; Barthelemy, R; Tourné, Y

2017-12-01

Among radiographic views available for assessing hindfoot alignment, the antero-posterior weight-bearing view with metal cerclage of the hindfoot (Méary view) is the most widely used in France. Internationally, the long axial view (LAV) and hindfoot alignment view (HAV) are used also. The objective of this study was to compare the reliability of these three views. The Méary view with cerclage of the hindfoot is as reliable as the LAV and HAV for assessing hindfoot alignment. All three views were obtained in each of 22 prospectively included patients. Intra-observer and inter-observer reliabilities were assessed by having two observers collect the radiographic measurements then computing the intra-class correlation coefficients (ICCs). The intra-observer and inter-observer ICCs were 0.956 and 0.988 with the Méary view, 0.990 and 0.765 with the HAV, and 0.997 and 0.991 with the LAV, respectively. Correlations were far stronger between the LAV and HAV than between each of these and the Méary view. Compared to the LAV and HAV, the Méary view indicated a greater degree of hindfoot valgus. Intra-observer reliability was excellent with both the LAV and HAV, whereas inter-observer reliability was better with the LAV. Excellent reliability was also obtained with the Méary view. Combining the Méary view to obtain a radiographic image of the clinical deformity with the LAV to measure the angular deviation of the hindfoot axis may be useful when assessing hindfoot malalignment. A comparison of the three views in a larger population is needed before clinical recommendations can be made. II, prospective study. Copyright © 2017 Elsevier Masson SAS. All rights reserved.
Accuracy and reliability testing of two methods to measure internal rotation of the glenohumeral joint.

PubMed

Hall, Justin M; Azar, Frederick M; Miller, Robert H; Smith, Richard; Throckmorton, Thomas W

2014-09-01

We compared accuracy and reliability of a traditional method of measurement (most cephalad vertebral spinous process that can be reached by a patient with the extended thumb) to estimates made with the shoulder in abduction to determine if there were differences between the two methods. Six physicians with fellowship training in sports medicine or shoulder surgery estimated measurements in 48 healthy volunteers. Three were randomly chosen to make estimates of both internal rotation measurements for each volunteer. An independent observer made objective measurements on lateral scoliosis films (spinous process method) or with a goniometer (abduction method). Examiners were blinded to objective measurements as well as to previous estimates. Intraclass coefficients for interobserver reliability for the traditional method averaged 0.75, indicating good agreement among observers. The difference in vertebral level estimated by the examiner and the actual radiographic level averaged 1.8 levels. The intraclass coefficient for interobserver reliability for the abduction method averaged 0.81 for all examiners, indicating near-perfect agreement. Confidence intervals indicated that estimates were an average of 8° different from the objective goniometer measurements. Pearson correlation coefficients of intraobserver reliability for the abduction method averaged 0.94, indicating near-perfect agreement within observers. Confidence intervals demonstrated repeated estimates between 5° and 10° of the original. Internal rotation estimates made with the shoulder abducted demonstrated interobserver reliability superior to that of spinous process estimates, and reproducibility was high. On the basis of this finding, we now take glenohumeral internal rotation measurements with the shoulder in abduction and use a goniometer to maximize accuracy and objectivity. Copyright © 2014 Journal of Shoulder and Elbow Surgery Board of Trustees. Published by Mosby, Inc. All rights reserved.
[LiLa classification for paediatric long bone fractures. Intraobserver and interobserver reliability].

PubMed

Kamphaus, A; Rapp, M; Wessel, L M; Buchholz, M; Massalme, E; Schneidmüller, D; Roeder, C; Kaiser, M M

2015-04-01

There are two child-specific fracture classification systems for long bone fractures: the AO classification of pediatric long-bone fractures (PCCF) and the LiLa classification of pediatric fractures of long bones (LiLa classification). Both are still not widely established in comparison to the adult AO classification for long bone fractures. During a period of 12 months all long bone fractures in children were documented and classified according to the LiLa classification by experts and non-experts. Intraobserver and interobserver reliability were calculated according to Cohen (kappa). A total of 408 fractures were classified. The intraobserver reliability for location in the skeletal and bone segment showed an almost perfect agreement (K = 0.91-0.95) and also the morphology (joint/shaft fracture) (K = 0.87-0.93). Due to different judgment of the fracture displacement in the second classification round, the intraobserver reliability of the whole classification revealed moderate agreement (K = 0.53-0.58). Interobserver reliability showed moderate agreement (K = 0.55) often due to the low quality of the X-rays. Further differences occurred due to difficulties in assigning the precise transition from metaphysis to diaphysis. The LiLa classification is suitable and in most cases user-friendly for classifying long bone fractures in children. Reliability is higher than in established fracture specific classifications and comparable to the AO classification of pediatric long bone fractures. Some mistakes were due to a low quality of the X-rays and some due to difficulties to classify the fractures themselves. Improvements include a more precise definition of the metaphysis and the kind of displacement. Overall the LiLa classification should still be considered as an alternative for classifying pediatric long bone fractures.
Reliability of Two Smartphone Applications for Radiographic Measurements of Hallux Valgus Angles.

PubMed

Mattos E Dinato, Mauro Cesar; Freitas, Marcio de Faria; Milano, Cristiano; Valloto, Elcio; Ninomiya, André Felipe; Pagnano, Rodrigo Gonçalves

The objective of the present study was to assess the reliability of 2 smartphone applications compared with the traditional goniometer technique for measurement of radiographic angles in hallux valgus and the time required for analysis with the different methods. The radiographs of 31 patients (52 feet) with a diagnosis of hallux valgus were analyzed. Four observers, 2 with >10 years' experience in foot and ankle surgery and 2 in-training surgeons, measured the hallux valgus angle and intermetatarsal angle using a manual goniometer technique and 2 smartphone applications (Hallux Angles and iPinPoint). The interobserver and intermethod reliability were estimated using intraclass correlation coefficients (ICCs), and the time required for measurement of the angles among the 3 methods was compared using the Friedman test. A very good or good interobserver reliability was found among the 4 observers measuring the hallux valgus angle and intermetatarsal angle using the goniometer (ICC 0.913 and 0.821, respectively) and iPinPoint (ICC 0.866 and 0.638, respectively). Using the Hallux Angles application, a very good interobserver reliability was found for measurements of the hallux valgus angle (ICC 0.962) and intermetatarsal angle (ICC 0.935) only among the more experienced observers. The time required for the measurements was significantly shorter for the measurements using both smartphone applications compared with the goniometer method. One smartphone application (iPinPoint) was reliable for measurements of the hallux valgus angles by either experienced or nonexperienced observers. The use of these tools might save time in the evaluation of radiographic angles in the hallux valgus. Copyright © 2016 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.
Evaluation of Fracture and Osteotomy Union in the Setting of Osteogenesis Imperfecta: Reliability of the Modified Radiographic Union Score for Tibial Fractures (RUST).

PubMed

Franzone, Jeanne M; Finkelstein, Mark S; Rogers, Kenneth J; Kruse, Richard W

2017-09-08

Evaluation of the union of osteotomies and fractures in patients with osteogenesis imperfecta (OI) is a critical component of patient care. Studies of the OI patient population have so far used varied criteria to evaluate bony union. The radiographic union score for tibial fractures (RUST), which was subsequently revised to the modified RUST, is an objective standardized method of evaluating fracture healing. We sought to evaluate the reliability of the modified RUST in the setting of the tibias of patients with OI. Tibial radiographs of 30 patients with OI fractures, or osteotomies were scored by 3 observers on 2 separate occasions. Each of the 4 cortices was given a score (1=no callus, 2=callus present, 3=bridging callus, and 4=remodeled, fracture not visible) and the modified RUST is the sum of these scores (range, 4 to 16). The interobserver and intraobserver reliabilities were evaluated using intraclass coefficients (ICC) with 95% confidence intervals. The ICC representing the interobserver reliability for the first iteration of scores was 0.926 (0.864 to 0.962) and for the second series was 0.915 (0.845 to 0.957). The ICCs representing the intraobserver reliability for each of the 3 reviewers for the measurements in series 1 and 2 were 0.860 (0.707 to 0.934), 0.994 (0.986 to 0.997), and 0.974 (0.946 to 0.988). The modified RUST has excellent interobserver and intraobserver reliability in the setting of OI despite challenges related to the poor quality of the bone and its dysplastic nature. The application and routine use of the modified RUST in the OI population will help standardize our evaluation of osteotomy and fracture healing. Level III-retrospective study of nonconsecutive patients.
Prospective Analysis of Surgical Bone Margins After Partial Foot Amputation in Diabetic Patients Admitted With Moderate to Severe Foot Infections.

PubMed

Schmidt, Brian M; McHugh, Jonathan B; Patel, Rajiv M; Wrobel, James S

2018-04-01

Osteomyelitis is common in diabetic foot infections and medical management can lead to poor outcomes. Surgical management involves sending histopathologic and microbiologic specimens which guides future intervention. We examined the effect of obtainment of surgical margins in patients undergoing forefoot amputations to identify patient characteristics associated with outcomes. Secondary aims included evaluating interobserver reliability of histopathologic data at both the distal-to and proximal-to surgical bone margin. Data were prospectively collected on 72 individuals and was pooled for analysis. Standardized method to retrieve intraoperative bone margins was established. A univariate analysis was performed. Negative outcomes, including major lower extremity amputation, wound dehiscence, reulceration, reamputation, or death were recorded. Viable proximal margins were obtained in 63 out of 72 cases (87.5%). Strong interobserver reliability of histopathology was recorded. Univariate analysis demonstrated preoperative platelets, albumin, probe-to-bone testing, absolute toe pressures, smaller wound surface area were associated with obtaining viable margins. Residual osteomyelitis resulted in readmission 2.6 times more often and more postoperative complications. Certain patients were significantly different in the viable margin group versus dirty margin group. High interobserver reliability was demonstrated. Obtainment of viable margins resulted in reduced rates of readmission and negative outcomes. Prognostic, Level I: Prospective.
Assessment of four midcarpal radiologic determinations.

PubMed

Cho, Mickey S; Battista, Vincent; Dubin, Norman H; Pirela-Cruz, Miguel

2006-03-01

Several radiologic measurement methods have been described for determining static carpal alignment of the wrist. These include the scapholunate, radiolunate, and capitolunate angles. The triangulation method is an alternative radiologic measurement which we believe is easier to use and more reproducible and reliable than the above mentioned methods. The purpose of this study is to assess the intraobserver reproducibility and interobserver reliability of the triangulation method, scapholunate, radiolunate, and capitolunate angles. Twenty orthopaedic residents and staff at varying levels of training made four radiologic measurements including the scapholunate, radiolunate and capitolunate angles as well as the triangulation method on five different lateral, digitized radiographs of the wrist and forearm in neutral radioulnar deviation. Thirty days after the initial measurements, the participants repeated the four radiologic measurements using the same radiographs. The triangulation method had the best intra-and-interobserver agreement of the four methods tested. This agreement was significantly better than the capitolunate and radiolunate angles. The scapholunate angle had the next best intraobserver reproducibility and interobserver reliability. The triangulation method has the best overall observer agreement when compared to the scapholunate, radiolunate, and capitolunate angles in determining static midcarpal alignment. No comment can be made on the validity of the measurements since there is no radiographic gold standard in determining static carpal alignment.
AO Distal Radius Fracture Classification: Global Perspective on Observer Agreement.

PubMed

Jayakumar, Prakash; Teunis, Teun; Giménez, Beatriz Bravo; Verstreken, Frederik; Di Mascio, Livio; Jupiter, Jesse B

2017-02-01

Background The primary objective of this study was to test interobserver reliability when classifying fractures by consensus by AO types and groups among a large international group of surgeons. Secondarily, we assessed the difference in inter- and intraobserver agreement of the AO classification in relation to geographical location, level of training, and subspecialty. Methods A randomized set of radiographic and computed tomographic images from a consecutive series of 96 distal radius fractures (DRFs), treated between October 2010 and April 2013, was classified using an electronic web-based portal by an invited group of participants on two occasions. Results Interobserver reliability was substantial when classifying AO type A fractures but fair and moderate for type B and C fractures, respectively. No difference was observed by location, except for an apparent difference between participants from India and Australia classifying type B fractures. No statistically significant associations were observed comparing interobserver agreement by level of training and no differences were shown comparing subspecialties. Intra-rater reproducibility was "substantial" for fracture types and "fair" for fracture groups with no difference accounting for location, training level, or specialty. Conclusion Improved definition of reliability and reproducibility of this classification may be achieved using large international groups of raters, empowering decision making on which system to utilize. Level of Evidence Level III.
AO Distal Radius Fracture Classification: Global Perspective on Observer Agreement

PubMed Central

Jayakumar, Prakash; Teunis, Teun; Giménez, Beatriz Bravo; Verstreken, Frederik; Di Mascio, Livio; Jupiter, Jesse B.

2016-01-01

Background The primary objective of this study was to test interobserver reliability when classifying fractures by consensus by AO types and groups among a large international group of surgeons. Secondarily, we assessed the difference in inter- and intraobserver agreement of the AO classification in relation to geographical location, level of training, and subspecialty. Methods A randomized set of radiographic and computed tomographic images from a consecutive series of 96 distal radius fractures (DRFs), treated between October 2010 and April 2013, was classified using an electronic web-based portal by an invited group of participants on two occasions. Results Interobserver reliability was substantial when classifying AO type A fractures but fair and moderate for type B and C fractures, respectively. No difference was observed by location, except for an apparent difference between participants from India and Australia classifying type B fractures. No statistically significant associations were observed comparing interobserver agreement by level of training and no differences were shown comparing subspecialties. Intra-rater reproducibility was “substantial” for fracture types and “fair” for fracture groups with no difference accounting for location, training level, or specialty. Conclusion Improved definition of reliability and reproducibility of this classification may be achieved using large international groups of raters, empowering decision making on which system to utilize. Level of Evidence Level III PMID:28119795
The Music Attentiveness Screening Assessment, Revised (MASA-R): A Study of Technical Adequacy.

PubMed

Waldon, Eric G; Lesser, Alexander; Weeden, Lydia; Messick, Emily

2016-01-01

Evidence suggests that attention is an important consideration when designing procedural support interventions for children undergoing distressing medical procedures. As such, the extent to which children can attend to musical stimuli used during music-based procedural support interventions would seem important. The Music Attentiveness Screening Assessment (MASA) was designed to assess a child's ability to attend to musical stimuli, but further revisions were deemed necessary to improve administration, test-retest reliability, and interobserver agreement for the measure's items. This study investigated the technical adequacy of the Music Attentiveness Screening Assessment, Revised (MASA-R), with a non-clinical sample of children aged 4 to 9 years by examining (a) Construct validity using comparator instruments measuring auditory attention; (b) Test-retest reliability following a two-week delay; and (c) Interobserver agreement when administered by two independent examiners. This non-clinical sample included 69 children who were administered both items from MASA-R and two comparator instruments: the Auditory Attention subtest from the NEPSY-II (NII-AA) for children aged 5 to 9 years (n = 47); and the Auditory Attention subtest from the Woodcock-Johnson Tests of Cognitive Abilities, 3rd ed. (WJIII-AA), for children aged 4 years (n = 22). A significant proportion of score variance was shared by both MASA-R items and the comparator measures: R (2) = .16, F(2, 66) = 6.30, p = .003. MASA-R score estimates with regard to test-retest reliability (Item I, intra-class correlation [ICC] = .88; Item II, ICC = .91) and interobserver agreement (Item I, ICC = .99; Item II, ICC = .98) also fell into acceptable ranges. Estimates of MASA-R score construct validity, test-retest reliability, and interobserver agreement appear improved over its predecessor, MASA. While findings are promising, additional investigation of its use with a clinical sample is needed before it can be confidently used in pediatrics. © the American Music Therapy Association 2015. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Interobserver and intraobserver variability in the identification of the Lenke classification lumbar modifier in adolescent idiopathic scoliosis.

PubMed

Duong, Luc; Cheriet, Farida; Labelle, Hubert; Cheung, Kenneth M C; Abel, Mark F; Newton, Peter O; McCall, Richard E; Lenke, Lawrence G; Stokes, Ian A F

2009-08-01

Interobserver and intraobserver reliability study for the identification of the Lenke classification lumbar modifier by a panel of experts compared with a computer algorithm. To measure the variability of the Lenke classification lumbar modifier and determine if computer assistance using 3-dimensional spine models can improve the reliability of classification. The lumbar modifier has been proposed to subclassify Lenke scoliotic curve types into A, B, and C on the basis of the relationship between the central sacral vertical line (CSVL) and the apical lumbar vertebra. Landmarks for identification of the CSVL have not been clearly defined, and the reliability of the actual CSVL position and lumbar modifier selection have never been tested independently. Therefore, the value of the lumbar modifier for curve classification remains unknown. The preoperative radiographs of 68 patients with adolescent idiopathic scoliosis presenting a Lenke type 1 curve were measured manually twice by 6 members of the Scoliosis Research Society 3-dimensional classification committee at 6 months interval. Intraobserver and interobserver reliability was quantified using the percentage of agreement and kappa statistics. In addition, the lumbar curve of all subjects was reconstructed in 3-dimension using a stereoradiographic technique and was submitted to a computer algorithm to infer the lumbar modifier according to measurements from the pedicles. Interobserver rates for the first trial showed a mean kappa value of 0.56. Second trial rates were higher with a mean kappa value of 0.64. Intraobserver rates were evaluated at a mean kappa value of 0.69. The computer algorithm was successful in identifying the lumbar curve type and was in agreement with the observers by a proportion up to 93%. Agreement between and within observers for the Lenke lumbar modifier is only moderate to substantial with manual methods. Computer assistance with 3-dimensional models of the spine has the potential to decrease this variability.
Interobserver Reliability of the Total Body Score System for Quantifying Human Decomposition.

PubMed

Dabbs, Gretchen R; Connor, Melissa; Bytheway, Joan A

2016-03-01

Several authors have tested the accuracy of the Total Body Score (TBS) method for quantifying decomposition, but none have examined the reliability of the method as a scoring system by testing interobserver error rates. Sixteen participants used the TBS system to score 59 observation packets including photographs and written descriptions of 13 human cadavers in different stages of decomposition (postmortem interval: 2-186 days). Data analysis used a two-way random model intraclass correlation in SPSS (v. 17.0). The TBS method showed "almost perfect" agreement between observers, with average absolute correlation coefficients of 0.990 and average consistency correlation coefficients of 0.991. While the TBS method may have sources of error, scoring reliability is not one of them. Individual component scores were examined, and the influences of education and experience levels were investigated. Overall, the trunk component scores were the least concordant. Suggestions are made to improve the reliability of the TBS method. © 2016 American Academy of Forensic Sciences.
Diagnosing Femoroacetabular Impingement From Plain Radiographs

PubMed Central

Ayeni, Olufemi R.; Chan, Kevin; Whelan, Daniel B.; Gandhi, Rajiv; Williams, Dale; Harish, Srinivasan; Choudur, Hema; Chiavaras, Mary M.; Karlsson, Jon; Bhandari, Mohit

2014-01-01

Background: A diagnosis of femoroacetabular impingement (FAI) requires careful history and physical examination, as well as an accurate and reliable radiologic evaluation using plain radiographs as a screening modality. Radiographic markers in the diagnosis of FAI are numerous and not fully validated. In particular, reliability in their assessment across health care providers is unclear. Purpose: To determine inter- and intraobserver reliability between orthopaedic surgeons and musculoskeletal radiologists. Study Design: Cohort study (diagnosis); Level of evidence, 3. Methods: Six physicians (3 orthopaedic surgeons, 3 musculoskeletal radiologists) independently evaluated a broad spectrum of FAI pathologies across 51 hip radiographs on 2 occasions separated by at least 4 weeks. Reviewers used 8 common criteria to diagnose FAI, including (1) pistol-grip deformity, (2) size of alpha angle, (3) femoral head-neck offset, (4) posterior wall sign abnormality, (5) ischial spine sign abnormality, (6) coxa profunda abnormality, (7) crossover sign abnormality, and (8) acetabular protrusion. Agreement was calculated using the intraclass correlation coefficient (ICC). Results: When establishing an FAI diagnosis, there was poor interobserver reliability between the surgeons and radiologists (ICC batch 1 = 0.33; ICC batch 2 = 0.15). In contrast, there was higher interobserver reliability within each specialty, ranging from fair to good (surgeons: ICC batch 1 = 0.72; ICC batch 2 = 0.70 vs radiologists: ICC batch 1 = 0.59; ICC batch 2 = 0.74). Orthopaedic surgeons had the highest interobserver reliability when identifying pistol-grip deformities (ICC = 0.81) or abnormal alpha angles (ICC = 0.81). Similarly, radiologists had the highest agreement for detecting pistol-grip deformities (ICC = 0.75). Conclusion: These results suggest that surgeons and radiologists agree among themselves, but there is a need to improve the reliability of radiographic interpretations for FAI between the 2 specialties. The observed degree of low reliability may ultimately lead to missed, delayed, or inappropriate treatments for patients with symptomatic FAI. PMID:26535344
Online Studies on Variation in Orthopedic Surgery: Computed Tomography in MPEG4 Versus DICOM Format.

PubMed

Mellema, Jos J; Mallee, Wouter H; Guitton, Thierry G; van Dijk, C Niek; Ring, David; Doornberg, Job N

2017-10-01

The purpose of this study was to compare the observer participation and satisfaction as well as interobserver reliability between two online platforms, Science of Variation Group (SOVG) and Traumaplatform Study Collaborative, for the evaluation of complex tibial plateau fractures using computed tomography in MPEG4 and DICOM format. A total of 143 observers started with the online evaluation of 15 complex tibial plateau fractures via either the SOVG or Traumaplatform Study Collaborative websites using MPEG4 videos or a DICOM viewer, respectively. Observers were asked to indicate the absence or presence of four tibial plateau fracture characteristics and to rate their satisfaction with the evaluation as provided by the respective online platforms. The observer participation rate was significantly higher in the SOVG (MPEG4 video) group compared to that in the Traumaplatform Study Collaborative (DICOM viewer) group (75 and 43%, respectively; P < 0.001). The median observer satisfaction with the online evaluation was seven (range, 0-10) using MPEG4 video compared to six (range, 1-9) using DICOM viewer (P = 0.11). The interobserver reliability for recognition of fracture characteristics in complex tibial plateau fractures was higher for the evaluation using MPEG4 video. In conclusion, observer participation and interobserver reliability for the characterization of tibial plateau fractures was greater with MPEG4 videos than with a standard DICOM viewer, while there was no difference in observer satisfaction. Future reliability studies should account for the method of delivering images.

Inter-observer reliability of radiographic classifications and measurements in the assessment of Perthes' disease.

PubMed

Wiig, Ola; Terjesen, Terje; Svenningsen, Svein

2002-10-01

We evaluated the inter-observer agreement of radiographic methods when evaluating patients with Perthes' disease. The radiographs were assessed at the time of diagnosis and at the 1-year follow-up by local orthopaedic surgeons (O) and 2 experienced pediatric orthopedic surgeons (TT and SS). The Catterall, Salter-Thompson, and Herring lateral pillar classifications were compared, and the femoral head coverage (FHC), center-edge angle (CE-angle), and articulo-trochanteric distance (ATD) were measured in the affected and normal hips. On the primary evaluation, the lateral pillar and Salter-Thompson classifications had a higher level of agreement among the observers than the Catterall classification, but none of the classifications showed good agreement (weighted kappa values between O and SS 0.56, 0.54, 0.49, respectively). Combining Catterall groups 1 and 2 into one group, and groups 3 and 4 into another resulted in better agreement (kappa 0.55) than with the original 4-group system. The agreement was also better (kappa 0.62-0.70) between experienced than between less experienced examiners for all classifications. The femoral head coverage was a more reliable and accurate measure than the CE-angle for quantifying the acetabular covering of the femoral head, as indicated by higher intraclass correlation coefficients (ICC) and smaller inter-observer differences. The ATD showed good agreement in all comparisons and had low interobserver differences. We conclude that all classifications of femoral head involvement are adequate in clinical work if the radiographic assessment is done by experienced examiners. When they are less experienced examiners, a 2-group classification or the lateral pillar classification is more reliable. For evaluation of containment of the femoral head, FHC is more appropriate than the CE-angle.
Three-column classification and Schatzker classification: a three- and two-dimensional computed tomography characterisation and analysis of tibial plateau fractures.

PubMed

Patange Subba Rao, Sheethal Prasad; Lewis, James; Haddad, Ziad; Paringe, Vishal; Mohanty, Khitish

2014-10-01

The aim of the study was to evaluate inter-observer reliability and intra-observer reproducibility between the three-column classification and Schatzker classification systems using 2D and 3D CT models. Fifty-two consecutive patients with tibial plateau fractures were evaluated by five orthopaedic surgeons. All patients were classified into Schatzker and three-column classification systems using x-rays and 2D and 3D CT images. The inter-observer reliability was evaluated in the first round and the intra-observer reliability was determined during the second round 2 weeks later. The average intra-observer reproducibility for the three-column classification was from substantial to excellent in all sub classifications, as compared with Schatzker classification. The inter-observer kappa values increased from substantial to excellent in three-column classification and to moderate in Schatzker classification The average values for three-column classification for all the categories are as follows: (I-III) k2D = 0.718, 95% CI 0.554-0.864, p < 0.0001 and average 3D = 0.874, 95% CI 0.754-0.890, p < 0.0001. For Schatzker classification system, the average values for all six categories are as follows: (I-VI) k2D = 0.536, 95% CI 0.365-0.685, p < 0.0001 and average k3D = 0.552 95% CI 0.405-0.700, p < 0.0001. The values are statistically significant. Statistically significant inter-observer values in both rounds were noted with the three-column classification, making it statistically an excellent agreement. The intra-observer reproducibility for the three-column classification improved as compared with the Schatzker classification. The three-column classification seems to be an effective way to characterise and classify fractures of tibial plateau.
Development and validation of a paediatric long-bone fracture classification. A prospective multicentre study in 13 European paediatric trauma centres

PubMed Central

2011-01-01

Background The aim of this study was to develop a child-specific classification system for long bone fractures and to examine its reliability and validity on the basis of a prospective multicentre study. Methods Using the sequentially developed classification system, three samples of between 30 and 185 paediatric limb fractures from a pool of 2308 fractures documented in two multicenter studies were analysed in a blinded fashion by eight orthopaedic surgeons, on a total of 5 occasions. Intra- and interobserver reliability and accuracy were calculated. Results The reliability improved with successive simplification of the classification. The final version resulted in an overall interobserver agreement of κ = 0.71 with no significant difference between experienced and less experienced raters. Conclusions In conclusion, the evaluation of the newly proposed classification system resulted in a reliable and routinely applicable system, for which training in its proper use may further improve the reliability. It can be recommended as a useful tool for clinical practice and offers the option for developing treatment recommendations and outcome predictions in the future. PMID:21548939
Intra- and interobserver agreement for fetal cerebral measurements in 3D-ultrasonography.

PubMed

Albers, Maria E W A; Buisman, Erato T I A; Kahn, René S; Franx, Arie; Onland-Moret, N Charlotte; de Heus, Roel

2018-04-10

The aim of this study is to evaluate intra- and interobserver agreement for measurement of intracranial, cerebellar, and thalamic volume with the Virtual Organ Computer-aided AnaLysis (VOCAL) technique in three-dimensional ultrasound images, in comparison to two-dimensional measurements of these brain structures. Three-dimensional ultrasound images of the brains of 80 fetuses at 20-24 weeks' gestational age were obtained from YOUth, a Dutch prospective cohort study. Two observers performed offline measurement of the occipitofrontal diameter, intracranial volume, transcerebellar diameter, cerebellar volume, and thalamic width, area, and volume, independently. VOCAL was used for calculation of the volumes. The two-way random, single measures intraclass correlation coefficient (ICC) was used for analysis of agreement and Bland-Altman plots were configured. Intra- and interobserver agreement was almost perfect for occipitofrontal diameter (intra ICC 0.88, 95% CI 0.82-0.92; inter ICC 0.91, 95% CI 0.85-0.94), intracranial volume (intra ICC 0.96, 95% CI 0.91-0.98; inter ICC 0.97, 95% CI 0.96-0.98) and transcerebellar diameter (intra ICC 0.91, 95% CI 0.86-0.94; inter ICC 0.86, 95% CI 0.78-0.910). For cerebellar volume, the intraobserver agreement was almost perfect (0.85, 95% CI 0.76-0.90), whereas the interobserver agreement was substantial (0.75, 95% CI 0.44-0.88). Agreement was only moderate for thalamic measurements. Bland-Altman plots for the volume measurements are normally distributed with acceptable mean differences and 95% limits of agreement. The intra- and interobserver agreement of the measurement of intracranial and cerebellar volume with VOCAL was almost perfect. These measurements are therefore reliable, and can be used to investigate fetal brain development. Thalamic measurements are not reliable enough. © 2018 Wiley Periodicals, Inc.
Intra- and interobserver reliability estimates for identification and grading of upper respiratory tract abnormalities recorded in horses at rest and during overground endoscopy.

PubMed

McGivney, C L; Sweeney, J; David, F; O'Leary, J M; Hill, E W; Katz, L M

2017-07-01

Previous studies support good intra- and interobserver agreements for endoscopic evaluation of various upper respiratory tract (URT) diseases in horses. However, these studies mainly assessed resting endoscopic examination videos and/or focussed on a single URT abnormality. To estimate intra- and interobserver agreement for identification and grading of all URT abnormalities from resting and overground endoscopy (OGE) videos of Thoroughbreds. Blinded, fully crossed design. Resting and OGE URT videos for n = 43 Thoroughbreds were retrospectively chosen based on identification of common URT disorders. The videos were randomly evaluated in duplicate by 4 raters blinded to all information including prior URT disorder(s) diagnosis. Abnormalities were graded using well-described ordinal scales. Intra- and interobserver agreements were estimated using Cohen's weighted κ and Krippendorff's α, respectively. Intraobserver agreement was perfect/nearly perfect for arytenoid symmetry at exercise, epiglottic entrapment and epiglottic retroversion, substantial for arytenoid asymmetry at rest, palatal dysfunction (PD), medial deviation of the aryepiglottic folds (MDAF), pharyngeal mucus and epiglottic grade at exercise and moderate for vocal fold collapse (VFC), ventromedial luxation of the apex of the corniculate process of the arytenoid (VLAC), nasopharyngeal collapse (NPC) and epiglottic grade at rest. Interobserver agreement was substantial for arytenoid symmetry at exercise and PD and moderate for arytenoid asymmetry at rest, MDAF, VLAC and epiglottic entrapment. It was only fair for VFC, epiglottic grade at exercise, epiglottic retroversion, pharyngeal mucus and NPC and poor for epiglottic grade at rest. Sample size was insufficient to allow assessment of the effect of one abnormality on the grading of another abnormality. Observers were consistent in grading URT disorders. However, significant disparity in grading existed between observers for some conditions affecting reliability. © 2016 EVJ Ltd.
Intra- and Inter-Observer Reliability of the Trunk Impairment Scale for Children with Cerebral Palsy

ERIC Educational Resources Information Center

Saether, Rannei; Jorgensen, Lone

2011-01-01

Standardized scales to evaluate qualities of trunk movements in children with dysfunction are sparse. An examination of the reliability of scales that may be useful in the clinic is important. The aim of this study was to examine the reliability of the Trunk Impairment Scale (TIS) for children with cerebral palsy (CP). Standardized scales are…
Standardization for Ki-67 Assessment in Moderately Differentiated Breast Cancer. A Retrospective Analysis of the SAKK 28/12 Study

PubMed Central

Varga, Zsuzsanna; Cassoly, Estelle; Li, Qiyu; Oehlschlegel, Christian; Tapia, Coya; Lehr, Hans Anton; Klingbiel, Dirk; Thürlimann, Beat; Ruhstaller, Thomas

2015-01-01

Background Proliferative activity (Ki-67 Labelling Index) in breast cancer increasingly serves as an additional tool in the decision for or against adjuvant chemotherapy in midrange hormone receptor positive breast cancer. Ki-67 Index has been previously shown to suffer from high inter-observer variability especially in midrange (G2) breast carcinomas. In this study we conducted a systematic approach using different Ki-67 assessments on large tissue sections in order to identify the method with the highest reliability and the lowest variability. Materials and Methods Five breast pathologists retrospectively analyzed proliferative activity of 50 G2 invasive breast carcinomas using large tissue sections by assessing Ki-67 immunohistochemistry. Ki-67-assessments were done on light microscopy and on digital images following these methods: 1) assessing five regions, 2) assessing only darkly stained nuclei and 3) considering only condensed proliferative areas (‘hotspots’). An individual review (the first described assessment from 2008) was also performed. The assessments on light microscopy were done by estimating. All measurements were performed three times. Inter-observer and intra-observer reliabilities were calculated using the approach proposed by Eliasziw et al. Clinical cutoffs (14% and 20%) were tested using Fleiss’ Kappa. Results There was a good intra-observer reliability in 5 of 7 methods (ICC: 0.76–0.89). The two highest inter-observer reliability was fair to moderate (ICC: 0.71 and 0.74) in 2 methods (region-analysis and individual-review) on light microscopy. Fleiss’-kappa-values (14% cut-off) were the highest (moderate) using the original recommendation on light-microscope (Kappa 0.58). Fleiss’ kappa values (20% cut-off) were the highest (Kappa 0.48 each) in analyzing hotspots on light-microscopy and digital-analysis. No methodologies using digital-analysis were superior to the methods on light microscope. Conclusion Our results show that all methods on light-microscopy for Ki-67 assessment in large tissue sections resulted in a good intra-observer reliability. Region analysis and individual review (the original recommendation) on light-microscopy yielded the highest inter-observer reliability. These results show slight improvement to previously published data on poor-reproducibility and thus might be a practical-pragmatic way for routine assessment of Ki-67 Index in G2 breast carcinomas. PMID:25885288
Measurement of fetal head descent using the 'angle of progression' on transperineal ultrasound imaging is reliable regardless of fetal head station or ultrasound expertise.

PubMed

Dückelmann, A M; Bamberg, C; Michaelis, S A M; Lange, J; Nonnenmacher, A; Dudenhausen, J W; Kalache, K D

2010-02-01

To assess whether ultrasound experience or fetal head station affects the reliability of measurement of fetal head descent using the angle of progression on intrapartum ultrasound images obtained by a single experienced operator, and to determine reliability of measurements when images were acquired by different operators with variable ultrasound experience. One experienced obstetrician performed 44 transperineal ultrasound examinations of women at term and in prolonged second stage of labor with the fetus in the occipitoanterior position. Three midwives without ultrasound experience, three obstetricians with < 5 years' experience and three obstetricians with > 10 years' experience measured fetal head descent based on the angle of progression in the images obtained. The angle of progression was measured by two obstetricians in independent ultrasound examinations of 24 laboring women at term with the fetus in the cephalic position to allow assessment of the reliability of image acquisition. Intraclass correlation coefficients (ICCs) with 95% confidence interval (CI) were used to evaluate interobserver reliability and Bland-Altman analysis was used to assess interobserver agreement. In total, 444 measurements were performed and compared. Interobserver reliability with respect to offline image analysis was substantial (overall ICC, 0.72; 95% CI, 0.63-0.81). ICCs were 0.82 (95% CI, 0.70-0.89), 0.81 (95% CI, 0.71-0.88) and 0.61 (95% CI, 0.43-074) for observers with > 10 years', < 5 years' and no ultrasound experience, respectively. There were no significant differences between ICCs among observer groups according to ultrasound experience. Fetal head station did not affect reliability. Bland-Altman analysis indicated reasonable agreement between measurements obtained by two different operators with > 10 years' and < 5 years' ultrasound experience (bias, -1.09 degrees ; 95% limits of agreement, -8.76 to 6.58). The reliability of measurement of the angle of progression following separate image acquisition by two experienced operators was similar to the reliability of offline image analysis (ICC, 0.86; 95% CI, 0.70-0.93). Measurement of the angle of progression on transperineal ultrasound imaging is reliable regardless of fetal head station or the clinician's level of ultrasound experience.
Intra- and Interobserver Reliability of Three Classification Systems for Hallux Rigidus.

PubMed

Dillard, Sarita; Schilero, Christina; Chiang, Sharon; Pham, Peter

2018-04-18

There are over ten classification systems currently used in the staging of hallux rigidus. This results in confusion and inconsistency with radiographic interpretation and treatment. The reliability of hallux rigidus classification systems has not yet been tested. The purpose of this study was to evaluate intra- and interobserver reliability using three commonly used classifications for hallux rigidus. Twenty-one plain radiograph sets were presented to ten ACFAS board-certified foot and ankle surgeons. Each physician classified each radiograph based on clinical experience and knowledge according to the Regnauld, Roukis, and Hattrup and Johnson classification systems. The two-way mixed single-measure consistency intraclass correlation was used to calculate intra- and interrater reliability. The intrarater reliability of individual sets for the Roukis and Hattrup and Johnson classification systems was "fair to good" (Roukis, 0.62±0.19; Hattrup and Johnson, 0.62±0.28), whereas the intrarater reliability of individual sets for the Regnauld system bordered between "fair to good" and "poor" (0.43±0.24). The interrater reliability of the mean classification was "excellent" for all three classification systems. Conclusions Reliable and reproducible classification systems are essential for treatment and prognostic implications in hallux rigidus. In our study, Roukis classification system had the best intrarater reliability. Although there are various classification systems for hallux rigidus, our results indicate that all three of these classification systems show reliability and reproducibility.
Interobserver Reliability of Peripheral Muscle Strength Tests and Short Physical Performance Battery in Patients With Chronic Obstructive Pulmonary Disease: A Prospective Observational Study.

PubMed

Medina-Mirapeix, Francesc; Bernabeu-Mora, Roberto; Llamazares-Herrán, Eduardo; Sánchez-Martínez, Ma Piedad; García-Vidal, José Antonio; Escolar-Reina, Pilar

2016-11-01

To evaluate the interobserver reliability of the Short Physical Performance Battery (SPPB) and hand dynamometry when measuring isometric muscle strength in people with chronic obstructive pulmonary disease (COPD). Reliability study. Each patient was assessed by a pulmonology physician and a physical therapist in 2 separate sessions 7 to 14 days apart (mean, 9.8±0.8d). Each rater was blinded to the other's results. Pneumology unit of a public hospital. Random sample of outpatients with stable COPD (N=30). Not applicable. SPPB and muscle strength (kg) using electronic handgrip and handheld dynamometers. Reliability was assessed with intraclass correlation coefficients (ICCs), standard error of measurement values, and Bland-Altman plots. ICCs were calculated for the SPPB summary score and for its 3 subscales. The ICCs for the overall reliability of the SPPB summary score and for grip and quadriceps strength were .82 (95% confidence interval [CI], .62-.91), .97 (95% CI, .93-.98), and .76 (95% CI, .49-.88), respectively. The standard error of measurement values were .55 points, 1.30kg, and 1.22kg, respectively. The mean differences between the rater's scores were near zero for grip strength and SPPB summary score measures. The ICCs for the SPPB subscales were .84 (95% CI, .66-.92) for the chair subscale, .75 (95% CI, .48-.88) for gait, and .33 (95% CI, -.42 to .68) for balance. Interobserver reliability was good for quadriceps and handgrip dynamometry and for the SPPB summary score and its chair stand and gait speed subscales. Both pulmonary physicians and physical therapists can obtain and exchange the scores. Because the reliability of the balance subscale was questionable, it is better to use the SPPB summary score. Copyright © 2016 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Reliability and validity of a tool to measure the severity of tongue thrust in children: the Tongue Thrust Rating Scale.

PubMed

Serel Arslan, S; Demir, N; Karaduman, A A

2017-02-01

This study aimed to develop a scale called Tongue Thrust Rating Scale (TTRS), which categorised tongue thrust in children in terms of its severity during swallowing, and to investigate its validity and reliability. The study describes the developmental phase of the TTRS and presented its content and criterion-based validity and interobserver and intra-observer reliability. For content validation, seven experts assessed the steps in the scale over two Delphi rounds. Two physical therapists evaluated videos of 50 children with cerebral palsy (mean age, 57·9 ± 16·8 months), using the TTRS to test criterion-based validity, interobserver and intra-observer reliability. The Karaduman Chewing Performance Scale (KCPS) and Drooling Severity and Frequency Scale (DSFS) were used for criterion-based validity. All the TTRS steps were deemed necessary. The content validity index was 0·857. A very strong positive correlation was found between two examinations by one physical therapist, which indicated intra-observer reliability (r = 0·938, P < 0·001). A very strong positive correlation was also found between the TTRS scores of two physical therapists, indicating interobserver reliability (r = 0·892, P < 0·001). There was also a strong positive correlation between the TTRS and KCPS (r = 0·724, P < 0·001) and a very strong positive correlation between the TTRS scores and DSFS (r = 0·822 and r = 0·755; P < 0·001). These results demonstrated the criterion-based validity of the TTRS. The TTRS is a valid, reliable and clinically easy-to-use functional instrument to document the severity of tongue thrust in children. © 2016 John Wiley & Sons Ltd.
Mitotic rate in primary melanoma: interobserver and intraobserver reliability, analyzed using H&E sections and immunohistochemistry.

PubMed

Garbe, Claus; Eigentler, Thomas K; Bauer, Jürgen; Blödorn-Schlicht, Norbert; Cerroni, Lorenzo; Fend, Falko; Hantschke, Markus; Kurschat, Peter; Kutzner, Heinz; Metze, Dieter; Mielke, Volker; Preßler, Harald; Reusch, Michael; Reusch, Ursula; Stadler, Rudolf; Tronnier, Michael; Yazdi, Amir; Metzler, Gisela

2016-09-01

In 2009, the AJCC issued a revised melanoma staging system. In addition to tumor thickness and ulceration, the mitotic rate was introduced as the third major prognostic parameter for the classification of primary cutaneous melanoma. Given that, according to the 2009 AJCC classification, the detection of one or more dermal tumor mitoses leads to an upstaging - from stage Ia to Ib - of melanomas with a tumor thickness of ≤ 1.0 mm, we set out to investigate the reproducibility of this new parameter. In order to assess interobserver reliability, 17 dermatopathologists und pathologists - all well versed in the diagnosis of cutaneous melanoma - analyzed the mitotic rate in 15 thin primary cutaneous melanomas (mean tumor thickness 0.91 mm) using identical slides. Mitotic rates were determined on H&E and phosphohistone H3 (Ser10)-stained samples. Without knowledge of their previous assessment, five of the aforementioned examiners reevaluated the samples after more than one year in order to ascertain intraobserver reliability. Interobserver reliability of the mitotic rate in thin primary melanomas is disappointing and independent of whether H&E or immunohistochemically stained samples are used (kappa value: 0.088 [H&E], 0.154 [IH], respectively). Kappa values improved to 0.345 (H&E) and 0.403 (IH) when using a cutoff of 0/1 vs. 2+ mitoses. Similarly unsatisfactory, kappa values for intraobserver reliability ranged from 0.18 and 0.348, depending on the individual examiner. Given the unsatisfactory reproducibility and large variations in assessing the mitotic rate, it remains a matter of debate whether this diagnostic parameter should play a role in therapeutic decisions. © 2016 Deutsche Dermatologische Gesellschaft (DDG). Published by John Wiley & Sons Ltd.
Lumbar lordosis and sacral slope in lumbar spinal stenosis: standard values and measurement accuracy.

PubMed

Bredow, J; Oppermann, J; Scheyerer, M J; Gundlfinger, K; Neiss, W F; Budde, S; Floerkemeier, T; Eysel, P; Beyer, F

2015-05-01

Radiological study. To asses standard values, intra- and interobserver reliability and reproducibility of sacral slope (SS) and lumbar lordosis (LL) and the correlation of these parameters in patients with lumbar spinal stenosis (LSS). Anteroposterior and lateral X-rays of the lumbar spine of 102 patients with LSS were included in this retrospective, radiologic study. Measurements of SS and LL were carried out by five examiners. Intraobserver correlation and correlation between LL and SS were calculated with Pearson's r linear correlation coefficient and intraclass correlation coefficients (ICC) were calculated for inter- and intraobserver reliability. In addition, patients were examined in subgroups with respect to previous surgery and the current therapy. Lumbar lordosis averaged 45.6° (range 2.5°-74.9°; SD 14.2°), intraobserver correlation was between Pearson r = 0.93 and 0.98. The measurement of SS averaged 35.3° (range 13.8°-66.9°; SD 9.6°), intraobserver correlation was between Pearson r = 0.89 and 0.96. Intraobserver reliability ranged from 0.966 to 0.992 ICC in LL measurements and 0.944-0.983 ICC in SS measurements. There was an interobserver reliability ICC of 0.944 in LL and 0.990 in SS. Correlation between LL and SS averaged r = 0.79. No statistically significant differences were observed between the analyzed subgroups. Manual measurement of LL and SS in patients with LSS on lateral radiographs is easily performed with excellent intra- and interobserver reliability. Correlation between LL and SS is very high. Differences between patients with and without previous decompression were not statistically significant.
Reliability and validity of Edinburgh visual gait score as an evaluation tool for children with cerebral palsy.

PubMed

Del Pilar Duque Orozco, Maria; Abousamra, Oussama; Church, Chris; Lennon, Nancy; Henley, John; Rogers, Kenneth J; Sees, Julieanne P; Connor, Justin; Miller, Freeman

2016-09-01

Assessment of gait abnormalities in cerebral palsy (CP) is challenging, and access to instrumented gait analysis is not always feasible. Therefore, many observational gait analysis scales have been devised. This study aimed to evaluate the interobserver reliability, intraobserver reliability, and validity of Edinburgh visual gait score (EVGS). Video of 30 children with spastic CP were reviewed by 7 raters (10 children each in GMFCS levels I, II, and III, age 6-12 years). Three observers had high level of experience in gait analysis (10+ years), two had medium level (2-5 years) and two had no previous experience (orthopedic fellows). Interobserver reliability was evaluated using percentage of complete agreement and kappa values. Criterion validity was evaluated by comparing EVGS scores with 3DGA data taken from the same video visit. Interobserver agreement was 60-90% and Kappa values were 0.18-0.85 for the 17 items in EVGS. Reliability was higher for distal segments (foot/ankle/knee 63-90%; trunk/pelvis/hip 60-76%), with greater experience (high 66-91%, medium 62-90%, no-experience 41-87%), with more EVGS practice (1st 10 videos 52-88%, last 10 videos 64-97%) and when used with higher functioning children (GMFCS I 65-96%, II 58-90%, III 35-65%). Intraobserver agreement was 64-92%. Agreement between EVGS and 3DGA was 52-73%. We believe that having EVGS as part of the standardized gait evaluation is helpful in optimizing the visual scoring. EVGS can be a supportive tool that adds quantitative data instead of only qualitative assessment to a video only gait evaluation. Copyright © 2016 Elsevier B.V. All rights reserved.
The influence of critical shoulder angle on secondary rotator cuff insufficiency following shoulder arthroplasty.

PubMed

Cerciello, Simone; Monk, Andrew Paul; Visonà, Enrico; Carbone, Stefano; Edwards, Thomas Bradley; Maffulli, Nicola; Walch, Gilles

2017-07-01

Secondary cuff failure after shoulder replacement is disabling and often requires additional surgery. Increased critical shoulder angle (CSA) has been found in patients with cuff tear compared to normal subjects. The interobserver reliability of the CSA and the relationship between CSA and symptomatic secondary cuff failure after shoulder replacement were investigated. Nineteen patients with symptomatic cuff failure after anatomic shoulder replacement (mean FU 45 months) were compared to a control group of 29 patients showing no signs of symptomatic cuff failure (mean FU 105.7 months). The CSA was measured by two blinded surgeons at a mean follow-up of 45 and 105.7 months, respectively. Inter-observer reliability was calculated. The mean CSA in the study group in neutral, internal and external rotations were 33°, 34° and 34°, respectively. Corresponding values in the control group were 32°, 32° and 32°. The interclass correlation coefficient for the whole population between the two examiners were 0.956 (P < 0.01), 0.964 (P < 0.01) and 0.955 (P < 0.01), respectively. There were no significant differences of CSA values between patients who had undergone shoulder replacement and experienced late cuff failure and those in whom the same procedure had been successful. A good inter-observer reliability was found for the CSA method.
Systematic review of methods for quantifying teamwork in the operating theatre

PubMed Central

Marshall, D.; Sykes, M.; McCulloch, P.; Shalhoub, J.; Maruthappu, M.

2018-01-01

Background Teamwork in the operating theatre is becoming increasingly recognized as a major factor in clinical outcomes. Many tools have been developed to measure teamwork. Most fall into two categories: self‐assessment by theatre staff and assessment by observers. A critical and comparative analysis of the validity and reliability of these tools is lacking. Methods MEDLINE and Embase databases were searched following PRISMA guidelines. Content validity was assessed using measurements of inter‐rater agreement, predictive validity and multisite reliability, and interobserver reliability using statistical measures of inter‐rater agreement and reliability. Quantitative meta‐analysis was deemed unsuitable. Results Forty‐eight articles were selected for final inclusion; self‐assessment tools were used in 18 and observational tools in 28, and there were two qualitative studies. Self‐assessment of teamwork by profession varied with the profession of the assessor. The most robust self‐assessment tool was the Safety Attitudes Questionnaire (SAQ), although this failed to demonstrate multisite reliability. The most robust observational tool was the Non‐Technical Skills (NOTECHS) system, which demonstrated both test–retest reliability (P > 0·09) and interobserver reliability (Rwg = 0·96). Conclusion Self‐assessment of teamwork by the theatre team was influenced by professional differences. Observational tools, when used by trained observers, circumvented this.
[Validating the Spanish version of the Nursing Activities Score].

PubMed

Sánchez-Sánchez, M M; Arias-Rivera, S; Fraile-Gamo, M P; Thuissard-Vasallo, I J; Frutos-Vivar, F

2015-01-01

Validating workload scores ensures that they are appropriate for the purpose for which they were developed. To validate the Nursing Activities Score (NAS) Spanish version. Observational and prospective study. 1,045 patients who were admitted to a medical-surgical unit and a serious burns unit in 2006 were included. The nurse in charge assessed patient workloads by Nine Equivalent of Nursing Manpower use Score and NAS. To assess the internal consistency of the measurements of NAS, item-test correlations, Cronbach's α and Cronbach's α corrected by omitting each of the items were calculated. The intraobserver and interobserver reliability were assessed with the intraclass correlation coefficient by viewing recordings and Kappa (interobserver reliability) was estimated. For the analysis of internal validity, a factorial principal components analysis was performed. Convergent validity was assessed using the Spearman correlation coefficient values obtained from the Nine Equivalent of Nursing Manpower use Score and Spanish-NAS scales. For internal consistency, 164 questionnaires were analysed and a Cronbach's α of 0.373 was calculated. The intraclass correlation coefficient for intraobserver reliability estimate was 0.837 (95% IC: 0.466-0.950) and 0.662 (95% IC: 0.033-0.882) for interobserver reliability. The estimated kappa was 0.371. For internal validity, exploratory factor analysis showed that the first item explained 58.9% of the variance of the questionnaire. For convergent validity 1006 questionnaires were included and a Spearman correlation coefficient of 0.746 was observed. The psychometric properties of Spanish-NAS are acceptable. Copyright © 2014 Elsevier España, S.L.U. y SEEIUC. All rights reserved.
Reliability of a rapid hematology stain for sputum cytology*

PubMed Central

Gonçalves, Jéssica; Pizzichini, Emilio; Pizzichini, Marcia Margaret Menezes; Steidle, Leila John Marques; Rocha, Cristiane Cinara; Ferreira, Samira Cardoso; Zimmermann, Célia Tânia

2014-01-01

Objective: To determine the reliability of a rapid hematology stain for the cytological analysis of induced sputum samples. Methods: This was a cross-sectional study comparing the standard technique (May-Grünwald-Giemsa stain) with a rapid hematology stain (Diff-Quik). Of the 50 subjects included in the study, 21 had asthma, 19 had COPD, and 10 were healthy (controls). From the induced sputum samples collected, we prepared four slides: two were stained with May-Grünwald-Giemsa, and two were stained with Diff-Quik. The slides were read independently by two trained researchers blinded to the identification of the slides. The reliability for cell counting using the two techniques was evaluated by determining the intraclass correlation coefficients (ICCs) for intraobserver and interobserver agreement. Agreement in the identification of neutrophilic and eosinophilic sputum between the observers and between the stains was evaluated with kappa statistics. Results: In our comparison of the two staining techniques, the ICCs indicated almost perfect interobserver agreement for neutrophil, eosinophil, and macrophage counts (ICC: 0.98-1.00), as well as substantial agreement for lymphocyte counts (ICC: 0.76-0.83). Intraobserver agreement was almost perfect for neutrophil, eosinophil, and macrophage counts (ICC: 0.96-0.99), whereas it was moderate to substantial for lymphocyte counts (ICC = 0.65 and 0.75 for the two observers, respectively). Interobserver agreement for the identification of eosinophilic and neutrophilic sputum using the two techniques ranged from substantial to almost perfect (kappa range: 0.91-1.00). Conclusions: The use of Diff-Quik can be considered a reliable alternative for the processing of sputum samples. PMID:25029648
Construct validity and reliability of the Music Attentiveness Screening Assessment (MASA).

PubMed

Waldon, Eric G; Broadhurst, Emily

2014-01-01

Music as alternate engagement (MAE) can be used effectively to distract children during painful or anxiety-provoking medical procedures. For such interventions to be successful, it would seem important to assess the degree to which a child can attend to musical stimuli. The purposes of this study were as follows: (a) To establish construct validity by determining the extent to which the Music Attentiveness Screening Assessment (MASA) measures auditory attention; and (b) to gather evidence regarding MASA test-retest and inter-observer reliability. The Auditory Attention (AA) subtest from the NEPSY-II (NEPSY, Second Edition) and the two items from MASA were administered to a nonclinical sample of children (N = 50) aged 5 to 9 years. There was a statistically significant proportion of AA score variance shared with MASA (both items), R (2) = .21, F(2, 47) = 6.34, p = .004. Test-retest reliability on the first MASA item was moderately high (Pearson r = .84) while on the second item it was lower (r = .63). Similarly, interobserver agreement was high for Item I (intraclass correlation coefficient [ICC] = .95) and lower for Item II (ICC = .71). Evidence suggests that MASA measures, at least in part, auditory attention. Despite this finding, a large proportion of unexplained variance remains. Furthermore, reliability estimates (test-retest and interobserver agreement) differ between both items. These findings are discussed with particular attention paid to the ways in which MASA should be revised and further study conducted. © the American Music Therapy Association 2014. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Is computed tomography an accurate and reliable method for measuring total knee arthroplasty component rotation?

PubMed

Figueroa, José; Guarachi, Juan Pablo; Matas, José; Arnander, Magnus; Orrego, Mario

2016-04-01

Computed tomography (CT) is widely used to assess component rotation in patients with poor results after total knee arthroplasty (TKA). The purpose of this study was to simultaneously determine the accuracy and reliability of CT in measuring TKA component rotation. TKA components were implanted in dry-bone models and assigned to two groups. The first group (n = 7) had variable femoral component rotations, and the second group (n = 6) had variable tibial tray rotations. CT images were then used to assess component rotation. Accuracy of CT rotational assessment was determined by mean difference, in degrees, between implanted component rotation and CT-measured rotation. Intraclass correlation coefficient (ICC) was applied to determine intra-observer and inter-observer reliability. Femoral component accuracy showed a mean difference of 2.5° and the tibial tray a mean difference of 3.2°. There was good intra- and inter-observer reliability for both components, with a femoral ICC of 0.8 and 0.76, and tibial ICC of 0.68 and 0.65, respectively. CT rotational assessment accuracy can differ from true component rotation by approximately 3° for each component. It does, however, have good inter- and intra-observer reliability.

[Desing and validation of a scale to measure caregiving dedication in caregivers of dependent older people].

PubMed

Serrano-Ortega, Natalia; Frías-Osuna, Antonio; Recio-Gómez, Juan M; Del-Pino-Casado, Rafael

2015-11-01

To develop and validate a scale to measure caregiving dedication regarding activities of daily living in caregivers of dependent older people. Cross-sectional study. Primary Health Care (Andalusia, Spain). a probabilistic sample of 200 caregivers of older relatives from Córdoba, Spain. Content validation by experts, construct validity (by exploratory factor analysis), divergent validity and reliability (internal consistency, test-retest reliability and inter-observers reliability). Cronbach's alpha was 0.86. Intraclass Correlation Coefficient was 0.96 for test-retest reliability and 0.88 for inter-observers reliability. When the sample was divided in two groups according to perceived burden level (presence and absence), the perceived burden was significantly different in each group (P=.001). The factor analysis revealed one only factor that explained 64% of the variance. The scale allows a suitable measure of caregiving dedication regarding activities of daily living in caregivers of older people, because this scale allows a quickly, easy administration, is well accepted by caregivers, has acceptable psychometric results and includes the frequency of caregiving, the kind of attended need and the dependence level in each need. Copyright © 2014 Elsevier España, S.L.U. All rights reserved.
Validation and cross cultural adaptation of the Italian version of the Harris Hip Score.

PubMed

Dettoni, Federico; Pellegrino, Pietro; La Russa, Massimo R; Bonasia, Davide E; Blonna, Davide; Bruzzone, Matteo; Castoldi, Filippo; Rossi, Roberto

2015-01-01

The Harris Hip Score (HHS) is one of the most widely used health related quality of life (HRQOL) measures for the assessment of hip pathology: in spite of this, a validation study, and an official Italian version have not been provided yet. The aim of this study was to create an Italian valid and reliable version of the HHS. The score was translated and modified in Italian; then 103 patients with different hip pathologies were evaluated using this HHS version and also with the WOMAC and the SF-12 questionnaires. Content, construct and criterion validities were tested, such as interobserver reliability, test-retest reliability and internal consistency. Cross-cultural adaptation was easy, and only minor adaptation was required in the translation process. Construct and criterion validity of the HHS Italian Version were confirmed by satisfactory values of Spearman's Rho for correlation between specific domains of HHS and Womac and SF12 scores. Interobserver and test-retest reliabilities obtained values of 0.996 and 0.975 respectively; Cronbach's alpha for internal consistency was 0.816. Statistical and clinical analysis showed that HHS is highly valid and reliable in this new Italian version.
Diagnosing paratonia in the demented elderly: reliability and validity of the Paratonia Assessment Instrument (PAI).

PubMed

Hobbelen, Johannes S M; Koopmans, Raymond T C M; Verhey, Frans R J; Habraken, Kitty M; de Bie, Rob A

2008-08-01

Paratonia is one of the associated movement disorders characteristic of dementia. The aim of this study was to develop an assessment tool (the Paratonia Assessment Instrument, PAI), based on the new consensus definition of paratonia. An additional aim was to investigate the reliability and validity of the PAI. A three-phase cross-sectional survey was conducted. In the first two phases, the PAI was developed and validated. In the third phase, the inter-observer reliability and feasibility of the instrument was tested. The original PAI consisted of five criteria that all needed to be met in order to make the diagnosis. On the basis of a qualitative analysis, one criterion was reformulated and another was removed. Following this, inter-observer reliability between the two assessors resulted in an improvement of Cohen's kappa from 0.532 in the initial phase to 0.677 in the second phase. This improvement was substantiated in the third phase by two independent assessors with Cohen's kappa ranging from 0.625 to 1. The PAI is a reliable and valid assessment tool for diagnosing paratonia in elderly people with dementia that can be applied easily in daily practice.
Reliability and reproducibility of several methods of arthroscopic assessment of femoral tunnel position during anterior cruciate ligament reconstruction.

PubMed

Ilahi, Omer A; Mansfield, David J; Urrea, Luis H; Qadeer, Ali A

2014-10-01

To assess interobserver and intraobserver agreement of estimating anterior cruciate ligament (ACL) femoral tunnel positioning arthroscopically using circular and linear (noncircular) estimation methods and to determine whether overlay template visual aids improve agreement. Standardized intraoperative pictures of femoral tunnel pilot holes (taken with a 30° arthroscope through an anterolateral portal at 90° of knee flexion with horizontal being parallel to the tibial surface) in 27 patients undergoing single-bundle ACL reconstruction were presented to 3 fellowship-trained arthroscopists on 2 separate occasions. On both viewings, each surgeon estimated the femoral tunnel pilot hole location to the nearest half-hour mark using a whole clock face and half clock face, to the nearest 15° using a whole compass and half compass, in the top or bottom half of a linear quadrant, and in the top or bottom half of a linear trisector. Evaluations were performed first without and then with an overlay template of each estimation method. The average difference among reviewers was quite similar for all 4 circular methods with the use of visual aids. Without overlay template visual aids, pair-wise κ statistic values for interobserver agreement ranged from -0.14 to 0.56 for the whole clock face and from 0.16 to 0.42 for the half clock face. With overlay visual guides, interobserver agreement ranged from 0.29 to 0.63 for the whole clock face and from 0.17 to 0.66 for the half clock face. The quadrant method's interobserver agreement ranged from 0.22 to 0.60, and that of the trisection method ranged from 0.17 to 0.57. Neither linear estimation method's reliability uniformly improved with the use of overlay templates. Intraobserver agreement without overlay templates ranged from 0.17 to 0.49 for the whole clock face, 0.11 to 0.47 for the half clock face, 0.01 to 0.66 for the quadrant method, and 0.20 to 0.57 for the trisection method. Use of overlay templates did not uniformly improve intraobserver agreement for any estimation method. There does not appear to be any advantage of using a half clock face or compass for estimating femoral tunnel position compared with a whole clock-face analogy. Visual reference aids appear to improve interobserver agreement (reliability) of circular analogies. The linear quadrant appears to be the most reliable method (fair to moderate agreement) for estimating femoral tunnel position without a visual aid for reference, but even better reliability, ranging from fair to good agreement, may be obtained by using the whole clock-face analogy with a visual aid. Increasing femoral tunnel position reliability may improve outcomes of ACL reconstruction surgery. Copyright © 2014 Arthroscopy Association of North America. Published by Elsevier Inc. All rights reserved.
The validity and reliability of a simple semantic classification of foot posture.

PubMed

Cross, Hugh A; Lehman, Linda

2008-12-01

The Simple Semantic Classification (SSC) is described as a pragmatic method to assist in the assessment of the weight bearing foot. It was designed for application by therapists and technicians working in underdeveloped situations, after they have had basic orientation in foot function. To present evidence of the validity and inter observer reliability of the SSC. 13 physiotherapists from LEPRA India projects and 12 physical therapists functioning within the National Programme for the Elimination of Hansen's Disease (PNEH), Brazil, participated in an inter-observer exercise. Inter-observer agreement was gauged using the Kappa statistic. The results of the inter-observer exercise were dependent on observations of foot posture made from photographs. This was necessary to ensure that the procedure was standardised for participants in different countries. The method had limitations which were partly reflected in the results. The level of agreement between the principle investigator and Indian physiotherapists was Kappa = 058. The level of agreement between Brazilian physical therapists and the principle investigator was Kappa = 0.70. The authors opine that the results were sufficiently compelling to suggest that the Simple Semantic Classification can be used as a field method to identify people at increased risk of foot pathologies.
Reliability of internal oblique elbow radiographs for measuring displacement of medial epicondyle humerus fractures: a cadaveric study.

PubMed

Gottschalk, Hilton P; Bastrom, Tracey P; Edmonds, Eric W

2013-01-01

Standard elbow radiographs (AP and lateral views) are not accurate enough to measure true displacement of medial epicondyle fractures of the humerus. The amount of perceived displacement has been used to determine treatment options. This study assesses the utility of internal oblique radiographs for measurement of true displacement in these fractures. A medial epicondyle fracture was created in a cadaveric specimen. Displacement of the fragment (mm) was set at 5, 10, and 15 in line with the vector of the flexor pronator mass. The fragment was sutured temporarily in place. Radiographs were obtained at 0 (AP), 15, 30, 45, 60, 75, and 90 degrees (lateral) of internal rotation, with the elbow in set positions of flexion. This was done with and without radio-opaque markers placed on the fragment and fracture bed. The 45 and 60 degrees internal oblique radiographs were then presented to 5 separate reviewers (of different levels of training) to evaluate intraobserver and interobserver agreement. Change in elbow position did not affect the perceived displacement (P=0.82) with excellent intraobserver reliability (intraclass correlation coefficient range, 0.979 to 0.988) and interobserver agreement of 0.953. The intraclass correlation coefficient for intraobserver reliability on 45 degrees internal oblique films for all groups ranged from 0.985 to 0.998, with interobserver agreement of 0.953. For predicting displacement, the observers were 60% accurate in predicting the true displacement on the 45 degrees internal oblique films and only 35% accurate using the 60 degrees internal oblique view. Standardizing to a 45 degrees internal oblique radiograph of the elbow (regardless of elbow flexion) can augment the treating surgeon's ability to determine true displacement. At this degree of rotation, the measured number can be multiplied by 1.4 to better estimate displacement. The addition of a 45 degrees internal oblique radiograph in medial humeral epicondyle fractures has good intraobserver and interobserver reliability to more accurately estimate the true displacement of these fractures. Diagnostic study, Level II (Development of diagnostic study with universally applied reference "gold" standard).
Plateau-patella angle in evaluation of patellar height after total knee arthroplasty.

PubMed

Robin, Brett N; Ellington, Matthew D; Jupiter, Daniel C; Allen, Bryce C

2014-07-01

The plateau-patella angle (PPA) has been proposed as a new and simpler method to describe patellar height. This method has not been used or validated in knees following total knee arthroplasty (TKA). A modified PPA (mPPA) was developed for use in this population. The method was validated by determining the interobserver and intraobserver reliability of the technique in 50 consecutive patients compared to three well-described methods of describing patellar height after TKA. Three observers then evaluated the mPPA of 297 post-operative radiographs to describe a normal range after TKA for a given technique and implant. The interobserver reliability was the highest for the mPPA compared to the other methods. The mean mPPA for the entire cohort was 21.06, 20.49, and 19.94 for the three observers. The modified plateau-patella angle is a reliable way to evaluate patellar height in patients who have undergone total knee arthroplasty. Copyright © 2014 Elsevier Inc. All rights reserved.
The development of a reliable amateur boxing performance analysis template.

PubMed

Thomson, Edward; Lamb, Kevin; Nicholas, Ceri

2013-01-01

The aim of this study was to devise a valid performance analysis system for the assessment of the movement characteristics associated with competitive amateur boxing and assess its reliability using analysts of varying experience of the sport and performance analysis. Key performance indicators to characterise the demands of an amateur contest (offensive, defensive and feinting) were developed and notated using a computerised notational analysis system. Data were subjected to intra- and inter-observer reliability assessment using median sign tests and calculating the proportion of agreement within predetermined limits of error. For all performance indicators, intra-observer reliability revealed non-significant differences between observations (P > 0.05) and high agreement was established (80-100%) regardless of whether exact or the reference value of ±1 was applied. Inter-observer reliability was less impressive for both analysts (amateur boxer and experienced analyst), with the proportion of agreement ranging from 33-100%. Nonetheless, there was no systematic bias between observations for any indicator (P > 0.05), and the proportion of agreement within the reference range (±1) was 100%. A reliable performance analysis template has been developed for the assessment of amateur boxing performance and is available for use by researchers, coaches and athletes to classify and quantify the movement characteristics of amateur boxing.
Inter-Observer Reliability of DSM-5 Substance Use Disorders*

PubMed Central

Denis, Cécile M.; Gelernter, Joel; Hart, Amy B.; Kranzler, Henry R.

2015-01-01

Aims Although studies have examined the impact of changes made in DSM-5 on the estimated prevalence of substance use disorder (SUD) diagnoses, there is limited evidence of the reliability of DSM-5 SUDs. We evaluated the inter-observer reliability of four DSM-5 SUDs in a sample in which we had previously evaluated the reliability of DSM-IV diagnoses, allowing us to compare the two systems. Methods Two different interviewers each assessed 173 subjects over a 2-week period using the Semi-Structured Assessment for Drug Dependence and Alcoholism (SSADDA). Using the percent agreement and kappa (κ) coefficient, we examined the reliability of DSM-5 lifetime alcohol, opioid, cocaine, and cannabis use disorders, which we compared to that of SSADDA-derived DSM-IV SUD diagnoses. We also assessed the effect of additional lifetime SUD and lifetime mood or anxiety disorder diagnoses on the reliability of the DSM-5 SUD diagnoses. Results Reliability was good to excellent for the four disorders, with κ values ranging from 0.65 to 0.94. Agreement was consistently lower for SUDs of mild severity than for moderate or severe disorders. DSM-5 SUD diagnoses showed greater reliability than DSM-IV diagnoses of abuse or dependence or dependence only. Co-occurring SUD and lifetime mood or anxiety disorders exerted a modest effect on the reliability of the DSM-5 SUD diagnoses. Conclusions For alcohol, opioid, cocaine and cannabis use disorders, DSM-5 criteria and diagnoses are at least as reliable as those of DSM-IV. PMID:26048641
Validity of a smartphone protractor to measure sagittal parameters in adult spinal deformity.

PubMed

Kunkle, William Aaron; Madden, Michael; Potts, Shannon; Fogelson, Jeremy; Hershman, Stuart

2017-10-01

Smartphones have become an integral tool in the daily life of health-care professionals (Franko 2011). Their ease of use and wide availability often make smartphones the first tool surgeons use to perform measurements. This technique has been validated for certain orthopedic pathologies (Shaw 2012; Quek 2014; Milanese 2014; Milani 2014), but never to assess sagittal parameters in adult spinal deformity (ASD). This study was designed to assess the validity, reproducibility, precision, and efficiency of using a smartphone protractor application to measure sagittal parameters commonly measured in ASD assessment and surgical planning. This study aimed to (1) determine the validity of smartphone protractor applications, (2) determine the intra- and interobserver reliability of smartphone protractor applications when used to measure sagittal parameters in ASD, (3) determine the efficiency of using a smartphone protractor application to measure sagittal parameters, and (4) elucidate whether a physician's level of experience impacts the reliability or validity of using a smartphone protractor application to measure sagittal parameters in ASD. An experimental validation study was carried out. Thirty standard 36″ standing lateral radiographs were examined. Three separate measurements were performed using a marker and protractor; then at a separate time point, three separate measurements were performed using a smartphone protractor application for all 30 radiographs. The first 10 radiographs were then re-measured two more times, for a total of three measurements from both the smartphone protractor and marker and protractor. The parameters included lumbar lordosis, pelvic incidence, and pelvic tilt. Three raters performed all measurements-a junior level orthopedic resident, a senior level orthopedic resident, and a fellowship-trained spinal deformity surgeon. All data, including the time to perform the measurements, were recorded, and statistical analysis was performed to determine intra- and interobserver reliability, as well as accuracy, efficiency, and precision. Statistical analysis using the intra- and interclass correlation coefficient was calculated using R (version 3.3.2, 2016) to determine the degree of intra- and interobserver reliability. High rates of intra- and interobserver reliability were observed between the junior resident, senior resident, and attending surgeon when using the smartphone protractor application as demonstrated by high inter- and intra-class correlation coefficients greater than 0.909 and 0.874 respectively. High rates of inter- and intraobserver reliability were also seen between the junior resident, senior resident, and attending surgeon when a marker and protractor were used as demonstrated by high inter- and intra-class correlation coefficients greater than 0.909 and 0.807 respectively. The lumbar lordosis, pelvic incidence, and pelvic tilt values were accurately measured by all three raters, with excellent inter- and intra-class correlation coefficient values. When the first 10 radiographs were re-measured at different time points, a high degree of precision was noted. Measurements performed using the smartphone application were consistently faster than using a marker and protractor-this difference reached statistical significance of p<.05. Adult spinal deformity radiographic parameters can be measured accurately, precisely, reliably, and more efficiently using a smartphone protractor application than with a standard protractor and wax pencil. A high degree of intra- and interobserver reliability was seen between the residents and attending surgeon, indicating measurements made with a smartphone protractor are unaffected by an observer's level of experience. As a result, smartphone protractors may be used when planning ASD surgery. Copyright © 2017 Elsevier Inc. All rights reserved.
Influences of Response Rate and Distribution on the Calculation of Interobserver Reliability Scores

ERIC Educational Resources Information Center

Rolider, Natalie U.; Iwata, Brian A.; Bullock, Christopher E.

2012-01-01

We examined the effects of several variations in response rate on the calculation of total, interval, exact-agreement, and proportional reliability indices. Trained observers recorded computer-generated data that appeared on a computer screen. In Study 1, target responses occurred at low, moderate, and high rates during separate sessions so that…
Lenke and King classification systems for adolescent idiopathic scoliosis: interobserver agreement and postoperative results

PubMed Central

Hosseinpour-Feizi, Hojjat; Soleimanpour, Jafar; Sales, Jafar Ganjpour; Arzroumchilar, Ali

2011-01-01

Purpose The aim of this study was to investigate the interobserver agreement of the Lenke and King classifications for adolescent idiopathic scoliosis, and to compare the results of surgery performed based on classification of the scoliosis according to each of these classification systems. Methods The study was conducted in Shohada Hospital in Tabriz, Iran, between 2009 and 2010. First, a reliability assessment was undertaken to assess interobserver agreement of the Lenke and King classifications for adolescent idiopathic scoliosis. Second, postoperative efficacy and safety of surgery performed based on the Lenke and King classifications were compared. Kappa coefficients of agreement were calculated to assess the agreement. Outcomes were compared using bivariate tests and repeated measures analysis of variance. Results A low to moderate interobserver agreement was observed for the King classification; the Lenke classification yielded mostly high agreement coefficients. The outcome of surgery was not found to be substantially different between the two systems. Conclusion Based on the results, the Lenke classification method seems advantageous. This takes into consideration the Lenke classification’s priority in providing details of curvatures in different anatomical surfaces to explain precise intensity of scoliosis, that it has higher interobserver agreement scores, and also that it leads to noninferior postoperative results compared with the King classification method. PMID:22267934
Lenke and King classification systems for adolescent idiopathic scoliosis: interobserver agreement and postoperative results.

PubMed

Hosseinpour-Feizi, Hojjat; Soleimanpour, Jafar; Sales, Jafar Ganjpour; Arzroumchilar, Ali

2011-01-01

The aim of this study was to investigate the interobserver agreement of the Lenke and King classifications for adolescent idiopathic scoliosis, and to compare the results of surgery performed based on classification of the scoliosis according to each of these classification systems. The study was conducted in Shohada Hospital in Tabriz, Iran, between 2009 and 2010. First, a reliability assessment was undertaken to assess interobserver agreement of the Lenke and King classifications for adolescent idiopathic scoliosis. Second, postoperative efficacy and safety of surgery performed based on the Lenke and King classifications were compared. Kappa coefficients of agreement were calculated to assess the agreement. Outcomes were compared using bivariate tests and repeated measures analysis of variance. A low to moderate interobserver agreement was observed for the King classification; the Lenke classification yielded mostly high agreement coefficients. The outcome of surgery was not found to be substantially different between the two systems. Based on the results, the Lenke classification method seems advantageous. This takes into consideration the Lenke classification's priority in providing details of curvatures in different anatomical surfaces to explain precise intensity of scoliosis, that it has higher interobserver agreement scores, and also that it leads to noninferior postoperative results compared with the King classification method.
Major influence of interobserver reliability on polytrauma identification with the Injury Severity Score (ISS): Time for a centralised coding in trauma registries?

PubMed

Maduz, Roman; Kugelmeier, Patrick; Meili, Severin; Döring, Robert; Meier, Christoph; Wahl, Peter

2017-04-01

The Abbreviated Injury Scale (AIS) and the Injury Severity Score (ISS) find increasingly widespread use to assess trauma burden and to perform interhospital benchmarking through trauma registries. Since 2015, public resource allocation in Switzerland shall even be derived from such data. As every trauma centre is responsible for its own coding and data input, this study aims at evaluating interobserver reliability of AIS and ISS coding. Interobserver reliability of the AIS and ISS is analysed from a cohort of 50 consecutive severely injured patients treated in 2012 at our institution, coded retrospectively by 3 independent and specifically trained observers. Considering a cutoff ISS≥16, only 38/50 patients (76%) were uniformly identified as polytraumatised or not. Increasing the cut off to ≥20, this increased to 41/50 patients (82%). A difference in the AIS of ≥ 1 was present in 261 (16%) of possible codes. Excluding the vast majority of uninjured body regions, uniformly identical AIS severity values were attributed in 67/193 (35%) body regions, or 318/579 (55%) possible observer pairings. Injury severity all too often is neither identified correctly nor consistently when using the AIS. This leads to wrong identification of severely injured patients using the ISS. Improving consistency of coding through centralisation is recommended before scores based on the AIS are to be used for interhospital benchmarking and resource allocation in the treatment of severely injured patients. Copyright © 2017. Published by Elsevier Ltd.
Inter-Observer Agreement of Whole-Body Computed Tomography in Staging and Response Assessment in Lymphoma: The Lugano Classification.

PubMed

Razek, Ahmed Abdel Khalek Abdel; Shamaa, Sameh; Lattif, Mahmoud Abdel; Yousef, Hanan Hamid

2017-01-01

To assess inter-observer agreement of whole-body computed tomography (WBCT) in staging and response assessment in lymphoma according to the Lugano classification. Retrospective analysis was conducted of 115 consecutive patients with lymphomas (45 females, 70 males; mean age of 46 years). Patients underwent WBCT with a 64 multi-detector CT device for staging and response assessment after a complete course of chemotherapy. Image analysis was performed by 2 reviewers according to the Lugano classification for staging and response assessment. The overall inter-observer agreement of WBCT in staging of lymphoma was excellent ( k =0.90, percent agreement=94.9%). There was an excellent inter-observer agreement for stage I ( k =0.93, percent agreement=96.4%), stage II ( k =0.90, percent agreement=94.8%), stage III ( k =0.89, percent agreement=94.6%) and stage IV ( k =0.88, percent agreement=94%). The overall inter-observer agreement in response assessment after a completer course of treatment was excellent ( k =0.91, percent agreement=95.8%). There was an excellent inter-observer agreement in progressive disease ( k =0.94, percent agreement=97.1%), stable disease ( k =0.90, percent agreement=95%), partial response ( k =0.96, percent agreement=98.1%) and complete response ( k =0.87, Percent agreement=93.3%). We concluded that WBCT is a reliable and reproducible imaging modality for staging and treatment assessment in lymphoma according to the Lugano classification.
Power Doppler ultrasound of rheumatoid synovitis: quantification of vascular signal and analysis of interobserver variability.

PubMed

Kamishima, Tamotsu; Tanimura, Kazuhide; Henmi, Mihoko; Narita, Akihiro; Sakamoto, Fumihiko; Terae, Satoshi; Shirato, Hiroki

2009-05-01

The objective of this study was to assess interobserver uncertainties in power Doppler (PD) examination of the fingers of patients with rheumatoid arthritis (RA), by separating the source of the discrepancy into (1) acquisition of the images and (2) criteria for assessment of the images. Twenty patients who had been diagnosed with RA were enrolled in this study. Ultrasound examinations were performed by one inexperienced and two experienced sonographers. Interobserver variation was measured using a conventional semiquantitative image grading scale. Interobserver variation of the quantitative PD (QPD) index (the summation of the colored pixels in a region of interest) was also assessed. The agreement was higher between the two experienced sonographers (kappa value of 0.8) than between experienced and inexperienced sonographers (kappa value, 0.6-0.7) in the semiquantitative image grading scale. Results suggest that the difference in the assessment on the image grading scale was due more to the difference in the acquisition of the images than to variations in the grading criteria between sonographers. An excellent relationship was noted between the image grading scale and the QPD index for Doppler signal with a Spearman's coefficient of rank correlation of 0.83 (P < 0.0001). Interobserver discrepancies in the image grading and QPD index methods were due more to the difference in the acquisition of the image than to the grading criteria used. The QPD index seems to be as reliable as the image grading scale with reasonable interobserver agreement between experienced sonographers.
Orofacial Pain during Mastication in People with Dementia: Reliability Testing of the Orofacial Pain Scale for Non-Verbal Individuals.

PubMed

de Vries, Merlijn W; Visscher, Corine; Delwel, Suzanne; van der Steen, Jenny T; Pieper, Marjoleine J C; Scherder, Erik J A; Achterberg, Wilco P; Lobbezoo, Frank

2016-01-01

Objectives. The aim of this study was to establish the reliability of the "chewing" subscale of the OPS-NVI, a novel tool designed to estimate presence and severity of orofacial pain in nonverbal patients. Methods. The OPS-NVI consists of 16 items for observed behavior, classified into four categories and a subjective estimate of pain. Two observers used the OPS-NVI for 237 video clips of people with dementia in Dutch nursing homes during their meal to observe their behavior and to estimate the intensity of orofacial pain. Six weeks later, the same observers rated the video clips a second time. Results. Bottom and ceiling effects for some items were found. This resulted in exclusion of these items from the statistical analyses. The categories which included the remaining items (n = 6) showed reliability varying between fair-to-good and excellent (interobserver reliability, ICC: 0.40-0.47; intraobserver reliability, ICC: 0.40-0.92). Conclusions. The "chewing" subscale of the OPS-NVI showed a fair-to-good to excellent interobserver and intraobserver reliability in this dementia population. This study contributes to the validation process of the OPS-NVI as a whole and stresses the need for further assessment of the reliability of the OPS-NVI with subjects that might already show signs of orofacial pain.
Reliability of anthropometric measurements in European preschool children: the ToyBox-study.

PubMed

De Miguel-Etayo, P; Mesana, M I; Cardon, G; De Bourdeaudhuij, I; Góźdź, M; Socha, P; Lateva, M; Iotova, V; Koletzko, B V; Duvinage, K; Androutsos, O; Manios, Y; Moreno, L A

2014-08-01

The ToyBox-study aims to develop and test an innovative and evidence-based obesity prevention programme for preschoolers in six European countries: Belgium, Bulgaria, Germany, Greece, Poland and Spain. In multicentre studies, anthropometric measurements using standardized procedures that minimize errors in the data collection are essential to maximize reliability of measurements. The aim of this paper is to describe the standardization process and reliability (intra- and inter-observer) of height, weight and waist circumference (WC) measurements in preschoolers. All technical procedures and devices were standardized and centralized training was given to the fieldworkers. At least seven children per country participated in the intra- and inter-observer reliability testing. Intra-observer technical error ranged from 0.00 to 0.03 kg for weight and from 0.07 to 0.20 cm for height, with the overall reliability being above 99%. A second training was organized for WC due to low reliability observed in the first training. Intra-observer technical error for WC ranged from 0.12 to 0.71 cm during the first training and from 0.05 to 1.11 cm during the second training, and reliability above 92% was achieved. Epidemiological surveys need standardized procedures and training of researchers to reduce measurement error. In the ToyBox-study, very good intra- and-inter-observer agreement was achieved for all anthropometric measurements performed. © 2014 World Obesity.
Reliability analysis of the AOSpine thoracolumbar spine injury classification system by a worldwide group of naïve spinal surgeons.

PubMed

Kepler, Christopher K; Vaccaro, Alexander R; Koerner, John D; Dvorak, Marcel F; Kandziora, Frank; Rajasekaran, Shanmuganathan; Aarabi, Bizhan; Vialle, Luiz R; Fehlings, Michael G; Schroeder, Gregory D; Reinhold, Maximilian; Schnake, Klaus John; Bellabarba, Carlo; Cumhur Öner, F

2016-04-01

The aims of this study were (1) to demonstrate the AOSpine thoracolumbar spine injury classification system can be reliably applied by an international group of surgeons and (2) to delineate those injury types which are difficult for spine surgeons to classify reliably. A previously described classification system of thoracolumbar injuries which consists of a morphologic classification of the fracture, a grading system for the neurologic status and relevant patient-specific modifiers was applied to 25 cases by 100 spinal surgeons from across the world twice independently, in grading sessions 1 month apart. The results were analyzed for classification reliability using the Kappa coefficient (κ). The overall Kappa coefficient for all cases was 0.56, which represents moderate reliability. Kappa values describing interobserver agreement were 0.80 for type A injuries, 0.68 for type B injuries and 0.72 for type C injuries, all representing substantial reliability. The lowest level of agreement for specific subtypes was for fracture subtype A4 (Kappa = 0.19). Intraobserver analysis demonstrated overall average Kappa statistic for subtype grading of 0.68 also representing substantial reproducibility. In a worldwide sample of spinal surgeons without previous exposure to the recently described AOSpine Thoracolumbar Spine Injury Classification System, we demonstrated moderate interobserver and substantial intraobserver reliability. These results suggest that most spine surgeons can reliably apply this system to spine trauma patients as or more reliably than previously described systems.
Can we improve accuracy and reliability of MRI interpretation in children with optic pathway glioma? Proposal for a reproducible imaging classification.

PubMed

Lambron, Julien; Rakotonjanahary, Josué; Loisel, Didier; Frampas, Eric; De Carli, Emilie; Delion, Matthieu; Rialland, Xavier; Toulgoat, Frédérique

2016-02-01

Magnetic resonance (MR) images from children with optic pathway glioma (OPG) are complex. We initiated this study to evaluate the accuracy of MR imaging (MRI) interpretation and to propose a simple and reproducible imaging classification for MRI. We randomly selected 140 MRIs from among 510 MRIs performed on 104 children diagnosed with OPG in France from 1990 to 2004. These images were reviewed independently by three radiologists (F.T., 15 years of experience in neuroradiology; D.L., 25 years of experience in pediatric radiology; and J.L., 3 years of experience in radiology) using a classification derived from the Dodge and modified Dodge classifications. Intra- and interobserver reliabilities were assessed using the Bland-Altman method and the kappa coefficient. These reviews allowed the definition of reliable criteria for MRI interpretation. The reviews showed intraobserver variability and large discrepancies among the three radiologists (kappa coefficient varying from 0.11 to 1). These variabilities were too large for the interpretation to be considered reproducible over time or among observers. A consensual analysis, taking into account all observed variabilities, allowed the development of a definitive interpretation protocol. Using this revised protocol, we observed consistent intra- and interobserver results (kappa coefficient varying from 0.56 to 1). The mean interobserver difference for the solid portion of the tumor with contrast enhancement was 0.8 cm(3) (limits of agreement = -16 to 17). We propose simple and precise rules for improving the accuracy and reliability of MRI interpretation for children with OPG. Further studies will be necessary to investigate the possible prognostic value of this approach.

Intra-observer reproducibility and interobserver reliability of the radiographic parameters in the Spinal Deformity Study Group's AIS Radiographic Measurement Manual.

PubMed

Dang, Natasha Radhika; Moreau, Marc J; Hill, Douglas L; Mahood, James K; Raso, James

2005-05-01

Retrospective cross-sectional assessment of the reproducibility and reliability of radiographic parameters. To measure the intra-examiner and interexaminer reproducibility and reliability of salient radiographic features. The management and treatment of adolescent idiopathic scoliosis (AIS) depends on accurate and reproducible radiographic measurements of the deformity. Ten sets of radiographs were randomly selected from a sample of patients with AIS, with initial curves between 20 degrees and 45 degrees. Fourteen measures of the deformity were measured from posteroanterior and lateral radiographs by 2 examiners, and were repeated 5 times at intervals of 3-5 days. Intra-examiner and interexaminer differences were examined. The parameters include measures of curve size, spinal imbalance, sagittal kyphosis and alignment, maximum apical vertebral rotation, T1 tilt, spondylolysis/spondylolisthesis, and skeletal age. Intra-examiner reproducibility was generally excellent for parameters measured from the posteroanterior radiographs but only fair to good for parameters from the lateral radiographs, in which some landmarks were not clearly visible. Of the 13 parameters observed, 7 had excellent interobserver reliability. The measurements from the lateral radiograph were less reproducible and reliable and, thus, may not add value to the assessment of AIS. Taking additional measures encourages a systematic and comprehensive assessment of spinal radiographs.
External Validation and Evaluation of Reliability and Validity of the Modified Seoul National University Renal Stone Complexity Scoring System to Predict Stone-Free Status After Retrograde Intrarenal Surgery.

PubMed

Park, Juhyun; Kang, Minyong; Jeong, Chang Wook; Oh, Sohee; Lee, Jeong Woo; Lee, Seung Bae; Son, Hwancheol; Jeong, Hyeon; Cho, Sung Yong

2015-08-01

The modified Seoul National University Renal Stone Complexity scoring system (S-ReSC-R) for retrograde intrarenal surgery (RIRS) was developed as a tool to predict stone-free rate (SFR) after RIRS. We externally validated the S-ReSC-R. We retrospectively reviewed 159 patients who underwent RIRS. The S-ReSC-R was assigned from 1 to 12 according to the location and number of sites involved. The stone-free status was defined as no evidence of a stone or with clinically insignificant residual fragment stones less than 2 mm. Interobserver and test-retest reliabilities were evaluated. Statistical performance of the prediction model was assessed by its predictive accuracy, predictive probability, and clinical usefulness. Overall SFR was 73.0%. The SFRs were 86.7%, 70.2%, and 48.6% in low-score (1-2), intermediate-score (3-4), and high-score (5-12) groups, respectively (p<0.001). External validation of S-ReSC-R revealed an area under the curve (AUC) of 0.731 (95% CI 0.650-0.813). The AUC of the three-titered S-ReSC-R was 0.701 (95% CI 0.609-0.794). The calibration plot showed that the predicted probability of SFR had a concordance comparable to that of observed frequency. The Hosmer-Lemeshow goodness of fit test revealed a p-value of 0.01 for the S-ReSC-R and 0.90 for the three-titered S-ReSC-R. Interobserver and test-retest reliabilities revealed an almost perfect level of agreement. The present study proved the predictive value of S-ReSC-R to predict SFR following RIRS in an independent cohort. Interobserver and test-retest reliabilities confirmed that S-ReSC-R was reliable and valid.
Intraoperative Physical Examination for Diagnosis of Interosseous Ligament Rupture-Cadaveric Study.

PubMed

Kachooei, Amir Reza; Rivlin, Michael; Wu, Fei; Faghfouri, Aram; Eberlin, Kyle R; Ring, David

2015-09-01

To study the intraobserver and interobserver reliability of the diagnosis of interosseous ligament (IOL) rupture in a cadaver model. On 12 fresh frozen cadavers, radial heads were cut using an identical incision and osteotomy. After randomization, the soft tissues of the limbs were divided into 4 groups: both IOL and triangular fibrocartilage (TFCC) intact; IOL disruption but TFCC intact; both IOL and TFCC divided; and IOL intact but TFCC divided. All incisions had identical suturing. After standard instruction and demonstration of radius pull-push and radius lateral pull tests, 10 physician evaluators with different levels of experience examined the cadaver limbs in a standardized way (elbow at 90° with the forearm held in both supination and pronation) and were asked to classify them into one of the 4 groups. Next, the same examiners were asked to re-examine the limbs after randomly changing the order of examination. The interobserver reliability of agreement for the diagnosis of IOL injury (groups 2 and 3) was fair in both rounds of examination and the intraobserver reliability was moderate. The intra- and interobserver reliabilities of agreement for the 4 groups of injuries among the examiners were fair in both rounds of examination. The sensitivity, specificity, accuracy, positive, and negative predictive values were all around 70%. The likelihood of a positive test corresponding with the presence of IOL rupture (positive likelihood ratio) was 2.2. The likelihood of a negative test correctly diagnosing an intact IOL was 0.40. In cadavers, intraoperative tests had fair reliability and 70% accuracy for the diagnosis of IOL rupture using the push-pull and lateral pull maneuvers. The level of experience did not have any effect on the correct diagnosis of intact versus disrupted IOL. Although not common, some failure of surgeries for traumatic elbow fracture-dislocations is because of failure in timely diagnosis of IOL disruption. Copyright © 2015 American Society for Surgery of the Hand. Published by Elsevier Inc. All rights reserved.
Evaluation of a modified Karnofsky score to assess physical and psychological wellbeing of cats in a hospital setting.

PubMed

Taffin, Elien Rl; Paepe, Dominique; Campos, Miguel; Duchateau, Luc; Goris, Nesya; De Roover, Katrien; Daminet, Sylvie

2016-11-01

Objectives The Karnofsky score (KS) modified for cats, a scoring system to rate health and quality of life (QOL) in cats, is used in clinical trials, but its reliability and validity are yet to be determined. The present study aims to evaluate the scientific robustness of the KS when adapted for use in a hospital setting. Methods A list of variables to consider during the physical examination, which informs the clinician's score (CS) part of the KS, was added and clinicians were allowed to choose a score anywhere between 0 and 50. The Karnofsky QOL questionnaire was adapted for use in a hospital setting. F-tests with Bonferroni correction and Spearman rank correlation coefficients were used to evaluate reliability and validity of the KS to assess the health and wellbeing of cats in a hospital setting. The records of 54 feline immunodeficiency virus-positive cats, which were recruited for a clinical trial and hospitalised for 6 weeks, were reviewed. Four veterinarians scored the CS, and one veterinarian and a veterinary nurse assessed the QOL score. Results Mean absolute difference between observers was significantly larger for the CS than for the QOL score ( P <0.001) and two veterinarians scored significantly higher than the remaining two veterinarians ( P <0.001). Inter-observer correlation ranged from 0.45-0.75 for the CS. For the QOL score, the absolute difference between observers was small, no significant difference was found between observers and a high degree of inter-observer correlation was noted (r = 0.91). Conclusions and relevance The results indicate low inter-observer reliability for the CS, requiring additional modifications to this part of the KS. The QOL score seems more reliable, and the questionnaire may serve as a reliable tool in the assessment of QOL in cats in a hospital setting. Consequently, further adaptation of the KS is mandatory when simultaneous assessment of both the cat's clinical health and perceived wellbeing is required.
Inter-observer reliability of DSM-5 substance use disorders.

PubMed

Denis, Cécile M; Gelernter, Joel; Hart, Amy B; Kranzler, Henry R

2015-08-01

Although studies have examined the impact of changes made in DSM-5 on the estimated prevalence of substance use disorder (SUD) diagnoses, there is limited evidence concerning the reliability of DSM-5 SUDs. We evaluated the inter-observer reliability of four DSM-5 SUDs in a sample in which we had previously evaluated the reliability of DSM-IV diagnoses, allowing us to compare the two systems. Two different interviewers each assessed 173 subjects over a 2-week period using the Semi-Structured Assessment for Drug Dependence and Alcoholism (SSADDA). Using the percent agreement and kappa (κ) coefficient, we examined the reliability of DSM-5 lifetime alcohol, opioid, cocaine, and cannabis use disorders, which we compared to that of SSADDA-derived DSM-IV SUD diagnoses. We also assessed the effect of additional lifetime SUD and lifetime mood or anxiety disorder diagnoses on the reliability of the DSM-5 SUD diagnoses. Reliability was good to excellent for the four disorders, with κ values ranging from 0.65 to 0.94. Agreement was consistently lower for SUDs of mild severity than for moderate or severe disorders. DSM-5 SUD diagnoses showed greater reliability than DSM-IV diagnoses of abuse or dependence or dependence only. Co-occurring SUD and lifetime mood or anxiety disorders exerted a modest effect on the reliability of the DSM-5 SUD diagnoses. For alcohol, opioid, cocaine and cannabis use disorders, DSM-5 criteria and diagnoses are at least as reliable as those of DSM-IV. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Diagnosing Nodular Regenerative Hyperplasia of the Liver Is Thwarted by Low Interobserver Agreement.

PubMed

Jharap, Bindia; van Asseldonk, Dirk P; de Boer, Nanne K H; Bedossa, Pierre; Diebold, Joachim; Jonker, A Mieke; Leteurtre, Emmanuelle; Verheij, Joanne; Wendum, Dominique; Wrba, Fritz; Zondervan, Pieter E; Colombel, Jean-Frédéric; Reinisch, Walter; Mulder, Chris J J; Bloemena, Elisabeth; van Bodegraven, Adriaan A

2015-01-01

Nodular regenerative hyperplasia (NRH) of the liver is associated with several diseases and drugs. Clinical symptoms of NRH may vary from absence of symptoms to full-blown (non-cirrhotic) portal hypertension. However, diagnosing NRH is challenging. The objective of this study was to determine inter- and intraobserver agreement on the histopathologic diagnosis of NRH. Liver specimens (n=48) previously diagnosed as NRH, were reviewed for the presence of NRH by seven pathologists without prior knowledge of the original diagnosis or clinical background. The majority of the liver specimens were from thiopurine using inflammatory bowel disease patients. Histopathologic features contributing to NRH were also assessed. Criteria for NRH were modified by consensus and subsequently validated. Interobserver agreement was evaluated by using the standard kappa index. After review, definite NRH, inconclusive NRH and no NRH were found in 35% (23-40%), 21% (13-27%) and 44% (38-56%), respectively (median, IQR). The median interobserver agreement for NRH was poor (κ = 0.20, IQR 0.14-0.28). The intraobserver variability on NRH ranged between 14% and 71%. After modification of the criteria and exclusion of biopsies with technical shortcomings, the interobserver agreement on the diagnosis NRH was fair (κ = 0.45). The interobserver agreement on the histopathologic diagnosis of NRH was poor, even when assessed by well-experienced liver pathologists. Modification of the criteria of NRH based on consensus effort and exclusion of biopsies of poor quality led to a fairly increased interobserver agreement. The main conclusion of this study is that NRH is a clinicopathologic diagnosis that cannot reliably be based on histopathology alone.
Interobserver error involved in independent attempts to measure cusp base areas of Pan M1s

PubMed Central

Bailey, Shara E; Pilbrow, Varsha C; Wood, Bernard A

2004-01-01

Cusp base areas measured from digitized images increase the amount of detailed quantitative information one can collect from post-canine crown morphology. Although this method is gaining wide usage for taxonomic analyses of extant and extinct hominoids, the techniques for digitizing images and taking measurements differ between researchers. The aim of this study was to investigate interobserver error in order to help assess the reliability of cusp base area measurement within extant and extinct hominoid taxa. Two of the authors measured individual cusp base areas and total cusp base area of 23 maxillary first molars (M1) of Pan. From these, relative cusp base areas were calculated. No statistically significant interobserver differences were found for either absolute or relative cusp base areas. On average the hypocone and paracone showed the least interobserver error (< 1%) whereas the protocone and metacone showed the most (2.6–4.5%). We suggest that the larger measurement error in the metacone/protocone is due primarily to either weakly defined fissure patterns and/or the presence of accessory occlusal features. Overall, levels of interobserver error are similar to those found for intraobserver error. The results of our study suggest that if certain prescribed standards are employed then cusp and crown base areas measured by different individuals can be pooled into a single database. PMID:15447691
Optimizing study design for interobserver reliability: IUGA-ICS classification of complications of prostheses and graft insertion.

PubMed

Haylen, Bernard T; Lee, Joseph; Maher, Chris; Deprest, Jan; Freeman, Robert

2014-06-01

Results of interobserver reliability studies for the International Urogynecological Association-International Continence Society (IUGA-ICS) Complication Classification coding can be greatly influenced by study design factors such as participant instruction, motivation, and test-question clarity. We attempted to optimize these factors. After a 15-min instructional lecture with eight clinical case examples (including images) and with classification/coding charts available, those clinicians attending an IUGA Surgical Complications workshop were presented with eight similar-style test cases over 10 min and asked to code them using the Category, Time and Site classification. Answers were compared to predetermined correct codes obtained by five instigators of the IUGA-ICS prostheses and grafts complications classification. Prelecture and postquiz participant confidence levels using a five-step Likert scale were assessed. Complete sets of answers to the questions (24 codings) were provided by 34 respondents, only three of whom reported prior use of the charts. Average score [n (%)] out of eight, as well as median score (range) for each coding category were: (i) Category: 7.3 (91 %); 7 (4-8); (ii) Time: 7.8 (98 %); 7 (6-8); (iii) Site: 7.2 (90 %); 7 (5-8). Overall, the equivalent calculations (out of 24) were 22.3 (93 %) and 22 (18-24). Mean prelecture confidence was 1.37 (out of 5), rising to 3.85 postquiz. Urogynecologists had the highest correlation with correct coding, followed closely by fellows and general gynecologists. Optimizing training and study design can lead to excellent results for interobserver reliability of the IUGA-ICS Complication Classification coding, with increased participant confidence in complication-coding ability.
Ultrasonographic Evaluation of Diaphragm Thickness During Mechanical Ventilation in Intensive Care Patients.

PubMed

Francis, Colin Anthony; Hoffer, Joaquín Andrés; Reynolds, Steven

2016-01-01

Mechanical ventilation is associated with atrophy and weakness of the diaphragm. Ultrasound is an easy noninvasive way to track changes in thickness of the diaphragm. To validate ultrasound as a means of tracking thickness of the diaphragm in patients undergoing mechanical ventilation by evaluating interobserver and interoperator reliability and to collect initial data on the relationship of mode of ventilation to changes in the diaphragm. Daily ultrasound images of the quadriceps and the right side of the diaphragm were acquired in 8 critically ill patients receiving various modes of mechanical ventilation. Thickness of the diaphragm and the quadriceps was measured, and changes with time were noted. Interoperator and interobserver reliability were measured. Intraclass correlation coefficients between operators and between observers for thickness of the diaphragm and quadriceps were greater than 0.95, indicating excellent interoperator and interobserver reliability. Patients receiving assist-control ventilation (n = 4) showed a mean decline in diaphragm thickness of 4.7% per day. Patients receiving pressure support ventilation (n = 8) showed a mean increase in diaphragm thickness of 1.5% per day. Quadriceps thickness declined in all participants (n = 8) at a mean rate of 2.0% per day. Use of ultrasound to measure thickness of the diaphragm in 8 intensive care patients undergoing various modes of mechanical ventilation was feasible and yielded reproducible results. Ultrasound tracking of changes in thickness of the diaphragm in this small sample indicated that the thickness decreased during assist-control mode and increased during pressure support mode. ©2016 American Association of Critical-Care Nurses.
Automating Security Protocol Analysis

DTIC Science & Technology

2004-03-01

language that allows easy representation of pattern interaction. Using CSP, Lowe tests whether a protocol achieves authentication. In the case of...only to correctly code whatever protocol they intend to evaluate. The tool, OCaml 3.04 [1], translates the protocol into Horn clauses and then...model protocol transactions. One example of automated modeling software is Maude [19]. Maude was the intended language for this research, but Java
Defining Educational Research: A Perspective of/on Presidential Addresses and the Australian Association for Research in Education

ERIC Educational Resources Information Center

Lingard, Bob; Gale, Trevor

2010-01-01

This paper is concerned with the definition of the field of educational research and the changing and developing role of the Australian Association for Research in Education (AARE) in representing and constituting this field. The evidence for the argument is derived from AARE Presidential Addresses across its 40-year history. The paper documents…
A proposed simple method for measurement in the anterior chamber angle: biometric gonioscopy.

PubMed

Congdon, N G; Spaeth, G L; Augsburger, J; Klancnik, J; Patel, K; Hunter, D G

1999-11-01

To design a system of gonioscopy that will allow greater interobserver reliability and more clearly defined screening cutoffs for angle closure than current systems while being simple to teach and technologically appropriate for use in rural Asia, where the prevalence of angle-closure glaucoma is highest. Clinic-based validation and interobserver reliability trial. Study 1: 21 patients 18 years of age and older recruited from a university-based specialty glaucoma clinic; study 2: 32 patients 18 years of age and older recruited from the same clinic. In study 1, all participants underwent conventional gonioscopy by an experienced observer (GLS) using the Spaeth system and in the same eye also underwent Scheimpflug photography, ultrasonographic measurement of anterior chamber depth and axial length, automatic refraction, and biometric gonioscopy with measurement of the distance from iris insertion to Schwalbe's line using a reticule based in the slit-lamp ocular. In study 2, all participants underwent both conventional gonioscopy and biometric gonioscopy by an experienced gonioscopist (NGC) and a medical student with no previous training in gonioscopy (JK). Study 1: The association between biometric gonioscopy and conventional gonioscopy, Scheimpflug photography, and other factors known to correlate with the configuration of the angle. Study 2: Interobserver agreement using biometric gonioscopy compared to that obtained with conventional gonioscopy. In study 1, there was an independent, monotonic, statistically significant relationship between biometric gonioscopy and both Spaeth angle (P = 0.001, t test) and Spaeth insertion (P = 0.008, t test) grades. Biometric gonioscopy correctly identified six of six patients with occludable angles according to Spaeth criteria. Biometric gonioscopic grade was also significantly associated with the anterior chamber angle as measured by Scheimpflug photography (P = 0.005, t test). In study 2, the intraclass correlation coefficient between graders for biometric gonioscopy (0.97) was higher than for Spaeth angle grade (0.72) or Spaeth insertion grade (0.84). Biometric gonioscopy correlates well with other measures of the anterior chamber angle, shows a higher degree of interobserver reliability than conventional gonioscopy, and can readily be learned by an inexperienced observer.
Periorbital Biometric Measurements using ImageJ Software: Standardisation of Technique and Assessment Of Intra- and Interobserver Variability

PubMed Central

Rajyalakshmi, R.; Prakash, Winston D.; Ali, Mohammad Javed; Naik, Milind N.

2017-01-01

Purpose: To assess the reliability and repeatability of periorbital biometric measurements using ImageJ software and to assess if the horizontal visible iris diameter (HVID) serves as a reliable scale for facial measurements. Methods: This study was a prospective, single-blind, comparative study. Two clinicians performed 12 periorbital measurements on 100 standardised face photographs. Each individual’s HVID was determined by Orbscan IIz and used as a scale for measurements using ImageJ software. All measurements were repeated using the ‘average’ HVID of the study population as a measurement scale. Intraclass correlation coefficient (ICC) and Pearson product-moment coefficient were used as statistical tests to analyse the data. Results: The range of ICC for intra- and interobserver variability was 0.79–0.99 and 0.86–0.99, respectively. Test-retest reliability ranged from 0.66–1.0 to 0.77–0.98, respectively. When average HVID of the study population was used as scale, ICC ranged from 0.83 to 0.99, and the test-retest reliability ranged from 0.83 to 0.96 and the measurements correlated well with recordings done with individual Orbscan HVID measurements. Conclusion: Periorbital biometric measurements using ImageJ software are reproducible and repeatable. Average HVID of the population as measured by Orbscan is a reliable scale for facial measurements. PMID:29403183
Diagnosing Nodular Regenerative Hyperplasia of the Liver Is Thwarted by Low Interobserver Agreement

PubMed Central

Jharap, Bindia; van Asseldonk, Dirk P.; de Boer, Nanne K. H.; Bedossa, Pierre; Diebold, Joachim; Jonker, A. Mieke; Leteurtre, Emmanuelle; Verheij, Joanne; Wendum, Dominique; Wrba, Fritz; Zondervan, Pieter E.; Colombel, Jean-Frédéric; Reinisch, Walter; Mulder, Chris J. J.; Bloemena, Elisabeth; van Bodegraven, Adriaan A.

2015-01-01

Background and Aims Nodular regenerative hyperplasia (NRH) of the liver is associated with several diseases and drugs. Clinical symptoms of NRH may vary from absence of symptoms to full-blown (non-cirrhotic) portal hypertension. However, diagnosing NRH is challenging. The objective of this study was to determine inter- and intraobserver agreement on the histopathologic diagnosis of NRH. Methods Liver specimens (n=48) previously diagnosed as NRH, were reviewed for the presence of NRH by seven pathologists without prior knowledge of the original diagnosis or clinical background. The majority of the liver specimens were from thiopurine using inflammatory bowel disease patients. Histopathologic features contributing to NRH were also assessed. Criteria for NRH were modified by consensus and subsequently validated. Interobserver agreement was evaluated by using the standard kappa index. Results After review, definite NRH, inconclusive NRH and no NRH were found in 35% (23-40%), 21% (13-27%) and 44% (38-56%), respectively (median, IQR). The median interobserver agreement for NRH was poor (κ = 0.20, IQR 0.14-0.28). The intraobserver variability on NRH ranged between 14% and 71%. After modification of the criteria and exclusion of biopsies with technical shortcomings, the interobserver agreement on the diagnosis NRH was fair (κ = 0.45). Conclusions The interobserver agreement on the histopathologic diagnosis of NRH was poor, even when assessed by well-experienced liver pathologists. Modification of the criteria of NRH based on consensus effort and exclusion of biopsies of poor quality led to a fairly increased interobserver agreement. The main conclusion of this study is that NRH is a clinicopathologic diagnosis that cannot reliably be based on histopathology alone. PMID:26054009
Radiographic classifications in Perthes disease

PubMed Central

Huhnstock, Stefan; Svenningsen, Svein; Merckoll, Else; Catterall, Anthony; Terjesen, Terje; Wiig, Ola

2017-01-01

Background and purpose Different radiographic classifications have been proposed for prediction of outcome in Perthes disease. We assessed whether the modified lateral pillar classification would provide more reliable interobserver agreement and prognostic value compared with the original lateral pillar classification and the Catterall classification. Patients and methods 42 patients (38 boys) with Perthes disease were included in the interobserver study. Their mean age at diagnosis was 6.5 (3–11) years. 5 observers classified the radiographs in 2 separate sessions according to the Catterall classification, the original and the modified lateral pillar classifications. Interobserver agreement was analysed using weighted kappa statistics. We assessed the associations between the classifications and femoral head sphericity at 5-year follow-up in 37 non-operatively treated patients in a crosstable analysis (Gamma statistics for ordinal variables, γ). Results The original lateral pillar and Catterall classifications showed moderate interobserver agreement (kappa 0.49 and 0.43, respectively) while the modified lateral pillar classification had fair agreement (kappa 0.40). The original lateral pillar classification was strongly associated with the 5-year radiographic outcome, with a mean γ correlation coefficient of 0.75 (95% CI: 0.61–0.95) among the 5 observers. The modified lateral pillar and Catterall classifications showed moderate associations (mean γ correlation coefficient 0.55 [95% CI: 0.38–0.66] and 0.64 [95% CI: 0.57–0.72], respectively). Interpretation The Catterall classification and the original lateral pillar classification had sufficient interobserver agreement and association to late radiographic outcome to be suitable for clinical use. Adding the borderline B/C group did not increase the interobserver agreement or prognostic value of the original lateral pillar classification. PMID:28613966
Intra- and inter-observer agreement on diagnosis of Dupuytren disease, measurements of severity of contracture, and disease extent.

PubMed

Broekstra, Dieuwke C; Lanting, Rosanne; Werker, Paul M N; van den Heuvel, Edwin R

2015-08-01

Dupuytren disease (DD) is a fibrosing disease affecting the palmar aponeurosis, and is mostly treated by surgery based on measurement of severity of flexion contracture of the fingers. Literature concerning the measurement reliability is scarce. This study aimed to determine the intra- and inter-observer agreement of four variables for diagnosing DD, determining severity of contracture, and disease extent. One of them is a new measurement on the area of nodules and cords for measuring the disease extent in early disease stages. An agreement study (n = 54) was performed by two trained investigators. Agreement was calculated per finger, based on an intraclass correlation coefficient (ICC) using a latent variable model on subjects for diagnosis and Tubiana stage. For total passive extension deficit (TPED) and the area of nodules and cords, agreement was calculated with an ICC using a one-way random effects model with subject as random effect. Inter-observer agreement was very good for diagnosing DD (ICC: 95.5%-99.9%) and good to very good for classifying Tubiana stage (ICC: 73.5%-94.9%). Agreements for area and TPED were moderate (middle finger) to very good (ICC: 48.4%-98.6% and 45.0%-99.5%, respectively). Intra-observer agreement was slightly higher on average than inter-observer agreement. Overall, the intra- and inter-observer agreement in diagnosing DD, and determining the severity of flexion contracture is high. Also, the newly introduced variable area of nodules and cords has high intra- and inter-observer agreement, indicating that it is suitable to measure disease extent. Copyright © 2015 Elsevier Ltd. All rights reserved.
HARBO, a simple computer-aided observation method for recording work postures.

PubMed

Wiktorin, C; Mortimer, M; Ekenvall, L; Kilbom, A; Hjelm, E W

1995-12-01

The aim of the study was to present an observation method focusing on the positions of the hands relative to the body and to evaluate whether this simple observation technique gives a reliable estimate of the total time spent in each of five work postures during one workday. In the first part of the study the interobserver reliability of the observation method was tested with eight blue-collar workers. In the second part the observed time spent with work above the shoulder level was tested in relation to an upper-arm position analyzer, and observed time spent in work below knuckle level was tested in relation to a trunk flexion analyzer, both with 72 blue-collar workers. The interobserver reliability for full-day registrations was high. The intraclass correlation coefficients ranged from 0.99 to 1.00. The observed duration of work with hands above shoulder level correlated well with the measured duration of pronounced arm elevation (> 75 degrees). The product moment correlation coefficient was 0.97. The observed duration of work with hands below knuckle level correlated well with the measured duration of pronounced trunk flexion angles (> 40 degrees). The product moment correlation coefficient was 0.98. The present observation method, designed to make postural observations continuously for several hours, is easy to learn and seems reliable.
A prospective study evaluating cochlear implant management skills: development and validation of the Cochlear Implant Management Skills survey.

PubMed

Bennett, R J; Jayakody, D M P; Eikelboom, R H; Taljaard, D S; Atlas, M D

2016-02-01

To investigate the ability of cochlear implant (CI) recipients to physically handle and care for their hearing implant device(s) and to identify factors that may influence skills. To assess device management skills, a clinical survey was developed and validated on a clinical cohort of CI recipients. Survey development and validation. A prospective convenience cohort design study. Specialist hearing implant clinic. Forty-nine post-lingually deafened, adult CI recipients, at least 12 months postoperative. Survey test-retest reliability, interobserver reliability and responsiveness. Correlations between management skills and participant demographic, audiometric, clinical outcomes and device factors. The Cochlear Implant Management Skills survey was developed, demonstrating high test-retest reliability (0.878), interobserver reliability (0.972) and responsiveness to intervention (skills training) [t(20) = -3.913, P = 0.001]. Cochlear Implant Management Skills survey scores range from 54.69% to 100% (mean: 83.45%, sd: 12.47). No associations were found between handling skills and participant factors. This is the first study to demonstrate a range in cochlear implant device handling skills in CI recipients and offers clinicians and researchers a tool to systematically and objectively identify shortcomings in CI recipients' device handling skills. © 2015 John Wiley & Sons Ltd.
Plain film measurement error in acute displaced midshaft clavicle fractures

PubMed Central

Archer, Lori Anne; Hunt, Stephen; Squire, Daniel; Moores, Carl; Stone, Craig; O’Dea, Frank; Furey, Andrew

2016-01-01

Background Clavicle fractures are common and optimal treatment remains controversial. Recent literature suggests operative fixation of acute displaced mid-shaft clavicle fractures (DMCFs) shortened more than 2 cm improves outcomes. We aimed to identify correlation between plain film and computed tomography (CT) measurement of displacement and the inter- and intraobserver reliability of repeated radiographic measurements. Methods We obtained radiographs and CT scans of patients with acute DMCFs. Three orthopedic staff and 3 residents measured radiographic displacement at time zero and 2 weeks later. The CT measurements identified absolute shortening in 3 dimensions (by subtracting the length of the fractured from the intact clavicle). We then compared shortening measured on radiographs and shortening measured in 3 dimensions on CT. Interobserver and intraobserver reliability were calculated. Results We reviewed the fractures of 22 patients. Bland–Altman repeatability coefficient calculations indicated that radiograph and CT measurements of shortening could not be correlated owing to an unacceptable amount of measurement error (6 cm). Interobserver reliability for plain radiograph measurements was excellent (Cronbach α = 0.90). Likewise, intraobserver reliabilities for plain radiograph measurements as calculated with paired t tests indicated excellent correlation (p > 0.05 in all but 1 observer [p = 0.04]). Conclusion To establish shortening as an indication for DMCF fixation, reliable measurement tools are required. The low correlation between plain film and CT measurements we observed suggests further research is necessary to establish what imaging modality reliably predicts shortening. Our results indicate weak correlation between radiograph and CT measurement of acute DMCF shortening. PMID:27438054
Interobserver Variability of Radiographic Assessment Using a Mobile Messaging Application as a Teleconsultation Tool

PubMed Central

Özkan, Sezai; Mellema, Jos J.; Ring, David; Chen, Neal C.

2017-01-01

Background: To examine whether interobserver reliability, decision-making, and confidence in decision-making in the treatment of distal radius fractures changes if radiographs are viewed on a messenger application on a mobile phone compared to a standard DICOM viewer. Methods: Radiographs of distal radius fractures were presented to surgeons on either a smart phone using a mobile messenger application or a laptop using a DICOM viewer application. Twenty observers participated: 10 (50%) were randomly assigned to the DICOM viewer group and 10 (50%) to the mobile messenger group. Each observer was asked to evaluate the cases and (1) classify the fracture type according to the AO classification, (2) recommend operative or conservative treatment and (3) rate their confidence about this decision. Results: There was no significant difference in interobserver reliability for AO classification and recommendation for surgery for distal radius fractures in both groups. The percentage of recommendation for surgery was significantly higher in the messenger application group compared to the DICOM viewer group (89% versus 78%, P=0.019) and the confidence for treatment decision was significantly higher in the mobile messenger group compared to the DICOM viewer group (8.9 versus 7.9, P=0.026). Conclusion: Messenger applications on mobile phones could facilitate remote decision-making for patients with distal radius fractures, but should be used with caution. PMID:29226202

Reliability of landmark identification in cephalometric radiography acquired by a storage phosphor imaging system.

PubMed

Chen, Y-J; Chen, S-K; Huang, H-W; Yao, C-C; Chang, H-F

2004-09-01

To compare the cephalometric landmark identification on softcopy and hardcopy of direct digital cephalography acquired by a storage-phosphor (SP) imaging system. Ten digital cephalograms and their conventional counterpart, hardcopy on a transparent blue film, were obtained by a SP imaging system and a dye sublimation printer. Twelve orthodontic residents identified 19 cephalometric landmarks on monitor-displayed SP digital images with computer-aided method and on their hardcopies with conventional method. The x- and y-coordinates for each landmark, indicating the horizontal and vertical positions, were analysed to assess the reliability of landmark identification and evaluate the concordance of the landmark locations in softcopy and hardcopy of SP digital cephalometric radiography. For each of the 19 landmarks, the location differences as well as the horizontal and vertical components were statistically significant between SP digital cephalometric radiography and its hardcopy. Smaller interobserver errors on SP digital images than those on their hardcopies were noted for all the landmarks, except point Go in vertical direction. The scatter-plots demonstrate the characteristic distribution of the interobserver error in both horizontal and vertical directions. Generally, the dispersion of interobserver error on SP digital cephalometric radiography is less than that on its hardcopy with conventional method. The SP digital cephalometric radiography could yield better or comparable level of performance in landmark identification as its hardcopy, except point Go in vertical direction.
The Impact of Nanotechnology Energetics on the Department of Defense by 2035

DTIC Science & Technology

2010-02-17

Kaili Zhang, Daniel Esteve, Pierre Alphonse , Philippe Tailhades and Constantin Vahlas. “Nano-Energetic Materials for MEMS: A Review.” Journal of...on impact and the energetic compounds react. 16 Rossi, Carole, Kaili Zhang, Daniel Esteve, Pierre Alphonse , Philippe Tailhades and Constantin Vahlas...Rossi, Carole, Kaili Zhang, Daniel Esteve, Pierre Alphonse , Philippe Tailhades and Constantin Vahlas. “Nano-Energetic Materials for
Impact of image quality on reliability of the measurements of left ventricular systolic function and global longitudinal strain in 2D echocardiography

PubMed Central

Nagata, Yasufumi; Kado, Yuichiro; Onoue, Takeshi; Otani, Kyoko; Nakazono, Akemi; Otsuji, Yutaka; Takeuchi, Masaaki

2018-01-01

Background Left ventricular ejection fraction (LVEF) and global longitudinal strain (GLS) play important roles in diagnosis and management of cardiac diseases. However, the issue of the accuracy and reliability of LVEF and GLS remains to be solved. Image quality is one of the most important factors affecting measurement variability. The aim of this study was to investigate whether improved image quality could reduce observer variability. Methods Two sets of three apical images were acquired using relatively old- and new-generation ultrasound imaging systems (Vivid 7 and Vivid E95) in 308 subjects. Image quality was assessed by endocardial border delineation index (EBDI) using a 3-point scoring system. Three observers measured the LVEF and GLS, and these values and inter-observer variability were investigated. Results Image quality was significantly better with Vivid E95 (EBDI: 26.8 ± 5.9) than that with Vivid 7 (22.8 ± 6.3, P < 0.0001). Regarding the inter-observer variability of LVEF, the r-value, bias, 95% limit of agreement and intra-class correlation coefficient for Vivid 7 were comparable to those for Vivid E95. The % variabilities were significantly lower for Vivid E95 (5.3–6.5%) than those for Vivid 7 (6.5–7.5%). Regarding GLS, all observer variability parameters were better for Vivid E95 than for Vivid 7. Improvements in image quality yielded benefits to both LVEF and GLS measurement reliability. Multivariate analysis showed that image quality was indeed an important factor of observer variability in the measurement of LVEF and GLS. Conclusions The new-generation ultrasound imaging system offers improved image quality and reduces inter-observer variability in the measurement of LVEF and GLS. PMID:29432198
Interobserver and intraobserver reliability of the modified Waldenström classification system for staging of Legg-Calvé-Perthes disease.

PubMed

Hyman, Joshua E; Trupia, Evan P; Wright, Margaret L; Matsumoto, Hiroko; Jo, Chan-Hee; Mulpuri, Kishore; Joseph, Benjamin; Kim, Harry K W

2015-04-15

The absence of a reliable classification system for Legg-Calvé-Perthes disease has contributed to difficulty in establishing consistent management strategies and in interpreting outcome studies. The purpose of this study was to assess interobserver and intraobserver reliability of the modified Waldenström classification system among a large and diverse group of pediatric orthopaedic surgeons. Twenty surgeons independently completed the first two rounds of staging: two assessments of forty deidentified radiographs of patients with Legg-Calvé-Perthes disease in various stages. Ten of the twenty surgeons completed another two rounds of staging after the addition of a second pair of radiographs in sequence. Kappa values were calculated within and between each of the rounds. Interobserver kappa values for the classification for surveys 1, 2, 3, and 4 were 0.81, 0.82, 0.76, and 0.80, respectively (with 0.61 to 0.80 considered substantial agreement and 0.81 to 1.0, nearly perfect agreement). Intraobserver agreement for the classification was an average of 0.88 (range, 0.77 to 0.96) between surveys 1 and 2 and an average of 0.87 (range, 0.81 to 0.94) between surveys 3 and 4. The modified Waldenström classification system for staging of Legg-Calvé-Perthes disease demonstrated substantial to almost perfect agreement between and within observers across multiple rounds of study. In doing so, the results of this study provide a foundation for future validation studies, in which the classification stage will be associated with clinical outcomes. Copyright © 2015 by The Journal of Bone and Joint Surgery, Incorporated.
Identification of Nasal Bone Fractures on Conventional Radiography and Facial CT: Comparison of the Diagnostic Accuracy in Different Imaging Modalities and Analysis of Interobserver Reliability.

PubMed

Baek, Hye Jin; Kim, Dong Wook; Ryu, Ji Hwa; Lee, Yoo Jin

2013-09-01

There has been no study to compare the diagnostic accuracy of an experienced radiologist with a trainee in nasal bone fracture. To compare the diagnostic accuracy between conventional radiography and computed tomography (CT) for the identification of nasal bone fractures and to evaluate the interobserver reliability between a staff radiologist and a trainee. A total of 108 patients who underwent conventional radiography and CT after acute nasal trauma were included in this retrospective study. Two readers, a staff radiologist and a second-year resident, independently assessed the results of the imaging studies. Of the 108 patients, the presence of a nasal bone fracture was confirmed in 88 (81.5%) patients. The number of non-depressed fractures was higher than the number of depressed fractures. In nine (10.2%) patients, nasal bone fractures were only identified on conventional radiography, including three depressed and six non-depressed fractures. CT was more accurate as compared to conventional radiography for the identification of nasal bone fractures as determined by both readers (P <0.05), all diagnostic indices of an experienced radiologist were similar to or higher than those of a trainee, and κ statistics showed moderate agreement between the two diagnostic tools for both readers. There was no statistical difference in the assessment of interobserver reliability for both imaging modalities in the identification of nasal bone fractures. For the identification of nasal bone fractures, CT was significantly superior to conventional radiography. Although a staff radiologist showed better values in the identification of nasal bone fracture and differentiation between depressed and non-depressed fractures than a trainee, there was no statistically significant difference in the interpretation of conventional radiography and CT between a radiologist and a trainee.
Comparison of translabial three-dimensional ultrasound with magnetic resonance imaging for measurement of levator hiatal biometry at rest.

PubMed

Vergeldt, T F M; Notten, K J B; Stoker, J; Fütterer, J J; Beets-Tan, R G; Vliegen, R F A; Schweitzer, K J; Mulder, F E M; van Kuijk, S M J; Roovers, J P W R; Kluivers, K B; Weemhoff, M

2016-05-01

To compare translabial three-dimensional (3D) ultrasound with magnetic resonance imaging (MRI) for the measurement of levator hiatal biometry at rest in women with pelvic organ prolapse, and to determine the interobserver reliability between two independent observers for ultrasound and MRI measurements. Data were derived from a multicenter prospective cohort study in which women scheduled for conventional anterior colporrhaphy underwent translabial 3D ultrasound and MRI prior to surgery. Intraclass correlation coefficients (ICCs) were calculated to estimate interobserver reliability between two independent observers and determine the agreement between ultrasound and MRI measurements. Bland-Altman plots were created to assess the agreement between ultrasound and MRI measurements. Data from 139 women from nine hospitals were included in the study. The interobserver reliability of ultrasound assessment at rest, during Valsalva maneuver and during contraction and of MRI assessment at rest were moderate or good. The agreement between ultrasound and MRI for the measurement of levator hiatal biometry at rest was moderate, with ICCs of 0.52 (95%CI, 0.32-0.66) for levator hiatal area, 0.44 (95%CI, 0.21-0.60) for anteroposterior diameter and 0.44 (95%CI, 0.22-0.60) for transverse diameter. Levator hiatal biometry measurements were statistically significantly larger on MRI than on translabial 3D ultrasound. The agreement between translabial 3D ultrasound and MRI for measurement of the levator hiatus at rest in women with pelvic organ prolapse was only moderate. The results of translabial 3D ultrasound and MRI should therefore not be used interchangeably in daily practice or in clinical research. Copyright © 2015 ISUOG. Published by John Wiley & Sons Ltd. Copyright © 2015 ISUOG. Published by John Wiley & Sons Ltd.
Impact of image quality on reliability of the measurements of left ventricular systolic function and global longitudinal strain in 2D echocardiography.

PubMed

Nagata, Yasufumi; Kado, Yuichiro; Onoue, Takeshi; Otani, Kyoko; Nakazono, Akemi; Otsuji, Yutaka; Takeuchi, Masaaki

2018-03-01

Left ventricular ejection fraction (LVEF) and global longitudinal strain (GLS) play important roles in diagnosis and management of cardiac diseases. However, the issue of the accuracy and reliability of LVEF and GLS remains to be solved. Image quality is one of the most important factors affecting measurement variability. The aim of this study was to investigate whether improved image quality could reduce observer variability. Two sets of three apical images were acquired using relatively old- and new-generation ultrasound imaging systems (Vivid 7 and Vivid E95) in 308 subjects. Image quality was assessed by endocardial border delineation index (EBDI) using a 3-point scoring system. Three observers measured the LVEF and GLS, and these values and inter-observer variability were investigated. Image quality was significantly better with Vivid E95 (EBDI: 26.8 ± 5.9) than that with Vivid 7 (22.8 ± 6.3, P < 0.0001). Regarding the inter-observer variability of LVEF, the r -value, bias, 95% limit of agreement and intra-class correlation coefficient for Vivid 7 were comparable to those for Vivid E95. The % variabilities were significantly lower for Vivid E95 (5.3-6.5%) than those for Vivid 7 (6.5-7.5%). Regarding GLS, all observer variability parameters were better for Vivid E95 than for Vivid 7. Improvements in image quality yielded benefits to both LVEF and GLS measurement reliability. Multivariate analysis showed that image quality was indeed an important factor of observer variability in the measurement of LVEF and GLS. The new-generation ultrasound imaging system offers improved image quality and reduces inter-observer variability in the measurement of LVEF and GLS. © 2018 The authors.
Interobserver reliability of echocardiography for prognostication of normotensive patients with pulmonary embolism

PubMed Central

2014-01-01

Objectives To evaluate the interobserver reliability of echocardiographic findings of right ventricle (RV) dysfunction for prognosticating normotensive patients with pulmonary embolism (PE). Methods A central panel of cardiologists evaluated echocardiographic studies of 75 patients included in the PROTECT study for the following signs: RV diameter, RV/left ventricular (LV) diameter ratio, hypokinesis of the RV free wall, and tricuspid plane systolic excursion (TAPSE). Investigators used intraclass correlation to assess agreement between the measurements of the central panel and each of the local cardiologists. Investigators used the single weighted kappa statistic to test for agreement between readers of interpretation of RV enlargement and RV hypokinesis. Results The two observers had fair agreement (k = 0.45) for RV enlargement assessed by the RV diameter, and good agreement (k = 0.65) for RV enlargement assessed by the RV/LV diameter ratio. The interobserver reliability of the assessment whether hypokinesis of the RV free wall is present was good (к = 0.70), and whether RV dysfunction (assessed by TAPSE measurement) is present was very good (k = 0.86). The intraclass correlation for the RV/LV diameter ratio was fair (0.55; 95% confidence interval [CI], 0.37-0.69), for the RV diameter was good (0.70; 95% CI, 0.56-0.80), and for the TAPSE measurement was very good (0.85; 95% CI, 0.77-0.90). On Bland-Altman analysis, the mean differences for RV diameter, RV/LV diameter ratio and TAPSE measurement were 2.33 (±5.38), 0.06 (±0.23) and 0.08 (±2.20), respectively. Conclusion TAPSE measurement is the least user dependent and most reproducible echocardiographic finding of RV dysfunction in normotensive patients with PE. PMID:25092465
Interobserver variability and feasibility of polymerase chain reaction-based assay in distinguishing ischemic colitis from Clostridium difficile colitis in endoscopic mucosal biopsies.

PubMed

Wiland, Homer O; Procop, Gary W; Goldblum, John R; Tuohy, Marion; Rybicki, Lisa; Patil, Deepa T

2013-06-01

Polymerase chain reaction (PCR)-based assays using stool samples are currently the most effective method of detecting Clostridium difficile. This study examines the feasibility of this assay using mucosal biopsy samples and evaluates the interobserver reproducibility in diagnosing and distinguishing ischemic colitis from C difficile colitis. Thirty-eight biopsy specimens were reviewed and classified by 3 observers into C difficile and ischemic colitis. The findings were correlated with clinical data. PCR was performed on 34 cases using BD GeneOhm C difficile assay. The histologic interobserver agreement was excellent (κ= 0.86) and the agreement between histologic and clinical diagnosis was good (κ = 0.84). All 19 ischemic colitis cases tested negative (100% specificity) and 3 of 15 cases of C difficile colitis tested positive (20% sensitivity). C difficile colitis can be reliably distinguished from ischemic colitis using histologic criteria. The C difficile PCR test on endoscopic biopsy specimens has excellent specificity but limited sensitivity.
The quadrant method measuring four points is as a reliable and accurate as the quadrant method in the evaluation after anatomical double-bundle ACL reconstruction.

PubMed

Mochizuki, Yuta; Kaneko, Takao; Kawahara, Keisuke; Toyoda, Shinya; Kono, Norihiko; Hada, Masaru; Ikegami, Hiroyasu; Musha, Yoshiro

2017-11-20

The quadrant method was described by Bernard et al. and it has been widely used for postoperative evaluation of anterior cruciate ligament (ACL) reconstruction. The purpose of this research is to further develop the quadrant method measuring four points, which we named four-point quadrant method, and to compare with the quadrant method. Three-dimensional computed tomography (3D-CT) analyses were performed in 25 patients who underwent double-bundle ACL reconstruction using the outside-in technique. The four points in this study's quadrant method were defined as point1-highest, point2-deepest, point3-lowest, and point4-shallowest, in femoral tunnel position. Value of depth and height in each point was measured. Antero-medial (AM) tunnel is (depth1, height2) and postero-lateral (PL) tunnel is (depth3, height4) in this four-point quadrant method. The 3D-CT images were evaluated independently by 2 orthopaedic surgeons. A second measurement was performed by both observers after a 4-week interval. Intra- and inter-observer reliability was calculated by means of intra-class correlation coefficient (ICC). Also, the accuracy of the method was evaluated against the quadrant method. Intra-observer reliability was almost perfect for both AM and PL tunnel (ICC > 0.81). Inter-observer reliability of AM tunnel was substantial (ICC > 0.61) and that of PL tunnel was almost perfect (ICC > 0.81). The AM tunnel position was 0.13% deep, 0.58% high and PL tunnel position was 0.01% shallow, 0.13% low compared to quadrant method. The four-point quadrant method was found to have high intra- and inter-observer reliability and accuracy. This method can evaluate the tunnel position regardless of the shape and morphology of the bone tunnel aperture for use of comparison and can provide measurement that can be compared with various reconstruction methods. The four-point quadrant method of this study is considered to have clinical relevance in that it is a detailed and accurate tool for evaluating femoral tunnel position after ACL reconstruction. Case series, Level IV.
Construct Validity and Reliability of the SARA Gait and Posture Sub-scale in Early Onset Ataxia

PubMed Central

Lawerman, Tjitske F.; Brandsma, Rick; Verbeek, Renate J.; van der Hoeven, Johannes H.; Lunsing, Roelineke J.; Kremer, Hubertus P. H.; Sival, Deborah A.

2017-01-01

Aim: In children, gait and posture assessment provides a crucial marker for the early characterization, surveillance and treatment evaluation of early onset ataxia (EOA). For reliable data entry of studies targeting at gait and posture improvement, uniform quantitative biomarkers are necessary. Until now, the pediatric test construct of gait and posture scores of the Scale for Assessment and Rating of Ataxia sub-scale (SARA) is still unclear. In the present study, we aimed to validate the construct validity and reliability of the pediatric (SARAGAIT/POSTURE) sub-scale. Methods: We included 28 EOA patients [15.5 (6–34) years; median (range)]. For inter-observer reliability, we determined the ICC on EOA SARAGAIT/POSTURE sub-scores by three independent pediatric neurologists. For convergent validity, we associated SARAGAIT/POSTURE sub-scores with: (1) Ataxic gait Severity Measurement by Klockgether (ASMK; dynamic balance), (2) Pediatric Balance Scale (PBS; static balance), (3) Gross Motor Function Classification Scale -extended and revised version (GMFCS-E&R), (4) SARA-kinetic scores (SARAKINETIC; kinetic function of the upper and lower limbs), (5) Archimedes Spiral (AS; kinetic function of the upper limbs), and (6) total SARA scores (SARATOTAL; i.e., summed SARAGAIT/POSTURE, SARAKINETIC, and SARASPEECH sub-scores). For discriminant validity, we investigated whether EOA co-morbidity factors (myopathy and myoclonus) could influence SARAGAIT/POSTURE sub-scores. Results: The inter-observer agreement (ICC) on EOA SARAGAIT/POSTURE sub-scores was high (0.97). SARAGAIT/POSTURE was strongly correlated with the other ataxia and functional scales [ASMK (rs = -0.819; p < 0.001); PBS (rs = -0.943; p < 0.001); GMFCS-E&R (rs = -0.862; p < 0.001); SARAKINETIC (rs = 0.726; p < 0.001); AS (rs = 0.609; p = 0.002); and SARATOTAL (rs = 0.935; p < 0.001)]. Comorbid myopathy influenced SARAGAIT/POSTURE scores by concurrent muscle weakness, whereas comorbid myoclonus predominantly influenced SARAKINETIC scores. Conclusion: In young EOA patients, separate SARAGAIT/POSTURE parameters reveal a good inter-observer agreement and convergent validity, implicating the reliability of the scale. In perspective of incomplete discriminant validity, it is advisable to interpret SARAGAIT/POSTURE scores for comorbid muscle weakness. PMID:29326569
Validity and reliability of a method for assessment of cervical vertebral maturation.

PubMed

Zhao, Xiao-Guang; Lin, Jiuxiang; Jiang, Jiu-Hui; Wang, Qingzhu; Ng, Sut Hong

2012-03-01

To evaluate the validity and reliability of the cervical vertebral maturation (CVM) method with a longitudinal sample. Eighty-six cephalograms from 18 subjects (5 males and 13 females) were selected from the longitudinal database. Total mandibular length was measured on each film; an increased rate served as the gold standard in examination of the validity of the CVM method. Eleven orthodontists, after receiving intensive training in the CVM method, evaluated all films twice. Kendall's W and the weighted kappa statistic were employed. Kendall's W values were higher than 0.8 at both times, indicating strong interobserver reproducibility, but interobserver agreement was documented twice at less than 50%. A wide range of intraobserver agreement was noted (40.7%-79.1%), and substantial intraobserver reproducibility was proved by kappa values (0.53-0.86). With regard to validity, moderate agreement was reported between the gold standard and observer staging at the initial time (kappa values 0.44-0.61). However, agreement seemed to be unacceptable for clinical use, especially in cervical stage 3 (26.8%). Even though the validity and reliability of the CVM method proved statistically acceptable, we suggest that many other growth indicators should be taken into consideration in evaluating adolescent skeletal maturation.
Interobserver Variability and Accuracy of High-Definition Endoscopic Diagnosis for Gastric Intestinal Metaplasia among Experienced and Inexperienced Endoscopists

PubMed Central

Hyun, Yil Sik; Bae, Joong Ho; Park, Hye Sun; Eun, Chang Soo

2013-01-01

Accurate diagnosis of gastric intestinal metaplasia is important; however, conventional endoscopy is known to be an unreliable modality for diagnosing gastric intestinal metaplasia (IM). The aims of the study were to evaluate the interobserver variation in diagnosing IM by high-definition (HD) endoscopy and the diagnostic accuracy of this modality for IM among experienced and inexperienced endoscopists. Selected 50 cases, taken with HD endoscopy, were sent for a diagnostic inquiry of gastric IM through visual inspection to five experienced and five inexperienced endoscopists. The interobserver agreement between endoscopists was evaluated to verify the diagnostic reliability of HD endoscopy in diagnosing IM, and the diagnostic accuracy, sensitivity, and specificity were evaluated for validity of HD endoscopy in diagnosing IM. Interobserver agreement among the experienced endoscopists was "poor" (κ = 0.38) and it was also "poor" (κ = 0.33) among the inexperienced endoscopists. The diagnostic accuracy of the experienced endoscopists was superior to that of the inexperienced endoscopists (P = 0.003). Since diagnosis through visual inspection is unreliable in the diagnosis of IM, all suspicious areas for gastric IM should be considered to be biopsied. Furthermore, endoscopic experience and education are needed to raise the diagnostic accuracy of gastric IM. PMID:23678267
Interobserver variability and accuracy of high-definition endoscopic diagnosis for gastric intestinal metaplasia among experienced and inexperienced endoscopists.

PubMed

Hyun, Yil Sik; Han, Dong Soo; Bae, Joong Ho; Park, Hye Sun; Eun, Chang Soo

2013-05-01

Accurate diagnosis of gastric intestinal metaplasia is important; however, conventional endoscopy is known to be an unreliable modality for diagnosing gastric intestinal metaplasia (IM). The aims of the study were to evaluate the interobserver variation in diagnosing IM by high-definition (HD) endoscopy and the diagnostic accuracy of this modality for IM among experienced and inexperienced endoscopists. Selected 50 cases, taken with HD endoscopy, were sent for a diagnostic inquiry of gastric IM through visual inspection to five experienced and five inexperienced endoscopists. The interobserver agreement between endoscopists was evaluated to verify the diagnostic reliability of HD endoscopy in diagnosing IM, and the diagnostic accuracy, sensitivity, and specificity were evaluated for validity of HD endoscopy in diagnosing IM. Interobserver agreement among the experienced endoscopists was "poor" (κ = 0.38) and it was also "poor" (κ = 0.33) among the inexperienced endoscopists. The diagnostic accuracy of the experienced endoscopists was superior to that of the inexperienced endoscopists (P = 0.003). Since diagnosis through visual inspection is unreliable in the diagnosis of IM, all suspicious areas for gastric IM should be considered to be biopsied. Furthermore, endoscopic experience and education are needed to raise the diagnostic accuracy of gastric IM.
Clinical assessment of effusion in knee osteoarthritis—A systematic review

PubMed Central

Maricar, Nasimah; Callaghan, Michael J.; Parkes, Matthew J.; Felson, David T.; O׳Neill, Terence W.

2016-01-01

Objective The aim of this systematic review was to determine the validity and inter- and intra-observer reliability of the assessment of knee joint effusion in osteoarthritis (OA) of the knee. Methods MEDLINE, Web of Knowledge, CINAHL, EMBASE, and AMED were searched from their inception to February 2015. Articles were included according to a priori defined criteria: samples containing participants with knee OA; prospective evaluation of clinical tests and assessments of knee effusion that included reliability, sensitivity, and specificity of these tests. Results A total of 10 publications were reviewed. Eight of these considered reliability and four on validity of clinical assessments against ultrasound effusion. It was not possible to undertake a meta-analysis of reliability or validity because of differences in study designs and the clinical tests. Intra-observer kappa agreement for visible swelling ranged from 0.37 (suprapatellar) to 1.0 (prepatellar); for bulge sign 0.47 and balloon sign 0.37. Inter-observer kappa agreement for visible swelling ranged from −0.02 (prepatellar) to 0.65 (infrapatellar), the balloon sign −0.11 to 0.82, patellar tap −0.02 to 0.75 and bulge sign kappa −0.04 to 0.14 or reliability coefficient 0.97. Reliability and diagnostic accuracy tended to be better in experienced observers. Very few data looked at performance of individual clinical tests with sensitivity ranging 18.2–85.7% and specificity 35.3–93.3%, both higher with larger effusions. Conclusion The majority of unstandardized clinical tests to assess joint effusion in knee OA had relatively low intra- and inter-observer reliability. There is some evidence experience improved reliability and diagnostic accuracy of tests. Currently there is insufficient evidence to recommend any particular test in clinical practice. PMID:26581486
Clinical assessment of effusion in knee osteoarthritis-A systematic review.

PubMed

Maricar, Nasimah; Callaghan, Michael J; Parkes, Matthew J; Felson, David T; O'Neill, Terence W

2016-04-01

The aim of this systematic review was to determine the validity and inter- and intra-observer reliability of the assessment of knee joint effusion in osteoarthritis (OA) of the knee. MEDLINE, Web of Knowledge, CINAHL, EMBASE, and AMED were searched from their inception to February 2015. Articles were included according to a priori defined criteria: samples containing participants with knee OA; prospective evaluation of clinical tests and assessments of knee effusion that included reliability, sensitivity, and specificity of these tests. A total of 10 publications were reviewed. Eight of these considered reliability and four on validity of clinical assessments against ultrasound effusion. It was not possible to undertake a meta-analysis of reliability or validity because of differences in study designs and the clinical tests. Intra-observer kappa agreement for visible swelling ranged from 0.37 (suprapatellar) to 1.0 (prepatellar); for bulge sign 0.47 and balloon sign 0.37. Inter-observer kappa agreement for visible swelling ranged from -0.02 (prepatellar) to 0.65 (infrapatellar), the balloon sign -0.11 to 0.82, patellar tap -0.02 to 0.75 and bulge sign kappa -0.04 to 0.14 or reliability coefficient 0.97. Reliability and diagnostic accuracy tended to be better in experienced observers. Very few data looked at performance of individual clinical tests with sensitivity ranging 18.2-85.7% and specificity 35.3-93.3%, both higher with larger effusions. The majority of unstandardized clinical tests to assess joint effusion in knee OA had relatively low intra- and inter-observer reliability. There is some evidence experience improved reliability and diagnostic accuracy of tests. Currently there is insufficient evidence to recommend any particular test in clinical practice. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
China Report, Red Flag No. 16 August 1982

DTIC Science & Technology

1982-10-12

illegal activities. Such phenomena 19 are clear illustrations of the egoism of the capitalist classes which has developed in the hearts of some...and educate the people on the ideas arid ethics of communism. Article 20 in the "General Principles" stipulates in principle the develop- ment of...one important aspect of socialist spiritual civilization—political consciousness and ideas and ethics . This article embraces three contents: One is
AFRRI (Armed Forces Radiobiology Research Institute) Annual Research Report, 1 October 1984 through 30 September 1985.

DTIC Science & Technology

1985-09-30

locomotor performance. To evaluate the effects of radiation on social behaviors. To determine how ionizing radiation alters strength and duration of...on social behaviors and the behavioral pharmacology of social behaviors. Study involvement of CNS autostimulation of the immune system of irradiated...marrow cultured in medium not supplemented with the extract. In addition, marrow cultured in media supplemented with various collagen fractions did
New Results on a Stochastic Duel Game with Each Force Consisting of Heterogeneous Units

DTIC Science & Technology

2013-02-01

NAVAL POSTGRADUATE SCHOOL MONTEREY, CALIFORNIA NEW RESULTS ON A STOCHASTIC DUEL GAME WITH EACH FORCE CONSISTING OF...on a Stochastic Duel Game With Each Force Consisting of Heterogeneous Units 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER...distribution is unlimited 13. SUPPLEMENTARY NOTES 14. ABSTRACT Two forces engage in a duel , with each force initially consisting of several
Polar Cap and Polar Cap Boundary Phenomena

DTIC Science & Technology

2009-06-25

of the high-latitude ionospheric plasma. Incoherent scatter radar and radio tomography measurements were used to directly observe the remnants of...On the relationship between thin Birkeland current arcs and reversed flow channels in the winter cusp/cleft ionosphere Moen J., Y. Rinne, H...current arcs in the winter cusp ionosphere above Svalbard. An RFE is a longitudinally elongated, 100–200 km wide channel, in which the flow direction is

Agrarian Transformations in North Korea

DTIC Science & Technology

1960-12-22

meat — to 400,000 tons. I The culmination of the cooperative movement in agri- culture and the mighty growth of agricultural production in the KNDrl...stand for socialist development. The land reform.and the cooperative movement are a serious school for the political and economic re-education of...on the one hand, and small- peasant agricultural economy on the other. The success of the cooperative movement affirms the correctness of the
The Cleft Aesthetic Rating Scale for 18-Year-Old Unilateral Cleft Lip and Palate Patients: A Tool for Nasolabial Aesthetics Assessment.

PubMed

Mulder, F J; Mosmuller, D G M; de Vet, H C W; Mouës, C M; Breugem, C C; van der Molen, A B Mink; Don Griot, J P W

2018-01-01

Objective To develop a reliable and easy-to-use method to assess the nasolabial appearance of 18-year-old patients with unilateral cleft lip and palate (CLP). Design Retrospective analysis of nasolabial aesthetics using a 5-point ordinal scale and newly developed photographic reference scale: the Cleft Aesthetic Rating Scale (CARS). Three cleft surgeons and 20 medical students scored the nasolabial appearance on standardized frontal photographs. Setting VU University Medical Center, Amsterdam. Patients Inclusion criteria: 18-year-old patients, unilateral cleft lip and palate, available photograph of the frontal view. history of facial trauma, congenital syndromes affecting facial appearance. Eighty photographs were available for scoring. Main Outcome Measures The interobserver and intraobserver reliability of the CARS for 18-year-old patients when used by cleft surgeons and medical students. Results The interobserver reliability for the nose and lip together was 0.64 for the cleft surgeons and 0.61 for the medical students. There was an intraobserver reliability of 0.75 and 0.78 from the surgeons and students, respectively, on the nose and lip together. No significant difference was found between the cleft surgeons and medical students in the way they scored the nose ( P = 0.22) and lip ( P = 0.72). Conclusions The Cleft Aesthetic Rating Scale for 18-year-old patients has a substantial overall estimated reliability when the average score is taken from three or more cleft surgeons or medical students assessing the nasolabial aesthetics of CLP patients.
Concurrent validity and reliability of the Alberta Infant Motor Scale in premature infants.

PubMed

Almeida, Kênnea Martins; Dutra, Maria Virginia Peixoto; Mello, Rosane Reis de; Reis, Ana Beatriz Rodrigues; Martins, Priscila Silveira

2008-01-01

To verify the concurrent validity and interobserver reliability of the Alberta Infant Motor Scale (AIMS) in premature infants followed-up at the outpatient clinic of Instituto Fernandes Figueira, Fundação Oswaldo Cruz (IFF/Fiocruz), in Rio de Janeiro, Brazil. A total of 88 premature infants were enrolled at the follow-up clinic at IFF/Fiocruz, between February and December of 2006. For the concurrent validity study, 46 infants were assessed at either 6 (n = 26) or 12 (n = 20) months' corrected age using the AIMS and the second edition of the Bayley Scales of Infant Development, by two different observers, and applying Pearson's correlation coefficient to analyze the results. For the reliability study, 42 infants between 0 and 18 months were assessed using the Alberta Infant Motor Scale, by two different observers and the results analyzed using the intraclass correlation coefficient. The concurrent validity study found a high level of correlation between the two scales (r = 0.95) and one that was statistically significant (p < 0.01) for the entire population of infants, with higher values at 12 months (r = 0.89) than at 6 months (r = 0.74). The interobserver reliability study found satisfactory intraclass correlation coefficients at all ages tested, varying from 0.76 to 0.99. The AIMS is a valid and reliable instrument for the evaluation of motor development in high-risk infants within the Brazilian public health system.
Training induces scapular dyskinesis in pain-free competitive swimmers: a reliability and observational study.

PubMed

Madsen, Pernille H; Bak, Klaus; Jensen, Susanne; Welter, Ulrik

2011-03-01

Scapular dyskinesis is a major etiological factor in overhead athletes' shoulder problems. Our hypotheses were to evaluate if (1) visual observation of scapular dyskinesis during scaption has substantial interobserver reliability, and (2) scapular dyskinesis may be induced by swim training in pain-free swimmers. A reliability and observational study. Bachelor project at a college institution and at a private sports orthopedic hospital. Seventy-eight competitive swimmers with no history of shoulder pain were included in the study. Fourteen swimmers were evaluated regarding reliability. Inclusion criteria were competitive swimmers with high training volume who previously had no shoulder pain. Observations of scapular dyskinesis (yes/no) during simple scaption. The interobserver reliability of scaption and wall push-up was evaluated in 14 swimmers using kappa analysis. Prevalence of scapular dyskinesis at 4 time intervals during a swim training session. The scaption test resulted in a weighted kappa value of 0.75. Scapular dyskinesis was seen in 29 shoulders (37%) after the first time interval, in another 24 (cumulated prevalence 68%) after one-half of the training session, and in an additional 4 swimmers (cumulated prevalence 73%) after three-quarters of the training session. During the last quarter of the training session, another 7 swimmers had dyskinesis, resulting in a cumulated prevalence of 82%. The prevalence of abnormal scapular kinesis during a normal training session is high in previously pain-free swimmers. The prevalence increases with more training and occurs early during the training session.
Scapula fractures: interobserver reliability of classification and treatment.

PubMed

Neuhaus, Valentin; Bot, Arjan G J; Guitton, Thierry G; Ring, David C; Abdel-Ghany, Mahmoud I; Abrams, Jeffrey; Abzug, Joshua M; Adolfsson, Lars E; Balfour, George W; Bamberger, H Brent; Barquet, Antonio; Baskies, Michael; Batson, W Arnold; Baxamusa, Taizoon; Bayne, Grant J; Begue, Thierry; Behrman, Michael; Beingessner, Daphne; Biert, Jan; Bishop, Julius; Alves, Mateus Borges Oliveira; Boyer, Martin; Brilej, Drago; Brink, Peter R G; Brunton, Lance M; Buckley, Richard; Cagnone, Juan Carlos; Calfee, Ryan P; Campinhos, Luiz Augusto B; Cassidy, Charles; Catalano, Louis; Chivers, Karel; Choudhari, Pradeep; Cimerman, Matej; Conflitti, Joseph M; Costanzo, Ralph M; Crist, Brett D; Cross, Brian J; Dantuluri, Phani; Darowish, Michael; de Bedout, Ramon; DeCoster, Thomas; Dennison, David G; DeNoble, Peter H; DeSilva, Gregory; Dienstknecht, Thomas; Duncan, Scott F; Duralde, Xavier A; Durchholz, Holger; Egol, Kenneth; Ekholm, Carl; Elias, Nelson; Erickson, John M; Esparza, J Daniel Espinosa; Fernandes, C H; Fischer, Thomas J; Fischmeister, Martin; Forigua Jaime, E; Getz, Charles L; Gilbert, Richard S; Giordano, Vincenzo; Glaser, David L; Gosens, Taco; Grafe, Michael W; Filho, Jose Eduardo Grandi Ribeiro; Gray, Robert R L; Gulotta, Lawrence V; Gummerson, Nigel William; Hammerberg, Eric Mark; Harvey, Edward; Haverlag, R; Henry, Patrick D G; Hobby, Jonathan L; Hofmeister, Eric P; Hughes, Thomas; Itamura, John; Jebson, Peter; Jenkinson, Richard; Jeray, Kyle; Jones, Christopher M; Jones, Jedediah; Jubel, Axel; Kaar, Scott G; Kabir, K; Kaplan, F Thomas D; Kennedy, Stephen A; Kessler, Michael W; Kimball, Hervey L; Kloen, Peter; Klostermann, Cyrus; Kohut, Georges; Kraan, G A; Kristan, Anze; Loebenberg, Mark I; Malone, Kevin J; Marsh, L; Martineau, Paul A; McAuliffe, John; McGraw, Iain; Mehta, Samir; Merchant, Milind; Metzger, Charles; Meylaerts, S A; Miller, Anna N; Wolf, Jennifer Moriatis; Murachovsky, Joel; Murthi, Anand; Nancollas, Michael; Nolan, Betsy M; Omara, Timothy; Omid, Reza; Ortiz, Jose A; Overbeck, Joachim P; Castillo, Alberto Pérez; Pesantez, Rodrigo; Polatsch, Daniel; Porcellini, G; Prayson, Michael; Quell, M; Ragsdell, Matthew M; Reid, James G; Reuver, J M; Richard, Marc J; Richardson, Martin; Rizzo, Marco; Rowinski, Sergio; Rubio, Jorge; Guerrero, Carlos G Sánchez; Satora, Wojciech; Schandelmaier, Peter; Scheer, Johan H; Schmidt, Andrew; Schubkegel, Todd A; Schulte, Leah M; Schumer, Evan D; Sears, Benjamin W; Shafritz, Adam B; Shortt, Nicholas L; Siff, Todd; Silva, Dario Mejia; Smith, Raymond Malcolm; Spruijt, Sander; Stein, Jason A; Pemovska, Emilija Stojkovska; Streubel, Philipp N; Swigart, Carrie; Swiontkowski, Marc; Thomas, George; Tolo, Eric T; Turina, Matthias; Tyllianakis, Minos; van den Bekerom, Michel P J; van der Heide, Huub; van de Sande, M A J; van Eerten, P V; Verbeek, Diederik O F; Hoffmann, David Victoria; Vochteloo, A J H; Wagenmakers, Robert; Wall, Christopher J; Wallensten, Richard; Wascher, Daniel C; Weiss, Lawrence; Wiater, J Michael; Wills, Brian P D; Wint, Jeffrey; Wright, Thomas; Young, Jason P; Zalavras, Charalampos; Zura, Robert D; Zyto, Karol

2014-03-01

There is substantial variation in the classification and management of scapula fractures. The first purpose of this study was to analyze the interobserver reliability of the OTA/AO classification and the New International Classification for Scapula Fractures. The second purpose was to assess the proportion of agreement among orthopaedic surgeons on operative or nonoperative treatment. Web-based reliability study. Independent orthopaedic surgeons from several countries were invited to classify scapular fractures in an online survey. One hundred three orthopaedic surgeons evaluated 35 movies of three-dimensional computerized tomography reconstruction of selected scapular fractures, representing a full spectrum of fracture patterns. Fleiss kappa (κ) was used to assess the reliability of agreement between the surgeons. The overall agreement on the OTA/AO classification was moderate for the types (A, B, and C, κ = 0.54) with a 71% proportion of rater agreement (PA) and for the 9 groups (A1 to C3, κ = 0.47) with a 57% PA. For the New International Classification, the agreement about the intraarticular extension of the fracture (Fossa (F), κ = 0.79) was substantial and the agreement about a fractured body (Body (B), κ = 0.57) or process was moderate (Process (P), κ = 0.53); however, PAs were more than 81%. The agreement on the treatment recommendation was moderate (κ = 0.57) with a 73% PA. The New International Classification was more reliable. Body and process fractures generated more disagreement than intraarticular fractures and need further clear definitions.
Use of the smartphone for end vertebra selection in scoliosis.

PubMed

Pepe, Murad; Kocadal, Onur; Iyigun, Abdullah; Gunes, Zafer; Aksahin, Ertugrul; Aktekin, Cem Nuri

2017-03-01

The aim of our study was to develop a smartphone-aided end vertebra selection method and to investigate its effectiveness in Cobb angle measurement. Twenty-nine adolescent idiopathic scoliosis patients' pre-operative posteroanterior scoliosis radiographs were used for end vertebra selection and Cobb angle measurement by standard method and smartphone-aided method. Measurements were performed by 7 examiners. The intraclass correlation coefficient was used to analyze selection and measurement reliability. Summary statistics of variance calculations were used to provide 95% prediction limits for the error in Cobb angle measurements. A paired 2-tailed t test was used to analyze end vertebra selection differences. Mean absolute Cobb angle difference was 3.6° for the manual method and 1.9° for the smartphone-aided method. Both intraobserver and interobserver reliability were found excellent in manual and smartphone set for Cobb angle measurement. Both intraobserver and interobserver reliability were found excellent in manual and smartphone set for end vertebra selection. But reliability values of manual set were lower than smartphone. Two observers selected significantly different end vertebra in their repeated selections for manual method. Smartphone-aided method for end vertebra selection and Cobb angle measurement showed excellent reliability. We can expect a reduction in measurement error rates with the widespread use of this method in clinical practice. Level III, Diagnostic study. Copyright © 2016 Turkish Association of Orthopaedics and Traumatology. Production and hosting by Elsevier B.V. All rights reserved.
Development and validation of a tool to evaluate the quality of medical education websites in pathology.

PubMed

Alyusuf, Raja H; Prasad, Kameshwar; Abdel Satir, Ali M; Abalkhail, Ali A; Arora, Roopa K

2013-01-01

The exponential use of the internet as a learning resource coupled with varied quality of many websites, lead to a need to identify suitable websites for teaching purposes. The aim of this study is to develop and to validate a tool, which evaluates the quality of undergraduate medical educational websites; and apply it to the field of pathology. A tool was devised through several steps of item generation, reduction, weightage, pilot testing, post-pilot modification of the tool and validating the tool. Tool validation included measurement of inter-observer reliability; and generation of criterion related, construct related and content related validity. The validated tool was subsequently tested by applying it to a population of pathology websites. Reliability testing showed a high internal consistency reliability (Cronbach's alpha = 0.92), high inter-observer reliability (Pearson's correlation r = 0.88), intraclass correlation coefficient = 0.85 and κ =0.75. It showed high criterion related, construct related and content related validity. The tool showed moderately high concordance with the gold standard (κ =0.61); 92.2% sensitivity, 67.8% specificity, 75.6% positive predictive value and 88.9% negative predictive value. The validated tool was applied to 278 websites; 29.9% were rated as recommended, 41.0% as recommended with caution and 29.1% as not recommended. A systematic tool was devised to evaluate the quality of websites for medical educational purposes. The tool was shown to yield reliable and valid inferences through its application to pathology websites.
Multicenter accuracy and interobserver agreement of spot sign identification in acute intracerebral hemorrhage.

PubMed

Huynh, Thien J; Flaherty, Matthew L; Gladstone, David J; Broderick, Joseph P; Demchuk, Andrew M; Dowlatshahi, Dar; Meretoja, Atte; Davis, Stephen M; Mitchell, Peter J; Tomlinson, George A; Chenkin, Jordan; Chia, Tze L; Symons, Sean P; Aviv, Richard I

2014-01-01

Rapid, accurate, and reliable identification of the computed tomography angiography spot sign is required to identify patients with intracerebral hemorrhage for trials of acute hemostatic therapy. We sought to assess the accuracy and interobserver agreement for spot sign identification. A total of 131 neurology, emergency medicine, and neuroradiology staff and fellows underwent imaging certification for spot sign identification before enrolling patients in 3 trials targeting spot-positive intracerebral hemorrhage for hemostatic intervention (STOP-IT, SPOTLIGHT, STOP-AUST). Ten intracerebral hemorrhage cases (spot-positive/negative ratio, 1:1) were presented for evaluation of spot sign presence, number, and mimics. True spot positivity was determined by consensus of 2 experienced neuroradiologists. Diagnostic performance, agreement, and differences by training level were analyzed. Mean accuracy, sensitivity, and specificity for spot sign identification were 87%, 78%, and 96%, respectively. Overall sensitivity was lower than specificity (P<0.001) because of true spot signs incorrectly perceived as spot mimics. Interobserver agreement for spot sign presence was moderate (k=0.60). When true spots were correctly identified, 81% correctly identified the presence of single or multiple spots. Median time needed to evaluate the presence of a spot sign was 1.9 minutes (interquartile range, 1.2-3.1 minutes). Diagnostic performance, interobserver agreement, and time needed for spot sign evaluation were similar among staff physicians and fellows. Accuracy for spot identification is high with opportunity for improvement in spot interpretation sensitivity and interobserver agreement particularly through greater reliance on computed tomography angiography source data and awareness of limitations of multiplanar images. Further prospective study is needed.
Error in geometric morphometric data collection: Combining data from multiple sources.

PubMed

Robinson, Chris; Terhune, Claire E

2017-09-01

This study compares two- and three-dimensional morphometric data to determine the extent to which intra- and interobserver and intermethod error influence the outcomes of statistical analyses. Data were collected five times for each method and observer on 14 anthropoid crania using calipers, a MicroScribe, and 3D models created from NextEngine and microCT scans. ANOVA models were used to examine variance in the linear data at the level of genus, species, specimen, observer, method, and trial. Three-dimensional data were analyzed using geometric morphometric methods; principal components analysis was employed to examine how trials of all specimens were distributed in morphospace and Procrustes distances among trials were calculated and used to generate UPGMA trees to explore whether all trials of the same individual grouped together regardless of observer or method. Most variance in the linear data was at the genus level, with greater variance at the observer than method levels. In the 3D data, interobserver and intermethod error were similar to intraspecific distances among Callicebus cupreus individuals, with interobserver error being higher than intermethod error. Generally, taxa separate well in morphospace, with different trials of the same specimen typically grouping together. However, trials of individuals in the same species overlapped substantially with one another. Researchers should be cautious when compiling data from multiple methods and/or observers, especially if analyses are focused on intraspecific variation or closely related species, as in these cases, patterns among individuals may be obscured by interobserver and intermethod error. Conducting interobserver and intermethod reliability assessments prior to the collection of data is recommended. © 2017 Wiley Periodicals, Inc.
Reliability of the MDi Psoriasis® Application to Aid Therapeutic Decision-Making in Psoriasis.

PubMed

Moreno-Ramírez, D; Herrerías-Esteban, J M; Ojeda-Vila, T; Carrascosa, J M; Carretero, G; de la Cueva, P; Ferrándiz, C; Galán, M; Rivera, R; Rodríguez-Fernández, L; Ruiz-Villaverde, R; Ferrándiz, L

2017-09-01

Therapeutic decisions in psoriasis are influenced by disease factors (e.g., severity or location), comorbidity, and demographic and clinical features. We aimed to assess the reliability of a mobile telephone application (MDi-Psoriasis) designed to help the dermatologist make decisions on how to treat patients with moderate to severe psoriasis. We analyzed interobserver agreement between the advice given by an expert panel and the recommendations of the MDi-Psoriasis application in 10 complex cases of moderate to severe psoriasis. The experts were asked their opinion on which treatments were most appropriate, possible, or inappropriate. Data from the same 10 cases were entered into the MDi-Psoriasis application. Agreement was analyzed in 3 ways: paired interobserver concordance (Cohen's κ), multiple interobserver concordance (Fleiss's κ), and percent agreement between recommendations. The mean percent agreement between the total of 1210 observations was 51.3% (95% CI, 48.5-54.1%). Cohen's κ statistic was 0.29 and Fleiss's κ was 0.28. Mean agreement between pairs of human observers only, excluding the MDi-Psoriasis recommendations, was 50.5% (95% CI, 47.6-53.5%). Paired agreement between the recommendations of the MDi-Psoriasis tool and the majority opinion of the expert panel (Cohen's κ) was 0.44 (68.2% agreement). The MDi-Psoriasis tool can generate recommendations that are comparable to those of experts in psoriasis. Copyright © 2017 AEDV. Publicado por Elsevier España, S.L.U. All rights reserved.
77 FR 54917 - Findings of Research Misconduct

Federal Register 2010, 2011, 2012, 2013, 2014

2012-09-06

... values for inter-observer reliabilities when coding was done by only one observer, in both cases leading... Research Integrity (ORI) has taken final action in the following case: Marc Hauser, Ph.D., Harvard... collaborators that he miscoded some of the trials and that the study failed to provide support for the initial...
An Evaluation of Videoconferencing as a Supportive Technology for Practicum Supervision

ERIC Educational Resources Information Center

Dymond, Stacy K.; Renzaglia, Adelle; Halle, James W.; Chadsey, Janis; Bentz, Johnell L.

2008-01-01

In this study, the authors determine the efficacy of videoconferencing to supervise pre-service special education teachers. Efficacy is determined by (a) assessing interobserver reliability between on-site and off-site observers and (b) evaluating the feasibility and practicality of the videoconferencing technology. Data are collected in two…
Inter-observer and intra-observer reliability in the radiographic diagnosis of avascular necrosis of the femoral head following reconstructive hip surgery in children with cerebral palsy.

PubMed

Hesketh, Kim; Sankar, Wudbhav; Joseph, Benjamin; Narayanan, Unni; Mulpuri, Kishore

2016-04-01

The incidence of avascular necrosis (AVN) following reconstructive hip surgery in cerebral palsy (CP) ranges from 0 to 69 % in the current literature. The purpose of this study was to determine the inter- and intra-observer reliability of radiographically diagnosing AVN in children with CP after hip surgery. A retrospective review of 65 children with CP who had reconstructive hip surgery between 2009 and 2012 at BC Children's Hospital was completed. Anterior-posterior and lateral radiographs were presented to four pediatric orthopaedic surgeons over two rounds. Surgeons were asked to review the set of unidentified radiographs and comment 'yes' or 'no' for the presence of AVN. Two weeks later the same set of radiographs was sent in a different order and the surgeons were again asked to comment on AVN. Inter- and intra-observer reliability was determined using kappa statistics. The intra-observer reliability ranged from 0.65 to 0.88 with an average score of 0.76. Inter-observer reliability showed greater variability, ranging from 0.41 to 0.77 with an average score of 0.56 across all surgeons. Although the intra-rater reliability produced a strength of "good" and the inter-rater reliability a strength of "moderate" agreement, the variability within these scores is clinically important as it demonstrates the difficulty in identifying AVN. This may explain the variability in AVN that is reported in the literature. The need for further education and research in the diagnosis of AVN in children with CP who have undergone reconstructive hip surgery is clinically necessary.
Pre-operative Duplex Ultrasonography in Arteriovenous Fistula Creation: Intra- and Inter-observer Agreement.

PubMed

Zonnebeld, Niek; Maas, Tommy M G; Huberts, Wouter; van Loon, Magda M; Delhaas, Tammo; Tordoir, Jan H M

2017-11-01

Although clinical guidelines on arteriovenous fistula (AVF) creation advocate minimum luminal arterial and venous diameters, assessed by duplex ultrasonography (DUS), the clinical value of routine DUS examination is under debate. DUS might be an insufficiently repeatable and/or reproducible imaging modality because of its operator dependency. The present study aimed to assess intra- and inter-observer agreement of DUS examination in support of AVF surgery planning. Ten end stage renal disease patients were included, to assess intra- and inter-observer agreement of pre-operative DUS measurements. All measurements were performed by two trained and experienced vascular technicians, blinded to measurement readings. From the routine DUS protocol, representative measurements (venous diameters, and arterial diameters and volume flow in the upper arm and forearm) were selected. For intra-observer agreement the measurements were performed in triplicate, with the probe released from the skin between each. Intraclass correlation coefficients were calculated for intra- and inter-observer agreement, and Bland-Altman plots used to graphically display mean measurement differences and limits of agreement. Ten patients (6 male, 59.4±19.7 years) consented to participate, and all predefined measurements were obtained. Intraclass correlation coefficients for intra-observer agreement of diameter measurements were at least 0.90 (95% CI 0.74-0.97; radial artery). Inter-observer agreement was at least 0.83 (0.46-0.96; lateral diameter upper arm cephalic vein). The Bland-Altman plots showed acceptable mean measurement differences and limits of agreement. In experienced hands, excellent intra- and inter-observer agreement can be reached for the discrete pre-operative DUS measurements advocated in clinical guidelines. DUS is therefore a reliable imaging modality to support AVF surgery planning. The content of DUS protocols, however, needs further standardisation. Copyright © 2017 European Society for Vascular Surgery. Published by Elsevier Ltd. All rights reserved.
Drop Tower Characterization of Army Research Lab (ARL)-Fabricated Thin-Film Lead Zirconate Titanate (PZT) Transducers

DTIC Science & Technology

2012-04-01

TRADE NAME, TRADEMARK, MANUFACTURER , OR OTHERWISE, DOES NOT CONSTITUTE OR IMPLY ITS ENDORSEMENT, RECOMMENDATION, OR FAVORING BY THE U.S. GOVERNMENT...TRADE NAMES USE OF TRADE NAMES OR MANUFACTURERS IN THIS REPORT DOES NOT CONSTITUTE AN OFFICIAL ENDORSEMENT OR APPROVAL OF THE USE OF...on Gel-Pak Ready for Noodle Wire Attachment ..................... 10 13. Waveshaper (on Right) Beside a 0.5-Inch Steel Ball
The Budget and Economic Outlook: Fiscal Years 2007 to 2016

DTIC Science & Technology

2006-01-01

2001 and 2003. d. The estimated trend in the ratio of output to hours worked in the nonfarm business sector . Total, Total, 1950- 1974- 1982- 1991...Potential Labor Productivity in the Nonfarm Business Sectord Overall Economy Nonfarm Business Sector TFP adjustments Contributions to the Growth of...on CBO’s Baseline Budget Projections 123D-1. Relationship of the Budget to the Federal Sector of the National Income and Product Accounts 128D-2
Source Camera Identification and Blind Tamper Detections for Images

DTIC Science & Technology

2007-04-24

measures and image quality measures in camera identification problem was studied using conjunction with a KNN classifier to identify the feature sets...shots varying from nature scenes .-.. motorala to close-ups of people. We experimented with the KNN *~. * ny classifier (K=5) as well SVM algorithm of...on Acoustic, Speech and Signal Processing (ICASSP), France, May 2006, vol. 5, pp. 401-404. [9] H. Farid and S. Lyu, "Higher-order wavelet statistics
Capillary refill time: a study of interobserver reliability among nurses and nurse assistants.

PubMed

Brabrand, Mikkel; Hosbond, Susanne; Folkestad, Lars

2011-02-01

The interobserver variability of capillary refill time (CRT) has been questioned. Earlier studies of interobserver variability of CRT have been on a large number of patients but with few observers. The objective of our study was to investigate how a large group of nurses and nurse assistants would grade CRT. We recorded a video of the index finger of six medical patients and these were shown to nurses and nurse assistants. They were asked to record the CRT and whether they found this value to be normal. The data were analyzed using the Fleiss Kappa Coefficient Analysis and graded according to the Landis and Koch correlation. Correlation between the exact numbers was evaluated using interclass correlation. Nine nurse assistants and 37 nurses participated. The patients were aged between 44 and 87 years. All but one patient had a systolic blood pressure reading above 130 mmHg. All had arterial blood oxygen saturation above 92% and all but one had normal body temperature. The κ value for normality was 0.56. The interclass correlation of measurement of CRT was 0.62. This is the largest interobserver study of CRT when looking at the number of observers. We found an only moderate agreement for the exact value of CRT and a moderate agreement for normality. We believe that CRT should be used with caution in clinical practice.
The clinimetric qualities of patient-assessed instruments for measuring chronic ankle instability: a systematic review.

PubMed

Eechaute, Christophe; Vaes, Peter; Van Aerschot, Lieve; Asman, Sara; Duquet, William

2007-01-18

The assessment of outcomes from the patient's perspective becomes more recognized in health care. Also in patients with chronic ankle instability, the degree of present impairments, disabilities and participation problems should be documented from the perspective of the patient. The decision about which patient-assessed instrument is most appropriate for clinical practice should be based upon systematic reviews. Only rating scales constructed for patients with acute ligament injuries were systematically reviewed in the past. The aim of this study was to review systematically the clinimetric qualities of patient-assessed instruments designed for patients with chronic ankle instability. A computerized literature search of Medline, Embase, Cinahl, Web of Science, Sport Discus and the Cochrane Controlled Trial Register was performed to identify eligible instruments. Two reviewers independently evaluated the clinimetric qualities of the selected instruments using a criteria list. The inter-observer reliability of both the selection procedure and the clinimetric evaluation was calculated using modified kappa coefficients. The inter-observer reliability of the selection procedure was excellent (k = .86). Four instruments met the eligibility criteria: the Ankle Joint Functional Assessment Tool (AJFAT), the Functional Ankle Outcome Score (FAOS), the Foot and Ankle Disability Index (FADI) and the Functional Ankle Ability Measure (FAAM). The inter-observer reliability of the quality assessment was substantial to excellent (k between .64 and .88). Test-retest reliability was demonstrated for the FAOS, the FADI and the FAAM but not for the AJFAT. The FAOS and the FAAM met the criteria for content validity and construct validity. For none of the studied instruments, the internal consistency was sufficiently demonstrated. The presence of floor- and ceiling effects was assessed for the FAOS but ceiling effects were present for all subscales. Responsiveness was demonstrated for the AJFAT, FADI and the FAAM. Only for the FAAM, a minimal clinical important difference (MCID) was presented. The FADI and the FAAM can be considered as the most appropriate, patient-assessed tools to quantify functional disabilities in patients with chronic ankle instability. The clinimetric qualities of the FAAM need to be further demonstrated in a specific population of patients with chronic ankle instability.
Evidence-Based School Behavior Assessment of Externalizing Behavior in Young Children

ERIC Educational Resources Information Center

Bagner, Daniel M.; Boggs, Stephen R.; Eyberg, Sheila M.

2010-01-01

This study examined the psychometric properties of the Revised Edition of the School Observation Coding System (REDSOCS). Participants were 68 children ages 3 to 6 who completed parent-child interaction therapy for Oppositional Defiant Disorder as part of a larger efficacy trial. Interobserver reliability on REDSOCS categories was moderate to…

CLINICAL AUDIT OF IMAGE QUALITY IN RADIOLOGY USING VISUAL GRADING CHARACTERISTICS ANALYSIS.

PubMed

Tesselaar, Erik; Dahlström, Nils; Sandborg, Michael

2016-06-01

The aim of this work was to assess whether an audit of clinical image quality could be efficiently implemented within a limited time frame using visual grading characteristics (VGC) analysis. Lumbar spine radiography, bedside chest radiography and abdominal CT were selected. For each examination, images were acquired or reconstructed in two ways. Twenty images per examination were assessed by 40 radiology residents using visual grading of image criteria. The results were analysed using VGC. Inter-observer reliability was assessed. The results of the visual grading analysis were consistent with expected outcomes. The inter-observer reliability was moderate to good and correlated with perceived image quality (r(2) = 0.47). The median observation time per image or image series was within 2 min. These results suggest that the use of visual grading of image criteria to assess the quality of radiographs provides a rapid method for performing an image quality audit in a clinical environment. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Assessment of Interobserver Reliability in Nutrition Studies that Use Direct Observation of School Meals

PubMed Central

BAGLIO, MICHELLE L.; BAXTER, SUZANNE DOMEL; GUINN, CAROLINE H.; THOMPSON, WILLIAM O.; SHAFFER, NICOLE M.; FRYE, FRANCESCA H. A.

2005-01-01

This article (a) provides a general review of interobserver reliability (IOR) and (b) describes our method for assessing IOR for items and amounts consumed during school meals for a series of studies regarding the accuracy of fourth-grade children's dietary recalls validated with direct observation of school meals. A widely used validation method for dietary assessment is direct observation of meals. Although many studies utilize several people to conduct direct observations, few published studies indicate whether IOR was assessed. Assessment of IOR is necessary to determine that the information collected does not depend on who conducted the observation. Two strengths of our method for assessing IOR are that IOR was assessed regularly throughout the data collection period and that IOR was assessed for foods at the item and amount level instead of at the nutrient level. Adequate agreement among observers is essential to the reasoning behind using observation as a validation tool. Readers are encouraged to question the results of studies that fail to mention and/or to include the results for assessment of IOR when multiple people have conducted observations. PMID:15354155
Smartphone Photography as a Tool to Measure Knee Range of Motion.

PubMed

Mica, Megan Conti; Wagner, Eric R; Shin, Alexander Y

2018-01-01

The objective of this study was to validate measuring knee range of motion (ROM) from smartphone photography. Thirty-two participants (64 knees) obtained smartphone photographs of knee flexion and extension. Surgeons obtained the same photographs and goniometric measurement of ROM. ROM was measured using Adobe Photoshop. Goniometer versus digital measurements, participant versus surgeon photographs, and interobserver measurements were analyzed. The average difference in goniometer and digital photograph measurements was 5°. The interclass correlation was .642(L) and .656(R). The Bland-Altman plots demonstrated that 29/32 digital measurements were within the 95% confidence interval (CI). Participants' versus researchers' photographs averaged a 2° difference. The interclass correlation was .924(L) and .91(R). Bland-Altman plots demonstrated that 31/32 measurements were within the 95% CI. Interobserver reliability averaged aROMdifference of 5°. The concordance coefficients were .647(L) and .723(R). Bland-Altman plots demonstrated that 30 of 32 digital measurements were within the 95% CI. Measuring knee ROM using smartphone digital photography is valid and reliable. (Journal of Surgical Orthopaedic Advances 27(1):52-57, 2018).
Development and validation of a tool to evaluate the quality of medical education websites in pathology

PubMed Central

Alyusuf, Raja H.; Prasad, Kameshwar; Abdel Satir, Ali M.; Abalkhail, Ali A.; Arora, Roopa K.

2013-01-01

Background: The exponential use of the internet as a learning resource coupled with varied quality of many websites, lead to a need to identify suitable websites for teaching purposes. Aim: The aim of this study is to develop and to validate a tool, which evaluates the quality of undergraduate medical educational websites; and apply it to the field of pathology. Methods: A tool was devised through several steps of item generation, reduction, weightage, pilot testing, post-pilot modification of the tool and validating the tool. Tool validation included measurement of inter-observer reliability; and generation of criterion related, construct related and content related validity. The validated tool was subsequently tested by applying it to a population of pathology websites. Results and Discussion: Reliability testing showed a high internal consistency reliability (Cronbach's alpha = 0.92), high inter-observer reliability (Pearson's correlation r = 0.88), intraclass correlation coefficient = 0.85 and κ =0.75. It showed high criterion related, construct related and content related validity. The tool showed moderately high concordance with the gold standard (κ =0.61); 92.2% sensitivity, 67.8% specificity, 75.6% positive predictive value and 88.9% negative predictive value. The validated tool was applied to 278 websites; 29.9% were rated as recommended, 41.0% as recommended with caution and 29.1% as not recommended. Conclusion: A systematic tool was devised to evaluate the quality of websites for medical educational purposes. The tool was shown to yield reliable and valid inferences through its application to pathology websites. PMID:24392243
RELIABILITY AND VALIDITY OF A BIOMECHANICALLY BASED ANALYSIS METHOD FOR THE TENNIS SERVE

PubMed Central

Kibler, W. Ben; Lamborn, Leah; Smith, Belinda J.; English, Tony; Jacobs, Cale; Uhl, Tim L.

2017-01-01

Background An observational tennis serve analysis (OTSA) tool was developed using previously established body positions from three-dimensional kinematic motion analysis studies. These positions, defined as nodes, have been associated with efficient force production and minimal joint loading. However, the tool has yet to be examined scientifically. Purpose The primary purpose of this investigation was to determine the inter-observer reliability for each node between two health care professionals (HCPs) that developed the OTSA, and secondarily to investigate the validity of the OTSA. Methods Two separate studies were performed to meet these objectives. An inter-observer reliability study preceded the validity study by examining 28 videos of players serving. Two HCPs graded each video and scored the presence or absence of obtaining each node. Discriminant validity was determined in 33 tennis players using video taped records of three first serves. Serve mechanics were graded using the OSTA and categorized players into those with good ( ≥ 5) and poor ( ≤ 4) mechanics. Participants performed a series of field tests to evaluate trunk flexibility, lower extremity and trunk power, and dynamic balance. Results The group with good mechanics demonstrated greater backward trunk flexibility (p=0.02), greater rotational power (p=0.02), and higher single leg countermovement jump (p=0.05). Reliability of the OTSA ranged from K = 0.36-1.0, with the majority of all the nodes displaying substantial reliability (K>0.61). Conclusion This study provides HCPs with a valid and reliable field tool used to assess serve mechanics. Physical characteristics of trunk mobility and power appear to discriminate serve mechanics between players. Future intervention studies are needed to determine if improvement in physical function contribute to improved serve mechanics. Level of Evidence 3 PMID:28593098
Reliability of a new method for measuring coronal trunk imbalance, the axis-line-angle technique.

PubMed

Zhang, Rui-Fang; Liu, Kun; Wang, Xue; Liu, Qian; He, Jia-Wei; Wang, Xiang-Yang; Yan, Zhi-Han

2015-12-01

Accurate determination of the extent of trunk imbalance in the coronal plane plays a key role in an evaluation of patients with trunk imbalance, such as patients with adolescent idiopathic scoliosis. An established, widely used practice in evaluating trunk imbalance is to drop a plumb line from the C7 vertebra to a key reference axis, the central sacral vertical line (CSVL) in full-spine standing anterioposterior radiographs, and measuring the distance between them, the C7-CSVL. However, measuring the CSVL is subject to intraobserver differences, is error-prone, and is of poor reliability. Therefore, the development of a different way to measure trunk imbalance is needed. This study aimed to describe a new method to measure coronal trunk imbalance, the axis-line-angle technique (ALAT), which measures the angle at the intersection between the C7 plumb line and an axis line drawn from the vertebral centroid of the C7 to the middle of the superior border of the symphysis pubis, and to compare the reliability of the ALAT with that of the C7-CSVL. A prospective study at a university hospital was used. The patient sample consisted of sixty-nine consecutively enrolled men and women patients, aged 10-18 years, who had trunk imbalance defined as C7-CSVL longer than 20 mm on computed full-spine standing anterioposterior radiographs. Data were analyzed to determine the correlation between C7-CSVL and ALAT measurements and to determine intraobserver and interobserver reliabilities. Using a picture archiving and communication system, three radiologists independently evaluated trunk imbalance on the 69 computed radiographs by measuring the C7-CSVL and by measuring the angle determined by the ALAT. Data were analyzed to determine the correlations between the two measures of trunk imbalance, and to determine intraobserver and interobserver reliabilities of each of them. Overall results from the measurements by the C7-CSVL and the ALAT were significantly moderately correlated. Intraobserver assessments by measuring the C7-CSVL and by doing the ALAT failed to find any significant differences between the findings from the first and second assessments by the same radiologist. Interobserver assessments significantly differed between radiologists 1 and 2 for the first assessment measuring the C7-CSVL, and between radiologists 2 and 3 for the second assessment measuring the C7-CSVL. Interobserver assessments by doing the ALAT failed to find any significant differences among the three radiologists for either of the two assessments. Our results indicated that using the ALAT, which is simple and convenient, is of great value in measuring trunk imbalance. For measuring trunk imbalance, the ALAT has essential advantages compared with measuring the C7-CSVL. We encourage spine surgeons to consider using the ALAT in evaluating trunk imbalance. Copyright © 2015 Elsevier Inc. All rights reserved.
Intraoperative assessment of the stability of the distal tibiofibular joint in supination-external rotation injuries of the ankle: sensitivity, specificity, and reliability of two clinical tests.

PubMed

Pakarinen, Harri; Flinkkilä, Tapio; Ohtonen, Pasi; Hyvönen, Pekka; Lakovaara, Martti; Leppilahti, Juhana; Ristiniemi, Jukka

2011-11-16

This study was designed to assess the sensitivity, specificity, and interobserver reliability of the hook test and the stress test for the intraoperative diagnosis of instability of the distal tibiofibular joint following fixation of ankle fractures resulting from supination-external rotation forces. We conducted a prospective study of 140 patients with an unstable unilateral ankle fracture resulting from a supination-external rotation mechanism (Lauge-Hansen SE). After internal fixation of the malleolar fracture, a hook test and an external rotation stress test under fluoroscopy were performed independently by the lead surgeon and assisting surgeon, followed by a standardized 7.5-Nm external rotation stress test of each ankle under fluoroscopy. A positive stress test result was defined as a side-to-side difference of >2 mm in the tibiotalar or the tibiofibular clear space on mortise radiographs. The sensitivity and specificity of each test were calculated with use of the standardized 7.5-Nm external rotation stress test as a reference. Twenty-four (17%) of the 140 patients had a positive standardized 7.5-Nm external rotation stress test after internal fixation of the malleolar fracture. The hook test had a sensitivity of 0.25 (95% confidence interval, 0.12 to 0.45) and a specificity of 0.98 (95% confidence interval, 0.94 to 1.0) for the detection of the same instabilities. The external rotation stress test had a sensitivity of 0.58 (95% confidence interval, 0.39 to 0.76) and a specificity of 0.96 (95% confidence interval, 0.90 to 0.98). Both tests had excellent interobserver reliability, with 99% agreement for the hook test and 98% for the stress test. Interobserver agreement for the hook test and the clinical stress test was excellent, but the sensitivity of these tests was insufficient to adequately detect instability of the syndesmosis intraoperatively.
Interobserver reliability of the 'Welfare Quality(®) Animal Welfare Assessment Protocol for Growing Pigs'.

PubMed

Czycholl, I; Kniese, C; Büttner, K; Beilage, E Grosse; Schrader, L; Krieter, J

2016-01-01

The present paper focuses on evaluating the interobserver reliability of the 'Welfare Quality(®) Animal Welfare Assessment Protocol for Growing Pigs'. The protocol for growing pigs mainly consists of a Qualitative Behaviour Assessment (QBA), direct behaviour observations (BO) carried out by instantaneous scan sampling and checks for different individual parameters (IP), e.g. presence of tail biting, wounds and bursitis. Three trained observers collected the data by performing 29 combined assessments, which were done at the same time and on the same animals; but they were carried out completely independent of each other. The findings were compared by the calculation of Spearman Rank Correlation Coefficients (RS), Intraclass Correlation Coefficients (ICC), Smallest Detectable Changes (SDC) and Limits of Agreements (LoA). There was no agreement found concerning the adjectives belonging to the QBA (e.g. active: RS: 0.50, ICC: 0.30, SDC: 0.38, LoA: -0.05 to 0.45; fearful: RS: 0.06, ICC: 0.0, SDC: 0.26, LoA: -0.20 to 0.30). In contrast, the BO showed good agreement (e.g. social behaviour: RS: 0.45, ICC: 0.50, SDC: 0.09, LoA: -0.09 to 0.03 use of enrichment material: RS: 0.75, ICC: 0.68, SDC: 0.06, LoA: -0.03 to 0.03). Overall, observers agreed well in the IP, e.g. tail biting (RS: 0.52, ICC: 0.88; SDC: 0.05, LoA: -0.01 to 0.02) and wounds (RS: 0.43, ICC: 0.59, SDC: 0.10, LoA: -0.09 to 0.10). The parameter bursitis showed great differences (RS: 0.10, ICC: 0.0, SDC: 0.35, LoA: -0.37 to 0.40), which can be explained by difficulties in the assessment when the animals moved around quickly or their legs were soiled. In conclusion, the interobserver reliability was good in the BO and most IP, but not for the parameter bursitis and the QBA.
Identification of Nasal Bone Fractures on Conventional Radiography and Facial CT: Comparison of the Diagnostic Accuracy in Different Imaging Modalities and Analysis of Interobserver Reliability

PubMed Central

Baek, Hye Jin; Kim, Dong Wook; Ryu, Ji Hwa; Lee, Yoo Jin

2013-01-01

Background There has been no study to compare the diagnostic accuracy of an experienced radiologist with a trainee in nasal bone fracture. Objectives To compare the diagnostic accuracy between conventional radiography and computed tomography (CT) for the identification of nasal bone fractures and to evaluate the interobserver reliability between a staff radiologist and a trainee. Patients and Methods A total of 108 patients who underwent conventional radiography and CT after acute nasal trauma were included in this retrospective study. Two readers, a staff radiologist and a second-year resident, independently assessed the results of the imaging studies. Results Of the 108 patients, the presence of a nasal bone fracture was confirmed in 88 (81.5%) patients. The number of non-depressed fractures was higher than the number of depressed fractures. In nine (10.2%) patients, nasal bone fractures were only identified on conventional radiography, including three depressed and six non-depressed fractures. CT was more accurate as compared to conventional radiography for the identification of nasal bone fractures as determined by both readers (P <0.05), all diagnostic indices of an experienced radiologist were similar to or higher than those of a trainee, and κ statistics showed moderate agreement between the two diagnostic tools for both readers. There was no statistical difference in the assessment of interobserver reliability for both imaging modalities in the identification of nasal bone fractures. Conclusion For the identification of nasal bone fractures, CT was significantly superior to conventional radiography. Although a staff radiologist showed better values in the identification of nasal bone fracture and differentiation between depressed and non-depressed fractures than a trainee, there was no statistically significant difference in the interpretation of conventional radiography and CT between a radiologist and a trainee. PMID:24348599
Physicians’ accuracy and interrator reliability for the diagnosis of unstable meniscal tears in patients having osteoarthritis of the knee

PubMed Central

Dervin, Geoffrey F.; Stiell, Ian G.; Wells, George A.; Rody, Kelly; Grabowski, Jenny

2001-01-01

Objective To determine clinicians’ accuracy and reliability for the clinical diagnosis of unstable meniscus tears in patients with symptomatic osteoarthritis of the knee. Design A prospective cohort study. Setting A single tertiary care centre. Patients One hundred and fifty-two patients with symptomatic osteoarthritis of the knee refractory to conservative medical treatment were selected for prospective evaluation of arthroscopic débridement. Intervention Arthroscopic débridement of the knee, including meniscal tear and chondral flap resection, without abrasion arthroplasty. Outcome measures A standardized assessment protocol was administered to each patient by 2 independent observers. Arthroscopic determination of unstable meniscal tears was recorded by 1 observer who reviewed a video recording and was blinded to preoperative data. Those variables that had the highest interobserver agreement and the strongest association with meniscal tear by univariate methods were entered into logistic regression to model the best prediction of resectable tears. Results There were 92 meniscal tears (77 medial, 15 lateral). Interobserver agreement between clinical fellows and treating surgeons was poor to fair (κ < 0.4) for all clinical variables except radiographic measures, which were good. Fellows and surgeons predicted unstable meniscal tear preoperatively with equivalent accuracy of 60%. Logistic regression modelling revealed that a history of swelling and a ballottable effusion were negative predictors. A positive McMurray test was the only positive predictor of unstable meniscal tear. “Mechanical” symptoms were not reliable predictors in this prospective study. The model was 69% accurate for all patients and 76% for those with advanced medial compartment osteoarthritis defined by a joint space height of 2 mm or less. Conclusions This study underscored the difficulty in using clinical variables to predict unstable medial meniscal tears in patients with pre-existing osteoarthritis of the knee. The lack of interobserver agreement must be overcome to ensure that the findings can be generalized to other physician observers. PMID:11504260
Detection of myocardial ischemia by automated, motion-corrected, color-encoded perfusion maps compared with visual analysis of adenosine stress cardiovascular magnetic resonance imaging at 3 T: a pilot study.

PubMed

Doesch, Christina; Papavassiliu, Theano; Michaely, Henrik J; Attenberger, Ulrike I; Glielmi, Christopher; Süselbeck, Tim; Fink, Christian; Borggrefe, Martin; Schoenberg, Stefan O

2013-09-01

The purpose of this study was to compare automated, motion-corrected, color-encoded (AMC) perfusion maps with qualitative visual analysis of adenosine stress cardiovascular magnetic resonance imaging for detection of flow-limiting stenoses. Myocardial perfusion measurements applying the standard adenosine stress imaging protocol and a saturation-recovery temporal generalized autocalibrating partially parallel acquisition (t-GRAPPA) turbo fast low angle shot (Turbo FLASH) magnetic resonance imaging sequence were performed in 25 patients using a 3.0-T MAGNETOM Skyra (Siemens Healthcare Sector, Erlangen, Germany). Perfusion studies were analyzed using AMC perfusion maps and qualitative visual analysis. Angiographically detected coronary artery (CA) stenoses greater than 75% or 50% or more with a myocardial perfusion reserve index less than 1.5 were considered as hemodynamically relevant. Diagnostic performance and time requirement for both methods were compared. Interobserver and intraobserver reliability were also assessed. A total of 29 CA stenoses were included in the analysis. Sensitivity, specificity, positive predictive value, negative predictive value, and accuracy for detection of ischemia on a per-patient basis were comparable using the AMC perfusion maps compared to visual analysis. On a per-CA territory basis, the attribution of an ischemia to the respective vessel was facilitated using the AMC perfusion maps. Interobserver and intraobserver reliability were better for the AMC perfusion maps (concordance correlation coefficient, 0.94 and 0.93, respectively) compared to visual analysis (concordance correlation coefficient, 0.73 and 0.79, respectively). In addition, in comparison to visual analysis, the AMC perfusion maps were able to significantly reduce analysis time from 7.7 (3.1) to 3.2 (1.9) minutes (P < 0.0001). The AMC perfusion maps yielded a diagnostic performance on a per-patient and on a per-CA territory basis comparable with the visual analysis. Furthermore, this approach demonstrated higher interobserver and intraobserver reliability as well as a better time efficiency when compared to visual analysis.
Have levels of evidence improved the quality of orthopaedic research?

PubMed

Cunningham, Brian P; Harmsen, Samuel; Kweon, Chris; Patterson, Jason; Waldrop, Robert; McLaren, Alex; McLemore, Ryan

2013-11-01

Since 2003 many orthopaedic journals have adopted grading systems for levels of evidence (LOE). It is unclear if the quality of orthopaedic literature has changed since LOE was introduced. We asked three questions: (1) Have the overall number and proportion of Level I and II studies increased in the orthopaedic literature since the introduction of LOE? (2) Is a similar pattern seen in individual orthopaedic subspecialty journals? (3) What is the interobserver reliability of grading LOE? We assigned LOE to therapeutic studies published in 2000, 2005, and 2010 in eight major orthopaedic subspecialty journals. Number and proportion of Level I and II publications were determined. Data were evaluated using log-linear models. Twenty-six reviewers (13 residents and 13 attendings) graded LOE of 20 blinded therapeutic articles from the Journal of Bone and Joint Surgery for 2009. Interobserver agreement relative to the Journal of Bone and Joint Surgery was assessed using a weighted kappa. The total number of Level I and II publications in subspecialty journals increased from 150 in 2000 to 239 in 2010. The proportion of high-quality publications increased with time (p < 0.001). All subspecialty journals other than the Journal of Pediatric Orthopaedics and the Journal of Orthopaedic Trauma showed a similar behavior. Average weighted kappa was 0.791 for residents and 0.842 for faculty (p = 0.209). The number and proportion of Level I and II publications have increased. LOE can be graded reliably with high interobserver agreement. The number and proportion of high-level studies should continue to increase.
Creation and validation of a visual macroscopic hematuria scale for optimal communication and an objective hematuria index.

PubMed

Wong, Lih-Ming; Chum, Jia-Min; Maddy, Peter; Chan, Steven T F; Travis, Douglas; Lawrentschuk, Nathan

2010-07-01

Macroscopic hematuria is a common symptom and sign that is challenging to quantify and describe. The degree of hematuria communicated is variable due to health worker experience combined with lack of a reliable grading tool. We produced a reliable, standardized visual scale to describe hematuria severity. Our secondary aim was to validate a new laboratory test to quantify hemoglobin in hematuria specimens. Nurses were surveyed to ascertain current hematuria descriptions. Blood and urine were titrated at varying concentrations and digitally photographed in catheter bag tubing. Photos were processed and printed on transparency paper to create a prototype swatch or card showing light, medium, heavy and old hematuria. Using the swatch 60 samples were rated by nurses and laymen. Interobserver variability was reported using the generalized kappa coefficient of agreement. Specimens were analyzed for hemolysis by measuring optical density at oxyhemoglobin absorption peaks. Interobserver agreement between nurses and laymen was good (kappa = 0.51, p <0.001). Subgroup analysis showed substantial agreement for light hematuria (kappa = 0.71). Overall agreement improved when the moderate (kappa = 0.28) and heavy (kappa = 0.53) hematuria categories were combined (kappa = 0.70). Compared to known blood concentrations the assay of optical density at oxyhemoglobin absorption peaks showed a linear trend. A simple visual scale to grade and communicate hematuria with adequate interobserver agreement is feasible. The test for optical density at oxyhemoglobin absorption peaks is a new method, validated in our study, to quantify hemoglobin in a hematuria specimen. Copyright (c) 2010 American Urological Association Education and Research, Inc. Published by Elsevier Inc. All rights reserved.
Biomechanical Analysis of Military Boots. Phase 1. Materials Testing of Military and Commercial Footwear

DTIC Science & Technology

1992-10-01

N=8) and Results of 44 Statistical Analyses for Impact Test Performed on Forefoot of Unworn Footwear A-2. Summary Statistics (N=8) and Results of...on Forefoot of Worn Footwear Vlll Tables (continued) Table Page B-2. Summary Statistics (N=4) and Results of 76 Statistical Analyses for Impact...used tests to assess heel and forefoot shock absorption, upper and sole durability, and flexibility (Cavanagh, 1978). Later, the number of tests was
Unresolved Ethnic Conflict and Religious Revival in Russia: The Chechen Element

DTIC Science & Technology

2007-12-01

heritage and blood and soil . As one looks at the unsettled state of the world today, one might ask: Is the future of humanity damned because of...one can see that even in the midst of acknowledging “multiple identities”, the permeability of identity barriers is a tool to be utilized in the...creation of peaceful heterogeneity. Thus, the “ permeability ” is a result of acknowledging the multiple identities and needs of the various ethnic
Epigenetic Machinery Regulates Alternative Splicing of Androgen Receptor (AR) Gene in Castration Resistant Prostate Cancer

DTIC Science & Technology

2017-09-01

AWARD NUMBER: W81XWH-16-1-0531 TITLE: Epigenetic machinery regulates alternative splicing of androgen receptor ( AR ) gene in castration...DISTRIBUTION STATEMENT: Approved for Public Release Distribution Unlimited The views, opinions and/or findings contained in this report are those of...One of the reasons for the resistance to ADT and newer anti-androgen drugs is the emergence of constitutively active AR variants ( AR -Vs) such as AR
Climate Change: U.S.-China Partnership for Global Security

DTIC Science & Technology

2010-03-01

our best traditions – find common ground to move the country forward, keep our country safe and strong, and lay the groundwork for decades of...only can this be dangerous to the region in view of Indian on-going conflict with Pakistan . India is but one example of the widespread impact that...sustain the lives of nearly 3 billion people in this Southwest Asian region—in People’s Republic of China (PRC), India, and Pakistan . Changes within
The Impact of Emotion on Negotiation Behaviour during a Realistic Training Scenario

DTIC Science & Technology

2007-11-01

stratégique. Néanmoins, l’émotion du sergent semblait effectivement avoir une incidence négative sur les comportements des stagiaires. Comparativement à...in particular, a human rights violation scenario, is designed to test trainees’ abilities to negotiate in extreme conditions without the use of...One officer is designated as the Sgt and one as the constable. The scenario begins when a female victim runs out in full view of the trainees
The Results of an Experimental Restructuring of EO Staffing Patterns in a Infantry Division

DTIC Science & Technology

1979-10-01

Research, Inc. ARI FIELD UNIT AT PRESIDIO OF MONTEREY, CALIFORNIA JI NV 8 1984 -U 1 S: U. S. Army Research Institute for the Behavioral and Social ...12. REPORT DATE Army Research Institute for the Behavioral and Social October 1979 Sciences, 5001 Eisenhower Ave., Alexandria, VA 22333 I. NUMBER OF...one of thiree prepared under this contract. The other reports are: Unit Equal Opportunity Training Diagnosis and Assessment System, and Development and
Sub-Saharan Africa Report.

DTIC Science & Technology

1987-01-16

Botha (DIE AFRIKANER, 8 Oct 86) 55 s 57 (DIE AFRIKANER, 15 Oct 86) 59 BP Oil Company Integration Plans Viewed (DIE BURGER, 17 Nov 86...among things, ban the transporta- tion of crude oil to South Africa and Namibia on Norwegian ships, ban any form of investment and trans- fer of...on November 10 on an em- bargo against the sale and transport of oil to South Africa, and calls for the setting up of a United Nations "mechanism

Calcaneotalar ratio: a new concept in the estimation of the length of the calcaneus.

PubMed

David, Vikram; Stephens, Terry J; Kindl, Radek; Ang, Andy; Tay, Wei-Han; Asaid, Rafik; McCullough, Keith

2015-01-01

Maintaining the calcaneal length after calcaneal fractures is vital to restoring the normal biomechanics of the foot, because it acts as an important lever arm to the plantarflexors of the foot. However, estimation of the length of the calcaneus to be reconstructed in comminuted calcaneal fractures can be difficult. We propose a new method to reliably estimate the calcaneal length radiographically by defining the calcaneotalar length ratio. A total of 100 ankle radiographs with no fracture in the calcaneus or talus taken in skeletally mature patients were reviewed by 6 observers. The anteroposterior lengths of the calcaneus and talus were measured, and the calcaneotalar length ratio was determined. The ratio was then used to estimate the length of the calcaneus. Interobserver reliability was determined using Cronbach's α coefficient and Pearson's correlation coefficient. The mean length of the calcaneus was 75 ± 0.6 mm, and the mean length of the talus was 59 ± 0.5 mm. The calcaneotalar ratio was 1.3. Using this ratio and multiplying it by the talar length, the mean average estimated length of the calcaneus was within 0.7 mm of the known calcaneal length. Cronbach's α coefficient and Pearson's correlation coefficient showed excellent interobserver reliability. The proposed calcaneotalar ratio is a new and reliable method to radiographically estimate the normal length of the calcaneus when reconstructing the calcaneus. Copyright © 2015 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.
Inter-observer reliability of animal-based welfare indicators included in the Animal Welfare Indicators welfare assessment protocol for dairy goats.

PubMed

Vieira, A; Battini, M; Can, E; Mattiello, S; Stilwell, G

2018-01-08

This study was conducted within the context of the Animal Welfare Indicators (AWIN) project and the underlying scientific motivation for the development of the study was the scarcity of data regarding inter-observer reliability (IOR) of welfare indicators, particularly given the importance of reliability as a further step for developing on-farm welfare assessment protocols. The objective of this study is therefore to evaluate IOR of animal-based indicators (at group and individual-level) of the AWIN welfare assessment protocol (prototype) for dairy goats. In the design of the study, two pairs of observers, one in Portugal and another in Italy, visited 10 farms each and applied the AWIN prototype protocol. Farms in both countries were visited between January and March 2014, and all the observers received the same training before the farm visits were initiated. Data collected during farm visits, and analysed in this study, include group-level and individual-level observations. The results of our study allow us to conclude that most of the group-level indicators presented the highest IOR level ('substantial', 0.85 to 0.99) in both field studies, pointing to a usable set of animal-based welfare indicators that were therefore included in the first level of the final AWIN welfare assessment protocol for dairy goats. Inter-observer reliability of individual-level indicators was lower, but the majority of them still reached 'fair to good' (0.41 to 0.75) and 'excellent' (0.76 to 1) levels. In the paper we explore reasons for the differences found in IOR between the group and individual-level indicators, including how the number of individual-level indicators to be assessed on each animal and the restraining method may have affected the results. Furthermore, we discuss the differences found in the IOR of individual-level indicators in both countries: the Portuguese pair of observers reached a higher level of IOR, when compared with the Italian observers. We argue how the reasons behind these differences may stem from the restraining method applied, or the different background and experience of the observers. Finally, the discussion of the results emphasizes the importance of considering that reliability is not an absolute attribute of an indicator, but derives from an interaction between the indicators, the observers and the situation in which the assessment is taking place. This highlights the importance of further considering the indicators' reliability while developing welfare assessment protocols.
Technical considerations in the evaluation of pediatric motor scales.

PubMed

Berk, R A; DeGangi, G A

1979-04-01

Guidelines are suggested for evaluating the validity and reliability of fine and gross motor scales. In the process of examining three types of validity (domain, construct, discriminant) and two types of reliability (interobserver, decision-making), it was found that there were marked deficiencies in most of the instruments currently available, particularly in the areas of discriminant validity and decision-making reliability. Each psychometric property of the scales was addressed from both scale-developer and user perspectives. An evaluative checklist is generated to assist occupational therapists who need to decide on the quality and appropriateness of a motor behavior scale for specific decision applications.
Perme Intensive Care Unit Mobility Score and ICU Mobility Scale: translation into Portuguese and cross-cultural adaptation for use in Brazil.

PubMed

Kawaguchi, Yurika Maria Fogaça; Nawa, Ricardo Kenji; Figueiredo, Thais Borgheti; Martins, Lourdes; Pires-Neto, Ruy Camargo

2016-01-01

To translate the Perme Intensive Care Unit Mobility Score and the ICU Mobility Scale (IMS) into Portuguese, creating versions that are cross-culturally adapted for use in Brazil, and to determine the interobserver agreement and reliability for both versions. The processes of translation and cross-cultural validation consisted in the following: preparation, translation, reconciliation, synthesis, back-translation, review, approval, and pre-test. The Portuguese-language versions of both instruments were then used by two researchers to evaluate critically ill ICU patients. Weighted kappa statistics and Bland-Altman plots were used in order to verify interobserver agreement for the two instruments. In each of the domains of the instruments, interobserver reliability was evaluated with Cronbach's alpha coefficient. The correlation between the instruments was assessed by Spearman's correlation test. The study sample comprised 103 patients-56 (54%) of whom were male-with a mean age of 52 ± 18 years. The main reason for ICU admission (in 44%) was respiratory failure. Both instruments showed excellent interobserver agreement ( > 0.90) and reliability ( > 0.90) in all domains. Interobserver bias was low for the IMS and the Perme Score (-0.048 ± 0.350 and -0.06 ± 0.73, respectively). The 95% CIs for the same instruments ranged from -0.73 to 0.64 and -1.50 to 1.36, respectively. There was also a strong positive correlation between the two instruments (r = 0.941; p < 0.001). In their versions adapted for use in Brazil, both instruments showed high interobserver agreement and reliability. Realizar a tradução e a validação cultural para a língua portuguesa falada no Brasil e determinar a concordância e a confiabilidade dos instrumentos Perme Intensive Care Unit Mobility Score (designado Perme Escore) e ICU Mobility Scale (designada Escala de Mobilidade em UTI, EMU). Os processos de tradução e adaptação cultural seguiram as seguintes etapas: preparação, tradução, reconciliação, síntese, tradução reversa, revisão, aprovação e pré-teste. Após esses processos, as versões em português dos dois instrumentos foram utilizadas por dois pesquisadores na avaliação de pacientes críticos em UTI. O índice kappa ponderado e a disposição gráfica de Bland-Altman foram utilizados para verificar a concordância entre os instrumentos. O coeficiente alfa de Cronbach foi utilizado para verificar a confiabilidade entre as respostas dos avaliadores dentro de cada domínio dos instrumentos. A correlação entre os instrumentos foi verificada pelo teste de correlação de Spearman. A amostra foi composta por 103 pacientes, sendo a maioria homens (n = 56; 54%), com média de idade = 52 ± 18 anos. O principal motivo de internação nas UTIs foi insuficiência respiratória (em 44%). Os dois instrumentos apresentaram excelente concordância interobservador (> 0,90) e confiabilidade ( > 0,90) em todos os domínios. Constatou-se um baixo viés interobservador na EMU e no Perme Escore (-0,048 ± 0,350 e -0,06 ± 0,73, respectivamente). Os IC95% para os mesmos instrumentos variaram, respectivamente, de -0,73 a 0,64 e de -1,50 a 1,36, respectivamente. Além disso, verificou-se alta correlação positiva entre os dois instrumentos (r = 0,941; p < 0,001). As versões dos dois instrumentos apresentaram alta concordância e confiabilidade interobservador.
Developing a General Outcome Measure of Growth in Movement for Infants and Toddlers.

ERIC Educational Resources Information Center

Greenwood, Charles R.; Luze, Gayle J.; Cline, Gabriel; Kuntz, Susan; Leitschuh, Carol

2002-01-01

The development of an experimental measure for assessing growth in movement in children (ages birth-3) is described. Results from the use of the Movement General Outcome Measurement with 29 infants and toddlers demonstrated the feasibility of the measure. The 6-minute assessment was found reliable in terms of inter-observer agreement. (Contains…
Quantitative Assessment of Motor and Sensory/Motor Acquisition in Handicapped and Nonhandicapped Infants and Young Children. Volume III: Replication of the Procedures.

ERIC Educational Resources Information Center

Guess, Doug; And Others

Ten replication studies based on quantitative procedures developed to measure motor and sensory/motor skill acquisition among handicapped and nonhandicapped infants and children are presented. Each study follows the original assessment procedures, and emphasizes the stability of interobserver reliability across time, consistency in the response…
The feasibility of measuring renal blood flow using transesophageal echocardiography in patients undergoing cardiac surgery.

PubMed

Yang, Ping-Liang; Wong, David T; Dai, Shuang-Bo; Song, Hai-Bo; Ye, Ling; Liu, Jin; Liu, Bin

2009-05-01

There is no reliable method to monitor renal blood flow intraoperatively. In this study, we evaluated the feasibility and reproducibility of left renal blood flow measurements using transesophageal echocardiography during cardiac surgery. In this prospective noninterventional study, left renal blood flow was measured with transesophageal echocardiography during three time points (pre-, intra-, and postcardiopulmonary bypass) in 60 patients undergoing cardiac surgery. Sonograms from 6 subjects were interpreted by 2 blinded independent assessors at the time of acquisition and 6 mo later. Interobserver and intraobserver reproducibility were quantified by calculating variability and intraclass correlation coefficients. Patients with Doppler angles of >30 degrees (20 of 60 subjects) were eliminated from renal blood flow measurements. Left renal blood flow was successfully measured and analyzed in 36 of 60 (60%) subjects. Both interobserver and intraobserver variability were <10%. Interobserver and intraobserver reproducibility in left renal blood flow measurements were good to excellent (intraclass correlation coefficients 0.604-0.999). Left renal arterial luminal diameter for the pre, intra, and postcardiopulmonary bypass phases, ranged from 3.8 to 4.1 mm, renal arterial velocity from 25 to 35 cm/s, and left renal blood flow from 192 to 299 mL/min. In patients undergoing cardiac surgery, it was feasible in 60% of the subjects to measure left renal blood flow using intraoperative transesophageal echocardiography. The interobserver and intraobserver reproducibility of renal blood flow measurements was good to excellent.
Variability in Cobb angle measurements using reformatted computerized tomography scans.

PubMed

Adam, Clayton J; Izatt, Maree T; Harvey, Jason R; Askin, Geoffrey N

2005-07-15

Survey of intraobserver and interobserver measurement variability. To assess the use of reformatted computerized tomography (CT) images for manual measurement of coronal Cobb angles in idiopathic scoliosis. Cobb angle measurements in idiopathic scoliosis are traditionally made from standing radiographs, whereas CT is often used for assessment of vertebral rotation. Correlating Cobb angles from standing radiographs with vertebral rotations from supine CT is problematic because the geometry of the spine changes significantly from standing to supine positions, and 2 different imaging methods are involved. We assessed the use of reformatted thoracolumbar CT images for Cobb angle measurement. Preoperative CT of 12 patients with idiopathic scoliosis were used to generate reformatted coronal images. Five observers measured coronal Cobb angles on 3 occasions from each of the images. Intraobserver and interobserver variability associated with Cobb measurement from reformatted CT scans was assessed and compared with previous studies of measurement variability using plain radiographs. For major curves, 95% confidence intervals for intraobserver and interobserver variability were +/-6.6 degrees and +/-7.7 degrees, respectively. For minor curves, the intervals were +/-7.5 degrees and +/-8.2 degrees, respectively. Intraobserver and interobserver technical error of measurement was 2.4 degrees and 2.7 degrees, with reliability coefficients of 88% and 84%, respectively. There was no correlation between measurement variability and curve severity. Reformatted CT images may be used for manual measurement of coronal Cobb angles in idiopathic scoliosis with similar variability to manual measurement of plain radiographs.
INFLUENCES OF RESPONSE RATE AND DISTRIBUTION ON THE CALCULATION OF INTEROBSERVER RELIABILITY SCORES

PubMed Central

Rolider, Natalie U.; Iwata, Brian A.; Bullock, Christopher E.

2012-01-01

We examined the effects of several variations in response rate on the calculation of total, interval, exact-agreement, and proportional reliability indices. Trained observers recorded computer-generated data that appeared on a computer screen. In Study 1, target responses occurred at low, moderate, and high rates during separate sessions so that reliability results based on the four calculations could be compared across a range of values. Total reliability was uniformly high, interval reliability was spuriously high for high-rate responding, proportional reliability was somewhat lower for high-rate responding, and exact-agreement reliability was the lowest of the measures, especially for high-rate responding. In Study 2, we examined the separate effects of response rate per se, bursting, and end-of-interval responding. Response rate and bursting had little effect on reliability scores; however, the distribution of some responses at the end of intervals decreased interval reliability somewhat, proportional reliability noticeably, and exact-agreement reliability markedly. PMID:23322930
First Cases of Spotted Fever Group Rickettsiosis in Thailand

DTIC Science & Technology

1994-01-01

Rocky Mountain spotted fever ). R. cono- test.6 and an enzyme-linked immunosorbent as- rii, (boutonneuse fever). R. sibirica (North Asian say tELISA...vascular infection of the brain in 28% of Rocky Mountain spotted fever patients.- All I p,,.s:.,, three Thai tick typhus patients responded to P.tCn1 Da.e...rickettsiae of the spotted fever group by mi- has previously shown a sensitivity ofonly 47% comuoioecne ,,tw 2:16I in diagnosing Rocky Mountain spotted
Writing Skills Course for Newly Commissioned Marine Corps Officers

DTIC Science & Technology

1993-10-01

on the parked government vehicle were the 0 main causes of the accident. (8 and 9) 4. That LCpl Frank Johnson’s injuries were incurred in the line of...on the parked government vehicle were the main causes of the accident. (Findings of Fact14 8 and 9) 4. That LCpI Frank Johnson’s injuries were...sports, such as soccer, touch football, baseball, and karate . 3. Use a comma after an introductory word, Phrase. or adverb clause. Adverb clauses are
DoD Depot-Level Reparable Supply Chain Management: Process Effectiveness and Opportunities for Improvement

DTIC Science & Technology

2014-01-01

Memorandum QBO quantity by owner RAPS Rotables Allocation and Planning System RBOM repair bill of materials RC Recoverability Code RI Rock Island RMC...Service-owned inventory on hand in DLA distribution centers was determined using the DLA Quantity by Owner ( QBO ) file, which records the amount of...on analysis of DLA QBO file data). 4 DoD Depot-Level Reparable Supply Chain Management Budget (OMB) guidance is also very low4 and some argue
The Information Age: An Anthology on Its Impact and Consequences

DTIC Science & Technology

1997-01-01

the United States, but it soon assumed global proportions as information and its collection, management , and distribution became the hallmarks of...on election day, soon forgotten in the enjoyment of power, is over," he argues. There is a simple reason for this, Huber maintains. Just as you can...than the value of all U.S. exports. Thus a lot of commerce that looks domestic to an economist—such as the Stouffer’s frozen dinner you bought last
Values of a Patient and Observer Scar Assessment Scale to Evaluate the Facial Skin Graft Scar.

PubMed

Chae, Jin Kyung; Kim, Jeong Hee; Kim, Eun Jung; Park, Kun

2016-10-01

The patient and observer scar assessment scale (POSAS) recently emerged as a promising method, reflecting both observer's and patient's opinions in evaluating scar. This tool was shown to be consistent and reliable in burn scar assessment, but it has not been tested in the setting of skin graft scar in skin cancer patients. To evaluate facial skin graft scar applied to POSAS and to compare with objective scar assessment tools. Twenty three patients, who diagnosed with facial cutaneous malignancy and transplanted skin after Mohs micrographic surgery, were recruited. Observer assessment was performed by three independent rates using the observer component of the POSAS and Vancouver scar scale (VSS). Patient self-assessment was performed using the patient component of the POSAS. To quantify scar color and scar thickness more objectively, spectrophotometer and ultrasonography was applied. Inter-observer reliability was substantial with both VSS and the observer component of the POSAS (average measure intraclass coefficient correlation, 0.76 and 0.80, respectively). The observer component consistently showed significant correlations with patients' ratings for the parameters of the POSAS (all p -values<0.05). The correlation between subjective assessment using POSAS and objective assessment using spectrophotometer and ultrasonography showed low relationship. In facial skin graft scar assessment in skin cancer patients, the POSAS showed acceptable inter-observer reliability. This tool was more comprehensive and had higher correlation with patient's opinion.
Estimation of the refractive index of rigid contact lenses on the basis of back vertex power measurements.

PubMed

Pearson, Richard

2011-03-01

To assess the possibility of estimating the refractive index of rigid contact lenses on the basis of measurements of their back vertex power (BVP) in air and when immersed in liquid. First, a spreadsheet model was used to quantify the magnitude of errors arising from simulated inaccuracies in the variables required to calculate refractive index. Then, refractive index was calculated from in-air and in-liquid measurements of BVP of 21 lenses that had been made in three negative BVPs from materials with seven different nominal refractive index values. The power measurements were made by two operators on two occasions. Intraobserver reliability showed a mean difference of 0.0033±0.0061 (t = 0.544, P = 0.59), interobserver reliability showed a mean difference of 0.0043±0.0061 (t = 0.707, P = 0.48), and the mean difference between the nominal and calculated refractive index values was -0.0010±0.0111 (t = -0.093, P = 0.93). The spreadsheet prediction that low-powered lenses might be subject to greater errors in the calculated values of refractive index was substantiated by the experimental results. This method shows good intra and interobserver reliabilities and can be used easily in a clinical setting to provide an estimate of the refractive index of rigid contact lenses having a BVP of 3 D or more.
Protoporphyrin-IX fluorescence guided surgical resection in high-grade gliomas: The potential impact of human colour perception.

PubMed

Petterssen, Max; Eljamel, Sarah; Eljamel, Sam

2014-09-01

Protoporphyrin-IX (Pp-IX) fluorescence had been used frequently in recent years to guide microsurgical resection of high-grade gliomas (HGG), particularly following the publication of a randomized controlled trial demonstrating its advantages. However, Pp-IX fluorescence is dependent upon the surgeons' eyes' perception of red fluorescent colour. This study was designed to evaluate human eye fluorescence perception and establish a fluorescence scale. 20 of 108 pre-recorded images from intraoperative fluorescence of HGG were used to construct an 8-panel visual analogue fluorescence scale. The scale was validated by testing 56 participants with normal colour vision and three red-green colour-blind participants. For intra-rater agreement ten participants were tested twice and for inter-observer reliability the whole cohort were tested. The intra- and inter-observer reliability of the scale in normal colour vision participants was excellent. The scale was less reliable in the violet-blue panels of the scale. Colour-blind participants were not able to distinguish between red fluorescence and blue-violet colours. The 8-panel fluorescence scale is valid in differentiating red, pink and blue colours in a fluorescence surgical field among participants with normal colour perception and potentially useful to standardize fluorescence-guided surgery. However, colourblind surgeons should not use fluorescence-guided surgery. Copyright © 2014 Elsevier B.V. All rights reserved.
Osteochondritis dissecans of the humeral capitellum: reliability of four classification systems using radiographs and computed tomography.

PubMed

Claessen, Femke M A P; van den Ende, Kimberly I M; Doornberg, Job N; Guitton, Thierry G; Eygendaal, Denise; van den Bekerom, Michel P J

2015-10-01

The radiographic appearance of osteochondritis dissecans (OCD) of the humeral capitellum varies according to the stage of the lesion. It is important to evaluate the stage of OCD lesion carefully to guide treatment. We compared the interobserver reliability of currently used classification systems for OCD of the humeral capitellum to identify the most reliable classification system. Thirty-two musculoskeletal radiologists and orthopaedic surgeons specialized in elbow surgery from several countries evaluated anteroposterior and lateral radiographs and corresponding computed tomography (CT) scans of 22 patients to classify the stage of OCD of the humeral capitellum according to the classification systems developed by (1) Minami, (2) Berndt and Harty, (3) Ferkel and Sgaglione, and (4) Anderson on a Web-based study platform including a Digital Imaging and Communications in Medicine viewer. Magnetic resonance imaging was not evaluated as part of this study. We measured agreement among observers using the Siegel and Castellan multirater κ. All OCD classification systems, except for Berndt and Harty, which had poor agreement among observers (κ = 0.20), had fair interobserver agreement: κ was 0.27 for the Minami, 0.23 for Anderson, and 0.22 for Ferkel and Sgaglione classifications. The Minami Classification was significantly more reliable than the other classifications (P < .001). The Minami Classification was the most reliable for classifying different stages of OCD of the humeral capitellum. However, it is unclear whether radiographic evidence of OCD of the humeral capitellum, as categorized by the Minami Classification, guides treatment in clinical practice as a result of this fair agreement. Copyright © 2015 Journal of Shoulder and Elbow Surgery Board of Trustees. Published by Elsevier Inc. All rights reserved.
Agreement and reliability of pelvic floor measurements during contraction using three-dimensional pelvic floor ultrasound and virtual reality.

PubMed

Speksnijder, L; Rousian, M; Steegers, E A P; Van Der Spek, P J; Koning, A H J; Steensma, A B

2012-07-01

Virtual reality is a novel method of visualizing ultrasound data with the perception of depth and offers possibilities for measuring non-planar structures. The levator ani hiatus has both convex and concave aspects. The aim of this study was to compare levator ani hiatus volume measurements obtained with conventional three-dimensional (3D) ultrasound and with a virtual reality measurement technique and to establish their reliability and agreement. 100 symptomatic patients visiting a tertiary pelvic floor clinic with a normal intact levator ani muscle diagnosed on translabial ultrasound were selected. Datasets were analyzed using a rendered volume with a slice thickness of 1.5 cm at the level of minimal hiatal dimensions during contraction. The levator area (in cm(2)) was measured and multiplied by 1.5 to get the levator ani hiatus volume in conventional 3D ultrasound (in cm(3)). Levator ani hiatus volume measurements were then measured semi-automatically in virtual reality (cm(3) ) using a segmentation algorithm. An intra- and interobserver analysis of reliability and agreement was performed in 20 randomly chosen patients. The mean difference between levator ani hiatus volume measurements performed using conventional 3D ultrasound and virtual reality was 0.10 (95% CI, - 0.15 to 0.35) cm(3). The intraclass correlation coefficient (ICC) comparing conventional 3D ultrasound with virtual reality measurements was > 0.96. Intra- and interobserver ICCs for conventional 3D ultrasound measurements were > 0.94 and for virtual reality measurements were > 0.97, indicating good reliability for both. Levator ani hiatus volume measurements performed using virtual reality were reliable and the results were similar to those obtained with conventional 3D ultrasonography. Copyright © 2012 ISUOG. Published by John Wiley & Sons, Ltd.
Intra-and inter-observer reliability of nailfold videocapillaroscopy - A possible outcome measure for systemic sclerosis-related microangiopathy.

PubMed

Dinsdale, Graham; Moore, Tonia; O'Leary, Neil; Tresadern, Philip; Berks, Michael; Roberts, Christopher; Manning, Joanne; Allen, John; Anderson, Marina; Cutolo, Maurizio; Hesselstrand, Roger; Howell, Kevin; Pizzorni, Carmen; Smith, Vanessa; Sulli, Alberto; Wildt, Marie; Taylor, Christopher; Murray, Andrea; Herrick, Ariane L

2017-07-01

Our aim was to assess the reliability of nailfold capillary assessment in terms of image evaluability, image severity grade ('normal', 'early', 'active', 'late'), capillary density, capillary (apex) width, and presence of giant capillaries, and also to gain further insight into differences in these parameters between patients with systemic sclerosis (SSc), patients with primary Raynaud's phenomenon (PRP) and healthy control subjects. Videocapillaroscopy images (magnification 300×) were acquired from all 10 digits from 173 participants: 101 patients with SSc, 22 with PRP and 50 healthy controls. Ten capillaroscopy experts from 7 European centres evaluated the images. Custom image mark-up software allowed extraction of the following outcome measures: overall grade ('normal', 'early', 'active', 'late', 'non-specific', or 'ungradeable'), capillary density (vessels/mm), mean vessel apical width, and presence of giant capillaries. Observers analysed a median of 129 images each. Evaluability (i.e. the availability of measures) varied across outcome measures (e.g. 73.0% for density and 46.2% for overall grade in patients with SSc). Intra-observer reliability for evaluability was consistently higher than inter- (e.g. for density, intra-class correlation coefficient [ICC] was 0.71 within and 0.14 between observers). Conditional on evaluability, both intra- and inter-observer reliability were high for grade (ICC 0.93 and 0.78 respectively), density (0.91 and 0.64) and width (0.91 and 0.85). Evaluability is one of the major challenges in assessing nailfold capillaries. However, when images are evaluable, the high intra- and inter-reliabilities suggest that overall image grade, capillary density and apex width have potential as outcome measures in longitudinal studies. Copyright © 2017 Elsevier Inc. All rights reserved.
Measuring the food and built environments in urban centres: reliability and validity of the EURO-PREVOB Community Questionnaire.

PubMed

Pomerleau, J; Knai, C; Foster, C; Rutter, H; Darmon, N; Derflerova Brazdova, Z; Hadziomeragic, A F; Pekcan, G; Pudule, I; Robertson, A; Brunner, E; Suhrcke, M; Gabrijelcic Blenkus, M; Lhotska, L; Maiani, G; Mistura, L; Lobstein, T; Martin, B W; Elinder, L S; Logstrup, S; Racioppi, F; McKee, M

2013-03-01

The authors designed an instrument to measure objectively aspects of the built and food environments in urban areas, the EURO-PREVOB Community Questionnaire, within the EU-funded project 'Tackling the social and economic determinants of nutrition and physical activity for the prevention of obesity across Europe' (EURO-PREVOB). This paper describes its development, reliability, validity, feasibility and relevance to public health and obesity research. The Community Questionnaire is designed to measure key aspects of the food and built environments in urban areas of varying levels of affluence or deprivation, within different countries. The questionnaire assesses (1) the food environment and (2) the built environment. Pilot tests of the EURO-PREVOB Community Questionnaire were conducted in five to 10 purposively sampled urban areas of different socio-economic status in each of Ankara, Brno, Marseille, Riga, and Sarajevo. Inter-rater reliability was compared between two pairs of fieldworkers in each city centre using three methods: inter-observer agreement (IOA), kappa statistics, and intraclass correlation coefficients (ICCs). Data were collected successfully in all five cities. Overall reliability of the EURO-PREVOB Community Questionnaire was excellent (inter-observer agreement (IOA) > 0.87; intraclass correlation coefficients (ICC)s > 0.91 and kappa statistics > 0.7. However, assessment of certain aspects of the quality of the built environment yielded slightly lower IOA coefficients than the quantitative aspects. The EURO-PREVOB Community Questionnaire was found to be a reliable and practical observational tool for measuring differences in community-level data on environmental factors that can impact on dietary intake and physical activity. The next step is to evaluate its predictive power by collecting behavioural and anthropometric data relevant to obesity and its determinants. Copyright © 2013 The Royal Society for Public Health. Published by Elsevier Ltd. All rights reserved.

Reliability of widefield nailfold capillaroscopy and video capillaroscopy in the assessment of patients with Raynaud’s phenomenon.

PubMed

Sekiyama, Juliana Y; Camargo, Cintia Z; Eduardo, Luís; Andrade, C; Kayser, Cristiane

2013-11-01

To analyze the diagnostic performance and reliability of different parameters evaluated by widefield nailfold capillaroscopy (NFC) with those obtained by video capillaroscopy in patients with Raynaud’s phenomenon (RP). Two hundred fifty-two individuals were assessed, including 101 systemic sclerosis (SSc; scleroderma) patients,61 patients with undifferentiated connective tissue disease, 37 patients with primary RP, and 53 controls. Widefield NFC was performed using a stereomicroscope under 10–25 x magnification and direct measurement of all parameters. Video capillaroscopy was performed under 200 x magnification, with the acquirement of 32 images per individual (4 fields per finger in 8 fingers). The following parameters were analyzed in 8 fingers of the hands (excluding thumbs) by both methods: number of capillaries/mm, number of enlarged and giant capillaries, microhemorrhages, and avascular score.Intra- and interobserver reliability was evaluated by performing both examinations in 20 individuals on 2 different days and by 2 long-term experienced observers. There was a significant correlation (P < 0.000) between widefield NFC and video capillaroscopy in the comparison of all parameters. Kappa values and intraclass correlation coefficient analysis showed excellent intra- and interobserver reproducibility for all parameters evaluated by widefield NFC and video capillaroscopy. Bland-Altman analysis showed high agreement of all parameters evaluated in both methods. According to receiver operating characteristic curve analysis, both methods showed a similar performance in discriminating SSc patients from controls. Widefield NFC and video capillaroscopy are reliable and accurate methods and can be used equally for assessing peripheral microangiopathy in RP and SSc patients. Nonetheless, the high reliability obtained may not be similar for less experienced examiners.
Hand-held dynamometry in patients with haematological malignancies: Measurement error in the clinical assessment of knee extension strength

PubMed Central

Knols, Ruud H; Aufdemkampe, Geert; de Bruin, Eling D; Uebelhart, Daniel; Aaronson, Neil K

2009-01-01

Background Hand-held dynamometry is a portable and inexpensive method to quantify muscle strength. To determine if muscle strength has changed, an examiner must know what part of the difference between a patient's pre-treatment and post-treatment measurements is attributable to real change, and what part is due to measurement error. This study aimed to determine the relative and absolute reliability of intra and inter-observer strength measurements with a hand-held dynamometer (HHD). Methods Two observers performed maximum voluntary peak torque measurements (MVPT) for isometric knee extension in 24 patients with haematological malignancies. For each patient, the measurements were carried out on the same day. The main outcome measures were the intraclass correlation coefficient (ICC ± 95%CI), the standard error of measurement (SEM), the smallest detectable difference (SDD), the relative values as % of the grand mean of the SEM and SDD, and the limits of agreement for the intra- and inter-observer '3 repetition average' and the 'highest value of 3 MVPT' knee extension strength measures. Results The intra-observer ICCs were 0.94 for the average of 3 MVPT (95%CI: 0.86–0.97) and 0.86 for the highest value of 3 MVPT (95%CI: 0.71–0.94). The ICCs for the inter-observer measurements were 0.89 for the average of 3 MVPT (95%CI: 0.75–0.95) and 0.77 for the highest value of 3 MVPT (95%CI: 0.54–0.90). The SEMs for the intra-observer measurements were 6.22 Nm (3.98% of the grand mean (GM) and 9.83 Nm (5.88% of GM). For the inter-observer measurements, the SEMs were 9.65 Nm (6.65% of GM) and 11.41 Nm (6.73% of GM). The SDDs for the generated parameters varied from 17.23 Nm (11.04% of GM) to 27.26 Nm (17.09% of GM) for intra-observer measurements, and 26.76 Nm (16.77% of GM) to 31.62 Nm (18.66% of GM) for inter-observer measurements, with similar results for the limits of agreement. Conclusion The results indicate that there is acceptable relative reliability for evaluating knee strength with a HHD, while the measurement error observed was modest. The HHD may be useful in detecting changes in knee extension strength at the individual patient level. PMID:19272149
Intra- and inter-observer reliability of quantitative analysis of the infra-patellar fat pad and comparison between fat- and non-fat-suppressed imaging--Data from the osteoarthritis initiative.

PubMed

Steidle-Kloc, E; Wirth, W; Ruhdorfer, A; Dannhauer, T; Eckstein, F

2016-03-01

The infra-patellar fat pad (IPFP), as intra-articular adipose tissue represents a potential source of pro-inflammatory cytokines and its size has been suggested to be associated with osteoarthritis (OA) of the knee. This study examines inter- and intra-observer reliability of fat-suppressed (fs) and non-fat-suppressed (nfs) MR imaging for determination of IPFP morphological measurements as novel biomarkers. The IPFP of nine right knees of healthy Osteoarthritis Initiative participants was segmented by five readers, using fs and nfs baseline sagittal MRIs. The intra-observer reliability was determined from baseline and 1-year follow-up images. All segmentations were quality controlled (QC) by an expert reader. Reliability was expressed as root mean square coefficient of variation (RMS CV%). After QC, the inter-observer reliability for fs (nfs) imaging was 2.0% (1.1%) for IPFP volume, 2.1%/2.5% (1.6%/1.8%) for anterior/posterior surface areas, 1.8% (1.8%) for depth, and 2.1% (2.4%) for maximum sagittal area. The intra-observer reliability was 3.1% (5.0%) for volume, 2.3%/2.8% (2.5%/2.9%) for anterior/posterior surfaces, 1.9% (3.5%) for depth, and 3.3% (4.5%) for maximum sagittal area. IPFP volume from nfs images was systematically greater (+7.3%) than from fs images, but highly correlated (r=0.98). The results suggest that quantitative measurements of IPFP morphology can be performed with satisfactory reliability when expert QC is implemented. The IPFP is more clearly depicted in nfs images, and there is a small systematic off-set versus analysis from fs images. However, the high linear relationship between fs and nfs imaging suggests that fs images can be used to analyze IPFP morphology, when nfs images are not available. Copyright © 2015 Elsevier GmbH. All rights reserved.
Development and assessment of a digital X-ray software tool to determine vertebral rotation in adolescent idiopathic scoliosis.

PubMed

Eijgenraam, Susanne M; Boselie, Toon F M; Sieben, Judith M; Bastiaenen, Caroline H G; Willems, Paul C; Arts, Jacobus J; Lataster, Arno

2017-02-01

The amount of vertebral rotation in the axial plane is of key importance in the prognosis and treatment of adolescent idiopathic scoliosis (AIS). Current methods to determine vertebral rotation are either designed for use in analogue plain radiographs and not useful in digital images, or lack measurement precision and are therefore less suitable for the follow-up of rotation in AIS patients. This study aimed to develop a digital X-ray software tool with high measurement precision to determine vertebral rotation in AIS, and to assess its (concurrent) validity and reliability. In this study a combination of basic science and reliability methodology applied in both laboratory and clinical settings was used. Software was developed using the algorithm of the Perdriolle torsion meter for analogue AP plain radiographs of the spine. Software was then assessed for (1) concurrent validity and (2) intra- and interobserver reliability. Plain radiographs of both human cadaver vertebrae and outpatient AIS patients were used. Concurrent validity was measured by two independent observers, both experienced in the assessment of plain radiographs. Reliability-measurements were performed by three independent spine surgeons. Pearson correlation of the software compared with the analogue Perdriolle torsion meter for mid-thoracic vertebrae was 0.98, for low-thoracic vertebrae 0.97 and for lumbar vertebrae 0.97. Measurement exactness of the software was within 5° in 62% of cases and within 10° in 97% of cases. Intraclass correlation coefficient (ICC) for inter-observer reliability was 0.92 (0.91-0.95), ICC for intra-observer reliability was 0.96 (0.94-0.97). We developed a digital X-ray software tool to determine vertebral rotation in AIS with a substantial concurrent validity and reliability, which may be useful for the follow-up of vertebral rotation in AIS patients. Copyright © 2015 Elsevier Inc. All rights reserved.
Visual judgements of steadiness in one-legged stance: reliability and validity.

PubMed

Haupstein, T; Goldie, P

2000-01-01

There is a paucity of information about the validity and reliability of clinicians' visual judgements of steadiness in one-legged stance. Such judgements are used frequently in clinical practice to support decisions about treatment in the fields of neurology, sports medicine, paediatrics and orthopaedics. The aim of the present study was to address the validity and reliability of visual judgements of steadiness in one-legged stance in a group of physiotherapists. A videotape of 20 five-second performances was shown to 14 physiotherapists with median clinical experience of 6.75 years. Validity of visual judgement was established by correlating scores obtained from an 11-point rating scale with criterion scores obtained from a force platform. In addition, partial correlations were used to control for the potential influence of body weight on the relationship between the visual judgements and criterion scores. Inter-observer reliability was quantified between the physiotherapists; intra-observer reliability was quantified between two tests four weeks apart. Mean criterion-related validity was high, regardless of whether body weight was controlled for statistically (Pearson's r = 0.84, 0.83, respectively). The standard error of estimating the criterion score was 3.3 newtons. Inter-observer reliability was high (ICC (2,1) = 0.81 at Test 1 and 0.82 at Test 2). Intra-observer reliability was high (on average ICC (2,1) = 0.88; Pearson's r = 0.90). The standard error of measurement for the 11-point scale was one unit. The finding of higher accuracy of making visual judgements than previously reported may be due to several aspects of design: use of a criterion score derived from the variability of the force signal which is more discriminating than variability of centre of pressure; use of a discriminating visual rating scale; specificity and clear definition of the phenomenon to be rated.
Intra- and inter-observer reliability of quantitative analysis of the infra-patellar fat pad and comparison between fat- and non-fat-suppressed imaging—Data from the osteoarthritis initiative

PubMed Central

Steidle-Kloc, E.; Wirth, W.; Ruhdorfer, A.; Dannhauer, T.; Eckstein, F.

2015-01-01

The infra-patellar fat pad (IPFP), as intra-articular adipose tissue represents a potential source of pro-inflammatory cytokines and its size has been suggested to be associated with osteoarthritis (OA) of the knee. This study examines inter- and intra-observer reliability of fat-suppressed (fs) and non-fat-suppressed (nfs) MR imaging for determination of IPFP morphological measurements as novel biomarkers. The IPFP of nine right knees of healthy Osteoarthritis Initiative participants was segmented by five readers, using fs and nfs baseline sagittal MRIs. The intra-observer reliability was determined from baseline and 1-year follow-up images. All segmentations were quality controlled (QC) by an expert reader. Reliability was expressed as root mean square coefficient of variation (RMS CV%). After QC, the inter-observer reliability for fs (nfs) imaging was 2.0% (1.1%) for IPFP volume, 2.1%/2.5% (1.6%/1.8%) for anterior/posterior surface areas, 1.8% (1.8%) for depth, and 2.1% (2.4%) for maximum sagittal area. The intra-observer reliability was 3.1% (5.0%) for volume, 2.3%/2.8% (2.5%/2.9%) for anterior/posterior surfaces, 1.9% (3.5%) for depth, and 3.3% (4.5%) for maximum sagittal area. IPFP volume from nfs images was systematically greater (+7.3%) than from fs images, but highly correlated (r = 0.98). The results suggest that quantitative measurements of IPFP morphology can be performed with satisfactory reliability when expert QC is implemented. The IPFP is more clearly depicted in nfs images, and there is a small systematic off-set versus analysis from fs images. However, the high linear relationship between fs and nfs imaging suggests that fs images can be used to analyze IPFP morphology, when nfs images are not available. PMID:26569532
3-dimensional computed tomographic analysis of the pharynx in adult patients with unrepaired isolated cleft palate.

PubMed

Xu, Yi; Zhao, Shufan; Shi, Jiayu; Wang, Yan; Shi, Bing; Zheng, Qian; Lo, Lun-Jou

2013-08-01

This study investigated 3D differences of the pharynx in adult patients with unrepaired isolated cleft palate (ICP) versus normal adults using cone-beam computed tomography (CBCT). CBCT data of 32 adult patients with nonsyndromic unrepaired ICP and 30 normal controls were acquired. Image processing and analyses were performed using Mimics (Materialise NV, Leuven, Belgium). Linear, planar, and volumetric measurements and comparisons were performed between patients with ICP and controls. Interobserver and intraobserver reliabilities of 3D pharyngeal analysis were determined by the Pearson correlation coefficient. Statistical analyses comparing patients with ICP to normal adults were performed using independent-samples t test, with the significance threshold set at P = .05. Interobserver and intraobserver reliabilities were high. Pearson correlation coefficients ranged from 0.992 to 0.999 for interobserver measurements and from 0.994 to 0.999 for intraobserver measurements. Anterior height (P = .000), total depth (P = .003), and floor length (P = .034) of the bony nasopharynx; posteroanterior diameter of the pharyngeal airway at the palatal plane (P = .000); cross-sectional area of the pharyngeal airway at the palatal plane (P = .000); total volume (P = .031); volume above the palatal plane (P = .024); and the volume between the palatal plane and the plane of the most anterior point on the inferior margin of the outline of the body of the second cervical vertebra (P = .022) were larger in patients with ICP. This imaging study showed an enlarged nasopharynx in the sagittal plane and increased nasopharyngeal airway volume at the palatal plane in patients with ICP. Crown Copyright © 2013. Published by Elsevier Inc. All rights reserved.
Consistency of corneal sublayer thickness measurements using Fourier-domain optical coherence tomography after phacoemulsification.

PubMed

López-Miguel, Alberto; Calabuig-Goena, María; Marqués-Fernández, Victoria; Fernández, Itziar; Alió, Jorge L; Maldonado, Miguel J

2016-11-04

To assess the reliability of corneal epithelial thickness (CET), nonepithelial central corneal thickness (NECCT), and central corneal thickness (CCT) measurements using Cirrus high-definition optical coherence tomography (HD-OCT) in patients who did and did not undergo cataract surgery. Forty patients who underwent uneventful phacoemulsification and 40 healthy participants were recruited to evaluate the intraobserver repeatability and interobserver reproducibility of CET, NECCT, and CCT measurements using Cirrus HD-OCT. To analyze repeatability, one examiner obtained 5 consecutive scans in each participant; for interobserver reproducibility, another examiner randomly obtained another scan. Within-subject standard deviation, coefficient of variation (CV), limits of agreement, and intraclass correlation coefficient (ICC) data were obtained. For intraobserver repeatability, the intrasession CV (CVw) and ICC values of the CET in the operated and nonoperated groups were 3.7% and 0.80 and 3.8% and 0.73, respectively; for NECCT, 0.7% and 0.98 and 0.8% and 0.97; and for CCT, 0.6% and 0.99 and 0.7% and 0.98. For interobserver reproducibility, the CVw and ICC values for the CET in the operated and nonoperated groups were 2.6% and 0.82 and 2.3% and 0.62, respectively; for NECCT, 0.7% and 0.98 and 0.5% and 0.98; and for CCT, 0.5% and 0.99 and 0.4% and 0.99. The corneal sublayer thickness can be measured reliably using Cirrus HD-OCT in patients who underwent cataract surgery and elderly participants; however, the CET consistency is poorer than the NECCT. Corneal epithelial thickness modifications exceeding 4% reflect true thickness changes instead of random error variations using HD-OCT.
Reliability of Pentacam HR Thickness Maps of the Entire Cornea in Normal, Post-Laser In Situ Keratomileusis, and Keratoconus Eyes.

PubMed

Xu, Zhe; Peng, Mei; Jiang, Jun; Yang, Chun; Zhu, Weigen; Lu, Fan; Shen, Meixiao

2016-02-01

To measure the repeatability and reproducibility of Pentacam HR system thickness maps for the entire cornea in normal, post-laser in situ keratomileusis (post-LASIK), and keratoconus (KC) eyes. Reliability study. Sixty normal subjects (60 eyes), 30 post-LASIK subjects (60 eyes), and 14 KC patients (27 eyes) were imaged with the Pentacam HR system by 2 well-trained operators. For pachymetry the cornea was divided into 4 zones: a central zone (2-mm diameter) and concentric pericentral zone (2-5 mm), transitional zone (5-7 mm), and peripheral zone (7-10 mm). The 3 concentric zones were subdivided into 8 sectors. Intraobserver repeatability and interobserver reproducibility of entire corneal thickness maps were tested by the repeatability and reproducibility coefficients, intraclass correlation coefficients, coefficient of variation, and 95% limits of agreement. From central to peripheral zones, the precision of corneal thickness measurements became gradually smaller. Central zone repeatability and reproducibility were the best in the normal, post-LASIK, and KC groups. The peripheral superior sectors showed poorer repeatability and reproducibility for all subjects. The intraobserver repeatability and interobserver reproducibility for all zones were ≤19.3 μm, ≤22.1 μm, and ≤20.7 μm, in the normal, post-LASIK, and KC groups, respectively. The intraobserver and interobserver coefficients of variation for all zones were ≤1.3%, ≤1.6%, and ≤1.6% for all 3 groups. Pentacam HR system pachymetry of the entire cornea provided good precision in normal, post-LASIK, and KC corneas. Thickness measurements in the peripheral cornea should be interpreted with caution in abnormal corneas after surgery or with diseases. Copyright © 2016 Elsevier Inc. All rights reserved.
Can emergency physicians accurately and reliably assess acute vertigo in the emergency department?

PubMed

Vanni, Simone; Nazerian, Peiman; Casati, Carlotta; Moroni, Federico; Risso, Michele; Ottaviani, Maddalena; Pecci, Rudi; Pepe, Giuseppe; Vannucchi, Paolo; Grifoni, Stefano

2015-04-01

To validate a clinical diagnostic tool, used by emergency physicians (EPs), to diagnose the central cause of patients presenting with vertigo, and to determine interrater reliability of this tool. A convenience sample of adult patients presenting to a single academic ED with isolated vertigo (i.e. vertigo without other neurological deficits) was prospectively evaluated with STANDING (SponTAneousNystagmus, Direction, head Impulse test, standiNG) by five trained EPs. The first step focused on the presence of spontaneous nystagmus, the second on the direction of nystagmus, the third on head impulse test and the fourth on gait. The local standard practice, senior audiologist evaluation corroborated by neuroimaging when deemed appropriate, was considered the reference standard. Sensitivity and specificity of STANDING were calculated. On the first 30 patients, inter-observer agreement among EPs was also assessed. Five EPs with limited experience in nystagmus assessment volunteered to participate in the present study enrolling 98 patients. Their average evaluation time was 9.9 ± 2.8 min (range 6-17). Central acute vertigo was suspected in 16 (16.3%) patients. There were 13 true positives, three false positives, 81 true negatives and one false negative, with a high sensitivity (92.9%, 95% CI 70-100%) and specificity (96.4%, 95% CI 93-38%) for central acute vertigo according to senior audiologist evaluation. The Cohen's kappas of the first, second, third and fourth steps of the STANDING were 0.86, 0.93, 0.73 and 0.78, respectively. The whole test showed a good inter-observer agreement (k = 0.76, 95% CI 0.45-1). In the hands of EPs, STANDING showed a good inter-observer agreement and accuracy validated against the local standard of care. © 2015 Australasian College for Emergency Medicine and Australasian Society for Emergency Medicine.
Test Characteristics of Acridine Orange, Gram, and May-Grünwald-Giemsa Stains for Enumeration of Intracellular Organisms in Bronchoalveolar Lavage Fluid

PubMed Central

De Brauwer, Els; Jacobs, Jan; Nieman, Fred; Bruggeman, Cathrien; Drent, Marjolein

1999-01-01

For enumeration of intracellular organisms (ICO) in bronchoalveolar lavage fluid samples, the May-Grünwald-Giemsa (MGG) stain displayed higher interobserver agreement than the acridine orange and Gram stains. The MGG stain offered a reliable enumeration of ICO when 200 cells were counted by one observer. PMID:9889233
Reproducibility of abdominal fat assessment by ultrasound and computed tomography

PubMed Central

Mauad, Fernando Marum; Chagas-Neto, Francisco Abaeté; Benedeti, Augusto César Garcia Saab; Nogueira-Barbosa, Marcello Henrique; Muglia, Valdair Francisco; Carneiro, Antonio Adilton Oliveira; Muller, Enrico Mattana; Elias Junior, Jorge

2017-01-01

Objective: To test the accuracy and reproducibility of ultrasound and computed tomography (CT) for the quantification of abdominal fat in correlation with the anthropometric, clinical, and biochemical assessments. Materials and Methods: Using ultrasound and CT, we determined the thickness of subcutaneous and intra-abdominal fat in 101 subjects-of whom 39 (38.6%) were men and 62 (61.4%) were women-with a mean age of 66.3 years (60-80 years). The ultrasound data were correlated with the anthropometric, clinical, and biochemical parameters, as well as with the areas measured by abdominal CT. Results: Intra-abdominal thickness was the variable for which the correlation with the areas of abdominal fat was strongest (i.e., the correlation coefficient was highest). We also tested the reproducibility of ultrasound and CT for the assessment of abdominal fat and found that CT measurements of abdominal fat showed greater reproducibility, having higher intraobserver and interobserver reliability than had the ultrasound measurements. There was a significant correlation between ultrasound and CT, with a correlation coefficient of 0.71. Conclusion: In the assessment of abdominal fat, the intraobserver and interobserver reliability were greater for CT than for ultrasound, although both methods showed high accuracy and good reproducibility. PMID:28670024
Reproducibility of abdominal fat assessment by ultrasound and computed tomography.

PubMed

Mauad, Fernando Marum; Chagas-Neto, Francisco Abaeté; Benedeti, Augusto César Garcia Saab; Nogueira-Barbosa, Marcello Henrique; Muglia, Valdair Francisco; Carneiro, Antonio Adilton Oliveira; Muller, Enrico Mattana; Elias Junior, Jorge

2017-01-01

To test the accuracy and reproducibility of ultrasound and computed tomography (CT) for the quantification of abdominal fat in correlation with the anthropometric, clinical, and biochemical assessments. Using ultrasound and CT, we determined the thickness of subcutaneous and intra-abdominal fat in 101 subjects-of whom 39 (38.6%) were men and 62 (61.4%) were women-with a mean age of 66.3 years (60-80 years). The ultrasound data were correlated with the anthropometric, clinical, and biochemical parameters, as well as with the areas measured by abdominal CT. Intra-abdominal thickness was the variable for which the correlation with the areas of abdominal fat was strongest (i.e., the correlation coefficient was highest). We also tested the reproducibility of ultrasound and CT for the assessment of abdominal fat and found that CT measurements of abdominal fat showed greater reproducibility, having higher intraobserver and interobserver reliability than had the ultrasound measurements. There was a significant correlation between ultrasound and CT, with a correlation coefficient of 0.71. In the assessment of abdominal fat, the intraobserver and interobserver reliability were greater for CT than for ultrasound, although both methods showed high accuracy and good reproducibility.
Noninvasive measurement of burn wound depth applying infrared thermal imaging (Conference Presentation)

NASA Astrophysics Data System (ADS)

Jaspers, Mariëlle E.; Maltha, Ilse M.; Klaessens, John H.; Vet, Henrica C.; Verdaasdonk, Rudolf M.; Zuijlen, Paul P.

2016-02-01

In burn wounds early discrimination between the different depths plays an important role in the treatment strategy. The remaining vasculature in the wound determines its healing potential. Non-invasive measurement tools that can identify the vascularization are therefore considered to be of high diagnostic importance. Thermography is a non-invasive technique that can accurately measure the temperature distribution over a large skin or tissue area, the temperature is a measure of the perfusion of that area. The aim of this study was to investigate the clinimetric properties (i.e. reliability and validity) of thermography for measuring burn wound depth. In a cross-sectional study with 50 burn wounds of 35 patients, the inter-observer reliability and the validity between thermography and Laser Doppler Imaging were studied. With ROC curve analyses the ΔT cut-off point for different burn wound depths were determined. The inter-observer reliability, expressed by an intra-class correlation coefficient of 0.99, was found to be excellent. In terms of validity, a ΔT cut-off point of 0.96°C (sensitivity 71%; specificity 79%) differentiates between a superficial partial-thickness and deep partial-thickness burn. A ΔT cut-off point of -0.80°C (sensitivity 70%; specificity 74%) could differentiate between a deep partial-thickness and a full-thickness burn wound. This study demonstrates that thermography is a reliable method in the assessment of burn wound depths. In addition, thermography was reasonably able to discriminate among different burn wound depths, indicating its potential use as a diagnostic tool in clinical burn practice.
Evaluation of a clinical dehydration scale in children requiring intravenous rehydration.

PubMed

Kinlin, Laura M; Freedman, Stephen B

2012-05-01

To evaluate the reliability and validity of a previously derived clinical dehydration scale (CDS) in a cohort of children with gastroenteritis and evidence of dehydration. Participants were 226 children older than 3 months who presented to a tertiary care emergency department and required intravenous rehydration. Reliability was assessed at treatment initiation, by comparing the scores assigned independently by a trained research nurse and a physician. Validity was assessed by using parameters reflective of disease severity: weight gain, baseline laboratory results, willingness of the physician to discharge the patient, hospitalization, and length of stay. Interobserver reliability was moderate, with a weighted κ of 0.52 (95% confidence interval [CI] 0.41, 0.63). There was no correlation between CDS score and percent weight gain, a proxy measure of fluid deficit (Spearman correlation coefficient = -0.03; 95% CI -0.18, 0.12). There were, however, modest and statistically significant correlations between CDS score and several other parameters, including serum bicarbonate (Pearson correlation coefficient = -0.35; 95% CI -0.46, -0.22) and length of stay (Pearson correlation coefficient = 0.24; 95% CI 0.11, 0.36). The scale's discriminative ability was assessed for the outcome of hospitalization, yielding an area under the receiver operating characteristic curve of 0.65 (95% CI 0.57, 0.73). In children administered intravenous rehydration, the CDS was characterized by moderate interobserver reliability and weak associations with objective measures of disease severity. These data do not support its use as a tool to dictate the need for intravenous rehydration or to predict clinical course.
Three-dimensional Magnetic Resonance Imaging of the Anterolateral Ligament of the Knee: An Evaluation of Intact and Anterior Cruciate Ligament-Deficient Knees From the Scientific Anterior Cruciate Ligament Network International (SANTI) Study Group.

PubMed

Muramatsu, Koichi; Saithna, Adnan; Watanabe, Hiroki; Sasaki, Kana; Yokosawa, Kenta; Hachiya, Yudo; Banno, Tatsuo; Helito, Camilo Partezani; Sonnery-Cottet, Bertrand

2018-05-02

To determine the visualization rate of the anterolateral ligament (ALL) in uninjured and anterior cruciate ligament (ACL)-deficient knees using 3-dimensional (3D) magnetic resonance imaging (MRI) and to characterize the spectrum of ALL injury observed in ACL-deficient knees, as well as determine the interobserver and intraobserver reliability of a 3D MRI classification of ALL injury. A total of 100 knees (60 ACL deficient and 40 uninjured) underwent 3D MRI. The ALL was evaluated by 2 blinded orthopaedic surgeons. The ALL was classified as follows: type A, continuous, clearly defined low-signal band; type B, warping, thinning, or iso-signal changes; and type C, without clear continuity. The comparison between imaging performed early after ACL injury (<1 month) and delayed imaging (>1 month) was evaluated, as was intraobserver and interobserver reliability. Complete visualization of the ALL was achieved in all uninjured knees. In the ACL-deficient group, 24 knees underwent early imaging, with 87.5% showing evidence of ALL injury (3 normal, or type A, knees [12.5%], 18 type B [75.0%], and 3 type C [12.5%]). The remaining 36 knees underwent delayed imaging, with 55.6% showing evidence of injury (16 type A [44.4%], 18 type B [50.0%], and 2 type C [5.6%]). The difference in the rate of injury between the 2 groups was significant (P = .03). Multivariate analysis showed that the delay from ACL injury to MRI was the only factor (negatively) associated with the rate of injury to the ALL. Interobserver reliability and intraobserver reliability of the classification of ALL type were good (κ = 0.86 and κ = 0.93, respectively). Three-dimensional MRI allows full visualization of the ALL in all normal knees. The rate of injury to the ALL in acutely ACL-injured knees identified on 3D MRI is higher than previous reports using standard MRI techniques. This rate is significantly higher than the rate of injury to the ALL identified on delayed imaging of ACL-injured knees. Level IV, diagnostic, case-control study. Copyright © 2018 Arthroscopy Association of North America. Published by Elsevier Inc. All rights reserved.
Computed Tomography Assessment of Hepatic Metastases of Breast Cancer with Revised Response Evaluation Criteria in Solid Tumors (RECIST) Criteria (Version 1.1): Inter-Observer Agreement.

PubMed

Ghobrial, Fady Emil Ibrahim; Eldin, Manal Salah; Razek, Ahmed Abdel Khalek Abdel; Atwan, Nadia Ibrahim; Shamaa, Sameh Sayed Ahmed

2017-01-01

To assess inter-observer agreement of revised RECIST criteria (version 1.1) for computed tomography assessment of hepatic metastases of breast cancer. A prospective study was conducted in 28 female patients with breast cancer and with at least one measurable metastatic lesion in the liver that was treated with 3 cycles of anthracycline-based chemotherapy. All patients underwent computed tomography of the abdomen with 64-row multi- detector CT at baseline and after 3 cycles of chemotherapy for response assessment. Image analysis was performed by 2 observers, based on the RECIST criteria (version 1.1). Computed tomography revealed partial response of hepatic metastases in 7 patients (25%) by one observer and in 10 patients (35.7%) by the other observer, with good inter-observer agreement (k=0.75, percent agreement of 89.29%). Stable disease was detected in 19 patients (67.8%) by one observer and in 16 patients (57.1%) by the other observer, with good agreement (k=0.774, percent agreement of 89.29%). Progressive disease was detected in 2 patients (7.2%) by both observers, with perfect agreement (k=1, percent agreement of 100%). The overall inter-observer agreement in the CT-based response assessment of hepatic metastasis between the two observers was good ( k =0.793, percent agreement of 89.29%). We concluded that computed tomography is a reliable and reproducible imaging modality for response assessment of hepatic metastases of breast cancer according to the RECIST criteria (version 1.1).
Prospective Study Validating Inter- and Intraobserver Variability of Tissue Compliance Meter in Breast Tissue of Healthy Volunteers: Potential Implications for Patients With Radiation-Induced Fibrosis of the Breast

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wernicke, A. Gabriella, E-mail: gaw9008@med.cornell.ed; Parashar, Bhupesh; Kulidzhanov, Fridon

2011-05-01

Purpose: Accurate detection of radiation-induced fibrosis (RIF) is crucial in management of breast cancer survivors. Tissue compliance meter (TCM) has been validated in musculature. We validate TCM in healthy breast tissue with respect to interobserver and intraobserver variability before applying it in RIF. Methods and Materials: Three medical professionals obtained three consecutive TCM measurements in each of the four quadrants of the right and left breasts of 40 women with no breast disease or surgical intervention. The intraclass correlation coefficient (ICC) assessed interobserver variability. The paired t test and Pearson correlation coefficient (r) were used to assess intraobserver variability withinmore » each rater. Results: The median age was 45 years (range, 24-68 years). The median bra size was 35C (range, 32A-40DD). Of the participants, 27 were white (67%), 4 black (10%), 5 Asian (13%), and 4 Hispanic (10%). ICCs indicated excellent interrater reliability (low interobserver variability) among the three raters, by breast and quadrant (all ICC {>=}0.99). The paired t test and Pearson correlation coefficient both indicated low intraobserver variability within each rater (right vs. left breast), stratified by quadrant (all r{>=} 0.94, p < 0.0001). Conclusions: The interobserver and intraobserver variability is small using TCM in healthy mammary tissue. We are now embarking on a prospective study using TCM in women with breast cancer at risk of developing RIF that may guide early detection, timely therapeutic intervention, and assessment of success of therapy for RIF.« less
Validation of the Italian version of the Coma Recovery Scale-Revised (CRS-R).

PubMed

Sacco, Simona; Altobelli, Emma; Pistarini, Caterina; Cerone, Davide; Cazzulani, Benedetta; Carolei, Antonio

2011-01-01

To validate the Italian version of the Coma Recovery Scale-Revised (CRS-R). Two observers applied the Italian version of the CRS-R to selected patients. On day 1, observer A and B independently scored each patient; the comparison of their observations was used to evaluate inter-observer agreement. On day 2, observer A completed a second evaluation and the comparison of this observation with that obtained on day 1 by the same observer was used to evaluate test-re-test agreement. For each evaluation, also diagnostic impression (vegetative state/minimally conscious state) was reported. Thirty-eight patients were evaluated (mean age ± SD, 58.9 ± 13.8 years). Inter-observer (ρ = 0.81; p < 0.001) as well as test-re-test agreement (ρ = 0.97; p < 0.001) for the total score was high. Inter-observer agreement was excellent for the communication sub-scale, good for the auditory, visual and motor sub-scales and moderate for the oromotor/verbal and arousal sub-scales. Test-re-test agreement was excellent for the visual, motor, oromotor/verbal and communication sub-scales, good for the auditory sub-scale and moderate for the arousal sub-scale. When considering the diagnostic impression, inter-observer agreement was good (κ = 0.75; p < 0.001) and test-re-test agreement was excellent (κ = 0.92; p < 0.001). The Italian version of the CRS-R can be administered reliably and can be also employed to discriminate patients in vegetative and in minimally conscious state.
Reliability and Usefulness of Intraoperative 3-Dimensional Imaging by Mobile C-Arm With Flat-Panel Detector.

PubMed

Fujimori, Takahito; Iwasaki, Motoki; Nagamoto, Yukitaka; Kashii, Masafumi; Takao, Masaki; Sugiura, Tsuyoshi; Yoshikawa, Hideki

2017-02-01

Reliability and agreement study. To assess the reliability of intraoperative 3-dimensional imaging with a mobile C-arm (3D C-arm) equipped with a flat-panel detector. Pedicle screws are widely used in spinal surgery. Postoperative computed tomography (CT) is the most reliable method to detect screw misplacement. Recent advances in imaging devices have enabled surgeons to acquire 3D images of the spine during surgery. However, the reliability of these imaging devices is not known. A total of 203 screws were used in 22 consecutive patients who underwent surgery for scoliosis. Screw position was read twice with a 3D C-arm and twice with CT in a blinded manner by 2 independent observers. Screw positions were classified into 4 categories at every 2 mm and then into 2 simpler categories of acceptable or unacceptable. The degree of agreement with respect to screw positions between the double readings was evaluated by κ value. With unanimous agreement between 2 observers regarding postoperative CT readings considered the gold standard, the sensitivity of the 3D C-arm for determining screw misplacement was calculated. A total 804 readings were performed. For the 4-category classification, the mean κ value for the 2 interobserver readings was 0.52 for the 3D C-arm and 0.46 for CT. For the 2-category classification, the mean κ value for the 2 interobserver readings was 0.80 for the 3D C-arm and 0.66 for CT. The sensitivity, specificity, positive predictive value, and negative predictive value of intraoperative imaging with the 3D C-arm were 70%, 95%, 44%, and 98%, respectively. With respect to screws with perforation ≥4 mm, the sensitivity was 83%. No revision surgery was performed. Intraoperative imaging with a 3D C-arm was reliable for detecting screw misplacement and helpful in decreasing the rate of revision surgery for screw misplacement.

Values of a Patient and Observer Scar Assessment Scale to Evaluate the Facial Skin Graft Scar

PubMed Central

Chae, Jin Kyung; Kim, Eun Jung; Park, Kun

2016-01-01

Background The patient and observer scar assessment scale (POSAS) recently emerged as a promising method, reflecting both observer's and patient's opinions in evaluating scar. This tool was shown to be consistent and reliable in burn scar assessment, but it has not been tested in the setting of skin graft scar in skin cancer patients. Objective To evaluate facial skin graft scar applied to POSAS and to compare with objective scar assessment tools. Methods Twenty three patients, who diagnosed with facial cutaneous malignancy and transplanted skin after Mohs micrographic surgery, were recruited. Observer assessment was performed by three independent rates using the observer component of the POSAS and Vancouver scar scale (VSS). Patient self-assessment was performed using the patient component of the POSAS. To quantify scar color and scar thickness more objectively, spectrophotometer and ultrasonography was applied. Results Inter-observer reliability was substantial with both VSS and the observer component of the POSAS (average measure intraclass coefficient correlation, 0.76 and 0.80, respectively). The observer component consistently showed significant correlations with patients' ratings for the parameters of the POSAS (all p-values<0.05). The correlation between subjective assessment using POSAS and objective assessment using spectrophotometer and ultrasonography showed low relationship. Conclusion In facial skin graft scar assessment in skin cancer patients, the POSAS showed acceptable inter-observer reliability. This tool was more comprehensive and had higher correlation with patient's opinion. PMID:27746642
Targeting Homology-Directed Recombinational Repair (HDR) of Chromosomal Breaks to Sensitize Prostate Cancer Cells to Poly (ADP-Ribose) Polymerase (PARP) Inhibition

DTIC Science & Technology

2013-08-01

The views, opinions and/or findings contained in this report are those of the author( s ) and should not be construed as an official Department of...on p53. To assess whether BRCA1 nuclear export following IR in prostate cancer cells is also p53 dependent, we next performed the above experiments...Task 1B. Previous reports suggest that IR-induced BRCA1 export is also dependent on CRM1. To test this hypothesis, we proposed that the CRM1
Terrain Analysis and Settlement Pattern Survey: Upper Bayou Zourie, Fort Polk, Louisiana.

DTIC Science & Technology

1981-10-01

Louisiana , Vernon Parish 20. AsSrl ACT (Coawnuo - Fevwe eie if nacee.,y ad identify by block number) "- As part of a cultural resources survey of...on the hilltops and along the tops of ridges between the incised drain- ages. The bulk of the soil is a colluvial sand and clay. To the north is a...general exposure of red clayey sands which were probably a product 25 Table 10 16VN441 Site DFI Summary Table FP-31 Material Primary Secondary Tertiary Non
Test-retest and inter- and intrareliability of the quality of the upper-extremity skills test in preschool-age children with cerebral palsy.

PubMed

Haga, Nienke; van der Heijden-Maessen, Hélène C; van Hoorn, Jessika F; Boonstra, Anne M; Hadders-Algra, Mijna

2007-12-01

To investigate the test-retest, inter-, and intraobserver reliability of the Quality of Upper Extremity Skills Test (QUEST) in young children with cerebral palsy (CP). For test-retest reliability, a test-retest design was used; for the intra- and interobserver reliability, the videotaped test was scored on 2 occasions by 1 observer and by various observers. Groups of preschool-age children in 2 general rehabilitation centers. Twenty-one children with CP (12 boys, 9 girls) aged 2 to 4.5 years (mean, 39 mo). Not applicable. Spearman correlation coefficient. The data indicated that test-retest reliability was strong (rho range, .85-.94). Intraobserver agreement (rho range, .63-.95) and agreement between various observers (rho range, .72-.90) were moderate to strong. Test-retest and inter- and intraobserver reliability of the QUEST in preschool-age children with CP is good.
Definition and Reliability Assessment of Elementary Ultrasonographic Findings in Calcium Pyrophosphate Deposition Disease: A Study by the OMERACT Calcium Pyrophosphate Deposition Disease Ultrasound Subtask Force.

PubMed

Filippou, Georgios; Scirè, Carlo A; Damjanov, Nemanja; Adinolfi, Antonella; Carrara, Greta; Picerno, Valentina; Toscano, Carmela; Bruyn, George A; D'Agostino, Maria Antonietta; Delle Sedie, Andrea; Filippucci, Emilio; Gutierrez, Marwin; Micu, Mihaela; Möller, Ingrid; Naredo, Esperanza; Pineda, Carlos; Porta, Francesco; Schmidt, Wolfgang A; Terslev, Lene; Vlad, Violeta; Zufferey, Pascal; Iagnocco, Annamaria

2017-11-01

To define the ultrasonographic characteristics of calcium pyrophosphate crystal (CPP) deposits in joints and periarticular tissues and to evaluate the intra- and interobserver reliability of expert ultrasonographers in the assessment of CPP deposition disease (CPPD) according to the new definitions. After a systematic literature review, a Delphi survey was circulated among a group of expert ultrasonographers, who were members of the CPPD Ultrasound (US) Outcome Measures in Rheumatology (OMERACT) subtask force, to obtain definitions of the US characteristics of CPPD at the level of fibrocartilage (FC), hyaline cartilage (HC), tendon, and synovial fluid (SF). Subsequently, the reliability of US in assessing CPPD at knee and wrist levels according to the agreed definitions was tested in static images and in patients with CPPD. Cohen's κ was used for statistical analysis. HC and FC of the knee yielded the highest interobserver κ values among all the structures examined, in both the Web-based (0.73 for HC and 0.58 for FC) and patient-based exercises (0.55 for the HC and 0.64 for the FC). Kappa values for the other structures were lower, ranging from 0.28 in tendons to 0.50 in SF in the static exercise and from 0.09 (proximal patellar tendon) to 0.27 (triangular FC of the wrist) in the patient-based exercise. The new OMERACT definitions for the US identification of CPPD proved to be reliable at the level of the HC and FC of the knee. Further studies are needed to better define the US characteristics of CPPD and optimize the scanning technique in other anatomical sites.
Classification of instability after reverse shoulder arthroplasty guides surgical management and outcomes.

PubMed

Abdelfattah, Adham; Otto, Randall J; Simon, Peter; Christmas, Kaitlyn N; Tanner, Gregory; LaMartina, Joey; Levy, Jonathan C; Cuff, Derek J; Mighell, Mark A; Frankle, Mark A

2018-04-01

Revision of unstable reverse shoulder arthroplasty (RSA) remains a significant challenge. The purpose of this study was to determine the reliability of a new treatment-guiding classification for instability after RSA, to describe the clinical outcomes of patients stabilized operatively, and to identify those with higher risk of recurrence. All patients undergoing revision for instability after RSA were identified at our institution. Demographic, clinical, radiographic, and intraoperative data were collected. A classification was developed using all identified causes of instability after RSA and allocating them to 1 of 3 defined treatment-guiding categories. Eight surgeons reviewed all data and applied the classification scheme to each case. Interobserver and intraobserver reliability was used to evaluate the classification scheme. Preoperative clinical outcomes were compared with final follow-up in stabilized shoulders. Forty-three revision cases in 34 patients met the inclusion for study. Five patients remained unstable after revision. Persistent instability most commonly occurred in persistent deltoid dysfunction and postoperative acromial fractures but also in 1 case of soft tissue impingement. Twenty-one patients remained stable at minimum 2 years of follow-up and had significant improvement of clinical outcome scores and range of motion. Reliability of the classification scheme showed substantial and almost perfect interobserver and intraobserver agreement among all the participants (κ = 0.699 and κ = 0.851, respectively). Instability after RSA can be successfully treated with revision surgery using the reliable treatment-guiding classification scheme presented herein. However, more understanding is needed for patients with greater risk of recurrent instability after revision surgery. Copyright © 2017 Journal of Shoulder and Elbow Surgery Board of Trustees. Published by Elsevier Inc. All rights reserved.
Dental measurements and Bolton index reliability and accuracy obtained from 2D digital, 3D segmented CBCT, and 3d intraoral laser scanner

PubMed Central

San José, Verónica; Bellot-Arcís, Carlos; Tarazona, Beatriz; Zamora, Natalia; O Lagravère, Manuel

2017-01-01

Background To compare the reliability and accuracy of direct and indirect dental measurements derived from two types of 3D virtual models: generated by intraoral laser scanning (ILS) and segmented cone beam computed tomography (CBCT), comparing these with a 2D digital model. Material and Methods One hundred patients were selected. All patients’ records included initial plaster models, an intraoral scan and a CBCT. Patients´ dental arches were scanned with the iTero® intraoral scanner while the CBCTs were segmented to create three-dimensional models. To obtain 2D digital models, plaster models were scanned using a conventional 2D scanner. When digital models had been obtained using these three methods, direct dental measurements were measured and indirect measurements were calculated. Differences between methods were assessed by means of paired t-tests and regression models. Intra and inter-observer error were analyzed using Dahlberg´s d and coefficients of variation. Results Intraobserver and interobserver error for the ILS model was less than 0.44 mm while for segmented CBCT models, the error was less than 0.97 mm. ILS models provided statistically and clinically acceptable accuracy for all dental measurements, while CBCT models showed a tendency to underestimate measurements in the lower arch, although within the limits of clinical acceptability. Conclusions ILS and CBCT segmented models are both reliable and accurate for dental measurements. Integration of ILS with CBCT scans would get dental and skeletal information altogether. Key words:CBCT, intraoral laser scanner, 2D digital models, 3D models, dental measurements, reliability. PMID:29410764
Reliability of corneal dynamic scheimpflug analyser measurements in virgin and post-PRK eyes.

PubMed

Chen, Xiangjun; Stojanovic, Aleksandar; Hua, Yanjun; Eidet, Jon Roger; Hu, Di; Wang, Jingting; Utheim, Tor Paaske

2014-01-01

To determine the measurement reliability of CorVis ST, a dynamic Scheimpflug analyser, in virgin and post-photorefractive keratectomy (PRK) eyes and compare the results between these two groups. Forty virgin eyes and 42 post-PRK eyes underwent CorVis ST measurements performed by two technicians. Repeatability was evaluated by comparing three consecutive measurements by technician A. Reproducibility was determined by comparing the first measurement by technician A with one performed by technician B. Intraobserver and interobserver intraclass correlation coefficients (ICCs) were calculated. Univariate analysis of covariance (ANCOVA) was used to compare measured parameters between virgin and post-PRK eyes. The intraocular pressure (IOP), central corneal thickness (CCT) and 1st applanation time demonstrated good intraobserver repeatability and interobserver reproducibility (ICC ≧ 0.90) in virgin and post-PRK eyes. The deformation amplitude showed a good or close to good repeatability and reproducibility in both groups (ICC ≧ 0.88). The CCT correlated positively with 1st applanation time (r = 0.437 and 0.483, respectively, p<0.05) and negatively with deformation amplitude (r = -0.384 and -0.375, respectively, p<0.05) in both groups. Compared to post-PRK eyes, virgin eyes showed longer 1st applanation time (7.29 ± 0.21 vs. 6.96 ± 0.17 ms, p<0.05) and lower deformation amplitude (1.06 ± 0.07 vs. 1.17 ± 0.08 mm, p < 0.05). CorVis ST demonstrated reliable measurements for CCT, IOP, and 1st applanation time, as well as relatively reliable measurement for deformation amplitude in both virgin and post-PRK eyes. There were differences in 1st applanation time and deformation amplitude between virgin and post-PRK eyes, which may reflect corneal biomechanical changes occurring after the surgery in the latter.
Inter- and intraobserver reliability of the clock face representation as used to describe the femoral intercondylar notch.

PubMed

Azzam, Michael G; Lenarz, Christopher J; Farrow, Lutul D; Israel, Heidi A; Kieffer, David A; Kaar, Scott G

2011-08-01

To validate the use of the clock face reference as a reliable means of communicating femoral intercondylar notch position. A single red mark was made on ten identical left Sawbones femurs in the intercondylar notch at variable locations. Ten surgeons, who routinely perform ACL reconstructions, were presented the femurs in random order and asked to state the position of the mark to the nearest 30-min interval. Responses were recorded and then repeated 3 weeks later. The same 10 surgeons were presented with 30 actual arthroscopic photographs of the intercondylar notch, performed at 90° of knee flexion, with a probe pointing at various locations (10 knees; 3 photographs/knee) along the lateral aspect of the notch. The results were then analyzed with an ICC, Cronbach's alpha test, and descriptive statistics. For the Sawbones, the ICC was 0.996 while individual physician's Cronbach's alpha test ranged from 0.954 to 0.999, indicating a very high interobserver and intraobserver reliability. The mean range of responses among the 10 surgeons was 1.6 h, SD 0.6. For the photographs, the ICC was also high at 0.997. There was a mean range of 1.1 h, SD 0.4, among surgeons. The clock face method is commonly utilized for both placement of the femoral tunnel during ACL reconstruction as well as describing the location of the ACL femoral tunnel between communicating surgeons. Despite a high statistical interobserver correlation, there is significant range among different surgeons' responses. The present study questions the reliability of the clock face method for use between surgeons as a stand alone tool. Other methods also utilizing anatomic landmarks may be more accurate for describing intercondylar notch anatomy. III.
Electronic synoptic operative reporting: assessing the reliability and completeness of synoptic reports for pancreatic resection.

PubMed

Park, Jason; Pillarisetty, Venu G; Brennan, Murray F; Jarnagin, William R; D'Angelica, Michael I; Dematteo, Ronald P; G Coit, Daniel; Janakos, Maria; Allen, Peter J

2010-09-01

Electronic synoptic operative reports (E-SORs) have replaced dictated reports at many institutions, but whether E-SORs adequately document the components and findings of an operation has received limited study. This study assessed the reliability and completeness of E-SORs for pancreatic surgery developed at our institution. An attending surgeon and surgical fellow prospectively and independently completed an E-SOR after each of 112 major pancreatic resections (78 proximal, 29 distal, and 5 central) over a 10-month period (September 2008 to June 2009). Reliability was assessed by calculating the interobserver agreement between attending physician and fellow reports. Completeness was assessed by comparing E-SORs to a case-matched (surgeon and procedure) historical control of dictated reports, using a 39-item checklist developed through an internal and external query of 13 high-volume pancreatic surgeons. Interobserver agreement between attending and fellow was moderate to very good for individual categorical E-SOR items (kappa = 0.65 to 1.00, p < 0.001 for all items). Compared with dictated reports, E-SORs had significantly higher completeness checklist scores (mean 88.8 +/- 5.4 vs 59.6 +/- 9.2 [maximum possible score, 100], p < 0.01) and were available in patients' electronic records in a significantly shorter interval of time (median 0.5 vs 5.8 days from case end, p < 0.01). The mean time taken to complete E-SORs was 4.0 +/- 1.6 minutes per case. E-SORs for pancreatic surgery are reliable, complete in data collected, and rapidly available, all of which support their clinical implementation. The inherent strengths of E-SORs offer real promise of a new standard for operative reporting and health communication. Copyright 2010 American College of Surgeons. Published by Elsevier Inc. All rights reserved.
Preliminary validation of the Knee Inflammation MRI Scoring System (KIMRISS) for grading bone marrow lesions in osteoarthritis of the knee: data from the Osteoarthritis Initiative

PubMed Central

Jeffery, Dean; Buller, M; Wichuk, Stephanie; McDougall, Dave; Lambert, Robert GW; Maksymowych, Walter P

2017-01-01

Objective Bone marrow lesions (BML) are an MRI feature of osteoarthritis (OA) offering a potential target for therapy. We developed the Knee Inflammation MRI Scoring System (KIMRISS) to semiquantitatively score BML with high sensitivity to small changes, and compared feasibility, reliability and responsiveness versus the established MRI Osteoarthritis Knee Score (MOAKS). Methods KIMRISS incorporates a web-based graphic overlay to facilitate detailed regional BML scoring. Observers scored BML by MOAKS and KIMRISS on sagittal fluid-sensitive sequences. Exercise 1 focused on interobserver reliability in Osteoarthritis Initiative observational data, with 4 readers (two experienced/two new to KIMRISS) scoring BML in 80 patients (baseline/1 year). Exercise 2 focused on responsiveness in an open-label trial of adalimumab, with 2 experienced readers scoring BML in 16 patients (baseline/12 weeks). Results Scoring time was similar for KIMRISS and MOAKS. Interobserver reliability of KIMRISS was equivalent to MOAKS for BML status (ICC=0.84 vs 0.79), but consistently better than MOAKS for change in BML: Exercise 1 (ICC 0.82 vs 0.53), Exercise 2 (ICC 0.90 vs 0.32), and in new readers (0.87–0.92 vs 0.32–0.51). KIMRISS BML was more responsive than MOAKS BML: post-treatment BML improvement in Exercise 2 reached statistical significance for KIMRISS (SRM −0.69, p=0.015), but not MOAKS (SRM −0.12, p=0.625). KIMRISS BML also more strongly correlated to WOMAC scores than MOAKS BML (r=0.80 vs 0.58, p<0.05). Conclusions KIMRISS BML scoring was highly feasible, and was more reliable for assessment of change and more responsive to change than MOAKS BML for expert and new readers. PMID:28123780
Development of a new instrument for determining the level of chewing function in children.

PubMed

Serel Arslan, S; Demir, N; Barak Dolgun, A; Karaduman, A A

2016-07-01

This study aimed to develop a chewing performance scale that classifies chewing from normal to severely impaired and to investigate its validity and reliability. The study included the developmental phase and reported the content, structural, criterion validity, interobserver and intra-observer reliability of the chewing performance scale, which was called the Karaduman Chewing Performance Scale (KCPS). A dysphagia literature review, other questionnaires and clinical experiences were used in the developmental phase. Seven experts assessed the steps for content validity over two Delphi rounds. To test structural, criterion validity, interobserver and intra-observer reliability, two swallowing therapists evaluated chewing videos of 144 children (Group I: 61 healthy children without chewing disorders, mean age of 42·38 ± 9·36 months; Group II: 83 children with cerebral palsy who have chewing disorders, mean age of 39·09 ± 22·95 months) using KCPS. The Behavioral Pediatrics Feeding Assessment Scale (BPFAS) was used for criterion validity. The KCPS steps arranged between 0-4 were found to be necessary. The content validity index was 0·885. The KCPS levels were found to be different between groups I and II (χ(2) = 123·286, P < 0·001). A moderately strong positive correlation was found between the KCPS and the subscales of the BPFAS (r = 0·444-0·773, P < 0·001). An excellent positive correlation was detected between two swallowing therapists and between two examinations of one swallowing therapist (r = 0·962, P < 0·001; r = 0·990, P < 0·001, respectively). The KCPS is a valid, reliable, quick and clinically easy-to-use functional instrument for determining the level of chewing function in children. © 2016 John Wiley & Sons Ltd.
Prioritisation of patients on waiting lists for hip and knee arthroplasties and cataract surgery: Instruments validation

PubMed Central

Allepuz, Alejandro; Espallargues, Mireia; Moharra, Montse; Comas, Mercè; Pons, Joan MV

2008-01-01

Background Prioritisation instruments were developed for patients on waiting list for hip and knee arthroplasties (AI) and cataract surgery (CI). The aim of the study was to assess their convergent and discriminant validity and inter-observer reliability. Methods Multicentre validation study which included orthopaedic surgeons and ophthalmologists from 10 hospitals. Participating doctors were asked to include all eligible patients placed in the waiting list for the procedures under study during the medical visit. Doctors assessed patients' priority through a visual analogue scale (VAS) and administered the prioritisation instrument. Information on socio-demographic data and health-related quality of life (HRQOL) (HUI3, EQ-5D, WOMAC and VF-14) was obtained through a telephone interview with patients. The correlation coefficients between the prioritisation instrument score and VAS and HRQOL were calculated. For the reliability study a self-administered questionnaire, which included hypothetic patients' scenarios, was sent via postal mail to the doctors. The priority of these scenarios was assessed through the prioritisation instrument. The intraclass correlation coefficient (ICC) between doctors was calculated. Results Correlations with VAS were strong for the AI (0.64, CI95%: 0.59–0.68) and for the CI (0.65, CI95%: 0.62–0.69), and moderate between the WOMAC and the AI (0.39, CI95%: 0.33–0.45) and the VF-14 and the CI (0.38, IC95%: 0.33–0.43). The results of the discriminant analysis were in general as expected. Inter-observer reliability was 0.79 (CI95%: 0.64–0.94) for the AI, and 0.79 (CI95%: 0.63–0.95) for the CI. Conclusion The results show acceptable validity and reliability of the prioritisation instruments in establishing priority for surgery. PMID:18397519
Inter- and intra-observer reliability of clinical movement-control tests for marines

PubMed Central

2012-01-01

Background Musculoskeletal disorders particularly in the back and lower extremities are common among marines. Here, movement-control tests are considered clinically useful for screening and follow-up evaluation. However, few studies have addressed the reliability of clinical tests, and no such published data exists for marines. The present aim was therefore to determine the inter- and intra-observer reliability of clinically convenient tests emphasizing movement control of the back and hip among marines. A secondary aim was to investigate the sensitivity and specificity of these clinical tests for discriminating musculoskeletal pain disorders in this group of military personnel. Methods This inter- and intra-observer reliability study used a test-retest approach with six standardized clinical tests focusing on movement control for back and hip. Thirty-three marines (age 28.7 yrs, SD 5.9) on active duty volunteered and were recruited. They followed an in-vivo observation test procedure that covered both low- and high-load (threshold) tasks relevant for marines on operational duty. Two independent observers simultaneously rated performance as “correct” or “incorrect” following a standardized assessment protocol. Re-testing followed 7–10 days thereafter. Reliability was analysed using kappa (κ) coefficients, while discriminative power of the best-fitting tests for back- and lower-extremity pain was assessed using a multiple-variable regression model. Results Inter-observer reliability for the six tests was moderate to almost perfect with κ-coefficients ranging between 0.56-0.95. Three tests reached almost perfect inter-observer reliability with mean κ-coefficients > 0.81. However, intra-observer reliability was fair-to-moderate with mean κ-coefficients between 0.22-0.58. Three tests achieved moderate intra-observer reliability with κ-coefficients > 0.41. Combinations of one low- and one high-threshold test best discriminated prior back pain, but results were inconsistent for lower-extremity pain. Conclusions Our results suggest that clinical tests of movement control of back and hip are reliable for use in screening protocols using several observers with marines. However, test-retest reproducibility was less accurate, which should be considered in follow-up evaluations. The results also indicate that combinations of low- and high-threshold tests have discriminative validity for prior back pain, but were inconclusive for lower-extremity pain. PMID:23273285
Three-dimensional image technology in forensic anthropology: Assessing the validity of biological profiles derived from CT-3D images of the skeleton

NASA Astrophysics Data System (ADS)

Garcia de Leon Valenzuela, Maria Julia

This project explores the reliability of building a biological profile for an unknown individual based on three-dimensional (3D) images of the individual's skeleton. 3D imaging technology has been widely researched for medical and engineering applications, and it is increasingly being used as a tool for anthropological inquiry. While the question of whether a biological profile can be derived from 3D images of a skeleton with the same accuracy as achieved when using dry bones has been explored, bigger sample sizes, a standardized scanning protocol and more interobserver error data are needed before 3D methods can become widely and confidently used in forensic anthropology. 3D images of Computed Tomography (CT) scans were obtained from 130 innominate bones from Boston University's skeletal collection (School of Medicine). For each bone, both 3D images and original bones were assessed using the Phenice and Suchey-Brooks methods. Statistical analysis was used to determine the agreement between 3D image assessment versus traditional assessment. A pool of six individuals with varying experience in the field of forensic anthropology scored a subsample (n = 20) to explore interobserver error. While a high agreement was found for age and sex estimation for specimens scored by the author, the interobserver study shows that observers found it difficult to apply standard methods to 3D images. Higher levels of experience did not result in higher agreement between observers, as would be expected. Thus, a need for training in 3D visualization before applying anthropological methods to 3D bones is suggested. Future research should explore interobserver error using a larger sample size in order to test the hypothesis that training in 3D visualization will result in a higher agreement between scores. The need for the development of a standard scanning protocol focusing on the optimization of 3D image resolution is highlighted. Applications for this research include the possibility of digitizing skeletal collections in order to expand their use and for deriving skeletal collections from living populations and creating population-specific standards. Further research for the development of a standard scanning and processing protocol is needed before 3D methods in forensic anthropology are considered as reliable tools for generating biological profiles.
An International Ki67 Reproducibility Study in Adrenal Cortical Carcinoma.

PubMed

Papathomas, Thomas G; Pucci, Eugenio; Giordano, Thomas J; Lu, Hao; Duregon, Eleonora; Volante, Marco; Papotti, Mauro; Lloyd, Ricardo V; Tischler, Arthur S; van Nederveen, Francien H; Nose, Vania; Erickson, Lori; Mete, Ozgur; Asa, Sylvia L; Turchini, John; Gill, Anthony J; Matias-Guiu, Xavier; Skordilis, Kassiani; Stephenson, Timothy J; Tissier, Frédérique; Feelders, Richard A; Smid, Marcel; Nigg, Alex; Korpershoek, Esther; van der Spek, Peter J; Dinjens, Winand N M; Stubbs, Andrew P; de Krijger, Ronald R

2016-04-01

Despite the established role of Ki67 labeling index in prognostic stratification of adrenocortical carcinomas and its recent integration into treatment flow charts, the reproducibility of the assessment method has not been determined. The aim of this study was to investigate interobserver variability among endocrine pathologists using a web-based virtual microscopy approach. Ki67-stained slides of 76 adrenocortical carcinomas were analyzed independently by 14 observers, each according to their method of preference including eyeballing, formal manual counting, and digital image analysis. The interobserver variation was statistically significant (P<0.001) in the absence of any correlation between the various methods. Subsequently, 61 static images were distributed among 15 observers who were instructed to follow a category-based scoring approach. Low levels of interobserver (F=6.99; Fcrit=1.70; P<0.001) as well as intraobserver concordance (n=11; Cohen κ ranging from -0.057 to 0.361) were detected. To improve harmonization of Ki67 analysis, we tested the utility of an open-source Galaxy virtual machine application, namely Automated Selection of Hotspots, in 61 virtual slides. The software-provided Ki67 values were validated by digital image analysis in identical images, displaying a strong correlation of 0.96 (P<0.0001) and dividing the cases into 3 classes (cutoffs of 0%-15%-30% and/or 0%-10%-20%) with significantly different overall survivals (P<0.05). We conclude that current practices in Ki67 scoring assessment vary greatly, and interobserver variation sets particular limitations to its clinical utility, especially around clinically relevant cutoff values. Novel digital microscopy-enabled methods could provide critical aid in reducing variation, increasing reproducibility, and improving reliability in the clinical setting.
Concordance between local, institutional, and central pathology review in glioblastoma: implications for research and practice: a pilot study.

PubMed

Gupta, Tejpal; Nair, Vimoj; Epari, Sridhar; Pietsch, Torsten; Jalali, Rakesh

2012-01-01

There is significant inter-observer variation amongst the neuro-pathologists in the typing, subtyping, and grading of glial neoplasms for diagnosis. Centralized pathology review has been proposed to minimize this inter-observer variation and is now almost mandatory for accrual into multicentric trials. We sought to assess the concordance between neuro-pathologists on histopathological diagnosis of glioblastoma. Comparison of local, institutional, and central neuro-oncopathology reporting in a cohort of 34 patients with newly diagnosed supratentorial glioblastoma accrued consecutively at a tertiary-care institution on a prospective trial testing the addition of a new agent to standard chemo-radiation regimen. Concordance was sub-optimal between local histological diagnosis and central review, fair between local diagnosis and institutional review, and good between institutional and central review, with respect to histological typing/subtyping. Twelve (39%) of 31 patients with local histological diagnosis had identical tumor type, subtype and grade on central review. Overall agreement was modestly better (52%) between local diagnosis and institutional review. In contrast, 28 (83%) of 34 patients had completely concordant histopathologic diagnosis between institutional and central review. The inter-observer reliability test showed poor agreement between local and central review (kappa statistic=0.12, 95% confidence interval (CI): -0.03-0.32, P=0.043), but moderate agreement between institutional and central review (kappa statistic=0.51, 95%CI: 0.17-0.84, P=0.00003). Agreement between local diagnosis and institutional review was fair. There exists significant inter-observer variation regarding histopathological diagnosis of glioblastoma with significant implications for clinical research and practice. There is a need for more objective, quantitative, robust, and reproducible criteria for better subtyping for accurate diagnosis.
Radiographic measurement reliability of lumbar lordosis in ankylosing spondylitis.

PubMed

Lee, Jung Sub; Goh, Tae Sik; Park, Shi Hwan; Lee, Hong Seok; Suh, Kuen Tak

2013-04-01

Intraobserver and interobserver reliabilities of the several different methods to measure lumbar lordosis have been reported. However, it has not been studied sofar in patients with ankylosing spondylitis (AS). We evaluated the inter and intraobserver reliabilities of six specific measures of global lumbar lordosis in patients with AS. Ninety-one consecutive patients with AS who met the most recently modified New York criteria were enrolled and underwent anteroposterior and lateral radiographs of whole spine. The radiographs were divided into non-ankylosis (no bony bridge in the lumbar spine), incomplete ankylosis (lumbar spines were partially connected by bony bridge) and complete ankylosis groups to evaluate the reliability of the Cobb L1-S1, Cobb L1-L5, centroid, posterior tangent L1-S1, posterior tangent L1-L5, and TRALL methods. The radiographs were composed of 39 non-ankylosis, 27 incomplete ankylosis and 25 complete ankylosis. Intra- and inter-class correlation coefficients (ICCs) of all six methods were generally high. The ICCs were all ≥0.77 (excellent) for the six radiographic methods in the combined group. However, a comparison of the ICCs, 95 % confidence intervals and mean absolute difference (MAD) between groups with varying degrees of ankylosis showed that the reliability of the lordosis measurements decreased in proportion to the severity of ankylosis. The Cobb L1-S1, Cobb L1-L5 and posterior tangent L1-S1 method demonstrated higher ICCs for both inter and intraobserver comparisons and the other methods showed lower ICCs in all groups. The intraobserver MAD was similar in the Cobb L1-S1 and Cobb L1-L5 (2.7°-4.3°), but the other methods showed higher intraobserver MAD. Interobserver MAD of Cobb L1-L5 only showed low in all group. These results are the first to provide a reliability analysis of different global lumbar lordosis measurement methods in AS. The findings in this study demonstrated that the Cobb L1-L5 method is reliable for measuring the global lumbar lordosis in AS.
A Student Assessment Tool for Standardized Patient Simulations (SAT-SPS): Psychometric analysis.

PubMed

Castro-Yuste, Cristina; García-Cabanillas, María José; Rodríguez-Cornejo, María Jesús; Carnicer-Fuentes, Concepción; Paloma-Castro, Olga; Moreno-Corral, Luis Javier

2018-05-01

The evaluation of the level of clinical competence acquired by the student is a complex process that must meet various requirements to ensure its quality. The psychometric analysis of the data collected by the assessment tools used is a fundamental aspect to guarantee the student's competence level. To conduct a psychometric analysis of an instrument which assesses clinical competence in nursing students at simulation stations with standardized patients in OSCE-format tests. The construct of clinical competence was operationalized as a set of observable and measurable behaviors, measured by the newly-created Student Assessment Tool for Standardized Patient Simulations (SAT-SPS), which was comprised of 27 items. The categories assigned to the items were 'incorrect or not performed' (0), 'acceptable' (1), and 'correct' (2). 499 nursing students. Data were collected by two independent observers during the assessment of the students' performance at a four-station OSCE with standardized patients. Descriptive statistics were used to summarize the variables. The difficulty levels and floor and ceiling effects were determined for each item. Reliability was analyzed using internal consistency and inter-observer reliability. The validity analysis was performed considering face validity, content and construct validity (through exploratory factor analysis), and criterion validity. Internal reliability and inter-observer reliability were higher than 0.80. The construct validity analysis suggested a three-factor model accounting for 37.1% of the variance. These three factors were named 'Nursing process', 'Communication skills', and 'Safe practice'. A significant correlation was found between the scores obtained and the students' grades in general, as well as with the grades obtained in subjects with clinical content. The assessment tool has proven to be sufficiently reliable and valid for the assessment of the clinical competence of nursing students using standardized patients. This tool has three main components: the nursing process, communication skills, and safety management. Copyright © 2018 Elsevier Ltd. All rights reserved.
Pitfalls and important issues in testing reliability using intraclass correlation coefficients in orthopaedic research.

PubMed

Lee, Kyoung Min; Lee, Jaebong; Chung, Chin Youb; Ahn, Soyeon; Sung, Ki Hyuk; Kim, Tae Won; Lee, Hui Jong; Park, Moon Seok

2012-06-01

Intra-class correlation coefficients (ICCs) provide a statistical means of testing the reliability. However, their interpretation is not well documented in the orthopedic field. The purpose of this study was to investigate the use of ICCs in the orthopedic literature and to demonstrate pitfalls regarding their use. First, orthopedic articles that used ICCs were retrieved from the Pubmed database, and journal demography, ICC models and concurrent statistics used were evaluated. Second, reliability test was performed on three common physical examinations in cerebral palsy, namely, the Thomas test, the Staheli test, and popliteal angle measurement. Thirty patients were assessed by three orthopedic surgeons to explore the statistical methods testing reliability. Third, the factors affecting the ICC values were examined by simulating the data sets based on the physical examination data where the ranges, slopes, and interobserver variability were modified. Of the 92 orthopedic articles identified, 58 articles (63%) did not clarify the ICC model used, and only 5 articles (5%) described all models, types, and measures. In reliability testing, although the popliteal angle showed a larger mean absolute difference than the Thomas test and the Staheli test, the ICC of popliteal angle was higher, which was believed to be contrary to the context of measurement. In addition, the ICC values were affected by the model, type, and measures used. In simulated data sets, the ICC showed higher values when the range of data sets were larger, the slopes of the data sets were parallel, and the interobserver variability was smaller. Care should be taken when interpreting the absolute ICC values, i.e., a higher ICC does not necessarily mean less variability because the ICC values can also be affected by various factors. The authors recommend that researchers clarify ICC models used and ICC values are interpreted in the context of measurement.

Comparison of Two Methods for Estimating Adjustable One-Point Cane Length in Community-Dwelling Older Adults.

PubMed

Camara, Camila Thais Pinto; de Freitas, Sandra Maria Sbeghen Ferreira; de Lima, Waléria Paixão; Lima, Camila Astolphi; Amorim, César Ferreira; Perracini, Monica Rodrigues

2017-01-01

Our aim is to estimate inter-observer reliability, test-retest reliability, anthropometric and biomechanical adequacy and minimal detectable change when measuring the length of single-point adjustable canes in community-dwelling older adults. There are 112 participants in the study. They are men and women, aged 60 years and over, who were attending an outpatient community health centre. An exploratory study design was used. Participants underwent two assessments within the same day by two independent observers and by the same observer at an interval of 15-45 days. Two measures were used to establish the length of a single-point adjustable cane: the distance from the distal wrist crease to the floor (WF) and the distance from the top of the greater trochanter of the femur to the floor (TF). Each individual was fitted according to these two measures, and elbow flexion angle was measured. Inter-observer reliability and the test-retest reliability were high in both TF (ICC 3.1 = 0.918 and ICC 2.1 = 0.935) and WF measures (ICC 3.1 = 0.967 and ICC 2.1 = 0.960). Only 1% of the individuals kept an elbow flexion angle within the standard recommendation of 30° ± 10° when the cane length was determined by the TF measure, and 30% of the participants when the cane was determined by the WF measure. The minimal detectable cane length change was 2.2 cm. Our results suggest that, even though both measures are reliable, cane length determined by WF distance is more appropriate to keep the elbow flexion angle within the standard recommendation. The minimal detectable change corresponds to approximately a hole in the cane adjustment. Copyright © 2015 John Wiley & Sons, Ltd. Copyright © 2015 John Wiley & Sons, Ltd.
Inter-observer agreement, diagnostic sensitivity and specificity of animal-based indicators of young lamb welfare.

PubMed

Phythian, C J; Toft, N; Cripps, P J; Michalopoulou, E; Winter, A C; Jones, P H; Grove-White, D; Duncan, J S

2013-07-01

A scientific literature review and consensus of expert opinion used the welfare definitions provided by the Farm Animal Welfare Council (FAWC) Five Freedoms as the framework for selecting a set of animal-based indicators that were sensitive to the current on-farm welfare issues of young lambs (aged ≤ 6 weeks). Ten animal-based indicators assessed by observation - demeanour, response to stimulation, shivering, standing ability, posture, abdominal fill, body condition, lameness, eye condition and salivation were tested as part of the objective of developing valid, reliable and feasible animal-based measures of lamb welfare The indicators were independently tested on 966 young lambs from 17 sheep flocks across Northwest England and Wales during December 2008 to April 2009 by four trained observers. Inter-observer reliability was assessed using Fleiss's kappa (κ), and the pair-wise agreement with an experienced, observer designated as the 'test standard observer' (TSO) was examined using Cohen's κ. Latent class analysis (LCA) estimated the sensitivity (Se) and specificity (Sp) of each observer without assuming a gold standard and predicted the Se and Sp of randomly selected observers who may apply the indicators in the future. Overall, good levels of inter-observer reliability, and high levels of Sp were identified for demeanour (κ = 0.54, Se ≥ 0.70, Sp ≥ 0.98), stimulation (κ = 0.57, Se = 0.30 to 0.77, Sp ≥ 0.98), shivering (κ = 0.55, Se = 0.37 to 0.85, Sp ≥ 0.99), standing ability (0.54, Se ≥ 0.80, Sp ≥ 0.99), posture (κ = 0.45, Se ≥ 0.56, Sp = 0.99), abdominal fill (κ = 0.44, Se = 0.39 to 0.98, Sp = 0.99), body condition (κ = 0.72, Se ⩾ 0.38 to 0.90, Sp = 0.99), lameness (κ = 0.68, Se > 0.73, Sp = 1.00), and eye condition (κ = 0.72, Se ≥ 0.86, Sp = 0.99). LCA predicted that randomly selected observers had Se > 0.77 (acceptable), and Sp ≥ 0.98 (high) for assessments of demeanour, lameness, abdominal fill posture, body condition and eye condition. The diagnostic performance of some indicators was influenced by the composition of the study population, and it would be useful to test the indicators on lambs with a greater level of outcomes associated with poor welfare. The findings presented in this paper could be applied in the selection of valid, reliable and feasible indicators used for the purposes of on-farm assessments of lamb welfare.
Reliability of a semi-automated 3D-CT measuring method for tunnel diameters after anterior cruciate ligament reconstruction: A comparison between soft-tissue single-bundle allograft vs. autograft.

PubMed

Robbrecht, Cedric; Claes, Steven; Cromheecke, Michiel; Mahieu, Peter; Kakavelakis, Kyriakos; Victor, Jan; Bellemans, Johan; Verdonk, Peter

2014-10-01

Post-operative widening of tibial and/or femoral bone tunnels is a common observation after ACL reconstruction, especially with soft-tissue grafts. There are no studies comparing tunnel widening in hamstring autografts versus tibialis anterior allografts. The goal of this study was to observe the difference in tunnel widening after the use of allograft vs. autograft for ACL reconstruction, by measuring it with a novel 3-D computed tomography based method. Thirty-five ACL-deficient subjects were included, underwent anatomic single-bundle ACL reconstruction and were evaluated at one year after surgery with the use of 3-D CT imaging. Three independent observers semi-automatically delineated femoral and tibial tunnel outlines, after which a best-fit cylinder was derived and the tunnel diameter was determined. Finally, intra- and inter-observer reliability of this novel measurement protocol was defined. In femoral tunnels, the intra-observer ICC was 0.973 (95% CI: 0.922-0.991) and the inter-observer ICC was 0.992 (95% CI: 0.982-0.996). In tibial tunnels, the intra-observer ICC was 0.955 (95% CI: 0.875-0.985). The combined inter-observer ICC was 0.970 (95% CI: 0.987-0.917). Tunnel widening was significantly higher in allografts compared to autografts, in the tibial tunnels (p=0.013) as well as in the femoral tunnels (p=0.007). To our knowledge, this novel, semi-automated 3D-computed tomography image processing method has shown to yield highly reproducible results for the measurement of bone tunnel diameter and area. This series showed a significantly higher amount of tunnel widening observed in the allograft group at one-year follow-up. Level II, Prospective comparative study. Copyright © 2014 Elsevier B.V. All rights reserved.
Novel pathomorphologic classification of capsulo-articular lesions of the pubic symphysis in athletes to predict treatment and outcome.

PubMed

Hopp, Sascha; Ojodu, Ishaq; Jain, Atul; Fritz, Tobias; Pohlemann, Tim; Kelm, Jens

2018-05-01

Radiographic abnormalities of the symphysis as well as the formation of accessory clefts, indicating injury at the rectus-adductor aponeurosis, reportedly relate to longstanding groin pain in athletes. However, yet, no systematic classification for clinical and scientific purposes exists. We aimed to (1) create a radiographic classification based on symphysography; (2) test intra- and interobserver reliability; (3) characterise clinical significance of the morphologic patterns by evaluating success of injection therapy. We retrospectively reviewed symphysography, AP radiographs, and MRI of the pelvis from 70 consecutive competitive athletes, with chronic groin pain. Symphysographs were evaluated for intra- and interobserver variance using cohen's kappa statistics. Morphologic studies of the different contrast distribution patterns and their clinical and radiological correlation with symptom relief were investigated. All patients were followed up to evaluate immediate and long-term response to the initial therapeutic injection with steroid. Four reproducible symphysographic patterns were identified: type 0, no changes; type 1, symphyseal disk degeneration; types 2a with unilateral clefts, bilateral clefts (2b), suprapubic clefts (2c); and type 3, with expanded or multidirectional clefts. Analysis revealed excellent intra (0.94)-and interobserver (0.90) reliability. Our findings showed that 78.6% of our patients had significant short-term improvement enabling early resumption of physiotherapy, only in types 1 and 2 (p = 0.001), while type 0 and 3 did not respond. At follow-up, only 21.8% had permanent pain relief. Regarding the detection of pathologic clefts with symphysography, sensitivity (88%) and specifity (77%) were superior to that of MRI. A reproducible symphysography-based classification of distinct morphologic patterns is proposed. It serves as a predictive tool for response to injection therapy in a select group of pathologic lesions. Complete recovery after injection can only be expected in a lesser percentage, as this might indicate surgical treatment for long-term non-responders.
Interobserver variability when employing the IUGA/ICS classification system for complications related to prostheses and grafts in female pelvic floor surgery.

PubMed

Gowda, Meghana; Kit, Laura Chang; Stuart Reynolds, W; Wang, Li; Dmochowski, Roger R; Kaufman, Melissa R

2013-10-01

To unify and organize reporting, an International Urogynecological Association (IUGA)/International Continence Society (ICS) expert consortium published terminology guidelines with a classification system for complications related to implants used in female pelvic surgery. We hypothesize that the complexity of the codification system may be a hindrance to precision, especially with decreasing levels of postgraduate expertise. Residents, fellows, and attending physicians were asked to code seven test cases taken from published literature. Category, timing, and site components of the classification system were assessed independently and according to the level of training. Interobserver reliability was calculated as percent agreement and Fleiss' kappa statistic. A total of 24 participants (6 attending physicians, 3 fellows, and 15 residents) were tested. The percent agreement showed significant variation when classified by level of training. In all categories, attending physicians had the greatest percentage agreement and largest kappa. The most agreement was seen when attending physicians classified mesh complications by time, 71% agreement with kappa 0.73 [95% confidence interval (CI) 0.58-0.88]. For the same task, the percentage agreement for fellows was 57%, kappa 0.55 (95% CI 0.23-0.87) and with residents 57%, kappa 0.71([95% CI 0.64-0.78). Interestingly, the site component of the classification system had the least overall agreement and lowest kappa [0%, kappa 0.29 (95% CI 0.26-0.32)] followed by the category component [14%, kappa 0.48 (95% CI 0.46-0.5)]. The IUGA/ICS mesh complication classification system has poor interobserver reliability. This trended downward with decreasing postgraduate level; however, we did not have sufficient statistical power to show an association when stratifying by all training levels. This highlights the complex nature of the classification system in its current form and its limitation for widespread clinical and research application.
Shear wave elastography for breast masses is highly reproducible.

PubMed

Cosgrove, David O; Berg, Wendie A; Doré, Caroline J; Skyba, Danny M; Henry, Jean-Pierre; Gay, Joel; Cohen-Bacrie, Claude

2012-05-01

To evaluate intra- and interobserver reproducibility of shear wave elastography (SWE) for breast masses. For intraobserver reproducibility, each observer obtained three consecutive SWE images of 758 masses that were visible on ultrasound. 144 (19%) were malignant. Weighted kappa was used to assess the agreement of qualitative elastographic features; the reliability of quantitative measurements was assessed by intraclass correlation coefficients (ICC). For the interobserver reproducibility, a blinded observer reviewed images and agreement on features was determined. Mean age was 50 years; mean mass size was 13 mm. Qualitatively, SWE images were at least reasonably similar for 666/758 (87.9%). Intraclass correlation for SWE diameter, area and perimeter was almost perfect (ICC ≥ 0.94). Intraobserver reliability for maximum and mean elasticity was almost perfect (ICC = 0.84 and 0.87) and was substantial for the ratio of mass-to-fat elasticity (ICC = 0.77). Interobserver agreement was moderate for SWE homogeneity (κ = 0.57), substantial for qualitative colour assessment of maximum elasticity (κ = 0.66), fair for SWE shape (κ = 0.40), fair for B-mode mass margins (κ = 0.38), and moderate for B-mode mass shape (κ = 0.58), orientation (κ = 0.53) and BI-RADS assessment (κ = 0.59). SWE is highly reproducible for assessing elastographic features of breast masses within and across observers. SWE interpretation is at least as consistent as that of BI-RADS ultrasound B-mode features. • Shear wave ultrasound elastography can measure the stiffness of breast tissue • It provides a qualitatively and quantitatively interpretable colour-coded map of tissue stiffness • Intraobserver reproducibility of SWE is almost perfect while intraobserver reproducibility of SWE proved to be moderate to substantial • The most reproducible SWE features between observers were SWE image homogeneity and maximum elasticity.
Concordance between (99m)Tc-ECD SPECT and 18F-FDG PET interpretations in patients with cognitive disorders diagnosed according to NIA-AA criteria.

PubMed

Ito, Kimiteru; Shimano, Yasumasa; Imabayashi, Etsuko; Nakata, Yasuhiro; Omachi, Yoshie; Sato, Noriko; Arima, Kunimasa; Matsuda, Hiroshi

2014-10-01

The purpose of this study was to clarify the concordance of diagnostic abilities and interobserver agreement between 18F-fluorodeoxyglucose (FDG) positron emission tomography (PET) and brain perfusion single photon-emission computed tomography (SPECT) in patients with Alzheimer's disease (AD) who were diagnosed according to the research criteria of the National Institute of Aging-Alzheimer's Association Workshop. Fifty-five patients with "AD and mild cognitive impairment (MCI)" (n = 40) and "non-AD" (n = 15) were evaluated with 18F-FDG PET and (99m)Tc-ethyl cysteinate dimer (ECD) SPECT during an 8-week period. Three radiologists independently graded the regional uptake in the frontal, temporal, parietal, and occipital lobes as well as the precuneus/posterior cingulate cortex in both images. Kappa values were used to determine the interobserver reliability regarding regional uptake. The regions with better interobserver reliability between 18F-FDG PET and (99m)Tc-ECD SPECT were the frontal, parietal, and temporal lobes. The (99m)Tc-ECD SPECT agreement in the occipital lobes was not significant. The frontal, temporal, and parietal lobes showed good correlations between 18F-FDG PET and (99m)Tc-ECD SPECT in the degree of uptake, but the occipital lobe and precuneus/posterior cingulate cortex did not show good correlations. The diagnostic accuracy rates of "AD and MCI" ranged from 60% to 70% in both of the techniques. The degree of uptake on 18F-FDG PET and (99m)Tc-ECD SPECT showed significant correlations in the frontal, temporal, and parietal lobes. The diagnostic abilities of 18F-FDG PET and (99m)Tc-ECD SPECT for "AD and MCI," when diagnosed according to the National Institute of Aging-Alzheimer's Association Workshop criteria, were nearly identical. Copyright © 2014 John Wiley & Sons, Ltd.
Ultrasound detection of cartilage calcification at knee level in calcium pyrophosphate deposition disease.

PubMed

Gutierrez, Marwin; Di Geso, Luca; Salaffi, Fausto; Carotti, Marina; Girolimetti, Rita; De Angelis, Rossella; Filippucci, Emilio; Grassi, Walter

2014-01-01

To determine the sensitivity, specificity, and accuracy of ultrasound (US) in the detection of cartilage calcification at knee level in patients with calcium pyrophosphate deposition disease (CPDD) and to assess the interobserver reliability. Seventy-four CPDD patients and 83 controls with other chronic arthritis were included. All patients underwent a clinical examination, synovial fluid analysis, and radiographic assessment of the knee. US examinations were performed in order to detect hyperechoic spots within the hyaline cartilage layer and hyperechoic areas within the meniscal fibrocartilage. Twenty patients were assessed by 2 operators in order to calculate the interobserver reliability. A total of 314 knees in 157 patients (74 with CPDD, 19 with rheumatoid arthritis, 17 with spondyloarthritis, 32 with osteoarthritis, and 15 with gout) were assessed. In the 74 patients with CPDD, hyaline cartilage spots were detected by US in at least 1 knee in 44 patients (59.5%), whereas radiography detected hyaline cartilage spots in 34 patients (45.9%) (P < 0.001). Meniscal fibrocartilage calcifications were detected by US in 67 of the 74 CPDD patients (90.5%), whereas conventional radiography detected calcifications in 62 patients (83.7%) (P = 0.011). The criterion validity expressed as percentage of sensitivity, specificity, and accuracy of US in the detection of articular cartilage calcification was high. Both kappa values and overall agreement percentages showed moderate to excellent agreement. US is an accurate and reliable imaging technique in the detection of articular cartilage calcification at knee level in patients with CPDD. Copyright © 2014 by the American College of Rheumatology.
Checklist and Scoring System for the Assessment of Soft Tissue Preservation in CT Examinations of Human Mummies.

PubMed

Panzer, Stephanie; Mc Coy, Mark R; Hitzl, Wolfgang; Piombino-Mascali, Dario; Jankauskas, Rimantas; Zink, Albert R; Augat, Peter

2015-01-01

The purpose of this study was to develop a checklist for standardized assessment of soft tissue preservation in human mummies based on whole-body computed tomography examinations, and to add a scoring system to facilitate quantitative comparison of mummies. Computed tomography examinations of 23 mummies from the Capuchin Catacombs of Palermo, Sicily (17 adults, 6 children; 17 anthropogenically and 6 naturally mummified) and 7 mummies from the crypt of the Dominican Church of the Holy Spirit of Vilnius, Lithuania (5 adults, 2 children; all naturally mummified) were used to develop the checklist following previously published guidelines. The scoring system was developed by assigning equal scores for checkpoints with equivalent quality. The checklist was evaluated by intra- and inter-observer reliability. The finalized checklist was applied to compare the groups of anthropogenically and naturally mummified bodies. The finalized checklist contains 97 checkpoints and was divided into two main categories, "A. Soft Tissues of Head and Musculoskeletal System" and "B. Organs and Organ Systems", each including various subcategories. The complete checklist had an intra-observer reliability of 98% and an inter-observer reliability of 93%. Statistical comparison revealed significantly higher values in anthropogenically compared to naturally mummified bodies for the total score and for three subcategories. In conclusion, the developed checklist allows for a standardized assessment and documentation of soft tissue preservation in whole-body computed tomography examinations of human mummies. The scoring system facilitates a quantitative comparison of the soft tissue preservation status between single mummies or mummy collections.
Ultrasound as an Outcome Measure in Gout. A Validation Process by the OMERACT Ultrasound Working Group.

PubMed

Terslev, Lene; Gutierrez, Marwin; Schmidt, Wolfgang A; Keen, Helen I; Filippucci, Emilio; Kane, David; Thiele, Ralf; Kaeley, Gurjit; Balint, Peter; Mandl, Peter; Delle Sedie, Andrea; Hammer, Hilde Berner; Christensen, Robin; Möller, Ingrid; Pineda, Carlos; Kissin, Eugene; Bruyn, George A; Iagnocco, Annamaria; Naredo, Esperanza; D'Agostino, Maria Antonietta

2015-11-01

To summarize the work performed by the Outcome Measures in Rheumatology (OMERACT) Ultrasound (US) Working Group on the validation of US as a potential outcome measure in gout. Based on the lack of definitions, highlighted in a recent literature review on US as an outcome tool in gout, a series of iterative exercises were carried out to obtain consensus-based definitions on US elementary components in gout using a Delphi exercise and subsequently testing these definitions in static images and in patients with proven gout. Cohen's κ was used to test agreement, and values of 0-0.20 were considered poor, 0.20-0.40 fair, 0.40-0.60 moderate, 0.60-0.80 good, and 0.80-1 excellent. With an agreement of > 80%, consensus-based definitions were obtained for the 4 elementary lesions highlighted in the literature review: tophi, aggregates, erosions, and double contour (DC). In static images interobserver reliability ranged from moderate to almost perfect, and similar results were found for the intrareader reliability. In patients the intraobserver agreement was good for all lesions except DC (moderate). The interobserver agreement was poor for aggregates and DC but moderate for the other components. These first steps in evaluating the validity of US as an outcome measure for gout show that the reliability of the definitions ranged from moderate to excellent in static images and somewhat lower in patients, indicating that a standardized scanning technique may be needed, before testing the responsiveness of those definitions in a composite US score.
A Comparison of Reliability Measures for Continuous and Discontinuous Recording Methods: Inflated Agreement Scores with Partial Interval Recording and Momentary Time Sampling for Duration Events

ERIC Educational Resources Information Center

Rapp, John T.; Carroll, Regina A.; Stangeland, Lindsay; Swanson, Greg; Higgins, William J.

2011-01-01

The authors evaluated the extent to which interobserver agreement (IOA) scores, using the block-by-block method for events scored with continuous duration recording (CDR), were higher when the data from the same sessions were converted to discontinuous methods. Sessions with IOA scores of 89% or less with CDR were rescored using 10-s partial…
Publishing nutrition research: validity, reliability, and diagnostic test assessment in nutrition-related research.

PubMed

Gleason, Philip M; Harris, Jeffrey; Sheean, Patricia M; Boushey, Carol J; Bruemmer, Barbara

2010-03-01

This is the sixth in a series of monographs on research design and analysis. The purpose of this article is to describe and discuss several concepts related to the measurement of nutrition-related characteristics and outcomes, including validity, reliability, and diagnostic tests. The article reviews the methodologic issues related to capturing the various aspects of a given nutrition measure's reliability, including test-retest, inter-item, and interobserver or inter-rater reliability. Similarly, it covers content validity, indicators of absolute vs relative validity, and internal vs external validity. With respect to diagnostic assessment, the article summarizes the concepts of sensitivity and specificity. The hope is that dietetics practitioners will be able to both use high-quality measures of nutrition concepts in their research and recognize these measures in research completed by others. Copyright 2010 American Dietetic Association. Published by Elsevier Inc. All rights reserved.
Smartphone photography utilized to measure wrist range of motion.

PubMed

Wagner, Eric R; Conti Mica, Megan; Shin, Alexander Y

2018-02-01

The purpose was to determine if smartphone photography is a reliable tool in measuring wrist movement. Smartphones were used to take digital photos of both wrists in 32 normal participants (64 wrists) at extremes of wrist motion. The smartphone measurements were compared with clinical goniometry measurements. There was a very high correlation between the clinical goniometry and smartphone measurements, as the concordance coefficients were high for radial deviation, ulnar deviation, wrist extension and wrist flexion. The Pearson coefficients also demonstrated the high precision of the smartphone measurements. The Bland-Altman plots demonstrated 29-31 of 32 smartphone measurements were within the 95% confidence interval of the clinical measurements for all positions of the wrists. There was high reliability between the photography taken by the volunteer and researcher, as well as high inter-observer reliability. Smartphone digital photography is a reliable and accurate tool for measuring wrist range of motion. II.
The Influence of the Manner of Performing the Thyroid Ultrasound Examination on the Reliability of the Assessment of the Thyroid Size in School-Aged Children.

PubMed

Zygmunt, Arkadiusz; Adamczewski, Zbigniew; Zygmunt, Agnieszka; Karbownik-Lewinska, Malgorzata; Lewinski, Andrzej

2017-01-01

Goitre incidence in school-aged children evaluated using ultrasonography is one of the essential indicators of iodine intake in a given area. The aim of the study was to examine what the difference is between the volume of the thyroid gland measured in the supine and sitting position and to determine the intra-observer, inter-observer, and inter-position variations. The survey was conducted among 87 children (56 girls and 31 boys aged 7-13 years, mean age 10.44 ± 1.72 years). The thyroid volume measured in a sitting position was significantly lower than that measured in the supine position. The intra-observer variations for the total thyroid volume equalled 9.56-9.65%. The inter-observer variations were significantly higher and amounted to 34.5-35.7%. The way in which ultrasound evaluation is performed is important for the analysis of the results. It is crucial to aim for the smallest inter-observer variation, which can be achieved by strictly defining the methods of the thyroid measurement and comparing one's measuring techniques with the reference method. The use of standards in ultrasound evaluation performed in the supine position, as well as the use of standards without a strict determination of the study method, can lead to erro-neous conclusions. © 2017 S. Karger AG, Basel.
Fatty degeneration of the rotator cuff muscles on pre- and postoperative CT arthrography (CTA): is the Goutallier grading system reliable?

PubMed

Lee, Eugene; Choi, Jung-Ah; Oh, Joo Han; Ahn, Soyeon; Hong, Sung Hwan; Chai, Jee Won; Kang, Heung Sik

2013-09-01

To retrospectively evaluate fatty degeneration (FD) of rotator cuff muscles on CTA using Goutallier's grading system and quantitative measurements with comparison between pre- and postoperative states. IRB approval was obtained for this study. Two radiologists independently reviewed pre- and postoperative CTAs of 43 patients (24 males and 19 females, mean age, 58.1 years) with 46 shoulders confirmed as full-thickness tears with random distribution. FD of supraspinatus, infraspinatus/teres minor, and subscapularis was assessed using Goutallier's system and by quantitative measurements of Hounsfield units (HUs) on sagittal images. Changes in FD grades and HUs were compared between pre- and postoperative CTAs and analyzed with respect to preoperative tear size and postoperative cuff integrity. The correlations between qualitative grades and quantitative measurements and their inter-observer reliabilities were also assessed. There was statistically significant correlation between FD grades and HU measurements of all muscles on pre- and postoperative CTA (p < 0.05). Inter-observer reliability of fatty degeneration grades were excellent to substantial on both pre- and postoperative CTA in supraspinatus (0.8685 and 0.8535) and subscapularis muscles (0.7777 and 0.7972), but fair in infraspinatus/teres minor muscles (0.5791 and 0.5740); however, quantitative Hounsfield units measurements showed excellent reliability for all muscles (ICC: 0.7950 and 0.9346 for SST, 0.7922 and 0.8492 for SSC, and 0.9254 and 0.9052 for IST/TM). No muscle showed improvement of fatty degeneration after surgical repair on qualitative and quantitative assessments; there was no difference in changes of fatty degeneration after surgical repair according to preoperative tear size and post-operative cuff integrity (p > 0.05). The average dose-length product (DLP, mGy · cm) was 365.2 mGy · cm (range, 323.8-417.2 mGy · cm) and estimated average effective dose was 5.1 mSv. Goutallier grades correlated well with HUs of rotator cuff muscles. Reliability was excellent for both systems, except for FD grade of IST/TM muscles, which may be more reliably assessed using quantitative measurements.
Reliability and accuracy of three imaging software packages used for 3D analysis of the upper airway on cone beam computed tomography images.

PubMed

Chen, Hui; van Eijnatten, Maureen; Wolff, Jan; de Lange, Jan; van der Stelt, Paul F; Lobbezoo, Frank; Aarab, Ghizlane

2017-08-01

The aim of this study was to assess the reliability and accuracy of three different imaging software packages for three-dimensional analysis of the upper airway using CBCT images. To assess the reliability of the software packages, 15 NewTom 5G ® (QR Systems, Verona, Italy) CBCT data sets were randomly and retrospectively selected. Two observers measured the volume, minimum cross-sectional area and the length of the upper airway using Amira ® (Visage Imaging Inc., Carlsbad, CA), 3Diagnosys ® (3diemme, Cantu, Italy) and OnDemand3D ® (CyberMed, Seoul, Republic of Korea) software packages. The intra- and inter-observer reliability of the upper airway measurements were determined using intraclass correlation coefficients and Bland & Altman agreement tests. To assess the accuracy of the software packages, one NewTom 5G ® CBCT data set was used to print a three-dimensional anthropomorphic phantom with known dimensions to be used as the "gold standard". This phantom was subsequently scanned using a NewTom 5G ® scanner. Based on the CBCT data set of the phantom, one observer measured the volume, minimum cross-sectional area, and length of the upper airway using Amira ® , 3Diagnosys ® , and OnDemand3D ® , and compared these measurements with the gold standard. The intra- and inter-observer reliability of the measurements of the upper airway using the different software packages were excellent (intraclass correlation coefficient ≥0.75). There was excellent agreement between all three software packages in volume, minimum cross-sectional area and length measurements. All software packages underestimated the upper airway volume by -8.8% to -12.3%, the minimum cross-sectional area by -6.2% to -14.6%, and the length by -1.6% to -2.9%. All three software packages offered reliable volume, minimum cross-sectional area and length measurements of the upper airway. The length measurements of the upper airway were the most accurate results in all software packages. All software packages underestimated the upper airway dimensions of the anthropomorphic phantom.
A novel standardized algorithm using SPECT/CT evaluating unhappy patients after unicondylar knee arthroplasty--a combined analysis of tracer uptake distribution and component position.

PubMed

Suter, Basil; Testa, Enrique; Stämpfli, Patrick; Konala, Praveen; Rasch, Helmut; Friederich, Niklaus F; Hirschmann, Michael T

2015-03-20

The introduction of a standardized SPECT/CT algorithm including a localization scheme, which allows accurate identification of specific patterns and thresholds of SPECT/CT tracer uptake, could lead to a better understanding of the bone remodeling and specific failure modes of unicondylar knee arthroplasty (UKA). The purpose of the present study was to introduce a novel standardized SPECT/CT algorithm for patients after UKA and evaluate its clinical applicability, usefulness and inter- and intra-observer reliability. Tc-HDP-SPECT/CT images of consecutive patients (median age 65, range 48-84 years) with 21 knees after UKA were prospectively evaluated. The tracer activity on SPECT/CT was localized using a specific standardized UKA localization scheme. For tracer uptake analysis (intensity and anatomical distribution pattern) a 3D volumetric quantification method was used. The maximum intensity values were recorded for each anatomical area. In addition, ratios between the respective value in the measured area and the background tracer activity were calculated. The femoral and tibial component position (varus-valgus, flexion-extension, internal and external rotation) was determined in 3D-CT. The inter- and intraobserver reliability of the localization scheme, grading of the tracer activity and component measurements were determined by calculating the intraclass correlation coefficients (ICC). The localization scheme, grading of the tracer activity and component measurements showed high inter- and intra-observer reliabilities for all regions (tibia, femur and patella). For measurement of component position there was strong agreement between the readings of the two observers; the ICC for the orientation of the femoral component was 0.73-1.00 (intra-observer reliability) and 0.91-1.00 (inter-observer reliability). The ICC for the orientation of the tibial component was 0.75-1.00 (intra-observer reliability) and 0.77-1.00 (inter-observer reliability). The SPECT/CT algorithm presented combining the mechanical information on UKA component position, alignment and metabolic data is highly reliable and proved to be a valuable, consistent and useful tool for analysing postoperative knees after UKA. Using this standardized approach in clinical studies might be helpful in establishing the diagnosis in patients with pain after UKA.
Scaling digital radiographs for templating in total hip arthroplasty using conventional acetate templates independent of calibration markers.

PubMed

Brew, Christopher J; Simpson, Philip M; Whitehouse, Sarah L; Donnelly, William; Crawford, Ross W; Hubble, Matthew J W

2012-04-01

We describe a scaling method for templating digital radiographs using conventional acetate templates independent of template magnification without the need for a calibration marker. The mean magnification factor for the radiology department was determined (119.8%; range, 117%-123.4%). This fixed magnification factor was used to scale the radiographs by the method described. Thirty-two femoral heads on postoperative total hip arthroplasty radiographs were then measured and compared with the actual size. The mean absolute accuracy was within 0.5% of actual head size (range, 0%-3%) with a mean absolute difference of 0.16 mm (range, 0-1 mm; SD, 0.26 mm). Intraclass correlation coefficient showed excellent reliability for both interobserver and intraobserver measurements with intraclass correlation coefficient scores of 0.993 (95% CI, 0.988-0.996) for interobserver measurements and intraobserver measurements ranging between 0.990 and 0.993 (95% CI, 0.980-0.997). Crown Copyright Â© 2012. Published by Elsevier Inc. All rights reserved.
Assessment of Myometrial Invasion in Premenopausal Grade 1 Endometrial Carcinoma: Is Magnetic Resonance Imaging a Reliable Tool in Selecting Patients for Fertility-Preserving Therapy?

PubMed

Sakane, Makoto; Hori, Masatoshi; Onishi, Hiromitsu; Tsuboyama, Takahiro; Ota, Takashi; Tatsumi, Mitsuaki; Ueda, Yutaka; Kimura, Toshihiro; Kimura, Tadashi; Tomiyama, Noriyuki

The aim of this study was to evaluate the diagnostic ability of magnetic resonance imaging (MRI) in premenopausal women with G1 endometrial carcinoma. Twenty-six patients underwent T2W, diffusion weighted, and dynamic contrast-enhanced 3-T MRI. The degree of myometrial invasion was pathologically classified into no invasion, shallow (3 mm or less), and more. Two radiologists assessed myometrial invasion on MRI. Diagnostic accuracy, sensitivity, specificity, positive and negative predictive values, AUC, and interobserver agreement were analyzed. For assessing myometrial invasion, mean accuracy, sensitivity, specificity, positive predictive values, negative predictive values, and AUC, respectively, were as follows: 63%, 42%, 85%, 79%, 47%, and 0.75. Mean interobserver agreement was fair (k = 0.36). Shallow invasions were underestimated as no invasion on MRI in all 6 cases. Magnetic resonance imaging produced false-negative result on half of patients. The misjudgments tended to happen in patients with shallow invasion.
Caregiver person-centeredness and behavioral symptoms during mealtime interactions: development and feasibility of a coding scheme.

PubMed

Gilmore-Bykovskyi, Andrea L

2015-01-01

Mealtime behavioral symptoms are distressing and frequently interrupt eating for the individual experiencing them and others in the environment. A computer-assisted coding scheme was developed to measure caregiver person-centeredness and behavioral symptoms for nursing home residents with dementia during mealtime interactions. The purpose of this pilot study was to determine the feasibility, ease of use, and inter-observer reliability of the coding scheme, and to explore the clinical utility of the coding scheme. Trained observers coded 22 observations. Data collection procedures were acceptable to participants. Overall, the coding scheme proved to be feasible, easy to execute and yielded good to very good inter-observer agreement following observer re-training. The coding scheme captured clinically relevant, modifiable antecedents to mealtime behavioral symptoms, but would be enhanced by the inclusion of measures for resident engagement and consolidation of items for measuring caregiver person-centeredness that co-occurred and were difficult for observers to distinguish. Published by Elsevier Inc.

The Infant Motor Profile: a standardized and qualitative method to assess motor behaviour in infancy.

PubMed

Heineman, Kirsten R; Bos, Arend F; Hadders-Algra, Mijna

2008-04-01

A reliable and valid instrument to assess neuromotor condition in infancy is a prerequisite for early detection of developmental motor disorders. We developed a video-based assessment of motor behaviour, the Infant Motor Profile (IMP), to evaluate motor abilities, movement variability, ability to select motor strategies, movement symmetry, and fluency. The IMP consists of 80 items and is applicable in children from 3 to 18 months. The present study aimed to test intra- and interobserver reliability and concurrent validity of the IMP with the Alberta Infant Motor Scale (AIMS) and Touwen neurological examination. The study group consisted of 40 low-risk term (median gestational age [GA] 40 wks, range 38-42 wks) and 40 high-risk preterm infants (median GA 29.6 wks, range 26-33 wks) with corrected ages 4 to 18 months (31 females, 49 males). Intra- and interobserver agreement of the IMP were satisfactory (Spearman's rho=0.9). Concurrent validity of IMP and AIMS was good (Spearman's rho=0.8, p<0.005). The IMP was able to differentiate between infants with normal neurological condition, simple minor neurological dysfunction (MND), complex MND, and abnormal neurological condition (p<0.005). This means that the IMP may be a promising tool to evaluate neurological integrity during infancy, a suggestion that needs confirmation by means of assessment of larger groups of infants with heterogeneous neurological conditions.
Interview protocols and ergonomics checklist for analysing overexertion back accidents among nursing personnel.

PubMed

Engkvist, I L; Hagberg, M; Wigaeus-Hjelm, E; Menckel, E; Ekenvall, L

1995-06-01

No documented strategy, including preventive strategies, for systematic investigation of overexertion back accidents among nursing personnel has yet been published. One aim of the present study was to develop standardized instruments for the systematic investigation of back accidents among nursing personnel in order to develop preventive strategies. Another aim was to produce a screening tool that could easily be used for identifying potential overexertion back accident hazards. Two structured interview protocols were developed, one for the injured person and one for the supervisor. An ergonomics checklist was designed for the most important spaces according to accident statistics: patient's room, corridor, toilet, and also one for 'other space', eg X-ray and treatment rooms. The instruments were developed by frequent discussions and adjustments in a task force of researchers and occupational health personnel. The protocols were tested in two steps before a final version was established. The construct validity and interobserver reliability of the checklist were tested by ten ergonomists, who checked a patient's room, a toilet and a corridor with some known hazards. The constructed validity agreement was 90% in 19 of 26 items in the checklist. The interobserver reliability had the same figures as the validity for all items in the checklist. The interview protocols and checklist appear to be suitable for systematic investigation of overexertion back accidents.
Influence of Light Conditions and Light Sources on Clinical Measurement of Natural Teeth Color using VITA Easyshade Advance 4,0® Spectrophotometer. Pilot Study.

PubMed

Posavec, Ivona; Prpić, Vladimir; Zlatarić, Dubravka Knezović

2016-12-01

The purpose of this study was to evaluate and compare lightness (L), chroma (C) and hue (h), green-red (a) and blue-yellow (b) character of the color of maxillary right central incisors in different light conditions and light sources. Two examiners who were well trained in digital color evaluation participated in the research. Intraclass correlation coefficients (ICCs) were used to analyze intra- and interobserver reliability. The LCh and L*a*b* values were determined at 08.15 and at 10.00 in the morning under three different light conditions. Tooth color was assessed in 10 subjects using intraoral spectrophotometer VITA Easyshade Advance 4.0 ® set at the central region of the vestibular surface of the measured tooth. Intra- and interobserver ICC values were high for both examiners and ranged from 0.57 to 0.99. Statistically significant differences in LCh and L*a*b* values measured in different time of the day and certain light condition were not found (p>0.05). Statistically significant differences in LCh and L*a*b* values measured under three different light conditions were not found, too (p>0.05). VITA Easyshade Advance 4.0 ® is reliable enough for daily clinical work in order to assess tooth color during the fabrication of esthtic appliances because it is not dependent on light conditions and light sources.
Intra- and inter-tester reliability and validity of normal finger size measurement using the Japanese ring gauge system.

PubMed

Suzuki, T; Sato, Y; Sotome, S; Arai, H; Arai, A; Yoshida, H

2017-06-01

This study was designed to investigate the reliability and validity of measurements of finger diameters with a ring gauge. A reliability study enrolled two independent samples (50 participants and seven examiners in Study I; 26 participants and 26 examiners in Study II). The sizes of each participant's little fingers were measured twice with a ring gauge by each examiner. To investigate the validity of the measurements, five hand therapists compared the finger size and hand volume of 30 participants with the ring gauge and with a figure-of-eight technique (Study III). The intra-class correlation coefficient for intra-observer reliability ranged from 0.97 to 0.99 in Study I, and 0.90 to 0.97 in Study II. The intra-class correlation coefficient for inter-observer reliability was 0.95 in Study I and 0.94 in Study II. The validity study showed a Pearson product moment correlation coefficient of 0.75. The ring gauge showed high reliability and validity for measurement of finger size. III, diagnostic.
A new SPECT/CT reconstruction algorithm: reliability and accuracy in clinical routine for non-oncologic bone diseases.

PubMed

Delcroix, Olivier; Robin, Philippe; Gouillou, Maelenn; Le Duc-Pennec, Alexandra; Alavi, Zarrin; Le Roux, Pierre-Yves; Abgral, Ronan; Salaun, Pierre-Yves; Bourhis, David; Querellou, Solène

2018-02-12

xSPECT Bone® (xB) is a new reconstruction algorithm developed by Siemens® in bone hybrid imaging (SPECT/CT). A CT-based tissue segmentation is incorporated into SPECT reconstruction to provide SPECT images with bone anatomy appearance. The objectives of this study were to assess xB/CT reconstruction diagnostic reliability and accuracy in comparison with Flash 3D® (F3D)/CT in clinical routine. Two hundred thirteen consecutive patients referred to the Brest Nuclear Medicine Department for non-oncological bone diseases were evaluated retrospectively. Two hundred seven SPECT/CT were included. All SPECT/CT were independently interpreted by two nuclear medicine physicians (a junior and a senior expert) with xB/CT then with F3D/CT three months later. Inter-observer agreement (IOA) and diagnostic confidence were determined using McNemar test, and unweighted Kappa coefficient. The study objectives were then re-assessed for validation through > 18 months of clinical and paraclinical follow-up. No statistically significant differences between IOA xB and IOA F3D were found (p = 0.532). Agreement for xB after categorical classification of the diagnoses was high (κ xB = 0.89 [95% CI 0.84 -0.93]) but without statistically significant difference F3D (κ F3D = 0.90 [95% CI 0.86 - 0.94]). Thirty-one (14.9%) inter-reconstruction diagnostic discrepancies were observed of which 21 (10.1%) were classified as major. The follow-up confirmed the diagnosis of F3D in 10 cases, xB in 6 cases and was non-contributory in 5 cases. xB reconstruction algorithm was found reliable, providing high interobserver agreement and similar diagnostic confidence to F3D reconstruction in clinical routine.
Validation of the Spanish Acne Severity Scale (Escala de Gravedad del Acné Española--EGAE).

PubMed

Puig, Lluis; Guerra-Tapia, Aurora; Conejo-Mir, Julián; Toribio, Jaime; Berasategui, Carmen; Zsolt, Ilonka

2013-04-01

Several acne grading systems have been described, but consensus is lacking on which shows superiority. A standardized system would facilitate therapeutic decisions and the analysis of clinical trial data. To assess the feasibility, reliability, validity and sensitivity to change of the Spanish Acne Severity Scale (EGAE). A Spanish, multicentre, prospective, observational study was performed in patients with facial, back or chest acne assessed using EGAE, Leeds Revised Acne Grading system (LRAG) and lesion count. Clinicians answered 4 questions regarding EGAE use and time employed. Patients were evaluated at baseline and after 5±1 weeks. Four additional blinded observers, all dermatologists, evaluated patients' pictures using EGAE and LRAG. In total, 349 acne locations were assessed in 328 patients. Of the dermatologists, 95.6% (CI: 92.9-97.5%) reported that EGAE was easy to use, and 75% used it in <3 minutes. Interobserver reliability of the EGAE scale was shown by a Kendall's W of 0.773 (p<0.001). EGAE and LRAG scales showed a high correlation (Spearman's correlation>0.85; p<0.001). EGAE mean score in treatment-compliant patients was significantly lower at follow-up than at baseline (2.14 vs. 1.57, p<0.001, Cohen's d=0.35).The pre-post-treatment difference in EGAE mean score in non-compliant patients was not significant (1.44 vs. 1.32, p<0.102) and Cohen's d was lower (0.19) than in compliant patients. The use of EGAE to evaluate acne grade in daily clinical dermatological practice in Spanish centres has shown feasibility, high interobserver reliability, concurrent validity and sensitivity to detect treatment effects.
Measurement of clavicular length and shortening after a midshaft clavicular fracture: Spatial digitization versus planar roentgen photogrammetry.

PubMed

Stegeman, Sylvia A; de Witte, Pieter Bas; Boonstra, Sjoerd; de Groot, Jurriaan H; Nagels, Jochem; Krijnen, Pieta; Schipper, Inger B

2016-08-01

Clavicular shortening after fracture is deemed prognostic for clinical outcome and is therefore generally assessed on radiographs. It is used for clinical decision making regarding operative or non-operative treatment in the first 2weeks after trauma, although the reliability and accuracy of the measurements are unclear. This study aimed to assess the reliability of roentgen photogrammetry (2D) of clavicular length and shortening, and to compare these with 3D-spatial digitization measurements, obtained with an electromagnetic recording system (Flock of Birds). Thirty-two participants with a consolidated non-operatively treated two or multi-fragmented dislocated midshaft clavicular fracture were analysed. Two observers measured clavicular lengths and absolute and proportional clavicular shortening on radiographs taken before and after fracture consolidation. The clavicular lengths were also measured with spatial digitization. Inter-observer agreement on the radiographic measurements was assessed using the Intraclass Correlation Coefficient (ICC). Agreement between the radiographic and spatial digitization measurements was assessed using a Bland-Altman plot. The inter-observer agreement on clavicular length, and absolute and proportional shortening on trauma radiographs was almost perfect (ICC>0.90), but moderate for absolute shortening after consolidation (ICC=0.45). The Bland-Altman plot compared measurements of length on AP panorama radiographs with spatial digitization and showed that planar roentgen photogrammetry resulted in up to 37mm longer and 34mm shorter measurements than spatial digitization. Measurements of clavicular length on radiographs are highly reliable between observers, but may not reflect the actual length and shortening of the clavicle when compared to length measurements with spatial digitization. We recommend to use proportional shortening when measuring clavicular length or shortening on radiographs for clinical decision making. Copyright © 2015 Elsevier Ltd. All rights reserved.
Calcification at orifices of aortic arch branches is a reliable and significant marker of stenosis at carotid bifurcation and intracranial arteries.

PubMed

Yamada, Shigeki; Hashimoto, Kenji; Ogata, Hideki; Watanabe, Yoshihiko; Oshima, Marie; Miyake, Hidenori

2014-02-01

Simple rating scale for calcification in the cervical arteries and the aortic arch on multi-detector computed tomography angiography (MDCTA) was evaluated its reliability and validity. Additionally, we investigated where is the most representative location for evaluating the calcification risk of carotid bifurcation stenosis and atherosclerotic infarction in the overall cervical arteries covering from the aortic arch to the carotid bifurcation. The aortic arch and cervical arteries among 518 patients (292 men, 226 women) were evaluated the extent of calcification using a 4-point grading scale for MDCTA. Reliability, validity and the concomitant risk with vascular stenosis and atherosclerotic infarction were assessed. Calcification was most frequently observed in the aortic arch itself, the orifices from the aortic arch, and the carotid bifurcation. Compared with the bilateral carotid bifurcations, the aortic arch itself had a stronger inter-observer agreement for the calcification score (Fleiss' kappa coefficients; 0.77), but weaker associations with stenosis and atherosclerotic infarction. Calcification at the orifices of the aortic arch branches had a stronger inter-observer agreement (0.74) and enough associations with carotid bifurcation stenosis and intracranial stenosis. In addition, the extensive calcification at the orifices from the aortic arch was significantly associated with atherosclerotic infarction, similar to the calcification at the bilateral carotid bifurcations. The orifices of the aortic arch branches were the novel representative location of the aortic arch and overall cervical arteries for evaluating the calcification extent. Thus, calcification at the aortic arch should be evaluated with focus on the orifices of 3 main branches. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Elliptical broken line method for calculating capillary density in nailfold capillaroscopy: Proposal and evaluation.

PubMed

Karbalaie, Abdolamir; Abtahi, Farhad; Fatemi, Alimohammad; Etehadtavakol, Mahnaz; Emrani, Zahra; Erlandsson, Björn-Erik

2017-09-01

Nailfold capillaroscopy is a practical method for identifying and obtaining morphological changes in capillaries which might reveal relevant information about diseases and health. Capillaroscopy is harmless, and seems simple and repeatable. However, there is lack of established guidelines and instructions for acquisition as well as the interpretation of the obtained images; which might lead to various ambiguities. In addition, assessment and interpretation of the acquired images are very subjective. In an attempt to overcome some of these problems, in this study a new modified technique for assessment of nailfold capillary density is introduced. The new method is named elliptic broken line (EBL) which is an extension of the two previously known methods by defining clear criteria for finding the apex of capillaries in different scenarios by using a fitted elliptic. A graphical user interface (GUI) is developed for pre-processing, manual assessment of capillary apexes and automatic correction of selected apexes based on 90° rule. Intra- and inter-observer reliability of EBL and corrected EBL is evaluated in this study. Four independent observers familiar with capillaroscopy performed the assessment for 200 nailfold videocapillaroscopy images, form healthy subject and systemic lupus erythematosus patients, in two different sessions. The results show elevation from moderate (ICC=0.691) and good (ICC=0.753) agreements to good (ICC=0.750) and good (ICC=0.801) for intra- and inter-observer reliability after automatic correction of EBL. This clearly shows the potential of this method to improve the reliability and repeatability of assessment which motivates us for further development of automatic tool for EBL method. Copyright © 2017 Elsevier Inc. All rights reserved.
Is it better to include necrosis in apparent diffusion coefficient (ADC) measurements? The necrosis/wall ADC ratio to differentiate malignant and benign necrotic lung lesions: Preliminary results.

PubMed

Karaman, Adem; Durur-Subasi, Irmak; Alper, Fatih; Durur-Karakaya, Afak; Subasi, Mahmut; Akgun, Metin

2017-10-01

To determine whether the use of necrosis/wall apparent diffusion coefficient (ADC) ratios in the differentiation of necrotic lung lesions is more reliable than measuring the wall alone. In this retrospective study, a total of 76 patients (54 males and 22 females, 71% vs. 29%, with a mean age of 53 ± 18 years, range, 18-84) were enrolled, 33 of whom had lung carcinoma and 43 had a benign necrotic lung lesion. A 3T scanner was used. The calculation of the necrosis/wall ADC ratio was based on ADC values measured from necrosis and the wall of the lesions by diffusion-weighted imaging (DWI). Statistical analyses were performed with the independent samples t-test and receiver operating characteristic analysis. Intraobserver and interobserver reliability were calculated for ADC values of wall and necrosis. The mean necrosis/wall ADC ratio was 1.67 ± 0.23 for malignant lesions and 0.75 ± 0.19 for benign lung lesions (P < 0.001). To estimate malignancy the area under the curve (AUC) values for necrosis ADC, wall ADC, and the necrosis/wall ADC ratio were 0.720, 0.073, and 0.997, respectively. A wall/necrosis ADC ratio cutoff value of 1.12 demonstrated a 100% sensitivity and 98% specificity in the estimation of malignancy. Positive predictive value was 100%, and negative predictive value 98% and diagnostic accuracy 99%. There was a good intraobserver and interobserver reliability for wall and necrosis. The necrosis/wall ADC ratio appears to be a reliable and promising tool for discriminating lung carcinoma from benign necrotic lung lesions than measuring the wall alone. 4 Technical Efficacy: Stage 2 J. Magn. Reson. Imaging 2017;46:1001-1006. © 2017 International Society for Magnetic Resonance in Medicine.
Readability of arthroscopy-related patient education materials from the American Academy of Orthopaedic Surgeons and Arthroscopy Association of North America Web sites.

PubMed

Yi, Paul H; Ganta, Abhishek; Hussein, Khalil I; Frank, Rachel M; Jawa, Andrew

2013-06-01

We sought to assess the readability levels of arthroscopy-related patient education materials available on the Web sites of the American Academy of Orthopaedic Surgeons (AAOS) and the Arthroscopy Association of North America (AANA). We identified all articles related to arthroscopy available in 2012 from the online patient education libraries of AAOS and AANA. After performing follow-up editing, we assessed each article with the Flesch-Kincaid (FK) readability test. Mean readability levels of the articles from the AAOS Web site and the AANA Web site were compared. We also determined the number of articles with readability levels at or below the eighth-grade level (the average reading ability of the US adult population) and sixth-grade level (the widely recommended level for patient education materials). Intraobserver reliability and interobserver reliability of FK grade assessment were evaluated. A total of 62 articles were reviewed (43 from AAOS and 19 from AANA). The mean overall FK grade level was 10.2 (range, 5.2 to 12). The AAOS articles had a mean FK grade level of 9.6 (range, 5.2 to 12), whereas the AANA articles had a mean FK grade level of 11.4 (range, 8.7 to 12); the difference was significant (P < .0001). Only 3 articles had a readability level at or below the eighth-grade level and only 1 was at or below the sixth-grade level; all were from AAOS. Intraobserver reliability and interobserver reliability were excellent (intraclass correlation coefficient of 1 for both). Online patient education materials related to arthroscopy from AAOS and AANA may be written at a level too difficult for a large portion of the patient population to comprehend. Copyright © 2013 Arthroscopy Association of North America. Published by Elsevier Inc. All rights reserved.
Reliability of Corneal Dynamic Scheimpflug Analyser Measurements in Virgin and Post-PRK Eyes

PubMed Central

Chen, Xiangjun; Stojanovic, Aleksandar; Hua, Yanjun; Eidet, Jon Roger; Hu, Di; Wang, Jingting; Utheim, Tor Paaske

2014-01-01

Purpose To determine the measurement reliability of CorVis ST, a dynamic Scheimpflug analyser, in virgin and post-photorefractive keratectomy (PRK) eyes and compare the results between these two groups. Methods Forty virgin eyes and 42 post-PRK eyes underwent CorVis ST measurements performed by two technicians. Repeatability was evaluated by comparing three consecutive measurements by technician A. Reproducibility was determined by comparing the first measurement by technician A with one performed by technician B. Intraobserver and interobserver intraclass correlation coefficients (ICCs) were calculated. Univariate analysis of covariance (ANCOVA) was used to compare measured parameters between virgin and post-PRK eyes. Results The intraocular pressure (IOP), central corneal thickness (CCT) and 1st applanation time demonstrated good intraobserver repeatability and interobserver reproducibility (ICC≧0.90) in virgin and post-PRK eyes. The deformation amplitude showed a good or close to good repeatability and reproducibility in both groups (ICC≧0.88). The CCT correlated positively with 1st applanation time (r = 0.437 and 0.483, respectively, p<0.05) and negatively with deformation amplitude (r = −0.384 and −0.375, respectively, p<0.05) in both groups. Compared to post-PRK eyes, virgin eyes showed longer 1st applanation time (7.29±0.21 vs. 6.96±0.17 ms, p<0.05) and lower deformation amplitude (1.06±0.07 vs. 1.17±0.08 mm, p<0.05). Conclusions CorVis ST demonstrated reliable measurements for CCT, IOP, and 1st applanation time, as well as relatively reliable measurement for deformation amplitude in both virgin and post-PRK eyes. There were differences in 1st applanation time and deformation amplitude between virgin and post-PRK eyes, which may reflect corneal biomechanical changes occurring after the surgery in the latter. PMID:25302580
Comparative Validity and Reproducibility Study of Various Landmark-Oriented Reference Planes in 3-Dimensional Computed Tomographic Analysis for Patients Receiving Orthognathic Surgery

PubMed Central

Lin, Hsiu-Hsia; Chuang, Ya-Fang; Weng, Jing-Ling; Lo, Lun-Jou

2015-01-01

Background Three-dimensional computed tomographic imaging has become popular in clinical evaluation, treatment planning, surgical simulation, and outcome assessment for maxillofacial intervention. The purposes of this study were to investigate whether there is any correlation among landmark-based horizontal reference planes and to validate the reproducibility and reliability of landmark identification. Materials and Methods Preoperative and postoperative cone-beam computed tomographic images of patients who had undergone orthognathic surgery were collected. Landmark-oriented reference planes including the Frankfort horizontal plane (FHP) and the lateral semicircular canal plane (LSP) were established. Four FHPs were defined by selecting 3 points from the orbitale, porion, or midpoint of paired points. The LSP passed through both the lateral semicircular canal points and nasion. The distances between the maxillary or mandibular teeth and the reference planes were measured, and the differences between the 2 sides were calculated and compared. The precision in locating the landmarks was evaluated by performing repeated tests, and the intraobserver reproducibility and interobserver reliability were assessed. Results A total of 30 patients with facial deformity and malocclusion—10 patients with facial symmetry, 10 patients with facial asymmetry, and 10 patients with cleft lip and palate—were recruited. Comparing the differences among the 5 reference planes showed no statistically significant difference among all patient groups. Regarding intraobserver reproducibility, the mean differences in the 3 coordinates varied from 0 to 0.35 mm, with correlation coefficients between 0.96 and 1.0, showing high correlation between repeated tests. Regarding interobserver reliability, the mean differences among the 3 coordinates varied from 0 to 0.47 mm, with correlation coefficients between 0.88 and 1.0, exhibiting high correlation between the different examiners. Conclusions The 5 horizontal reference planes were reliable and comparable for 3D craniomaxillofacial analysis. These reference planes were useful in standardizing the orientation of 3D skull models. PMID:25668209
Comparative validity and reproducibility study of various landmark-oriented reference planes in 3-dimensional computed tomographic analysis for patients receiving orthognathic surgery.

PubMed

Lin, Hsiu-Hsia; Chuang, Ya-Fang; Weng, Jing-Ling; Lo, Lun-Jou

2015-01-01

Three-dimensional computed tomographic imaging has become popular in clinical evaluation, treatment planning, surgical simulation, and outcome assessment for maxillofacial intervention. The purposes of this study were to investigate whether there is any correlation among landmark-based horizontal reference planes and to validate the reproducibility and reliability of landmark identification. Preoperative and postoperative cone-beam computed tomographic images of patients who had undergone orthognathic surgery were collected. Landmark-oriented reference planes including the Frankfort horizontal plane (FHP) and the lateral semicircular canal plane (LSP) were established. Four FHPs were defined by selecting 3 points from the orbitale, porion, or midpoint of paired points. The LSP passed through both the lateral semicircular canal points and nasion. The distances between the maxillary or mandibular teeth and the reference planes were measured, and the differences between the 2 sides were calculated and compared. The precision in locating the landmarks was evaluated by performing repeated tests, and the intraobserver reproducibility and interobserver reliability were assessed. A total of 30 patients with facial deformity and malocclusion--10 patients with facial symmetry, 10 patients with facial asymmetry, and 10 patients with cleft lip and palate--were recruited. Comparing the differences among the 5 reference planes showed no statistically significant difference among all patient groups. Regarding intraobserver reproducibility, the mean differences in the 3 coordinates varied from 0 to 0.35 mm, with correlation coefficients between 0.96 and 1.0, showing high correlation between repeated tests. Regarding interobserver reliability, the mean differences among the 3 coordinates varied from 0 to 0.47 mm, with correlation coefficients between 0.88 and 1.0, exhibiting high correlation between the different examiners. The 5 horizontal reference planes were reliable and comparable for 3D craniomaxillofacial analysis. These reference planes were useful in standardizing the orientation of 3D skull models.
Reliability of length measurements collected by community nurses and health volunteers in rural growth monitoring and promotion services.

PubMed

Laar, Matilda E; Marquis, Grace S; Lartey, Anna; Gray-Donald, Katherine

2018-02-17

Length measurements are important in growth, monitoring and promotion (GMP) for the surveillance of a child's weight-for-length and length-for-age. These two indices provide an indication of a child's risk of becoming wasted or stunted, and are more informative about a child's growth than the widely used weight-for-age index (underweight). Although the introduction of length measurements in GMP is recommended by the World Health Organization, concerns about the reliability of length measurements collected in rural outreach settings have been expressed by stakeholders. Our aim was to describe the reliability and challenges associated with community health personnel measuring length for rural outreach GMP activities. Two reliability studies (A and B), using 10 children less than 24 months each, were conducted in the GMP services of a rural district in Ghana. Fifteen nurses and 15 health volunteers (HV) with no prior experience in length measurements were trained. Intra- and inter-observer technical error of measurement (TEM), average bias from expert anthropometrist, and coefficient of reliability (R) of length measurements were assessed and compared across sessions. Observations and interviews were used to understand the ability and experiences of health personnel with measuring length at outreach GMP. Inter-observer TEM was larger than intra-observer TEM for both nurses and HV at both sessions and was unacceptably (compared to error standards) high in both groups at both time points. Average biases from expert's measurements were within acceptable limits, however, both groups tended to underestimate length measurements. The R for lengths collected by nurses (92.3%) was higher at session B compared to that of HV (87.5%). Length measurements taken by nurses and HV, and those taken by an experienced anthropometrist at GMP sessions were of moderate agreement (kappa = 0.53, p < 0.0001). The reliability of length measurements improved after two refresher trainings for nurses but not for HV. In addition, length measurements taken during GMP sessions may be susceptible to errors due to overburdened health personnel and crowded GMP clinics. There is need for both pre- and in-service training of nurses and HV on length measurements and procedures to improve reliability of length measurements.
The Integration of Research in Judgment and Decision Theory

DTIC Science & Technology

1980-07-01

off at any one of a series of choice points in a basically linear, unidimensional, all-or-none series of relays is at least in part the result of the...Subjective and objective referents. An objective referent requires a series of observations in which inter-observer reliabilities approximate unity; as... series of studies by Br6hmer (1980). More generally, research as far back as that of Krechevsky’s in the 1930s was conducted precisely to show that
[Automated procedure for volumetric measurement of metastases: estimation of tumor burden].

PubMed

Fabel, M; Bolte, H

2008-09-01

Cancer is a common and increasing disease worldwide. Therapy monitoring in oncologic patient care requires accurate and reliable measurement methods for evaluation of the tumor burden. RECIST (response evaluation criteria in solid tumors) and WHO criteria are still the current standards for therapy response evaluation with inherent disadvantages due to considerable interobserver variation of the manual diameter estimations. Volumetric analysis of e.g. lung, liver and lymph node metastases, promises to be a more accurate, precise and objective method for tumor burden estimation.
[Interobserver reliability of the Glasgow coma scale in critically ill patients with neurological and/or neurosurgical disease].

PubMed

Sánchez-Sánchez, M M; Sánchez-Izquierdo, R; Sánchez-Muñoz, E I; Martínez-Yegles, I; Fraile-Gamo, M P; Arias-Rivera, S

2014-01-01

The Glasgow coma scale (GCS) is a common tool used for neurological assessment of critically ill patients. Despite its widespread use, the GCS has some limitations, as sometimes different observers may value differently the same response. To evaluate the interobserver agreement, among intensive care nurses with a minimum of 3 years experience, both in the overall estimate of GCS and for each of its components. Prospective observational study including 110 neurological and/or neurosurgical patients conducted in a critical care unit of 18 beds, from October 2010 until December 2012. Registered variables: Demographic characteristics, reason for admission, overall GCS and its components. The neurological evaluation was conducted by a minimum of 3 nurses. One of them applied an algorithm and consensual assessment technique and all, independently, valued response to stimuli. Interobserver agreement was measured using the intraclass correlation coefficient (ICC) for a confidence interval (CI) of 95%. The study was approved by the Ethics Committee for Clinical Trails. The intraclass correlation coefficient (confident interval) for scale was: Overall GCS: 0.989 (0.985-0.992); ocular response: 0.981 (0.974-0.986); verbal response: 0.971 (0.960-0.979); motor response: 0.987 (0.982-0.991). In our cohort of patients we observed a high level of consistency in the application of both the GCS as in each of its components. Copyright © 2013 Elsevier España, S.L. y SEEIUC. All rights reserved.
Double sac sign and intradecidual sign in early pregnancy: interobserver reliability and frequency of occurrence.

PubMed

Doubilet, Peter M; Benson, Carol B

2013-07-01

To assess the interobserver agreement, frequency of occurrence, and prognostic importance of the double sac sign (DSS), intradecidual sign (IDS), and other sonographic findings in early intrauterine pregnancies. We retrospectively identified all sonograms obtained between January 1, 2006, and December 31, 2011, in which: (1) the scan demonstrated an intrauterine fluid collection without a yolk sac or embryo; (2) a follow-up scan confirmed an intrauterine pregnancy; and (3) the first-trimester outcome was known. Each coinvestigator characterized the 199 study sonograms as demonstrating or not demonstrating a DSS or an IDS, based on judgment about whether the scan met published criteria defining these signs. Interobserver agreement was poor for the DSS (κ= 0.24) and IDS (κ= 0.23). Scans frequently demonstrated neither sign: 150 cases (75.4%) if we considered a sign to be present when both investigators graded it as present and 69 cases (34.7%) using the looser criterion that either graded it as present. The presence of a DSS or an IDS was unrelated to the β-human chorionic gonadotropin (β-hCG) value (P > .05, t test, all comparisons). An inner echogenic ring was present in 158 cases (79.4%), and the decidua was brighter peripherally than centrally in 102 (51.3%). The first-trimester outcome was unrelated to the presence of a DSS or an IDS, presence of an inner echogenic ring, or decidual appearance (P > .05, χ(2), all comparisons). The sonographic appearance of early gestational sacs, before visualization of a yolk sac or embryo, is highly variable. The DSS and IDS are often absent; there is poor interobserver agreement regarding these signs; and the prognosis is unrelated to their presence or absence. A round or oval intrauterine fluid collection in a woman with positive β-hCG should be treated as a gestational sac until proven otherwise, regardless of whether it demonstrates a DSS or an IDS.
Intra- and interobserver agreement in the classification and treatment of distal third clavicle fractures.

PubMed

Bishop, Julie Y; Jones, Grant L; Lewis, Brian; Pedroza, Angela

2015-04-01

In treatment of distal third clavicle fractures, the Neer classification system, based on the location of the fracture in relation to the coracoclavicular ligaments, has traditionally been used to determine fracture pattern stability. To determine the intra- and interobserver reliability in the classification of distal third clavicle fractures via standard plain radiographs and the intra- and interobserver agreement in the preferred treatment of these fractures. Cohort study (Diagnosis); Level of evidence, 3. Thirty radiographs of distal clavicle fractures were randomly selected from patients treated for distal clavicle fractures between 2006 and 2011. The radiographs were distributed to 22 shoulder/sports medicine fellowship-trained orthopaedic surgeons. Fourteen surgeons responded and took part in the study. The evaluators were asked to measure the size of the distal fragment, classify the fracture pattern as stable or unstable, assign the Neer classification, and recommend operative versus nonoperative treatment. The radiographs were reordered and redistributed 3 months later. Inter- and intrarater agreement was determined for the distal fragment size, stability of the fracture, Neer classification, and decision to operate. Single variable logistic regression was performed to determine what factors could most accurately predict the decision for surgery. Interrater agreement was fair for distal fragment size, moderate for stability, fair for Neer classification, slight for type IIB and III fractures, and moderate for treatment approach. Intrarater agreement was moderate for distal fragment size categories (κ = 0.50, P < .001) and Neer classification (κ = 0.42, P < .001) and substantial for stable fracture (κ = 0.65, P < .001) and decision to operate (κ = 0.65, P < .001). Fracture stability was the best predictor of treatment, with 89% accuracy (P < .001). Fracture stability determination and the decision to operate had the highest interobserver agreement. Fracture stability was the key determinant of treatment, rather than the Neer classification system or the size of the distal fragment. © 2015 The Author(s).

The Power of Flash Mob Research: Conducting a Nationwide Observational Clinical Study on Capillary Refill Time in a Single Day.

PubMed

Alsma, Jelmer; van Saase, Jan L C M; Nanayakkara, Prabath W B; Schouten, W E M Ineke; Baten, Anique; Bauer, Martijn P; Holleman, Frits; Ligtenberg, Jack J M; Stassen, Patricia M; Kaasjager, Karin H A H; Haak, Harm R; Bosch, Frank H; Schuit, Stephanie C E

2017-05-01

Capillary refill time (CRT) is a clinical test used to evaluate the circulatory status of patients; various methods are available to assess CRT. Conventional clinical research often demands large numbers of patients, making it costly, labor-intensive, and time-consuming. We studied the interobserver agreement on CRT in a nationwide study by using a novel method of research called flash mob research (FMR). Physicians in the Netherlands were recruited by using word-of-mouth referrals, conventional media, and social media to participate in a nationwide, single-day, "nine-to-five," multicenter, cross-sectional, observational study to evaluate CRT. Patients aged ≥ 18 years presenting to the ED or who were hospitalized were eligible for inclusion. CRT was measured independently (by two investigators) at the patient's sternum and distal phalanx after application of pressure for 5 s (5s) and 15 s (15s). On October 29, 2014, a total of 458 investigators in 38 Dutch hospitals enrolled 1,734 patients. The mean CRT measured at the distal phalanx were 2.3 s (5s, SD 1.1) and 2.4 s (15s, SD 1.3). The mean CRT measured at the sternum was 2.6 s (5s, SD 1.1) and 2.7 s (15s, SD 1.1). Interobserver agreement was higher for the distal phalanx (κ value, 0.40) than for the sternum (κ value, 0.30). Interobserver agreement on CRT is, at best, moderate. CRT measured at the distal phalanx yielded higher interobserver agreement compared with sternal CRT measurements. FMR proved a valuable instrument to investigate a relatively simple clinical question in an inexpensive, quick, and reliable manner. Copyright © 2016 American College of Chest Physicians. Published by Elsevier Inc. All rights reserved.
Describing Peripancreatic Collections According to the Revised Atlanta Classification of Acute Pancreatitis: An International Interobserver Agreement Study.

PubMed

Bouwense, Stefan A; van Brunschot, Sandra; van Santvoort, Hjalmar C; Besselink, Marc G; Bollen, Thomas L; Bakker, Olaf J; Banks, Peter A; Boermeester, Marja A; Cappendijk, Vincent C; Carter, Ross; Charnley, Richard; van Eijck, Casper H; Freeny, Patrick C; Hermans, John J; Hough, David M; Johnson, Colin D; Laméris, Johan S; Lerch, Markus M; Mayerle, Julia; Mortele, Koenraad J; Sarr, Michael G; Stedman, Brian; Vege, Santhi Swaroop; Werner, Jens; Dijkgraaf, Marcel G; Gooszen, Hein G; Horvath, Karen D

2017-08-01

Severe acute pancreatitis is associated with peripancreatic morphologic changes as seen on imaging. Uniform communication regarding these morphologic findings is crucial for accurate diagnosis and treatment. For the original 1992 Atlanta classification, interobserver agreement is poor. We hypothesized that for the revised Atlanta classification, interobserver agreement will be better. An international, interobserver agreement study was performed among expert and nonexpert radiologists (n = 14), surgeons (n = 15), and gastroenterologists (n = 8). Representative computed tomographies of all stages of acute pancreatitis were selected from 55 patients and were assessed according to the revised Atlanta classification. The interobserver agreement was calculated among all reviewers and subgroups, that is, expert and nonexpert reviewers; interobserver agreement was defined as poor (≤0.20), fair (0.21-0.40), moderate (0.41-0.60), good (0.61-0.80), or very good (0.81-1.00). Interobserver agreement among all reviewers was good (0.75 [standard deviation, 0.21]) for describing the type of acute pancreatitis and good (0.62 [standard deviation, 0.19]) for the type of peripancreatic collection. Expert radiologists showed the best and nonexpert clinicians the lowest interobserver agreement. Interobserver agreement was good for the revised Atlanta classification, supporting the importance for widespread adaption of this revised classification for clinical and research communications.
Reliability of gadolinium-enhanced magnetic resonance imaging findings and their correlation with clinical outcome in patients with sciatica.

PubMed

el Barzouhi, Abdelilah; Vleggeert-Lankamp, Carmen L A M; Lycklama à Nijeholt, Geert J; Van der Kallen, Bas F; van den Hout, Wilbert B; Koes, Bart W; Peul, Wilco C

2014-11-01

Gadolinium-enhanced magnetic resonance imaging (Gd-MRI) is often performed in the evaluation of patients with persistent sciatica after lumbar disc surgery. However, correlation between enhancement and clinical findings is debated, and limited data are available regarding the reliability of enhancement findings. To evaluate the reliability of Gd-MRI findings and their correlation with clinical findings in patients with sciatica. Prospective observational evaluation of patients who were enrolled in a randomized trial with 1-year follow-up. Patients with 6- to 12-week sciatica, who participated in a multicentre randomized clinical trial comparing an early surgery strategy with prolonged conservative care with surgery if needed. In total 204 patients underwent Gd-MRI at baseline and after 1 year. Patients were assessed by means of the Roland Disability Questionnaire (RDQ) for sciatica, visual analog scale (VAS) for leg pain, and patient-reported perceived recovery at 1 year. Kappa coefficients were used to assess interobserver reliability. In total, 204 patients underwent Gd-MRI at baseline and after 1 year. Magnetic resonance imaging findings were correlated to the outcome measures using the Mann-Whitney U test for continuous data and Fisher exact tests for categorical data. Poor-to-moderate agreement was observed regarding Gd enhancement of the herniated disc and compressed nerve root (kappa<0.41), which was in contrast with excellent interobserver agreement of the disc level of the herniated disc and compressed nerve root (kappa>0.95). Of the 59 patients with an enhancing herniated disc at 1 year, 86% reported recovery compared with 100% of the 12 patients with nonenhancing herniated discs (p=.34). Of the 12 patients with enhancement of the most affected nerve root at 1 year, 83% reported recovery compared with 85% of the 192 patients with no enhancement (p=.69). Patients with and without enhancing herniated discs or nerve roots at 1 year reported comparable outcomes on RDQ and VAS-leg pain. Reliability of Gd-MRI findings was poor-to-moderate and no correlation was observed between enhancement and clinical findings at 1-year follow-up. Copyright © 2014 Elsevier Inc. All rights reserved.
New definitions of 6 clinical signs of perceptual disorder in children with cerebral palsy: an observational study through reliability measures.

PubMed

Ferrari, A; Sghedoni, A; Alboresi, S; Pedroni, E; Lombardi, F

2014-12-01

Recently authors have begun to emphasize the non-motor aspects of Cerebral Palsy and their influence on motor control and recovery prognosis. Much has been written about single clinical signs (i.e., startle reaction) but so far no definitions of the six perceptual signs presented in this study have appeared in literature. This study defines 6 signs (startle reaction, upper limbs in startle position, frequent eye blinking, posture freezing, averted eye gaze, grimacing) suggestive of perceptual disorders in children with cerebral palsy and measures agreement on sign recognition among independent observers and consistency of opinions over time. Observational study with both cross-sectional and prospective components. Fifty-six videos presented to observers in random order. Videos were taken from 19 children with a bilateral form of cerebral palsy referred to the Children Rehabilitation Unit in Reggio Emilia. Thirty-five rehabilitation professionals from all over Italy: 9 doctors and 26 physiotherapists. Measure of agreement among 35 independent observers was compiled from a sample of 56 videos. Interobserver reliability was determined using the K index of Fleiss and reliability intra-observer was calculated by the Spearman correlation index between ranks (rho - ρ). Percentage of agreement between observers and Gold Standard was used as criterion validity. Interobserver reliability was moderate for startle reaction, upper limb in startle position, adverted eye gaze and eye-blinking and fair for posture freezing and grimacing. Intraobserver reliability remained consistent over time. Criterion validity revealed very high agreement between independent observer evaluation and gold standard. Semiotics of perceptual disorders can be used as a specific and sensitive instrument in order to identify a new class of patients within existing heterogeneous clinical types of bilateral cerebral palsy forms and could help clinicians in identifying functional prognosis. To provide clinicians with a definition of 6 clinical signs found in children with cerebral palsy in routine rehabilitation settings. Future research should explore the link between these signs and motor prognosis (i.e., time to independent walking).
Interobserver Variation in Response Evaluation Criteria in Solid Tumors 1.1.

PubMed

Karmakar, Arunabha; Kumtakar, Apeksha; Sehgal, Himanshu; Kumar, Savith; Kalyanpur, Arjun

2018-06-19

Response Evaluation Criteria in Solid Tumors (RECIST 1.1) is the gold standard for imaging response evaluation in cancer trials. We sought to evaluate consistency of applying RECIST 1.1 between 2 conventionally trained radiologists, designated as A and B; identify reasons for variation; and reconcile these differences for future studies. The study was approved as an institutional quality check exercise. Since no identifiable patient data was collected or used, a waiver of informed consent was granted. Imaging case report forms of a concluded multicentric breast cancer trial were retrospectively reviewed. Cohen's kappa was used to rate interobserver agreement in Response Evaluation Data (target response, nontarget response, new lesions, overall response). Significant variations were reassessed by a senior radiologist to extrapolate reasons for disagreement. Methods to improve agreement were similarly ascertained. Sixty one cases with total of 82 data-pairs were evaluated (35 data-pairs in visit 5, 47 in visit 9). Both radiologists showed moderate agreement in target response (n = 82; ĸ = 0.477; 95% confidence interval [CI]: 0.314-0.640-), nontarget response (n = 82; ĸ = 0.578; 95% CI: 0.213-0.944) and overall response evaluation in both visits (n = 82; ĸ = 0.510; 95% CI: 0.344-0.676). Further assessment demonstrated "Prevalence effect" of Kappa in some cases which led to underestimation of agreement. Percent agreement of overall response was 74.39% while percent variation was 25.6%. Differences in interpreting RECIST 1.1 and in radiological image interpretation were the primary sources of variation. The commonest overall response was "Partial Response" (Rad A:45/82; Rad B:63/82). Inspite of moderate interobserver agreement, qualitative interpretation differences in some cases increased interobserver variability. Protocols such as Adjudication, to reduce easily avoidable inconsistencies are or should be a part of the Standard Operating Procedure in imaging institutions. Based on our findings, a standard checklist has been developed to help reduce the interpretation error-margin for future studies. Such check-lists may improve interobserver agreement in the preadjudication phase thereby improving quality of results and reducing adjudication per case ratio. Improving data reliability when using RECIST 1.1 will reflect in better cancer clinical trial outcomes. A checklist can be of use to imaging centers to assess and improve their own processes. Copyright © 2018. Published by Elsevier Inc.
RELIABILITY AND VALIDITY OF SUBJECTIVE ASSESSMENT OF LUMBAR LORDOSIS IN CONVENTIONAL RADIOGRAPHY.

PubMed

Ruhinda, E; Byanyima, R K; Mugerwa, H

2014-10-01

Reliability and validity studies of different lumbar curvature analysis and measurement techniques have been documented however there is limited literature on the reliability and validity of subjective visual analysis. Radiological assessment of lumbar lordotic curve aids in early diagnosis of conditions even before neurologic changes set in. To ascertain the level of reliability and validity of subjective assessment of lumbar lordosis in conventional radiography. A blinded, repeated-measures diagnostic test was carried out on lumbar spine x-ray radiographs. Radiology Department at Joint Clinical Research Centre (JCRC), Mengo-Kampala-Uganda. Seventy (70) lateral lumbar x-ray films were used for this study and were obtained from the archive of JCRC radiology department at Butikiro house, Mengo-Kampala. Poor observer agreement, both inter- and intra-observer, with kappa values of 0.16 was found. Inter-observer agreement was poorer than intra-observer agreement. Kappa values significantly rose when the lumbar lordosis was clustered into four categories without grading each abnormality. The results confirm that subjective assessment of lumbar lordosis has low reliability and validity. Film quality has limited influence on the observer reliability. This study further shows that fewer scale categories of lordosis abnormalities produce better observer reliability.
Reliability of the Robinson classification for displaced comminuted midshaft clavicular fractures.

PubMed

Stegeman, Sylvia A; Fernandes, Nicole C; Krijnen, Pieta; Schipper, Inger B

2015-01-01

This study aimed to assess the reliability of the Robinson classification for displaced comminuted midshaft fractures. A total of 102 surgeons and 52 radiologists classified 15 displaced comminuted midshaft clavicular fractures on anteroposterior (AP) and 30-degree caudocephalad radiographs twice. For both surgeons and radiologists, inter-observer and intra-observer agreement significantly improved after showing the 30-degree caudocephalad view in addition to the AP view. Radiologists had significantly higher inter- and intra-observer agreement than surgeons after judging both radiographs (κmultirater of 0.81 vs. 0.56; κintra-observer of 0.73 vs. 0.44). We advise to use two-plane radiography and to routinely incorporate the Robinson classification in the radiology reports. Copyright © 2015 Elsevier Inc. All rights reserved.
Are photographic records reliable for orthodontic screening?

PubMed

Mandall, N A

2002-06-01

The aim of the study was to evaluate the reliability of a panel of orthodontists for accepting new patient referrals based on clinical photographs. Eight orthodontists from Greater Manchester, Lancashire, Chester, and Derbyshire observed clinical photographs of 40 consecutive new patients attending the orthodontic department, Hope Hospital, Salford. They recorded whether or not they would accept the patient, as a new patient referral, in their department. Each consultant was asked to take into account factors, such as oral hygiene, dental development, and severity of the malocclusion. Kappa statistic for multiple-rater agreement and kappa statistic for intra-observer reliability were calculated. Inter-observer panel agreement for accepting new patient referrals based on photographic information was low (multiple rater kappa score 0.37). Intra-examiner agreement was better (kappa range 0.34-0.90). Clinician agreement for screening and accepting orthodontic referrals based on clinical photographs is comparable to that previously reported for other clinical decision making.
Measuring the suffering of end-stage dementia: reliability and validity of the Mini-Suffering State Examination.

PubMed

Aminoff, Bechor Z; Purits, Elena; Noy, Shlomo; Adunsky, Abraham

2004-01-01

Assessment of suffering is extremely important in dying end-stage dementia patients (ESDP). We have developed and examined the reliability and validity of the Mini-Suffering State Examination (MSSE), in 103 consecutive bedridden ESDP. Main outcome measures included inter-observer reliability and concurrent validity. Reliability of the MSSE questionnaire was satisfactory, with Cronbach alpha values of 0.735 and 0.718 for the two physicians (Ph-1, Ph-2), respectively. The kappa agreement coefficient was 0.791. There was a high agreement for seven items (kappa 0.882-0.972) and a substantial agreement for the other three items (kappa 0.621-0.682) of the MSSE. MSSE was validated versus the comfort assessment in dying with dementia (CAD-EOLD) scale and resulted in a significant Pearson correlation (r=-0.796, P<0.001). We conclude that the MSSE scale is a reliable and valid clinical tool, recommended for evaluating the severity of the patient's condition and the level of suffering of ESDP. Use of MSSE may improve medical management and facilitate communication between patients and caregivers.
Stress examination of flexor tendon pulley rupture in the crimp grip position: a 1.5-Tesla MRI cadaver study.

PubMed

Bayer, Thomas; Fries, Simon; Schweizer, Andreas; Schöffl, Isabelle; Janka, Rolf; Bongartz, Georg

2015-01-01

The objectives of this study were the evaluation of flexor tendon pulley rupture of the fingers in the crimp grip position using magnetic resonance imaging (MRI) and the comparison of the results with MRI in the neutral position in a cadaver study. MRI in the crimp grip position and in the neutral position was performed in 21 cadaver fingers with artificially created flexor tendon pulley tears (combined pulley rupture, n = 14; single pulley rupture, n = 7). Measurement of the distance between the tendon and bone was performed. Images were evaluated by two readers, first independently and in cases of discrepancy in consensus. Sensitivity and specificity for detecting combined pulley ruptures were calculated. Tendon bone distances were significantly higher in the crimp grip position than in the neutral position. Sensitivity and specificity for detecting combined pulley rupture were 92.86 % and 100 % respectively in the crimp grip position and 78.57 % and 85.71 % respectively in the neutral position. Kappa values for interobserver reliability were 0.87 in the crimp grip position and 0.59 in the neutral position. MRI examination in the crimp grip position results in higher tendon bone distances by subjecting the pulleys to a higher strain, which facilitates image evaluation with higher interobserver reliability, higher sensitivity, and higher specificity for combined pulley rupture compared with examination in the neutral position.
Evaluation of Renal Oxygenation Level Changes after Water Loading Using Susceptibility-Weighted Imaging and T2* Mapping.

PubMed

Ding, Jiule; Xing, Wei; Wu, Dongmei; Chen, Jie; Pan, Liang; Sun, Jun; Xing, Shijun; Dai, Yongming

2015-01-01

To assess the feasibility of susceptibility-weighted imaging (SWI) while monitoring changes in renal oxygenation level after water loading. Thirty-two volunteers (age, 28.0 ± 2.2 years) were enrolled in this study. SWI and multi-echo gradient echo sequence-based T2(*) mapping were used to cover the kidney before and after water loading. Cortical and medullary parameters were measured using small regions of interest, and their relative changes due to water loading were calculated based on baseline and post-water loading data. An intraclass correlation coefficient analysis was used to assess inter-observer reliability of each parameter. A receiver operating characteristic curve analysis was conducted to compare the performance of the two methods for detecting renal oxygenation changes due to water loading. Both medullary phase and medullary T2(*) values increased after water loading (p < 0.001), although poor correlations were found between the phase changes and the T2(*) changes (p > 0.05). Interobserver reliability was excellent for the T2(*) values, good for SWI cortical phase values, and moderate for the SWI medullary phase values. The area under receiver operating characteristic curve of the SWI medullary phase values was 0.85 and was not different from the medullary T2(*) value (0.84). Susceptibility-weighted imaging enabled monitoring changes in the oxygenation level in the medulla after water loading, and may allow comparable feasibility to detect renal oxygenation level changes due to water loading compared with that of T2(*) mapping.
Reliability of macroscopic grading of intervertebral disk degeneration in dogs by use of the Thompson system and comparison with low-field magnetic resonance imaging findings.

PubMed

Bergknut, Niklas; Grinwis, Guy; Pickee, Emile; Auriemma, Edoardo; Lagerstedt, Anne-Sofie; Hagman, Ragnvi; Hazewinkel, Herman A W; Meij, Björn P

2011-07-01

To evaluate the reliability of the Thompson system for use in grading the gross pathological changes of intervertebral disk (IVD) degeneration in dogs and to investigate the agreement between gross pathological findings and low-field (0.2-T) magnetic resonance imaging (MRI) findings. Vertebral columns from cadavers of 19 dogs of various ages, breeds, and origins. 182 intervertebral segments were collected from 19 canine cadavers. Sagittal T2-weighted MRI of the T11 through S1 portion of the vertebral column was performed within 24 hours after the dogs were euthanized. The vertebral columns were subsequently divided in the midsagittal plane, and high-resolution photographs were obtained of each intervertebral segment (end plate-disk-end plate). The MRI images and photographs were graded separately in a blinded manner by 4 observers who used both Pfirrmann and Thompson grading criteria. The interobserver agreement for Thompson scores ranged from 0.76 to 0.88, and the intraobserver agreement ranged from 0.88 to 0.94 (Cohen weighted κ analysis). Agreement between scores for the Pfirrmann and Thompson grading criteria was κ = 0.70. Grading of IVD degeneration in dogs by use of the Thompson system resulted in high interobserver and intraobserver agreement, and scores for the Thompson system had substantial agreement with low-field MRI findings graded by use of the Pfirrmann system. This suggested that low-field MRI can be used to diagnose IVD degeneration in dogs.
A validation of the new definition of drug-resistant epilepsy by the International League Against Epilepsy.

PubMed

Téllez-Zenteno, Jose F; Hernández-Ronquillo, Lizbeth; Buckley, Samantha; Zahagun, Ricardo; Rizvi, Syed

2014-06-01

To establish applicability, the recently proposed International League Against Epilepsy (ILAE) consensus on drug-resistant epilepsy (DRE) requires testing in clinical and research settings. This study evaluates the reliability and validity of these criteria in a clinical population. In phase I, two independent evaluators reviewed 97 randomly selected medical records of patients with epilepsy at two separate intervals. Both ILEA consensus and standard diagnostic criteria were employed. Kappa, weighted kappa, and intraclass correlation coefficient (ICC) were used to determine interobserver and intraobserver variability. In phase II, ILAE consensus criteria were applied to 250 patients with epilepsy to determine risk factors associated with development of DRE and to calculate point prevalence. The interobserver agreement of the four definitions was as follows: Berg (0.56), Kwan and Brodie (0.58), Camfield and Camfield (0.69), and ILAE (0.77). The intraobserver agreement of the four definition was as follows: Berg (0.81), Kwan and Brodie (0.82), Camfield and Camfield (0.72), and ILAE (0.82). The prevalence of DRE was the following: with the Berg's definition was 28.4%, Kwan and Brodie 34%, Camfield and Camfield 37%, and with ILAE was 33%. This is first study to establish reliability and validity of ILAE criteria for the diagnosis of DRE. This new definition compares favorably with previously established constructs, which continue to retain clinical significance. Wiley Periodicals, Inc. © 2014 International League Against Epilepsy.
Influence of Light Conditions and Light Sources on Clinical Measurement of Natural Teeth Color using VITA Easyshade Advance 4,0® Spectrophotometer. Pilot Study.

PubMed Central

Posavec, Ivona; Prpić, Vladimir

2016-01-01

Objectives The purpose of this study was to evaluate and compare lightness (L), chroma (C) and hue (h), green-red (a) and blue-yellow (b) character of the color of maxillary right central incisors in different light conditions and light sources. Materials and methods Two examiners who were well trained in digital color evaluation participated in the research. Intraclass correlation coefficients (ICCs) were used to analyze intra- and interobserver reliability. The LCh and L*a*b* values were determined at 08.15 and at 10.00 in the morning under three different light conditions. Tooth color was assessed in 10 subjects using intraoral spectrophotometer VITA Easyshade Advance 4.0® set at the central region of the vestibular surface of the measured tooth. Results Intra- and interobserver ICC values were high for both examiners and ranged from 0.57 to 0.99. Statistically significant differences in LCh and L*a*b* values measured in different time of the day and certain light condition were not found (p>0.05). Statistically significant differences in LCh and L*a*b* values measured under three different light conditions were not found, too (p>0.05). Conclusions VITA Easyshade Advance 4.0® is reliable enough for daily clinical work in order to assess tooth color during the fabrication of esthtic appliances because it is not dependent on light conditions and light sources. PMID:28275281
Interobserver reliability of video recording in the diagnosis of nocturnal frontal lobe seizures.

PubMed

Vignatelli, Luca; Bisulli, Francesca; Provini, Federica; Naldi, Ilaria; Pittau, Francesca; Zaniboni, Anna; Montagna, Pasquale; Tinuper, Paolo

2007-08-01

Nocturnal frontal lobe seizures (NFLS) show one or all of the following semeiological patterns: (1) paroxysmal arousals (PA: brief and sudden recurrent motor paroxysmal behavior); (2) hyperkinetic seizures (HS: motor attacks with complex dyskinetic features); (3) asymmetric bilateral tonic seizures (ATS: motor attacks with dystonic features); (4) epileptic nocturnal wanderings (ENW: stereotyped, prolonged ambulatory behavior). To estimate the interobserver reliability (IR) of video-recording diagnosis in patients with suspected NFLS among sleep medicine experts, epileptologists, and trainees in sleep medicine. Sixty-six patients with suspected NFLS were included. All underwent nocturnal video-polysomnographic recording. Six doctors (three experts and three trainees) independently classified each case as "NFLS ascertained" (according to the above specified subtypes: PA, HS, ATS, ENW) or "NFLS excluded". IR was calculated by means of Kappa statistics, and interpreted according to the standard classification (0.0-0.20 = slight agreement; 0.21-0.40 = fair; 0.41-0.60 = moderate; 0.61-0.80 = substantial; 0.81-1.00 = almost perfect). The observed raw agreement ranged from 63% to 79% between each pair of raters; the IR ranged from "moderate" (kappa = 0.50) to "substantial" (kappa = 0.72). A major source of variance was the disagreement in distinguishing between PA and nonepileptic arousals, without differences in the level of agreement between experts and trainees. Among sleep experts and trainees, IR of diagnosis of NFLS, based on videotaped observation of sleep phenomena, is not satisfactory. Explicit video-polysomnographic criteria for the classification of paroxysmal sleep motor phenomena are needed.
Diagnosis of long head of biceps tendinopathy in rotator cuff tear patients: correlation of imaging and arthroscopy data.

PubMed

Rol, Morgane; Favard, Luc; Berhouet, Julien

2018-06-01

The goal of this prospective study was to assess the reliability of pre-operative cross-sectional imaging for the diagnosis of long head of biceps (LHB) tendinopathy in patients with a rotator cuff tear. Cross-sectional imaging with MRI or CT arthrography data from 25 patients operated upon because of a rotator cuff tear between 1 October 2015 and 1 April 2016 was analysed by one experienced orthopaedic surgeon, one experienced radiologist and one orthopaedic resident. The analysis consisted of determining whether the LHB was present, the extrinsic tendon abnormalities (dislocation, tendon coverage) and intrinsic abnormalities (fraying, inflammation, degeneration). These findings were then compared to intra-operative arthroscopy findings, which were used as the benchmark. The interobserver correlation between the three different examiners for the cross-sectional imaging analysis as well as the correlation between the imaging and arthroscopy data were determined. The correlation between the imaging and arthroscopy data was the highest (80%) for the determination of LHB dislocation from the bicipital groove. The other diagnostic elements (subluxation, coverage and tendon degeneration) were difficult to discern with preoperative imaging, and correlated poorly with the arthroscopy findings (45% to 65%). The interobserver correlation was moderate to strong for the diagnosis of extrinsic tendon abnormalities. It was low to moderate for intrinsic abnormalities. Except for LHB dislocation, pre-operative imaging is not sufficient to make a reliable diagnosis of LHB tendinopathy. Arthroscopy remains the gold standard for the management of LHB tendinopathy, as diagnosed intra-operatively.
How do emergency physicians interpret prescription narcotic history when assessing patients presenting to the emergency department with pain?

PubMed

Grover, Casey A; Garmel, Gus M

2012-01-01

Narcotics are frequently prescribed in the Emergency Department (ED) and are increasingly abused. Prescription monitoring programs affect prescribing by Emergency Physicians (EPs), yet little is known on how EPs interpret prescription records. To assess how EPs interpret prescription narcotic history for patients in the ED with painful conditions. DESIGN/MAIN Outcome Measures: We created an anonymous survey of EPs consisting of fictitious cases of patients presenting to the ED with back pain. For each case, we provided a prescription history that varied in the number of narcotic prescriptions, prescribing physicians, and narcotic potency. Respondents rated how likely they thought each patient was drug seeking, and how likely they thought that the prescription history would change their prescribing behavior. We calculated κ values to evaluate interobserver reliability of physician assessment of drug-seeking behavior. We collected 59 responses (response rate = 70%). Respondents most suspected drug seeking in patients with greater than 6 prescriptions per month or greater than 6 prescribing physicians in 2 months. Medication potency did not affect physician interpretation of drug seeking. Respondents reported that access to a prescription history would change their prescribing practice in all cases. κ values for assessment of drug seeking demonstrated moderate agreement. A greater number of prescriptions and a greater number of prescribing physicians in the prescription record increased suspicion for drug seeking. EPs believed that access to prescription history would change their prescribing behavior, yet interobserver reliability in the assessment of drug seeking was moderate.
Quantitative facial asymmetry: using three-dimensional photogrammetry to measure baseline facial surface symmetry.

PubMed

Taylor, Helena O; Morrison, Clinton S; Linden, Olivia; Phillips, Benjamin; Chang, Johnny; Byrne, Margaret E; Sullivan, Stephen R; Forrest, Christopher R

2014-01-01

Although symmetry is hailed as a fundamental goal of aesthetic and reconstructive surgery, our tools for measuring this outcome have been limited and subjective. With the advent of three-dimensional photogrammetry, surface geometry can be captured, manipulated, and measured quantitatively. Until now, few normative data existed with regard to facial surface symmetry. Here, we present a method for reproducibly calculating overall facial symmetry and present normative data on 100 subjects. We enrolled 100 volunteers who underwent three-dimensional photogrammetry of their faces in repose. We collected demographic data on age, sex, and race and subjectively scored facial symmetry. We calculated the root mean square deviation (RMSD) between the native and reflected faces, reflecting about a plane of maximum symmetry. We analyzed the interobserver reliability of the subjective assessment of facial asymmetry and the quantitative measurements and compared the subjective and objective values. We also classified areas of greatest asymmetry as localized to the upper, middle, or lower facial thirds. This cluster of normative data was compared with a group of patients with subtle but increasing amounts of facial asymmetry. We imaged 100 subjects by three-dimensional photogrammetry. There was a poor interobserver correlation between subjective assessments of asymmetry (r = 0.56). There was a high interobserver reliability for quantitative measurements of facial symmetry RMSD calculations (r = 0.91-0.95). The mean RMSD for this normative population was found to be 0.80 ± 0.24 mm. Areas of greatest asymmetry were distributed as follows: 10% upper facial third, 49% central facial third, and 41% lower facial third. Precise measurement permitted discrimination of subtle facial asymmetry within this normative group and distinguished norms from patients with subtle facial asymmetry, with placement of RMSDs along an asymmetry ruler. Facial surface symmetry, which is poorly assessed subjectively, can be easily and reproducibly measured using three-dimensional photogrammetry. The RMSD for facial asymmetry of healthy volunteers clusters at approximately 0.80 ± 0.24 mm. Patients with facial asymmetry due to a pathologic process can be differentiated from normative facial asymmetry based on their RMSDs.
The minimally invasive spinal deformity surgery algorithm: a reproducible rational framework for decision making in minimally invasive spinal deformity surgery.

PubMed

Mummaneni, Praveen V; Shaffrey, Christopher I; Lenke, Lawrence G; Park, Paul; Wang, Michael Y; La Marca, Frank; Smith, Justin S; Mundis, Gregory M; Okonkwo, David O; Moal, Bertrand; Fessler, Richard G; Anand, Neel; Uribe, Juan S; Kanter, Adam S; Akbarnia, Behrooz; Fu, Kai-Ming G

2014-05-01

Minimally invasive surgery (MIS) is an alternative to open deformity surgery for the treatment of patients with adult spinal deformity. However, at this time MIS techniques are not as versatile as open deformity techniques, and MIS techniques have been reported to result in suboptimal sagittal plane correction or pseudarthrosis when used for severe deformities. The minimally invasive spinal deformity surgery (MISDEF) algorithm was created to provide a framework for rational decision making for surgeons who are considering MIS versus open spine surgery. A team of experienced spinal deformity surgeons developed the MISDEF algorithm that incorporates a patient's preoperative radiographic parameters and leads to one of 3 general plans ranging from MIS direct or indirect decompression to open deformity surgery with osteotomies. The authors surveyed fellowship-trained spine surgeons experienced with spinal deformity surgery to validate the algorithm using a set of 20 cases to establish interobserver reliability. They then resurveyed the same surgeons 2 months later with the same cases presented in a different sequence to establish intraobserver reliability. Responses were collected and tabulated. Fleiss' analysis was performed using MATLAB software. Over a 3-month period, 11 surgeons completed the surveys. Responses for MISDEF algorithm case review demonstrated an interobserver kappa of 0.58 for the first round of surveys and an interobserver kappa of 0.69 for the second round of surveys, consistent with substantial agreement. In at least 10 cases there was perfect agreement between the reviewing surgeons. The mean intraobserver kappa for the 2 surveys was 0.86 ± 0.15 (± SD) and ranged from 0.62 to 1. The use of the MISDEF algorithm provides consistent and straightforward guidance for surgeons who are considering either an MIS or an open approach for the treatment of patients with adult spinal deformity. The MISDEF algorithm was found to have substantial inter- and intraobserver agreement. Although further studies are needed, the application of this algorithm could provide a platform for surgeons to achieve the desired goals of surgery.
Error quantification of osteometric data in forensic anthropology.

PubMed

Langley, Natalie R; Meadows Jantz, Lee; McNulty, Shauna; Maijanen, Heli; Ousley, Stephen D; Jantz, Richard L

2018-06-01

This study evaluates the reliability of osteometric data commonly used in forensic case analyses, with specific reference to the measurements in Data Collection Procedures 2.0 (DCP 2.0). Four observers took a set of 99 measurements four times on a sample of 50 skeletons (each measurement was taken 200 times by each observer). Two-way mixed ANOVAs and repeated measures ANOVAs with pairwise comparisons were used to examine interobserver (between-subjects) and intraobserver (within-subjects) variability. Relative technical error of measurement (TEM) was calculated for measurements with significant ANOVA results to examine the error among a single observer repeating a measurement multiple times (e.g. repeatability or intraobserver error), as well as the variability between multiple observers (interobserver error). Two general trends emerged from these analyses: (1) maximum lengths and breadths have the lowest error across the board (TEM<0.5), and (2) maximum and minimum diameters at midshaft are more reliable than their positionally-dependent counterparts (i.e. sagittal, vertical, transverse, dorso-volar). Therefore, maxima and minima are specified for all midshaft measurements in DCP 2.0. Twenty-two measurements were flagged for excessive variability (either interobserver, intraobserver, or both); 15 of these measurements were part of the standard set of measurements in Data Collection Procedures for Forensic Skeletal Material, 3rd edition. Each measurement was examined carefully to determine the likely source of the error (e.g. data input, instrumentation, observer's method, or measurement definition). For several measurements (e.g. anterior sacral breadth, distal epiphyseal breadth of the tibia) only one observer differed significantly from the remaining observers, indicating a likely problem with the measurement definition as interpreted by that observer; these definitions were clarified in DCP 2.0 to eliminate this confusion. Other measurements were taken from landmarks that are difficult to locate consistently (e.g. pubis length, ischium length); these measurements were omitted from DCP 2.0. This manual is available for free download online (https://fac.utk.edu/wp-content/uploads/2016/03/DCP20_webversion.pdf), along with an accompanying instructional video (https://www.youtube.com/watch?v=BtkLFl3vim4). Copyright © 2018 Elsevier B.V. All rights reserved.

Agreement and accuracy using the FIGO, ACOG and NICE cardiotocography interpretation guidelines.

PubMed

Santo, Susana; Ayres-de-Campos, Diogo; Costa-Santos, Cristina; Schnettler, William; Ugwumadu, Austin; Da Graça, Luís M

2017-02-01

One of the limitations reported with cardiotocography is the modest interobserver agreement observed in tracing interpretation. This study compared agreement, reliability and accuracy of cardiotocography interpretation using the International Federation of Gynecology and Obstetrics, American College of Obstetrics and Gynecology and National Institute for Health and Care Excellence guidelines. A total of 151 tracings were evaluated by 27 clinicians from three centers where International Federation of Gynecology and Obstetrics, American College of Obstetrics and Gynecology and National Institute for Health and Care Excellence guidelines were routinely used. Interobserver agreement was evaluated using the proportions of agreement and reliability with the κ statistic. The accuracy of tracings classified as "pathological/category III" was assessed for prediction of newborn acidemia. For all measures, 95% confidence interval were calculated. Cardiotocography classifications were more distributed with International Federation of Gynecology and Obstetrics (9, 52, 39%) and National Institute for Health and Care Excellence (30, 33, 37%) than with American College of Obstetrics and Gynecology (13, 81, 6%). The category with the highest agreement was American College of Obstetrics and Gynecology category II (proportions of agreement = 0.73, 95% confidence interval 0.70-76), and the ones with the lowest agreement were American College of Obstetrics and Gynecology categories I and III. Reliability was significantly higher with International Federation of Gynecology and Obstetrics (κ = 0.37, 95% confidence interval 0.31-0.43), and National Institute for Health and Care Excellence (κ = 0.33, 95% confidence interval 0.28-0.39) than with American College of Obstetrics and Gynecology (κ = 0.15, 95% confidence interval 0.10-0.21); however, all represent only slight/fair reliability. International Federation of Gynecology and Obstetrics and National Institute for Health and Care Excellence showed a trend towards higher sensitivities in prediction of newborn acidemia (89 and 97%, respectively) than American College of Obstetrics and Gynecology (32%), but the latter achieved a significantly higher specificity (95%). With American College of Obstetrics and Gynecology guidelines there is high agreement in category II, low reliability, low sensitivity and high specificity in prediction of acidemia. With International Federation of Gynecology and Obstetrics and National Institute for Health and Care Excellence guidelines there is higher reliability, a trend towards higher sensitivity, and lower specificity in prediction of acidemia. © 2016 Nordic Federation of Societies of Obstetrics and Gynecology.
Reliability of cervical lordosis measurement techniques on long-cassette radiographs.

PubMed

Janusz, Piotr; Tyrakowski, Marcin; Yu, Hailong; Siemionow, Kris

2016-11-01

Lateral radiographs are commonly used to assess cervical sagittal alignment. Three assessment methods have been described and are commonly utilized in clinical practice. These methods are described for perfect lateral cervical radiographs, however in everyday practice radiograph quality varies. The aim of this study was to compare the reliability and reproducibility of 3 cervical lordosis (CL) measurement methods. Forty-four standing lateral radiographs were randomly chosen from a lateral long-cassette radiograph database. Measurements of CL were performed with: Cobb method C2-C7 (CM), C2-C7 posterior tangent method (PTM), sum of posterior tangent method for each segment (SPTM). Three independent orthopaedic surgeons measured CL using the three methods on 44 lateral radiographs. One researcher used the three methods to measured CL three times at 4-week time intervals. Agreement between the methods as well as their intra- and interobserver reliability were tested and quantified by intraclass correlation coefficient (ICC) and median error for a single measurement (SEM). ICC of 0.75 or more reflected an excellent agreement/reliability. The results were compared with repeated ANOVA test, with p < 0.05 considered as significant. All methods revealed excellent intra- and interobserver reliability. Agreement (ICC, SEM) between three methods was (0.89°, 3.44°), between CM and SPTM was (0.82°, 4.42°), between CM and PTM was (0.80°, 4.80°) and between PTM and SPTM was (0.99°, 1.10°). Mean values CL for a CM, PTM, SPTM were 10.5° ± 13.9°, 17.5° ± 15.6° and 17.7° ± 15.9° (p < 0.0001), respectively. The significant difference was between CM vs PTM (p < 0.0001) and CM vs SPTM (p < 0.0001), but not between PTM vs SPTM (p > 0.05). All three methods appeared to be highly reliable. Although, high agreement between all measurement methods was shown, we do not recommend using Cobb measurement method interchangeably with PTM or SPTM within a single study as this could lead to error, whereas, such a comparison between tangent methods can be considered.
Critical discussion of evaluation parameters for inter-observer variability in target definition for radiation therapy.

PubMed

Fotina, I; Lütgendorf-Caucig, C; Stock, M; Pötter, R; Georg, D

2012-02-01

Inter-observer studies represent a valid method for the evaluation of target definition uncertainties and contouring guidelines. However, data from the literature do not yet give clear guidelines for reporting contouring variability. Thus, the purpose of this work was to compare and discuss various methods to determine variability on the basis of clinical cases and a literature review. In this study, 7 prostate and 8 lung cases were contoured on CT images by 8 experienced observers. Analysis of variability included descriptive statistics, calculation of overlap measures, and statistical measures of agreement. Cross tables with ratios and correlations were established for overlap parameters. It was shown that the minimal set of parameters to be reported should include at least one of three volume overlap measures (i.e., generalized conformity index, Jaccard coefficient, or conformation number). High correlation between these parameters and scatter of the results was observed. A combination of descriptive statistics, overlap measure, and statistical measure of agreement or reliability analysis is required to fully report the interrater variability in delineation.
Magnetic Resonance Imaging in Patients With Mechanical Low Back Pain Using a Novel Rapid-Acquisition Three-Dimensional SPACE Sequence at 1.5-T: A Pilot Study Comparing Lumbar Stenosis Assessment With Routine Two-Dimensional Magnetic Resonance Sequences.

PubMed

Swami, Vimarsha G; Katlariwala, Mihir; Dhillon, Sukhvinder; Jibri, Zaid; Jaremko, Jacob L

2016-11-01

To minimize the burden of overutilisation of lumbar spine magnetic resonance imaging (MRI) on a resource-constrained public healthcare system, it may be helpful to image some patients with mechanical low-back pain (LBP) using a simplified rapid MRI screening protocol at 1.5-T. A rapid-acquisition 3-dimensional (3D) SPACE (Sampling Perfection with Application-optimized Contrasts using different flip angle Evolution) sequence can demonstrate common etiologies of LBP. We compared lumbar spinal canal stenosis (LSCS) and neural foraminal stenosis (LNFS) assessment on 3D SPACE against conventional 2-dimensional (2D) MRI. We prospectively performed 3D SPACE and 2D spin-echo MRI sequences (axial or sagittal T1-weighted or T2-weighted) at 1.5-T in 20 patients. Two blinded readers assessed levels L3-4, L4-5 and L5-S1 using: 1) morphologic grading systems, 2) global impression on the presence or absence of clinically significant stenosis (n = 60 disc levels for LSCS, n = 120 foramina for LNFS). Reliability statistics were calculated. Acquisition time was ∼5 minutes for SPACE and ∼20 minutes for 2D MRI sequences. Interobserver agreement of LSCS was substantial to near perfect on both sequences (morphologic grading: kappa [k] = 0.71 SPACE, k = 0.69 T2-weighted; global impression: k = 0.85 SPACE, k = 0.78 T2-weighted). LNFS assessment had superior interobserver reliability using SPACE than T1-weighted (k = 0.54 vs 0.37). Intersequence agreement of findings between SPACE and 2D MRI was substantial to near perfect by global impression (LSCS: k = 0.78 Reader 1, k = 0.85 Reader 2; LNFS: k = 0.63 Reader 1, k = 0.66 Reader 2). 3D SPACE was acquired in one-quarter the time as the conventional 2D MRI protocol, had excellent agreement with 2D MRI for stenosis assessment, and had interobserver reliability superior to 2D MRI. These results justify future work to explore the role of 3D SPACE in a rapid MRI screening protocol at 1.5-T for mechanical LBP. Copyright © 2016 Canadian Association of Radiologists. Published by Elsevier Inc. All rights reserved.
Accuracy and reliability of observational gait analysis data: judgments of push-off in gait after stroke.

PubMed

McGinley, Jennifer L; Goldie, Patricia A; Greenwood, Kenneth M; Olney, Sandra J

2003-02-01

Physical therapists routinely observe gait in clinical practice. The purpose of this study was to determine the accuracy and reliability of observational assessments of push-off in gait after stroke. Eighteen physical therapists and 11 subjects with hemiplegia following a stroke participated in the study. Measurements of ankle power generation were obtained from subjects following stroke using a gait analysis system. Concurrent videotaped gait performances were observed by the physical therapists on 2 occasions. Ankle power generation at push-off was scored as either normal or abnormal using two 11-point rating scales. These observational ratings were correlated with the measurements of peak ankle power generation. A high correlation was obtained between the observational ratings and the measurements of ankle power generation (mean Pearson r=.84). Interobserver reliability was moderately high (mean intraclass correlation coefficient [ICC (2,1)]=.76). Intraobserver reliability also was high, with a mean ICC (2,1) of.89 obtained. Physical therapists were able to make accurate and reliable judgments of push-off in videotaped gait of subjects following stroke using observational assessment. Further research is indicated to explore the accuracy and reliability of data obtained with observational gait analysis as it occurs in clinical practice.
Ultrathin disposable gastroscope for screening and surveillance of gastroesophageal varices in patients with liver cirrhosis: a prospective comparative study.

PubMed

Huynh, Dep K; Toscano, Leanne; Phan, Vinh-An; Ow, Tsai-Wing; Schoeman, Mark; Nguyen, Nam Q

2017-06-01

This study aims to evaluate the role of unsedated, ultrathin disposable gastroscopy (TDG) against conventional gastroscopy (CG) in the screening and surveillance of gastroesophageal varices (GEVs) in patients with liver cirrhosis. Forty-eight patients (56.4 ± 1.3 years; 38 male, 10 female) with liver cirrhosis referred for screening (n = 12) or surveillance (n = 36) of GEVs were prospectively enrolled. Unsedated gastroscopy was initially performed with TDG, followed by CG with conscious sedation. The 2 gastroscopies were performed by different endoscopists blinded to the results of the previous examination. Video recordings of both gastroscopies were validated by an independent investigator in a random, blinded fashion. Endpoints were accuracy and interobserver agreement of detecting GEVs, safety, and potential cost saving. CG identified GEVs in 26 (54%) patients, 10 of whom (21%) had high-risk esophageal varices (HREV). Compared with CG, TDG had an accuracy of 92% for the detection of all GEVs, which increased to 100% for high-risk GEVs. The interobserver agreement for detecting all GEVs on TDG was 88% (κ = 0.74). This increased to 94% (κ = 0.82) for high-risk GEVs. There were no serious adverse events. Unsedated TDG is safe and has high diagnostic accuracy and interobserver reliability for the detection of GEVs. The use of clinic-based TDG would allow immediate determination of a follow-up plan, making it attractive for variceal screening and surveillance programs. (Clinical trial (ANZCTR) registration number: ACTRN12616001103459.). Crown Copyright © 2017. Published by Elsevier Inc. All rights reserved.
Multidetector computed tomography sizing of aortic annulus prior to transcatheter aortic valve replacement (TAVR): Variability and impact of observer experience.

PubMed

Le Couteulx, S; Caudron, J; Dubourg, B; Cauchois, G; Dupré, M; Michelin, P; Durand, E; Eltchaninoff, H; Dacher, J-N

2018-05-01

To evaluate intra- and inter-observer variability of multidetector computed tomography (MDCT) sizing of the aortic annulus before transcatheter aortic valve replacement (TAVR) and the effect of observer experience, aortic valve calcification and image quality. MDCT examinations of 52 consecutive patients with tricuspid aortic valve (30 women, 22 men) with a mean age of 83±7 (SD) years (range: 64-93 years) were evaluated retrospectively. The maximum and minimum diameters, area and circumference of the aortic annulus were measured twice at diastole and systole with a standardized approach by three independent observers with different levels of experience (expert [observer 1]; resident with intensive 6 months practice [observer 2]; trained resident with starting experience [observer 3]). Observers were requested to recommend the valve prosthesis size. Calcification volume of the aortic valve and signal to noise ratio were evaluated. Intra- and inter-observer reproducibility was excellent for all aortic annulus dimensions, with an intraclass correlation coefficient ranging respectively from 0.84 to 0.98 and from 0.82 to 0.97. Agreement for selection of prosthesis size was almost perfect between the two most experienced observers (k=0.82) and substantial with the inexperienced observer (k=0.67). Aortic valve calcification did not influence intra-observer reproducibility. Image quality influenced reproducibility of the inexperienced observer. Intra- and inter-observer variability of aortic annulus sizing by MDCT is low. Nevertheless, the less experienced observer showed lower reliability suggesting a learning curve. Copyright © 2017. Published by Elsevier Masson SAS.
Cross-cultural Adaptation of the Self-care of Hypertension Inventory Into Brazilian Portuguese.

PubMed

Silveira, Luana Claudia Jacoby; Rabelo-Silva, Eneida Rejane; Ávila, Christiane Whast; Beltrami Moreira, Leila; Dickson, Victoria Vaughan; Riegel, Barbara

Lifestyle changes and treatment adherence still constitute a challenge to healthcare providers involved in the care of persons with hypertension. The lack of validated instruments measuring the ability of hypertensive patients to manage their disease has slowed research progress in this area. The Self-care of Hypertension Inventory, originally developed in the United States, consists of 23 items divided across 3 scales: Self-care Maintenance, Self-care Management, and Self-care Confidence. These scales measure how well patients with hypertension adhere to treatment and manage elevated blood pressure, as well as their confidence in their ability to perform self-care. A rigorous cross-cultural adaptation and validation process is required before this instrument can be used in other countries. The aims of this study were to translate the Self-care of Hypertension Inventory into Brazilian Portuguese with cross-cultural adaptation and to evaluate interobserver reliability and temporal stability. This methodological study involved forward translation, synthesis of forward translations, back-translation, synthesis of back-translations, expert committee review, and pretesting. Interobserver agreement and the temporal stability of the scales were assessed. The expert committee proposed semantic and cultural modifications to some items and the addition of guidance statements to facilitate administration of the scale. Interobserver analysis demonstrated substantial agreement. Analysis of temporal stability showed near-perfect agreement. Cross-cultural adaptation of the Self-care of Hypertension Inventory successfully produced a Portuguese-language version of the instrument for further evaluation of psychometric properties. Once that step is completed, the scale can be used in Brazil.
The reticulin algorithm for adrenocortical tumor diagnosis: a multicentric validation study on 245 unpublished cases.

PubMed

Duregon, Eleonora; Fassina, Ambrogio; Volante, Marco; Nesi, Gabriella; Santi, Raffaella; Gatti, Gaia; Cappellesso, Rocco; Dalino Ciaramella, Paolo; Ventura, Laura; Gambacorta, Marcello; Dei Tos, Angelo Paolo; Loli, Paola; Mannelli, Massimo; Mantero, Franco; Berruti, Alfredo; Terzolo, Massimo; Papotti, Mauro

2013-09-01

The pathologic diagnosis of adrenocortical carcinoma (ACC) still needs to be improved, because the renowned Weiss Score (WS) system has a poor reproducibility of some parameters and is difficult to apply in borderline cases and in ACC variants. The "reticulin algorithm" (RA) defines malignancy through an altered reticulin framework associated with 1 of the 3 following parameter: necrosis, high mitotic rate, and vascular invasion. This study aimed at validating the interobserver reproducibility of reticulin stain evaluation in an unpublished series of 245 adrenocortical tumors (61 adenomas and 184 carcinomas) from 5 Italian centers, classified according to the WS. Eight pathologists reviewed all reticulin-stained slides. After training, a second round of evaluation on discordant cases was performed 10 weeks later. The RA reclassified 67 cases (27%) as adenomas, including 44 with no reticulin alterations and 23 with an altered reticulin framework but lacking the subsequent parameters of the triad. The other 178 cases (73%) were carcinomas according to the above-mentioned criteria. A complete (8/8 pathologists) interobserver agreement was reached in 75% of cases (κ=0.702), irrespective of case derivation, pathologists' experience, and histologic variants, and was further improved when only those cases with high WS and clinically malignant behavior were considered. After the training, the overall agreement increased to 86%. We conclude that reticulin staining is a reliable technique and an easy-to-interpret system in adrenocortical tumors; moreover, it has a high interobserver reproducibility, which supports the notion of using such a method in the proposed 2-step RA approach for ACC diagnosis.
A demonstration of lack of variability among six tuberculin skin test readers.

PubMed Central

Perez-Stable, E J; Slutkin, G

1985-01-01

The variability of tuberculin skin test readings among six trained and experienced readers was evaluated using a modified sliding caliper method. Each of 537 tests were read independently by two readers. There were 23 disagreements between paired readers resulting in an overall interobserver reliability of 95.7 per cent. In 82 per cent of the paired readings the results were different by 2 mm or less. The observer lack of variability was likely due to the training and experience of the readers. PMID:4051078
Cervical vertebrae maturation method morphologic criteria: poor reproducibility.

PubMed

Nestman, Trenton S; Marshall, Steven D; Qian, Fang; Holton, Nathan; Franciscus, Robert G; Southard, Thomas E

2011-08-01

The cervical vertebrae maturation (CVM) method has been advocated as a predictor of peak mandibular growth. A careful review of the literature showed potential methodologic errors that might influence the high reported reproducibility of the CVM method, and we recently established that the reproducibility of the CVM method was poor when these potential errors were eliminated. The purpose of this study was to further investigate the reproducibility of the individual vertebral patterns. In other words, the purpose was to determine which of the individual CVM vertebral patterns could be classified reliably and which could not. Ten practicing orthodontists, trained in the CVM method, evaluated the morphology of cervical vertebrae C2 through C4 from 30 cephalometric radiographs using questions based on the CVM method. The Fleiss kappa statistic was used to assess interobserver agreement when evaluating each cervical vertebrae morphology question for each subject. The Kendall coefficient of concordance was used to assess the level of interobserver agreement when determining a "derived CVM stage" for each subject. Interobserver agreement was high for assessment of the lower borders of C2, C3, and C4 that were either flat or curved in the CVM method, but interobserver agreement was low for assessment of the vertebral bodies of C3 and C4 when they were either trapezoidal, rectangular horizontal, square, or rectangular vertical; this led to the overall poor reproducibility of the CVM method. These findings were reflected in the Fleiss kappa statistic. Furthermore, nearly 30% of the time, individual morphologic criteria could not be combined to generate a final CVM stage because of incompatible responses to the 5 questions. Intraobserver agreement in this study was only 62%, on average, when the inconclusive stagings were excluded as disagreements. Intraobserver agreement was worse (44%) when the inconclusive stagings were included as disagreements. For the group of subjects that could be assigned a CVM stage, the level of interobserver agreement as measured by the Kendall coefficient of concordance was only 0.45, indicating moderate agreement. The weakness of the CVM method results, in part, from difficulty in classifying the vertebral bodies of C3 and C4 as trapezoidal, rectangular horizontal, square, or rectangular vertical. This led to the overall poor reproducibility of the CVM method and our inability to support its use as a strict clinical guideline for the timing of orthodontic treatment. Copyright © 2011 American Association of Orthodontists. Published by Mosby, Inc. All rights reserved.
External validation of Global Evaluative Assessment of Robotic Skills (GEARS).

PubMed

Aghazadeh, Monty A; Jayaratna, Isuru S; Hung, Andrew J; Pan, Michael M; Desai, Mihir M; Gill, Inderbir S; Goh, Alvin C

2015-11-01

We demonstrate the construct validity, reliability, and utility of Global Evaluative Assessment of Robotic Skills (GEARS), a clinical assessment tool designed to measure robotic technical skills, in an independent cohort using an in vivo animal training model. Using a cross-sectional observational study design, 47 voluntary participants were categorized as experts (>30 robotic cases completed as primary surgeon) or trainees. The trainee group was further divided into intermediates (≥5 but ≤30 cases) or novices (<5 cases). All participants completed a standardized in vivo robotic task in a porcine model. Task performance was evaluated by two expert robotic surgeons and self-assessed by the participants using the GEARS assessment tool. Kruskal-Wallis test was used to compare the GEARS performance scores to determine construct validity; Spearman's rank correlation measured interobserver reliability; and Cronbach's alpha was used to assess internal consistency. Performance evaluations were completed on nine experts and 38 trainees (14 intermediate, 24 novice). Experts demonstrated superior performance compared to intermediates and novices overall and in all individual domains (p < 0.0001). In comparing intermediates and novices, the overall performance difference trended toward significance (p = 0.0505), while the individual domains of efficiency and autonomy were significantly different between groups (p = 0.0280 and 0.0425, respectively). Interobserver reliability between expert ratings was confirmed with a strong correlation observed (r = 0.857, 95 % CI [0.691, 0.941]). Experts and participant scoring showed less agreement (r = 0.435, 95 % CI [0.121, 0.689] and r = 0.422, 95 % CI [0.081, 0.0672]). Internal consistency was excellent for experts and participants (α = 0.96, 0.98, 0.93). In an independent cohort, GEARS was able to differentiate between different robotic skill levels, demonstrating excellent construct validity. As a standardized assessment tool, GEARS maintained consistency and reliability for an in vivo robotic surgical task and may be applied for skills evaluation in a broad range of robotic procedures.
Spine Instability Neoplastic Score: agreement across different medical and surgical specialties.

PubMed

Arana, Estanislao; Kovacs, Francisco M; Royuela, Ana; Asenjo, Beatriz; Pérez-Ramírez, Úrsula; Zamora, Javier

2016-05-01

Spinal instability is an acknowledged complication of spinal metastases; in spite of recent suggested criteria, it is not clearly defined in the literature. This study aimed to assess intra and interobserver agreement when using the Spine Instability Neoplastic Score (SINS) by all physicians involved in its management. Independent multicenter reliability study for the recently created SINS, undertaken with a panel of medical oncologists, neurosurgeons, radiologists, orthopedic surgeons, and radiation oncologists, was carried out. Ninety patients with biopsy-proven spinal metastases and magnetic resonance imaging, reviewed at the multidisciplinary tumor board of our institution, were included. Intraclass correlation coefficient (ICC) was used for SINS score agreement. Fleiss kappa statistic was used to assess agreement on the location of the most affected vertebral level; agreement on the SINS category ("stable," "potentially stable," or "unstable"); and overall agreement with the classification established by tumor board. Clinical data and imaging were provided to 83 specialists in 44 hospitals across 14 Spanish regions. No assessment criteria were pre-established. Each clinician assessed the SINS score twice, with a minimum 6-week interval. Clinicians were blinded to assessments made by other specialists and to their own previous assessment. Subgroup analyses were performed by clinicians' specialty, experience (≤7, 8-13, ≥14 years), and hospital category (four levels according to size and complexity). This study was supported by Kovacs Foundation. Intra and interobserver agreement on the location of the most affected levels was "almost perfect" (κ>0.94). Intra-observer agreement on the SINS score was "excellent" (ICC=0.77), whereas interobserver agreement was "moderate" (ICC=0.55). Intra-observer agreement in SINS category was "substantial" (k=0.61), whereas interobserver agreement was "moderate" (k=0.42). Overall agreement with the tumor board classification was "substantial" (κ=0.61). Results were similar across specialties, years of experience, and hospital category. Agreement on the assessment of metastatic spine instability is moderate. The SINS can help improve communication among clinicians in oncology care. Copyright © 2015 Elsevier Inc. All rights reserved.
Inter-study reproducibility of cardiovascular magnetic resonance tagging

PubMed Central

2013-01-01

Background The aim of this study is to determine the test-retest reliability of the measurement of regional myocardial function by cardiovascular magnetic resonance (CMR) tagging using spatial modulation of magnetization. Methods Twenty-five participants underwent CMR tagging twice over 12 ± 7 days. To assess the role of slice orientation on strain measurement, two healthy volunteers had a first exam, followed by image acquisition repeated with slices rotated ±15 degrees out of true short axis, followed by a second exam in the true short axis plane. To assess the role of slice location, two healthy volunteers had whole heart tagging. The harmonic phase (HARP) method was used to analyze the tagged images. Peak midwall circumferential strain (Ecc), radial strain (Err), Lambda 1, Lambda 2, and Angle α were determined in basal, mid and apical slices. LV torsion, systolic and early diastolic circumferential strain and torsion rates were also determined. Results LV Ecc and torsion had excellent intra-, interobserver, and inter-study intra-class correlation coefficients (ICC range, 0.7 to 0.9). Err, Lambda 1, Lambda 2 and angle had excellent intra- and interobserver ICC than inter-study ICC. Angle had least inter-study reproducibility. Torsion rates had superior intra-, interobserver, and inter-study reproducibility to strain rates. The measurements of LV Ecc were comparable in all three slices with different short axis orientations (standard deviation of mean Ecc was 0.09, 0.18 and 0.16 at basal, mid and apical slices, respectively). The mean difference in LV Ecc between slices was more pronounced in most of the basal slices compared to the rest of the heart. Conclusions Intraobserver and interobserver reproducibility of all strain and torsion parameters was excellent. Inter-study reproducibility of CMR tagging by SPAMM varied between different parameters as described in the results above and was superior for Ecc and LV torsion. The variation in LV Ecc measurement due to altered slice orientation is negligible compared to the variation due to slice location. Trial registration This trial is registered as NCT00005487 at National Heart, Lung and Blood institute. PMID:23663535
Diagnostic validity of alternative manual stress radiographic technique detecting subtalar instability with concomitant ankle instability.

PubMed

Lee, Byung Hoon; Choi, Kyung-Hwa; Seo, Dong Yeon; Choi, Sang Min; Kim, Gab Lae

2016-04-01

To incorporate a diagnostic technique for measuring subtalar motion, namely "talar rotation", into the manual supination-anterior drawer stress radiographs for evaluation of the severity of rotational instability, and to determine its clinical relevance. Sixty-six patients with combined injuries of the anterior talofibular (ATFL) and calcaneofibular ligament (CFL) underwent three bilateral manual stress radiographs, and mean increments of anterior talar translation (mm), talar tilt (°), and talar rotation (%) in the injured ankle compared to the normal opposite side were measured with the technique. Intraobserver and interobserver reliability of each measure was assessed, and the difference in the degree of increments was compared according to the presence of additional cervical ligament insufficiency. Ankle stress radiographic intraobserver and interobserver agreement was ICC = 0.91 and 0.82 for talar rotation (%), ICC = 0.64 and 0.51 for anterior talar translation, and ICC = 0.78 and 0.71 for talar tilt angle, respectively. In group 2 including patients with combined injuries of the ATFL and CFL along with additional cervical ligament insufficiency, a significantly higher increment of talar rotation, mean 6.4% (SD 3.4%), was observed compared to that of talar rotation, mean 4.1% (SD 2.7 ), in the other group (group 1) with an intact cervical ligament (p < 0.001). A new comprehensive stress radiographic technique for diagnosis of chronic lateral ankle instability presented in this study might be a reliable and representable measurement tool to assess additional injury or instability of the subtalar joint. Prospective cohort study, Level II.
Readability Trends of Online Information by the American Academy of Otolaryngology-Head and Neck Surgery Foundation.

PubMed

Wong, Kevin; Levi, Jessica R

2017-01-01

Objective Previous studies have shown that patient education materials published by the American Academy of Otolaryngology-Head and Neck Surgery Foundation may be too difficult for the average reader to understand. The purpose of this study was to determine if current educational materials show improvements in readability. Study Design Cross-sectional analysis. Setting The Patient Health Information section of the American Academy of Otolaryngology-Head and Neck Surgery Foundation website. Subjects and Methods All patient education articles were extracted in plain text. Webpage navigation, references, author information, appointment information, acknowledgments, and disclaimers were removed. Follow-up editing was also performed to remove paragraph breaks, colons, semicolons, numbers, percentages, and bullets. Readability grade was calculated with the Flesch-Kincaid Grade Level, Flesch Reading Ease, Gunning-Fog Index, Coleman-Liau Index, Automated Readability Index, and Simple Measure of Gobbledygook. Intra- and interobserver reliability were assessed. Results A total of 126 articles from 7 topics were analyzed. Readability levels across all 6 tools showed that the difficulty of patient education materials exceeded the abilities of an average American. As compared with previous studies, current educational materials by the American Academy of Otolaryngology-Head and Neck Surgery Foundation have shown a decrease in difficulty. Intra- and interobserver reliability were both excellent, with intraclass coefficients of 0.99 and 0.96, respectively. Conclusion Improvements in readability is an encouraging finding and one that is consistent with recent trends toward improved health literacy. Nevertheless, online patient educational material is still too difficult for the average reader. Revisions may be necessary for current materials to benefit a larger readership.
Patient-oriented cancer information on the internet: a comparison of wikipedia and a professionally maintained database.

PubMed

Rajagopalan, Malolan S; Khanna, Vineet K; Leiter, Yaacov; Stott, Meghan; Showalter, Timothy N; Dicker, Adam P; Lawrence, Yaacov R

2011-09-01

A wiki is a collaborative Web site, such as Wikipedia, that can be freely edited. Because of a wiki's lack of formal editorial control, we hypothesized that the content would be less complete and accurate than that of a professional peer-reviewed Web site. In this study, the coverage, accuracy, and readability of cancer information on Wikipedia were compared with those of the patient-orientated National Cancer Institute's Physician Data Query (PDQ) comprehensive cancer database. For each of 10 cancer types, medically trained personnel scored PDQ and Wikipedia articles for accuracy and presentation of controversies by using an appraisal form. Reliability was assessed by using interobserver variability and test-retest reproducibility. Readability was calculated from word and sentence length. Evaluators were able to rapidly assess articles (18 minutes/article), with a test-retest reliability of 0.71 and interobserver variability of 0.53. For both Web sites, inaccuracies were rare, less than 2% of information examined. PDQ was significantly more readable than Wikipedia: Flesch-Kincaid grade level 9.6 versus 14.1. There was no difference in depth of coverage between PDQ and Wikipedia (29.9, 34.2, respectively; maximum possible score 72). Controversial aspects of cancer care were relatively poorly discussed in both resources (2.9 and 6.1 for PDQ and Wikipedia, respectively, NS; maximum possible score 18). A planned subanalysis comparing common and uncommon cancers demonstrated no difference. Although the wiki resource had similar accuracy and depth as the professionally edited database, it was significantly less readable. Further research is required to assess how this influences patients' understanding and retention.
Transcultural adaptation and initial validation of Brazilian-Portuguese version of the Basel assessment of adherence to immunosuppressive medications scale (BAASIS) in kidney transplants

PubMed Central

2013-01-01

Background Transplant recipients are expected to adhere to a lifelong immunosuppressant therapeutic regimen. However, nonadherence to treatment is an underestimated problem for which no properly validated measurement tool is available for Portuguese-speaking patients. We aimed to initially validate the Basel Assessment of Adherence to Immunosuppressive Medications Scale (BAASIS®) to accurately estimate immunosuppressant nonadherence in Brazilian transplant patients. Methods The BAASIS® (English version) was transculturally adapted and its psychometric properties were assessed. The transcultural adaptation was performed using the Guillemin protocol. Psychometric testing included reliability (intraobserver and interobserver reproducibility, agreement, Kappa coefficient, and the Cronbach’s alpha) and validity (content, criterion, and construct validities). Results The final version of the transculturally adapted BAASIS® was pretested, and no difficulties in understanding its content were found. The intraobserver and interobserver reproducibility variances (0.007 and 0.003, respectively), the Cronbach’s alpha (0.7), Kappa coefficient (0.88) and the agreement (95.2%) suggest accuracy, preciseness and reliability. For construct validity, exploratory factorial analysis demonstrated unidimensionality of the first three questions (r = 0.76, r = 0.80, and r = 0.68). For criterion validity, the adapted BAASIS® was correlated with another self-report instrument, the Measure of Adherence to Treatment, and showed good congruence (r = 0.65). Conclusions The BAASIS® has adequate psychometric properties and may be employed in advance to measure adherence to posttransplant immunosuppressant treatments. This instrument will be the first one validated to use in this specific transplant population and in the Portuguese language. PMID:23692889
New endoscopic indicator of esophageal achalasia: "pinstripe pattern".

PubMed

Minami, Hitomi; Isomoto, Hajime; Miuma, Satoshi; Kobayashi, Yasutoshi; Yamaguchi, Naoyuki; Urabe, Shigetoshi; Matsushima, Kayoko; Akazawa, Yuko; Ohnita, Ken; Takeshima, Fuminao; Inoue, Haruhiro; Nakao, Kazuhiko

2015-01-01

Endoscopic diagnosis of esophageal achalasia lacking typical endoscopic features can be extremely difficult. The aim of this study was to identify simple and reliable early indicator of esophageal achalasia. This single-center retrospective study included 56 cases of esophageal achalasia without previous treatment. As a control, 60 non-achalasia subjects including reflux esophagitis and superficial esophageal cancer were also included in this study. Endoscopic findings were evaluated according to Descriptive Rules for Achalasia of the Esophagus as follows: (1) esophageal dilatation, (2) abnormal retention of liquid and/or food, (3) whitish change of the mucosal surface, (4) functional stenosis of the esophago-gastric junction, and (5) abnormal contraction. Additionally, the presence of the longitudinal superficial wrinkles of esophageal mucosa, "pinstripe pattern (PSP)" was evaluated endoscopically. Then, inter-observer diagnostic agreement was assessed for each finding. The prevalence rates of the above-mentioned findings (1-5) were 41.1%, 41.1%, 16.1%, 94.6%, and 43.9%, respectively. PSP was observed in 60.7% of achalasia, while none of the control showed positivity for PSP. PSP was observed in 26 (62.5%) of 35 cases with shorter history < 10 years, which usually lacks typical findings such as severe esophageal dilation and tortuosity. Inter-observer agreement level was substantial for food/liquid remnant (k = 0.6861) and PSP (k = 0.6098), and was fair for abnormal contraction and white change. The accuracy, sensitivity, and specificity for achalasia were 83.8%, 64.7%, and 100%, respectively. "Pinstripe pattern" could be a reliable indicator for early discrimination of primary esophageal achalasia.
Cross-cultural adaptation to Portuguese of tools for assessing the nutritional status of patients on dialysis.

PubMed

Fetter, Renata Lemos; Bigogno, Fernanda Guedes; de Oliveira, Fernanda Galvão Pasculli; Avesani, Carla Maria

2014-01-01

The 7 point subjective global assessment (7p-SGA) and the malnutrition inflammation score (MIS) are tools commonly applied for the assessment of nutritional status in dialyzed patients. Both were developed in English and require translation to Portuguese to be applied in Brazil. The cross-cultural equivalence process ensures semantic and measurement equivalence of a translated tool. To perform the cross-cultural adaptation to Portuguese of the 7p-SGA and MIS. Semantic equivalence was performed by the back-translation method and by assessing the degree of similarity between the original instrument and that back-translated from Portuguese to English (Back-translation). The assessment of the equivalence measurement was made by evaluating the intern reliability (Cronbach's α) and interobserver reliability (two observers). One-hundred and one elderly patients on hemodialysis (HD) were included. Both instruments showed a high degree of semantic similarity with results close to the maximum value (7p-SGA 96.8 ± 7.8 and MIS 99.6 ± 1.4). The intern consistency showed a Cronbach's α value for 7p-SGA of 0.72 and of 0.53 for MIS. The interobserver reproducibility of 7p-SGA was moderate (intraclass coefficient [ICC] = 0.74 [95% CI: 0.58; 0.84]), while for MIS was strong (ICC = 0.88 [95% CI: 0.81; 0.93]). The 7p-SGA and MIS translated into Portuguese can be applied for assessing the nutritional status of elderly patients on HD. Studies testing the applicability of these instruments in adult patients on HD and in peritoneal dialysis should yet be performed.

Accuracy of both virtual and printed 3-dimensional models for volumetric measurement of alveolar clefts before grafting with alveolar bone compared with a validated algorithm: a preliminary investigation.

PubMed

Kasaven, C P; McIntyre, G T; Mossey, P A

2017-01-01

Our objective was to assess the accuracy of virtual and printed 3-dimensional models derived from cone-beam computed tomographic (CT) scans to measure the volume of alveolar clefts before bone grafting. Fifteen subjects with unilateral cleft lip and palate had i-CAT cone-beam CT scans recorded at 0.2mm voxel and sectioned transversely into slices 0.2mm thick using i-CAT Vision. Volumes of alveolar clefts were calculated using first a validated algorithm; secondly, commercially-available virtual 3-dimensional model software; and finally 3-dimensional printed models, which were scanned with microCT and analysed using 3-dimensional software. For inter-observer reliability, a two-way mixed model intraclass correlation coefficient (ICC) was used to evaluate the reproducibility of identification of the cranial and caudal limits of the clefts among three observers. We used a Friedman test to assess the significance of differences among the methods, and probabilities of less than 0.05 were accepted as significant. Inter-observer reliability was almost perfect (ICC=0.987). There were no significant differences among the three methods. Virtual and printed 3-dimensional models were as precise as the validated computer algorithm in the calculation of volumes of the alveolar cleft before bone grafting, but virtual 3-dimensional models were the most accurate with the smallest 95% CI and, subject to further investigation, could be a useful adjunct in clinical practice. Copyright © 2016 The British Association of Oral and Maxillofacial Surgeons. Published by Elsevier Ltd. All rights reserved.
High-resolution dental magnetic resonance imaging for planning palatal graft surgery-a clinical pilot study.

PubMed

Hilgenfeld, Tim; Kästel, Thorsten; Heil, Alexander; Rammelsberg, Peter; Heiland, Sabine; Bendszus, Martin; Schwindling, Franz Sebastian

2018-04-01

To evaluate whether high-resolution, non-contrast-enhanced dental magnetic resonance imaging (MRI) can be used for accurate determination of palatal masticatory mucosa thickness (PMMT) and to locate the greater palatal artery (GPA). In five volunteers (four males, one female; mean age 30.2 ± 0.4 years), two independent raters measured PMMT by use of dental MRI in 180 positions. For comparison, clinical bone sounding was performed. The GPA was identified in time-of-flight (TOF) angiography and MSVAT-SPACE-prototype sequence. Intra- and inter-observer agreement for MRI measurements, agreement between MRI and bone sounding were analysed by intra-class correlation coefficient (ICC) and Cohen's kappa (κ). Reliability of dental MRI measurements was high (intra-observer-ICC 0.962; inter-observer ICC 0.959). Agreement of MRI measurements with bone sounding was moderate (ICC 0.744), and the GPA could be identified in 60% of measurement points using the TOF-angiography alone and in 85% with additional information of the MSVAT-SPACE. Good intra-observer agreement was observed for GPA identification (κ: 0.778). Palatal masticatory mucosa thickness measured by high-resolution, non-contrast enhanced dental MRI is comparable with that obtained by bone sounding. Dental MRI enables reliable, non-invasive and radiation-free planning of palatal tissue harvesting and can also be used for location of the GPA at 85% of measurement points, which might help reduce complications during surgery. © 2018 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Leg lengthening and femoral-offset reduction after total hip arthroplasty: where is the problem - stem or cup positioning?

PubMed

Al-Amiry, Bariq; Mahmood, Sarwar; Krupic, Ferid; Sayed-Noor, Arkan

2017-09-01

Background Restoration of femoral offset (FO) and leg length is an important goal in total hip arthroplasty (THA) as it improves functional outcome. Purpose To analyze whether the problem of postoperative leg lengthening and FO reduction is related to the femoral stem or acetabular cup positioning or both. Material and Methods Between September 2010 and April 2013, 172 patients with unilateral primary osteoarthritis treated with THA were included. Postoperative leg-length discrepancy (LLD) and global FO (summation of cup and FO) were measured by two observers using a standardized protocol for evaluation of antero-posterior plain hip radiographs. Patients with postoperative leg lengthening ≥10 mm (n = 41) or with reduced global FO >5 mm (n = 58) were further studied by comparing the stem and cup length of the operated side with the contralateral side in the lengthening group, and by comparing the stem and cup offset of the operated side with the contralateral side in the FO reduction group. We evaluated also the inter-observer and intra-observer reliability of the radiological measurements. Results Both observers found that leg lengthening was related to the stem positioning while FO reduction was related to the positioning of both the femoral stem and acetabular cup. Both inter-observer reliability and intra-observer reproducibility were moderate to excellent (intra-class correlation co-efficient, ICC ≥0.69). Conclusion Post THA leg lengthening was mainly caused by improper femoral stem positioning while global FO reduction resulted from improper positioning of both the femoral stem and the acetabular cup.
Development and initial validation of the Classification of Early-Onset Scoliosis (C-EOS).

PubMed

Williams, Brendan A; Matsumoto, Hiroko; McCalla, Daren J; Akbarnia, Behrooz A; Blakemore, Laurel C; Betz, Randal R; Flynn, John M; Johnston, Charles E; McCarthy, Richard E; Roye, David P; Skaggs, David L; Smith, John T; Snyder, Brian D; Sponseller, Paul D; Sturm, Peter F; Thompson, George H; Yazici, Muharrem; Vitale, Michael G

2014-08-20

Early-onset scoliosis is a heterogeneous condition, with highly variable manifestations and natural history. No standardized classification system exists to describe and group patients, to guide optimal care, or to prognosticate outcomes within this population. A classification system for early-onset scoliosis is thus a necessary prerequisite to the timely evolution of care of these patients. Fifteen experienced surgeons participated in a nominal group technique designed to achieve a consensus-based classification system for early-onset scoliosis. A comprehensive list of factors important in managing early-onset scoliosis was generated using a standardized literature review, semi-structured interviews, and open forum discussion. Three group meetings and two rounds of surveying guided the selection of classification components, subgroupings, and cut-points. Initial validation of the system was conducted using an interobserver reliability assessment based on the classification of a series of thirty cases. Nominal group technique was used to identify three core variables (major curve angle, etiology, and kyphosis) with high group content validity scores. Age and curve progression ranked slightly lower. Participants evaluated the cases of thirty patients with early-onset scoliosis for reliability testing. The mean kappa value for etiology (0.64) was substantial, while the mean kappa values for major curve angle (0.95) and kyphosis (0.93) indicated almost perfect agreement. The final classification consisted of a continuous age prefix, etiology (congenital or structural, neuromuscular, syndromic, and idiopathic), major curve angle (1, 2, 3, or 4), and kyphosis (-, N, or +) variables, and an optional progression modifier (P0, P1, or P2). Utilizing formal consensus-building methods in a large group of surgeons experienced in treating early-onset scoliosis, a novel classification system for early-onset scoliosis was developed with all core components demonstrating substantial to excellent interobserver reliability. This classification system will serve as a foundation to guide ongoing research efforts and standardize communication in the clinical setting. Copyright © 2014 by The Journal of Bone and Joint Surgery, Incorporated.
Smartphone versus knee ligament arthrometer when size does not matter.

PubMed

Ferretti, Andrea; Andrea, Ferretti; Valeo, Luigi; Luigi, Valeo; Mazza, Daniele; Daniele, Mazza; Muliere, Luca; Luca, Muliere; Iorio, Paolo; Paolo, Iorio; Giovannetti, Giovanni; Giovanni, Giovannetti; Conteduca, Fabio; Fabio, Conteduca; Iorio, Raffaele; Raffaele, Iorio

2014-10-01

The use of available mechanical methods to measure anterior tibial translation (ATT) in anterior cruciate ligament (ACL)-deficient knees are limited by size and costs. This study evaluated the performance of a portable device based on a downloadable electronic smartphone application to measure ATT in ACL-deficient knees. A specific smartphone application (SmartJoint) was developed for this purpose. Two independent observers nonsequentially measured the amount of ATT during execution of a maximum manual Lachman test in 35 patients with an ACL-deficient knee using KT 1000 and SmartJoint on both involved and uninvolved knees. As each examiner performed the test three times on each knee, a total of 840 measurements were collected. Statistical analysis compared intertest, interobserver and intra-observer reliability using the interclass correlation coefficient (ICC). An ICC > 0.75 indicates excellent reproducibility among measurements. Mean amount of ATT on uninvolved knees was 6.1 mm [standard deviation (SD = 2)] with the KT 1000 and 6.4 mm (SD = 2) with SmartJoint. Mean side-to-side difference was 8.1 mm. (SD = 4) with KT 1000 and 8.3 mm (SD = 3) with SmartJoint. Intertest reliability between the two methods yielded an ICC 0.797 [95 % confidence interval (CI) 0.717-0.857] for the uninvolved knee and of 0.987 (CI 0.981-0.991) for the involved knee. Interobserver ICC for SmartJoint and KT 1000 was 0.957 (CI 0.927-0.976) for the uninvolved knee and 0.992 (CI 0.986-0.996) for the involved knee and 0.973 (CI 0.954-0.985) for the uninvolved knee and 0.989 (CI 0.981-0.994) for involved knee, respectively. The performance of SmartJoint is comparable and highly correlated with measurements obtained from KT 1000. SmartJoint may provide a truly portable, noninvasive, accurate, reliable, inexpensive and widely accessible method to characterize ATT in ACL-deficient knee.
Assessing the Validity and Reliability of the Peristomal Skin Lesion Assessment Instrument Adapted for Use in Turkey.

PubMed

Ay, Ali; Bulut, Hulya

2015-08-01

Many ostomy patients experience peristomal skin lesions. A descriptive study was conducted to assess the validity, usability, and reliability of the Peristomal Skin Lesions Assessment instrument (SACS instrument) adapted to Turkish from English. The SACS Instrument consists of 2 main assessments: lesion type (utilizing definitions and photographs) and lesion area by location around the ostomy. The study was performed in 2 stages: 1) the SACS language was changed and its content validity established; and 2) the instrument\\'92s content validity and inter-observer agreement (consistency) were determined among pairs of nurses who used the tool to assess peristomal skin lesions. Patients (included if they were >18 years old and receiving treatment/observation at 1 of the 4 participating stomatherapy units) and 8 stomatherapy nurses also completed appropriate sociodemographic questionnaires. Of the 393 patients screened during the 7-month study, 100 (average age 56.74 \\'b1 14.03 years, 55 men) participated; most (79) had a planned operation. A little more than half (59) of the patients had colorectal cancer and 28 had their stoma site marked preoperatively by a stomatherapy nurse. The most common peristomal skin lesion risk factors were having an ileostomy and unplanned surgery. The content validity index of the entire Turkish SACS instrument was 1, and the inter-observer agreement Kappa statistic was very good (K = 0.90, 95% CI 0.80- 0.99). Individual SACS item K values ranged from K = 0.84 (95% CI 0.63\\'961) to K = 1 (95% CI 1). Most (62.5%) nurses found the terms and pictures used in the SACS classification adequate and suitable, and 50% believed the Turkish version of the SACS instrument was a valid and suitable assessment tool for use by Turkish stomatherapy nurses. Validity and reliability studies involving larger and more diverse patient and nurse samples are warranted.
Selecting process quality indicators for the integrated care of vulnerable older adults affected by cognitive impairment or dementia.

PubMed

Kröger, Edeltraut; Tourigny, André; Morin, Diane; Côté, Lise; Kergoat, Marie-Jeanne; Lebel, Paule; Robichaud, Line; Imbeault, Shirley; Proulx, Solange; Benounissa, Zohra

2007-11-29

This study aimed at evaluating face and content validity, feasibility and reliability of process quality indicators developed previously in the United States or other countries. The indicators can be used to evaluate care and services for vulnerable older adults affected by cognitive impairment or dementia within an integrated service system in Quebec, Canada. A total of 33 clinical experts from three major urban centres in Quebec formed a panel representing two medical specialties (family medicine, geriatrics) and seven health or social services specialties (nursing, occupational therapy, psychology, neuropsychology, pharmacy, nutrition, social work), from primary or secondary levels of care, including long-term care. A modified version of the RAND(R)/University of California at Los Angeles (UCLA) appropriateness method, a two-round Delphi panel, was used to assess face and content validity of process quality indicators. The appropriateness of indicators was evaluated according to a) agreement of the panel with three criteria, defined as a median rating of 7-9 on a nine-point rating scale, and b) agreement among panellists, judged by the statistical measure of the interpercentile range adjusted for symmetry. Feasibility of quality assessment and reliability of appropriate indicators were then evaluated within a pilot study on 29 patients affected by cognitive impairment or dementia. For measurable indicators the inter-observer reliability was calculated with the Kappa statistic. Initially, 82 indicators for care of vulnerable older adults with cognitive impairment or dementia were submitted to the panellists. Of those, 72 (88%) were accepted after two rounds. Among 29 patients for whom medical files of the preceding two years were evaluated, 63 (88%) of these indicators were considered applicable at least once, for at least one patient. Only 22 indicators were considered applicable at least once for ten or more out of 29 patients. Four indicators could be measured with the help of a validated questionnaire on patient satisfaction. Inter-observer reliability was moderate (Kappa = 0.57). A multidisciplinary panel of experts judged a large majority of the initial indicators valid for use in integrated care systems for vulnerable older adults in Quebec, Canada. Most of these indicators can be measured using patient files or patient or caregiver interviews and reliability of assessment from patient-files is moderate.
Selecting process quality indicators for the integrated care of vulnerable older adults affected by cognitive impairment or dementia

PubMed Central

Kröger, Edeltraut; Tourigny, André; Morin, Diane; Côté, Lise; Kergoat, Marie-Jeanne; Lebel, Paule; Robichaud, Line; Imbeault, Shirley; Proulx, Solange; Benounissa, Zohra

2007-01-01

Background This study aimed at evaluating face and content validity, feasibility and reliability of process quality indicators developed previously in the United States or other countries. The indicators can be used to evaluate care and services for vulnerable older adults affected by cognitive impairment or dementia within an integrated service system in Quebec, Canada. Methods A total of 33 clinical experts from three major urban centres in Quebec formed a panel representing two medical specialties (family medicine, geriatrics) and seven health or social services specialties (nursing, occupational therapy, psychology, neuropsychology, pharmacy, nutrition, social work), from primary or secondary levels of care, including long-term care. A modified version of the RAND®/University of California at Los Angeles (UCLA) appropriateness method, a two-round Delphi panel, was used to assess face and content validity of process quality indicators. The appropriateness of indicators was evaluated according to a) agreement of the panel with three criteria, defined as a median rating of 7–9 on a nine-point rating scale, and b) agreement among panellists, judged by the statistical measure of the interpercentile range adjusted for symmetry. Feasibility of quality assessment and reliability of appropriate indicators were then evaluated within a pilot study on 29 patients affected by cognitive impairment or dementia. For measurable indicators the inter-observer reliability was calculated with the Kappa statistic. Results Initially, 82 indicators for care of vulnerable older adults with cognitive impairment or dementia were submitted to the panellists. Of those, 72 (88%) were accepted after two rounds. Among 29 patients for whom medical files of the preceding two years were evaluated, 63 (88%) of these indicators were considered applicable at least once, for at least one patient. Only 22 indicators were considered applicable at least once for ten or more out of 29 patients. Four indicators could be measured with the help of a validated questionnaire on patient satisfaction. Inter-observer reliability was moderate (Kappa = 0.57). Conclusion A multidisciplinary panel of experts judged a large majority of the initial indicators valid for use in integrated care systems for vulnerable older adults in Quebec, Canada. Most of these indicators can be measured using patient files or patient or caregiver interviews and reliability of assessment from patient-files is moderate. PMID:18047668
Accuracy and Reliability of the Klales et al. (2012) Morphoscopic Pelvic Sexing Method.

PubMed

Lesciotto, Kate M; Doershuk, Lily J

2018-01-01

Klales et al. (2012) devised an ordinal scoring system for the morphoscopic pelvic traits described by Phenice (1969) and used for sex estimation of skeletal remains. The aim of this study was to test the accuracy and reliability of the Klales method using a large sample from the Hamann-Todd collection (n = 279). Two observers were blinded to sex, ancestry, and age and used the Klales et al. method to estimate the sex of each individual. Sex was correctly estimated for females with over 95% accuracy; however, the male allocation accuracy was approximately 50%. Weighted Cohen's kappa and intraclass correlation coefficient analysis for evaluating intra- and interobserver error showed moderate to substantial agreement for all traits. Although each trait can be reliably scored using the Klales method, low accuracy rates and high sex bias indicate better trait descriptions and visual guides are necessary to more accurately reflect the range of morphological variation. © 2017 American Academy of Forensic Sciences.
A Tool for Measuring Active Learning in the Classroom

PubMed Central

Devlin, John W.; Kirwin, Jennifer L.; Qualters, Donna M.

2007-01-01

Objectives To develop a valid and reliable active-learning inventory tool for use in large classrooms and compare faculty perceptions of active-learning using the Active-Learning Inventory Tool. Methods The Active-Learning Inventory Tool was developed using published literature and validated by national experts in educational research. Reliability was established by trained faculty members who used the Active-Learning Inventory Tool to observe 9 pharmacy lectures. Instructors were then interviewed to elicit perceptions regarding active learning and asked to share their perceptions. Results Per lecture, 13 (range: 4-34) episodes of active learning encompassing 3 (range: 2-5) different types of active learning occurred over 2.2 minutes (0.6-16) per episode. Both interobserver (≥87%) and observer-instructor agreement (≥68%) were high for these outcomes. Conclusions The Active-Learning Inventory Tool is a valid and reliable tool to measure active learning in the classroom. Future studies are needed to determine the impact of the Active-Learning Inventory Tool on teaching and its usefulness in other disciplines. PMID:17998982
Reliability and accuracy analysis of a new semiautomatic radiographic measurement software in adult scoliosis.

PubMed

Aubin, Carl-Eric; Bellefleur, Christian; Joncas, Julie; de Lanauze, Dominic; Kadoury, Samuel; Blanke, Kathy; Parent, Stefan; Labelle, Hubert

2011-05-20

Radiographic software measurement analysis in adult scoliosis. To assess the accuracy as well as the intra- and interobserver reliability of measuring different indices on preoperative adult scoliosis radiographs using a novel measurement software that includes a calibration procedure and semiautomatic features to facilitate the measurement process. Scoliosis requires a careful radiographic evaluation to assess the deformity. Manual and computer radiographic process measures have been studied extensively to determine the reliability and reproducibility in adolescent idiopathic scoliosis. Most studies rely on comparing given measurements, which are repeated by the same user or by an expert user. A given measure with a small intra- or interobserver error might be deemed as good repeatability, but all measurements might not be truly accurate because the ground-truth value is often unknown. Thorough accuracy assessment of radiographic measures is necessary to assess scoliotic deformities, compare these measures at different stages or to permit valid multicenter studies. Thirty-four sets of adult scoliosis digital radiographs were measured two times by three independent observers using a novel radiographic measurement software that includes semiautomatic features to facilitate the measurement process. Twenty different measures taken from the Spinal Deformity Study Group radiographic measurement manual were performed on the coronal and sagittal images. Intra- and intermeasurer reliability for each measure was assessed. The accuracy of the measurement software was also assessed using a physical spine model in six different scoliotic configurations as a true reference. The majority of the measures demonstrated good to excellent intra- and intermeasurer reliability, except for sacral obliquity. The standard variation of all the measures was very small: ≤ 4.2° for Cobb angles, ≤ 4.2° for the kyphosis, ≤ 5.7° for the lordosis, ≤ 3.9° for the pelvic angles, and ≤5.3° for the sacral angles. The variability in the linear measurements (distances) was <4 mm. The variance of the measures was 1.7 and 2.6 times greater, respectively, for the angular and linear measures between the inter- and intrameasurer reliability. The image quality positively influenced the intermeasurer reliability especially for the proximal thoracic Cobb angle, T10-L2 lordosis, sacral slope and L5 seating. The accuracy study revealed that on average the difference in the angular measures was < 2° for the Cobb angles, and < 4° for the other angles, except T2-T12 kyphosis (5.3°). The linear measures were all <3.5 mm difference on average. The majority of the measures, which were analyzed in this study demonstrated good to excellent reliability and accuracy. The novel semiautomatic measurement software can be recommended for use for clinical, research or multicenter study purposes.
Inter- and intra- observer reliability of risk assessment of repetitive work without an explicit method.

PubMed

Eliasson, Kristina; Palm, Peter; Nyman, Teresia; Forsman, Mikael

2017-07-01

A common way to conduct practical risk assessments is to observe a job and report the observed long term risks for musculoskeletal disorders. The aim of this study was to evaluate the inter- and intra-observer reliability of ergonomists' risk assessments without the support of an explicit risk assessment method. Twenty-one experienced ergonomists assessed the risk level (low, moderate, high risk) of eight upper body regions, as well as the global risk of 10 video recorded work tasks. Intra-observer reliability was assessed by having nine of the ergonomists repeat the procedure at least three weeks after the first assessment. The ergonomists made their risk assessment based on his/her experience and knowledge. The statistical parameters of reliability included agreement in %, kappa, linearly weighted kappa, intraclass correlation and Kendall's coefficient of concordance. The average inter-observer agreement of the global risk was 53% and the corresponding weighted kappa (K w ) was 0.32, indicating fair reliability. The intra-observer agreement was 61% and 0.41 (K w ). This study indicates that risk assessments of the upper body, without the use of an explicit observational method, have non-acceptable reliability. It is therefore recommended to use systematic risk assessment methods to a higher degree. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.
Fatigue in children: reliability and validity of the Dutch PedsQL™ Multidimensional Fatigue Scale.

PubMed

Gordijn, M Suzanne; Suzanne Gordijn, M; Cremers, Eline M P; Kaspers, Gertjan J L; Gemke, Reinoud J B J

2011-09-01

The aim of the study is to report on the feasibility, reliability, validity, and the norm-references of the Dutch version of the PedsQL™ Multidimensional Fatigue Scale. The study participants are four hundred and ninety-seven parents of children aged 2-18 years and 366 children aged 5-18 years from various day care facilities, elementary schools, and a high school who completed the Dutch version of the PedsQL™ Multidimensional Fatigue Scale. The number of missing items was minimal. All scales showed satisfactory internal consistency reliability, with Cronbach's coefficient alpha exceeding 0.70. Test-retest reliability was good to excellent (ICCs 0.68-0.84) and inter-observer reliability varied from moderate to excellent (ICCs 0.56-0.93) for total scores. Parent/child concordance for total scores was poor to good (ICCs 0.25-0.68). The PedsQL™ Multidimensional Fatigue Scale was able to distinguish between healthy children and children with an impaired health condition. The Dutch version of the PedsQL™ Multidimensional Fatigue Scale demonstrates an adequate feasibility, reliability, and validity in another sociocultural context. With the obtained norm-references, it can be utilized as a tool in the evaluation of fatigue in healthy and chronically ill children aged 2-18 years.
The Neurologic Assessment in Neuro-Oncology (NANO) scale: a tool to assess neurologic function for integration into the Response Assessment in Neuro-Oncology (RANO) criteria

PubMed Central

DeAngelis, Lisa M.; Brandes, Alba A.; Peereboom, David M.; Galanis, Evanthia; Lin, Nancy U.; Soffietti, Riccardo; Macdonald, David R.; Chamberlain, Marc; Perry, James; Jaeckle, Kurt; Mehta, Minesh; Stupp, Roger; Muzikansky, Alona; Pentsova, Elena; Cloughesy, Timothy; Iwamoto, Fabio M.; Tonn, Joerg-Christian; Vogelbaum, Michael A.; Wen, Patrick Y.; van den Bent, Martin J.; Reardon, David A.

2017-01-01

Abstract Background. The Macdonald criteria and the Response Assessment in Neuro-Oncology (RANO) criteria define radiologic parameters to classify therapeutic outcome among patients with malignant glioma and specify that clinical status must be incorporated and prioritized for overall assessment. But neither provides specific parameters to do so. We hypothesized that a standardized metric to measure neurologic function will permit more effective overall response assessment in neuro-oncology. Methods. An international group of physicians including neurologists, medical oncologists, radiation oncologists, and neurosurgeons with expertise in neuro-oncology drafted the Neurologic Assessment in Neuro-Oncology (NANO) scale as an objective and quantifiable metric of neurologic function evaluable during a routine office examination. The scale was subsequently tested in a multicenter study to determine its overall reliability, inter-observer variability, and feasibility. Results. The NANO scale is a quantifiable evaluation of 9 relevant neurologic domains based on direct observation and testing conducted during routine office visits. The score defines overall response criteria. A prospective, multinational study noted a >90% inter-observer agreement rate with kappa statistic ranging from 0.35 to 0.83 (fair to almost perfect agreement), and a median assessment time of 4 minutes (interquartile range, 3–5). Conclusion. The NANO scale provides an objective clinician-reported outcome of neurologic function with high inter-observer agreement. It is designed to combine with radiographic assessment to provide an overall assessment of outcome for neuro-oncology patients in clinical trials and in daily practice. Furthermore, it complements existing patient-reported outcomes and cognition testing to combine for a global clinical outcome assessment of well-being among brain tumor patients. PMID:28453751
Inter-observer variance with the diagnosis of myelodysplastic syndromes (MDS) following the 2008 WHO classification.

PubMed

Font, P; Loscertales, J; Benavente, C; Bermejo, A; Callejas, M; Garcia-Alonso, L; Garcia-Marcilla, A; Gil, S; Lopez-Rubio, M; Martin, E; Muñoz, C; Ricard, P; Soto, C; Balsalobre, P; Villegas, A

2013-01-01

Morphology is the basis of the diagnosis of myelodysplastic syndromes (MDS). The WHO classification offers prognostic information and helps with the treatment decisions. However, morphological changes are subject to potential inter-observer variance. The aim of our study was to explore the reliability of the 2008 WHO classification of MDS, reviewing 100 samples previously diagnosed with MDS using the 2001 WHO criteria. Specimens were collected from 10 hospitals and were evaluated by 10 morphologists, working in five pairs. Each observer evaluated 20 samples, and each sample was analyzed independently by two morphologists. The second observer was blinded to the clinical and laboratory data, except for the peripheral blood (PB) counts. Nineteen cases were considered as unclassified MDS (MDS-U) by the 2001 WHO classification, but only three remained as MDS-U by the 2008 WHO proposal. Discordance was observed in 26 of the 95 samples considered suitable (27 %). Although there were a high number of observers taking part, the rate of discordance was quite similar among the five pairs. The inter-observer concordance was very good regarding refractory anemia with excess blasts type 1 (RAEB-1) (10 of 12 cases, 84 %), RAEB-2 (nine of 10 cases, 90 %), and also good regarding refractory cytopenia with multilineage dysplasia (37 of 50 cases, 74 %). However, the categories with unilineage dysplasia were not reproducible in most of the cases. The rate of concordance with refractory cytopenia with unilineage dysplasia was 40 % (two of five cases) and 25 % with RA with ring sideroblasts (two of eight). Our results show that the 2008 WHO classification gives a more accurate stratification of MDS but also illustrates the difficulty in diagnosing MDS with unilineage dysplasia.
Three-Dimensional Eyeball and Orbit Volume Modification After LeFort III Midface Distraction.

PubMed

Smektala, Tomasz; Nysjö, Johan; Thor, Andreas; Homik, Aleksandra; Sporniak-Tutak, Katarzyna; Safranow, Krzysztof; Dowgierd, Krzysztof; Olszewski, Raphael

2015-07-01

The aim of our study was to evaluate orbital volume modification with LeFort III midface distraction in patients with craniosynostosis and its influence on eyeball volume and axial diameter modification. Orbital volume was assessed by the semiautomatic segmentation method based on deformable surface models and on 3-dimensional (3D) interaction with haptics. The eyeball volumes and diameters were automatically calculated after manual segmentation of computed tomographic scans with 3D slicer software. The mean, minimal, and maximal differences as well as the standard deviation and intraclass correlation coefficient (ICC) for intraobserver and interobserver measurements reliability were calculated. The Wilcoxon signed rank test was used to compare measured values before and after surgery. P < 0.05 was considered statistically significant. Intraobserver and interobserver ICC for haptic-aided semiautomatic orbital volume measurements were 0.98 and 0.99, respectively. The intraobserver and interobserver ICC values for manual segmentation of the eyeball volume were 0.87 and 0.86, respectively. The orbital volume increased significantly after surgery: 30.32% (mean, 5.96 mL) for the left orbit and 31.04% (mean, 6.31 mL) for the right orbit. The mean increase in eyeball volume was 12.3%. The mean increases in the eyeball axial dimensions were 7.3%, 9.3%, and 4.4% for the X-, Y-, and Z-axes, respectively. The Wilcoxon signed rank test showed that preoperative and postoperative eyeball volumes, as well as the diameters along the X- and Y-axes, were statistically significant. Midface distraction in patients with syndromic craniostenosis results in a significant increase (P < 0.05) in the orbit and eyeball volumes. The 2 methods (haptic-aided semiautomatic segmentation and manual 3D slicer segmentation) are reproducible techniques for orbit and eyeball volume measurements.
The definition of radiological signs in gastric ulcer and assessment of their validity by inter-observer variation study.

PubMed

Schulman, A; Simpkins, K C

1975-07-01

The initial aim was to program a computer with information on the frequency of radiological signs in benign and malignant gastric ulcers in order to obtain a percentage probability of benignancy or malignancy in succeeding ulcers in clinical practice. However, only four of the many signs described in gastric ulcer were confirmed to be of validity (i.e. reliable existence) by an inter-observer variation study using two observers and the films from 69 barium meal examinations. These were projection or non-projection of the in-profile ulcer, presence or absence of adjacent mucosal folds, good or poor definition of the in-face ulcer's edge, and extension of radiating folds to the in-face ulcer's edge. A few more remained unassessed due to insufficient numbers of relevant cases. It is condluced that: as defined in the literature the majority of radiological signs in this field are of uncertain existence; and the four that were found to be valid do not fully describe the important appearances that may be seen in benign and malignant ulcers and would be inadequate to differentiate them to a sufficiently high degree of probability.
Evaluation of Cohen's cross-section trichometer for measuring hair quantity.

PubMed

Hendriks, Maria A E; Geerts, Paulus A F; Dercksen, Marcus W; van den Hurk, Corina J G; Breed, Wim P M

2012-04-01

Until now, there has been no reliable, simple method available for measuring hair quantity that is suitable in clinical practice. Recently, the cross-section trichometer by Cohen has been introduced. This study was designed to test its clinical utility. The hair mass index (HMI) is ratio of the cross-sectional area of an isolated bundle of hair and the premeasured area of skin from which it was taken using the trichometer device. The intra- and interobserver reproducibility of measurements at the same location and after relocation were evaluated. For intraobserver reproducibility, the HMI ranged from 3 to 120 (mean difference .2, 95% confidence interval [CI] = -4.7-5.1, correlation coefficient [r] = .99. For interobserver reproducibility, the HMI ranged from 18 to 119 (mean difference -.4, 95% CI = -8,0-7,2, r = .98). With relocation, the HMI ranged from 2 to 113 (mean difference -1.0, 95% CI = -10.1-8.1, r = .97). Measurements took 5-10 minutes per area. Measurements were simple to perform, and the data showed high reproducibility. The trichometer is a promising technology for hair quantity measurements and has multiple clinical and research applications. © 2012 by the American Society for Dermatologic Surgery, Inc. Published by Wiley Periodicals, Inc.
Caregiver Person-Centeredness and Behavioral Symptoms during Mealtime Interactions: Development and Feasibility of a Coding Scheme

PubMed Central

Gilmore-Bykovskyi, Andrea L.

2015-01-01

Mealtime behavioral symptoms are distressing and frequently interrupt eating for the individual experiencing them and others in the environment. In order to enable identification of potential antecedents to mealtime behavioral symptoms, a computer-assisted coding scheme was developed to measure caregiver person-centeredness and behavioral symptoms for nursing home residents with dementia during mealtime interactions. The purpose of this pilot study was to determine the acceptability and feasibility of procedures for video-capturing naturally-occurring mealtime interactions between caregivers and residents with dementia, to assess the feasibility, ease of use, and inter-observer reliability of the coding scheme, and to explore the clinical utility of the coding scheme. Trained observers coded 22 observations. Data collection procedures were feasible and acceptable to caregivers, residents and their legally authorized representatives. Overall, the coding scheme proved to be feasible, easy to execute and yielded good to very good inter-observer agreement following observer re-training. The coding scheme captured clinically relevant, modifiable antecedents to mealtime behavioral symptoms, but would be enhanced by the inclusion of measures for resident engagement and consolidation of items for measuring caregiver person-centeredness that co-occurred and were difficult for observers to distinguish. PMID:25784080
Reliability in endoscopic diagnosis of portal hypertensive gastropathy

PubMed Central

de Macedo, George Fred Soares; Ferreira, Fabio Gonçalves; Ribeiro, Maurício Alves; Szutan, Luiz Arnaldo; Assef, Mauricio Saab; Rossini, Lucio Giovanni Battista

2013-01-01

AIM: To analyze reliability among endoscopists in diagnosing portal hypertensive gastropathy (PHG) and to determine which criteria from the most utilized classifications are the most suitable. METHODS: From January to July 2009, in an academic quaternary referral center at Santa Casa of São Paulo Endoscopy Service, Brazil, we performed this single-center prospective study. In this period, we included 100 patients, including 50 sequential patients who had portal hypertension of various etiologies; who were previously diagnosed based on clinical, laboratory and imaging exams; and who presented with esophageal varices. In addition, our study included 50 sequential patients who had dyspeptic symptoms and were referred for upper digestive endoscopy without portal hypertension. All subjects underwent upper digestive endoscopy, and the images of the exam were digitally recorded. Five endoscopists with more than 15 years of experience answered an electronic questionnaire, which included endoscopic criteria from the 3 most commonly used Portal Hypertensive Gastropathy classifications (McCormack, NIEC and Baveno) and the presence of elevated or flat antral erosive gastritis. All five endoscopists were blinded to the patients’ clinical information, and all images of varices were deliberately excluded for the analysis. RESULTS: The three most common etiologies of portal hypertension were schistosomiasis (36%), alcoholic cirrhosis (20%) and viral cirrhosis (14%). Of the 50 patients with portal hypertension, 84% were Child A, 12% were Child B, 4% were Child C, 64% exhibited previous variceal bleeding and 66% were previously endoscopic treated. The endoscopic parameters, presence or absence of mosaic-like pattern, red point lesions and cherry-red spots were associated with high inter-observer reliability and high specificity for diagnosing Portal Hypertensive Gastropathy. Sensitivity, specificity and reliability for the diagnosis of PHG (%) were as follows: mosaic-like pattern (100; 92.21; High); fine pink speckling (56; 76.62; Unsatisfactory); superficial reddening (69.57; 66.23; Unsatisfactory); red-point lesions (47.83; 90.91; High); cherry-red spots (39.13; 96.10; High); isolated red marks (43.48; 88.31; High); and confluent red marks (21.74; 100; Unsatisfactory). Antral elevated erosive gastritis exhibited high reliability and high specificity with respect to the presence of portal hypertension (92%) and the diagnosis of portal hypertensive gastropathy (88.31%). CONCLUSION: The most suitable endoscopic criteria for the diagnosis of PHG were mosaic-like pattern, red-point lesions and cherry-red spots with no subdivisions, which were associated with a high rate of inter-observer reliability. PMID:23858376

Implementation of a Posted Schedule to Increase Class-Wide Interobserver Agreement Assessment

ERIC Educational Resources Information Center

Doucette, Stefanie; DiGennaro Reed, Florence D.; Reed, Derek D.; Maguire, Helena; Marquardt, Heidi

2012-01-01

The present study investigated the impact of an antecedent intervention in the form of a daily posted schedule on the interobserver agreement (IOA) assessment of educational goals implemented within a classroom at a private school serving individuals with disabilities. During baseline, the percentage of academic goals with interobserver agreement…
Prospective Validation of Intra- and Interobserver Reproducibility of a New Point Shear Wave Elastographic Technique for Assessing Liver Stiffness in Patients with Chronic Liver Disease.

PubMed

Ahn, Su Joa; Lee, Jeong Min; Chang, Won; Lee, Sang Min; Kang, Hyo-Jin; Yang, Hyunkyung; Yoon, Jeong Hee; Park, Sae Jin; Han, Joon Koo

2017-01-01

To assess intra- and inter-observer reproducibility of a new point shear wave elastography technique (pSWE, S-Shearwave, Samsung Medison) and compare its accuracy in assessing liver stiffness (LS) with an established pSWE technique (Virtual Touch Quantification, VTQ). Thirty-three patients were enrolled in this Institutional Review Board-approved prospective study. LS values were measured by VTQ on an Acuson S2000 system (Siemens Healthineer) and S-Shearwave on an RS-80A (Samsung Medison) in the same session, followed by two further S-Shearwave sessions for inter- and intra-observer variation at 8-hour intervals. The technical success rate (SR) and reliability of the measurements of both pSWE techniques were compared. The intra- and inter-observer reproducibility of S-Shearwave was determined by intraclass correlation coefficients (ICCs). LS values were measured by both methods of pSWE. The diagnostic performance in severe fibrosis (F ≥ 3) and cirrhosis (F = 4) was evaluated using the receiver operating characteristics curve analysis and the Obuchowski measure with the LS values of transient elastography as the referenced standard. The VTQ (100%, 33/33) and S-Shearwave (96.9%, 32/33) techniques did not display a significant difference in technical SR ( p = 0.63) or reliability of LS measurements (96.9%, 32/33; 93.9%, 30/32, respectively, p = 0.61). The inter- and intra-observer agreement for LS measurements using the S-Shearwave technique was excellent (ICC = 0.98 and 0.99, respectively). The mean LS values of both pSWE techniques were not significantly different and exhibited a good correlation (r = 0.78). To detect F ≥ 3 and F = 4, VTQ and S-Shearwave showed comparable diagnostic accuracy as indicated by the following outcomes: areas under receiver operating characteristics curve (AUROC) = 0.87 (95% confidence intervals [CI] 0.70-0.96), 0.89 for VTQ (95% CI 0.74-0.97), respectively; and AUROC = 0.84 (95% CI 0.67-0.94), 0.94 (95% CI 0.80-0.99) for S-Shearwave (p > 0.48), respectively. The Obuchowski measures were similarly high for S-Shearwave and VTQ (0.94 vs. 0.95). S-Shearwave shows excellent inter- and intra-observer agreement and diagnostic effectiveness comparable to VTQ in detecting LS.
Interobserver and intermodality agreement of standardized algorithms for non-invasive diagnosis of hepatocellular carcinoma in high-risk patients: CEUS-LI-RADS versus MRI-LI-RADS.

PubMed

Schellhaas, Barbara; Hammon, Matthias; Strobel, Deike; Pfeifer, Lukas; Kielisch, Christian; Goertz, Ruediger S; Cavallaro, Alexander; Janka, Rolf; Neurath, Markus F; Uder, Michael; Seuss, Hannes

2018-04-19

We compared the interobserver agreement for the recently introduced contrast-enhanced ultrasound (CEUS)-based algorithm CEUS-LI-RADS (Liver Imaging Reporting and Data System) versus the well-established magnetic resonance imaging (MRI)-LI-RADS for non-invasive diagnosis of hepatocellular carcinoma (HCC) in high-risk patients. Focal liver lesions in 50 high-risk patients (mean age 66.2 ± 11.8 years; 39 male) were assessed retrospectively with CEUS and MRI. Two independent observers reviewed CEUS and MRI examinations, separately, classifying observations according to CEUS-LI-RADSv.2016 and MRI-LI-RADSv.2014. Interobserver agreement was assessed with Cohen's kappa. Forty-three lesions were HCCs; two were intrahepatic cholangiocarcinomas; five were benign lesions. Arterial phase hyperenhancement was perceived less frequently with CEUS than with MRI (37/50 / 38/50 lesions = 74%/78% [CEUS; observer 1/observer 2] versus 46/50 / 44/50 lesions = 92%/88% [MRI; observer 1/observer 2]). Washout appearance was observed in 34/50 / 20/50 lesions = 68%/40% with CEUS and 31/50 / 31/50 lesions = 62%/62%) with MRI. Interobserver agreement was moderate for arterial hyperenhancement (ĸ = 0.511/0.565 [CEUS/MRI]) and "washout" (ĸ = 0.490/0.582 [CEUS/MRI]), fair for CEUS-LI-RADS category (ĸ = 0.309) and substantial for MRI-LI-RADS category (ĸ = 0.609). Intermodality agreement was fair for arterial hyperenhancement (ĸ = 0.329), slight to fair for "washout" (ĸ = 0.202) and LI-RADS category (ĸ = 0.218) CONCLUSION: Interobserver agreement is substantial for MRI-LI-RADS and only fair for CEUS-LI-RADS. This is mostly because interobserver agreement in the perception of washout appearance is better in MRI than in CEUS. Further refinement of the LI-RADS algorithms and increasing education and practice may be necessary to improve the concordance between CEUS and MRI for the final LI-RADS categorization. • CEUS-LI-RADS and MRI-LIRADS enable standardized non-invasive diagnosis of HCC in high-risk patients. • With CEUS, interobserver agreement is better for arterial hyperenhancement than for "washout". • Interobserver agreement for major features is moderate for both CEUS and MRI. • Interobserver agreement for LI-RADS category is substantial for MRI, and fair for CEUS. • Interobserver-agreement for CEUS-LI-RADS will presumably improve with ongoing use of the algorithm.
Acoustic radiation force impulse tissue characterization of the anterior talofibular ligament: A promising non-invasive approach in ankle imaging.

PubMed

Hotfiel, Thilo; Heiss, Rafael; Janka, Rolf; Forst, Raimund; Raithel, Martine; Lutter, Christoph; Gelse, Kolja; Pachowsky, Milena; Golditz, Tobias

2018-06-09

The anterior talofibular ligament (ATFL) is the most frequently injured ligament during inversion strains of the ankle. The purpose of this study was to evaluate the feasibility of acoustic radiation force impulse (ARFI) elastography and to determine the in vivo mechanical properties of the ATFL in healthy athletes. Fifty-one healthy athletes (32 female, 28 male; 29 ±2 years) were recruited from the medical and sports faculty. ARFI values, represented as shear wave velocities (SWV) as well as conventional ultrasound were obtained for the ATFL in neutral ankle position. A clinical assessment was performed in which the American Orthopaedic Foot & Ankle Society (AOFAS) Ankle-Hindfoot Score and the functional ankle ability measure (FAAM) were collected. Interobserver and intraobserver reliability (repeated sessions and repeated days) were assessed using an intra class correlation coefficient (ICC) and typical error (TE) calculation in absolute (TE) and relative units as coefficient of the variation (CV). SWV values of the ATFL had an average velocity of 1.79±0.34 m/s for all participants, with an average of 1.72±0.36 m/s for females and 1.85±0.31 m/s for males. The interobserver and intraobserver reliability revealed an ICC of 0.902 and 0.933 (TE of 0.67 (CV: 5.2 % and 0.51 m/s (CV: 3.83 %), respectively. FAAM and AOFAS revealed the best possible scores. ARFI seems to be a valuable diagnostic modality and represents a promising imaging marker for the assessment and monitoring of ankle ligaments in the context of acute and chronic ankle instabilities; ARFI could also be used to investigate loading or sport dependent adaptions.
SU-E-J-252: Reproducibility of Radiogenomic Image Features: Comparison of Two Semi-Automated Segmentation Methods

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lee, M; Woo, B; Kim, J

Purpose: Objective and reliable quantification of imaging phenotype is an essential part of radiogenomic studies. We compared the reproducibility of two semi-automatic segmentation methods for quantitative image phenotyping in magnetic resonance imaging (MRI) of glioblastoma multiforme (GBM). Methods: MRI examinations with T1 post-gadolinium and FLAIR sequences of 10 GBM patients were downloaded from the Cancer Image Archive site. Two semi-automatic segmentation tools with different algorithms (deformable model and grow cut method) were used to segment contrast enhancement, necrosis and edema regions by two independent observers. A total of 21 imaging features consisting of area and edge groups were extracted automaticallymore » from the segmented tumor. The inter-observer variability and coefficient of variation (COV) were calculated to evaluate the reproducibility. Results: Inter-observer correlations and coefficient of variation of imaging features with the deformable model ranged from 0.953 to 0.999 and 2.1% to 9.2%, respectively, and the grow cut method ranged from 0.799 to 0.976 and 3.5% to 26.6%, respectively. Coefficient of variation for especially important features which were previously reported as predictive of patient survival were: 3.4% with deformable model and 7.4% with grow cut method for the proportion of contrast enhanced tumor region; 5.5% with deformable model and 25.7% with grow cut method for the proportion of necrosis; and 2.1% with deformable model and 4.4% with grow cut method for edge sharpness of tumor on CE-T1W1. Conclusion: Comparison of two semi-automated tumor segmentation techniques shows reliable image feature extraction for radiogenomic analysis of GBM patients with multiparametric Brain MRI.« less
Reproducibility and reliability of the ankle-brachial index as assessed by vascular experts, family physicians and nurses.

PubMed

Holland-Letz, Tim; Endres, Heinz G; Biedermann, Stefanie; Mahn, Matthias; Kunert, Joachim; Groh, Sabine; Pittrow, David; von Bilderling, Peter; Sternitzky, Reinhardt; Diehm, Curt

2007-05-01

The reliability of ankle-brachial index (ABI) measurements performed by different observer groups in primary care has not yet been determined. The aims of the study were to provide precise estimates for all effects influencing the variability of the ABI (patients' individual variability, intra- and inter-observer variability), with particular focus on the performance of different observer groups. Using a partially balanced incomplete block design, 144 unselected individuals aged > or = 65 years underwent double ABI measurements by one vascular surgeon or vascular physician, one family physician and one nurse with training in Doppler sonography. Three groups comprising a total of 108 individuals were analyzed (only two with ABI < 0.90). Errors for two repeated measurements for all three observer groups did not differ (experts 8.5%, family physicians 7.7%, and nurses 7.5%, p = 0.39). There was no relevant bias among observer groups. Intra-observer variability expressed as standard deviation divided by the mean was 8%, and inter-observer variability was 9%. In conclusion, reproducibility of the ABI measurement was good in this cohort of elderly patients who almost all had values in the normal range. The mean error of 8-9% within or between observers is smaller than with established screening measures. Since there were no differences among observers with different training backgrounds, our study confirms the appropriateness of ABI assessment for screening peripheral arterial disease (PAD) and generalized atherosclerosis in the primary case setting. Given the importance of the early detection and management of PAD, this diagnostic tool should be used routinely as a standard for PAD screening. Additional studies will be required to confirm our observations in patients with PAD of various severities.
Evaluation of a modified knee rotation angle in MRI scans with and without trochlear dysplasia: a parameter independent of knee size and trochlear morphology.

PubMed

Dornacher, Daniel; Trubrich, Angela; Guelke, Joachim; Reichel, Heiko; Kappe, Thomas

2017-08-01

Regarding TT-TG in knee realignment surgery, two aspects have to be considered: first, there might be flaws in using absolute values for TT-TG, ignoring the knee size of the individual. Second, in high-grade trochlear dysplasia with a dome-shaped trochlea, measurement of TT-TG has proven to lack precision and reliability. The purpose of this examination was to establish a knee rotation angle, independent of the size of the individual knee and unaffected by a dysplastic trochlea. A total of 114 consecutive MRI scans of knee joints were analysed by two observers, retrospectively. Of these, 59 were obtained from patients with trochlear dysplasia, and another 55 were obtained from patients presenting with a different pathology of the knee joint. Trochlear dysplasia was classified into low grade and high grade. TT-TG was measured according to the method described by Schoettle et al. In addition, a modified knee rotation angle was assessed. Interobserver reliability of the knee rotation angle and its correlation with TT-TG was calculated. The knee rotation angle showed good correlation with TT-TG in the readings of observer 1 and observer 2. Interobserver correlation of the parameter showed excellent values for the scans with normal trochlea, low-grade and high-grade trochlear dysplasia, respectively. All calculations were statistically significant (p < 0.05). The knee rotation angle might meet the requirements for precise diagnostics in knee realignment surgery. Unlike TT-TG, this parameter seems not to be affected by a dysplastic trochlea. In addition, the dimensionless parameter is independent of the knee size of the individual. II.
Quantification of hardness, elasticity and viscosity of the skin of patients with systemic sclerosis using a novel sensing device (Vesmeter): a proposal for a new outcome measurement procedure.

PubMed

Kuwahara, Y; Shima, Y; Shirayama, D; Kawai, M; Hagihara, K; Hirano, T; Arimitsu, J; Ogata, A; Tanaka, T; Kawase, I

2008-07-01

No objective method to measure skin involvement in SSc has been established. We developed a novel method using a computer-linked device to simultaneously quantify physical properties of the skin such as hardness, elasticity and viscosity. Skin hardness was calculated by measuring the depth of an indenter pressed onto the skin. The Voigt model was used to calculate skin elasticity, viscosity, visco-elastic ratio and relaxation time by analysing the waveform of skin surface behaviour. The results were compared with the modified Rodnan skin score (mRSS) obtained at 17 sites on the bodies of 20 SSc patients and 20 healthy controls. A functional assessment questionnaire was administered to determine how skin hardness represents a patient's disability. We also examined intra- and inter-observer variability to determine the reliability of this method. The crude hardness obtained with this device correlated well with the standard hardness specified by the American Society for Testing and Materials (ASTM, r = 0.957). A close relationship between hardness and total mRSS was also observed (r = 0.832). Skin elasticity correlated positively, and relaxation time negatively with mRSS. Functional disability correlated more closely with skin hardness (r = 0.643) than with mRSS (r = 0.517). Intra- and inter-observer variabilities were 7.63 and 19.76%, respectively, which were lower than those reported for mRSS. Increases in hardness and elasticity as well as shortening of relaxation time constitute objective characteristics of skin involvement in SSc. The system devised by us proved to be able to assess skin abnormalities of SSc with high reliability.
A comparison of film and computer workstation measurements of degenerative spondylolisthesis: intraobserver and interobserver reliability.

PubMed

Bolesta, Michael J; Winslow, Lauren; Gill, Kevin

2010-06-01

A comparison of measurements of degenerative spondylolisthesis made on film and on computer workstations. To determine whether the 2 methodologies are comparable in some of the parameters used to assess lumbar degenerative spondylolisthesis. Digital radiology has been replacing analog radiographs. In scoliosis, several studies have shown that measurements made on digital and analog films are similar and that they are also similar to those made on computer workstations. Such work has not been done in spondylolisthesis. Twenty-four cases of lumbar degenerative spondylolisthesis were identified from our clinic practice. Three observers measured anterior displacement, sagittal rotation, and lumbar lordosis on digital films using the same protractor and pencil. The same parameters were measured on the same studies at clinical workstations. All measurements were repeated 2 weeks later. A statistician determined the intra and interobserver reliability of the 2 measurement methods and the degree of agreement between the 2 methods. The differences between the first and second readings did reach statistical significance in some cases, but none of them were large enough to be clinically meaningful. The interclass correlation coefficients (ICCs) were >or=0.80 except for one (0.67). The difference among the 3 observers was similarly statistically significant in a few instances but not enough to influence clinical decisions and with good ICCs (0.67 and better). Similarly, the differences in the 2 methods were small, and ICCs ranged from 0.69 to 0.98. This study supports the use of computer workstation measurements in lumbar degenerative spondylolisthesis. The parameters used in this study were comparable, whether measured on film or at clinical workstations.
[Utility and validity of indicators from the Nursing Outcomes Classification as a support tool for diagnosing Ineffective Self Health Management in patients with chronic conditions in primary health care].

PubMed

Morilla-Herrera, J C; Morales-Asencio, J M; Fernández-Gallego, M C; Cobos, E Berrobianco; Romero, A Delgado

2011-01-01

Self-care and management of therapeutic regime (drugs adherence, preventive behaviours and development of healthy life-styles) are key components for managing chronic diseases. Nursing has standardized languages which describe many of these situations, such as the diagnosis "Ineffective Self Health Management" (ISHM) or many of the Nursing Outcomes Classification (NOC) indicators. The aims of this study were to determine the interobserver reliability of a NOC-based instrument for assessment and aid in diagnosis of the ISHM in patients with chronic conditions in Primary Health Care, to determine its diagnostic validity and to describe the prevalence of patients with this problem. Cross-sectional validation study developed in the provinces of Málaga, Cádiz and Almería from 2006 to 2009. Each patient was assessed by 3 independent observers: the first two observers evaluated scoring of the NOC indicators and the third one acted as the "gold-standard". Two hundred and twenty-eight patients were included, 37.7% of them with more than one chronic condition. NOC indicators showed a high interobserver reliability (ICC>0,70) and a consistency (Cronbach's alpha: 0.81). With a cut-point of 10.5, sensitivity was 61% and specificity 85%, and the area under the curve was 0.81 (CI95%: 0.77 to 0.85). The prevalence of patients with ISHM was 36% (CI 95%: 34 to 40). The use of NOC indicators allows evaluation of management of the therapeutic regime in people with chronic conditions with a satisfactory validity and it provides new approaches for dealing with this problem.
New Endoscopic Indicator of Esophageal Achalasia: “Pinstripe Pattern”

PubMed Central

Minami, Hitomi; Isomoto, Hajime; Miuma, Satoshi; Kobayashi, Yasutoshi; Yamaguchi, Naoyuki; Urabe, Shigetoshi; Matsushima, Kayoko; Akazawa, Yuko; Ohnita, Ken; Takeshima, Fuminao; Inoue, Haruhiro; Nakao, Kazuhiko

2015-01-01

Background and Study Aims Endoscopic diagnosis of esophageal achalasia lacking typical endoscopic features can be extremely difficult. The aim of this study was to identify simple and reliable early indicator of esophageal achalasia. Patients and Methods This single-center retrospective study included 56 cases of esophageal achalasia without previous treatment. As a control, 60 non-achalasia subjects including reflux esophagitis and superficial esophageal cancer were also included in this study. Endoscopic findings were evaluated according to Descriptive Rules for Achalasia of the Esophagus as follows: (1) esophageal dilatation, (2) abnormal retention of liquid and/or food, (3) whitish change of the mucosal surface, (4) functional stenosis of the esophago-gastric junction, and (5) abnormal contraction. Additionally, the presence of the longitudinal superficial wrinkles of esophageal mucosa, “pinstripe pattern (PSP)” was evaluated endoscopically. Then, inter-observer diagnostic agreement was assessed for each finding. Results The prevalence rates of the above-mentioned findings (1–5) were 41.1%, 41.1%, 16.1%, 94.6%, and 43.9%, respectively. PSP was observed in 60.7% of achalasia, while none of the control showed positivity for PSP. PSP was observed in 26 (62.5%) of 35 cases with shorter history < 10 years, which usually lacks typical findings such as severe esophageal dilation and tortuosity. Inter-observer agreement level was substantial for food/liquid remnant (k = 0.6861) and PSP (k = 0.6098), and was fair for abnormal contraction and white change. The accuracy, sensitivity, and specificity for achalasia were 83.8%, 64.7%, and 100%, respectively. Conclusion “Pinstripe pattern” could be a reliable indicator for early discrimination of primary esophageal achalasia. PMID:25664812
Translation, cultural adaptation and validation into portuguese (Brazil) in Systemic Sclerosis Questionnaire (SySQ).

PubMed

Machado, Roberta Ismael Lacerda; Souto, Lais Medeiros; Freire, Eutilia Andrade Medeiros

2014-01-01

Systemic sclerosis (SSc) is a multisystem disease, autoimmune disorder characterized by a fibroblastic disfunction, with significant impact on quality of life (QoL), measured by instruments or questionnaires that usually were formulated in other languages and in different cultural contexts. Translate into Brazilian Portuguese, cross cultural adaptation and assess the reliability and validity of the Systemic Sclerosis Questionnaire (SySQ). Translation and adaptation: into Portuguese and cross-cultural adaptation was performed in accordance with studies on questionnaire translation methodology into other languages. Reliability: it was analyzed using three interviews with different interviewers, two on the same day (interobserver) and the third within 14 days of the first assessment (intraobserver).Validity was assessed by correlating clinical and quality of life parameters with the domain scores of Sysc. a descriptive analysis of the study sample. Reproducibility was assessed using an intraclass correlation coefficient (ICC). Internal consistency was assessed using Cronbach's alpha coefficient. To assess validity we used Spearman correlation coefficient. Five percent was the level of significance adopted for all statistical tests. In the evaluation of the questionnaires, the results were similar to the original questionnaire, the internal consistency ranging between 0.73 and 0.93 for each item. The interobserver reproducibility was very good for all domains (α = 0.786 to 0.983) and intraobserver agreement was considered very good for general symptoms domain (ICC = 0.916), good for musculoskeletal symptoms domain (ICC = 0.897) and cardiopulmonary domain (ICC = 0.842) and reasonable for gastrointestinal symptoms domain (ICC = 0.686). The Brazilian Portuguese version of SySQ proved to be reproducible and valid for our population, using a recognized methodology for translation and cultural adaptation of questionnaires, as well as to assess the reproducibility and validity.
Exploring cartilage damage in gout using 3-T MRI: distribution and associations with joint inflammation and tophus deposition.

PubMed

Popovich, I; Dalbeth, N; Doyle, A; Reeves, Q; McQueen, F M

2014-07-01

Few imaging studies have investigated cartilage in gout. Magnetic resonance imaging (MRI) can image cartilage damage and also reveals other features of gouty arthropathy. The objective was to develop and validate a system for quantifying cartilage damage in gout. 3-T MRI scans of the wrist were obtained in 40 gout patients. MRI cartilage damage was quantified using an adaptation of the radiographic Sharp van der Heijde score. Two readers scored cartilage loss at 7 wrist joints: 0 (normal), 1 (partial narrowing), 2 (complete narrowing) and concomitant osteoarthritis was recorded. Bone erosion, bone oedema and synovitis were scored (RAMRIS) and tophi were assessed. Correlations between radiographic and MRI cartilage scores were investigated, as was the reliability of the MRI cartilage score and its associations. The GOut MRI Cartilage Score (GOMRICS) was highly correlated with the total Sharp van der Heijde (SvdH) score and the joint space narrowing component (R = 0.8 and 0.71 respectively, p < 0.001). Reliability was high (intraobserver, interobserver ICCs = 0.87 [0.57-0.97], 0.64 [0.41-0.79] respectively), and improved on unenhanced scans; interobserver ICC = 0.82 [0.49-0.95]. Cartilage damage was predominantly focal (82% of lesions) and identified in 40 out of 280 (14%) of joints. Cartilage scores correlated with bone erosion (R = 0.57), tophus size (R = 0.52), and synovitis (R = 0.55), but not bone oedema scores. Magnetic resonance imaging can be used to investigate cartilage in gout. Cartilage damage was relatively uncommon, focal, and associated with bone erosions, tophi and synovitis, but not bone oedema. This emphasises the unique pathophysiology of gout.
Simplified Radiographic Damage Index for Affected Joints in Chronic Gouty Arthritis

PubMed Central

2016-01-01

The aim of this study was to develop and validate a new radiographic damage scoring method (DAmagE index of GoUt; DAEGU) in chronic gout using plain radiography. Two independent observers scored foot x-rays from 15 patients with chronic gout according to the DAEGU method and the modified Sharp/van der Heijde (SvdH) method. The 10 metatarsophalangeal (MTP) and 2 interphalangeal (IP) joints of the first toes of both feet were scored to assess the degrees of erosion and joint space narrowing (JSN). The intraobserver and interobserver reliabilities were analyzed by calculating the intraclass correlation coefficient (ICC) and minimal detectable change (MDC). The correlation between the DAEGU and SvdH methods was analyzed by calculating the Spearman's rho correlation coefficients and Kappa coefficients. The DAEGU method was found to be highly reproducible (0.945–0.987 for the intraobserver and 0.993–0.996 for the interobserver ICC values). The erosion, JSN, and total scores exhibited strong positive correlations between the DAEGU and SvdH methods and also within each method (r = 0.860–0.969, P < 0.001 for all parameters). The DAEGU and SvdH methods were in very good agreement as determined by Kappa coefficient analysis [0.732 (0.387–1.000) for erosion and 1.000 (1.000–1.000) for JSN]. In conclusion, this study revealed that DAEGU method was a reliable and feasible tool in the assessment of radiographic damage in chronic gout. The DAEGU method may provide a more easy assessment of structural damage in chronic gout in the real clinical practice. PMID:26955246
Patient-Oriented Cancer Information on the Internet: A Comparison of Wikipedia and a Professionally Maintained Database

PubMed Central

Rajagopalan, Malolan S.; Khanna, Vineet K.; Leiter, Yaacov; Stott, Meghan; Showalter, Timothy N.; Dicker, Adam P.; Lawrence, Yaacov R.

2011-01-01

Purpose: A wiki is a collaborative Web site, such as Wikipedia, that can be freely edited. Because of a wiki's lack of formal editorial control, we hypothesized that the content would be less complete and accurate than that of a professional peer-reviewed Web site. In this study, the coverage, accuracy, and readability of cancer information on Wikipedia were compared with those of the patient-orientated National Cancer Institute's Physician Data Query (PDQ) comprehensive cancer database. Methods: For each of 10 cancer types, medically trained personnel scored PDQ and Wikipedia articles for accuracy and presentation of controversies by using an appraisal form. Reliability was assessed by using interobserver variability and test-retest reproducibility. Readability was calculated from word and sentence length. Results: Evaluators were able to rapidly assess articles (18 minutes/article), with a test-retest reliability of 0.71 and interobserver variability of 0.53. For both Web sites, inaccuracies were rare, less than 2% of information examined. PDQ was significantly more readable than Wikipedia: Flesch-Kincaid grade level 9.6 versus 14.1. There was no difference in depth of coverage between PDQ and Wikipedia (29.9, 34.2, respectively; maximum possible score 72). Controversial aspects of cancer care were relatively poorly discussed in both resources (2.9 and 6.1 for PDQ and Wikipedia, respectively, NS; maximum possible score 18). A planned subanalysis comparing common and uncommon cancers demonstrated no difference. Conclusion: Although the wiki resource had similar accuracy and depth as the professionally edited database, it was significantly less readable. Further research is required to assess how this influences patients' understanding and retention. PMID:22211130
Advanced Rotator Cuff Tear Score (ARoCuS): a multi-scaled tool for the classification and description of rotator cuff tears.

PubMed

Walter, S G; Stadler, T; Thomas, T S; Thomas, W

2018-03-02

To introduce a (semi-)quantitative surgical score for the classification of rotator cuff tears. A total of 146 consecutive patients underwent rotator cuff repair and were assessed using the previously defined Advanced Rotator Cuff Tear Score (ARoCuS) criteria: muscle tendon, size, tissue quality, pattern as well as mobilization of the tear. The data set was split into a training (125 patients) and a testing set (21 patients). The training data set fitted a nonlinear predictive model of the tear score based on the ARoCuS criteria, while the testing data served as control. Based on the scoring results, rotator cuff tears were assigned to one of four categories (ΔV I-IV) and received a stage-adapted treatment. For statistical analysis, mean values ± standard deviation, interclass correlation coefficients (ICC) and kappa values were calculated. Overall, 32 patients were classified as ΔV I, 68 as ΔV II and 37 as ΔV III. Nine patients showed ΔV IV tears. Patients of all ΔV groups improved significantly their Constant scores (p < 0.001) and profited from significant pain reduction after surgery (p < 0.001). To date, ten patients have undergone revision surgery with five of them primarily classified as ΔV IV. Kappa values for the interobserver reliability ranged between 0.69 and 0.95. ICC scores for the ΔV category were 0.95 for interobserver reliability. The ARoCuS facilitates intra-operative decision-making and enables surgeons and researches to document rotator cuff tears in a standardized and reproducible manner.
Quantitative analysis of tympanic membrane perforation: a simple and reliable method.

PubMed

Ibekwe, T S; Adeosun, A A; Nwaorgu, O G

2009-01-01

Accurate assessment of the features of tympanic membrane perforation, especially size, site, duration and aetiology, is important, as it enables optimum management. To describe a simple, cheap and effective method of quantitatively analysing tympanic membrane perforations. The system described comprises a video-otoscope (capable of generating still and video images of the tympanic membrane), adapted via a universal serial bus box to a computer screen, with images analysed using the Image J geometrical analysis software package. The reproducibility of results and their correlation with conventional otoscopic methods of estimation were tested statistically with the paired t-test and correlational tests, using the Statistical Package for the Social Sciences version 11 software. The following equation was generated: P/T x 100 per cent = percentage perforation, where P is the area (in pixels2) of the tympanic membrane perforation and T is the total area (in pixels2) for the entire tympanic membrane (including the perforation). Illustrations are shown. Comparison of blinded data on tympanic membrane perforation area obtained independently from assessments by two trained otologists, of comparative years of experience, using the video-otoscopy system described, showed similar findings, with strong correlations devoid of inter-observer error (p = 0.000, r = 1). Comparison with conventional otoscopic assessment also indicated significant correlation, comparing results for two trained otologists, but some inter-observer variation was present (p = 0.000, r = 0.896). Correlation between the two methods for each of the otologists was also highly significant (p = 0.000). A computer-adapted video-otoscope, with images analysed by Image J software, represents a cheap, reliable, technology-driven, clinical method of quantitative analysis of tympanic membrane perforations and injuries.
Investigation of the reproducibility and reliability of sagittal vertebral inclination measurements from MR images of the spine.

PubMed

Vrtovec, Tomaž; Pernuš, Franjo; Likar, Boštjan

2014-10-01

In this study, sagittal vertebral inclination (SVI) was systematically evaluated for 28 vertebrae (segments between T4 and L5) in magnetic resonance (MR) images of one normal and one scoliotic subject to compare the performance of manual and computerized measurements, and identify the most reproducible and reliable measurements. Manual measurements were performed by three observers, who identified on two occasions the distinctive anatomical landmarks required to evaluate SVI by six measurement methods, i.e. the superior tangents, inferior tangents, anterior tangents, posterior tangents, mid-endplate lines and mid-wall lines. Computerized measurements were performed by automatically evaluating SVI from the symmetry of vertebral anatomical structures in two-dimensional (2D) sagittal cross-sections and in three-dimensional (3D) volumetric images. The mid-wall lines and posterior tangents proved to be the manual measurements with the lowest intra-observer (standard deviation, SD, of 1.4° and 1.7°, respectively) and inter-observer variability (SD of 1.9° and 2.4°, respectively). The strongest inter-method agreement was found between the mid-wall lines and posterior tangents (SD of 2.0°). Computerized measurements in 2D and in 3D resulted in intra-observer (SD of 2.8° and 3.1°, respectively) and inter-observer variability (SD of 3.8° and 5.2°, respectively) that were comparable to those of the superior tangents (SD of 2.6° and 3.7°) and inferior tangents (SD of 3.2° and 4.5°), which represent standard Cobb angle measurements. It can be concluded that computerized measurements of SVI should be based on the inclination of vertebral body walls. Copyright © 2014 Elsevier Ltd. All rights reserved.
Interobserver reliability of a "Standardized Psychiatric Examination" (SPE) for case ascertainment (DSM-III).

PubMed

Romanoski, A J; Nestadt, G; Chahal, R; Merchant, A; Folstein, M F; Gruenberg, E M; McHugh, P R

1988-02-01

The authors describe the Standardized Psychiatric Examination (SPE), a new method for conducting psychiatric examinations in both clinical and research settings that preserves the clinical method. The SPE provides a consistent replicable format for eliciting and recording psychiatric history, signs, and symptoms without perturbing the patient-clinician interaction. By means of the SPE, the clinician can formulate diagnoses using DSM-III or ICD-9 criteria and yet generate CATEGO profiles derived from the Present State Examination, 9th edition. Psychiatrists using the SPE demonstrated high interrater reliability in ascertaining individual psychopathological symptoms (Kappa range, 0.55 to 1.0) and in making DSM-III diagnoses (Kappa range, 0.79 to 1.0) among a sample of study subjects (N = 43) drawn from both a psychiatric inpatient population and a large community sample of nonpatients from the Epidemiological Catchment Area (ECA) study. The implications of the SPE for clinical practice and for research are discussed.
Maternal sensitivity and attachment security in Thailand: cross-cultural validation of Western measures.

PubMed

Chaimongkol, Nujjaree N; Flick, Louise H

2006-01-01

The purpose of this study was to examine the psychometric properties of Thai versions of the Maternal Behavior Q-Sort (MBQS), Caldwell's HOME, and the Attachment Q-set (AQS). A sample of 110 Thai mother-infant dyads were studied. The Content Validity Index (CVIs) of the Thai MBQS, HOME and AQS were between 91% and 99%. Internal consistency of the HOME was .71. Interobserver reliability of the MBQS, HOME, and AQS were .95, .87, and .87, respectively. Convergent validity was supported by finding a positive correlation between the MBQS and the HOME (r = .29, p < .001). A positive correlation of .45 (p < .001) between the scores of the MBQS and the AQS indicated concurrent validity of these scales. Study findings indicate the Thai MBQS, HOME, and AQS are reliable and valid in this Thai sample and suggest that the Thai versions reflect concepts similar to those in the original English versions.

Interobserver concordance of assessments of dysplasia and blast counts for the diagnosis of patients with cytopenia: From the Japanese central review study.

PubMed

Matsuda, Akira; Kawabata, Hiroshi; Tohyama, Kaoru; Maeda, Tomoya; Araseki, Kayano; Hata, Tomoko; Suzuki, Takahiro; Kayano, Hidekazu; Shimbo, Kei; Usuki, Kensuke; Chiba, Shigeru; Ishikawa, Takayuki; Arima, Nobuyoshi; Nohgawa, Masaharu; Ohta, Akiko; Miyazaki, Yasushi; Nakao, Sinnji; Ozawa, Keiya; Arai, Shunya; Kurokawa, Mineo; Mitani, Kinuko; Takaori-Kondo, Akifumi

2018-06-07

The diagnosis of myelodysplastic syndromes (MDS) is based on morphology and cytogenetics. However, limited information is currently available on the interobserver concordance of the assessment of dysplastic lineages (<10% or ≥10% in bone marrow (BM)). The revised International Prognostic Scoring System (IPSS-R) described a new threshold (2%) for BM blasts. However, the interobserver concordance of the categories (0-≤2% and >2-<5%) has limited data. The purpose of the present study was to investigate the assessment of dysplastic lineages and IPSS-R reproducibility. Our study was divided into two Steps. In each Step, the microscopic examinations were performed separately by two morphologists. Regarding the category of BM blasts ≤2% and >2-<5%, interobserver agreement was more than 'moderate' in all pairs (kappa test: 0.43-0.90). Regarding dysgranulopoiesis (dysG) and dyserythropoiesis (dysE) in BM, interobserver agreement was more than 'moderate' in all pairs (kappa test, dysG: 0.45-0.96, dysE: 0.45-0.81). Regarding the category of dysmegakaryopoiesis (dysMgk) in BM, interobserver agreement was more than moderate in 4 out of 5 pairs (kappa test: 0.58-1.00), and was fair for one pair (kappa test: 0.37). We consider that high interobserver concordance may be possible for the BM blast cell count (≤2% or >2-<5%) and dysplasia (<10% or ≥10%) of each lineage. Copyright © 2018 Elsevier Ltd. All rights reserved.
What is the optimal cutoff value of the axis-line-angle technique for evaluating trunk imbalance in coronal plane?

PubMed

Zhang, Rui-Fang; Fu, Yu-Chuan; Lu, Yi; Zhang, Xiao-Xia; Hu, Yu-Min; Zhou, Yong-Jin; Tian, Nai-Feng; He, Jia-Wei; Yan, Zhi-Han

2017-02-01

Accurately evaluating the extent of trunk imbalance in the coronal plane is significant for patients before and after treatment. We preliminarily practiced a new method, axis-line-angle technique (ALAT), for evaluating coronal trunk imbalance with excellent intra-observer and interobserver reliability. Radiologists and surgeons were encouraged to use this method in clinical practice. However, the optimal cutoff value of the ALAT for determination of the extent of coronal trunk imbalance has not been calculated up to now. The purpose of this study was to identify the cutoff value of the ALAT that best predicts a positive measurement point to assess coronal balance or imbalance. A retrospective study at a university affiliated hospital was carried out. A total of 130 patients with C7-central sacral vertical line (CSVL) >0 mm and aged 10-18 years were recruited in this study from September 2013 to December 2014. Data were analyzed to determine the optimal cutoff value of the ALAT measurement. The C7-CSVL and ALAT measurements were conducted respectively twice on plain film within a 2-week interval by two radiologists. The optimal cutoff value of the ALAT was analyzed via receiver operating characteristic (ROC) curve. Comparison variables were performed with chi-square test between the C7-CSVL and ALAT measurements for evaluating trunk imbalance. Kappa agreement coefficient method was used to test the intra-observer and interobserver agreement of C7-CSVL and ALAT. The ROC curve area for the ALAT was 0.82 (95% confidence interval: 0.753-0.894, p<.001). The maximum Youden index was 0.51, and the corresponding cutoff point was 2.59°. No statistical difference was found between the C7-CSVL and ALAT measurements for evaluating trunk imbalance (p>.05). Intra-observer agreement values for the C7-CSVL measurements by observers 1 and 2 were 0.79 and 0.91 (p<.001), respectively, whereas intra-observer agreement values for the ALAT measurements were both 0.89 by observers 1 and 2 (p<.001). The interobserver agreement values for the first and second measurements with the C7-CSVL were 0.78 and 0.85 (p<.001), respectively, whereas the interobserver agreement values for the first and second measurements with the ALAT were 0.91 and 0.88 (p<.001), respectively. The newly developed ALAT provided an acceptable optimal cutoff value for evaluating trunk imbalance in the coronal plane with a high level of intra-observer and interobserver agreement, which suggests that the ALAT is suitable for clinical use. Copyright © 2016 Elsevier Inc. All rights reserved.
Validity and reliability of the Paprosky acetabular defect classification.

PubMed

Yu, Raymond; Hofstaetter, Jochen G; Sullivan, Thomas; Costi, Kerry; Howie, Donald W; Solomon, Lucian B

2013-07-01

The Paprosky acetabular defect classification is widely used but has not been appropriately validated. Reliability of the Paprosky system has not been evaluated in combination with standardized techniques of measurement and scoring. This study evaluated the reliability, teachability, and validity of the Paprosky acetabular defect classification. Preoperative radiographs from a random sample of 83 patients undergoing 85 acetabular revisions were classified by four observers, and their classifications were compared with quantitative intraoperative measurements. Teachability of the classification scheme was tested by dividing the four observers into two groups. The observers in Group 1 underwent three teaching sessions; those in Group 2 underwent one session and the influence of teaching on the accuracy of their classifications was ascertained. Radiographic evaluation showed statistically significant relationships with intraoperative measurements of anterior, medial, and superior acetabular defect sizes. Interobserver reliability improved substantially after teaching and did not improve without it. The weighted kappa coefficient went from 0.56 at Occasion 1 to 0.79 after three teaching sessions in Group 1 observers, and from 0.49 to 0.65 after one teaching session in Group 2 observers. The Paprosky system is valid and shows good reliability when combined with standardized definitions of radiographic landmarks and a structured analysis. Level II, diagnostic study. See the Guidelines for Authors for a complete description of levels of evidence.
A Study on the Reliability of Sasang Constitutional Body Trunk Measurement

PubMed Central

Jang, Eunsu; Kim, Jong Yeol; Lee, Haejung; Kim, Honggie; Baek, Younghwa; Lee, Siwoo

2012-01-01

Objective. Body trunk measurement for human plays an important diagnostic role not only in conventional medicine but also in Sasang constitutional medicine (SCM). The Sasang constitutional body trunk measurement (SCBTM) consists of the 5-widths and the 8-circumferences which are standard locations currently employed in the SCM society. This study suggests to what extent a comprehensive training can improve the reliability of the SCBTM. Methods. We recruited 10 male subjects and 5 male observers with no experience of anthropometric measurement. We conducted measurements twice before and after a comprehensive training. Relative technical error of measurement (%TEMs) was produced to assess intra and inter observer reliabilities. Results. Post-training intra-observer %TEMs of the SCBTM were 0.27% to 1.85% reduced from 0.27% to 6.26% in pre-training, respectively. Post-training inter-observer %TEMs of those were 0.56% to 1.66% reduced from 1.00% to 9.60% in pre-training, respectively. Post-training % total TEMs which represent the whole reliability were 0.68% to 2.18% reduced from maximum value of 10.18%. Conclusion. A comprehensive training makes the SCBTM more reliable, hence giving a sufficiently confident diagnostic tool. It is strongly recommended to give a comprehensive training in advance to take the SCBTM. PMID:21822442
Reliability of tristimulus colourimetry in the assessment of cutaneous bruise colour.

PubMed

Scafide, Katherine N; Sheridan, Daniel J; Taylor, Laura A; Hayat, Matthew J

2016-06-01

Bruising is one of the most common types of injury clinicians observe among victims of violence and other trauma patients. However, research has shown commonly used qualitative description of cutaneous bruise colour via the naked eye is subjective and unreliable. No published work has formally evaluated the reliability of tristimulus colourimetry as an alternative for assessing bruise colour, despite its clinical and research applications in accurately assessing skin colour. The purpose of this study was to systematically evaluate the test-retest and inter-observer reliability of tristimulus colourimetry in the assessment of cutaneous bruise colour. Two researchers obtained repeated tristimulus colourimetry measures of cutaneous bruises with participants of diverse skin colour. Measures were obtained using the Minolta CR-400 Chomameter. Commission Internationale d'Eclairage (CIE) L*a*b* colour space was used. Data was analysed using intraclass correlation coefficients (ICC), Cronbach's alpha, and minimal detectable change (MDC) on all three L*a*b* values. The colorimeter demonstrated excellent test-retest or intra-rater reliability (L* ICC=0.999; a* ICC=0.973; b* ICC=0.892) and inter-rater reliability (L* ICC=0.997; a* ICC=0.976; b* ICC=0.982). With consistent placement, the tristimulus colourimetry is reliable for the objective assessment and documentation of cutaneous bruise colour for purposes of clinical practice and research. Recommendations for use in practice/research are provided. Copyright © 2016 Elsevier Ltd. All rights reserved.
Reliability and agreement on embryo assessment: 5 years of an external quality control programme.

PubMed

Martínez-Granados, Luis; Serrano, María; González-Utor, Antonio; Ortiz, Nereyda; Badajoz, Vicente; López-Regalado, María Luisa; Boada, Montserrat; Castilla, Jose A

2018-03-01

An external quality-control programme for morphology-based embryo quality assessment, incorporating a standardized embryo grading scheme, was evaluated over a period of 5 years to determine levels of inter-observer reliability and agreement between practising clinical embryologists at IVF centres and the opinions of a panel of experts. Following Guidelines for Reporting Reliability and Agreement Studies, the Gwet index and proportion of positive (Ppos) and negative agreement were calculated. For embryo morphology assessment, a substantial degree of reliability was measured between the centres and the panel of experts (Gwet index: 0.76; 95% CI 0.70 to 0.84). The agreement was higher for good- versus poor-quality embryos. When multinucleation or vacuoles were observed, low levels of reliability were obtained (Ppos: 0.56 and 0.43, respectively). In blastocysts, the characteristic that presented the largest discrepancy was that related to the inner cell mass. In decisions about the final disposition of the embryo, reliability between centre and the panel of experts was moderate (Gwet index: 0.51; 95% CI 0.41 to 0.60). In conclusion, the ability of clinical embryologists to evaluate the presence of multinucleation and vacuoles in the early cleavage embryo, and to determine the category of the inner cell mass in blastocysts, needs to be improved. Copyright © 2017 Reproductive Healthcare Ltd. All rights reserved.
Identifying and classifying hyperostosis frontalis interna via computerized tomography.

PubMed

May, Hila; Peled, Nathan; Dar, Gali; Hay, Ori; Abbas, Janan; Masharawi, Youssef; Hershkovitz, Israel

2010-12-01

The aim of this study was to recognize the radiological characteristics of hyperostosis frontalis interna (HFI) and to establish a valid and reliable method for its identification and classification. A reliability test was carried out on 27 individuals who had undergone a head computerized tomography (CT) scan. Intra-observer reliability was obtained by examining the images three times, by the same researcher, with a 2-week interval between each sample ranking. The inter-observer test was performed by three independent researchers. A validity test was carried out using two methods for identifying and classifying HFI: 46 cadaver skullcaps were ranked twice via computerized tomography scans and then by direct observation. Reliability and validity were calculated using Kappa test (SPSS 15.0). Reliability tests of ranking HFI via CT scans demonstrated good results (K > 0.7). As for validity, a very good consensus was obtained between the CT and direct observation, when moderate and advanced types of HFI were present (K = 0.82). The suggested classification method for HFI, using CT, demonstrated a sensitivity of 84%, specificity of 90.5%, and positive predictive value of 91.3%. In conclusion, volume rendering is a reliable and valid tool for identifying HFI. The suggested three-scale classification is most suitable for radiological diagnosis of the phenomena. Considering the increasing awareness of HFI as an early indicator of a developing malady, this study may assist radiologists in identifying and classifying the phenomena.
A comparative study of software programmes for cross-sectional skeletal muscle and adipose tissue measurements on abdominal computed tomography scans of rectal cancer patients.

PubMed

van Vugt, Jeroen L A; Levolger, Stef; Gharbharan, Arvind; Koek, Marcel; Niessen, Wiro J; Burger, Jacobus W A; Willemsen, Sten P; de Bruin, Ron W F; IJzermans, Jan N M

2017-04-01

The association between body composition (e.g. sarcopenia or visceral obesity) and treatment outcomes, such as survival, using single-slice computed tomography (CT)-based measurements has recently been studied in various patient groups. These studies have been conducted with different software programmes, each with their specific characteristics, of which the inter-observer, intra-observer, and inter-software correlation are unknown. Therefore, a comparative study was performed. Fifty abdominal CT scans were randomly selected from 50 different patients and independently assessed by two observers. Cross-sectional muscle area (CSMA, i.e. rectus abdominis, oblique and transverse abdominal muscles, paraspinal muscles, and the psoas muscle), visceral adipose tissue area (VAT), and subcutaneous adipose tissue area (SAT) were segmented by using standard Hounsfield unit ranges and computed for regions of interest. The inter-software, intra-observer, and inter-observer agreement for CSMA, VAT, and SAT measurements using FatSeg, OsiriX, ImageJ, and sliceOmatic were calculated using intra-class correlation coefficients (ICCs) and Bland-Altman analyses. Cohen's κ was calculated for the agreement of sarcopenia and visceral obesity assessment. The Jaccard similarity coefficient was used to compare the similarity and diversity of measurements. Bland-Altman analyses and ICC indicated that the CSMA, VAT, and SAT measurements between the different software programmes were highly comparable (ICC 0.979-1.000, P < 0.001). All programmes adequately distinguished between the presence or absence of sarcopenia (κ = 0.88-0.96 for one observer and all κ = 1.00 for all comparisons of the other observer) and visceral obesity (all κ = 1.00). Furthermore, excellent intra-observer (ICC 0.999-1.000, P < 0.001) and inter-observer (ICC 0.998-0.999, P < 0.001) agreement for all software programmes were found. Accordingly, excellent Jaccard similarity coefficients were found for all comparisons (mean ≥ 0.964). FatSeg, OsiriX, ImageJ, and sliceOmatic showed an excellent agreement for CSMA, VAT, and SAT measurements on abdominal CT scans. Furthermore, excellent inter-observer and intra-observer agreement were achieved. Therefore, results of studies using these different software programmes can reliably be compared. © 2016 The Authors. Journal of Cachexia, Sarcopenia and Muscle published by John Wiley & Sons Ltd on behalf of the Society on Sarcopenia, Cachexia and Wasting Disorders.
Diffusion-weighted magnetic resonance imaging for assessment of lung lesions: repeatability of the apparent diffusion coefficient measurement.

PubMed

Bernardin, L; Douglas, N H M; Collins, D J; Giles, S L; O'Flynn, E A M; Orton, M; deSouza, N M

2014-02-01

To establish repeatability of apparent diffusion coefficients (ADCs) acquired from free-breathing diffusion-weighted magnetic resonance imaging (DW-MRI) in malignant lung lesions and investigate effects of lesion size, location and respiratory motion. Thirty-six malignant lung lesions (eight patients) were examined twice (1- to 5-h interval) using T1-weighted, T2-weighted and axial single-shot echo-planar DW-MRI (b = 100, 500, 800 s/mm(2)) during free-breathing. Regions of interest around target lesions on computed b = 800 s/mm(2) images by two independent observers yielded ADC values from maps (pixel-by-pixel fitting using all b values and a mono-exponential decay model). Intra- and inter-observer repeatability was assessed per lesion, per patient and by lesion size (> or <2 cm) or location. ADCs were similar between observers (mean ± SD, 1.15 ± 0.28 × 10(-3) mm(2)/s, observer 1; 1.15 ± 0.29 × 10(-3) mm(2)/s, observer 2). Intra-observer coefficients of variation of the mean [median] ADC per lesion and per patient were 11% [11.4%], 5.7% [5.7%] for observer 1 and 9.2% [9.5%], 3.9% [4.7%] for observer 2 respectively; inter-observer values were 8.9% [9.3%] (per lesion) and 3.0% [3.7%] (per patient). Inter-observer coefficient of variation (CoV) was greater for lesions <2 cm (n = 20) compared with >2 cm (n = 16) (10.8% vs 6.5% ADCmean, 11.3% vs 6.7% ADCmedian) and for mid (n = 14) vs apical (n = 9) or lower zone (n = 13) lesions (13.9%, 2.7%, 3.8% respectively ADCmean; 14.2%, 2.8%, 4.7% respectively ADCmedian). Free-breathing DW-MRI of whole lung achieves good intra- and inter-observer repeatability of ADC measurements in malignant lung tumours. • Diffusion-weighted MRI of the lung can be satisfactorily acquired during free-breathing • DW-MRI demonstrates high contrast between primary and metastatic lesions and normal lung • Apparent diffusion coefficient (ADC) measurements in lung tumours are repeatable and reliable • ADC offers potential in assessing response in lung metastases in clinical trials.
Measuring physical activity in preschoolers: Reliability and validity of The System for Observing Fitness Instruction Time for Preschoolers (SOFIT-P)

PubMed Central

Sharma, Shreela; Chuang, Ru-Jye; Skala, Katherine; Atteberry, Heather

2012-01-01

The purpose of this study is describe the initial feasibility, reliability, and validity of an instrument to measure physical activity in preschoolers using direct observation. The System for Observing Fitness Instruction Time for Preschoolers was developed and tested among 3- to 6-year-old children over fall 2008 for feasibility and reliability (Phase I, n=67) and in fall 2009 for concurrent validity (Phase II, n=27). Phase I showed that preschoolers spent >75% of their active time at preschool in light physical activity. The mean inter-observer agreements scores were ≥.75 for physical activity level and type. Correlation coefficients, measuring construct validity between the lesson context and physical activity types with and with the activity levels, were moderately strong. Phase II showed moderately strong correlations ranging from .50 to .54 between the System for Observing Fitness Instruction Time for Preschoolers and Actigraph accelerometers for physical activity levels. The System for Observing Fitness Instruction Time for Preschoolers shows promising initial results as a new method for measuring physical activity among preschoolers. PMID:22485071
The "Good, Bad and Ugly" pin site grading system: A reliable and memorable method for documenting and monitoring ring fixator pin sites.

PubMed

Clint, S A; Eastwood, D M; Chasseaud, M; Calder, P R; Marsh, D R

2010-02-01

Although there is much in the literature regarding pin site infections, there is no accepted, validated method for documenting their state. We present a system for reliably labelling pin sites on any ring fixator construct and an easy-to-remember grading system to document the state of each pin site. Each site is graded in terms of erythema, pain and discharge to give a 3-point scale, named "Good", "Bad" and "Ugly" for ease of recall. This system was tested for intra- and inter-observer reproducibility. 15 patients undergoing elective limb reconstruction were recruited. A total of 218 pin sites were independently scored by 2 examiners. 82 were then re-examined later by the same examiners. 514 pin sites were felt to be "Good", 80 "Bad" and 6 "Ugly". The reproducibility of the system was found to be excellent. We feel our system gives a quick, reliable and reproducible method to monitor individual pin sites and their response to treatment. Crown Copyright 2009. Published by Elsevier Ltd. All rights reserved.
Polyp morphology: an interobserver evaluation for the Paris classification among international experts.

PubMed

van Doorn, Sascha C; Hazewinkel, Y; East, James E; van Leerdam, Monique E; Rastogi, Amit; Pellisé, Maria; Sanduleanu-Dascalescu, Silvia; Bastiaansen, Barbara A J; Fockens, Paul; Dekker, Evelien

2015-01-01

The Paris classification is an international classification system for describing polyp morphology. Thus far, the validity and reproducibility of this classification have not been assessed. We aimed to determine the interobserver agreement for the Paris classification among seven Western expert endoscopists. A total of 85 short endoscopic video clips depicting polyps were created and assessed by seven expert endoscopists according to the Paris classification. After a digital training module, the same 85 polyps were assessed again. We calculated the interobserver agreement with a Fleiss kappa and as the proportion of pairwise agreement. The interobserver agreement of the Paris classification among seven experts was moderate with a Fleiss kappa of 0.42 and a mean pairwise agreement of 67%. The proportion of lesions assessed as "flat" by the experts ranged between 13 and 40% (P<0.001). After the digital training, the interobserver agreement did not change (kappa 0.38, pairwise agreement 60%). Our study is the first to validate the Paris classification for polyp morphology. We demonstrated only a moderate interobserver agreement among international Western experts for this classification system. Our data suggest that, in its current version, the use of this classification system in daily practice is questionable and it is unsuitable for comparative endoscopic research. We therefore suggest introduction of a simplification of the classification system.
MRI of the wrist in juvenile idiopathic arthritis: proposal of a paediatric synovitis score by a consensus of an international working group. Results of a multicentre reliability study.

PubMed

Damasio, Maria Beatrice; Malattia, Clara; Tanturri de Horatio, Laura; Mattiuz, Chiara; Pistorio, Angela; Bracaglia, Claudia; Barbuti, Domenico; Boavida, Peter; Juhan, Karen Lambot; Ording, Lil Sophie Mueller; Rosendahl, Karen; Martini, Alberto; Magnano, GianMichele; Tomà, Paolo

2012-09-01

MRI is a sensitive tool for the evaluation of synovitis in juvenile idiopathic arthritis (JIA). The purpose of this study was to introduce a novel MRI-based score for synovitis in children and to examine its inter- and intraobserver variability in a multi-centre study. Wrist MRI was performed in 76 children with JIA. On postcontrast 3-D spoiled gradient-echo and fat-suppressed T2-weighted spin-echo images, joint recesses were scored for the degree of synovial enhancement, effusion and overall inflammation independently by two paediatric radiologists. Total-enhancement and inflammation-synovitis scores were calculated. Interobserver agreement was poor to moderate for enhancement and inflammation in all recesses, except in the radioulnar and radiocarpal joints. Intraobserver agreement was good to excellent. For enhancement and inflammation scores, mean differences (95 % CI) between observers were -1.18 (-4.79 to 2.42) and -2.11 (-6.06 to 1.83). Intraobserver variability (reader 1) was 0 (-1.65 to 1.65) and 0.02 (-1.39 to 1.44). Intraobserver agreement was good. Except for the radioulnar and radiocarpal joints, interobserver agreement was not acceptable. Therefore, the proposed scoring system requires further refinement.
Magnetic resonance direct thrombus imaging differentiates acute recurrent ipsilateral deep vein thrombosis from residual thrombosis.

PubMed

Tan, Melanie; Mol, Gerben C; van Rooden, Cornelis J; Klok, Frederikus A; Westerbeek, Robin E; Iglesias Del Sol, Antonio; van de Ree, Marcel A; de Roos, Albert; Huisman, Menno V

2014-07-24

Accurate diagnostic assessment of suspected ipsilateral recurrent deep vein thrombosis (DVT) is a major clinical challenge because differentiating between acute recurrent thrombosis and residual thrombosis is difficult with compression ultrasonography (CUS). We evaluated noninvasive magnetic resonance direct thrombus imaging (MRDTI) in a prospective study of 39 patients with symptomatic recurrent ipsilateral DVT (incompressibility of a different proximal venous segment than at the prior DVT) and 42 asymptomatic patients with at least 6-month-old chronic residual thrombi and normal D-dimer levels. All patients were subjected to MRDTI. MRDTI images were judged by 2 independent radiologists blinded for the presence of acute DVT and a third in case of disagreement. The sensitivity, specificity, and interobserver reliability of MRDTI were determined. MRDTI demonstrated acute recurrent ipsilateral DVT in 37 of 39 patients and was normal in all 42 patients without symptomatic recurrent disease for a sensitivity of 95% (95% CI, 83% to 99%) and a specificity of 100% (95% CI, 92% to 100%). Interobserver agreement was excellent (κ = 0.98). MRDTI images were adequate for interpretation in 95% of the cases. MRDTI is a sensitive and reproducible method for distinguishing acute ipsilateral recurrent DVT from 6-month-old chronic residual thrombi in the leg veins. © 2014 by The American Society of Hematology.
Three-Dimensional Photography for Quantitative Assessment of Penile Volume-Loss Deformities in Peyronie's Disease.

PubMed

Margolin, Ezra J; Mlynarczyk, Carrie M; Mulhall, John P; Stember, Doron S; Stahl, Peter J

2017-06-01

Non-curvature penile deformities are prevalent and bothersome manifestations of Peyronie's disease (PD), but the quantitative metrics that are currently used to describe these deformities are inadequate and non-standardized, presenting a barrier to clinical research and patient care. To introduce erect penile volume (EPV) and percentage of erect penile volume loss (percent EPVL) as novel metrics that provide detailed quantitative information about non-curvature penile deformities and to study the feasibility and reliability of three-dimensional (3D) photography for measurement of quantitative penile parameters. We constructed seven penis models simulating deformities found in PD. The 3D photographs of each model were captured in triplicate by four observers using a 3D camera. Computer software was used to generate automated measurements of EPV, percent EPVL, penile length, minimum circumference, maximum circumference, and angle of curvature. The automated measurements were statistically compared with measurements obtained using water-displacement experiments, a tape measure, and a goniometer. Accuracy of 3D photography for average measurements of all parameters compared with manual measurements; inter-test, intra-observer, and inter-observer reliabilities of EPV and percent EPVL measurements as assessed by the intraclass correlation coefficient. The 3D images were captured in a median of 52 seconds (interquartile range = 45-61). On average, 3D photography was accurate to within 0.3% for measurement of penile length. It overestimated maximum and minimum circumferences by averages of 4.2% and 1.6%, respectively; overestimated EPV by an average of 7.1%; and underestimated percent EPVL by an average of 1.9%. All inter-test, inter-observer, and intra-observer intraclass correlation coefficients for EPV and percent EPVL measurements were greater than 0.75, reflective of excellent methodologic reliability. By providing highly descriptive and reliable measurements of penile parameters, 3D photography can empower researchers to better study volume-loss deformities in PD and enable clinicians to offer improved clinical assessment, communication, and documentation. This is the first study to apply 3D photography to the assessment of PD and to accurately measure the novel parameters of EPV and percent EPVL. This proof-of-concept study is limited by the lack of data in human subjects, which could present additional challenges in obtaining reliable measurements. EPV and percent EPVL are novel metrics that can be quickly, accurately, and reliably measured using computational analysis of 3D photographs and can be useful in describing non-curvature volume-loss deformities resulting from PD. Margolin EJ, Mlynarczyk CM, Muhall JP, et al. Three-Dimensional Photography for Quantitative Assessment of Penile Volume-Loss Deformities in Peyronie's Disease. J Sex Med 2017;14:829-833. Copyright © 2017 International Society for Sexual Medicine. Published by Elsevier Inc. All rights reserved.
Lower urinary tract symptoms that predict microscopic pyuria.

PubMed

Khasriya, Rajvinder; Barcella, William; De Iorio, Maria; Swamy, Sheela; Gill, Kiren; Kupelian, Anthony; Malone-Lee, James

2017-10-02

Urinary dipsticks and culture analyses of a mid-stream urine specimen (MSU) at 10 5 cfu ml -1 of a known urinary pathogen are considered the gold standard investigations for diagnosing urinary tract infection (UTI). However, the reliability of these tests has been much criticised and they may mislead. It is now widely accepted that pyuria (≥1 WBC μl -1 ) detected by microscopy of a fresh unspun, unstained specimen of urine is the best biological indicator of UTI available. We aimed to scrutinise the greater potential of symptoms analysis in detecting pyuria and UTI. Lower urinary tract symptom (LUTS) descriptions were collected from patients with chronic lower urinary tract symptoms referred to a tertiary referral unit. The symptoms informed a 39-question inventory, grouped into storage, voiding, stress incontinence and pain symptoms. All questions sought a binary yes or no response. A bespoke software package was developed to collect the data. The study was powered to a sample of at least 1,990 patients, with sufficient power to analyse 39 symptoms in a linear model with an effect size of Cohen's f 2 = 0.02, type 1 error probability = 0.05; and power (1-β); 95% where β is the probability of type 2 error). The inventory was administered to 2,050 female patients between August 2004 and November 2011. The data were collated and the following properties assessed: internal consistency, test-retest reliability, inter-observer reliability, internal responsiveness, external responsiveness, construct validity analysis and a comparison with the International Consultation on Incontinence Modular Questionnaire for female lower urinary tract symptoms (ICIQ-FLUTS). The dependent variable used as a surrogate marker of UTI was microscopic pyuria. An MSU sample was sent for routine culture. The symptoms proved reliable predictors of microscopic pyuria. In particular, voiding symptoms correlated well with microscopic pyuria (χ 2 = 88, df = 1, p < 0.001). The symptom inventory has significant psychometric characteristics as below: test-retest reliability: Cronbach's alpha was 0.981; inter-observer reliability, Cronbach's alpha was 0.995, internal responsiveness F = 221, p < 0.001, external responsiveness F = 359, df = 5, p < 0.001. The correlation coefficients for the domains of the ICIQ-FLUTS were around R = 0.5, p < 0.001. This symptoms score performed well on the standard, psychometric validation. The score changed in response to treatment and in a direction appropriate to the changes in microscopic pyuria. It correlated with measures of quality of life. It would seem to make a good candidate for monitoring treatment progress in ordinary clinical practice.
Inter-Observer and Intra-Observer Reliability of Clinical Assessments in Knee Osteoarthritis

PubMed Central

Maricar, Nasimah; Callaghan, Michael J; Parkes, Matthew J; Felson, David T; O’Neill, Terence W

2016-01-01

Background Clinical examination of the knee is subject to measurement error. The aim of this analysis was to determine inter- and intra-observer reliability of commonly used clinical tests in patients with knee osteoarthritis(OA). Methods We studied subjects with symptomatic knee OA who were participants in an open-label clinical trial of intra-articular steroid therapy. Following standardisation of the clinical test procedures, two clinicians assessed 25 subjects independently at the same visit, and the same clinician assessed 88 subjects over an interval period of 2–10 weeks; in both cases prior to the steroid intervention. Clinical examination included assessment of bony enlargement, crepitus, quadriceps wasting, knee effusion, joint-line and anserine tenderness and knee range of movement(ROM). Intra-class correlation coefficients(ICC), estimated kappa(κ), weighted kappa(κω) and Bland and Altman plots were used to determine inter- and intra-observer levels of agreement. Results Using Landis and Koch criteria, inter-observer kappa scores were moderate for patellofemoral joint(κ=0.53) and anserine tenderness(κ=0.48); good for bony enlargement(κ=0.66), quadriceps wasting(κ=0.78), crepitus(κ=0.78), medial tibiofemoral joint tenderness(κ=0.76), and effusion assessed by ballottement(κ=0.73) and bulge sign(κω =0.78); and excellent for lateral tibiofemoral joint tenderness(κ=1.00), flexion(ICC=0.97) and extension(ICC=0.87) ROM. Intra-observer kappa scores were moderate for lateral tibiofemoral joint tenderness(κ=0.60), good for crepitus(κ=0.78), effusion assessed by ballottement test(κ=0.77), patellofemoral joint(κ=0.66), medial tibiofemoral joint(κ=0.64) and anserine(κ=0.73) tenderness and excellent for effusion assessed by bulge sign(κω =0.83), bony enlargement(κ=0.98), quadriceps wasting(κ=0.83), flexion(ICC=0.99) and extension(ICC=0.96) ROM. Conclusion Among individuals with symptomatic knee OA, the reliability of clinical examination of the knee was at least good for the majority of clinical signs of knee OA. PMID:27909143
Megavoltage computed tomography image guidance with helical tomotherapy in patients with vertebral tumors: analysis of factors influencing interobserver variability.

PubMed

Levegrün, Sabine; Pöttgen, Christoph; Jawad, Jehad Abu; Berkovic, Katharina; Hepp, Rodrigo; Stuschke, Martin

2013-02-01

To evaluate megavoltage computed tomography (MVCT)-based image guidance with helical tomotherapy in patients with vertebral tumors by analyzing factors influencing interobserver variability, considered as quality criterion of image guidance. Five radiation oncologists retrospectively registered 103 MVCTs in 10 patients to planning kilovoltage CTs by rigid transformations in 4 df. Interobserver variabilities were quantified using the standard deviations (SDs) of the distributions of the correction vector components about the observers' fraction mean. To assess intraobserver variabilities, registrations were repeated after ≥4 weeks. Residual deviations after setup correction due to uncorrectable rotational errors and elastic deformations were determined at 3 craniocaudal target positions. To differentiate observer-related variations in minimizing these residual deviations across the 3-dimensional MVCT from image resolution effects, 2-dimensional registrations were performed in 30 single transverse and sagittal MVCT slices. Axial and longitudinal MVCT image resolutions were quantified. For comparison, image resolution of kilovoltage cone-beam CTs (CBCTs) and interobserver variability in registrations of 43 CBCTs were determined. Axial MVCT image resolution is 3.9 lp/cm. Longitudinal MVCT resolution amounts to 6.3 mm, assessed as full-width at half-maximum of thin objects in MVCTs with finest pitch. Longitudinal CBCT resolution is better (full-width at half-maximum, 2.5 mm for CBCTs with 1-mm slices). In MVCT registrations, interobserver variability in the craniocaudal direction (SD 1.23 mm) is significantly larger than in the lateral and ventrodorsal directions (SD 0.84 and 0.91 mm, respectively) and significantly larger compared with CBCT alignments (SD 1.04 mm). Intraobserver variabilities are significantly smaller than corresponding interobserver variabilities (variance ratio [VR] 1.8-3.1). Compared with 3-dimensional registrations, 2-dimensional registrations have significantly smaller interobserver variability in the lateral and ventrodorsal directions (VR 3.8 and 2.8, respectively) but not in the craniocaudal direction (VR 0.75). Tomotherapy image guidance precision is affected by image resolution and residual deviations after setup correction. Eliminating the effect of residual deviations yields small interobserver variabilities with submillimeter precision in the axial plane. In contrast, interobserver variability in the craniocaudal direction is dominated by the poorer longitudinal MVCT image resolution. Residual deviations after image guidance exist and need to be considered when dose gradients ultimately achievable with image guided radiation therapy techniques are analyzed. Copyright © 2013 Elsevier Inc. All rights reserved.
Gastritis staging: interobserver agreement by applying OLGA and OLGIM systems.

PubMed

Isajevs, Sergejs; Liepniece-Karele, Inta; Janciauskas, Dainius; Moisejevs, Georgijs; Putnins, Viesturs; Funka, Konrads; Kikuste, Ilze; Vanags, Aigars; Tolmanis, Ivars; Leja, Marcis

2014-04-01

Atrophic gastritis remains a difficult histopathological diagnosis with low interobserver agreement. The aim of our study was to compare gastritis staging and interobserver agreement between general and expert gastrointestinal (GI) pathologists using Operative Link for Gastritis Assessment (OLGA) and Operative Link on Gastric Intestinal Metaplasia (OLGIM). We enrolled 835 patients undergoing upper endoscopy in the study. Two general and two expert gastrointestinal pathologists graded biopsy specimens according to the Sydney classification, and the stage of gastritis was assessed by OLGA and OLGIM system. Using OLGA, 280 (33.4 %) patients had gastritis (stage I-IV), whereas with OLGIM this was 167 (19.9 %). OLGA stage III- IV gastritis was observed in 25 patients, whereas by OLGIM stage III-IV was found in 23 patients. Interobserver agreement between expert GI pathologists for atrophy in the antrum, incisura angularis, and corpus was moderate (kappa = 0.53, 0.57 and 0.41, respectively, p < 0.0001), but almost perfect for intestinal metaplasia (kappa = 0.82, 0.80 and 0.81, respectively, p < 0.0001). However, interobserver agreement between general pathologists was poor for atrophy, but moderate for intestinal metaplasia. OLGIM staging provided the highest interobserver agreement, but a substantial proportion of potentially high-risk individuals would be missed if only OLGIM staging is applied. Therefore, we recommend to use a combination of OLGA and OLGIM for staging of chronic gastritis.
Interobserver agreement in analysis of cardiotocograms recorded during trial of labor after cesarean.

PubMed

Caning, M M; Thisted, D L A; Amer-Wählin, I; Laier, G H; Krebs, L

2018-05-17

To examine interobserver agreement in intrapartum cardiotocography (CTG) classification in women undergoing trial of labor after a cesarean section (TOLAC) at term with or without complete uterine rupture. Nineteen blinded and independent Danish obstetricians assessed CTG tracings from 47 women (174 individual pages) with a complete uterine rupture during TOLAC and 37 women (133 individual pages) with no uterine rupture during TOLAC. Individual pages with CTG tracings lasting at least 20 min were evaluated by three different assessors and counted as an individual case. The tracings were analyzed according to the modified version of the Federation of Gynaecology and Obstetrics (FIGO) guidelines elaborated for the use of STAN (ST-analysis). Occurrence of defined abnormalities was recorded and the tracings were classified as normal, suspicious, pathological, or preterminal. The interobserver agreement was evaluated using Fleiss' kappa. Agreement on classification of a preterminal CTG was almost perfect. The interobserver agreement on normal, suspicious or pathological CTG was moderate to substantial. Regarding the presence of severe variable decelerations, the agreement was moderate. No statistical difference was found in the interobserver agreement between classification of tracings from women undergoing TOLAC with and without complete uterine rupture. The interobserver agreement on classification of CTG tracings from high-risk deliveries during TOLAC is best for assessment of a preterminal CTG and the poorest for the identification of severe variable decelerations.

The interrater and intrarater reliability of the Philpott-Javer staging system based on level of training.

PubMed

Parhar, Harman S; Thamboo, Andrew; Habib, Al-Rahim; Chang, Brent; Gan, Eng Cern; Javer, Amin R

2014-04-01

The Philpott-Javer postoperative endoscopic mucosal staging system for allergic fungal rhinosinusitis has previously demonstrated acceptable interrater reliability among rhinologists. There are, however, numerous learners involved in patient care at tertiary centers. This study aims to analyze the interrater and intrarater reliability of this system among learners in otolaryngology at different stages in training. A prospective analysis of retrospectively collected endoscopic photographs. A tertiary care teaching hospital (January 2013). Fifty patients undergoing routine follow-up. Three photographs from each of 50 patients undergoing routine postsurgical nasoendoscopy were reviewed. Images were played twice, 1 week apart, in 2 differently randomized cycles and scored according to Philpott-Javer criteria by a rhinologist, a rhinology fellow, a senior otolaryngology resident, a junior otolaryngology resident, and a medical student. Interobserver reliability was assessed using the intraclass correlation coefficient, while intrarater reliability was assessed by Shrout-Fleiss κ values. Agreement between each learner and the rhinologist was also assessed using κ values. The interclass correlation among the 5 raters was 0.7600 (95% confidence interval, 0.6917-0.8161) for the Philpott-Javer scoring system, suggesting substantial reliability. Intrarater data showed substantial to almost-perfect reliability (κ values between 0.668 and 0.815) among all raters using this system. There was also moderate to substantial agreement between the learners and the rhinologist (κ values between 0.534 and 0.710). Results suggest that the Philpott-Javer staging system has acceptable intrarater and interrater reliability among learners of differing levels of clinical experience and is suitable for evaluating progress following surgery.
Validation of Morphometric Analyses of Small-Intestinal Biopsy Readouts in Celiac Disease

PubMed Central

Taavela, Juha; Koskinen, Outi; Huhtala, Heini; Lähdeaho, Marja-Leena; Popp, Alina; Laurila, Kaija; Collin, Pekka; Kaukinen, Katri; Kurppa, Kalle; Mäki, Markku

2013-01-01

Background Assessment of the gluten-induced small-intestinal mucosal injury remains the cornerstone of celiac disease diagnosis. Usually the injury is evaluated using grouped classifications (e.g. Marsh groups), but this is often too imprecise and ignores minor but significant changes in the mucosa. Consequently, there is a need for validated continuous variables in everyday practice and in academic and pharmacological research. Methods We studied the performance of our standard operating procedure (SOP) on 93 selected biopsy specimens from adult celiac disease patients and non-celiac disease controls. The specimens, which comprised different grades of gluten-induced mucosal injury, were evaluated by morphometric measurements. Specimens with tangential cutting resulting from poorly oriented biopsies were included. Two accredited evaluators performed the measurements in blinded fashion. The intraobserver and interobserver variations for villus height and crypt depth ratio (VH:CrD) and densities of intraepithelial lymphocytes (IELs) were analyzed by the Bland-Altman method and intraclass correlation. Results Unevaluable biopsies according to our SOP were correctly identified. The intraobserver analysis of VH:CrD showed a mean difference of 0.087 with limits of agreement from −0.398 to 0.224; the standard deviation (SD) was 0.159. The mean difference in interobserver analysis was 0.070, limits of agreement −0.516 to 0.375, and SD 0.227. The intraclass correlation coefficient in intraobserver variation was 0.983 and that in interobserver variation 0.978. CD3+ IEL density countings in the paraffin-embedded and frozen biopsies showed SDs of 17.1% and 16.5%; the intraclass correlation coefficients were 0.961 and 0.956, respectively. Conclusions Using our SOP, quantitative, reliable and reproducible morphometric results can be obtained on duodenal biopsy specimens with different grades of gluten-induced injury. Clinically significant changes were defined according to the error margins (2SD) of the analyses in VH:CrD as 0.4 and in CD3+-stained IELs as 30%. PMID:24146832
Computed tomography arthrography using a radial plane view for the detection of triangular fibrocartilage complex foveal tears.

PubMed

Moritomo, Hisao; Arimitsu, Sayuri; Kubo, Nobuyuki; Masatomi, Takashi; Yukioka, Masao

2015-02-01

To classify triangular fibrocartilage complex (TFCC) foveal lesions on the basis of computed tomography (CT) arthrography using a radial plane view and to correlate the CT arthrography results with surgical findings. We also tested the interobserver and intra-observer reliability of the radial plane view. A total of 33 patients with a suspected TFCC foveal tear who had undergone wrist CT arthrography and subsequent surgical exploration were enrolled. We classified the configurations of TFCC foveal lesions into 5 types on the basis of CT arthrography with the radial plane view in which the image slices rotate clockwise centered on the ulnar styloid process. Sensitivity, specificity, and positive predictive values were calculated for each type of foveal lesion in CT arthrography to detect foveal tears. We determined interobserver and intra-observer agreements using kappa statistics. We also compared accuracies with the radial plane views with those with the coronal plane views. Among the tear types on CT arthrography, type 3, a roundish defect at the fovea, and type 4, a large defect at the overall ulnar insertion, had high specificity and positive predictive value for the detection of foveal tears. Specificity and positive predictive values were 90% and 89% for type 3 and 100% and 100% for type 4, respectively, whereas sensitivity was 35% for type 3 and 22% for type 4. Interobserver and intra-observer agreement was substantial and almost perfect, respectively. The radial plane view identified foveal lesion of each palmar and dorsal radioulnar ligament separately, but accuracy results with the radial plane views were not statistically different from those with the coronal plane views. Computed tomography arthrography with a radial plane view exhibited enhanced specificity and positive predictive value when a type 3 or 4 lesion was identified in the detection of a TFCC foveal tear compared with historical controls. Diagnostic II. Copyright © 2015 American Society for Surgery of the Hand. Published by Elsevier Inc. All rights reserved.
Diagnostic value of Doppler assessment of the hepatic and portal vessels and ultrasound of the spleen in liver disease.

PubMed

O'Donohue, John; Ng, Chaan; Catnach, Susan; Farrant, Patricia; Williams, Roger

2004-02-01

To investigate the clinical utility and the intra-observer and inter-observer variability of Doppler ultrasound assessment of the hepatic and portal vessels along with measurement of spleen size in the diagnosis of chronic liver disease and cirrhosis. Ultrasound measurements of portal vein diameter (PVD), portal vein velocity (PVV), hepatic arterial resistance index (HARI), hepatic vein profile (HVP), and spleen size were obtained in 49 controls and 45 patients with liver disease (23 with primary biliary cirrhosis, 22 with hepatitis C) by two experienced observers, who each performed three blinded measurements of each variable. Control values were derived from normal hospital workers. Percutaneous liver biopsies in 41 of the patients showed cirrhosis (14 patients), moderate/severe fibrosis (13 patients), and early disease (14 patients). Seventy-one percent of cirrhotic patients had splenomegaly (> 13.6 cm). The spleen size was significantly larger in cirrhotics (16.0 cm) than in non-cirrhotics (13.0 cm, P < 0.009) and healthy controls (10.7 cm, P < 0.00005), and was the only independent predictor of cirrhosis, with a threshold of 15 cm predicting cirrhosis with a specificity of 98%, positive predictive value of 93%, sensitivity of 57% and negative predictive value of 80%. HVP was abnormal in 76.9% of cirrhotics, 57.7% of non-cirrhotics and 2.1% of controls (P < 0.04). However, the mean PVV, PVD and HARI were no different between controls and patients or between cirrhotic and non-cirrhotic liver disease. There was significant inter-observer variability for PVV, but intra-observer and inter-observer variability was acceptable for the other measurements. Splenomegaly size and abnormal HVP are useful predictors of chronic liver disease and cirrhosis, and both can be measured reliably and reproducibly. However, Doppler measurements of PVV, PVD and HARI are not useful in distinguishing patients with chronic liver disease from normal controls.
Radiographic Blind Test of Curvature of the Posterior Border of the Mandibular Ramus as a Morphological Indicator of Gender.

PubMed

Peregrina, Alejandro; Azer, Shereen S; Tao, Erin E; Johnston, William M

2016-12-01

Curvature of the posterior border of the mandibular ramus at the occlusal plane has been described as a morphological trait for males. Controversy over the accuracy of this method remains among researchers; studies employing similar methods report accuracy rates for successful gender identification ranging from 59% to 99%. This blind study assessed evaluators' ability to determine gender based on the presence or absence of curvature of the posterior margin of the mandibular ramus through panoramic radiographs. Randomly selected panoramic radiographs were obtained from The Ohio State University College of Dentistry for 413 adult male (M) and female (F) subjects. Two evaluators separately assigned ratings using a similar method to the Loth and Henenberg methodology to each subject on the right and left sides of mandibular rami. The ratings were based upon three criteria: (1) presence of curvature at the occlusal plane (M), (2) presence of curvature but not at the occlusal plane (F), and (3) lack of curvature (F). Pearson exact chi-squared test was used to evaluate the statistical strength of the ratings. The evaluators were only in agreement for both the right and left rami in roughly two-thirds (66.8%) of cases when there was no excessive tooth loss (ETL); however, the inter-observer agreement improved to 82.1% for those rami associated with ETL. Inter-observer agreement occurred in 72.9% of female rami and in only 64.4% of male rami. The results of this study indicated that assessment of posterior border curvature of mandibular rami through panoramic radiographs was not a reliable indicator of gender and was further plagued by unacceptably high levels of inter-observer disagreement. © 2016 by the American College of Prosthodontists.
Investigating Various Thresholds as Immunohistochemistry Cutoffs for Observer Agreement.

PubMed

Ali, Asif; Bell, Sarah; Bilsland, Alan; Slavin, Jill; Lynch, Victoria; Elgoweini, Maha; Derakhshan, Mohammad H; Jamieson, Nigel B; Chang, David; Brown, Victoria; Denley, Simon; Orange, Clare; McKay, Colin; Carter, Ross; Oien, Karin A; Duthie, Fraser R

2017-10-01

Clinical translation of immunohistochemistry (IHC) biomarkers requires reliable and reproducible cutoffs or thresholds for interpretation of immunostaining. Most IHC biomarker research focuses on the clinical relevance (diagnostic, prognostic, or predictive utility) of cutoffs, with less emphasis on observer agreement using these cutoffs. From the literature, we identified 3 commonly used cutoffs of 10% positive epithelial cells, 20% positive epithelial cells, and moderate to strong staining intensity (+2/+3 hereafter) to use for investigating observer agreement. A series of 36 images of microarray cores stained for 4 different IHC biomarkers, with variable staining intensity and percentage of positive cells, was used for investigating interobserver and intraobserver agreement. Seven pathologists scored the immunostaining in each image using the 3 cutoffs for positive and negative staining. Kappa (κ) statistic was used to assess the strength of agreement for each cutoff. The interobserver agreement between all 7 pathologists using the 3 cutoffs was reasonably good, with mean κ scores of 0.64, 0.59, and 0.62, respectively, for 10%, 20%, and +2/+3 cutoffs. A good agreement was observed for experienced pathologists using the 10% cutoff, and their agreement was statistically higher than for junior pathologists (P=0.02). In addition, the mean intraobserver agreement for all 7 pathologists using the 3 cutoffs was reasonably good, with mean κ scores of 0.71, 0.60, and 0.73, respectively, for 10%, 20%, and +2/+3 cutoffs. For all 3 cutoffs, a positive correlation was observed with perceived ease of interpretation (P<0.003). Finally, cytoplasmic-only staining achieved higher agreement using all 3 cutoffs than mixed staining patterns. All 3 cutoffs investigated achieve reasonable strength of agreement, modestly decreasing interobserver and intraobserver variability in IHC interpretation. These cutoffs have previously been used in cancer pathology, and this study provides evidence that these cutoffs can be reproducible between practicing pathologists.
Inter-observer agreement of standard joint count examination and disease global assessment in a cohort of Egyptian Rheumatoid Arthritis patients.

PubMed

El-Hadidi, Khaled; Gamal, Sherif M; Saad, Sahar

2017-12-21

To assess the inter-observer agreement of standard joint count between experienced Rheumatology professor (Prof) and young Rheumatology fellow (candidate), and to compare disease global assessment between professor, young candidate and patients. This study included one hundred rheumatoid arthritis patients. For all patients independent clinical evaluation was done by two rheumatologists (professor and candidate) for detection of tenderness in 28 joints and swelling in 26 joints. The study also involved global assessment of disease activity by the provider (Prof and candidate) (EGA) as well as by the patient (PGA). The EGA was determined without previous knowledge of the patient's laboratory test results. A highly significant accordance (correlation) between professor and candidate was found in both the number of tender joints (p<0.001) (r=0.946), and the number of swollen joints (p<0.001) (r=0.797). Regarding swollen joints, the highest agreement was in right knee (0.929), while poor agreement was found in the right 5th MCP (0.049). Regarding tender joints, the highest analogy was in the right elbow (0.899), in contrast to the left 3rd PIP (0.462) which showed the least congruence. Agreement study using kappa measurement for disease global assessment showed: moderate agreement (between professor and candidate) (0.405), fair agreement between (professor and patient) (0.213), fair agreement between (candidate and patient) (0.367). Inter-observer reliability was better for TJCs than SJCs. Regarding SJCs agreement was better in large joints such as the knees compared to the small joints such as the MCPs. Disease global assessment may show discrepancy between patients and physicians. Copyright © 2017 Elsevier España, S.L.U. and Sociedad Española de Reumatología y Colegio Mexicano de Reumatología. All rights reserved.
Does experience in hysteroscopy improve accuracy and inter-observer agreement in the management of abnormal uterine bleeding?

PubMed

Bourdel, Nicolas; Modaffari, Paola; Tognazza, Enrica; Pertile, Riccardo; Chauvet, Pauline; Botchorishivili, Revaz; Savary, Dennis; Pouly, Jean Luc; Rabischong, Benoit; Canis, Michel

2016-12-01

Hysteroscopic reliability may be influenced by the experience of the operator and by a lack of morphological diagnostic criteria for endometrial malignant pathologies. The aim of this study was to evaluate the diagnostic accuracy and the inter-observer agreement (IOA) in the management of abnormal uterine bleeding (AUB) among different experienced gynecologists. Each gynecologist, without any other clinical information, was asked to evaluate the anonymous video recordings of 51 consecutive patients who underwent hysteroscopy and endometrial resection for AUB. Experts (>500 hysteroscopies), seniors (20-499 procedures) and junior (≤19 procedures) gynecologists were asked to judge endometrial macroscopic appearance (benign, suspicious or frankly malignant). They also had to propose the histological diagnosis (atrophic or proliferative endometrium; simple, glandulocystic or atypical endometrial hyperplasia and endometrial carcinoma). Observers were free to indicate whether the quality of recordings were not good enough for adequate assessment. IOA (k coefficient), sensitivity, specificity, predictive value and the likelihood ratio were calculated. Five expert, five senior and six junior gynecologists were involved in the study. Considering endometrial cancer and endometrial atypical hyperplasia, sensitivity and specificity were respectively 55.5 % and 84.5 % for juniors, 66.6 % and 81.2 % for seniors and 86.6 % and 87.3 % for experts. Concerning endometrial macroscopic appearance, IOA was poor for juniors (k = 0.10) and fair for seniors and experts (k = 0.23 and 0.22, respectively). IOA was poor for juniors and experts (k = 0.18 and 0.20, respectively) and fair for seniors (k = 0.30) in predicting the histological diagnosis. Sensitivity improves with the observer's experience, but inter-observer agreement and reproducibility of hysteroscopy for endometrial malignancies are not satisfying no matter the level of expertise. Therefore, an accurate and complete endometrial sampling is still needed.
Evaluation of the influence exerted by different dental specialty backgrounds and measuring instrument reproducibility on esthetic aspects of maxillary implant-supported single crown.

PubMed

Vaidya, Samriddhi; Ho, Yu Lau Elaine; Hao, Jie; Lang, Niklaus P; Mattheos, Nikos

2015-03-01

To evaluate the influence exerted by different dental specialty backgrounds as well as the validity and reproducibility of the Pink Esthetic Score/White Esthetic Score (PES/WES) and the modified Implant Crown Aesthetic Index (mod-ICAI) on the assessment of esthetic aspects of maxillary implants supported single-tooth prosthesis. A total of fourteen examiners (Two orthodontists, two prosthodontists, two oral surgeons, two periodontists, two dental technicians, two dental assistants, and two postgraduate students in Implant Dentistry evaluated 20 photographs of single-implant-supported crowns and five photographs of unrestored teeth of esthetic zone in a two part study. The examiners assessed the photographs with each index (Pink Esthetic Score/White Esthetic Score and modified Implant Crown Aesthetic Index), twice with a week's interval. Orders of photographs were rearranged in the second assessment. Kruskal-Wallis test results showed significant differences among all the six specialties (P ≤ 0.001). DAs and periodontists had significantly better ratings than other specialties with both indices. Prosthodontists had the lowest mean rank scores regardless of the index. Interobserver agreement was also lowest between the two prosthodontists (4-28%), rest of the groups had low-to-moderate agreement (20-80%) when limited allowance was accepted. With mod-ICAI, more interobserver agreement was noted within the specialty group than with PES/WES. The PES/WES and the modified ICAI can be reliable estimates of esthetic outcomes. The assessor degree of specialization affected the esthetic evaluation with both the PES/WES and the modified ICAI. DAs and periodontists were identified to provide more favorable ratings than other specialties while prosthodontists were most critical in this study. With modified ICAI, more interobserver agreement within specialty resulted. The interexaminer agreement may be increased if more tolerance of 1-2 points is considered. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Assessing distractors and teamwork during surgery: developing an event-based method for direct observation.

PubMed

Seelandt, Julia C; Tschan, Franziska; Keller, Sandra; Beldi, Guido; Jenni, Nadja; Kurmann, Anita; Candinas, Daniel; Semmer, Norbert K

2014-11-01

To develop a behavioural observation method to simultaneously assess distractors and communication/teamwork during surgical procedures through direct, on-site observations; to establish the reliability of the method for long (>3 h) procedures. Observational categories for an event-based coding system were developed based on expert interviews, observations and a literature review. Using Cohen's κ and the intraclass correlation coefficient, interobserver agreement was assessed for 29 procedures. Agreement was calculated for the entire surgery, and for the 1st hour. In addition, interobserver agreement was assessed between two tired observers and between a tired and a non-tired observer after 3 h of surgery. The observational system has five codes for distractors (door openings, noise distractors, technical distractors, side conversations and interruptions), eight codes for communication/teamwork (case-relevant communication, teaching, leadership, problem solving, case-irrelevant communication, laughter, tension and communication with external visitors) and five contextual codes (incision, last stitch, personnel changes in the sterile team, location changes around the table and incidents). Based on 5-min intervals, Cohen's κ was good to excellent for distractors (0.74-0.98) and for communication/teamwork (0.70-1). Based on frequency counts, intraclass correlation coefficient was excellent for distractors (0.86-0.99) and good to excellent for communication/teamwork (0.45-0.99). After 3 h of surgery, Cohen's κ was 0.78-0.93 for distractors, and 0.79-1 for communication/teamwork. The observational method developed allows a single observer to simultaneously assess distractors and communication/teamwork. Even for long procedures, high interobserver agreement can be achieved. Data collected with this method allow for investigating separate or combined effects of distractions and communication/teamwork on surgical performance and patient outcomes. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
The Neurologic Assessment in Neuro-Oncology (NANO) scale: a tool to assess neurologic function for integration into the Response Assessment in Neuro-Oncology (RANO) criteria.

PubMed

Nayak, Lakshmi; DeAngelis, Lisa M; Brandes, Alba A; Peereboom, David M; Galanis, Evanthia; Lin, Nancy U; Soffietti, Riccardo; Macdonald, David R; Chamberlain, Marc; Perry, James; Jaeckle, Kurt; Mehta, Minesh; Stupp, Roger; Muzikansky, Alona; Pentsova, Elena; Cloughesy, Timothy; Iwamoto, Fabio M; Tonn, Joerg-Christian; Vogelbaum, Michael A; Wen, Patrick Y; van den Bent, Martin J; Reardon, David A

2017-05-01

The Macdonald criteria and the Response Assessment in Neuro-Oncology (RANO) criteria define radiologic parameters to classify therapeutic outcome among patients with malignant glioma and specify that clinical status must be incorporated and prioritized for overall assessment. But neither provides specific parameters to do so. We hypothesized that a standardized metric to measure neurologic function will permit more effective overall response assessment in neuro-oncology. An international group of physicians including neurologists, medical oncologists, radiation oncologists, and neurosurgeons with expertise in neuro-oncology drafted the Neurologic Assessment in Neuro-Oncology (NANO) scale as an objective and quantifiable metric of neurologic function evaluable during a routine office examination. The scale was subsequently tested in a multicenter study to determine its overall reliability, inter-observer variability, and feasibility. The NANO scale is a quantifiable evaluation of 9 relevant neurologic domains based on direct observation and testing conducted during routine office visits. The score defines overall response criteria. A prospective, multinational study noted a >90% inter-observer agreement rate with kappa statistic ranging from 0.35 to 0.83 (fair to almost perfect agreement), and a median assessment time of 4 minutes (interquartile range, 3-5). The NANO scale provides an objective clinician-reported outcome of neurologic function with high inter-observer agreement. It is designed to combine with radiographic assessment to provide an overall assessment of outcome for neuro-oncology patients in clinical trials and in daily practice. Furthermore, it complements existing patient-reported outcomes and cognition testing to combine for a global clinical outcome assessment of well-being among brain tumor patients. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Neuro-Oncology. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com
Challenges in Coding Adverse Events in Clinical Trials: A Systematic Review

PubMed Central

Schroll, Jeppe Bennekou; Maund, Emma; Gøtzsche, Peter C.

2012-01-01

Background Misclassification of adverse events in clinical trials can sometimes have serious consequences. Therefore, each of the many steps involved, from a patient's adverse experience to presentation in tables in publications, should be as standardised as possible, minimising the scope for interpretation. Adverse events are categorised by a predefined dictionary, e.g. MedDRA, which is updated biannually with many new categories. The objective of this paper is to study interobserver variation and other challenges of coding. Methods Systematic review using PRISMA. We searched PubMed, EMBASE and The Cochrane Library. All studies were screened for eligibility by two authors. Results Our search returned 520 unique studies of which 12 were included. Only one study investigated interobserver variation. It reported that 12% of the codes were evaluated differently by two coders. Independent physicians found that 8% of all the codes deviated from the original description. Other studies found that product summaries could be greatly affected by the choice of dictionary. With the introduction of MedDRA, it seems to have become harder to identify adverse events statistically because each code is divided in subgroups. To account for this, lumping techniques have been developed but are rarely used, and guidance on when to use them is vague. An additional challenge is that adverse events are censored if they already occurred in the run-in period of a trial. As there are more than 26 ways of determining whether an event has already occurred, this can lead to bias, particularly because data analysis is rarely performed blindly. Conclusion There is a lack of evidence that coding of adverse events is a reliable, unbiased and reproducible process. The increase in categories has made detecting adverse events harder, potentially compromising safety. It is crucial that readers of medical publications are aware of these challenges. Comprehensive interobserver studies are needed. PMID:22911755
Analysis of the psychometric properties of the American Orthopaedic Foot and Ankle Society Score (AOFAS) in rheumatoid arthritis patients: application of the Rasch model.

PubMed

Conceição, Cristiano Sena da; Neto, Mansueto Gomes; Neto, Anolino Costa; Mendes, Selena M D; Baptista, Abrahão Fontes; Sá, Kátia Nunes

2016-01-01

To tested the reliability and validity of Aofas in a sample of rheumatoid arthritis patients. The scale was applicable to rheumatoid arthritis patients, twice by the interviewer 1 and once by the interviewer 2. The Aofas was subjected to test-retest reliability analysis (with 20 Rheumatoid arthritis subjects). The psychometric properties were investigated using Rasch analysis on 33 Rheumatoid arthritis patients. Intra-Class Correlation Coefficient (ICC) were (0.90
Observer reliability of the Gross Motor Performance Measure and the Quality of Upper Extremity Skills Test, based on video recordings.

PubMed

Sorsdahl, Anne Brit; Moe-Nilssen, Rolf; Strand, Liv Inger

2008-02-01

The aim of this study was to examine observer reliability of the Gross Motor Performance Measure (GMPM) and the Quality of Upper Extremity Skills Test (QUEST) based on video clips. The tests were administered to 26 children with cerebral palsy (CP; 14 males, 12 females; range 2-13y, mean 7y 6mo), 24 with spastic CP, and two with dyskinesia. Respectively, five, six, five, four, and six children were classified in Gross Motor Function Classification System Levels I to V; and four, nine, five, five, and three children were classified in Manual Ability Classification System levels I to V. The children's performances were recorded and edited. Two experienced paediatric physical therapists assessed the children from watching the video clips. Intraobserver and interobserver reliability values of the total scores were mostly high, intraclass correlation coefficient (ICC)(1,1) varying from 0.69 to 0.97 with only one coefficient below 0.89. The ICCs of subscores varied from 0.36 to 0.95, finding'Alignment'and'Weight shift'in GMPM and'Protective extension'in QUEST highly reliable. The subscores'Dissociated movements'in GMPM and QUEST, and'Grasp'in QUEST were the least reliable, and recommendations are made to increase reliability of these subscores. Video scoring was time consuming, but was found to offer many advantages; the possibility to review performance, to use special trained observers for scoring and less demanding assessment for the children.
Identifying homologous anatomical landmarks on reconstructed magnetic resonance images of the human cerebral cortical surface

PubMed Central

MAUDGIL, D. D.; FREE, S. L.; SISODIYA, S. M.; LEMIEUX, L.; WOERMANN, F. G.; FISH, D. R.; SHORVON, S. D.

1998-01-01

Guided by a review of the anatomical literature, 36 sulci on the human cerebral cortical surface were designated as homologous. These sulci were assessed for visibility on 3-dimensional images reconstructed from magnetic resonance imaging scans of the brains of 20 normal volunteers by 2 independent observers. Those sulci that were found to be reproducibly identifiable were used to define 24 landmarks around the cortical surface. The interobserver and intraobserver variabilities of measurement of the 24 landmarks were calculated. These reliably reproducible landmarks can be used for detailed morphometric analysis, and may prove helpful in the analysis of suspected cerebral cortical structured abnormalities in patients with such conditions as epilepsy. PMID:10029189
The use of ultrasound for postoperative monitoring of cerebral bypass grafts: A technical report.

PubMed

Morton, Ryan P; Abecassis, Isaac Joshua; Moore, Anne E; Kelly, Cory M; Levitt, Michael R; Kim, Louis J; Sekhar, Laligam N

2017-06-01

Duplex ultrasound and transcranial Doppler are valuable tools for post-operative monitoring of extracranial-intracranial cerebral bypass grafts. Here we describe our technique for the evaluation of both high-flow and low-flow cerebral bypass grafts over a nine year period. 186 bypass grafts were studied daily during the inpatient period between Jan 2005 and Dec 2014 after surgery for various cerebrovascular pathologies. There was a technical success rate of 97%. Duplex ultrasonographic flow measurements had excellent interobserver reliability with an intraclass correlation coefficient (ICC) of 0.89 (p=0.009). Technical nuances are highlighted and a brief discussion of pathology is undertaken. Copyright © 2017 Elsevier Ltd. All rights reserved.
Interobserver agreement between primary graders and an expert grader in the Bristol and Weston diabetic retinopathy screening programme: a quality assurance audit.

PubMed

Patra, S; Gomm, E M W; Macipe, M; Bailey, C

2009-08-01

To assess the quality and accuracy of primary grading in the Bristol and Weston diabetic retinopathy screening programme and to set standards for future interobserver agreement reports. A prospective audit of 213 image sets from six fully trained primary graders in the Bristol and Weston diabetic retinopathy screening programme was carried out over a 4-week period. All the images graded by the primary graders were regraded by an expert grader blinded to the primary grading results and the identity of the primary grader. The interobserver agreement between primary graders and the blinded expert grader and the corresponding Kappa coefficient was determined for overall grading, referable, non-referable and ungradable disease. The audit standard was set at 80% for interobserver agreement with a Kappa coefficient of 0.7. The interobserver agreement bettered the audit standard of 80% in all the categories. The Kappa coefficient was substantial (0.7) for the overall grading results and ranged from moderate to substantial (0.59-0.65) for referable, non-referable and ungradable disease categories. The main recommendation of the audit was to provide refresher training for the primary graders with focus on ungradable disease. The audit demonstrated an acceptable level of quality and accuracy of primary grading in the Bristol and Weston diabetic retinopathy screening programme and provided a standard against which future interobserver agreement can be measured for quality assurance within a screening programme. Diabet. Med. 26, 820-823 (2009).
Segmentation precision of abdominal anatomy for MRI-based radiotherapy

DOE Office of Scientific and Technical Information (OSTI.GOV)

Noel, Camille E.; Zhu, Fan; Lee, Andrew Y.

2014-10-01

The limited soft tissue visualization provided by computed tomography, the standard imaging modality for radiotherapy treatment planning and daily localization, has motivated studies on the use of magnetic resonance imaging (MRI) for better characterization of treatment sites, such as the prostate and head and neck. However, no studies have been conducted on MRI-based segmentation for the abdomen, a site that could greatly benefit from enhanced soft tissue targeting. We investigated the interobserver and intraobserver precision in segmentation of abdominal organs on MR images for treatment planning and localization. Manual segmentation of 8 abdominal organs was performed by 3 independent observersmore » on MR images acquired from 14 healthy subjects. Observers repeated segmentation 4 separate times for each image set. Interobserver and intraobserver contouring precision was assessed by computing 3-dimensional overlap (Dice coefficient [DC]) and distance to agreement (Hausdorff distance [HD]) of segmented organs. The mean and standard deviation of intraobserver and interobserver DC and HD values were DC{sub intraobserver} = 0.89 ± 0.12, HD{sub intraobserver} = 3.6 mm ± 1.5, DC{sub interobserver} = 0.89 ± 0.15, and HD{sub interobserver} = 3.2 mm ± 1.4. Overall, metrics indicated good interobserver/intraobserver precision (mean DC > 0.7, mean HD < 4 mm). Results suggest that MRI offers good segmentation precision for abdominal sites. These findings support the utility of MRI for abdominal planning and localization, as emerging MRI technologies, techniques, and onboard imaging devices are beginning to enable MRI-based radiotherapy.« less
Inter-observer and intra-observer agreement on interpretation of uroflowmetry curves of kindergarten children.

PubMed

Chang, Shang-Jen; Yang, Stephen S D

2008-12-01

To evaluate the inter-observer and intra-observer agreement on the interpretation of uroflowmetry curves of children. Healthy kindergarten children were enrolled for evaluation of uroflowmetry. Uroflowmetry curves were classified as bell-shaped, tower, plateau, staccato and interrupted. Only the bell-shaped curves were regarded as normal. Two urodynamists evaluated the curves independently after reviewing the definitions of the different types of uroflowmetry curve. The senior urodynamist evaluated the curves twice 3 months apart. The final conclusion was made when consensus was reached. Agreement among observers was analyzed using kappa statistics. Of 190 uroflowmetry curves eligible for analysis, the intra-observer agreement in interpreting each type of curve and interpreting normalcy vs abnormality was good (kappa=0.71 and 0.68, respectively). Very good inter-observer agreement (kappa=0.81) on normalcy and good inter-observer agreement (kappa=0.73) on types of uroflowmetry were observed. Poor inter-observer agreement existed on the classification of specific types of abnormal uroflowmetry curves (kappa=0.07). Uroflowmetry is a good screening tool for normalcy of kindergarten children, while not a good tool to define the specific types of abnormal uroflowmetry.
JOURNAL CLUB: Assessment of Interobserver Variability in the Peer Review Process: Should We Agree to Disagree?

PubMed

Verma, Nupur; Hippe, Daniel S; Robinson, Jeffrey D

2016-12-01

Peer review is an important and necessary part of radiology. There are several options to perform the peer review process. This study examines the reproducibility of peer review by comparing two scoring systems. American Board of Radiology-certified radiologists from various practice environments and subspecialties were recruited to score deidentified examinations on a web-based PACS with two scoring systems, RADPEER and Cleareview. Quantitative analysis of the scores was performed for interrater agreement. Interobserver variability was high for both the RADPEER and Cleareview scoring systems. The interobserver correlations (kappa values) were 0.17-0.23 for RADPEER and 0.10-0.16 for Cleareview. Interrater correlation was not statistically significantly different when comparing the RADPEER and Cleareview systems (p = 0.07-0.27). The kappa values were low for the Cleareview subscores when we evaluated for missed findings (0.26), satisfaction of search (0.17), and inadequate interpretation of findings (0.12). Our study confirms the previous report of low interobserver correlation when using the peer review process. There was low interobserver agreement seen when using both the RADPEER and the Cleareview scoring systems.

Understanding and Visualizing Multitasking and Task Switching Activities: A Time Motion Study to Capture Nursing Workflow

PubMed Central

Yen, Po-Yin; Kelley, Marjorie; Lopetegui, Marcelo; Rosado, Amber L.; Migliore, Elaina M.; Chipps, Esther M.; Buck, Jacalyn

2016-01-01

A fundamental understanding of multitasking within nursing workflow is important in today’s dynamic and complex healthcare environment. We conducted a time motion study to understand nursing workflow, specifically multitasking and task switching activities. We used TimeCaT, a comprehensive electronic time capture tool, to capture observational data. We established inter-observer reliability prior to data collection. We completed 56 hours of observation of 10 registered nurses. We found, on average, nurses had 124 communications and 208 hands-on tasks per 4-hour block of time. They multitasked (having communication and hands-on tasks simultaneously) 131 times, representing 39.48% of all times; the total multitasking duration ranges from 14.6 minutes to 109 minutes, 44.98 minutes (18.63%) on average. We also reviewed workflow visualization to uncover the multitasking events. Our study design and methods provide a practical and reliable approach to conducting and analyzing time motion studies from both quantitative and qualitative perspectives. PMID:28269924
Transcultural adaptation into Spanish of the Induction Compliance Checklist for assessing children's behaviour during induction of anaesthesia.

PubMed

Jerez-Molina, Carmen; Lázaro-Alcay, Juan J; Ullán-de la Fuente, Ana M

2017-10-17

Cross-cultural adaptation into Spanish of the Induction Compliance Checklist (ICC) for assessing children's behaviour during induction of anaesthesia. A descriptive cross-sectional observational study was conducted on a sample of 81 children aged 2 to 12 years operated in an ambulatory surgery unit of a paediatric hospital in Barcelona. Adaptation by translation-back translation of the tool and analysis of the scale's validity and reliability. Face validity of the tool was guaranteed through a discussion group and inter-observer reliability was evaluated, obtaining an intraclass correlation index of r = 0.956. The ICC scale validated for the Spanish population can be an effective tool for the presurgical evaluation of activities carried out to minimise children's anxiety. The ICC is an easy-to-use scale completed by operating room staff in one minute and would provide important information about children's behaviour, specifically during induction. Copyright © 2017 Elsevier España, S.L.U. All rights reserved.
Understanding and Visualizing Multitasking and Task Switching Activities: A Time Motion Study to Capture Nursing Workflow.

PubMed

Yen, Po-Yin; Kelley, Marjorie; Lopetegui, Marcelo; Rosado, Amber L; Migliore, Elaina M; Chipps, Esther M; Buck, Jacalyn

2016-01-01

A fundamental understanding of multitasking within nursing workflow is important in today's dynamic and complex healthcare environment. We conducted a time motion study to understand nursing workflow, specifically multitasking and task switching activities. We used TimeCaT, a comprehensive electronic time capture tool, to capture observational data. We established inter-observer reliability prior to data collection. We completed 56 hours of observation of 10 registered nurses. We found, on average, nurses had 124 communications and 208 hands-on tasks per 4-hour block of time. They multitasked (having communication and hands-on tasks simultaneously) 131 times, representing 39.48% of all times; the total multitasking duration ranges from 14.6 minutes to 109 minutes, 44.98 minutes (18.63%) on average. We also reviewed workflow visualization to uncover the multitasking events. Our study design and methods provide a practical and reliable approach to conducting and analyzing time motion studies from both quantitative and qualitative perspectives.
Ensuring relational competency in critical care: Importance of nursing students' communication skills.

PubMed

Sánchez Expósito, Judit; Leal Costa, César; Díaz Agea, José Luis; Carrillo Izquierdo, María Dolores; Jiménez Rodríguez, Diana

2018-02-01

The aim of this study was to analyse the communication skills of students in interactions with simulated critically-ill patients using a new assessment tool to study the relationships between communication skills, teamwork and clinical skills and to analyse the psychometric properties of the tool. A cross-sectional study was conducted to assess the communications skills of 52 students with critically-ill patients through the use of a new measurement tool to score video recordings of simulated clinical scenarios. The 52 students obtained low scores on their skills in communicating with patients. The reliability of the measuring instrument showed good inter-observer agreement (ICC between 0.71 and 0.90) and the validity yielded a positive correlation (p<0.01). The results provide evidence that nursing students lack skills when communicating with critically ill patients in simulated scenarios. The measuring instrument used is therefore deemed valid and reliable for assessing nursing students through a clinical simulation. Copyright © 2017 Elsevier Ltd. All rights reserved.
Assessing physical activity during youth sport: the Observational System for Recording Activity in Children: Youth Sports.

PubMed

Cohen, Alysia; McDonald, Samantha; McIver, Kerry; Pate, Russell; Trost, Stewart

2014-05-01

The purpose of this study was to evaluate the validity and interrater reliability of the Observational System for Recording Activity in Children: Youth Sports (OSRAC:YS). Children (N = 29) participating in a parks and recreation soccer program were observed during regularly scheduled practices. Physical activity (PA) intensity and contextual factors were recorded by momentary time-sampling procedures (10-second observe, 20-second record). Two observers simultaneously observed and recorded children's PA intensity, practice context, social context, coach behavior, and coach proximity. Interrater reliability was based on agreement (Kappa) between the observer's coding for each category, and the Intraclass Correlation Coefficient (ICC) for percent of time spent in MVPA. Validity was assessed by calculating the correlation between OSRAC:YS estimated and objectively measured MVPA. Kappa statistics for each category demonstrated substantial to almost perfect interobserver agreement (Kappa = 0.67-0.93). The ICC for percent time in MVPA was 0.76 (95% C.I. = 0.49-0.90). A significant correlation (r = .73) was observed for MVPA recorded by observation and MVPA measured via accelerometry. The results indicate the OSRAC:YS is a reliable and valid tool for measuring children's PA and contextual factors during a youth soccer practice.
The reliability and predictive value of an amniotic fluid scoring system in severe second-trimester oligohydramnios.

PubMed

Moore, T R; Longo, J; Leopold, G R; Casola, G; Gosink, B B

1989-05-01

Sixty-two cases of oligohydramnios diagnosed by ultrasound between 13-28 weeks' gestation were reviewed. Three experienced ultrasonographers used a subjective scale to rate the oligohydramnios as mild, moderate, severe, or anhydramniotic. Interobserver reliability was excellent (intraclass correlation coefficient 0.81). The overall perinatal mortality rate was 43%, and the incidence of pulmonary hypoplasia was 33%. One-third had lethal congenital anomalies. The frequency of adverse outcome correlated strongly with the most severe degrees of oligohydramnios; 88% of the fetuses with severe oligohydramnios or anhydramnios had lethal outcomes, compared with 11% in the mild/moderate group. The presence of an anuric urinary tract anomaly was associated with the most severe grades of oligohydramnios and was uniformly fatal. Pulmonary hypoplasia was diagnosed in 60% of the severe group versus 6% in the moderate group. We conclude that subjective grading of oligohydramnios by experienced observers is both reliable and predictive of outcome. The finding of severe oligohydramnios in the second trimester is highly predictive of poor fetal outcome and should stimulate a thorough search for etiology and consideration of intervention. Moderate grades of reduced amniotic fluid may be managed with relative optimism.
Evidence-based dentistry: analysis of dental anxiety scales for children.

PubMed

Al-Namankany, A; de Souza, M; Ashley, P

2012-03-09

To review paediatric dental anxiety measures (DAMs) and assess the statistical methods used for validation and their clinical implications. A search of four computerised databases between 1960 and January 2011 associated with DAMs, using pre-specified search terms, to assess the method of validation including the reliability as intra-observer agreement 'repeatability or stability' and inter-observer agreement 'reproducibility' and all types of validity. Fourteen paediatric DAMs were predominantly validated in schools and not in the clinical setting while five of the DAMs were not validated at all. The DAMs that were validated were done so against other paediatric DAMs which may not have been validated previously. Reliability was not assessed in four of the DAMs. However, all of the validated studies assessed reliability which was usually 'good' or 'acceptable'. None of the current DAMs used a formal sample size technique. Diversity was seen between the studies ranging from a few simple pictograms to lists of questions reported by either the individual or an observer. To date there is no scale that can be considered as a gold standard, and there is a need to further develop an anxiety scale with a cognitive component for children and adolescents.
How Reliable is the Acetabular Cup Position Assessment from Routine Radiographs?

PubMed Central

Carvajal Alba, Jaime A.; Vincent, Heather K.; Sodhi, Jagdeep S.; Latta, Loren L.; Parvataneni, Hari K.

2017-01-01

Abstract Background: Cup position is crucial for optimal outcomes in total hip arthroplasty. Radiographic assessment of component position is routinely performed in the early postoperative period. Aims: The aims of this study were to determine in a controlled environment if routine radiographic methods accurately and reliably assess the acetabular cup position and to assess if there is a statistical difference related to the rater’s level of training. Methods: A pelvic model was mounted in a spatial frame. An acetabular cup was fixed in different degrees of version and inclination. Standardized radiographs were obtained. Ten observers including five fellowship-trained orthopaedic surgeons and five orthopaedic residents performed a blind assessment of cup position. Inclination was assessed from anteroposterior radiographs of the pelvis and version from cross-table lateral radiographs of the hip. Results: The radiographic methods used showed to be imprecise specially when the cup was positioned at the extremes of version and inclination. An excellent inter-observer reliability (Intra-class coefficient > 0,9) was evidenced. There were no differences related to the level of training of the raters. Conclusions: These widely used radiographic methods should be interpreted cautiously and computed tomography should be utilized in cases when further intervention is contemplated. PMID:28852355
Coronary artery disease reporting and data system (CAD-RADSTM): Inter-observer agreement for assessment categories and modifiers.

PubMed

Maroules, Christopher D; Hamilton-Craig, Christian; Branch, Kelley; Lee, James; Cury, Roberto C; Maurovich-Horvat, Pál; Rubinshtein, Ronen; Thomas, Dustin; Williams, Michelle; Guo, Yanshu; Cury, Ricardo C

The Coronary Artery Disease Reporting and Data System (CAD-RADS) provides a lexicon and standardized reporting system for coronary CT angiography. To evaluate inter-observer agreement of the CAD-RADS among an panel of early career and expert readers. Four early career and four expert cardiac imaging readers prospectively and independently evaluated 50 coronary CT angiography cases using the CAD-RADS lexicon. All readers assessed image quality using a five-point Likert scale, with mean Likert score ≥4 designating high image quality, and <4 designating moderate/low image quality. All readers were blinded to medical history and invasive coronary angiography findings. Inter-observer agreement for CAD-RADS assessment categories and modifiers were assessed using intra-class correlation (ICC) and Fleiss' Kappa (κ).The impact of reader experience and image quality on inter-observer agreement was also examined. Inter-observer agreement for CAD-RADS assessment categories was excellent (ICC 0.958, 95% CI 0.938-0.974, p < 0.0001). Agreement among expert readers (ICC 0.925, 95% CI 0.884-0.954) was marginally stronger than for early career readers (ICC 0.904, 95% CI 0.852-0.941), both p < 0.0001. High image quality was associated with stronger agreement than moderate image quality (ICC 0.944, 95% CI 0.886-0.974 vs. ICC 0.887, 95% CI 0.775-0.95, both p < 0.0001). While excellent inter-observer agreement was observed for modifiers S (stent) and G (bypass graft) (both κ = 1.0), only fair agreement (κ = 0.40) was observed for modifier V (high risk plaque). Inter-observer reproducibility of CAD-RADS assessment categories and modifiers is excellent, except for high-risk plaque (modifier V) which demonstrates fair agreement. These results suggest CAD-RADS is feasible for clinical implementation. Copyright © 2017. Published by Elsevier Inc.
Validation of a standardized mapping system of the hip joint for radial MRA sequencing.

PubMed

Klenke, Frank M; Hoffmann, Daniel B; Cross, Brian J; Siebenrock, Klaus A

2015-03-01

Intraarticular gadolinium-enhanced magnetic resonance arthrography (MRA) is commonly applied to characterize morphological disorders of the hip. However, the reproducibility of retrieving anatomic landmarks on MRA scans and their correlation with intraarticular pathologies is unknown. A precise mapping system for the exact localization of hip pathomorphologies with radial MRA sequences is lacking. Therefore, the purpose of the study was the establishment and validation of a reproducible mapping system for radial sequences of hip MRA. Sixty-nine consecutive intraarticular gadolinium-enhanced hip MRAs were evaluated. Radial sequencing consisted of 14 cuts orientated along the axis of the femoral neck. Three orthopedic surgeons read the radial sequences independently. Each MRI was read twice with a minimum interval of 7 days from the first reading. The intra- and inter-observer reliability of the mapping procedure was determined. A clockwise system for hip MRA was established. The teardrop figure served to determine the 6 o'clock position of the acetabulum; the center of the greater trochanter served to determine the 12 o'clock position of the femoral head-neck junction. The intra- and inter-observer ICCs to retrieve the correct 6/12 o'clock positions were 0.906-0.996 and 0.978-0.988, respectively. The established mapping system for radial sequences of hip joint MRA is reproducible and easy to perform.
Automated 3D ultrasound measurement of the angle of progression in labor.

PubMed

Montaguti, Elisa; Rizzo, Nicola; Pilu, Gianluigi; Youssef, Aly

2018-01-01

To assess the feasibility and reliability of an automated technique for the assessment of the angle of progression (AoP) in labor by using three-dimensional (3D) ultrasound. AoP was assessed by using 3D transperineal ultrasound by two operators in 52 women in active labor to evaluate intra- and interobserver reproducibility. Furthermore, intermethod agreement between automated and manual techniques on 3D images, and between automated technique on 3D vs 2D images were evaluated. Automated measurements were feasible in all cases. Automated measurements were considered acceptable in 141 (90.4%) out of the 156 on the first assessments and in all 156 after repeating measurements for unacceptable evaluations. The automated technique on 3D images demonstrated good intra- and interobserver reproducibility. The 3D-automated technique showed a very good agreement with the 3D manual technique. Notably, AoP calculated with the 3D automated technique were significantly wider in comparison with those measured manually on 3D images (133 ± 17° vs 118 ± 21°, p = 0.013). The assessment of the angle of progression through 3D ultrasound is highly reproducible. However, automated software leads to a systematic overestimation of AoP in comparison with the standard manual technique thus hindering its use in clinical practice in its present form.
Ultrasound functional evaluation of fetuses with myelomeningocele: study of the interpretation of results.

PubMed

Maroto, A; Illescas, T; Meléndez, M; Arévalo, S; Rodó, C; Peiró, J L; Belfort, M; Cuxart, A; Carreras, E

2017-10-01

To assess the reliability of the interpretation of a new technique for the ultrasound evaluation of the level of neurological lesion in fetuses with myelomeningocele. Observational study including myelomeningocele fetuses, referred to our center for the sonographic assessment of the fetal lower-limb movements, made and recorded by an expert in Maternal-fetal medicine and a specialist in Rehabilitation. Two observers, with different levels of expertise and blinded to each other's results, interpreted each recorded scan two different times. The agreement for the segmental levels assigned between the observers and the gold standard, the inter-observer and intra-observer reproducibility were tested using the weighed Kappa (wκ) index. Twenty-eight scans were recorded and evaluated. The agreement between the observers and the gold standard remained constant for the expert observer (wκ = 0.82) and increased (wκ = 0.66-wκ = 0.72) for the other one. The inter-observer and the intra-observer variability for the expert observer were wκ = 0.72 and wκ = 0.94, respectively. The agreement for the prenatal evaluation of the segmental neurological level was excellent, after a short training period, for observers with different degrees of expertise. The interpretation of this technique is reproducible enough and this supports its value for the prediction of postnatal motor function in myelomeningocele fetuses.
[Identification of adverse events in hospitalised influenza patients].

PubMed

Aranaz-Andrés, J M; Gea-Velázquez de Castro, M T; Jiménez-Pericás, F; Balbuena-Segura, A I; Meyer-García, M C; López-Fresneña, N; Miralles-Bueno, J J; Obón-Azuara, B; Moliner-Lahoz, J; Aibar-Remón, C

2015-01-01

To test the inter-observer agreement in identifying adverse events (AE) in patients hospitalized by flu and undergoing precautionary isolation measures. Historical cohort study, 50 patients undergoing isolation measures due to flu, and 50 patients without any isolation measures. The AE incidence ranges from 10 to 26% depending on the observer (26% [95%CI: 17.4%-34.60%], 10% [95%CI: 4.12%-15.88%], and 23% [95%CI: 14.75%-31.25%]). It was always lower in the cohort undergoing the isolation measures. This difference is statistically significant when the accurate definition of a case is applied. The agreement as regards the screening was good (higher than 76%; Kappa index between 0.29 and 0.81). The agreement as regards the accurate identification of AE related to care was lower (from 50 to 93.3%, Kappa index from 0.20 to 0.70). Before performing an epidemiological study on AE, interobserver concordance must be analyzed to improve the accuracy of the results and the validity of the study. Studies have different levels of reliability. Kappa index shows high levels for the screening guide, but not for the identification of AE. Without a good methodology the results achieved, and thus the decisions made from them, cannot be guaranteed. Researchers have to be sure of the method used, which should be as close as possible to the optimal achievable. Copyright © 2014 SECA. Published by Elsevier Espana. All rights reserved.
Interobserver agreement on Poser's and the new McDonald's diagnostic criteria for multiple sclerosis.

PubMed

Zipoli, V; Portaccio, E; Siracusa, G; Pracucci, G; Sorbi, S; Amato, M P

2003-10-01

We assessed the interobserver agreement on the diagnosis of multiple sclerosis (MS) in a study sample consisting of 41 MS (15 relapsing remitting, two secondary progressive, five primary progressive and 19 presenting their first clinical attack) and three non-MS cases. Clinical and paraclinical information was recorded in standardized forms. Four neurologists were asked to make a diagnosis using Poser's and McDonald's criteria and to assess MRI scans according to the McDonald's guidelines. In terms of the kappa statistic (kappa), we found a moderate agreement on the overall diagnosis using both Poser's and McDonald's criteria (kappa, respectively 0.57 and 0.52). As for distinct diagnostic categories, we observed a moderate to substantial agreement for the three McDonald categories (range of kappa values 0.49-0.64) and a fair to substantial agreement for the nine Poser categories (range of kappa values 0.37-0.67). Taking into account clinical information, the agreement on dissemination over time was substantially higher (kappa = 0.69) than that found on dissemination over space (kappa = 0.46). In contrast, for MRI assessment, the agreement for spatial dissemination was substantial (kappa = 0.74) compared with the fair agreement (kappa = 0.25) yielded by dissemination over time. The new McDonald's criteria yield a good overall diagnostic reliability, and compare favourably with Poser's classification in terms of agreement on distinct diagnostic categories.
Detection of vascularity in wrist tenosynovitis: power doppler ultrasound compared with contrast-enhanced grey-scale ultrasound.

PubMed

Klauser, Andrea S; Franz, Magdalena; Arora, Rohit; Feuchtner, Gudrun M; Gruber, Johann; Schirmer, Michael; Jaschke, Werner R; Gabl, Markus F

2010-01-01

We sought to assess vascularity in wrist tenosynovitis by using power Doppler ultrasound (PDUS) and to compare detection of intra- and peritendinous vascularity with that of contrast-enhanced grey-scale ultrasound (CEUS). Twenty-six tendons of 24 patients (nine men, 15 women; mean age ± SD, 54.4 ± 11.8 years) with a clinical diagnosis of tenosynovitis were examined with B-mode ultrasonography, PDUS, and CEUS by using a second-generation contrast agent, SonoVue (Bracco Diagnostics, Milan, Italy) and a low-mechanical-index ultrasound technique. Thickness of synovitis, extent of vascularized pannus, intensity of peritendinous vascularisation, and detection of intratendinous vessels was incorporated in a 3-score grading system (grade 0 to 2). Interobserver variability was calculated. With CEUS, a significantly greater extent of vascularity could be detected than by using PDUS (P < 0.001). In terms of peri- and intratendinous vessels, CEUS was significantly more sensitive in the detection of vascularization compared with PDUS (P < 0.001). No significant correlation between synovial thickening and extent of vascularity could be found (P = 0.089 to 0.097). Interobserver reliability was calculated to be excellent when evaluating the grading score (κ = 0.811 to 1.00). CEUS is a promising tool to detect tendon vascularity with higher sensitivity than PDUS by improved detection of intra- and peritendinous vascularity.
Fully automatic measurements of axial vertebral rotation for assessment of spinal deformity in idiopathic scoliosis

NASA Astrophysics Data System (ADS)

Forsberg, Daniel; Lundström, Claes; Andersson, Mats; Vavruch, Ludvig; Tropp, Hans; Knutsson, Hans

2013-03-01

Reliable measurements of spinal deformities in idiopathic scoliosis are vital, since they are used for assessing the degree of scoliosis, deciding upon treatment and monitoring the progression of the disease. However, commonly used two dimensional methods (e.g. the Cobb angle) do not fully capture the three dimensional deformity at hand in scoliosis, of which axial vertebral rotation (AVR) is considered to be of great importance. There are manual methods for measuring the AVR, but they are often time-consuming and related with a high intra- and inter-observer variability. In this paper, we present a fully automatic method for estimating the AVR in images from computed tomography. The proposed method is evaluated on four scoliotic patients with 17 vertebrae each and compared with manual measurements performed by three observers using the standard method by Aaro-Dahlborn. The comparison shows that the difference in measured AVR between automatic and manual measurements are on the same level as the inter-observer difference. This is further supported by a high intraclass correlation coefficient (0.971-0.979), obtained when comparing the automatic measurements with the manual measurements of each observer. Hence, the provided results and the computational performance, only requiring approximately 10 to 15 s for processing an entire volume, demonstrate the potential clinical value of the proposed method.
Quantitative evaluation of fatty degeneration of the supraspinatus and infraspinatus muscles using T2 mapping.

PubMed

Matsuki, Keisuke; Watanabe, Atsuya; Ochiai, Shunsuke; Kenmoku, Tomonori; Ochiai, Nobuyasu; Obata, Takayuki; Toyone, Tomoaki; Wada, Yuichi; Okubo, Toshiyuki

2014-05-01

Although fatty degeneration of the rotator cuff muscles has been reported to affect the outcomes of rotator cuff repairs, only a few studies have attempted to quantitatively evaluate this degeneration. T2 mapping is a quantitative magnetic resonance imaging technique that potentially evaluates the concentration of fat in muscles. The purpose of this study was to investigate fatty degeneration of the rotator cuff muscles by using T2 mapping, as well as to evaluate the reliability of T2 measurement. We obtained magnetic resonance images including T2 mapping from 184 shoulders (180 patients; 110 male patients [112 shoulders] and 70 female patients [72 shoulders]; mean age, 62 years [range, 16-84 years]). Eighty-three shoulders had no rotator cuff tear (group A), whereas 101 shoulders had tears, of which 62 were incomplete to medium (group B) and 39 were large to massive (group C). T2 values of the supraspinatus and infraspinatus muscles were measured and compared among groups. Intraobserver and interobserver variabilities also were examined. The mean T2 values of the supraspinatus in groups A, B, and C were 36.3 ± 4.7 milliseconds, 44.2 ± 11.3 milliseconds, and 57.0 ± 18.8 milliseconds, respectively. The mean T2 values of the infraspinatus in groups A, B, and C were 36.1 ± 5.1 milliseconds, 40.0 ± 11.1 milliseconds, and 51.9 ± 18.2 milliseconds, respectively. The T2 value significantly increased with the extent of the tear in both muscles. Both intraobserver and interobserver variabilities were more than 0.99. T2 mapping can be a reliable tool to quantify fatty degeneration of the rotator cuff muscles. Copyright © 2014 Journal of Shoulder and Elbow Surgery Board of Trustees. Published by Mosby, Inc. All rights reserved.
High-resolution 3 T MRI of traumatic and degenerative triangular fibrocartilage complex (TFCC) abnormalities using Palmer and Outerbridge classifications.

PubMed

Nozaki, T; Rafijah, G; Yang, L; Ueno, T; Horiuchi, S; Hitt, D; Yoshioka, H

2017-10-01

To investigate the usefulness of high-resolution 3 T magnetic resonance imaging (MRI) for the evaluation of traumatic and degenerative triangular fibrocartilage complex (TFCC) abnormalities among three groups: patients presenting with wrist pain who were (a) younger than age 50 years or (b) age 50 or older (PT<50 and PT≥50, respectively), and (c) asymptomatic controls who were younger than age 50 years (AC). High-resolution 3 T MRI was evaluated retrospectively in 96 patients, including 47 PT<50, 38 PT≥50, and 11 AC. Two board-certified radiologists reviewed the MRI images independently. MRI features of TFCC injury were analysed according to the Palmer classification, and cartilage degeneration around the TFCC was evaluated using the Outerbridge classification. Differences in MRI findings among these groups were detected using chi-square test. Cohen's kappa was calculated to assess interobserver and intra-observer reliability. The incidence of Palmer class 1A, 1C and 1D traumatic TFCC injury was significantly (p<0.05) higher in PT≥50 than in PT<50 (class 1A: 47.4% versus 27.7%, class 1C: 31.6% versus 12.8%, and class 1D: 21.1% versus 2.1%). Likewise, MRI findings of TFCC degeneration were observed more frequently in PT≥50 than in PT<50 (p<0.01). Outerbridge grade 2 or higher cartilage degeneration was significantly (p<0.01) more frequently seen in PT≥50 than in PT<50 (55.3% versus 17% in the lunate, 28.9% versus 4.3% in the triquetrum, 73.7% versus 12.8% in the ulna). High-resolution wrist MRI at 3 T enables detailed evaluation of TFCC traumatic injury and degenerative changes using the Palmer and Outerbridge classifications, with good or excellent interobserver and intra-observer reliability. Copyright © 2017 The Royal College of Radiologists. Published by Elsevier Ltd. All rights reserved.
Feasibility of four-dimensional preoperative simulation for elbow debridement arthroplasty.

PubMed

Yamamoto, Michiro; Murakami, Yukimi; Iwatsuki, Katsuyuki; Kurimoto, Shigeru; Hirata, Hitoshi

2016-04-02

Recent advances in imaging modalities have enabled three-dimensional preoperative simulation. A four-dimensional preoperative simulation system would be useful for debridement arthroplasty of primary degenerative elbow osteoarthritis because it would be able to detect the impingement lesions. We developed a four-dimensional simulation system by adding the anatomical axis to the three-dimensional computed tomography scan data of the affected arm in one position. Eleven patients with primary degenerative elbow osteoarthritis were included. A "two rings" method was used to calculate the flexion-extension axis of the elbow by converting the surface of the trochlea and capitellum into two rings. A four-dimensional simulation movie was created and showed the optimal range of motion and the impingement area requiring excision. To evaluate the reliability of the flexion-extension axis, interobserver and intraobserver reliabilities regarding the assessment of bony overlap volumes were calculated twice for each patient by two authors. Patients were treated by open or arthroscopic debridement arthroplasties. Pre- and postoperative examinations included elbow range of motion measurement, and completion of the patient-rated questionnaire Hand20, Japanese Orthopaedic Association-Japan Elbow Society Elbow Function Score, and the Mayo Elbow Performance Score. Measurement of the bony overlap volume showed an intraobserver intraclass correlation coefficient of 0.93 and 0.90, and an interobserver intraclass correlation coefficient of 0.94. The mean elbow flexion-extension arc significantly improved from 101° to 125°. The mean Hand20 score significantly improved from 52 to 22. The mean Japanese Orthopaedic Association-Japan Elbow Society Elbow Function Score significantly improved from 67 to 88. The mean Mayo Elbow Performance Score significantly improved from 71 to 91 at the final follow-up evaluation. We showed that four-dimensional, preoperative simulation can be generated by adding the rotation axis to the one-position, three-dimensional computed tomography image of the affected arm. This method is feasible for elbow debridement arthroplasty.
Pilot study on objective measurement of abdominal wall strength in patients with ventral incisional hernia.

PubMed

Parker, Michael; Goldberg, Ross F; Dinkins, Maryane M; Asbun, Horacio J; Daniel Smith, C; Preissler, Susanne; Bowers, Steven P

2011-11-01

Outcomes after ventral incisional hernia (VIH) repair are measured by recurrence rate and subjective measures. No objective metrics evaluate functional outcomes after abdominal wall reconstruction. This study aimed to develop testing of abdominal wall strength (AWS) that could be validated as a useful metric. Data were prospectively collected during 9 months from 35 patients. A total of 10 patients were evaluated before and after VIH repair, for a total of 45 encounters. The patients were tested simultaneously or in succession by two of three examiners. Data were collected for three tests: double leg lowering (DLL), trunk raising (TR), and supine reaching (SR). Raw data were compared and tested for validity, and continuous data were transformed to categorical data. Agreement was measured using the intraclass correlation coefficient (ICC) for DLL and using kappa for the ordinal measures. Simultaneous testing yielded the following interobserver reliability: DLL (0.96 and 0.87), TR (1.00 and 0.95), and SR (0.76). Reproducibility was assessed by consecutive tests, with correlation as follows: DLL (0.81), TR (0.81), and RCH (0.21). Due to poor interobserver reliability for the SR test compared with the DLL and TR tests, the SR test was excluded from calculation of an overall score. Based on raw data distribution from the DLL and TR tests, the DLL data were categorized into 10º increments, allowing construction of a 10-point score. The median AWS score was 5 (interquartile range [IQR], 4-7), and there was agreement within 1 point for 42 of the 45 encounters (93%). The findings from this study demonstrate that the 10-point AWS score may measure AWS in an accurate and reproducible fashion, with potential for objective description of abdominal wall function of VIH patients. This score may help to identify patients suited for abdominal wall reconstruction while measuring progress after VIH repair. Further longitudinal outcomes studies are needed.

Readability of Spine-Related Patient Education Materials From Leading Orthopedic Academic Centers.

PubMed

Ryu, Justine H; Yi, Paul H

2016-05-01

Cross-sectional analysis of online spine-related patient education materials from leading academic centers. To assess the readability levels of spine surgery-related patient education materials available on the websites of academic orthopedic surgery departments. The Internet is becoming an increasingly popular resource for patient education. Yet many previous studies have found that Internet-based orthopedic-related patient education materials from subspecialty societies are written at a level too difficult for the average American; however, no prior study has assessed the readability of spine surgery-related patient educational materials from leading academic centers. All spine surgery-related articles from the online patient education libraries of the top five US News & World Report-ranked orthopedic institutions were assessed for readability using the Flesch-Kincaid (FK) readability test. Mean readability levels of articles amongst the five academic institutions and articles were compared. We also determined the number of articles with readability levels at or below the recommended sixth- or eight-grade levels. Intraobserver and interobserver reliability of readability assessment were assessed. A total of 122 articles were reviewed. The mean overall FK grade level was 11.4; the difference in mean FK grade level between each department varied significantly (range, 9.3-13.4; P < 0.0001). Twenty-three articles (18.9%) had a readability level at or below the eighth grade level, and only one (0.8%) was at or below the sixth grade level. Intraobserver and interobserver reliability were both excellent (intraclass correlation coefficient of 1 for both). Online patient education materials related to spine from academic orthopedic centers are written at a level too high for the average patient, consistent with spine surgery-related patient education materials provided by the American Academy of Orthopaedic Surgeons and spine subspecialty societies. This study highlights the potential difficulties patients might have in reading and comprehending the information in publicly available education materials related to spine. N/A.
Evaluation of the iPhone with an acrylic sleeve versus the Scoliometer for rib hump measurement in scoliosis.

PubMed

Izatt, Maree T; Bateman, Gary R; Adam, Clayton J

2012-07-30

Vertebral rotation found in structural scoliosis contributes to trunkal asymmetry which is commonly measured with a simple Scoliometer device on a patient's thorax in the forward flexed position. The new generation of mobile 'smartphones' have an integrated accelerometer, making accurate angle measurement possible, which provides a potentially useful clinical tool for assessing rib hump deformity. This study aimed to compare rib hump angle measurements performed using a Smartphone and traditional Scoliometer on a set of plaster torsos representing the range of torsional deformities seen in clinical practice. Nine observers measured the rib hump found on eight plaster torsos moulded from scoliosis patients with both a Scoliometer and an Apple iPhone on separate occasions. Each observer repeated the measurements at least a week after the original measurements, and were blinded to previous results. Intra-observer reliability and inter-observer reliability were analysed using the method of Bland and Altman and 95% confidence intervals were calculated. The Intra-Class Correlation Coefficients (ICC) were calculated for repeated measurements of each of the eight plaster torso moulds by the nine observers. Mean absolute difference between pairs of iPhone/Scoliometer measurements was 2.1 degrees, with a small (1 degrees) bias toward higher rib hump angles with the iPhone. 95% confidence intervals for intra-observer variability were +/- 1.8 degrees (Scoliometer) and +/- 3.2 degrees (iPhone). 95% confidence intervals for inter-observer variability were +/- 4.9 degrees (iPhone) and +/- 3.8 degrees (Scoliometer). The measurement errors and confidence intervals found were similar to or better than the range of previously published thoracic rib hump measurement studies. The iPhone is a clinically equivalent rib hump measurement tool to the Scoliometer in spinal deformity patients. The novel use of plaster torsos as rib hump models avoids the variables of patient fatigue and discomfort, inconsistent positioning and deformity progression using human subjects in a single or multiple measurement sessions.
Evaluation of the iPhone with an acrylic sleeve versus the Scoliometer for rib hump measurement in scoliosis

PubMed Central

2012-01-01

Background Vertebral rotation found in structural scoliosis contributes to trunkal asymmetry which is commonly measured with a simple Scoliometer device on a patient's thorax in the forward flexed position. The new generation of mobile 'smartphones' have an integrated accelerometer, making accurate angle measurement possible, which provides a potentially useful clinical tool for assessing rib hump deformity. This study aimed to compare rib hump angle measurements performed using a Smartphone and traditional Scoliometer on a set of plaster torsos representing the range of torsional deformities seen in clinical practice. Methods Nine observers measured the rib hump found on eight plaster torsos moulded from scoliosis patients with both a Scoliometer and an Apple iPhone on separate occasions. Each observer repeated the measurements at least a week after the original measurements, and were blinded to previous results. Intra-observer reliability and inter-observer reliability were analysed using the method of Bland and Altman and 95% confidence intervals were calculated. The Intra-Class Correlation Coefficients (ICC) were calculated for repeated measurements of each of the eight plaster torso moulds by the nine observers. Results Mean absolute difference between pairs of iPhone/Scoliometer measurements was 2.1 degrees, with a small (1 degrees) bias toward higher rib hump angles with the iPhone. 95% confidence intervals for intra-observer variability were +/- 1.8 degrees (Scoliometer) and +/- 3.2 degrees (iPhone). 95% confidence intervals for inter-observer variability were +/- 4.9 degrees (iPhone) and +/- 3.8 degrees (Scoliometer). The measurement errors and confidence intervals found were similar to or better than the range of previously published thoracic rib hump measurement studies. Conclusions The iPhone is a clinically equivalent rib hump measurement tool to the Scoliometer in spinal deformity patients. The novel use of plaster torsos as rib hump models avoids the variables of patient fatigue and discomfort, inconsistent positioning and deformity progression using human subjects in a single or multiple measurement sessions. PMID:22846346
Monitoring acute equine visceral pain with the Equine Utrecht University Scale for Composite Pain Assessment (EQUUS-COMPASS) and the Equine Utrecht University Scale for Facial Assessment of Pain (EQUUS-FAP): A scale-construction study.

PubMed

van Loon, Johannes P A M; Van Dierendonck, Machteld C

2015-12-01

Although recognition of equine pain has been studied extensively over the past decades there is still need for improvement in objective identification of pain in horses with acute colic. This study describes scale construction and clinical applicability of the Equine Utrecht University Scale for Composite Pain Assessment (EQUUS-COMPASS) and the Equine Utrecht University Scale for Facial Assessment of Pain (EQUUS-FAP) in horses with acute colic. A cohort follow-up study was performed using 50 adult horses (n = 25 with acute colic, n = 25 controls). Composite pain scores were assessed by direct observations, Visual Analog Scale (VAS) scores were assessed from video clips. Colic patients were assessed at arrival, and on the first and second mornings after arrival. Both the EQUUS-COMPASS and EQUUS-FAP scores showed high inter-observer reliability (ICC = 0.98 for EQUUS-COMPASS, ICC = 0.93 for EQUUS-FAP, P <0.001), while a moderate inter-observer reliability for the VAS scores was found (ICC = 0.63, P <0.001). The cut-off value for differentiation between healthy and colic horses for the EQUUS-COMPASS was 5, and for differentiation between conservatively treated and surgically treated or euthanased patients it was 11. For the EQUUS-FAP, cut-off values were 4 and 6, respectively. Internal sensitivity and specificity were good for both EQUUS-COMPASS (sensitivity 95.8%, specificity 84.0%) and EQUUS-FAP (sensitivity 87.5%, specificity 88.0%). The use of the EQUUS-COMPASS and EQUUS-FAP enabled repeated and objective scoring of pain in horses with acute colic. A follow-up study with new patients and control animals will be performed to further validate the constructed scales that are described in this study. Copyright © 2015 Elsevier Ltd. All rights reserved.
In vitro comparison of water displacement method and 3 Tesla MRI for MR-volumetry of the olfactory bulb: which sequence is appropriate?

PubMed

Burmeister, Hartmut Peter; Möslein, Constanze; Bitter, Thomas; Fröber, Rosemarie; Herrmann, Karl-Heinz; Baltzer, Pascal Andreas Thomas; Gudziol, Hilmar; Dietzel, Matthias; Guntinas-Lichius, Orlando; Kaiser, Werner Alois

2011-10-01

Magnetic resonance imaging olfactory bulb (OB) volumetry (OBV) is already used as a complementary prognostic tool to assess olfactory disorders. However, a reference standard in imaging for OBV has not been established. The aim of this in vitro study was to compare volumetric results of different magnetic resonance sequences for OBV at 3 T to genuine OB volumes measured by water displacement. The volumes of 15 human cadaveric OBs were measured using the water displacement method in this institutional review board-approved prospective study. The magnetic resonance imaging protocol at 3 T included constructive interference in steady state (CISS), T2-weighted (T2w) three-dimensional (3D) sampling perfection with application-optimized contrasts using different flip-angle evolutions (SPACE), T2w two-dimensional (2D) turbo spin-echo (TSE), and T1-weighted (T1w) 3D fast low-angle shot (FLASH) sequences. Two blinded observers independently performed two OB volumetric assessments per bulbus and sequence. Intraobserver and interobserver reliabilities were assessed by intraclass correlation coefficients. Bland-Altman plots were analyzed to evaluate systematic biases and concordance correlation coefficients to assess reproducibility. For both observers, intraclass correlation coefficient analysis yielded almost perfect results for intraobserver reliability (CISS, 0.94-0.98; T2w 3D SPACE, 0.93-0.98; T2w 2D TSE, 0.98-0.98; T1w 3D FLASH, 0.95-0.99). Interobserver reliability showed almost perfect agreement for all sequences (CISS, 0.98; T2w 3D SPACE, 0.89; T2w 2D TSE, 0.93; T1w 3D FLASH, 0.97). The CISS sequence yielded the highest mean concordance correlation coefficient (0.95) and the highest combination of precision (0.97) and accuracy (0.98) values. In comparison with the water displacement method, Bland-Altman analyses revealed the lowest systematic bias (-0.5%) for the CISS sequence, followed by T1w 3D FLASH (-1.3%), T2w 3D SPACE (-7.5%), and T2w 2D TSE (-10.9%) sequences. Compared to the water displacement method, the CISS sequence is suited best to validly and reliably measure OB volumes because of its highest values for accuracy and precision and lowest systematic bias. Copyright © 2011 AUR. Published by Elsevier Inc. All rights reserved.
Evaluation of bleach-sedimentation for sterilising and concentrating Mycobacterium tuberculosis in sputum specimens

PubMed Central

2011-01-01

Background Bleach-sedimentation may improve microscopy for diagnosing tuberculosis by sterilising sputum and concentrating Mycobacterium tuberculosis. We studied gravity bleach-sedimentation effects on safety, sensitivity, speed and reliability of smear-microscopy. Methods This blinded, controlled study used sputum specimens (n = 72) from tuberculosis patients. Bleach concentrations and exposure times required to sterilise sputum (n = 31) were determined. In the light of these results, the performance of 5 gravity bleach-sedimentation techniques that sterilise sputum specimens (n = 16) were compared. The best-performing of these bleach-sedimentation techniques involved adding 1 volume of 5% bleach to 1 volume of sputum, shaking for 10-minutes, diluting in 8 volumes distilled water and sedimenting overnight before microscopy. This technique was further evaluated by comparing numbers of visible acid-fast bacilli, slide-reading speed and reliability for triplicate smears before versus after bleach-sedimentation of sputum specimens (n = 25). Triplicate smears were made to increase precision and were stained using the Ziehl-Neelsen method. Results M. tuberculosis in sputum was successfully sterilised by adding equal volumes of 15% bleach for 1-minute, 6% for 5-minutes or 3% for 20-minutes. Bleach-sedimentation significantly decreased the number of acid-fast bacilli visualised compared with conventional smears (geometric mean of acid-fast bacilli per 100 microscopy fields 166, 95%CI 68-406, versus 346, 95%CI 139-862, respectively; p = 0.02). Bleach-sedimentation diluted paucibacillary specimens less than specimens with higher concentrations of visible acid-fast bacilli (p = 0.02). Smears made from bleach-sedimented sputum were read more rapidly than conventional smears (9.6 versus 11.2 minutes, respectively, p = 0.03). Counting conventional acid-fast bacilli had high reliability (inter-observer agreement, r = 0.991) that was significantly reduced (p = 0.03) by bleach-sedimentation (to r = 0.707) because occasional strongly positive bleach-sedimented smears were misread as negative. Conclusions Gravity bleach-sedimentation improved laboratory safety by sterilising sputum but decreased the concentration of acid-fast bacilli visible on microscopy, especially for sputum specimens containing high concentrations of M. tuberculosis. Bleach-sedimentation allowed examination of more of each specimen in the time available but decreased the inter-observer reliability with which slides were read. Thus bleach-sedimentation effects vary depending upon specimen characteristics and whether microscopy was done for a specified time, or until a specified number of microscopy fields had been read. These findings provide an explanation for the contradictory results of previous studies. PMID:21985457
DOE Office of Scientific and Technical Information (OSTI.GOV)

Levegruen, Sabine, E-mail: sabine.levegruen@uni-due.de; Poettgen, Christoph; Abu Jawad, Jehad

Purpose: To evaluate megavoltage computed tomography (MVCT)-based image guidance with helical tomotherapy in patients with vertebral tumors by analyzing factors influencing interobserver variability, considered as quality criterion of image guidance. Methods and Materials: Five radiation oncologists retrospectively registered 103 MVCTs in 10 patients to planning kilovoltage CTs by rigid transformations in 4 df. Interobserver variabilities were quantified using the standard deviations (SDs) of the distributions of the correction vector components about the observers' fraction mean. To assess intraobserver variabilities, registrations were repeated after {>=}4 weeks. Residual deviations after setup correction due to uncorrectable rotational errors and elastic deformations were determinedmore » at 3 craniocaudal target positions. To differentiate observer-related variations in minimizing these residual deviations across the 3-dimensional MVCT from image resolution effects, 2-dimensional registrations were performed in 30 single transverse and sagittal MVCT slices. Axial and longitudinal MVCT image resolutions were quantified. For comparison, image resolution of kilovoltage cone-beam CTs (CBCTs) and interobserver variability in registrations of 43 CBCTs were determined. Results: Axial MVCT image resolution is 3.9 lp/cm. Longitudinal MVCT resolution amounts to 6.3 mm, assessed as full-width at half-maximum of thin objects in MVCTs with finest pitch. Longitudinal CBCT resolution is better (full-width at half-maximum, 2.5 mm for CBCTs with 1-mm slices). In MVCT registrations, interobserver variability in the craniocaudal direction (SD 1.23 mm) is significantly larger than in the lateral and ventrodorsal directions (SD 0.84 and 0.91 mm, respectively) and significantly larger compared with CBCT alignments (SD 1.04 mm). Intraobserver variabilities are significantly smaller than corresponding interobserver variabilities (variance ratio [VR] 1.8-3.1). Compared with 3-dimensional registrations, 2-dimensional registrations have significantly smaller interobserver variability in the lateral and ventrodorsal directions (VR 3.8 and 2.8, respectively) but not in the craniocaudal direction (VR 0.75). Conclusion: Tomotherapy image guidance precision is affected by image resolution and residual deviations after setup correction. Eliminating the effect of residual deviations yields small interobserver variabilities with submillimeter precision in the axial plane. In contrast, interobserver variability in the craniocaudal direction is dominated by the poorer longitudinal MVCT image resolution. Residual deviations after image guidance exist and need to be considered when dose gradients ultimately achievable with image guided radiation therapy techniques are analyzed.« less
Microarsecond models for the celestial motions of the CIP and CEO

NASA Astrophysics Data System (ADS)

Capitaine, N.

2004-09-01

The Celestial intermediate pole (CIP) and Celestial ephemeris (orintermediate) origin (CEO/CIO) have been adopted by the IAU (c.f. IAU2000 Resolution B1.8) as the celestial pole and origin, respectively,to be used for realizing the intermediate celestial system between theInternational Terrestrial System (ITRS) and Geocentric CelestialReference System (GCRS). Resolution B1.8 has also recommended that theInternational Earth Rotation and Reference Systems Service (IERS)continue to provide users with data and algorithms for the conventionaltransformation. The IAU 2000 Resolutions have been implemented in theIERS 2003 Conventions including Tables and routines that provide thecelestial motions of the CIP and the CEO with a theoretical accuracy ofone microarcsecond after one century using either the classical or thenew transformation. This paper reports on the method used for achievingthis accuracy in the positions of the CIP and CIO and on the differencebetween this rigorous procedure and the pre-2003 classical one.
Theoretical Study of Operational Limits of High-Speed Quantum Dot Lasers

DTIC Science & Technology

2012-09-09

esc − vLn,captnL − b1 BnL pL, (1) b1 ∂pL ∂ t = p L QW τLp,esc − vLp,capt pL − b1 BnL pL, (2) for free holes and electrons on the right-hand side of...on the left- hand side of the OCL can be written as follows: pLQW τp,esc = vLp,capt pL + b1 BnL pL. (28) Substituting pLQW/τp,esc−vLp,capt pL = b1 BnL ...pL in (6), we have B2Dn L QW p L QW + b1 BnL pL = wLp,tunn pL,QW1 NS fp − wLp,tunn NS(1 − fp)pLQW. (29) As seen from (29), bimolecular recombination
A Microsoft Excel® 2010 Based Tool for Calculating Interobserver Agreement

PubMed Central

Azulay, Richard L

2011-01-01

This technical report provides detailed information on the rationale for using a common computer spreadsheet program (Microsoft Excel®) to calculate various forms of interobserver agreement for both continuous and discontinuous data sets. In addition, we provide a brief tutorial on how to use an Excel spreadsheet to automatically compute traditional total count, partial agreement-within-intervals, exact agreement, trial-by-trial, interval-by-interval, scored-interval, unscored-interval, total duration, and mean duration-per-interval interobserver agreement algorithms. We conclude with a discussion of how practitioners may integrate this tool into their clinical work. PMID:22649578
A microsoft excel(®) 2010 based tool for calculating interobserver agreement.

PubMed

Reed, Derek D; Azulay, Richard L

2011-01-01

This technical report provides detailed information on the rationale for using a common computer spreadsheet program (Microsoft Excel(®)) to calculate various forms of interobserver agreement for both continuous and discontinuous data sets. In addition, we provide a brief tutorial on how to use an Excel spreadsheet to automatically compute traditional total count, partial agreement-within-intervals, exact agreement, trial-by-trial, interval-by-interval, scored-interval, unscored-interval, total duration, and mean duration-per-interval interobserver agreement algorithms. We conclude with a discussion of how practitioners may integrate this tool into their clinical work.
Atlas-based segmentation technique incorporating inter-observer delineation uncertainty for whole breast

NASA Astrophysics Data System (ADS)

Bell, L. R.; Dowling, J. A.; Pogson, E. M.; Metcalfe, P.; Holloway, L.

2017-01-01

Accurate, efficient auto-segmentation methods are essential for the clinical efficacy of adaptive radiotherapy delivered with highly conformal techniques. Current atlas based auto-segmentation techniques are adequate in this respect, however fail to account for inter-observer variation. An atlas-based segmentation method that incorporates inter-observer variation is proposed. This method is validated for a whole breast radiotherapy cohort containing 28 CT datasets with CTVs delineated by eight observers. To optimise atlas accuracy, the cohort was divided into categories by mean body mass index and laterality, with atlas’ generated for each in a leave-one-out approach. Observer CTVs were merged and thresholded to generate an auto-segmentation model representing both inter-observer and inter-patient differences. For each category, the atlas was registered to the left-out dataset to enable propagation of the auto-segmentation from atlas space. Auto-segmentation time was recorded. The segmentation was compared to the gold-standard contour using the dice similarity coefficient (DSC) and mean absolute surface distance (MASD). Comparison with the smallest and largest CTV was also made. This atlas-based auto-segmentation method incorporating inter-observer variation was shown to be efficient (<4min) and accurate for whole breast radiotherapy, with good agreement (DSC>0.7, MASD <9.3mm) between the auto-segmented contours and CTV volumes.
Reliability of plain radiographic parameters for developmental dysplasia of the hip in children.

PubMed

Upasani, Vidyadhar V; Bomar, James D; Parikh, Gaurav; Hosalkar, Harish

2012-07-01

Few studies have evaluated the reliability and reproducibility of the femoral neck-shaft angle (NSA), center-edge angle (CEA), and acetabular index (AI) in young children with developmental dysplasia of the hip (DDH). We wanted to determine whether these parameters could be used reliably by practitioners. Fifty radiographs from 21 children with DDH were reviewed. Analysis was performed by three observers, at two time periods. The intra- and inter-observer reliability for each measure was assessed. At time period one, we noted a "high" level of agreement between observers when measuring the NSA, a "low" level when measuring the CEA, and a "moderate" level when measuring the AI. At time period two, we noted a "very high" level of agreement between observers when measuring the NSA and a "high" level when measuring the CEA and AI. When comparing the measurements of observer 1 at the two different time periods, we noted nearly "very high" agreement when measuring the NSA, a "moderate" agreement when measuring the CEA, and a "high" agreement for the AI. In comparing the measurements of observer 2, we noted "very high" agreement for the NSA and "high" agreement for the CEA and AI. In comparing the measurements for observer 3, we noted nearly "very high" agreement for the NSA, nearly "high" agreement for the CEA, and "high" agreement for the AI. It is difficult to reliably measure three-dimensional pelvic morphology on a frontal plane radiograph, especially when important pelvic landmarks have yet to ossify.
Assessment of radial torsion using computed tomography in dogs with and without antebrachial limb deformity.

PubMed

Kroner, Kevin; Cooley, Katie; Hoey, Seamus; Hetzel, Scott J; Bleedorn, Jason A

2017-01-01

To evaluate the reliability of radial torsion assessment in dogs using computed tomography (CT). Cadaveric and retrospective observational clinical study. Thoracic limbs (n = 40) from bilateral normal cadaveric canine specimens (10 pairs) and unilateral antebrachial angular limb deformity (ALD) dogs (10 uniapical and 10 biapical deformities). Limbs were evaluated using CT. Frontal, sagittal, and axial plane (torsion) values were obtained using published guidelines and compared between groups and limbs. Radial torsion reliability was assessed among 3 observers using intraclass correlation coefficients (ICC). The mean (±SD) radial torsion of normal dogs was 3.6° ± 6.4° and contained a significant right to left limb variation of 2.6°. Mean radial torsion in uniapical ALD limbs (3.6° ± 18.7°) was not significantly different from biapical ALD limbs (8.9° ± 17.9°). There was a wide range of torsion values in normal and ALD limbs. The interobserver reliability was excellent (ICC > 0.8) for normal dogs, good (0.73) for uniapical, and excellent (0.89) for biapical ALD limbs. The intraobserver reliability was excellent (>0.8) for all groups. There was a small side-to-side variation of radial torsion in normal dogs. With directed training, torsion assessment using CT is reliable in dogs with and without antebrachial bone deformity. © 2016 The American College of Veterinary Surgeons.
Facial Angiofibroma Severity Index (FASI): reliability assessment of a new tool developed to measure severity and responsiveness to therapy in tuberous sclerosis-associated facial angiofibroma.

PubMed

Salido-Vallejo, R; Ruano, J; Garnacho-Saucedo, G; Godoy-Gijón, E; Llorca, D; Gómez-Fernández, C; Moreno-Giménez, J C

2014-12-01

Tuberous sclerosis complex (TSC) is an autosomal dominant neurocutaneous disorder characterized by the development of multisystem hamartomatous tumours. Topical sirolimus has recently been suggested as a potential treatment for TSC-associated facial angiofibroma (FA). To validate a reproducible scale created for the assessment of clinical severity and treatment response in these patients. We developed a new tool, the Facial Angiofibroma Severity Index (FASI) to evaluate the grade of erythema and the size and extent of FAs. In total, 30 different photographs of patients with TSC were shown to 56 dermatologists at each evaluation. Three evaluations using the same photographs but in a different random order were performed 1 week apart. Test and retest reliability and interobserver reproducibility were determined. There was good agreement between the investigators. Inter-rater reliability showed strong correlations (> 0.98; range 0.97-0.99) with inter-rater correlation coefficients (ICCs) for the FASI. The global estimated kappa coefficient for the degree of intra-rater agreement (test-retest) was 0.94 (range 0.91-0.97). The FASI is a valid and reliable tool for measuring the clinical severity of TSC-associated FAs, which can be applied in clinical practice to evaluate the response to treatment in these patients. © 2014 British Association of Dermatologists.
Spanish translation and validation of the Scale for Contraversive Pushing to measure pusher behaviour.

PubMed

Martín-Nieto, A; Atín-Arratibel, M Á; Bravo-Llatas, C; Moreno-Bermejo, M I; Martín-Casas, P

2018-06-08

The aim of this study was to develop and validate a Spanish-language version of the Scale for Contraversive Pushing, used to diagnose and measure pusher behaviour in stroke patients. Translation-back translation was used to create the Spanish-language Scale for Contraversive Pushing; we subsequently evaluated its validity and reliability by administering it to a sample of patients. We also analysed its sensitivity to change in patients identified as pushers who received neurological physiotherapy. Experts indicated that the content of the scale was valid. Internal consistency was very good (Cronbach's alpha of 0.94). The intraclass correlation coefficient showed high intra- and interobserver reliability (0.999 and 0.994, respectively). The Kappa and weighted Kappa coefficients were used to measure the reliability of each item; the majority obtained values above 0.9. Lastly, the differences between baseline and final evaluations of pushers were significant (paired sample t test), showing that the scale is sensitive to changes obtained through physical therapy. The Spanish-language version of the Scale for Contraversive Pushing is valid and reliable for measuring pusher behaviour in stroke patients. In addition, it is able to evaluate the ongoing changes in patients who have received physical therapy. Copyright © 2018 Sociedad Española de Neurología. Publicado por Elsevier España, S.L.U. All rights reserved.
The reliability of the Hendrich Fall Risk Model in a geriatric hospital.

PubMed

Heinze, Cornelia; Halfens, Ruud; Dassen, Theo

2008-12-01

Aims and objectives. The purpose of this study was to test the interrater reliability of the Hendrich Fall Risk Model, an instrument to identify patients in a hospital setting with a high risk of falling. Background. Falls are a serious problem in older patients. Valid and reliable fall risk assessment tools are required to identify high-risk patients and to take adequate preventive measures. Methods. Seventy older patients were independently and simultaneously assessed by six pairs of raters made up of nursing staff members. Consensus estimates were calculated using simple percentage agreement and consistency estimates using Spearman's rho and intra class coefficient. Results. Percentage agreement ranged from 0.70 to 0.92 between the six pairs of raters. Spearman's rho coefficients were between 0.54 and 0.80 and the intra class coefficients were between 0.46 and 0.92. Conclusions. Whereas some pairs of raters obtained considerable interobserver agreement and internal consistency, the others did not. Therefore, it is concluded that the Hendrich Fall Risk Model is not a reliable instrument. The use of more unambiguous operationalized items is preferred. Relevance to clinical practice. In practice, well operationalized fall risk assessment tools are necessary. Observer agreement should always be investigated after introducing a standardized measurement tool. © 2008 The Authors. Journal compilation © 2008 Blackwell Publishing Ltd.
Reliability of injury grading systems for patients with blunt splenic trauma.

PubMed

Olthof, D C; van der Vlies, C H; Scheerder, M J; de Haan, R J; Beenen, L F M; Goslings, J C; van Delden, O M

2014-01-01

The most widely used grading system for blunt splenic injury is the American Association for the Surgery of Trauma (AAST) organ injury scale. In 2007 a new grading system was developed. This 'Baltimore CT grading system' is superior to the AAST classification system in predicting the need for angiography and embolization or surgery. The objective of this study was to assess inter- and intraobserver reliability between radiologists in classifying splenic injury according to both grading systems. CT scans of 83 patients with blunt splenic injury admitted between 1998 and 2008 to an academic Level 1 trauma centre were retrospectively reviewed. Inter and intrarater reliability were expressed in Cohen's or weighted Kappa values. Overall weighted interobserver Kappa coefficients for the AAST and 'Baltimore CT grading system' were respectively substantial (kappa=0.80) and almost perfect (kappa=0.85). Average weighted intraobserver Kappa's values were in the 'almost perfect' range (AAST: kappa=0.91, 'Baltimore CT grading system': kappa=0.81). The present study shows that overall the inter- and intraobserver reliability for grading splenic injury according to the AAST grading system and 'Baltimore CT grading system' are equally high. Because of the integration of vascular injury, the 'Baltimore CT grading system' supports clinical decision making. We therefore recommend use of this system in the classification of splenic injury. Copyright © 2012 Elsevier Ltd. All rights reserved.
Intrarater and interrater reliability and validity in the assessment of the mechanism of injury and integrity of the posterior ligamentous complex: a novel injury severity scoring system for thoracolumbar injuries. Invited submission from the Joint Section Meeting On Disorders of the Spine and Peripheral Nerves, March 2005.

PubMed

Harrop, James S; Vaccaro, Alexander R; Hurlbert, R John; Wilsey, Jared T; Baron, Eli M; Shaffrey, Christopher I; Fisher, Charles G; Dvorak, Marcel F; Oner, F C; Wood, Kirkham B; Anand, Neel; Anderson, D Greg; Lim, Moe R; Lee, Joon Y; Bono, Christopher M; Arnold, Paul M; Rampersaud, Y Raja; Fehlings, Michael G

2006-02-01

A new classification and treatment algorithm for thoracolumbar injuries was recently introduced by Vaccaro and colleagues in 2005. A thoracolumbar injury severity scale (TLISS) was proposed for grading and guiding treatment for these injuries. The scale is based on the following: 1) the mechanism of injury; 2) the integrity of the posterior ligamentous complex (PLC); and 3) the patient's neurological status. The reliability and validity of assessing injury mechanism and the integrity of the PLC was assessed. Forty-eight spine surgeons, consisting of neurosurgeons and orthopedic surgeons, reviewed 56 clinical thoracolumbar injury case histories. Each was classified and scored to determine treatment recommendations according to a novel classification system. After 3 months the case histories were reordered and the physicians repeated the exercise. Validity of this classification was good among reviewers; the vast majority (> 90%) agreed with the system's treatment recommendations. Surgeons were unclear as to a cogent description of PLC disruption and fracture mechanism. The TLISS demonstrated acceptable reliability in terms of intra- and interobserver agreement on the algorithm's treatment recommendations. Replacing injury mechanism with a description of injury morphology and better definition of PLC injury will improve inter- and intraobserver reliability of this injury classification system.
Psychometric properties and cross-cultural adaptation of the Brazilian Quebec back pain disability scale questionnaire.

PubMed

Rodrigues, Marcelo F; Michel-Crosato, Edgard; Cardoso, Jefferson R; Traebert, Jefferson

2009-06-01

Cross-cultural translation and psychometric testing. To translate and cross-culturally adapt the Quebec Back Pain Disability Scale (QDS) to Brazilian Portuguese and to examine its validity and reliability. Current literature shows the need to adopt reliable and internationally standardized methods for the analysis of low back pain. To our knowledge, this specific questionnaire has not been translated and validated for Portuguese-speaking patients. The translation and cross-cultural adaptation of the QDS were developed in agreement with internationally recommended methodology, and the resulting product was evaluated in this study with 54 consecutive patients. Internal consistency was obtained through Cronbach's alpha; reliability was estimated through the intraclass correlation coefficient and the Bland and Altman agreement (d = mean difference). Validity was determined by correlating the scores of the Brazil-QDS with the Brazilian version of the Roland-Morris Questionnaire and Visual Analogue Pain Scale by means of the Spearman rank correlation coefficient. The internal consistency obtained was excellent (Cronbach's alpha = 0.97). Intraobserver and interobserver reliability were considered strong (ICC = 0.93-d = 0.68 and 0.96-d = 0.57, respectively). The correlation with Brazilian Roland-Morris Questionnaire and with the Visual Analogue Scale was high (r = 0.857; r = 0.758, respectively). The data showed that the process of translation and cross-cultural adaptation were successful and that the adapted instrument demonstrated excellent psychometric properties.

Intra- and inter-observer agreement when using a descriptive classification scale for clinical assessment of faecal consistency in growing pigs.

PubMed

Pedersen, Ken Steen; Toft, Nils

2011-03-01

The objective of the current study was to evaluate intra- and inter-observer agreement using a descriptive classification scale with four categories, descriptive text and pictures for assessment of consistency in faecal samples from pigs post weaning. The four consistency categories were score one=firm and shaped, score two=soft and shaped, score three=loose and score four=watery. Five observers from the same veterinary practice examined 100 faecal samples using the scale with four categories. Four of the observers examined the 100 faecal samples twice within the same day. Within observers the difference in proportions for the individual consistency categories between two examinations was on average 0.04 (range: 0-0.10). The mean intra-observer agreement was 0.82 (range: 0.72-0.91) with a mean kappa value of 0.76 (range: 0.61-0.88). For inter-observer agreement overall kappa was 0.64. For the 10 pair-wise comparisons the mean inter-observer agreement was 0.73 (range: 0.61-0.90) with a mean kappa value of 0.64 (range: 0.48-0.87). The difference in proportions for the individual consistency categories was on average 0.08 (range: 0-0.17). In conclusion, the agreement observed for the descriptive classification scale with four categories, descriptive text and pictures may be categorized as a substantial to almost perfect intra-observer agreement and a moderate to almost perfect inter-observer agreement. However, more objective measures than clinical scales may still be needed to improve intra- and inter-observer agreement in research studies. Copyright © 2010 Elsevier B.V. All rights reserved.
Interobserver variability in the radiological assessment of magnetic resonance imaging (MRI) including perfusion MRI in glioblastoma multiforme.

PubMed

Kerkhof, M; Hagenbeek, R E; van der Kallen, B F W; Lycklama À Nijeholt, G J; Dirven, L; Taphoorn, M J B; Vos, M J

2016-10-01

Conventional magnetic resonance imaging (MRI) has limited value for differentiation of true tumor progression and pseudoprogression in treated glioblastoma multiforme (GBM). Perfusion weighted imaging (PWI) may be helpful in the differentiation of these two phenomena. Here interobserver variability in routine radiological evaluation of GBM patients is assessed using MRI, including PWI. Three experienced neuroradiologists evaluated MR scans of 28 GBM patients during temozolomide chemoradiotherapy at three time points: preoperative (MR1) and postoperative (MR2) MR scan and the follow-up MR scan after three cycles of adjuvant temozolomide (MR3). Tumor size was measured both on T1 post-contrast and T2 weighted images according to the Response Assessment in Neuro-Oncology criteria. PW images of MR3 were evaluated by visual inspection of relative cerebral blood volume (rCBV) color maps and by quantitative rCBV measurements of enhancing areas with highest rCBV. Image interpretability of PW images was also scored. Finally, the neuroradiologists gave a conclusion on tumor status, based on the interpretation of both T1 and T2 weighted images (MR1, MR2 and MR3) in combination with PWI (MR3). Interobserver agreement on visual interpretation of rCBV maps was good (κ = 0.63) but poor on quantitative rCBV measurements and on interpretability of perfusion images (intraclass correlation coefficient 0.37 and κ = 0.23, respectively). Interobserver agreement on the overall conclusion of tumor status was moderate (κ = 0.48). Interobserver agreement on the visual interpretation of PWI color maps was good. However, overall interpretation of MR scans (using both conventional and PW images) showed considerable interobserver variability. Therefore, caution should be applied when interpreting MRI results during chemoradiation therapy. © 2016 EAN.
The 2014 updated version of the Confusion Assessment Method for the Intensive Care Unit compared to the 5th version of the Diagnostic and Statistical Manual of Mental Disorders and other current methods used by intensivists.

PubMed

Chanques, Gérald; Ely, E Wesley; Garnier, Océane; Perrigault, Fanny; Eloi, Anaïs; Carr, Julie; Rowan, Christine M; Prades, Albert; de Jong, Audrey; Moritz-Gasser, Sylvie; Molinari, Nicolas; Jaber, Samir

2018-03-01

One third of patients admitted to an intensive care unit (ICU) will develop delirium. However, delirium is under-recognized by bedside clinicians without the use of delirium screening tools, such as the Intensive Care Delirium Screening Checklist (ICDSC) or the Confusion Assessment Method for the ICU (CAM-ICU). The CAM-ICU was updated in 2014 to improve its use by clinicians throughout the world. It has never been validated compared to the new reference standard, the Diagnostic and Statistical Manual of Mental Disorders 5th version (DSM-5). We made a prospective psychometric study in a 16-bed medical-surgical ICU of a French academic hospital, to measure the diagnostic performance of the 2014 updated CAM-ICU compared to the DSM-5 as the reference standard. We included consecutive adult patients with a Richmond Agitation Sedation Scale (RASS) ≥ -3, without preexisting cognitive disorders, psychosis or cerebral injury. Delirium was independently assessed by neuropsychological experts using an operationalized approach to DSM-5, by investigators using the CAM-ICU and the ICDSC, by bedside clinicians and by ICU patients. The sensitivity, specificity, positive and negative predictive values were calculated considering neuropsychologist DSM-5 assessments as the reference standard (primary endpoint). CAM-ICU inter-observer agreement, as well as that between delirium diagnosis methods and the reference standard, was summarized using κ coefficients, which were subsequently compared using the Z-test. Delirium was diagnosed by experts in 38% of the 108 patients included for analysis. The CAM-ICU had a sensitivity of 83%, a specificity of 100%, a positive predictive value of 100% and a negative predictive value of 91%. Compared to the reference standard, the CAM-ICU had a significantly (p < 0.05) higher agreement (κ = 0.86 ± 0.05) than the physicians,' residents' and nurses' diagnoses (κ = 0.65 ± 0.09; 0.63 ± 0.09; 0.61 ± 0.09, respectively), as well as the patient's own impression of feeling delirious (κ = 0.02 ± 0.11). Differences between the ICDSC (κ = 0.69 ± 0.07) and CAM-ICU were not significant (p = 0.054). The CAM-ICU demonstrated a high reliability for inter-observer agreement (κ = 0.87 ± 0.06). The 2014 updated version of the CAM-ICU is valid according to DSM-5 criteria and reliable regarding inter-observer agreement in a research setting. Delirium remains under-recognized by bedside clinicians.
Scoring haemophilic arthropathy on X-rays: improving inter- and intra-observer reliability and agreement using a consensus atlas.

PubMed

Foppen, Wouter; van der Schaaf, Irene C; Beek, Frederik J A; Verkooijen, Helena M; Fischer, Kathelijn

2016-06-01

The radiological Pettersson score (PS) is widely applied for classification of arthropathy to evaluate costly haemophilia treatment. This study aims to assess and improve inter- and intra-observer reliability and agreement of the PS. Two series of X-rays (bilateral elbows, knees, and ankles) of 10 haemophilia patients (120 joints) with haemophilic arthropathy were scored by three observers according to the PS (maximum score 13/joint). Subsequently, (dis-)agreement in scoring was discussed until consensus. Example images were collected in an atlas. Thereafter, second series of 120 joints were scored using the atlas. One observer rescored the second series after three months. Reliability was assessed by intraclass correlation coefficients (ICC), agreement by limits of agreement (LoA). Median Pettersson score at joint level (PSjoint) of affected joints was 6 (interquartile range 3-9). Using the consensus atlas, inter-observer reliability of the PSjoint improved significantly from 0.94 (95 % confidence interval (CI) 0.91-0.96) to 0.97 (CI 0.96-0.98). LoA improved from ±1.7 to ±1.1 for the PSjoint. Therefore, true differences in arthropathy were differences in the PSjoint of >2 points. Intra-observer reliability of the PSjoint was 0.98 (CI 0.97-0.98), intra-observer LoA were ±0.9 points. Reliability and agreement of the PS improved by using a consensus atlas. • Reliability of the Pettersson score significantly improved using the consensus atlas. • The presented consensus atlas improved the agreement among observers. • The consensus atlas could be recommended to obtain a reproducible Pettersson score.
Reliability and validity of the Turkish version of the Berg Balance Scale.

PubMed

Sahin, Fusun; Yilmaz, Figen; Ozmaden, Asli; Kotevolu, Nurdan; Sahin, Tulay; Kuran, Banu

2008-01-01

The purpose of this study was to develop a Turkish version of the Berg Balance Scale (BBS) and assess its reliability and validity. Sixty healthy volunteers older than 65 years were included in to the study. Subjects who had lower extremity amputation, or were armchair or bedridden were excluded. After translation process, the Turkish version of the scale was administered to each participant twice with an interval of 2 weeks. The intraclass correlation coefficient (ICC) was calculated to assess intra- and inter-observer reliability. Chronbach alpha was calculated to evaluate internal consistency of the total BBS score. Interclass correlation coefficient was calcuated to examine test-retest reliability. Convergent validity was assessed by correlating the scale with Modified Barthel Index (MBI) and Timed Up and Go Test (TUG). Construct validity was assessed with factor analysis. The mean age in years of the participants were 77.00+/-5.67 (range: 67-92 yrs). The ICC for intra- and inter- observer reliability was 0.98 (p<0.0001) and 0.97 (p<0.0001), respectively. Chronbach alpha of the Turkish version of the BBS was 0.98. The test-retest reliability (ICC) of the Turkish version of the BBS was determined as 0.98 for the total score, and ranged from 0.86-0.99 for individual items. In terms of validity, the Turkish version of the BBS was correlated with the MBI (in positive direction) and TUG (in negative direction) (r=0.67 p<0.0001; r=-0.75 p<0.0001, respectively). The Turkish version of the BBS is a reliable and valid scale to be used in balance assessment of Turkish older adults.
Interobserver Agreement on First-Stage Conversation Analytic Transcription

ERIC Educational Resources Information Center

Roberts, Felicia; Robinson, Jeffrey D.

2004-01-01

This investigation assesses interobserver agreement on conversation analytic (CA) transcription. Four professional CA transcribers spent a maximum of 3 hours transcribing 2.5 minutes of a previously unknown, naturally occurring, mundane telephone call. Researchers unitized transcripts into words, sounds, silences, inbreaths, outbreaths, and laugh…
Intraobserver and Interobserver Agreement of Structural and Functional Software Programs for Measuring Glaucoma Progression.

PubMed

Moreno-Montañés, Javier; Antón, Vanesa; Antón, Alfonso; Larrosa, José M; Martinez-de-la-Casa, José María; Rebolleda, Gema; Ussa, Fernando; García-Granero, Marta

2017-04-01

It is important to evaluate intraobserver and interobserver agreement using visual field (VF) testing and optical coherence tomography (OCT) software in order to understand whether the use of this software is sufficient to detect glaucoma progression and to make decisions regarding its treatment. To evaluate agreement in VF and OCT software among 5 glaucoma specialists. The printout pages from VF progression software and OCT progression software from 100 patients were randomized, and the 5 glaucoma specialists subjectively and independently evaluated them for glaucoma. Each image was classified as having no progression, questionable progression, or progression. The principal investigator classified the patients previously as without variability (normal) or with high variability among tests (difficult). Using both software, the specialists also evaluated whether the glaucoma damage had progressed and if treatment change was needed. One month later, the same observers reevaluated the patients in a different order to determine intraobserver reproducibility. Intraobserver and interobserver agreement was estimated using κ statistics and Gwet second-order agreement coefficient. The agreement was compared with other factors. Of the 100 observed patients, half were male and all were white; the mean (SD) age was 69.7 (14.1) years. Intraobserver agreement was substantial to almost perfect for VF software (overall κ [95% CI], 0.59 [0.46-0.72] to 0.87 [0.79-0.96]) and similar for OCT software (overall κ [95% CI], 0.59 [0.46-0.71] to 0.85 [0.76-0.94]). Interobserver agreement among the 5 glaucoma specialists with the VF progression software was moderate (κ, 0.48; 95% CI, 0.41-0.55) and similar to OCT progression software (κ, 0.52; 95% CI, 0.44-0.59). Interobserver agreement was substantial in images classified as having no progression but only fair in those classified as having questionable glaucoma progression or glaucoma progression. Interobserver agreement was fair regarding questions about glaucoma progression (κ, 0.39; 95% CI, 0.32-0.48) and consideration about treatment changes (κ, 0.39; 95% CI, 0.32-0.48). The factors associated with agreement were the glaucoma stage and case difficulty. There was substantial intraobserver agreement but moderate interobserver agreement among glaucoma specialists using 2 glaucoma progression software packages. These data suggest that these glaucoma progression software packages are insufficient to obtain high interobserver agreement in both devices except in patients with no progression. The low agreement regarding progression or treatment changes suggests that both software programs used in isolation are insufficient for decision making.
Diffusion-weighted magnetic resonance imaging in the characterization of testicular germ cell neoplasms: Effect of ROI methods on apparent diffusion coefficient values and interobserver variability.

PubMed

Tsili, Athina C; Ntorkou, Alexandra; Astrakas, Loukas; Xydis, Vasilis; Tsampalas, Stavros; Sofikitis, Nikolaos; Argyropoulou, Maria I

2017-04-01

To evaluate the difference in apparent diffusion coefficient (ADC) measurements at diffusion-weighted (DW) magnetic resonance imaging of differently shaped regions-of-interest (ROIs) in testicular germ cell neoplasms (TGCNS), the diagnostic ability of differently shaped ROIs in differentiating seminomas from nonseminomatous germ cell neoplasms (NSGCNs) and the interobserver variability. Thirty-three TGCNs were retrospectively evaluated. Patients underwent MR examinations, including DWI on a 1.5-T MR system. Two observers measured mean tumor ADCs using four distinct ROI methods: round, square, freehand and multiple small, round ROIs. The interclass correlation coefficient was analyzed to assess interobserver variability. Statistical analysis was used to compare mean ADC measurements among observers, methods and histologic types. All ROI methods showed excellent interobserver agreement, with excellent correlation (P<0.001). Multiple, small ROIs provided the lower mean ADC in TGCNs. Seminomas had lower mean ADC compared to NSGCNs for each ROI method (P<0.001). Round ROI proved the most accurate method in characterizing TGCNS. Interobserver variability in ADC measurement is excellent, irrespective of the ROI shape. Multiple, small round ROIs and round ROI proved the more accurate methods for ADC measurement in the characterization of TGCNs and in the differentiation between seminomas and NSGCNs, respectively. Copyright © 2017 Elsevier B.V. All rights reserved.
Intraobserver and interobserver variability of the bone marrow burden (BMB) score for the assessment of disease severity in Gaucher disease. Possible impact of reporting experience.

PubMed

Lai, Jeffrey K C; Robertson, Patricia L; Goh, Christine; Szer, Jeff

2018-02-01

To evaluate the intraobserver and interobserver agreement for bone marrow burden (BMB) scores for individual examinations and for the change in BMB score over time in the same patient. A total of 119 sets of MR images of the lumbar spine and femora from 60 patients with Gaucher disease were included. Each set of MR images was scored using the BMB score independently by two experienced MSK radiologists. One radiologist performed a second read four weeks later. Intraobserver and interobserver agreement was assessed using Bland-Altman analysis and weighted kappa scores. BMB scores (n=119) demonstrated fair intraobserver agreement (weighted kappa=0.53) with a mean difference of -0.20 and 95% limits of agreement (LOA) of (-3.41, 3.01). Inter observer agreement was poor with weighted kappa 0.28 with mean difference of -0.16 and 95% LOA of (-4.45, 4.11). Change in BMB scores over time (n=59) demonstrated poor/fair intraobserver agreement (weighted kappa 0.41, mean difference-0.20 and 95% LOA (-4.35, 3.94)). Interobserver agreement was poor (weighted kappa 0.25, mean difference -0.12 with wide 95% LOA (-6.23, 5.99)). Significant interobserver, and to a lesser extent intraobserver, variation occurs with blinded BMB scoring of Gaucher disease. Copyright © 2016 Elsevier Inc. All rights reserved.
A novel scoring system to measure radiographic abnormalities and related spirometric values in cured pulmonary tuberculosis.

PubMed

Báez-Saldaña, Renata; López-Arteaga, Yesenia; Bizarrón-Muro, Alma; Ferreira-Guerrero, Elizabeth; Ferreyra-Reyes, Leticia; Delgado-Sánchez, Guadalupe; Cruz-Hervert, Luis Pablo; Mongua-Rodríguez, Norma; García-García, Lourdes

2013-01-01

Despite chemotherapy, patients with cured pulmonary tuberculosis may result in lung functional impairment. To evaluate a novel scoring system based on the degree of radiographic abnormalities and related spirometric values in patients with cured pulmonary tuberculosis. One hundred and twenty seven patients with cured pulmonary tuberculosis were prospectively enrolled in a referral hospital specializing in respiratory diseases. Spirometry was performed and the extent of radiographic abnormalities was evaluated twice by each of two readers to generate a novel quantitative score. Scoring reproducibility was analyzed by the intra-class correlation coefficient (ICC) and the Bland-Altman method. Multiple linear regression models were performed to assess the association of the extent of radiographic abnormalities with spirometric values. The intra-observer agreement for scoring of radiographic abnormalities (SRA) showed an ICC of 0.81 (CI:95%, 0.67-0.95) and 0.78 (CI:95%, 0.65-0.92), for reader 1 and 2, respectively. Inter-observer reproducibility for the first measurement was 0.83 (CI:95%, 0.71-0.95), and for the second measurement was 0.74 (CI:95%, 0.58-0.90). The Bland-Altman analysis of the intra-observer agreement showed a mean bias of 0.87% and -0.55% and an inter-observer agreement of -0.35% and -1.78%, indicating a minor average systematic variability. After adjustment for age, gender, height, smoking status, pack-years of smoking, and degree of dyspnea, the scoring degree of radiographic abnormalities was significantly and negatively associated with absolute and percent predicted values of FVC: -0.07 (CI:95%, -0.01 to -0.04); -2.48 (CI:95%, -3.45 to -1.50); and FEV1 -0.07 (CI:95%, -0.10 to -0.05); -2.92 (CI:95%, -3.87 to -1.97) respectively, in the patients studied. The extent of radiographic abnormalities, as evaluated through our novel scoring system, was inversely associated with spirometric values, and exhibited good reliability and reproducibility. As intra-observer and inter-observer agreement of the SRA varied from good to excellent, the use of SRA in this setting appears acceptable.
A Novel Scoring System to Measure Radiographic Abnormalities and Related Spirometric Values in Cured Pulmonary Tuberculosis

PubMed Central

Báez-Saldaña, Renata; López-Arteaga, Yesenia; Bizarrón-Muro, Alma; Ferreira-Guerrero, Elizabeth; Ferreyra-Reyes, Leticia; Delgado-Sánchez, Guadalupe; Cruz-Hervert, Luis Pablo; Mongua-Rodríguez, Norma; García-García, Lourdes

2013-01-01

Background Despite chemotherapy, patients with cured pulmonary tuberculosis may result in lung functional impairment. Objective To evaluate a novel scoring system based on the degree of radiographic abnormalities and related spirometric values in patients with cured pulmonary tuberculosis. Methods One hundred and twenty seven patients with cured pulmonary tuberculosis were prospectively enrolled in a referral hospital specializing in respiratory diseases. Spirometry was performed and the extent of radiographic abnormalities was evaluated twice by each of two readers to generate a novel quantitative score. Scoring reproducibility was analyzed by the intra-class correlation coefficient (ICC) and the Bland-Altman method. Multiple linear regression models were performed to assess the association of the extent of radiographic abnormalities with spirometric values. Results The intra-observer agreement for scoring of radiographic abnormalities (SRA) showed an ICC of 0.81 (CI:95%, 0.67–0.95) and 0.78 (CI:95%, 0.65–0.92), for reader 1 and 2, respectively. Inter-observer reproducibility for the first measurement was 0.83 (CI:95%, 0.71–0.95), and for the second measurement was 0.74 (CI:95%, 0.58–0.90). The Bland-Altman analysis of the intra-observer agreement showed a mean bias of 0.87% and -0.55% and an inter-observer agreement of -0.35% and -1.78%, indicating a minor average systematic variability. After adjustment for age, gender, height, smoking status, pack-years of smoking, and degree of dyspnea, the scoring degree of radiographic abnormalities was significantly and negatively associated with absolute and percent predicted values of FVC: -0.07 (CI:95%, -0.01 to -0.04); -2.48 (CI:95%, -3.45 to -1.50); and FEV1 -0.07 (CI:95%, -0.10 to -0.05); -2.92 (CI:95%, -3.87 to -1.97) respectively, in the patients studied. Conclusion The extent of radiographic abnormalities, as evaluated through our novel scoring system, was inversely associated with spirometric values, and exhibited good reliability and reproducibility. As intra-observer and inter-observer agreement of the SRA varied from good to excellent, the use of SRA in this setting appears acceptable. PMID:24223865
Isolated glenohumeral range of motion, excluding side-to-side difference in humeral retroversion, in asymptomatic high-school baseball players.

PubMed

Mihata, Teruhisa; Takeda, Atsushi; Kawakami, Takeshi; Itami, Yasuo; Watanabe, Chisato; Doi, Munekazu; Neo, Masashi

2016-06-01

Glenohumeral range of motion is correlated with shoulder capsular condition and is thus considered to be predictive of shoulder pathology. However, in throwing athletes, a side-to-side difference in humeral retroversion makes it difficult to evaluate capsular condition on the basis of glenohumeral range of motion measured by using the conventional technique. The purpose of this study was to measure isolated glenohumeral rotation, excluding side-to-side differences in humeral retroversion, in asymptomatic high-school baseball players. A total of 195 high-school baseball players (52 pitchers and 143 position players; median age, 16 years) and 20 high-school non-throwing athletes (median age, 16 years) without any shoulder symptoms were enroled in this study. Glenohumeral external and internal rotations were measured by using both a conventional technique and our ultrasound-assisted technique. This technique, neutral rotation, was standardized on the basis of the ultrasonographically visualized location of the bicipital groove to exclude side-to-side differences in humeral retroversion from the calculated rotation angle. Intra- and inter-observer agreements of rotational measurements were evaluated by using intra-class correlation coefficients (ICCs). Isolated glenohumeral rotation measurements, excluding side-to-side differences in humeral retroversion, demonstrated excellent intra-observer (ICC > 0.89) and inter-observer (ICC > 0.78) agreements. Isolated glenohumeral internal rotation was significantly less in the dominant shoulder than in the non-dominant shoulder in asymptomatic baseball players (P < 0.001). Isolated glenohumeral external rotation in baseball players was significantly greater than in non-throwing athletes (P < 0.05). In the baseball players, humeral torsion in the dominant shoulder was significantly greater than that in the non-dominant shoulder (P < 0.001), indicating that the retroversion angle was greater in dominant shoulders than in non-dominant shoulders. Isolated glenohumeral external and internal rotations can be measured with high intra- and inter-observer reliability with the exclusion of side-to-side differences in humeral retroversion. Capsular and muscular changes in the throwing shoulder may be better evaluated by using our ultrasound-assisted technique. Cross-sectional study, Level III.
Differentiation of periapical granulomas and cysts by using dental MRI: a pilot study.

PubMed

Juerchott, Alexander; Pfefferle, Thorsten; Flechtenmacher, Christa; Mente, Johannes; Bendszus, Martin; Heiland, Sabine; Hilgenfeld, Tim

2018-05-17

The purpose of this pilot study was to evaluate whether periapical granulomas can be differentiated from periapical cysts in vivo by using dental magnetic resonance imaging (MRI). Prior to apicoectomy, 11 patients with radiographically confirmed periapical lesions underwent dental MRI, including fat-saturated T2-weighted (T2wFS) images, non-contrast-enhanced T1-weighted images with and without fat saturation (T1w/T1wFS), and contrast-enhanced fat-saturated T1-weighted (T1wFS+C) images. Two independent observers performed structured image analysis of MRI datasets twice. A total of 15 diagnostic MRI criteria were evaluated, and histopathological results (6 granulomas and 5 cysts) were compared with MRI characteristics. Statistical analysis was performed using intraclass correlation coefficient (ICC), Cohen's kappa (κ), Mann-Whitney U-test and Fisher's exact test. Lesion identification and consecutive structured image analysis was possible on T2wFS and T1wFS+C MRI images. A high reproducibility was shown for MRI measurements of the maximum lesion diameter (intraobserver ICC = 0.996/0.998; interobserver ICC = 0.997), for the "peripheral rim" thickness (intraobserver ICC = 0.988/0.984; interobserver ICC = 0.970), and for all non-quantitative MRI criteria (intraobserver-κ = 0.990/0.995; interobserver-κ = 0.988). In accordance with histopathological results, six MRI criteria allowed for a clear differentiation between cysts and granulomas: (1) outer margin of lesion, (2) texture of "peripheral rim" in T1wFS+C, (3) texture of "lesion center" in T2wFS, (4) surrounding tissue involvement in T2wFS, (5) surrounding tissue involvement in T1wFS+C and (6) maximum "peripheral rim" thickness (all: P < 0.05). In conclusion, this pilot study indicates that radiation-free dental MRI enables a reliable differentiation between periapical cysts and granulomas in vivo. Thus, MRI may substantially improve treatment strategies and help to avoid unnecessary surgery in apical periodontitis.
Diagnostic Performance of MR Elastography and Vibration-controlled Transient Elastography in the Detection of Hepatic Fibrosis in Patients with Severe to Morbid Obesity

PubMed Central

Chen, Jun; Yin, Meng; Talwalkar, Jayant A.; Oudry, Jennifer; Glaser, Kevin J.; Smyrk, Thomas C.; Miette, Véronique; Sandrin, Laurent

2017-01-01

Purpose To evaluate the diagnostic performance and examination success rate of magnetic resonance (MR) elastography and vibration-controlled transient elastography (VCTE) in the detection of hepatic fibrosis in patients with severe to morbid obesity. Materials and Methods This prospective and HIPAA-compliant study was approved by the institutional review board. A total of 111 patients (71 women, 40 men) participated. Written informed consent was obtained from all patients. Patients underwent MR elastography with two readers and VCTE with three observers to acquire liver stiffness measurements for liver fibrosis assessment. The results were compared with those from liver biopsy. Each pathology specimen was evaluated by two hepatopathologists according to the METAVIR scoring system or Brunt classification when appropriate. All imaging observers were blinded to the biopsy results, and all hepatopathologists were blinded to the imaging results. Examination success rate, interobserver agreement, and diagnostic accuracy for fibrosis detection were assessed. Results In this obese patient population (mean body mass index = 40.3 kg/m2; 95% confidence interval [CI]: 38.7 kg/m2, 41.8 kg/m2]), the examination success rate was 95.8% (92 of 96 patients) for MR elastography and 81.3% (78 of 96 patients) or 88.5% (85 of 96 patients) for VCTE. Interobserver agreement was higher with MR elastography than with biopsy (intraclass correlation coefficient, 0.95 vs 0.89). In patients with successful MR elastography and VCTE examinations (excluding unreliable VCTE examinations), both MR elastography and VCTE had excellent diagnostic accuracy in the detection of clinically significant hepatic fibrosis (stage F2–F4) (mean area under the curve: 0.93 [95% CI: 0.85, 0.97] vs 0.91 [95% CI: 0.83, 0.96]; P = .551). Conclusion In this obese patient population, both MR elastography and VCTE had excellent diagnostic performance for assessing hepatic fibrosis; MR elastography was more technically reliable than VCTE and had a higher interobserver agreement than liver biopsy. © RSNA, 2016 Online supplemental material is available for this article. An earlier incorrect version of this article appeared online. This article was corrected on January 25, 2017. PMID:27861111
A Standardized DNA Variant Scoring System for Pathogenicity Assessments in Mendelian Disorders

PubMed Central

Karbassi, Izabela; Maston, Glenn A.; Love, Angela; DiVincenzo, Christina; Braastad, Corey D.; Elzinga, Christopher D.; Bright, Alison R.; Previte, Domenic; Zhang, Ke; Rowland, Charles M.; McCarthy, Michele; Lapierre, Jennifer L.; Dubois, Felicita; Medeiros, Katelyn A.; Batish, Sat Dev; Jones, Jeffrey; Liaquat, Khalida; Hoffman, Carol A.; Jaremko, Malgorzata; Wang, Zhenyuan; Sun, Weimin; Buller‐Burckle, Arlene; Strom, Charles M.; Keiles, Steven B.

2015-01-01

ABSTRACT We developed a rules‐based scoring system to classify DNA variants into five categories including pathogenic, likely pathogenic, variant of uncertain significance (VUS), likely benign, and benign. Over 16,500 pathogenicity assessments on 11,894 variants from 338 genes were analyzed for pathogenicity based on prediction tools, population frequency, co‐occurrence, segregation, and functional studies collected from internal and external sources. Scores were calculated by trained scientists using a quantitative framework that assigned differential weighting to these five types of data. We performed descriptive and comparative statistics on the dataset and tested interobserver concordance among the trained scientists. Private variants defined as variants found within single families (n = 5,182), were either VUS (80.5%; n = 4,169) or likely pathogenic (19.5%; n = 1,013). The remaining variants (n = 6,712) were VUS (38.4%; n = 2,577) or likely benign/benign (34.7%; n = 2,327) or likely pathogenic/pathogenic (26.9%, n = 1,808). Exact agreement between the trained scientists on the final variant score was 98.5% [95% confidence interval (CI) (98.0, 98.9)] with an interobserver consistency of 97% [95% CI (91.5, 99.4)]. Variant scores were stable and showed increasing odds of being in agreement with new data when re‐evaluated periodically. This carefully curated, standardized variant pathogenicity scoring system provides reliable pathogenicity scores for DNA variants encountered in a clinical laboratory setting. PMID:26467025
A Standardized DNA Variant Scoring System for Pathogenicity Assessments in Mendelian Disorders.

PubMed

Karbassi, Izabela; Maston, Glenn A; Love, Angela; DiVincenzo, Christina; Braastad, Corey D; Elzinga, Christopher D; Bright, Alison R; Previte, Domenic; Zhang, Ke; Rowland, Charles M; McCarthy, Michele; Lapierre, Jennifer L; Dubois, Felicita; Medeiros, Katelyn A; Batish, Sat Dev; Jones, Jeffrey; Liaquat, Khalida; Hoffman, Carol A; Jaremko, Malgorzata; Wang, Zhenyuan; Sun, Weimin; Buller-Burckle, Arlene; Strom, Charles M; Keiles, Steven B; Higgins, Joseph J

2016-01-01

We developed a rules-based scoring system to classify DNA variants into five categories including pathogenic, likely pathogenic, variant of uncertain significance (VUS), likely benign, and benign. Over 16,500 pathogenicity assessments on 11,894 variants from 338 genes were analyzed for pathogenicity based on prediction tools, population frequency, co-occurrence, segregation, and functional studies collected from internal and external sources. Scores were calculated by trained scientists using a quantitative framework that assigned differential weighting to these five types of data. We performed descriptive and comparative statistics on the dataset and tested interobserver concordance among the trained scientists. Private variants defined as variants found within single families (n = 5,182), were either VUS (80.5%; n = 4,169) or likely pathogenic (19.5%; n = 1,013). The remaining variants (n = 6,712) were VUS (38.4%; n = 2,577) or likely benign/benign (34.7%; n = 2,327) or likely pathogenic/pathogenic (26.9%, n = 1,808). Exact agreement between the trained scientists on the final variant score was 98.5% [95% confidence interval (CI) (98.0, 98.9)] with an interobserver consistency of 97% [95% CI (91.5, 99.4)]. Variant scores were stable and showed increasing odds of being in agreement with new data when re-evaluated periodically. This carefully curated, standardized variant pathogenicity scoring system provides reliable pathogenicity scores for DNA variants encountered in a clinical laboratory setting. © 2015 The Authors. **Human Mutation published by Wiley Periodicals, Inc.
Clinical performance of a free-breathing spatiotemporally accelerated 3-D time-resolved contrast-enhanced pediatric abdominal MR angiography.

PubMed

Zhang, Tao; Yousaf, Ufra; Hsiao, Albert; Cheng, Joseph Y; Alley, Marcus T; Lustig, Michael; Pauly, John M; Vasanawala, Shreyas S

2015-10-01

Pediatric contrast-enhanced MR angiography is often limited by respiration, other patient motion and compromised spatiotemporal resolution. To determine the reliability of a free-breathing spatiotemporally accelerated 3-D time-resolved contrast-enhanced MR angiography method for depicting abdominal arterial anatomy in young children. With IRB approval and informed consent, we retrospectively identified 27 consecutive children (16 males and 11 females; mean age: 3.8 years, range: 14 days to 8.4 years) referred for contrast-enhanced MR angiography at our institution, who had undergone free-breathing spatiotemporally accelerated time-resolved contrast-enhanced MR angiography studies. A radio-frequency-spoiled gradient echo sequence with Cartesian variable density k-space sampling and radial view ordering, intrinsic motion navigation and intermittent fat suppression was developed. Images were reconstructed with soft-gated parallel imaging locally low-rank method to achieve both motion correction and high spatiotemporal resolution. Quality of delineation of 13 abdominal arteries in the reconstructed images was assessed independently by two radiologists on a five-point scale. Ninety-five percent confidence intervals of the proportion of diagnostically adequate cases were calculated. Interobserver agreements were also analyzed. Eleven out of 13 arteries achieved acceptable image quality (mean score range: 3.9-5.0) for both readers. Fair to substantial interobserver agreement was reached on nine arteries. Free-breathing spatiotemporally accelerated 3-D time-resolved contrast-enhanced MR angiography frequently yields diagnostic image quality for most abdominal arteries in young children.
Validation of scores of use of inhalation devices: valoration of errors *

PubMed Central

Zambelli-Simões, Letícia; Martins, Maria Cleusa; Possari, Juliana Carneiro da Cunha; Carvalho, Greice Borges; Coelho, Ana Carla Carvalho; Cipriano, Sonia Lucena; de Carvalho-Pinto, Regina Maria; Cukier, Alberto; Stelmach, Rafael

2015-01-01

Abstract Objective: To validate two scores quantifying the ability of patients to use metered dose inhalers (MDIs) or dry powder inhalers (DPIs); to identify the most common errors made during their use; and to identify the patients in need of an educational program for the use of these devices. Methods: This study was conducted in three phases: validation of the reliability of the inhaler technique scores; validation of the contents of the two scores using a convenience sample; and testing for criterion validation and discriminant validation of these instruments in patients who met the inclusion criteria. Results: The convenience sample comprised 16 patients. Interobserver disagreement was found in 19% and 25% of the DPI and MDI scores, respectively. After expert analysis on the subject, the scores were modified and were applied in 72 patients. The most relevant difficulty encountered during the use of both types of devices was the maintenance of total lung capacity after a deep inhalation. The degree of correlation of the scores by observer was 0.97 (p < 0.0001). There was good interobserver agreement in the classification of patients as able/not able to use a DPI (50%/50% and 52%/58%; p < 0.01) and an MDI (49%/51% and 54%/46%; p < 0.05). Conclusions: The validated scores allow the identification and correction of inhaler technique errors during consultations and, as a result, improvement in the management of inhalation devices. PMID:26398751
Interpretation of Post-operative Distal Humerus Radiographs After Internal Fixation: Prediction of Later Loss of Fixation.

PubMed

Claessen, Femke M A P; Stoop, Nicky; Doornberg, Job N; Guitton, Thierry G; van den Bekerom, Michel P J; Ring, David

2016-10-01

Stable fixation of distal humerus fracture fragments is necessary for adequate healing and maintenance of reduction. The purpose of this study was to measure the reliability and accuracy of interpretation of postoperative radiographs to predict which implants will loosen or break after operative treatment of bicolumnar distal humerus fractures. We also addressed agreement among surgeons regarding which fracture fixation will loosen or break and the influence of years in independent practice, location of practice, and so forth. A total of 232 orthopedic residents and surgeons from around the world evaluated 24 anteroposterior and lateral radiographs of distal humerus fractures on a Web-based platform to predict which implants would loosen or break. Agreement among observers was measured using the multi-rater kappa measure. The sensitivity of prediction of failure of fixation of distal humerus fracture on radiographs was 63%, specificity was 53%, positive predictive value was 36%, the negative predictive value was 78%, and accuracy was 56%. There was fair interobserver agreement (κ = 0.27) regarding predictions of failure of fixation of distal humerus fracture on radiographs. Interobserver variability did not change when assessed for the various subgroups. When experienced and skilled surgeons perform fixation of type C distal humerus fracture, the immediate postoperative radiograph is not predictive of fixation failure. Reoperation based on the probability of failure might not be advisable. Diagnostic III. Copyright © 2016 American Society for Surgery of the Hand. Published by Elsevier Inc. All rights reserved.
Brazilian version of the body dysmorphic disorder examination.

PubMed

Jorge, Renata Trajano Borges; Sabino Neto, Miguel; Natour, Jamil; Veiga, Daniela Francescato; Jones, Anamaria; Ferreira, Lydia Masako

2008-03-06

Body image improvement is considered to be the main reason for undergoing plastic surgery. The objective was to translate the Body Dysmorphic Disorder Examination (BDDE) into Brazilian Portuguese and to adapt and validate this questionnaire for use in Brazil. Cross-sectional survey, at the Department of Plastic Surgery of Universidade Federal de São Paulo. The BDDE was first translated into Portuguese and then back-translated into English. These translations were then discussed by healthcare professionals in order to establish the final Brazilian version. In a second stage, the validity and reliability of the BDDE were assessed. For this, patients were initially interviewed by two interviewers and subsequently, by only one of these interviewers. On the first occasion, in addition to the BDDE, the body shape questionnaire (BSQ) and the Rosenberg self-esteem scale were also applied. These questionnaires were applied to 90 patients. Six questions were modified during the assessment of cultural equivalence. Cronbach's alpha was 0.89 and the intraclass correlation coefficients for interobserver and test-retest reliability were 0.91 and 0.87, respectively. Pearson's coefficient showed no correlation between the BDDE and the Rosenberg self-esteem scale (0.22), whereas there was a moderate correlation between the BDDE and the BSQ (0.64). The BDDE was successfully translated and adapted, with good internal consistency, reliability and construct validity.

Reliable Alignment in Total Knee Arthroplasty by the Use of an iPod-Based Navigation System

PubMed Central

Koenen, Paola; Schneider, Marco M.; Fröhlich, Matthias; Driessen, Arne; Bouillon, Bertil; Bäthis, Holger

2016-01-01

Axial alignment is one of the main objectives in total knee arthroplasty (TKA). Computer-assisted surgery (CAS) is more accurate regarding limb alignment reconstruction compared to the conventional technique. The aim of this study was to analyse the precision of the innovative navigation system DASH® by Brainlab and to evaluate the reliability of intraoperatively acquired data. A retrospective analysis of 40 patients was performed, who underwent CAS TKA using the iPod-based navigation system DASH. Pre- and postoperative axial alignment were measured on standardized radiographs by two independent observers. These data were compared with the navigation data. Furthermore, interobserver reliability was measured. The duration of surgery was monitored. The mean difference between the preoperative mechanical axis by X-ray and the first intraoperatively measured limb axis by the navigation system was 2.4°. The postoperative X-rays showed a mean difference of 1.3° compared to the final navigation measurement. According to radiographic measurements, 88% of arthroplasties had a postoperative limb axis within ±3°. The mean additional time needed for navigation was 5 minutes. We could prove very good precision for the DASH system, which is comparable to established navigation devices with only negligible expenditure of time compared to conventional TKA. PMID:27313898
PubMed Central

Frémont, P.; Labrecque, M.; Légaré, F.; Baillargeon, L.; Misson, L.

2001-01-01

OBJECTIVE: To develop and test the reliability of a tool for rating websites that provide information on evidence-based medicine. DESIGN: For each site, 60% of the score was given for content (eight criteria) and 40% was given for organization and presentation (nine criteria). Five of 10 randomly selected sites met the inclusion criteria and were used by three observers to test the accuracy of the tool. Each site was rated twice by each observer, with a 3-week interval between ratings. SETTING: Laval University, Quebec city. PARTICIPANTS: Three observers. MAIN OUTCOME MEASURES: The intraclass correlation coefficient (ICC) was used to rate the reliability of the tool. RESULTS: Average overall scores for the five sites were 40%, 79%, 83%, 88%, and 89%. All three observers rated the same two sites in fourth and fifth place and gave the top three ratings to the other three sites. The overall rating of the five sites by the three observers yielded an ICC of 0.93 to 0.97. An ICC of 0.87 was obtained for the two overall ratings conducted 3 weeks apart. CONCLUSION: This new tool offers excellent intraobserver and interobserver measurement reliability and is an excellent means of distinguishing between medical websites of varying quality. For best results, we recommend that the tool be used simultaneously by two observers and that differences be resolved by consensus. PMID:11768925
A study of lip prints and its reliability as a forensic tool

PubMed Central

Verma, Yogendra; Einstein, Arouquiaswamy; Gondhalekar, Rajesh; Verma, Anoop K.; George, Jiji; Chandra, Shaleen; Gupta, Shalini; Samadi, Fahad M.

2015-01-01

Introduction: Lip prints, like fingerprints, are unique to an individual and can be easily recorded. Therefore, we compared direct and indirect lip print patterns in males and females of different age groups, studied the inter- and intraobserver bias in recording the data, and observed any changes in the lip print patterns over a period of time, thereby, assessing the reliability of lip prints as a forensic tool. Materials and Methods: Fifty females and 50 males in the age group of 15 to 35 years were selected for the study. Lips with any deformity or scars were not included. Lip prints were registered by direct and indirect methods and transferred to a preformed registration sheet. Direct method of lip print registration was repeated after a six-month interval. All the recorded data were analyzed statistically. Results: The predominant patterns were vertical and branched. More females showed the branched pattern and males revealed an equal prevalence of vertical and reticular patterns. There was an interobserver agreement, which was 95%, and there was no change in the lip prints over time. Indirect registration of lip prints correlated with direct method prints. Conclusion: Lip prints can be used as a reliable forensic tool, considering the consistency of lip prints over time and the accurate correlation of indirect prints to direct prints. PMID:26668449
Towards a new protocol of scoliosis assessments and monitoring in clinical practice: A pilot study.

PubMed

Lukovic, Tanja; Cukovic, Sasa; Lukovic, Vanja; Devedzic, Goran; Djordjevic, Dusica

2015-01-01

Although intensively investigated, the procedures for assessment and monitoring of scoliosis are still a subject of controversies. The aim of this study was to assess validity and reliability of a number of physiotherapeutic measurements that could be used for clinical monitoring of scoliosis. Fifteen healthy (symmetric) subjects were subjected to a set of measurements two times, by two experienced and two inexperienced physiotherapists. Intra-observer and inter-observer reliability of measurements were determined. Following measurements were performed: body height and weight, chest girth in inspirium and expirium, the length of legs, the spine translation, the lateral pelvic tilt, the equality of the shoulders, position of scapulas, the equality of stature triangles, the rib hump, the existence of m. iliopsoas contracture, Fröhner index, the size of lumbar lordosis and the angle of trunk rotation. Intraclass correlation coefficient was high (> 0.8) for majority of measurements when experienced physiotherapists performed them, while inexperienced physiotherapists performed precisely only basic, easy measurements. We showed in this pilot study on healthy subjects, that majority of basic physiotherapeutic measurements are valid and reliable when performed by specialized physiotherapist, and it can be expected that this protocol will gain high value when measurements on subjects with scoliosis are performed.
An assessment of the intra- and inter-reliability of the lumbar paraspinal muscle parameters using CT scan and magnetic resonance imaging.

PubMed

Hu, Zhi-Jun; He, Jian; Zhao, Feng-Dong; Fang, Xiang-Qian; Zhou, Li-Na; Fan, Shun-Wu

2011-06-01

A reliability study was conducted. To estimate the intra- and intermeasurement errors in the measurements of functional cross-sectional area (FCSA), density, and T2 signal intensity of paraspinal muscles using computed tomography (CT) scan and magnetic resonance imaging (MRI). CT scan and MRI had been used widely to measure the cross-sectional area and degeneration of the back muscles in spine and muscle research. But there is still no systemic study to analyze the reliability of these measurements. This study measured the FCSA and fatty infiltration (density on CT scan and T2 signal intensity on MRI) of the paraspinal muscles at L3-L4, L4-L5, and L5-S1 in 29 patients with chronic low back pain. Two experienced musculoskeletal radiologists and one superior spine surgeon traced the region of interest twice within 3 weeks for measurement of the intra- and interobserver reliability. The intraclass correlation coefficients (ICCs) of the intra-reliability ranged from fair to excellent for FCSA, and good to excellent for fatty infiltration. The ICCs of the inter-reliability ranged from fair to excellent for FCSA, and good to excellent for fatty infiltration. There were no significant differences between CT scan and MRI in reliability results, except in the relative standard error of fatty infiltration measurement. The ICCs of the FCSA measurement between CT scan and MRI ranged from poor to good. The reliabilities of the CT scan and MRI for measuring the FCSA and fatty infiltration of the atrophied lumbar paraspinal muscles were acceptable. It was reliable for using uniform one image method for a single paraspinal muscle evaluation study. And the authors preferred to advise the MRI other than CT scan for paraspinal muscles measurements of FCSA and fatty infiltration.
Acute Respiratory Distress Syndrome Measurement Error. Potential Effect on Clinical Study Results

PubMed Central

Cooke, Colin R.; Iwashyna, Theodore J.; Hofer, Timothy P.

2016-01-01

Rationale: Identifying patients with acute respiratory distress syndrome (ARDS) is a recognized challenge. Experts often have only moderate agreement when applying the clinical definition of ARDS to patients. However, no study has fully examined the implications of low reliability measurement of ARDS on clinical studies. Objectives: To investigate how the degree of variability in ARDS measurement commonly reported in clinical studies affects study power, the accuracy of treatment effect estimates, and the measured strength of risk factor associations. Methods: We examined the effect of ARDS measurement error in randomized clinical trials (RCTs) of ARDS-specific treatments and cohort studies using simulations. We varied the reliability of ARDS diagnosis, quantified as the interobserver reliability (κ-statistic) between two reviewers. In RCT simulations, patients identified as having ARDS were enrolled, and when measurement error was present, patients without ARDS could be enrolled. In cohort studies, risk factors as potential predictors were analyzed using reviewer-identified ARDS as the outcome variable. Measurements and Main Results: Lower reliability measurement of ARDS during patient enrollment in RCTs seriously degraded study power. Holding effect size constant, the sample size necessary to attain adequate statistical power increased by more than 50% as reliability declined, although the result was sensitive to ARDS prevalence. In a 1,400-patient clinical trial, the sample size necessary to maintain similar statistical power increased to over 1,900 when reliability declined from perfect to substantial (κ = 0.72). Lower reliability measurement diminished the apparent effectiveness of an ARDS-specific treatment from a 15.2% (95% confidence interval, 9.4–20.9%) absolute risk reduction in mortality to 10.9% (95% confidence interval, 4.7–16.2%) when reliability declined to moderate (κ = 0.51). In cohort studies, the effect on risk factor associations was similar. Conclusions: ARDS measurement error can seriously degrade statistical power and effect size estimates of clinical studies. The reliability of ARDS measurement warrants careful attention in future ARDS clinical studies. PMID:27159648
Reliability of the Cardiff Test of basic life support and automated external defibrillation version 3.1.

PubMed

Whitfield, Richard H; Newcombe, Robert G; Woollard, Malcolm

2003-12-01

The introduction of the European Resuscitation Guidelines (2000) for cardiopulmonary resuscitation (CPR) and automated external defibrillation (AED) prompted the development of an up-to-date and reliable method of assessing the quality of performance of CPR in combination with the use of an AED. The Cardiff Test of basic life support (BLS) and AED version 3.1 was developed to meet this need and uses standardised checklists to retrospectively evaluate performance from analyses of video recordings and data drawn from a laptop computer attached to a training manikin. This paper reports the inter- and intra-observer reliability of this test. Data used to assess reliability were obtained from an investigation of CPR and AED skill acquisition in a lay responder AED training programme. Six observers were recruited to evaluate performance in 33 data sets, repeating their evaluation after a minimum interval of 3 weeks. More than 70% of the 42 variables considered in this study had a kappa score of 0.70 or above for inter-observer reliability or were drawn from computer data and therefore not subject to evaluator variability. 85% of the 42 variables had kappa scores for intra-observer reliability of 0.70 or above or were drawn from computer data. The standard deviations for inter- and intra-observer measures of time to first shock were 11.6 and 7.7 s, respectively. The inter- and intra-observer reliability for the majority of the variables in the Cardiff Test of BLS and AED version 3.1 is satisfactory. However, reliability is less acceptable with respect to shaking when checking for responsiveness, initial check/clearing of the airway, checks for signs of circulation, time to first shock and performance of interventions in the correct sequence. Further research is required to determine if modifications to the method of assessing these variables can increase reliability.
Ultrasound anatomy in the normal neonatal and infant foot: an anatomic introduction to ultrasound assessment of foot deformities.

PubMed

Aurell, Y; Johansson, A; Hansson, G; Wallander, H; Jonsson, K

2002-09-01

The aim of this study was to establish guidelines for US assessment of the talo-crural, the talo-navicular and the calcaneo-cuboid joints during the first year of life, which could serve as a reference while studying foot deformities. The feet of 54 healthy children were examined at birth and at the age of 4, 7 and 12 months by using three easily defined and reproducible US projections. With a medial projection the relation of the navicular in relation to the medial malleolus and the head of the talus was studied. A lateral projection revealed the calcaneo-cuboid relationship and a dorsal projection the talo-navicular alignment in the sagittal plane. Normal values for measurements of these cartilaginous relationships were established for the different age groups. Intra- and inter-observer reliability was assessed and found to be acceptable ( r=0.53-0.90, Pearson correlation coefficient). With US it is possible to obtain reproducible planes of investigation that give reliable information about the talo-crural, the talo-navicular and the calcaneo-cuboid relationships during the first year of life.
Assessment of fatty degeneration of the gluteal muscles in patients with THA using MRI: reliability and accuracy of the Goutallier and quartile classification systems.

PubMed

Engelken, Florian; Wassilew, Georgi I; Köhlitz, Torsten; Brockhaus, Sebastian; Hamm, Bernd; Perka, Carsten; Diederichs, und Gerd

2014-01-01

The purpose of this study was to quantify the performance of the Goutallier classification for assessing fatty degeneration of the gluteus muscles from magnetic resonance (MR) images and to compare its performance to a newly proposed system. Eighty-four hips with clinical signs of gluteal insufficiency and 50 hips from asymptomatic controls were analyzed using a standard classification system (Goutallier) and a new scoring system (Quartile). Interobserver reliability and intraobserver repeatability were determined, and accuracy was assessed by comparing readers' scores with quantitative estimates of the proportion of intramuscular fat based on MR signal intensities (gold standard). The existing Goutallier classification system and the new Quartile system performed equally well in assessing fatty degeneration of the gluteus muscles, both showing excellent levels of interrater and intrarater agreement. While the Goutallier classification system has the advantage of being widely known, the benefit of the Quartile system is that it is based on more clearly defined grades of fatty degeneration. Copyright © 2014 Elsevier Inc. All rights reserved.
Validity and reliability of haemoglobin colour scale and its comparison with clinical signs in diagnosing anaemia in pregnancy in Ahmedabad, India.

PubMed

Bala, D V; Vyas, S; Shukla, A; Tiwari, H; Bhatt, G; Gupta, K

2012-07-01

This study compared the validity of the haemoglobin colour scale (HCS) and clinical signs in diagnosing anaemia against Sahli's haemoglobinometer method as the gold standard, and assessed the reliability of HCS. The sample comprised 129 pregnant women recruited from 6 urban health centres in Ahmedabad. The prevalence of anaemia was 69.8% by Sahli's method, 78.3% by HCS and 89.9% by clinical signs; there was no statistically significant difference between Sahli's method and HCS whereas there was between Sahlis method and clinical signs. The mean haemoglobin level by Sahli's method and HCS differed significantly. The sensitivity, specificity, positive predictive value and negative predictive value of HCS was 83.3%, 33.3%, 74.3% and 46.4% respectively and that of clinical signs was 91.1%, 12.8%, 70.7% and 38.5% respectively. Interobserver agreement for HCS was moderate (K = 0.43). Clinical signs are better than HCS for diagnosing anaemia. HCS can be used in the field provided assessors are adequately trained.
Development of a pneumatic tensioning device for gap measurement during total knee arthroplasty.

PubMed

Kwak, Dai-Soon; Kong, Chae-Gwan; Han, Seung-Ho; Kim, Dong-Hyun; In, Yong

2012-09-01

Despite the importance of soft tissue balancing during total knee arthroplasty (TKA), all estimating techniques are dependent on a surgeon's manual distraction force or subjective feeling based on experience. We developed a new device for dynamic gap balancing, which can offer constant load to the gap between the femur and tibia, using pneumatic pressure during range of motion. To determine the amount of distraction force for the new device, 3 experienced surgeons' manual distraction force was measured using a conventional spreader. A new device called the consistent load pneumatic tensor was developed on the basis of the biomechanical tests. Reliability testing for the new device was performed using 5 cadaveric knees by the same surgeons. Intraclass correlation coefficients (ICCs) were calculated. The distraction force applied to the new pneumatic tensioning device was determined to be 150 N. The interobserver reliability was very good for the newly tested spreader device with ICCs between 0.828 and 0.881. The new pneumatic tensioning device can enable us to properly evaluate the soft tissue balance throughout the range of motion during TKA with acceptable reproducibility.
Development of a Valid and Reliable Knee Articular Cartilage Condition-Specific Study Methodological Quality Score.

PubMed

Harris, Joshua D; Erickson, Brandon J; Cvetanovich, Gregory L; Abrams, Geoffrey D; McCormick, Frank M; Gupta, Anil K; Verma, Nikhil N; Bach, Bernard R; Cole, Brian J

2014-02-01

Condition-specific questionnaires are important components in evaluation of outcomes of surgical interventions. No condition-specific study methodological quality questionnaire exists for evaluation of outcomes of articular cartilage surgery in the knee. To develop a reliable and valid knee articular cartilage-specific study methodological quality questionnaire. Cross-sectional study. A stepwise, a priori-designed framework was created for development of a novel questionnaire. Relevant items to the topic were identified and extracted from a recent systematic review of 194 investigations of knee articular cartilage surgery. In addition, relevant items from existing generic study methodological quality questionnaires were identified. Items for a preliminary questionnaire were generated. Redundant and irrelevant items were eliminated, and acceptable items modified. The instrument was pretested and items weighed. The instrument, the MARK score (Methodological quality of ARticular cartilage studies of the Knee), was tested for validity (criterion validity) and reliability (inter- and intraobserver). A 19-item, 3-domain MARK score was developed. The 100-point scale score demonstrated face validity (focus group of 8 orthopaedic surgeons) and criterion validity (strong correlation to Cochrane Quality Assessment score and Modified Coleman Methodology Score). Interobserver reliability for the overall score was good (intraclass correlation coefficient [ICC], 0.842), and for all individual items of the MARK score, acceptable to perfect (ICC, 0.70-1.000). Intraobserver reliability ICC assessed over a 3-week interval was strong for 2 reviewers (≥0.90). The MARK score is a valid and reliable knee articular cartilage condition-specific study methodological quality instrument. This condition-specific questionnaire may be used to evaluate the quality of studies reporting outcomes of articular cartilage surgery in the knee.
Sonographic measurements of the achilles tendon, plantar fascia, and heel fat pad are reliable: A test-retest intra- and intertester study.

PubMed

Johannsen, Finn; Jensen, Signe; Stallknecht, Sandra E; Olsen, Lars Otto; Magnusson, S Peter

2016-10-01

To determine intra- and interobserver reliability and precision of sonographic (US) scanning in measuring thickness of the Achilles tendon, plantar fascia, and heel fat pad in patients with heel pain. Seventeen consecutive patients referred with heel pain were included. Two evaluators blinded to the diagnosis performed independently US scanning of both feet without any dialogue with the patient. The examiner left the room, and the next examiner entered. All patients had two US scans performed by each examiner. Two months later, the US images were randomly presented to the evaluators for measurements. Reliability and agreement were assessed by calculation of intraclass correlation coefficient (ICC), 95% limits of agreement (LOA), and typical error (TE). LOA was calculated as a percentage of the mean thickness of each structure to obtain a unitless parameter. We found excellent intratester reliability (ICC 0.78-0.98) and good intertester reliability using one measurement (ICC 0.72-0.91) and excellent (ICC 0.85-0.95) when using average of two measurements. The intratester agreements were good with LOA: 9.5-23.4% and TE: 3.4-8.4%. The intertester agreements were acceptable using one measurement with LOA: 16.1-36.4%, and better using two measurements with LOA: 14.4-33.2%. US is a reliable technique of measurement in the daily clinic, and one single measurement is sufficient. In research, we recommend that the same observer performs the US measurements, if one single scanning is preferred; if more researchers are involved, the average measurement of two US scans is recommended. © 2016 Wiley Periodicals, Inc. J Clin Ultrasound 44:480-486, 2016. © 2016 Wiley Periodicals, Inc.
An Instrument to Assess the Obesogenic Environment of Child Care Centers

ERIC Educational Resources Information Center

Ward, Dianne; Hales, Derek; Haverly, Katie; Marks, Julie; Benjamin, Sara; Ball, Sarah; Trost, Stewart

2008-01-01

Objectives: To describe protocol and interobserver agreements of an instrument to evaluate nutrition and physical activity environments at child care. Methods: Interobserver data were collected from 9 child care centers, through direct observation and document review (17 observer pairs). Results: Mean agreement between observer pairs was 87.26%…
The inter-observer agreement in the assessment of carotid plaque neovascularization by contrast-enhanced ultrasonography: The impact of plaque thickness.

PubMed

Chen, Jian; Zhang, Yan-Ming; Song, Ze-Zhou; Fu, Yan-Fei; Geng, Yu

2018-04-10

The interobserver agreement in the assessment of the grade of carotid plaque neovascularization by contrast-enhanced ultrasonography is poorly established. We examined 140 carotid plaques in 66 patients (all patients had bilateral plaques, and 8 patients had 2 plaques on one side). We performed conventional and contrast-enhanced ultrasonography to analyze the presence of carotid plaque neovascularization, which was graded by two independent observers whose interobserver agreement (κ) was evaluated according to the thickness of carotid plaque. For all carotid plaques, the mean κ was 0.689 (95% confidence interval 0.604-0.774). It was 0.689 (0.569-0.808), 0.637 (0.487-0.787), and 0.740 (0.585-0.896), respectively for carotid plaques with maximal thickness <2 mm, from 2 mm to 3 mm, and >3 mm. The interobserver agreement for assessing carotid plaque neovascularization by using contrast-enhanced ultrasonography is substantial and acceptable for research purposes, regardless of the maximal thickness of the plaque. © 2018 Wiley Periodicals, Inc.
Automatic training and reliability estimation for 3D ASM applied to cardiac MRI segmentation

NASA Astrophysics Data System (ADS)

Tobon-Gomez, Catalina; Sukno, Federico M.; Butakoff, Constantine; Huguet, Marina; Frangi, Alejandro F.

2012-07-01

Training active shape models requires collecting manual ground-truth meshes in a large image database. While shape information can be reused across multiple imaging modalities, intensity information needs to be imaging modality and protocol specific. In this context, this study has two main purposes: (1) to test the potential of using intensity models learned from MRI simulated datasets and (2) to test the potential of including a measure of reliability during the matching process to increase robustness. We used a population of 400 virtual subjects (XCAT phantom), and two clinical populations of 40 and 45 subjects. Virtual subjects were used to generate simulated datasets (MRISIM simulator). Intensity models were trained both on simulated and real datasets. The trained models were used to segment the left ventricle (LV) and right ventricle (RV) from real datasets. Segmentations were also obtained with and without reliability information. Performance was evaluated with point-to-surface and volume errors. Simulated intensity models obtained average accuracy comparable to inter-observer variability for LV segmentation. The inclusion of reliability information reduced volume errors in hypertrophic patients (EF errors from 17 ± 57% to 10 ± 18% LV MASS errors from -27 ± 22 g to -14 ± 25 g), and in heart failure patients (EF errors from -8 ± 42% to -5 ± 14%). The RV model of the simulated images needs further improvement to better resemble image intensities around the myocardial edges. Both for real and simulated models, reliability information increased segmentation robustness without penalizing accuracy.
Defining ulnar variance in the adolescent wrist: measurement technique and interobserver reliability.

PubMed

Goldfarb, Charles A; Strauss, Nicole L; Wall, Lindley B; Calfee, Ryan P

2011-02-01

The measurement technique for ulnar variance in the adolescent population has not been well established. The purpose of this study was to assess the reliability of a standard ulnar variance assessment in the adolescent population. Four orthopedic surgeons measured 138 adolescent wrist radiographs for ulnar variance using a standard technique. There were 62 male and 76 female radiographs obtained in a standardized fashion for subjects aged 12 to 18 years. Skeletal age was used for analysis. We determined mean variance and assessed for differences related to age and gender. We also determined the interrater reliability. The mean variance was -0.7 mm for boys and -0.4 mm for girls; there was no significant difference between the 2 groups overall. When subdivided by age and gender, the younger group (≤ 15 y of age) was significantly less negative for girls (boys, -0.8 mm and girls, -0.3 mm, p < .05). There was no significant difference between boys and girls in the older group. The greatest difference between any 2 raters was 1 mm; exact agreement was obtained in 72 subjects. Correlations between raters were high (r(p) 0.87-0.97 in boys and 0.82-0.96 for girls). Interrater reliability was excellent (Cronbach's alpha, 0.97-0.98). Standard assessment techniques for ulnar variance are reliable in the adolescent population. Open growth plates did not interfere with this assessment. Young adolescent boys demonstrated a greater degree of negative ulnar variance compared with young adolescent girls. Copyright © 2011 American Society for Surgery of the Hand. Published by Elsevier Inc. All rights reserved.
Automatic training and reliability estimation for 3D ASM applied to cardiac MRI segmentation.

PubMed

Tobon-Gomez, Catalina; Sukno, Federico M; Butakoff, Constantine; Huguet, Marina; Frangi, Alejandro F

2012-07-07

Training active shape models requires collecting manual ground-truth meshes in a large image database. While shape information can be reused across multiple imaging modalities, intensity information needs to be imaging modality and protocol specific. In this context, this study has two main purposes: (1) to test the potential of using intensity models learned from MRI simulated datasets and (2) to test the potential of including a measure of reliability during the matching process to increase robustness. We used a population of 400 virtual subjects (XCAT phantom), and two clinical populations of 40 and 45 subjects. Virtual subjects were used to generate simulated datasets (MRISIM simulator). Intensity models were trained both on simulated and real datasets. The trained models were used to segment the left ventricle (LV) and right ventricle (RV) from real datasets. Segmentations were also obtained with and without reliability information. Performance was evaluated with point-to-surface and volume errors. Simulated intensity models obtained average accuracy comparable to inter-observer variability for LV segmentation. The inclusion of reliability information reduced volume errors in hypertrophic patients (EF errors from 17 ± 57% to 10 ± 18%; LV MASS errors from -27 ± 22 g to -14 ± 25 g), and in heart failure patients (EF errors from -8 ± 42% to -5 ± 14%). The RV model of the simulated images needs further improvement to better resemble image intensities around the myocardial edges. Both for real and simulated models, reliability information increased segmentation robustness without penalizing accuracy.
Reliability of Causality Assessment for Drug, Herbal and Dietary Supplement Hepatoxicity in the Drug-Induced Liver Injury Network (DILIN)

PubMed Central

Hayashi, Paul H.; Barnhart, Huiman X.; Fontana, Robert J.; Chalasani, Naga; Davern, Timothy J.; Talwalkar, Jayant A.; Reddy, K. Rajender; Stolz, Andrew A.; Hoofnagle, Jay H.; Rockey, Don C.

2014-01-01

Background Due to the lack of objective tests to diagnose drug induced liver injury (DILI), causality assessment is a matter of debate. Expert opinion is often used in research and industry but its test-retest reliability is unknown. Aims To determine the test-retest reliability of the expert opinion process used by the Drug-Induced Liver Injury Network (DILIN) Methods Three DILIN hepatologists adjudicate suspected hepatotoxicity cases to 1 of 5 categories representing levels of likelihood of DILI. Adjudication is based on retrospective assessment of gathered case data that includes prospective follow-up information. One hundred randomly selected DILIN cases were re-assessed using the same processes for initial assessment but by 3 different reviewers in 92% of cases. Results The median time between assessments was 938 days (range: 140–2352). Thirty-one cases involved >1 agent. Weighted kappa statistics for overall case and individual agent category agreement were 0.60 (95% CI: 0.50–0.71) and 0.60 (0.52–0.68), respectively. Overall case adjudications were within one category of each other 93% of the time, while 5% differed by 2 categories and 2% differed by 3 categories. Fourteen-percent crossed the 50% threshold of likelihood due to competing diagnoses or atypical timing between drug exposure and injury. Conclusions The DILIN expert opinion causality assessment method has moderate inter-observer reliability but very good agreement within 1 category. A small but important proportion of cases could not be reliably diagnosed as ≥ 50% likely to be DILI. PMID:24661785
A simple method of measuring tibial tubercle to trochlear groove distance on MRI: description of a novel and reliable technique.

PubMed

Camp, Christopher L; Heidenreich, Mark J; Dahm, Diane L; Bond, Jeffrey R; Collins, Mark S; Krych, Aaron J

2016-03-01

Tibial tubercle-trochlear groove (TT-TG) distance is a variable that helps guide surgical decision-making in patients with patellar instability. The purpose of this study was to compare the accuracy and reliability of an MRI TT-TG measuring technique using a simple external alignment method to a previously validated gold standard technique that requires advanced software read by radiologists. TT-TG was calculated by MRI on 59 knees with a clinical diagnosis of patellar instability in a blinded and randomized fashion by two musculoskeletal radiologists using advanced software and by two orthopaedists using the study technique which utilizes measurements taken on a simple electronic imaging platform. Interrater reliability between the two radiologists and the two orthopaedists and intermethods reliability between the two techniques were calculated using interclass correlation coefficients (ICC) and concordance correlation coefficients (CCC). ICC and CCC values greater than 0.75 were considered to represent excellent agreement. The mean TT-TG distance was 14.7 mm (Standard Deviation (SD) 4.87 mm) and 15.4 mm (SD 5.41) as measured by the radiologists and orthopaedists, respectively. Excellent interobserver agreement was noted between the radiologists (ICC 0.941; CCC 0.941), the orthopaedists (ICC 0.978; CCC 0.976), and the two techniques (ICC 0.941; CCC 0.933). The simple TT-TG distance measurement technique analysed in this study resulted in excellent agreement and reliability as compared to the gold standard technique. This method can predictably be performed by orthopaedic surgeons without advanced radiologic software. II.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.