Science.gov

Sample records for good interobserver agreement

  1. Describing Peripancreatic Collections According to the Revised Atlanta Classification of Acute Pancreatitis: An International Interobserver Agreement Study.

    PubMed

    Bouwense, Stefan A; van Brunschot, Sandra; van Santvoort, Hjalmar C; Besselink, Marc G; Bollen, Thomas L; Bakker, Olaf J; Banks, Peter A; Boermeester, Marja A; Cappendijk, Vincent C; Carter, Ross; Charnley, Richard; van Eijck, Casper H; Freeny, Patrick C; Hermans, John J; Hough, David M; Johnson, Colin D; Laméris, Johan S; Lerch, Markus M; Mayerle, Julia; Mortele, Koenraad J; Sarr, Michael G; Stedman, Brian; Vege, Santhi Swaroop; Werner, Jens; Dijkgraaf, Marcel G; Gooszen, Hein G; Horvath, Karen D

    2017-08-01

    Severe acute pancreatitis is associated with peripancreatic morphologic changes as seen on imaging. Uniform communication regarding these morphologic findings is crucial for accurate diagnosis and treatment. For the original 1992 Atlanta classification, interobserver agreement is poor. We hypothesized that for the revised Atlanta classification, interobserver agreement will be better. An international, interobserver agreement study was performed among expert and nonexpert radiologists (n = 14), surgeons (n = 15), and gastroenterologists (n = 8). Representative computed tomographies of all stages of acute pancreatitis were selected from 55 patients and were assessed according to the revised Atlanta classification. The interobserver agreement was calculated among all reviewers and subgroups, that is, expert and nonexpert reviewers; interobserver agreement was defined as poor (≤0.20), fair (0.21-0.40), moderate (0.41-0.60), good (0.61-0.80), or very good (0.81-1.00). Interobserver agreement among all reviewers was good (0.75 [standard deviation, 0.21]) for describing the type of acute pancreatitis and good (0.62 [standard deviation, 0.19]) for the type of peripancreatic collection. Expert radiologists showed the best and nonexpert clinicians the lowest interobserver agreement. Interobserver agreement was good for the revised Atlanta classification, supporting the importance for widespread adaption of this revised classification for clinical and research communications.

  2. Inter-observer and intra-observer agreement on interpretation of uroflowmetry curves of kindergarten children.

    PubMed

    Chang, Shang-Jen; Yang, Stephen S D

    2008-12-01

    To evaluate the inter-observer and intra-observer agreement on the interpretation of uroflowmetry curves of children. Healthy kindergarten children were enrolled for evaluation of uroflowmetry. Uroflowmetry curves were classified as bell-shaped, tower, plateau, staccato and interrupted. Only the bell-shaped curves were regarded as normal. Two urodynamists evaluated the curves independently after reviewing the definitions of the different types of uroflowmetry curve. The senior urodynamist evaluated the curves twice 3 months apart. The final conclusion was made when consensus was reached. Agreement among observers was analyzed using kappa statistics. Of 190 uroflowmetry curves eligible for analysis, the intra-observer agreement in interpreting each type of curve and interpreting normalcy vs abnormality was good (kappa=0.71 and 0.68, respectively). Very good inter-observer agreement (kappa=0.81) on normalcy and good inter-observer agreement (kappa=0.73) on types of uroflowmetry were observed. Poor inter-observer agreement existed on the classification of specific types of abnormal uroflowmetry curves (kappa=0.07). Uroflowmetry is a good screening tool for normalcy of kindergarten children, while not a good tool to define the specific types of abnormal uroflowmetry.

  3. Prospective assessment of interobserver agreement for defecography in fecal incontinence.

    PubMed

    Dobben, Annette C; Wiersma, Tjeerd G; Janssen, Lucas W M; de Vos, Rien; Terra, Maaike P; Baeten, Cor G; Stoker, Jaap

    2005-11-01

    The primary aim of our study was to determine the interobserver agreement of defecography in diagnosing enterocele, anterior rectocele, intussusception, and anismus in fecal-incontinent patients. The subsidiary aim was to evaluate the influence of level of experience on interpreting defecography. Defecography was performed in 105 consecutive fecal-incontinent patients. Observers were classified by level of experience and their findings were compared with the findings of an expert radiologist. The quality of the expert radiologist's findings was evaluated by an intraobserver agreement procedure. Intraobserver agreement was good to very good except for anismus: incomplete evacuation after 30 sec (kappa, 0.55) and puborectalis impression (kappa, 0.54). Interobserver agreement for enterocele and rectocele was good (kappa, 0.66 for both) and for intussusception, fair (kappa, 0.29). Interobserver agreement for anismus: incomplete evacuation after 30 sec was moderate (kappa, 0.47), and for anismus: puborectalis impression was fair (kappa, 0.24). Agreement in grading of enterocele and rectocele was good (kappa, 0.64 and 0.72, respectively) and for intussusception, fair (kappa, 0.39). Agreement separated by experience level was very good for rectocele (kappa, 0.83) and grading of rectoceles (kappa, 0.83) and moderate for intussusception (kappa, 0.44) at the most experienced level. For enterocele and grading, experience level did not influence the reproducibility. Reproducibility for enterocele, anterior rectocele, and severity grading is good, but for intussusception is fair to moderate. For anismus, the diagnosis of incomplete evacuation after 30 sec is more reproducible than puborectalis impression. The level of experience seems to play a role in diagnosing anterior rectocele and its grading and in diagnosing intussusception.

  4. Intra- and inter-observer agreement on diagnosis of Dupuytren disease, measurements of severity of contracture, and disease extent.

    PubMed

    Broekstra, Dieuwke C; Lanting, Rosanne; Werker, Paul M N; van den Heuvel, Edwin R

    2015-08-01

    Dupuytren disease (DD) is a fibrosing disease affecting the palmar aponeurosis, and is mostly treated by surgery based on measurement of severity of flexion contracture of the fingers. Literature concerning the measurement reliability is scarce. This study aimed to determine the intra- and inter-observer agreement of four variables for diagnosing DD, determining severity of contracture, and disease extent. One of them is a new measurement on the area of nodules and cords for measuring the disease extent in early disease stages. An agreement study (n = 54) was performed by two trained investigators. Agreement was calculated per finger, based on an intraclass correlation coefficient (ICC) using a latent variable model on subjects for diagnosis and Tubiana stage. For total passive extension deficit (TPED) and the area of nodules and cords, agreement was calculated with an ICC using a one-way random effects model with subject as random effect. Inter-observer agreement was very good for diagnosing DD (ICC: 95.5%-99.9%) and good to very good for classifying Tubiana stage (ICC: 73.5%-94.9%). Agreements for area and TPED were moderate (middle finger) to very good (ICC: 48.4%-98.6% and 45.0%-99.5%, respectively). Intra-observer agreement was slightly higher on average than inter-observer agreement. Overall, the intra- and inter-observer agreement in diagnosing DD, and determining the severity of flexion contracture is high. Also, the newly introduced variable area of nodules and cords has high intra- and inter-observer agreement, indicating that it is suitable to measure disease extent. Copyright © 2015 Elsevier Ltd. All rights reserved.

  5. Gastritis staging: interobserver agreement by applying OLGA and OLGIM systems.

    PubMed

    Isajevs, Sergejs; Liepniece-Karele, Inta; Janciauskas, Dainius; Moisejevs, Georgijs; Putnins, Viesturs; Funka, Konrads; Kikuste, Ilze; Vanags, Aigars; Tolmanis, Ivars; Leja, Marcis

    2014-04-01

    Atrophic gastritis remains a difficult histopathological diagnosis with low interobserver agreement. The aim of our study was to compare gastritis staging and interobserver agreement between general and expert gastrointestinal (GI) pathologists using Operative Link for Gastritis Assessment (OLGA) and Operative Link on Gastric Intestinal Metaplasia (OLGIM). We enrolled 835 patients undergoing upper endoscopy in the study. Two general and two expert gastrointestinal pathologists graded biopsy specimens according to the Sydney classification, and the stage of gastritis was assessed by OLGA and OLGIM system. Using OLGA, 280 (33.4 %) patients had gastritis (stage I-IV), whereas with OLGIM this was 167 (19.9 %). OLGA stage III- IV gastritis was observed in 25 patients, whereas by OLGIM stage III-IV was found in 23 patients. Interobserver agreement between expert GI pathologists for atrophy in the antrum, incisura angularis, and corpus was moderate (kappa = 0.53, 0.57 and 0.41, respectively, p < 0.0001), but almost perfect for intestinal metaplasia (kappa = 0.82, 0.80 and 0.81, respectively, p < 0.0001). However, interobserver agreement between general pathologists was poor for atrophy, but moderate for intestinal metaplasia. OLGIM staging provided the highest interobserver agreement, but a substantial proportion of potentially high-risk individuals would be missed if only OLGIM staging is applied. Therefore, we recommend to use a combination of OLGA and OLGIM for staging of chronic gastritis.

  6. A microsoft excel(®) 2010 based tool for calculating interobserver agreement.

    PubMed

    Reed, Derek D; Azulay, Richard L

    2011-01-01

    This technical report provides detailed information on the rationale for using a common computer spreadsheet program (Microsoft Excel(®)) to calculate various forms of interobserver agreement for both continuous and discontinuous data sets. In addition, we provide a brief tutorial on how to use an Excel spreadsheet to automatically compute traditional total count, partial agreement-within-intervals, exact agreement, trial-by-trial, interval-by-interval, scored-interval, unscored-interval, total duration, and mean duration-per-interval interobserver agreement algorithms. We conclude with a discussion of how practitioners may integrate this tool into their clinical work.

  7. A Microsoft Excel® 2010 Based Tool for Calculating Interobserver Agreement

    PubMed Central

    Azulay, Richard L

    2011-01-01

    This technical report provides detailed information on the rationale for using a common computer spreadsheet program (Microsoft Excel®) to calculate various forms of interobserver agreement for both continuous and discontinuous data sets. In addition, we provide a brief tutorial on how to use an Excel spreadsheet to automatically compute traditional total count, partial agreement-within-intervals, exact agreement, trial-by-trial, interval-by-interval, scored-interval, unscored-interval, total duration, and mean duration-per-interval interobserver agreement algorithms. We conclude with a discussion of how practitioners may integrate this tool into their clinical work. PMID:22649578

  8. Are distal radius fracture classifications reproducible? Intra and interobserver agreement.

    PubMed

    Belloti, João Carlos; Tamaoki, Marcel Jun Sugawara; Franciozi, Carlos Eduardo da Silveira; Santos, João Baptista Gomes dos; Balbachevsky, Daniel; Chap Chap, Eduardo; Albertoni, Walter Manna; Faloppa, Flávio

    2008-05-01

    Various classification systems have been proposed for fractures of the distal radius, but the reliability of these classifications is seldom addressed. For a fracture classification to be useful, it must provide prognostic significance, interobserver reliability and intraobserver reproducibility. The aim here was to evaluate the intraobserver and interobserver agreement of distal radius fracture classifications. This was a validation study on interobserver and intraobserver reliability. It was developed in the Department of Orthopedics and Traumatology, Universidade Federal de São Paulo - Escola Paulista de Medicina. X-rays from 98 cases of displaced distal radius fracture were evaluated by five observers: one third-year orthopedic resident (R3), one sixth-year undergraduate medical student (UG6), one radiologist physician (XRP), one orthopedic trauma specialist (OT) and one orthopedic hand surgery specialist (OHS). The radiographs were classified on three different occasions (times T1, T2 and T3) using the Universal (Cooney), Arbeitsgemeinschaft für Osteosynthesefragen/Association for the Study of Internal Fixation (AO/ASIF), Frykman and Fernández classifications. The kappa coefficient (kappa) was applied to assess the degree of agreement. Among the three occasions, the highest mean intraobserver k was observed in the Universal classification (0.61), followed by Fernández (0.59), Frykman (0.55) and AO/ASIF (0.49). The interobserver agreement was unsatisfactory in all classifications. The Fernández classification showed the best agreement (0.44) and the worst was the Frykman classification (0.26). The low agreement levels observed in this study suggest that there is still no classification method with high reproducibility.

  9. Interobserver agreement in analysis of cardiotocograms recorded during trial of labor after cesarean.

    PubMed

    Caning, M M; Thisted, D L A; Amer-Wählin, I; Laier, G H; Krebs, L

    2018-05-17

    To examine interobserver agreement in intrapartum cardiotocography (CTG) classification in women undergoing trial of labor after a cesarean section (TOLAC) at term with or without complete uterine rupture. Nineteen blinded and independent Danish obstetricians assessed CTG tracings from 47 women (174 individual pages) with a complete uterine rupture during TOLAC and 37 women (133 individual pages) with no uterine rupture during TOLAC. Individual pages with CTG tracings lasting at least 20 min were evaluated by three different assessors and counted as an individual case. The tracings were analyzed according to the modified version of the Federation of Gynaecology and Obstetrics (FIGO) guidelines elaborated for the use of STAN (ST-analysis). Occurrence of defined abnormalities was recorded and the tracings were classified as normal, suspicious, pathological, or preterminal. The interobserver agreement was evaluated using Fleiss' kappa. Agreement on classification of a preterminal CTG was almost perfect. The interobserver agreement on normal, suspicious or pathological CTG was moderate to substantial. Regarding the presence of severe variable decelerations, the agreement was moderate. No statistical difference was found in the interobserver agreement between classification of tracings from women undergoing TOLAC with and without complete uterine rupture. The interobserver agreement on classification of CTG tracings from high-risk deliveries during TOLAC is best for assessment of a preterminal CTG and the poorest for the identification of severe variable decelerations.

  10. Pre-operative Duplex Ultrasonography in Arteriovenous Fistula Creation: Intra- and Inter-observer Agreement.

    PubMed

    Zonnebeld, Niek; Maas, Tommy M G; Huberts, Wouter; van Loon, Magda M; Delhaas, Tammo; Tordoir, Jan H M

    2017-11-01

    Although clinical guidelines on arteriovenous fistula (AVF) creation advocate minimum luminal arterial and venous diameters, assessed by duplex ultrasonography (DUS), the clinical value of routine DUS examination is under debate. DUS might be an insufficiently repeatable and/or reproducible imaging modality because of its operator dependency. The present study aimed to assess intra- and inter-observer agreement of DUS examination in support of AVF surgery planning. Ten end stage renal disease patients were included, to assess intra- and inter-observer agreement of pre-operative DUS measurements. All measurements were performed by two trained and experienced vascular technicians, blinded to measurement readings. From the routine DUS protocol, representative measurements (venous diameters, and arterial diameters and volume flow in the upper arm and forearm) were selected. For intra-observer agreement the measurements were performed in triplicate, with the probe released from the skin between each. Intraclass correlation coefficients were calculated for intra- and inter-observer agreement, and Bland-Altman plots used to graphically display mean measurement differences and limits of agreement. Ten patients (6 male, 59.4±19.7 years) consented to participate, and all predefined measurements were obtained. Intraclass correlation coefficients for intra-observer agreement of diameter measurements were at least 0.90 (95% CI 0.74-0.97; radial artery). Inter-observer agreement was at least 0.83 (0.46-0.96; lateral diameter upper arm cephalic vein). The Bland-Altman plots showed acceptable mean measurement differences and limits of agreement. In experienced hands, excellent intra- and inter-observer agreement can be reached for the discrete pre-operative DUS measurements advocated in clinical guidelines. DUS is therefore a reliable imaging modality to support AVF surgery planning. The content of DUS protocols, however, needs further standardisation. Copyright © 2017 European

  11. Interobserver agreement on histopathological lesions in class III or IV lupus nephritis.

    PubMed

    Wilhelmus, Suzanne; Cook, H Terence; Noël, Laure-Hélène; Ferrario, Franco; Wolterbeek, Ron; Bruijn, Jan A; Bajema, Ingeborg M

    2015-01-07

    To treat lupus nephritis effectively, proper identification of the histologic class is essential. Although the classification system for lupus nephritis is nearly 40 years old, remarkably few studies have investigated interobserver agreement. Interobserver agreement among nephropathologists was studied, particularly with respect to the recognition of class III/IV lupus nephritis lesions, and possible causes of disagreement were determined. A link to a survey containing pictures of 30 glomeruli was provided to all 360 members of the Renal Pathology Society; 34 responses were received from 12 countries (a response rate of 9.4%). The nephropathologist was asked whether glomerular lesions were present that would categorize the biopsy as class III/IV. If so, additional parameters were scored. To determine the interobserver agreement among the participants, κ or intraclass correlation values were calculated. The intraclass correlation or κ-value was also calculated for two separate levels of experience (specifically, nephropathologists who were new to the field or moderately experienced [less experienced] and nephropathologists who were highly experienced). Intraclass correlation for the presence of a class III/IV lesion was 0.39 (poor). The κ/intraclass correlation values for the additional parameters were as follows: active, chronic, or both: 0.36; segmental versus global: 0.39; endocapillary proliferation: 0.46; influx of inflammatory cells: 0.32; swelling of endothelial cells: 0.46; extracapillary proliferation: 0.57; type of crescent: 0.46; and wire loops: 0.35. The highly experienced nephropathologists had significantly less interobserver variability compared with the less experienced nephropathologists (P=0.004). There is generally poor agreement in terms of recognizing class III/IV lesions. Because experience clearly increases interobserver agreement, this agreement may be improved by training nephropathologists. These results also underscore the importance of

  12. Inter-observer and intra-observer agreement between embryologists during selection of a single Day 5 embryo for transfer: a multicenter study.

    PubMed

    Storr, Ashleigh; Venetis, Christos A; Cooke, Simon; Kilani, Suha; Ledger, William

    2017-02-01

    What is the inter-observer and intra-observer agreement between embryologists when selecting a single Day 5 embryo for transfer? The inter-observer and intra-observer agreement between embryologists when selecting a single Day 5 embryo for transfer was generally good, although not optimal, even among experienced embryologists. Previous research on the morphological assessment of early stage (two pronuclei to Day 3) embryos has shown varying levels of inter-observer and intra-observer agreement. However, single blastocyst transfer is now becoming increasingly popular and there are no published data that assess inter-observer and intra-observer agreement when selecting a single embryo for Day 5 transfer. This was a prospective study involving 10 embryologists working at five different IVF clinics within a single organization between July 2013 and November 2015. The top 10 embryologists were selected based on their yearly Quality Assurance Program scores for blastocyst grading and were asked to morphologically grade all Day 5 embryos and choose a single embryo for transfer in a survey of 100 cases using 2D images. A total of 1000 decisions were therefore assessed. For each case, Day 5 images were shown, followed by a Day 3 and Day 5 image of the same embryo. Subgroup analyses were also performed based on the following characteristics of embryologists: the level of clinical embryology experience in the laboratory; amount of research experience; number of days per week spent grading embryos. The agreement between these embryologists and the one that scored the embryos on the actual day of transfer was also evaluated. Inter-observer and intra-observer variability was assessed using the kappa coefficient to evaluate the extent of agreement. This study showed that all 10 embryologists agreed on the embryo chosen for transfer in 50 out of 100 cases. In 93 out of 100 cases, at least 6 out of the 10 embryologists agreed. The inter-observer and intra-observer agreement among

  13. Interobserver Agreement on First-Stage Conversation Analytic Transcription

    ERIC Educational Resources Information Center

    Roberts, Felicia; Robinson, Jeffrey D.

    2004-01-01

    This investigation assesses interobserver agreement on conversation analytic (CA) transcription. Four professional CA transcribers spent a maximum of 3 hours transcribing 2.5 minutes of a previously unknown, naturally occurring, mundane telephone call. Researchers unitized transcripts into words, sounds, silences, inbreaths, outbreaths, and laugh…

  14. Diagnosing Nodular Regenerative Hyperplasia of the Liver Is Thwarted by Low Interobserver Agreement.

    PubMed

    Jharap, Bindia; van Asseldonk, Dirk P; de Boer, Nanne K H; Bedossa, Pierre; Diebold, Joachim; Jonker, A Mieke; Leteurtre, Emmanuelle; Verheij, Joanne; Wendum, Dominique; Wrba, Fritz; Zondervan, Pieter E; Colombel, Jean-Frédéric; Reinisch, Walter; Mulder, Chris J J; Bloemena, Elisabeth; van Bodegraven, Adriaan A

    2015-01-01

    Nodular regenerative hyperplasia (NRH) of the liver is associated with several diseases and drugs. Clinical symptoms of NRH may vary from absence of symptoms to full-blown (non-cirrhotic) portal hypertension. However, diagnosing NRH is challenging. The objective of this study was to determine inter- and intraobserver agreement on the histopathologic diagnosis of NRH. Liver specimens (n=48) previously diagnosed as NRH, were reviewed for the presence of NRH by seven pathologists without prior knowledge of the original diagnosis or clinical background. The majority of the liver specimens were from thiopurine using inflammatory bowel disease patients. Histopathologic features contributing to NRH were also assessed. Criteria for NRH were modified by consensus and subsequently validated. Interobserver agreement was evaluated by using the standard kappa index. After review, definite NRH, inconclusive NRH and no NRH were found in 35% (23-40%), 21% (13-27%) and 44% (38-56%), respectively (median, IQR). The median interobserver agreement for NRH was poor (κ = 0.20, IQR 0.14-0.28). The intraobserver variability on NRH ranged between 14% and 71%. After modification of the criteria and exclusion of biopsies with technical shortcomings, the interobserver agreement on the diagnosis NRH was fair (κ = 0.45). The interobserver agreement on the histopathologic diagnosis of NRH was poor, even when assessed by well-experienced liver pathologists. Modification of the criteria of NRH based on consensus effort and exclusion of biopsies of poor quality led to a fairly increased interobserver agreement. The main conclusion of this study is that NRH is a clinicopathologic diagnosis that cannot reliably be based on histopathology alone.

  15. Interobserver agreement in CTG interpretation using the 2015 FIGO guidelines for intrapartum fetal monitoring.

    PubMed

    Rei, Mariana; Tavares, Sara; Pinto, Pedro; Machado, Ana P; Monteiro, Sofia; Costa, Antónia; Costa-Santos, Cristina; Bernardes, João; Ayres-De-Campos, Diogo

    2016-10-01

    Visual analysis of cardiotocographic (CTG) tracings has been shown to be prone to poor intra- and interobserver agreement when several interpretation guidelines are used, and this may have an important impact on the technology's performance. The aim of this study was to evaluate agreement in CTG interpretation using the new 2015 FIGO guidelines on intrapartum fetal monitoring. A pre-existing database of intrapartum CTG tracings was used to sequentially select 151 cases acquired with a fetal electrode, with duration exceeding 60minutes, and signal loss less than 15%. These tracings were presented to six clinicians, three with more than 5 years' experience in the labor ward, and three with 5 or less years' experience. Observers were asked to evaluate tracings independently, to assess basic CTG features: baseline, variability, accelerations, decelerations, sinusoidal pattern, tachysystole, and to classify each tracing as normal, suspicious or pathologic, according to the 2015 FIGO guidelines on intrapartum fetal monitoring. Agreement between observers was evaluated using the proportions of agreement (Pa), with 95% confidence intervals (95%CI). A good interobserver agreement was found in the evaluation of most CTG features, but not bradycardia, reduced variability, saltatory pattern, absence of accelerations and absence of decelerations. For baseline classification Pa was 0.85 [0.82-0.90], for variability 0.82 [0.78-0.85], for accelerations 0.72 [0.68-0.75], for tachysystole 0.77 [0.74-0.81], for decelerations 0.92 [0.90-0.95], for variable decelerations 0.62 [0.58-0.65], for late decelerations 0.63 [0.59-0.66], for repetitive decelerations 0.73 [0.69-0.78], and for prolonged decelerations 0.81 [0.77-0.85]. For overall CTG classification, Pa were 0.60 [0.56-0.64], for classification as normal 0.67 [0.61-0.72], for suspicious 0.54 [0.48-0.60] and for pathologic 0.59 [0.51-0.66]. No differences in agreement according to the level of expertise were observed, except in the

  16. Interobserver Agreement on Arteriovenous Malformation Diffuseness Using Digital Subtraction Angiography.

    PubMed

    Braileanu, Maria; Yang, Wuyang; Caplan, Justin M; Lin, Li-Mei; Radvany, Martin G; Tamargo, Rafael J; Huang, Judy

    2016-11-01

    Arteriovenous malformation (AVM) diffuseness has been shown to be prognostic of treatment outcomes. We assessed interobserver agreement of AVM diffuseness among physicians of different specialty and training backgrounds using digital subtraction angiography (DSA). All research protocols were approved by the institutional review board for this retrospective chart review. In a single-blinded setting, 2 attending neurosurgeons, 1 attending interventional neuroradiologist, and 1 senior neurosurgical resident rated 80 DSA views of 36 AVMs as either compact or diffuse. Individual interobserver agreement and subgroup agreement were analyzed using κ agreement and intraclass correlation coefficient. Disagreement regarding AVM diffuseness occurred in 43.8% of all DSA views (n = 80). Interobserver κ agreement on AVM diffuseness using DSA views among 4 physicians ranged from fair (κ = 0.40 [95% confidence interval (CI) = 0.22-0.58]) to substantial (κ = 0.65 [95% CI = 0.48-0.81]), whereas total intraclass correlation coefficient was 0.81 (95% CI = 0.73-0.87). For the 36 AVMs, κ agreement ranged from fair (κ = 0.36 [95% CI = 0.13-0.60]) to moderate (κ = 0.57 [95% CI = 0.35-0.79]), whereas intraclass correlation coefficient among all 4 physicians was 0.68 (95% CI = 0.47-0.82). Moderate agreement on AVM diffuseness (n = 80) was found between attending and resident assessments (κ = 0.57 [95% CI = 0.39-0.75]) and between neurosurgeon and interventional neuroradiologist assessments (κ = 0.55 [95% CI = 0.37-0.73]). Agreement of individual physicians on AVM diffuseness varies from fair to substantial. Objective and three-dimensional measures of AVM diffuseness should be developed for consistent clinical application. Copyright © 2016 Elsevier Inc. All rights reserved.

  17. Intraobserver and Interobserver Agreement of Structural and Functional Software Programs for Measuring Glaucoma Progression.

    PubMed

    Moreno-Montañés, Javier; Antón, Vanesa; Antón, Alfonso; Larrosa, José M; Martinez-de-la-Casa, José María; Rebolleda, Gema; Ussa, Fernando; García-Granero, Marta

    2017-04-01

    It is important to evaluate intraobserver and interobserver agreement using visual field (VF) testing and optical coherence tomography (OCT) software in order to understand whether the use of this software is sufficient to detect glaucoma progression and to make decisions regarding its treatment. To evaluate agreement in VF and OCT software among 5 glaucoma specialists. The printout pages from VF progression software and OCT progression software from 100 patients were randomized, and the 5 glaucoma specialists subjectively and independently evaluated them for glaucoma. Each image was classified as having no progression, questionable progression, or progression. The principal investigator classified the patients previously as without variability (normal) or with high variability among tests (difficult). Using both software, the specialists also evaluated whether the glaucoma damage had progressed and if treatment change was needed. One month later, the same observers reevaluated the patients in a different order to determine intraobserver reproducibility. Intraobserver and interobserver agreement was estimated using κ statistics and Gwet second-order agreement coefficient. The agreement was compared with other factors. Of the 100 observed patients, half were male and all were white; the mean (SD) age was 69.7 (14.1) years. Intraobserver agreement was substantial to almost perfect for VF software (overall κ [95% CI], 0.59 [0.46-0.72] to 0.87 [0.79-0.96]) and similar for OCT software (overall κ [95% CI], 0.59 [0.46-0.71] to 0.85 [0.76-0.94]). Interobserver agreement among the 5 glaucoma specialists with the VF progression software was moderate (κ, 0.48; 95% CI, 0.41-0.55) and similar to OCT progression software (κ, 0.52; 95% CI, 0.44-0.59). Interobserver agreement was substantial in images classified as having no progression but only fair in those classified as having questionable glaucoma progression or glaucoma progression. Interobserver agreement was fair

  18. Diagnosing Nodular Regenerative Hyperplasia of the Liver Is Thwarted by Low Interobserver Agreement

    PubMed Central

    Jharap, Bindia; van Asseldonk, Dirk P.; de Boer, Nanne K. H.; Bedossa, Pierre; Diebold, Joachim; Jonker, A. Mieke; Leteurtre, Emmanuelle; Verheij, Joanne; Wendum, Dominique; Wrba, Fritz; Zondervan, Pieter E.; Colombel, Jean-Frédéric; Reinisch, Walter; Mulder, Chris J. J.; Bloemena, Elisabeth; van Bodegraven, Adriaan A.

    2015-01-01

    Background and Aims Nodular regenerative hyperplasia (NRH) of the liver is associated with several diseases and drugs. Clinical symptoms of NRH may vary from absence of symptoms to full-blown (non-cirrhotic) portal hypertension. However, diagnosing NRH is challenging. The objective of this study was to determine inter- and intraobserver agreement on the histopathologic diagnosis of NRH. Methods Liver specimens (n=48) previously diagnosed as NRH, were reviewed for the presence of NRH by seven pathologists without prior knowledge of the original diagnosis or clinical background. The majority of the liver specimens were from thiopurine using inflammatory bowel disease patients. Histopathologic features contributing to NRH were also assessed. Criteria for NRH were modified by consensus and subsequently validated. Interobserver agreement was evaluated by using the standard kappa index. Results After review, definite NRH, inconclusive NRH and no NRH were found in 35% (23-40%), 21% (13-27%) and 44% (38-56%), respectively (median, IQR). The median interobserver agreement for NRH was poor (κ = 0.20, IQR 0.14-0.28). The intraobserver variability on NRH ranged between 14% and 71%. After modification of the criteria and exclusion of biopsies with technical shortcomings, the interobserver agreement on the diagnosis NRH was fair (κ = 0.45). Conclusions The interobserver agreement on the histopathologic diagnosis of NRH was poor, even when assessed by well-experienced liver pathologists. Modification of the criteria of NRH based on consensus effort and exclusion of biopsies of poor quality led to a fairly increased interobserver agreement. The main conclusion of this study is that NRH is a clinicopathologic diagnosis that cannot reliably be based on histopathology alone. PMID:26054009

  19. Lenke and King classification systems for adolescent idiopathic scoliosis: interobserver agreement and postoperative results

    PubMed Central

    Hosseinpour-Feizi, Hojjat; Soleimanpour, Jafar; Sales, Jafar Ganjpour; Arzroumchilar, Ali

    2011-01-01

    Purpose The aim of this study was to investigate the interobserver agreement of the Lenke and King classifications for adolescent idiopathic scoliosis, and to compare the results of surgery performed based on classification of the scoliosis according to each of these classification systems. Methods The study was conducted in Shohada Hospital in Tabriz, Iran, between 2009 and 2010. First, a reliability assessment was undertaken to assess interobserver agreement of the Lenke and King classifications for adolescent idiopathic scoliosis. Second, postoperative efficacy and safety of surgery performed based on the Lenke and King classifications were compared. Kappa coefficients of agreement were calculated to assess the agreement. Outcomes were compared using bivariate tests and repeated measures analysis of variance. Results A low to moderate interobserver agreement was observed for the King classification; the Lenke classification yielded mostly high agreement coefficients. The outcome of surgery was not found to be substantially different between the two systems. Conclusion Based on the results, the Lenke classification method seems advantageous. This takes into consideration the Lenke classification’s priority in providing details of curvatures in different anatomical surfaces to explain precise intensity of scoliosis, that it has higher interobserver agreement scores, and also that it leads to noninferior postoperative results compared with the King classification method. PMID:22267934

  20. Lenke and King classification systems for adolescent idiopathic scoliosis: interobserver agreement and postoperative results.

    PubMed

    Hosseinpour-Feizi, Hojjat; Soleimanpour, Jafar; Sales, Jafar Ganjpour; Arzroumchilar, Ali

    2011-01-01

    The aim of this study was to investigate the interobserver agreement of the Lenke and King classifications for adolescent idiopathic scoliosis, and to compare the results of surgery performed based on classification of the scoliosis according to each of these classification systems. The study was conducted in Shohada Hospital in Tabriz, Iran, between 2009 and 2010. First, a reliability assessment was undertaken to assess interobserver agreement of the Lenke and King classifications for adolescent idiopathic scoliosis. Second, postoperative efficacy and safety of surgery performed based on the Lenke and King classifications were compared. Kappa coefficients of agreement were calculated to assess the agreement. Outcomes were compared using bivariate tests and repeated measures analysis of variance. A low to moderate interobserver agreement was observed for the King classification; the Lenke classification yielded mostly high agreement coefficients. The outcome of surgery was not found to be substantially different between the two systems. Based on the results, the Lenke classification method seems advantageous. This takes into consideration the Lenke classification's priority in providing details of curvatures in different anatomical surfaces to explain precise intensity of scoliosis, that it has higher interobserver agreement scores, and also that it leads to noninferior postoperative results compared with the King classification method.

  1. Intra- and interobserver agreement for fetal cerebral measurements in 3D-ultrasonography.

    PubMed

    Albers, Maria E W A; Buisman, Erato T I A; Kahn, René S; Franx, Arie; Onland-Moret, N Charlotte; de Heus, Roel

    2018-04-10

    The aim of this study is to evaluate intra- and interobserver agreement for measurement of intracranial, cerebellar, and thalamic volume with the Virtual Organ Computer-aided AnaLysis (VOCAL) technique in three-dimensional ultrasound images, in comparison to two-dimensional measurements of these brain structures. Three-dimensional ultrasound images of the brains of 80 fetuses at 20-24 weeks' gestational age were obtained from YOUth, a Dutch prospective cohort study. Two observers performed offline measurement of the occipitofrontal diameter, intracranial volume, transcerebellar diameter, cerebellar volume, and thalamic width, area, and volume, independently. VOCAL was used for calculation of the volumes. The two-way random, single measures intraclass correlation coefficient (ICC) was used for analysis of agreement and Bland-Altman plots were configured. Intra- and interobserver agreement was almost perfect for occipitofrontal diameter (intra ICC 0.88, 95% CI 0.82-0.92; inter ICC 0.91, 95% CI 0.85-0.94), intracranial volume (intra ICC 0.96, 95% CI 0.91-0.98; inter ICC 0.97, 95% CI 0.96-0.98) and transcerebellar diameter (intra ICC 0.91, 95% CI 0.86-0.94; inter ICC 0.86, 95% CI 0.78-0.910). For cerebellar volume, the intraobserver agreement was almost perfect (0.85, 95% CI 0.76-0.90), whereas the interobserver agreement was substantial (0.75, 95% CI 0.44-0.88). Agreement was only moderate for thalamic measurements. Bland-Altman plots for the volume measurements are normally distributed with acceptable mean differences and 95% limits of agreement. The intra- and interobserver agreement of the measurement of intracranial and cerebellar volume with VOCAL was almost perfect. These measurements are therefore reliable, and can be used to investigate fetal brain development. Thalamic measurements are not reliable enough. © 2018 Wiley Periodicals, Inc.

  2. Implementation of a Posted Schedule to Increase Class-Wide Interobserver Agreement Assessment

    ERIC Educational Resources Information Center

    Doucette, Stefanie; DiGennaro Reed, Florence D.; Reed, Derek D.; Maguire, Helena; Marquardt, Heidi

    2012-01-01

    The present study investigated the impact of an antecedent intervention in the form of a daily posted schedule on the interobserver agreement (IOA) assessment of educational goals implemented within a classroom at a private school serving individuals with disabilities. During baseline, the percentage of academic goals with interobserver agreement…

  3. Computed Tomography Assessment of Hepatic Metastases of Breast Cancer with Revised Response Evaluation Criteria in Solid Tumors (RECIST) Criteria (Version 1.1): Inter-Observer Agreement.

    PubMed

    Ghobrial, Fady Emil Ibrahim; Eldin, Manal Salah; Razek, Ahmed Abdel Khalek Abdel; Atwan, Nadia Ibrahim; Shamaa, Sameh Sayed Ahmed

    2017-01-01

    To assess inter-observer agreement of revised RECIST criteria (version 1.1) for computed tomography assessment of hepatic metastases of breast cancer. A prospective study was conducted in 28 female patients with breast cancer and with at least one measurable metastatic lesion in the liver that was treated with 3 cycles of anthracycline-based chemotherapy. All patients underwent computed tomography of the abdomen with 64-row multi- detector CT at baseline and after 3 cycles of chemotherapy for response assessment. Image analysis was performed by 2 observers, based on the RECIST criteria (version 1.1). Computed tomography revealed partial response of hepatic metastases in 7 patients (25%) by one observer and in 10 patients (35.7%) by the other observer, with good inter-observer agreement (k=0.75, percent agreement of 89.29%). Stable disease was detected in 19 patients (67.8%) by one observer and in 16 patients (57.1%) by the other observer, with good agreement (k=0.774, percent agreement of 89.29%). Progressive disease was detected in 2 patients (7.2%) by both observers, with perfect agreement (k=1, percent agreement of 100%). The overall inter-observer agreement in the CT-based response assessment of hepatic metastasis between the two observers was good ( k =0.793, percent agreement of 89.29%). We concluded that computed tomography is a reliable and reproducible imaging modality for response assessment of hepatic metastases of breast cancer according to the RECIST criteria (version 1.1).

  4. The Orientation of Gastric Biopsy Samples Improves the Inter-observer Agreement of the OLGA Staging System.

    PubMed

    Cotruta, Bogdan; Gheorghe, Cristian; Iacob, Razvan; Dumbrava, Mona; Radu, Cristina; Bancila, Ion; Becheanu, Gabriel

    2017-12-01

    Evaluation of severity and extension of gastric atrophy and intestinal metaplasia is recommended to identify subjects with a high risk for gastric cancer. The inter-observer agreement for the assessment of gastric atrophy is reported to be low. The aim of the study was to evaluate the inter-observer agreement for the assessment of severity and extension of gastric atrophy using oriented and unoriented gastric biopsy samples. Furthermore, the quality of biopsy specimens in oriented and unoriented samples was analyzed. A total of 35 subjects with dyspeptic symptoms addressed for gastrointestinal endoscopy that agreed to enter the study were prospectively enrolled. The OLGA/OLGIM gastric biopsies protocol was used. From each subject two sets of biopsies were obtained (four from the antrum, two oriented and two unoriented, two from the gastric incisure, one oriented and one unoriented, four from the gastric body, two oriented and two unoriented). The orientation of the biopsy samples was completed using nitrocellulose filters (Endokit®, BioOptica, Milan, Italy). The samples were blindly examined by two experienced pathologists. Inter-observer agreement was evaluated using kappa statistic for inter-rater agreement. The quality of histopathology specimens taking into account the identification of lamina propria was analyzed in oriented vs. unoriented samples. The samples with detectable lamina propria mucosae were defined as good quality specimens. Categorical data was analyzed using chi-square test and a two-sided p value <0.05 was considered statistically significant. A total of 350 biopsy samples were analyzed (175 oriented / 175 unoriented). The kappa index values for oriented/unoriented OLGA 0/I/II/III and IV stages have been 0.62/0.13, 0.70/0.20, 0.61/0.06, 0.62/0.46, and 0.77/0.50, respectively. For OLGIM 0/I/II/III stages the kappa index values for oriented/unoriented samples were 0.83/0.83, 0.88/0.89, 0.70/0.88 and 0.83/1, respectively. No case of OLGIM IV

  5. High inter-observer agreement of observer-perceived pain assessment in the emergency department.

    PubMed

    Hangaard, Martin Høhrmann; Malling, Brian; Mogensen, Christian Backer

    2018-02-21

    Triage is used to prioritize the patients in the emergency department. The majority of the triage systems include the patients' pain score to assess their level of acuity by using a combination of patient reported pain and observer-perceived pain; the latter therefore requires a certain degree of inter-observer agreement. The aim of the present study was to assess the inter-observer agreement of perceived pain among emergency department nurses and to evaluate if it was influenced by predetermined factors like age and gender. A project assistant randomly recruited two nurses, who were not allowed to interact with each other, to assess patient pain intensity on the numeric ranking scale. The project assistant afterwards entered the pain scores in a predesigned electronic questionnaire. We used weighted Fleiss-Cohen (quadratic) kappa statistics, Bland-Altman statistics and logistic regression analysis to assess the inter-observer agreement. One hundred and sixty-two patients were included. They had a median age of 38 years and 45% were females. 30% of the patients were acute surgical patients and 70% acute orthopedic patients. The average time between the pain assessments were 1,7 min. The Bland Altman analysis found a mean difference in pain score of 0.2 and 95% limits of agreement of +/- 3 point. When the NRS scores were translated to commonly used pain categories (no, mild, moderate or severe pain) we found a 70% agreement with a mean difference in categories of 0.05 and 95% limits of agreement of +/- 1 category. Patient age, gender, localization of pain, examination room or presence of a significant other did not affect the inter-observer agreement. We found 70% agreement on pain category between the nurses and it is justified that nurse-perceived pain assessment is used for triage in the emergency department.

  6. Multicenter accuracy and interobserver agreement of spot sign identification in acute intracerebral hemorrhage.

    PubMed

    Huynh, Thien J; Flaherty, Matthew L; Gladstone, David J; Broderick, Joseph P; Demchuk, Andrew M; Dowlatshahi, Dar; Meretoja, Atte; Davis, Stephen M; Mitchell, Peter J; Tomlinson, George A; Chenkin, Jordan; Chia, Tze L; Symons, Sean P; Aviv, Richard I

    2014-01-01

    Rapid, accurate, and reliable identification of the computed tomography angiography spot sign is required to identify patients with intracerebral hemorrhage for trials of acute hemostatic therapy. We sought to assess the accuracy and interobserver agreement for spot sign identification. A total of 131 neurology, emergency medicine, and neuroradiology staff and fellows underwent imaging certification for spot sign identification before enrolling patients in 3 trials targeting spot-positive intracerebral hemorrhage for hemostatic intervention (STOP-IT, SPOTLIGHT, STOP-AUST). Ten intracerebral hemorrhage cases (spot-positive/negative ratio, 1:1) were presented for evaluation of spot sign presence, number, and mimics. True spot positivity was determined by consensus of 2 experienced neuroradiologists. Diagnostic performance, agreement, and differences by training level were analyzed. Mean accuracy, sensitivity, and specificity for spot sign identification were 87%, 78%, and 96%, respectively. Overall sensitivity was lower than specificity (P<0.001) because of true spot signs incorrectly perceived as spot mimics. Interobserver agreement for spot sign presence was moderate (k=0.60). When true spots were correctly identified, 81% correctly identified the presence of single or multiple spots. Median time needed to evaluate the presence of a spot sign was 1.9 minutes (interquartile range, 1.2-3.1 minutes). Diagnostic performance, interobserver agreement, and time needed for spot sign evaluation were similar among staff physicians and fellows. Accuracy for spot identification is high with opportunity for improvement in spot interpretation sensitivity and interobserver agreement particularly through greater reliance on computed tomography angiography source data and awareness of limitations of multiplanar images. Further prospective study is needed.

  7. Inter-Observer Agreement of Whole-Body Computed Tomography in Staging and Response Assessment in Lymphoma: The Lugano Classification.

    PubMed

    Razek, Ahmed Abdel Khalek Abdel; Shamaa, Sameh; Lattif, Mahmoud Abdel; Yousef, Hanan Hamid

    2017-01-01

    To assess inter-observer agreement of whole-body computed tomography (WBCT) in staging and response assessment in lymphoma according to the Lugano classification. Retrospective analysis was conducted of 115 consecutive patients with lymphomas (45 females, 70 males; mean age of 46 years). Patients underwent WBCT with a 64 multi-detector CT device for staging and response assessment after a complete course of chemotherapy. Image analysis was performed by 2 reviewers according to the Lugano classification for staging and response assessment. The overall inter-observer agreement of WBCT in staging of lymphoma was excellent ( k =0.90, percent agreement=94.9%). There was an excellent inter-observer agreement for stage I ( k =0.93, percent agreement=96.4%), stage II ( k =0.90, percent agreement=94.8%), stage III ( k =0.89, percent agreement=94.6%) and stage IV ( k =0.88, percent agreement=94%). The overall inter-observer agreement in response assessment after a completer course of treatment was excellent ( k =0.91, percent agreement=95.8%). There was an excellent inter-observer agreement in progressive disease ( k =0.94, percent agreement=97.1%), stable disease ( k =0.90, percent agreement=95%), partial response ( k =0.96, percent agreement=98.1%) and complete response ( k =0.87, Percent agreement=93.3%). We concluded that WBCT is a reliable and reproducible imaging modality for staging and treatment assessment in lymphoma according to the Lugano classification.

  8. [Inter-observes agreement of Ishak and Metavir scores in histological evaluation of chronic viral hepatitis B and C].

    PubMed

    Rammeh, Soumaya; Khadra, Hajer Ben; Znaidi, Nadia Sabbegh; Romdhane, Neila Attia; Najjar, Taoufik; Bouzaidi, Slim; Zermani, Rachida

    2014-01-01

    Many classification systems are currently used for histological evaluation of the severity of chronic viral hepatitis, including the Ishak and Metavir scores, but there is not a consensus classification. The objective of this work was to study the intra and inter-observers agreement of these two scores in the histopathological analysis of liver biopsies in patients with chronic viral hepatitis B or C. Fifty nine patients were included in the study, 26 had chronic hepatitis C and 33 had chronic hepatitis B. To investigate the inter-observers agreement, the liver biopsies were analyzed separately by two pathologists without prior consensus reading. The two pathologists conducted then a consensual reading before reviewing all cases independently. Cohen's kappa coefficient was calculated and in case of asymmetry Spearman's rho coefficient. Before the consensus reading, the agreement was moderate for the analysis of histological activity with both scores (Metavir: kappa=0.41, Ishak: rho=0.58). For the analysis of fibrosis, the agreement was good with both scores (Metavir: kappa=0.61, Ishak: rho=0.86). The consensus reading has improved the reproducibility of the activity that has become good with both scores (Metavir: kappa=0.77, Ishak: rho=0.76). For fibrosis improvement was observed with the Ishak score which agreement became excellent (kappa=0.81). In conclusion, we recommend in routine practice, a combined score: Metavir for activity and Ishak for fibrosis and to make a double reading for each biopsy.

  9. Evaluating Random Error in Clinician-Administered Surveys: Theoretical Considerations and Clinical Applications of Interobserver Reliability and Agreement.

    PubMed

    Bennett, Rebecca J; Taljaard, Dunay S; Olaithe, Michelle; Brennan-Jones, Chris; Eikelboom, Robert H

    2017-09-18

    The purpose of this study is to raise awareness of interobserver concordance and the differences between interobserver reliability and agreement when evaluating the responsiveness of a clinician-administered survey and, specifically, to demonstrate the clinical implications of data types (nominal/categorical, ordinal, interval, or ratio) and statistical index selection (for example, Cohen's kappa, Krippendorff's alpha, or interclass correlation). In this prospective cohort study, 3 clinical audiologists, who were masked to each other's scores, administered the Practical Hearing Aid Skills Test-Revised to 18 adult owners of hearing aids. Interobserver concordance was examined using a range of reliability and agreement statistical indices. The importance of selecting statistical measures of concordance was demonstrated with a worked example, wherein the level of interobserver concordance achieved varied from "no agreement" to "almost perfect agreement" depending on data types and statistical index selected. This study demonstrates that the methodology used to evaluate survey score concordance can influence the statistical results obtained and thus affect clinical interpretations.

  10. Coronary artery disease reporting and data system (CAD-RADSTM): Inter-observer agreement for assessment categories and modifiers.

    PubMed

    Maroules, Christopher D; Hamilton-Craig, Christian; Branch, Kelley; Lee, James; Cury, Roberto C; Maurovich-Horvat, Pál; Rubinshtein, Ronen; Thomas, Dustin; Williams, Michelle; Guo, Yanshu; Cury, Ricardo C

    The Coronary Artery Disease Reporting and Data System (CAD-RADS) provides a lexicon and standardized reporting system for coronary CT angiography. To evaluate inter-observer agreement of the CAD-RADS among an panel of early career and expert readers. Four early career and four expert cardiac imaging readers prospectively and independently evaluated 50 coronary CT angiography cases using the CAD-RADS lexicon. All readers assessed image quality using a five-point Likert scale, with mean Likert score ≥4 designating high image quality, and <4 designating moderate/low image quality. All readers were blinded to medical history and invasive coronary angiography findings. Inter-observer agreement for CAD-RADS assessment categories and modifiers were assessed using intra-class correlation (ICC) and Fleiss' Kappa (κ).The impact of reader experience and image quality on inter-observer agreement was also examined. Inter-observer agreement for CAD-RADS assessment categories was excellent (ICC 0.958, 95% CI 0.938-0.974, p < 0.0001). Agreement among expert readers (ICC 0.925, 95% CI 0.884-0.954) was marginally stronger than for early career readers (ICC 0.904, 95% CI 0.852-0.941), both p < 0.0001. High image quality was associated with stronger agreement than moderate image quality (ICC 0.944, 95% CI 0.886-0.974 vs. ICC 0.887, 95% CI 0.775-0.95, both p < 0.0001). While excellent inter-observer agreement was observed for modifiers S (stent) and G (bypass graft) (both κ = 1.0), only fair agreement (κ = 0.40) was observed for modifier V (high risk plaque). Inter-observer reproducibility of CAD-RADS assessment categories and modifiers is excellent, except for high-risk plaque (modifier V) which demonstrates fair agreement. These results suggest CAD-RADS is feasible for clinical implementation. Copyright © 2017. Published by Elsevier Inc.

  11. Proposed Terminology for Anal Squamous Lesions: Its Application and Interobserver Agreement Among Pathologists in Academic and Community Hospitals.

    PubMed

    Roma, Andres A; Liu, Xiuli; Patil, Deepa T; Xie, Hao; Allende, Daniela

    2017-07-01

    To analyze interobserver reproducibility and compare practice patterns between academic and community settings of Lower Anogenital Squamous Terminology (LAST). In total, 132 anal biopsy slides were revised as well as p16 immunostains. LAST was used in 49% of cases (academic center, 68%; satellite hospitals [community practice setting], 32%). After pathology review and consensus interpretation, 23 (17%) case diagnoses were reclassified: eight (34.8%) cases (benign or low-grade squamous intraepithelial lesion [LSIL]) were upgraded to high-grade squamous intraepithelial lesion (HSIL) (p16 confirmed ordered during review); four (17.4%) cases originally classified as HSIL were downgraded to LSIL (p16 originally ordered in one case). There was no significant difference in discrepancies between original and consensus diagnosis in the community vs academic setting or by subspecialty (gynecological vs gastrointestinal). Overall interobserver agreement among reviewers was substantial (κ = 0.63) and improved with the use of p16 immunostain in challenging cases (κ = 0.71; P < .001). This new terminology is not yet uniformly used by pathologists in anal/perianal biopsy specimens; this two-tier system has a good interobserver agreement and is further improved with p16 use in appropriate cases. © American Society for Clinical Pathology, 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com

  12. Interobserver Agreement for Contrast-Enhanced Ultrasound (CEUS)-Based Standardized Algorithms for the Diagnosis of Hepatocellular Carcinoma in High-Risk Patients.

    PubMed

    Schellhaas, Barbara; Pfeifer, Lukas; Kielisch, Christian; Goertz, Ruediger Stephan; Neurath, Markus F; Strobel, Deike

    2018-06-07

     This pilot study aimed at assessing interobserver agreement with two contrast-enhanced ultrasound (CEUS) algorithms for the diagnosis of hepatocellular carcinoma (HCC) in high-risk patients.  Focal liver lesions in 55 high-risk patients were assessed independently by three blinded observers with two standardized CEUS algorithms: ESCULAP (Erlanger Synopsis of Contrast-Enhanced Ultrasound for Liver Lesion Assessment in Patients at risk) and ACR-CEUS-LI-RADSv.2016 (American College of Radiology CEUS-Liver Imaging Reporting and Data System). Lesions were categorized according to size and ultrasound contrast enhancement in the arterial, portal-venous and late phase. Interobserver agreement for assessment of enhancement pattern and categorization was compared between both CEUS algorithms. Additionally, diagnostic accuracy for the definitive diagnosis of HCC was compared. Histology and/or CE-MRI and follow-up served as reference standards.  55 patients were included in the study (male/female, 44/ 11; mean age: 65.9 years). 90.9 % had cirrhosis. Histological findings were available in 39/55 lesions (70.9 %). Reference standard of the 55 lesions revealed 48 HCCs, 2 intrahepatic cholangiocellular carcinomas (ICCs), and 5 non-HCC-non-ICC lesions. Interobserver agreement was moderate to substantial for arterial phase hyperenhancement (ĸ = 0.53 - 0.67), and fair to moderate for contrast washout in the portal-venous or late phase (ĸ = 0.33 - 0.53). Concerning the CEUS-based algorithms, the interreader agreement was substantial for the ESCULAP category (ĸ = 0.64 - 0.68) and fair for the CEUS-LI-RADS ® category (ĸ = 0.3 - 0.39). Disagreement between observers was mostly due to different perception of washout.  Interobserver agreement is better for ESCULAP than for CEUS-LI-RADS ® . This is mostly due to the fact that perception of contrast washout varies between different observers. However, interobserver agreement is good for

  13. Inter-observer agreement for Crohn's disease sub-phenotypes using the Montreal Classification: How good are we? A multi-centre Australasian study.

    PubMed

    Krishnaprasad, Krupa; Andrews, Jane M; Lawrance, Ian C; Florin, Timothy; Gearry, Richard B; Leong, Rupert W L; Mahy, Gillian; Bampton, Peter; Prosser, Ruth; Leach, Peta; Chitti, Laurie; Cock, Charles; Grafton, Rachel; Croft, Anthony R; Cooke, Sharon; Doecke, James D; Radford-Smith, Graham L

    2012-04-01

    Crohn's disease (CD) exhibits significant clinical heterogeneity. Classification systems attempt to describe this; however, their utility and reliability depends on inter-observer agreement (IOA). We therefore sought to evaluate IOA using the Montreal Classification (MC). De-identified clinical records of 35 CD patients from 6 Australian IBD centres were presented to 13 expert practitioners from 8 Australia and New Zealand Inflammatory Bowel Disease Consortium (ANZIBDC) centres. Practitioners classified the cases using MC and forwarded data for central blinded analysis. IOA on smoking and medications was also tested. Kappa statistics, with pre-specified outcomes of κ>0.8 excellent; 0.61-0.8 good; 0.41-0.6 moderate and ≤0.4 poor, were used. 97% of study cases had colonoscopy reports, however, only 31% had undergone a complete set of diagnostic investigations (colonoscopy, histology, SB imaging). At diagnosis, IOA was excellent for age, κ=0.84; good for disease location, κ=0.73; only moderate for upper GI disease (κ=0.57) and disease behaviour, κ=0.54; and good for the presence of perianal disease, κ=0.6. At last follow-up, IOA was good for location, κ=0.68; only moderate for upper GI disease (κ=0.43) and disease behaviour, κ=0.46; but excellent for the presence/absence of perianal disease, κ=0.88. IOA for immunosuppressant use ever and presence of stricture were both good (κ=0.79 and 0.64 respectively). IOA using MC is generally good; however some areas are less consistent than others. Omissions and inaccuracies reduce the value of clinical data when comparing cohorts across different centres, and may impair the ability to translate genetic discoveries into clinical practice. Crown Copyright © 2011. Published by Elsevier B.V. All rights reserved.

  14. Evaluation of interobserver agreement for postoperative pain and sedation assessment in cats.

    PubMed

    Benito, Javier; Monteiro, Beatriz P; Beauchamp, Guy; Lascelles, B Duncan X; Steagall, Paulo V

    2017-09-01

    OBJECTIVE To evaluate agreement between observers with different training and experience for assessment of postoperative pain and sedation in cats by use of a dynamic and interactive visual analog scale (DIVAS) and for assessment of postoperative pain in the same cats with a multidimensional composite pain scale (MCPS). DESIGN Randomized, controlled, blinded study. ANIMALS 45 adult cats undergoing ovariohysterectomy. PROCEDURES Cats received 1 of 3 preoperative treatments: bupivacaine, IP; meloxicam, SC with saline (0.9% NaCl) solution, IP, (positive control); or saline solution only, IP (negative control). All cats received premedication with buprenorphine prior to general anesthesia. An experienced observer (observer 1; male; native language, Spanish) used scales in English, and an inexperienced observer (observer 2; female; native language, French) used scales in French to assess signs of sedation and pain. Rescue analgesia was administered according to MCPS scoring by observer 1. Mean pain and sedation scores per treatment and time point, proportions of cats in each group with MCPS scores necessitating rescue analgesia, and mean MCPS scores assigned at the time of rescue analgesia were compared between observers. Agreement was assessed by intraclass correlation coefficient determination. Percentage disagreement between observers on the need for rescue analgesia was calculated. RESULTS Interobserver agreements for pain scores were good, and that for sedation scores was fair. On the basis of observer 1's MCPS scores, a greater proportion of cats in the negative control group received rescue analgesia than in the bupivacaine or positive control groups. Scores from observer 2 indicated a greater proportion of cats in the negative control group than in the positive control group required rescue analgesia but identified no significant difference between the negative control and bupivacaine groups for this variable. Overall, disagreement regarding need for rescue

  15. Interobserver agreement on Poser's and the new McDonald's diagnostic criteria for multiple sclerosis.

    PubMed

    Zipoli, V; Portaccio, E; Siracusa, G; Pracucci, G; Sorbi, S; Amato, M P

    2003-10-01

    We assessed the interobserver agreement on the diagnosis of multiple sclerosis (MS) in a study sample consisting of 41 MS (15 relapsing remitting, two secondary progressive, five primary progressive and 19 presenting their first clinical attack) and three non-MS cases. Clinical and paraclinical information was recorded in standardized forms. Four neurologists were asked to make a diagnosis using Poser's and McDonald's criteria and to assess MRI scans according to the McDonald's guidelines. In terms of the kappa statistic (kappa), we found a moderate agreement on the overall diagnosis using both Poser's and McDonald's criteria (kappa, respectively 0.57 and 0.52). As for distinct diagnostic categories, we observed a moderate to substantial agreement for the three McDonald categories (range of kappa values 0.49-0.64) and a fair to substantial agreement for the nine Poser categories (range of kappa values 0.37-0.67). Taking into account clinical information, the agreement on dissemination over time was substantially higher (kappa = 0.69) than that found on dissemination over space (kappa = 0.46). In contrast, for MRI assessment, the agreement for spatial dissemination was substantial (kappa = 0.74) compared with the fair agreement (kappa = 0.25) yielded by dissemination over time. The new McDonald's criteria yield a good overall diagnostic reliability, and compare favourably with Poser's classification in terms of agreement on distinct diagnostic categories.

  16. Cervical Cancer Screening in Cameroon: Interobserver Agreement on the Interpretation of Digital Cervicography Results.

    PubMed

    Manga, Simon; Parham, Groesbeck; Benjamin, Nkoum; Nulah, Kathleen; Sheldon, Lisa Kennedy; Welty, Edith; Ogembo, Javier Gordon; Bradford, Leslie; Sando, Zacharie; Shields, Ray; Welty, Thomas

    2015-10-01

    The World Health Organization recommends visual inspection with acetic acid (VIA) for cervical cancer screening in resource-limited settings. In Cameroon, we use digital cervicography (DC) to capture images of the cervix after VIA. This study evaluated interobserver agreement of DC results, compared DC with histopathologic results, and examined interobserver agreement among screening methods. Three observers, blinded to each other's interpretations, evaluated 540 DC photographs as follows: (1) negative/positive for acetowhite lesions or cancer and (2) assigned a presumptive diagnosis of histopathologic lesion grade in the 91 cases that had a histopathologic diagnosis. Observer A was the actual screening nurse; B, a reproductive health nurse; C, a gynecologic oncologist; and D, the histopathologic diagnosis. We compared inter-rater agreement of DC impressions among observers A, B, and C, and with D, with Cohen kappas. For interpretations of DC, (negative/positive) strengths of agreement of paired observers were the following: A/B, moderate [K, 0.54; 95% confidence interval (CI), 0.47-0.61], A/C, fair (K, 0.37; 95% CI, 0.29-0.44), and B/C, moderate (K, 0.45; 95% CI, 0.37-0.53). For presumptive pathologic grading, strengths of agreement for weighted Ks were as follows: A/B, moderate (K, 0.42; 95% CI, 0.28-0.56); A/C, fair (K, 0.33; 95% CI, 0.20-0.46); B/C, fair (K, 0.54; 95% CI, 0.40-0.67); A/D, moderate (K, 0.59; 95% CI, 0.45-0.74); B/D, moderate (K, 0.58; 95% CI, 0.46-0.70); and C/D, moderate (K, 0.50; 95% CI, 0.37-0.63). Interobserver agreement of DC interpretations was mostly moderate among the 3 observers, between them and histopathology, and comparable to that of other visual-based screening methods, i.e., VIA, cytology, or colposcopy.

  17. Intra- and inter-observer agreement when using a descriptive classification scale for clinical assessment of faecal consistency in growing pigs.

    PubMed

    Pedersen, Ken Steen; Toft, Nils

    2011-03-01

    The objective of the current study was to evaluate intra- and inter-observer agreement using a descriptive classification scale with four categories, descriptive text and pictures for assessment of consistency in faecal samples from pigs post weaning. The four consistency categories were score one=firm and shaped, score two=soft and shaped, score three=loose and score four=watery. Five observers from the same veterinary practice examined 100 faecal samples using the scale with four categories. Four of the observers examined the 100 faecal samples twice within the same day. Within observers the difference in proportions for the individual consistency categories between two examinations was on average 0.04 (range: 0-0.10). The mean intra-observer agreement was 0.82 (range: 0.72-0.91) with a mean kappa value of 0.76 (range: 0.61-0.88). For inter-observer agreement overall kappa was 0.64. For the 10 pair-wise comparisons the mean inter-observer agreement was 0.73 (range: 0.61-0.90) with a mean kappa value of 0.64 (range: 0.48-0.87). The difference in proportions for the individual consistency categories was on average 0.08 (range: 0-0.17). In conclusion, the agreement observed for the descriptive classification scale with four categories, descriptive text and pictures may be categorized as a substantial to almost perfect intra-observer agreement and a moderate to almost perfect inter-observer agreement. However, more objective measures than clinical scales may still be needed to improve intra- and inter-observer agreement in research studies. Copyright © 2010 Elsevier B.V. All rights reserved.

  18. Evaluation of inter-observer agreement when using a clinical respiratory scoring system in pre-weaned dairy calves.

    PubMed

    Buczinski, S; Faure, C; Jolivet, S; Abdallah, A

    2016-07-01

    To determine inter-observer agreement for a clinical scoring system for the detection of bovine respiratory disease complex in calves, and the impact of classification of calves as sick or healthy based on different cut-off values. Two third-year veterinary students (Observer 1 and 2) and one post-graduate student (Observer 3) received 4 hours of training on scoring dairy calves for signs of respiratory disease, including rectal temperature, cough, eye and nasal discharge, and ear position. Observers 1 and 2 scored 40 pre-weaning dairy calves 24 hours apart (80 observations) over three visits to a calf-rearing facility, and Observers 1, 2 and 3 scored 20 calves on one visit. Inter-observer agreement was assessed using percentage of agreement (PA) and Kappa statistics for individual clinical signs, comparing Observers 1 and 2. Agreement between the three observers for total clinical score was assessed using cut-off values of ≥4, ≥5 and ≥6 to indicate unhealthy calves. Inter-observer PA for rectal temperature was 0.68, for cough 0.78, for nasal discharge 0.62, for eye discharge 0.63, and for ear position 0.85. Kappa values for all clinical signs indicated slight to fair agreement (<0.4), except temperature that had moderate agreement (0.6). The Fleiss' Kappa for total score, using cut-offs of ≥4, ≥5 and ≥6 to indicate unhealthy calves, was 0.35, 0.06 and 0.13, respectively, indicating slight to fair agreement. There was important inter-observer discrepancies in scoring clinical signs of respiratory disease, using relatively inexperienced observers. These disagreements may ultimately mean increased false negative or false positive diagnoses and incorrect treatment of cases. Visual assessment of clinical signs associated with bovine respiratory disease needs to be thoroughly validated when disease monitoring is based on the use of a clinical scoring system.

  19. Interobserver Agreement Among Uveitis Experts on Uveitic Diagnoses: The Standardization of Uveitis Nomenclature Experience.

    PubMed

    Jabs, Douglas A; Dick, Andrew; Doucette, John T; Gupta, Amod; Lightman, Susan; McCluskey, Peter; Okada, Annabelle A; Palestine, Alan G; Rosenbaum, James T; Saleem, Sophia M; Thorne, Jennifer; Trusko, Brett

    2018-02-01

    To evaluate the interobserver agreement among uveitis experts on the diagnosis of the specific uveitic disease. Interobserver agreement analysis. Five committees, each comprised of 9 individuals and working in parallel, reviewed cases from a preliminary database of 25 uveitic diseases, collected by disease, and voted independently online whether the case was the disease in question or not. The agreement statistic, κ, was calculated for the 36 pairwise comparisons for each disease, and a mean κ was calculated for each disease. After the independent online voting, committee consensus conference calls, using nominal group techniques, reviewed all cases not achieving supermajority agreement (>75%) on the diagnosis in the online voting to attempt to arrive at a supermajority agreement. A total of 5766 cases for the 25 diseases were evaluated. The overall mean κ for the entire project was 0.39, with disease-specific variation ranging from 0.23 to 0.79. After the formalized consensus conference calls to address cases that did not achieve supermajority agreement in the online voting, supermajority agreement overall was reached on approximately 99% of cases, with disease-specific variation ranging from 96% to 100%. Agreement among uveitis experts on diagnosis is moderate at best but can be improved by discussion among them. These data suggest the need for validated and widely used classification criteria in the field of uveitis. Copyright © 2017 Elsevier Inc. All rights reserved.

  20. Factors to improve the interobserver agreement for gastric atrophy and intestinal metaplasia: consensus of definition and criteria.

    PubMed

    Kim, Sung Sun; Kook, Myeong-Cherl; Shin, Ok-Ran; Kim, Hee Sung; Bae, Han-Ik; Seo, An Na; Park, Do Youn; Choi, Il Ju; Kim, Young-Il; Nam, Byung Ho; Kim, Sohee

    2018-04-01

    Intestinal metaplasia and atrophy of the gastric mucosa are associated with Helicobacter pylori infection and are considered premalignant lesions. The updated Sydney system is used for these parameters, but experienced pathologists and consensus processes are required for interobserver agreement. We sought to determine the influence of the consensus process on the assessment of intestinal metaplasia and atrophy. Two study sets were used: consensus and validation. The consensus set was circulated and five gastrointestinal pathologists evaluated them independently using the updated Sydney system. The consensus of the definitions was then determined at the first consensus meeting. The same set was recirculated to determine the effect of the consensus. The second consensus meeting was held to standardise the grading criteria and the validation set was circulated to determine the influence. Two additional circulations were performed to assess the maintainance of consensus and intraobserver variability. Interobserver agreement of intestinal metaplasia and atrophy was improved through the consensus process (intestinal metaplasia: baseline κ = 0.52 versus final κ = 0.68, P = 0.006; atrophy: baseline κ = 0.19 versus final κ = 0.43, P < 0.001). Higher interobserver agreement in atrophy was observed after consensus regarding the definition (pre-consensus: κ = 0.19 versus post-consensus: κ = 0.34, P = 0.001). There was improved interobserver agreement in intestinal metaplasia after standardisation of the grading criteria (pre-standardisation: κ = 0.56 versus post-standardisation: κ = 0.71, P = 0.010). This study suggests that interobserver variability regarding intestinal metaplasia and atrophy may result from lack of a precise definition and fine criteria, and can be reduced by consensus of definition and standardisation of grading criteria. © 2017 John Wiley & Sons Ltd.

  1. Interobserver agreement in retrospective chart reviews for factors associated with cervical spine injuries in children.

    PubMed

    Olsen, Cody S; Kuppermann, Nathan; Jaffe, David M; Brown, Kathleen; Babcock, Lynn; Mahajan, Prashant V; Leonard, Julie C

    2015-04-01

    The objective was to describe the interobserver agreement between trained chart reviewers and physician reviewers in a multicenter retrospective chart review study of children with cervical spine injuries (CSIs). Medical records of children younger than 16 years old with cervical spine radiography from 17 Pediatric Emergency Care Applied Research Network (PECARN) hospitals from years 2000 through 2004 were abstracted by trained reviewers for a study aimed to identify predictors of CSIs in children. Independent physician-reviewers abstracted patient history and clinical findings from a random sample of study patient medical records at each hospital. Interobserver agreement was assessed using percent agreement and the weighted kappa (κ) statistic, with lower 95% confidence intervals. Moderate or better agreement (κ > 0.4) was achieved for most candidate CSI predictors, including altered mental status (κ = 0.87); focal neurologic findings (κ = 0.74); posterior midline neck tenderness (κ = 0.74); any neck tenderness (κ = 0.89); torticollis (κ = 0.79); complaint of neck pain (κ = 0.83); history of loss of consciousness (κ = 0.89); nonambulatory status (κ = 0.74); and substantial injuries to the head (κ = 0.50), torso/trunk (κ = 0.48), and extremities (κ = 0.59). High-risk mechanisms showed near-perfect agreement (diving, κ = 1.0; struck by car, κ = 0.93; other motorized vehicle crash, κ = 0.93; fall, κ = 0.92; high-risk motor vehicle collision, κ = 0.89; hanging, κ = 0.80). Fair agreement was found for clotheslining mechanisms (κ = 0.36) and substantial face injuries (κ = 0.40). Most retrospectively assessed variables thought to be predictive of CSIs in blunt trauma-injured children had at least moderate interobserver agreement, suggesting that these data are sufficiently valid for use in identifying potential predictors of CSI. © 2015 by the Society for Academic Emergency Medicine.

  2. Inter-observer agreement for diagnostic classification of esophageal motility disorders defined in high-resolution manometry.

    PubMed

    Fox, M R; Pandolfino, J E; Sweis, R; Sauter, M; Abreu Y Abreu, A T; Anggiansah, A; Bogte, A; Bredenoord, A J; Dengler, W; Elvevi, A; Fruehauf, H; Gellersen, S; Ghosh, S; Gyawali, C P; Heinrich, H; Hemmink, M; Jafari, J; Kaufman, E; Kessing, K; Kwiatek, M; Lubomyr, B; Banasiuk, M; Mion, F; Pérez-de-la-Serna, J; Remes-Troche, J M; Rohof, W; Roman, S; Ruiz-de-León, A; Tutuian, R; Uscinowicz, M; Valdovinos, M A; Vardar, R; Velosa, M; Waśko-Czopnik, D; Weijenborg, P; Wilshire, C; Wright, J; Zerbib, F; Menne, D

    2015-01-01

    High-resolution esophageal manometry (HRM) is a recent development used in the evaluation of esophageal function. Our aim was to assess the inter-observer agreement for diagnosis of esophageal motility disorders using this technology. Practitioners registered on the HRM Working Group website were invited to review and classify (i) 147 individual water swallows and (ii) 40 diagnostic studies comprising 10 swallows using a drop-down menu that followed the Chicago Classification system. Data were presented using a standardized format with pressure contours without a summary of HRM metrics. The sequence of swallows was fixed for each user but randomized between users to avoid sequence bias. Participants were blinded to other entries. (i) Individual swallows were assessed by 18 practitioners (13 institutions). Consensus agreement (≤ 2/18 dissenters) was present for most cases of normal peristalsis and achalasia but not for cases of peristaltic dysmotility. (ii) Diagnostic studies were assessed by 36 practitioners (28 institutions). Overall inter-observer agreement was 'moderate' (kappa 0.51) being 'substantial' (kappa > 0.7) for achalasia type I/II and no lower than 'fair-moderate' (kappa >0.34) for any diagnosis. Overall agreement was somewhat higher among those that had performed >400 studies (n = 9; kappa 0.55) and 'substantial' among experts involved in development of the Chicago Classification system (n = 4; kappa 0.66). This prospective, randomized, and blinded study reports an acceptable level of inter-observer agreement for HRM diagnoses across the full spectrum of esophageal motility disorders for a large group of clinicians working in a range of medical institutions. Suboptimal agreement for diagnosis of peristaltic motility disorders highlights contribution of objective HRM metrics. © 2014 International Society for Diseases of the Esophagus.

  3. Intraobserver and interobserver agreement on the radiographical diagnosis of canine cranial cruciate ligament rupture.

    PubMed

    Bogaerts, Evelien; Van der Vekens, Elke; Verhoeven, Geert; de Rooster, Hilde; Van Ryssen, Bernadette; Samoy, Yves; Putcuyps, Ingrid; Van Tilburg, Johan; Devriendt, Nausikaa; Weekers, Frederik; Bertal, Mileva; Houdellier, Blandine; Scheemaeker, Stephanie; Versteken, Jeroen; Lamerand, Maryline; Feenstra, Laurien; Peelman, Luc; Nieuwerburgh, Filip Van; Saunders, Jimmy H; Broeckx, Bart J G

    2018-04-28

    Even though radiography is one of the most frequently used imaging techniques for orthopaedic disorders, it has been demonstrated that the interpretation can vary between assessors. As such, the purpose of this study was to examine the intraobserver and interobserver agreement and the influence of level of expertise on the interpretation of radiographs of the stifle in dogs with and without cranial cruciate ligament rupture (CCLR). Sixteen observers, divided in four groups according to their level of experience, evaluated 30 radiographs (15 cases with CCLR and 15 control stifles) twice. Each observer was asked to evaluate joint effusion, presence and location of degenerative joint disease, joint instability and whether CCLR was present or absent. Overall, intraobserver and interobserver agreement ranged from fair to almost perfect with a trend towards increased agreement for more experienced observers. Additionally, it was found that stifles that were classified with high agreement have either overt disease characteristics or no disease characteristics at all, in comparison to the ones that are classified with a low agreement. Overall, the agreement on radiographic interpretation of CCLR was high, which is important, as it is the basis of a correct diagnosis and treatment. © British Veterinary Association (unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  4. Intra- and inter-observer agreement in histological assessment of canine soft tissue sarcoma.

    PubMed

    Yap, F W; Rasotto, R; Priestnall, S L; Parsons, K J; Stewart, J

    2017-12-01

    The diagnosis of canine soft tissue sarcoma (STS) is based on histological assessment. Assessment of criteria such as, degree of differentiation, necrosis score and mitotic score, gives rise to a final tumour grade, which is important in the recommendation of treatment and prognosis of patients. Previously diagnosed cases of STS were independently assessed by three board-certified veterinary pathologists. Participating pathologists were blinded to the original results. For the intra-observer study, the cases were assessed by a single pathologist six months apart and slides were randomized between readings. For the inter-observer study, the whole case series was assessed by a single pathologist before being passed onto the next pathologist. Intraclass correlation coefficient (ICC) and Fleiss's Kappa (ƙ) for the intra- (single observer) and inter-observer agreement. Strong agreement was observed for the intra-observer assessment in necrosis score, mitotic score, total score and tumour grading (ICC between 0.78 to 0.91). The intra-observer agreement for differentiation score was rated perfect (ICC 1.00). The agreement between pathologists for the diagnosis and grading of canine STS was moderate (ƙ = 0.60 and 0.43 respectively). Histological assessment of canine STS had high reproducibility by an individual pathologist. The agreement of diagnosis and grading of canine STS was moderate between pathologists. Future studies are required to investigate further assessment criteria to improve the specificity of STS diagnosis and the accuracy of the STS grading in dogs. © 2017 John Wiley & Sons Ltd.

  5. Interobserver agreement in the histologic diagnosis of colorectal polyps. the experience of the multicenter adenoma colorectal study (SMAC).

    PubMed

    Costantini, Massimo; Sciallero, Stefania; Giannini, Augusto; Gatteschi, Beatrice; Rinaldi, Paolo; Lanzanova, Giuseppe; Bonelli, Luigina; Casetti, Tino; Bertinelli, Elisabetta; Giuliani, Orietta; Castiglione, Guido; Mantellini, Paola; Naldoni, Carlo; Bruzzi, Paolo

    2003-03-01

    Current clinical practice guidelines for patients with colorectal polyps are mainly based on the histologic characteristics of their lesions. However, interobserver variability in the assessment of specific polyp characteristics was evaluated in very few studies. The purpose of this study was to evaluate the interobserver agreement of four pathologists in the diagnosis of histologic type of colorectal polyps and in the degree of dysplasia and of infiltrating carcinoma in adenomas. A stratified random sample of 100 polyps was obtained from the 4,889 polyps resected within the Multicentre Adenoma Colorectal Study (SMAC), and the slides were blindly reviewed by the four pathologists. Agreement was analyzed using kappa statistics. A median kappa of 0.89 (range 0.79-1.0) was estimated for the interobserver agreement for the diagnosis of hyperplastic polyp vs. adenoma. The agreement in the diagnosis of tubular, tubulovillous, and villous type, was given by median kappa values of 0.50, 0.15, and 0.36, respectively. The median kappa for the diagnosis of infiltrating carcinoma was 0.78 (range 0.73-0.84). Agreement on diagnosis of adenoma histologic subtypes, degrees of dysplasia, or infiltrating carcinoma in adenoma was moderate. A simpler classifications might help to better identify patients at different risk of colorectal cancer.

  6. Intra- and inter-observer agreement in MRI assessment of rotator cuff healing using the Sugaya classification 10years after surgery.

    PubMed

    Niglis, L; Collin, P; Dosch, J-C; Meyer, N; Kempf, J-F

    2017-10-01

    The long-term outcomes of rotator cuff repair are unclear. Recurrent tears are common, although their reported frequency varies depending on the type and interpretation challenges of the imaging method used. The primary objective of this study was to assess the intra- and inter-observer reproducibility of the MRI assessment of rotator cuff repair using the Sugaya classification 10years after surgery. The secondary objective was to determine whether poor reproducibility, if found, could be improved by using a simplified yet clinically relevant classification. Our hypothesis was that reproducibility was limited but could be improved by simplifying the classification. In a retrospective study, we assessed intra- and inter-observer agreement in interpreting 49 magnetic resonance imaging (MRI) scans performed 10years after rotator cuff repair. These 49 scans were taken at random among 609 cases that underwent re-evaluation, with imaging, for the 2015 SoFCOT symposium on 10-year and 20-year clinical and anatomical outcomes of rotator cuff repair for full-thickness tears. Each of three observers read each of the 49 scans on two separate occasions. At each reading, they assessed the supra-spinatus tendon according to the Sugaya classification in five types. Intra-observer agreement for the Sugaya type was substantial (κ=0.64) but inter-observer agreement was only fair (κ=0.39). Agreement improved when the five Sugaya types were collapsed into two categories (1-2-3 and 4-5) (intra-observer κ=0.74 and inter-observer κ=0.68). Using the Sugaya classification to assess post-operative rotator cuff healing was associated with substantial intra-observer and fair inter-observer agreement. A simpler classification into two categories improved agreement while remaining clinically relevant. II, prospective randomised low-power study. Copyright © 2017 Elsevier Masson SAS. All rights reserved.

  7. Mitosis Counting in Breast Cancer: Object-Level Interobserver Agreement and Comparison to an Automatic Method

    PubMed Central

    Veta, Mitko; van Diest, Paul J.; Jiwa, Mehdi; Al-Janabi, Shaimaa; Pluim, Josien P. W.

    2016-01-01

    Background Tumor proliferation speed, most commonly assessed by counting of mitotic figures in histological slide preparations, is an important biomarker for breast cancer. Although mitosis counting is routinely performed by pathologists, it is a tedious and subjective task with poor reproducibility, particularly among non-experts. Inter- and intraobserver reproducibility of mitosis counting can be improved when a strict protocol is defined and followed. Previous studies have examined only the agreement in terms of the mitotic count or the mitotic activity score. Studies of the observer agreement at the level of individual objects, which can provide more insight into the procedure, have not been performed thus far. Methods The development of automatic mitosis detection methods has received large interest in recent years. Automatic image analysis is viewed as a solution for the problem of subjectivity of mitosis counting by pathologists. In this paper we describe the results from an interobserver agreement study between three human observers and an automatic method, and make two unique contributions. For the first time, we present an analysis of the object-level interobserver agreement on mitosis counting. Furthermore, we train an automatic mitosis detection method that is robust with respect to staining appearance variability and compare it with the performance of expert observers on an “external” dataset, i.e. on histopathology images that originate from pathology labs other than the pathology lab that provided the training data for the automatic method. Results The object-level interobserver study revealed that pathologists often do not agree on individual objects, even if this is not reflected in the mitotic count. The disagreement is larger for objects from smaller size, which suggests that adding a size constraint in the mitosis counting protocol can improve reproducibility. The automatic mitosis detection method can perform mitosis counting in an unbiased

  8. Mitosis Counting in Breast Cancer: Object-Level Interobserver Agreement and Comparison to an Automatic Method.

    PubMed

    Veta, Mitko; van Diest, Paul J; Jiwa, Mehdi; Al-Janabi, Shaimaa; Pluim, Josien P W

    2016-01-01

    Tumor proliferation speed, most commonly assessed by counting of mitotic figures in histological slide preparations, is an important biomarker for breast cancer. Although mitosis counting is routinely performed by pathologists, it is a tedious and subjective task with poor reproducibility, particularly among non-experts. Inter- and intraobserver reproducibility of mitosis counting can be improved when a strict protocol is defined and followed. Previous studies have examined only the agreement in terms of the mitotic count or the mitotic activity score. Studies of the observer agreement at the level of individual objects, which can provide more insight into the procedure, have not been performed thus far. The development of automatic mitosis detection methods has received large interest in recent years. Automatic image analysis is viewed as a solution for the problem of subjectivity of mitosis counting by pathologists. In this paper we describe the results from an interobserver agreement study between three human observers and an automatic method, and make two unique contributions. For the first time, we present an analysis of the object-level interobserver agreement on mitosis counting. Furthermore, we train an automatic mitosis detection method that is robust with respect to staining appearance variability and compare it with the performance of expert observers on an "external" dataset, i.e. on histopathology images that originate from pathology labs other than the pathology lab that provided the training data for the automatic method. The object-level interobserver study revealed that pathologists often do not agree on individual objects, even if this is not reflected in the mitotic count. The disagreement is larger for objects from smaller size, which suggests that adding a size constraint in the mitosis counting protocol can improve reproducibility. The automatic mitosis detection method can perform mitosis counting in an unbiased way, with substantial

  9. Intra- and interobserver agreement in the classification and treatment of distal third clavicle fractures.

    PubMed

    Bishop, Julie Y; Jones, Grant L; Lewis, Brian; Pedroza, Angela

    2015-04-01

    In treatment of distal third clavicle fractures, the Neer classification system, based on the location of the fracture in relation to the coracoclavicular ligaments, has traditionally been used to determine fracture pattern stability. To determine the intra- and interobserver reliability in the classification of distal third clavicle fractures via standard plain radiographs and the intra- and interobserver agreement in the preferred treatment of these fractures. Cohort study (Diagnosis); Level of evidence, 3. Thirty radiographs of distal clavicle fractures were randomly selected from patients treated for distal clavicle fractures between 2006 and 2011. The radiographs were distributed to 22 shoulder/sports medicine fellowship-trained orthopaedic surgeons. Fourteen surgeons responded and took part in the study. The evaluators were asked to measure the size of the distal fragment, classify the fracture pattern as stable or unstable, assign the Neer classification, and recommend operative versus nonoperative treatment. The radiographs were reordered and redistributed 3 months later. Inter- and intrarater agreement was determined for the distal fragment size, stability of the fracture, Neer classification, and decision to operate. Single variable logistic regression was performed to determine what factors could most accurately predict the decision for surgery. Interrater agreement was fair for distal fragment size, moderate for stability, fair for Neer classification, slight for type IIB and III fractures, and moderate for treatment approach. Intrarater agreement was moderate for distal fragment size categories (κ = 0.50, P < .001) and Neer classification (κ = 0.42, P < .001) and substantial for stable fracture (κ = 0.65, P < .001) and decision to operate (κ = 0.65, P < .001). Fracture stability was the best predictor of treatment, with 89% accuracy (P < .001). Fracture stability determination and the decision to operate had the highest interobserver agreement

  10. Interobserver and intermodality agreement of standardized algorithms for non-invasive diagnosis of hepatocellular carcinoma in high-risk patients: CEUS-LI-RADS versus MRI-LI-RADS.

    PubMed

    Schellhaas, Barbara; Hammon, Matthias; Strobel, Deike; Pfeifer, Lukas; Kielisch, Christian; Goertz, Ruediger S; Cavallaro, Alexander; Janka, Rolf; Neurath, Markus F; Uder, Michael; Seuss, Hannes

    2018-04-19

    We compared the interobserver agreement for the recently introduced contrast-enhanced ultrasound (CEUS)-based algorithm CEUS-LI-RADS (Liver Imaging Reporting and Data System) versus the well-established magnetic resonance imaging (MRI)-LI-RADS for non-invasive diagnosis of hepatocellular carcinoma (HCC) in high-risk patients. Focal liver lesions in 50 high-risk patients (mean age 66.2 ± 11.8 years; 39 male) were assessed retrospectively with CEUS and MRI. Two independent observers reviewed CEUS and MRI examinations, separately, classifying observations according to CEUS-LI-RADSv.2016 and MRI-LI-RADSv.2014. Interobserver agreement was assessed with Cohen's kappa. Forty-three lesions were HCCs; two were intrahepatic cholangiocarcinomas; five were benign lesions. Arterial phase hyperenhancement was perceived less frequently with CEUS than with MRI (37/50 / 38/50 lesions = 74%/78% [CEUS; observer 1/observer 2] versus 46/50 / 44/50 lesions = 92%/88% [MRI; observer 1/observer 2]). Washout appearance was observed in 34/50 / 20/50 lesions = 68%/40% with CEUS and 31/50 / 31/50 lesions = 62%/62%) with MRI. Interobserver agreement was moderate for arterial hyperenhancement (ĸ = 0.511/0.565 [CEUS/MRI]) and "washout" (ĸ = 0.490/0.582 [CEUS/MRI]), fair for CEUS-LI-RADS category (ĸ = 0.309) and substantial for MRI-LI-RADS category (ĸ = 0.609). Intermodality agreement was fair for arterial hyperenhancement (ĸ = 0.329), slight to fair for "washout" (ĸ = 0.202) and LI-RADS category (ĸ = 0.218) CONCLUSION: Interobserver agreement is substantial for MRI-LI-RADS and only fair for CEUS-LI-RADS. This is mostly because interobserver agreement in the perception of washout appearance is better in MRI than in CEUS. Further refinement of the LI-RADS algorithms and increasing education and practice may be necessary to improve the concordance between CEUS and MRI for the final LI-RADS categorization. • CEUS-LI-RADS and MRI-LIRADS enable standardized non-invasive diagnosis of HCC in

  11. Interobserver agreement between primary graders and an expert grader in the Bristol and Weston diabetic retinopathy screening programme: a quality assurance audit.

    PubMed

    Patra, S; Gomm, E M W; Macipe, M; Bailey, C

    2009-08-01

    To assess the quality and accuracy of primary grading in the Bristol and Weston diabetic retinopathy screening programme and to set standards for future interobserver agreement reports. A prospective audit of 213 image sets from six fully trained primary graders in the Bristol and Weston diabetic retinopathy screening programme was carried out over a 4-week period. All the images graded by the primary graders were regraded by an expert grader blinded to the primary grading results and the identity of the primary grader. The interobserver agreement between primary graders and the blinded expert grader and the corresponding Kappa coefficient was determined for overall grading, referable, non-referable and ungradable disease. The audit standard was set at 80% for interobserver agreement with a Kappa coefficient of 0.7. The interobserver agreement bettered the audit standard of 80% in all the categories. The Kappa coefficient was substantial (0.7) for the overall grading results and ranged from moderate to substantial (0.59-0.65) for referable, non-referable and ungradable disease categories. The main recommendation of the audit was to provide refresher training for the primary graders with focus on ungradable disease. The audit demonstrated an acceptable level of quality and accuracy of primary grading in the Bristol and Weston diabetic retinopathy screening programme and provided a standard against which future interobserver agreement can be measured for quality assurance within a screening programme. Diabet. Med. 26, 820-823 (2009).

  12. Continuous Recording and Interobserver Agreement Algorithms Reported in the "Journal of Applied Behavior Analysis" (1995-2005)

    ERIC Educational Resources Information Center

    Mudford, Oliver C.; Taylor, Sarah Ann; Martin, Neil T.

    2009-01-01

    We reviewed all research articles in 10 recent volumes of the "Journal of Applied Behavior Analysis (JABA)": Vol. 28(3), 1995, through Vol. 38(2), 2005. Continuous recording was used in the majority (55%) of the 168 articles reporting data on free-operant human behaviors. Three methods for reporting interobserver agreement (exact agreement,…

  13. Standardized assessment of tumor-infiltrating lymphocytes in breast cancer: an evaluation of inter-observer agreement between pathologists.

    PubMed

    Tramm, Trine; Di Caterino, Tina; Jylling, Anne-Marie B; Lelkaitis, Giedrius; Lænkholm, Anne-Vibeke; Ragó, Péter; Tabor, Tomasz P; Talman, Maj-Lis M; Vouza, Emmanouela

    2018-01-01

    In breast cancer, there is a growing body of evidence that tumor-infiltrating lymphocytes (TILs) may have clinical utility and may be able to direct clinical decisions for subgroups of patients. Clinical utility is, however, not sufficient for warranting the implementation of a new biomarker in the routine practice, and evaluation of the analytical validity is needed, including testing the reproducibility of decentralized assessment of TILs. The aim of this study was to evaluate the inter-observer agreement of TILs assessment using a standardized method, as proposed by the International TILs Working Group 2014, applied to a cohort of breast cancers reflecting an average breast cancer population. Stromal TILs were assessed using full slide sections from 124 breast cancers with varying histology, malignancy grade and ER- and HER2 status. TILs were estimated by nine dedicated breast pathologists using scanned hematoxylin-eosin stainings. TILs results were categorized using various cutoffs, and the inter-observer agreement was evaluated using the intraclass coefficient (ICC), Kappa statistics as well as individual overall agreements with the median value of TILs. Evaluation of TILs led to an ICC of 0.71 (95% CI: 0.65-0.77) corresponding to an acceptable agreement. Kappa values were in the range of 0.38-0.46 corresponding to a fair to moderate agreement. The individual agreements increased, when using only two categories ('high' vs. 'low' TILs) and a cutoff of 50-60%. The results of the present study are in accordance with previous studies, and shows that the proposed methodology for standardized evaluation of TILs renders an acceptable inter-observer agreement. The findings, however, indicate that assessment of TILs needs further refinement, and is in support of the latest St. Gallen Consensus, that routine reporting of TILs for early breast cancer is not ready for implementation in a clinical setting.

  14. The inter-observer agreement in the assessment of carotid plaque neovascularization by contrast-enhanced ultrasonography: The impact of plaque thickness.

    PubMed

    Chen, Jian; Zhang, Yan-Ming; Song, Ze-Zhou; Fu, Yan-Fei; Geng, Yu

    2018-04-10

    The interobserver agreement in the assessment of the grade of carotid plaque neovascularization by contrast-enhanced ultrasonography is poorly established. We examined 140 carotid plaques in 66 patients (all patients had bilateral plaques, and 8 patients had 2 plaques on one side). We performed conventional and contrast-enhanced ultrasonography to analyze the presence of carotid plaque neovascularization, which was graded by two independent observers whose interobserver agreement (κ) was evaluated according to the thickness of carotid plaque. For all carotid plaques, the mean κ was 0.689 (95% confidence interval 0.604-0.774). It was 0.689 (0.569-0.808), 0.637 (0.487-0.787), and 0.740 (0.585-0.896), respectively for carotid plaques with maximal thickness <2 mm, from 2 mm to 3 mm, and >3 mm. The interobserver agreement for assessing carotid plaque neovascularization by using contrast-enhanced ultrasonography is substantial and acceptable for research purposes, regardless of the maximal thickness of the plaque. © 2018 Wiley Periodicals, Inc.

  15. Interobserver agreement and diagnostic accuracy of brain magnetic resonance imaging in dogs.

    PubMed

    Leclerc, Mylène-Kim; d'Anjou, Marc-André; Blond, Laurent; Carmel, Éric Norman; Dennis, Ruth; Kraft, Susan L; Matthews, Andrea R; Parent, Joane M

    2013-06-15

    To evaluate interobserver agreement and diagnostic accuracy of brain MRI in dogs. Evaluation study. 44 dogs. 5 board-certified veterinary radiologists with variable MRI experience interpreted transverse T2-weighted (T2w), T2w fluid-attenuated inversion recovery (FLAIR), and T1-weighted-FLAIR; transverse, sagittal, and dorsal T2w; and T1-weighted-FLAIR postcontrast brain sequences (1.5 T). Several imaging parameters were scored, including the following: lesion (present or absent), lesion characteristics (axial localization, mass effect, edema, hemorrhage, and cavitation), contrast enhancement characteristics, and most likely diagnosis (normal, neoplastic, inflammatory, vascular, metabolic or toxic, or other). Magnetic resonance imaging diagnoses were determined initially without patient information and then repeated, providing history and signalment. For all cases and readers, MRI diagnoses were compared with final diagnoses established with results from histologic examination (when available) or with other pertinent clinical data (CSF analysis, clinical response to treatment, or MRI follow-up). Magnetic resonance scores were compared between examiners with κ statistics. Reading agreement was substantial to almost perfect (0.64 < κ < 0.86) when identifying a brain lesion on MRI; fair to moderate (0.14 < κ < 0.60) when interpreting hemorrhage, edema, and pattern of contrast enhancement; fair to substantial (0.22 < κ < 0.74) for dural tail sign and categorization of margins of enhancement; and moderate to substantial (0.40 < κ < 0.78) for axial localization, presence of mass effect, cavitation, intensity, and distribution of enhancement. Interobserver agreement was moderate to substantial for categories of diagnosis (0.56 < κ < 0.69), and agreement with the final diagnosis was substantial regardless of whether patient information was (0.65 < κ < 0.76) or was not (0.65 < κ < 0.68) provided. The present study found that whereas some MRI features such as edema

  16. MR imaging of silicone breast implants: evaluation of prospective and retrospective interpretations and interobserver agreement.

    PubMed

    Quinn, S F; Neubauer, N M; Sheley, R C; Demlow, T A; Szumowski, J

    1996-01-01

    MR imaging was used to evaluate the integrity of silicone breast implants in 54 women with 108 implants. MR images were interpreted by relatively inexperienced readers who tried to reproduce the experiences reported in the literature. The study examines the interobserver agreement using different diagnostic signs and the influence of experience on interpretation errors. Prospective and retrospective interpretations were compared with surgical findings at the time of explanation. Diagnostic indicators, including the linguine sign, the inverted tear drop sign, the C sign, water droplets mixed with silicone, and extracapsular globules of silicone, were evaluated for diagnostic efficacy and interobserver agreement. The prospective sensitivity and specificity were 87% and 78%, respectively. With the retrospective interpretations, the sensitivity and specificity increased to 93% and 92%, respectively. Most of the prospective false-positive interpretations were due to misinterpreting radial folds as signs of implant rupture. Six implants interpreted retrospectively as false positives had gross amounts of silicone around the implants at surgery but there were no obvious rents in the implant shells. There was fair to excellent interobserver agreement with the individual diagnostic signs except for extracapsular globules of silicone. All of the signs had specificities of greater than 90%. The sensitivities of the individual signs were less than the overall retrospective sensitivity. With experience, the sensitivity improved from 87% to 93% and the specificity improved from 78% to 92%. This study helps substantiate the use of diagnostic signs used by other authors to detect silicone loss from breast implants by MR imaging; however, questions remain as to the clinical role of MR imaging in evaluating implants for silicone loss.

  17. Continuous recording and interobserver agreement algorithms reported in the Journal of Applied Behavior Analysis (1995-2005).

    PubMed

    Mudford, Oliver C; Taylor, Sarah Ann; Martin, Neil T

    2009-01-01

    We reviewed all research articles in 10 recent volumes of the Journal of Applied Behavior Analysis (JABA): Vol. 28(3), 1995, through Vol. 38(2), 2005. Continuous recording was used in the majority (55%) of the 168 articles reporting data on free-operant human behaviors. Three methods for reporting interobserver agreement (exact agreement, block-by-block agreement, and time-window analysis) were employed in more than 10 of the articles that reported continuous recording. Having identified these currently popular agreement computation algorithms, we explain them to assist researchers, software writers, and other consumers of JABA articles.

  18. Detection of intracavitary uterine pathology using offline analysis of three-dimensional ultrasound volumes: interobserver agreement and diagnostic accuracy.

    PubMed

    Van den Bosch, T; Valentin, L; Van Schoubroeck, D; Luts, J; Bignardi, T; Condous, G; Epstein, E; Leone, F P; Testa, A C; Van Huffel, S; Bourne, T; Timmerman, D

    2012-10-01

    To estimate the diagnostic accuracy and interobserver agreement in predicting intracavitary uterine pathology at offline analysis of three-dimensional (3D) ultrasound volumes of the uterus. 3D volumes (unenhanced ultrasound and gel infusion sonography with and without power Doppler, i.e. four volumes per patient) of 75 women presenting with abnormal uterine bleeding at a 'bleeding clinic' were assessed offline by six examiners. The sonologists were asked to provide a tentative diagnosis. A histological diagnosis was obtained by hysteroscopy with biopsy or operative hysteroscopy. Proliferative, secretory or atrophic endometrium was classified as 'normal' histology; endometrial polyps, intracavitary myomas, endometrial hyperplasia and endometrial cancer were classified as 'abnormal' histology. The diagnostic accuracy of the six sonologists with regard to normal/abnormal histology and interobserver agreement were estimated. Intracavitary pathology was diagnosed at histology in 39% of patients. Agreement between the ultrasound diagnosis and the histological diagnosis (normal vs abnormal) ranged from 67 to 83% for the six sonologists. In 45% of cases all six examiners agreed with regard to the presence/absence of intracavitary pathology. The percentage agreement between any two examiners ranged from 65 to 91% (Cohen's κ, 0.31-0.81). The Schouten κ for all six examiners was 0.51 (95% CI, 0.40-0.62), while the highest Schouten κ for any three examiners was 0.69. When analyzing stored 3D ultrasound volumes, agreement between sonologists with regard to classifying the endometrium/uterine cavity as normal or abnormal as well as the diagnostic accuracy varied substantially. Possible actions to improve interobserver agreement and diagnostic accuracy include optimization of image quality and the use of a consistent technique for analyzing the 3D volumes. Copyright © 2012 ISUOG. Published by John Wiley & Sons, Ltd.

  19. Good Agreements Make Good Friends

    PubMed Central

    Han, The Anh; Pereira, Luís Moniz; Santos, Francisco C.; Lenaerts, Tom

    2013-01-01

    When starting a new collaborative endeavor, it pays to establish upfront how strongly your partner commits to the common goal and what compensation can be expected in case the collaboration is violated. Diverse examples in biological and social contexts have demonstrated the pervasiveness of making prior agreements on posterior compensations, suggesting that this behavior could have been shaped by natural selection. Here, we analyze the evolutionary relevance of such a commitment strategy and relate it to the costly punishment strategy, where no prior agreements are made. We show that when the cost of arranging a commitment deal lies within certain limits, substantial levels of cooperation can be achieved. Moreover, these levels are higher than that achieved by simple costly punishment, especially when one insists on sharing the arrangement cost. Not only do we show that good agreements make good friends, agreements based on shared costs result in even better outcomes. PMID:24045873

  20. 'Ease of interpretation' of cytological smears stained with modified ultrafast papanicolaou stain: Interobserver agreement and reproducibility.

    PubMed

    Uthamalingam, Preithy; Sathish Kumar, Thabasum; Venus, Albina; Sekar, Preethi; Muthusamy, Rajeshwari K; Mehta, Sangita

    2018-04-01

    Since its inception in 1995, the Ultrafast Papanicoloau (UFPAP) cytological stain has undergone a number of modifications to suit the local availability of reagents and cost in different set ups. However, the reported results have been uniformly encouraging. We designed a study to investigate the inter-observer agreement in 'perceived ease of interpretation' of cytological smears stained with Modified Ultrafast Papanicoloau stain (MUFPAP). After a small pilot study, we prospectively stained air-dried fine needle aspirate smears (FNACs) and Body Fluid smears with the standardized MUFPAP stain. The MUFPAP stained slides were evaluated in tandem with other routine cytological stains as well as independently by two pathologists. Two rater kappa was used to determine the agreement. The study included 93 fluids and 34 FNACs. A vast majority of the cases stained with MUFPAP were rated 'better' than the routine stains in terms of 'overall ease of interpretation' with considerable agreement. The agreement tended to be better for FNACs than fluid specimens. Cases with malignant pathology demonstrated a perfect agreement (kappa = 1) between the raters in terms of 'overall ease of interpretation' (91.7% cases were rated 'very good' by each pathologist) when compared to cases with benign pathology (kappa = 0.52). Nuclear characteristics were appreciated with a better agreement than other parameters. Modified UFPAP stain appears to be quick, reliable, cost-effective alternative in cytology, especially for detecting malignant cells in smears with low cellularity. Its specific advantage is robust nuclear staining against a clear background. © 2018 Wiley Periodicals, Inc.

  1. Does experience in hysteroscopy improve accuracy and inter-observer agreement in the management of abnormal uterine bleeding?

    PubMed

    Bourdel, Nicolas; Modaffari, Paola; Tognazza, Enrica; Pertile, Riccardo; Chauvet, Pauline; Botchorishivili, Revaz; Savary, Dennis; Pouly, Jean Luc; Rabischong, Benoit; Canis, Michel

    2016-12-01

    Hysteroscopic reliability may be influenced by the experience of the operator and by a lack of morphological diagnostic criteria for endometrial malignant pathologies. The aim of this study was to evaluate the diagnostic accuracy and the inter-observer agreement (IOA) in the management of abnormal uterine bleeding (AUB) among different experienced gynecologists. Each gynecologist, without any other clinical information, was asked to evaluate the anonymous video recordings of 51 consecutive patients who underwent hysteroscopy and endometrial resection for AUB. Experts (>500 hysteroscopies), seniors (20-499 procedures) and junior (≤19 procedures) gynecologists were asked to judge endometrial macroscopic appearance (benign, suspicious or frankly malignant). They also had to propose the histological diagnosis (atrophic or proliferative endometrium; simple, glandulocystic or atypical endometrial hyperplasia and endometrial carcinoma). Observers were free to indicate whether the quality of recordings were not good enough for adequate assessment. IOA (k coefficient), sensitivity, specificity, predictive value and the likelihood ratio were calculated. Five expert, five senior and six junior gynecologists were involved in the study. Considering endometrial cancer and endometrial atypical hyperplasia, sensitivity and specificity were respectively 55.5 % and 84.5 % for juniors, 66.6 % and 81.2 % for seniors and 86.6 % and 87.3 % for experts. Concerning endometrial macroscopic appearance, IOA was poor for juniors (k = 0.10) and fair for seniors and experts (k = 0.23 and 0.22, respectively). IOA was poor for juniors and experts (k = 0.18 and 0.20, respectively) and fair for seniors (k = 0.30) in predicting the histological diagnosis. Sensitivity improves with the observer's experience, but inter-observer agreement and reproducibility of hysteroscopy for endometrial malignancies are not satisfying no matter the level of expertise. Therefore, an accurate and

  2. Interobserver Agreement on Endoscopic Classification of Oesophageal Varices in Children.

    PubMed

    D'Antiga, Lorenzo; Betalli, Pietro; De Angelis, Paola; Davenport, Mark; Di Giorgio, Angelo; McKiernan, Patrick J; McLin, Valerie; Ravelli, Paolo; Durmaz, Ozlem; Talbotec, Cecile; Sturm, Ekkehard; Woynarowski, Marek; Burroughs, Andrew K

    2015-08-01

    Data regarding agreement on endoscopic features of oesophageal varices in children with portal hypertension (PH) are scant. The aim of this study was to evaluate endoscopic visualisation and classification of oesophageal varices in children by several European clinicians, to build a rational basis for future multicentre trials. Endoscopic pictures of the distal oesophagus of 100 children with a clinical diagnosis of PH were distributed to 10 endoscopists. Observers were requested to classify variceal size according to a 3-degree scale (small, medium, and large, class A), a 2-degree scale (small and large, class B), and to recognise red wales (presence or absence, class Red). Overall agreement was considered fair if Fleiss and Cohen κ test was ≥0.30, good if ≥0.40, excellent if ≥0.60, and perfect if ≥0.80. Agreement between observers was fair with class A (κ = 0.34) and class B (κ = 0.38), and good with class Red (κ = 0.49). The agreement was good on presence versus absence of varices (class A = 0.53, class B = 0.48). The agreement among the observers was good in class A when endoscopic features of severe PH (medium and large sizes, red marks) were grouped and compared with mild features (absent and small varices) (κ = 0.58). Experts working in different centres show a fairly good agreement on endoscopic features of PH in children, although a better training of paediatric endoscopists may improve the agreement in grading severity of varices in this setting.

  3. Agreement between Gonioscopic Examination and Swept Source Fourier Domain Anterior Segment Optical Coherence Tomography Imaging

    PubMed Central

    Nguyen, Donna; Minnal, Vandana R.

    2016-01-01

    Purpose. To evaluate interobserver, intervisit, and interinstrument agreements for gonioscopy and Fourier domain anterior segment optical coherence tomography (FD ASOCT) for classifying open and narrow angle eyes. Methods. Eighty-six eyes with open or narrow anterior chamber angles were included. The superior angle was classified open or narrow by 2 of 5 glaucoma specialists using gonioscopy and imaged by FD ASOCT in the dark. The superior angle of each FD ASOCT image was graded as open or narrow by 2 masked readers. The same procedures were repeated within 6 months. Kappas for interobserver and intervisit agreements for each instrument and interinstrument agreements were calculated. Results. The mean age was 50.9 (±18.4) years. Interobserver agreements were moderate to good for both gonioscopy (0.57 and 0.69) and FD ASOCT (0.58 and 0.75). Intervisit agreements were moderate to excellent for both gonioscopy (0.53 to 0.86) and FD ASOCT (0.57 and 0.85). Interinstrument agreements were fair to good (0.34 to 0.63), with FD ASOCT classifying more angles as narrow than gonioscopy. Conclusions. Both gonioscopy and FD ASOCT examiners were internally consistent with similar interobserver and intervisit agreements for angle classification. Agreement between instruments was fair to good, with FD ASOCT classifying more angles as narrow than gonioscopy. PMID:27990300

  4. Learning process for performing and analyzing 3D/4D transperineal ultrasound imaging and interobserver reliability study.

    PubMed

    Siafarikas, F; Staer-Jensen, J; Braekken, I H; Bø, K; Engh, M Ellström

    2013-03-01

    To evaluate the learning process for acquiring three- and four-dimensional (3D/4D) transperineal ultrasound volumes of the levator hiatus (LH) dimensions at rest, during pelvic floor muscle (PFM) contraction and on Valsalva maneuver, and for analyzing the ultrasound volumes, as well as to perform an interobserver reliability study between two independent ultrasound examiners. This was a prospective study including 22 women. We monitored the learning process of an inexperienced examiner (IE) performing 3D/4D transperineal ultrasonography and analyzing the volumes. The examination included acquiring volumes during three PFM contractions and three Valsalva maneuvers. LH dimensions were determined in the axial plane. The learning process was documented by estimating agreement between the IE and an experienced examiner (E) using the intraclass correlation coefficient. Agreement was calculated in blocks of 10 ultrasound examinations and analyzed volumes. After the learning process was complete the interobserver reliability for the technique was calculated between these two independent examiners. For offline analysis of the first 10 ultrasound volumes obtained by E, good to very good agreement between E and IE was achieved for all LH measurements except for the left and right levator-urethra gap and pubic arc. For the next 10 analyzed volumes, agreement improved for all LH measurements. Volumes that had been obtained by IE and E were then re-evaluated by IE, and good to very good agreement was found for all LH measurements indicating consistency in volume acquisition. The interobserver reliability study showed excellent ICC values (ICC, 0.81-0.97) for all LH measurements except the pubic arc (ICC = 0.67). 3D/4D transperineal ultrasound is a reliable technique that can be learned in a short period of time. Copyright © 2012 ISUOG. Published by John Wiley & Sons, Ltd.

  5. Inter-observer reliability of radiographic classifications and measurements in the assessment of Perthes' disease.

    PubMed

    Wiig, Ola; Terjesen, Terje; Svenningsen, Svein

    2002-10-01

    We evaluated the inter-observer agreement of radiographic methods when evaluating patients with Perthes' disease. The radiographs were assessed at the time of diagnosis and at the 1-year follow-up by local orthopaedic surgeons (O) and 2 experienced pediatric orthopedic surgeons (TT and SS). The Catterall, Salter-Thompson, and Herring lateral pillar classifications were compared, and the femoral head coverage (FHC), center-edge angle (CE-angle), and articulo-trochanteric distance (ATD) were measured in the affected and normal hips. On the primary evaluation, the lateral pillar and Salter-Thompson classifications had a higher level of agreement among the observers than the Catterall classification, but none of the classifications showed good agreement (weighted kappa values between O and SS 0.56, 0.54, 0.49, respectively). Combining Catterall groups 1 and 2 into one group, and groups 3 and 4 into another resulted in better agreement (kappa 0.55) than with the original 4-group system. The agreement was also better (kappa 0.62-0.70) between experienced than between less experienced examiners for all classifications. The femoral head coverage was a more reliable and accurate measure than the CE-angle for quantifying the acetabular covering of the femoral head, as indicated by higher intraclass correlation coefficients (ICC) and smaller inter-observer differences. The ATD showed good agreement in all comparisons and had low interobserver differences. We conclude that all classifications of femoral head involvement are adequate in clinical work if the radiographic assessment is done by experienced examiners. When they are less experienced examiners, a 2-group classification or the lateral pillar classification is more reliable. For evaluation of containment of the femoral head, FHC is more appropriate than the CE-angle.

  6. Intra- and interobserver agreement among obstetric experts in court regarding the review of abnormal fetal heart rate tracings and obstetrical management.

    PubMed

    Sabiani, Laura; Le Dû, Renaud; Loundou, Anderson; d'Ercole, Claude; Bretelle, Florence; Boubli, Léon; Carcopino, Xavier

    2015-12-01

    The objective of the study was to evaluate the intra- and interobserver agreement among obstetric experts in court regarding the retrospective review of abnormal fetal heart rate tracings and obstetrical management of patients with abnormal fetal heart rate during labor. A total of 22 French obstetric experts in court reviewed 30 cases of term deliveries of singleton pregnancies diagnosed with at least 1 hour of abnormal fetal heart rate, including 10 cases with adverse neonatal outcome. The experts reviewed all cases twice within a 3-month interval, with the first review being blinded to neonatal outcome. For each case reviewed, the experts were provided with the obstetric data and copies of the complete fetal heart rate recording and the partogram. The experts were asked to classify the abnormal fetal heart rate tracing and to express whether they agreed with the obstetrical management performed. When they disagreed, the experts were asked whether they concluded that an error had been made and whether they considered the obstetrical management as the cause of cerebral palsy in children if any. Compared with blinded review, the experts were significantly more likely to agree with the obstetric management performed (P < .001) and with the mode of delivery (P < .001) when informed about the neonatal outcome and were less likely to conclude that an error had been made (P < .001) or to establish a link with potential cerebral palsy (P = .003). The experts' intraobserver agreement for the review of abnormal fetal heart rate tracing and obstetrical management were both mediocre (kappa = 0.46-0.51 and kappa = 0.48-0.53, respectively). The interobserver agreement for the review of abnormal fetal heart rate tracing was low and was not improved by knowledge of the neonatal outcome (kappa = 0.11-0.18). The interobserver agreement for the interpretation of obstetrical management was also low (kappa = 0.08-0.19) but appeared to be improved by knowledge of the neonatal outcome

  7. Applanation tonometry: interobserver and prism agreement using the reusable Goldmann applanation prism and the Tonosafe disposable prism.

    PubMed

    Ajtony, Csilla; Elkarmouty, Ahmed; Barton, Keith; Kotecha, Aachal

    2016-06-01

    To evaluate the levels of agreement between the standard reusable prism and a disposable prism, and to examine the agreement between ophthalmologists, nursing and technical staff when measuring intraocular pressure (IOP) using the Goldmann applanation tonometer. Three hundred eyes of 300 patients were recruited. IOP measurements were made in a randomised order by three observer groups consisting of ophthalmologists and ophthalmic technicians/nurses taken from a pool of clinicians working within a busy outpatient clinic. Agreement was calculated by Bland-Altman analysis, showing the mean difference and 95% limits of agreement (LoA) of measurements. The mean difference between the reusable and disposable prism IOP measurements was <0.5 mm Hg. The LoA ranged from ±3.1 to ±4.9 mm Hg, depending on the observer group. The interobserver variability was <1 mm Hg across all observer groups; the LoA was slightly higher for observers using the reusable prism (range between ±4.3 and ±5.6 mm Hg) compared with using the disposable prism (range between ±3.7 and ±5.4 mm Hg) across observer groups. There is an acceptable agreement between IOP measurements made with the reusable Goldmann tonometer prism and the disposable Tonosafe prism. Interobserver variability in IOP measurements within an outpatient setting is larger than that found within a research setting, and may be of a level that impacts on clinical decision-making. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/

  8. Interobserver reliability of echocardiography for prognostication of normotensive patients with pulmonary embolism

    PubMed Central

    2014-01-01

    Objectives To evaluate the interobserver reliability of echocardiographic findings of right ventricle (RV) dysfunction for prognosticating normotensive patients with pulmonary embolism (PE). Methods A central panel of cardiologists evaluated echocardiographic studies of 75 patients included in the PROTECT study for the following signs: RV diameter, RV/left ventricular (LV) diameter ratio, hypokinesis of the RV free wall, and tricuspid plane systolic excursion (TAPSE). Investigators used intraclass correlation to assess agreement between the measurements of the central panel and each of the local cardiologists. Investigators used the single weighted kappa statistic to test for agreement between readers of interpretation of RV enlargement and RV hypokinesis. Results The two observers had fair agreement (k = 0.45) for RV enlargement assessed by the RV diameter, and good agreement (k = 0.65) for RV enlargement assessed by the RV/LV diameter ratio. The interobserver reliability of the assessment whether hypokinesis of the RV free wall is present was good (к = 0.70), and whether RV dysfunction (assessed by TAPSE measurement) is present was very good (k = 0.86). The intraclass correlation for the RV/LV diameter ratio was fair (0.55; 95% confidence interval [CI], 0.37-0.69), for the RV diameter was good (0.70; 95% CI, 0.56-0.80), and for the TAPSE measurement was very good (0.85; 95% CI, 0.77-0.90). On Bland-Altman analysis, the mean differences for RV diameter, RV/LV diameter ratio and TAPSE measurement were 2.33 (±5.38), 0.06 (±0.23) and 0.08 (±2.20), respectively. Conclusion TAPSE measurement is the least user dependent and most reproducible echocardiographic finding of RV dysfunction in normotensive patients with PE. PMID:25092465

  9. Interobserver variability in the radiological assessment of magnetic resonance imaging (MRI) including perfusion MRI in glioblastoma multiforme.

    PubMed

    Kerkhof, M; Hagenbeek, R E; van der Kallen, B F W; Lycklama À Nijeholt, G J; Dirven, L; Taphoorn, M J B; Vos, M J

    2016-10-01

    Conventional magnetic resonance imaging (MRI) has limited value for differentiation of true tumor progression and pseudoprogression in treated glioblastoma multiforme (GBM). Perfusion weighted imaging (PWI) may be helpful in the differentiation of these two phenomena. Here interobserver variability in routine radiological evaluation of GBM patients is assessed using MRI, including PWI. Three experienced neuroradiologists evaluated MR scans of 28 GBM patients during temozolomide chemoradiotherapy at three time points: preoperative (MR1) and postoperative (MR2) MR scan and the follow-up MR scan after three cycles of adjuvant temozolomide (MR3). Tumor size was measured both on T1 post-contrast and T2 weighted images according to the Response Assessment in Neuro-Oncology criteria. PW images of MR3 were evaluated by visual inspection of relative cerebral blood volume (rCBV) color maps and by quantitative rCBV measurements of enhancing areas with highest rCBV. Image interpretability of PW images was also scored. Finally, the neuroradiologists gave a conclusion on tumor status, based on the interpretation of both T1 and T2 weighted images (MR1, MR2 and MR3) in combination with PWI (MR3). Interobserver agreement on visual interpretation of rCBV maps was good (κ = 0.63) but poor on quantitative rCBV measurements and on interpretability of perfusion images (intraclass correlation coefficient 0.37 and κ = 0.23, respectively). Interobserver agreement on the overall conclusion of tumor status was moderate (κ = 0.48). Interobserver agreement on the visual interpretation of PWI color maps was good. However, overall interpretation of MR scans (using both conventional and PW images) showed considerable interobserver variability. Therefore, caution should be applied when interpreting MRI results during chemoradiation therapy. © 2016 EAN.

  10. Intra-observer and inter-observer agreements for the measurement of dual-input whole tumor computed tomography perfusion in patients with lung cancer: Influences of the size and inner-air density of tumors.

    PubMed

    Wang, Qingle; Zhang, Zhiyong; Shan, Fei; Shi, Yuxin; Xing, Wei; Shi, Liangrong; Zhang, Xingwei

    2017-09-01

    This study was conducted to assess intra-observer and inter-observer agreements for the measurement of dual-input whole tumor computed tomography perfusion (DCTP) in patients with lung cancer. A total of 88 patients who had undergone DCTP, which had proved a diagnosis of primary lung cancer, were divided into two groups: (i) nodules (diameter ≤3 cm) and masses (diameter >3 cm) by size, and (ii) tumors with and without air density. Pulmonary flow, bronchial flow, and pulmonary index were measured in each group. Intra-observer and inter-observer agreements for measurement were assessed using intraclass correlation coefficient, within-subject coefficient of variation, and Bland-Altman analysis. In all lung cancers, the reproducibility coefficient for intra-observer agreement (range 26.1-38.3%) was superior to inter-observer agreement (range 38.1-81.2%). Further analysis revealed lower agreements for nodules compared to masses. Additionally, inner-air density reduced both agreements for lung cancer. The intra-observer agreement for measuring lung cancer DCTP was satisfied, while the inter-observer agreement was limited. The effects of tumoral size and inner-air density to agreements, especially between two observers, should be emphasized. In future, an automatic computer-aided segment of perfusion value of the tumor should be developed. © 2017 The Authors. Thoracic Cancer published by China Lung Oncology Group and John Wiley & Sons Australia, Ltd.

  11. Polyp morphology: an interobserver evaluation for the Paris classification among international experts.

    PubMed

    van Doorn, Sascha C; Hazewinkel, Y; East, James E; van Leerdam, Monique E; Rastogi, Amit; Pellisé, Maria; Sanduleanu-Dascalescu, Silvia; Bastiaansen, Barbara A J; Fockens, Paul; Dekker, Evelien

    2015-01-01

    The Paris classification is an international classification system for describing polyp morphology. Thus far, the validity and reproducibility of this classification have not been assessed. We aimed to determine the interobserver agreement for the Paris classification among seven Western expert endoscopists. A total of 85 short endoscopic video clips depicting polyps were created and assessed by seven expert endoscopists according to the Paris classification. After a digital training module, the same 85 polyps were assessed again. We calculated the interobserver agreement with a Fleiss kappa and as the proportion of pairwise agreement. The interobserver agreement of the Paris classification among seven experts was moderate with a Fleiss kappa of 0.42 and a mean pairwise agreement of 67%. The proportion of lesions assessed as "flat" by the experts ranged between 13 and 40% (P<0.001). After the digital training, the interobserver agreement did not change (kappa 0.38, pairwise agreement 60%). Our study is the first to validate the Paris classification for polyp morphology. We demonstrated only a moderate interobserver agreement among international Western experts for this classification system. Our data suggest that, in its current version, the use of this classification system in daily practice is questionable and it is unsuitable for comparative endoscopic research. We therefore suggest introduction of a simplification of the classification system.

  12. Investigating Various Thresholds as Immunohistochemistry Cutoffs for Observer Agreement.

    PubMed

    Ali, Asif; Bell, Sarah; Bilsland, Alan; Slavin, Jill; Lynch, Victoria; Elgoweini, Maha; Derakhshan, Mohammad H; Jamieson, Nigel B; Chang, David; Brown, Victoria; Denley, Simon; Orange, Clare; McKay, Colin; Carter, Ross; Oien, Karin A; Duthie, Fraser R

    2017-10-01

    Clinical translation of immunohistochemistry (IHC) biomarkers requires reliable and reproducible cutoffs or thresholds for interpretation of immunostaining. Most IHC biomarker research focuses on the clinical relevance (diagnostic, prognostic, or predictive utility) of cutoffs, with less emphasis on observer agreement using these cutoffs. From the literature, we identified 3 commonly used cutoffs of 10% positive epithelial cells, 20% positive epithelial cells, and moderate to strong staining intensity (+2/+3 hereafter) to use for investigating observer agreement. A series of 36 images of microarray cores stained for 4 different IHC biomarkers, with variable staining intensity and percentage of positive cells, was used for investigating interobserver and intraobserver agreement. Seven pathologists scored the immunostaining in each image using the 3 cutoffs for positive and negative staining. Kappa (κ) statistic was used to assess the strength of agreement for each cutoff. The interobserver agreement between all 7 pathologists using the 3 cutoffs was reasonably good, with mean κ scores of 0.64, 0.59, and 0.62, respectively, for 10%, 20%, and +2/+3 cutoffs. A good agreement was observed for experienced pathologists using the 10% cutoff, and their agreement was statistically higher than for junior pathologists (P=0.02). In addition, the mean intraobserver agreement for all 7 pathologists using the 3 cutoffs was reasonably good, with mean κ scores of 0.71, 0.60, and 0.73, respectively, for 10%, 20%, and +2/+3 cutoffs. For all 3 cutoffs, a positive correlation was observed with perceived ease of interpretation (P<0.003). Finally, cytoplasmic-only staining achieved higher agreement using all 3 cutoffs than mixed staining patterns. All 3 cutoffs investigated achieve reasonable strength of agreement, modestly decreasing interobserver and intraobserver variability in IHC interpretation. These cutoffs have previously been used in cancer pathology, and this study provides

  13. Intraobserver and interobserver agreement in the classification and treatment of midshaft clavicle fractures.

    PubMed

    Jones, Grant L; Bishop, Julie Y; Lewis, Brian; Pedroza, Angela D

    2014-05-01

    With the recent emphasis on performing open reduction and internal fixation on midshaft clavicle fractures with complete displacement, comminution, and >2 cm of shortening, it is important to determine the reliability of orthopaedic surgeons to assess these variables on standard plain radiographs and to determine the agreement among orthopaedic surgeons in choosing the treatment. To determine the intra- and interobserver reliability in the classification of midshaft clavicle fractures via standard plain radiographs and to determine the intra- and interobserver agreement in the treatment of these fractures. Cohort study (diagnosis); Level of evidence, 3. Charts of patients seen by the 2 senior authors from 2006 to 2011 were reviewed to identify patients treated for clavicle fractures (CPT codes 23500 and 23515). Anteroposterior and 30° cephalad radiographs were selected, representing midshaft clavicle fractures treated both operatively and nonoperatively. Thirty pairs of radiographs were included in the investigation. The radiographs were standardized for size to allow accurate measurements within a non-PACS (picture archiving and communications system) program, and a PDF document was created with all representative radiographs. Clinical scenarios were created for each set of radiographs, and the evaluators were asked to (1) measure the degree of shortening in millimeters, (2) determine the percentage displacement, (3) determine whether the fracture was comminuted, and (4) state whether they would treat the fracture operatively or nonoperatively. The radiographs, along with instructions on how to use the measuring tool with Adobe Reader, were distributed to 22 shoulder/sports medicine fellowship-trained orthopaedic surgeons, then reordered and redistributed approximately 3 months later. Sixteen surgeons completed 1 round of surveys, and 13 surgeons completed both rounds. Interrater agreement was moderate for displacement of 0%-49% (κ = 0.71, P < .001) and >100

  14. The Problems with the Kappa Statistic as a Metric of Interobserver Agreement on Lesion Detection Using a Third-reader Approach When Locations Are Not Prespecified.

    PubMed

    Shih, Joanna H; Greer, Matthew D; Turkbey, Baris

    2018-03-16

    To point out the problems with Cohen kappa statistic and to explore alternative metrics to determine interobserver agreement on lesion detection when locations are not prespecified. Use of kappa and two alternative methods, namely index of specific agreement (ISA) and modified kappa, for measuring interobserver agreement on the location of detected lesions are presented. These indices of agreement are illustrated by application to a retrospective multireader study in which nine readers detected and scored prostate cancer lesions in 163 consecutive patients (n = 110 cases, n = 53 controls) using the guideline of Prostate Imaging Reporting and Data System version 2 on multiparametric magnetic resonance imaging. The proposed modified kappa, which properly corrects for the amount of agreement by chance, is shown to be approximately equivalent to the ISA. In the prostate cancer data, average kappa, modified kappa, and ISA equaled 30%, 55%, and 57%, respectively, for all lesions and 20%, 87%, and 87%, respectively, for index lesions. The application of kappa could result in a substantial downward bias in reader agreement on lesion detection when locations are not prespecified. ISA is recommended for assessment of reader agreement on lesion detection. Published by Elsevier Inc.

  15. Inter-observer agreement of standard joint count examination and disease global assessment in a cohort of Egyptian Rheumatoid Arthritis patients.

    PubMed

    El-Hadidi, Khaled; Gamal, Sherif M; Saad, Sahar

    2017-12-21

    To assess the inter-observer agreement of standard joint count between experienced Rheumatology professor (Prof) and young Rheumatology fellow (candidate), and to compare disease global assessment between professor, young candidate and patients. This study included one hundred rheumatoid arthritis patients. For all patients independent clinical evaluation was done by two rheumatologists (professor and candidate) for detection of tenderness in 28 joints and swelling in 26 joints. The study also involved global assessment of disease activity by the provider (Prof and candidate) (EGA) as well as by the patient (PGA). The EGA was determined without previous knowledge of the patient's laboratory test results. A highly significant accordance (correlation) between professor and candidate was found in both the number of tender joints (p<0.001) (r=0.946), and the number of swollen joints (p<0.001) (r=0.797). Regarding swollen joints, the highest agreement was in right knee (0.929), while poor agreement was found in the right 5th MCP (0.049). Regarding tender joints, the highest analogy was in the right elbow (0.899), in contrast to the left 3rd PIP (0.462) which showed the least congruence. Agreement study using kappa measurement for disease global assessment showed: moderate agreement (between professor and candidate) (0.405), fair agreement between (professor and patient) (0.213), fair agreement between (candidate and patient) (0.367). Inter-observer reliability was better for TJCs than SJCs. Regarding SJCs agreement was better in large joints such as the knees compared to the small joints such as the MCPs. Disease global assessment may show discrepancy between patients and physicians. Copyright © 2017 Elsevier España, S.L.U. and Sociedad Española de Reumatología y Colegio Mexicano de Reumatología. All rights reserved.

  16. Interobserver Agreement in Clinical Grading of Vitreous Haze Using Alternative Grading Scales

    PubMed Central

    Hornbeak, Dana M; Payal, Abhishek; Pistilli, Maxwell; Biswas, Jyotirmay; Ganesh, Sudha K; Gupta, Vishali; Rathinam, Sivakumar R; Davis, Janet L; Kempen, John H

    2014-01-01

    Purpose To evaluate the reliability of clinical grading of vitreous haze using a new 9-step ordinal scale vs. the existing 6-step ordinal scale. Design Evaluation of Diagnostic Test (interobserver agreement study). Participants 119 consecutive patients (204 uveitic eyes) presenting for uveitis subspecialty care on the study day at one of three large uveitis centers. Methods Five pairs of uveitis specialists clinically graded vitreous haze in the same eyes, one after the other using the same equipment, using the 6- and 9-step scales. Main Outcome Measures Agreement in vitreous haze grade between each pair of specialists was evaluated by the κ statistic (exact agreement and agreement within one or two grades). Results The scales correlated well (Spearman’s ρ=0.84). Exact agreement was modest using both the 6-step and 9-step scales: average κ=0.46 (range 0.28–0.81) and κ=0.40 (range 0.15–0.63), respectively. Within-1-grade agreement was slightly more favorable for the scale with fewer steps, but values were excellent for both scales: κ=0.75 (range 0.66–0.96) and κ=0.62 (range 0.38–0.87), respectively. Within-2-grade agreement for the 9-step scale also was excellent [κ=0.85 (range 0.79–0.92)]. Two-fold more cases were potentially clinical trial eligible based on the 9- than the 6-step scale (p<0.001). Conclusions Both scales are sufficiently reproducible using clinical grading for clinical and research use with the appropriate threshold (a ≥2 and ≥3 step differences for the 6-step and 9-step scales respectively). The results suggest that more eyes are likely to meet eligibility criteria for trials using the 9-step scale. The 9-step scale appears to have higher reproducibility with Reading Center grading than clinical grading, suggesting Reading Center grading may be preferable for clinical trials. PMID:24697913

  17. 68Ga-PSMA-11 PET/CT Interobserver Agreement for Prostate Cancer Assessments: An International Multicenter Prospective Study.

    PubMed

    Fendler, Wolfgang Peter; Calais, Jeremie; Allen-Auerbach, Martin; Bluemel, Christina; Eberhardt, Nina; Emmett, Louise; Gupta, Pawan; Hartenbach, Markus; Hope, Thomas A; Okamoto, Shozo; Pfob, Christian Helmut; Pöppel, Thorsten D; Rischpler, Christoph; Schwarzenböck, Sarah; Stebner, Vanessa; Unterrainer, Marcus; Zacho, Helle D; Maurer, Tobias; Gratzke, Christian; Crispin, Alexander; Czernin, Johannes; Herrmann, Ken; Eiber, Matthias

    2017-10-01

    The interobserver agreement for 68 Ga-PSMA-11 PET/CT study interpretations in patients with prostate cancer is unknown. Methods: 68 Ga-PSMA-11 PET/CT was performed in 50 patients with prostate cancer for biochemical recurrence ( n = 25), primary diagnosis ( n = 10), biochemical persistence after primary therapy ( n = 5), or staging of known metastatic disease ( n = 10). Images were reviewed by 16 observers who used a standardized approach for interpretation of local (T), nodal (N), bone (Mb), or visceral (Mc) involvement. Observers were classified as having a low (<30 prior 68 Ga-PSMA-11 PET/CT studies; n = 5), intermediate (30-300 studies; n = 5), or high level of experience (>300 studies; n = 6). Histopathology ( n = 25, 50%), post-external-beam radiation therapy prostate-specific antigen response ( n = 15, 30%), or follow-up PET/CT ( n = 10, 20%) served as a standard of reference. Observer groups were compared by overall agreement (% patients matching the standard of reference) and Fleiss' κ with mean and corresponding 95% confidence interval (CI). Results: Agreement among all observers was substantial for T (κ = 0.62; 95% CI, 0.59-0.64) and N (κ = 0.74; 95% CI, 0.71-0.76) staging and almost perfect for Mb (κ = 0.88; 95% CI, 0.86-0.91) staging. Level of experience positively correlated with agreement for T (κ = 0.73/0.66/0.50 for high/intermediate/low experience, respectively), N (κ = 0.80/0.76/0.64, respectively), and Mc staging (κ = 0.61/0.46/0.36, respectively). Interobserver agreement for Mb was almost perfect irrespective of prior experience (κ = 0.87/0.91/0.88, respectively). Observers with low experience, when compared with intermediate and high experience, demonstrated significantly lower median overall agreement (54% vs. 66% and 76%, P = 0.041) and specificity for T staging (73% vs. 88% and 93%, P = 0.032). Conclusion: The interpretation of 68 Ga-PSMA-11 PET/CT for prostate cancer staging is highly consistent among observers with high levels of

  18. Measurement of focal ground-glass opacity diameters on CT images: interobserver agreement in regard to identifying increases in the size of ground-glass opacities.

    PubMed

    Kakinuma, Ryutaro; Ashizawa, Kazuto; Kuriyama, Keiko; Fukushima, Aya; Ishikawa, Hiroyuki; Kamiya, Hisashi; Koizumi, Naoya; Maruyama, Yuichiro; Minami, Kazunori; Nitta, Norihisa; Oda, Seitaro; Oshiro, Yasuji; Kusumoto, Masahiko; Murayama, Sadayuki; Murata, Kiyoshi; Muramatsu, Yukio; Moriyama, Noriyuki

    2012-04-01

    To evaluate interobserver agreement in regard to measurements of focal ground-glass opacities (GGO) diameters on computed tomography (CT) images to identify increases in the size of GGOs. Approval by the institutional review board and informed consent by the patients were obtained. Ten GGOs (mean size, 10.4 mm; range, 6.5-15 mm), one each in 10 patients (mean age, 65.9 years; range, 58-78 years), were used to make the diameter measurements. Eleven radiologists independently measured the diameters of the GGOs on a total of 40 thin-section CT images (the first [n = 10], the second [n = 10], and the third [n = 10] follow-up CT examinations and remeasurement of the first [n = 10] follow-up CT examinations) without comparing time-lapse CT images. Interobserver agreement was assessed by means of Bland-Altman plots. The smallest range of the 95% limits of interobserver agreement between the members of the 55 pairs of the 11 radiologists in regard to maximal diameter was -1.14 to 1.72 mm, and the largest range was -7.7 to 1.7 mm. The mean value of the lower limit of the 95% limits of agreement was -3.1 ± 1.4 mm, and the mean value of their upper limit was 2.5 ± 1.1 mm. When measurements are made by any two radiologists, an increase in the length of the maximal diameter of more than 1.72 mm would be necessary in order to be able to state that the maximal diameter of a particular GGO had actually increased. Copyright © 2012 AUR. Published by Elsevier Inc. All rights reserved.

  19. 68Ga-DOTATATE PET/CT interobserver agreement for neuroendocrine tumor assessments: results from a prospective study on 50 patients

    PubMed Central

    Fendler, Wolfgang Peter; Barrio, Martin; Spick, Claudio; Allen-Auerbach, Martin; Ambrosini, Valentina; Benz, Matthias; Bluemel, Christina; Grewal, Ravinder Kaur; Lapa, Constantin; Miederer, Matthias; Nicolas, Guillaume; Schuster, Tibor; Czernin, Johannes; Herrmann, Ken

    2016-01-01

    We evaluated the observer agreement for 68Ga-DOTATATE PET/CT study interpretations in patients with neuroendocrine tumors (NET). Methods 68Ga-DOTATATE PET/CT was performed in 50 patients with known or suspected NET of the small bowel (n = 19), pancreas (n = 14), lung (n = 4) or other location (n = 13). Images were reviewed by seven observers who used a standardized approach for image interpretation. Observers were classified as having low (<500 scans or <5 years experience with 68Ga-DOTATATE PET/CT; n = 4) or high level of experience (≥500 scans and ≥5 years experience with 68Ga-DOTATATE PET/CT; n = 3). Interpretation by the primary nuclear medicine physician un-blinded to all clinical and imaging data served as reference standard. Interobserver agreement was determined by Cohen's κ and intraclass correlation coefficient (ICC) with corresponding 95% confidence interval (CI). Results Interobserver agreement was substantial and the median number of false findings (FF) was low for the overall scan result; i.e. positive versus negative study (κ = 0.80, 95%CI 0.74–0.86; FF = 3), organ involvement (κ = 0.70, 95%CI 0.64–0.76; FF = 5), and lymph node involvement (κ = 0.71, 95%CI 0.65–0.78; FF = 6). The interobserver agreement was substantial to almost-perfect and the average absolute difference (Δ) to the reference reader was low for number of organ and lymph node metastases (ICC = 0.84, 95%CI 0.77–0.89, Δ = 0.45 and ICC = 0.77, 95%CI 0.69–0.84, Δ = 0.45), tumor SUVmax (ICC = 0.99, 95%CI 0.97–0.99; Δ = 0.44) and reference SUV (SUVmean spleen: ICC = 0.81, Δ = 1.10; SUVmax liver ICC = 0.79, Δ = 0.62). Interpretations of the appropriateness for peptide-receptor radionuclide therapy (PRRT) varied more significantly among observers (κ = 0.64, 95%CI 0.57–0.70) and a higher frequency of false positive recommendations for PRRT occurred in observers with low versus high levels of experience (range, 7–12 versus 4–8). Conclusion The interpretation of

  20. Inter-observer agreement on a checklist to evaluate scientific publications in the field of animal reproduction.

    PubMed

    Simoneit, Céline; Heuwieser, Wolfgang; Arlt, Sebastian P

    2012-01-01

    This study's objective was to determine respondents' inter-observer agreement on a detailed checklist to evaluate three exemplars (one case report, one randomized controlled study without blinding, and one blinded, randomized controlled study) of the scientific literature in the field of bovine reproduction. Fourteen international scientists in the field of animal reproduction were provided with the three articles, three copies of the checklist, and a supplementary explanation. Overall, 13 responded to more than 90% of the items. Overall repeatability between respondents using Fleiss's κ was 0.35 (fair agreement). Combining the "strongly agree" and "agree" responses and the "strongly disagree" and "disagree" responses increased κ to 0.49 (moderate agreement). Evaluation of information given in the three articles on housing of the animals (35% identical answers) and preconditions or pretreatments (42%) varied widely. Even though the overall repeatability was fair, repeatability concerning the important categories was high (e.g., level of agreement=98%). Our data show that the checklist is a reasonable and practical supporting tool to assess the quality of publications. Therefore, it may be used in teaching and practicing evidence-based veterinary medicine. It can support training in systematic and critical appraisal of information and in clinical decision making.

  1. The learning curve, interobserver, and intraobserver agreement of endoscopic confocal laser endomicroscopy in the assessment of mucosal barrier defects.

    PubMed

    Chang, Jeff; Ip, Matthew; Yang, Michael; Wong, Brendon; Power, Theresa; Lin, Lisa; Xuan, Wei; Phan, Tri Giang; Leong, Rupert W

    2016-04-01

    Confocal laser endomicroscopy can dynamically assess intestinal mucosal barrier defects and increased intestinal permeability (IP). These are functional features that do not have corresponding appearance on histopathology. As such, previous pathology training may not be beneficial in learning these dynamic features. This study aims to evaluate the diagnostic accuracy, learning curve, inter- and intraobserver agreement for identifying features of increased IP in experienced and inexperienced analysts and pathologists. A total of 180 endoscopic confocal laser endomicroscopy (Pentax EC-3870FK; Pentax, Tokyo, Japan) images of the terminal ileum, subdivided into 6 sets of 30 were evaluated by 6 experienced analysts, 13 inexperienced analysts, and 2 pathologists, after a 30-minute teaching session. Cell-junction enhancement, fluorescein leak, and cell dropout were used to represent increased IP and were either present or absent in each image. For each image, the diagnostic accuracy, confidence, and quality were assessed. Diagnostic accuracy was significantly higher for experienced analysts compared with inexperienced analysts from the first set (96.7% vs 83.1%, P < .001) to the third set (95% vs 89.7, P = .127). No differences in accuracy were noted between inexperienced analysts and pathologists. Confidence (odds ratio, 8.71; 95% confidence interval, 5.58-13.57) and good image quality (odds ratio, 1.58; 95% confidence interval, 1.22-2.03) were associated with improved interpretation. Interobserver agreement κ values were high and improved with experience (experienced analysts, 0.83; inexperienced analysts, 0.73; and pathologists, 0.62). Intraobserver agreement was >0.86 for experienced observers. Features representative of increased IP can be rapidly learned with high inter- and intraobserver agreement. Confidence and image quality were significant predictors of accurate interpretation. Previous pathology training did not have an effect on learning. Copyright © 2016

  2. Atlas-based segmentation technique incorporating inter-observer delineation uncertainty for whole breast

    NASA Astrophysics Data System (ADS)

    Bell, L. R.; Dowling, J. A.; Pogson, E. M.; Metcalfe, P.; Holloway, L.

    2017-01-01

    Accurate, efficient auto-segmentation methods are essential for the clinical efficacy of adaptive radiotherapy delivered with highly conformal techniques. Current atlas based auto-segmentation techniques are adequate in this respect, however fail to account for inter-observer variation. An atlas-based segmentation method that incorporates inter-observer variation is proposed. This method is validated for a whole breast radiotherapy cohort containing 28 CT datasets with CTVs delineated by eight observers. To optimise atlas accuracy, the cohort was divided into categories by mean body mass index and laterality, with atlas’ generated for each in a leave-one-out approach. Observer CTVs were merged and thresholded to generate an auto-segmentation model representing both inter-observer and inter-patient differences. For each category, the atlas was registered to the left-out dataset to enable propagation of the auto-segmentation from atlas space. Auto-segmentation time was recorded. The segmentation was compared to the gold-standard contour using the dice similarity coefficient (DSC) and mean absolute surface distance (MASD). Comparison with the smallest and largest CTV was also made. This atlas-based auto-segmentation method incorporating inter-observer variation was shown to be efficient (<4min) and accurate for whole breast radiotherapy, with good agreement (DSC>0.7, MASD <9.3mm) between the auto-segmented contours and CTV volumes.

  3. Application of Prostate Imaging Reporting and Data System Version 2 (PI-RADS v2): Interobserver Agreement and Positive Predictive Value for Localization of Intermediate- and High-Grade Prostate Cancers on Multiparametric Magnetic Resonance Imaging.

    PubMed

    Chen, Frank; Cen, Steven; Palmer, Suzanne

    2017-09-01

    To evaluate interobserver agreement with the use of and the positive predictive value (PPV) of Prostate Imaging Reporting and Data System version 2 (PI-RADS v2) for the localization of intermediate- and high-grade prostate cancers on multiparametric magnetic resonance imaging (mpMRI). In this retrospective, institutional review board-approved study, 131 consecutive patients who had mpMRI followed by transrectal ultrasound-MR imaging fusion-guided biopsy of the prostate were included. Two readers who were blinded to initial mpMRI reports, clinical data, and pathologic outcomes reviewed the MR images, identified all prostate lesions, and scored each lesion based on the PI-RADS v2. Interobserver agreement was assessed by intraclass correlation coefficient (ICC), and PPV was calculated for each PI-RADS category. PI-RADS v2 was found to have a moderate level of interobserver agreement between two readers of varying experience, with ICC of 0.74, 0.72, and 0.67 for all lesions, peripheral zone lesions, and transitional zone lesions, respectively. Despite only moderate interobserver agreement, the calculated PPV in the detection of intermediate- and high-grade prostate cancers for each PI-RADS category was very similar between the two readers, with approximate PPV of 0%, 12%, 64%, and 87% for PI-RADS categories 2, 3, 4, and 5, respectively. In our study, PI-RADS v2 has only moderate interobserver agreement, a similar finding in studies of the original PI-RADS and in initial studies of PI-RADS v2. Despite this, PI-RADS v2 appears to be a useful system to predict significant prostate cancer, with PI-RADS scores correlating well with the likelihood of intermediate- and high-grade cancers. Copyright © 2017 The Association of University Radiologists. Published by Elsevier Inc. All rights reserved.

  4. Interobserver variability of sonography for prediction of placenta accreta.

    PubMed

    Bowman, Zachary S; Eller, Alexandra G; Kennedy, Anne M; Richards, Douglas S; Winter, Thomas C; Woodward, Paula J; Silver, Robert M

    2014-12-01

    The sensitivity of sonography to predict accreta has been reported as higher than 90%. However, most studies are from single expert investigators. Our objective was to analyze interobserver variability of sonography for prediction of placenta accreta. Patients with previa with and without accreta were ascertained, and images with placental views were collected, deidentified, and placed in random sequence. Three radiologists and 3 maternal-fetal medicine specialists interpreted each study for the presence of accreta and specific findings reported to be associated with its diagnosis. Investigator-specific sensitivity, specificity, and accuracy were calculated. κ statistics were used to assess variability between individuals and types of investigators. A total of 229 sonographic studies from 55 patients with accreta and 56 control patients were examined. Accuracy ranged from 55.9% to 76.4%. Of imaging studies yielding diagnoses, sensitivity ranged from 53.4% to 74.4%, and specificity ranged from 70.8% to 94.8%. Overall interobserver agreement was moderate (mean κ ± SD = 0.47 ± 0.12). κ values between pairs of investigators ranged from 0.32 (fair agreement) to 0.73 (substantial agreement). Average individual agreement ranged from fair (κ = 0.35) to moderate (κ = 0.53). Blinded from clinical data, sonography has significant interobserver variability for the diagnosis of placenta accreta. © 2013 by the American Institute of Ultrasound in Medicine.

  5. Inter-Observer, Intra-Observer and Intra-Individual Reliability of Uroflowmetry Tests in Aged Men: A Generalizability Theory Approach.

    PubMed

    Liu, Ying-Buh; Yang, Stephen S; Hsieh, Cheng-Hsing; Lin, Chia-Da; Chang, Shang-Jen

    2014-05-01

    To evaluate the inter-observer, intra-observer and intra-individual reliability of uroflowmetry and post-void residual urine (PVR) tests in adult men. Healthy volunteers aged over 40 years were enrolled. Every participant underwent two sets of uroflowmetry and PVR tests with a 2-week interval between the tests. The uroflowmetry tests were interpreted by four urologists independently. Uroflowmetry curves were classified as bell-shaped, bell-shaped with tail, obstructive, restrictive, staccato, interrupted and tower-shaped and scored from 1 (highly abnormal) to 5 (absolutely normal). The agreements between the observers, interpretations and tests within individuals were analyzed using kappa statistics and intraclass correlation coefficients. Generalizability theory with decision analysis was used to determine how many observers, tests, and interpretations were needed to obtain an acceptable reliability (> 0.80). Of 108 volunteers, we randomly selected the uroflowmetry results from 25 participants for the evaluation of reliability. The mean age of the studied adults was 55.3 years. The intra-individual and intra-observer reliability on uroflowmetry tests ranged from good to very good. However, the inter-observer reliability on normalcy and specific type of flow pattern were relatively lower. In generalizability theory, three observers were needed to obtain an acceptable reliability on normalcy of uroflow pattern if the patient underwent uroflowmetry tests twice with one observation. The intra-individual and intra-observer reliability on uroflowmetry tests were good while the inter-observer reliability was relatively lower. To improve inter-observer reliability, the definition of uroflowmetry should be clarified by the International Continence Society. © 2013 Wiley Publishing Asia Pty Ltd.

  6. Validity and interobserver agreement of lower extremity local tissue water measurements in healthy women using tissue dielectric constant.

    PubMed

    Jensen, Mads R; Birkballe, Susanne; Nørregaard, Susan; Karlsmark, Tonny

    2012-07-01

    Tissue dielectric constant (TDC) measurement may become an important tool in the clinical evaluation of chronic lower extremity swelling in women; however, several factors are known to influence TDC measurements, and comparative data on healthy lower extremities are few. Thirty-four healthy women volunteered. Age, BMI, moisturizer use and hair removal were registered. Three blinded investigators performed TDC measurements in a randomized sequence on clearly marked locations on the foot, the ankle and the lower leg. The effective measuring depth was 2.5 mm. The mean TDC was 37.8 ± 5.5 (mean ± SD) on the foot, 29.0 ± 3.1 on the ankle and 30.5 ± 3.9 on the lower leg. TDC was highly dependent on measuring site (P<0.001) but did not vary significantly between investigators (P=0.127). Neither age, BMI, hair removal nor moisturizer use had any significant effect on the lower leg TDC. Intraclass correlation coefficients were 0.77 for the foot, 0.94 for the ankle and 0.94 for the lower leg. The TDC on the foot was significantly higher compared with ankle and lower leg values. Foot measurements should be interpreted cautiously because of questionable interobserver agreement. The interobserver agreement was high on lower leg and ankle measurements. Neither age, BMI, hair removal nor moisturizer use had any significant on effect on the lower leg TDC. TDC values of 35.2 for the ankle and 38.3 for the lower leg are suggested as upper normal reference limits in women. © 2012 The Authors Clinical Physiology and Functional Imaging © 2012 Scandinavian Society of Clinical Physiology and Nuclear Medicine.

  7. Interobserver agreement of interim and end-of-treatment 18F-FDG PET/CT in diffuse large B-cell lymphoma (DLBCL): impact on clinical practice and trials.

    PubMed

    Burggraaff, Coreline N; Cornelisse, Alexander C; Hoekstra, Otto S; Lugtenburg, Pieternella J; de Keizer, Bart; Arens, Anne I J; Celik, Filiz; Huijbregts, Julia E; De Vet, Henrica C W; Zijlstra, Josee M

    2018-05-04

    We aimed to assess the interobserver agreement of Interim PET (I-PET) and End-of-Treatment PET (EoT-PET) using the Deauville 5-point scale (DS) in first-line DLBCL patients. Methods: I-PET and EoT-PET scans of DLBCL patients were performed in the HOVON84 study (2007-2012), an international multicenter randomized controlled trial. Patients received R-CHOP14 and were randomized to receive rituximab intensification in the first 4 cycles or not. I-PET was made after 4 cycles (for observational purposes), and EoT-PET scan after 6 or 8 cycles. Two independent central reviewers retrospectively scored all scans according to the DS-system, blinded to clinical outcomes. Results were dichotomised as 'negative' (DS: 1-3) or 'positive' (DS: 4-5). Besides percentage overall agreement we calculated agreement for positive and negative scores, expressed as positive agreement (PA) and negative agreement (NA), respectively. Results: 465 I-PET and 457 EoT-PET scans were centrally reviewed; baseline 18 F-FDG PET(/CT) was available in 75-77%, and CT in the remaining cases. Percentage overall agreement for I-PET and EoT-PET were 87.7% and 91.7% ( P =0.049), with NA of 92.0% and 95.0% ( P =0.091), and PA of 73.7% and 76.3% ( P =0.656), respectively. Conclusion: Interobserver agreement using DS in DLBCL patients in I-PET and EoT-PET yields high overall and negative agreement. The lower positive agreement suggests that EoT-PET/CT treatment evaluation in daily practice and I-PET adapted trials may benefit from dual reads and central review, respectively. Copyright © 2018 by the Society of Nuclear Medicine and Molecular Imaging, Inc.

  8. Intra- and interobserver reliability estimates for identification and grading of upper respiratory tract abnormalities recorded in horses at rest and during overground endoscopy.

    PubMed

    McGivney, C L; Sweeney, J; David, F; O'Leary, J M; Hill, E W; Katz, L M

    2017-07-01

    Previous studies support good intra- and interobserver agreements for endoscopic evaluation of various upper respiratory tract (URT) diseases in horses. However, these studies mainly assessed resting endoscopic examination videos and/or focussed on a single URT abnormality. To estimate intra- and interobserver agreement for identification and grading of all URT abnormalities from resting and overground endoscopy (OGE) videos of Thoroughbreds. Blinded, fully crossed design. Resting and OGE URT videos for n = 43 Thoroughbreds were retrospectively chosen based on identification of common URT disorders. The videos were randomly evaluated in duplicate by 4 raters blinded to all information including prior URT disorder(s) diagnosis. Abnormalities were graded using well-described ordinal scales. Intra- and interobserver agreements were estimated using Cohen's weighted κ and Krippendorff's α, respectively. Intraobserver agreement was perfect/nearly perfect for arytenoid symmetry at exercise, epiglottic entrapment and epiglottic retroversion, substantial for arytenoid asymmetry at rest, palatal dysfunction (PD), medial deviation of the aryepiglottic folds (MDAF), pharyngeal mucus and epiglottic grade at exercise and moderate for vocal fold collapse (VFC), ventromedial luxation of the apex of the corniculate process of the arytenoid (VLAC), nasopharyngeal collapse (NPC) and epiglottic grade at rest. Interobserver agreement was substantial for arytenoid symmetry at exercise and PD and moderate for arytenoid asymmetry at rest, MDAF, VLAC and epiglottic entrapment. It was only fair for VFC, epiglottic grade at exercise, epiglottic retroversion, pharyngeal mucus and NPC and poor for epiglottic grade at rest. Sample size was insufficient to allow assessment of the effect of one abnormality on the grading of another abnormality. Observers were consistent in grading URT disorders. However, significant disparity in grading existed between observers for some conditions affecting

  9. Inter-observer agreement, diagnostic sensitivity and specificity of animal-based indicators of young lamb welfare.

    PubMed

    Phythian, C J; Toft, N; Cripps, P J; Michalopoulou, E; Winter, A C; Jones, P H; Grove-White, D; Duncan, J S

    2013-07-01

    A scientific literature review and consensus of expert opinion used the welfare definitions provided by the Farm Animal Welfare Council (FAWC) Five Freedoms as the framework for selecting a set of animal-based indicators that were sensitive to the current on-farm welfare issues of young lambs (aged ≤ 6 weeks). Ten animal-based indicators assessed by observation - demeanour, response to stimulation, shivering, standing ability, posture, abdominal fill, body condition, lameness, eye condition and salivation were tested as part of the objective of developing valid, reliable and feasible animal-based measures of lamb welfare The indicators were independently tested on 966 young lambs from 17 sheep flocks across Northwest England and Wales during December 2008 to April 2009 by four trained observers. Inter-observer reliability was assessed using Fleiss's kappa (κ), and the pair-wise agreement with an experienced, observer designated as the 'test standard observer' (TSO) was examined using Cohen's κ. Latent class analysis (LCA) estimated the sensitivity (Se) and specificity (Sp) of each observer without assuming a gold standard and predicted the Se and Sp of randomly selected observers who may apply the indicators in the future. Overall, good levels of inter-observer reliability, and high levels of Sp were identified for demeanour (κ = 0.54, Se ≥ 0.70, Sp ≥ 0.98), stimulation (κ = 0.57, Se = 0.30 to 0.77, Sp ≥ 0.98), shivering (κ = 0.55, Se = 0.37 to 0.85, Sp ≥ 0.99), standing ability (0.54, Se ≥ 0.80, Sp ≥ 0.99), posture (κ = 0.45, Se ≥ 0.56, Sp = 0.99), abdominal fill (κ = 0.44, Se = 0.39 to 0.98, Sp = 0.99), body condition (κ = 0.72, Se ⩾ 0.38 to 0.90, Sp = 0.99), lameness (κ = 0.68, Se > 0.73, Sp = 1.00), and eye condition (κ = 0.72, Se ≥ 0.86, Sp = 0.99). LCA predicted that randomly selected observers had Se > 0.77 (acceptable), and Sp ≥ 0.98 (high) for assessments of demeanour, lameness, abdominal fill posture, body condition and eye

  10. Breast lesion shape and margin evaluation: BI-RADS based metrics understate radiologists' actual levels of agreement.

    PubMed

    Rawashdeh, Mohammad; Lewis, Sarah; Zaitoun, Maha; Brennan, Patrick

    2018-05-01

    While there is much literature describing the radiologic detection of breast cancer, there are limited data available on the agreement between experts when delineating and classifying breast lesions. The aim of this work is to measure the level of agreement between expert radiologists when delineating and classifying breast lesions as demonstrated through Breast Imaging Reporting and Data System (BI-RADS) and quantitative shape metrics. Forty mammographic images, each containing a single lesion, were presented to nine expert breast radiologists using a high specification interactive digital drawing tablet with stylus. Each reader was asked to manually delineate the breast masses using the tablet and stylus and then visually classify the lesion according to the American College of Radiology (ACR) BI-RADS lexicon. The delineated lesion compactness and elongation were computed using Matlab software. Intraclass Correlation Coefficient (ICC) and Cohen's kappa were used to assess inter-observer agreement for delineation and classification outcomes, respectively. Inter-observer agreement was fair for BI-RADS shape (kappa = 0.37) and moderate for margin (kappa = 0.58) assessments. Agreement for quantitative shape metrics was good for lesion elongation (ICC = 0.82) and excellent for compactness (ICC = 0.93). Fair to moderate levels of agreement was shown by radiologists for shape and margin classifications of cancers using the BI-RADS lexicon. When quantitative shape metrics were used to evaluate radiologists' delineation of lesions, good to excellent inter-observer agreement was found. The results suggest that qualitative descriptors such as BI-RADS lesion shape and margin understate the actual level of expert radiologist agreement. Copyright © 2018 Elsevier Ltd. All rights reserved.

  11. Gleason grade 4 prostate adenocarcinoma patterns: an interobserver agreement study among genitourinary pathologists.

    PubMed

    Kweldam, Charlotte F; Nieboer, Daan; Algaba, Ferran; Amin, Mahul B; Berney, Dan M; Billis, Athanase; Bostwick, David G; Bubendorf, Lukas; Cheng, Liang; Compérat, Eva; Delahunt, Brett; Egevad, Lars; Evans, Andrew J; Hansel, Donna E; Humphrey, Peter A; Kristiansen, Glen; van der Kwast, Theodorus H; Magi-Galluzzi, Cristina; Montironi, Rodolfo; Netto, George J; Samaratunga, Hemamali; Srigley, John R; Tan, Puay H; Varma, Murali; Zhou, Ming; van Leenders, Geert J L H

    2016-09-01

    To assess the interobserver reproducibility of individual Gleason grade 4 growth patterns. Twenty-three genitourinary pathologists participated in the evaluation of 60 selected high-magnification photographs. The selection included 10 cases of Gleason grade 3, 40 of Gleason grade 4 (10 per growth pattern), and 10 of Gleason grade 5. Participants were asked to select a single predominant Gleason grade per case (3, 4, or 5), and to indicate the predominant Gleason grade 4 growth pattern, if present. 'Consensus' was defined as at least 80% agreement, and 'favoured' as 60-80% agreement. Consensus on Gleason grading was reached in 47 of 60 (78%) cases, 35 of which were assigned to grade 4. In the 13 non-consensus cases, ill-formed (6/13, 46%) and fused (7/13, 54%) patterns were involved in the disagreement. Among the 20 cases where at least one pathologist assigned the ill-formed growth pattern, none (0%, 0/20) reached consensus. Consensus for fused, cribriform and glomeruloid glands was reached in 2%, 23% and 38% of cases, respectively. In nine of 35 (26%) consensus Gleason grade 4 cases, participants disagreed on the growth pattern. Six of these were characterized by large epithelial proliferations with delicate intervening fibrovascular cores, which were alternatively given the designation fused or cribriform growth pattern ('complex fused'). Consensus on Gleason grade 4 growth pattern was predominantly reached on cribriform and glomeruloid patterns, but rarely on ill-formed and fused glands. The complex fused glands seem to constitute a borderline pattern of unknown prognostic significance on which a consensus could not be reached. © 2016 John Wiley & Sons Ltd.

  12. Interobserver agreement for post mortem renal histopathology and diagnosis of acute tubular necrosis in critically ill patients.

    PubMed

    Glassford, Neil J; Skene, Alison; Guardiola, Maria B; Chan, Matthew J; Bagshaw, Sean M; Bellomo, Rinaldo; Solez, Kim

    2017-12-01

    The renal histopathology of critically ill patients dying with acute kidney injury (AKI) in intensive care units of high income countries remains uncertain. Retrospective observational assessment of interobserver agreement in the reporting of renal post mortem histopathology, and the ability of pathologists blinded to the clinical context to independently identify the presence of pre-mortem AKI from digital images of histological sections from 34 critically ill patients dying in teaching hospitals in Australia and Canada. We identified a heterogeneous cohort with a median age of 65 years (interquartile range [IQR], 56.5-77), APACHE II score of 27 (IQR, 19-33), and sepsis as the most common admission diagnosis (12/34; 35%). The most common proximate causes of death were cardiovascular (19/34; 56%) and respiratory (7/34; 21%) failure. AKI was common, with 23 patients (68%) developing RIFLE-F AKI, and 21 patients (62%) receiving renal replacement therapy. Structured reporting for tubular inflammation showed excellent agreement (kappa = 1), but no other subdomain demonstrated better than moderate agreement (kappa < 0.6). Only fair agreement (55.9% of cases; kappa = 0.23) was demonstrated on the diagnosis of moderate to severe acute tubular necrosis (ATN). Pathologist A predicted RIFLE-I or worse AKI with the diagnosis of ATN, with an overall accuracy of 61.8%; pathologist B predicted AKI with an accuracy of 35.3%. Post mortem assessment of the renal histopathology in critically ill patients is neither robust nor reproducible; independent pathologists agree poorly on the diagnosis of ATN, and their structural assessment appears dissociated from ante-mortem renal function.

  13. Utility of Interobserver Agreement Statistics in Establishing Radiology Resident Learning Curves During Self-directed Radiologic Anatomy Training.

    PubMed

    Tureli, Derya; Altas, Hilal; Cengic, Ismet; Ekinci, Gazanfer; Baltacioglu, Feyyaz

    2015-10-01

    The aim of the study was to ascertain the learning curves for the radiology residents when first introduced to an anatomic structure in magnetic resonance images (MRI) to which they have not been previously exposed to. The iliolumbar ligament is a good marker for testing learning curves of radiology residents because the ligament is not part of a routine lumbar MRI reporting and has high variability in detection. Four radiologists, three residents without previous training and one mentor, studied standard axial T1- and T2-weighted images of routine lumbar MRI examinations. Radiologists had to define iliolumbar ligament while blinded to each other's findings. Interobserver agreement analyses, namely Cohen and Fleiss κ statistics, were performed for groups of 20 cases to evaluate the self-learning curve of radiology residents. Mean κ values of resident-mentor pairs were 0.431, 0.608, 0.604, 0.826, and 0.963 in the analysis of successive groups (P < .001). The results indicate that the concordance between the experienced and inexperienced radiologists started as weak (κ <0.5) and gradually became very acceptable (κ >0.8). Therefore, a junior radiology resident can obtain enough experience in identifying a rather ambiguous anatomic structure in routine MRI after a brief instruction of a few minutes by a mentor and studying approximately 80 cases by oneself. Implementing this methodology will help radiology educators obtain more concrete ideas on the optimal time and effort required for supported self-directed visual learning processes in resident education. Copyright © 2015 AUR. Published by Elsevier Inc. All rights reserved.

  14. Interobserver Variation in Response Evaluation Criteria in Solid Tumors 1.1.

    PubMed

    Karmakar, Arunabha; Kumtakar, Apeksha; Sehgal, Himanshu; Kumar, Savith; Kalyanpur, Arjun

    2018-06-19

    Response Evaluation Criteria in Solid Tumors (RECIST 1.1) is the gold standard for imaging response evaluation in cancer trials. We sought to evaluate consistency of applying RECIST 1.1 between 2 conventionally trained radiologists, designated as A and B; identify reasons for variation; and reconcile these differences for future studies. The study was approved as an institutional quality check exercise. Since no identifiable patient data was collected or used, a waiver of informed consent was granted. Imaging case report forms of a concluded multicentric breast cancer trial were retrospectively reviewed. Cohen's kappa was used to rate interobserver agreement in Response Evaluation Data (target response, nontarget response, new lesions, overall response). Significant variations were reassessed by a senior radiologist to extrapolate reasons for disagreement. Methods to improve agreement were similarly ascertained. Sixty one cases with total of 82 data-pairs were evaluated (35 data-pairs in visit 5, 47 in visit 9). Both radiologists showed moderate agreement in target response (n = 82; ĸ = 0.477; 95% confidence interval [CI]: 0.314-0.640-), nontarget response (n = 82; ĸ = 0.578; 95% CI: 0.213-0.944) and overall response evaluation in both visits (n = 82; ĸ = 0.510; 95% CI: 0.344-0.676). Further assessment demonstrated "Prevalence effect" of Kappa in some cases which led to underestimation of agreement. Percent agreement of overall response was 74.39% while percent variation was 25.6%. Differences in interpreting RECIST 1.1 and in radiological image interpretation were the primary sources of variation. The commonest overall response was "Partial Response" (Rad A:45/82; Rad B:63/82). Inspite of moderate interobserver agreement, qualitative interpretation differences in some cases increased interobserver variability. Protocols such as Adjudication, to reduce easily avoidable inconsistencies are or should be a part of the Standard Operating

  15. Dermoscopic patterns of Melanoma Metastases: inter-observer consistency and accuracy for metastases recognition

    PubMed Central

    Costa, J.; Ortiz-Ibañez, K.; Salerni, G.; Borges, V.; Carrera, C.; Puig, S.; Malvehy, J.

    2013-01-01

    Background Cutaneous metastases of malignant melanoma (CMMM) can be confused with other skin lesions. Dermoscopy could be helpful in the differential diagnosis. Objective To describe distinctive dermoscopic patterns that are reproducible and accurate in the identification of CMMM Methods A retrospective study of 146 dermoscopic images of CMMM from 42 patients attending a Melanoma Unit between 2002 and 2009 was performed. Firstly, two investigators established six dermoscopic patterns for CMMM. The correlation of 73 dermoscopic images with their distinctive patterns was assessed by four independent dermatologists to evaluate the reproducibility in the identification of the patterns. Finally, 163 dermoscopic images, including CMMM and non-metastatic lesions, were evaluated by the same four dermatologists to calculate the accuracy of the patterns in the recognition of CMMM. Results Five CMMM dermoscopic patterns had a good inter-observer agreement (blue nevus-like, nevus-like, angioma like, vascular and unspecific). When CMMM were classified according to these patterns, correlation between the investigators and the four dermatologists ranged from κ = 0.56 to 0.7. 71 CMMM, 16 angiomas, 22 blue nevus, 15 malignant melanoma, 11 seborrheic keratosis, 15 melanocytic nevus with globular pattern and 13 pink lesions with vascular pattern were evaluated according to the previously described CMMM dermoscopy patterns, showing an overall sensitivity of 68% (between 54.9–76%) and a specificity of 81% (between 68.6–93.5) for the diagnosis of CMMM. Conclusion Five dermoscopic patterns of CMMM with good inter-observer agreement obtained a high sensitivity and specificity in the diagnosis of metastasis, the accuracy varying according to the experience of the observer. PMID:23495915

  16. Interobserver variability and feasibility of polymerase chain reaction-based assay in distinguishing ischemic colitis from Clostridium difficile colitis in endoscopic mucosal biopsies.

    PubMed

    Wiland, Homer O; Procop, Gary W; Goldblum, John R; Tuohy, Marion; Rybicki, Lisa; Patil, Deepa T

    2013-06-01

    Polymerase chain reaction (PCR)-based assays using stool samples are currently the most effective method of detecting Clostridium difficile. This study examines the feasibility of this assay using mucosal biopsy samples and evaluates the interobserver reproducibility in diagnosing and distinguishing ischemic colitis from C difficile colitis. Thirty-eight biopsy specimens were reviewed and classified by 3 observers into C difficile and ischemic colitis. The findings were correlated with clinical data. PCR was performed on 34 cases using BD GeneOhm C difficile assay. The histologic interobserver agreement was excellent (κ= 0.86) and the agreement between histologic and clinical diagnosis was good (κ = 0.84). All 19 ischemic colitis cases tested negative (100% specificity) and 3 of 15 cases of C difficile colitis tested positive (20% sensitivity). C difficile colitis can be reliably distinguished from ischemic colitis using histologic criteria. The C difficile PCR test on endoscopic biopsy specimens has excellent specificity but limited sensitivity.

  17. Interobserver agreement on the echocardiographic parameters that estimate right ventricular systolic function in the early postoperative period of cardiac surgery.

    PubMed

    Olmos-Temois, S G; Santos-Martínez, L E; Álvarez-Álvarez, R; Gutiérrez-Delgado, L G; Baranda-Tovar, F M

    2016-11-01

    To know the variability of transthoracic echocardiographic parameters that assess right ventricular systolic function by analyzing interobserver agreement in the early postoperative period of cardiovascular surgery. To assess the feasibility of these echocardiographic measurements. A cross-sectional study, double-blind pilot study was carried out from May 2011 to February 2013. Cardiovascular postoperative critical care at the National Institute of Cardiology "Ignacio Chávez", Mexico City, Mexico. Consecutive, non-probabilistic sampling. Fifty-six patients were studied in the postoperative period of cardiac surgery. The first echocardiographic parameters were obtained between 6-8hours after cardiac surgery, followed by blinded second measurements. Tricuspid annular plane systolic excursion (TAPSE), tricuspid annular peak systolic velocity on tissue Doppler imaging (VSPAT), diameters and right ventricular outflow area, tract fractional shortening. The agreement was analyzed by the Bland-Altman method, and its magnitude was assessed by the intraclass correlation coefficient (95% confidence interval). Both observers evaluated TAPSE and VSPAT in 48 patients (92%). The average TAPSE was 11.68±4.53mm (range 4-27mm). Right ventricular systolic dysfunction was observed in 41 cases (85%) and normal TAPSE in 7 patients (15%). The average difference and its limits according to TAPSE were -0.917±2.95 (-6.821, 4.988), with a magnitude of 0.725 (0.552, 0.837); the tricuspid annular peak systolic velocity on tissue Doppler imaging was -0.001±0.015 (-0.031, 0.030), and its magnitude 0.825 (0.708, 0.898), respectively. VSPAT and TAPSE were estimated by both observers in 92% of the patients, these parameters exhibiting the lowest interobserver variability. Copyright © 2016 Elsevier España, S.L.U. y SEMICYUC. All rights reserved.

  18. Challenges in Pathologic Staging of Renal Cell Carcinoma: A Study of Interobserver Variability Among Urologic Pathologists.

    PubMed

    Williamson, Sean R; Rao, Priya; Hes, Ondrej; Epstein, Jonathan I; Smith, Steven C; Picken, Maria M; Zhou, Ming; Tretiakova, Maria S; Tickoo, Satish K; Chen, Ying-Bei; Reuter, Victor E; Fleming, Stewart; Maclean, Fiona M; Gupta, Nilesh S; Kuroda, Naoto; Delahunt, Brett; Mehra, Rohit; Przybycin, Christopher G; Cheng, Liang; Eble, John N; Grignon, David J; Moch, Holger; Lopez, Jose I; Kunju, Lakshmi P; Tamboli, Pheroze; Srigley, John R; Amin, Mahul B; Martignoni, Guido; Hirsch, Michelle S; Bonsib, Stephen M; Trpkov, Kiril

    2018-06-06

    Staging criteria for renal cell carcinoma differ from many other cancers, in that renal tumors are often spherical with subtle, finger-like extensions into veins, renal sinus, or perinephric tissue. We sought to study interobserver agreement in pathologic stage categories for challenging cases. An online survey was circulated to urologic pathologists interested in kidney tumors, yielding 89% response (31/35). Most questions included 1 to 4 images, focusing on: vascular and renal sinus invasion (n=24), perinephric invasion (n=9), and gross pathology/specimen handling (n=17). Responses were collapsed for analysis into positive and negative/equivocal for upstaging. Consensus was regarded as an agreement of 67% (2/3) of participants, which was reached in 20/33 (61%) evaluable scenarios regarding renal sinus, perinephric, or vein invasion, of which 13/33 (39%) had ≥80% consensus. Lack of agreement was especially encountered regarding small tumor protrusions into a possible vascular lumen, close to the tumor leading edge. For gross photographs, most were interpreted as suspicious but requiring histologic confirmation. Most participants (61%) rarely used special stains to evaluate vascular invasion, usually endothelial markers (81%). Most agreed that a spherical mass bulging well beyond the kidney parenchyma into the renal sinus (71%) or perinephric fat (90%) did not necessarily indicate invasion. Interobserver agreement in pathologic staging of renal cancer is relatively good among urologic pathologists interested in kidney tumors, even when selecting cases that test the earliest and borderline thresholds for extrarenal extension. Disagreements remain, however, particularly for tumors with small, finger-like protrusions, closely juxtaposed to the main mass.

  19. Inter-observer variability in fetal biometric measurements.

    PubMed

    Kilani, Rami; Aleyadeh, Wesam; Atieleh, Luay Abu; Al Suleimat, Abdul Mane; Khadra, Maysa; Hawamdeh, Hassan M

    2018-02-01

    To evaluate inter-observer variability and reproducibility of ultrasound measurements for fetal biometric parameters. A prospective cohort study was implemented in two tertiary care hospitals in Amman, Jordan; Prince Hamza Hospital and Albashir Hospital. 192 women with a singleton pregnancy at a gestational age of 18-36 weeks were the participants in the study. Transabdominal scans for fetal biometric parameter measurement were performed on study participants from the period of November 2014 to March 2015. Women who agreed to participate in the study were administered two ultrasound scans for head circumference, abdominal circumference and femur length. The correlation coefficient was calculated. Bland-Altman plots were used to analyze the degree of measurement agreement between observers. Limits of agreement ± 2 SD for the differences in fetal biometry measurements in proportions of the mean of the measurements were derived. Main outcome measures examine the reproducibility of fetal biometric measurements by different observers. High inter-observer inter-class correlation coefficient (ICC) was found for femur length (0.990) and abdominal circumference (0.996) where Bland-Altman plots showed high degrees of agreement. The highest degrees of agreement were noted in the measurement of abdominal circumference followed by head circumference. The lowest degree of agreement was found for femur length measurement. We used a paired-sample t-test and found that the mean difference between duplicate measurements was not significant (P > 0.05). Biometric fetal parameter measurements may be reproducible by different operators in the clinical setting with similar results. Fetal head circumference, abdominal circumference and femur length were highly reproducible. Large organized studies are needed to ensure accurate fetal measurements due to the important clinical implications of inaccurate measurements. Copyright © 2018. Published by Elsevier B.V.

  20. Interobserver reliability of the 'Welfare Quality(®) Animal Welfare Assessment Protocol for Growing Pigs'.

    PubMed

    Czycholl, I; Kniese, C; Büttner, K; Beilage, E Grosse; Schrader, L; Krieter, J

    2016-01-01

    The present paper focuses on evaluating the interobserver reliability of the 'Welfare Quality(®) Animal Welfare Assessment Protocol for Growing Pigs'. The protocol for growing pigs mainly consists of a Qualitative Behaviour Assessment (QBA), direct behaviour observations (BO) carried out by instantaneous scan sampling and checks for different individual parameters (IP), e.g. presence of tail biting, wounds and bursitis. Three trained observers collected the data by performing 29 combined assessments, which were done at the same time and on the same animals; but they were carried out completely independent of each other. The findings were compared by the calculation of Spearman Rank Correlation Coefficients (RS), Intraclass Correlation Coefficients (ICC), Smallest Detectable Changes (SDC) and Limits of Agreements (LoA). There was no agreement found concerning the adjectives belonging to the QBA (e.g. active: RS: 0.50, ICC: 0.30, SDC: 0.38, LoA: -0.05 to 0.45; fearful: RS: 0.06, ICC: 0.0, SDC: 0.26, LoA: -0.20 to 0.30). In contrast, the BO showed good agreement (e.g. social behaviour: RS: 0.45, ICC: 0.50, SDC: 0.09, LoA: -0.09 to 0.03 use of enrichment material: RS: 0.75, ICC: 0.68, SDC: 0.06, LoA: -0.03 to 0.03). Overall, observers agreed well in the IP, e.g. tail biting (RS: 0.52, ICC: 0.88; SDC: 0.05, LoA: -0.01 to 0.02) and wounds (RS: 0.43, ICC: 0.59, SDC: 0.10, LoA: -0.09 to 0.10). The parameter bursitis showed great differences (RS: 0.10, ICC: 0.0, SDC: 0.35, LoA: -0.37 to 0.40), which can be explained by difficulties in the assessment when the animals moved around quickly or their legs were soiled. In conclusion, the interobserver reliability was good in the BO and most IP, but not for the parameter bursitis and the QBA.

  1. Power Doppler ultrasound of rheumatoid synovitis: quantification of vascular signal and analysis of interobserver variability.

    PubMed

    Kamishima, Tamotsu; Tanimura, Kazuhide; Henmi, Mihoko; Narita, Akihiro; Sakamoto, Fumihiko; Terae, Satoshi; Shirato, Hiroki

    2009-05-01

    The objective of this study was to assess interobserver uncertainties in power Doppler (PD) examination of the fingers of patients with rheumatoid arthritis (RA), by separating the source of the discrepancy into (1) acquisition of the images and (2) criteria for assessment of the images. Twenty patients who had been diagnosed with RA were enrolled in this study. Ultrasound examinations were performed by one inexperienced and two experienced sonographers. Interobserver variation was measured using a conventional semiquantitative image grading scale. Interobserver variation of the quantitative PD (QPD) index (the summation of the colored pixels in a region of interest) was also assessed. The agreement was higher between the two experienced sonographers (kappa value of 0.8) than between experienced and inexperienced sonographers (kappa value, 0.6-0.7) in the semiquantitative image grading scale. Results suggest that the difference in the assessment on the image grading scale was due more to the difference in the acquisition of the images than to variations in the grading criteria between sonographers. An excellent relationship was noted between the image grading scale and the QPD index for Doppler signal with a Spearman's coefficient of rank correlation of 0.83 (P < 0.0001). Interobserver discrepancies in the image grading and QPD index methods were due more to the difference in the acquisition of the image than to the grading criteria used. The QPD index seems to be as reliable as the image grading scale with reasonable interobserver agreement between experienced sonographers.

  2. Interobserver Variability in Histologic Evaluation of Liver Fibrosis Using Categorical and Quantitative Scores.

    PubMed

    Pavlides, Michael; Birks, Jacqueline; Fryer, Eve; Delaney, David; Sarania, Nikita; Banerjee, Rajarshi; Neubauer, Stefan; Barnes, Eleanor; Fleming, Kenneth A; Wang, Lai Mun

    2017-04-01

    The aim of the study was to investigate the interobserver agreement for categorical and quantitative scores of liver fibrosis. Sixty-five consecutive biopsy specimens from patients with mixed liver disease etiologies were assessed by three pathologists using the Ishak and nonalcoholic steatohepatitis Clinical Research Network (NASH CRN) scoring systems, and the fibrosis area (collagen proportionate area [CPA]) was estimated by visual inspection (visual-CPA). A subset of 20 biopsy specimens was analyzed using digital imaging analysis (DIA) for the measurement of CPA (DIA-CPA). The bivariate weighted κ between any two pathologists ranged from 0.57 to 0.67 for Ishak staging and from 0.47 to 0.57 for the NASH CRN staging. Bland-Altman analysis showed poor agreement between all possible pathologist pairings for visual-CPA but good agreement between all pathologist pairings for DIA-CPA. There was good agreement between the two pathologists who assessed biopsy specimens by visual-CPA and DIA-CPA. The intraclass correlation coefficient, which is equivalent to the κ statistic for continuous variables, was 0.78 for visual-CPA and 0.97 for DIA-CPA. These results suggest that DIA-CPA is the most robust method for assessing liver fibrosis followed by visual-CPA. Categorical scores perform less well than both the quantitative CPA scores assessed here. © American Society for Clinical Pathology, 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com

  3. JOURNAL CLUB: Assessment of Interobserver Variability in the Peer Review Process: Should We Agree to Disagree?

    PubMed

    Verma, Nupur; Hippe, Daniel S; Robinson, Jeffrey D

    2016-12-01

    Peer review is an important and necessary part of radiology. There are several options to perform the peer review process. This study examines the reproducibility of peer review by comparing two scoring systems. American Board of Radiology-certified radiologists from various practice environments and subspecialties were recruited to score deidentified examinations on a web-based PACS with two scoring systems, RADPEER and Cleareview. Quantitative analysis of the scores was performed for interrater agreement. Interobserver variability was high for both the RADPEER and Cleareview scoring systems. The interobserver correlations (kappa values) were 0.17-0.23 for RADPEER and 0.10-0.16 for Cleareview. Interrater correlation was not statistically significantly different when comparing the RADPEER and Cleareview systems (p = 0.07-0.27). The kappa values were low for the Cleareview subscores when we evaluated for missed findings (0.26), satisfaction of search (0.17), and inadequate interpretation of findings (0.12). Our study confirms the previous report of low interobserver correlation when using the peer review process. There was low interobserver agreement seen when using both the RADPEER and the Cleareview scoring systems.

  4. Observer agreement for detection of cardiac arrhythmias on telemetric ECG recordings obtained at rest, during and after exercise in 10 Warmblood horses.

    PubMed

    Trachsel, D S; Bitschnau, C; Waldern, N; Weishaupt, M A; Schwarzwald, C C

    2010-11-01

    Frequent supraventricular or ventricular arrhythmias during and after exercise are considered pathological in horses. Prevalence of arrhythmias seen in apparently healthy horses is still a matter of debate and may depend on breed, athletic condition and exercise intensity. To determine intra- and interobserver agreement for detection of arrhythmias at rest, during and after exercise using a telemetric electrocardiography device. The electrocardiogram (ECG) recordings of 10 healthy Warmblood horses (5 of which had an intracardiac catheter in place) undergoing a standardised treadmill exercise test were analysed at rest (R), during warm-up (W), during exercise (E), as well as during 0-5 min (PE(0-5)) and 6-45 min (PE(6-45)) recovery after exercise. The number and time of occurrence of physiological and pathological 'rhythm events' were recorded. Events were classified according to origin and mode of conduction. The agreement of 3 independent, blinded observers with different experience in ECG reading was estimated considering time of occurrence and classification of events. For correct timing and classification, intraobserver agreement for observer 1 was 97% (R), 100% (W), 20% (E), 82% (PE(0-5)) and 100% (PE(6-45)). Interobserver agreement between observer 1 vs. observer 2 and between observer 1 vs. 3, respectively, was 96 and 92.6% (R), 83 and 31% (W), 0 and 13% (E), 23 and 18% (PE(0-5)), and 67 and 55% (PE(6-45)). When including the events with correct timing but disagreement for classification, the intraobserver agreement increased to 94% during PE(0-5) and the interobserver agreement reached 83 and 50% (W), 20 and 50% (E), 41 and 47% (PE(0-5)), and 83.5 and 65% (PE(6-45)). The interobserver agreement increased with observer experience. Intra- and interobserver agreement for recognition and classification of events was good at R, but poor during E and poor-moderate during recovery periods. These results highlight the limitations of stress ECG in horses and the

  5. Interobserver concordance of assessments of dysplasia and blast counts for the diagnosis of patients with cytopenia: From the Japanese central review study.

    PubMed

    Matsuda, Akira; Kawabata, Hiroshi; Tohyama, Kaoru; Maeda, Tomoya; Araseki, Kayano; Hata, Tomoko; Suzuki, Takahiro; Kayano, Hidekazu; Shimbo, Kei; Usuki, Kensuke; Chiba, Shigeru; Ishikawa, Takayuki; Arima, Nobuyoshi; Nohgawa, Masaharu; Ohta, Akiko; Miyazaki, Yasushi; Nakao, Sinnji; Ozawa, Keiya; Arai, Shunya; Kurokawa, Mineo; Mitani, Kinuko; Takaori-Kondo, Akifumi

    2018-06-07

    The diagnosis of myelodysplastic syndromes (MDS) is based on morphology and cytogenetics. However, limited information is currently available on the interobserver concordance of the assessment of dysplastic lineages (<10% or ≥10% in bone marrow (BM)). The revised International Prognostic Scoring System (IPSS-R) described a new threshold (2%) for BM blasts. However, the interobserver concordance of the categories (0-≤2% and >2-<5%) has limited data. The purpose of the present study was to investigate the assessment of dysplastic lineages and IPSS-R reproducibility. Our study was divided into two Steps. In each Step, the microscopic examinations were performed separately by two morphologists. Regarding the category of BM blasts ≤2% and >2-<5%, interobserver agreement was more than 'moderate' in all pairs (kappa test: 0.43-0.90). Regarding dysgranulopoiesis (dysG) and dyserythropoiesis (dysE) in BM, interobserver agreement was more than 'moderate' in all pairs (kappa test, dysG: 0.45-0.96, dysE: 0.45-0.81). Regarding the category of dysmegakaryopoiesis (dysMgk) in BM, interobserver agreement was more than moderate in 4 out of 5 pairs (kappa test: 0.58-1.00), and was fair for one pair (kappa test: 0.37). We consider that high interobserver concordance may be possible for the BM blast cell count (≤2% or >2-<5%) and dysplasia (<10% or ≥10%) of each lineage. Copyright © 2018 Elsevier Ltd. All rights reserved.

  6. Training improves interobserver reliability for the diagnosis of scaphoid fracture displacement.

    PubMed

    Buijze, Geert A; Guitton, Thierry G; van Dijk, C Niek; Ring, David

    2012-07-01

    The diagnosis of displacement in scaphoid fractures is notorious for poor interobserver reliability. We tested whether training can improve interobserver reliability and sensitivity, specificity, and accuracy for the diagnosis of scaphoid fracture displacement on radiographs and CT scans. Sixty-four orthopaedic surgeons rated a set of radiographs and CT scans of 10 displaced and 10 nondisplaced scaphoid fractures for the presence of displacement, using a web-based rating application. Before rating, observers were randomized to a training group (34 observers) and a nontraining group (30 observers). The training group received an online training module before the rating session, and the nontraining group did not. Interobserver reliability for training and nontraining was assessed by Siegel's multirater kappa and the Z-test was used to test for significance. There was a small, but significant difference in the interobserver reliability for displacement ratings in favor of the training group compared with the nontraining group. Ratings of radiographs and CT scans combined resulted in moderate agreement for both groups. The average sensitivity, specificity, and accuracy of diagnosing displacement of scaphoid fractures were, respectively, 83%, 85%, and 84% for the nontraining group and 87%, 86%, and 87% for the training group. Assuming a 5% prevalence of fracture displacement, the positive predictive value was 0.23 in the nontraining group and 0.25 in the training group. The negative predictive value was 0.99 in both groups. Our results suggest training can improve interobserver reliability and sensitivity, specificity and accuracy for the diagnosis of scaphoid fracture displacement, but the improvements are slight. These findings are encouraging for future research regarding interobserver variation and how to reduce it further.

  7. Cardiac valve calcifications on low-dose unenhanced ungated chest computed tomography: inter-observer and inter-examination reliability, agreement and variability.

    PubMed

    van Hamersvelt, Robbert W; Willemink, Martin J; Takx, Richard A P; Eikendal, Anouk L M; Budde, Ricardo P J; Leiner, Tim; Mol, Christian P; Isgum, Ivana; de Jong, Pim A

    2014-07-01

    To determine inter-observer and inter-examination variability for aortic valve calcification (AVC) and mitral valve and annulus calcification (MC) in low-dose unenhanced ungated lung cancer screening chest computed tomography (CT). We included 578 lung cancer screening trial participants who were examined by CT twice within 3 months to follow indeterminate pulmonary nodules. On these CTs, AVC and MC were measured in cubic millimetres. One hundred CTs were examined by five observers to determine the inter-observer variability. Reliability was assessed by kappa statistics (κ) and intra-class correlation coefficients (ICCs). Variability was expressed as the mean difference ± standard deviation (SD). Inter-examination reliability was excellent for AVC (κ = 0.94, ICC = 0.96) and MC (κ = 0.95, ICC = 0.90). Inter-examination variability was 12.7 ± 118.2 mm(3) for AVC and 31.5 ± 219.2 mm(3) for MC. Inter-observer reliability ranged from κ = 0.68 to κ = 0.92 for AVC and from κ = 0.20 to κ = 0.66 for MC. Inter-observer ICC was 0.94 for AVC and ranged from 0.56 to 0.97 for MC. Inter-observer variability ranged from -30.5 ± 252.0 mm(3) to 84.0 ± 240.5 mm(3) for AVC and from -95.2 ± 210.0 mm(3) to 303.7 ± 501.6 mm(3) for MC. AVC can be quantified with excellent reliability on ungated unenhanced low-dose chest CT, but manual detection of MC can be subject to substantial inter-observer variability. Lung cancer screening CT may be used for detection and quantification of cardiac valve calcifications. • Low-dose unenhanced ungated chest computed tomography can detect cardiac valve calcifications. • However, calcified cardiac valves are not reported by most radiologists. • Inter-observer and inter-examination variability of aortic valve calcifications is sufficient for longitudinal studies. • Volumetric measurement variability of mitral valve and annulus calcifications is substantial.

  8. Intraobserver and interobserver variability of the bone marrow burden (BMB) score for the assessment of disease severity in Gaucher disease. Possible impact of reporting experience.

    PubMed

    Lai, Jeffrey K C; Robertson, Patricia L; Goh, Christine; Szer, Jeff

    2018-02-01

    To evaluate the intraobserver and interobserver agreement for bone marrow burden (BMB) scores for individual examinations and for the change in BMB score over time in the same patient. A total of 119 sets of MR images of the lumbar spine and femora from 60 patients with Gaucher disease were included. Each set of MR images was scored using the BMB score independently by two experienced MSK radiologists. One radiologist performed a second read four weeks later. Intraobserver and interobserver agreement was assessed using Bland-Altman analysis and weighted kappa scores. BMB scores (n=119) demonstrated fair intraobserver agreement (weighted kappa=0.53) with a mean difference of -0.20 and 95% limits of agreement (LOA) of (-3.41, 3.01). Inter observer agreement was poor with weighted kappa 0.28 with mean difference of -0.16 and 95% LOA of (-4.45, 4.11). Change in BMB scores over time (n=59) demonstrated poor/fair intraobserver agreement (weighted kappa 0.41, mean difference-0.20 and 95% LOA (-4.35, 3.94)). Interobserver agreement was poor (weighted kappa 0.25, mean difference -0.12 with wide 95% LOA (-6.23, 5.99)). Significant interobserver, and to a lesser extent intraobserver, variation occurs with blinded BMB scoring of Gaucher disease. Copyright © 2016 Elsevier Inc. All rights reserved.

  9. Inter-observer variability within BI-RADS and RANZCR mammographic density assessment schemes

    NASA Astrophysics Data System (ADS)

    Damases, Christine N.; Mello-Thoms, Claudia; McEntee, Mark F.

    2016-03-01

    This study compares variability associated with two visual mammographic density (MD) assessment methods using two separate samples of radiologists. The image test-set comprised of images obtained from 20 women (age 42-89 years). The images were assessed for their MD by twenty American Board of Radiology (ABR) examiners and twenty-six radiologists registered with the Royal Australian and New Zealand College of Radiologists (RANZCR). Images were assessed using the same technology and conditions, however the ABR radiologists used the BI-RADS and the RANZCR radiologists used the RANZCR breast density synoptic. Both scales use a 4-point assessment. The images were then grouped as low- and high-density; low including BIRADS 1 and 2 or RANZCR 1 and 2 and high including BI-RADS 3 and 4 or RANZCR 3 and 4. Four-point BI-RADS and RANZCR showed no or negligible correlation (ρ=-0.029 p<0.859). The average inter-observer agreement on the BI-RADS scale had a Kappa of 0.565; [95% CI = 0.519 - 0.610], and ranged between 0.328-0.669 while the inter-observer agreement using the RANZCR scale had a Kappa of 0.360; [95% CI = 0.308 - 0.412] and a range of 0.078-0.499. Our findings show a wider range of inter-observer variability among RANZCR registered radiologists than the ABR examiners.

  10. A pilot study of clinical agreement in cardiovascular preparticipation examinations: how good is the standard of care?

    PubMed

    O'Connor, Francis G; Johnson, Jeremy D; Chapin, Mark; Oriscello, Ralph G; Taylor, Dean C

    2005-05-01

    To evaluate the interobserver agreement between physicians regarding a abnormal cardiovascular assessment on athletic preparticipation examinations. Cross-sectional clinical survey. Outpatient Clinic, United States Military Academy, West Point, NY. We randomly selected 101 out of 539 cadet-athletes presenting for a preparticipation examination. Two primary care sports medicine fellows and a cardiologist examined the cadets. After obtaining informed consent from all participants, all 3 physicians separately evaluated all 101 cadets. The physicians recorded their clinical findings and whether they thought further cardiovascular evaluation (echocardiography) was indicated. Rate of referral for further cardiovascular evaluation, clinical agreement between sports medicine fellows, and clinical agreement between sports medicine fellows and the cardiologist. Each fellow referred 6 of the 101 evaluated cadets (5.9%). The cardiologist referred none. Although each fellow referred 6 cadets, only 1 cadet was referred by both. The kappa statistic for clinical agreement between fellows is 0.114 (95% CI, -0.182 to 0.411). There was no clinical agreement between the fellows and the cardiologist. This pilot study reveals a low level of agreement between physicians regarding which athletes with an abnormal examination deserved further testing. It challenges the standard of care and questions whether there is a need for improved technologies or improved training in cardiovascular clinical assessment.

  11. Interobserver variability in the assessment of aneurysm occlusion with the WEB aneurysm embolization system.

    PubMed

    Fiorella, David; Arthur, Adam; Byrne, James; Pierot, Laurent; Molyneux, Andy; Duckwiler, Gary; McCarthy, Thomas; Strother, Charles

    2015-08-01

    The WEB (WEB aneurysm embolization system, Sequent Medical, Aliso Viejo, California, USA) is a self-expanding, nitinol, mesh device designed to achieve aneurysm occlusion after endosaccular deployment. The WEB Occlusion Scale (WOS) is a standardized angiographic assessment scale for reporting aneurysm occlusion achieved with intrasaccular mesh implants. This study was performed to assess the interobserver variability of the WOS. Seven experienced neurovascular specialists were trained to apply the WOS. These physicians independently reviewed angiographic image sets from 30 patients treated with the WEB under blinded conditions. No additional clinical information was provided. Raters graded each image according to the WOS (complete occlusion, residual neck or residual aneurysm). Final statistics were calculated using the dichotomous outcomes of complete occlusion or incomplete occlusion. The interobserver agreement was measured by the generalized κ statistic. In this series of 30 test case aneurysms, observers rated 12-17 as completely occluded, 3-9 as nearly completely occluded, and 9-11 as demonstrating residual aneurysm filling. Agreement was perfect across all seven observers for the presence or absence of complete occlusion in 22 of 30 cases. Overall, interobserver agreement was substantial (κ statistic 0.779 with a 95% CI of 0.700 to 0.857). The WOS allows a consistent means of reporting angiographic occlusion for aneurysms treated with the WEB device. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  12. Improved inter-observer agreement of an expert review panel in an oncology treatment trial--Insights from a structured interventional process.

    PubMed

    Nestle, Ursula; Rischke, Hans Christian; Eschmann, Susanne Martina; Holl, Gabriele; Tosch, Marco; Miederer, Matthias; Plotkin, Michail; Essler, Markus; Puskas, Cornelia; Schimek-Jasch, Tanja; Duncker-Rohr, Viola; Rühl, Friederike; Leifert, Anja; Mix, Michael; Grosu, Anca-Ligia; König, Jochem; Vach, Werner

    2015-11-01

    Oncologic imaging is a key for successful cancer treatment. While the quality assurance (QA) of image acquisition protocols has already been focussed, QA of reading and reporting offers still room for improvement. The latter was addressed in the context of a prospective multicentre trial on fluoro-deoxyglucose (FDG)-positron-emission tomography (PET)/CT-based chemoradiotherapy for locally advanced non-small cell lung cancer (NSCLC). An expert panel was prospectively installed performing blinded reviews of mediastinal NSCLC involvement in FDG-PET/CT. Due to a high initial reporting inter-observer disagreement, the independent data monitoring committee (IDMC) triggered an interventional harmonisation process, which overall involved 11 experts uttering 6855 blinded diagnostic statements. After assessing the baseline inter-observer agreement (IOA) of a blinded re-review (phase 1), a discussion process led to improved reading criteria (phase 2). Those underwent a validation study (phase 3) and were then implemented into the study routine. After 2 months (phase 4) and 1 year (phase 5), the IOA was reassessed. The initial overall IOA was moderate (kappa 0.52 CT; 0.53 PET). After improvement of reading criteria, the kappa values improved substantially (kappa 0.61 CT; 0.66 PET), which was retained until the late reassessment (kappa 0.71 CT; 0.67 PET). Subjective uncertainty was highly predictive for low IOA. The IOA of an expert panel was significantly improved by a structured interventional harmonisation process which could be a model for future clinical trials. Furthermore, the low IOA in reporting nodal involvement in NSCLC may bear consequences for individual patient care. Copyright © 2015 Elsevier Ltd. All rights reserved.

  13. Capillary refill time: a study of interobserver reliability among nurses and nurse assistants.

    PubMed

    Brabrand, Mikkel; Hosbond, Susanne; Folkestad, Lars

    2011-02-01

    The interobserver variability of capillary refill time (CRT) has been questioned. Earlier studies of interobserver variability of CRT have been on a large number of patients but with few observers. The objective of our study was to investigate how a large group of nurses and nurse assistants would grade CRT. We recorded a video of the index finger of six medical patients and these were shown to nurses and nurse assistants. They were asked to record the CRT and whether they found this value to be normal. The data were analyzed using the Fleiss Kappa Coefficient Analysis and graded according to the Landis and Koch correlation. Correlation between the exact numbers was evaluated using interclass correlation. Nine nurse assistants and 37 nurses participated. The patients were aged between 44 and 87 years. All but one patient had a systolic blood pressure reading above 130 mmHg. All had arterial blood oxygen saturation above 92% and all but one had normal body temperature. The κ value for normality was 0.56. The interclass correlation of measurement of CRT was 0.62. This is the largest interobserver study of CRT when looking at the number of observers. We found an only moderate agreement for the exact value of CRT and a moderate agreement for normality. We believe that CRT should be used with caution in clinical practice.

  14. Assessment of Intraobserver and Interobserver Agreement of a New Classification System for Retrograde Periimplantitis.

    PubMed

    Shah, Rucha; Thomas, Raison; Kumar, Tarun; Mehta, Dhoom Singh

    2016-12-01

    Retrograde periimplantitis (RPI) is the inflammatory disease that affects the apical part of an osseointegrated implant while the coronal portion of the implant sustains a normal bone-to-implant interface. The aim of the current study was to assess the intraexaminer and interexaminer reliability of a proposed new classification system for RPI. After thorough electronic literature search, 56 intraoral periapical radiographs (IOPA) of implants with RPI were collected and were classified by 2 independent reviewers as per the new classification system into one of the 3-mild, moderate, and advanced-classes based on the amount of bone loss from the apex of the implant to the most coronal part as a percentage of the total implant length. The IOPAs were assessed twice by the same examiners and both were blinded to each other's observations. The intraobserver agreement ranged from 0.85 to 0.91, which falls under the category of almost perfect agreement. The interexaminer agreement was found to be 0.83, also considered as almost perfect agreement. The proposed classification shows good intraexaminer and interexaminer reliability and can be used for treatment planning and prognosis in cases of RPI.

  15. All over the map: An interobserver agreement study of tumor location based on the PI-RADSv2 sector map.

    PubMed

    Greer, Matthew D; Shih, Joanna H; Barrett, Tristan; Bednarova, Sandra; Kabakus, Ismail; Law, Yan Mee; Shebel, Haytham; Merino, Maria J; Wood, Bradford J; Pinto, Peter A; Choyke, Peter L; Turkbey, Baris

    2018-01-17

    Prostate imaging reporting and data system version 2 (PI-RADSv2) recommends a sector map for reporting findings of prostate cancer mulitparametric MRI (mpMRI). Anecdotally, radiologists may demonstrate inconsistent reproducibility with this map. To evaluate interobserver agreement in defining prostate tumor location on mpMRI using the PI-RADSv2 sector map. Retrospective. Thirty consecutive patients who underwent mpMRI between October, 2013 and March, 2015 and who subsequently underwent prostatectomy with whole-mount processing. 3T mpMRI with T 2 W, diffusion-weighted imaging (DWI) (apparent diffusion coefficient [ADC] and b-2000), dynamic contrast-enhanced (DCE). Six radiologists (two high, two intermediate, and two low experience) from six institutions participated. Readers were blinded to lesion location and detected up to four lesions as per PI-RADSv2 guidelines. Readers marked the long-axis of lesions, saved screen-shots of each lesion, and then marked the lesion location on the PI-RADSv2 sector map. Whole-mount prostatectomy specimens registered to the MRI served as ground truth. Index lesions were defined as the highest grade lesion or largest lesion if grades were equivalent. Agreement was calculated for the exact, overlap, and proportion of agreement. Readers detected an average of 1.9 lesions per patient (range 1.6-2.3). 96.3% (335/348) of all lesions for all readers were scored PI-RADS ≥3. Readers defined a median of 2 (range 1-18) sectors per lesion. Agreement for detecting index lesions by screen shots was 83.7% (76.1%-89.9%) vs. 71.0% (63.1-78.3%) overlap agreement on the PI-RADS sector map (P < 0.001). Exact agreement for defining sectors of detected index lesions was only 21.2% (95% confidence interval [CI]: 14.4-27.7%) and rose to 49.0% (42.4-55.3%) when overlap was considered. Agreement on defining the same level of disease (ie, apex, mid, base) was 61.4% (95% CI 50.2-71.8%). Readers are highly likely to detect the same index lesion on mpMRI, but

  16. Ultrasonographic assessment of tendon thickness, Doppler activity and bony spurs of the elbow in patients with lateral epicondylitis and healthy subjects: a reliability and agreement study.

    PubMed

    Krogh, T P; Fredberg, U; Christensen, R; Stengaard-Pedersen, K; Ellingsen, T

    2013-10-01

    Tennis elbow, also known as lateral epicondylitis (LE), is a common disorder often assessed by ultrasound. The aim of this study was to evaluate the ultrasonographic outcomes and methods used in LE research and clinical practice. This study was designed as an intra- and interobserver reliability and agreement study. Ultrasonographic examination of the common extensor tendon of the elbow was performed. The intraobserver study examined tendon thickness twice in 20 right elbows from 20 healthy individuals at an interval of 7 to 12 days. The interobserver study examined tendon thickness, color Doppler activity, and bony spurs in 18 right elbows in 9 healthy individuals and 9 patients with LE. Two trained rheumatologists performed the interobserver examinations with the same scanner on the same day. The main outcomes were intra- and interclass correlation (ICC) and agreement. In the intraobserver study, the ICC with regard to tendon thickness ranged from 0.76 to 0.81, depending on the measurement techniques used. The agreement ranged from 0.06 to 0.13 mm. In the interobserver study, the tendon thickness ICC ranged from 0.45 to 0.65 and the agreement ranged from -0.17 to 0.13 mm. The ICC for color Doppler activity was 0.93, with agreement in 14/18 (78 %) of the cases. A perfect reliability was demonstrated for bony spurs, with an ICC of 1 and exact agreement in 18/18 (100 %) of the cases. Good to excellent reliability was obtained for all measurements. The ultrasonographic techniques evaluated in this trial can be recommended for use in both research and clinical practice. © Georg Thieme Verlag KG Stuttgart · New York.

  17. Reliability of laser Doppler flowmetry curve reading for measurement of toe and ankle pressures: intra- and inter-observer variation.

    PubMed

    Høyer, C; Paludan, J P D; Pavar, S; Biurrun Manresa, J A; Petersen, L J

    2014-03-01

    To assess the intra- and inter-observer variation in laser Doppler flowmetry curve reading for measurement of toe and ankle pressures. A prospective single blinded diagnostic accuracy study was conducted on 200 patients with known or suspected peripheral arterial disease (PAD), with a total of 760 curve sets produced. The first curve reading for this study was performed by laboratory technologists blinded to clinical clues and previous readings at least 3 months after the primary data sampling. The pressure curves were later reassessed following another period of at least 3 months. Observer agreement in diagnostic classification according to TASC-II criteria was quantified using Cohen's kappa. Reliability was quantified using intra-class correlation coefficients, coefficients of variance, and Bland-Altman analysis. The overall agreement in diagnostic classification (PAD/not PAD) was 173/200 (87%) for intra-observer (κ = .858) and 175/200 (88%) for inter-observer data (κ = .787). Reliability analysis confirmed excellent correlation for both intra- and inter-observer data (ICC all ≥.931). The coefficients of variance ranged from 2.27% to 6.44% for intra-observer and 2.39% to 8.42% for inter-observer data. Subgroup analysis showed lower observer-variation for reading of toe pressures in patients with diabetes and/or chronic kidney disease than patients not diagnosed with these conditions. Bland-Altman plots showed higher variation in toe pressure readings than ankle pressure readings. This study shows substantial intra- and inter-observer agreement in diagnostic classification and reading of absolute pressures when using laboratory technologists as observers. The study emphasises that observer variation for curve reading is an important factor concerning the overall reproducibility of the method. Our data suggest diabetes and chronic kidney disease have an influence on toe pressure reproducibility. Copyright © 2013 European Society for Vascular Surgery. Published

  18. Spine Instability Neoplastic Score: agreement across different medical and surgical specialties.

    PubMed

    Arana, Estanislao; Kovacs, Francisco M; Royuela, Ana; Asenjo, Beatriz; Pérez-Ramírez, Úrsula; Zamora, Javier

    2016-05-01

    Spinal instability is an acknowledged complication of spinal metastases; in spite of recent suggested criteria, it is not clearly defined in the literature. This study aimed to assess intra and interobserver agreement when using the Spine Instability Neoplastic Score (SINS) by all physicians involved in its management. Independent multicenter reliability study for the recently created SINS, undertaken with a panel of medical oncologists, neurosurgeons, radiologists, orthopedic surgeons, and radiation oncologists, was carried out. Ninety patients with biopsy-proven spinal metastases and magnetic resonance imaging, reviewed at the multidisciplinary tumor board of our institution, were included. Intraclass correlation coefficient (ICC) was used for SINS score agreement. Fleiss kappa statistic was used to assess agreement on the location of the most affected vertebral level; agreement on the SINS category ("stable," "potentially stable," or "unstable"); and overall agreement with the classification established by tumor board. Clinical data and imaging were provided to 83 specialists in 44 hospitals across 14 Spanish regions. No assessment criteria were pre-established. Each clinician assessed the SINS score twice, with a minimum 6-week interval. Clinicians were blinded to assessments made by other specialists and to their own previous assessment. Subgroup analyses were performed by clinicians' specialty, experience (≤7, 8-13, ≥14 years), and hospital category (four levels according to size and complexity). This study was supported by Kovacs Foundation. Intra and interobserver agreement on the location of the most affected levels was "almost perfect" (κ>0.94). Intra-observer agreement on the SINS score was "excellent" (ICC=0.77), whereas interobserver agreement was "moderate" (ICC=0.55). Intra-observer agreement in SINS category was "substantial" (k=0.61), whereas interobserver agreement was "moderate" (k=0.42). Overall agreement with the tumor board classification

  19. Inter-observer variability between general pathologists and a specialist in breast pathology in the diagnosis of lobular neoplasia, columnar cell lesions, atypical ductal hyperplasia and ductal carcinoma in situ of the breast

    PubMed Central

    2014-01-01

    Background This study aimed to assess inter-observer variability between the original diagnostic reports and later review by a specialist in breast pathology considering lobular neoplasias (LN), columnar cell lesions (CCL), atypical ductal hyperplasia (ADH), and ductal carcinoma in situ (DCIS) of the breast. Methods A retrospective, observational, cross-sectional study was conducted. A total of 610 breast specimens that had been formally sent for consultation and/or second opinions to the Breast Pathology Laboratory of Federal University of Minas Gerais were analysed between January 2005 and December 2010. The inter-observer variability between the original report and later review was compared regarding the diagnoses of LN, CCL, ADH, and DCIS. Statistical analyses were conducted using the Kappa index. Results Weak correlations were observed for the diagnoses of columnar cell change (CCC; Kappa = 0.38), columnar cell hyperplasia (CCH; Kappa = 0.32), while a moderate agreement (Kappa = 0.47) was observed for the diagnoses of flat epithelial atypia (FEA). Good agreement was observed in the diagnoses of atypical lobular hyperplasia (ALH; Kappa = 0.62) and lobular carcinoma in situ (LCIS; Kappa = 0.66). However, poor agreement was observed for the diagnoses of pleomorphic LCIS (Kappa = 0.22). Moderate agreement was observed for the diagnoses of ADH (Kappa = 0.44), low-grade DCIS (Kappa = 0.47), intermediate-grade DCIS (Kappa = 0.45), and DCIS with microinvasion (Kappa = 0.56). Good agreement was observed between the diagnoses of high-grade DCIS (Kappa = 0.68). Conclusions According to our data, the best diagnostic agreements were observed for high-grade DCIS, ALH, and LCIS. CCL without atypia and pleomorphic LCIS had the worst agreement indices. Virtual Slides The virtual slide(s) for this article can be found here: http://www.diagnosticpathology.diagnomx.eu/vs/1640072350119725. PMID:24948027

  20. Inter-observer variability between general pathologists and a specialist in breast pathology in the diagnosis of lobular neoplasia, columnar cell lesions, atypical ductal hyperplasia and ductal carcinoma in situ of the breast.

    PubMed

    Gomes, Douglas S; Porto, Simone S; Balabram, Débora; Gobbi, Helenice

    2014-06-19

    This study aimed to assess inter-observer variability between the original diagnostic reports and later review by a specialist in breast pathology considering lobular neoplasias (LN), columnar cell lesions (CCL), atypical ductal hyperplasia (ADH), and ductal carcinoma in situ (DCIS) of the breast. A retrospective, observational, cross-sectional study was conducted. A total of 610 breast specimens that had been formally sent for consultation and/or second opinions to the Breast Pathology Laboratory of Federal University of Minas Gerais were analysed between January 2005 and December 2010. The inter-observer variability between the original report and later review was compared regarding the diagnoses of LN, CCL, ADH, and DCIS. Statistical analyses were conducted using the Kappa index. Weak correlations were observed for the diagnoses of columnar cell change (CCC; Kappa=0.38), columnar cell hyperplasia (CCH; Kappa=0.32), while a moderate agreement (Kappa=0.47) was observed for the diagnoses of flat epithelial atypia (FEA). Good agreement was observed in the diagnoses of atypical lobular hyperplasia (ALH; Kappa=0.62) and lobular carcinoma in situ (LCIS; Kappa=0.66). However, poor agreement was observed for the diagnoses of pleomorphic LCIS (Kappa=0.22). Moderate agreement was observed for the diagnoses of ADH (Kappa=0.44), low-grade DCIS (Kappa=0.47), intermediate-grade DCIS (Kappa=0.45), and DCIS with microinvasion (Kappa=0.56). Good agreement was observed between the diagnoses of high-grade DCIS (Kappa=0.68). According to our data, the best diagnostic agreements were observed for high-grade DCIS, ALH, and LCIS. CCL without atypia and pleomorphic LCIS had the worst agreement indices. The virtual slide(s) for this article can be found here: http://www.diagnosticpathology.diagnomx.eu/vs/1640072350119725.

  1. Interobserver variability in identification of breast tumors in MRI and its implications for prognostic biomarkers and radiogenomics

    SciTech Connect

    Saha, Ashirbani, E-mail: as698@duke.edu; Grimm, La

    Purpose: To assess the interobserver variability of readers when outlining breast tumors in MRI, study the reasons behind the variability, and quantify the effect of the variability on algorithmic imaging features extracted from breast MRI. Methods: Four readers annotated breast tumors from the MRI examinations of 50 patients from one institution using a bounding box to indicate a tumor. All of the annotated tumors were biopsy proven cancers. The similarity of bounding boxes was analyzed using Dice coefficients. An automatic tumor segmentation algorithm was used to segment tumors from the readers’ annotations. The segmented tumors were then compared between readersmore » using Dice coefficients as the similarity metric. Cases showing high interobserver variability (average Dice coefficient <0.8) after segmentation were analyzed by a panel of radiologists to identify the reasons causing the low level of agreement. Furthermore, an imaging feature, quantifying tumor and breast tissue enhancement dynamics, was extracted from each segmented tumor for a patient. Pearson’s correlation coefficients were computed between the features for each pair of readers to assess the effect of the annotation on the feature values. Finally, the authors quantified the extent of variation in feature values caused by each of the individual reasons for low agreement. Results: The average agreement between readers in terms of the overlap (Dice coefficient) of the bounding box was 0.60. Automatic segmentation of tumor improved the average Dice coefficient for 92% of the cases to the average value of 0.77. The mean agreement between readers expressed by the correlation coefficient for the imaging feature was 0.96. Conclusions: There is a moderate variability between readers when identifying the rectangular outline of breast tumors on MRI. This variability is alleviated by the automatic segmentation of the tumors. Furthermore, the moderate interobserver variability in terms of the

  2. Echocardiographic agreement in the diagnostic evaluation for infective endocarditis.

    PubMed

    Lauridsen, Trine Kiilerich; Selton-Suty, Christine; Tong, Steven; Afonso, Luis; Cecchi, Enrico; Park, Lawrence; Yow, Eric; Barnhart, Huiman X; Paré, Carlos; Samad, Zainab; Levine, Donald; Peterson, Gail; Stancoven, Amy Butler; Johansson, Magnus Carl; Dickerman, Stuart; Tamin, Syahidah; Habib, Gilbert; Douglas, Pamela S; Bruun, Niels Eske; Crowley, Anna Lisa

    2016-07-01

    Echocardiography is essential for the diagnosis and management of infective endocarditis (IE). However, the reproducibility for the echocardiographic assessment of variables relevant to IE is unknown. Objectives of this study were: (1) To define the reproducibility for IE echocardiographic variables and (2) to describe a methodology for assessing quality in an observational cohort containing site-interpreted data. IE reproducibility was assessed on a subset of echocardiograms from subjects enrolled in the International Collaboration on Endocarditis registry. Specific echocardiographic case report forms were used. Intra-observer agreement was assessed from six site readers on ten randomly selected echocardiograms. Inter-observer agreement between sites and an echocardiography core laboratory was assessed on a separate random sample of 110 echocardiograms. Agreement was determined using intraclass correlation (ICC), coverage probability (CP), and limits of agreement for continuous variables and kappa statistics (κweighted) and CP for categorical variables. Intra-observer agreement for LVEF was excellent [ICC = 0.93 ± 0.1 and all pairwise differences for LVEF (CP) were within 10 %]. For IE categorical echocardiographic variables, intra-observer agreement was best for aortic abscess (κweighted = 1.0, CP = 1.0 for all readers). Highest inter-observer agreement for IE categorical echocardiographic variables was obtained for vegetation location (κweighted = 0.95; 95 % CI 0.92-0.99) and lowest agreement was found for vegetation mobility (κweighted = 0.69; 95 % CI 0.62-0.86). Moderate to excellent intra- and inter-observer agreement is observed for echocardiographic variables in the diagnostic assessment of IE. A pragmatic approach for determining echocardiographic data reproducibility in a large, multicentre, site interpreted observational cohort is feasible.

  3. Validity and inter-observer reliability of subjective hand-arm vibration assessments.

    PubMed

    Coenen, Pieter; Formanoy, Margriet; Douwes, Marjolein; Bosch, Tim; de Kraker, Heleen

    2014-07-01

    Exposure to mechanical vibrations at work (e.g., due to handling powered tools) is a potential occupational risk as it may cause upper extremity complaints. However, reliable and valid assessment methods for vibration exposure at work are lacking. Measuring hand-arm vibration objectively is often difficult and expensive, while often used information provided by manufacturers lacks detail. Therefore, a subjective hand-arm vibration assessment method was tested on validity and inter-observer reliability. In an experimental protocol, sixteen tasks handling powered tools were executed by two workers. Hand-arm vibration was assessed subjectively by 16 observers according to the proposed subjective assessment method. As a gold standard reference, hand-arm vibration was measured objectively using a vibration measurement device. Weighted κ's were calculated to assess validity, intra-class-correlation coefficients (ICCs) were calculated to assess inter-observer reliability. Inter-observer reliability of the subjective assessments depicting the agreement among observers can be expressed by an ICC of 0.708 (0.511-0.873). The validity of the subjective assessments as compared to the gold-standard reference can be expressed by a weighted κ of 0.535 (0.285-0.785). Besides, the percentage of exact agreement of the subjective assessment compared to the objective measurement was relatively low (i.e., 52% of all tasks). This study shows that subjectively assessed hand-arm vibrations are fairly reliable among observers and moderately valid. This assessment method is a first attempt to use subjective risk assessments of hand-arm vibration. Although, this assessment method can benefit from some future improvement, it can be of use in future studies and in field-based ergonomic assessments. Copyright © 2014 Elsevier Ltd and The Ergonomics Society. All rights reserved.

  4. Agreement in the assessment of metastatic spine disease using scoring systems.

    PubMed

    Arana, Estanislao; Kovacs, Francisco M; Royuela, Ana; Asenjo, Beatriz; Pérez-Ramírez, Ursula; Zamora, Javier

    2015-04-01

    To assess variability in the use of Tomita and modified Bauer scores in spine metastases. Clinical data and imaging from 90 patients with biopsy-proven spinal metastases, were provided to 83 specialists from 44 hospitals. Spinal levels involved and the Tomita and modified Bauer scores for each case were determined twice by each clinician, with a minimum of 6-week interval. Clinicians were blinded to every evaluation. Kappa statistic was used to assess intra and inter-observer agreement. Subgroup analyses were performed according to clinicians' specialty (medical oncology, neurosurgery, radiology, orthopedic surgery and radiation oncology), years of experience (⩽7, 8-13, ⩾14), and type of hospital (four levels). For metastases identification, intra-observer agreement was "substantial" (0.600.80) at the other levels. Inter-observer agreement was "almost perfect" at lumbar spine, and "substantial" at the other levels. Intra-observer agreement for the Tomita and Bauer scores was almost perfect. Inter-observer agreement was almost perfect for the Tomita score and substantial for the Bauer one. Results were similar across specialties, years of experience and type of hospital. Agreement in the assessment of metastatic spine disease is high. These scoring systems can improve communication among clinicians involved in oncology care. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  5. Interobserver variability and accuracy of high-definition endoscopic diagnosis for gastric intestinal metaplasia among experienced and inexperienced endoscopists.

    PubMed

    Hyun, Yil Sik; Han, Dong Soo; Bae, Joong Ho; Park, Hye Sun; Eun, Chang Soo

    2013-05-01

    Accurate diagnosis of gastric intestinal metaplasia is important; however, conventional endoscopy is known to be an unreliable modality for diagnosing gastric intestinal metaplasia (IM). The aims of the study were to evaluate the interobserver variation in diagnosing IM by high-definition (HD) endoscopy and the diagnostic accuracy of this modality for IM among experienced and inexperienced endoscopists. Selected 50 cases, taken with HD endoscopy, were sent for a diagnostic inquiry of gastric IM through visual inspection to five experienced and five inexperienced endoscopists. The interobserver agreement between endoscopists was evaluated to verify the diagnostic reliability of HD endoscopy in diagnosing IM, and the diagnostic accuracy, sensitivity, and specificity were evaluated for validity of HD endoscopy in diagnosing IM. Interobserver agreement among the experienced endoscopists was "poor" (κ = 0.38) and it was also "poor" (κ = 0.33) among the inexperienced endoscopists. The diagnostic accuracy of the experienced endoscopists was superior to that of the inexperienced endoscopists (P = 0.003). Since diagnosis through visual inspection is unreliable in the diagnosis of IM, all suspicious areas for gastric IM should be considered to be biopsied. Furthermore, endoscopic experience and education are needed to raise the diagnostic accuracy of gastric IM.

  6. Evaluation of interobserver variability of parenchymal phase of Tc-99m mercaptoacetyltriglycine and Tc-99m dimercaptosuccinic acid renal scintigraphy

    PubMed Central

    Erdoğan, Zeynep; Abdülrezzak, Ümmühan; Silov, Güler; Özdal, Ayşegül; Turhal, Özgül

    2014-01-01

    Objective: The aim of this study was to investigate the variability in the interpretation of parenchymal abnormalities and to assess the differences in interpretation of routine renal scintigraphic findings on posterior view of technetium-99m dimercaptosuccinic acid (pvDMSA) scans and parenchymal phase of technetium-99m mercaptoacetyltriglycine (ppMAG3) scans by using standard criterions to make standardization and semiquantitative evaluation and to have more accurately correlation. Materials and Methods: Two experienced nuclear medicine physicians independently interpreted pvDMSA scans of 204 and ppMAG3 scans of 102 pediatric patients, retrospectively. Comparisons were made by visual inspection of pvDMSA scans, and ppMAG3 scans by using a grading system modified from Itoh et al. According to this, anatomical damage of the renal parenchyma was classified into six types: Grade 0-V. In the calculation of the agreement rates, Kendall correlation (tau-b) analysis was used. Results: According to our findings, excellent agreement was found for DMSA grade readings (DMSA-GR) (tau-b = 0.827) and good agreement for MAG3 grade readings (MAG3-GR) (tau-b = 0.790) between two observers. Most of clear parenchymal lesions detected on pvDMSA scans and ppMAG3 scans identified by observers equally. Studies with negative or minimal lesions reduced correlation degrees for both DMSA-GR and MAG3-GR. Conclusion: Our grading system can be used for standardization of the reports. We conclude that standardization of criteria and terminology in the interpretations may result in higher interobserver consistency, also improve low interobserver reproducibility and objectivity of renal scintigraphy reports. PMID:24761059

  7. Scapula fractures: interobserver reliability of classification and treatment.

    PubMed

    Neuhaus, Valentin; Bot, Arjan G J; Guitton, Thierry G; Ring, David C; Abdel-Ghany, Mahmoud I; Abrams, Jeffrey; Abzug, Joshua M; Adolfsson, Lars E; Balfour, George W; Bamberger, H Brent; Barquet, Antonio; Baskies, Michael; Batson, W Arnold; Baxamusa, Taizoon; Bayne, Grant J; Begue, Thierry; Behrman, Michael; Beingessner, Daphne; Biert, Jan; Bishop, Julius; Alves, Mateus Borges Oliveira; Boyer, Martin; Brilej, Drago; Brink, Peter R G; Brunton, Lance M; Buckley, Richard; Cagnone, Juan Carlos; Calfee, Ryan P; Campinhos, Luiz Augusto B; Cassidy, Charles; Catalano, Louis; Chivers, Karel; Choudhari, Pradeep; Cimerman, Matej; Conflitti, Joseph M; Costanzo, Ralph M; Crist, Brett D; Cross, Brian J; Dantuluri, Phani; Darowish, Michael; de Bedout, Ramon; DeCoster, Thomas; Dennison, David G; DeNoble, Peter H; DeSilva, Gregory; Dienstknecht, Thomas; Duncan, Scott F; Duralde, Xavier A; Durchholz, Holger; Egol, Kenneth; Ekholm, Carl; Elias, Nelson; Erickson, John M; Esparza, J Daniel Espinosa; Fernandes, C H; Fischer, Thomas J; Fischmeister, Martin; Forigua Jaime, E; Getz, Charles L; Gilbert, Richard S; Giordano, Vincenzo; Glaser, David L; Gosens, Taco; Grafe, Michael W; Filho, Jose Eduardo Grandi Ribeiro; Gray, Robert R L; Gulotta, Lawrence V; Gummerson, Nigel William; Hammerberg, Eric Mark; Harvey, Edward; Haverlag, R; Henry, Patrick D G; Hobby, Jonathan L; Hofmeister, Eric P; Hughes, Thomas; Itamura, John; Jebson, Peter; Jenkinson, Richard; Jeray, Kyle; Jones, Christopher M; Jones, Jedediah; Jubel, Axel; Kaar, Scott G; Kabir, K; Kaplan, F Thomas D; Kennedy, Stephen A; Kessler, Michael W; Kimball, Hervey L; Kloen, Peter; Klostermann, Cyrus; Kohut, Georges; Kraan, G A; Kristan, Anze; Loebenberg, Mark I; Malone, Kevin J; Marsh, L; Martineau, Paul A; McAuliffe, John; McGraw, Iain; Mehta, Samir; Merchant, Milind; Metzger, Charles; Meylaerts, S A; Miller, Anna N; Wolf, Jennifer Moriatis; Murachovsky, Joel; Murthi, Anand; Nancollas, Michael; Nolan, Betsy M; Omara, Timothy; Omid, Reza; Ortiz, Jose A; Overbeck, Joachim P; Castillo, Alberto Pérez; Pesantez, Rodrigo; Polatsch, Daniel; Porcellini, G; Prayson, Michael; Quell, M; Ragsdell, Matthew M; Reid, James G; Reuver, J M; Richard, Marc J; Richardson, Martin; Rizzo, Marco; Rowinski, Sergio; Rubio, Jorge; Guerrero, Carlos G Sánchez; Satora, Wojciech; Schandelmaier, Peter; Scheer, Johan H; Schmidt, Andrew; Schubkegel, Todd A; Schulte, Leah M; Schumer, Evan D; Sears, Benjamin W; Shafritz, Adam B; Shortt, Nicholas L; Siff, Todd; Silva, Dario Mejia; Smith, Raymond Malcolm; Spruijt, Sander; Stein, Jason A; Pemovska, Emilija Stojkovska; Streubel, Philipp N; Swigart, Carrie; Swiontkowski, Marc; Thomas, George; Tolo, Eric T; Turina, Matthias; Tyllianakis, Minos; van den Bekerom, Michel P J; van der Heide, Huub; van de Sande, M A J; van Eerten, P V; Verbeek, Diederik O F; Hoffmann, David Victoria; Vochteloo, A J H; Wagenmakers, Robert; Wall, Christopher J; Wallensten, Richard; Wascher, Daniel C; Weiss, Lawrence; Wiater, J Michael; Wills, Brian P D; Wint, Jeffrey; Wright, Thomas; Young, Jason P; Zalavras, Charalampos; Zura, Robert D; Zyto, Karol

    2014-03-01

    There is substantial variation in the classification and management of scapula fractures. The first purpose of this study was to analyze the interobserver reliability of the OTA/AO classification and the New International Classification for Scapula Fractures. The second purpose was to assess the proportion of agreement among orthopaedic surgeons on operative or nonoperative treatment. Web-based reliability study. Independent orthopaedic surgeons from several countries were invited to classify scapular fractures in an online survey. One hundred three orthopaedic surgeons evaluated 35 movies of three-dimensional computerized tomography reconstruction of selected scapular fractures, representing a full spectrum of fracture patterns. Fleiss kappa (κ) was used to assess the reliability of agreement between the surgeons. The overall agreement on the OTA/AO classification was moderate for the types (A, B, and C, κ = 0.54) with a 71% proportion of rater agreement (PA) and for the 9 groups (A1 to C3, κ = 0.47) with a 57% PA. For the New International Classification, the agreement about the intraarticular extension of the fracture (Fossa (F), κ = 0.79) was substantial and the agreement about a fractured body (Body (B), κ = 0.57) or process was moderate (Process (P), κ = 0.53); however, PAs were more than 81%. The agreement on the treatment recommendation was moderate (κ = 0.57) with a 73% PA. The New International Classification was more reliable. Body and process fractures generated more disagreement than intraarticular fractures and need further clear definitions.

  8. Inter-Observer Reliability of DSM-5 Substance Use Disorders*

    PubMed Central

    Denis, Cécile M.; Gelernter, Joel; Hart, Amy B.; Kranzler, Henry R.

    2015-01-01

    Aims Although studies have examined the impact of changes made in DSM-5 on the estimated prevalence of substance use disorder (SUD) diagnoses, there is limited evidence of the reliability of DSM-5 SUDs. We evaluated the inter-observer reliability of four DSM-5 SUDs in a sample in which we had previously evaluated the reliability of DSM-IV diagnoses, allowing us to compare the two systems. Methods Two different interviewers each assessed 173 subjects over a 2-week period using the Semi-Structured Assessment for Drug Dependence and Alcoholism (SSADDA). Using the percent agreement and kappa (κ) coefficient, we examined the reliability of DSM-5 lifetime alcohol, opioid, cocaine, and cannabis use disorders, which we compared to that of SSADDA-derived DSM-IV SUD diagnoses. We also assessed the effect of additional lifetime SUD and lifetime mood or anxiety disorder diagnoses on the reliability of the DSM-5 SUD diagnoses. Results Reliability was good to excellent for the four disorders, with κ values ranging from 0.65 to 0.94. Agreement was consistently lower for SUDs of mild severity than for moderate or severe disorders. DSM-5 SUD diagnoses showed greater reliability than DSM-IV diagnoses of abuse or dependence or dependence only. Co-occurring SUD and lifetime mood or anxiety disorders exerted a modest effect on the reliability of the DSM-5 SUD diagnoses. Conclusions For alcohol, opioid, cocaine and cannabis use disorders, DSM-5 criteria and diagnoses are at least as reliable as those of DSM-IV. PMID:26048641

  9. Inter-observer reliability of DSM-5 substance use disorders.

    PubMed

    Denis, Cécile M; Gelernter, Joel; Hart, Amy B; Kranzler, Henry R

    2015-08-01

    Although studies have examined the impact of changes made in DSM-5 on the estimated prevalence of substance use disorder (SUD) diagnoses, there is limited evidence concerning the reliability of DSM-5 SUDs. We evaluated the inter-observer reliability of four DSM-5 SUDs in a sample in which we had previously evaluated the reliability of DSM-IV diagnoses, allowing us to compare the two systems. Two different interviewers each assessed 173 subjects over a 2-week period using the Semi-Structured Assessment for Drug Dependence and Alcoholism (SSADDA). Using the percent agreement and kappa (κ) coefficient, we examined the reliability of DSM-5 lifetime alcohol, opioid, cocaine, and cannabis use disorders, which we compared to that of SSADDA-derived DSM-IV SUD diagnoses. We also assessed the effect of additional lifetime SUD and lifetime mood or anxiety disorder diagnoses on the reliability of the DSM-5 SUD diagnoses. Reliability was good to excellent for the four disorders, with κ values ranging from 0.65 to 0.94. Agreement was consistently lower for SUDs of mild severity than for moderate or severe disorders. DSM-5 SUD diagnoses showed greater reliability than DSM-IV diagnoses of abuse or dependence or dependence only. Co-occurring SUD and lifetime mood or anxiety disorders exerted a modest effect on the reliability of the DSM-5 SUD diagnoses. For alcohol, opioid, cocaine and cannabis use disorders, DSM-5 criteria and diagnoses are at least as reliable as those of DSM-IV. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  10. Interobserver Variability and Accuracy of High-Definition Endoscopic Diagnosis for Gastric Intestinal Metaplasia among Experienced and Inexperienced Endoscopists

    PubMed Central

    Hyun, Yil Sik; Bae, Joong Ho; Park, Hye Sun; Eun, Chang Soo

    2013-01-01

    Accurate diagnosis of gastric intestinal metaplasia is important; however, conventional endoscopy is known to be an unreliable modality for diagnosing gastric intestinal metaplasia (IM). The aims of the study were to evaluate the interobserver variation in diagnosing IM by high-definition (HD) endoscopy and the diagnostic accuracy of this modality for IM among experienced and inexperienced endoscopists. Selected 50 cases, taken with HD endoscopy, were sent for a diagnostic inquiry of gastric IM through visual inspection to five experienced and five inexperienced endoscopists. The interobserver agreement between endoscopists was evaluated to verify the diagnostic reliability of HD endoscopy in diagnosing IM, and the diagnostic accuracy, sensitivity, and specificity were evaluated for validity of HD endoscopy in diagnosing IM. Interobserver agreement among the experienced endoscopists was "poor" (κ = 0.38) and it was also "poor" (κ = 0.33) among the inexperienced endoscopists. The diagnostic accuracy of the experienced endoscopists was superior to that of the inexperienced endoscopists (P = 0.003). Since diagnosis through visual inspection is unreliable in the diagnosis of IM, all suspicious areas for gastric IM should be considered to be biopsied. Furthermore, endoscopic experience and education are needed to raise the diagnostic accuracy of gastric IM. PMID:23678267

  11. Variability of inter-observer agreement on feasibility of partial nephrectomy before and after neoadjuvant axitinib for locally advanced renal cell carcinoma (RCC): independent analysis from a phase II trial.

    PubMed

    Karam, Jose A; Devine, Catherine E; Fellman, Bryan M; Urbauer, Diana L; Abel, E Jason; Allaf, Mohamad E; Bex, Axel; Lane, Brian R; Thompson, R Houston; Wood, Christopher G

    2016-04-01

    To evaluate how many patients could have undergone partial nephrectomy (PN) rather than radical nephrectomy (RN) before and after neoadjuvant axitinib therapy, as assessed by five independent urological oncologists, and to study the variability of inter-observer agreement. Pre- and post-systemic treatment computed tomography scans from 22 patients with clear cell renal cell carcinoma in a phase II neoadjuvant axitinib trial were reviewed by five independent urological oncologists. R.E.N.A.L. nephrometry score and κ statistics were calculated. The median R.E.N.A.L. nephrometry score changed from 11 before treatment to 10 after treatment (P = 0.002). Five tumours with moderate complexity before axitinib treatment remained moderate complexity after treatment. Of 17 tumours with high complexity before axitinib treatment, three became moderate complexity after treatment. The overall κ statistic was 0.611. Moderate-complexity κ was 0.611 vs a high-complexity κ of 0.428. Before axitinib treatment the κ was 0.550 vs 0.609 after treatment. After treatment with axitinib, all five reviewers agreed that only five patients required RN (instead of eight before treatment) and that 10 patients could now undergo PN (instead of three before treatment). The odds of PN feasibility were 22.8-times higher after treatment with axitinib. There is considerable variability in inter-observer agreement on the feasibility of PN in patients treated with neoadjuvant targeted therapy. Although more patients were candidates for PN after neoadjuvant axitinib therapy, it remains difficult to identify these patients a priori. © 2015 The Authors BJU International © 2015 BJU International Published by John Wiley & Sons Ltd.

  12. Inter- and intraobserver agreement in 24-hour combined multiple intraluminal impedance and pH measurement in children.

    PubMed

    Pilic, Denisa; Höfs, Carolin; Weitmann, Sandra; Nöh, Frank; Fröhlich, Thorsten; Skopnik, Heino; Köhler, Henrik; Wenzl, Tobias G; Schmidt-Choudhury, Anjona

    2011-09-01

    Assessment of intra- and interobserver agreement in multiple intraluminal impedance (MII) measurement between investigators from different institutions. Twenty-four 18- to 24-hour MII tracings were randomly chosen from 4 different institutions (6 per center). Software-aided automatic analysis was performed. Each result was validated by 2 independent investigators from the 4 different centers (4 investigator combinations). For intraobserver agreement, 6 measurements were analyzed twice by the same investigator. Agreement between investigators was calculated using the Cohen kappa coefficient. Interobserver agreement: 13 measurements showed a perfect agreement (kappa > 0.8); 9 had a substantial (kappa 0.61-0.8), 1 a moderate (kappa coefficient 0.41 to 0.6), and 1 a fair agreement (kappa coefficient 0.11-0.4). Median kappa value was 0.83. Intraobserver agreement: 5 tracings showed perfect and 1 showed a substantial agreement. The median kappa value was 0.88. Most measurements showed substantial to perfect intra- and interobserver agreement. Still, we found a few outliers presumably caused by poorer signal quality in some tracings rather than being observer dependent. An improvement of analysis results may be achieved by using a standard analysis protocol, a standardized method for judging tracing quality, better training options for method users, and more interaction between investigators from different institutions.

  13. Interobserver variability for the WHO classification of pulmonary carcinoids.

    PubMed

    Swarts, Dorian R A; van Suylen, Robert-Jan; den Bakker, Michael A; van Oosterhout, Matthijs F M; Thunnissen, Frederik B J M; Volante, Marco; Dingemans, Anne-Marie C; Scheltinga, Marc R M; Bootsma, Gerben P; Pouwels, Harry M M; van den Borne, Ben E E M; Ramaekers, Frans C S; Speel, Ernst-Jan M

    2014-10-01

    Pulmonary carcinoids are neuroendocrine tumors histopathologically subclassified into typical (TC; no necrosis, <2 mitoses per 2 mm) and atypical (AC; necrosis or 2 to 10 mitoses per 2 mm). The reproducibility of lung carcinoid classification, however, has not been extensively studied and may be hampered by the presence of pyknotic apoptosis mimicking mitotic figures. Furthermore, prediction of prognosis based on histopathology varies, especially for ACs. We examined the presence of interobserver variation between 5 experienced pulmonary pathologists who reviewed 123 originally diagnosed pulmonary carcinoid cases. The tumors were subsequently redistributed over 3 groups: unanimously classified cases, consensus cases (4/5 pathologists rendered identical diagnosis), and disagreement cases (divergent diagnosis by ≥2 assessors). κ-values were calculated, and results were correlated with clinical follow-up and molecular data. When focusing on the 114/123 cases unanimously classified as pulmonary carcinoids, the interobserver agreement was only fair (κ=0.32). Of these 114 cases, 55% were unanimously classified, 25% reached consensus classification, and for 19% there was no consensus. ACs were significantly more often in the latter category (P=0.00038). The designation of TCs and ACs by ≥3 assessors was not associated with prognosis (P=0.11). However, when disagreement cases were allocated on the basis of Ki-67 proliferative index (<5%; ≥5%) or nuclear orthopedia homeobox immunostaining (+; -), correlation with prognosis improved significantly (P=0.00040 and 0.0024, respectively). In conclusion, there is a considerable interobserver variation in the histopathologic classification of lung carcinoids, in particular concerning ACs. Additional immunomarkers such as Ki-67 or orthopedia homeobox may improve classification and prediction of prognosis.

  14. Inter-observer agreement improves with PERCIST 1.0 as opposed to qualitative evaluation in non-small cell lung cancer patients evaluated with F-18-FDG PET/CT early in the course of chemo-radiotherapy.

    PubMed

    Fledelius, Joan; Khalil, Azza; Hjorthaug, Karin; Frøkiær, Jørgen

    2016-12-01

    The purpose of this study is to determine whether a qualitative approach or a semi-quantitative approach provides the most robust method for early response evaluation with 2'-deoxy-2'-[(18)F]fluoro-D-glucose (F-18-FDG) positron emission tomography combined with whole body computed tomography (PET/CT) in non-small cell lung cancer (NSCLC). In this study eight Nuclear Medicine consultants analyzed F-18-FDG PET/CT scans from 35 patients with locally advanced NSCLC. Scans were performed at baseline and after 2 cycles of chemotherapy. Each observer used two different methods for evaluation: (1) PET response criteria in solid tumors (PERCIST) 1.0 and (2) a qualitative approach. Both methods allocate patients into one of four response categories (complete and partial metabolic response (CMR and PMR) and stable and progressive metabolic disease (SMD and PMD)). The inter-observer agreement was evaluated using Fleiss' kappa for multiple raters, Cohens kappa for comparison of the two methods, and intraclass correlation coefficients (ICC) for comparison of lean body mass corrected standardized uptake value (SUL) peak measurements. The agreement between observers when determining the percentage change in SULpeak was "almost perfect", with ICC = 0.959. There was a strong agreement among observers allocating patients to the different response categories with a Fleiss kappa of 0.76 (0.71-0.81). In 22 of the 35 patients, complete agreement was observed with PERCIST 1.0. The agreement was lower when using the qualitative method, moderate, having a Fleiss kappa of 0.60 (0.55-0.64). Complete agreement was achieved in only 10 of the 35 patients. The difference between the two methods was statistically significant (p < 0.005) (chi-squared). Comparing the two methods for each individual observer showed Cohen's kappa values ranging from 0.64 to 0.79, translating into a strong agreement between the two methods. PERCIST 1.0 provides a higher overall agreement between observers

  15. Assessment of the diagnostic performance and interobserver variability of endocytoscopy in Barrett’s esophagus: A pilot ex-vivo study

    PubMed Central

    Tomizawa, Yutaka; Iyer, Prasad G; Wongkeesong, Louis M; Buttar, Navtej S; Lutzke, Lori S; Wu, Tsung-Teh; Wang, Kenneth K

    2013-01-01

    AIM: To investigate a classification of endocytoscopy (ECS) images in Barrett’s esophagus (BE) and evaluate its diagnostic performance and interobserver variability. METHODS: ECS was applied to surveillance endoscopic mucosal resection (EMR) specimens of BE ex-vivo. The mucosal surface of specimen was stained with 1% methylene blue and surveyed with a catheter-type endocytoscope. We selected still images that were most representative of the endoscopically suspect lesion and matched with the final histopathological diagnosis to accomplish accurate correlation. The diagnostic performance and inter-observer variability of the new classification scheme were assessed in a blinded fashion by physicians with expertise in both BE and ECS and inexperienced physicians with no prior exposure to ECS. RESULTS: Three staff physicians and 22 gastroenterology fellows classified eight randomly assigned unknown still ECS pictures (two images per each classification) into one of four histopathologic categories as follows: (1) BEC1-squamous epithelium; (2) BEC2-BE without dysplasia; (3) BEC3-BE with dysplasia; and (4) BEC4-esophageal adenocarcinoma (EAC) in BE. Accuracy of diagnosis in staff physicians and clinical fellows were, respectively, 100% and 99.4% for BEC1, 95.8% and 83.0% for BEC2, 91.7% and 83.0% for BEC3, and 95.8% and 98.3% for BEC4. Interobserver agreement of the faculty physicians and fellows in classifying each category were 0.932 and 0.897, respectively. CONCLUSION: This is the first study to investigate classification system of ECS in BE. This ex-vivo pilot study demonstrated acceptable diagnostic accuracy and excellent interobserver agreement. PMID:24379583

  16. Classification of intertrochanteric fractures with computed tomography: a study of intraobserver and interobserver variability and prognostic value.

    PubMed

    Chapman, Cary B; Herrera, Mauricio F; Binenbaum, Gil; Schweppe, Michael; Staron, Ronald B; Feldman, Frieda; Rosenwasser, Melvin P

    2003-09-01

    The purpose of this prospective study was to determine the level of interobserver and intraobserver agreement among orthopedic surgeons and radiologists when computed tomography (CT) scans are used with plain radiographs to evaluate intertrochanteric fractures. In addition, the prognostic value of current classifications systems concerning quality of life was evaluated. Sixty-one patients who presented with intertrochanteric fractures received open reduction and internal fixation with compression hip screw. Three orthopedic surgeons and 2 radiologists independently classified the fractures according to 2 systems: Evans-Jensen and AO (Arbeitsgemeinschaft für Osteo-synthesefragen). Fractures were initially graded with plain radiographs and then again in conjunction with CT. Results were analyzed using the (kappa) kappa coefficient. The 36-item Short-Form Health Survey was administered at baseline, 3 months, and 1 year, and results were correlated with fracture grade. Mean kappa coefficients when comparing radiography alone with radiography and CT scan were 0.63 for the AO system and 0.59 for the Evans-Jensen system. Both represent "fair" agreements. Mean overall interobserver kappa coefficients were 0.67 for radiologists and 0.57 for orthopedic surgeons. Radiologists also had higher intraobserver kappa coefficients. No significant relationships were found between follow-up Short Form Health Survey results and intraoperative grading of fractures. When these classification schemes are compared, interobserver agreement does not appear to change dramatically when information from CT scans is added. This may suggest that (1) more data have been provided by CT with greater possibilities for misinterpretation and (2) these classification schemes may not be comprehensive in describing fracture pattern and displacement. Finally, both systems failed to provide any prognostic value.

  17. Prediction of Tubal Ectopic Pregnancy Using Offline Analysis of 3-Dimensional Transvaginal Ultrasonographic Data Sets: An Interobserver and Diagnostic Accuracy Study.

    PubMed

    Infante, Fernando; Espada Vaquero, Mercedes; Bignardi, Tommaso; Lu, Chuan; Testa, Antonia C; Fauchon, David; Epstein, Elisabeth; Leone, Francesco P G; Van den Bosch, Thierry; Martins, Wellington P; Condous, George

    2018-06-01

    To assess interobserver reproducibility in detecting tubal ectopic pregnancies by reading data sets from 3-dimensional (3D) transvaginal ultrasonography (TVUS) and comparing it with real-time 2-dimensional (2D) TVUS. Images were initially classified as showing pregnancies of unknown location or tubal ectopic pregnancies on real time 2D TVUS by an experienced sonologist, who acquired 5 3D volumes. Data sets were analyzed offline by 5 observers who had to classify each case as ectopic pregnancy or pregnancy of unknown location. The interobserver reproducibility was evaluated by the Fleiss κ statistic. The performance of each observer in predicting ectopic pregnancies was compared to that of the experienced sonologist. Women were followed until they were reclassified as follows: (1) failed pregnancy of unknown location; (2) intrauterine pregnancy; (3) ectopic pregnancy; or (4) persistent pregnancy of unknown location. Sixty-one women were included. The agreement between reading offline 3D data sets and the first real-time 2D TVUS was very good (80%-82%; κ = 0.89). The overall interobserver agreement among observers reading offline 3D data sets was moderate (κ = 0.52). The diagnostic performance of experienced observers reading offline 3D data sets had accuracy of 78.3% to 85.0%, sensitivity of 66.7% to 81.3%, specificity of 79.5% to 88.4%, positive predictive value of 57.1% to 72.2%, and negative predictive value of 87.5% to 91.3%, compared to the experienced sonologist's real-time 2D TVUS: accuracy of 94.5%, sensitivity of 94.4%, specificity of 94.5%, positive predictive value of 85.0%, and negative predictive value of 98.1%. The diagnostic accuracy of 3D TVUS by reading offline data sets for predicting ectopic pregnancies is dependent on experience. Reading only static 3D data sets without clinical information does not match the diagnostic performance of real time 2D TVUS combined with clinical information obtained during the scan. © 2017 by the American

  18. Reduction of Conflicts in Mining Development Using "Good Neighbor Agreements"

    NASA Astrophysics Data System (ADS)

    Masaitis, A.

    2013-05-01

    New environmental and social challenges for the mining industry in both developed and developing countries show the obvious need to implement "responsible" mining practices that include improved community involvement. Good Neighbor Agreements (GNA's) are a relatively new mechanism for improving communication and trust between a mining company and the community. The focus of a GNA will be to provide a written and enforceable agreement, negotiated between the concerned public and the respective mining company to respond to concerns from the public, and also provide a mechanism for conflict resolution, when there is mutual benefit to maintain a working relationship. Development of GNA's, a recently evolving process that promotes environmentally sound relationships between mines and the surrounding communities. Modify and apply the resulting GNA formulas to the developing countries and countries with transitional economies. This is particularly important for countries that have poorly functioning regulatory systems that cannot guarantee a healthy and safe environment for the communities. The fundamental questions addressed by this research. 1. This is a three-year research project started in August 2012 at the University of Nevada, Reno (UNR) to develop a Good Neighbor Agreements standards as well as to investigate the details of mine development. 2. Identify spheres of possible cooperation between mining companies, government organizations, and the Non-Governmental Organizations (NGO's). Use this cooperation to develop international standards for the GNA, to promote exchange of environmental information, and exchange of successful environmental, health, and safety practices between mining operations from different countries. Discussion: The Good Neighbor Agreement currently evolving will address the following: 1. Provide an economically viable mechanism for developing a partnership between mining operations and the local communities that will increase mining industry

  19. Intra- and interobserver reliability of quantitative ultrasound measurement of the plantar fascia.

    PubMed

    Rathleff, Michael Skovdal; Moelgaard, Carsten; Lykkegaard Olesen, Jens

    2011-01-01

    To determine intra- and interobserver reliability and measurement precision of sonographic assessment of plantar fascia thickness when using one, the mean of two, or the mean of three measurements. Two experienced observers scanned 20 healthy subjects twice with 60 minutes between test and retest. A GE LOGIQe ultrasound scanner was used in the study. The built-in software in the scanner was used to measure the thickness of the plantar fascia (PF). Reliability was calculated using intraclass correlation coefficient (ICC) and limits of agreement (LOA). Intraobserver reliability (ICC) using one measurement was 0.50 for one observer and 0.52 for the other, and using the mean of three measurements intraobserver reliability increased up to 0.77 and 0.67, respectively. Interobserver reliability (ICC) when using one measurement was 0.62 and increased to 0.82 when using the average of three measurements. LOA showed that when using the average of three measurements, LOA decreased to 0.6 mm, corresponding to 17.5% of the mean thickness of the PF. The results showed that reliability increases when using the mean of three measurements compared with one. Limits of agreement based on intratester reliability shows that changes in thickness that are larger than 0.6 mm can be considered actual changes in thickness and not a result of measurement error. Copyright © 2011 Wiley Periodicals, Inc.

  20. The sizing of hamstring grafts for anterior cruciate reconstruction: intra- and inter-observer reliability.

    PubMed

    Dwyer, Tim; Whelan, Daniel B; Khoshbin, Amir; Wasserstein, David; Dold, Andrew; Chahal, Jaskarndip; Nauth, Aaron; Murnaghan, M Lucas; Ogilvie-Harris, Darrell J; Theodoropoulos, John S

    2015-04-01

    The objective of this study was to establish the intra- and inter-observer reliability of hamstring graft measurement using cylindrical sizing tubes. Hamstring tendons (gracilis and semitendinosus) were harvested from ten cadavers by a single surgeon and whip stitched together to create ten 4-strand hamstring grafts. Ten sports medicine surgeons and fellows sized each graft independently using either hollow cylindrical sizers or block sizers in 0.5-mm increments—the sizing technique used was applied consistently to each graft. Surgeons moved sequentially from graft to graft and measured each hamstring graft twice. Surgeons were asked to state the measured proximal (femoral) and distal (tibial) diameter of each graft, as well as the diameter of the tibial and femoral tunnels that they would drill if performing an anterior cruciate ligament (ACL) reconstruction using that graft. Reliability was established using intra-class correlation coefficients. Overall, both the inter-observer and intra-observer agreement were >0.9, demonstrating excellent reliability. The inter-observer reliability for drill sizes was also excellent (>0.9). Excellent correlation was seen between cylindrical sizing, and drill sizes (>0.9). Sizing of hamstring grafts by multiple surgeons demonstrated excellent intra-observer and intra-observer reliability, potentially validating clinical studies exploring ACL reconstruction outcomes by hamstring graft diameter when standard techniques are used. III.

  1. Interobserver and intraobserver variability in the identification of the Lenke classification lumbar modifier in adolescent idiopathic scoliosis.

    PubMed

    Duong, Luc; Cheriet, Farida; Labelle, Hubert; Cheung, Kenneth M C; Abel, Mark F; Newton, Peter O; McCall, Richard E; Lenke, Lawrence G; Stokes, Ian A F

    2009-08-01

    Interobserver and intraobserver reliability study for the identification of the Lenke classification lumbar modifier by a panel of experts compared with a computer algorithm. To measure the variability of the Lenke classification lumbar modifier and determine if computer assistance using 3-dimensional spine models can improve the reliability of classification. The lumbar modifier has been proposed to subclassify Lenke scoliotic curve types into A, B, and C on the basis of the relationship between the central sacral vertical line (CSVL) and the apical lumbar vertebra. Landmarks for identification of the CSVL have not been clearly defined, and the reliability of the actual CSVL position and lumbar modifier selection have never been tested independently. Therefore, the value of the lumbar modifier for curve classification remains unknown. The preoperative radiographs of 68 patients with adolescent idiopathic scoliosis presenting a Lenke type 1 curve were measured manually twice by 6 members of the Scoliosis Research Society 3-dimensional classification committee at 6 months interval. Intraobserver and interobserver reliability was quantified using the percentage of agreement and kappa statistics. In addition, the lumbar curve of all subjects was reconstructed in 3-dimension using a stereoradiographic technique and was submitted to a computer algorithm to infer the lumbar modifier according to measurements from the pedicles. Interobserver rates for the first trial showed a mean kappa value of 0.56. Second trial rates were higher with a mean kappa value of 0.64. Intraobserver rates were evaluated at a mean kappa value of 0.69. The computer algorithm was successful in identifying the lumbar curve type and was in agreement with the observers by a proportion up to 93%. Agreement between and within observers for the Lenke lumbar modifier is only moderate to substantial with manual methods. Computer assistance with 3-dimensional models of the spine has the potential to

  2. Acute myonecrosis at MRI: Etiologies in an oncologic cohort, and assessment of interobserver variability

    PubMed Central

    Cunningham, Jane; Sharma, Richa; Kirzner, Anna; Hwang, Sinchun; Lefkowitz, Robert; Greenspan, Daniel; Shapoval, Anton; Panicek, David M.

    2016-01-01

    Objective To determine etiologies of myonecrosis in oncology patients and to assess interobserver variability in interpreting its MRI features. Materials and Methods Pathology records in our tertiary cancer hospital were searched for proven myonecrosis, and MRIs of affected regions in those patients were identified. MRI reports that suggested myonecrosis also were identified. Each MRI was reviewed independently by two of six readers to assess anatomic site, size, and signal intensities of muscle changes, and presence of the previously reported stipple sign (enhancing foci within a region defined by rim enhancement). The stipple sign was assessed again, weeks after a training session. Cohen kappa and percent agreement were calculated. Medical records were reviewed for contemporaneous causes of myonecrosis. Results MRI reports in 73 patients suggested the diagnosis of myonecrosis; pathologic proof was available in another two. Myonecrosis was frequently associated with radiotherapy (n=34 (45%) patients)); less frequent causes included intraoperative immobilization, trauma, therapeutic embolization, ablation therapy, exercise, and diabetes. Myonecrosis usually involved lower extremity, pelvis, and upper extremity; mean size was 13.0 cm. Stipple sign was observed in 55–95% of patients at first assessment (k=0.09–0.42; 60–80% agreement) and 55–100% at second (k=0.0–0.58; 72–90% agreement). Enhancement surrounded myonecrosis in 55–100% patients (k=0.03 – 0.32; 58–70% agreement). Conclusion Myonecrosis in oncology patients usually occurred after radiotherapy, and less commonly after intraoperative immobilization, trauma, therapeutic embolization, ablation therapy, exercise, or diabetes. Although interobserver variability for MRI features of myonecrosis exists (even after focused training), a combination of findings facilitates diagnosis and conservative management. PMID:27105618

  3. Double sac sign and intradecidual sign in early pregnancy: interobserver reliability and frequency of occurrence.

    PubMed

    Doubilet, Peter M; Benson, Carol B

    2013-07-01

    To assess the interobserver agreement, frequency of occurrence, and prognostic importance of the double sac sign (DSS), intradecidual sign (IDS), and other sonographic findings in early intrauterine pregnancies. We retrospectively identified all sonograms obtained between January 1, 2006, and December 31, 2011, in which: (1) the scan demonstrated an intrauterine fluid collection without a yolk sac or embryo; (2) a follow-up scan confirmed an intrauterine pregnancy; and (3) the first-trimester outcome was known. Each coinvestigator characterized the 199 study sonograms as demonstrating or not demonstrating a DSS or an IDS, based on judgment about whether the scan met published criteria defining these signs. Interobserver agreement was poor for the DSS (κ= 0.24) and IDS (κ= 0.23). Scans frequently demonstrated neither sign: 150 cases (75.4%) if we considered a sign to be present when both investigators graded it as present and 69 cases (34.7%) using the looser criterion that either graded it as present. The presence of a DSS or an IDS was unrelated to the β-human chorionic gonadotropin (β-hCG) value (P > .05, t test, all comparisons). An inner echogenic ring was present in 158 cases (79.4%), and the decidua was brighter peripherally than centrally in 102 (51.3%). The first-trimester outcome was unrelated to the presence of a DSS or an IDS, presence of an inner echogenic ring, or decidual appearance (P > .05, χ(2), all comparisons). The sonographic appearance of early gestational sacs, before visualization of a yolk sac or embryo, is highly variable. The DSS and IDS are often absent; there is poor interobserver agreement regarding these signs; and the prognosis is unrelated to their presence or absence. A round or oval intrauterine fluid collection in a woman with positive β-hCG should be treated as a gestational sac until proven otherwise, regardless of whether it demonstrates a DSS or an IDS.

  4. Interobserver variability in recognizing arousal in respiratory sleep disorders.

    PubMed

    Drinnan, M J; Murray, A; Griffiths, C J; Gibson, G J

    1998-08-01

    Daytime sleepiness is a common consequence of repeated arousal in obstructive sleep apnea (OSA). Arousal indices are sometimes used to make decisions on treatment, but there is no evidence that arousals are detected similarly even by experienced observers. Using the American Sleep Disorders Association (ASDA) definition of arousal in terms of the accompanying electroencephalogram (EEG) changes, we have quantified interobserver agreement for arousal scoring and identified factors affecting it. Ten patients with suspected OSA were studied; three representative EEG events during each of light, slow-wave, and rapid-eye-movement (REM) sleep were extracted from each record (90 events total) and evaluated by experts in 14 sleep laboratories. Observers differed (ANOVA, p < 0.001) in the number of events scored as arousal (totals ranged from 23 to 53 of the 90 events). Overall agreement was moderate (kappa = 0.47), but it was best for events during slow-wave sleep, moderate for REM, and poor for light sleep (kappa = 0.60, 0.52, and 0.28, respectively). Agreement was unrelated to arousal duration. We conclude that the ASDA definition of arousal is only moderately repeatable. Account should be taken of this variability when results from different centers are compared.

  5. Interobserver reproducibility of The Paris System for Reporting Urinary Cytology.

    PubMed

    Long, Theresa; Layfield, Lester J; Esebua, Magda; Frazier, Shellaine R; Giorgadze, D Tamar; Schmidt, Robert L

    2017-01-01

    The Paris System for Reporting Urinary Cytology represents a significant improvement in classification of urinary specimens. The system acknowledges the difficulty in cytologically diagnosing low-grade urothelial carcinomas and has developed categories to deal with this issue. The system uses six categories: unsatisfactory, negative for high-grade urothelial carcinoma (NHGUC), atypical urothelial cells, suspicious for high-grade urothelial carcinoma, high-grade urothelial carcinoma, other malignancies and a seventh subcategory (low-grade urothelial neoplasm). Three hundred and fifty-seven urine specimens were independently reviewed by four cytopathologists unaware of the previous diagnoses. Each cytopathologist rendered a diagnosis according to the Paris System categories. Agreement was assessed using absolute agreement and weighted chance-corrected agreement (kappa). Disagreements were classified as low impact and high impact based on the potential impact of a misclassification on clinical management. The average absolute agreement was 65% with an average expected agreement of 44%. The average chance-corrected agreement (kappa) was 0.32. Nine hundred and ninety-nine of 1902 comparisons between rater pairs were in agreement, but 12% of comparisons differed by two or more categories for the category NHGUC. Approximately 15% of the disagreements were classified as high clinical impact. Our findings indicated that the scheme recommended by the Paris System shows adequate precision for the category NHGUC, but the other categories demonstrated unacceptable interobserver variability. This low level of diagnostic precision may negatively impact the applicability of the Paris System for widespread clinical application.

  6. [LiLa classification for paediatric long bone fractures. Intraobserver and interobserver reliability].

    PubMed

    Kamphaus, A; Rapp, M; Wessel, L M; Buchholz, M; Massalme, E; Schneidmüller, D; Roeder, C; Kaiser, M M

    2015-04-01

    There are two child-specific fracture classification systems for long bone fractures: the AO classification of pediatric long-bone fractures (PCCF) and the LiLa classification of pediatric fractures of long bones (LiLa classification). Both are still not widely established in comparison to the adult AO classification for long bone fractures. During a period of 12 months all long bone fractures in children were documented and classified according to the LiLa classification by experts and non-experts. Intraobserver and interobserver reliability were calculated according to Cohen (kappa). A total of 408 fractures were classified. The intraobserver reliability for location in the skeletal and bone segment showed an almost perfect agreement (K = 0.91-0.95) and also the morphology (joint/shaft fracture) (K = 0.87-0.93). Due to different judgment of the fracture displacement in the second classification round, the intraobserver reliability of the whole classification revealed moderate agreement (K = 0.53-0.58). Interobserver reliability showed moderate agreement (K = 0.55) often due to the low quality of the X-rays. Further differences occurred due to difficulties in assigning the precise transition from metaphysis to diaphysis. The LiLa classification is suitable and in most cases user-friendly for classifying long bone fractures in children. Reliability is higher than in established fracture specific classifications and comparable to the AO classification of pediatric long bone fractures. Some mistakes were due to a low quality of the X-rays and some due to difficulties to classify the fractures themselves. Improvements include a more precise definition of the metaphysis and the kind of displacement. Overall the LiLa classification should still be considered as an alternative for classifying pediatric long bone fractures.

  7. [Interobserver agreement on electrocardiographic diagnosis of left ventricular hypertrophy in hypertensive patients in Andalusia. PREHVIA study].

    PubMed

    Martín-Rioboó, Enrique; López Granados, Amador; Cea Calvo, Luis; Pérula De Torres, Luis Angel; García Criado, Emilio; Anguita Sánchez, Manuel P; García Matarín, Lisardo; Molina Díaz, Rafael; Ureña Fernández, Tomas

    2009-05-01

    To assess the agreement between Primary Care (PC) doctors and a cardiology specialist in diagnosing left ventricular hypertrophy in the electrocardiograph (LVH-ECG) in hypertensive patients. Cross-sectional, multicentre study. Andalusian Primary Care Centres. A total of 120 PC doctors who using a random sample selected patients of 35 years or more with AHT of at least 6 months of progression. PRIMARY VARIABLES: Demographic data, risk factors and cardiovascular diseases were recorded. The LVH-ECG was evaluated by applying Cornell voltage criteria, Cornell and Sokolow-Lyon product. The PC researchers read the ECG first and the cardiologist made a second reading blind. A total of 570 patients (mean +/- SD of age, 65 +/- 11 years; 54.5% females); the LVH-ECG prevalence was 13.7% (95% CI, 10.8-16.6; 12.6% by Cornell and 1.6% by Sokolow-Lyon). The agreement in the diagnosis between the PC doctors and the cardiologist was 0.378 (95% CI, 0.272-0.486; disagreements in 15.5% of cases). The PC doctors slightly underestimated the LVH-ECG prevalence by Cornell and slightly overestimated it by the Sokolow-Lyon criteria. The agreement was also low for all of them (kappa = 0.367; 95% CI, 0.252-0.482, for Cornell, and kappa = 0.274; 95% CI: 0.093-0.454 for Sokolow-Lyon). The agreement between the diagnosis by the PC doctors and the cardiologist was low. The implications of this study suggest the need to improve the reading of ECG among PC doctors. The use of computerised systems could be a good option.

  8. Observers' Agreement on Measurements in Fiberoptic Endoscopic Evaluation of Swallowing.

    PubMed

    Pilz, Walmari; Vanbelle, Sophie; Kremer, Bernd; van Hooren, Michel R; van Becelaere, Tine; Roodenburg, Nel; Baijens, Laura W J

    2016-04-01

    This study analyzed the effect that dysphagia etiology, different observers, and bolus consistency might have on the level of agreement for measurements in FEES images reached by independent versus consensus panel rating. Sixty patients were included and divided into two groups according to dysphagia etiology: neurological or head and neck oncological. All patients underwent standardized FEES examination using thin and thick liquid consistencies. Two observers scored the same exams, first independently and then in a consensus panel. Four ordinal FEES variables were analyzed. Statistical analysis was performed using a linear weighted kappa coefficient and Bayesian multilevel model. Intra- and interobserver agreement on FEES measurements ranged from 0.76 to 0.93 and from 0.61 to 0.88, respectively. Dysphagia etiology did not influence observers' agreement level. However, bolus consistency resulted in decreased interobserver agreement for all measured FEES variables during thin liquid swallows. When rating on the consensus panel, the observers deviated considerably from the scores they had previously given on the independent rating task. Observer agreement on measurements in FEES exams was influenced by bolus consistency, not by dysphagia etiology. Therefore, observer agreement on FEES measurements should be analyzed by taking bolus consistency into account, as it might affect the interpretation of the outcome. Identifying factors that might influence agreement levels could lead to better understanding of the rating process and assist in developing a more precise measurement scale that would ensure higher levels of observer agreement for measurements in FEES exams.

  9. Total knee arthroplasty: good agreement of clinical severity scores between patients and consultants.

    PubMed

    Ebinesan, Ananthan D; Sarai, Bhupinder S; Walley, Gayle; Bridgman, Stephen; Maffulli, Nicola

    2006-07-31

    Nearly 20,000 patients per year in the UK receive total knee arthroplasty (TKA). One of the problems faced by the health services of many developed countries is the length of time patients spend waiting for elective treatment. We therefore report the results of a study in which the Salisbury Priority Scoring System (SPSS) was used by both the surgeon and their patients to ascertain whether there were differences between the surgeon generated and patient generated Salisbury Priority Scores. The Salisbury Priority Scoring System (SPSS) was used to assign relative priority to patients with knee osteoarthritis as part of a randomised controlled trial comparing the standard medial parapatellar approach versus the sub-vastus approach in TKA. The operating surgeons and each patient completed the SPSS at the same pre-assessment clinic. The SPSS assesses four criteria, namely progression of disease, pain or distress, disability or dependence on others, and loss of usual occupation. Crosstabs and agreement measures (Cohen's kappa) were performed. Overall, the four SPSS criteria showed a kappa value of 0.526, 0.796, 0.813, and 0.820, respectively, showing moderate to very good agreement between the patient and the operating consultant. Male patients showed better agreement than female patients. The Salisbury Priority Scoring System is a good means of assessing patients' needs in relation to elective surgery, with high agreement between the patient and the operating surgeon.

  10. Agreement among Classroom Observers of Children's Stylistic Learning Behaviors.

    ERIC Educational Resources Information Center

    Buchanan, Helen Hamlet; McDermott, Paul A.; Schaefer, Barbara A.

    1998-01-01

    Investigates the interobserver agreement of the Learning Behavior Scale (LBS) by educators (n=16) observing students in special-education classes (n=72). No significant observer effect was found. Moreover, the LBS produced comparable levels of differential learning styles for assessments of individual children. (Author/MKA)

  11. An inter-observer agreement study of autofluorescence endoscopy in Barrett's esophagus among expert and non-expert endoscopists.

    PubMed

    Mannath, J; Subramanian, V; Telakis, E; Lau, K; Ramappa, V; Wireko, M; Kaye, P V; Ragunath, K

    2013-02-01

    Autofluorescence imaging (AFI), which is a "red flag" technique during Barrett's surveillance, is associated with significant false positive results. The aim of this study was to assess the inter-observer agreement (IOA) in identifying AFI-positive lesions and to assess the overall accuracy of AFI. Anonymized AFI and high resolution white light (HRE) images were prospectively collected. The AFI images were presented in random order, followed by corresponding AFI + HRE images. Three AFI experts and 3 AFI non-experts scored images after a training presentation. The IOA was calculated using kappa and accuracy was calculated with histology as gold standard. Seventy-four sets of images were prospectively collected from 63 patients (48 males, mean age 69 years). The IOA for number of AF positive lesions was fair when AFI images were presented. This improved to moderate with corresponding AFI and HRE images [experts 0.57 (0.44-0.70), non-experts 0.47 (0.35-0.62)]. The IOA for the site of AF lesion was moderate for experts and fair for non-experts using AF images, which improved to substantial for experts [κ = 0.62 (0.50-0.72)] but remained at fair for non-experts [κ = 0.28 (0.18-0.37)] with AFI + HRE. Among experts, the accuracy of identifying dysplasia was 0.76 (0.7-0.81) using AFI images and 0.85 (0.79-0.89) using AFI + HRE images. The accuracy was 0.69 (0.62-0.74) with AFI images alone and 0.75 (0.70-0.80) using AFI + HRE among non-experts. The IOA for AF positive lesions is fair to moderate using AFI images which improved with addition of HRE. The overall accuracy of identifying dysplasia was modest, and was better when AFI and HRE images were combined.

  12. Evaluation of interobserver variability and diagnostic performance of developed MRI-based radiological scoring system for invasive placenta previa.

    PubMed

    Ueno, Yoshiko; Maeda, Tetsuo; Tanaka, Utaru; Tanimura, Kenji; Kitajima, Kazuhiro; Suenaga, Yuko; Takahashi, Satoru; Yamada, Hideto; Sugimura, Kazuro

    2016-09-01

    To evaluate the interobserver variability and diagnostic performance of a developed magnetic resonance imaging (MRI)-based scoring system for invasive placenta previa. Prenatal MR images of 70 women were retrospectively evaluated, 18 of whom were diagnosed with invasive placenta. The six MR features (dark band on T2 -weighted images, intraplacental abnormal vascularity, placental bulge, heterogeneous placenta, myometrial thinning, and placental protrusion sign) were scored on 5-point Likert scale separately, and the cumulative radiological score (CRS) was defined as the sum of each score. Two more experienced radiologists (readers A and B) and two less experienced residents (readers C and D) calculated the CRS. Interobserver variability was assessed by measuring the intraclass correlation coefficient. Diagnostic performance was evaluated by means of receiver operating characteristic (ROC) analysis. Interobserver variability for CRS was excellent for the more experienced radiologists (0.85), and good for all readers (0.72) and the less experienced residents (0.66). The area under the ROC curve (Az) and accuracy (Acc) for CRS were significantly higher or equivalent to those of other MR features for all readers (Az and Acc for reader A; CRS, 0.92, 91.4%; intraplacental T2 dark band, 0.83, P = 0.009, 81.4%, P = 0.03; intraplacental abnormal vascularity, 0.9, P = 0.3, 90.0%, P = 1.00; placental bulge, 0.81, P = 0.0008, 80.0%, P = 0.02; heterogeneous placenta, 0.85, P = 0.11, 74.3%, P = 0.002; myometrial thinning, 0.84, P = 0.06, 60.0%, P < 0.0001; placental protrusion sign, 0.81, P = 0.01, 81.4%, P = 0.26). This developed MRI-based scoring system demonstrated excellent or good interobserver variability, and good diagnostic performance for invasive placenta previa. J. Magn. Reson. Imaging 2016;44:573-583. © 2016 International Society for Magnetic Resonance in Medicine.

  13. Interobserver Agreement and Disagreement in Continuous Recording Exemplified by Measurement of Behavior State.

    ERIC Educational Resources Information Center

    Mudford, Oliver C.; Hogg, James; Roberts, Jessica

    1997-01-01

    Continuous observational recording over 57 hours evaluated behavior states of three adults with profound and multiple disabilities. Two independent observers also recorded for 22 hours. Although overall percentage agreement was satisfactory (above 80%), agreement on occurrence was unsatisfactory (mean of 65%). Agreement data were superimposed on…

  14. Significant inter-observer variation in the diagnosis of extrapancreatic necrosis and type of pancreatic collections in acute pancreatitis - An international multicenter evaluation of the revised Atlanta classification.

    PubMed

    Sternby, Hanna; Verdonk, Robert C; Aguilar, Guadalupe; Dimova, Alexandra; Ignatavicius, Povilas; Ilzarbe, Lucas; Koiva, Peeter; Lantto, Eila; Loigom, Tonis; Penttilä, Anne; Regnér, Sara; Rosendahl, Jonas; Strahinova, Vanya; Zackrisson, Sophia; Zviniene, Kristina; Bollen, Thomas L

    2016-01-01

    For consistent reporting and better comparison of data in research the revised Atlanta classification (RAC) proposes new computed tomography (CT) criteria to describe the morphology of acute pancreatitis (AP). The aim of this study was to analyse the interobserver agreement among radiologists in evaluating CT morphology by using the new RAC criteria in patients with AP. Patients with a first episode of AP who obtained a CT were identified and consecutively enrolled at six European centres backwards from January 2013 to January 2012. A local radiologist at each center and a central expert radiologist scored the CTs separately using the RAC criteria. Center dependent and independent interobserver agreement was determined using Kappa statistics. In total, 285 patients with 388 CTs were included. For most CT criteria, interobserver agreement was moderate to substantial. In four categories, the center independent kappa values were fair: extrapancreatic necrosis (EXPN) (0.326), type of pancreatitis (0.370), characteristics of collections (0.408), and appropriate term of collections (0.356). The fair kappa values relate to discrepancies in the identification of extrapancreatic necrotic material. The local radiologists diagnosed EXPN (33% versus 59%, P < 0.0001) and non-homogeneous collections (35% versus 66%, P < 0.0001) significantly less frequent than the central expert. Cases read by the central expert showed superior correlation with clinical outcome. Diagnosis of EXPN and recognition of non-homogeneous collections show only fair agreement potentially resulting in inconsistent reporting of morphologic findings. Copyright © 2016 IAP and EPC. Published by Elsevier B.V. All rights reserved.

  15. Subsolid Lung Nodule Classification: A CT Criterion for Improving Interobserver Agreement.

    PubMed

    Revel, Marie-Pierre; Mannes, Inès; Benzakoun, Joseph; Guinet, Claude; Léger, Thomas; Grenier, Philippe; Lupo, Audrey; Fournel, Ludovic; Chassagnon, Guillaume; Bommart, Sébastien

    2018-01-01

    Purpose To evaluate an objective computed tomographic (CT) criterion for distinguishing between part-solid (PS) and nonsolid (NS) lung nodules. Materials and Methods This study received institutional review board approval, and patients gave informed consent. Preoperative CT studies in all patients who underwent surgery for subsolid nodules between 2008 and 2015 were first reviewed by two senior radiologists, who subjectively classified the nodules as PS or NS. A second reading performed 1 month later used predefined classification criteria and involved a third senior radiologist as well as three junior radiologists. Subsolid nodules were classified as PS if a solid portion was detectable in the mediastinal window setting (nonmeasurable, < 50%, or > 50% of the entire nodule) and were otherwise classified as NS (subclassified as pure or heterogeneous). Interreader agreement was assessed with κ statistics and the intraclass correlation coefficient (ICC). Results A total of 99 nodules measuring a median of 20 mm (range, 5-47 mm) in lung window CT images were analyzed. Senior radiologist agreement on the PS/NS distinction increased from moderate (κ = 0.54; 95% confidence interval [CI]: 0.37, 0.71) to excellent (κ = 0.89; 95% CI: 0.80, 0.98) between the first and second readings. At the second readings, agreement among senior and junior radiologists was excellent for PS/NS distinction (ICC = 0.87; 95% CI: 0.83, 0.90) and for subcategorization (ICC = 0.82; 95% CI: 0.77, 0.87). When a solid portion was measurable in the mediastinal window, the specificity for adenocarcinoma invasiveness ranged from 86% to 96%. Conclusion Detection of a solid portion in the mediastinal window setting allows subsolid nodules to be classified as PS with excellent interreader agreement. If the solid portion is measurable, the specificity for adenocarcinoma invasiveness is high. © RSNA, 2017 Online supplemental material is available for this article.

  16. An inter-observer Ki67 reproducibility study applying two different assessment methods: on behalf of the Danish Scientific Committee of Pathology, Danish breast cancer cooperative group (DBCG).

    PubMed

    Laenkholm, Anne-Vibeke; Grabau, Dorthe; Møller Talman, Maj-Lis; Balslev, Eva; Bak Jylling, Anne Marie; Tabor, Tomasz Piotr; Johansen, Morten; Brügmann, Anja; Lelkaitis, Giedrius; Di Caterino, Tina; Mygind, Henrik; Poulsen, Thomas; Mertz, Henrik; Søndergaard, Gorm; Bruun Rasmussen, Birgitte

    2018-01-01

    In 2011, the St. Gallen Consensus Conference introduced the use of pathology to define the intrinsic breast cancer subtypes by application of immunohistochemical (IHC) surrogate markers ER, PR, HER2 and Ki67 with a specified Ki67 cutoff (>14%) for luminal B-like definition. Reports concerning impaired reproducibility of Ki67 estimation and threshold inconsistency led to the initiation of this quality assurance study (2013-2015). The aim of the study was to investigate inter-observer variation for Ki67 estimation in malignant breast tumors by two different quantification methods (assessment method and count method) including measure of agreement between methods. Fourteen experienced breast pathologists from 12 pathology departments evaluated 118 slides from a consecutive series of malignant breast tumors. The staining interpretation was performed according to both the Danish and Swedish guidelines. Reproducibility was quantified by intra-class correlation coefficient (ICC) and Lights Kappa with dichotomization of observations at the larger than (>) 20% threshold. The agreement between observations by the two quantification methods was evaluated by Bland-Altman plot. For the fourteen raters the median ranged from 20% to 40% by the assessment method and from 22.5% to 36.5% by the count method. Light's Kappa was 0.664 for observation by the assessment method and 0.649 by the count method. The ICC was 0.82 (95% CI: 0.77-0.86) by the assessment method vs. 0.84 (95% CI: 0.80-0.87) by the count method. Although the study in general showed a moderate to good inter-observer agreement according to both ICC and Lights Kappa, still major discrepancies were identified in especially the mid-range of observations. Consequently, for now Ki67 estimation is not implemented in the DBCG treatment algorithm.

  17. AO Distal Radius Fracture Classification: Global Perspective on Observer Agreement.

    PubMed

    Jayakumar, Prakash; Teunis, Teun; Giménez, Beatriz Bravo; Verstreken, Frederik; Di Mascio, Livio; Jupiter, Jesse B

    2017-02-01

    Background  The primary objective of this study was to test interobserver reliability when classifying fractures by consensus by AO types and groups among a large international group of surgeons. Secondarily, we assessed the difference in inter- and intraobserver agreement of the AO classification in relation to geographical location, level of training, and subspecialty. Methods  A randomized set of radiographic and computed tomographic images from a consecutive series of 96 distal radius fractures (DRFs), treated between October 2010 and April 2013, was classified using an electronic web-based portal by an invited group of participants on two occasions. Results  Interobserver reliability was substantial when classifying AO type A fractures but fair and moderate for type B and C fractures, respectively. No difference was observed by location, except for an apparent difference between participants from India and Australia classifying type B fractures. No statistically significant associations were observed comparing interobserver agreement by level of training and no differences were shown comparing subspecialties. Intra-rater reproducibility was "substantial" for fracture types and "fair" for fracture groups with no difference accounting for location, training level, or specialty. Conclusion  Improved definition of reliability and reproducibility of this classification may be achieved using large international groups of raters, empowering decision making on which system to utilize. Level of Evidence  Level III.

  18. AO Distal Radius Fracture Classification: Global Perspective on Observer Agreement

    PubMed Central

    Jayakumar, Prakash; Teunis, Teun; Giménez, Beatriz Bravo; Verstreken, Frederik; Di Mascio, Livio; Jupiter, Jesse B.

    2016-01-01

    Background The primary objective of this study was to test interobserver reliability when classifying fractures by consensus by AO types and groups among a large international group of surgeons. Secondarily, we assessed the difference in inter- and intraobserver agreement of the AO classification in relation to geographical location, level of training, and subspecialty. Methods A randomized set of radiographic and computed tomographic images from a consecutive series of 96 distal radius fractures (DRFs), treated between October 2010 and April 2013, was classified using an electronic web-based portal by an invited group of participants on two occasions. Results Interobserver reliability was substantial when classifying AO type A fractures but fair and moderate for type B and C fractures, respectively. No difference was observed by location, except for an apparent difference between participants from India and Australia classifying type B fractures. No statistically significant associations were observed comparing interobserver agreement by level of training and no differences were shown comparing subspecialties. Intra-rater reproducibility was “substantial” for fracture types and “fair” for fracture groups with no difference accounting for location, training level, or specialty. Conclusion Improved definition of reliability and reproducibility of this classification may be achieved using large international groups of raters, empowering decision making on which system to utilize. Level of Evidence Level III PMID:28119795

  19. Intra- and interobserver reliability of the Eaton classification for trapeziometacarpal arthritis: a systematic review.

    PubMed

    Berger, Aaron J; Momeni, Arash; Ladd, Amy L

    2014-04-01

    Trapeziometacarpal, or thumb carpometacarpal (CMC), arthritis is a common problem with a variety of treatment options. Although widely used, the Eaton radiographic staging system for CMC arthritis is of questionable clinical utility, as disease severity does not predictably correlate with symptoms or treatment recommendations. A possible reason for this is that the classification itself may not be reliable, but the literature on this has not, to our knowledge, been systematically reviewed. We therefore performed a systematic review to determine the intra- and interobserver reliability of the Eaton staging system. We systematically reviewed English-language studies published between 1973 and 2013 to assess the degree of intra- and interobserver reliability of the Eaton classification for determining the stage of trapeziometacarpal joint arthritis and pantrapezial arthritis based on plain radiographic imaging. Search engines included: PubMed, Scopus(®), and CINAHL. Four studies, which included a total of 163 patients, met our inclusion criteria and were evaluated. The level of evidence of the studies included in this analysis was determined using the Oxford Centre for Evidence Based Medicine Levels of Evidence Classification by two independent observers. A limited number of studies have been performed to assess intra- and interobserver reliability of the Eaton classification system. The four studies included were determined to be Level 3b. These studies collectively indicate that the Eaton classification demonstrates poor to fair interobserver reliability (kappa values: 0.11-0.56) and fair to moderate intraobserver reliability (kappa values: 0.54-0.657). Review of the literature demonstrates that radiographs assist in the assessment of CMC joint disease, but there is not a reliable system for classification of disease severity. Currently, diagnosis and treatment of thumb CMC arthritis are based on the surgeon's qualitative assessment combining history, physical

  20. Interobserver variability when employing the IUGA/ICS classification system for complications related to prostheses and grafts in female pelvic floor surgery.

    PubMed

    Gowda, Meghana; Kit, Laura Chang; Stuart Reynolds, W; Wang, Li; Dmochowski, Roger R; Kaufman, Melissa R

    2013-10-01

    To unify and organize reporting, an International Urogynecological Association (IUGA)/International Continence Society (ICS) expert consortium published terminology guidelines with a classification system for complications related to implants used in female pelvic surgery. We hypothesize that the complexity of the codification system may be a hindrance to precision, especially with decreasing levels of postgraduate expertise. Residents, fellows, and attending physicians were asked to code seven test cases taken from published literature. Category, timing, and site components of the classification system were assessed independently and according to the level of training. Interobserver reliability was calculated as percent agreement and Fleiss' kappa statistic. A total of 24 participants (6 attending physicians, 3 fellows, and 15 residents) were tested. The percent agreement showed significant variation when classified by level of training. In all categories, attending physicians had the greatest percentage agreement and largest kappa. The most agreement was seen when attending physicians classified mesh complications by time, 71% agreement with kappa 0.73 [95% confidence interval (CI) 0.58-0.88]. For the same task, the percentage agreement for fellows was 57%, kappa 0.55 (95% CI 0.23-0.87) and with residents 57%, kappa 0.71([95% CI 0.64-0.78). Interestingly, the site component of the classification system had the least overall agreement and lowest kappa [0%, kappa 0.29 (95% CI 0.26-0.32)] followed by the category component [14%, kappa 0.48 (95% CI 0.46-0.5)]. The IUGA/ICS mesh complication classification system has poor interobserver reliability. This trended downward with decreasing postgraduate level; however, we did not have sufficient statistical power to show an association when stratifying by all training levels. This highlights the complex nature of the classification system in its current form and its limitation for widespread clinical and research

  1. Interobserver variation in the diagnosis of fibroepithelial lesions of the breast: a multicentre audit by digital pathology.

    PubMed

    Dessauvagie, Benjamin F; Lee, Andrew H S; Meehan, Katie; Nijhawan, Anju; Tan, Puay Hoon; Thomas, Jeremy; Tie, Bibiana; Treanor, Darren; Umar, Seemeen; Hanby, Andrew M; Millican-Slater, Rebecca

    2018-02-13

    Fibroepithelial lesions (FELs) of the breast span a morphological continuum including lesions where distinction between cellular fibroadenoma (FA) and benign phyllodes tumour (PT) is difficult. The distinction is clinically important with FAs managed conservatively while equivocal lesions and PTs are managed with surgery. We sought to audit core biopsy diagnoses of equivocal FELs by digital pathology and to investigate whether digital point counting is useful in clarifying FEL diagnoses. Scanned slide images from cores and subsequent excisions of 69 equivocal FELs were examined in a multicentre audit by eight pathologists to determine the agreement and accuracy of core needle biopsy (CNB) diagnoses and by digital point counting of stromal cellularity and expansion to determine if classification could be improved. Interobserver variation was high on CNB with a unanimous diagnosis from all pathologists in only eight cases of FA, diagnoses of both FA and PT on the same CNB in 15 and a 'weak' mean kappa agreement between pathologists (k=0.36). 'Moderate' agreement was observed on CNBs among breast specialists (k=0.44) and on excision samples (k=0.49). Up to 23% of lesions confidently diagnosed as FA on CNB were PT on excision and up to 30% of lesions confidently diagnosed as PT on CNB were FA on excision. Digital point counting did not aid in the classification of FELs. Accurate and reproducible diagnosis of equivocal FELs is difficult, particularly on CNB, resulting in poor interobserver agreement and suboptimal accuracy. Given the diagnostic difficulty, and surgical implications, equivocal FELs should be reported in consultation with experienced breast pathologists as a small number of benign FAs can be selected out from equivocal lesions. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  2. Agreement of three interpretation systems of intrapartum foetal heart rate monitoring by different levels of physicians.

    PubMed

    Pruksanusak, Ninlapa; Thongphanang, Putthaporn; Chainarong, Natthicha; Suntharasaj, Thitima; Kor-Anantakul, Ounjai; Suwanrath, Chitkasaem; Petpichetchian, Chusana

    2017-11-01

    A prospective study was conducted in centre in Southern Thailand, to evaluate agreement in EFM interpretation among various physicians in order to find out the most practical system for daily use. We found strong agreement of very normal FHR tracings among the FIGO, NICHD 3-tier and 5-tier systems. The NICHD 3-tier was more compatible with the FIGO system than 5-tier system. Overall inter-observer agreement was moderate for the NICHD 3-tier system while inter-observer agreement of 5-tier system was fair also the intra-observer agreement was higher in the NICHD 3-tier system. So the 3-tier systems are more suitable than the 5-tier system in general obstetric practice. Impact statement What is already known on this subject: The 3-tier and 5-tier systems were widely used in general obstetrics practice. What the results of this study add: The inter- and intra-observer agreement of NICHD 3-tier system was higher than the 5-tier system. What the implications are of these findings for clinical practice and/or further research: The 3-tier systems were more suitable than the 5-tier systems in general obstetrics practice.

  3. Validation of the Italian version of the Coma Recovery Scale-Revised (CRS-R).

    PubMed

    Sacco, Simona; Altobelli, Emma; Pistarini, Caterina; Cerone, Davide; Cazzulani, Benedetta; Carolei, Antonio

    2011-01-01

    To validate the Italian version of the Coma Recovery Scale-Revised (CRS-R). Two observers applied the Italian version of the CRS-R to selected patients. On day 1, observer A and B independently scored each patient; the comparison of their observations was used to evaluate inter-observer agreement. On day 2, observer A completed a second evaluation and the comparison of this observation with that obtained on day 1 by the same observer was used to evaluate test-re-test agreement. For each evaluation, also diagnostic impression (vegetative state/minimally conscious state) was reported. Thirty-eight patients were evaluated (mean age ± SD, 58.9 ± 13.8 years). Inter-observer (ρ = 0.81; p < 0.001) as well as test-re-test agreement (ρ = 0.97; p < 0.001) for the total score was high. Inter-observer agreement was excellent for the communication sub-scale, good for the auditory, visual and motor sub-scales and moderate for the oromotor/verbal and arousal sub-scales. Test-re-test agreement was excellent for the visual, motor, oromotor/verbal and communication sub-scales, good for the auditory sub-scale and moderate for the arousal sub-scale. When considering the diagnostic impression, inter-observer agreement was good (κ = 0.75; p < 0.001) and test-re-test agreement was excellent (κ = 0.92; p < 0.001). The Italian version of the CRS-R can be administered reliably and can be also employed to discriminate patients in vegetative and in minimally conscious state.

  4. 24 CFR 1000.244 - If the recipient has made a good-faith effort to negotiate a cooperation agreement and tax-exempt...

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ...-faith effort to negotiate a cooperation agreement and tax-exempt status but has been unsuccessful... recipient has made a good-faith effort to negotiate a cooperation agreement and tax-exempt status but has... recipient's Area ONAP. The request must detail a good faith effort by the recipient, identify the housing...

  5. 24 CFR 1000.244 - If the recipient has made a good-faith effort to negotiate a cooperation agreement and tax-exempt...

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ...-faith effort to negotiate a cooperation agreement and tax-exempt status but has been unsuccessful... recipient has made a good-faith effort to negotiate a cooperation agreement and tax-exempt status but has... recipient's Area ONAP. The request must detail a good faith effort by the recipient, identify the housing...

  6. Diffusion-weighted magnetic resonance imaging in the characterization of testicular germ cell neoplasms: Effect of ROI methods on apparent diffusion coefficient values and interobserver variability.

    PubMed

    Tsili, Athina C; Ntorkou, Alexandra; Astrakas, Loukas; Xydis, Vasilis; Tsampalas, Stavros; Sofikitis, Nikolaos; Argyropoulou, Maria I

    2017-04-01

    To evaluate the difference in apparent diffusion coefficient (ADC) measurements at diffusion-weighted (DW) magnetic resonance imaging of differently shaped regions-of-interest (ROIs) in testicular germ cell neoplasms (TGCNS), the diagnostic ability of differently shaped ROIs in differentiating seminomas from nonseminomatous germ cell neoplasms (NSGCNs) and the interobserver variability. Thirty-three TGCNs were retrospectively evaluated. Patients underwent MR examinations, including DWI on a 1.5-T MR system. Two observers measured mean tumor ADCs using four distinct ROI methods: round, square, freehand and multiple small, round ROIs. The interclass correlation coefficient was analyzed to assess interobserver variability. Statistical analysis was used to compare mean ADC measurements among observers, methods and histologic types. All ROI methods showed excellent interobserver agreement, with excellent correlation (P<0.001). Multiple, small ROIs provided the lower mean ADC in TGCNs. Seminomas had lower mean ADC compared to NSGCNs for each ROI method (P<0.001). Round ROI proved the most accurate method in characterizing TGCNS. Interobserver variability in ADC measurement is excellent, irrespective of the ROI shape. Multiple, small round ROIs and round ROI proved the more accurate methods for ADC measurement in the characterization of TGCNs and in the differentiation between seminomas and NSGCNs, respectively. Copyright © 2017 Elsevier B.V. All rights reserved.

  7. Education in Northern Ireland since the Good Friday Agreement: Kabuki Theatre Meets "Danse Macabre"

    ERIC Educational Resources Information Center

    Gardner, John

    2016-01-01

    The Good Friday Agreement (1998) between the UK and Irish governments, and most of the political parties in Northern Ireland, heralded a significant step forward in securing peace and stability for this troubled region of the British Isles. From the new-found stability, the previous fits and starts of education reform were replaced by a…

  8. Inter-Observer and Intra-Observer Reliability of Clinical Assessments in Knee Osteoarthritis

    PubMed Central

    Maricar, Nasimah; Callaghan, Michael J; Parkes, Matthew J; Felson, David T; O’Neill, Terence W

    2016-01-01

    Background Clinical examination of the knee is subject to measurement error. The aim of this analysis was to determine inter- and intra-observer reliability of commonly used clinical tests in patients with knee osteoarthritis(OA). Methods We studied subjects with symptomatic knee OA who were participants in an open-label clinical trial of intra-articular steroid therapy. Following standardisation of the clinical test procedures, two clinicians assessed 25 subjects independently at the same visit, and the same clinician assessed 88 subjects over an interval period of 2–10 weeks; in both cases prior to the steroid intervention. Clinical examination included assessment of bony enlargement, crepitus, quadriceps wasting, knee effusion, joint-line and anserine tenderness and knee range of movement(ROM). Intra-class correlation coefficients(ICC), estimated kappa(κ), weighted kappa(κω) and Bland and Altman plots were used to determine inter- and intra-observer levels of agreement. Results Using Landis and Koch criteria, inter-observer kappa scores were moderate for patellofemoral joint(κ=0.53) and anserine tenderness(κ=0.48); good for bony enlargement(κ=0.66), quadriceps wasting(κ=0.78), crepitus(κ=0.78), medial tibiofemoral joint tenderness(κ=0.76), and effusion assessed by ballottement(κ=0.73) and bulge sign(κω =0.78); and excellent for lateral tibiofemoral joint tenderness(κ=1.00), flexion(ICC=0.97) and extension(ICC=0.87) ROM. Intra-observer kappa scores were moderate for lateral tibiofemoral joint tenderness(κ=0.60), good for crepitus(κ=0.78), effusion assessed by ballottement test(κ=0.77), patellofemoral joint(κ=0.66), medial tibiofemoral joint(κ=0.64) and anserine(κ=0.73) tenderness and excellent for effusion assessed by bulge sign(κω =0.83), bony enlargement(κ=0.98), quadriceps wasting(κ=0.83), flexion(ICC=0.99) and extension(ICC=0.96) ROM. Conclusion Among individuals with symptomatic knee OA, the reliability of clinical examination of the

  9. Interobserver and intraobserver reliability of the modified Waldenström classification system for staging of Legg-Calvé-Perthes disease.

    PubMed

    Hyman, Joshua E; Trupia, Evan P; Wright, Margaret L; Matsumoto, Hiroko; Jo, Chan-Hee; Mulpuri, Kishore; Joseph, Benjamin; Kim, Harry K W

    2015-04-15

    The absence of a reliable classification system for Legg-Calvé-Perthes disease has contributed to difficulty in establishing consistent management strategies and in interpreting outcome studies. The purpose of this study was to assess interobserver and intraobserver reliability of the modified Waldenström classification system among a large and diverse group of pediatric orthopaedic surgeons. Twenty surgeons independently completed the first two rounds of staging: two assessments of forty deidentified radiographs of patients with Legg-Calvé-Perthes disease in various stages. Ten of the twenty surgeons completed another two rounds of staging after the addition of a second pair of radiographs in sequence. Kappa values were calculated within and between each of the rounds. Interobserver kappa values for the classification for surveys 1, 2, 3, and 4 were 0.81, 0.82, 0.76, and 0.80, respectively (with 0.61 to 0.80 considered substantial agreement and 0.81 to 1.0, nearly perfect agreement). Intraobserver agreement for the classification was an average of 0.88 (range, 0.77 to 0.96) between surveys 1 and 2 and an average of 0.87 (range, 0.81 to 0.94) between surveys 3 and 4. The modified Waldenström classification system for staging of Legg-Calvé-Perthes disease demonstrated substantial to almost perfect agreement between and within observers across multiple rounds of study. In doing so, the results of this study provide a foundation for future validation studies, in which the classification stage will be associated with clinical outcomes. Copyright © 2015 by The Journal of Bone and Joint Surgery, Incorporated.

  10. The intra- and inter-observer reliability of the physical examination methods used to assess patients with patellofemoral joint instability.

    PubMed

    Smith, Toby O; Clark, Allan; Neda, Sophia; Arendt, Elizabeth A; Post, William R; Grelsamer, Ronald P; Dejour, David; Almqvist, Karl Fredrik; Donell, Simon T

    2012-08-01

    An accurate physical examination of patients with patellar instability is an important aspect of the diagnosis and treatment. While previous studies have assessed the diagnostic accuracy of such physical examination tests, little has been undertaken to assess the inter- and intra-tester reliability of such techniques. The purpose of this study was to determine the inter- and intra-tester reliability of the physical examination tests used for patients with patellar instability. Five patients (10 knees) with bilateral recurrent patellar instability were assessed by five members of the International Patellofemoral Study Group. Each surgeon assessed each patient twice using 18 reported physical examination tests. The inter- and intra-observer reliability was assessed using weighted Kappa statistics with 95% confidence intervals. The findings of the study suggested that there were very poor inter-observer reliability for the majority of the physical tests, with only the assessments of patellofemoral crepitus, foot arch position and the J-sign presenting with fair to moderate agreement respectively. The intra-observer reliability indicated largely moderate to substantial agreement between the first and second tests performed by each assessor, with the greatest agreement seen for the assessment of tibial torsion, popliteal angle and the Bassett's sign. For the common physical examination tests used in the management of patients with patellar instability inter-observer reliability is poor, while intra-observer reliability is moderate. Standardization of physical exam assessments and further study of these results among different clinicians and more divergent patient groups is indicated. Copyright © 2011 Elsevier B.V. All rights reserved.

  11. A Statistical Analysis of Reviewer Agreement and Bias in Evaluating Medical Abstracts 1

    PubMed Central

    Cicchetti, Domenic V.; Conn, Harold O.

    1976-01-01

    Observer variability affects virtually all aspects of clinical medicine and investigation. One important aspect, not previously examined, is the selection of abstracts for presentation at national medical meetings. In the present study, 109 abstracts, submitted to the American Association for the Study of Liver Disease, were evaluated by three “blind” reviewers for originality, design-execution, importance, and overall scientific merit. Of the 77 abstracts rated for all parameters by all observers, interobserver agreement ranged between 81 and 88%. However, corresponding intraclass correlations varied between 0.16 (approaching statistical significance) and 0.37 (p < 0.01). Specific tests of systematic differences in scoring revealed statistically significant levels of observer bias on most of the abstract components. Moreover, the mean differences in interobserver ratings were quite small compared to the standard deviations of these differences. These results emphasize the importance of evaluating the simple percentage of rater agreement within the broader context of observer variability and systematic bias. PMID:997596

  12. [Interobserver reliability of the Glasgow coma scale in critically ill patients with neurological and/or neurosurgical disease].

    PubMed

    Sánchez-Sánchez, M M; Sánchez-Izquierdo, R; Sánchez-Muñoz, E I; Martínez-Yegles, I; Fraile-Gamo, M P; Arias-Rivera, S

    2014-01-01

    The Glasgow coma scale (GCS) is a common tool used for neurological assessment of critically ill patients. Despite its widespread use, the GCS has some limitations, as sometimes different observers may value differently the same response. To evaluate the interobserver agreement, among intensive care nurses with a minimum of 3 years experience, both in the overall estimate of GCS and for each of its components. Prospective observational study including 110 neurological and/or neurosurgical patients conducted in a critical care unit of 18 beds, from October 2010 until December 2012. Registered variables: Demographic characteristics, reason for admission, overall GCS and its components. The neurological evaluation was conducted by a minimum of 3 nurses. One of them applied an algorithm and consensual assessment technique and all, independently, valued response to stimuli. Interobserver agreement was measured using the intraclass correlation coefficient (ICC) for a confidence interval (CI) of 95%. The study was approved by the Ethics Committee for Clinical Trails. The intraclass correlation coefficient (confident interval) for scale was: Overall GCS: 0.989 (0.985-0.992); ocular response: 0.981 (0.974-0.986); verbal response: 0.971 (0.960-0.979); motor response: 0.987 (0.982-0.991). In our cohort of patients we observed a high level of consistency in the application of both the GCS as in each of its components. Copyright © 2013 Elsevier España, S.L. y SEEIUC. All rights reserved.

  13. Interobserver delineation variation in lung tumour stereotactic body radiotherapy

    PubMed Central

    Persson, G F; Nygaard, D E; Hollensen, C; Munck af Rosenschöld, P; Mouritsen, L S; Due, A K; Berthelsen, A K; Nyman, J; Markova, E; Roed, A P; Roed, H; Korreman, S; Specht, L

    2012-01-01

    Objectives In radiotherapy, delineation uncertainties are important as they contribute to systematic errors and can lead to geographical miss of the target. For margin computation, standard deviations (SDs) of all uncertainties must be included as SDs. The aim of this study was to quantify the interobserver delineation variation for stereotactic body radiotherapy (SBRT) of peripheral lung tumours using a cross-sectional study design. Methods 22 consecutive patients with 26 tumours were included. Positron emission tomography/CT scans were acquired for planning of SBRT. Three oncologists and three radiologists independently delineated the gross tumour volume. The interobserver variation was calculated as a mean of multiple SDs of distances to a reference contour, and calculated for the transversal plane (SDtrans) and craniocaudal (CC) direction (SDcc) separately. Concordance indexes and volume deviations were also calculated. Results Median tumour volume was 13.0 cm3, ranging from 0.3 to 60.4 cm3. The mean SDtrans was 0.15 cm (SD 0.08 cm) and the overall mean SDcc was 0.26 cm (SD 0.15 cm). Tumours with pleural contact had a significantly larger SDtrans than tumours surrounded by lung tissue. Conclusions The interobserver delineation variation was very small in this systematic cross-sectional analysis, although significantly larger in the CC direction than in the transversal plane, stressing that anisotropic margins should be applied. This study is the first to make a systematic cross-sectional analysis of delineation variation for peripheral lung tumours referred for SBRT, establishing the evidence that interobserver variation is very small for these tumours. PMID:22919015

  14. Inter- and Intra-Observer Agreement in Ultrasound BI-RADS Classification and Real-Time Elastography Tsukuba Score Assessment of Breast Lesions.

    PubMed

    Schwab, Fabienne; Redling, Katharina; Siebert, Matthias; Schötzau, Andy; Schoenenberger, Cora-Ann; Zanetti-Dällenbach, Rosanna

    2016-11-01

    Our aim was to prospectively evaluate inter- and intra-observer agreement between Breast Imaging Reporting and Data System (BI-RADS) classifications and Tsukuba elasticity scores (TSs) of breast lesions. The study included 164 breast lesions (63 malignant, 101 benign). The BI-RADS classification and TS of each breast lesion was assessed by the examiner and twice by three reviewers at an interval of 2 months. Weighted κ values for inter-observer agreement ranged from moderate to substantial for BI-RADS classification (κ = 0.585-0.738) and was substantial for TS (κ = 0.608-0.779). Intra-observer agreement was almost perfect for ultrasound (US) BI-RADS (κ = 0.847-0.872) and TS (κ = 0.879-0.914). Overall, individual reviewers are highly self-consistent (almost perfect intra-observer agreement) with respect to BI-RADS classification and TS, whereas inter-observer agreement was moderate to substantial. Comprehensive training is essential for achieving high agreement and minimizing the impact of subjectivity. Our results indicate that breast US and real-time elastography can achieve high diagnostic performance. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  15. Radiographical measurements for distal intra-articular fractures of the radius using plain radiographs and cone beam computed tomography images.

    PubMed

    Suojärvi, Nora; Sillat, T; Lindfors, N; Koskinen, S K

    2015-12-01

    Operative treatment of an intra-articular distal radius fracture is one of the most common procedures in orthopedic and hand surgery. The intra- and interobserver agreement of common radiographical measurements of these fractures using cone beam computed tomography (CBCT) and plain radiographs were evaluated. Thirty-seven patients undergoing open reduction and volar fixation for a distal radius fracture were studied. Two radiologists analyzed the preoperative radiographs and CBCT images. Agreement of the measurements was subjected to intra-class correlation coefficient and the Bland-Altman analyses. Plain radiographs provided a slightly poorer level of agreement. For fracture diastasis, excellent intraobserver agreement was achieved for radiographs and good or excellent agreement for CBCT, compared to poor interobserver agreement (ICC 0.334) for radiographs and good interobserver agreement (ICC 0.621) for CBCT images. The Bland-Altman analyses indicated a small mean difference between the measurements but rather large variation using both imaging methods, especially in angular measurements. For most of the measurements, radiographs do well, and may be used in clinical practice. Two different measurements by the same reader or by two different readers can lead to different decisions, and therefore a standardization of the measurements is imperative. More detailed analysis of articular surface needs cross-sectional imaging modalities.

  16. Reliability and agreement on embryo assessment: 5 years of an external quality control programme.

    PubMed

    Martínez-Granados, Luis; Serrano, María; González-Utor, Antonio; Ortiz, Nereyda; Badajoz, Vicente; López-Regalado, María Luisa; Boada, Montserrat; Castilla, Jose A

    2018-03-01

    An external quality-control programme for morphology-based embryo quality assessment, incorporating a standardized embryo grading scheme, was evaluated over a period of 5 years to determine levels of inter-observer reliability and agreement between practising clinical embryologists at IVF centres and the opinions of a panel of experts. Following Guidelines for Reporting Reliability and Agreement Studies, the Gwet index and proportion of positive (Ppos) and negative agreement were calculated. For embryo morphology assessment, a substantial degree of reliability was measured between the centres and the panel of experts (Gwet index: 0.76; 95% CI 0.70 to 0.84). The agreement was higher for good- versus poor-quality embryos. When multinucleation or vacuoles were observed, low levels of reliability were obtained (Ppos: 0.56 and 0.43, respectively). In blastocysts, the characteristic that presented the largest discrepancy was that related to the inner cell mass. In decisions about the final disposition of the embryo, reliability between centre and the panel of experts was moderate (Gwet index: 0.51; 95% CI 0.41 to 0.60). In conclusion, the ability of clinical embryologists to evaluate the presence of multinucleation and vacuoles in the early cleavage embryo, and to determine the category of the inner cell mass in blastocysts, needs to be improved. Copyright © 2017 Reproductive Healthcare Ltd. All rights reserved.

  17. High resolution pituitary gland MRI at 7.0 tesla: a clinical evaluation in Cushing's disease.

    PubMed

    de Rotte, Alexandra A J; Groenewegen, Amy; Rutgers, Dik R; Witkamp, Theo; Zelissen, Pierre M J; Meijer, F J Anton; van Lindert, Erik J; Hermus, Ad; Luijten, Peter R; Hendrikse, Jeroen

    2016-01-01

    To evaluate the detection of pituitary lesions at 7.0 T compared to 1.5 T MRI in 16 patients with clinically and biochemically proven Cushing's disease. In seven patients, no lesion was detected on the initial 1.5 T MRI, and in nine patients it was uncertain whether there was a lesion. Firstly, two readers assessed both 1.5 T and 7.0 T MRI examinations unpaired in a random order for the presence of lesions. Consensus reading with a third neuroradiologist was used to define final lesions in all MRIs. Secondly, surgical outcome was evaluated. A comparison was made between the lesions visualized with MRI and the lesions found during surgery in 9/16 patients. The interobserver agreement for lesion detection was good at 1.5 T MRI (κ = 0.69) and 7.0 T MRI (κ = 0.62). In five patients, both the 1.5 T and 7.0 T MRI enabled visualization of a lesion on the correct side of the pituitary gland. In three patients, 7.0 T MRI detected a lesion on the correct side of the pituitary gland, while no lesion was visible at 1.5 T MRI. The interobserver agreement of image assessment for 7.0 T MRI in patients with Cushing's disease was good, and lesions were detected more accurately with 7.0 T MRI. Interobserver agreement for lesion detection on 1.5 T MRI was good; Interobserver agreement for lesion detection on 7.0 T MRI was good; 7.0 T enabled confirmation of unclear lesions at 1.5 T; 7.0 T enabled visualization of lesions not visible at 1.5 T.

  18. The META score for differentiating metastatic from osteoporotic vertebral fractures: an independent agreement assessment.

    PubMed

    Besa, Pablo; Urrutia, Julio; Campos, Mauricio; Mobarec, Sebastián; Cruz, Juan Pablo; Cikutovic, Pablo; Diaz, Gonzalo

    2018-04-27

    Differentiating osteoporotic vertebral fractures (OVFs) from metastatic vertebral fractures (MVFs) is an important clinical challenge. A novel magnetic resonance imaging (MRI)-based score (the META score) was described, aiming to differentiate OVF from MVF. This score showed an almost perfect agreement by the group developing it, but an independent agreement evaluation is pending. We aimed to perform an independent inter- and intraobserver agreement evaluation of the META score and to test the score's capability of differentiating OVF from MVF. This is an agreement study of the META score. Sixty-four patients with confirmed OVF or MVF were assessed by six independent evaluators (three spine surgeons and three fellowship-trained radiologists) using the META score. We used the intraclass correlation coefficient (ICC) to determine the overall inter-and intraobserver agreement, and the kappa statistic (κ) to express the agreement for each individual score criterion. The score accuracy was determined by calculating the area under the receiver operating characteristic curve. Finally, we used κ to evaluate the agreement among raters to determine whether the fracture was OVF or MVF. The overall interobserver agreement was poor [ICC=0.10 (0.02-0.20)]; spine surgeons [ICC=0.75 (0.66-0.83)] had better agreement than radiologists did [ICC=0.05 (-0.08 to 0.21)]. The intraobserver agreement was poor [ICC=0.17 (0.01-0.32)]; both spine surgeons [ICC=0.21 (0.05-0.41)] and radiologists had a poor agreement [ICC=0.03 (-0.29 to 0.27)]. The agreement for each specific criterion varied from κ=0.24 to κ=0.60. The area under the receiver operating characteristic curve was 0.58 (0.64 for spine surgeons and 0.52 for radiologists, p<.01). The interobserver agreement using the META score was adequate for spine surgeons but not for other potential users (radiologists); the intraobserver agreement was poor. Further studies are thus necessary before the use of this score is recommended

  19. Intra- and Inter-observer Variability of Measurements of the Laxity Index on Stress Radiographs Performed with the Vezzoni-Modified Badertscher Hip Distension Device.

    PubMed

    Bertal, Mileva; Vezzoni, Aldo; Houdellier, Blandine; Bogaerts, Evelien; Stock, Emmelie; Polis, Ingeborgh; Deforce, Dieter; Saunders, Jimmy H; Broeckx, Bart J G

    2018-06-02

     To describe and evaluate the accuracy, intra- and inter-observer variability of the laxity index (LI), used to quantify hip laxity on stress radiographs obtained with the Vezzoni-modified Badertscher distension device (VMBDD).  Stress radiographs of 10 dogs obtained with the VMBDD were measured three times by an experienced observer. Six participants with different backgrounds (two ECVDI residents, two PhD students, two veterinary assistants) followed a short presentation and performed subsequently the measurements four times in two separate sessions. The effect of self-learning, feedback and specialization on the accuracy of the measurements was assessed.  While the intra- and inter-observer variability were in agreement with other studies, the results of the experienced observer indicated that the variability can be very low. Neither feedback nor self-learning improved the results. A high degree of experience in radiographic assessment was not necessary to perform the measurements correctly.  As the LI measurements were acceptable after a short presentation, they support the use of VMBDD for a complete and correct in-house evaluation of the hip joint by trained clinicians. However, we propose that, in the context of screening, measurements should be performed by a limited number of experienced examiners, to limit the impact of the inter-observer variability. Schattauer GmbH Stuttgart.

  20. A score card for upper GI endoscopy: Evaluation of interobserver variability in examiners with various levels of experience.

    PubMed

    Neumann, M; Friedl, S; Meining, A; Egger, K; Heldwein, W; Rey, J F; Hochberger, J; Classen, M; Hohenberger, W; Rösch, T

    2002-10-01

    In most European countries, training in GI endoscopy has largely been based on hands-on acquisition of experience in patients rather than on a structured training programme. With the development of training models systematic hands-on training in a variety of diagnostic and therapeutic endoscopy techniques was achieved. Little, however, is known about methods of objectively assessing trainees' performance. We therefore developed an assessment 'score card' for upper GI endoscopy and tested it in endoscopists with various levels of experience. The aim of the study was therefore to assess interobserver variations in the evaluation of trainees. On the basis of textbook and expert opinions a consensus group of eight experienced endoscopists developed a score card for diagnostic upper GI endoscopy with biopsy. The score card includes an assessment of the single steps of the procedure as well as of the times needed to complete each step. This score card was then evaluated in a further conference including ten experts who blindly assessed videotapes of 15 endoscopists performing upper GI endoscopy in a training bio-simulation model (the 'Erlangen Endo-Trainer'). On the basis of their previous experience (i. e. the number of endoscopies performed) these 15 endoscopists were classified into four groups: very experienced, experienced, having some experience and inexperienced. Interobserver variability (IOV) was tested for the various score card parameters (Kendall's rank-correlation coefficient 0.0-0.5 poor, 0.5-1.0 good agreement). In addition, the correlation between the score card assessment and the examiners' experience levels was analysed. Despite poor IOV results for all the parameters tested (Kendall coefficient < 0.3), the assessment parameters correlated well when the examiners' different experience levels were taken into account (correlation coefficient 0.59-0.89, p < 0.05). The score card parameters were suitable for differentiating between the four groups of

  1. Suboptimal Agreement Among Cytopathologists in Diagnosis of Malignancy Based on Endoscopic Ultrasound Needle Aspirates of Solid Pancreatic Lesions: A Validation Study.

    PubMed

    Marshall, Carrie; Mounzer, Rawad; Hall, Matt; Simon, Violette; Centeno, Barbara; Dennis, Katie; Dhillon, Jasreman; Fan, Fang; Khazai, Laila; Klapman, Jason; Komanduri, Srinadh; Lin, Xiaoqi; Lu, David; Mehrotra, Sanjana; Muthusamy, V Raman; Nayar, Ritu; Paintal, Ajit; Rao, Jianyu; Sams, Sharon; Shah, Janak; Watson, Rabindra; Rastogi, Amit; Wani, Sachin

    2018-07-01

    Despite the widespread use of endoscopic ultrasound-guided fine-needle aspiration (EUS-FNA) to sample pancreatic lesions and the standardization of pancreaticobiliary cytopathologic nomenclature, there are few data on inter-observer agreement among cytopathologists evaluating pancreatic cytologic specimens obtained by EUS-FNA. We developed a scoring system to assess agreement among cytopathologists in overall diagnosis and quantitative and qualitative parameters, and evaluated factors associated with agreement. We performed a prospective study to validate results from our pilot study that demonstrated moderate to substantial inter-observer agreement among cytopathologists for the final cytologic diagnosis. In the first phase, 3 cytopathologists refined criteria for assessment of quantity and quality measures. During phase 2, EUS-FNA specimens of solid pancreatic lesions from 46 patients were evaluated by 11 cytopathologists at 5 tertiary care centers using a standardized scoring tool. Individual quantitative and qualitative measures were scored and an overall cytologic diagnosis was determined. Clinical and EUS parameters were assessed as predictors of unanimous agreement. Inter-observer agreement (IOA) was calculated using multi-rater kappa (κ) statistics and a logistic regression model was created to identify factors associated with unanimous agreement. The IOA for final diagnoses, based on cytologic analysis, was moderate (κ = 0.56; 95% CI, 0.43-0.70). Kappa values did not increase when categories of suspicious for malignancy, malignant, and neoplasm were combined. IOA was slight to moderate for individual quantitative (κ = 0.007; 95% CI, -0.03 to -0.04) and qualitative parameters (κ = 0.5; 95% CI, 0.47-0.53). Jaundice was the only factor associated with agreement among all cytopathologists on multivariate analysis (odds ratio for unanimous agreement, 5.3; 95% CI, 1.1-26.89). There is a suboptimal level of agreement among cytopathologists in the diagnosis

  2. Critical discussion of evaluation parameters for inter-observer variability in target definition for radiation therapy.

    PubMed

    Fotina, I; Lütgendorf-Caucig, C; Stock, M; Pötter, R; Georg, D

    2012-02-01

    Inter-observer studies represent a valid method for the evaluation of target definition uncertainties and contouring guidelines. However, data from the literature do not yet give clear guidelines for reporting contouring variability. Thus, the purpose of this work was to compare and discuss various methods to determine variability on the basis of clinical cases and a literature review. In this study, 7 prostate and 8 lung cases were contoured on CT images by 8 experienced observers. Analysis of variability included descriptive statistics, calculation of overlap measures, and statistical measures of agreement. Cross tables with ratios and correlations were established for overlap parameters. It was shown that the minimal set of parameters to be reported should include at least one of three volume overlap measures (i.e., generalized conformity index, Jaccard coefficient, or conformation number). High correlation between these parameters and scatter of the results was observed. A combination of descriptive statistics, overlap measure, and statistical measure of agreement or reliability analysis is required to fully report the interrater variability in delineation.

  3. An Independent Inter- and Intraobserver Agreement Evaluation of the AOSpine Subaxial Cervical Spine Injury Classification System.

    PubMed

    Urrutia, Julio; Zamora, Tomas; Yurac, Ratko; Campos, Mauricio; Palma, Joaquin; Mobarec, Sebastian; Prada, Carlos

    2017-03-01

    An agreement study. The aim of this study was to perform an independent interobserver and intraobserver agreement assessment of the AOSpine subaxial cervical spine injury classification system. The AOSpine subaxial cervical spine injury classification system was recently described. It showed substantial inter- and intraobserver agreement in the study describing it; however, an independent evaluation has not been performed. Anteroposterior and lateral radiographs, computed tomography scans, and magnetic resonance imaging of 65 patients with acute traumatic subaxial cervical spine injuries were selected and classified using the morphologic grading of the subaxial cervical spine injury classification system by 6 evaluators (3 spine surgeons and 3 orthopedic surgery residents). After a 6-week interval, the 65 cases were presented to the same evaluators in a random sequence for repeat evaluation. The kappa coefficient (κ) was used to determine the inter- and intraobserver agreement. The interobserver agreement was substantial when considering the fracture main types (A, B, C, or F), with κ = 0.61 (0.57-0.64), but moderate when considering the subtypes: κ = 0.57 (0.54-0.60). The intraobserver agreement was substantial considering the fracture types, with κ = 0.68 (0.62-0.74) and considering subtypes, κ = 0.62 (0.57-0.66). No significant differences were observed between spine surgeons and orthopedic residents in the overall inter- and intraobserver agreement, or in the inter- and intraobserver agreement of specific A, B, C, or F type of injuries. This classification allows adequate agreement among different observers and by the same observer on separate occasions. Future prospective studies should determine whether this classification allows surgeons to decide the best treatment for patients with subaxial cervical spine injuries. 3.

  4. Novel use of non-echo-planar diffusion weighted MRI in monitoring disease activity and treatment response in active Grave's orbitopathy: An initial observational cohort study.

    PubMed

    Lingam, Ravi Kumar; Mundada, Pravin; Lee, Vickie

    2018-01-10

    To examine the novel use of non-echo-planar diffusion weighted MRI (DWI) in depicting activity and treatment response in active Grave's orbitopathy (GO) by assessing, with inter-observer agreement, for a correlation between its apparent diffusion coefficients (ADCs) and conventional Short tau Inversion Recovery (STIR) MRI signal-intensity ratios (SIRs). A total of 23 actively inflamed muscles and 30 muscle response episodes were analysed in patients with active GO who underwent medical treatment. The MRI orbit scans included STIR sequences and non-echo-planar DWI were evaluated. Two observers independently assessed the images qualitatively for the presence of activity in the extraocular muscles (EOMs) and recorded the STIR signal-intensity (SI), SIR (SI ratio of EOM/temporalis muscle), and ADC values of any actively inflamed muscle on the pre-treatment scans and their corresponding values on the subsequent post-treatment scans. Inter-observer agreement was examined. There was a significant positive correlation (0.57, p < 0.001) between ADC and both SIR and STIR SI of the actively inflamed EOM. There was also a significant positive correlation (0.75, p < 0.001) between SIR and ADC values depicting change in muscle activity associated with treatment response. There was good inter-observer agreement. Our preliminary results indicate that quantitative evaluation with non-echo-planar DWI ADC values correlates well with conventional STIR SIR in detecting active GO and monitoring its treatment response, with good inter-observer agreement.

  5. Stress-only myocardial perfusion scintigraphy: a prospective study on the accuracy and observer agreement with quantitative coronary angiography as the gold standard.

    PubMed

    Ejlersen, June A; May, Ole; Mortensen, Jesper; Nielsen, Gitte L; Lauridsen, Jeppe F; Allan, Johansen

    2017-11-01

    Patients with normal stress perfusion have an excellent prognosis. Prospective studies on the diagnostic accuracy of stress-only scans with contemporary, independent examinations as gold standards are lacking. A total of 109 patients with typical angina and no previous coronary artery disease underwent a 2-day stress (exercise)/rest, gated, and attenuation-corrected (AC), 99m-technetium-sestamibi perfusion study, followed by invasive coronary angiography. The stress datasets were evaluated twice by four physicians with two different training levels (expert and novice): familiar and unfamiliar with AC. The two experts also made a consensus reading of the integrated stress-rest datasets. The consensus reading and quantitative data from the invasive coronary angiography were applied as reference methods. The sensitivity/specificity were 0.92-1.00/0.73-0.90 (reference: expert consensus reading), 0.93-0.96/0.63-0.82 (reference: ≥1 stenosis>70%), and 0.75-0.88/0.70-0.88 (reference: ≥1 stenosis>50%). The four readers showed a high and fairly equal sensitivity independent of their familiarity with AC. The expert familiar with AC had the highest specificity independent of the reference method. The intraobserver and interobserver agreements on the stress-only readings were good (readers without AC experience) to excellent (readers with AC experience). AC stress-only images yielded a high sensitivity independent of the training level and experience with AC of the nuclear physician, whereas the specificity correlated positively with both. Interobserver and intraobserver agreements tended to be the best for physicians with AC experience.

  6. Multicentre evaluation of multidisciplinary team meeting agreement on diagnosis in diffuse parenchymal lung disease: a case-cohort study.

    PubMed

    Walsh, Simon L F; Wells, Athol U; Desai, Sujal R; Poletti, Venerino; Piciucchi, Sara; Dubini, Alessandra; Nunes, Hilario; Valeyre, Dominique; Brillet, Pierre Y; Kambouchner, Marianne; Morais, António; Pereira, José M; Moura, Conceição Souto; Grutters, Jan C; van den Heuvel, Daniel A; van Es, Hendrik W; van Oosterhout, Matthijs F; Seldenrijk, Cornelis A; Bendstrup, Elisabeth; Rasmussen, Finn; Madsen, Line B; Gooptu, Bibek; Pomplun, Sabine; Taniguchi, Hiroyuki; Fukuoka, Junya; Johkoh, Takeshi; Nicholson, Andrew G; Sayer, Charlie; Edmunds, Lilian; Jacob, Joseph; Kokosi, Maria A; Myers, Jeffrey L; Flaherty, Kevin R; Hansell, David M

    2016-07-01

    Diffuse parenchymal lung disease represents a diverse and challenging group of pulmonary disorders. A consistent diagnostic approach to diffuse parenchymal lung disease is crucial if clinical trial data are to be applied to individual patients. We aimed to evaluate inter-multidisciplinary team agreement for the diagnosis of diffuse parenchymal lung disease. We did a multicentre evaluation of clinical data of patients who presented to the interstitial lung disease unit of the Royal Brompton and Harefield NHS Foundation Trust (London, UK; host institution) and required multidisciplinary team meeting (MDTM) characterisation between March 1, 2010, and Aug 31, 2010. Only patients whose baseline clinical, radiological, and, if biopsy was taken, pathological data were undertaken at the host institution were included. Seven MDTMs, consisting of at least one clinician, radiologist, and pathologist, from seven countries (Denmark, France, Italy, Japan, Netherlands, Portugal, and the UK) evaluated cases of diffuse parenchymal lung disease in a two-stage process between Jan 1, and Oct 15, 2015. First, the clinician, radiologist, and pathologist (if lung biopsy was completed) independently evaluated each case, selected up to five differential diagnoses from a choice of diffuse lung diseases, and chose likelihoods (censored at 5% and summing to 100% in each case) for each of their differential diagnoses, without inter-disciplinary consultation. Second, these specialists convened at an MDTM and reviewed all data, selected up to five differential diagnoses, and chose diagnosis likelihoods. We compared inter-observer and inter-MDTM agreements on patient first-choice diagnoses using Cohen's kappa coefficient (κ). We then estimated inter-observer and inter-MDTM agreement on the probability of diagnosis using weighted kappa coefficient (κw). We compared inter-observer and inter-MDTM confidence of patient first-choice diagnosis. Finally, we evaluated the prognostic significance of a

  7. Interobserver Reliability of the Total Body Score System for Quantifying Human Decomposition.

    PubMed

    Dabbs, Gretchen R; Connor, Melissa; Bytheway, Joan A

    2016-03-01

    Several authors have tested the accuracy of the Total Body Score (TBS) method for quantifying decomposition, but none have examined the reliability of the method as a scoring system by testing interobserver error rates. Sixteen participants used the TBS system to score 59 observation packets including photographs and written descriptions of 13 human cadavers in different stages of decomposition (postmortem interval: 2-186 days). Data analysis used a two-way random model intraclass correlation in SPSS (v. 17.0). The TBS method showed "almost perfect" agreement between observers, with average absolute correlation coefficients of 0.990 and average consistency correlation coefficients of 0.991. While the TBS method may have sources of error, scoring reliability is not one of them. Individual component scores were examined, and the influences of education and experience levels were investigated. Overall, the trunk component scores were the least concordant. Suggestions are made to improve the reliability of the TBS method. © 2016 American Academy of Forensic Sciences.

  8. Measurement of the center edge angle and determination of the Severin classification using digital radiography, computer-assisted measurement tools, and a Severin algorithm: intraobserver and interobserver reliability revisited.

    PubMed

    Carroll, Kristen L; Murray, Kathleen A; MacLeod, Lynne M; Hennessey, Theresa A; Woiczik, Marcella R; Roach, James W

    2011-06-01

    Numerous studies underscore the poor intraobserver and interobserver reliability of both the center edge angle (CEA) and the Severin classification using plain film measurements. In this study, experienced observers applied a computer-assisted measurement program to determine the CEA in digital pelvic radiographs of adults who had been previously treated for dysplasia of the hip (DDH). Using a teaching aid/algorithm of the Severin classification, the observers then assigned a Severin rating to these hips. Intraobserver and interobserver errors were then calculated on both the CEA measurements and the Severin classifications. Four pediatric orthopaedic surgeons and 1 pediatric radiologist calculated the CEAs using the OrthoView TM planning system and then determined the Severin classification on 41 blinded digital pelvic radiographs. The radiographs were evaluated by each examiner twice, with evaluations separated by 2 months. All examiners reviewed a Severin classification algorithm before making their Severin assignments. The intraobserver and interobserver reliability for both the CEA and the Severin classification were calculated using the interclass correlation coefficients and Cohen and Fleiss κ scores, respectively. The intraobserver and interobserver reliability for CEA measurement was moderate to almost perfect. When we separated the Severin classification into 3 clinically relevant groups of good (Severin I and II), dysplastic (Severin III), and poor (Severin IV and above), our interobserver reliability neared almost perfect. The Severin classification is an extremely useful and oft-used radiographic measure for the success of DDH treatment. Our research found digital radiography, computer-aided measurement tools, the use of a Severin algorithm, and separating the Severin classification into 3 clinically relevant groups significantly increased the intraobserver and interobserver reliability of both the CEA and Severin classification. This finding will

  9. Evaluating agreement, histological features and relevance of separating pleomorphic and florid lobular carcinoma-in-situ subtypes.

    PubMed

    Singh, Kamaljeet; Paquette, Cherie; Kalife, Elizabeth T; Wang, Yihong; Mangray, Shamlal; MPhil, M Ruhul Quddus Md; Steinhoff, Margaret M

    2018-05-09

    Morphological variants of lobular carcinoma in situ (LCIS) include classical- (CLCIS), pleomorphic- (PLCIS) and florid-type (FLCIS). Treatment guidelines suggest managing PLCIS and FLCIS like ductal carcinoma in situ (DCIS); therefore accurate identification of LCIS subtypes is critical. However significance of separating PLCIS from FLCIS is not clear. Also inter-observer agreement in identifying LCIS subtypes, using contemporary criteria, is not known. We aimed to evaluate inter-observer agreement amongst breast pathologists in diagnosing LCIS subtypes and use the agreement data to justify LCIS classification for management purposes. Six breast pathologists independently reviewed 50 hematoxylin and eosin stained slides comprised of a mix of LCIS subtypes. After reviewing published criteria participants diagnosed PLCIS, CLCIS and apocrine change in a marked region of interest and FLCIS based on entire section. PLCIS was identified in 8-37 slides with overall moderate agreement (Fleiss' κ =0.565) and pairwise κ (Cohen's) ranging from -.008 to 0.492. FLCIS was diagnosed in 15-26 slides with overall substantial agreement (Fleiss' κ =0.687) and pairwise κ ranging from -.068 to 0.706. Both FLCIS and PLCIS coexisted in 45% of slides with consensus on non-classical LCIS. Comedo-type necrosis (Odds ratio=5.5) and apoptosis (Odds ratio=1.8) predicted FLCIS. We found moderate and substantial agreement in diagnosing PLCIS and FLCIS respectively. Objective histological features linked with aggressive behavior were more frequent with FLCIS. PLCIS and FLCIS patterns frequently coexist, contain similar molecular aberrations, and are managed similarly (like DCIS); therefore combining FLCIS and PLCIS into one category (non-classical LCIS) should be considered. Copyright © 2018. Published by Elsevier Inc.

  10. Agreement and reliability of pelvic floor measurements during contraction using three-dimensional pelvic floor ultrasound and virtual reality.

    PubMed

    Speksnijder, L; Rousian, M; Steegers, E A P; Van Der Spek, P J; Koning, A H J; Steensma, A B

    2012-07-01

    Virtual reality is a novel method of visualizing ultrasound data with the perception of depth and offers possibilities for measuring non-planar structures. The levator ani hiatus has both convex and concave aspects. The aim of this study was to compare levator ani hiatus volume measurements obtained with conventional three-dimensional (3D) ultrasound and with a virtual reality measurement technique and to establish their reliability and agreement. 100 symptomatic patients visiting a tertiary pelvic floor clinic with a normal intact levator ani muscle diagnosed on translabial ultrasound were selected. Datasets were analyzed using a rendered volume with a slice thickness of 1.5 cm at the level of minimal hiatal dimensions during contraction. The levator area (in cm(2)) was measured and multiplied by 1.5 to get the levator ani hiatus volume in conventional 3D ultrasound (in cm(3)). Levator ani hiatus volume measurements were then measured semi-automatically in virtual reality (cm(3) ) using a segmentation algorithm. An intra- and interobserver analysis of reliability and agreement was performed in 20 randomly chosen patients. The mean difference between levator ani hiatus volume measurements performed using conventional 3D ultrasound and virtual reality was 0.10 (95% CI, - 0.15 to 0.35) cm(3). The intraclass correlation coefficient (ICC) comparing conventional 3D ultrasound with virtual reality measurements was > 0.96. Intra- and interobserver ICCs for conventional 3D ultrasound measurements were > 0.94 and for virtual reality measurements were > 0.97, indicating good reliability for both. Levator ani hiatus volume measurements performed using virtual reality were reliable and the results were similar to those obtained with conventional 3D ultrasonography. Copyright © 2012 ISUOG. Published by John Wiley & Sons, Ltd.

  11. Observer Agreement for Measurements in Videolaryngostroboscopy.

    PubMed

    Brunings, Jan Wouter; Vanbelle, Sophie; Akkermans, Annemarie; Heemskerk, Nienke M M; Kremer, Bernd; Stokroos, Robert J; Baijens, Laura W J

    2017-11-06

    This study evaluated the levels of intraobserver and interobserver agreement for measurements of visuoperceptual variables in videolaryngostroboscopic examinations and compared the observers' behavior during independent versus consensus panel rating. This is a retrospective study. This study was conducted in a single-center tertiary care facility. Sixty-four patients with dysphonia of heterogeneous etiology were included. All subjects underwent a standardized videolaryngostroboscopic examination. Two experienced and trained observers scored exactly the same examinations, first independently and then on a consensus panel. Specific visuoperceptual variables and the clinical diagnosis (as recommended by the Committee on Phoniatrics and the Phonosurgery Committee of the European Laryngological Society and advised by the American Speech-Language-Hearing Association) were scored. Descriptive and kappa statistics were used. In general, intraobserver agreement was better than agreement between observers for measurements of several variables. The intrapanel observer agreement levels were slightly higher than the intraobserver agreement levels on the independent rating task. When rating on the consensus panel, the observers deviated considerably from the scores they had previously given on the independent rating task. Observer agreement in videolaryngostroboscopic assessment has important implications not only for the diagnosis and treatment of dysphonic patients but also for the interpretation of the results of scientific studies using videolaryngostroboscopic outcome parameters. The identification of factors that can influence the levels of observer agreement can provide a better understanding of the rating process and its limitations. The results of this study suggest that future research could achieve better agreement levels by rating the visuoperceptual variables in a panel setting. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  12. Inter-Observer Agreement on Subjects' Race and Race-Informative Characteristics

    PubMed Central

    Edgar, Heather J. H.; Daneshvari, Shamsi; Harris, Edward F.; Kroth, Philip J.

    2011-01-01

    Health and socioeconomic disparities tend to be experienced along racial and ethnic lines, but investigators are not sure how individuals are assigned to groups, or how consistent this process is. To address these issues, 1,919 orthodontic patient records were examined by at least two observers who estimated each individual's race and the characteristics that influenced each estimate. Agreement regarding race is high for African and European Americans, but not as high for Asian, Hispanic, and Native Americans. The indicator observers most often agreed upon as important in estimating group membership is name, especially for Asian and Hispanic Americans. The observers, who were almost all European American, most often agreed that skin color is an important indicator of race only when they also agreed the subject was European American. This suggests that in a diverse community, light skin color is associated with a particular group, while a range of darker shades can be associated with members of any other group. This research supports comparable studies showing that race estimations in medical records are likely reliable for African and European Americans, but are less so for other groups. Further, these results show that skin color is not consistently the primary indicator of an individual's race, but that other characteristics such as facial features add significant information. PMID:21897865

  13. Wheezes, crackles and rhonchi: simplifying description of lung sounds increases the agreement on their classification: a study of 12 physicians' classification of lung sounds from video recordings

    PubMed Central

    Melbye, Hasse; Garcia-Marcos, Luis; Brand, Paul; Everard, Mark; Priftis, Kostas; Pasterkamp, Hans

    2016-01-01

    Background The European Respiratory Society (ERS) lung sounds repository contains 20 audiovisual recordings of children and adults. The present study aimed at determining the interobserver variation in the classification of sounds into detailed and broader categories of crackles and wheezes. Methods Recordings from 10 children and 10 adults were classified into 10 predefined sounds by 12 observers, 6 paediatricians and 6 doctors for adult patients. Multirater kappa (Fleiss' κ) was calculated for each of the 10 adventitious sounds and for combined categories of sounds. Results The majority of observers agreed on the presence of at least one adventitious sound in 17 cases. Poor to fair agreement (κ<0.40) was usually found for the detailed descriptions of the adventitious sounds, whereas moderate to good agreement was reached for the combined categories of crackles (κ=0.62) and wheezes (κ=0.59). The paediatricians did not reach better agreement on the child cases than the family physicians and specialists in adult medicine. Conclusions Descriptions of auscultation findings in broader terms were more reliably shared between observers compared to more detailed descriptions. PMID:27158515

  14. Evaluating performance of a user-trained MR lung tumor autocontouring algorithm in the context of intra- and interobserver variations.

    PubMed

    Yip, Eugene; Yun, Jihyun; Gabos, Zsolt; Baker, Sarah; Yee, Don; Wachowicz, Keith; Rathee, Satyapal; Fallone, B Gino

    2018-01-01

    , that are comparable to multiple human experts (several seconds per contour), but at a much faster speed. At the same time, the agreement between autocontours and manual contours is comparable to the intra- and interobserver variations. This algorithm may be a key component of the real time tumor tracking workflow for our hybrid Linac-MR device in the future. © 2017 American Association of Physicists in Medicine.

  15. SU-E-T-509: Inter-Observer and Inter-Modality Contouring Analysis for Organs at Risk for HDR Gynecological Brachytherapy

    SciTech Connect

    Sadeghi, P; Smith, W; Tom Baker Cancer Centre, Calgary, AB

    2015-06-15

    Purpose This study quantifies errors associated with MR-guided High Dose Rate (HDR) gynecological brachytherapy. Uncertainties in this treatment results from contouring, organ motion between imaging and treatment delivery, dose calculation, and dose delivery. We focus on interobserver and inter-modality variability in contouring and the motion of organs at risk (OARs) in the time span between the MR and CT scans (∼1 hour). We report the change in organ volume and position of center of mass (CM) between the two imaging modalities. Methods A total of 8 patients treated with MR-guided HDR brachytherapy were included in this study. Two observers contouredmore » the bladder and rectum on both MR and CT scans. The change in OAR volume and CM position between the MR and CT imaging sessions on both image sets were calculated. Results The absolute mean bladder volume change between the two imaging modalities is 67.1cc. The absolute mean inter-observer difference in bladder volume is much lower at 15.5cc (MR) and 11.0cc (CT). This higher inter-modality volume difference suggests a real change in the bladder filling between the two imaging sessions. Change in Rectum volume inter-observer standard error of means (SEM) is 3.18cc (MR) and 3.09cc (CT), while the inter-modality SEM is 3.65cc (observer 1), and 2.75cc (observer 2). The SEM for rectum CM position in the superior-inferior direction was approximately three times higher than in other directions for both the inter—observer (0.77 cm, 0.92 cm for observers 1 and 2, respectively) and inter-modality (0.91 cm, 0.95 cm for MR and CT, respectively) variability. Conclusion Bladder contours display good consistency between different observers on both CT and MR images. For rectum contouring the highest inconsistency stems from the observers’ choice of the superior-inferior borders. A complete analysis of a larger patient cohort will enable us to separate the true organ motion from the inter-observer variability.« less

  16. High interobserver variability in the assessment of epsilon waves: Implications for diagnosis of arrhythmogenic right ventricular cardiomyopathy/dysplasia.

    PubMed

    Platonov, Pyotr G; Calkins, Hugh; Hauer, Richard N; Corrado, Domenico; Svendsen, Jesper H; Wichter, Thomas; Biernacka, Elżbieta Katarzyna; Saguner, Ardan M; Te Riele, Anneline S J M; Zareba, Wojciech

    2016-01-01

    Revision of the Task Force diagnostic criteria for arrhythmogenic right ventricular cardiomyopathy/dysplasia (ARVC/D) has increased their sensitivity for the diagnosis of early and familial forms of the disease. The epsilon wave is a major diagnostic criterion in the context of ARVC/D, which, however, remains not quantifiable and therefore may leave room for substantial subjective interpretation. The purpose of this study was to assess interobserver agreement in epsilon wave definition and epsilon wave importance for ARVC/D diagnosis. Electrocardiographic (ECG) tracings depicting leads V1, V2, and V3 collected from individuals evaluated for ARVC/D (n = 30) were given to panel members who were asked to respond to the question whether ECG patterns meet epsilon wave definition outlined by the Task Force diagnostic criteria. The prevalence and importance of epsilon waves for ARVC/D diagnosis were assessed in a pooled data set of patients with definite ARVC/D from European and American registries (n = 815). The number of ECG patterns identified as epsilon waves varied from 5 to 18 per reviewer (median 13 per reviewer). A unanimous agreement was reached for only 10 cases (33%), 2 of which qualified as epsilon waves and 8 as non-epsilon waves by all panel members. From a pooled data set, 106 patients reportedly had epsilon waves (13%). In 105 of 106 patients with epsilon waves (99%), exclusion of epsilon waves from the diagnostic score would not affect the "definite" diagnostic category. Interobserver variability in the assessment of epsilon waves is high; however, the impact of epsilon waves on ARVC/D diagnosis is negligibly low. The results urge to exercise caution in the assessment of epsilon waves, especially in patients who would not otherwise meet diagnostic criteria. Copyright © 2016 Heart Rhythm Society. Published by Elsevier Inc. All rights reserved.

  17. Inter-observer variance with the diagnosis of myelodysplastic syndromes (MDS) following the 2008 WHO classification.

    PubMed

    Font, P; Loscertales, J; Benavente, C; Bermejo, A; Callejas, M; Garcia-Alonso, L; Garcia-Marcilla, A; Gil, S; Lopez-Rubio, M; Martin, E; Muñoz, C; Ricard, P; Soto, C; Balsalobre, P; Villegas, A

    2013-01-01

    Morphology is the basis of the diagnosis of myelodysplastic syndromes (MDS). The WHO classification offers prognostic information and helps with the treatment decisions. However, morphological changes are subject to potential inter-observer variance. The aim of our study was to explore the reliability of the 2008 WHO classification of MDS, reviewing 100 samples previously diagnosed with MDS using the 2001 WHO criteria. Specimens were collected from 10 hospitals and were evaluated by 10 morphologists, working in five pairs. Each observer evaluated 20 samples, and each sample was analyzed independently by two morphologists. The second observer was blinded to the clinical and laboratory data, except for the peripheral blood (PB) counts. Nineteen cases were considered as unclassified MDS (MDS-U) by the 2001 WHO classification, but only three remained as MDS-U by the 2008 WHO proposal. Discordance was observed in 26 of the 95 samples considered suitable (27 %). Although there were a high number of observers taking part, the rate of discordance was quite similar among the five pairs. The inter-observer concordance was very good regarding refractory anemia with excess blasts type 1 (RAEB-1) (10 of 12 cases, 84 %), RAEB-2 (nine of 10 cases, 90 %), and also good regarding refractory cytopenia with multilineage dysplasia (37 of 50 cases, 74 %). However, the categories with unilineage dysplasia were not reproducible in most of the cases. The rate of concordance with refractory cytopenia with unilineage dysplasia was 40 % (two of five cases) and 25 % with RA with ring sideroblasts (two of eight). Our results show that the 2008 WHO classification gives a more accurate stratification of MDS but also illustrates the difficulty in diagnosing MDS with unilineage dysplasia.

  18. Quantitative blood flow measurements in gliomas using arterial spin-labeling at 3T: intermodality agreement and inter- and intraobserver reproducibility study.

    PubMed

    Hirai, T; Kitajima, M; Nakamura, H; Okuda, T; Sasao, A; Shigematsu, Y; Utsunomiya, D; Oda, S; Uetani, H; Morioka, M; Yamashita, Y

    2011-12-01

    QUASAR is a particular application of the ASL method and facilitates the user-independent quantification of brain perfusion. The purpose of this study was to assess the intermodality agreement of TBF measurements obtained with ASL and DSC MR imaging and the inter- and intraobserver reproducibility of glioma TBF measurements acquired by ASL at 3T. Two observers independently measured TBF in 24 patients with histologically proved glioma. ASL MR imaging with QUASAR and DSC MR imaging were performed on 3T scanners. The observers placed 5 regions of interest in the solid tumor on rCBF maps derived from ASL and DSC MR images and 1 region of interest in the contralateral brain and recorded the measured values. Maximum and average sTBF values were calculated. Intermodality and intra- and interobsever agreement were determined by using 95% Bland-Altman limits of agreement and ICCs. The intermodality agreement for maximum sTBF was good to excellent on DSC and ASL images; ICCs ranged from 0.718 to 0.884. The 95% limits of agreement ranged from 59.2% to 65.4% of the mean. ICCs for intra- and interobserver agreement for maximum sTBF ranged from 0.843 to 0.850 and from 0.626 to 0.665, respectively. The reproducibility of maximum sTBF measurements obtained by methods was similar. In the evaluation of sTBF in gliomas, ASL with QUASAR at 3T yielded measurements and reproducibility similar to those of DSC perfusion MR imaging.

  19. Accuracy of contrast-enhanced spectral mammography for estimating residual tumor size after neoadjuvant chemotherapy in patients with breast cancer: a feasibility study

    PubMed Central

    Barra, Filipe Ramos; de Souza, Fernanda Freire; Camelo, Rosimara Eva Ferreira Almeida; Ribeiro, Andrea Campos de Oliveira; Farage, Luciano

    2017-01-01

    Objective To assess the feasibility of contrast-enhanced spectral mammography (CESM) of the breast for assessing the size of residual tumors after neoadjuvant chemotherapy (NAC). Materials and methods In breast cancer patients who underwent NAC between 2011 and 2013, we evaluated residual tumor measurements obtained with CESM and full-field digital mammography (FFDM). We determined the concordance between the methods, as well as their level of agreement with the pathology. Three radiologists analyzed eight CESM and FFDM measurements separately, considering the size of the residual tumor at its largest diameter and correlating it with that determined in the pathological analysis. Interobserver agreement was also evaluated. Results The sensitivity, specificity, positive predictive value, and negative predictive value were higher for CESM than for FFDM (83.33%, 100%, 100%, and 66% vs. 50%, 50%, 50%, and 25%, respectively). The CESM measurements showed a strong, consistent correlation with the pathological findings (correlation coefficient = 0.76-0.92; intraclass correlation coefficient = 0.692-0.886). The correlation between the FFDM measurements and the pathological findings was not statistically significant, with questionable consistency (intraclass correlation coefficient = 0.488-0.598). Agreement with the pathological findings was narrower for CESM measurements than for FFDM measurements. Interobserver agreement was higher for CESM than for FFDM (0.94 vs. 0.88). Conclusion CESM is a feasible means of evaluating residual tumor size after NAC, showing a good correlation and good agreement with pathological findings. For CESM measurements, the interobserver agreement was excellent. PMID:28894329

  20. Accuracy of contrast-enhanced spectral mammography for estimating residual tumor size after neoadjuvant chemotherapy in patients with breast cancer: a feasibility study.

    PubMed

    Barra, Filipe Ramos; de Souza, Fernanda Freire; Camelo, Rosimara Eva Ferreira Almeida; Ribeiro, Andrea Campos de Oliveira; Farage, Luciano

    2017-01-01

    To assess the feasibility of contrast-enhanced spectral mammography (CESM) of the breast for assessing the size of residual tumors after neoadjuvant chemotherapy (NAC). In breast cancer patients who underwent NAC between 2011 and 2013, we evaluated residual tumor measurements obtained with CESM and full-field digital mammography (FFDM). We determined the concordance between the methods, as well as their level of agreement with the pathology. Three radiologists analyzed eight CESM and FFDM measurements separately, considering the size of the residual tumor at its largest diameter and correlating it with that determined in the pathological analysis. Interobserver agreement was also evaluated. The sensitivity, specificity, positive predictive value, and negative predictive value were higher for CESM than for FFDM (83.33%, 100%, 100%, and 66% vs. 50%, 50%, 50%, and 25%, respectively). The CESM measurements showed a strong, consistent correlation with the pathological findings (correlation coefficient = 0.76-0.92; intraclass correlation coefficient = 0.692-0.886). The correlation between the FFDM measurements and the pathological findings was not statistically significant, with questionable consistency (intraclass correlation coefficient = 0.488-0.598). Agreement with the pathological findings was narrower for CESM measurements than for FFDM measurements. Interobserver agreement was higher for CESM than for FFDM (0.94 vs. 0.88). CESM is a feasible means of evaluating residual tumor size after NAC, showing a good correlation and good agreement with pathological findings. For CESM measurements, the interobserver agreement was excellent.

  1. Prospective Validation of Intra- and Interobserver Reproducibility of a New Point Shear Wave Elastographic Technique for Assessing Liver Stiffness in Patients with Chronic Liver Disease.

    PubMed

    Ahn, Su Joa; Lee, Jeong Min; Chang, Won; Lee, Sang Min; Kang, Hyo-Jin; Yang, Hyunkyung; Yoon, Jeong Hee; Park, Sae Jin; Han, Joon Koo

    2017-01-01

    To assess intra- and inter-observer reproducibility of a new point shear wave elastography technique (pSWE, S-Shearwave, Samsung Medison) and compare its accuracy in assessing liver stiffness (LS) with an established pSWE technique (Virtual Touch Quantification, VTQ). Thirty-three patients were enrolled in this Institutional Review Board-approved prospective study. LS values were measured by VTQ on an Acuson S2000 system (Siemens Healthineer) and S-Shearwave on an RS-80A (Samsung Medison) in the same session, followed by two further S-Shearwave sessions for inter- and intra-observer variation at 8-hour intervals. The technical success rate (SR) and reliability of the measurements of both pSWE techniques were compared. The intra- and inter-observer reproducibility of S-Shearwave was determined by intraclass correlation coefficients (ICCs). LS values were measured by both methods of pSWE. The diagnostic performance in severe fibrosis (F ≥ 3) and cirrhosis (F = 4) was evaluated using the receiver operating characteristics curve analysis and the Obuchowski measure with the LS values of transient elastography as the referenced standard. The VTQ (100%, 33/33) and S-Shearwave (96.9%, 32/33) techniques did not display a significant difference in technical SR ( p = 0.63) or reliability of LS measurements (96.9%, 32/33; 93.9%, 30/32, respectively, p = 0.61). The inter- and intra-observer agreement for LS measurements using the S-Shearwave technique was excellent (ICC = 0.98 and 0.99, respectively). The mean LS values of both pSWE techniques were not significantly different and exhibited a good correlation (r = 0.78). To detect F ≥ 3 and F = 4, VTQ and S-Shearwave showed comparable diagnostic accuracy as indicated by the following outcomes: areas under receiver operating characteristics curve (AUROC) = 0.87 (95% confidence intervals [CI] 0.70-0.96), 0.89 for VTQ (95% CI 0.74-0.97), respectively; and AUROC = 0.84 (95% CI 0.67-0.94), 0.94 (95% CI 0.80-0.99) for S

  2. Interobserver reliability of video recording in the diagnosis of nocturnal frontal lobe seizures.

    PubMed

    Vignatelli, Luca; Bisulli, Francesca; Provini, Federica; Naldi, Ilaria; Pittau, Francesca; Zaniboni, Anna; Montagna, Pasquale; Tinuper, Paolo

    2007-08-01

    Nocturnal frontal lobe seizures (NFLS) show one or all of the following semeiological patterns: (1) paroxysmal arousals (PA: brief and sudden recurrent motor paroxysmal behavior); (2) hyperkinetic seizures (HS: motor attacks with complex dyskinetic features); (3) asymmetric bilateral tonic seizures (ATS: motor attacks with dystonic features); (4) epileptic nocturnal wanderings (ENW: stereotyped, prolonged ambulatory behavior). To estimate the interobserver reliability (IR) of video-recording diagnosis in patients with suspected NFLS among sleep medicine experts, epileptologists, and trainees in sleep medicine. Sixty-six patients with suspected NFLS were included. All underwent nocturnal video-polysomnographic recording. Six doctors (three experts and three trainees) independently classified each case as "NFLS ascertained" (according to the above specified subtypes: PA, HS, ATS, ENW) or "NFLS excluded". IR was calculated by means of Kappa statistics, and interpreted according to the standard classification (0.0-0.20 = slight agreement; 0.21-0.40 = fair; 0.41-0.60 = moderate; 0.61-0.80 = substantial; 0.81-1.00 = almost perfect). The observed raw agreement ranged from 63% to 79% between each pair of raters; the IR ranged from "moderate" (kappa = 0.50) to "substantial" (kappa = 0.72). A major source of variance was the disagreement in distinguishing between PA and nonepileptic arousals, without differences in the level of agreement between experts and trainees. Among sleep experts and trainees, IR of diagnosis of NFLS, based on videotaped observation of sleep phenomena, is not satisfactory. Explicit video-polysomnographic criteria for the classification of paroxysmal sleep motor phenomena are needed.

  3. Inter-method agreement in retinal blood vessels diameter analysis between Dynamic Vessel Analyzer and optical coherence tomography.

    PubMed

    Benatti, Lucia; Corvi, Federico; Tomasso, Livia; Mercuri, Stefano; Querques, Lea; Ricceri, Fulvio; Bandello, Francesco; Querques, Giuseppe

    2017-06-01

    To analyze the inter-methods agreement in arteriovenous ratio (AVR) evaluation between spectral-domain optical coherence tomography (SD-OCT) and Dynamic Vessel Analyzer (DVA). Healthy volunteers underwent DVA and SD-OCT examination. AVR was measured by SD-OCT using the four external lines of the optic nerve head-centered 7-line cube and by DVA using an automated AVR estimation. The mean AVR was calculated, twice, separately by two independent readers for each tool. Twenty-two eyes of 11 healthy subjects (five women and six men, mean age 35) were included. AVR analysis by DVA showed high inter-observer agreement between reader 1 and 2, and high intra-observer agreement for both reader 1 and reader 2. With regard to AVR analysis on SD-OCT, we found high inter-observer agreement between reader 1 and 2, and low intra-observer agreement for reader 2 but high intra-observer agreement for reader 1. Overall, the mean AVR measured on SD-OCT turned out to be significantly higher than mean AVR measured through DVA (reader 1, 0.9023 ± 0.06 vs 0.8036 ± 0.08; p < 0.001, and reader 2, 0.9067 ± 0.06 vs 0.8083 ± 0.05; p= 0.003). No inter-method agreement in AVR could be detected in the present study due to bias in measurements (shift between DVA and SD-OCT). We found significant difference in the two noninvasive methods for AVR measurement, with a tendency for SD-OCT to overestimate retinal vascular caliber in comparison to DVA. This may be useful for achieving greater accuracy in the evaluation of retinal vessel in ocular as well as systemic diseases.

  4. Pre- and postoperative radiotherapy for extremity soft tissue sarcoma: Evaluation of inter-observer target volume contouring variability among French sarcoma group radiation oncologists.

    PubMed

    Sargos, P; Charleux, T; Haas, R L; Michot, A; Llacer, C; Moureau-Zabotto, L; Vogin, G; Le Péchoux, C; Verry, C; Ducassou, A; Delannes, M; Mervoyer, A; Wiazzane, N; Thariat, J; Sunyach, M P; Benchalal, M; Laredo, J D; Kind, M; Gillon, P; Kantor, G

    2018-04-01

    The purpose of this study was to evaluate, during a national workshop, the inter-observer variability in target volume delineation for primary extremity soft tissue sarcoma radiation therapy. Six expert sarcoma radiation oncologists (members of French Sarcoma Group) received two extremity soft tissue sarcoma radiation therapy cases 1: one preoperative and one postoperative. They were distributed with instructions for contouring gross tumour volume or reconstructed gross tumour volume, clinical target volume and to propose a planning target volume. The preoperative radiation therapy case was a patient with a grade 1 extraskeletal myxoid chondrosarcoma of the thigh. The postoperative case was a patient with a grade 3 pleomorphic undifferentiated sarcoma of the thigh. Contour agreement analysis was performed using kappa statistics. For the preoperative case, contouring agreement regarding GTV, gross tumour volume GTV, clinical target volume and planning target volume were substantial (kappa between 0.68 and 0.77). In the postoperative case, the agreement was only fair for reconstructed gross tumour volume (kappa: 0.38) but moderate for clinical target volume and planning target volume (kappa: 0.42). During the workshop discussion, consensus was reached on most of the contour divergences especially clinical target volume longitudinal extension. The determination of a limited cutaneous cover was also discussed. Accurate delineation of target volume appears to be a crucial element to ensure multicenter clinical trial quality assessment, reproducibility and homogeneity in delivering RT. radiation therapy RT. Quality assessment process should be proposed in this setting. We have shown in our study that preoperative radiation therapy of extremity soft tissue sarcoma has less inter-observer contouring variability. Copyright © 2018 Société française de radiothérapie oncologique (SFRO). Published by Elsevier SAS. All rights reserved.

  5. Interobserver variability of ultrasound elastography and the ultrasound BI-RADS lexicon of breast lesions.

    PubMed

    Park, Chang Suk; Kim, Sung Hun; Jung, Na Young; Choi, Jae Jung; Kang, Bong Joo; Jung, Hyun Seouk

    2015-03-01

    Elastographpy is a newly developed noninvasive imaging technique that uses ultrasound (US) to evaluate tissue stiffness. The interpretation of the same elastographic images may be variable according to reviewers. Because breast lesions are usually reported according to American College of Radiology Breast Imaging and Data System (ACR BI-RADS) lexicons and final category, we tried to compare observer variability between lexicons and final categorization of US BI-RADS and the elasticity score of US elastography. From April 2009 to February 2010, 1356 breast lesions in 1330 patients underwent ultrasound-guided core biopsy. Among them, 63 breast lesions in 55 patients (mean age, 45.7 years; range, 21-79 years) underwent both conventional ultrasound and elastography and were included in this study. Two radiologists independently performed conventional ultrasound and elastography, and another three observers reviewed conventional ultrasound images and elastography videos. Observers independently recorded the elasticity score for a 5-point scoring system proposed by Itoh et al., BI-RADS lexicons and final category using ultrasound BI-RADS. The histopathologic results were obtained and used as the reference standard. Interobserver variability was evaluated. Of the 63 lesions, 42 (66.7 %) were benign, and 21 (33.3 %) were malignant. The highest value of concordance among all variables was achieved for the elasticity score (k = 0.59), followed by shape (k = 0.54), final category (k = 0.48), posterior acoustic features (k = 0.44), echogenecity and orientation (k = 0.43). The least concordances were margin (k = 0.26), lesion boundary (k = 0.29) and calcification (k = 0.3). Elasticity score showed a higher level of interobserver agreement for the diagnosis of breast lesions than BI-RADS lexicons and final category.

  6. Improving agreement in assessment of synovitis in rheumatoid arthritis.

    PubMed

    Cheung, Peter P; Dougados, Maxime; Andre, Vincent; Balandraud, Nathalie; Chales, Gérard; Chary-Valckenaere, Isabelle; Chatelus, Emmanuel; Dernis, Emmanuelle; Gill, Ghislaine; Gilson, Mélanie; Guis, Sandrine; Mouterde, Gael; Pavy, Stephan; Pouyol, François; Marhadour, Thierry; Richette, Pascal; Ruyssen-Witrand, Adeline; Soubrier, Martin; Gossec, Laure

    2013-03-01

    Synovitis assessment through evaluation of swollen joints is integral in steering treatment decisions in rheumatoid arthritis (RA). However, there is high inter-observer variation. The objective was to assess if a short collegiate consensus would improve swollen joint agreement between rheumatologists and whether this was affected by experience. Eighteen rheumatologists from French university rheumatology units participated in three 30 minutes rounds over a half day meeting evaluating joint counts of RA patients in small groups, followed by short consensus discussions. Agreement was evaluated at the end of each round as follows: (i) global agreement of swollen joints (ii) swollen joint agreement according to level of experience of the rheumatologist (iii) swollen joint count and (iv) agreement of disease activity state according to the Disease Activity Score (DAS28). Agreement was calculated using percentage agreement and kappa. Global agreement of swollen joints failed to improve (kappa 0.50 to 0.52) at the joint level. Agreement between seniors did not improve but agreement between newly qualified rheumatologists and their senior peer, which was initially poor (kappa 0.28), improved significantly (to 0.54) at the end of the consensus exercises. Concordance of DAS28 activity states improved from 71% to 87%. Consensus exercises for swollen joint assessment is worthwhile and may potentially improve agreement between clinicians in clinical synovitis and disease activity state, benefit was mostly observed in newly qualified rheumatologists. Copyright © 2012 Société française de rhumatologie. Published by Elsevier SAS. All rights reserved.

  7. Good agreement between smart device and inertial sensor-based gait parameters during a 6-min walk.

    PubMed

    Proessl, F; Swanson, C W; Rudroff, T; Fling, B W; Tracy, B L

    2018-05-28

    Traditional laboratory-based kinetic and kinematic gait analyses are expensive, time-intensive, and impractical for clinical settings. Inertial sensors have gained popularity in gait analysis research and more recently smart devices have been employed to provide quantification of gait. However, no study to date has investigated the agreement between smart device and inertial sensor-based gait parameters during prolonged walking. Compare spatiotemporal gait metrics measured with a smart device versus previously validated inertial sensors. Twenty neurologically healthy young adults (7 women; age: 25.0 ± 3.7 years; BMI: 23.4 ± 2.9 kg/m 2 ) performed a 6-min walk test (6MWT) wearing inertial sensors and smart devices to record stride duration, stride length, cadence, and gait speed. Pearson correlations were used to assess associations between spatiotemporal measures from the two devices and agreement between the two methods was assessed with Bland-Altman plots and limits of agreement. All spatiotemporal gait metrics (stride duration, cadence, stride length and gait speed) showed strong (r>0.9) associations and good agreement between the two devices. Smart devices are capable of accurately reflecting many of the spatiotemporal gait metrics of inertial sensors. As the smart devices also accurately reflected individual leg output, future studies may apply this analytical strategy to clinical populations, to identify hallmarks of disability status and disease progression in a more ecologically valid environment. Copyright © 2018. Published by Elsevier B.V.

  8. Interobserver and Intraobserver Agreement on Qualitative Assessments of Right Ventricular Dysfunction With Echocardiography in Patients With Pulmonary Embolism.

    PubMed

    Weekes, Anthony J; Oh, Laura; Thacker, Gregory; Johnson, Angela K; Runyon, Michael; Rose, Geoffrey; Johnson, Thomas; Templin, Megan; Norton, H James

    2016-10-01

    To evaluate observer agreement using qualitative goal-directed echocardiographic criteria for right ventricular (RV) dysfunction prognostication in submassive pulmonary embolism (PE). Two emergency physicians and 2 cardiologists independently reviewed 31 packets of goal-directed echocardiographic video clips consisting of at least 3 windows obtained by emergency physicians from normotensive patients with PE. Nine packets were repeated to assess for intraobserver agreement. Right ventricular dysfunction criteria on goal-directed echocardiography were as follows: RV enlargement was present, with a right-to-left ventricular basal diameter ratio of 1.0 or higher and blunting of the apex of the RV in 2 or more different windows; RV systolic dysfunction was present if the tricuspid annulus moved toward the apex 10 mm or less and there was RV free wall hypokinesis; and septal deviation was present with any flattening or deviation of the ventricular septum toward the left ventricle. Among the 4 participants, there was 83.9% agreement on the presence or absence of RV enlargement (κ = 0.84), 74.2% agreement on the presence or absence of RV systolic dysfunction (κ = 0.69), and 71.0% agreement on the presence or absence of septal deviation (κ = 0.59). Intraobserver agreement was 100% for each RV dysfunction variable for each observer (κ = 1.0). Agreement was substantial for both severe RV enlargement and RV systolic dysfunction and moderate for septal deviation. Right ventricular dysfunction assessment with qualitative goal-directed echocardiographic criteria is reproducible for PE risk stratification.

  9. Agreement and accuracy using the FIGO, ACOG and NICE cardiotocography interpretation guidelines.

    PubMed

    Santo, Susana; Ayres-de-Campos, Diogo; Costa-Santos, Cristina; Schnettler, William; Ugwumadu, Austin; Da Graça, Luís M

    2017-02-01

    One of the limitations reported with cardiotocography is the modest interobserver agreement observed in tracing interpretation. This study compared agreement, reliability and accuracy of cardiotocography interpretation using the International Federation of Gynecology and Obstetrics, American College of Obstetrics and Gynecology and National Institute for Health and Care Excellence guidelines. A total of 151 tracings were evaluated by 27 clinicians from three centers where International Federation of Gynecology and Obstetrics, American College of Obstetrics and Gynecology and National Institute for Health and Care Excellence guidelines were routinely used. Interobserver agreement was evaluated using the proportions of agreement and reliability with the κ statistic. The accuracy of tracings classified as "pathological/category III" was assessed for prediction of newborn acidemia. For all measures, 95% confidence interval were calculated. Cardiotocography classifications were more distributed with International Federation of Gynecology and Obstetrics (9, 52, 39%) and National Institute for Health and Care Excellence (30, 33, 37%) than with American College of Obstetrics and Gynecology (13, 81, 6%). The category with the highest agreement was American College of Obstetrics and Gynecology category II (proportions of agreement = 0.73, 95% confidence interval 0.70-76), and the ones with the lowest agreement were American College of Obstetrics and Gynecology categories I and III. Reliability was significantly higher with International Federation of Gynecology and Obstetrics (κ = 0.37, 95% confidence interval 0.31-0.43), and National Institute for Health and Care Excellence (κ = 0.33, 95% confidence interval 0.28-0.39) than with American College of Obstetrics and Gynecology (κ = 0.15, 95% confidence interval 0.10-0.21); however, all represent only slight/fair reliability. International Federation of Gynecology and Obstetrics and National Institute for Health and Care

  10. Interobserver reliability of computed tomographic contouring of canine tonsils in radiation therapy treatment planning.

    PubMed

    Murakami, Keiko; Rancilio, Nicholas J; Plantenga, Jeannie Poulson; Moore, George E; Heng, Hock Gan; Lim, Chee Kin

    2018-05-01

    In radiation therapy (RT) treatment planning for canine head and neck cancer, the tonsils may be included as part of the treated volume. Delineation of tonsils on computed tomography (CT) scans is difficult. Error or uncertainty in the volume and location of contoured structures may result in treatment failure. The purpose of this prospective, observer agreement study was to assess the interobserver agreement of tonsillar contouring by two groups of trained observers. Thirty dogs undergoing pre- and post-contrast CT studies of the head were included. After the pre- and postcontrast CT scans, the tonsils were identified via direct visualization, barium paste was applied bilaterally to the visible tonsils, and a third CT scan was acquired. Data from each of the three CT scans were registered in an RT treatment planning system. Two groups of observers (one veterinary radiologist and one veterinary radiation oncologist in each group) contoured bilateral tonsils by consensus, obtaining three sets of contours. Tonsil volume and location data were obtained from both groups. The contour volumes and locations were compared between groups using mixed (fixed and random effect) linear models. There was no significant difference between each group's contours in terms of three-dimensional coordinates. However there was a significant difference between each group's contours in terms of the tonsillar volume (P < 0.0001). Pre- and postcontrast CT can be used to identify the location of canine tonsils with reasonable agreement between trained observers. Discrepancy in tonsillar volume between groups of trained observers may affect RT treatment outcome. © 2017 American College of Veterinary Radiology.

  11. Interobserver reliability of the young-burgess and tile classification systems for fractures of the pelvic ring.

    PubMed

    Koo, Henry; Leveridge, Mike; Thompson, Charles; Zdero, Rad; Bhandari, Mohit; Kreder, Hans J; Stephen, David; McKee, Michael D; Schemitsch, Emil H

    2008-07-01

    The purpose of this study was to measure interobserver reliability of 2 classification systems of pelvic ring fractures and to determine whether computed tomography (CT) improves reliability. The reliability of several radiographic findings was also tested. Thirty patients taken from a database at a Level I trauma facility were reviewed. For each patient, 3 radiographs (AP pelvis, inlet, and outlet) and CT scans were available. Six different reviewers (pelvic and acetabular specialist, orthopaedic traumatologist, or orthopaedic trainee) classified the injury according to Young-Burgess and Tile classification systems after reviewing plain radiographs and then after CT scans. The Kappa coefficient was used to determine interobserver reliability of these classification systems before and after CT scan. For plain radiographs, overall Kappa values for the Young-Burgess and Tile classification systems were 0.72 and 0.30, respectively. For CT scan and plain radiographs, the overall Kappa values for the Young-Burgess and Tile classification systems were 0.63 and 0.33, respectively. The pelvis/acetabular surgeons demonstrated the highest level of agreement using both classification systems. For individual questions, the addition of CT did significantly improve reviewer interpretation of fracture stability. The pre-CT and post-CT Kappa values for fracture stability were 0.59 and 0.93, respectively. The CT scan can improve the reliability of assessment of pelvic stability because of its ability to identify anatomical features of injury. The Young-Burgess system may be optimal for the learning surgeon. The Tile classification system is more beneficial for specialists in pelvic and acetabular surgery.

  12. Emergency Physicians Are Able to Detect Right Ventricular Dilation With Good Agreement Compared to Cardiology.

    PubMed

    Rutz, Matt A; Clary, Julie M; Kline, Jeffrey A; Russell, Frances M

    2017-07-01

    Focused cardiac ultrasound (FOCUS) is a useful tool in evaluating patients presenting to the emergency department (ED) with acute dyspnea. Prior work has shown that right ventricular (RV) dilation is associated with repeat hospitalizations and shorter life expectancy. Traditionally, RV assessment has been evaluated by cardiologist-interpreted comprehensive echocardiography. The primary goal of this study was to determine the inter-rater reliability between emergency physicians (EPs) and a cardiologist for determining RV dilation on FOCUS performed on ED patients with acute dyspnea. This was a prospective, observational study at two urban academic EDs; patients were enrolled if they had acute dyspnea and a computed tomographic pulmonary angiogram without acute disease. All patients had an EP-performed FOCUS to assess for RV dilation. RV dilation was defined as an RV to left ventricular ratio greater than 1. FOCUS interpretations were compared to a blinded cardiologist FOCUS interpretation using agreement and kappa statistics. Of 84 FOCUS examinations performed on 83 patients, 17% had RV dilation. Agreement and kappa, for EP-performed FOCUS for RV dilation were 89% (95% confidence interval [CI] 80-95%) and 0.68 (95% CI 0.48-0.88), respectively. Emergency physician sonographers are able to detect RV dilation with good agreement when compared to cardiology. These results support the wider use of EP-performed FOCUS to evaluate for RV dilation in ED patients with dyspnea. © 2017 by the Society for Academic Emergency Medicine.

  13. Information Technology Management: DoD Organization Information Assurance Management of Information Technology Goods and Services Acquired Through Interagency Agreements

    DTIC Science & Technology

    2006-02-23

    Information Technology Management Department of Defense Office of Inspector General February 23, 2006 AccountabilityIntegrityQuality DoD...Organization Information Assurance Management of Information Technology Goods and Services Acquired Through Interagency Agreements (D-2006-052) Report...REPORT TYPE 3. DATES COVERED 00-00-2006 to 00-00-2006 4. TITLE AND SUBTITLE Information Technology Management: DoD Organization Information

  14. A comparison of film and computer workstation measurements of degenerative spondylolisthesis: intraobserver and interobserver reliability.

    PubMed

    Bolesta, Michael J; Winslow, Lauren; Gill, Kevin

    2010-06-01

    A comparison of measurements of degenerative spondylolisthesis made on film and on computer workstations. To determine whether the 2 methodologies are comparable in some of the parameters used to assess lumbar degenerative spondylolisthesis. Digital radiology has been replacing analog radiographs. In scoliosis, several studies have shown that measurements made on digital and analog films are similar and that they are also similar to those made on computer workstations. Such work has not been done in spondylolisthesis. Twenty-four cases of lumbar degenerative spondylolisthesis were identified from our clinic practice. Three observers measured anterior displacement, sagittal rotation, and lumbar lordosis on digital films using the same protractor and pencil. The same parameters were measured on the same studies at clinical workstations. All measurements were repeated 2 weeks later. A statistician determined the intra and interobserver reliability of the 2 measurement methods and the degree of agreement between the 2 methods. The differences between the first and second readings did reach statistical significance in some cases, but none of them were large enough to be clinically meaningful. The interclass correlation coefficients (ICCs) were >or=0.80 except for one (0.67). The difference among the 3 observers was similarly statistically significant in a few instances but not enough to influence clinical decisions and with good ICCs (0.67 and better). Similarly, the differences in the 2 methods were small, and ICCs ranged from 0.69 to 0.98. This study supports the use of computer workstation measurements in lumbar degenerative spondylolisthesis. The parameters used in this study were comparable, whether measured on film or at clinical workstations.

  15. Intra and interobserver variability of intrapartum transperineal ultrasound measurements with contraction and pushing.

    PubMed

    Sainz, José A; Fernández-Palacín, Ana; Borrero, Carlota; Aquise, Adriana; Ramos, Zenaida; García-Mejido, José A

    2018-04-01

    The aim of this study was to evaluate the inter- and intraobserver correlation of the different intrapartum-transperineal-ultrasound-parameters(ITU) (angle of progression (AoP), progression-distance (PD), head-direction (HD), midline-angle (MLA) and head-perineum distance (HPD)) with contraction and pushing. We evaluated 28 nulliparous women at full dilatation under epidural analgesia. We performed a transperineal ultrasound evaluating AoP and PD in the longitudinal plane, and MLA and HPD in the transverse plane. Interclass correlation coefficients (ICC) with 95% CIs and Bland-Altman analysis were used to assess intra- and interobserver measurement's repeatability. The ICC of the ITU for the same observer was adequate for all the parameters (p < .005) AoP 0.98 (95%CI, 0.96-0.99), PD 0.98 (95%CI, 0.97-0.99), MLA 0.99 (95%CI, 0.97-0.99), HPD 0.96 (95%CI, 0.88-0.99). The ICC of the ITU for interobserver was: AoP 0.93 (95%CI, 0.79-0.98), PD 0.92 (95%CI, 0.76-0.97), MLA 0.77 (95%CI, 0.42-0.92), HPD 0.47 (95%CI, -0.12-0.8). The HD had an interobserver correlation of 0.53 (95%CI, 0.1-0.9) (Kappa C). The mean difference of the AoP was 2.42°, of the PD 1 mm and 0.28° MLA (Bland-Altman test). ITU has an adequate intra- and interobserver correlation for its use with contraction and pushing under epidural analgesia. Impact statement What is already known on this subject: The intrapartum transperineal ultrasound parameters can be used with contraction and pushing under epidural analgesia. What the results of this study add to what we know: ITU may be used to evaluate the difficulty of instrumental delivery/to evaluate the difficulty of instrumentation in vaginal operative deliveries and this study concludes that ITU is reproducible during uterine contraction with pushing. What the implications are of these findings for clinical practice and/or further research: Therefore, ITU could be used without difficulty with an adequate intra- and interobserver correlation for the

  16. Assessment of interobserver concordance in polysomnography scoring of sleep bruxism☆

    PubMed Central

    Ferraz, Otávio; de Moura Guimarães, Thais; Maluly Filho, Milton; Dal-Fabbro, Cibele; Abraão Crosara Cunha, Thays; Cristina Lotaif, Ana; Cristina Barros Schütz, Teresa; Santos-Silva, Rogério; Tufik, Sergio; Bittencourt, Lia

    2015-01-01

    Introduction Objective evaluation of sleep bruxism (SB) using whole-night polysomnography (PSG) is relevant for diagnostic confirmation. Nevertheless, the PSG electromyogram (EMG) scoring may give rise to controversy, particularly when audiovisual monitoring is not performed. Therefore, the present study assessed the concordance between two independent scorers to visual SB on a PSG performed without audiovisual monitoring. Methods Fifty-six PSG tests were scored from individuals with clinical history and polysomnography criteria of SB. In addition to the protocol of conventional whole-night PSG, electrodes were also placed bilaterally on the masseter and temporal muscles. Visual EMG scoring without audio video monitoring was scored by two independent scorers (Dentist 1 and Dentist 2) according the recommendations formulated in the AASM manual (2007). Kendall Tau correlation was used to assess interobserver concordance relative to variables “total duration of events (seconds), “shortest events”, “longest events” and index in each phasic, tonic or mixed event. Results The correlation was positive and significant relative to all the investigated variables, being T>0.54. Conclusion It was found a good inter-examiner concordance rate in SB scoring in absence of audio video monitoring. PMID:26779318

  17. Inter- and Intraobserver Agreement of 18F-FDG PET/CT Image Interpretation in Patients Referred for Assessment of Cardiac Sarcoidosis.

    PubMed

    Ohira, Hiroshi; Ardle, Brian Mc; deKemp, Robert A; Nery, Pablo; Juneau, Daniel; Renaud, Jennifer M; Klein, Ran; Clarkin, Owen; MacDonald, Karen; Leung, Eugene; Nair, Girish; Beanlands, Rob; Birnie, David

    2017-08-01

    Recent studies have reported the usefulness of 18 F-FDG PET in aiding with the diagnosis and management of patients with cardiac sarcoidosis (CS). However, image interpretation of 18 F-FDG PET for CS is sometimes challenging. We sought to investigate the inter- and intraobserver agreement and explore factors that led to important discrepancies between readers. Methods: We studied consecutive patients with no significant coronary artery disease who were referred for assessment of CS. Two experienced readers masked to clinical information, imaging reports, independently reviewed 18 F-FDG PET/CT images. 18 F-FDG PET/CT images were interpreted according to a predefined standard operating procedure, with cardiac 18 F-FDG uptake patterns categorized into 5 patterns: none, focal, focal on diffuse, diffuse, and isolated lateral wall or basal uptake. Overall image assessment was classified as either consistent with active CS or not. Results: One hundred scans were included from 71 patients. Of these, 46 underwent 18 F-FDG PET/CT with a no-restriction diet (no-restriction group), and 54 underwent 18 F-FDG PET/CT with a low-carbohydrate, high-fat and protein-permitted diet (low-carb group). There was agreement of the interpretation category in 74 of 100 scans. The κ-value of agreement among all 5 categories was 0.64, indicating moderate agreement. For overall clinical interpretation, there was agreement in 93 of 100 scans (κ = 0.85). When scans were divided into the preparation groups, there was a trend toward higher agreement in the low-carb group versus the no-restriction group (80% vs. 67%, P = 0.08). Regarding the overall clinical interpretation, there was also a trend toward greater agreement in the low-carb group versus the no-restriction group (96% vs. 89%, P = 0.08). Conclusion : The interobserver agreement of cardiac 18 F-FDG uptake image patterns was moderate. However, agreement was better regarding overall interpretation of CS. Detailed prescan dietary

  18. Inter-observer variability in the classification of ovarian cancer cell type using microscopy: a pilot study

    NASA Astrophysics Data System (ADS)

    Gavrielides, Marios A.; Ronnett, Brigitte M.; Vang, Russell; Seidman, Jeffrey D.

    2015-03-01

    Studies have shown that different cell types of ovarian carcinoma have different molecular profiles, exhibit different behavior, and that patients could benefit from typespecific treatment. Different cell types display different histopathology features, and different criteria are used for each cell type classification. Inter-observer variability for the task of classifying ovarian cancer cell types is an under-examined area of research. This study served as a pilot study to quantify observer variability related to the classification of ovarian cancer cell types and to extract valuable data for designing a validation study of digital pathology (DP) for this task. Three observers with expertise in gynecologic pathology reviewed 114 cases of ovarian cancer with optical microscopy, with specific guidelines for classifications into distinct cell types. For 93 cases all 3 pathologists agreed on the same cell type, for 18 cases 2 out of 3 agreed, and for 3 cases there was no agreement. Across cell types with a minimum sample size of 10 cases, agreement between all three observers was {91.1%, 80.0%, 90.0%, 78.6%, 100.0%, 61.5%} for the high grade serous carcinoma, low grade serous carcinoma, endometrioid, mucinous, clear cell, and carcinosarcoma cell types respectively. These results indicate that unanimous agreement varied over a fairly wide range. However, additional research is needed to determine the importance of these differences in comparison studies. These results will be used to aid in the design and sizing of such a study comparing optical and digital pathology. In addition, the results will help in understanding the potential role computer-aided diagnosis has in helping to improve the agreement of pathologists for this task.

  19. Interobserver Reliability of the Berlin ARDS Definition and Strategies to Improve the Reliability of ARDS Diagnosis.

    PubMed

    Sjoding, Michael W; Hofer, Timothy P; Co, Ivan; Courey, Anthony; Cooke, Colin R; Iwashyna, Theodore J

    2018-02-01

    Failure to reliably diagnose ARDS may be a major driver of negative clinical trials and underrecognition and treatment in clinical practice. We sought to examine the interobserver reliability of the Berlin ARDS definition and examine strategies for improving the reliability of ARDS diagnosis. Two hundred five patients with hypoxic respiratory failure from four ICUs were reviewed independently by three clinicians, who evaluated whether patients had ARDS, the diagnostic confidence of the reviewers, whether patients met individual ARDS criteria, and the time when criteria were met. Interobserver reliability of an ARDS diagnosis was "moderate" (kappa = 0.50; 95% CI, 0.40-0.59). Sixty-seven percent of diagnostic disagreements between clinicians reviewing the same patient was explained by differences in how chest imaging studies were interpreted, with other ARDS criteria contributing less (identification of ARDS risk factor, 15%; cardiac edema/volume overload exclusion, 7%). Combining the independent reviews of three clinicians can increase reliability to "substantial" (kappa = 0.75; 95% CI, 0.68-0.80). When a clinician diagnosed ARDS with "high confidence," all other clinicians agreed with the diagnosis in 72% of reviews. There was close agreement between clinicians about the time when a patient met all ARDS criteria if ARDS developed within the first 48 hours of hospitalization (median difference, 5 hours). The reliability of the Berlin ARDS definition is moderate, driven primarily by differences in chest imaging interpretation. Combining independent reviews by multiple clinicians or improving methods to identify bilateral infiltrates on chest imaging are important strategies for improving the reliability of ARDS diagnosis. Copyright © 2017 American College of Chest Physicians. All rights reserved.

  20. Reliability testing of the Larsen and Sharp classifications for rheumatoid arthritis of the elbow.

    PubMed

    Jew, Nicholas B; Hollins, Anthony M; Mauck, Benjamin M; Smith, Richard A; Azar, Frederick M; Miller, Robert H; Throckmorton, Thomas W

    2017-01-01

    Two popular systems for classifying rheumatoid arthritis affecting the elbow are the Larsen and Sharp schemes. To our knowledge, no study has investigated the reliability of these 2 systems. We compared the intraobserver and interobserver agreement of the 2 systems to determine whether one is more reliable than the other. The radiographs of 45 patients diagnosed with rheumatoid arthritis affecting the elbow were evaluated. Anteroposterior and lateral radiographs were deidentified and distributed to 6 evaluators (4 fellowship-trained upper extremity surgeons and 2 orthopedic trainees). Each evaluator graded all 45 radiographs according to the Larsen and Sharp scoring methods on 2 occasions, at least 2 weeks apart. Overall intraobserver reliability was 0.93 (95% confidence interval [CI], 0.90-0.95) for the Larsen system and 0.92 (95% CI, 0.86-0.96) for the Sharp classification, both indicating substantial agreement. Overall interobserver reliability was 0.70 (95% CI, 0.60-0.80) for the Larsen classification and 0.68 (95% CI, 0.54-0.81) for the Sharp system, both indicating good agreement. There were no significant differences in the intraobserver or interobserver reliability of the systems overall and no significant differences in reliability between attending surgeons and trainees for either classification system. The Larsen and Sharp systems both show substantial intraobserver reliability and good interobserver agreement for the radiographic classification of rheumatoid arthritis affecting the elbow. Differences in training level did not result in substantial variances in reliability for either system. We conclude that both systems can be reliably used to evaluate rheumatoid arthritis of the elbow by observers of varying training levels. Copyright © 2017 Journal of Shoulder and Elbow Surgery Board of Trustees. Published by Elsevier Inc. All rights reserved.

  1. Intra- and Interobserver Reliability of Three Classification Systems for Hallux Rigidus.

    PubMed

    Dillard, Sarita; Schilero, Christina; Chiang, Sharon; Pham, Peter

    2018-04-18

    There are over ten classification systems currently used in the staging of hallux rigidus. This results in confusion and inconsistency with radiographic interpretation and treatment. The reliability of hallux rigidus classification systems has not yet been tested. The purpose of this study was to evaluate intra- and interobserver reliability using three commonly used classifications for hallux rigidus. Twenty-one plain radiograph sets were presented to ten ACFAS board-certified foot and ankle surgeons. Each physician classified each radiograph based on clinical experience and knowledge according to the Regnauld, Roukis, and Hattrup and Johnson classification systems. The two-way mixed single-measure consistency intraclass correlation was used to calculate intra- and interrater reliability. The intrarater reliability of individual sets for the Roukis and Hattrup and Johnson classification systems was "fair to good" (Roukis, 0.62±0.19; Hattrup and Johnson, 0.62±0.28), whereas the intrarater reliability of individual sets for the Regnauld system bordered between "fair to good" and "poor" (0.43±0.24). The interrater reliability of the mean classification was "excellent" for all three classification systems. Conclusions Reliable and reproducible classification systems are essential for treatment and prognostic implications in hallux rigidus. In our study, Roukis classification system had the best intrarater reliability. Although there are various classification systems for hallux rigidus, our results indicate that all three of these classification systems show reliability and reproducibility.

  2. Megavoltage computed tomography image guidance with helical tomotherapy in patients with vertebral tumors: analysis of factors influencing interobserver variability.

    PubMed

    Levegrün, Sabine; Pöttgen, Christoph; Jawad, Jehad Abu; Berkovic, Katharina; Hepp, Rodrigo; Stuschke, Martin

    2013-02-01

    To evaluate megavoltage computed tomography (MVCT)-based image guidance with helical tomotherapy in patients with vertebral tumors by analyzing factors influencing interobserver variability, considered as quality criterion of image guidance. Five radiation oncologists retrospectively registered 103 MVCTs in 10 patients to planning kilovoltage CTs by rigid transformations in 4 df. Interobserver variabilities were quantified using the standard deviations (SDs) of the distributions of the correction vector components about the observers' fraction mean. To assess intraobserver variabilities, registrations were repeated after ≥4 weeks. Residual deviations after setup correction due to uncorrectable rotational errors and elastic deformations were determined at 3 craniocaudal target positions. To differentiate observer-related variations in minimizing these residual deviations across the 3-dimensional MVCT from image resolution effects, 2-dimensional registrations were performed in 30 single transverse and sagittal MVCT slices. Axial and longitudinal MVCT image resolutions were quantified. For comparison, image resolution of kilovoltage cone-beam CTs (CBCTs) and interobserver variability in registrations of 43 CBCTs were determined. Axial MVCT image resolution is 3.9 lp/cm. Longitudinal MVCT resolution amounts to 6.3 mm, assessed as full-width at half-maximum of thin objects in MVCTs with finest pitch. Longitudinal CBCT resolution is better (full-width at half-maximum, 2.5 mm for CBCTs with 1-mm slices). In MVCT registrations, interobserver variability in the craniocaudal direction (SD 1.23 mm) is significantly larger than in the lateral and ventrodorsal directions (SD 0.84 and 0.91 mm, respectively) and significantly larger compared with CBCT alignments (SD 1.04 mm). Intraobserver variabilities are significantly smaller than corresponding interobserver variabilities (variance ratio [VR] 1.8-3.1). Compared with 3-dimensional registrations, 2-dimensional registrations

  3. Interobserver error involved in independent attempts to measure cusp base areas of Pan M1s

    PubMed Central

    Bailey, Shara E; Pilbrow, Varsha C; Wood, Bernard A

    2004-01-01

    Cusp base areas measured from digitized images increase the amount of detailed quantitative information one can collect from post-canine crown morphology. Although this method is gaining wide usage for taxonomic analyses of extant and extinct hominoids, the techniques for digitizing images and taking measurements differ between researchers. The aim of this study was to investigate interobserver error in order to help assess the reliability of cusp base area measurement within extant and extinct hominoid taxa. Two of the authors measured individual cusp base areas and total cusp base area of 23 maxillary first molars (M1) of Pan. From these, relative cusp base areas were calculated. No statistically significant interobserver differences were found for either absolute or relative cusp base areas. On average the hypocone and paracone showed the least interobserver error (< 1%) whereas the protocone and metacone showed the most (2.6–4.5%). We suggest that the larger measurement error in the metacone/protocone is due primarily to either weakly defined fissure patterns and/or the presence of accessory occlusal features. Overall, levels of interobserver error are similar to those found for intraobserver error. The results of our study suggest that if certain prescribed standards are employed then cusp and crown base areas measured by different individuals can be pooled into a single database. PMID:15447691

  4. Assessment of Interobserver Reliability in Nutrition Studies that Use Direct Observation of School Meals

    PubMed Central

    BAGLIO, MICHELLE L.; BAXTER, SUZANNE DOMEL; GUINN, CAROLINE H.; THOMPSON, WILLIAM O.; SHAFFER, NICOLE M.; FRYE, FRANCESCA H. A.

    2005-01-01

    This article (a) provides a general review of interobserver reliability (IOR) and (b) describes our method for assessing IOR for items and amounts consumed during school meals for a series of studies regarding the accuracy of fourth-grade children's dietary recalls validated with direct observation of school meals. A widely used validation method for dietary assessment is direct observation of meals. Although many studies utilize several people to conduct direct observations, few published studies indicate whether IOR was assessed. Assessment of IOR is necessary to determine that the information collected does not depend on who conducted the observation. Two strengths of our method for assessing IOR are that IOR was assessed regularly throughout the data collection period and that IOR was assessed for foods at the item and amount level instead of at the nutrient level. Adequate agreement among observers is essential to the reasoning behind using observation as a validation tool. Readers are encouraged to question the results of studies that fail to mention and/or to include the results for assessment of IOR when multiple people have conducted observations. PMID:15354155

  5. The postoperative COFAS end-stage ankle arthritis classification system: interobserver and intraobserver reliability.

    PubMed

    Krause, Fabian G; Di Silvestro, Matthew; Penner, Murray J; Wing, Kevin J; Glazebrook, Mark A; Daniels, Timothy R; Lau, Johnny T C; Younger, Alastair S E

    2012-02-01

    End-stage ankle arthritis is operatively treated with numerous designs of total ankle replacement and different techniques for ankle fusion. For superior comparison of these procedures, outcome research requires a classification system to stratify patients appropriately. A postoperative 4-type classification system was designed by 6 fellowship-trained foot and ankle surgeons. Four surgeons reviewed blinded patient profiles and radiographs on 2 occasions to determine the interobserver and intraobserver reliability of the classification. Excellent interobserver reliability (κ = .89) and intraobserver reproducibility (κ = .87) were demonstrated for the postoperative classification system. In conclusion, the postoperative Canadian Orthopaedic Foot and Ankle Society (COFAS) end-stage ankle arthritis classification system appears to be a valid tool to evaluate the outcome of patients operated for end-stage ankle arthritis.

  6. 19 CFR 10.605 - Goods classifiable as goods put up in sets.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ...-Central America-United States Free Trade Agreement Rules of Origin § 10.605 Goods classifiable as goods... 19 Customs Duties 1 2010-04-01 2010-04-01 false Goods classifiable as goods put up in sets. 10.605 Section 10.605 Customs Duties U.S. CUSTOMS AND BORDER PROTECTION, DEPARTMENT OF HOMELAND SECURITY...

  7. MRI of the wrist in juvenile idiopathic arthritis: proposal of a paediatric synovitis score by a consensus of an international working group. Results of a multicentre reliability study.

    PubMed

    Damasio, Maria Beatrice; Malattia, Clara; Tanturri de Horatio, Laura; Mattiuz, Chiara; Pistorio, Angela; Bracaglia, Claudia; Barbuti, Domenico; Boavida, Peter; Juhan, Karen Lambot; Ording, Lil Sophie Mueller; Rosendahl, Karen; Martini, Alberto; Magnano, GianMichele; Tomà, Paolo

    2012-09-01

    MRI is a sensitive tool for the evaluation of synovitis in juvenile idiopathic arthritis (JIA). The purpose of this study was to introduce a novel MRI-based score for synovitis in children and to examine its inter- and intraobserver variability in a multi-centre study. Wrist MRI was performed in 76 children with JIA. On postcontrast 3-D spoiled gradient-echo and fat-suppressed T2-weighted spin-echo images, joint recesses were scored for the degree of synovial enhancement, effusion and overall inflammation independently by two paediatric radiologists. Total-enhancement and inflammation-synovitis scores were calculated. Interobserver agreement was poor to moderate for enhancement and inflammation in all recesses, except in the radioulnar and radiocarpal joints. Intraobserver agreement was good to excellent. For enhancement and inflammation scores, mean differences (95 % CI) between observers were -1.18 (-4.79 to 2.42) and -2.11 (-6.06 to 1.83). Intraobserver variability (reader 1) was 0 (-1.65 to 1.65) and 0.02 (-1.39 to 1.44). Intraobserver agreement was good. Except for the radioulnar and radiocarpal joints, interobserver agreement was not acceptable. Therefore, the proposed scoring system requires further refinement.

  8. 20 CFR 416.2171 - Duration of agreement.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... end the agreement; and (3) The State does not give a good reason for keeping the agreement in force beyond the ending date we selected. If the State does provide a good reason, the termination will be...

  9. Interobserver variation in CD30 immunohistochemistry interpretation; consequences for patient selection for targeted treatment.

    PubMed

    Koens, Lianne; van de Ven, Peter M; Hijmering, Nathalie J; Kersten, Marie José; Diepstra, Arjan; Chamuleau, Martine; de Jong, Daphne

    2018-05-14

    CD30 immunohistochemistry (IHC) in malignant lymphoma is used for selection of patients in clinical trials using brentuximab vedotin, an antibody drug-conjugate targeting the CD30 molecule. For reliable implementation in daily practice and meaningful selection of patients for clinical trials, information on technical variation and interobserver reproducibility of CD30 IHC staining is required. We conducted a 3-round reproducibility assessment of CD30 scoring for categorized frequency and intensity, including a technical validation, a "live polling" pre- and post-instruction scoring round, and a web-based round including individual scoring with additional IHC information to mimic daily diagnostic practice. Agreement in all three scoring rounds was poor to fair (κ=0,12 to 0,35 for CD30 positive tumor cell percentage, and κ=0,16 to 0,41 for staining intensity), even when allowing for one category of freedom in percentage of tumor cell positivity (κ=0,30 to 0,61). The first round with CD30 staining performed in 5 independent laboratories showed objective differences in staining intensity. In the second round, about half of the pathologists changed their opinion on CD30 frequency after a discussion on potential pitfalls, highlighting hesitancy in decision-making. Using fictional cut-off points for percentage of tumor cell positivity, agreement was still suboptimal (κ=0,35 to 0,60). Lack of agreement in cases with heterogeneous expression is shown to influence patient eligibility for treatment with brentuximab vedotin both in clinical practice and within the context of clinical trials, and limits the potential predictive value of the relative frequency of CD30 positive neoplastic cells for clinical response. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.

  10. Digital image analysis of Ki67 proliferation index in breast cancer using virtual dual staining on whole tissue sections: clinical validation and inter-platform agreement.

    PubMed

    Koopman, Timco; Buikema, Henk J; Hollema, Harry; de Bock, Geertruida H; van der Vegt, Bert

    2018-05-01

    The Ki67 proliferation index is a prognostic and predictive marker in breast cancer. Manual scoring is prone to inter- and intra-observer variability. The aims of this study were to clinically validate digital image analysis (DIA) of Ki67 using virtual dual staining (VDS) on whole tissue sections and to assess inter-platform agreement between two independent DIA platforms. Serial whole tissue sections of 154 consecutive invasive breast carcinomas were stained for Ki67 and cytokeratin 8/18 with immunohistochemistry in a clinical setting. Ki67 proliferation index was determined using two independent DIA platforms, implementing VDS to identify tumor tissue. Manual Ki67 score was determined using a standardized manual counting protocol. Inter-observer agreement between manual and DIA scores and inter-platform agreement between both DIA platforms were determined and calculated using Spearman's correlation coefficients. Correlations and agreement were assessed with scatterplots and Bland-Altman plots. Spearman's correlation coefficients were 0.94 (p < 0.001) for inter-observer agreement between manual counting and platform A, 0.93 (p < 0.001) between manual counting and platform B, and 0.96 (p < 0.001) for inter-platform agreement. Scatterplots and Bland-Altman plots revealed no skewness within specific data ranges. In the few cases with ≥ 10% difference between manual counting and DIA, results by both platforms were similar. DIA using VDS is an accurate method to determine the Ki67 proliferation index in breast cancer, as an alternative to manual scoring of whole sections in clinical practice. Inter-platform agreement between two different DIA platforms was excellent, suggesting vendor-independent clinical implementability.

  11. Comparison of four morphometric definitions and a semiquantitative consensus reading for assessing prevalent vertebral fractures.

    PubMed

    Grados, F; Roux, C; de Vernejoul, M C; Utard, G; Sebert, J L; Fardellone, P

    2001-01-01

    The assessment of vertebral fracture in patients with osteoporosis by conventional radiography has been improved over the past 10 years using either the semiquantitative (SQ) method devised by Genant et al. or quantitative morphometry. However, there is still no internationally agreed definition for vertebral fracture and there have been few comparative studies between these different approaches. Our study assessed the reproducibility of the SQ method and of four commonly used morphometric algorithms (Melton's, Eastell's, Minne's and McCloskey's methods) for assessing prevalent vertebral fractures, and examined the agreement of each morphometric algorithm with a SQ consensus reading performed by three experts. With this consensus reading in place of a gold standard, we determined relative measures of sensitivity, specificity and optimal cutoff threshold for each morphometric algorithm. The study was conducted in 39 postmenopausal women who had at least one osteoporotic vertebral fracture. Normal values were derived from 84 healthy postmenopausal women with apparently normal vertebral bodies. Our results indicate that the concordance of SQ method was excellent (intraobserver agreement on serial radiographs = 96.4%, kappa = 0.91; agreement between individual readings and the consensus reading = 98%, kappa = 0.95). Three morphometric approaches demonstrated good intra- and interobserver concordance (Melton: intraobserver agreement on serial radiographs = 92.7%, kappa = 0.82, interobserver agreement = 91.1%, kappa = 0.79; Eastell: intraobserver agreement on serial radiographs = 87.6%, kappa = 0.66, interobserver agreement = 88.6%, kappa = 0.68; McCloskey: intraobserver agreement on serial radiographs = 91.5%, kappa = 0.72, interobserver agreement = 93.9%, kappa = 0.78). Except for McCloskey's method, the optimal cutoff thresholds defined in our study by highest kappa score or Youden index in comparison with the SQ consensus reading were near the cutoff thresholds that

  12. A Comparison of Reliability Measures for Continuous and Discontinuous Recording Methods: Inflated Agreement Scores with Partial Interval Recording and Momentary Time Sampling for Duration Events

    ERIC Educational Resources Information Center

    Rapp, John T.; Carroll, Regina A.; Stangeland, Lindsay; Swanson, Greg; Higgins, William J.

    2011-01-01

    The authors evaluated the extent to which interobserver agreement (IOA) scores, using the block-by-block method for events scored with continuous duration recording (CDR), were higher when the data from the same sessions were converted to discontinuous methods. Sessions with IOA scores of 89% or less with CDR were rescored using 10-s partial…

  13. Agreement between auricular and rectal measurements of body temperature in healthy cats.

    PubMed

    Sousa, Marlos G; Carareto, Roberta; Pereira-Junior, Valdo A; Aquino, Monally C C

    2013-04-01

    Measurement of body temperature is a routine part of the clinical assessment of a patient. However, this procedure may be time-consuming and stressful to most animals because the standard site of temperature acquisition remains the rectal mucosa. Although an increasing number of clinicians have been using auricular temperature to estimate core body temperature, evidence is still lacking regarding agreement between these two methods in cats. In this investigation, we evaluated the agreement between temperatures measured in the rectum and ear in 29 healthy cats over a 2-week period. Temperatures were measured in the rectum (using digital and mercury-in-glass thermometers) and ear once a day for 14 consecutive days, producing 406 temperature readings for each thermometer. Mean temperature and confidence intervals were similar between methods, and Bland-Altman plots showed small biases and narrow limits of agreement acceptable for clinical purposes. The interobserver variability was also checked, which indicated a strong correlation between two near-simultaneous temperature readings. Results are consistent with auricular thermometry being a reliable alternative to rectal thermometry for assessing core body temperature in healthy cats.

  14. Accuracy of endoscopic diagnosis of Helicobacter pylori infection according to level of endoscopic experience and the effect of training

    PubMed Central

    2013-01-01

    Background Accurate prediction of Helicobacter pylori infection status on endoscopic images can contribute to early detection of gastric cancer, especially in Asia. We identified the diagnostic yield of endoscopy for H. pylori infection at various endoscopist career levels and the effect of two years of training on diagnostic yield. Methods A total of 77 consecutive patients who underwent endoscopy were analyzed. H. pylori infection status was determined by histology, serology, and the urea breast test and categorized as H. pylori-uninfected, -infected, or -eradicated. Distinctive endoscopic findings were judged by six physicians at different career levels: beginner (<500 endoscopies), intermediate (1500–5000), and advanced (>5000). Diagnostic yield and inter- and intra-observer agreement on H. pylori infection status were evaluated. Values were compared between the two beginners after two years of training. The kappa (K) statistic was used to calculate agreement. Results For all physicians, the diagnostic yield was 88.9% for H. pylori-uninfected, 62.1% for H. pylori-infected, and 55.8% for H. pylori-eradicated. Intra-observer agreement for H. pylori infection status was good (K > 0.6) for all physicians, while inter-observer agreement was lower (K = 0.46) for beginners than for intermediate and advanced (K > 0.6). For all physicians, good inter-observer agreement in endoscopic findings was seen for atrophic change (K = 0.69), regular arrangement of collecting venules (K = 0.63), and hemorrhage (K = 0.62). For beginners, the diagnostic yield of H. pylori-infected/eradicated status and inter-observer agreement of endoscopic findings were improved after two years of training. Conclusions The diagnostic yield of endoscopic diagnosis was high for H. pylori-uninfected cases, but was low for H. pylori-eradicated cases. In beginners, daily training on endoscopic findings improved the low diagnostic yield. PMID:23947684

  15. A teaching intervention in a contouring dummy run improved target volume delineation in locally advanced non-small cell lung cancer: Reducing the interobserver variability in multicentre clinical studies.

    PubMed

    Schimek-Jasch, Tanja; Troost, Esther G C; Rücker, Gerta; Prokic, Vesna; Avlar, Melanie; Duncker-Rohr, Viola; Mix, Michael; Doll, Christian; Grosu, Anca-Ligia; Nestle, Ursula

    2015-06-01

    Interobserver variability in the definition of target volumes (TVs) is a well-known confounding factor in (multicentre) clinical studies employing radiotherapy. Therefore, detailed contouring guidelines are provided in the prospective randomised multicentre PET-Plan (NCT00697333) clinical trial protocol. This trial compares strictly FDG-PET-based TV delineation with conventional TV delineation in patients with locally advanced non-small cell lung cancer (NSCLC). Despite detailed contouring guidelines, their interpretation by different radiation oncologists can vary considerably, leading to undesirable discrepancies in TV delineation. Considering this, as part of the PET-Plan study quality assurance (QA), a contouring dummy run (DR) consisting of two phases was performed to analyse the interobserver variability before and after teaching. In the first phase of the DR (DR1), radiation oncologists from 14 study centres were asked to delineate TVs as defined by the study protocol (gross TV, GTV; and two clinical TVs, CTV-A and CTV-B) in a test patient. A teaching session was held at a study group meeting, including a discussion of the results focussing on discordances in comparison to the per-protocol solution. Subsequently, the second phase of the DR (DR2) was performed in order to evaluate the impact of teaching. Teaching after DR1 resulted in a reduction of absolute TVs in DR2, as well as in better concordance of TVs. The Overall Kappa(κ) indices increased from 0.63 to 0.71 (GTV), 0.60 to 0.65 (CTV-A) and from 0.59 to 0.63 (CTV-B), demonstrating improvements in overall interobserver agreement. Contouring DRs and study group meetings as part of QA in multicentre clinical trials help to identify misinterpretations of per-protocol TV delineation. Teaching the correct interpretation of protocol contouring guidelines leads to a reduction in interobserver variability and to more consistent contouring, which should consequently improve the validity of the overall study

  16. Interobserver variability in target volume delineation of hepatocellular carcinoma : An analysis of the working group "Stereotactic Radiotherapy" of the German Society for Radiation Oncology (DEGRO).

    PubMed

    Gkika, E; Tanadini-Lang, S; Kirste, S; Holzner, P A; Neeff, H P; Rischke, H C; Reese, T; Lohaus, F; Duma, M N; Dieckmann, K; Semrau, R; Stockinger, M; Imhoff, D; Kremers, N; Häfner, M F; Andratschke, N; Nestle, U; Grosu, A L; Guckenberger, M; Brunner, T B

    2017-10-01

    Definition of gross tumor volume (GTV) in hepatocellular carcinoma (HCC) requires dedicated imaging in multiple contrast medium phases. The aim of this study was to evaluate the interobserver agreement (IOA) in gross tumor delineation of HCC in a multicenter panel. The analysis was performed within the "Stereotactic Radiotherapy" working group of the German Society for Radiation Oncology (DEGRO). The GTVs of three anonymized HCC cases were delineated by 16 physicians from nine centers using multiphasic CT scans. In the first case the tumor was well defined. The second patient had multifocal HCC (one conglomerate and one peripheral tumor) and was previously treated with transarterial chemoembolization (TACE). The peripheral lesion was adjacent to the previous TACE site. The last patient had an extensive HCC with a portal vein thrombosis (PVT) and an inhomogeneous liver parenchyma due to cirrhosis. The IOA was evaluated according to Landis and Koch. The IOA for the first case was excellent (kappa: 0.85); for the second case moderate (kappa: 0.48) for the peripheral tumor and substantial (kappa: 0.73) for the conglomerate. In the case of the peripheral tumor the inconsistency is most likely explained by the necrotic tumor cavity after TACE caudal to the viable tumor. In the last case the IOA was fair, with a kappa of 0.34, with significant heterogeneity concerning the borders of the tumor and the PVT. The IOA was very good among the cases were the tumor was well defined. In complex cases, where the tumor did not show the typical characteristics, or in cases with Lipiodol (Guerbet, Paris, France) deposits, IOA agreement was compromised.

  17. Inter-observer agreement of a multi-parameter campsite monitoring program on the Dixie National Forest, Utah

    Treesearch

    Nicholas J. Glidden; Martha E. Lee

    2007-01-01

    Precision is crucial to campsite monitoring programs. Yet, little empirical research has ever been published on the level of precision of this type of monitoring programs. The purpose of this study was to evaluate the level of agreement between observers of campsite impacts using a multi-parameter campsite monitoring program. Thirteen trained observers assessed 16...

  18. Development and implementation of the Good Neighbor Agreement (GNA) practice in the USA sustainable mining development.

    NASA Astrophysics Data System (ADS)

    Masaitis, Alexandra

    2014-05-01

    New economic, environmental and social challenges for the mining industry in the USA show the need to implement "responsible" mining practices that include improved community involvement. Conflicts which occur in the US territory and with US mining companies around the world are now common between the mining proponents, NGO's and communities. These conflicts can sometimes be alleviated by early development of modes of communication, and a formal discussion format that allows airing of concerns and potential resolution of problems. One of the methods that can formalize this process is to establish a Good Neighbor Agreement (GNA), which deals specifically with challenges in relationships between mining operations and the local communities. It is a new practice related to mining operations that are oriented toward social needs and concerns of local communities that arise during the normal life of a mine, which can achieve sustainable mining practices. The GNA project being currently developed at the University of Nevada, USA in cooperation with the Newmont Mining Corporation has a goal of creating an open company/community dialog that will help identify and address sociological and environmental concerns associated with mining. Discussion: The Good Neighbor Agreement currently evolving will address the following: 1. Identify spheres of possible cooperation between mining companies, government organizations, and NGO's. 2. Provide an economically viable mechanism for developing a partnership between mining operations and the local communities that will increase mining industry's accountability and provide higher levels of confidence for the community that a mine is operated in a safe and sustainable manner. Implementation of the GNA can help identify and evaluate conflict criteria in mining/community relationships; determine the status of concerns; determine the role and responsibilities of stakeholders; analyze problem resolution feasibility; maintain the community

  19. Novel Semiquantitative Bone Marrow Oedema Score and Fracture Score for the Magnetic Resonance Imaging Assessment of the Active Charcot Foot in Diabetes

    PubMed Central

    Meacock, L.; Donaldson, Ana; Isaac, A.; Briody, A.; Ramnarine, R.; Edmonds, M. E.; Elias, D. A.

    2017-01-01

    There are no accepted methods to grade bone marrow oedema (BMO) and fracture on magnetic resonance imaging (MRI) scans in Charcot osteoarthropathy. The aim was to devise semiquantitative BMO and fracture scores on foot and ankle MRI scans in diabetic patients with active osteoarthropathy and to assess the agreement in using these scores. Three radiologists assessed 45 scans (Siemens Avanto 1.5T, dedicated foot and ankle coil) and scored independently twenty-two bones (proximal phalanges, medial and lateral sesamoids, metatarsals, tarsals, distal tibial plafond, and medial and lateral malleoli) for BMO (0—no oedema, 1—oedema < 50% of bone volume, and 2—oedema > 50% of bone volume) and fracture (0—no fracture, 1—fracture, and 2—collapse/fragmentation). Interobserver agreement and intraobserver agreement were measured using multilevel modelling and intraclass correlation (ICC). The interobserver agreement for the total BMO and fracture scores was very good (ICC = 0.83, 95% confidence intervals (CI) 0.76, 0.91) and good (ICC = 0.62; 95% CI 0.48, 0.76), respectively. The intraobserver agreement for the total BMO and fracture scores was good (ICC = 0.78, 95% CI 0.6, 0.95) and fair to moderate (ICC = 0.44; 95% CI 0.14, 0.74), respectively. The proposed BMO and fracture scores are reliable and can be used to grade the extent of bone damage in the active Charcot foot. PMID:29230422

  20. Assessment of the intraobserver and interobserver reliability of a communicating vessels volumeter to measure wrist-hand volume.

    PubMed

    de Carvalho, Rogério Mendonca; Perez, Maria Del Carmen Janerio; Miranda, Fausto

    2012-10-01

    Traditional volumetry based on Archimedes' principle is the gold standard for the measurement of limb volume, but the routine use of this technique is discouraged because of several disadvantages. The purpose of this study was to evaluate intraobserver and interobserver reliability of direct measurements of wrist-hand volume using a new communicating vessels volumeter based on Pascal's law. A reliability study was conducted. To evaluate the reliability of the communicating vessels volumeter in generating measurements, 30 hands of 15 participants (9 women, 6 men) were measured 3 times each by 3 observers, totaling 270 volumetric results. Measurement time was short (X =3 minutes 42 seconds). The intraclass correlation coefficient (ICC) was .9977 for observer 1 and .9976 for observers 2 and 3. The interobserver ICC was .9998. The standard error of measurement was about 3 mL for all observers; the interobserver result was 1 mL. The interrater coefficient of variance (CV) was 1.15% for the series of 9 measurements collected for each segment; the intrarater CV was 1.20%. Limitations No swollen hands were measured, and measurements were not compared with the gold standard technique. Thus, accuracy of the new volumeter was not determined in this study. A new device has been developed for plethysmography of the extremities, and the results of its use to measure the volume of the wrist-hand segment were reliable in both intraobserver and interobserver analyses.

  1. Agreement among Magnetic Resonance Imaging/Magnetic Resonance Cholangiopancreatography (MRI-MRCP) and Endoscopic Ultrasound (EUS) in the evaluation of morphological features of Branch Duct Intraductal Papillary Mucinous Neoplasm (BD-IPMN).

    PubMed

    Uribarri-Gonzalez, Laura; Keane, Margaret G; Pereira, Stephen P; Iglesias-García, Julio; Dominguez-Muñoz, J Enrique; Lariño-Noia, Jose

    2018-03-01

    To evaluate the agreement between the imaging modalities MRI-MRCP and EUS in cystic lesions of the pancreas which were thought to be a BD-IPMN. Multicenter retrospective study included all patients between 2010 and 2015 with a suspected BD-IPMN who underwent an EUS and MRI-MRCP within 6 months or less of each other. Location, number, size, worrisome features and high-risk stigmata were evaluated. Interobserver agreement was evaluated by Kappa score. 173 patients were included (97 UHSC, 76 UCLH-RFH), mean age 65 (range 25-87 years), 66 males. When comparing both modalities there was good agreement for the location of the cyst. The median lesion size was larger by MRI-MRCP than EUS although it was not significant. With regards to worrisome features, there was moderate agreement for main PD of 5-9 mm and abrupt change (k = 0.45 and 0.52). Fair agreement was seen for the cyst wall thickening (k = 0.25). No agreement was seen between the presence of non-enhanced mural nodules or lymphadenopathy (k < 0). With regards to high-risk stigmata, poor agreement was obtained for the detection of an enhanced solid component (k = 0.12). No agreement was observed for main PD > 10 mm (k < 0). In this multicentre study of patients with a BD-IPMN under active surveillance, most disagreement between these modalities was seen in the proximal pancreas. There was generally only minimal concordance between the imaging findings of EUS and MRI-MRCP for the detection of high-risk stigmata and worrisome features. Copyright © 2018 IAP and EPC. All rights reserved.

  2. Good Agreement Between Transabdominal and Endoscopic Ultrasound of the Pancreas in Chronic Pancreatitis.

    PubMed

    Engjom, Trond; Pham, Khahn Do-Chong; Erchinger, Friedemann; Haldorsen, Ingfrid Salvesen; Gilja, Odd Helge; Dimcevski, Georg; Havre, Roald Flesland

    2018-03-26

     We aimed to evaluate the agreement of single criteria and dedicated scores from transabdominal ultrasound of the pancreas (US) compared to standards by endoscopic ultrasound (EUS) and computed tomography (CT).  In this observational cohort study performed in a tertiary care center, US and EUS were performed in 110 patients referred for suspected CP. Based on the Mayo score, 52 patients were diagnosed with CP. The sonographic findings obtained by both methods were registered. The number of criteria was counted and scored according to the Rosemont score.  Agreement between the number of detected US and EUS criteria was substantial (ICC = 0.74 [0.61 - 0.83]. Adding Rosemont weighting improved the agreement (ICC = 0.88 [0.81 - 0.92]). Regarding individual criteria, the agreement was substantial for the detection of calcifications (κ = 0.86) and moderate for cysts and irregular or dilated pancreatic duct (κ = 0.42 - 0.58). Agreement for the other criteria was poorer (κ≤ 0.40). The diagnostic performance indices [95 % CI] of US for diagnosing CP (using Mayo score as reference standard) were for the unweighted score: Sensitivity: 0.65 [0.51 - 0.78], specificity: 0.97 [0.87 - 1.00]; and for Rosemont score: Sensitivity: 0.75 [0.61 - 0.86], specificity: 0.95 [0.83 - 0.99].  The agreement between US and EUS for the unweighted and weighted scores was substantial. For the features calcifications, cysts and main pancreatic duct (MPD) changes, agreement was moderate to substantial. For the other detected US criteria, the agreement with EUS was too poor to be clinically relevant. © Georg Thieme Verlag KG Stuttgart · New York.

  3. Application of good practices as described by the NEPSI agreement coincides with a strong decline in the exposure to respiratory crystalline silica in Finnish workplaces.

    PubMed

    Tuomi, Tapani; Linnainmaa, Markku; Väänänen, Virpi; Reijula, Kari

    2014-08-01

    To protect the health of those occupationally exposed to respirable crystalline silica, the main industries in European Union associated with exposure to respirable silica, agreed on appropriate measures for the improvement of working conditions through the application of good practices, as part of 'The Agreement on Workers Health Protection through the Good Handling and Use of Crystalline Silica and Products Containing it' (NEPSI agreement), signed in April 2006. The present paper examines trends in exposure to respirable crystalline silica in Finland prior to and following the implementation of the NEPSI agreement and includes a working example of the NEPSI approach in the concrete industry. Data derived from workplace exposure assessments during the years 1994-2013 are presented, including 2556 air samples collected mostly indoors, from either the breathing zone of workers or from stationary points usually at a height of 1.5 m above the floor, with the aim to estimate average exposure of workers to respiratory crystalline silica during an 8-h working day. The aim was, to find out how effective this unique approach has been in the management of one of the major occupational hazards in the concerned industries. Application of good practices as described by the NEPSI agreement coincides with a strong decline in the exposure to respirable crystalline silica in Finnish workplaces, as represented by the clientele of Finnish Institute of Occupational Health. During the years followed in the present study, we see a >10-fold decrease in the average and median exposures to respirable silica. Prior to the implementation of the NEPSI agreement, >50% of the workplace measurements yielded results above the OEL8 h (0.2mg m(-3)). As of present (2013), circa 10% of the measurements are above of or identical to the OEL8 h (0.05mg m(-3)). © The Author 2014. Published by Oxford University Press on behalf of the British Occupational Hygiene Society.

  4. Morphology vs morphokinetics: a retrospective comparison of inter-observer and intra-observer agreement between embryologists on blastocysts with known implantation outcome.

    PubMed

    Adolfsson, Emma; Andershed, Anna Nowosad

    2018-06-18

    Our primary aim was to compare the morphology and morphokinetics on inter- and intra-observer agreement for blastocyst with known implantation outcome. Our secondary aim was to validate the morphokinetic parameters' ability to predict pregnancy using a previous published selection algorithm, and to compare this to standard morphology assessments. Two embryologists made independent blinded annotations on two occasions using time-lapse images and morphology evaluations using the Gardner Schoolcraft criteria of 99 blastocysts with known implantation outcome. Inter- and intra-observer agreement was calculated and compared using the two methods. The embryos were grouped based on their morphological score, and on their morphokinetic class using a previous published selection algorithm. The implantation rates for each group was calculated and compared. There was moderate agreement for morphology, with agreement on the same embryo score in 55 of 99 cases. The highest agreement rate was found for expansion grade, followed by trophectoderm and inner cell mass. Correlation with pregnancy was inconclusive. For morphokinetics, almost perfect agreement was found for early and late embryo development events, and strong agreement for day-2 and day-3 events. When applying the selection algorithm, the embryo distributions were uneven, and correlation to pregnancy was inconclusive. Time-lapse annotation is consistent and accurate, but our external validation of a previously published selection algorithm was unsuccessful.

  5. SU-E-J-266: Cone Beam Computed Tomography (CBCT) Inter-Scan and Inter-Observer Tumor Volume Variability Assessment in Patients Treated with Stereotactic Body Radiation Therapy (SBRT) for Early Stage Non-Small Cell Lung Cancer (NSCLC)

    SciTech Connect

    Hou, Y; Aileen, C; Kozono, D

    Purpose: Quantification of volume changes on CBCT during SBRT for NSCLC may provide a useful radiological marker for radiation response and adaptive treatment planning, but the reproducibility of CBCT volume delineation is a concern. This study is to quantify inter-scan/inter-observer variability in tumor volume delineation on CBCT. Methods: Twenty earlystage (stage I and II) NSCLC patients were included in this analysis. All patients were treated with SBRT with a median dose of 54 Gy in 3 to 5 fractions. Two physicians independently manually contoured the primary gross tumor volume on CBCTs taken immediately before SBRT treatment (Pre) and after themore » same SBRT treatment (Post). Absolute volume differences (AVD) were calculated between the Pre and Post CBCTs for a given treatment to quantify inter-scan variability, and then between the two observers for a given CBCT to quantify inter-observer variability. AVD was also normalized with respect to average volume to obtain relative volume differences (RVD). Bland-Altman approach was used to evaluate variability. All statistics were calculated with SAS version 9.4. Results: The 95% limit of agreement (mean ± 2SD) on AVD and RVD measurements between Pre and Post scans were −0.32cc to 0.32cc and −0.5% to 0.5% versus −1.9 cc to 1.8 cc and −15.9% to 15.3% for the two observers respectively. The 95% limit of agreement of AVD and RVD between the two observers were −3.3 cc to 2.3 cc and −42.4% to 28.2% respectively. The greatest variability in inter-scan RVD was observed with very small tumors (< 5 cc). Conclusion: Inter-scan variability in RVD is greatest with small tumors. Inter-observer variability was larger than inter-scan variability. The 95% limit of agreement for inter-observer and inter-scan variability (∼15–30%) helps define a threshold for clinically meaningful change in tumor volume to assess SBRT response, with larger thresholds needed for very small tumors. Part of the work was funded by a

  6. Evaluating the intra- and interobserver reliability of three-dimensional ultrasound and power Doppler angiography (3D-PDA) for assessment of placental volume and vascularity in the second trimester of pregnancy.

    PubMed

    Jones, Nia W; Raine-Fenning, Nick J; Mousa, Hatem A; Bradley, Eileen; Bugg, George J

    2011-03-01

    Three-dimensional (3-D) power Doppler angiography (3-D-PDA) allows visualisation of Doppler signals within the placenta and their quantification is possible by the generation of vascular indices by the 4-D View software programme. This study aimed to investigate intra- and interobserver reproducibility of 3-D-PDA analysis of stored datasets at varying gestations with the ultimate goal being to develop a tool for predicting placental dysfunction. Women with an uncomplicated, viable singleton pregnancy were scanned at 12, 16 or 20 weeks gestational age groups. 3-D-PDA datasets acquired of the whole placenta were analysed using the VOCAL software processing tool. Each volume was analysed by three observers twice in the A plane. Intra- and interobserver reliability was assessed by intraclass correlation coefficients (ICCs) and Bland Altman plots. At each gestational age group, 20 low risk women were scanned resulting in 60 datasets in total. The ICC demonstrated a high level of measurement reliability at each gestation with intraobserver values >0.90 and interobserver values of >0.6 for the vascular indices. Bland Altman plots also showed high levels of agreement. Systematic bias was seen at 20 weeks in the vascular indices obtained by different observers. This study demonstrates that 3-D-PDA data can be measured reliably by different observers from stored datasets up to 18 weeks gestation. Measurements become less reliable as gestation advances with bias between observers evident at 20 weeks. Copyright © 2011 World Federation for Ultrasound in Medicine & Biology. Published by Elsevier Inc. All rights reserved.

  7. Observer variability in the assessment of CT coronary angiography and coronary artery calcium score: substudy of the Scottish COmputed Tomography of the HEART (SCOT-HEART) trial

    PubMed Central

    Williams, Michelle C; Golay, Saroj K; Hunter, Amanda; Weir-McCall, Jonathan R; Mlynska, Lucja; Dweck, Marc R; Uren, Neal G; Reid, John H; Lewis, Steff C; Berry, Colin; van Beek, Edwin J R; Roditi, Giles; Newby, David E; Mirsadraee, Saeed

    2015-01-01

    Introduction Observer variability can influence the assessment of CT coronary angiography (CTCA) and the subsequent diagnosis of angina pectoris due to coronary heart disease. Methods We assessed 210 CTCAs from the Scottish COmputed Tomography of the HEART (SCOT-HEART) trial for intraobserver and interobserver variability. Calcium score, coronary angiography and image quality were evaluated. Coronary artery disease was defined as none (<10%), mild (10–49%), moderate (50–70%) and severe (>70%) luminal stenosis and classified as no (<10%), non-obstructive (10–70%) or obstructive (>70%) coronary artery disease. Post-CTCA diagnosis of angina pectoris due to coronary heart disease was classified as yes, probable, unlikely or no. Results Patients had a mean body mass index of 29 (28, 30) kg/m2, heart rate of 58 (57, 60)/min and 62% were men. Intraobserver and interobserver agreements for the presence or absence of coronary artery disease were excellent (95% agreement, κ 0.884 (0.817 to 0.951) and good (91%, 0.791 (0.703 to 0.879)). Intraobserver and interobserver agreement for the presence or absence of angina pectoris due to coronary heart disease were excellent (93%, 0.842 (0.918 to 0.755) and good (86%, 0.701 (0.799 to 0.603)), respectively. Observer variability of calcium score was excellent for calcium scores below 1000. More segments were categorised as uninterpretable with 64-multidetector compared to 320-multidetector CTCA (10.1% vs 2.6%, p<0.001) but there was no difference in observer variability. Conclusions Multicentre multidetector CTCA has excellent agreement in patients under investigation for suspected angina due to coronary heart disease. Trial registration number NCT01149590. PMID:26019881

  8. Reproducibility and Agreement Between 2 Spectral Domain Optical Coherence Tomography Devices for Anterior Chamber Angle Measurements

    PubMed Central

    Marion, Kenneth M.; Maram, Jyotsna; Pan, Xiaojing; Dastiridou, Anna; Zhang, ZhouYuan; Ho, Alex; Francis, Brian A.; Sadda, Srinivas R.

    2015-01-01

    Purpose: To compare anterior chamber angle parameters based on the location of Schwalbe line (SL) from 2 spectral domain optical coherence tomography (SD-OCT) instruments and to measure their reproducibility. Methods: Forty-two eyes from 21 normal, healthy participants underwent imaging of the inferior irido-corneal angle with the Spectralis and Cirrus SD-OCT under tightly controlled low-light conditions. SL-angle opening distance (SL-AOD) and SL-trabecular iris space area (SL-TISA) were measured by masked, certified graders at the Doheny Imaging Reading Center using customized grading software. Interinstrument and intrainstrument, as well as interobserver and intraobserver reproducibility of SL-AOD and SL-TISA measurements were evaluated by intraclass correlation coefficients (ICCs) and Bland-Altman plots with limits of agreement (LoA). Results: The mean SL-AOD was 0.662±0.191 mm in Spectralis and 0.677±0.213 mm in Cirrus. The mean SL-TISA was 0.250±0.073 mm2 in Spectralis and 0.256±0.082 mm2 in Cirrus. The agreement for intrainstrument (ICCs>0.979), intragrader (ICCs>0.992), and intergrader (ICCs>0.929) was excellent. Excellent agreement between the 2 devices was also documented with a mean difference of −0.016 (LoA −0.125 to 0.092) mm for SL-AOD and −0.007 (LoA −0.056 to 0.043) mm2 in SL-TISA. Conclusions: Both SD-OCTs provided comparable measurements and permitted calculation of SL-based angle metrics. There was excellent interinstrument and intrainstrument and intraobserver and interobserver reproducibility for Spectralis and Cirrus SD-OCTs, suggesting true interchangeability between SD-OCT devices. This has the potential to lead to development of standardized grading assessments and quantification of angle parameters that would be valid across various SD-OCT devices. PMID:26200742

  9. Efficacy of double inversion recovery magnetic resonance imaging for the evaluation of the synovium in the femoro-patellar joint without contrast enhancement.

    PubMed

    Son, Ye Na; Jin, Wook; Jahng, Geon-Ho; Cha, Jang Gyu; Park, Yong Sung; Yun, Seong Jong; Park, So Young; Park, Ji Seon; Ryu, Kyung Nam

    2018-02-01

    To investigate the efficacy of double inversion recovery (DIR) sequence for evaluating the synovium of the femoro-patellar joint without contrast enhancement (CE). Two radiologists independently evaluated the axial DIR and CE T1-weighted fat-saturated (CET1FS) images of 33 knees for agreement; the visualisation and distribution of the synovium were evaluated using a four-point visual scaling system at each of the five levels of the femoro-patellar joint and the location of the thickest synovium. The maximal synovial thickness at each sequence was measured by consensus. The interobserver agreement was good (κ = 0.736) for the four-point scale, and was excellent for the location of the thickest synovium on DIR and CET1FS (κ = 0.955 and 0.954). The intersequential agreement for the area with the thickest synovium was also excellent (κ = 0.845 and κ = 0.828). The synovial thickness on each sequence showed excellent correlation (r = 0.872). The DIR showed as good a correlation as CET1FS for the evaluation of the synovium at the femoro-patellar joint. DIR may be a useful MR technique for evaluating the synovium without CE. • DIR can be useful for evaluating the synovium of the femoro-patellar joint. • Interobserver and intersequential agreements between DIR and CET1FS were good. • Mean thickness of the synovium was significantly different between two sequences.

  10. Does clinical experience affect the reproducibility of cervical vertebrae maturation method?

    PubMed

    Rongo, Roberto; Valleta, Rosa; Bucci, Rosaria; Bonetti, Giulio Alessandri; Michelotti, Ambrosina; D'Antò, Vincenzo

    2015-09-01

    To assess interobserver and intraobserver reproducibility of the cervical vertebrae maturation method (CVMM) among three panels of judges with different levels of orthodontic experience (OE). Fifty individual lateral cephalograms of good quality with complete visualization of cervical vertebrae 1 to 4 were selected. Thirty clinicians, divided according to their OE into three groups (junior group, JU, OE ≤ 1 year; postgraduate group, PG, 2 ≤ OE ≤ 4 years; specialist group, SP, OE ≥ 7 years), evaluated the cephalograms in two sessions (T1 and T2) at 3 weeks apart. Kendall's W and weighted Cohen's kappa (κ) coefficients were performed to assess interobserver and intraobserver agreement. The level of significance was set as P < .05. For both the interobserver and the intraobserver datasets, the percentage of perfect agreement (PPA) and the number of stages apart for each disagreement were calculated. Kendall's W at T1 was SP  =  0.61, PG  =  0.70, and JU  =  0.87; at T2 it was SP  =  0.78, PG  =  0.85, and JU  =  0.86. The percentage of total interobserver perfect agreement (Inter-PPA) was 42.3% at T1 and 46.3% at T2. The JU group had the highest Cohen's κ coefficient at 0.78, while the PG and SP had coefficients of 0.64 each. The percentage of total intraobserver perfect agreement (Intra-PPA) was 54.2%. The reproducibility of the method was not improved by the level of orthodontic experience. The group with the lowest level of orthodontic experience had the best performance.

  11. Concordance between local, institutional, and central pathology review in glioblastoma: implications for research and practice: a pilot study.

    PubMed

    Gupta, Tejpal; Nair, Vimoj; Epari, Sridhar; Pietsch, Torsten; Jalali, Rakesh

    2012-01-01

    There is significant inter-observer variation amongst the neuro-pathologists in the typing, subtyping, and grading of glial neoplasms for diagnosis. Centralized pathology review has been proposed to minimize this inter-observer variation and is now almost mandatory for accrual into multicentric trials. We sought to assess the concordance between neuro-pathologists on histopathological diagnosis of glioblastoma. Comparison of local, institutional, and central neuro-oncopathology reporting in a cohort of 34 patients with newly diagnosed supratentorial glioblastoma accrued consecutively at a tertiary-care institution on a prospective trial testing the addition of a new agent to standard chemo-radiation regimen. Concordance was sub-optimal between local histological diagnosis and central review, fair between local diagnosis and institutional review, and good between institutional and central review, with respect to histological typing/subtyping. Twelve (39%) of 31 patients with local histological diagnosis had identical tumor type, subtype and grade on central review. Overall agreement was modestly better (52%) between local diagnosis and institutional review. In contrast, 28 (83%) of 34 patients had completely concordant histopathologic diagnosis between institutional and central review. The inter-observer reliability test showed poor agreement between local and central review (kappa statistic=0.12, 95% confidence interval (CI): -0.03-0.32, P=0.043), but moderate agreement between institutional and central review (kappa statistic=0.51, 95%CI: 0.17-0.84, P=0.00003). Agreement between local diagnosis and institutional review was fair. There exists significant inter-observer variation regarding histopathological diagnosis of glioblastoma with significant implications for clinical research and practice. There is a need for more objective, quantitative, robust, and reproducible criteria for better subtyping for accurate diagnosis.

  12. Test-retest and interobserver reliability of quantitative sensory testing according to the protocol of the German Research Network on Neuropathic Pain (DFNS): a multi-centre study.

    PubMed

    Geber, Christian; Klein, Thomas; Azad, Shahnaz; Birklein, Frank; Gierthmühlen, Janne; Huge, Volker; Lauchart, Meike; Nitzsche, Dorothee; Stengel, Maike; Valet, Michael; Baron, Ralf; Maier, Christoph; Tölle, Thomas; Treede, Rolf-Detlef

    2011-03-01

    Quantitative sensory testing (QST) is an instrument to assess positive and negative sensory signs, helping to identify mechanisms underlying pathologic pain conditions. In this study, we evaluated the test-retest reliability (TR-R) and the interobserver reliability (IO-R) of QST in patients with sensory disturbances of different etiologies. In 4 centres, 60 patients (37 male and 23 female, 56.4±1.9years) with lesions or diseases of the somatosensory system were included. QST comprised 13 parameters including detection and pain thresholds for thermal and mechanical stimuli. QST was performed in the clinically most affected test area and a less or unaffected control area in a morning and an afternoon session on 2 consecutive days by examiner pairs (4 QSTs/patient). For both, TR-R and IO-R, there were high correlations (r=0.80-0.93) at the affected test area, except for wind-up ratio (TR-R: r=0.67; IO-R: r=0.56) and paradoxical heat sensations (TR-R: r=0.35; IO-R: r=0.44). Mean IO-R (r=0.83, 31% unexplained variance) was slightly lower than TR-R (r=0.86, 26% unexplained variance, P<.05); the difference in variance amounted to 5%. There were no differences between study centres. In a subgroup with an unaffected control area (n=43), reliabilities were significantly better in the test area (TR-R: r=0.86; IO-R: r=0.83) than in the control area (TR-R: r=0.79; IO-R: r=0.71, each P<.01), suggesting that disease-related systematic variance enhances reliability of QST. We conclude that standardized QST performed by trained examiners is a valuable diagnostic instrument with good test-retest and interobserver reliability within 2days. With standardized training, observer bias is much lower than random variance. Quantitative sensory testing performed by trained examiners is a valuable diagnostic instrument with good interobserver and test-retest reliability for use in patients with sensory disturbances of different etiologies to help identify mechanisms of neuropathic and non

  13. Multi-rater Agreement in the Assessment of Anterior Cruciate Ligament Reconstruction Failure. A Radiographic and Video Analysis of the MARS Cohort

    PubMed Central

    Matava, Matthew J.; Arciero, Robert A.; Baumgarten, Keith M.; Carey, James L.; DeBerardino, Thomas M.; Hame, Sharon L.; Hannafin, Jo A.; Miller, Bruce S.; Nissen, Carl W.; Taft, Timothy N.; Wolf, Brian R.; Wright, Rick W.

    2015-01-01

    Background ACL reconstruction failure occurs in up to 10% of cases. Technical errors are considered the most common cause of graft failure despite the absence of validated studies. There is limited data regarding the agreement among orthopedic surgeons in terms of the etiology of primary ACL reconstruction failure and accuracy of graft tunnel placement. Purpose The purpose of this study is to test the hypothesis that experienced knee surgeons have a high level of inter-observer reliability in the agreement of the etiology of the primary ACL reconstruction failure, anatomical graft characteristics, tunnel placement. Methods Twenty cases of revision ACL reconstruction were randomly selected from the MARS database. Each case included the patient's history, standardized radiographs, and a concise 30-second arthroscopic video taken at the time of revision demonstrating the graft remnant and location of the tunnel apertures. 10 MARS surgeons not involved with the primary surgery reviewed all 20 cases. Each surgeon completed a two-part questionnaire dealing with each surgeon's training and practice as well as the placement of the femoral and tibial tunnels, condition of the primary graft, and the surgeon's opinion as to the etiology of graft failure. Inter-rater agreement was determined for each question. Inter-rater agreement was determined for each question with the kappa coefficient and prevalence adjusted bias adjusted kappa (PABAK). Results The 10 reviewers were in practice an average of 14 years. All performed at least 25 ACL reconstructions per year and 9 were fellowship-trained in sports medicine. There was wide variability in agreement among knee experts as to the specific etiology of ACL graft failure. When specifically asked about technical error as the cause for failure, inter-observer agreement was only slight (prevalence adjusted bias adjusted kappa [PABAK]: 0.26). There was fair overall agreement on ideal femoral tunnel placement (PABAK: 0.55), but only

  14. Segmentation precision of abdominal anatomy for MRI-based radiotherapy

    SciTech Connect

    Noel, Camille E.; Zhu, Fan; Lee, Andrew Y.

    2014-10-01

    The limited soft tissue visualization provided by computed tomography, the standard imaging modality for radiotherapy treatment planning and daily localization, has motivated studies on the use of magnetic resonance imaging (MRI) for better characterization of treatment sites, such as the prostate and head and neck. However, no studies have been conducted on MRI-based segmentation for the abdomen, a site that could greatly benefit from enhanced soft tissue targeting. We investigated the interobserver and intraobserver precision in segmentation of abdominal organs on MR images for treatment planning and localization. Manual segmentation of 8 abdominal organs was performed by 3 independent observersmore » on MR images acquired from 14 healthy subjects. Observers repeated segmentation 4 separate times for each image set. Interobserver and intraobserver contouring precision was assessed by computing 3-dimensional overlap (Dice coefficient [DC]) and distance to agreement (Hausdorff distance [HD]) of segmented organs. The mean and standard deviation of intraobserver and interobserver DC and HD values were DC{sub intraobserver} = 0.89 ± 0.12, HD{sub intraobserver} = 3.6 mm ± 1.5, DC{sub interobserver} = 0.89 ± 0.15, and HD{sub interobserver} = 3.2 mm ± 1.4. Overall, metrics indicated good interobserver/intraobserver precision (mean DC > 0.7, mean HD < 4 mm). Results suggest that MRI offers good segmentation precision for abdominal sites. These findings support the utility of MRI for abdominal planning and localization, as emerging MRI technologies, techniques, and onboard imaging devices are beginning to enable MRI-based radiotherapy.« less

  15. Sensitivity and specificity of monochromatic photography of the ocular fundus in differentiating optic nerve head drusen and optic disc oedema: optic disc drusen and oedema.

    PubMed

    Gili, Pablo; Flores-Rodríguez, Patricia; Yangüela, Julio; Orduña-Azcona, Javier; Martín-Ríos, María Dolores

    2013-03-01

    Evaluation of the efficacy of monochromatic photography of the ocular fundus in differentiating optic nerve head drusen (ONHD) and optic disc oedema (ODE). Sixty-six patients with ONHD, 31 patients with ODE and 70 healthy subjects were studied. Colour and monochromatic fundus photography with different filters (green, red and autofluorescence) were performed. The results were analysed blindly by two observers. The sensitivity, specificity and interobserver agreement (k) of each test were assessed. Colour photography offers 65.5 % sensitivity and 100 % specificity for the diagnosis of ONHD. Monochromatic photography improves sensitivity and specificity and provides similar results: green filter (71.20 % sensitivity, 96.70 % specificity), red filter (80.30 % sensitivity, 96.80 % specificity), and autofluorescence technique (87.8 % sensitivity, 100 % specificity). The interobserver agreement was good with all techniques used: autofluorescence (k = 0.957), green filter (k = 0.897), red filter (k = 0.818) and colour (k = 0.809). Monochromatic fundus photography permits ONHD and ODE to be differentiated, with good sensitivity and very high specificity. The best results were obtained with autofluorescence and red filter study.

  16. Diagnosing colorectal medullary carcinoma: interobserver variability and clinicopathological implications.

    PubMed

    Lee, Lik Hang; Yantiss, Rhonda K; Sadot, Eran; Ren, Bing; Calvacanti, Marcela Santos; Hechtman, Jaclyn F; Ivelja, Sinisa; Huynh, Be; Xue, Yue; Shitilbans, Tatiana; Guend, Hamza; Stadler, Zsofia K; Weiser, Martin R; Vakiani, Efsevia; Gönen, Mithat; Klimstra, David S; Shia, Jinru

    2017-04-01

    Colorectal medullary carcinoma, recognized by the World Health Organization as a distinct histologic subtype, is commonly regarded as a specific entity with an improved prognosis and unique molecular pathogenesis. A fundamental but as yet unaddressed question, however, is whether it can be diagnosed reproducibly. In this study, by analyzing 80 colorectal adenocarcinomas whose dominant growth pattern was solid (thus encompassing medullary carcinoma and its mimics), we provided a detailed description of the morphological spectrum from "classic medullary histology" to nonmedullary poorly differentiated histologies and demonstrated significant overlapping between categories. By assessing a selected subset (n=30) that represented the spectrum of histologies, we showed that the interobserver agreement for diagnosing medullary carcinoma by using 2010 World Health Organization criteria was poor; the κ value among 5 gastrointestinal pathologists was only 0.157 (95% confidence interval, 0.127-0.263; P=.001). When we arbitrarily classified the entire cohort into "classic" and "indeterminate" medullary tumors (group 1, n=19; group 2, n=26, respectively) and nonmedullary poorly differentiated tumors (group 3, n=35), groups 1 and 2 were more likely to exhibit mismatch repair protein deficiency than group 3 (P<.001); however, improved survival could not be detected in either group compared with group 3. Our findings suggest that the diagnosis of medullary carcinoma, as currently applied, may only serve as a morphological descriptor indicating an increased likelihood of mismatch-repair deficiency. Additional evidence including a more objective classification system is needed before medullary carcinoma can be regarded as a distinct entity with prognostic relevance. Until such evidence becomes available, caution should be exercised when making this diagnosis, as well as when comparing results across different studies. Copyright © 2016 Elsevier Inc. All rights reserved.

  17. Reliability of cervical vertebral maturation staging.

    PubMed

    Rainey, Billie-Jean; Burnside, Girvan; Harrison, Jayne E

    2016-07-01

    Growth and its prediction are important for the success of many orthodontic treatments. The aim of this study was to determine the reliability of the cervical vertebral maturation (CVM) method for the assessment of mandibular growth. A group of 20 orthodontic clinicians, inexperienced in CVM staging, was trained to use the improved version of the CVM method for the assessment of mandibular growth with a teaching program. They independently assessed 72 consecutive lateral cephalograms, taken at Liverpool University Dental Hospital, on 2 occasions. The cephalograms were presented in 2 different random orders and interspersed with 11 additional images for standardization. The intraobserver and interobserver agreement values were evaluated using the weighted kappa statistic. The intraobserver and interobserver agreement values were substantial (weighted kappa, 0.6-0.8). The overall intraobserver agreement was 0.70 (SE, 0.01), with average agreement of 89%. The interobserver agreement values were 0.68 (SE, 0.03) for phase 1 and 0.66 (SE, 0.03) for phase 2, with average interobserver agreement of 88%. The intraobserver and interobserver agreement values of classifying the vertebral stages with the CVM method were substantial. These findings demonstrate that this method of CVM classification is reproducible and reliable. Copyright © 2016 American Association of Orthodontists. Published by Elsevier Inc. All rights reserved.

  18. 19 CFR 10.451 - Originating goods.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... ARTICLES CONDITIONALLY FREE, SUBJECT TO A REDUCED RATE, ETC. United States-Chile Free Trade Agreement Rules of Origin § 10.451 Originating goods. A good imported into the customs territory of the United States...

  19. PI-RADS version 2: evaluation of diffusion-weighted imaging interpretation between b = 1000 and b = 1500 s mm-2.

    PubMed

    Kwon, Mi-Ri; Kim, Chan Kyo; Kim, Jae-Hun

    2017-11-01

    To investigate the variability of diffusion-weighted imaging (DWI) interpretation of Prostate Imaging Reporting and Data System (PI-RADS) version 2 (v2) in evaluating prostate cancer (PCa). 154 patients with PCa underwent multiparametric 3T MRI, followed by radical prostatectomy. DWI with different b values (b = 0, 100, 1000 and 1500 s mm - 2 ) was obtained. Using the PI-RADS v2, two radiologists independently scored suspicious lesions in each patient and compared DWI of b = 1000 (DWI 1000 ) with 1500 (DWI 1500 ) s mm - 2 . On DWI 1000 and DWI 1500 , the intermethod and interobserver agreements of DWI scores were excellent in all patients (κ ≥ 0.873). In each peripheral zone and transition zone DWI scores, both observers showed excellent intermethod agreement between DWI 1000 and DWI 1500 (κ ≥ 0.897), and interobserver agreement for DWI 1000 and DWI 1500 was good to excellent (κ ≥ 0.796). For estimating clinically significant cancer, the area under receiver operating characteristics curves of DWI 1000 and DWI 1500 were 0.710 and 0.724 for observer 1 (p = 0.11), and 0.649 and 0.656 for observer 2 (p = 0.12), respectively. The PI-RADS v2 scoring at 3T shows excellent agreement between DWI 1000 and DWI 1500 in evaluating PCa, with excellent inter-observer agreement. Advance in knowledge: DWI using b = 1000 s mm -2 instead of b = 1500 s mm -2 reduces examination time or image distortion, with improved the signal-to-noise ratio.

  20. Creation and validation of a visual macroscopic hematuria scale for optimal communication and an objective hematuria index.

    PubMed

    Wong, Lih-Ming; Chum, Jia-Min; Maddy, Peter; Chan, Steven T F; Travis, Douglas; Lawrentschuk, Nathan

    2010-07-01

    Macroscopic hematuria is a common symptom and sign that is challenging to quantify and describe. The degree of hematuria communicated is variable due to health worker experience combined with lack of a reliable grading tool. We produced a reliable, standardized visual scale to describe hematuria severity. Our secondary aim was to validate a new laboratory test to quantify hemoglobin in hematuria specimens. Nurses were surveyed to ascertain current hematuria descriptions. Blood and urine were titrated at varying concentrations and digitally photographed in catheter bag tubing. Photos were processed and printed on transparency paper to create a prototype swatch or card showing light, medium, heavy and old hematuria. Using the swatch 60 samples were rated by nurses and laymen. Interobserver variability was reported using the generalized kappa coefficient of agreement. Specimens were analyzed for hemolysis by measuring optical density at oxyhemoglobin absorption peaks. Interobserver agreement between nurses and laymen was good (kappa = 0.51, p <0.001). Subgroup analysis showed substantial agreement for light hematuria (kappa = 0.71). Overall agreement improved when the moderate (kappa = 0.28) and heavy (kappa = 0.53) hematuria categories were combined (kappa = 0.70). Compared to known blood concentrations the assay of optical density at oxyhemoglobin absorption peaks showed a linear trend. A simple visual scale to grade and communicate hematuria with adequate interobserver agreement is feasible. The test for optical density at oxyhemoglobin absorption peaks is a new method, validated in our study, to quantify hemoglobin in a hematuria specimen. Copyright (c) 2010 American Urological Association Education and Research, Inc. Published by Elsevier Inc. All rights reserved.

  1. Inter-observer variation in identifying mammals from their tracks at enclosed track plate stations

    Treesearch

    William J. Zielinski; Fredrick V. Schlexer

    2009-01-01

    Enclosed track plate stations are a common method to detect mammalian carnivores. Studies rely on these data to make inferences about geographic range, population status and detectability. Despite their popularity, there has been no effort to document inter-observer variation in identifying the species that leave their tracks. Four previous field crew leaders...

  2. Effects of the change in cutoff values for human epidermal growth factor receptor 2 status by immunohistochemistry and fluorescence in situ hybridization: a study comparing conventional brightfield microscopy, image analysis-assisted microscopy, and interobserver variation.

    PubMed

    Atkinson, Roscoe; Mollerup, Jens; Laenkholm, Anne-Vibeke; Verardo, Mark; Hawes, Debra; Commins, Deborah; Engvad, Birte; Correa, Adrian; Ehlers, Charlotte Cort; Nielsen, Kirsten Vang

    2011-08-01

    New guidelines for HER2 testing have been introduced. To evaluate the difference in HER2 assessment after introduction of new cutoff levels for both immunohistochemistry (IHC) and fluorescence in situ hybridization (FISH) and to compare interobserver agreement and time to score between image analysis and conventional microscopy. Samples from 150 patients with breast cancer were scored by 7 pathologists using conventional microscopy, with a cutoff of both 10% and 30% IHC-stained cells, and using automated microscopy with image analysis. The IHC results were compared individually and to HER2 status as determined by FISH, using both the approved cutoff of 2.0 and the recently introduced cutoff of 2.2. High concordance was found in IHC scoring among the 7 pathologists. The 30% cutoff led to slightly fewer positive IHC observations. Introduction of a FISH equivocal zone affected 4% of the FISH scores. If cutoff for FISH is kept at 2.0, no difference in patient selection is found between the 10% and the 30% IHC cutoff. Among the 150 breast cancer samples, the new 30% IHC and 2.2 FISH cutoff levels resulted in one case without a firm diagnosis because both IHC and FISH were equivocal. Automated microscopy and image analysis-assisted IHC led to significantly better interobserver agreement among the 7 pathologists, with an increase in mean scoring time of only about 30 seconds per slide. The change in cutoff levels led to a higher concordance between IHC and FISH, but fewer samples were classified as HER2 positive.

  3. Dysplastic naevus: histological criteria and their inter-observer reproducibility.

    PubMed

    Hastrup, N; Clemmensen, O J; Spaun, E; Søndergaard, K

    1994-06-01

    Forty melanocytic lesions were examined in a pilot study, which was followed by a final series of 100 consecutive melanocytic lesions, in order to evaluate the inter-observer reproducibility of the histological criteria proposed for the dysplastic naevus. The specimens were examined in a blind fashion by four observers. Analysis by kappa statistics showed poor reproducibility of nuclear features, while reproducibility of architectural features was acceptable, improving in the final series. Consequently, we cannot apply the combined criteria of cytological and architectural features with any confidence in the diagnosis of dysplastic naevus, and, until further studies have documented that architectural criteria alone will suffice in the diagnosis of dysplastic naevus, we, as pathologists, shall avoid this term.

  4. 46 CFR 390.5 - Agreement vessels.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... water-borne carriage of men, materials, goods or wares between: (i) Two points in the United States; (ii..., Great Lakes, noncontiguous domestic, or short sea transportation trade. (iv) Engaged primarily in the water-borne carriage of men, materials, goods or wares; and (v) Designated in the agreement as a...

  5. 46 CFR 390.5 - Agreement vessels.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... water-borne carriage of men, materials, goods or wares between: (i) Two points in the United States; (ii..., Great Lakes, noncontiguous domestic, or short sea transportation trade. (iv) Engaged primarily in the water-borne carriage of men, materials, goods or wares; and (v) Designated in the agreement as a...

  6. 46 CFR 390.5 - Agreement vessels.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... water-borne carriage of men, materials, goods or wares between: (i) Two points in the United States; (ii..., Great Lakes, noncontiguous domestic, or short sea transportation trade. (iv) Engaged primarily in the water-borne carriage of men, materials, goods or wares; and (v) Designated in the agreement as a...

  7. 46 CFR 390.5 - Agreement vessels.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... water-borne carriage of men, materials, goods or wares between: (i) Two points in the United States; (ii..., Great Lakes, noncontiguous domestic, or short sea transportation trade. (iv) Engaged primarily in the water-borne carriage of men, materials, goods or wares; and (v) Designated in the agreement as a...

  8. 46 CFR 390.5 - Agreement vessels.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... water-borne carriage of men, materials, goods or wares between: (i) Two points in the United States; (ii..., Great Lakes, noncontiguous domestic, or short sea transportation trade. (iv) Engaged primarily in the water-borne carriage of men, materials, goods or wares; and (v) Designated in the agreement as a...

  9. TU-H-CAMPUS-JeP2-01: Inter-Observer Delineation Comparison of Visible Glandular Breast Tissue On Magnetic Resonance Imaging and Computed Tomography (prone and Supine)

    SciTech Connect

    Pogson, EM; University of Wollongong, Wollongong, NSW; Liverpool and Macarthur Cancer Therapy Centres, Liverpool, NSW

    2016-06-15

    Purpose: Breast cancers predominantly arise from Glandular Breast Tissue (GBT). If the GBT can be treated effectively post-operatively utilising radiotherapy this may be adequate volumetric coverage for adjuvant breast radiotherapy. Adequate imaging of the GBT is necessary and will be assessed between MRI and CT modalities. GBT visualisation is acknowledged to be qualitatively superior on Magnetic Resonance Image (MRI) compared to Computed Tomography (CT), the current radiotherapy imaging standard, however this has not been quantitatively assessed. For radiotherapy purposes it is important that any treatment volume can be consistently defined between observers. This study investigates the consistency of CT andmore » MRI GBT contours for potential radiotherapy planning. Methods: Ten experts (9 breast radiation oncologists and 1 radiologist) contoured the extent of the visible GBT for 33 patients on MRI and CT (both without contrast), which was performed according to a contouring guideline in supine and prone patient positions. The GBT volume was not a conventional whole breast radiotherapy planning volume, but rather the extent of GBT that was indicated from the CT or MR imaging. Volumes were compared utilizing the dice similarity coefficient (DSC), kappa statistic, and Hausdorff Distances (HDs) to ascertain the modality that was most consistently volumed. Results: The inter-observer concordance was of substantial agreement (kappa above 0.6) for the CT supine, CT prone, MRI supine and MRI prone datasets. The MRI GBT volumes were larger than the CT GBT volumes (p<0.001). Inter-observer conformity was higher for CT than MRI, although the magnitude of this difference was small (VOI<0.04). Conformity between modalities (CT and MRI) was in agreement for both prone and supine, DSC=0.75. Prone GBT volumes were larger than supine for both MRI and CT. Conclusion: MRI improves the extent of GBT delineation. The role of MRI guided, GBT-targeted radiotherapy requires

  10. Magnetic resonance enterography has good inter-rater agreement and diagnostic accuracy for detecting inflammation in pediatric Crohn disease.

    PubMed

    Church, Peter C; Greer, Mary-Louise C; Cytter-Kuint, Ruth; Doria, Andrea S; Griffiths, Anne M; Turner, Dan; Walters, Thomas D; Feldman, Brian M

    2017-05-01

    signs and wPCDAI was higher than with CRP. AUC was highest (≥0.75) for ulcers, wall enhancement, wall thickening, wall T2 hyperintensity and wall DWI hyperintensity. Some MRE signs had good inter-rater agreement and AUC for detection of inflammation in children with Crohn disease.

  11. Development and testing of a de novo clinical staging system for podoconiosis (endemic non-filarial elephantiasis).

    PubMed

    Tekola, Fasil; Ayele, Zewdu; Mariam, Dereje Haile; Fuller, Claire; Davey, Gail

    2008-10-01

    To develop and test a robust clinical staging system for podoconiosis, a geochemical disease in individuals exposed to red clay soil. We adapted the Dreyer system for staging filarial lymphoedema and tested it in four re-iterative field tests conducted in an area of high-podoconiosis prevalence in Southern Ethiopia. The system has five stages according to proximal spread of disease and presence of dermal nodules, ridges and bands. We measured the 1-week repeatability and the inter-observer agreement of the final staging system. The five-stage system is readily understood by community workers with little health training. Kappa for 1-week repeatability was 0.88 (95% CI 0.80-0.96), for agreement between health professionals was 0.71 (95% CI 0.60-0.82), while that between health professionals and community podoconiosis agents without formal health training averaged 0.64 (95% CI 0.52-0.78). This simple staging system with good inter-observer agreement and repeatability can assist in the management and further study of podoconiosis.

  12. A comparison of logistic regression analysis and an artificial neural network using the BI-RADS lexicon for ultrasonography in conjunction with introbserver variability.

    PubMed

    Kim, Sun Mi; Han, Heon; Park, Jeong Mi; Choi, Yoon Jung; Yoon, Hoi Soo; Sohn, Jung Hee; Baek, Moon Hee; Kim, Yoon Nam; Chae, Young Moon; June, Jeon Jong; Lee, Jiwon; Jeon, Yong Hwan

    2012-10-01

    To determine which Breast Imaging Reporting and Data System (BI-RADS) descriptors for ultrasound are predictors for breast cancer using logistic regression (LR) analysis in conjunction with interobserver variability between breast radiologists, and to compare the performance of artificial neural network (ANN) and LR models in differentiation of benign and malignant breast masses. Five breast radiologists retrospectively reviewed 140 breast masses and described each lesion using BI-RADS lexicon and categorized final assessments. Interobserver agreements between the observers were measured by kappa statistics. The radiologists' responses for BI-RADS were pooled. The data were divided randomly into train (n = 70) and test sets (n = 70). Using train set, optimal independent variables were determined by using LR analysis with forward stepwise selection. The LR and ANN models were constructed with the optimal independent variables and the biopsy results as dependent variable. Performances of the models and radiologists were evaluated on the test set using receiver-operating characteristic (ROC) analysis. Among BI-RADS descriptors, margin and boundary were determined as the predictors according to stepwise LR showing moderate interobserver agreement. Area under the ROC curves (AUC) for both of LR and ANN were 0.87 (95% CI, 0.77-0.94). AUCs for the five radiologists ranged 0.79-0.91. There was no significant difference in AUC values among the LR, ANN, and radiologists (p > 0.05). Margin and boundary were found as statistically significant predictors with good interobserver agreement. Use of the LR and ANN showed similar performance to that of the radiologists for differentiation of benign and malignant breast masses.

  13. Agreement Analysis: What He Said, She Said Versus You Said.

    PubMed

    Vetter, Thomas R; Schober, Patrick

    2018-06-01

    Correlation and agreement are 2 concepts that are widely applied in the medical literature and clinical practice to assess for the presence and strength of an association. However, because correlation and agreement are conceptually distinct, they require the use of different statistics. Agreement is a concept that is closely related to but fundamentally different from and often confused with correlation. The idea of agreement refers to the notion of reproducibility of clinical evaluations or biomedical measurements. The intraclass correlation coefficient is a commonly applied measure of agreement for continuous data. The intraclass correlation coefficient can be validly applied specifically to assess intrarater reliability and interrater reliability. As its name implies, the Lin concordance correlation coefficient is another measure of agreement or concordance. In undertaking a comparison of a new measurement technique with an established one, it is necessary to determine whether they agree sufficiently for the new to replace the old. Bland and Altman demonstrated that using a correlation coefficient is not appropriate for assessing the interchangeability of 2 such measurement methods. They in turn described an alternative approach, the since widely applied graphical Bland-Altman Plot, which is based on a simple estimation of the mean and standard deviation of differences between measurements by the 2 methods. In reading a medical journal article that includes the interpretation of diagnostic tests and application of diagnostic criteria, attention is conventionally focused on aspects like sensitivity, specificity, predictive values, and likelihood ratios. However, if the clinicians who interpret the test cannot agree on its interpretation and resulting typically dichotomous or binary diagnosis, the test results will be of little practical use. Such agreement between observers (interobserver agreement) about a dichotomous or binary variable is often reported as the

  14. Feasibility and observer reproducibility of speckle tracking echocardiography in congenital heart disease patients.

    PubMed

    Mokhles, Palwasha; van den Bosch, Annemien E; Vletter-McGhie, Jackie S; Van Domburg, Ron T; Ruys, Titia P E; Kauer, Floris; Geleijnse, Marcel L; Roos-Hesselink, Jolien W

    2013-09-01

    The twisting motion of the heart has an important role in the function of the left ventricle. Speckle tracking echocardiography is able to quantify left ventricular (LV) rotation and twist. So far this new technique has not been used in congenital heart disease patients. The aim of our study was to investigate the feasibility and the intra- and inter-observer reproducibility of LV rotation parameters in adult patients with congenital heart disease. The study population consisted of 66 consecutive patients seen in the outpatient clinic (67% male, mean age 31 ± 7.7 years, NYHA class 1 ± 0.3) with a variety of congenital heart disease. First, feasibility was assessed in all patients. Intra- and inter-observer reproducibility was assessed for the patients in which speckle tracking echocardiography was feasible. Adequate image quality, for performing speckle echocardiography, was found in 80% of patients. The bias for the intra-observer reproducibility of the LV twist was 0.0°, with 95% limits of agreement of -2.5° and 2.5° and for interobserver reproducibility the bias was 0.0°, with 95% limits of agreement of -3.0° and 3.0°. Intra- and inter-observer measurements showed a strong correlation (0.86 and 0.79, respectively). Also a good repeatability was seen. The mean time to complete full analysis per subject for the first and second measurement was 9 and 5 minutes, respectively. Speckle tracking echocardiography is feasible in 80% of adult patients with congenital heart disease and shows excellent intra- and inter-observer reproducibility. © 2013, Wiley Periodicals, Inc.

  15. Is liver perfusion CT reproducible? A study on intra- and interobserver agreement of normal hepatic haemodynamic parameters obtained with two different software packages.

    PubMed

    Bretas, Elisa Almeida Sathler; Torres, Ulysses S; Torres, Lucas Rios; Bekhor, Daniel; Saito Filho, Celso Fernando; Racy, Douglas Jorge; Faggioni, Lorenzo; D'Ippolito, Giuseppe

    2017-10-01

    To evaluate the agreement between the measurements of perfusion CT parameters in normal livers by using two different software packages. This retrospective study was based on 78 liver perfusion CT examinations acquired for detecting suspected liver metastasis. Patients with any morphological or functional hepatic abnormalities were excluded. The final analysis included 37 patients (59.7 ± 14.9 y). Two readers (1 and 2) independently measured perfusion parameters using different software packages from two major manufacturers (A and B). Arterial perfusion (AP) and portal perfusion (PP) were determined using the dual-input vascular one-compartmental model. Inter-reader agreement for each package and intrareader agreement between both packages were assessed with intraclass correlation coefficients (ICC) and Bland-Altman statistics. Inter-reader agreement was substantial for AP using software A (ICC = 0.82) and B (ICC = 0.85-0.86), fair for PP using software A (ICC = 0.44) and fair to moderate for PP using software B (ICC = 0.56-0.77). Intrareader agreement between software A and B ranged from slight to moderate (ICC = 0.32-0.62) for readers 1 and 2 considering the AP parameters, and from fair to moderate (ICC = 0.40-0.69) for readers 1 and 2 considering the PP parameters. At best there was only moderate agreement between both software packages, resulting in some uncertainty and suboptimal reproducibility. Advances in knowledge: Software-dependent factors may contribute to variance in perfusion measurements, demanding further technical improvements. AP measurements seem to be the most reproducible parameter to be adopted when evaluating liver perfusion CT.

  16. Reliability and reproducibility of several methods of arthroscopic assessment of femoral tunnel position during anterior cruciate ligament reconstruction.

    PubMed

    Ilahi, Omer A; Mansfield, David J; Urrea, Luis H; Qadeer, Ali A

    2014-10-01

    improve intraobserver agreement for any estimation method. There does not appear to be any advantage of using a half clock face or compass for estimating femoral tunnel position compared with a whole clock-face analogy. Visual reference aids appear to improve interobserver agreement (reliability) of circular analogies. The linear quadrant appears to be the most reliable method (fair to moderate agreement) for estimating femoral tunnel position without a visual aid for reference, but even better reliability, ranging from fair to good agreement, may be obtained by using the whole clock-face analogy with a visual aid. Increasing femoral tunnel position reliability may improve outcomes of ACL reconstruction surgery. Copyright © 2014 Arthroscopy Association of North America. Published by Elsevier Inc. All rights reserved.

  17. Can emergency physicians accurately and reliably assess acute vertigo in the emergency department?

    PubMed

    Vanni, Simone; Nazerian, Peiman; Casati, Carlotta; Moroni, Federico; Risso, Michele; Ottaviani, Maddalena; Pecci, Rudi; Pepe, Giuseppe; Vannucchi, Paolo; Grifoni, Stefano

    2015-04-01

    To validate a clinical diagnostic tool, used by emergency physicians (EPs), to diagnose the central cause of patients presenting with vertigo, and to determine interrater reliability of this tool. A convenience sample of adult patients presenting to a single academic ED with isolated vertigo (i.e. vertigo without other neurological deficits) was prospectively evaluated with STANDING (SponTAneousNystagmus, Direction, head Impulse test, standiNG) by five trained EPs. The first step focused on the presence of spontaneous nystagmus, the second on the direction of nystagmus, the third on head impulse test and the fourth on gait. The local standard practice, senior audiologist evaluation corroborated by neuroimaging when deemed appropriate, was considered the reference standard. Sensitivity and specificity of STANDING were calculated. On the first 30 patients, inter-observer agreement among EPs was also assessed. Five EPs with limited experience in nystagmus assessment volunteered to participate in the present study enrolling 98 patients. Their average evaluation time was 9.9 ± 2.8 min (range 6-17). Central acute vertigo was suspected in 16 (16.3%) patients. There were 13 true positives, three false positives, 81 true negatives and one false negative, with a high sensitivity (92.9%, 95% CI 70-100%) and specificity (96.4%, 95% CI 93-38%) for central acute vertigo according to senior audiologist evaluation. The Cohen's kappas of the first, second, third and fourth steps of the STANDING were 0.86, 0.93, 0.73 and 0.78, respectively. The whole test showed a good inter-observer agreement (k = 0.76, 95% CI 0.45-1). In the hands of EPs, STANDING showed a good inter-observer agreement and accuracy validated against the local standard of care. © 2015 Australasian College for Emergency Medicine and Australasian Society for Emergency Medicine.

  18. Voluntary environmental agreements: Good or bad news for environmental protection?

    SciTech Connect

    Segerson, K.; Miceli, T.J.

    1998-09-01

    There has been growing interest in the use of voluntary agreements (VAs) as an environmental policy tool. This article uses a simple model to determine whether VAs are likely to lead to efficient environmental protection. The authors consider cases where polluters are induced to participate either by a background threat of mandatory controls (the stick approach) or by cost-sharing subsidies (the carrot approach). The results suggest that the overall impact on environmental quality could be positive or negative, depending on a number of factors, including the allocation of bargaining power, the magnitude of the background threat, and the social costmore » of funds.« less

  19. Inter-observer and intra-observer reliability in the radiographic diagnosis of avascular necrosis of the femoral head following reconstructive hip surgery in children with cerebral palsy.

    PubMed

    Hesketh, Kim; Sankar, Wudbhav; Joseph, Benjamin; Narayanan, Unni; Mulpuri, Kishore

    2016-04-01

    The incidence of avascular necrosis (AVN) following reconstructive hip surgery in cerebral palsy (CP) ranges from 0 to 69 % in the current literature. The purpose of this study was to determine the inter- and intra-observer reliability of radiographically diagnosing AVN in children with CP after hip surgery. A retrospective review of 65 children with CP who had reconstructive hip surgery between 2009 and 2012 at BC Children's Hospital was completed. Anterior-posterior and lateral radiographs were presented to four pediatric orthopaedic surgeons over two rounds. Surgeons were asked to review the set of unidentified radiographs and comment 'yes' or 'no' for the presence of AVN. Two weeks later the same set of radiographs was sent in a different order and the surgeons were again asked to comment on AVN. Inter- and intra-observer reliability was determined using kappa statistics. The intra-observer reliability ranged from 0.65 to 0.88 with an average score of 0.76. Inter-observer reliability showed greater variability, ranging from 0.41 to 0.77 with an average score of 0.56 across all surgeons. Although the intra-rater reliability produced a strength of "good" and the inter-rater reliability a strength of "moderate" agreement, the variability within these scores is clinically important as it demonstrates the difficulty in identifying AVN. This may explain the variability in AVN that is reported in the literature. The need for further education and research in the diagnosis of AVN in children with CP who have undergone reconstructive hip surgery is clinically necessary.

  20. The Pfirrmann classification of lumbar intervertebral disc degeneration: an independent inter- and intra-observer agreement assessment.

    PubMed

    Urrutia, Julio; Besa, Pablo; Campos, Mauricio; Cikutovic, Pablo; Cabezon, Mario; Molina, Marcelo; Cruz, Juan Pablo

    2016-09-01

    Grading inter-vertebral disc degeneration (IDD) is important in the evaluation of many degenerative conditions, including patients with low back pain. Magnetic resonance imaging (MRI) is considered the best imaging instrument to evaluate IDD. The Pfirrmann classification is commonly used to grade IDD; the authors describing this classification showed an adequate agreement using it; however, there has been a paucity of independent agreement studies using this grading system. The aim of this study was to perform an independent inter- and intra-observer agreement study using the Pfirrmann classification. T2-weighted sagittal images of 79 patients consecutively studied with lumbar spine MRI were classified using the Pfirrmann grading system by six evaluators (three spine surgeons and three radiologists). After a 6-week interval, the 79 cases were presented to the same evaluators in a random sequence for repeat evaluation. The intra-class correlation coefficient (ICC) and the weighted kappa (wκ) were used to determine the inter- and intra-observer agreement. The inter-observer agreement was excellent, with an ICC = 0.94 (0.93-0.95) and wκ = 0.83 (0.74-0.91). There were no differences between spine surgeons and radiologists. Likewise, there were no differences in agreement evaluating the different lumbar discs. Most differences among observers were only of one grade. Intra-observer agreement was also excellent with ICC = 0.86 (0.83-0.89) and wκ = 0.89 (0.85-0.93). In this independent study, the Pfirrmann classification demonstrated an adequate agreement among different observers and by the same observer on separate occasions. Furthermore, it allows communication between radiologists and spine surgeons.

  1. Agreement in electrocardiogram interpretation in patients with septic shock.

    PubMed

    Mehta, Sangeeta; Granton, John; Lapinsky, Stephen E; Newton, Gary; Bandayrel, Kristofer; Little, Anjuli; Siau, Chuin; Cook, Deborah J; Ayers, Dieter; Singer, Joel; Lee, Terry C; Walley, Keith R; Storms, Michelle; Cooper, Jamie; Holmes, Cheryl L; Hebert, Paul; Gordon, Anthony C; Presneill, Jeff; Russell, James A

    2011-09-01

    The reliability of electrocardiogram interpretation to diagnose myocardial ischemia in critically ill patients is unclear. In adults with septic shock, we assessed intra- and inter-rater agreement of electrocardiogram interpretation, and the effect of knowledge of troponin values on these interpretations. Prospective substudy of a randomized trial of vasopressin vs. norepinephrine in septic shock. Nine Canadian intensive care units. Adults with septic shock requiring at least 5 μg/min of norepinephrine for 6 hrs. Twelve-lead electrocardiograms were recorded before study drug, and 6 hrs, 2 days, and 4 days after study drug initiation. Two physician readers, blinded to patient data and group, independently interpreted electrocardiograms on three occasions (first two readings were blinded to patient data; third reading was unblinded to troponin). To calibrate and refine definitions, both readers initially reviewed 25 trial electrocardiograms representing normal to abnormal. Cohen's Kappa and the φ statistic were used to analyze intra- and inter-rater agreement. One hundred twenty-one patients (62.2 ± 16.5 yrs, Acute Physiology and Chronic Health Evaluation II 28.6 ± 7.7) had 373 electrocardiograms. Blinded to troponin, readers 1 and 2 interpreted 46.4% and 30.0% of electrocardiograms as normal, and 15.3% and 12.3% as ischemic, respectively. Intrarater agreement was moderate for overall ischemia (κ 0.54 and 0.58), moderate/good for "normal" (κ 0.69 and 0.55), fair to good for specific signs of ischemia (ST elevation, T inversion, and Q waves, reader 1 κ 0.40 to 0.69; reader 2 κ 0.56 to 0.70); and good/very good for atrial arrhythmias (κ 0.84 and 0.79) and bundle branch block (κ 0.88 and 0.79). Inter-rater agreement was fair for ischemia (κ 0.29), moderate for ST elevation (κ 0.48), T inversion (κ 0.52), and Q waves (κ 0.44), good for bundle branch block (κ 0.78), and very good for atrial arrhythmias (κ 0.83). Inter-rater agreement for ischemia improved

  2. An Instrument to Assess the Obesogenic Environment of Child Care Centers

    ERIC Educational Resources Information Center

    Ward, Dianne; Hales, Derek; Haverly, Katie; Marks, Julie; Benjamin, Sara; Ball, Sarah; Trost, Stewart

    2008-01-01

    Objectives: To describe protocol and interobserver agreements of an instrument to evaluate nutrition and physical activity environments at child care. Methods: Interobserver data were collected from 9 child care centers, through direct observation and document review (17 observer pairs). Results: Mean agreement between observer pairs was 87.26%…

  3. Assessing distractors and teamwork during surgery: developing an event-based method for direct observation.

    PubMed

    Seelandt, Julia C; Tschan, Franziska; Keller, Sandra; Beldi, Guido; Jenni, Nadja; Kurmann, Anita; Candinas, Daniel; Semmer, Norbert K

    2014-11-01

    To develop a behavioural observation method to simultaneously assess distractors and communication/teamwork during surgical procedures through direct, on-site observations; to establish the reliability of the method for long (>3 h) procedures. Observational categories for an event-based coding system were developed based on expert interviews, observations and a literature review. Using Cohen's κ and the intraclass correlation coefficient, interobserver agreement was assessed for 29 procedures. Agreement was calculated for the entire surgery, and for the 1st hour. In addition, interobserver agreement was assessed between two tired observers and between a tired and a non-tired observer after 3 h of surgery. The observational system has five codes for distractors (door openings, noise distractors, technical distractors, side conversations and interruptions), eight codes for communication/teamwork (case-relevant communication, teaching, leadership, problem solving, case-irrelevant communication, laughter, tension and communication with external visitors) and five contextual codes (incision, last stitch, personnel changes in the sterile team, location changes around the table and incidents). Based on 5-min intervals, Cohen's κ was good to excellent for distractors (0.74-0.98) and for communication/teamwork (0.70-1). Based on frequency counts, intraclass correlation coefficient was excellent for distractors (0.86-0.99) and good to excellent for communication/teamwork (0.45-0.99). After 3 h of surgery, Cohen's κ was 0.78-0.93 for distractors, and 0.79-1 for communication/teamwork. The observational method developed allows a single observer to simultaneously assess distractors and communication/teamwork. Even for long procedures, high interobserver agreement can be achieved. Data collected with this method allow for investigating separate or combined effects of distractions and communication/teamwork on surgical performance and patient outcomes. Published by the

  4. Assessing the inter-observer variability of Computer-Aided Nodule Assessment and Risk Yield (CANARY) to characterize lung adenocarcinomas.

    PubMed

    Nakajima, Erica C; Frankland, Michael P; Johnson, Tucker F; Antic, Sanja L; Chen, Heidi; Chen, Sheau-Chiann; Karwoski, Ronald A; Walker, Ronald; Landman, Bennett A; Clay, Ryan D; Bartholmai, Brian J; Rajagopalan, Srinivasan; Peikert, Tobias; Massion, Pierre P; Maldonado, Fabien

    2018-01-01

    Lung adenocarcinoma (ADC), the most common lung cancer type, is recognized increasingly as a disease spectrum. To guide individualized patient care, a non-invasive means of distinguishing indolent from aggressive ADC subtypes is needed urgently. Computer-Aided Nodule Assessment and Risk Yield (CANARY) is a novel computed tomography (CT) tool that characterizes early ADCs by detecting nine distinct CT voxel classes, representing a spectrum of lepidic to invasive growth, within an ADC. CANARY characterization has been shown to correlate with ADC histology and patient outcomes. This study evaluated the inter-observer variability of CANARY analysis. Three novice observers segmented and analyzed independently 95 biopsy-confirmed lung ADCs from Vanderbilt University Medical Center/Nashville Veterans Administration Tennessee Valley Healthcare system (VUMC/TVHS) and the Mayo Clinic (Mayo). Inter-observer variability was measured using intra-class correlation coefficient (ICC). The average ICC for all CANARY classes was 0.828 (95% CI 0.76, 0.895) for the VUMC/TVHS cohort, and 0.852 (95% CI 0.804, 0.901) for the Mayo cohort. The most invasive voxel classes had the highest ICC values. To determine whether nodule size influenced inter-observer variability, an additional cohort of 49 sub-centimeter nodules from Mayo were also segmented by three observers, with similar ICC results. Our study demonstrates that CANARY ADC classification between novice CANARY users has an acceptably low degree of variability, and supports the further development of CANARY for clinical application.

  5. Segmentation precision of abdominal anatomy for MRI-based radiotherapy

    PubMed Central

    Noel, Camille E.; Zhu, Fan; Lee, Andrew Y.; Yanle, Hu; Parikh, Parag J.

    2014-01-01

    The limited soft tissue visualization provided by computed tomography, the standard imaging modality for radiotherapy treatment planning and daily localization, has motivated studies on the use of magnetic resonance imaging (MRI) for better characterization of treatment sites, such as the prostate and head and neck. However, no studies have been conducted on MRI-based segmentation for the abdomen, a site that could greatly benefit from enhanced soft tissue targeting. We investigated the interobserver and intraobserver precision in segmentation of abdominal organs on MR images for treatment planning and localization. Manual segmentation of 8 abdominal organs was performed by 3 independent observers on MR images acquired from 14 healthy subjects. Observers repeated segmentation 4 separate times for each image set. Interobserver and intraobserver contouring precision was assessed by computing 3-dimensional overlap (Dice coefficient [DC]) and distance to agreement (Hausdorff distance [HD]) of segmented organs. The mean and standard deviation of intraobserver and interobserver DC and HD values were DCintraobserver = 0.89 ± 0.12, HDintraobserver = 3.6 mm ± 1.5, DCinterobserver = 0.89 ± 0.15, and HDinterobserver = 3.2 mm ± 1.4. Overall, metrics indicated good interobserver/intraobserver precision (mean DC > 0.7, mean HD < 4 mm). Results suggest that MRI offers good segmentation precision for abdominal sites. These findings support the utility of MRI for abdominal planning and localization, as emerging MRI technologies, techniques, and onboard imaging devices are beginning to enable MRI-based radiotherapy. PMID:24726701

  6. A Time for Flexible Donor Agreements.

    ERIC Educational Resources Information Center

    Fischer, Gerald B.

    2003-01-01

    Discusses why volatile markets and new donor expectations make now a good time to rework payout rates and gift agreements to bolster financial and strategic performance. Suggests seven options for action. (EV)

  7. A Novel Method for Measuring Anterior Segment Area of the Eye on Ultrasound Biomicroscopic Images Using Photoshop

    PubMed Central

    Wu, Ziqiang; Lin, Jialiu; Huang, Jingjing

    2015-01-01

    Purpose To describe a novel method for quantitative measurement of area parameters in ocular anterior segment ultrasound biomicroscopy (UBM) images using Photoshop software and to assess its intraobserver and interobserver reproducibility. Methods Twenty healthy volunteers with wide angles and twenty patients with narrow or closed angles were consecutively recruited. UBM images were obtained and analyzed using Photoshop software by two physicians with different-level training on two occasions. Borders of anterior segment structures including cornea, iris, lens, and zonules in the UBM image were semi-automatically defined by the Magnetic Lasso Tool in the Photoshop software according to the pixel contrast and modified by the observers. Anterior chamber area (ACA), posterior chamber area (PCA), iris cross-section area (ICA) and angle recess area (ARA) were drawn and measured. The intraobserver and interobserver reproducibilities of the anterior segment area parameters and scleral spur location were assessed by limits of agreement, coefficient of variation (CV), and intraclass correlation coefficient (ICC). Results All of the parameters were successfully measured by Photoshop. The intraobserver and interobserver reproducibilities of ACA, PCA, and ICA were good, with no more than 5% CV and more than 0.95 ICC, while the CVs of ARA were within 20%. The intraobserver and interobserver reproducibilities for defining the spur location were more than 0.97 ICCs. Although the operating times for both observers were less than 3 minutes per image, there was significant difference in the measuring time between two observers with different levels of training (p<0.001). Conclusion Measurements of ocular anterior segment areas on UBM images by Photoshop showed good intraobserver and interobserver reproducibilties. The methodology was easy to adopt and effective in measuring. PMID:25803857

  8. Caregiver person-centeredness and behavioral symptoms during mealtime interactions: development and feasibility of a coding scheme.

    PubMed

    Gilmore-Bykovskyi, Andrea L

    2015-01-01

    Mealtime behavioral symptoms are distressing and frequently interrupt eating for the individual experiencing them and others in the environment. A computer-assisted coding scheme was developed to measure caregiver person-centeredness and behavioral symptoms for nursing home residents with dementia during mealtime interactions. The purpose of this pilot study was to determine the feasibility, ease of use, and inter-observer reliability of the coding scheme, and to explore the clinical utility of the coding scheme. Trained observers coded 22 observations. Data collection procedures were acceptable to participants. Overall, the coding scheme proved to be feasible, easy to execute and yielded good to very good inter-observer agreement following observer re-training. The coding scheme captured clinically relevant, modifiable antecedents to mealtime behavioral symptoms, but would be enhanced by the inclusion of measures for resident engagement and consolidation of items for measuring caregiver person-centeredness that co-occurred and were difficult for observers to distinguish. Published by Elsevier Inc.

  9. Inter-observer variability in diagnosing radiological features of aneurysmal subarachnoid hemorrhage; a preliminary single centre study comparing observers from different specialties and levels of training.

    PubMed

    Siddiqui, Usman T; Khan, Anjum F; Shamim, Muhammad Shahzad; Hamid, Rana Shoaib; Alam, Muhammad Mehboob; Emaduddin, Muhammad

    2014-01-01

    A noncontrast computed tomography (CT) scan remains the initial radiological investigation of choice for a patient with suspected aneurysmal subarachnoid hemorrhage (aSAH). This initial scan may be used to derive key information about the underlying aneurysm which may aid in further management. The interpretation, however, is subject to the skill and experience of the interpreting individual. The authors here evaluate the interpretation of such CT scans by different individuals at different levels of training, and in two different specialties (Radiology and Neurosurgery). Initial nonontrast CT scan of 35 patients with aSAH was evaluated independently by four different observers. The observers selected for the study included two from Radiology and two from Neurosurgery at different levels of training; a resident currently in mid training and a resident who had recently graduated from training of each specialty. Measured variables included interpreter's suspicion of presence of subarachnoid blood, side of the subarachnoid hemorrhage, location of the aneurysm, the aneurysm's proximity to vessel bifurcation, number of aneurysm(s), contour of aneurysm(s), presence of intraventricular hemorrhage (IVH), intracerebral hemorrhage (ICH), infarction, hydrocephalus and midline shift. To determine the inter-observer variability (IOV), weighted kappa values were calculated. There was moderate agreement on most of the CT scan findings among all observers. Substantial agreement was found amongst all observers for hydrocephalus, IVH, and ICH. Lowest agreement rates were seen in the location of aneurysm being supra or infra tentorial. There were, however, some noteworthy exceptions. There was substantial to almost perfect agreement between the radiology graduate and radiology resident on most CT findings. The lowest agreement was found between the neurosurgery graduate and the radiology graduate. Our study suggests that although agreements were seen in the interpretation of some of

  10. Pena to review LHC agreement

    SciTech Connect

    Lawler, A.

    The US government plans to review its tentative agreement with Europe to help build the Large Hadron Collider (LHC), to make sure it is a good deal for this country. The review, announced last week by Energy Secretary Federico Pena, comes at the urging of Representative James Sensenbrenner (RWI), who chairs the House Science Committee. Agency officials say they are confident that most of the lawmaker`s concerns can be met with only minor changes to the proposed partnership, while European managers insist that the current agreement already addresses most of Sensenbrenner`s worries.

  11. Accelerated convergence for synchronous approximate agreement

    NASA Technical Reports Server (NTRS)

    Kearns, J. P.; Park, S. K.; Sjogren, J. A.

    1988-01-01

    The protocol for synchronous approximate agreement presented by Dolev et. al. exhibits the undesirable property that a faulty processor, by the dissemination of a value arbitrarily far removed from the values held by good processors, may delay the termination of the protocol by an arbitrary amount of time. Such behavior is clearly undesirable in a fault tolerant dynamic system subject to hard real-time constraints. A mechanism is presented by which editing data suspected of being from Byzantine-failed processors can lead to quicker, predictable, convergence to an agreement value. Under specific assumptions about the nature of values transmitted by failed processors relative to those transmitted by good processors, a Monte Carlo simulation is presented whose qualitative results illustrate the trade-off between accelerated convergence and the accuracy of the value agreed upon.

  12. Histological features associated with diagnostic agreement in atypical ductal hyperplasia of the breast: illustrative cases from the B-Path study.

    PubMed

    Allison, Kimberly H; Rendi, Mara H; Peacock, Sue; Morgan, Tom; Elmore, Joann G; Weaver, Donald L

    2016-12-01

    This study examined the case-specific characteristics associated with interobserver diagnostic agreement in atypical ductal hyperplasia (ADH) of the breast. Seventy-two test set cases with a consensus diagnosis of ADH from the B-Path study were evaluated. Cases were scored for 17 histological features, which were then correlated with the participant agreement with the consensus ADH diagnosis. Participating pathologists' perceptions of case difficulty, borderline features or whether they would obtain a second opinion were also examined for associations with agreement. Of the 2070 participant interpretations of the 72 consensus ADH cases, 48% were scored by participants as difficult and 45% as borderline between two diagnoses; the presence of both of these features was significantly associated with increased agreement (P < 0.001). A second opinion would have been obtained in 80% of interpretations, and this was associated with increased agreement (P < 0.001). Diagnostic agreement ranged from 10% to 89% on a case-by-case basis. Cases with papillary lesions, cribriform architecture and obvious cytological monotony were associated with higher agreement. Lower agreement rates were associated with solid or micropapillary architecture, borderline cytological monotony, or cases without a diagnostic area that was obvious on low power. The results of this study suggest that pathologists frequently recognize the challenge of ADH cases, with some cases being more prone to diagnostic variability. In addition, there are specific histological features associated with diagnostic agreement on ADH cases. Multiple example images from cases in this test set are provided to serve as educational illustrations of these challenges. © 2016 John Wiley & Sons Ltd.

  13. Histologic Features associated with Diagnostic Agreement in Atypical Ductal Hyperplasia of the Breast: Illustrative Cases from the B-Path Study

    PubMed Central

    Allison, Kimberly H.; Rendi, Mara H.; Peacock, Sue; Morgan, Tom; Elmore, Joann G.; Weaver, Donald L.

    2016-01-01

    Background Case specific characteristics associated with interobserver diagnostic agreement in atypical ductal hyperplasia (ADH) of the breast are poorly understood. Methods Seventy-two test set cases with a consensus diagnosis of ADH from the B-Path study were evaluated. Cases were scored for 17 histologic features which were then correlated with the participant agreement with the consensus ADH diagnosis. Participating pathologists’ perceptions of case difficulty, borderline features, or if they would obtain a second opinion were also examined for associations with agreement. Results Of the 2,070 participant interpretations on the 72 consensus ADH cases, 48% were scored by participants as difficult and 45% as borderline between two diagnoses; the presence of both of these features was significantly associated with increased agreement (p < 0.001). A second opinion would have been obtained in 80% of interpretations, and this was associated with increased agreement (p < 0.001). Diagnostic agreement ranged from 10–89% on a case-by-case basis. Cases with papillary lesions, cribriform architecture and obvious cytologic monotony were associated with higher agreement. Lower agreement rates were associated with solid or micro-papillary architecture, borderline cytologic monotony or cases without a diagnostic area that was obvious on low power. Conclusions The results of this study suggest that pathologists frequently recognize the challenge of ADH cases with some cases more prone to diagnostic variability. In addition, there are specific histologic features associated with diagnostic agreement on ADH cases. Multiple example images from cases in this test set are provided to serve as educational illustrations of these challenges. PMID:27398812

  14. Intra- and Interobserver Variability of Cochlear Length Measurements in Clinical CT.

    PubMed

    Iyaniwura, John E; Elfarnawany, Mai; Riyahi-Alam, Sadegh; Sharma, Manas; Kassam, Zahra; Bureau, Yves; Parnes, Lorne S; Ladak, Hanif M; Agrawal, Sumit K

    2017-07-01

    The cochlear A-value measurement exhibits significant inter- and intraobserver variability, and its accuracy is dependent on the visualization method in clinical computed tomography (CT) images of the cochlea. An accurate estimate of the cochlear duct length (CDL) can be used to determine electrode choice, and frequency map the cochlea based on the Greenwood equation. Studies have described estimating the CDL using a single A-value measurement, however the observer variability has not been assessed. Clinical and micro-CT images of 20 cadaveric cochleae were acquired. Four specialists measured A-values on clinical CT images using both standard views and multiplanar reconstructed (MPR) views. Measurements were repeated to assess for intraobserver variability. Observer variabilities were evaluated using intra-class correlation and absolute differences. Accuracy was evaluated by comparison to the gold standard micro-CT images of the same specimens. Interobserver variability was good (average absolute difference: 0.77 ± 0.42 mm) using standard views and fair (average absolute difference: 0.90 ± 0.31 mm) using MPR views. Intraobserver variability had an average absolute difference of 0.31 ± 0.09 mm for the standard views and 0.38 ± 0.17 mm for the MPR views. MPR view measurements were more accurate than standard views, with average relative errors of 9.5 and 14.5%, respectively. There was significant observer variability in A-value measurements using both the standard and MPR views. Creating the MPR views increased variability between experts, however MPR views yielded more accurate results. Automated A-value measurement algorithms may help to reduce variability and increase accuracy in the future.

  15. Evaluation of the applicability of territorial arterial spin labeling in meningiomas for presurgical assessments compared with 3-dimensional time-of-flight magnetic resonance angiography.

    PubMed

    Lu, Yiping; Luan, Shihai; Liu, Li; Xiong, Ji; Wen, Jianbo; Qu, Jianxun; Geng, Daoying; Yin, Bo

    2017-10-01

    To prospectively evaluate the application of territorial arterial spin labelling (t-ASL) in comparison with unenhanced three-dimensional time-of-flight magnetic resonance angiography (3D-TOF-MRA) in the identification of the feeding vasculature of meningiomas. Thirty consecutive patients with suspected meningiomas underwent conventional MR imaging, unenhanced 3D-TOF-MRA and t-ASL scanning. Four experienced neuro-radiologists assessed the feeding vessels with different techniques separately. For the identification of the origin of the feeding arteries on t-ASL, the inter-observer agreement was excellent (к = 0.913), while the inter-observer agreement of 3D-TOF-MRA was good (к = 0.653). The inter-modality agreement between t-ASL and 3D-TOF-MRA for the feeding arteries was moderate (к = 0.514). All 8 patients with motor or sensory disorders proved to have meningiomas supplied completely or partially by the internal carotid arteries, while all 14 patients with meningiomas supplied by the external carotid arteries or basilar arteries didn't show any symptoms concerning motor or sensory disorders (p = 0.003). T-ASL could complement unenhanced 3D-TOF-MRA and increase accuracy in the identification of the supplying arteries of meningiomas in a safe, intuitive, non-radioactive manner. The information about feeding arteries was potentially related to patients' symptoms and pathology, making it more crucial for neurosurgeons in planning surgery as well as evaluating prognosis. • A comprehensive understanding of feeding vasculature is helpful for optimized treatment decisions. • T-ASL could identify main supplying arteries of meningiomas with excellent inter-observer agreement. • The inter-modality agreement for identification of the main feeding arteries was moderate. • Blood supply from ICAs was related to motor or sensory disorders. • High-level meningiomas were found to have double main supplying arteries.

  16. Ultrasound as an Outcome Measure in Gout. A Validation Process by the OMERACT Ultrasound Working Group.

    PubMed

    Terslev, Lene; Gutierrez, Marwin; Schmidt, Wolfgang A; Keen, Helen I; Filippucci, Emilio; Kane, David; Thiele, Ralf; Kaeley, Gurjit; Balint, Peter; Mandl, Peter; Delle Sedie, Andrea; Hammer, Hilde Berner; Christensen, Robin; Möller, Ingrid; Pineda, Carlos; Kissin, Eugene; Bruyn, George A; Iagnocco, Annamaria; Naredo, Esperanza; D'Agostino, Maria Antonietta

    2015-11-01

    To summarize the work performed by the Outcome Measures in Rheumatology (OMERACT) Ultrasound (US) Working Group on the validation of US as a potential outcome measure in gout. Based on the lack of definitions, highlighted in a recent literature review on US as an outcome tool in gout, a series of iterative exercises were carried out to obtain consensus-based definitions on US elementary components in gout using a Delphi exercise and subsequently testing these definitions in static images and in patients with proven gout. Cohen's κ was used to test agreement, and values of 0-0.20 were considered poor, 0.20-0.40 fair, 0.40-0.60 moderate, 0.60-0.80 good, and 0.80-1 excellent. With an agreement of > 80%, consensus-based definitions were obtained for the 4 elementary lesions highlighted in the literature review: tophi, aggregates, erosions, and double contour (DC). In static images interobserver reliability ranged from moderate to almost perfect, and similar results were found for the intrareader reliability. In patients the intraobserver agreement was good for all lesions except DC (moderate). The interobserver agreement was poor for aggregates and DC but moderate for the other components. These first steps in evaluating the validity of US as an outcome measure for gout show that the reliability of the definitions ranged from moderate to excellent in static images and somewhat lower in patients, indicating that a standardized scanning technique may be needed, before testing the responsiveness of those definitions in a composite US score.

  17. Comparison of High-Resolution MR Imaging and Digital Subtraction Angiography for the Characterization and Diagnosis of Intracranial Artery Disease.

    PubMed

    Lee, N J; Chung, M S; Jung, S C; Kim, H S; Choi, C-G; Kim, S J; Lee, D H; Suh, D C; Kwon, S U; Kang, D-W; Kim, J S

    2016-12-01

    High-resolution MR imaging has recently been introduced as a promising diagnostic modality in intracranial artery disease. Our aim was to compare high-resolution MR imaging with digital subtraction angiography for the characterization and diagnosis of various intracranial artery diseases. Thirty-seven patients who had undergone both high-resolution MR imaging and DSA for intracranial artery disease were enrolled in our study (August 2011 to April 2014). The time interval between the high-resolution MR imaging and DSA was within 1 month. The degree of stenosis and the minimal luminal diameter were independently measured by 2 observers in both DSA and high-resolution MR imaging, and the results were compared. Two observers independently diagnosed intracranial artery diseases on DSA and high-resolution MR imaging. The time interval between the diagnoses on DSA and high-resolution MR imaging was 2 weeks. Interobserver diagnostic agreement for each technique and intermodality diagnostic agreement for each observer were acquired. High-resolution MR imaging showed moderate-to-excellent agreement (interclass correlation coefficient = 0.892-0.949; κ = 0.548-0.614) and significant correlations (R = 0.766-892) with DSA on the degree of stenosis and minimal luminal diameter. The interobserver diagnostic agreement was good for DSA (κ = 0.643) and excellent for high-resolution MR imaging (κ = 0.818). The intermodality diagnostic agreement was good (κ = 0.704) for observer 1 and moderate (κ = 0.579) for observer 2, respectively. High-resolution MR imaging may be an imaging method comparable with DSA for the characterization and diagnosis of various intracranial artery diseases. © 2016 by American Journal of Neuroradiology.

  18. 19 CFR 10.771 - Textile or apparel goods.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... Agreement Rules of Origin § 10.771 Textile or apparel goods. (a) De minimis. Except as provided in paragraph... specific rules specified in General Note 27(h), HTSUS, textile or apparel goods classifiable as goods put up in sets for retail sale as provided for in General Rule of Interpretation 3, HTSUS, will not be...

  19. Comparison of 3T and 7T susceptibility-weighted angiography of the substantia nigra in diagnosing Parkinson disease.

    PubMed

    Cosottini, M; Frosini, D; Pesaresi, I; Donatelli, G; Cecchi, P; Costagli, M; Biagi, L; Ceravolo, R; Bonuccelli, U; Tosetti, M

    2015-03-01

    Standard neuroimaging fails in defining the anatomy of the substantia nigra and has a marginal role in the diagnosis of Parkinson disease. Recently 7T MR target imaging of the substantia nigra has been useful in diagnosing Parkinson disease. We performed a comparative study to evaluate whether susceptibility-weighted angiography can diagnose Parkinson disease with a 3T scanner. Fourteen patients with Parkinson disease and 13 healthy subjects underwent MR imaging examination at 3T and 7T by using susceptibility-weighted angiography. Two expert blinded observers and 1 neuroradiology fellow evaluated the 3T and 7T images of the sample to identify substantia nigra abnormalities indicative of Parkinson disease. Diagnostic accuracy and intra- and interobserver agreement were calculated separately for 3T and 7T acquisitions. Susceptibility-weighted angiography 7T MR imaging can diagnose Parkinson disease with a mean sensitivity of 93%, specificity of 100%, and diagnostic accuracy of 96%. 3T MR imaging diagnosed Parkinson disease with a mean sensitivity of 79%, specificity of 94%, and diagnostic accuracy of 86%. Intraobserver and interobserver agreement was excellent at 7T. At 3T, intraobserver agreement was excellent for experts, and interobserver agreement ranged between good and excellent. The less expert reader obtained a diagnostic accuracy of 89% at 3T. Susceptibility-weighted angiography images obtained at 3T and 7T differentiate controls from patients with Parkinson disease with a higher diagnostic accuracy at 7T. The capability of 3T in diagnosing Parkinson disease might encourage its use in clinical practice. The use of the more accurate 7T should be supported by a dedicated cost-effectiveness study. © 2015 by American Journal of Neuroradiology.

  20. Radiographic classifications in Perthes disease

    PubMed Central

    Huhnstock, Stefan; Svenningsen, Svein; Merckoll, Else; Catterall, Anthony; Terjesen, Terje; Wiig, Ola

    2017-01-01

    Background and purpose Different radiographic classifications have been proposed for prediction of outcome in Perthes disease. We assessed whether the modified lateral pillar classification would provide more reliable interobserver agreement and prognostic value compared with the original lateral pillar classification and the Catterall classification. Patients and methods 42 patients (38 boys) with Perthes disease were included in the interobserver study. Their mean age at diagnosis was 6.5 (3–11) years. 5 observers classified the radiographs in 2 separate sessions according to the Catterall classification, the original and the modified lateral pillar classifications. Interobserver agreement was analysed using weighted kappa statistics. We assessed the associations between the classifications and femoral head sphericity at 5-year follow-up in 37 non-operatively treated patients in a crosstable analysis (Gamma statistics for ordinal variables, γ). Results The original lateral pillar and Catterall classifications showed moderate interobserver agreement (kappa 0.49 and 0.43, respectively) while the modified lateral pillar classification had fair agreement (kappa 0.40). The original lateral pillar classification was strongly associated with the 5-year radiographic outcome, with a mean γ correlation coefficient of 0.75 (95% CI: 0.61–0.95) among the 5 observers. The modified lateral pillar and Catterall classifications showed moderate associations (mean γ correlation coefficient 0.55 [95% CI: 0.38–0.66] and 0.64 [95% CI: 0.57–0.72], respectively). Interpretation The Catterall classification and the original lateral pillar classification had sufficient interobserver agreement and association to late radiographic outcome to be suitable for clinical use. Adding the borderline B/C group did not increase the interobserver agreement or prognostic value of the original lateral pillar classification. PMID:28613966

  1. Reliability of joint count assessment in rheumatoid arthritis: a systematic literature review.

    PubMed

    Cheung, Peter P; Gossec, Laure; Mak, Anselm; March, Lyn

    2014-06-01

    Joint counts are central to the assessment of rheumatoid arthritis (RA) but reliability is an issue. To evaluate the reliability and agreement of joint counts (intra-observer and inter-observer) by health care professionals (physicians, nurses, and metrologists) and patients in RA, and the impact of training and standardization on joint count reliability through a systematic literature review. Articles reporting joint count reliability or agreement in RA in PubMed, EMBase, and the Cochrane library between 1960 and 2012 were selected. Data were extracted regarding tender joint counts (TJCs) and swollen joint counts (SJCs) derived by physicians, metrologists, or patients for intra-observer and inter-observer reliability. In addition, methods and effects of training or standardization were extracted. Statistics expressing reliability such as intraclass correlation coefficients (ICCs) were extracted. Data analysis was primarily descriptive due to high heterogeneity. Twenty-eight studies on health care professionals (HCP) and 20 studies on patients were included. Intra-observer reliability for TJCs and SJCs was good for HCPs and patients (range of ICC: 0.49-0.98). Inter-observer reliability between HCPs for TJCs was higher than for SJCs (range of ICC: 0.64-0.88 vs. 0.29-0.98). Patient inter-observer reliability with HCPs as comparators was better for TJCs (range of ICC: 0.31-0.91) compared to SJCs (0.16-0.64). Nine studies (7 with HCPs and 2 with patients) evaluated consensus or training, with improvement in reliability of TJCs but conflicting evidence for SJCs. Intra- and inter-observer reliability was high for TJCs for HCPs and patients: among all groups, reliability was better for TJCs than SJCs. Inter-observer reliability of SJCs was poorer for patients than HCPs. Data were inconclusive regarding the potential for training to improve SJC reliability. Overall, the results support further evaluation for patient-reported joint counts as an outcome measure. © 2013

  2. The effect of dental artifacts, contrast media, and experience on interobserver contouring variations in head and neck anatomy.

    PubMed

    O'Daniel, Jennifer C; Rosenthal, David I; Garden, Adam S; Barker, Jerry L; Ahamad, Anesa; Ang, K Kian; Asper, Joshua A; Blanco, Angel I; de Crevoisier, Renaud; Holsinger, F Christopher; Patel, Chirag B; Schwartz, David L; Wang, He; Dong, Lei

    2007-04-01

    To investigate interobserver variability in the delineation of head-and-neck (H&N) anatomic structures on CT images, including the effects of image artifacts and observer experience. Nine observers (7 radiation oncologists, 1 surgeon, and 1 physician assistant) with varying levels of H&N delineation experience independently contoured H&N gross tumor volumes and critical structures on radiation therapy treatment planning CT images alongside reference diagnostic CT images for 4 patients with oropharynx cancer. Image artifacts from dental fillings partially obstructed 3 images. Differences in the structure volumes, center-of-volume positions, and boundary positions (1 SD) were measured. In-house software created three-dimensional overlap distributions, including all observers. The effects of dental artifacts and observer experience on contouring precision were investigated, and the need for contrast media was assessed. In the absence of artifacts, all 9 participants achieved reasonable precision (1 SD < or =3 mm all boundaries). The structures obscured by dental image artifacts had larger variations when measured by the 3 metrics (1 SD = 8 mm cranial/caudal boundary). Experience improved the interobserver consistency of contouring for structures obscured by artifacts (1 SD = 2 mm cranial/caudal boundary). Interobserver contouring variability for anatomic H&N structures, specifically oropharyngeal gross tumor volumes and parotid glands, was acceptable in the absence of artifacts. Dental artifacts increased the contouring variability, but experienced participants achieved reasonable precision even with artifacts present. With a staging contrast CT image as a reference, delineation on a noncontrast treatment planning CT image can achieve acceptable precision.

  3. Do Orthopaedic Oncologists Agree on the Diagnosis and Treatment of Cartilage Tumors of the Appendicular Skeleton?

    PubMed

    Zamora, Tomas; Urrutia, Julio; Schweitzer, Daniel; Amenabar, Pedro Pablo; Botello, Eduardo

    2017-09-01

    Distinguishing a benign enchondroma from a low-grade chondrosarcoma is a common diagnostic challenge for orthopaedic oncologists. Low interrater agreement has been observed for the diagnosis of cartilaginous neoplasms among radiologists and pathologists, but, to our knowledge, no study has evaluated inter- and intraobserver agreement among orthopaedic oncologists grading these lesions using initial clinical and imaging information. Determining such agreement is important since it reflects the certainty in the diagnosis by orthopaedic oncologists. Agreement also is important as it will guide future treatment and prognosis, considering that there is no gold standard for diagnosis of these lesions. (1) to determine inter- and intraobserver agreement among a multinational panel of expert orthopaedic oncologists in diagnosing cartilaginous neoplasms based on their assessment of clinical symptoms and imaging at diagnosis. (2) To describe the most important clinical and imaging features that experts use during the initial diagnostic process. (3) To determine interobserver agreement for proposed initial treatment strategies for cartilaginous neoplasms by this panel of evaluators. Thirty-nine patients with intramedullary cartilaginous neoplasms of the appendicular skeleton of various histopathologic grades were selected and classified as having benign, low-grade malignant, or intermediate- or high-grade malignant neoplasms by 10 experienced orthopaedic oncologists based on clinical and imaging information. Additionally, they chose the three most important clinical or imaging features for the diagnosis of these neoplasms, and they proposed a treatment strategy for each patient. The Kappa coefficient (κ) was used to determine inter- and intraobserver agreement. Inter- and intraobserver agreements were only fair to good, κ = 0.44(95% CI, 0.41-0.48) and κ = 0.62 (95% CI, 0.52-0.72), respectively. The three factors most frequently identified as helpful in making the diagnosis

  4. Flat Urothelial Lesions With Atypia: Interobserver Concordance and Added Value of Immunohistochemical Profiling.

    PubMed

    Lawless, Margaret E; Tretiakova, Maria S; True, Lawrence D; Vakar-Lopez, Funda

    2018-03-01

    Distinguishing urothelial carcinoma in situ (CIS) from other flat lesions of the urinary bladder with cytologic atypia is critically important for the management of patients with bladder neoplasia. However, there is high interpathologist variability in making these distinctions. The aim of this study is to assess interobserver agreement between general and specialized genitourinary pathologists, and to compare these diagnoses with those rendered after an immunohistochemical panel is performed. We hypothesized that addition of a set of immunohistochemical stains would reduce the number of cases classified within intermediate categories of atypia of uncertain significance and low-grade dysplasia. Two genitourinary pathologists independently assessed haematoxylin and eosin (H&E)-stained sections of 127 bladder biopsies from each of the 4 International Society of Urological Pathology/World Health Organization categories of flat lesions diagnosed by general pathologists. A subset of biopsies from 49 patients was reassessed after staining with a 3-antibody panel (CD44, CK20, and p53) and the results were correlated with patient follow-up. Based on these immunohistochemistry (IHC) stains, 26 cases (53.1%) were recategorized. Of most clinical importance, 5 of 27 cases (18.5%) originally diagnosed as either atypia of uncertain significance or low-grade dysplasia were recategorized as CIS, and recurrent disease was identified on subsequent biopsies. None of the 10 cases diagnosed as CIS based on H&E stains were recategorized. This triad of IHC stains can improve the precision of pathologic diagnosis of histologically atypical urothelial lesions of flat bladder mucosa. We recommend that pathologists apply this set of IHC stains to such lesions they find problematic based on H&E stains.

  5. Intramodality and intermodality agreement in radiography and computed tomography of equine distal limb fractures.

    PubMed

    Crijns, C P; Martens, A; Bergman, H-J; van der Veen, H; Duchateau, L; van Bree, H J J; Gielen, I M V L

    2014-01-01

    Computed tomography (CT) is increasingly accessible in equine referral hospitals. To document the level of agreement within and between radiography and CT in characterising equine distal limb fractures. Retrospective descriptive study. Images from horses that underwent radiographic and CT evaluation for suspected distal limb fractures were reviewed, including 27 horses and 3 negative controls. Using Cohen's kappa and weighted kappa analysis, the level of agreement among 4 observers for a predefined set of diagnostic characteristics for radiography and CT separately and for the level of agreement between the 2 imaging modalities were documented. Both CT and radiography had very good intramodality agreement in identifying fractures, but intermodality agreement was lower. There was good intermodality and intramodality agreement for anatomical localisation and the identification of fracture displacement. Agreement for articular involvement, fracture comminution and fracture fragment number was towards the lower limit of good agreement. There was poor to fair intermodality agreement regarding fracture orientation, fracture width and coalescing cracks; intramodality agreement was higher for CT than for radiography for these features. Further studies, including comparisons with surgical and/or post mortem findings, are required to determine the sensitivity and specificity of CT and radiography in the diagnosis and characterisation of equine distal limb fractures. © 2013 EVJ Ltd.

  6. Intra- and inter-observer reliability of ten major histological scoring systems used for the evaluation of in vivo cartilage repair.

    PubMed

    Bonasia, Davide Edoardo; Marmotti, Antongiulio; Massa, Alessandro Domenico Felice; Ferro, Andrea; Blonna, Davide; Castoldi, Filippo; Rossi, Roberto

    2015-09-01

    In the last two decades, many surgical techniques have been described for articular cartilage repair. Reliable histological scoring systems are fundamental tools to evaluate new procedures. Several histological scoring systems have been described, and these can be divided in elementary and comprehensive scores, according to the number of sub-items. The aim of this study was to test the inter- and intra-observer reliability of ten main scores used for the histological evaluation of in vivo cartilage repair. The authors tested the starting hypothesis that elementary scores would show superior intra- and inter-observer reliability compared with comprehensive scores. Fifty histological sections obtained from the trochlea of New Zealand Rabbit and stained with Safranin-O fast green were used. The histological sections were analysed by 4 observers: 2 experienced in cartilage histology and 2 inexperienced. Histological evaluations were performed at time 1 and time 2, separated by a 30-day interval. The following scores were used: Mankin, O'Driscoll, Pineda, Wakitani, Fortier, Selleres, ICRS, ICRSII, Oswestry (OsScore) and modified O'Driscoll. Intra- and inter-observer reliability were evaluated for each score. In addition, the pavement-ceiling effect and the Bland-Altman Coefficient of Repeatability were then evaluated for each sub-item of every score. Intra-observer reliability was high for all observers in every score, even though the reliability was significantly lower for non-expert observers compared with expert counterparts. In terms of Coefficient of Repeatability, some scores performed better (O'Driscoll, Modified O'Driscoll and ICRSII) than others (Fortier, Seller). Inter-observer reliability was high for all observers in every score, but significantly lower for non-expert compared with expert observers. In expert hands, all the scores showed high intra- and inter-observer reliability, independently of the complexity. Although every score has advantages and

  7. [Identification of adverse events in hospitalised influenza patients].

    PubMed

    Aranaz-Andrés, J M; Gea-Velázquez de Castro, M T; Jiménez-Pericás, F; Balbuena-Segura, A I; Meyer-García, M C; López-Fresneña, N; Miralles-Bueno, J J; Obón-Azuara, B; Moliner-Lahoz, J; Aibar-Remón, C

    2015-01-01

    To test the inter-observer agreement in identifying adverse events (AE) in patients hospitalized by flu and undergoing precautionary isolation measures. Historical cohort study, 50 patients undergoing isolation measures due to flu, and 50 patients without any isolation measures. The AE incidence ranges from 10 to 26% depending on the observer (26% [95%CI: 17.4%-34.60%], 10% [95%CI: 4.12%-15.88%], and 23% [95%CI: 14.75%-31.25%]). It was always lower in the cohort undergoing the isolation measures. This difference is statistically significant when the accurate definition of a case is applied. The agreement as regards the screening was good (higher than 76%; Kappa index between 0.29 and 0.81). The agreement as regards the accurate identification of AE related to care was lower (from 50 to 93.3%, Kappa index from 0.20 to 0.70). Before performing an epidemiological study on AE, interobserver concordance must be analyzed to improve the accuracy of the results and the validity of the study. Studies have different levels of reliability. Kappa index shows high levels for the screening guide, but not for the identification of AE. Without a good methodology the results achieved, and thus the decisions made from them, cannot be guaranteed. Researchers have to be sure of the method used, which should be as close as possible to the optimal achievable. Copyright © 2014 SECA. Published by Elsevier Espana. All rights reserved.

  8. Assessment of colon polyp morphology: Is education effective?

    PubMed Central

    Kim, Jae Hyun; Nam, Kyoung Sik; Kwon, Hye Jung; Choi, Youn Jung; Jung, Kyoungwon; Kim, Sung Eun; Moon, Won; Park, Moo In; Park, Seun Ja

    2017-01-01

    AIM To determine the inter-observer variability for colon polyp morphology and to identify whether education can improve agreement among observers. METHODS For purposes of the tests, we recorded colonoscopy video clips that included scenes visualizing the polyps. A total of 15 endoscopists and 15 nurses participated in the study. Participants watched 60 video clips of the polyp morphology scenes and then estimated polyp morphology (pre-test). After education for 20 min, participants performed a second test in which the order of 60 video clips was changed (post-test). To determine if the effectiveness of education was sustained, four months later, a third, follow-up test was performed with the same participants. RESULTS The overall Fleiss’ kappa value of the inter-observer agreement was 0.510 in the pre-test, 0.618 in the post-test, and 0.580 in the follow-up test. The overall diagnostic accuracy of the estimation for polyp morphology in the pre-, post-, and follow-up tests was 0.662, 0.797, and 0.761, respectively. After education, the inter-observer agreement and diagnostic accuracy of all participants improved. However, after four months, the inter-observer agreement and diagnostic accuracy of expert groups were markedly decreased, and those of beginner and nurse groups remained similar to pre-test levels. CONCLUSION The education program used in this study can improve inter-observer agreement and diagnostic accuracy in assessing the morphology of colon polyps; it is especially effective when first learning endoscopy. PMID:28974894

  9. Assessment of colon polyp morphology: Is education effective?

    PubMed

    Kim, Jae Hyun; Nam, Kyoung Sik; Kwon, Hye Jung; Choi, Youn Jung; Jung, Kyoungwon; Kim, Sung Eun; Moon, Won; Park, Moo In; Park, Seun Ja

    2017-09-14

    To determine the inter-observer variability for colon polyp morphology and to identify whether education can improve agreement among observers. For purposes of the tests, we recorded colonoscopy video clips that included scenes visualizing the polyps. A total of 15 endoscopists and 15 nurses participated in the study. Participants watched 60 video clips of the polyp morphology scenes and then estimated polyp morphology (pre-test). After education for 20 min, participants performed a second test in which the order of 60 video clips was changed (post-test). To determine if the effectiveness of education was sustained, four months later, a third, follow-up test was performed with the same participants. The overall Fleiss' kappa value of the inter-observer agreement was 0.510 in the pre-test, 0.618 in the post-test, and 0.580 in the follow-up test. The overall diagnostic accuracy of the estimation for polyp morphology in the pre-, post-, and follow-up tests was 0.662, 0.797, and 0.761, respectively. After education, the inter-observer agreement and diagnostic accuracy of all participants improved. However, after four months, the inter-observer agreement and diagnostic accuracy of expert groups were markedly decreased, and those of beginner and nurse groups remained similar to pre-test levels. The education program used in this study can improve inter-observer agreement and diagnostic accuracy in assessing the morphology of colon polyps; it is especially effective when first learning endoscopy.

  10. Scoring haemophilic arthropathy on X-rays: improving inter- and intra-observer reliability and agreement using a consensus atlas.

    PubMed

    Foppen, Wouter; van der Schaaf, Irene C; Beek, Frederik J A; Verkooijen, Helena M; Fischer, Kathelijn

    2016-06-01

    The radiological Pettersson score (PS) is widely applied for classification of arthropathy to evaluate costly haemophilia treatment. This study aims to assess and improve inter- and intra-observer reliability and agreement of the PS. Two series of X-rays (bilateral elbows, knees, and ankles) of 10 haemophilia patients (120 joints) with haemophilic arthropathy were scored by three observers according to the PS (maximum score 13/joint). Subsequently, (dis-)agreement in scoring was discussed until consensus. Example images were collected in an atlas. Thereafter, second series of 120 joints were scored using the atlas. One observer rescored the second series after three months. Reliability was assessed by intraclass correlation coefficients (ICC), agreement by limits of agreement (LoA). Median Pettersson score at joint level (PSjoint) of affected joints was 6 (interquartile range 3-9). Using the consensus atlas, inter-observer reliability of the PSjoint improved significantly from 0.94 (95 % confidence interval (CI) 0.91-0.96) to 0.97 (CI 0.96-0.98). LoA improved from ±1.7 to ±1.1 for the PSjoint. Therefore, true differences in arthropathy were differences in the PSjoint of >2 points. Intra-observer reliability of the PSjoint was 0.98 (CI 0.97-0.98), intra-observer LoA were ±0.9 points. Reliability and agreement of the PS improved by using a consensus atlas. • Reliability of the Pettersson score significantly improved using the consensus atlas. • The presented consensus atlas improved the agreement among observers. • The consensus atlas could be recommended to obtain a reproducible Pettersson score.

  11. Interobserver Variability of Radiographic Assessment Using a Mobile Messaging Application as a Teleconsultation Tool

    PubMed Central

    Özkan, Sezai; Mellema, Jos J.; Ring, David; Chen, Neal C.

    2017-01-01

    Background: To examine whether interobserver reliability, decision-making, and confidence in decision-making in the treatment of distal radius fractures changes if radiographs are viewed on a messenger application on a mobile phone compared to a standard DICOM viewer. Methods: Radiographs of distal radius fractures were presented to surgeons on either a smart phone using a mobile messenger application or a laptop using a DICOM viewer application. Twenty observers participated: 10 (50%) were randomly assigned to the DICOM viewer group and 10 (50%) to the mobile messenger group. Each observer was asked to evaluate the cases and (1) classify the fracture type according to the AO classification, (2) recommend operative or conservative treatment and (3) rate their confidence about this decision. Results: There was no significant difference in interobserver reliability for AO classification and recommendation for surgery for distal radius fractures in both groups. The percentage of recommendation for surgery was significantly higher in the messenger application group compared to the DICOM viewer group (89% versus 78%, P=0.019) and the confidence for treatment decision was significantly higher in the mobile messenger group compared to the DICOM viewer group (8.9 versus 7.9, P=0.026). Conclusion: Messenger applications on mobile phones could facilitate remote decision-making for patients with distal radius fractures, but should be used with caution. PMID:29226202

  12. Identification of Nasal Bone Fractures on Conventional Radiography and Facial CT: Comparison of the Diagnostic Accuracy in Different Imaging Modalities and Analysis of Interobserver Reliability

    PubMed Central

    Baek, Hye Jin; Kim, Dong Wook; Ryu, Ji Hwa; Lee, Yoo Jin

    2013-01-01

    Background There has been no study to compare the diagnostic accuracy of an experienced radiologist with a trainee in nasal bone fracture. Objectives To compare the diagnostic accuracy between conventional radiography and computed tomography (CT) for the identification of nasal bone fractures and to evaluate the interobserver reliability between a staff radiologist and a trainee. Patients and Methods A total of 108 patients who underwent conventional radiography and CT after acute nasal trauma were included in this retrospective study. Two readers, a staff radiologist and a second-year resident, independently assessed the results of the imaging studies. Results Of the 108 patients, the presence of a nasal bone fracture was confirmed in 88 (81.5%) patients. The number of non-depressed fractures was higher than the number of depressed fractures. In nine (10.2%) patients, nasal bone fractures were only identified on conventional radiography, including three depressed and six non-depressed fractures. CT was more accurate as compared to conventional radiography for the identification of nasal bone fractures as determined by both readers (P <0.05), all diagnostic indices of an experienced radiologist were similar to or higher than those of a trainee, and κ statistics showed moderate agreement between the two diagnostic tools for both readers. There was no statistical difference in the assessment of interobserver reliability for both imaging modalities in the identification of nasal bone fractures. Conclusion For the identification of nasal bone fractures, CT was significantly superior to conventional radiography. Although a staff radiologist showed better values in the identification of nasal bone fracture and differentiation between depressed and non-depressed fractures than a trainee, there was no statistically significant difference in the interpretation of conventional radiography and CT between a radiologist and a trainee. PMID:24348599

  13. Identification of Nasal Bone Fractures on Conventional Radiography and Facial CT: Comparison of the Diagnostic Accuracy in Different Imaging Modalities and Analysis of Interobserver Reliability.

    PubMed

    Baek, Hye Jin; Kim, Dong Wook; Ryu, Ji Hwa; Lee, Yoo Jin

    2013-09-01

    There has been no study to compare the diagnostic accuracy of an experienced radiologist with a trainee in nasal bone fracture. To compare the diagnostic accuracy between conventional radiography and computed tomography (CT) for the identification of nasal bone fractures and to evaluate the interobserver reliability between a staff radiologist and a trainee. A total of 108 patients who underwent conventional radiography and CT after acute nasal trauma were included in this retrospective study. Two readers, a staff radiologist and a second-year resident, independently assessed the results of the imaging studies. Of the 108 patients, the presence of a nasal bone fracture was confirmed in 88 (81.5%) patients. The number of non-depressed fractures was higher than the number of depressed fractures. In nine (10.2%) patients, nasal bone fractures were only identified on conventional radiography, including three depressed and six non-depressed fractures. CT was more accurate as compared to conventional radiography for the identification of nasal bone fractures as determined by both readers (P <0.05), all diagnostic indices of an experienced radiologist were similar to or higher than those of a trainee, and κ statistics showed moderate agreement between the two diagnostic tools for both readers. There was no statistical difference in the assessment of interobserver reliability for both imaging modalities in the identification of nasal bone fractures. For the identification of nasal bone fractures, CT was significantly superior to conventional radiography. Although a staff radiologist showed better values in the identification of nasal bone fracture and differentiation between depressed and non-depressed fractures than a trainee, there was no statistically significant difference in the interpretation of conventional radiography and CT between a radiologist and a trainee.

  14. Good things come to those who wait: late first offers facilitate creative agreements in negotiation.

    PubMed

    Sinaceur, Marwan; Maddux, William W; Vasiljevic, Dimitri; Perez Nückel, Ricardo; Galinsky, Adam D

    2013-06-01

    Although previous research has shown that making the first offer leads to a distributive advantage in negotiations, the current research explored how the timing of first offers affects the creativity of negotiation agreements. We hypothesized that making the first offer later rather than earlier in the negotiation would facilitate the discovery of creative agreements that better meet the parties' underlying interests. Experiment 1 demonstrated that compared with early first offers, late first offers facilitated creative agreements that better met the parties' underlying interests. Experiments 2a and 2b controlled for the duration of the negotiation and conceptually replicated this effect. The last two studies also demonstrated that the beneficial effect of late first offers was mediated by greater information exchange. Thus, negotiators need to consider the timing of first offers to fully capitalize on the first offer advantage. Implications for our understanding of creativity, motivated information exchange, and timing in negotiations are discussed.

  15. Avoiding or restricting defectors in public goods games?

    PubMed

    Han, The Anh; Pereira, Luís Moniz; Lenaerts, Tom

    2015-02-06

    When creating a public good, strategies or mechanisms are required to handle defectors. We first show mathematically and numerically that prior agreements with posterior compensations provide a strategic solution that leads to substantial levels of cooperation in the context of public goods games, results that are corroborated by available experimental data. Notwithstanding this success, one cannot, as with other approaches, fully exclude the presence of defectors, raising the question of how they can be dealt with to avoid the demise of the common good. We show that both avoiding creation of the common good, whenever full agreement is not reached, and limiting the benefit that disagreeing defectors can acquire, using costly restriction mechanisms, are relevant choices. Nonetheless, restriction mechanisms are found the more favourable, especially in larger group interactions. Given decreasing restriction costs, introducing restraining measures to cope with public goods free-riding issues is the ultimate advantageous solution for all participants, rather than avoiding its creation. © 2014 The Author(s) Published by the Royal Society. All rights reserved.

  16. A comparative agreement evaluation of two subaxial cervical spine injury classification systems: the AOSpine and the Allen and Ferguson schemes.

    PubMed

    Urrutia, Julio; Zamora, Tomas; Campos, Mauricio; Yurac, Ratko; Palma, Joaquin; Mobarec, Sebastian; Prada, Carlos

    2016-07-01

    We performed an agreement study using two subaxial cervical spine classification systems: the AOSpine and the Allen and Ferguson (A&F) classifications. We sought to determine which scheme allows better agreement by different evaluators and by the same evaluator on different occasions. Complete imaging studies of 65 patients with subaxial cervical spine injuries were classified by six evaluators (three spine sub-specialists and three senior orthopaedic surgery residents) using the AOSpine subaxial cervical spine classification system and the A&F scheme. The cases were displayed in a random sequence after a 6-week interval for repeat evaluation. The Kappa coefficient (κ) was used to determine inter- and intra-observer agreement. Inter-observer: considering the main AO injury types, the agreement was substantial for the AOSpine classification [κ = 0.61 (0.57-0.64)]; using AO sub-types, the agreement was moderate [κ = 0.57 (0.54-0.60)]. For the A&F classification, the agreement [κ = 0.46 (0.42-0.49)] was significantly lower than using the AOSpine scheme. Intra-observer: the agreement was substantial considering injury types [κ = 0.68 (0.62-0.74)] and considering sub-types [κ = 0.62 (0.57-0.66)]. Using the A&F classification, the agreement was also substantial [κ = 0.66 (0.61-0.71)]. No significant differences were observed between spine surgeons and orthopaedic residents in the overall inter- and intra-observer agreement, or in the inter- and intra-observer agreement of specific type of injuries. The AOSpine classification (using the four main injury types or at the sub-types level) allows a significantly better agreement than the A&F classification. The A&F scheme does not allow reliable communication between medical professionals.

  17. The definition of radiological signs in gastric ulcer and assessment of their validity by inter-observer variation study.

    PubMed

    Schulman, A; Simpkins, K C

    1975-07-01

    The initial aim was to program a computer with information on the frequency of radiological signs in benign and malignant gastric ulcers in order to obtain a percentage probability of benignancy or malignancy in succeeding ulcers in clinical practice. However, only four of the many signs described in gastric ulcer were confirmed to be of validity (i.e. reliable existence) by an inter-observer variation study using two observers and the films from 69 barium meal examinations. These were projection or non-projection of the in-profile ulcer, presence or absence of adjacent mucosal folds, good or poor definition of the in-face ulcer's edge, and extension of radiating folds to the in-face ulcer's edge. A few more remained unassessed due to insufficient numbers of relevant cases. It is condluced that: as defined in the literature the majority of radiological signs in this field are of uncertain existence; and the four that were found to be valid do not fully describe the important appearances that may be seen in benign and malignant ulcers and would be inadequate to differentiate them to a sufficiently high degree of probability.

  18. Interrater agreement in the interpretation of neonatal electroencephalography in hypoxic-ischemic encephalopathy.

    PubMed

    Wusthoff, Courtney J; Sullivan, Joseph; Glass, Hannah C; Shellhaas, Renée A; Abend, Nicholas S; Chang, Taeun; Tsuchida, Tammy N

    2017-03-01

    Research using neonatal electroencephalography (EEG) has been limited by a lack of a standardized classification system and interpretation terminology. In 2013, the American Clinical Neurophysiology Society (ACNS) published a guideline for standardized terminology and categorization in the description of continuous EEG in neonates. We sought to assess interrater agreement for this neonatal EEG categorization system as applied by a group of pediatric neurophysiologists. A total of 60 neonatal EEG studies were collected from three institutions. All EEG segments were from term neonates with hypoxic-ischemic encephalopathy. Three pediatric neurophysiologists independently reviewed each record using the ACNS standardized scoring system. Unweighted kappa values were calculated for interrater agreement of categorical data across multiple observers. Interrater agreement was very good for identification of seizures (κ = 0.93, p < 0.001), with perfect agreement in 95% of records (57 of 60). Interrater agreement was moderate for classifying records as normal or having any abnormality (κ = 0.49, p < 0.001), with perfect agreement in 78% of records (47 of 60). Interrater agreement was good in classifying EEG backgrounds on a 5-category scale (normal, excessively discontinuous, burst suppression, status epilepticus, or electrocerebral inactivity) (κ = 0.70, p < 0.001), with perfect agreement in 72% of records (43 of 60). Other specific background features had lower agreement, including voltage (κ = 0.41, p < 0.001), variability (κ = 0.35, p < 0.001), symmetry (κ = 0.18, p = 0.01), presence of abnormal sharp waves (κ < 0.20, p < 0.05), and presence of brief rhythmic discharges (κ < 0.20, p < 0.05). We found good or very good interrater agreement applying the ACNS system for identification of seizures and classification of EEG background. Other specific EEG features showed limited interrater agreement. Of importance to both clinicians and

  19. 19 CFR 10.599 - Fungible goods and materials.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... America-United States Free Trade Agreement Rules of Origin § 10.599 Fungible goods and materials. (a... 19 Customs Duties 1 2010-04-01 2010-04-01 false Fungible goods and materials. 10.599 Section 10.599 Customs Duties U.S. CUSTOMS AND BORDER PROTECTION, DEPARTMENT OF HOMELAND SECURITY; DEPARTMENT OF...

  20. Interobserver Reliability of Peripheral Muscle Strength Tests and Short Physical Performance Battery in Patients With Chronic Obstructive Pulmonary Disease: A Prospective Observational Study.

    PubMed

    Medina-Mirapeix, Francesc; Bernabeu-Mora, Roberto; Llamazares-Herrán, Eduardo; Sánchez-Martínez, Ma Piedad; García-Vidal, José Antonio; Escolar-Reina, Pilar

    2016-11-01

    To evaluate the interobserver reliability of the Short Physical Performance Battery (SPPB) and hand dynamometry when measuring isometric muscle strength in people with chronic obstructive pulmonary disease (COPD). Reliability study. Each patient was assessed by a pulmonology physician and a physical therapist in 2 separate sessions 7 to 14 days apart (mean, 9.8±0.8d). Each rater was blinded to the other's results. Pneumology unit of a public hospital. Random sample of outpatients with stable COPD (N=30). Not applicable. SPPB and muscle strength (kg) using electronic handgrip and handheld dynamometers. Reliability was assessed with intraclass correlation coefficients (ICCs), standard error of measurement values, and Bland-Altman plots. ICCs were calculated for the SPPB summary score and for its 3 subscales. The ICCs for the overall reliability of the SPPB summary score and for grip and quadriceps strength were .82 (95% confidence interval [CI], .62-.91), .97 (95% CI, .93-.98), and .76 (95% CI, .49-.88), respectively. The standard error of measurement values were .55 points, 1.30kg, and 1.22kg, respectively. The mean differences between the rater's scores were near zero for grip strength and SPPB summary score measures. The ICCs for the SPPB subscales were .84 (95% CI, .66-.92) for the chair subscale, .75 (95% CI, .48-.88) for gait, and .33 (95% CI, -.42 to .68) for balance. Interobserver reliability was good for quadriceps and handgrip dynamometry and for the SPPB summary score and its chair stand and gait speed subscales. Both pulmonary physicians and physical therapists can obtain and exchange the scores. Because the reliability of the balance subscale was questionable, it is better to use the SPPB summary score. Copyright © 2016 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.

  1. Comparison of translabial three-dimensional ultrasound with magnetic resonance imaging for measurement of levator hiatal biometry at rest.

    PubMed

    Vergeldt, T F M; Notten, K J B; Stoker, J; Fütterer, J J; Beets-Tan, R G; Vliegen, R F A; Schweitzer, K J; Mulder, F E M; van Kuijk, S M J; Roovers, J P W R; Kluivers, K B; Weemhoff, M

    2016-05-01

    To compare translabial three-dimensional (3D) ultrasound with magnetic resonance imaging (MRI) for the measurement of levator hiatal biometry at rest in women with pelvic organ prolapse, and to determine the interobserver reliability between two independent observers for ultrasound and MRI measurements. Data were derived from a multicenter prospective cohort study in which women scheduled for conventional anterior colporrhaphy underwent translabial 3D ultrasound and MRI prior to surgery. Intraclass correlation coefficients (ICCs) were calculated to estimate interobserver reliability between two independent observers and determine the agreement between ultrasound and MRI measurements. Bland-Altman plots were created to assess the agreement between ultrasound and MRI measurements. Data from 139 women from nine hospitals were included in the study. The interobserver reliability of ultrasound assessment at rest, during Valsalva maneuver and during contraction and of MRI assessment at rest were moderate or good. The agreement between ultrasound and MRI for the measurement of levator hiatal biometry at rest was moderate, with ICCs of 0.52 (95%CI, 0.32-0.66) for levator hiatal area, 0.44 (95%CI, 0.21-0.60) for anteroposterior diameter and 0.44 (95%CI, 0.22-0.60) for transverse diameter. Levator hiatal biometry measurements were statistically significantly larger on MRI than on translabial 3D ultrasound. The agreement between translabial 3D ultrasound and MRI for measurement of the levator hiatus at rest in women with pelvic organ prolapse was only moderate. The results of translabial 3D ultrasound and MRI should therefore not be used interchangeably in daily practice or in clinical research. Copyright © 2015 ISUOG. Published by John Wiley & Sons Ltd. Copyright © 2015 ISUOG. Published by John Wiley & Sons Ltd.

  2. 2 CFR 176.170 - Notice of Required Use of American Iron, Steel, and Manufactured Goods (covered under...

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ..., Steel, and Manufactured Goods (covered under International Agreements)-Section 1605 of the American... Required Use of American Iron, Steel, and Manufactured Goods (covered under International Agreements... repair of a public building or public work, and involve iron, steel, and/or manufactured goods covered...

  3. 2 CFR 176.170 - Notice of Required Use of American Iron, Steel, and Manufactured Goods (covered under...

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ..., Steel, and Manufactured Goods (covered under International Agreements)-Section 1605 of the American... American Iron, Steel, and Manufactured Goods (covered under International Agreements)—Section 1605 of the... building or public work, and involve iron, steel, and/or manufactured goods covered under international...

  4. 2 CFR 176.170 - Notice of Required Use of American Iron, Steel, and Manufactured Goods (covered under...

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ..., Steel, and Manufactured Goods (covered under International Agreements)-Section 1605 of the American... Required Use of American Iron, Steel, and Manufactured Goods (covered under International Agreements... repair of a public building or public work, and involve iron, steel, and/or manufactured goods covered...

  5. 2 CFR 176.170 - Notice of Required Use of American Iron, Steel, and Manufactured Goods (covered under...

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ..., Steel, and Manufactured Goods (covered under International Agreements)-Section 1605 of the American... Iron, Steel, and Manufactured Goods (covered under International Agreements)—Section 1605 of the... building or public work, and involve iron, steel, and/or manufactured goods covered under international...

  6. A novel scoring system to measure radiographic abnormalities and related spirometric values in cured pulmonary tuberculosis.

    PubMed

    Báez-Saldaña, Renata; López-Arteaga, Yesenia; Bizarrón-Muro, Alma; Ferreira-Guerrero, Elizabeth; Ferreyra-Reyes, Leticia; Delgado-Sánchez, Guadalupe; Cruz-Hervert, Luis Pablo; Mongua-Rodríguez, Norma; García-García, Lourdes

    2013-01-01

    Despite chemotherapy, patients with cured pulmonary tuberculosis may result in lung functional impairment. To evaluate a novel scoring system based on the degree of radiographic abnormalities and related spirometric values in patients with cured pulmonary tuberculosis. One hundred and twenty seven patients with cured pulmonary tuberculosis were prospectively enrolled in a referral hospital specializing in respiratory diseases. Spirometry was performed and the extent of radiographic abnormalities was evaluated twice by each of two readers to generate a novel quantitative score. Scoring reproducibility was analyzed by the intra-class correlation coefficient (ICC) and the Bland-Altman method. Multiple linear regression models were performed to assess the association of the extent of radiographic abnormalities with spirometric values. The intra-observer agreement for scoring of radiographic abnormalities (SRA) showed an ICC of 0.81 (CI:95%, 0.67-0.95) and 0.78 (CI:95%, 0.65-0.92), for reader 1 and 2, respectively. Inter-observer reproducibility for the first measurement was 0.83 (CI:95%, 0.71-0.95), and for the second measurement was 0.74 (CI:95%, 0.58-0.90). The Bland-Altman analysis of the intra-observer agreement showed a mean bias of 0.87% and -0.55% and an inter-observer agreement of -0.35% and -1.78%, indicating a minor average systematic variability. After adjustment for age, gender, height, smoking status, pack-years of smoking, and degree of dyspnea, the scoring degree of radiographic abnormalities was significantly and negatively associated with absolute and percent predicted values of FVC: -0.07 (CI:95%, -0.01 to -0.04); -2.48 (CI:95%, -3.45 to -1.50); and FEV1 -0.07 (CI:95%, -0.10 to -0.05); -2.92 (CI:95%, -3.87 to -1.97) respectively, in the patients studied. The extent of radiographic abnormalities, as evaluated through our novel scoring system, was inversely associated with spirometric values, and exhibited good reliability and reproducibility. As intra

  7. A Novel Scoring System to Measure Radiographic Abnormalities and Related Spirometric Values in Cured Pulmonary Tuberculosis

    PubMed Central

    Báez-Saldaña, Renata; López-Arteaga, Yesenia; Bizarrón-Muro, Alma; Ferreira-Guerrero, Elizabeth; Ferreyra-Reyes, Leticia; Delgado-Sánchez, Guadalupe; Cruz-Hervert, Luis Pablo; Mongua-Rodríguez, Norma; García-García, Lourdes

    2013-01-01

    exhibited good reliability and reproducibility. As intra-observer and inter-observer agreement of the SRA varied from good to excellent, the use of SRA in this setting appears acceptable. PMID:24223865

  8. Perme Intensive Care Unit Mobility Score and ICU Mobility Scale: translation into Portuguese and cross-cultural adaptation for use in Brazil

    PubMed Central

    Kawaguchi, Yurika Maria Fogaça; Nawa, Ricardo Kenji; Figueiredo, Thais Borgheti; Martins, Lourdes; Pires-Neto, Ruy Camargo

    2016-01-01

    ABSTRACT Objective: To translate the Perme Intensive Care Unit Mobility Score and the ICU Mobility Scale (IMS) into Portuguese, creating versions that are cross-culturally adapted for use in Brazil, and to determine the interobserver agreement and reliability for both versions. Methods: The processes of translation and cross-cultural validation consisted in the following: preparation, translation, reconciliation, synthesis, back-translation, review, approval, and pre-test. The Portuguese-language versions of both instruments were then used by two researchers to evaluate critically ill ICU patients. Weighted kappa statistics and Bland-Altman plots were used in order to verify interobserver agreement for the two instruments. In each of the domains of the instruments, interobserver reliability was evaluated with Cronbach's alpha coefficient. The correlation between the instruments was assessed by Spearman's correlation test. Results: The study sample comprised 103 patients-56 (54%) of whom were male-with a mean age of 52 ± 18 years. The main reason for ICU admission (in 44%) was respiratory failure. Both instruments showed excellent interobserver agreement (κ > 0.90) and reliability (α > 0.90) in all domains. Interobserver bias was low for the IMS and the Perme Score (−0.048 ± 0.350 and −0.06 ± 0.73, respectively). The 95% CIs for the same instruments ranged from −0.73 to 0.64 and −1.50 to 1.36, respectively. There was also a strong positive correlation between the two instruments (r = 0.941; p < 0.001). Conclusions: In their versions adapted for use in Brazil, both instruments showed high interobserver agreement and reliability. PMID:28117473

  9. Agreement of angle closure assessments between gonioscopy, anterior segment optical coherence tomography and spectral domain optical coherence tomography.

    PubMed

    Tay, Elton Lik Tong; Yong, Vernon Khet Yau; Lim, Boon Ang; Sia, Stelson; Wong, Elizabeth Poh Ying; Yip, Leonard Wei Leon

    2015-01-01

    To determine angle closure agreements between gonioscopy and anterior segment optical coherence tomography (AS-OCT), as well as gonioscopy and spectral domain OCT (SD-OCT). A secondary objective was to quantify inter-observer agreements of AS-OCT and SD-OCT assessments. Seventeen consecutive subjects (33 eyes) were recruited from the study hospital's Glaucoma clinic. Gonioscopy was performed by a glaucomatologist masked to OCT results. OCT images were read independently by 2 other glaucomatologists masked to gonioscopy findings as well as each other's analyses of OCT images. Totally 84.8% and 45.5% of scleral spurs were visualized in AS-OCT and SD-OCT images respectively (P<0.01). The agreement for angle closure between AS-OCT and gonioscopy was fair at k=0.31 (95% confidence interval, CI: 0.03-0.59) and k=0.35 (95% CI: 0.07-0.63) for reader 1 and 2 respectively. The agreement for angle closure between SD-OCT and gonioscopy was fair at k=0.21 (95% CI: 0.07-0.49) and slight at k=0.17 (95% CI: 0.08-0.42) for reader 1 and 2 respectively. The inter-reader agreement for angle closure in AS-OCT images was moderate at 0.51 (95% CI: 0.13-0.88). The inter-reader agreement for angle closure in SD-OCT images was slight at 0.18 (95% CI: 0.08-0.45). Significant proportion of scleral spurs were not visualised with SD-OCT imaging resulting in weaker inter-reader agreements. Identifying other angle landmarks in SD-OCT images will allow more consistent angle closure assessments. Gonioscopy and OCT imaging do not always agree in angle closure assessments but have their own advantages, and should be used together and not exclusively.

  10. Agreement of angle closure assessments between gonioscopy, anterior segment optical coherence tomography and spectral domain optical coherence tomography

    PubMed Central

    Tay, Elton Lik Tong; Yong, Vernon Khet Yau; Lim, Boon Ang; Sia, Stelson; Wong, Elizabeth Poh Ying; Yip, Leonard Wei Leon

    2015-01-01

    AIM To determine angle closure agreements between gonioscopy and anterior segment optical coherence tomography (AS-OCT), as well as gonioscopy and spectral domain OCT (SD-OCT). A secondary objective was to quantify inter-observer agreements of AS-OCT and SD-OCT assessments. METHODS Seventeen consecutive subjects (33 eyes) were recruited from the study hospital's Glaucoma clinic. Gonioscopy was performed by a glaucomatologist masked to OCT results. OCT images were read independently by 2 other glaucomatologists masked to gonioscopy findings as well as each other's analyses of OCT images. RESULTS Totally 84.8% and 45.5% of scleral spurs were visualized in AS-OCT and SD-OCT images respectively (P<0.01). The agreement for angle closure between AS-OCT and gonioscopy was fair at k=0.31 (95% confidence interval, CI: 0.03-0.59) and k=0.35 (95% CI: 0.07-0.63) for reader 1 and 2 respectively. The agreement for angle closure between SD-OCT and gonioscopy was fair at k=0.21 (95% CI: 0.07-0.49) and slight at k=0.17 (95% CI: 0.08-0.42) for reader 1 and 2 respectively. The inter-reader agreement for angle closure in AS-OCT images was moderate at 0.51 (95% CI: 0.13-0.88). The inter-reader agreement for angle closure in SD-OCT images was slight at 0.18 (95% CI: 0.08-0.45). CONCLUSION Significant proportion of scleral spurs were not visualised with SD-OCT imaging resulting in weaker inter-reader agreements. Identifying other angle landmarks in SD-OCT images will allow more consistent angle closure assessments. Gonioscopy and OCT imaging do not always agree in angle closure assessments but have their own advantages, and should be used together and not exclusively. PMID:25938053

  11. Automated 3D ultrasound measurement of the angle of progression in labor.

    PubMed

    Montaguti, Elisa; Rizzo, Nicola; Pilu, Gianluigi; Youssef, Aly

    2018-01-01

    To assess the feasibility and reliability of an automated technique for the assessment of the angle of progression (AoP) in labor by using three-dimensional (3D) ultrasound. AoP was assessed by using 3D transperineal ultrasound by two operators in 52 women in active labor to evaluate intra- and interobserver reproducibility. Furthermore, intermethod agreement between automated and manual techniques on 3D images, and between automated technique on 3D vs 2D images were evaluated. Automated measurements were feasible in all cases. Automated measurements were considered acceptable in 141 (90.4%) out of the 156 on the first assessments and in all 156 after repeating measurements for unacceptable evaluations. The automated technique on 3D images demonstrated good intra- and interobserver reproducibility. The 3D-automated technique showed a very good agreement with the 3D manual technique. Notably, AoP calculated with the 3D automated technique were significantly wider in comparison with those measured manually on 3D images (133 ± 17° vs 118 ± 21°, p = 0.013). The assessment of the angle of progression through 3D ultrasound is highly reproducible. However, automated software leads to a systematic overestimation of AoP in comparison with the standard manual technique thus hindering its use in clinical practice in its present form.

  12. Contrast-Enhanced and Time-of-Flight MRA at 3T Compared with DSA for the Follow-Up of Intracranial Aneurysms Treated with the WEB Device.

    PubMed

    Timsit, C; Soize, S; Benaissa, A; Portefaix, C; Gauvrit, J-Y; Pierot, L

    2016-09-01

    Imaging follow-up at 3T of intracranial aneurysms treated with the WEB Device has not been evaluated yet. Our aim was to assess the diagnostic accuracy of 3D-time-of-flight MRA and contrast-enhanced MRA at 3T against DSA, as the criterion standard, for the follow-up of aneurysms treated with the Woven EndoBridge (WEB) system. From June 2011 to December 2014, patients treated with the WEB in our institution, then followed for ≥6 months after treatment by MRA at 3T (3D-TOF-MRA and contrast-enhanced MRA) and DSA within 48 hours were included. Aneurysm occlusion was assessed with a simplified 2-grade scale (adequate occlusion [total occlusion + neck remnant] versus aneurysm remnant). Interobserver and intermodality agreement was evaluated by calculating the linear weighted κ. MRA test characteristics and predictive values were calculated from a 2 × 2 contingency table, by using DSA data as the standard of reference. Twenty-six patients with 26 WEB-treated aneurysms were included. The interobserver reproducibility was good with DSA (κ = 0.71) and contrast-enhanced-MRA (κ = 0.65) compared with moderate with 3D-TOF-MRA (κ = 0.47). Intermodality agreement with DSA was fair with both contrast-enhanced MRA (κ = 0.36) and 3D-TOF-MRA (κ = 0.36) for the evaluation of total occlusion. For aneurysm remnant detection, the prevalence was low (15%), on the basis of DSA, and both MRA techniques showed low sensitivity (25%), high specificity (100%), very good positive predictive value (100%), and very good negative predictive value (88%). Despite acceptable interobserver reproducibility and predictive values, the low sensitivity of contrast-enhanced MRA and 3D-TOF-MRA for aneurysm remnant detection suggests that MRA is a useful screening procedure for WEB-treated aneurysms, but similar to stents and flow diverters, DSA remains the criterion standard for follow-up. © 2016 by American Journal of Neuroradiology.

  13. Reliability of internal oblique elbow radiographs for measuring displacement of medial epicondyle humerus fractures: a cadaveric study.

    PubMed

    Gottschalk, Hilton P; Bastrom, Tracey P; Edmonds, Eric W

    2013-01-01

    Standard elbow radiographs (AP and lateral views) are not accurate enough to measure true displacement of medial epicondyle fractures of the humerus. The amount of perceived displacement has been used to determine treatment options. This study assesses the utility of internal oblique radiographs for measurement of true displacement in these fractures. A medial epicondyle fracture was created in a cadaveric specimen. Displacement of the fragment (mm) was set at 5, 10, and 15 in line with the vector of the flexor pronator mass. The fragment was sutured temporarily in place. Radiographs were obtained at 0 (AP), 15, 30, 45, 60, 75, and 90 degrees (lateral) of internal rotation, with the elbow in set positions of flexion. This was done with and without radio-opaque markers placed on the fragment and fracture bed. The 45 and 60 degrees internal oblique radiographs were then presented to 5 separate reviewers (of different levels of training) to evaluate intraobserver and interobserver agreement. Change in elbow position did not affect the perceived displacement (P=0.82) with excellent intraobserver reliability (intraclass correlation coefficient range, 0.979 to 0.988) and interobserver agreement of 0.953. The intraclass correlation coefficient for intraobserver reliability on 45 degrees internal oblique films for all groups ranged from 0.985 to 0.998, with interobserver agreement of 0.953. For predicting displacement, the observers were 60% accurate in predicting the true displacement on the 45 degrees internal oblique films and only 35% accurate using the 60 degrees internal oblique view. Standardizing to a 45 degrees internal oblique radiograph of the elbow (regardless of elbow flexion) can augment the treating surgeon's ability to determine true displacement. At this degree of rotation, the measured number can be multiplied by 1.4 to better estimate displacement. The addition of a 45 degrees internal oblique radiograph in medial humeral epicondyle fractures has good

  14. Three-dimensional sampling perfection with application-optimised contrasts using a different flip angle evolutions sequence for routine imaging of the spine: preliminary experience.

    PubMed

    Tins, B; Cassar-Pullicino, V; Haddaway, M; Nachtrab, U

    2012-08-01

    The bulk of spinal imaging is still performed with conventional two-dimensional sequences. This study assesses the suitability of three-dimensional sampling perfection with application-optimised contrasts using a different flip angle evolutions (SPACE) sequence for routine spinal imaging. 62 MRI examinations of the spine were evaluated by 2 examiners in consensus for the depiction of anatomy and presence of artefact. We noted pathologies that might be missed using the SPACE sequence only or the SPACE and a sagittal T(1) weighted sequence. The reference standards were sagittal and axial T(1) weighted and T(2) weighted sequences. At a later date the evaluation was repeated by one of the original examiners and an additional examiner. There was good agreement of the single evaluations and consensus evaluation for the conventional sequences: κ>0.8, confidence interval (CI)>0.6-1.0. For the SPACE sequence, depiction of anatomy was very good for 84% of cases, with high interobserver agreement, but there was poor interobserver agreement for other cases. For artefact assessment of SPACE, κ=0.92, CI=0.92-1.0. The SPACE sequence was superior to conventional sequences for depiction of anatomy and artefact resistance. The SPACE sequence occasionally missed bone marrow oedema. In conjunction with sagittal T(1) weighted sequences, no abnormality was missed. The isotropic SPACE sequence was superior to conventional sequences in imaging difficult anatomy such as in scoliosis and spondylolysis. The SPACE sequence allows excellent assessment of anatomy owing to high spatial resolution and resistance to artefact. The sensitivity for bone marrow abnormalities is limited.

  15. Concordance between (99m)Tc-ECD SPECT and 18F-FDG PET interpretations in patients with cognitive disorders diagnosed according to NIA-AA criteria.

    PubMed

    Ito, Kimiteru; Shimano, Yasumasa; Imabayashi, Etsuko; Nakata, Yasuhiro; Omachi, Yoshie; Sato, Noriko; Arima, Kunimasa; Matsuda, Hiroshi

    2014-10-01

    The purpose of this study was to clarify the concordance of diagnostic abilities and interobserver agreement between 18F-fluorodeoxyglucose (FDG) positron emission tomography (PET) and brain perfusion single photon-emission computed tomography (SPECT) in patients with Alzheimer's disease (AD) who were diagnosed according to the research criteria of the National Institute of Aging-Alzheimer's Association Workshop. Fifty-five patients with "AD and mild cognitive impairment (MCI)" (n = 40) and "non-AD" (n = 15) were evaluated with 18F-FDG PET and (99m)Tc-ethyl cysteinate dimer (ECD) SPECT during an 8-week period. Three radiologists independently graded the regional uptake in the frontal, temporal, parietal, and occipital lobes as well as the precuneus/posterior cingulate cortex in both images. Kappa values were used to determine the interobserver reliability regarding regional uptake. The regions with better interobserver reliability between 18F-FDG PET and (99m)Tc-ECD SPECT were the frontal, parietal, and temporal lobes. The (99m)Tc-ECD SPECT agreement in the occipital lobes was not significant. The frontal, temporal, and parietal lobes showed good correlations between 18F-FDG PET and (99m)Tc-ECD SPECT in the degree of uptake, but the occipital lobe and precuneus/posterior cingulate cortex did not show good correlations. The diagnostic accuracy rates of "AD and MCI" ranged from 60% to 70% in both of the techniques. The degree of uptake on 18F-FDG PET and (99m)Tc-ECD SPECT showed significant correlations in the frontal, temporal, and parietal lobes. The diagnostic abilities of 18F-FDG PET and (99m)Tc-ECD SPECT for "AD and MCI," when diagnosed according to the National Institute of Aging-Alzheimer's Association Workshop criteria, were nearly identical. Copyright © 2014 John Wiley & Sons, Ltd.

  16. Optimizing prevention of hospital-acquired venous thromboembolism (VTE): prospective validation of a VTE risk assessment model.

    PubMed

    Maynard, Gregory A; Morris, Timothy A; Jenkins, Ian H; Stone, Sarah; Lee, Joshua; Renvall, Marian; Fink, Ed; Schoenhaus, Robert

    2010-01-01

    Hospital-acquired (HA) venous thromboembolism (VTE) is a common source of morbidity/mortality. Prophylactic measures are underutilized. Available risk assessment models/protocols are not prospectively validated. Improve VTE prophylaxis, reduce HA VTE, and prospectively validate a VTE risk-assessment model. Observational design. Academic medical center. Adult inpatients on medical/surgical services. A simple VTE risk assessment linked to a menu of preferred VTE prophylaxis methods, embedded in order sets. Education, audit/feedback, and concurrent identification of nonadherence. Randomly sampled inpatient audits determined the percent of patients with "adequate" VTE prevention. HA VTE cases were identified concurrently via digital imaging system. Interobserver agreement for VTE risk level and judgment of adequate prophylaxis were calculated from 150 random audits. Interobserver agreement with 5 observers was high (kappa score for VTE risk level = 0.81, and for judgment of "adequate" prophylaxis = 0.90). The percent of patients on adequate prophylaxis improved each of the 3 years (58%, 78%, and 93%; P < 0.001) and reached 98% in the last 6 months of 2007; 361 cases of HA VTE occurred over 3 years. Significant reductions for the risk of HA VTE (risk ratio [RR] = 0.69; 95% confidence interval [CI] = 0.47-0.79) and preventable HA VTE (RR = 0.14; 95% CI = 0.06-0.31) occurred. We detected no increase in heparin-induced thrombocytopenia (HIT) or prophylaxis-related bleeding using administrative data/chart review. We prospectively validated a VTE risk-assessment/prevention protocol by demonstrating ease of use, good interobserver agreement, and effectiveness. Improved VTE prophylaxis resulted in a substantial reduction in HA VTE. (c) 2010 Society of Hospital Medicine.

  17. 19 CFR 10.594 - Originating goods.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... Duties U.S. CUSTOMS AND BORDER PROTECTION, DEPARTMENT OF HOMELAND SECURITY; DEPARTMENT OF THE TREASURY ARTICLES CONDITIONALLY FREE, SUBJECT TO A REDUCED RATE, ETC. Dominican Republic-Central America-United States Free Trade Agreement Rules of Origin § 10.594 Originating goods. Except as otherwise provided in...

  18. Physiotherapist agreement when visually rating movement quality during lower extremity functional screening tests.

    PubMed

    Whatman, Chris; Hing, Wayne; Hume, Patria

    2012-05-01

    To investigate physiotherapist agreement in rating movement quality during lower extremity functional tests using two visual rating methods and physiotherapists with differing clinical experience. Clinical measurement. Six healthy individuals were rated by 44 physiotherapists. These raters were in three groups (inexperienced, novice, experienced). Video recordings of all six individuals performing four lower extremity functional tests were visually rated (dichotomous or ordinal scale) using two rating methods (overall or segment) on two occasions separated by 3-4 weeks. Intra and inter-rater agreement for physiotherapists was determined using overall percentage agreement (OPA) and the first order agreement coefficient (AC1). Intra-rater agreement for overall and segment methods ranged from slight to almost perfect (OPA: 29-96%, AC1: 0.01 to 0.96). AC1 agreement was better in the experienced group (84-99% likelihood) and for dichotomous rating (97-100% likelihood). Inter-rater agreement ranged from fair to good (OPA: 45-79%; AC1: 0.22-0.71). AC1 agreement was not influenced by clinical experience but was again better using dichotomous rating. Physiotherapists' visual rating of movement quality during lower extremity functional tests resulted in slight to almost perfect intra-rater agreement and fair to good inter-rater agreement. Agreement improved with increased level of clinical experience and use of dichotomous rating. Copyright © 2011 Elsevier Ltd. All rights reserved.

  19. Relationship between Two Types of Coil Packing Densities Relative to Aneurysm Size.

    PubMed

    Park, Keun Young; Kim, Byung Moon; Ihm, Eun Hyun; Baek, Jang Hyun; Kim, Dong Joon; Kim, Dong Ik; Huh, Seung Kon; Lee, Jae Whan

    2015-01-01

    Coil packing density (PD) can be calculated via a formula (PDF ) or software (PDS ). Two types of PD can be different from each other for same aneurysm. This study aimed to evaluate the interobserver agreement and relationships between the 2 types of PD relative to aneurysm size. Consecutive 420 saccular aneurysms were treated with coiling. PD (PDF , [coil volume]/[volume calculated by formula] and PDS, [coil volume]/[volume measured by software]) was calculated and prospectively recorded. Interobserver agreement was evaluated between PDF and PDS . Additionally, the relationships between PDF and PDS relative to aneurysm size were subsequently analyzed. Interobserver agreement for PDF and PDS was excellent (Intraclass correlation coefficient, PDF ; 0.967 and PDS ; 0.998). The ratio of PDF and PDS was greater for smaller aneurysms and converged toward 1.0 as the maximum dimension (DM ) of aneurysm increased. Compared with PDS , PDF was overestimated by a mean of 28% for DM < 5 mm, by 17% for 5 mm ≤ DM < 10 mm, and by 9% for DM ≥ 10 mm (P < 0.01). Interobserver agreement for PDF and PDS was excellent. However, PDF was overestimated in smaller aneurysms and converged to PDS as aneurysm size increased. Copyright © 2014 by the American Society of Neuroimaging.

  20. Standardized Reporting of Prostate MRI: Comparison of the Prostate Imaging Reporting and Data System (PI-RADS) Version 1 and Version 2

    PubMed Central

    Tewes, Susanne; Mokov, Nikolaj; Hartung, Dagmar; Schick, Volker; Peters, Inga; Schedl, Peter; Pertschy, Stefanie; Wacker, Frank; Voshage, Götz; Hueper, Katja

    2016-01-01

    Introduction Objective of our study was to determine the agreement between version 1 (v1) and v2 of the Prostate Imaging Reporting and Data System (PI-RADS) for evaluation of multiparametric prostate MRI (mpMRI) and to compare their diagnostic accuracy, their inter-observer agreement and practicability. Material and Methods mpMRI including T2-weighted imaging, diffusion-weighted imaging (DWI) and dynamic contrast-enhanced imaging (DCE) of 54 consecutive patients, who subsequently underwent MRI-guided in-bore biopsy were re-analyzed according to PI-RADS v1 and v2 by two independent readers. Diagnostic accuracy for detection of prostate cancer (PCa) was assessed using ROC-curve analysis. Agreement between PI-RADS versions and observers was calculated and the time needed for scoring was determined. Results MRI-guided biopsy revealed PCa in 31 patients. Diagnostic accuracy for detection of PCa was equivalent with both PI-RADS versions for reader 1 with sensitivities and specificities of 84%/91% (AUC = 0.91 95%CI[0.8–1]) for PI-RADS v1 and 100%/74% (AUC = 0.92 95% CI[0.8–1]) for PI-RADS v2. Reader 2 achieved similar diagnostic accuracy with sensitivity and specificity of 74%/91% (AUC = 0.88 95%CI[0.8–1]) for PI-RADS v1 and 81%/91% (AUC = 0.91 95%CI[0.8–1]) for PI-RADS v2. Agreement between scores determined with different PI-RADS versions was good (reader 1: κ = 0.62, reader 2: κ = 0.64). Inter-observer agreement was moderate with PI-RADS v2 (κ = 0.56) and fair with v1 (κ = 0.39). The time required for building the PI-RADS score was significantly lower with PI-RADS v2 compared to v1 (24.7±2.3 s vs. 41.9±2.6 s, p<0.001). Conclusion Agreement between PI-RADS versions was high and both versions revealed high diagnostic accuracy for detection of PCa. Due to better inter-observer agreement for malignant lesions and less time demand, the new PI-RADS version could be more practicable for clinical routine. PMID:27657729

  1. Prospective comparison of speckle tracking longitudinal bidimensional strain between two vendors.

    PubMed

    Castel, Anne-Laure; Szymanski, Catherine; Delelis, François; Levy, Franck; Menet, Aymeric; Mailliet, Amandine; Marotte, Nathalie; Graux, Pierre; Tribouilloy, Christophe; Maréchaux, Sylvestre

    2014-02-01

    Speckle tracking is a relatively new, largely angle-independent technique used for the evaluation of myocardial longitudinal strain (LS). However, significant differences have been reported between LS values obtained by speckle tracking with the first generation of software products. To compare LS values obtained with the most recently released equipment from two manufacturers. Systematic scanning with head-to-head acquisition with no modification of the patient's position was performed in 64 patients with equipment from two different manufacturers, with subsequent off-line post-processing for speckle tracking LS assessment (Philips QLAB 9.0 and General Electric [GE] EchoPAC BT12). The interobserver variability of each software product was tested on a randomly selected set of 20 echocardiograms from the study population. GE and Philips interobserver coefficients of variation (CVs) for global LS (GLS) were 6.63% and 5.87%, respectively, indicating good reproducibility. Reproducibility was very variable for regional and segmental LS values, with CVs ranging from 7.58% to 49.21% with both software products. The concordance correlation coefficient (CCC) between GLS values was high at 0.95, indicating substantial agreement between the two methods. While good agreement was observed between midwall and apical regional strains with the two software products, basal regional strains were poorly correlated. The agreement between the two software products at a segmental level was very variable; the highest correlation was obtained for the apical cap (CCC 0.90) and the poorest for basal segments (CCC range 0.31-0.56). A high level of agreement and reproducibility for global but not for basal regional or segmental LS was found with two vendor-dependent software products. This finding may help to reinforce clinical acceptance of GLS in everyday clinical practice. Copyright © 2014 Elsevier Masson SAS. All rights reserved.

  2. Stress echocardiography with smartphone: real-time remote reading for regional wall motion.

    PubMed

    Scali, Maria Chiara; de Azevedo Bellagamba, Clarissa Carmona; Ciampi, Quirino; Simova, Iana; de Castro E Silva Pretto, José Luis; Djordjevic-Dikic, Ana; Dodi, Claudio; Cortigiani, Lauro; Zagatina, Angela; Trambaiolo, Paolo; Torres, Marco R; Citro, Rodolfo; Colonna, Paolo; Paterni, Marco; Picano, Eugenio

    2017-11-01

    The diffusion of smart-phones offers access to the best remote expertise in stress echo (SE). To evaluate the reliability of SE based on smart-phone filming and reading. A set of 20 SE video-clips were read in random sequence with a multiple choice six-answer test by ten readers from five different countries (Italy, Brazil, Serbia, Bulgaria, Russia) of the "SE2020" study network. The gold standard to assess accuracy was a core-lab expert reader in agreement with angiographic verification (0 = wrong, 1 = right). The same set of 20 SE studies were read, in random order and >2 months apart, on desktop Workstation and via smartphones by ten remote readers. Image quality was graded from 1 = poor but readable, to 3 = excellent. Kappa (k) statistics was used to assess intra- and inter-observer agreement. The image quality was comparable in desktop workstation vs. smartphone (2.0 ± 0.5 vs. 2.4 ± 0.7, p = NS). The average reading time per case was similar for desktop versus smartphone (90 ± 39 vs. 82 ± 54 s, p = NS). The overall diagnostic accuracy of the ten readers was similar for desktop workstation vs. smartphone (84 vs. 91%, p = NS). Intra-observer agreement (desktop vs. smartphone) was good (k = 0.81 ± 0.14). Inter-observer agreement was good and similar via desktop or smartphone (k = 0.69 vs. k = 0.72, p = NS). The diagnostic accuracy and consistency of SE reading among certified readers was high and similar via desktop workstation or via smartphone.

  3. 19 CFR 10.873 - Originating goods.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... 19 Customs Duties 1 2012-04-01 2012-04-01 false Originating goods. 10.873 Section 10.873 Customs Duties U.S. CUSTOMS AND BORDER PROTECTION, DEPARTMENT OF HOMELAND SECURITY; DEPARTMENT OF THE TREASURY ARTICLES CONDITIONALLY FREE, SUBJECT TO A REDUCED RATE, ETC. United States-Oman Free Trade Agreement Rules...

  4. 19 CFR 10.873 - Originating goods.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... 19 Customs Duties 1 2013-04-01 2013-04-01 false Originating goods. 10.873 Section 10.873 Customs Duties U.S. CUSTOMS AND BORDER PROTECTION, DEPARTMENT OF HOMELAND SECURITY; DEPARTMENT OF THE TREASURY ARTICLES CONDITIONALLY FREE, SUBJECT TO A REDUCED RATE, ETC. United States-Oman Free Trade Agreement Rules...

  5. 19 CFR 10.873 - Originating goods.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... 19 Customs Duties 1 2014-04-01 2014-04-01 false Originating goods. 10.873 Section 10.873 Customs Duties U.S. CUSTOMS AND BORDER PROTECTION, DEPARTMENT OF HOMELAND SECURITY; DEPARTMENT OF THE TREASURY ARTICLES CONDITIONALLY FREE, SUBJECT TO A REDUCED RATE, ETC. United States-Oman Free Trade Agreement Rules...

  6. 19 CFR 10.873 - Originating goods.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... 19 Customs Duties 1 2011-04-01 2011-04-01 false Originating goods. 10.873 Section 10.873 Customs Duties U.S. CUSTOMS AND BORDER PROTECTION, DEPARTMENT OF HOMELAND SECURITY; DEPARTMENT OF THE TREASURY ARTICLES CONDITIONALLY FREE, SUBJECT TO A REDUCED RATE, ETC. United States-Oman Free Trade Agreement Rules...

  7. High-frequency ultrasound imaging for cutaneous neurofibroma in patients with neurofibromatosis type I.

    PubMed

    Raffin, Delphine; Zaragoza, Julia; Georgescou, Gabriella; Mourtada, Youssef; Maruani, Annabel; Ossant, Frédéric; Patat, Frédéric; Vaillant, Loïc; Machet, Laurent

    2017-06-01

    Neurofibromas (NFs) are benign tumours arising from a nerve sheath, which are present in nearly all patients with neurofibromatosis type 1 (NF1). High-frequency ultrasound (HFU) systems, using frequencies over 20 MHz, were developed to improve visualization of skin tumours by means of increased resolution. To describe NFs by using HFU in patients with NF1. Anonymized HFU (25-MHz) images of NFs were randomized. Initially, two dermatologist investigators, with experience in HFU imaging of the skin, together described the ultrasound images and established eight criteria for NFs. The same task was then repeated by two other dermatologists, also with experience in HFU imaging of the skin, independently, to establish inter-observer agreement. A total of 108 NFs in 29 patients were included. Superficial and subcutaneous NFs were hypoechoic with a round to spindle shape. Plexiform NFs were ill-defined, consisting of multiple hypoechoic linear zones. Good to excellent inter-observer agreement was found for six of the eight criteria (k>0.6). This is the first series describing HFU skin imaging of NFs in patients with NF1. Lateral extension that may correspond to involvement of an adjacent nerve seems to be specific to NFs.

  8. Muscle MR Imaging in Tubular Aggregate Myopathy

    PubMed Central

    Beltrame, Valeria; Ortolan, Paolo; Coran, Alessandro; Zanato, Riccardo; Gazzola, Matteo; Frigo, Annachiara; Bello, Luca; Pegoraro, Elena; Stramare, Roberto

    2014-01-01

    Purpose To evaluate with Magnetic Resonance (MR) the degree of fatty replacement and edematous involvement in skeletal muscles in patients with Tubular Aggregate Myopathy (TAM). To asses the inter-observer agreement in evaluating muscle involvement and the symmetry index of fatty replacement. Materials and Methods 13 patients were evaluated by MR to ascertain the degree of fatty replacement (T1W sequences) according to Mercuri's scale, and edema score (STIR sequences) according to extent and site. Results Fatty replacement mainly affects the posterior superficial compartment of the leg; the anterior compartment is generally spared. Edema was generally poor and almost only in the superficial compartment of the leg. The inter-observer agreement is very good with a Krippendorff's coefficient >0.9. Data show a total symmetry in the muscular replacement (McNemar-Bowker test with p = 1). Conclusions MR reveals characteristic muscular involvement, and is a reproducible technique for evaluation of TAM. There may also be a characteristic involvement of the long and short heads of the biceps femoris. It is useful for aimed biopsies, diagnostic hypotheses and evaluation of disease progression. PMID:24722334

  9. 75 FR 72987 - Brokers of Household Goods Transportation by Motor Vehicle

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-11-29

    ... revise broker marketing materials, forms, and orders for service, including technical writing, Web site... are FMCSA-authorized household goods motor carriers with which the broker has a written agreement, as... broker has a written agreement, as required by Sec. 371.115. We agree that brokers should not...

  10. Reproducibility, interrater agreement, and age-related changes of fractional anisotropy measures at 3T in healthy subjects: effect of the applied b-value.

    PubMed

    Bisdas, S; Bohning, D E; Besenski, N; Nicholas, J S; Rumboldt, Z

    2008-06-01

    There is no reproducibility study of fractional anisotropy (FA) measurements at 3T using regions of interest (ROIs). Our purpose was to establish the extent and statistical significance of the interrater variability, the variability observed with 2 different b-values, and in 2 separate scanning sessions. Twelve healthy volunteers underwent MR imaging twice. MR imaging was performed on a 3T unit, and FA maps were analyzed independently by 2 observers using ROIs positioned in the corpus callosum, internal capsules, corticospinal tracts, and right thalamus. Changes in FA values (x10(3)) measured with 2 b-values (700 and 1000 s/mm(2)), age-related differences, interobserver agreement, and measurement reproducibility were assessed. In the right internal capsule genu (FA = 702/728; b = 1000/700 s/mm(2)) and the left anterior limb of the internal capsule (AIC; FA = 617/745; b = 1000/700 s/mm(2)), the FA values were significantly different between the 2 b-values (P = .02 and .05, respectively). Significant age-related differences in FA were observed in the genu of the corpus callosum and in the left AIC. Interrater measurements showed fair-to-moderate agreement for most anatomic structures. The lowest significant change for a single subject regarding any FA values between the 2 sessions was in the corpus callosum (4%), whereas the highest one was in the corticospinal tracts (27%). The Bland-Altman plot analysis showed that the 1000-s/mm(2) b-value gave satisfactorily reproducible measurements equally good or better than the 700-s/mm(2) b-value. The reproducibility of FA estimates using ROIs was satisfactory. Measurements with a b-value at 1000 s/mm(2) showed superior reproducibility in most anatomic locations.

  11. A novel magnetic resonance imaging segmentation technique for determining diffuse intrinsic pontine glioma tumor volume.

    PubMed

    Singh, Ranjodh; Zhou, Zhiping; Tisnado, Jamie; Haque, Sofia; Peck, Kyung K; Young, Robert J; Tsiouris, Apostolos John; Thakur, Sunitha B; Souweidane, Mark M

    2016-11-01

    OBJECTIVE Accurately determining diffuse intrinsic pontine glioma (DIPG) tumor volume is clinically important. The aims of the current study were to 1) measure DIPG volumes using methods that require different degrees of subjective judgment; and 2) evaluate interobserver agreement of measurements made using these methods. METHODS Eight patients from a Phase I clinical trial testing convection-enhanced delivery (CED) of a therapeutic antibody were included in the study. Pre-CED, post-radiation therapy axial T2-weighted images were analyzed using 2 methods requiring high degrees of subjective judgment (picture archiving and communication system [PACS] polygon and Volume Viewer auto-contour methods) and 1 method requiring a low degree of subjective judgment (k-means clustering segmentation) to determine tumor volumes. Lin's concordance correlation coefficients (CCCs) were calculated to assess interobserver agreement. RESULTS The CCCs of measurements made by 2 observers with the PACS polygon and the Volume Viewer auto-contour methods were 0.9465 (lower 1-sided 95% confidence limit 0.8472) and 0.7514 (lower 1-sided 95% confidence limit 0.3143), respectively. Both were considered poor agreement. The CCC of measurements made using k-means clustering segmentation was 0.9938 (lower 1-sided 95% confidence limit 0.9772), which was considered substantial strength of agreement. CONCLUSIONS The poor interobserver agreement of PACS polygon and Volume Viewer auto-contour methods highlighted the difficulty in consistently measuring DIPG tumor volumes using methods requiring high degrees of subjective judgment. k-means clustering segmentation, which requires a low degree of subjective judgment, showed better interobserver agreement and produced tumor volumes with delineated borders.

  12. Development of an observational measure of healthcare worker hand-hygiene behaviour: the hand-hygiene observation tool (HHOT).

    PubMed

    McAteer, J; Stone, S; Fuller, C; Charlett, A; Cookson, B; Slade, R; Michie, S

    2008-03-01

    Previous observational measures of healthcare worker (HCW) hand-hygiene behaviour (HHB) fail to provide adequate standard operating procedures (SOPs), accounts of inter-rater agreement testing or evidence of sensitivity to change. This study reports the development of an observational tool in a way that addresses these deficiencies. Observational categories were developed systematically, guided by a clinical guideline, previous measures and pilot hand-hygiene behaviour observations (HHOs). The measure, a simpler version of the Geneva tool, consists of HHOs (before and after low-risk, high-risk or unobserved contact), HHBs (soap, alcohol hand rub, no action, unknown), and type of HCW. Inter-observer agreement for each category was assessed by observation of 298 HHOs and HHBs by two independent observers on acute elderly and intensive care units. Raw agreement (%) and Kappa were 77% and 0.68 for HHB; 83% and 0.77 for HHO; and 90% and 0.77 for HCW. Inter-observer agreement for overall compliance of a group of HCWs was assessed by observation of 1191 HHOs and HHBs by two pairs of independent observers. Overall agreement was good (intraclass correlation coefficient = 0.79). Sensitivity to change was examined by autoregressive time-series modelling of longitudinal observations for 8 months on the intensive therapy unit during an Acinetobacter baumannii outbreak and subsequent strengthening of infection control measures. Sensitivity to change was demonstrated by a rise in compliance from 80 to 98% with an odds ratio of increased compliance of 7.00 (95% confidence interval: 4.02-12.2) P < 0.001.

  13. Accuracy and reliability testing of two methods to measure internal rotation of the glenohumeral joint.

    PubMed

    Hall, Justin M; Azar, Frederick M; Miller, Robert H; Smith, Richard; Throckmorton, Thomas W

    2014-09-01

    We compared accuracy and reliability of a traditional method of measurement (most cephalad vertebral spinous process that can be reached by a patient with the extended thumb) to estimates made with the shoulder in abduction to determine if there were differences between the two methods. Six physicians with fellowship training in sports medicine or shoulder surgery estimated measurements in 48 healthy volunteers. Three were randomly chosen to make estimates of both internal rotation measurements for each volunteer. An independent observer made objective measurements on lateral scoliosis films (spinous process method) or with a goniometer (abduction method). Examiners were blinded to objective measurements as well as to previous estimates. Intraclass coefficients for interobserver reliability for the traditional method averaged 0.75, indicating good agreement among observers. The difference in vertebral level estimated by the examiner and the actual radiographic level averaged 1.8 levels. The intraclass coefficient for interobserver reliability for the abduction method averaged 0.81 for all examiners, indicating near-perfect agreement. Confidence intervals indicated that estimates were an average of 8° different from the objective goniometer measurements. Pearson correlation coefficients of intraobserver reliability for the abduction method averaged 0.94, indicating near-perfect agreement within observers. Confidence intervals demonstrated repeated estimates between 5° and 10° of the original. Internal rotation estimates made with the shoulder abducted demonstrated interobserver reliability superior to that of spinous process estimates, and reproducibility was high. On the basis of this finding, we now take glenohumeral internal rotation measurements with the shoulder in abduction and use a goniometer to maximize accuracy and objectivity. Copyright © 2014 Journal of Shoulder and Elbow Surgery Board of Trustees. Published by Mosby, Inc. All rights reserved.

  14. Articular cartilage grading of the knee: diagnostic performance of fat-suppressed 3D volume isotropic turbo spin-echo acquisition (VISTA) compared with 3D T1 high-resolution isovolumetric examination (THRIVE).

    PubMed

    Lee, Young Han; Hahn, Seok; Lim, Daekeon; Suh, Jin-Suck

    2017-02-01

    Background Conventionally, two-dimensional (2D) fast spin-echo (FSE) sequences have been widely used for clinical cartilage imaging as well as gradient (GRE) sequences. Recently, three-dimensional (3D) volumetric magnetic resonance imaging (MRI) has been introduced with one 3D volumetric scan, and this is replacing slice-by-slice 2D MR scans. Purpose To evaluate the image quality and diagnostic performance of two 3D sequences for abnormalities of knee cartilage: fat-suppressed (FS) FSE-based 3D volume isotropic turbo spin-echo acquisition (VISTA) and GRE-based 3D T1 high-resolution isovolumetric examination (THRIVE). Material and Methods The institutional review board approved the protocol of this retrospective review. This study enrolled 40 patients (41 knees) with arthroscopically confirmed abnormalities of cartilage. All patients underwent isovoxel 3D-VISTA and 3D-THRIVE MR sequences on 3T MRI. We assessed the cartilage grade on the two 3D sequences using arthroscopy as a gold standard. Inter-observer agreement for each technique was evaluated with the intraclass correlation coefficient (ICC). Differences in the area under the curve (AUC) were compared between the 3D-THRIVE and 3D-VISTA. Results Although inter-observer agreement for both sequences was excellent, the inter-observer agreement for 3D-VISTA was higher than for 3D-THRIVE for cartilage grading in all regions of the knee. There was no significant difference in the diagnostic performance ( P > 0.05) between the two sequences for detecting cartilage grade. Conclusion FSE-based 3D-VISTA images had good diagnostic performance that was comparable to GRE-based 3D-THRIVE images in the evaluation of knee cartilage, and can be used in routine knee MR protocols for the evaluation of cartilage.

  15. Clinical application of qualitative assessment for breast masses in shear-wave elastography.

    PubMed

    Gweon, Hye Mi; Youk, Ji Hyun; Son, Eun Ju; Kim, Jeong-Ah

    2013-11-01

    To evaluate the interobserver agreement and the diagnostic performance of various qualitative features in shear-wave elastography (SWE) for breast masses. A total of 153 breast lesions in 152 women who underwent B-mode ultrasound and SWE before biopsy were included. Qualitative analysis in SWE was performed using two different classifications: E values (Ecol; 6-point color score, Ehomo; homogeneity score and Esha; shape score) and a four-color pattern classification. Two radiologists reviewed five data sets: B-mode ultrasound, SWE, and combination of both for E values and four-color pattern. The BI-RADS categories were assessed B-mode and combined sets. Interobserver agreement was assessed using weighted κ statistics. Areas under the receiver operating characteristic curve (AUC), sensitivity, and specificity were analyzed. Interobserver agreement was substantial for Ecol (κ=0.79), Ehomo (κ=0.77) and four-color pattern (κ=0.64), and moderate for Esha (κ=0.56). Better-performing qualitative features were Ecol and four-color pattern (AUCs, 0.932 and 0.925) compared with Ehomo and Esha (AUCs, 0.857 and 0.864; P<0.05). The diagnostic performance of B-mode ultrasound (AUC, 0.950) was not significantly different from combined sets with E value and with four color pattern (AUCs, 0.962 and 0.954). When all qualitative values were negative, leading to downgrade the BI-RADS category, the specificity increased significantly from 16.5% to 56.1% (E value) and 57.0% (four-color pattern) (P<0.001) without improvement in sensitivity. The qualitative SWE features were highly reproducible and showed good diagnostic performance in suspicious breast masses. Adding qualitative SWE to B-mode ultrasound increased specificity in decision making for biopsy recommendation. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  16. Development and testing of a de novo clinical staging system for podoconiosis (endemic non-filarial elephantiasis)

    PubMed Central

    Tekola, Fasil; Ayele, Zewdu; HaileMariam, Dereje; Fuller, Claire; Davey, Gail

    2010-01-01

    Summary Background Podoconiosis (endemic non-filarial elephantiasis) is a geochemical disease in individuals exposed to red-clay soil. Despite the prevalence and public health importance of podoconiosis, there is as yet no accepted clinical staging system. Objective We aimed to develop and test a robust clinical staging system for podoconiosis. Methods We adapted the Dreyer system for staging filarial lymphoedema and tested it in four re-iterative field tests conducted in an area of high podoconiosis prevalence in Southern Ethiopia. The system finally arrived at has five stages according to proximal spread of disease and presence of dermal nodules, ridges and bands. We measured the one-week repeatability and the inter-observer agreement of the final staging system. Results We have developed a five-stage system that is readily understood by community workers with little health training. Kappa for one-week repeatability was 0.88 (95% CI 0.80 to 0.96), Kappa for agreement between health professionals was 0.71 (95% CI 0.60 to 0.82), while that between health professionals and community podoconiosis agents without formal health training averaged 0.64 (95% CI 0.52 to 0.78). Conclusions A simple staging system with good inter-observer agreement and repeatability has been developed to assist in the management and further study of podoconiosis. PMID:18721188

  17. Three-dimensional sampling perfection with application-optimised contrasts using a different flip angle evolutions sequence for routine imaging of the spine: preliminary experience

    PubMed Central

    Tins, B; Cassar-Pullicino, V; Haddaway, M; Nachtrab, U

    2012-01-01

    Objectives The bulk of spinal imaging is still performed with conventional two-dimensional sequences. This study assesses the suitability of three-dimensional sampling perfection with application-optimised contrasts using a different flip angle evolutions (SPACE) sequence for routine spinal imaging. Methods 62 MRI examinations of the spine were evaluated by 2 examiners in consensus for the depiction of anatomy and presence of artefact. We noted pathologies that might be missed using the SPACE sequence only or the SPACE and a sagittal T1 weighted sequence. The reference standards were sagittal and axial T1 weighted and T2 weighted sequences. At a later date the evaluation was repeated by one of the original examiners and an additional examiner. Results There was good agreement of the single evaluations and consensus evaluation for the conventional sequences: κ>0.8, confidence interval (CI)>0.6–1.0. For the SPACE sequence, depiction of anatomy was very good for 84% of cases, with high interobserver agreement, but there was poor interobserver agreement for other cases. For artefact assessment of SPACE, κ=0.92, CI=0.92–1.0. The SPACE sequence was superior to conventional sequences for depiction of anatomy and artefact resistance. The SPACE sequence occasionally missed bone marrow oedema. In conjunction with sagittal T1 weighted sequences, no abnormality was missed. The isotropic SPACE sequence was superior to conventional sequences in imaging difficult anatomy such as in scoliosis and spondylolysis. Conclusion The SPACE sequence allows excellent assessment of anatomy owing to high spatial resolution and resistance to artefact. The sensitivity for bone marrow abnormalities is limited. PMID:22374284

  18. Interobserver repeatability of measurements on computed tomography images of lax canine hip joints from youth to maturity.

    PubMed

    Lopez, Mandi J; Davis, Kechia M; Jeffrey-Borger, Susan L; Markel, Mark D; Rettenmund, Christy

    2009-12-01

    To determine interobserver repeatability of measurements on computed tomography (CT) images of lax canine hip joints at different ages and in the presence of degenerative joint disease at maturity. Longitudinal observational investigation. Sibling crossbreed hounds. Pelvic CT was performed at 20, 24, 32, 48, 68, and 104 weeks of age. Measures were performed on 3 contiguous two-dimensional (2D) transverse CT images of both hips at each time point by 3 investigators. Center-edge angle (CEA), horizontal toit externe angle (HTEA), ventral (VASA), dorsal (DASA), and horizontal (HASA) acetabular sector angles, acetabular index (AI), and percent femoral head coverage (CPC) were measured. Interobserver repeatability was quantified with the intraclass correlation coefficient (ICC). Satisfactory repeatability was considered when ICC >or=0.75. DASA, CEA, and CPC were repeatable in all age groups. HASA and HTEA were repeatable for all but 1 time point. At 20 weeks of age, all measures but AI were repeatable, and at 104 weeks of age, DASA, CEA, CPC, and HASA were repeatable. Measures were repeatable in hips with and without degenerative changes with the exceptions of AI and HASA in normal hips and VASA and HTEA in osteoarthritic hips. Most 2D CT measurements examined were repeatable regardless of age or joint disease. Two-dimensional CT measures may augment current techniques for assessing joint changes in lax canine hips.

  19. Inter- and intra-observer agreement of BI-RADS-based subjective visual estimation of amount of fibroglandular breast tissue with magnetic resonance imaging: comparison to automated quantitative assessment.

    PubMed

    Wengert, G J; Helbich, T H; Woitek, R; Kapetas, P; Clauser, P; Baltzer, P A; Vogl, W-D; Weber, M; Meyer-Baese, A; Pinker, Katja

    2016-11-01

    To evaluate the inter-/intra-observer agreement of BI-RADS-based subjective visual estimation of the amount of fibroglandular tissue (FGT) with magnetic resonance imaging (MRI), and to investigate whether FGT assessment benefits from an automated, observer-independent, quantitative MRI measurement by comparing both approaches. Eighty women with no imaging abnormalities (BI-RADS 1 and 2) were included in this institutional review board (IRB)-approved prospective study. All women underwent un-enhanced breast MRI. Four radiologists independently assessed FGT with MRI by subjective visual estimation according to BI-RADS. Automated observer-independent quantitative measurement of FGT with MRI was performed using a previously described measurement system. Inter-/intra-observer agreements of qualitative and quantitative FGT measurements were assessed using Cohen's kappa (k). Inexperienced readers achieved moderate inter-/intra-observer agreement and experienced readers a substantial inter- and perfect intra-observer agreement for subjective visual estimation of FGT. Practice and experience reduced observer-dependency. Automated observer-independent quantitative measurement of FGT was successfully performed and revealed only fair to moderate agreement (k = 0.209-0.497) with subjective visual estimations of FGT. Subjective visual estimation of FGT with MRI shows moderate intra-/inter-observer agreement, which can be improved by practice and experience. Automated observer-independent quantitative measurements of FGT are necessary to allow a standardized risk evaluation. • Subjective FGT estimation with MRI shows moderate intra-/inter-observer agreement in inexperienced readers. • Inter-observer agreement can be improved by practice and experience. • Automated observer-independent quantitative measurements can provide reliable and standardized assessment of FGT with MRI.

  20. 21 CFR 26.78 - Agreements with other countries.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... 21 Food and Drugs 1 2011-04-01 2011-04-01 false Agreements with other countries. 26.78 Section 26... MUTUAL RECOGNITION OF PHARMACEUTICAL GOOD MANUFACTURING PRACTICE REPORTS, MEDICAL DEVICE QUALITY SYSTEM AUDIT REPORTS, AND CERTAIN MEDICAL DEVICE PRODUCT EVALUATION REPORTS: UNITED STATES AND THE EUROPEAN...

  1. 7 CFR 1599.7 - Transportation of goods.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... other goods such as bags that may be provided by FAS under the McGovern-Dole Program will be acquired under a specific agreement in the manner determined by FAS. Such transportation will be acquired by: (1) FAS in accordance with the Federal Acquisition Regulations (FAR), the Department's procurement...

  2. 7 CFR 1599.7 - Transportation of goods.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... other goods such as bags that may be provided by FAS under the McGovern-Dole Program will be acquired under a specific agreement in the manner determined by FAS. Such transportation will be acquired by: (1) FAS in accordance with the Federal Acquisition Regulations (FAR), the Department's procurement...

  3. 7 CFR 1599.7 - Transportation of goods.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... other goods such as bags that may be provided by FAS under the McGovern-Dole Program will be acquired under a specific agreement in the manner determined by FAS. Such transportation will be acquired by: (1) FAS in accordance with the Federal Acquisition Regulations (FAR), the Department's procurement...

  4. Caregiver Person-Centeredness and Behavioral Symptoms during Mealtime Interactions: Development and Feasibility of a Coding Scheme

    PubMed Central

    Gilmore-Bykovskyi, Andrea L.

    2015-01-01

    Mealtime behavioral symptoms are distressing and frequently interrupt eating for the individual experiencing them and others in the environment. In order to enable identification of potential antecedents to mealtime behavioral symptoms, a computer-assisted coding scheme was developed to measure caregiver person-centeredness and behavioral symptoms for nursing home residents with dementia during mealtime interactions. The purpose of this pilot study was to determine the acceptability and feasibility of procedures for video-capturing naturally-occurring mealtime interactions between caregivers and residents with dementia, to assess the feasibility, ease of use, and inter-observer reliability of the coding scheme, and to explore the clinical utility of the coding scheme. Trained observers coded 22 observations. Data collection procedures were feasible and acceptable to caregivers, residents and their legally authorized representatives. Overall, the coding scheme proved to be feasible, easy to execute and yielded good to very good inter-observer agreement following observer re-training. The coding scheme captured clinically relevant, modifiable antecedents to mealtime behavioral symptoms, but would be enhanced by the inclusion of measures for resident engagement and consolidation of items for measuring caregiver person-centeredness that co-occurred and were difficult for observers to distinguish. PMID:25784080

  5. Agreement between questionnaire and medical records on some health and socioeconomic problems among poisoning cases

    PubMed Central

    Fathelrahman, Ahmed I

    2009-01-01

    Background The main objective of the present study was to evaluate the agreement between questionnaire and medical records on some health and socioeconomic problems among poisoning cases. Methods Cross-sectional sample of 100 poisoning cases consecutively admitted to the Hospital Pulau Pinang, Malaysia during the period from September 2003 to February 2004 were studied. Data on health and socioeconomic problems were collected both by self-administered questionnaire and from medical records. Agreement between the two sets of data was assessed by calculating the concordance rate, Kappa (k) and PABAK. McNemar statistic was used to test differences between categories. Results Data collected by questionnaire and medical records showed excellent agreement on the "marital status"; good agreements on "chronic illness", "psychiatric illness", and "previous history of poisoning"; and fair agreements on "at least one health problem", and "boy-girl friends problem". PABAK values suggest better agreements' measures. Conclusion There were excellent to good agreements between questionnaire and medical records on the marital status and most of the health problems and fair to poor agreements on the majority of socioeconomic problems. The implications of those findings were discussed. PMID:19751526

  6. Reproducibility of dynamic contrast-enhanced MRI and dynamic susceptibility contrast MRI in the study of brain gliomas: a comparison of data obtained using different commercial software.

    PubMed

    Conte, Gian Marco; Castellano, Antonella; Altabella, Luisa; Iadanza, Antonella; Cadioli, Marcello; Falini, Andrea; Anzalone, Nicoletta

    2017-04-01

    Dynamic susceptibility contrast MRI (DSC) and dynamic contrast-enhanced MRI (DCE) are useful tools in the diagnosis and follow-up of brain gliomas; nevertheless, both techniques leave the open issue of data reproducibility. We evaluated the reproducibility of data obtained using two different commercial software for perfusion maps calculation and analysis, as one of the potential sources of variability can be the software itself. DSC and DCE analyses from 20 patients with gliomas were tested for both the intrasoftware (as intraobserver and interobserver reproducibility) and the intersoftware reproducibility, as well as the impact of different postprocessing choices [vascular input function (VIF) selection and deconvolution algorithms] on the quantification of perfusion biomarkers plasma volume (Vp), volume transfer constant (K trans ) and rCBV. Data reproducibility was evaluated with the intraclass correlation coefficient (ICC) and Bland-Altman analysis. For all the biomarkers, the intra- and interobserver reproducibility resulted in almost perfect agreement in each software, whereas for the intersoftware reproducibility the value ranged from 0.311 to 0.577, suggesting fair to moderate agreement; Bland-Altman analysis showed high dispersion of data, thus confirming these findings. Comparisons of different VIF estimation methods for DCE biomarkers resulted in ICC of 0.636 for K trans and 0.662 for Vp; comparison of two deconvolution algorithms in DSC resulted in an ICC of 0.999. The use of single software ensures very good intraobserver and interobservers reproducibility. Caution should be taken when comparing data obtained using different software or different postprocessing within the same software, as reproducibility is not guaranteed anymore.

  7. Effect of clinical information and previous exam execution on observer agreement and reliability in the analysis of hysteroscopic video-recordings.

    PubMed

    Martinho, Margarida Suzel Lopes; da Costa Santos, Cristina Maria Nogueira; Silva Carvalho, João Luís Mendonça; Bernardes, João Francisco Montenegro Andrade Lima

    2018-02-01

    Inter-observer agreement and reliability in hysteroscopic image assessment remain uncertain and the type of factors that may influence it has only been studied in relation to the experience of hysteroscopists. We aim to assess the effect of clinical information and previous exam execution on observer agreement and reliability in the analysis of hysteroscopic video-recordings. Ninety hysteroscopies were video-recorded and randomized into a group without (Group 1) and with clinical information (Group 2). The videos were independently analyzed by three hysteroscopists, regarding lesion location, dimension, and type, as well as decision to perform a biopsy. One of the hysteroscopists had executed all the exams before. Proportions of agreement (PA) and kappa statistics (κ) with 95% confidence intervals (95% CI) were used. In Group 2, there was a higher proportion of a normal diagnosis (p < 0.001) and a lower proportion of biopsies recommended (p = 0.027). Observer agreement and reliability were better in Group 2, with the PA and κ ranging, respectively, from 0.73 (95% CI 0.62, 0.83) and 0.44 (95% CI 0.26, 0.63), for image quality, to 0.94 (95% CI 0.88, 0.99) and 0.85 (95% CI 0.65, 0.95), for the decision to perform a biopsy. Execution of the exams before the analysis of the video-recordings did not significantly affect the results. With clinical information, agreement and reliability in the overall analysis of hysteroscopic video-recordings may reach almost perfect results and this was not significantly affected by the execution of the exams before the analysis. However, there is still uncertainty in the analysis of specific endometrial cavity abnormalities.

  8. Comparison of 3D computer-aided with manual cerebral aneurysm measurements in different imaging modalities.

    PubMed

    Groth, M; Forkert, N D; Buhk, J H; Schoenfeld, M; Goebell, E; Fiehler, J

    2013-02-01

    To compare intra- and inter-observer reliability of aneurysm measurements obtained by a 3D computer-aided technique with standard manual aneurysm measurements in different imaging modalities. A total of 21 patients with 29 cerebral aneurysms were studied. All patients underwent digital subtraction angiography (DSA), contrast-enhanced (CE-MRA) and time-of-flight magnetic resonance angiography (TOF-MRA). Aneurysm neck and depth diameters were manually measured by two observers in each modality. Additionally, semi-automatic computer-aided diameter measurements were performed using 3D vessel surface models derived from CE- (CE-com) and TOF-MRA (TOF-com) datasets. Bland-Altman analysis (BA) and intra-class correlation coefficient (ICC) were used to evaluate intra- and inter-observer agreement. BA revealed the narrowest relative limits of intra- and inter-observer agreement for aneurysm neck and depth diameters obtained by TOF-com (ranging between ±5.3 % and ±28.3 %) and CE-com (ranging between ±23.3 % and ±38.1 %). Direct measurements in DSA, TOF-MRA and CE-MRA showed considerably wider limits of agreement. The highest ICCs were observed for TOF-com and CE-com (ICC values, 0.92 or higher for intra- as well as inter-observer reliability). Computer-aided aneurysm measurement in 3D offers improved intra- and inter-observer reliability and a reproducible parameter extraction, which may be used in clinical routine and as objective surrogate end-points in clinical trials.

  9. 19 CFR 10.570 - Goods re-entered after repair or alteration in Singapore.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... 19 Customs Duties 1 2011-04-01 2011-04-01 false Goods re-entered after repair or alteration in... States-Singapore Free Trade Agreement Goods Returned After Repair Or Alteration § 10.570 Goods re-entered... alterations” means restoration, addition, renovation, re-dyeing, cleaning, re-sterilizing, or other treatment...

  10. 19 CFR 10.1034 - Goods re-entered after repair or alteration in Korea.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... 19 Customs Duties 1 2013-04-01 2013-04-01 false Goods re-entered after repair or alteration in... States-Korea Free Trade Agreement Goods Returned After Repair Or Alteration § 10.1034 Goods re-entered... alterations” means restoration, addition, renovation, re-dyeing, cleaning, re-sterilizing, or other treatment...

  11. 19 CFR 10.1034 - Goods re-entered after repair or alteration in Korea.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... 19 Customs Duties 1 2014-04-01 2014-04-01 false Goods re-entered after repair or alteration in... States-Korea Free Trade Agreement Goods Returned After Repair Or Alteration § 10.1034 Goods re-entered... alterations” means restoration, addition, renovation, re-dyeing, cleaning, re-sterilizing, or other treatment...

  12. 19 CFR 10.3034 - Goods re-entered after repair or alteration in Colombia.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... 19 Customs Duties 1 2013-04-01 2013-04-01 false Goods re-entered after repair or alteration in... States-Colombia Trade Promotion Agreement Goods Returned After Repair Or Alteration § 10.3034 Goods re... alterations” means restoration, addition, renovation, re-dyeing, cleaning, re-sterilizing, or other treatment...

  13. 19 CFR 10.570 - Goods re-entered after repair or alteration in Singapore.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... 19 Customs Duties 1 2013-04-01 2013-04-01 false Goods re-entered after repair or alteration in... States-Singapore Free Trade Agreement Goods Returned After Repair Or Alteration § 10.570 Goods re-entered... alterations” means restoration, addition, renovation, re-dyeing, cleaning, re-sterilizing, or other treatment...

  14. 19 CFR 10.3034 - Goods re-entered after repair or alteration in Colombia.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... 19 Customs Duties 1 2014-04-01 2014-04-01 false Goods re-entered after repair or alteration in... States-Colombia Trade Promotion Agreement Goods Returned After Repair Or Alteration § 10.3034 Goods re... alterations” means restoration, addition, renovation, re-dyeing, cleaning, re-sterilizing, or other treatment...

  15. 19 CFR 10.2034 - Goods re-entered after repair or alteration in Panama.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... 19 Customs Duties 1 2014-04-01 2014-04-01 false Goods re-entered after repair or alteration in... States-Panama Trade Promotion Agreement Goods Returned After Repair Or Alteration § 10.2034 Goods re... alteration” means restoration, addition, renovation, re-dyeing, cleaning, re-sterilizing, or other treatment...

  16. 19 CFR 10.570 - Goods re-entered after repair or alteration in Singapore.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... 19 Customs Duties 1 2014-04-01 2014-04-01 false Goods re-entered after repair or alteration in... States-Singapore Free Trade Agreement Goods Returned After Repair Or Alteration § 10.570 Goods re-entered... alterations” means restoration, addition, renovation, re-dyeing, cleaning, re-sterilizing, or other treatment...

  17. 19 CFR 10.570 - Goods re-entered after repair or alteration in Singapore.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... 19 Customs Duties 1 2010-04-01 2010-04-01 false Goods re-entered after repair or alteration in... States-Singapore Free Trade Agreement Goods Returned After Repair Or Alteration § 10.570 Goods re-entered... alterations” means restoration, addition, renovation, re-dyeing, cleaning, re-sterilizing, or other treatment...

  18. 19 CFR 10.570 - Goods re-entered after repair or alteration in Singapore.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... 19 Customs Duties 1 2012-04-01 2012-04-01 false Goods re-entered after repair or alteration in... States-Singapore Free Trade Agreement Goods Returned After Repair Or Alteration § 10.570 Goods re-entered... alterations” means restoration, addition, renovation, re-dyeing, cleaning, re-sterilizing, or other treatment...

  19. 19 CFR 10.1034 - Goods re-entered after repair or alteration in Korea.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... 19 Customs Duties 1 2012-04-01 2012-04-01 false Goods re-entered after repair or alteration in... States-Korea Free Trade Agreement Goods Returned After Repair Or Alteration § 10.1034 Goods re-entered... alterations” means restoration, addition, renovation, re-dyeing, cleaning, re-sterilizing, or other treatment...

  20. Intra-observer reproducibility and interobserver reliability of the radiographic parameters in the Spinal Deformity Study Group's AIS Radiographic Measurement Manual.

    PubMed

    Dang, Natasha Radhika; Moreau, Marc J; Hill, Douglas L; Mahood, James K; Raso, James

    2005-05-01

    Retrospective cross-sectional assessment of the reproducibility and reliability of radiographic parameters. To measure the intra-examiner and interexaminer reproducibility and reliability of salient radiographic features. The management and treatment of adolescent idiopathic scoliosis (AIS) depends on accurate and reproducible radiographic measurements of the deformity. Ten sets of radiographs were randomly selected from a sample of patients with AIS, with initial curves between 20 degrees and 45 degrees. Fourteen measures of the deformity were measured from posteroanterior and lateral radiographs by 2 examiners, and were repeated 5 times at intervals of 3-5 days. Intra-examiner and interexaminer differences were examined. The parameters include measures of curve size, spinal imbalance, sagittal kyphosis and alignment, maximum apical vertebral rotation, T1 tilt, spondylolysis/spondylolisthesis, and skeletal age. Intra-examiner reproducibility was generally excellent for parameters measured from the posteroanterior radiographs but only fair to good for parameters from the lateral radiographs, in which some landmarks were not clearly visible. Of the 13 parameters observed, 7 had excellent interobserver reliability. The measurements from the lateral radiograph were less reproducible and reliable and, thus, may not add value to the assessment of AIS. Taking additional measures encourages a systematic and comprehensive assessment of spinal radiographs.

  1. Improvement of diagnostic agreement among pathologists in resolving an "atypical glands suspicious for cancer" diagnosis in prostate biopsies using a novel "Disease-Focused Diagnostic Review" quality improvement process.

    PubMed

    Shah, Rajal B; Leandro, Gioacchino; Romerocaces, Gloria; Bentley, James; Yoon, Jiyoon; Mendrinos, Savvas; Tadros, Yousef; Tian, Wei; Lash, Richard

    2016-10-01

    One of the major goals of an anatomic pathology laboratory quality program is to minimize unwarranted diagnostic variability and equivocal reporting. This study evaluated the utility of Miraca Life Sciences' "Disease-Focused Diagnostic Review" (DFDR) quality program in improving interobserver diagnostic reproducibility associated with classification of "atypical glands suspicious for adenocarcinoma" (ATYP) in prostate biopsies. Seventy-one selected prostate biopsies with a focus of ATYP were reviewed by 8 pathologists. Participants were blinded to the original diagnosis and were first asked to classify the ATYP as benign, atypical, or limited adenocarcinoma. DFDR comprised a "theoretical consensus" (in which pathologists first reached consensus on the morphological features they considered relevant for the diagnosis of limited prostatic adenocarcinoma), a didactic review including relevant literature, and "practical consensus" (pathologists performed joint microscopic sessions, reconciling each other's observations and positions evaluating a separate unique slide set). Participants were finally asked to reclassify the original 71 ATYP cases based on knowledge gleaned from DFDR. Pre- and post-DFDR interobserver reproducibility of overall diagnostic agreement was assessed. Interobserver reproducibility measured by Fleiss κ values of pre- and post-DFDR was 0.36 and 0.59, respectively (P=.006). Post-DFDR, there were significant improvement for "100% concordance" (P=.011) and reduction for "no consensus" (P=.0004) categories. Despite a lower pre-DFDR reproducibility for non-uropathology fellowship-trained (n=3, κ=0.38) versus uropathology fellowship-trained (n=5, κ=0.43) pathologists, both groups achieved similarly high post-DFDR κ levels (κ=0.58 and 0.56, respectively). DFDR represents an effective tool to formally achieve diagnostic consensus and reduce variability associated with critical diagnoses in an anatomic pathology practice. Copyright © 2016 Elsevier

  2. A novel magnetic resonance imaging segmentation technique for determining diffuse intrinsic pontine glioma tumor volume

    PubMed Central

    Singh, Ranjodh; Zhou, Zhiping; Tisnado, Jamie; Haque, Sofia; Peck, Kyung K.; Young, Robert J.; Tsiouris, Apostolos John; Thakur, Sunitha B.; Souweidane, Mark M.

    2017-01-01

    OBJECTIVE Accurately determining diffuse intrinsic pontine glioma (DIPG) tumor volume is clinically important. The aims of the current study were to 1) measure DIPG volumes using methods that require different degrees of subjective judgment; and 2) evaluate interobserver agreement of measurements made using these methods. METHODS Eight patients from a Phase I clinical trial testing convection-enhanced delivery (CED) of a therapeutic antibody were included in the study. Pre-CED, post–radiation therapy axial T2-weighted images were analyzed using 2 methods requiring high degrees of subjective judgment (picture archiving and communication system [PACS] polygon and Volume Viewer auto-contour methods) and 1 method requiring a low degree of subjective judgment (k-means clustering segmentation) to determine tumor volumes. Lin’s concordance correlation coefficients (CCCs) were calculated to assess interobserver agreement. RESULTS The CCCs of measurements made by 2 observers with the PACS polygon and the Volume Viewer auto-contour methods were 0.9465 (lower 1-sided 95% confidence limit 0.8472) and 0.7514 (lower 1-sided 95% confidence limit 0.3143), respectively. Both were considered poor agreement. The CCC of measurements made using k-means clustering segmentation was 0.9938 (lower 1-sided 95% confidence limit 0.9772), which was considered substantial strength of agreement. CONCLUSIONS The poor interobserver agreement of PACS polygon and Volume Viewer auto-contour methods high-lighted the difficulty in consistently measuring DIPG tumor volumes using methods requiring high degrees of subjective judgment. k-means clustering segmentation, which requires a low degree of subjective judgment, showed better interob-server agreement and produced tumor volumes with delineated borders. PMID:27391980

  3. 49 CFR 375.409 - May household goods brokers provide estimates?

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... REGULATIONS TRANSPORTATION OF HOUSEHOLD GOODS IN INTERSTATE COMMERCE; CONSUMER PROTECTION REGULATIONS... there is a written agreement between the broker and you, the carrier, adopting the broker's estimate as...

  4. 19 CFR 10.827 - Goods re-entered after repair or alteration in Bahrain.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... 19 Customs Duties 1 2011-04-01 2011-04-01 false Goods re-entered after repair or alteration in... States-Bahrain Free Trade Agreement Goods Returned After Repair Or Alteration § 10.827 Goods re-entered...” means restoration, addition, renovation, re-dyeing, cleaning, re-sterilizing, or other treatment which...

  5. 19 CFR 10.934 - Goods re-entered after repair or alteration in Peru.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... 19 Customs Duties 1 2013-04-01 2013-04-01 false Goods re-entered after repair or alteration in... States-Peru Trade Promotion Agreement Goods Returned After Repair Or Alteration § 10.934 Goods re-entered...” means restoration, addition, renovation, re-dyeing, cleaning, re-sterilizing, or other treatment that...

  6. 19 CFR 10.890 - Goods re-entered after repair or alteration in Oman.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... 19 Customs Duties 1 2012-04-01 2012-04-01 false Goods re-entered after repair or alteration in... States-Oman Free Trade Agreement Goods Returned After Repair Or Alteration § 10.890 Goods re-entered...” means restoration, renovation, cleaning, re-sterilizing, or other treatment which does not destroy the...

  7. 19 CFR 10.827 - Goods re-entered after repair or alteration in Bahrain.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... 19 Customs Duties 1 2012-04-01 2012-04-01 false Goods re-entered after repair or alteration in... States-Bahrain Free Trade Agreement Goods Returned After Repair Or Alteration § 10.827 Goods re-entered...” means restoration, addition, renovation, re-dyeing, cleaning, re-sterilizing, or other treatment which...

  8. 19 CFR 10.934 - Goods re-entered after repair or alteration in Peru.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... 19 Customs Duties 1 2012-04-01 2012-04-01 false Goods re-entered after repair or alteration in... States-Peru Trade Promotion Agreement Goods Returned After Repair Or Alteration § 10.934 Goods re-entered...” means restoration, addition, renovation, re-dyeing, cleaning, re-sterilizing, or other treatment that...

  9. 19 CFR 10.890 - Goods re-entered after repair or alteration in Oman.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... 19 Customs Duties 1 2011-04-01 2011-04-01 false Goods re-entered after repair or alteration in... States-Oman Free Trade Agreement Goods Returned After Repair Or Alteration § 10.890 Goods re-entered...” means restoration, renovation, cleaning, re-sterilizing, or other treatment which does not destroy the...

  10. 19 CFR 10.787 - Goods re-entered after repair or alteration in Morocco.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... 19 Customs Duties 1 2013-04-01 2013-04-01 false Goods re-entered after repair or alteration in... States-Morocco Free Trade Agreement Goods Returned After Repair Or Alteration § 10.787 Goods re-entered...” means restoration, addition, renovation, re-dyeing, cleaning, re-sterilizing, or other treatment which...

  11. 19 CFR 10.787 - Goods re-entered after repair or alteration in Morocco.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... 19 Customs Duties 1 2012-04-01 2012-04-01 false Goods re-entered after repair or alteration in... States-Morocco Free Trade Agreement Goods Returned After Repair Or Alteration § 10.787 Goods re-entered...” means restoration, addition, renovation, re-dyeing, cleaning, re-sterilizing, or other treatment which...

  12. 19 CFR 10.490 - Goods re-entered after repair or alteration in Chile.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... 19 Customs Duties 1 2011-04-01 2011-04-01 false Goods re-entered after repair or alteration in... States-Chile Free Trade Agreement Goods Returned After Repair Or Alteration § 10.490 Goods re-entered...” means restoration, addition, renovation, re-dyeing, cleaning, re-sterilizing, or other treatment which...

  13. 19 CFR 10.787 - Goods re-entered after repair or alteration in Morocco.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... 19 Customs Duties 1 2011-04-01 2011-04-01 false Goods re-entered after repair or alteration in... States-Morocco Free Trade Agreement Goods Returned After Repair Or Alteration § 10.787 Goods re-entered...” means restoration, addition, renovation, re-dyeing, cleaning, re-sterilizing, or other treatment which...

  14. 19 CFR 10.490 - Goods re-entered after repair or alteration in Chile.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... 19 Customs Duties 1 2010-04-01 2010-04-01 false Goods re-entered after repair or alteration in... States-Chile Free Trade Agreement Goods Returned After Repair Or Alteration § 10.490 Goods re-entered...” means restoration, addition, renovation, re-dyeing, cleaning, re-sterilizing, or other treatment which...

  15. 19 CFR 10.827 - Goods re-entered after repair or alteration in Bahrain.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... 19 Customs Duties 1 2014-04-01 2014-04-01 false Goods re-entered after repair or alteration in... States-Bahrain Free Trade Agreement Goods Returned After Repair Or Alteration § 10.827 Goods re-entered...” means restoration, addition, renovation, re-dyeing, cleaning, re-sterilizing, or other treatment which...

  16. 19 CFR 10.890 - Goods re-entered after repair or alteration in Oman.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... 19 Customs Duties 1 2013-04-01 2013-04-01 false Goods re-entered after repair or alteration in... States-Oman Free Trade Agreement Goods Returned After Repair Or Alteration § 10.890 Goods re-entered...” means restoration, renovation, cleaning, re-sterilizing, or other treatment which does not destroy the...

  17. 19 CFR 10.827 - Goods re-entered after repair or alteration in Bahrain.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... 19 Customs Duties 1 2010-04-01 2010-04-01 false Goods re-entered after repair or alteration in... States-Bahrain Free Trade Agreement Goods Returned After Repair Or Alteration § 10.827 Goods re-entered...” means restoration, addition, renovation, re-dyeing, cleaning, re-sterilizing, or other treatment which...

  18. 19 CFR 10.827 - Goods re-entered after repair or alteration in Bahrain.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... 19 Customs Duties 1 2013-04-01 2013-04-01 false Goods re-entered after repair or alteration in... States-Bahrain Free Trade Agreement Goods Returned After Repair Or Alteration § 10.827 Goods re-entered...” means restoration, addition, renovation, re-dyeing, cleaning, re-sterilizing, or other treatment which...

  19. 19 CFR 10.890 - Goods re-entered after repair or alteration in Oman.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... 19 Customs Duties 1 2014-04-01 2014-04-01 false Goods re-entered after repair or alteration in... States-Oman Free Trade Agreement Goods Returned After Repair Or Alteration § 10.890 Goods re-entered...” means restoration, renovation, cleaning, re-sterilizing, or other treatment which does not destroy the...

  20. 19 CFR 10.787 - Goods re-entered after repair or alteration in Morocco.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... 19 Customs Duties 1 2010-04-01 2010-04-01 false Goods re-entered after repair or alteration in... States-Morocco Free Trade Agreement Goods Returned After Repair Or Alteration § 10.787 Goods re-entered...” means restoration, addition, renovation, re-dyeing, cleaning, re-sterilizing, or other treatment which...

  1. 19 CFR 10.490 - Goods re-entered after repair or alteration in Chile.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... 19 Customs Duties 1 2013-04-01 2013-04-01 false Goods re-entered after repair or alteration in... States-Chile Free Trade Agreement Goods Returned After Repair Or Alteration § 10.490 Goods re-entered...” means restoration, addition, renovation, re-dyeing, cleaning, re-sterilizing, or other treatment which...

  2. 19 CFR 10.934 - Goods re-entered after repair or alteration in Peru.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... 19 Customs Duties 1 2014-04-01 2014-04-01 false Goods re-entered after repair or alteration in... States-Peru Trade Promotion Agreement Goods Returned After Repair Or Alteration § 10.934 Goods re-entered...” means restoration, addition, renovation, re-dyeing, cleaning, re-sterilizing, or other treatment that...

  3. 19 CFR 10.787 - Goods re-entered after repair or alteration in Morocco.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... 19 Customs Duties 1 2014-04-01 2014-04-01 false Goods re-entered after repair or alteration in... States-Morocco Free Trade Agreement Goods Returned After Repair Or Alteration § 10.787 Goods re-entered...” means restoration, addition, renovation, re-dyeing, cleaning, re-sterilizing, or other treatment which...

  4. 19 CFR 10.490 - Goods re-entered after repair or alteration in Chile.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... 19 Customs Duties 1 2014-04-01 2014-04-01 false Goods re-entered after repair or alteration in... States-Chile Free Trade Agreement Goods Returned After Repair Or Alteration § 10.490 Goods re-entered...” means restoration, addition, renovation, re-dyeing, cleaning, re-sterilizing, or other treatment which...

  5. 19 CFR 10.490 - Goods re-entered after repair or alteration in Chile.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... 19 Customs Duties 1 2012-04-01 2012-04-01 false Goods re-entered after repair or alteration in... States-Chile Free Trade Agreement Goods Returned After Repair Or Alteration § 10.490 Goods re-entered...” means restoration, addition, renovation, re-dyeing, cleaning, re-sterilizing, or other treatment which...

  6. A review of interventions to reduce inter-observer variability in volume delineation in radiation oncology.

    PubMed

    Vinod, Shalini K; Min, Myo; Jameson, Michael G; Holloway, Lois C

    2016-06-01

    Inter-observer variability (IOV) in target volume and organ-at-risk (OAR) delineation is a source of potential error in radiation therapy treatment. The aims of this study were to identify interventions shown to reduce IOV in volume delineation. Medline and Pubmed databases were queried for relevant articles using various keywords to identify articles which evaluated IOV in target or OAR delineation for multiple (>2) observers. The search was limited to English language articles and to those published from 1 January 2000 to 31 December 2014. Reference lists of identified articles were scrutinised to identify relevant studies. Studies were included if they reported IOV in contouring before and after an intervention including the use of additional or alternative imaging. Fifty-six studies were identified. These were grouped into evaluation of guidelines (n = 9), teaching (n = 9), provision of an autocontour (n = 7) and the impact of imaging (n = 31) on IOV. Guidelines significantly reduced IOV in 7/9 studies. Teaching interventions reduced IOV in 8/9 studies, statistically significant in 4. The provision of an autocontour improved consistency of contouring in 6/7 studies, statistically significant in 5. The effect of additional imaging on IOV was variable. Pre-operative CT was useful in reducing IOV in contouring breast and liver cancers, PET scans in lung cancer, rectal cancer and lymphoma and MRI scans in OARs in head and neck cancers. Inter-observer variability in volume delineation can be reduced with the use of guidelines, provision of autocontours and teaching. The use of multimodality imaging is useful in certain tumour sites. © 2016 The Royal Australian and New Zealand College of Radiologists.

  7. Optimizing study design for interobserver reliability: IUGA-ICS classification of complications of prostheses and graft insertion.

    PubMed

    Haylen, Bernard T; Lee, Joseph; Maher, Chris; Deprest, Jan; Freeman, Robert

    2014-06-01

    Results of interobserver reliability studies for the International Urogynecological Association-International Continence Society (IUGA-ICS) Complication Classification coding can be greatly influenced by study design factors such as participant instruction, motivation, and test-question clarity. We attempted to optimize these factors. After a 15-min instructional lecture with eight clinical case examples (including images) and with classification/coding charts available, those clinicians attending an IUGA Surgical Complications workshop were presented with eight similar-style test cases over 10 min and asked to code them using the Category, Time and Site classification. Answers were compared to predetermined correct codes obtained by five instigators of the IUGA-ICS prostheses and grafts complications classification. Prelecture and postquiz participant confidence levels using a five-step Likert scale were assessed. Complete sets of answers to the questions (24 codings) were provided by 34 respondents, only three of whom reported prior use of the charts. Average score [n (%)] out of eight, as well as median score (range) for each coding category were: (i) Category: 7.3 (91 %); 7 (4-8); (ii) Time: 7.8 (98 %); 7 (6-8); (iii) Site: 7.2 (90 %); 7 (5-8). Overall, the equivalent calculations (out of 24) were 22.3 (93 %) and 22 (18-24). Mean prelecture confidence was 1.37 (out of 5), rising to 3.85 postquiz. Urogynecologists had the highest correlation with correct coding, followed closely by fellows and general gynecologists. Optimizing training and study design can lead to excellent results for interobserver reliability of the IUGA-ICS Complication Classification coding, with increased participant confidence in complication-coding ability.

  8. 19 CFR 10.607 - Goods eligible for tariff preference level claims.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... Republic-Central America-United States Free Trade Agreement Tariff Preference Level § 10.607 Goods eligible.... 10.607 Section 10.607 Customs Duties U.S. CUSTOMS AND BORDER PROTECTION, DEPARTMENT OF HOMELAND... apparel goods provided for in U.S. Note 15(b), Subchapter XV, Chapter 99, HTSUS, that are both cut (or...

  9. A Web-Based Education Program for Colorectal Lesion Diagnosis with Narrow Band Imaging Classification.

    PubMed

    Aihara, Hiroyuki; Kumar, Nitin; Thompson, Christopher C

    2018-04-19

    An education system for narrow band imaging (NBI) interpretation requires sufficient exposure to key features. However, access to didactic lectures by experienced teachers is limited in the United States. To develop and assess the effectiveness of a colorectal lesion identification tutorial. In the image analysis pretest, subjects including 9 experts and 8 trainees interpreted 50 white light (WL) and 50 NBI images of colorectal lesions. Results were not reviewed with subjects. Trainees then participated in an online tutorial emphasizing NBI interpretation in colorectal lesion analysis. A post-test was administered and diagnostic yields were compared to pre-education diagnostic yields. Under the NBI mode, experts showed higher diagnostic yields (sensitivity 91.5% [87.3-94.4], specificity 90.6% [85.1-94.2], and accuracy 91.1% [88.5-93.7] with substantial interobserver agreement [κ value 0.71]) compared to trainees (sensitivity 89.6% [84.8-93.0], specificity 80.6% [73.5-86.3], and accuracy 86.0% [82.6-89.2], with substantial interobserver agreement [κ value 0.69]). The online tutorial improved the diagnostic yields of trainees to the equivalent level of experts (sensitivity 94.1% [90.0-96.6], specificity 89.0% [83.0-93.2], and accuracy 92.0% [89.3-94.7], p < 0.001 with substantial interobserver agreement [κ value 0.78]). This short, online tutorial improved diagnostic performance and interobserver agreement. © 2018 S. Karger AG, Basel.

  10. Poor agreement between endoscopists and gastrointestinal pathologists for the interpretation of probe-based confocal laser endomicroscopy findings

    PubMed Central

    Peter, Shajan; Council, Leona; Bang, Ji Young; Neumann, Helmut; Mönkemüller, Klaus; Varadarajulu, Shyam; Wilcox, Charles Melbern

    2014-01-01

    AIM: To compare the interpretation of probe-based confocal laser endomicroscopy (pCLE) findings between endoscopists and gastrointestinal (GI)-pathologists. METHODS: All pCLE procedures were undertaken and the endoscopist rendered assessment. The same pCLE videos were then viewed offline by an expert GI pathologist. Histopathology was considered the gold standard for definitive diagnosis. The sensitivity, specificity and accuracy for diagnosis of dysplastic/ neoplastic GI lesions and interobserver agreement between endoscopists and experienced gastrointestinal pathologist for pCLE findings were analyzed. RESULTS: Of the 66 included patients, 40 (60.6%) had lesions in the esophagus, 7 (10.6%) in the stomach, 15 (22.7%) in the biliary tract, 3 (4.5%) in the ampulla and 1 (1.5%) in the colon. The overall sensitivity, specificity and accuracy for diagnosing dysplastic/neoplastic lesions using pCLE were higher for endoscopists than pathologist at 87.0% vs 69.6%, 80.0% vs 40.0% and 84.8% vs 60.6% (P = 0.0003), respectively. Area under the ROC curve (AUC) was greater for endoscopists than the pathologist (0.83 vs 0.55, P = 0.0001). Overall agreement between endoscopists and pathologist was moderate for all GI lesions (K = 0.43; 95%CI: 0.26-0.61), luminal lesions (K = 0.40; 95%CI: 0.20-0.60) and those of dysplastic/neoplastic pathology (K = 0.55; 95%CI: 0.37-0.72), the agreement was poor for benign (K = 0.13; 95%CI: -0.097-0.36) and pancreaticobiliary lesions (K = 0.19; 95%CI: -0.26-0.63). CONCLUSION: There is a wide discrepancy in the interpretation of pCLE findings between endoscopists and pathologist, particularly for benign and malignant pancreaticobiliary lesions. Further studies are needed to identify the cause of this poor agreement. PMID:25548499

  11. Inter-observer reliability of animal-based welfare indicators included in the Animal Welfare Indicators welfare assessment protocol for dairy goats.

    PubMed

    Vieira, A; Battini, M; Can, E; Mattiello, S; Stilwell, G

    2018-01-08

    This study was conducted within the context of the Animal Welfare Indicators (AWIN) project and the underlying scientific motivation for the development of the study was the scarcity of data regarding inter-observer reliability (IOR) of welfare indicators, particularly given the importance of reliability as a further step for developing on-farm welfare assessment protocols. The objective of this study is therefore to evaluate IOR of animal-based indicators (at group and individual-level) of the AWIN welfare assessment protocol (prototype) for dairy goats. In the design of the study, two pairs of observers, one in Portugal and another in Italy, visited 10 farms each and applied the AWIN prototype protocol. Farms in both countries were visited between January and March 2014, and all the observers received the same training before the farm visits were initiated. Data collected during farm visits, and analysed in this study, include group-level and individual-level observations. The results of our study allow us to conclude that most of the group-level indicators presented the highest IOR level ('substantial', 0.85 to 0.99) in both field studies, pointing to a usable set of animal-based welfare indicators that were therefore included in the first level of the final AWIN welfare assessment protocol for dairy goats. Inter-observer reliability of individual-level indicators was lower, but the majority of them still reached 'fair to good' (0.41 to 0.75) and 'excellent' (0.76 to 1) levels. In the paper we explore reasons for the differences found in IOR between the group and individual-level indicators, including how the number of individual-level indicators to be assessed on each animal and the restraining method may have affected the results. Furthermore, we discuss the differences found in the IOR of individual-level indicators in both countries: the Portuguese pair of observers reached a higher level of IOR, when compared with the Italian observers. We argue how the

  12. Agreement and reliability of pelvic floor measurements during rest and on maximum Valsalva maneuver using three-dimensional translabial ultrasound and virtual reality imaging.

    PubMed

    Speksnijder, L; Oom, D M J; Koning, A H J; Biesmeijer, C S; Steegers, E A P; Steensma, A B

    2016-08-01

    Imaging of the levator ani hiatus provides valuable information for the diagnosis and follow-up of patients with pelvic organ prolapse (POP). This study compared measurements of levator ani hiatal volume during rest and on maximum Valsalva, obtained using conventional three-dimensional (3D) translabial ultrasound and virtual reality imaging. Our objectives were to establish their agreement and reliability, and their relationship with prolapse symptoms and POP quantification (POP-Q) stage. One hundred women with an intact levator ani were selected from our tertiary clinic database. Information on clinical symptoms were obtained using standardized questionnaires. Ultrasound datasets were analyzed using a rendered volume with a slice thickness of 1.5 cm, at the level of minimal hiatal dimensions, during rest and on maximum Valsalva. The levator area (in cm(2) ) was measured and multiplied by 1.5 to obtain the levator ani hiatal volume (in cm(3) ) on conventional 3D ultrasound. Levator ani hiatal volume (in cm(3) ) was measured semi-automatically by virtual reality imaging using a segmentation algorithm. Twenty patients were chosen randomly to analyze intra- and interobserver agreement. The mean difference between levator hiatal volume measurements on 3D ultrasound and by virtual reality was 1.52 cm(3) (95% CI, 1.00-2.04 cm(3) ) at rest and 1.16 cm(3) (95% CI, 0.56-1.76 cm(3) ) during maximum Valsalva (P < 0.001). Both intra- and interobserver intraclass correlation coefficients were ≥ 0.96 for conventional 3D ultrasound and > 0.99 for virtual reality. Patients with prolapse symptoms or POP-Q Stage ≥ 2 had significantly larger hiatal measurements than those without symptoms or POP-Q Stage < 2. Levator ani hiatal volume at rest and on maximum Valsalva is significantly smaller when using virtual reality compared with conventional 3D ultrasound; however, this difference does not seem clinically important. Copyright © 2015 ISUOG. Published by

  13. High-resolution dental magnetic resonance imaging for planning palatal graft surgery-a clinical pilot study.

    PubMed

    Hilgenfeld, Tim; Kästel, Thorsten; Heil, Alexander; Rammelsberg, Peter; Heiland, Sabine; Bendszus, Martin; Schwindling, Franz Sebastian

    2018-04-01

    To evaluate whether high-resolution, non-contrast-enhanced dental magnetic resonance imaging (MRI) can be used for accurate determination of palatal masticatory mucosa thickness (PMMT) and to locate the greater palatal artery (GPA). In five volunteers (four males, one female; mean age 30.2 ± 0.4 years), two independent raters measured PMMT by use of dental MRI in 180 positions. For comparison, clinical bone sounding was performed. The GPA was identified in time-of-flight (TOF) angiography and MSVAT-SPACE-prototype sequence. Intra- and inter-observer agreement for MRI measurements, agreement between MRI and bone sounding were analysed by intra-class correlation coefficient (ICC) and Cohen's kappa (κ). Reliability of dental MRI measurements was high (intra-observer-ICC 0.962; inter-observer ICC 0.959). Agreement of MRI measurements with bone sounding was moderate (ICC 0.744), and the GPA could be identified in 60% of measurement points using the TOF-angiography alone and in 85% with additional information of the MSVAT-SPACE. Good intra-observer agreement was observed for GPA identification (κ: 0.778). Palatal masticatory mucosa thickness measured by high-resolution, non-contrast enhanced dental MRI is comparable with that obtained by bone sounding. Dental MRI enables reliable, non-invasive and radiation-free planning of palatal tissue harvesting and can also be used for location of the GPA at 85% of measurement points, which might help reduce complications during surgery. © 2018 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  14. Vaginal vault suspension during hysterectomy for benign indications: a prospective register study of agreement on terminology and surgical procedure.

    PubMed

    Bonde, Lisbeth; Noer, Mette Calundann; Møller, Lars Alling; Ottesen, Bent; Gimbel, Helga

    2017-07-01

    Several suspension methods are used to try to prevent pelvic organ prolapse (POP) after hysterectomy. We aimed to evaluate agreement on terminology and surgical procedure of these methods. We randomly chose 532 medical records of women with a history of hysterectomy from the Danish Hysterectomy and Hysteroscopy Database (DHHD). Additionally, we video-recorded 36 randomly chosen hysterectomies. The hysterectomies were registered in the DHHD. The material was categorized according to predefined suspension methods. Agreement compared suspension codes in DHHD (gynecologists' registrations) with medical records (gynecologists' descriptions) and with videos (reviewers' categorizations) respectively. Whether the vaginal vault was suspended (pooled suspension) or not (no suspension method + not described) was analyzed, in addition to each suspension method. Regarding medical records, agreement on terminology was good among patients undergoing pooled suspension in cases of hysterectomy via the abdominal and vaginal route (agreement 78.7, 92.3%). Regarding videos, agreement on surgical procedure was good among pooled suspension patients in cases of hysterectomy via the abdominal, laparoscopic, and vaginal routes (agreement 88.9, 97.8, 100%). Agreement on individual suspension methods differed regarding both medical records (agreement 0-90.1%) and videos (agreement 0-100%). Agreement on terminology and surgical procedure regarding suspension method was good in respect of pooled suspension. However, disagreement was observed when individual suspension methods and operative details were scrutinized. Better consensus of terminology and surgical procedure is warranted to enable further research aimed at preventing POP among women undergoing hysterectomy.

  15. The utility of dual-energy CT for metal artifact reduction from intracranial clipping and coiling.

    PubMed

    Mera Fernández, D; Santos Armentia, E; Bustos Fiore, A; Villanueva Campos, A M; Utrera Pérez, E; Souto Bayarri, M

    2018-04-23

    To assess the ability of dual-energy CT (DECT) to reduce metal-related artifacts in patients with clips and coils in head CT angiography, and to analyze the differences in this reduction between both type of devices. Thirteen patients (6 clips, 7 coils) were selected and retrospectively analized. Virtual monoenergetic images (MEI) with photon energies from 40 to 150 keV were obtained. Noise was measured at the area of maximum artifact. Subjective evaluation of streak artifact was performed by two radiologists independently. Differences between noise values in all groups were tested by using the ANOVA test. Mann-Whitney U test was used to compare the differences between clips and coils. Coheńs κ statistic was used to determine interobserver agreement. The lowest noise value was observed at high energy levels (p<0,05). Noise was higher in the coil group than in the clip group (p<0.001). Interobserver agreement was good (κ=0.72). TCED with MEI helps to minimize the artifact from clips ands coils in patients who undergo head CT angiography. The reduction of the artifact is greater in patients with surgical clipping than in patients with endovascular coiling. Copyright © 2018 SERAM. Publicado por Elsevier España, S.L.U. All rights reserved.

  16. How Good Is Our College? First Edition

    ERIC Educational Resources Information Center

    Education Scotland, 2016

    2016-01-01

    The new quality framework, "How good is our college?" is a tool to support and enable colleges to evaluate the quality of provision and services alongside reporting on progress in relation to outcome agreements. It is designed to be used by all college staff. Colleges will evaluate the quality of their provision and services using the 12…

  17. Increasing Reliability of Direct Observation Measurement Approaches in Emotional and/or Behavioral Disorders Research Using Generalizability Theory

    ERIC Educational Resources Information Center

    Gage, Nicholas A.; Prykanowski, Debra; Hirn, Regina

    2014-01-01

    Reliability of direct observation outcomes ensures the results are consistent, dependable, and trustworthy. Typically, reliability of direct observation measurement approaches is assessed using interobserver agreement (IOA) and the calculation of observer agreement (e.g., percentage of agreement). However, IOA does not address intraobserver…

  18. Validity and sensitivity to change of the semi-quantitative OMERACT ultrasound scoring system for tenosynovitis in patients with rheumatoid arthritis.

    PubMed

    Ammitzbøll-Danielsen, Mads; Østergaard, Mikkel; Naredo, Esperanza; Terslev, Lene

    2016-12-01

    The aim was to evaluate the metric properties of the semi-quantitative OMERACT US scoring system vs a novel quantitative US scoring system for tenosynovitis, by testing its intra- and inter-reader reliability, sensitivity to change and comparison with clinical tenosynovitis scoring in a 6-month follow-up study. US and clinical assessments of the tendon sheaths of the clinically most affected hand and foot were performed at baseline, 3 and 6 months in 51 patients with RA. Tenosynovitis was assessed using the semi-quantitative scoring system (0-3) proposed by the OMERACT US group and a new quantitative US evaluation (0-100). A sum for US grey scale (GS), colour Doppler (CD) and pixel index (PI), respectively, was calculated for each patient. In 20 patients, intra- and inter-observer agreement was established between two independent investigators. A binary clinical tenosynovitis score was performed, calculating a sum score per patient. The intra- and inter-observer agreements for US tenosynovitis assessments were very good at baseline and for change for GS and CD, but less good for PI. The smallest detectable change was 0.97 for GS, 0.93 for CD and 30.1 for PI. The sensitivity to change from month 0 to 6 was high for GS and CD, and slightly higher than for clinical tenosynovitis score and PI. This study demonstrated an excellent intra- and inter-reader agreement between two investigators for the OMERACT US scoring system for tenosynovitis and a high ability to detect changes over time. Quantitative assessment by PI did not add further information. © The Author 2016. Published by Oxford University Press on behalf of the British Society for Rheumatology. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  19. Agreement between PRE2DUP register data modeling method and comprehensive drug use interview among older persons

    PubMed Central

    Taipale, Heidi; Tanskanen, Antti; Koponen, Marjaana; Tolppanen, Anna-Maija; Tiihonen, Jari; Hartikainen, Sirpa

    2016-01-01

    Background PRE2DUP is a modeling method that generates drug use periods (ie, when drug use started and ended) from drug purchases recorded in dispensing-based register data. It is based on the evaluation of personal drug purchasing patterns and considers hospital stays, possible stockpiling of drugs, and package information. Objective The objective of this study was to investigate person-level agreement between self-reported drug use in the interview and drug use modeled from dispensing data with PRE2DUP method for various drug classes used by older persons. Methods Self-reported drug use was assessed from the GeMS Study including a random sample of persons aged ≥75 years from the city of Kuopio, Finland, in 2006. Drug purchases recorded in the Prescription register data of these persons were modeled to determine drug use periods with PRE2DUP modeling method. Agreement between self-reported drug use on the interview date and drug use calculated from register-based data was compared in order to find the frequently used drugs and drug classes, which was evaluated by Cohen’s kappa. Kappa values 0.61–0.80 were considered to represent good and 0.81–1.00 as very good agreement. Results Among 569 participants with mean age of 82 years, the agreement between interview and register data was very good for 75% and very good or good for 93% of the studied drugs or drug classes. Good or very good agreement was observed for drugs that are typically used on regular bases, whereas “as needed” drugs represented poorer results. Conclusion PRE2DUP modeling method validly describes regular drug use among older persons. For most of drug classes investigated, PRE2DUP-modeled register data described drug use as well as interview-based data which are more time-consuming to collect. Further studies should be conducted by comparing it with other methods and in different drug user populations. PMID:27785101

  20. Cervical vertebrae maturation method morphologic criteria: poor reproducibility.

    PubMed

    Nestman, Trenton S; Marshall, Steven D; Qian, Fang; Holton, Nathan; Franciscus, Robert G; Southard, Thomas E

    2011-08-01

    The cervical vertebrae maturation (CVM) method has been advocated as a predictor of peak mandibular growth. A careful review of the literature showed potential methodologic errors that might influence the high reported reproducibility of the CVM method, and we recently established that the reproducibility of the CVM method was poor when these potential errors were eliminated. The purpose of this study was to further investigate the reproducibility of the individual vertebral patterns. In other words, the purpose was to determine which of the individual CVM vertebral patterns could be classified reliably and which could not. Ten practicing orthodontists, trained in the CVM method, evaluated the morphology of cervical vertebrae C2 through C4 from 30 cephalometric radiographs using questions based on the CVM method. The Fleiss kappa statistic was used to assess interobserver agreement when evaluating each cervical vertebrae morphology question for each subject. The Kendall coefficient of concordance was used to assess the level of interobserver agreement when determining a "derived CVM stage" for each subject. Interobserver agreement was high for assessment of the lower borders of C2, C3, and C4 that were either flat or curved in the CVM method, but interobserver agreement was low for assessment of the vertebral bodies of C3 and C4 when they were either trapezoidal, rectangular horizontal, square, or rectangular vertical; this led to the overall poor reproducibility of the CVM method. These findings were reflected in the Fleiss kappa statistic. Furthermore, nearly 30% of the time, individual morphologic criteria could not be combined to generate a final CVM stage because of incompatible responses to the 5 questions. Intraobserver agreement in this study was only 62%, on average, when the inconclusive stagings were excluded as disagreements. Intraobserver agreement was worse (44%) when the inconclusive stagings were included as disagreements. For the group of subjects

  1. Reliability of the MDi Psoriasis® Application to Aid Therapeutic Decision-Making in Psoriasis.

    PubMed

    Moreno-Ramírez, D; Herrerías-Esteban, J M; Ojeda-Vila, T; Carrascosa, J M; Carretero, G; de la Cueva, P; Ferrándiz, C; Galán, M; Rivera, R; Rodríguez-Fernández, L; Ruiz-Villaverde, R; Ferrándiz, L

    2017-09-01

    Therapeutic decisions in psoriasis are influenced by disease factors (e.g., severity or location), comorbidity, and demographic and clinical features. We aimed to assess the reliability of a mobile telephone application (MDi-Psoriasis) designed to help the dermatologist make decisions on how to treat patients with moderate to severe psoriasis. We analyzed interobserver agreement between the advice given by an expert panel and the recommendations of the MDi-Psoriasis application in 10 complex cases of moderate to severe psoriasis. The experts were asked their opinion on which treatments were most appropriate, possible, or inappropriate. Data from the same 10 cases were entered into the MDi-Psoriasis application. Agreement was analyzed in 3 ways: paired interobserver concordance (Cohen's κ), multiple interobserver concordance (Fleiss's κ), and percent agreement between recommendations. The mean percent agreement between the total of 1210 observations was 51.3% (95% CI, 48.5-54.1%). Cohen's κ statistic was 0.29 and Fleiss's κ was 0.28. Mean agreement between pairs of human observers only, excluding the MDi-Psoriasis recommendations, was 50.5% (95% CI, 47.6-53.5%). Paired agreement between the recommendations of the MDi-Psoriasis tool and the majority opinion of the expert panel (Cohen's κ) was 0.44 (68.2% agreement). The MDi-Psoriasis tool can generate recommendations that are comparable to those of experts in psoriasis. Copyright © 2017 AEDV. Publicado por Elsevier España, S.L.U. All rights reserved.

  2. Comparison of Inter-Observer Variability and Diagnostic Performance of the Fifth Edition of BI-RADS for Breast Ultrasound of Static versus Video Images.

    PubMed

    Youk, Ji Hyun; Jung, Inkyung; Yoon, Jung Hyun; Kim, Sung Hun; Kim, You Me; Lee, Eun Hye; Jeong, Sun Hye; Kim, Min Jung

    2016-09-01

    Our aim was to compare the inter-observer variability and diagnostic performance of the Breast Imaging Reporting and Data System (BI-RADS) lexicon for breast ultrasound of static and video images. Ninety-nine breast masses visible on ultrasound examination from 95 women 19-81 y of age at five institutions were enrolled in this study. They were scheduled to undergo biopsy or surgery or had been stable for at least 2 y of ultrasound follow-up after benign biopsy results or typically benign findings. For each mass, representative long- and short-axis static ultrasound images were acquired; real-time long- and short-axis B-mode video images through the mass area were separately saved as cine clips. Each image was reviewed independently by five radiologists who were asked to classify ultrasound features according to the fifth edition of the BI-RADS lexicon. Inter-observer variability was assessed using kappa (κ) statistics. Diagnostic performance on static and video images was compared using the area under the receiver operating characteristic curve. No significant difference was found in κ values between static and video images for all descriptors, although κ values of video images were higher than those of static images for shape, orientation, margin and calcifications. After receiver operating characteristic curve analysis, the video images (0.83, range: 0.77-0.87) had higher areas under the curve than the static images (0.80, range: 0.75-0.83; p = 0.08). Inter-observer variability and diagnostic performance of video images was similar to that of static images on breast ultrasonography according to the new edition of BI-RADS. Copyright © 2016 World Federation for Ultrasound in Medicine & Biology. Published by Elsevier Inc. All rights reserved.

  3. Nomogram for sample size calculation on a straightforward basis for the kappa statistic.

    PubMed

    Hong, Hyunsook; Choi, Yunhee; Hahn, Seokyung; Park, Sue Kyung; Park, Byung-Joo

    2014-09-01

    Kappa is a widely used measure of agreement. However, it may not be straightforward in some situation such as sample size calculation due to the kappa paradox: high