diagnostic accuracy interobserver: Topics by Science.gov

Sample records for diagnostic accuracy interobserver

Interobserver Variability and Accuracy of High-Definition Endoscopic Diagnosis for Gastric Intestinal Metaplasia among Experienced and Inexperienced Endoscopists

PubMed Central

Hyun, Yil Sik; Bae, Joong Ho; Park, Hye Sun; Eun, Chang Soo

2013-01-01

Accurate diagnosis of gastric intestinal metaplasia is important; however, conventional endoscopy is known to be an unreliable modality for diagnosing gastric intestinal metaplasia (IM). The aims of the study were to evaluate the interobserver variation in diagnosing IM by high-definition (HD) endoscopy and the diagnostic accuracy of this modality for IM among experienced and inexperienced endoscopists. Selected 50 cases, taken with HD endoscopy, were sent for a diagnostic inquiry of gastric IM through visual inspection to five experienced and five inexperienced endoscopists. The interobserver agreement between endoscopists was evaluated to verify the diagnostic reliability of HD endoscopy in diagnosing IM, and the diagnostic accuracy, sensitivity, and specificity were evaluated for validity of HD endoscopy in diagnosing IM. Interobserver agreement among the experienced endoscopists was "poor" (κ = 0.38) and it was also "poor" (κ = 0.33) among the inexperienced endoscopists. The diagnostic accuracy of the experienced endoscopists was superior to that of the inexperienced endoscopists (P = 0.003). Since diagnosis through visual inspection is unreliable in the diagnosis of IM, all suspicious areas for gastric IM should be considered to be biopsied. Furthermore, endoscopic experience and education are needed to raise the diagnostic accuracy of gastric IM. PMID:23678267
Interobserver variability and accuracy of high-definition endoscopic diagnosis for gastric intestinal metaplasia among experienced and inexperienced endoscopists.

PubMed

Hyun, Yil Sik; Han, Dong Soo; Bae, Joong Ho; Park, Hye Sun; Eun, Chang Soo

2013-05-01

Accurate diagnosis of gastric intestinal metaplasia is important; however, conventional endoscopy is known to be an unreliable modality for diagnosing gastric intestinal metaplasia (IM). The aims of the study were to evaluate the interobserver variation in diagnosing IM by high-definition (HD) endoscopy and the diagnostic accuracy of this modality for IM among experienced and inexperienced endoscopists. Selected 50 cases, taken with HD endoscopy, were sent for a diagnostic inquiry of gastric IM through visual inspection to five experienced and five inexperienced endoscopists. The interobserver agreement between endoscopists was evaluated to verify the diagnostic reliability of HD endoscopy in diagnosing IM, and the diagnostic accuracy, sensitivity, and specificity were evaluated for validity of HD endoscopy in diagnosing IM. Interobserver agreement among the experienced endoscopists was "poor" (κ = 0.38) and it was also "poor" (κ = 0.33) among the inexperienced endoscopists. The diagnostic accuracy of the experienced endoscopists was superior to that of the inexperienced endoscopists (P = 0.003). Since diagnosis through visual inspection is unreliable in the diagnosis of IM, all suspicious areas for gastric IM should be considered to be biopsied. Furthermore, endoscopic experience and education are needed to raise the diagnostic accuracy of gastric IM.
Assessment of colon polyp morphology: Is education effective?

PubMed Central

Kim, Jae Hyun; Nam, Kyoung Sik; Kwon, Hye Jung; Choi, Youn Jung; Jung, Kyoungwon; Kim, Sung Eun; Moon, Won; Park, Moo In; Park, Seun Ja

2017-01-01

AIM To determine the inter-observer variability for colon polyp morphology and to identify whether education can improve agreement among observers. METHODS For purposes of the tests, we recorded colonoscopy video clips that included scenes visualizing the polyps. A total of 15 endoscopists and 15 nurses participated in the study. Participants watched 60 video clips of the polyp morphology scenes and then estimated polyp morphology (pre-test). After education for 20 min, participants performed a second test in which the order of 60 video clips was changed (post-test). To determine if the effectiveness of education was sustained, four months later, a third, follow-up test was performed with the same participants. RESULTS The overall Fleiss’ kappa value of the inter-observer agreement was 0.510 in the pre-test, 0.618 in the post-test, and 0.580 in the follow-up test. The overall diagnostic accuracy of the estimation for polyp morphology in the pre-, post-, and follow-up tests was 0.662, 0.797, and 0.761, respectively. After education, the inter-observer agreement and diagnostic accuracy of all participants improved. However, after four months, the inter-observer agreement and diagnostic accuracy of expert groups were markedly decreased, and those of beginner and nurse groups remained similar to pre-test levels. CONCLUSION The education program used in this study can improve inter-observer agreement and diagnostic accuracy in assessing the morphology of colon polyps; it is especially effective when first learning endoscopy. PMID:28974894
Assessment of colon polyp morphology: Is education effective?

PubMed

Kim, Jae Hyun; Nam, Kyoung Sik; Kwon, Hye Jung; Choi, Youn Jung; Jung, Kyoungwon; Kim, Sung Eun; Moon, Won; Park, Moo In; Park, Seun Ja

2017-09-14

To determine the inter-observer variability for colon polyp morphology and to identify whether education can improve agreement among observers. For purposes of the tests, we recorded colonoscopy video clips that included scenes visualizing the polyps. A total of 15 endoscopists and 15 nurses participated in the study. Participants watched 60 video clips of the polyp morphology scenes and then estimated polyp morphology (pre-test). After education for 20 min, participants performed a second test in which the order of 60 video clips was changed (post-test). To determine if the effectiveness of education was sustained, four months later, a third, follow-up test was performed with the same participants. The overall Fleiss' kappa value of the inter-observer agreement was 0.510 in the pre-test, 0.618 in the post-test, and 0.580 in the follow-up test. The overall diagnostic accuracy of the estimation for polyp morphology in the pre-, post-, and follow-up tests was 0.662, 0.797, and 0.761, respectively. After education, the inter-observer agreement and diagnostic accuracy of all participants improved. However, after four months, the inter-observer agreement and diagnostic accuracy of expert groups were markedly decreased, and those of beginner and nurse groups remained similar to pre-test levels. The education program used in this study can improve inter-observer agreement and diagnostic accuracy in assessing the morphology of colon polyps; it is especially effective when first learning endoscopy.
Interpretation of bedside chest X-rays in the ICU: is the radiologist still needed?

PubMed

Martini, Katharina; Ganter, Christoph; Maggiorini, Marco; Winklehner, Anna; Leupi-Skibinski, Katarzyna E; Frauenfelder, Thomas; Nguyen-Kim, Thi Dan Linh

2015-01-01

To compare diagnostic accuracy of intensivists to radiologists in reading bedside chest X-rays. In a retrospective trial, 33 bedside chest X-rays were evaluated by five radiologists and five intensivists with different experience. Images were evaluated for devices and lung pathologies. Interobserver agreement and diagnostic accuracy were calculated. Computed tomography served as reference standard. Seniors had higher diagnostic accuracy than residents (mean-ExpB(Senior)=1.456; mean-ExpB(Resident)=1.635). Interobserver agreement for installations was more homogenously distributed between radiologists compared to intensivists (ExpB(Rad)=1.204-1.672; ExpB(Int)=1.005-2.368). Seniors had comparable diagnostic accuracy. No significant difference in diagnostic performance was seen between seniors of both disciplines, whereas the resident intensivists might still benefit from an interdisciplinary dialogue. Copyright © 2015 Elsevier Inc. All rights reserved.
Detection of intracavitary uterine pathology using offline analysis of three-dimensional ultrasound volumes: interobserver agreement and diagnostic accuracy.

PubMed

Van den Bosch, T; Valentin, L; Van Schoubroeck, D; Luts, J; Bignardi, T; Condous, G; Epstein, E; Leone, F P; Testa, A C; Van Huffel, S; Bourne, T; Timmerman, D

2012-10-01

To estimate the diagnostic accuracy and interobserver agreement in predicting intracavitary uterine pathology at offline analysis of three-dimensional (3D) ultrasound volumes of the uterus. 3D volumes (unenhanced ultrasound and gel infusion sonography with and without power Doppler, i.e. four volumes per patient) of 75 women presenting with abnormal uterine bleeding at a 'bleeding clinic' were assessed offline by six examiners. The sonologists were asked to provide a tentative diagnosis. A histological diagnosis was obtained by hysteroscopy with biopsy or operative hysteroscopy. Proliferative, secretory or atrophic endometrium was classified as 'normal' histology; endometrial polyps, intracavitary myomas, endometrial hyperplasia and endometrial cancer were classified as 'abnormal' histology. The diagnostic accuracy of the six sonologists with regard to normal/abnormal histology and interobserver agreement were estimated. Intracavitary pathology was diagnosed at histology in 39% of patients. Agreement between the ultrasound diagnosis and the histological diagnosis (normal vs abnormal) ranged from 67 to 83% for the six sonologists. In 45% of cases all six examiners agreed with regard to the presence/absence of intracavitary pathology. The percentage agreement between any two examiners ranged from 65 to 91% (Cohen's κ, 0.31-0.81). The Schouten κ for all six examiners was 0.51 (95% CI, 0.40-0.62), while the highest Schouten κ for any three examiners was 0.69. When analyzing stored 3D ultrasound volumes, agreement between sonologists with regard to classifying the endometrium/uterine cavity as normal or abnormal as well as the diagnostic accuracy varied substantially. Possible actions to improve interobserver agreement and diagnostic accuracy include optimization of image quality and the use of a consistent technique for analyzing the 3D volumes. Copyright © 2012 ISUOG. Published by John Wiley & Sons, Ltd.
A Web-Based Education Program for Colorectal Lesion Diagnosis with Narrow Band Imaging Classification.

PubMed

Aihara, Hiroyuki; Kumar, Nitin; Thompson, Christopher C

2018-04-19

An education system for narrow band imaging (NBI) interpretation requires sufficient exposure to key features. However, access to didactic lectures by experienced teachers is limited in the United States. To develop and assess the effectiveness of a colorectal lesion identification tutorial. In the image analysis pretest, subjects including 9 experts and 8 trainees interpreted 50 white light (WL) and 50 NBI images of colorectal lesions. Results were not reviewed with subjects. Trainees then participated in an online tutorial emphasizing NBI interpretation in colorectal lesion analysis. A post-test was administered and diagnostic yields were compared to pre-education diagnostic yields. Under the NBI mode, experts showed higher diagnostic yields (sensitivity 91.5% [87.3-94.4], specificity 90.6% [85.1-94.2], and accuracy 91.1% [88.5-93.7] with substantial interobserver agreement [κ value 0.71]) compared to trainees (sensitivity 89.6% [84.8-93.0], specificity 80.6% [73.5-86.3], and accuracy 86.0% [82.6-89.2], with substantial interobserver agreement [κ value 0.69]). The online tutorial improved the diagnostic yields of trainees to the equivalent level of experts (sensitivity 94.1% [90.0-96.6], specificity 89.0% [83.0-93.2], and accuracy 92.0% [89.3-94.7], p < 0.001 with substantial interobserver agreement [κ value 0.78]). This short, online tutorial improved diagnostic performance and interobserver agreement. © 2018 S. Karger AG, Basel.
Comparison of 3T and 7T susceptibility-weighted angiography of the substantia nigra in diagnosing Parkinson disease.

PubMed

Cosottini, M; Frosini, D; Pesaresi, I; Donatelli, G; Cecchi, P; Costagli, M; Biagi, L; Ceravolo, R; Bonuccelli, U; Tosetti, M

2015-03-01

Standard neuroimaging fails in defining the anatomy of the substantia nigra and has a marginal role in the diagnosis of Parkinson disease. Recently 7T MR target imaging of the substantia nigra has been useful in diagnosing Parkinson disease. We performed a comparative study to evaluate whether susceptibility-weighted angiography can diagnose Parkinson disease with a 3T scanner. Fourteen patients with Parkinson disease and 13 healthy subjects underwent MR imaging examination at 3T and 7T by using susceptibility-weighted angiography. Two expert blinded observers and 1 neuroradiology fellow evaluated the 3T and 7T images of the sample to identify substantia nigra abnormalities indicative of Parkinson disease. Diagnostic accuracy and intra- and interobserver agreement were calculated separately for 3T and 7T acquisitions. Susceptibility-weighted angiography 7T MR imaging can diagnose Parkinson disease with a mean sensitivity of 93%, specificity of 100%, and diagnostic accuracy of 96%. 3T MR imaging diagnosed Parkinson disease with a mean sensitivity of 79%, specificity of 94%, and diagnostic accuracy of 86%. Intraobserver and interobserver agreement was excellent at 7T. At 3T, intraobserver agreement was excellent for experts, and interobserver agreement ranged between good and excellent. The less expert reader obtained a diagnostic accuracy of 89% at 3T. Susceptibility-weighted angiography images obtained at 3T and 7T differentiate controls from patients with Parkinson disease with a higher diagnostic accuracy at 7T. The capability of 3T in diagnosing Parkinson disease might encourage its use in clinical practice. The use of the more accurate 7T should be supported by a dedicated cost-effectiveness study. © 2015 by American Journal of Neuroradiology.
Multicenter accuracy and interobserver agreement of spot sign identification in acute intracerebral hemorrhage.

PubMed

Huynh, Thien J; Flaherty, Matthew L; Gladstone, David J; Broderick, Joseph P; Demchuk, Andrew M; Dowlatshahi, Dar; Meretoja, Atte; Davis, Stephen M; Mitchell, Peter J; Tomlinson, George A; Chenkin, Jordan; Chia, Tze L; Symons, Sean P; Aviv, Richard I

2014-01-01

Rapid, accurate, and reliable identification of the computed tomography angiography spot sign is required to identify patients with intracerebral hemorrhage for trials of acute hemostatic therapy. We sought to assess the accuracy and interobserver agreement for spot sign identification. A total of 131 neurology, emergency medicine, and neuroradiology staff and fellows underwent imaging certification for spot sign identification before enrolling patients in 3 trials targeting spot-positive intracerebral hemorrhage for hemostatic intervention (STOP-IT, SPOTLIGHT, STOP-AUST). Ten intracerebral hemorrhage cases (spot-positive/negative ratio, 1:1) were presented for evaluation of spot sign presence, number, and mimics. True spot positivity was determined by consensus of 2 experienced neuroradiologists. Diagnostic performance, agreement, and differences by training level were analyzed. Mean accuracy, sensitivity, and specificity for spot sign identification were 87%, 78%, and 96%, respectively. Overall sensitivity was lower than specificity (P<0.001) because of true spot signs incorrectly perceived as spot mimics. Interobserver agreement for spot sign presence was moderate (k=0.60). When true spots were correctly identified, 81% correctly identified the presence of single or multiple spots. Median time needed to evaluate the presence of a spot sign was 1.9 minutes (interquartile range, 1.2-3.1 minutes). Diagnostic performance, interobserver agreement, and time needed for spot sign evaluation were similar among staff physicians and fellows. Accuracy for spot identification is high with opportunity for improvement in spot interpretation sensitivity and interobserver agreement particularly through greater reliance on computed tomography angiography source data and awareness of limitations of multiplanar images. Further prospective study is needed.
Assessment of the diagnostic performance and interobserver variability of endocytoscopy in Barrett’s esophagus: A pilot ex-vivo study

PubMed Central

Tomizawa, Yutaka; Iyer, Prasad G; Wongkeesong, Louis M; Buttar, Navtej S; Lutzke, Lori S; Wu, Tsung-Teh; Wang, Kenneth K

2013-01-01

AIM: To investigate a classification of endocytoscopy (ECS) images in Barrett’s esophagus (BE) and evaluate its diagnostic performance and interobserver variability. METHODS: ECS was applied to surveillance endoscopic mucosal resection (EMR) specimens of BE ex-vivo. The mucosal surface of specimen was stained with 1% methylene blue and surveyed with a catheter-type endocytoscope. We selected still images that were most representative of the endoscopically suspect lesion and matched with the final histopathological diagnosis to accomplish accurate correlation. The diagnostic performance and inter-observer variability of the new classification scheme were assessed in a blinded fashion by physicians with expertise in both BE and ECS and inexperienced physicians with no prior exposure to ECS. RESULTS: Three staff physicians and 22 gastroenterology fellows classified eight randomly assigned unknown still ECS pictures (two images per each classification) into one of four histopathologic categories as follows: (1) BEC1-squamous epithelium; (2) BEC2-BE without dysplasia; (3) BEC3-BE with dysplasia; and (4) BEC4-esophageal adenocarcinoma (EAC) in BE. Accuracy of diagnosis in staff physicians and clinical fellows were, respectively, 100% and 99.4% for BEC1, 95.8% and 83.0% for BEC2, 91.7% and 83.0% for BEC3, and 95.8% and 98.3% for BEC4. Interobserver agreement of the faculty physicians and fellows in classifying each category were 0.932 and 0.897, respectively. CONCLUSION: This is the first study to investigate classification system of ECS in BE. This ex-vivo pilot study demonstrated acceptable diagnostic accuracy and excellent interobserver agreement. PMID:24379583
A dual tracer (68)Ga-DOTANOC PET/CT and (18)F-FDG PET/CT pilot study for detection of cardiac sarcoidosis.

PubMed

Gormsen, Lars C; Haraldsen, Ate; Kramer, Stine; Dias, Andre H; Kim, Won Yong; Borghammer, Per

2016-12-01

Cardiac sarcoidosis (CS) is a potentially fatal condition lacking a single test with acceptable diagnostic accuracy. (18)F-FDG PET/CT has emerged as a promising imaging modality, but is challenged by physiological myocardial glucose uptake. An alternative tracer, (68)Ga-DOTANOC, binds to somatostatin receptors on inflammatory cells in sarcoid granulomas. We therefore aimed to conduct a proof-of-concept study using (68)Ga-DOTANOC to diagnose CS. In addition, we compared diagnostic accuracy and inter-observer variability of (68)Ga-DOTANOC vs. (18)F-FDG PET/CT. Nineteen patients (seven female) with suspected CS were prospectively recruited and dual tracer scanned within 7 days. PET images were reviewed by four expert readers for signs of CS and compared to the reference standard (Japanese ministry of Health and Welfare CS criteria). CS was diagnosed in 3/19 patients. By consensus, 11/19 (18)F-FDG scans and 0/19 (68)Ga-DOTANOC scans were rated as inconclusive. The sensitivity of (18)F-FDG PET for diagnosing CS was 33 %, specificity was 88 %, PPV was 33 %, NPV was 88 %, and diagnostic accuracy was 79 %. For (68)Ga-DOTANOC, accuracy was 100 %. Inter-observer agreement was poor for (18)F-FDG PET (Fleiss' combined kappa 0.27, NS) and significantly better for (68)Ga-DOTANOC (Fleiss' combined kappa 0.46, p = 0.001). Despite prolonged pre-scan fasting, a large proportion of (18)F-FDG PET/CT images were rated as inconclusive, resulting in low agreement among reviewers and correspondingly poor diagnostic accuracy. By contrast, (68)Ga-DOTANOC PET/CT had excellent diagnostic accuracy with the caveat that inter-observer variability was still significant. Nevertheless, (68)Ga-DOTANOC PET/CT looks very promising as an alternative CS PET tracer. Current Controlled Trials NCT01729169 .
Identification of Nasal Bone Fractures on Conventional Radiography and Facial CT: Comparison of the Diagnostic Accuracy in Different Imaging Modalities and Analysis of Interobserver Reliability.

PubMed

Baek, Hye Jin; Kim, Dong Wook; Ryu, Ji Hwa; Lee, Yoo Jin

2013-09-01

There has been no study to compare the diagnostic accuracy of an experienced radiologist with a trainee in nasal bone fracture. To compare the diagnostic accuracy between conventional radiography and computed tomography (CT) for the identification of nasal bone fractures and to evaluate the interobserver reliability between a staff radiologist and a trainee. A total of 108 patients who underwent conventional radiography and CT after acute nasal trauma were included in this retrospective study. Two readers, a staff radiologist and a second-year resident, independently assessed the results of the imaging studies. Of the 108 patients, the presence of a nasal bone fracture was confirmed in 88 (81.5%) patients. The number of non-depressed fractures was higher than the number of depressed fractures. In nine (10.2%) patients, nasal bone fractures were only identified on conventional radiography, including three depressed and six non-depressed fractures. CT was more accurate as compared to conventional radiography for the identification of nasal bone fractures as determined by both readers (P <0.05), all diagnostic indices of an experienced radiologist were similar to or higher than those of a trainee, and κ statistics showed moderate agreement between the two diagnostic tools for both readers. There was no statistical difference in the assessment of interobserver reliability for both imaging modalities in the identification of nasal bone fractures. For the identification of nasal bone fractures, CT was significantly superior to conventional radiography. Although a staff radiologist showed better values in the identification of nasal bone fracture and differentiation between depressed and non-depressed fractures than a trainee, there was no statistically significant difference in the interpretation of conventional radiography and CT between a radiologist and a trainee.
Prediction of Tubal Ectopic Pregnancy Using Offline Analysis of 3-Dimensional Transvaginal Ultrasonographic Data Sets: An Interobserver and Diagnostic Accuracy Study.

PubMed

Infante, Fernando; Espada Vaquero, Mercedes; Bignardi, Tommaso; Lu, Chuan; Testa, Antonia C; Fauchon, David; Epstein, Elisabeth; Leone, Francesco P G; Van den Bosch, Thierry; Martins, Wellington P; Condous, George

2018-06-01

To assess interobserver reproducibility in detecting tubal ectopic pregnancies by reading data sets from 3-dimensional (3D) transvaginal ultrasonography (TVUS) and comparing it with real-time 2-dimensional (2D) TVUS. Images were initially classified as showing pregnancies of unknown location or tubal ectopic pregnancies on real time 2D TVUS by an experienced sonologist, who acquired 5 3D volumes. Data sets were analyzed offline by 5 observers who had to classify each case as ectopic pregnancy or pregnancy of unknown location. The interobserver reproducibility was evaluated by the Fleiss κ statistic. The performance of each observer in predicting ectopic pregnancies was compared to that of the experienced sonologist. Women were followed until they were reclassified as follows: (1) failed pregnancy of unknown location; (2) intrauterine pregnancy; (3) ectopic pregnancy; or (4) persistent pregnancy of unknown location. Sixty-one women were included. The agreement between reading offline 3D data sets and the first real-time 2D TVUS was very good (80%-82%; κ = 0.89). The overall interobserver agreement among observers reading offline 3D data sets was moderate (κ = 0.52). The diagnostic performance of experienced observers reading offline 3D data sets had accuracy of 78.3% to 85.0%, sensitivity of 66.7% to 81.3%, specificity of 79.5% to 88.4%, positive predictive value of 57.1% to 72.2%, and negative predictive value of 87.5% to 91.3%, compared to the experienced sonologist's real-time 2D TVUS: accuracy of 94.5%, sensitivity of 94.4%, specificity of 94.5%, positive predictive value of 85.0%, and negative predictive value of 98.1%. The diagnostic accuracy of 3D TVUS by reading offline data sets for predicting ectopic pregnancies is dependent on experience. Reading only static 3D data sets without clinical information does not match the diagnostic performance of real time 2D TVUS combined with clinical information obtained during the scan. © 2017 by the American Institute of Ultrasound in Medicine.
Identification of Nasal Bone Fractures on Conventional Radiography and Facial CT: Comparison of the Diagnostic Accuracy in Different Imaging Modalities and Analysis of Interobserver Reliability

PubMed Central

Baek, Hye Jin; Kim, Dong Wook; Ryu, Ji Hwa; Lee, Yoo Jin

2013-01-01

Background There has been no study to compare the diagnostic accuracy of an experienced radiologist with a trainee in nasal bone fracture. Objectives To compare the diagnostic accuracy between conventional radiography and computed tomography (CT) for the identification of nasal bone fractures and to evaluate the interobserver reliability between a staff radiologist and a trainee. Patients and Methods A total of 108 patients who underwent conventional radiography and CT after acute nasal trauma were included in this retrospective study. Two readers, a staff radiologist and a second-year resident, independently assessed the results of the imaging studies. Results Of the 108 patients, the presence of a nasal bone fracture was confirmed in 88 (81.5%) patients. The number of non-depressed fractures was higher than the number of depressed fractures. In nine (10.2%) patients, nasal bone fractures were only identified on conventional radiography, including three depressed and six non-depressed fractures. CT was more accurate as compared to conventional radiography for the identification of nasal bone fractures as determined by both readers (P <0.05), all diagnostic indices of an experienced radiologist were similar to or higher than those of a trainee, and κ statistics showed moderate agreement between the two diagnostic tools for both readers. There was no statistical difference in the assessment of interobserver reliability for both imaging modalities in the identification of nasal bone fractures. Conclusion For the identification of nasal bone fractures, CT was significantly superior to conventional radiography. Although a staff radiologist showed better values in the identification of nasal bone fracture and differentiation between depressed and non-depressed fractures than a trainee, there was no statistically significant difference in the interpretation of conventional radiography and CT between a radiologist and a trainee. PMID:24348599
Diagnostic performance of dual-energy contrast-enhanced subtracted mammography in dense breasts compared to mammography alone: interobserver blind-reading analysis.

PubMed

Cheung, Yun-Chung; Lin, Yu-Ching; Wan, Yung-Liang; Yeow, Kee-Min; Huang, Pei-Chin; Lo, Yung-Feng; Tsai, Hsiu-Pei; Ueng, Shir-Hwa; Chang, Chee-Jen

2014-10-01

To analyse the accuracy of dual-energy contrast-enhanced spectral mammography in dense breasts in comparison with contrast-enhanced subtracted mammography (CESM) and conventional mammography (Mx). CESM cases of dense breasts with histological proof were evaluated in the present study. Four radiologists with varying experience in mammography interpretation blindly read Mx first, followed by CESM. The diagnostic profiles, consistency and learning curve were analysed statistically. One hundred lesions (28 benign and 72 breast malignancies) in 89 females were analysed. Use of CESM improved the cancer diagnosis by 21.2 % in sensitivity (71.5 % to 92.7 %), by 16.1 % in specificity (51.8 % to 67.9 %) and by 19.8 % in accuracy (65.9 % to 85.8 %) compared with Mx. The interobserver diagnostic consistency was markedly higher using CESM than using Mx alone (0.6235 vs. 0.3869 using the kappa ratio). The probability of a correct prediction was elevated from 80 % to 90 % after 75 consecutive case readings. CESM provided additional information with consistent improvement of the cancer diagnosis in dense breasts compared to Mx alone. The prediction of the diagnosis could be improved by the interpretation of a significant number of cases in the presence of 6 % benign contrast enhancement in this study. • DE-CESM improves the cancer diagnosis in dense breasts compared with mammography. • DE-CESM shows greater consistency than mammography alone by interobserver blind reading. • Diagnostic improvement of DE-CESM is independent of the mammographic reading experience.
Standardized Reporting of Prostate MRI: Comparison of the Prostate Imaging Reporting and Data System (PI-RADS) Version 1 and Version 2

PubMed Central

Tewes, Susanne; Mokov, Nikolaj; Hartung, Dagmar; Schick, Volker; Peters, Inga; Schedl, Peter; Pertschy, Stefanie; Wacker, Frank; Voshage, Götz; Hueper, Katja

2016-01-01

Introduction Objective of our study was to determine the agreement between version 1 (v1) and v2 of the Prostate Imaging Reporting and Data System (PI-RADS) for evaluation of multiparametric prostate MRI (mpMRI) and to compare their diagnostic accuracy, their inter-observer agreement and practicability. Material and Methods mpMRI including T2-weighted imaging, diffusion-weighted imaging (DWI) and dynamic contrast-enhanced imaging (DCE) of 54 consecutive patients, who subsequently underwent MRI-guided in-bore biopsy were re-analyzed according to PI-RADS v1 and v2 by two independent readers. Diagnostic accuracy for detection of prostate cancer (PCa) was assessed using ROC-curve analysis. Agreement between PI-RADS versions and observers was calculated and the time needed for scoring was determined. Results MRI-guided biopsy revealed PCa in 31 patients. Diagnostic accuracy for detection of PCa was equivalent with both PI-RADS versions for reader 1 with sensitivities and specificities of 84%/91% (AUC = 0.91 95%CI[0.8–1]) for PI-RADS v1 and 100%/74% (AUC = 0.92 95% CI[0.8–1]) for PI-RADS v2. Reader 2 achieved similar diagnostic accuracy with sensitivity and specificity of 74%/91% (AUC = 0.88 95%CI[0.8–1]) for PI-RADS v1 and 81%/91% (AUC = 0.91 95%CI[0.8–1]) for PI-RADS v2. Agreement between scores determined with different PI-RADS versions was good (reader 1: κ = 0.62, reader 2: κ = 0.64). Inter-observer agreement was moderate with PI-RADS v2 (κ = 0.56) and fair with v1 (κ = 0.39). The time required for building the PI-RADS score was significantly lower with PI-RADS v2 compared to v1 (24.7±2.3 s vs. 41.9±2.6 s, p<0.001). Conclusion Agreement between PI-RADS versions was high and both versions revealed high diagnostic accuracy for detection of PCa. Due to better inter-observer agreement for malignant lesions and less time demand, the new PI-RADS version could be more practicable for clinical routine. PMID:27657729
Diagnostic accuracy and reproducibility of the Ottawa Knee Rule vs the Pittsburgh Decision Rule.

PubMed

Cheung, Tung C; Tank, Yeliz; Breederveld, Roelf S; Tuinebreijer, Wim E; de Lange-de Klerk, Elly S M; Derksen, Robert J

2013-04-01

The aim of this present study was to compare the diagnostic accuracy and reproducibility of 2 clinical decision rules (the Ottawa Knee Rules [OKR] and Pittsburgh Decision Rules [PDR]) developed for selective use of x-rays in the evaluation of isolated knee trauma. Application of a decision rule leads to a more efficient evaluation of knee injuries and a reduction in health care costs. The diagnostic accuracy and reproducibility are compared in this study. A cross-sectional interobserver study was conducted in the emergency department of an urban teaching hospital from October 2008 to July 2009. Two observer groups collected data on standardized case-report forms: emergency medicine residents and surgical residents. Standard knee radiographs were performed in each patient. Participants were patients 18 years and older with isolated knee injuries. Pooled sensitivity and specificity were compared using χ(2) statistics, and interobserver agreement was calculated by using κ statistics. Ninety injuries were assessed. Seven injuries concerned fractures (7.8%). For the OKR, the pooled sensitivity and specificity were 0.86 (95% confidence interval [CI], 0.57-0.96) and 0.27 (95% CI, 0.21-0.35), respectively. The PDR had a pooled sensitivity and specificity of 0.86 (95% CI, 0.57-0.96) and 0.51 (95% CI, 0.44-0.59). The PDR was significantly (P = .002) more specific. The κ values for the OKR and PDR were 0.51 (95% CI, 0.32-0.71) and 0.71 (95% CI, 0.57-0.86), respectively. The PDR was found to be more specific than the OKR, with equal sensitivity. Interobserver agreement was moderate for the OKR and substantial for the PDR. Copyright © 2013 Elsevier Inc. All rights reserved.
Reliability of laser Doppler flowmetry curve reading for measurement of toe and ankle pressures: intra- and inter-observer variation.

PubMed

Høyer, C; Paludan, J P D; Pavar, S; Biurrun Manresa, J A; Petersen, L J

2014-03-01

To assess the intra- and inter-observer variation in laser Doppler flowmetry curve reading for measurement of toe and ankle pressures. A prospective single blinded diagnostic accuracy study was conducted on 200 patients with known or suspected peripheral arterial disease (PAD), with a total of 760 curve sets produced. The first curve reading for this study was performed by laboratory technologists blinded to clinical clues and previous readings at least 3 months after the primary data sampling. The pressure curves were later reassessed following another period of at least 3 months. Observer agreement in diagnostic classification according to TASC-II criteria was quantified using Cohen's kappa. Reliability was quantified using intra-class correlation coefficients, coefficients of variance, and Bland-Altman analysis. The overall agreement in diagnostic classification (PAD/not PAD) was 173/200 (87%) for intra-observer (κ = .858) and 175/200 (88%) for inter-observer data (κ = .787). Reliability analysis confirmed excellent correlation for both intra- and inter-observer data (ICC all ≥.931). The coefficients of variance ranged from 2.27% to 6.44% for intra-observer and 2.39% to 8.42% for inter-observer data. Subgroup analysis showed lower observer-variation for reading of toe pressures in patients with diabetes and/or chronic kidney disease than patients not diagnosed with these conditions. Bland-Altman plots showed higher variation in toe pressure readings than ankle pressure readings. This study shows substantial intra- and inter-observer agreement in diagnostic classification and reading of absolute pressures when using laboratory technologists as observers. The study emphasises that observer variation for curve reading is an important factor concerning the overall reproducibility of the method. Our data suggest diabetes and chronic kidney disease have an influence on toe pressure reproducibility. Copyright © 2013 European Society for Vascular Surgery. Published by Elsevier Ltd. All rights reserved.
Assessment of Myometrial Invasion in Premenopausal Grade 1 Endometrial Carcinoma: Is Magnetic Resonance Imaging a Reliable Tool in Selecting Patients for Fertility-Preserving Therapy?

PubMed

Sakane, Makoto; Hori, Masatoshi; Onishi, Hiromitsu; Tsuboyama, Takahiro; Ota, Takashi; Tatsumi, Mitsuaki; Ueda, Yutaka; Kimura, Toshihiro; Kimura, Tadashi; Tomiyama, Noriyuki

The aim of this study was to evaluate the diagnostic ability of magnetic resonance imaging (MRI) in premenopausal women with G1 endometrial carcinoma. Twenty-six patients underwent T2W, diffusion weighted, and dynamic contrast-enhanced 3-T MRI. The degree of myometrial invasion was pathologically classified into no invasion, shallow (3 mm or less), and more. Two radiologists assessed myometrial invasion on MRI. Diagnostic accuracy, sensitivity, specificity, positive and negative predictive values, AUC, and interobserver agreement were analyzed. For assessing myometrial invasion, mean accuracy, sensitivity, specificity, positive predictive values, negative predictive values, and AUC, respectively, were as follows: 63%, 42%, 85%, 79%, 47%, and 0.75. Mean interobserver agreement was fair (k = 0.36). Shallow invasions were underestimated as no invasion on MRI in all 6 cases. Magnetic resonance imaging produced false-negative result on half of patients. The misjudgments tended to happen in patients with shallow invasion.
The Feasibility of Classifying Breast Masses Using a Computer-Assisted Diagnosis (CAD) System Based on Ultrasound Elastography and BI-RADS Lexicon.

PubMed

Fleury, Eduardo F C; Gianini, Ana Claudia; Marcomini, Karem; Oliveira, Vilmar

2018-01-01

To determine the applicability of a computer-aided diagnostic system strain elastography system for the classification of breast masses diagnosed by ultrasound and scored using the criteria proposed by the breast imaging and reporting data system ultrasound lexicon and to determine the diagnostic accuracy and interobserver variability. This prospective study was conducted between March 1, 2016, and May 30, 2016. A total of 83 breast masses subjected to percutaneous biopsy were included. Ultrasound elastography images before biopsy were interpreted by 3 radiologists with and without the aid of computer-aided diagnostic system for strain elastography. The parameters evaluated by each radiologist results were sensitivity, specificity, and diagnostic accuracy, with and without computer-aided diagnostic system for strain elastography. Interobserver variability was assessed using a weighted κ test and an intraclass correlation coefficient. The areas under the receiver operating characteristic curves were also calculated. The areas under the receiver operating characteristic curve were 0.835, 0.801, and 0.765 for readers 1, 2, and 3, respectively, without computer-aided diagnostic system for strain elastography, and 0.900, 0.926, and 0.868, respectively, with computer-aided diagnostic system for strain elastography. The intraclass correlation coefficient between the 3 readers was 0.6713 without computer-aided diagnostic system for strain elastography and 0.811 with computer-aided diagnostic system for strain elastography. The proposed computer-aided diagnostic system for strain elastography system has the potential to improve the diagnostic performance of radiologists in breast examination using ultrasound associated with elastography.

Emergency department CT screening of patients with nontraumatic neurological symptoms referred to the posterior fossa: comparison of thin versus thick slice images.

PubMed

Kamalian, Shervin; Atkinson, Wendy L; Florin, Lauren A; Pomerantz, Stuart R; Lev, Michael H; Romero, Javier M

2014-06-01

Evaluation of the posterior fossa (PF) on 5-mm-thick helical CT images (current default) has improved diagnostic accuracy compared to 5-mm sequential CT images; however, 5-mm-thick images may not be ideal for PF pathology due to volume averaging of rapid changes in anatomy in the Z-direction. Therefore, we sought to determine if routine review of 1.25-mm-thin helical CT images has superior accuracy in screening for nontraumatic PF pathology. MRI proof of diagnosis was obtained within 6 h of helical CT acquisition for 90 consecutive ED patients with, and 88 without, posterior fossa lesions. Helical CT images were post-processed at 1.25 and 5-mm-axial slice thickness. Two neuroradiologists blinded to the clinical/MRI findings reviewed both image sets. Interobserver agreement and accuracy were rated using Kappa statistics and ROC analysis, respectively. Of the 90/178 (51 %) who were MR positive, 60/90 (66 %) had stroke and 30/90 (33 %) had other etiologies. There was excellent interobserver agreement (κ > 0.97) for both thick and thin slice assessments. The accuracy, sensitivity, and specificity for 1.25-mm images were 65, 44, and 84 %, respectively, and for 5-mm images were 67, 45, and 85 %, respectively. The diagnostic accuracy was not significantly different (p > 0.5). In this cohort of patients with nontraumatic neurological symptoms referred to the posterior fossa, 1.25-mm-thin slice CT reformatted images do not have superior accuracy compared to 5-mm-thick images. This information has implications on optimizing resource utilizations and efficiency in a busy emergency room. Review of 1.25-mm-thin images may help diagnostic accuracy only when review of 5-mm-thick images as current default is inconclusive.
Spectral-domain Optical Coherence Tomography in Manifest Glaucoma: Its Additive Role in Structural Diagnosis.

PubMed

Kim, Ko Eun; Oh, Sohee; Jeoung, Jin Wook; Suh, Min Hee; Seo, Je Hyun; Kim, Martha; Park, Ki Ho; Kim, Dong Myung; Kim, Seok Hwan

2016-11-01

To investigate the additive role of spectral-domain optical coherence tomography (SDOCT) in the structural diagnosis in glaucoma. Reliability and validity analysis. Structural examinations from 109 eyes of 109 healthy individuals and 151 eyes of 151 glaucoma patients with different severities were included. Four structural-diagnostic examination sets were prepared using stereo-optic disc photography (SDP), red-free retinal nerve fiber layer photography (RNFLP), and SDOCT: (1) SDP (S), (2) SDP and SDOCT (SO), (3) SDP and RNFLP (SR), and (4) SDP, RNFLP, and SDOCT (SRO). Five glaucoma specialists were instructed to classify subjects as normal or glaucoma using each of the 4 diagnostic sets in the order S, SO, SR, and SRO, with a 1-month interval. The interobserver agreement was evaluated using kappa (κ) statistics. The additive effect of SDOCT on the diagnostic performance of the specialists was evaluated using the generalized estimating equation. Five glaucoma specialists showed an excellent level of interobserver agreement on the diagnostic assessments based on the 4 sets. In the comparison of the collective diagnostic performance of the specialists, addition of SDOCT to SDP showed an approximately 2-fold significant increase in the diagnostic accuracy. Adding SDOCT to SDP significantly enhanced the specialists' structural-diagnostic ability with respect to the moderate glaucoma, though not mild or advanced glaucoma. SDOCT significantly enhanced the diagnostic accuracy of the glaucoma specialists' performance, showing its additive diagnostic value in judging glaucomatous structural damage, especially in the moderate stage of glaucoma. Copyright © 2016 Elsevier Inc. All rights reserved.
Combining diffusion-weighted MRI with Gd-EOB-DTPA-enhanced MRI improves the detection of colorectal liver metastases.

PubMed

Koh, D-M; Collins, D J; Wallace, T; Chau, I; Riddell, A M

2012-07-01

To compare the diagnostic accuracy of gadolinium-ethoxybenzyl-diethylenetriaminepentaacetic acid (Gd-EOB-DTPA)-enhanced MRI, diffusion-weighted MRI (DW-MRI) and a combination of both techniques for the detection of colorectal hepatic metastases. 72 patients with suspected colorectal liver metastases underwent Gd-EOB-DTPA MRI and DW-MRI. Images were retrospectively reviewed with unenhanced T(1) and T(2) weighted images as Gd-EOB-DTPA image set, DW-MRI image set and combined image set by two independent radiologists. Each lesion detected was scored for size, location and likelihood of metastasis, and compared with surgery and follow-up imaging. Diagnostic accuracy was compared using receiver operating characteristics and interobserver agreement by kappa statistics. 417 lesions (310 metastases, 107 benign) were found in 72 patients. For both readers, diagnostic accuracy using the combined image set was higher [area under the curve (Az)=0.96, 0.97] than Gd-EOB-DTPA image set (Az=0.86, 0.89) or DW-MRI image set (Az=0.93, 0.92). Using combined image set improved identification of liver metastases compared with Gd-EOB-DTPA image set (p<0.001) or DW-MRI image set (p<0.001). There was very good interobserver agreement for lesion classification (κ=0.81-0.88). Combining DW-MRI with Gd-EOB-DTPA-enhanced T(1) weighted MRI significantly improved the detection of colorectal liver metastases.
Accuracy of Four Imaging Techniques for Diagnosis of Posterior Pelvic Floor Disorders.

PubMed

van Gruting, Isabelle M A; Stankiewicz, Aleksandra; Kluivers, Kirsten; De Bin, Riccardo; Blake, Helena; Sultan, Abdul H; Thakar, Ranee

2017-11-01

To establish the diagnostic test accuracy of evacuation proctography, magnetic resonance imaging (MRI), transperineal ultrasonography, and endovaginal ultrasonography for detecting posterior pelvic floor disorders (rectocele, enterocele, intussusception, and anismus) in women with obstructed defecation syndrome and secondarily to identify the most patient-friendly imaging technique. In this prospective cohort study, 131 women with symptoms of obstructed defecation syndrome underwent evacuation proctogram, MRI, and transperineal and endovaginal ultrasonography. Images were analyzed by two blinded observers. In the absence of a reference standard, latent class analysis was used to assess diagnostic test accuracy of multiple tests with area under the curve (AUC) as the primary outcome measure. Secondary outcome measures were interobserver agreement calculated as Cohen's κ and patient acceptability using a visual analog scale. No significant differences in diagnostic accuracy were found among the imaging techniques for all the target conditions. Estimates of diagnostic test accuracy were highest for rectocele using MRI (AUC 0.79) or transperineal ultrasonography (AUC 0.85), for enterocele using transperineal (AUC 0.73) or endovaginal ultrasonography (AUC 0.87), for intussusception using evacuation proctography (AUC 0.76) or endovaginal ultrasonography (AUC 0.77), and for anismus using endovaginal (AUC 0.95) or transperineal ultrasonography (AUC 0.78). Interobserver agreement for the diagnosis of rectocele (κ 0.53-0.72), enterocele (κ 0.54-0.94) and anismus (κ 0.43-0.81) was moderate to excellent, but poor to fair for intussusception (κ -0.03 to 0.37) with all techniques. Patient acceptability was better for transperineal and endovaginal ultrasonography as compared with MRI and evacuation proctography (P<.001). Evacuation proctography, MRI, and transperineal and endovaginal ultrasonography were shown to have similar diagnostic test accuracy. Evacuation proctography is not the best available imaging technique. There is no one optimal test for the diagnosis of all posterior pelvic floor disorders. Because transperineal and endovaginal ultrasonography have good test accuracy and patient acceptability, we suggest these could be used for initial assessment of obstructed defecation syndrome. ClinicalTrials.gov, NCT02239302.
Diagnostic accuracy study of anorectal manometry for diagnosis of dyssynergic defaecation

PubMed Central

Grossi, Ugo; Carrington, Emma V; Bharucha, Adil E; Horrocks, Emma J; Scott, S Mark; Knowles, Charles H

2015-01-01

Objective The diagnostic accuracy of anorectal manometry (AM), which is necessary to diagnose functional defaecatory disorders (FDD), is unknown. Using blinded analysis and standardised reporting of diagnostic accuracy (STARD), we evaluated whether AM could discriminate between asymptomatic controls and patients with functional constipation (FC). Design Derived line-plots of anorectal pressure profiles during simulated defaecation were independently analysed in random order by 3 expert observers blinded to health status in 85 women with FC and 85 age-matched asymptomatic healthy volunteers (HV). Using accepted criteria, these pressure profiles were characterized as normal (i.e. increased rectal pressure coordinated with anal relaxation) or types I-IV dyssynergia. Inter-observer agreement and diagnostic accuracy were determined. Results Blinded consensus-based assessment disclosed a normal pattern in 16/170 (9%) of all participants and only 11/85 (13%) HV. The combined frequency of dyssynergic patterns (I-IV) was very similar in FC (80/85 [94%]) and HV (74/85 [87%]). Type I dyssynergia (‘paradoxical’ contraction) was less prevalent in FC (17/85 [20%] than HV (31/85 [36.5%], p=0.03). After statistical correction, only type IV dyssynergia was moderately useful for discriminating between FC (39/85 [46%] and HV 17/85 [20%], p=0.001, PPV=70.0%, positive LR=2.3). Inter-observer agreement was substantial or moderate for identifying a normal pattern, dyssynergia types I and IV, and FDD, and fair for types II and III. Conclusions While the interpretation of AM patterns is reproducible, nearly 90% of HV have a pattern that is currently regarded as “abnormal” by AM. Hence AM is of limited utility for distinguishing between FC and HV. PMID:25765461
Evaluation of interobserver variability and diagnostic performance of developed MRI-based radiological scoring system for invasive placenta previa.

PubMed

Ueno, Yoshiko; Maeda, Tetsuo; Tanaka, Utaru; Tanimura, Kenji; Kitajima, Kazuhiro; Suenaga, Yuko; Takahashi, Satoru; Yamada, Hideto; Sugimura, Kazuro

2016-09-01

To evaluate the interobserver variability and diagnostic performance of a developed magnetic resonance imaging (MRI)-based scoring system for invasive placenta previa. Prenatal MR images of 70 women were retrospectively evaluated, 18 of whom were diagnosed with invasive placenta. The six MR features (dark band on T2 -weighted images, intraplacental abnormal vascularity, placental bulge, heterogeneous placenta, myometrial thinning, and placental protrusion sign) were scored on 5-point Likert scale separately, and the cumulative radiological score (CRS) was defined as the sum of each score. Two more experienced radiologists (readers A and B) and two less experienced residents (readers C and D) calculated the CRS. Interobserver variability was assessed by measuring the intraclass correlation coefficient. Diagnostic performance was evaluated by means of receiver operating characteristic (ROC) analysis. Interobserver variability for CRS was excellent for the more experienced radiologists (0.85), and good for all readers (0.72) and the less experienced residents (0.66). The area under the ROC curve (Az) and accuracy (Acc) for CRS were significantly higher or equivalent to those of other MR features for all readers (Az and Acc for reader A; CRS, 0.92, 91.4%; intraplacental T2 dark band, 0.83, P = 0.009, 81.4%, P = 0.03; intraplacental abnormal vascularity, 0.9, P = 0.3, 90.0%, P = 1.00; placental bulge, 0.81, P = 0.0008, 80.0%, P = 0.02; heterogeneous placenta, 0.85, P = 0.11, 74.3%, P = 0.002; myometrial thinning, 0.84, P = 0.06, 60.0%, P < 0.0001; placental protrusion sign, 0.81, P = 0.01, 81.4%, P = 0.26). This developed MRI-based scoring system demonstrated excellent or good interobserver variability, and good diagnostic performance for invasive placenta previa. J. Magn. Reson. Imaging 2016;44:573-583. © 2016 International Society for Magnetic Resonance in Medicine.
Giant cell arteritis: diagnostic accuracy of MR imaging of superficial cranial arteries in initial diagnosis-results from a multicenter trial.

PubMed

Klink, Thorsten; Geiger, Julia; Both, Marcus; Ness, Thomas; Heinzelmann, Sonja; Reinhard, Matthias; Holl-Ulrich, Konstanze; Duwendag, Dirk; Vaith, Peter; Bley, Thorsten Alexander

2014-12-01

To assess the diagnostic accuracy of contrast material-enhanced magnetic resonance (MR) imaging of superficial cranial arteries in the initial diagnosis of giant cell arteritis ( GCA giant cell arteritis ). Following institutional review board approval and informed consent, 185 patients suspected of having GCA giant cell arteritis were included in a prospective three-university medical center trial. GCA giant cell arteritis was diagnosed or excluded clinically in all patients (reference standard [final clinical diagnosis]). In 53.0% of patients (98 of 185), temporal artery biopsy ( TAB temporal artery biopsy ) was performed (diagnostic standard [ TAB temporal artery biopsy ]). Two observers independently evaluated contrast-enhanced T1-weighted MR images of superficial cranial arteries by using a four-point scale. Diagnostic accuracy, involvement pattern, and systemic corticosteroid ( sCS systemic corticosteroid ) therapy effects were assessed in comparison with the reference standard (total study cohort) and separately in comparison with the diagnostic standard TAB temporal artery biopsy ( TAB temporal artery biopsy subcohort). Statistical analysis included diagnostic accuracy parameters, interobserver agreement, and receiver operating characteristic analysis. Sensitivity of MR imaging was 78.4% and specificity was 90.4% for the total study cohort, and sensitivity was 88.7% and specificity was 75.0% for the TAB temporal artery biopsy subcohort (first observer). Diagnostic accuracy was comparable for both observers, with good interobserver agreement ( TAB temporal artery biopsy subcohort, κ = 0.718; total study cohort, κ = 0.676). MR imaging scores were significantly higher in patients with GCA giant cell arteritis -positive results than in patients with GCA giant cell arteritis -negative results ( TAB temporal artery biopsy subcohort and total study cohort, P < .001). Diagnostic accuracy of MR imaging was high in patients without and with sCS systemic corticosteroid therapy for 5 days or fewer (area under the curve, ≥0.9) and was decreased in patients receiving sCS systemic corticosteroid therapy for 6-14 days. In 56.5% of patients with TAB temporal artery biopsy -positive results (35 of 62), MR imaging displayed symmetrical and simultaneous inflammation of arterial segments. MR imaging of superficial cranial arteries is accurate in the initial diagnosis of GCA giant cell arteritis . Sensitivity probably decreases after more than 5 days of sCS systemic corticosteroid therapy; thus, imaging should not be delayed. Clinical trial registration no. DRKS00000594 . © RSNA, 2014.
Reliability of mercury-in-silastic strain gauge plethysmography curve reading: influence of clinical clues and observer variation.

PubMed

Høyer, Christian; Pavar, Susanne; Pedersen, Begitte H; Biurrun Manresa, José A; Petersen, Lars J

2013-08-01

Mercury-in-silastic strain gauge pletysmography (SGP) is a well-established technique for blood flow and blood pressure measurements. The aim of this study was to examine (i) the possible influence of clinical clues, e.g. the presence of wounds and color changes during blood pressure measurements, and (ii) intra- and inter-observer variation of curve interpretation for segmental blood pressure measurements. A total of 204 patients with known or suspected peripheral arterial disease (PAD) were included in a diagnostic accuracy trial. Toe and ankle pressures were measured in both limbs, and primary observers analyzed a total of 804 pressure curve sets. The SGP curves were later reanalyzed separately by two observers blinded to clinical clues. Intra- and inter-observer agreement was quantified using Cohen's kappa and reliability was quantified using intra-class correlation coefficients, coefficients of variance, and Bland-Altman analysis. There was an overall agreement regarding patient diagnostic classification (PAD/not PAD) in 202/204 (99.0%) for intra-observer (κ = 0.969, p < 0.001), and 201/204 (98.5%) for inter-observer readings (κ = 0.953, p < 0.001). Reliability analysis showed excellent correlation between blinded versus non-blinded and inter-observer readings for determination of absolute segmental pressures (all intraclass correlation coefficients ≥ 0.984). The coefficient of variance for determination of absolute segmental blood pressure ranged from 2.9-3.4% for blinded/non-blinded data and from 3.8-5.0% for inter-observer data. This study shows a low inter-observer variation among experienced laboratory technicians for reading strain gauge curves. The low variation between blinded/non-blinded readings indicates that SGP measurements are minimally biased by clinical clues.
Diagnostic value of MRI signs in differentiating Ewing sarcoma from osteomyelitis.

PubMed

Kasalak, Ömer; Overbosch, Jelle; Adams, Hugo Ja; Dammann, Amelie; Dierckx, Rudi Ajo; Jutte, Paul C; Kwee, Thomas C

2018-01-01

Background The value of magnetic resonance imaging (MRI) signs in differentiating Ewing sarcoma from osteomyelitis has not be thoroughly investigated. Purpose To investigate the value of various MRI signs in differentiating Ewing sarcoma from osteomyelitis. Material and Methods Forty-one patients who underwent MRI because of a bone lesion of unknown nature with a differential diagnosis that included both Ewing sarcoma and osteomyelitis were included. Two observers assessed several MRI signs, including the transition zone of the bone lesion, the presence of a soft-tissue mass, intramedullary and extramedullary fat globules, and the penumbra sign. Results Diagnostic accuracies for discriminating Ewing sarcoma from osteomyelitis were 82.4% and 79.4% for the presence of a soft-tissue mass, and 64.7% and 58.8% for a sharp transition zone of the bone lesion, for readers 1 and 2 respectively. Inter-observer agreement with regard to the presence of a soft-tissue mass and the transition zone of the bone lesion were moderate (κ = 0.470) and fair (κ = 0.307), respectively. Areas under the receiver operating characteristic curve of the diameter of the soft-tissue mass (if present) were 0.829 and 0.833, for readers 1 and 2 respectively. Mean inter-observer difference in soft-tissue mass diameter measurement ± limits of agreement was 35.0 ± 75.0 mm. Diagnostic accuracies of all other MRI signs were all < 50%. Conclusion Presence and size of a soft-tissue mass, and sharpness of the transition zone, are useful MRI signs to differentiate Ewing sarcoma from osteomyelitis, but inter-observer agreement is relatively low. Other MRI signs are of no value in this setting.
Diagnostic performance of focused cardiac ultrasound performed by emergency physicians for the assessment of ascending aorta dilation and aneurysm.

PubMed

Nazerian, Peiman; Vanni, Simone; Morello, Fulvio; Castelli, Matteo; Ottaviani, Maddalena; Casula, Claudia; Petrioli, Alessandra; Bartolucci, Maurizio; Grifoni, Stefano

2015-05-01

The diagnostic performance of transthoracic focused cardiac ultrasound (FoCUS) performed by emergency physicians (EP) to estimate ascending aorta dimensions in the acute setting has not been prospectively studied. The diagnostic accuracy and the interobserver variability of EP-performed FoCUS were investigated to estimate thoracic aortic dilation and aneurysm compared with the results of computed tomography angiography (CTA). This was a prospective single-center cohort study of a convenience sample of patients who underwent CTA in the emergency department for suspected aortic pathology. FoCUS was performed before CTA, and the maximum ascending aorta diameter evaluated in parasternal long-axis view. Aorta diameter < 40 mm by visual estimation or by diameter measurement was considered normal. Measurements were recorded in all patients with aorta diameter ≥ 40 mm. Diagnostic accuracy of FoCUS for detection of aortic dilation (diameter ≥ 40 mm) and aneurysm (diameter ≥ 45 mm) were calculated considering the CTA result as reference standard. In a subgroup of patients, a second EP-sonographer performed FoCUS to evaluate interobserver agreement for the diagnosis of ascending aorta dilation. A total of 140 patients were enrolled in the study. Ascending aorta dilation and aneurysm were detected with FoCUS in 50 (35.7%) and in 27 (17.8%) patients, respectively. Sensitivity and specificity of FoCUS were 78.6% (95% confidence interval [CI] = 65.6% to 88.4%) and 92.9% (95% CI = 85.1% to 97.3%), respectively, for ascending aorta dilation and 64.7% (95% CI = 46.5% to 80.2%) and 95.3% (95% CI = 89.3% to 98.4%), respectively, for ascending aorta aneurysm. Interobserver agreement of FoCUS was k = 0.82. FoCUS performed by EP is specific for ascending aorta dilation and aneurysm when compared to CTA and appears a reproducible technique. © 2015 by the Society for Academic Emergency Medicine.
Factors Determining the Inter-observer Variability and Diagnostic Accuracy of High-resolution Manometry for Esophageal Motility Disorders.

PubMed

Kim, Ji Hyun; Kim, Sung Eun; Cho, Yu Kyung; Lim, Chul-Hyun; Park, Moo In; Hwang, Jin Won; Jang, Jae-Sik; Oh, Minkyung

2018-01-30

Although high-resolution manometry (HRM) has the advantage of visual intuitiveness, its diagnostic validity remains under debate. The aim of this study was to evaluate the diagnostic accuracy of HRM for esophageal motility disorders. Six staff members and 8 trainees were recruited for the study. In total, 40 patients enrolled in manometry studies at 3 institutes were selected. Captured images of 10 representative swallows and a single swallow in analyzing mode in both high-resolution pressure topography (HRPT) and conventional line tracing formats were provided with calculated metrics. Assessments of esophageal motility disorders showed fair agreement for HRPT and moderate agreement for conventional line tracing (κ = 0.40 and 0.58, respectively). With the HRPT format, the k value was higher in category A (esophagogastric junction [EGJ] relaxation abnormality) than in categories B (major body peristalsis abnormalities with intact EGJ relaxation) and C (minor body peristalsis abnormalities or normal body peristalsis with intact EGJ relaxation). The overall exact diagnostic accuracy for the HRPT format was 58.8% and rater's position was an independent factor for exact diagnostic accuracy. The diagnostic accuracy for major disorders was 63.4% with the HRPT format. The frequency of major discrepancies was higher for category B disorders than for category A disorders (38.4% vs 15.4%; P < 0.001). The interpreter's experience significantly affected the exact diagnostic accuracy of HRM for esophageal motility disorders. The diagnostic accuracy for major disorders was higher for achalasia than distal esophageal spasm and jackhammer esophagus.
Diagnostic Performance of MR Elastography and Vibration-controlled Transient Elastography in the Detection of Hepatic Fibrosis in Patients with Severe to Morbid Obesity

PubMed Central

Chen, Jun; Yin, Meng; Talwalkar, Jayant A.; Oudry, Jennifer; Glaser, Kevin J.; Smyrk, Thomas C.; Miette, Véronique; Sandrin, Laurent

2017-01-01

Purpose To evaluate the diagnostic performance and examination success rate of magnetic resonance (MR) elastography and vibration-controlled transient elastography (VCTE) in the detection of hepatic fibrosis in patients with severe to morbid obesity. Materials and Methods This prospective and HIPAA-compliant study was approved by the institutional review board. A total of 111 patients (71 women, 40 men) participated. Written informed consent was obtained from all patients. Patients underwent MR elastography with two readers and VCTE with three observers to acquire liver stiffness measurements for liver fibrosis assessment. The results were compared with those from liver biopsy. Each pathology specimen was evaluated by two hepatopathologists according to the METAVIR scoring system or Brunt classification when appropriate. All imaging observers were blinded to the biopsy results, and all hepatopathologists were blinded to the imaging results. Examination success rate, interobserver agreement, and diagnostic accuracy for fibrosis detection were assessed. Results In this obese patient population (mean body mass index = 40.3 kg/m2; 95% confidence interval [CI]: 38.7 kg/m2, 41.8 kg/m2]), the examination success rate was 95.8% (92 of 96 patients) for MR elastography and 81.3% (78 of 96 patients) or 88.5% (85 of 96 patients) for VCTE. Interobserver agreement was higher with MR elastography than with biopsy (intraclass correlation coefficient, 0.95 vs 0.89). In patients with successful MR elastography and VCTE examinations (excluding unreliable VCTE examinations), both MR elastography and VCTE had excellent diagnostic accuracy in the detection of clinically significant hepatic fibrosis (stage F2–F4) (mean area under the curve: 0.93 [95% CI: 0.85, 0.97] vs 0.91 [95% CI: 0.83, 0.96]; P = .551). Conclusion In this obese patient population, both MR elastography and VCTE had excellent diagnostic performance for assessing hepatic fibrosis; MR elastography was more technically reliable than VCTE and had a higher interobserver agreement than liver biopsy. © RSNA, 2016 Online supplemental material is available for this article. An earlier incorrect version of this article appeared online. This article was corrected on January 25, 2017. PMID:27861111
The relationship between transorbital ultrasound measurement of the optic nerve sheath diameter (ONSD) and invasively measured ICP in children : Part I: repeatability, observer variability and general analysis.

PubMed

Padayachy, Llewellyn C; Padayachy, Vaishali; Galal, Ushma; Gray, Rebecca; Fieggen, A Graham

2016-10-01

The aim of this study was to investigate the relationship between optic nerve sheath diameter (ONSD) measurement and invasively measured intracranial pressure (ICP) in children. ONSD measurement was performed prior to invasive measurement of ICP. The mean binocular ONSD measurement was compared to the ICP reading. Physiological variables including systolic blood pressure (SBP), diastolic blood pressure (DBP), mean arterial pressure (MAP), pulse rate, temperature, respiratory rate and end tidal carbon dioxide (ETCO2) level were recorded at the time of ONSD measurement. Diagnostic accuracy analysis was performed at various ICP thresholds and repeatability, intra- and inter-observer variability, correlation between measurements in different imaging planes as well the relationship over the entire patient cohort were examined in part I of this study. Data from 174 patients were analysed. Repeatability and intra-observer variability were excellent (α = 0.97-0.99). Testing for inter-observer variability revealed good correlation (r = 0.89, p < 0.001). Imaging in the sagittal plane demonstrated a slightly better correlation with ICP (r = 0.66, p < 0.001). The ONSD measurement with the best diagnostic accuracy for detecting an ICP ≥ 20 mmHg over the entire patient cohort was 5.5 mm, sensitivity 93.2 %, specificity 74 % and odds ratio (OR) of 39.3. Transorbital ultrasound measurement of the OSND is a reliable and reproducible technique, demonstrating a good relationship with ICP and high diagnostic accuracy for detecting raised ICP.
Is forceps more useful than visualization for measurement of colon polyp size?

PubMed Central

Kim, Jae Hyun; Park, Seun Ja; Lee, Jong Hoon; Kim, Tae Oh; Kim, Hyun Jin; Kim, Hyung Wook; Lee, Sang Heon; Baek, Dong Hoon; (BIGS), Busan Ulsan Gyeongnam Intestinal Study Group Society

2016-01-01

AIM: To identify whether the forceps estimation is more useful than visual estimation in the measurement of colon polyp size. METHODS: We recorded colonoscopy video clips that included scenes visualizing the polyp and scenes using open biopsy forceps in association with the polyp, which were used for an exam. A total of 40 endoscopists from the Busan Ulsan Gyeongnam Intestinal Study Group Society (BIGS) participated in this study. Participants watched 40 pairs of video clips of the scenes for visual estimation and forceps estimation, and wrote down the estimated polyp size on the exam paper. When analyzing the results of the exam, we assessed inter-observer differences, diagnostic accuracy, and error range in the measurement of the polyp size. RESULTS: The overall intra-class correlation coefficients (ICC) of inter-observer agreement for forceps estimation and visual estimation were 0.804 (95%CI: 0.731-0.873, P < 0.001) and 0.743 (95%CI: 0.656-0.828, P < 0.001), respectively. The ICCs of each group for forceps estimation were higher than those for visual estimation (Beginner group, 0.761 vs 0.693; Expert group, 0.887 vs 0.840, respectively). The overall diagnostic accuracy for visual estimation was 0.639 and for forceps estimation was 0.754 (P < 0.001). In the beginner group and the expert group, the diagnostic accuracy for the forceps estimation was significantly higher than that of the visual estimation (Beginner group, 0.734 vs 0.613, P < 0.001; Expert group, 0.784 vs 0.680, P < 0.001, respectively). The overall error range for visual estimation and forceps estimation were 1.48 ± 1.18 and 1.20 ± 1.10, respectively (P < 0.001). The error ranges of each group for forceps estimation were significantly smaller than those for visual estimation (Beginner group, 1.38 ± 1.08 vs 1.68 ± 1.30, P < 0.001; Expert group, 1.12 ± 1.11 vs 1.42 ± 1.11, P < 0.001, respectively). CONCLUSION: Application of the open biopsy forceps method when measuring colon polyp size could help reduce inter-observer differences and error rates. PMID:27003999
Is forceps more useful than visualization for measurement of colon polyp size?

PubMed

Kim, Jae Hyun; Park, Seun Ja; Lee, Jong Hoon; Kim, Tae Oh; Kim, Hyun Jin; Kim, Hyung Wook; Lee, Sang Heon; Baek, Dong Hoon; Bigs, Busan Ulsan Gyeongnam Intestinal Study Group Society

2016-03-21

To identify whether the forceps estimation is more useful than visual estimation in the measurement of colon polyp size. We recorded colonoscopy video clips that included scenes visualizing the polyp and scenes using open biopsy forceps in association with the polyp, which were used for an exam. A total of 40 endoscopists from the Busan Ulsan Gyeongnam Intestinal Study Group Society (BIGS) participated in this study. Participants watched 40 pairs of video clips of the scenes for visual estimation and forceps estimation, and wrote down the estimated polyp size on the exam paper. When analyzing the results of the exam, we assessed inter-observer differences, diagnostic accuracy, and error range in the measurement of the polyp size. The overall intra-class correlation coefficients (ICC) of inter-observer agreement for forceps estimation and visual estimation were 0.804 (95%CI: 0.731-0.873, P < 0.001) and 0.743 (95%CI: 0.656-0.828, P < 0.001), respectively. The ICCs of each group for forceps estimation were higher than those for visual estimation (Beginner group, 0.761 vs 0.693; Expert group, 0.887 vs 0.840, respectively). The overall diagnostic accuracy for visual estimation was 0.639 and for forceps estimation was 0.754 (P < 0.001). In the beginner group and the expert group, the diagnostic accuracy for the forceps estimation was significantly higher than that of the visual estimation (Beginner group, 0.734 vs 0.613, P < 0.001; Expert group, 0.784 vs 0.680, P < 0.001, respectively). The overall error range for visual estimation and forceps estimation were 1.48 ± 1.18 and 1.20 ± 1.10, respectively (P < 0.001). The error ranges of each group for forceps estimation were significantly smaller than those for visual estimation (Beginner group, 1.38 ± 1.08 vs 1.68 ± 1.30, P < 0.001; Expert group, 1.12 ± 1.11 vs 1.42 ± 1.11, P < 0.001, respectively). Application of the open biopsy forceps method when measuring colon polyp size could help reduce inter-observer differences and error rates.
Proximal pulmonary vein stenosis detection in pediatric patients: value of multiplanar and 3-D VR imaging evaluation.

PubMed

Lee, Edward Y; Jenkins, Kathy J; Muneeb, Muhammad; Marshall, Audrey C; Tracy, Donald A; Zurakowski, David; Boiselle, Phillip M

2013-08-01

One of the important benefits of using multidetector computed tomography (MDCT) is its capability to generate high-quality two-dimensional (2-D) multiplanar (MPR) and three-dimensional (3-D) images from volumetric and isotropic axial CT data. However, to the best of our knowledge, no results have been published on the potential diagnostic role of multiplanar and 3-D volume-rendered (VR) images in detecting pulmonary vein stenosis, a condition in which MDCT has recently assumed a role as the initial noninvasive imaging modality of choice. The purpose of this study was to compare diagnostic accuracy and interpretation time of axial, multiplanar and 3-D VR images for detection of proximal pulmonary vein stenosis in children, and to assess the potential added diagnostic value of multiplanar and 3-D VR images. We used our hospital information system to identify all consecutive children (< 18 years of age) with proximal pulmonary vein stenosis who had both a thoracic MDCT angiography study and a catheter-based conventional angiography within 2 months from June 2005 to February 2012. Two experienced pediatric radiologists independently reviewed each MDCT study for the presence of proximal pulmonary vein stenosis defined as ≥ 50% of luminal narrowing on axial, multiplanar and 3-D VR images. Final diagnosis was confirmed by angiographic findings. Diagnostic accuracy was compared using the z-test. Confidence level of diagnosis (scale 1-5, 5 = highest), perceived added diagnostic value (scale 1-5, 5 = highest), and interpretation time of multiplanar or 3-D VR images were compared using paired t-tests. Interobserver agreement was measured using the chance-corrected kappa coefficient. The final study population consisted of 28 children (15 boys and 13 girls; mean age: 5.2 months). Diagnostic accuracy based on 116 individual pulmonary veins for detection of proximal pulmonary vein stenosis was 72.4% (84 of 116) for axial MDCT images, 77.5% (90 of 116 cases) for multiplanar MDCT images, and 93% (108 of 116 cases) for 3-D VR images with significantly higher accuracy with 3-D VR compared to axial (z = 4.17, P < 0.001) and multiplanar (z = 3.34, P < 0.001) images. Confidence levels for detection of proximal pulmonary vein stenosis were significantly higher with 3-D VR images (mean level: 4.6) compared to axial MDCT images (mean level: 1.7) and multiplanar MDCT images (mean level: 2.0) (paired t-tests, P < 0.001). Thus, 3-D VR images (mean added diagnostic value: 4.7) were found to provide added diagnostic value for detecting proximal pulmonary vein stenosis (paired t-test, P < 0.001); however, multiplanar MDCT images did not provide added value (paired t-test, P = 0.89). Interpretation time was significantly longer and interobserver agreement was higher when using 3-D VR images than using axial MDCT images or MPR MDCT images for diagnosing proximal pulmonary vein stenosis (paired t-tests, P < 0.001). Use of 3-D VR images in the diagnosis of proximal pulmonary vein stenosis in children significantly increases accuracy, confidence level, added diagnostic value and interobserver agreement. Thus, the routine use of this technique should be encouraged despite its increased interpretation time.
Accuracy of endoscopic diagnosis of Helicobacter pylori infection according to level of endoscopic experience and the effect of training

PubMed Central

2013-01-01

Background Accurate prediction of Helicobacter pylori infection status on endoscopic images can contribute to early detection of gastric cancer, especially in Asia. We identified the diagnostic yield of endoscopy for H. pylori infection at various endoscopist career levels and the effect of two years of training on diagnostic yield. Methods A total of 77 consecutive patients who underwent endoscopy were analyzed. H. pylori infection status was determined by histology, serology, and the urea breast test and categorized as H. pylori-uninfected, -infected, or -eradicated. Distinctive endoscopic findings were judged by six physicians at different career levels: beginner (<500 endoscopies), intermediate (1500–5000), and advanced (>5000). Diagnostic yield and inter- and intra-observer agreement on H. pylori infection status were evaluated. Values were compared between the two beginners after two years of training. The kappa (K) statistic was used to calculate agreement. Results For all physicians, the diagnostic yield was 88.9% for H. pylori-uninfected, 62.1% for H. pylori-infected, and 55.8% for H. pylori-eradicated. Intra-observer agreement for H. pylori infection status was good (K > 0.6) for all physicians, while inter-observer agreement was lower (K = 0.46) for beginners than for intermediate and advanced (K > 0.6). For all physicians, good inter-observer agreement in endoscopic findings was seen for atrophic change (K = 0.69), regular arrangement of collecting venules (K = 0.63), and hemorrhage (K = 0.62). For beginners, the diagnostic yield of H. pylori-infected/eradicated status and inter-observer agreement of endoscopic findings were improved after two years of training. Conclusions The diagnostic yield of endoscopic diagnosis was high for H. pylori-uninfected cases, but was low for H. pylori-eradicated cases. In beginners, daily training on endoscopic findings improved the low diagnostic yield. PMID:23947684
The learning curve, interobserver, and intraobserver agreement of endoscopic confocal laser endomicroscopy in the assessment of mucosal barrier defects.

PubMed

Chang, Jeff; Ip, Matthew; Yang, Michael; Wong, Brendon; Power, Theresa; Lin, Lisa; Xuan, Wei; Phan, Tri Giang; Leong, Rupert W

2016-04-01

Confocal laser endomicroscopy can dynamically assess intestinal mucosal barrier defects and increased intestinal permeability (IP). These are functional features that do not have corresponding appearance on histopathology. As such, previous pathology training may not be beneficial in learning these dynamic features. This study aims to evaluate the diagnostic accuracy, learning curve, inter- and intraobserver agreement for identifying features of increased IP in experienced and inexperienced analysts and pathologists. A total of 180 endoscopic confocal laser endomicroscopy (Pentax EC-3870FK; Pentax, Tokyo, Japan) images of the terminal ileum, subdivided into 6 sets of 30 were evaluated by 6 experienced analysts, 13 inexperienced analysts, and 2 pathologists, after a 30-minute teaching session. Cell-junction enhancement, fluorescein leak, and cell dropout were used to represent increased IP and were either present or absent in each image. For each image, the diagnostic accuracy, confidence, and quality were assessed. Diagnostic accuracy was significantly higher for experienced analysts compared with inexperienced analysts from the first set (96.7% vs 83.1%, P < .001) to the third set (95% vs 89.7, P = .127). No differences in accuracy were noted between inexperienced analysts and pathologists. Confidence (odds ratio, 8.71; 95% confidence interval, 5.58-13.57) and good image quality (odds ratio, 1.58; 95% confidence interval, 1.22-2.03) were associated with improved interpretation. Interobserver agreement κ values were high and improved with experience (experienced analysts, 0.83; inexperienced analysts, 0.73; and pathologists, 0.62). Intraobserver agreement was >0.86 for experienced observers. Features representative of increased IP can be rapidly learned with high inter- and intraobserver agreement. Confidence and image quality were significant predictors of accurate interpretation. Previous pathology training did not have an effect on learning. Copyright © 2016 American Society for Gastrointestinal Endoscopy. Published by Elsevier Inc. All rights reserved.
Diagnostic accuracy of cone-beam computed tomography in detecting secondary caries under composite fillings: an in vitro study.

PubMed

Yildizer Keris, Elif; Demirel, Oguzhan; Ozdede, Melih; Altunkaynak, Bulent; Peker, Ilkay

2017-01-01

The aim of this in vitro study was to assess the diagnostic performance of cone-beam computed tomography (CBCT) in the detection of secondary carious lesions under composite resin fillings applied to different types of cavities. Occlusal cavities (O) (n=18), occlusal cavities with mesial or distal component (MO/DO) (n=30), and mesial-occlusal-distal cavities (MOD) (n=30) were prepared in seventy eight extracted human posterior teeth. In half of the cavities in each group, artificial secondary caries lesions were simulated. All cavities were restored by using composite resin. All specimens were embedded in silicone and they were positioned to have approximal contacts. CBCT imaging was done and data were evaluated two times with two week interval by two observers, using a five-point confidence scale. Intra- and inter-observer agreements were calculated with Kappa statistics (κ). The area under (Az) the receiver operating characteristic (ROC) curve was used to evaluate the diagnostic accuracy. Intra- (κ =0.89) and inter-observer (κ = 0.79) agreements were found to be excellent. Az values were highest for the O restorations which is followed by the MOD and DO/MO restorations. Az values for MOD and DO/MO restorations were very low and no statistically significant difference was found. Sensitivity for DO/MO restorations and specificity for MOD restorations were found to be the lowest values. Diagnostic performance of CBCT was higher in O composite restorations than MOD and DO/MO restorations for secondary caries detection. The use of alternative imaging methods rather than CBCT may be useful for evaluating secondary caries under composite MOD and DO/MO restorations.
A new SPECT/CT reconstruction algorithm: reliability and accuracy in clinical routine for non-oncologic bone diseases.

PubMed

Delcroix, Olivier; Robin, Philippe; Gouillou, Maelenn; Le Duc-Pennec, Alexandra; Alavi, Zarrin; Le Roux, Pierre-Yves; Abgral, Ronan; Salaun, Pierre-Yves; Bourhis, David; Querellou, Solène

2018-02-12

xSPECT Bone® (xB) is a new reconstruction algorithm developed by Siemens® in bone hybrid imaging (SPECT/CT). A CT-based tissue segmentation is incorporated into SPECT reconstruction to provide SPECT images with bone anatomy appearance. The objectives of this study were to assess xB/CT reconstruction diagnostic reliability and accuracy in comparison with Flash 3D® (F3D)/CT in clinical routine. Two hundred thirteen consecutive patients referred to the Brest Nuclear Medicine Department for non-oncological bone diseases were evaluated retrospectively. Two hundred seven SPECT/CT were included. All SPECT/CT were independently interpreted by two nuclear medicine physicians (a junior and a senior expert) with xB/CT then with F3D/CT three months later. Inter-observer agreement (IOA) and diagnostic confidence were determined using McNemar test, and unweighted Kappa coefficient. The study objectives were then re-assessed for validation through > 18 months of clinical and paraclinical follow-up. No statistically significant differences between IOA xB and IOA F3D were found (p = 0.532). Agreement for xB after categorical classification of the diagnoses was high (κ xB = 0.89 [95% CI 0.84 -0.93]) but without statistically significant difference F3D (κ F3D = 0.90 [95% CI 0.86 - 0.94]). Thirty-one (14.9%) inter-reconstruction diagnostic discrepancies were observed of which 21 (10.1%) were classified as major. The follow-up confirmed the diagnosis of F3D in 10 cases, xB in 6 cases and was non-contributory in 5 cases. xB reconstruction algorithm was found reliable, providing high interobserver agreement and similar diagnostic confidence to F3D reconstruction in clinical routine.

Arrhythmia discrimination by physician and defibrillator: importance of atrial channel.

PubMed

Diemberger, Igor; Martignani, Cristian; Biffi, Mauro; Frabetti, Lorenzo; Valzania, Cinzia; Cooke, Robin M T; Rapezzi, Claudio; Branzi, Angelo; Boriani, Giuseppe

2012-01-26

Many ICD carriers experience inappropriate shocks, but the relative merits of dual- /single-chamber devices for arrhythmia discrimination still remain unclear. We explored possible advantages of the atrial data provided by dual-chamber implantable defibrillators (ICD) for discrimination of real-life supraventricular/ventricular tachyarrhythmias (SVT/VT). 100 dual-chamber traces from 24 ICD were blindly reviewed in dual-chamber and simulated single-chamber (with/without discriminator data) reading modes by five electrophysiologists who determined chamber of origin and provided Likert-scale "confidence" ratings. We assessed 1) intra/interobserver concordance; 2) diagnostic accuracy, using expert diagnoses as a reference standard; 3) ROC curves of sensitivity/specificity of "likelihood perception" scores, generated by combining chamber-of-origin diagnostic judgments with Likert-scale "confidence" ratings. We also assessed diagnostic accuracy of automated discrimination by all possible dual-/single-chamber algorithm configurations. Interobserver concordance was "substantial" (modified Cohen kappa-test values for dual-/single-chamber, 0.79/0.68); intraobserver concordance "almost complete" (kappa ≥ 0.89). Dual-chamber mode provided best diagnostic sensitivity/specificity (99%/92%) and highest reader confidence (p<0.001). Area under ROC curves of sensitivity/specificity values for the "likelihood perception" score (representing electrophysiologists' perceptions of the likelihood that an episode was of ventricular origin) was highest in dual-chamber mode (0.98 vs. 0.93 for both single-chamber modes; p<0.001). Regarding automated discrimination, all four dual-chamber configurations conferred 100% sensitivity (specificity values ranged 39%-88%), whereas single-chamber configurations appeared inferior (best sensitivity/specificity combination, 89%/64%). Availability of the atrial channel helps in reducing inappropriate ICD therapies by providing relevant advantages in terms of both appropriate cardiologist's post-hoc discrimination of SVT/VT (improving program tailoring) and automated arrhythmia discrimination. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
Joint line tenderness and McMurray tests for the detection of meniscal lesions: what is their real diagnostic value?

PubMed

Galli, Marco; Ciriello, Vincenzo; Menghi, Amerigo; Aulisa, Angelo G; Rabini, Alessia; Marzetti, Emanuele

2013-06-01

To assess the interobserver concordance of the joint line tenderness (JLT) and McMurray tests, and to determine their diagnostic efficiency for the detection of meniscal lesions. Prospective observational study. Orthopedics outpatient clinic, university hospital. Patients (N=60) with suspected nonacute meniscal lesions who underwent knee arthroscopy. Not applicable. Patients were examined by 3 independent observers with graded levels of experience (>10y, 3y, and 4mo of practice). The interobserver concordance was assessed by Cohen-Fleiss κ statistics. Accuracy, negative and positive predictive values for prevalence 10% to 90%, positive (LR+) and negative (LR-) likelihood ratios, and the Bayesian posttest probability with a positive or negative result were also determined. The diagnostic value of the 2 tests combined was assessed by logistic regression. Arthroscopy was used as the reference test. No interobserver concordance was determined for the JLT. The McMurray test showed higher interobserver concordance, which improved when judgments by the less experienced examiner were discarded. The whole series studied by the "best" examiner (experienced orthopedist) provided the following values: (1) JLT: sensitivity, 62.9%; specificity, 50%; LR+, 1.26; LR-, .74; (2) McMurray: sensitivity, 34.3%; specificity, 86.4%; LR+, 2.52; LR-, .76. The combination of the 2 tests did not offer advantages over the McMurray alone. The JLT alone is of little clinical usefulness. A negative McMurray test does not modify the pretest probability of a meniscal lesion, while a positive result has a fair predictive value. Hence, in a patient with a suspected meniscal lesion, a positive McMurray test indicates that arthroscopy should be performed. In case of a negative result, further examinations, including imaging, are needed. Copyright © 2013 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Ultrathin disposable gastroscope for screening and surveillance of gastroesophageal varices in patients with liver cirrhosis: a prospective comparative study.

PubMed

Huynh, Dep K; Toscano, Leanne; Phan, Vinh-An; Ow, Tsai-Wing; Schoeman, Mark; Nguyen, Nam Q

2017-06-01

This study aims to evaluate the role of unsedated, ultrathin disposable gastroscopy (TDG) against conventional gastroscopy (CG) in the screening and surveillance of gastroesophageal varices (GEVs) in patients with liver cirrhosis. Forty-eight patients (56.4 ± 1.3 years; 38 male, 10 female) with liver cirrhosis referred for screening (n = 12) or surveillance (n = 36) of GEVs were prospectively enrolled. Unsedated gastroscopy was initially performed with TDG, followed by CG with conscious sedation. The 2 gastroscopies were performed by different endoscopists blinded to the results of the previous examination. Video recordings of both gastroscopies were validated by an independent investigator in a random, blinded fashion. Endpoints were accuracy and interobserver agreement of detecting GEVs, safety, and potential cost saving. CG identified GEVs in 26 (54%) patients, 10 of whom (21%) had high-risk esophageal varices (HREV). Compared with CG, TDG had an accuracy of 92% for the detection of all GEVs, which increased to 100% for high-risk GEVs. The interobserver agreement for detecting all GEVs on TDG was 88% (κ = 0.74). This increased to 94% (κ = 0.82) for high-risk GEVs. There were no serious adverse events. Unsedated TDG is safe and has high diagnostic accuracy and interobserver reliability for the detection of GEVs. The use of clinic-based TDG would allow immediate determination of a follow-up plan, making it attractive for variceal screening and surveillance programs. (Clinical trial (ANZCTR) registration number: ACTRN12616001103459.). Crown Copyright © 2017. Published by Elsevier Inc. All rights reserved.
Interobserver Agreement for Contrast-Enhanced Ultrasound (CEUS)-Based Standardized Algorithms for the Diagnosis of Hepatocellular Carcinoma in High-Risk Patients.

PubMed

Schellhaas, Barbara; Pfeifer, Lukas; Kielisch, Christian; Goertz, Ruediger Stephan; Neurath, Markus F; Strobel, Deike

2018-06-07

This pilot study aimed at assessing interobserver agreement with two contrast-enhanced ultrasound (CEUS) algorithms for the diagnosis of hepatocellular carcinoma (HCC) in high-risk patients. Focal liver lesions in 55 high-risk patients were assessed independently by three blinded observers with two standardized CEUS algorithms: ESCULAP (Erlanger Synopsis of Contrast-Enhanced Ultrasound for Liver Lesion Assessment in Patients at risk) and ACR-CEUS-LI-RADSv.2016 (American College of Radiology CEUS-Liver Imaging Reporting and Data System). Lesions were categorized according to size and ultrasound contrast enhancement in the arterial, portal-venous and late phase. Interobserver agreement for assessment of enhancement pattern and categorization was compared between both CEUS algorithms. Additionally, diagnostic accuracy for the definitive diagnosis of HCC was compared. Histology and/or CE-MRI and follow-up served as reference standards. 55 patients were included in the study (male/female, 44/ 11; mean age: 65.9 years). 90.9 % had cirrhosis. Histological findings were available in 39/55 lesions (70.9 %). Reference standard of the 55 lesions revealed 48 HCCs, 2 intrahepatic cholangiocellular carcinomas (ICCs), and 5 non-HCC-non-ICC lesions. Interobserver agreement was moderate to substantial for arterial phase hyperenhancement (ĸ = 0.53 - 0.67), and fair to moderate for contrast washout in the portal-venous or late phase (ĸ = 0.33 - 0.53). Concerning the CEUS-based algorithms, the interreader agreement was substantial for the ESCULAP category (ĸ = 0.64 - 0.68) and fair for the CEUS-LI-RADS ® category (ĸ = 0.3 - 0.39). Disagreement between observers was mostly due to different perception of washout. Interobserver agreement is better for ESCULAP than for CEUS-LI-RADS ® . This is mostly due to the fact that perception of contrast washout varies between different observers. However, interobserver agreement is good for arterial phase hyperenhancement, which is the key diagnostic feature for the diagnosis of HCC with CEUS in the cirrhotic liver. © Georg Thieme Verlag KG Stuttgart · New York.
Diagnostic Accuracy of Virtual Pathology vs Traditional Microscopy in a Large Dermatopathology Study.

PubMed

Kent, Michael N; Olsen, Thomas G; Feeser, Theresa A; Tesno, Katherine C; Moad, John C; Conroy, Michael P; Kendrick, Mary Jo; Stephenson, Sean R; Murchland, Michael R; Khan, Ayesha U; Peacock, Elizabeth A; Brumfiel, Alexa; Bottomley, Michael A

2017-12-01

Digital pathology represents a transformative technology that impacts dermatologists and dermatopathologists from residency to academic and private practice. Two concerns are accuracy of interpretation from whole-slide images (WSI) and effect on workflow. Studies of considerably large series involving single-organ systems are lacking. To evaluate whether diagnosis from WSI on a digital microscope is inferior to diagnosis of glass slides from traditional microscopy (TM) in a large cohort of dermatopathology cases with attention on image resolution, specifically eosinophils in inflammatory cases and mitotic figures in melanomas, and to measure the workflow efficiency of WSI compared with TM. Three dermatopathologists established interobserver ground truth consensus (GTC) diagnosis for 499 previously diagnosed cases proportionally representing the spectrum of diagnoses seen in the laboratory. Cases were distributed to 3 different dermatopathologists who diagnosed by WSI and TM with a minimum 30-day washout between methodologies. Intraobserver WSI/TM diagnoses were compared, followed by interobserver comparison with GTC. Concordance, major discrepancies, and minor discrepancies were calculated and analyzed by paired noninferiority testing. We also measured pathologists' read rates to evaluate workflow efficiency between WSI and TM. This retrospective study was caried out in an independent, national, university-affiliated dermatopathology laboratory. Intraobserver concordance of diagnoses between WSI and TM methods and interobserver variance from GTC, following College of American Pathology guidelines. Mean intraobserver concordance between WSI and TM was 94%. Mean interobserver concordance was 94% for WSI and GTC and 94% for TM and GTC. Mean interobserver concordance between WSI, TM, and GTC was 91%. Diagnoses from WSI were noninferior to those from TM. Whole-slide image read rates were commensurate with WSI experience, achieving parity with TM by the most experienced user. Diagnosis from WSI was found equivalent to diagnosis from glass slides using TM in this statistically powerful study of 499 dermatopathology cases. This study supports the viability of WSI for primary diagnosis in the clinical setting.
Prediction of Helicobacter pylori status by conventional endoscopy, narrow-band imaging magnifying endoscopy in stomach after endoscopic resection of gastric cancer.

PubMed

Yagi, Kazuyoshi; Saka, Akiko; Nozawa, Yujiro; Nakamura, Atsuo

2014-04-01

To reduce the incidence of metachronous gastric carcinoma after endoscopic resection of early gastric cancer, Helicobacter pylori eradication therapy has been endorsed. It is not unusual for such patients to be H. pylori negative after eradication or for other reasons. If it were possible to predict H. pylori status using endoscopy alone, it would be very useful in clinical practice. To clarify the accuracy of endoscopic judgment of H. pylori status, we evaluated it in the stomach after endoscopic submucosal dissection (ESD) of gastric cancer. Fifty-six patients treated by ESD were enrolled. The diagnostic criteria for H. pylori status by conventional endoscopy and narrow-band imaging (NBI)-magnifying endoscopy were decided, and H. pylori status was judged by two endoscopists. Based on the H. pylori stool antigen test as a diagnostic gold standard, conventional endoscopy and NBI-magnifying endoscopy were compared for their sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). Interobserver agreement was assessed in terms of κ value. Interobserver agreement was moderate (0.56) for conventional endoscopy and substantial (0.77) for NBI-magnifying endoscopy. The sensitivity, specificity, PPV, and NPV were 0.79, 0.52, 0.70, and 0.63 for conventional endoscopy and 0.91, 0.83, 0.88, and 0.86 for NBI-magnifying endoscopy, respectively. Prediction of H. pylori status using NBI-magnifying endoscopy is practical, and interobserver agreement is substantial. © 2013 John Wiley & Sons Ltd.
Does experience in hysteroscopy improve accuracy and inter-observer agreement in the management of abnormal uterine bleeding?

PubMed

Bourdel, Nicolas; Modaffari, Paola; Tognazza, Enrica; Pertile, Riccardo; Chauvet, Pauline; Botchorishivili, Revaz; Savary, Dennis; Pouly, Jean Luc; Rabischong, Benoit; Canis, Michel

2016-12-01

Hysteroscopic reliability may be influenced by the experience of the operator and by a lack of morphological diagnostic criteria for endometrial malignant pathologies. The aim of this study was to evaluate the diagnostic accuracy and the inter-observer agreement (IOA) in the management of abnormal uterine bleeding (AUB) among different experienced gynecologists. Each gynecologist, without any other clinical information, was asked to evaluate the anonymous video recordings of 51 consecutive patients who underwent hysteroscopy and endometrial resection for AUB. Experts (>500 hysteroscopies), seniors (20-499 procedures) and junior (≤19 procedures) gynecologists were asked to judge endometrial macroscopic appearance (benign, suspicious or frankly malignant). They also had to propose the histological diagnosis (atrophic or proliferative endometrium; simple, glandulocystic or atypical endometrial hyperplasia and endometrial carcinoma). Observers were free to indicate whether the quality of recordings were not good enough for adequate assessment. IOA (k coefficient), sensitivity, specificity, predictive value and the likelihood ratio were calculated. Five expert, five senior and six junior gynecologists were involved in the study. Considering endometrial cancer and endometrial atypical hyperplasia, sensitivity and specificity were respectively 55.5 % and 84.5 % for juniors, 66.6 % and 81.2 % for seniors and 86.6 % and 87.3 % for experts. Concerning endometrial macroscopic appearance, IOA was poor for juniors (k = 0.10) and fair for seniors and experts (k = 0.23 and 0.22, respectively). IOA was poor for juniors and experts (k = 0.18 and 0.20, respectively) and fair for seniors (k = 0.30) in predicting the histological diagnosis. Sensitivity improves with the observer's experience, but inter-observer agreement and reproducibility of hysteroscopy for endometrial malignancies are not satisfying no matter the level of expertise. Therefore, an accurate and complete endometrial sampling is still needed.
Interobserver agreement and diagnostic accuracy of brain magnetic resonance imaging in dogs.

PubMed

Leclerc, Mylène-Kim; d'Anjou, Marc-André; Blond, Laurent; Carmel, Éric Norman; Dennis, Ruth; Kraft, Susan L; Matthews, Andrea R; Parent, Joane M

2013-06-15

To evaluate interobserver agreement and diagnostic accuracy of brain MRI in dogs. Evaluation study. 44 dogs. 5 board-certified veterinary radiologists with variable MRI experience interpreted transverse T2-weighted (T2w), T2w fluid-attenuated inversion recovery (FLAIR), and T1-weighted-FLAIR; transverse, sagittal, and dorsal T2w; and T1-weighted-FLAIR postcontrast brain sequences (1.5 T). Several imaging parameters were scored, including the following: lesion (present or absent), lesion characteristics (axial localization, mass effect, edema, hemorrhage, and cavitation), contrast enhancement characteristics, and most likely diagnosis (normal, neoplastic, inflammatory, vascular, metabolic or toxic, or other). Magnetic resonance imaging diagnoses were determined initially without patient information and then repeated, providing history and signalment. For all cases and readers, MRI diagnoses were compared with final diagnoses established with results from histologic examination (when available) or with other pertinent clinical data (CSF analysis, clinical response to treatment, or MRI follow-up). Magnetic resonance scores were compared between examiners with κ statistics. Reading agreement was substantial to almost perfect (0.64 < κ < 0.86) when identifying a brain lesion on MRI; fair to moderate (0.14 < κ < 0.60) when interpreting hemorrhage, edema, and pattern of contrast enhancement; fair to substantial (0.22 < κ < 0.74) for dural tail sign and categorization of margins of enhancement; and moderate to substantial (0.40 < κ < 0.78) for axial localization, presence of mass effect, cavitation, intensity, and distribution of enhancement. Interobserver agreement was moderate to substantial for categories of diagnosis (0.56 < κ < 0.69), and agreement with the final diagnosis was substantial regardless of whether patient information was (0.65 < κ < 0.76) or was not (0.65 < κ < 0.68) provided. The present study found that whereas some MRI features such as edema and hemorrhage were interpreted less consistently, radiologists were reasonably constant and accurate when providing diagnoses.
Stress echocardiography with smartphone: real-time remote reading for regional wall motion.

PubMed

Scali, Maria Chiara; de Azevedo Bellagamba, Clarissa Carmona; Ciampi, Quirino; Simova, Iana; de Castro E Silva Pretto, José Luis; Djordjevic-Dikic, Ana; Dodi, Claudio; Cortigiani, Lauro; Zagatina, Angela; Trambaiolo, Paolo; Torres, Marco R; Citro, Rodolfo; Colonna, Paolo; Paterni, Marco; Picano, Eugenio

2017-11-01

The diffusion of smart-phones offers access to the best remote expertise in stress echo (SE). To evaluate the reliability of SE based on smart-phone filming and reading. A set of 20 SE video-clips were read in random sequence with a multiple choice six-answer test by ten readers from five different countries (Italy, Brazil, Serbia, Bulgaria, Russia) of the "SE2020" study network. The gold standard to assess accuracy was a core-lab expert reader in agreement with angiographic verification (0 = wrong, 1 = right). The same set of 20 SE studies were read, in random order and >2 months apart, on desktop Workstation and via smartphones by ten remote readers. Image quality was graded from 1 = poor but readable, to 3 = excellent. Kappa (k) statistics was used to assess intra- and inter-observer agreement. The image quality was comparable in desktop workstation vs. smartphone (2.0 ± 0.5 vs. 2.4 ± 0.7, p = NS). The average reading time per case was similar for desktop versus smartphone (90 ± 39 vs. 82 ± 54 s, p = NS). The overall diagnostic accuracy of the ten readers was similar for desktop workstation vs. smartphone (84 vs. 91%, p = NS). Intra-observer agreement (desktop vs. smartphone) was good (k = 0.81 ± 0.14). Inter-observer agreement was good and similar via desktop or smartphone (k = 0.69 vs. k = 0.72, p = NS). The diagnostic accuracy and consistency of SE reading among certified readers was high and similar via desktop workstation or via smartphone.
Cervical digital photography for screening of uterine cervix cancer and its precursor lesions in developing countries.

PubMed

Hillmann, Elise de Castro; Dos Reis, Ricardo; Monego, Heleusa; Appel, Márcia; Hammes, Luciano Serpa; Rivoire, Waldemar Augusto; Capp, Edison

2013-07-01

This study aims to evaluate and to compare the performance of cervical digital photography (CDP) to the visual inspection with acetic acid (VIA) and visual inspection with Lugol's iodine (VILI) methods for screening the uterine cervix cancer and its precursor lesions in developing countries. A cross-sectional study was performed in Brazil. 176 women were evaluated by VIA, VILI, CDP with acetic acid and CDP with Lugol's iodine. Kappa statistic was used to estimate the interobserver and intermethod agreement. Sensitivity, specificity and diagnostic accuracy of the four methods (VIA, VILI, CDP with acetic acid, CDP with Lugol's iodine) was calculated. Interobserver agreement for CDP with acetic acid was K = 0.441 and for CDP with Lugol's iodine was K = 0.533; intermethod agreement of VIA and CDP with acetic acid, K = 0.559; and of VILI and CDP with Lugol's iodine, K = 0.507. Sensitivity and specificity of CDP with acetic acid were 84.00 and 95.83 %, and of CDP with Lugol's iodine were 88.00 and 97.26 %, respectively. The diagnostic accuracy of CDP with acetic acid and CDP with Lugol's iodine was 92.78 and 94.90 %, respectively. This was the first study to assess the CDP with Lugol's iodine performance, which had similar performance to the CDP with acetic acid. CDP is considered a promising method for screening the uterine cervix cancer and its precursor lesions in developing countries.
A Comparative Study on Diagnostic Accuracy of Colour Coded Digital Images, Direct Digital Images and Conventional Radiographs for Periapical Lesions – An In Vitro Study

PubMed Central

Mubeen; K.R., Vijayalakshmi; Bhuyan, Sanat Kumar; Panigrahi, Rajat G; Priyadarshini, Smita R; Misra, Satyaranjan; Singh, Chandravir

2014-01-01

Objectives: The identification and radiographic interpretation of periapical bone lesions is important for accurate diagnosis and treatment. The present study was undertaken to study the feasibility and diagnostic accuracy of colour coded digital radiographs in terms of presence and size of lesion and to compare the diagnostic accuracy of colour coded digital images with direct digital images and conventional radiographs for assessing periapical lesions. Materials and Methods: Sixty human dry cadaver hemimandibles were obtained and periapical lesions were created in first and second premolar teeth at the junction of cancellous and cortical bone using a micromotor handpiece and carbide burs of sizes 2, 4 and 6. After each successive use of round burs, a conventional, RVG and colour coded image was taken for each specimen. All the images were evaluated by three observers. The diagnostic accuracy for each bur and image mode was calculated statistically. Results: Our results showed good interobserver (kappa > 0.61) agreement for the different radiographic techniques and for the different bur sizes. Conventional Radiography outperformed Digital Radiography in diagnosing periapical lesions made with Size two bur. Both were equally diagnostic for lesions made with larger bur sizes. Colour coding method was least accurate among all the techniques. Conclusion: Conventional radiography traditionally forms the backbone in the diagnosis, treatment planning and follow-up of periapical lesions. Direct digital imaging is an efficient technique, in diagnostic sense. Colour coding of digital radiography was feasible but less accurate however, this imaging technique, like any other, needs to be studied continuously with the emphasis on safety of patients and diagnostic quality of images. PMID:25584318
Use of Digitally Stained Multimodal Confocal Mosaic Images to Screen for Nonmelanoma Skin Cancer

PubMed Central

Mu, Euphemia W.; Lewin, Jesse M.; Stevenson, Mary L.; Meehan, Shane A.; Carucci, John A.; Gareau, Daniel S.

2017-01-01

IMPORTANCE Confocal microscopy has the potential to provide rapid bedside pathologic analysis, but clinical adoption has been limited in part by the need for physician retraining to interpret grayscale images. Digitally stained confocal mosaics (DSCMs) mimic the colors of routine histologic specimens and may increase adaptability of this technology. OBJECTIVE To evaluate the accuracy and precision of 3 physicians using DSCMs before and after training to detect basal cell carcinoma (BCC) and squamous cell carcinoma (SCC) in Mohs micrographic surgery fresh-tissue specimens. DESIGN This retrospective study used 133 DSCMs from 64 Mohs tissue excisions, which included clear margins, residual BCC, or residual SCC. Discarded tissue from Mohs surgical excisions from the dermatologic surgery units at Memorial Sloan Kettering Cancer Center and Oregon Health & Science University were collected for confocal imaging from 2006 to 2011. Final data analysis and interpretation took place between 2014 and 2016. Two Mohs surgeons and a Mohs fellow, who were blinded to the correlating gold standard frozen section diagnoses, independently reviewed the DSCMs for residual nonmelanoma skin cancer (NMSC) before and after a brief training session (about 5 minutes). The 2 assessments were separated by a 6-month washout period. MAIN OUTCOMES AND MEASURES Diagnostic accuracy was characterized by sensitivity and specificity of detecting NMSC using DSCMs vs standard frozen histopathologic specimens. The diagnostic precision was calculated based on interobserver agreement and κ scores. Paired 2-sample t tests were used for comparative means analyses before and after training. RESULTS The average respective sensitivities and specificities of detecting NMSC were 90% (95% CI, 89%-91%) and 79% (95% CI, 52%-100%) before training and 99% (95% CI, 99%-99%) (P = .001) and 93% (95% CI, 90%-96%) (P = .18) after training; for BCC, they were 83% (95% CI, 59%-100%) and 92% (95% CI, 81%-100%) before training and 98% (95% CI, 98%-98%) (P = .18) and 97% (95% CI, 95%-100%) (P = .15) after training; for SCC, they were 73% (95% CI, 65%-81%) and 89% (95% CI, 72%-100%) before training and 100% (P = .004) and 98% (95% CI, 95%-100%) (P = .21) after training. The pretraining interobserver agreement was 72% (κ = 0.58), and the posttraining interobserver agreement was 98% (κ = 0.97) (P = .04). CONCLUSIONS AND RELEVANCE Diagnostic use of DSCMs shows promising correlation to frozen histologic analysis, but image quality was affected by variations in image contrast and mosaic-stitching artifact. With training, physicians were able to read DSCMs with significantly improved accuracy and precision to detect NMSC. PMID:27603676
Demonstration of infective endocarditis by cardiac CT and transoesophageal echocardiography: comparison with intra-operative findings.

PubMed

Koo, Hyun Jung; Yang, Dong Hyun; Kang, Joon-Won; Lee, Joo Yeon; Kim, Dae-Hee; Song, Jong-Min; Kang, Duk-Hyun; Song, Jae-Kwan; Kim, Joon Bum; Jung, Sung-Ho; Choo, Suk Jung; Chung, Cheol Hyun; Lee, Jae-Won; Lim, Tae-Hwan

2018-02-01

We aimed to compare imaging findings of infective endocarditis between computed tomography (CT) and transoesophageal echocardiography (TEE) using surgical inspection as a reference standard. Forty-nine patients (aged 54 ± 17 years, 69% men) who underwent pre-operative CT and TEE for infective endocarditis were included. Twelve of these patients had prosthetic valve endocarditis. Imaging findings of infective endocarditis were classified as vegetation, leaflet perforation, abscess/pseudoaneurysm, and paravalvular leakage. Diagnostic performances of CT and TEE were evaluated using surgical inspection as a reference standard. Interobserver agreements for CT findings were obtained using Cohen's κ test. The detection rates of infective endocarditis per patient with CT and TEE were 93.9% (46/49) and 95.9% (47/49), respectively. In per-imaging analysis, the sensitivities of CT and TEE were not significantly different for both native and prosthetic valve infective endocarditis (sensitivity: vegetation, 100% in TEE and 90.9% in CT; leaflet perforation, 87.5% in TEE and 50.0% in CT; abscess/pseudoaneurysm, 40.0% in TEE and 60.0% in CT; paravalvular leakage, 100% in TEE and 50.0% in CT). Interobserver agreements for CT findings were substantial or excellent (0.79-0.88). Cardiac CT can accurately demonstrate infective endocarditis in pre-operative patients with a similar diagnostic accuracy to TEE. The interobserver agreements for the CT findings of infective endocarditis were excellent. Published on behalf of the European Society of Cardiology. All rights reserved. © The Author 2017. For permissions, please email: journals.permissions@oup.com.
Prospective Validation of Intra- and Interobserver Reproducibility of a New Point Shear Wave Elastographic Technique for Assessing Liver Stiffness in Patients with Chronic Liver Disease.

PubMed

Ahn, Su Joa; Lee, Jeong Min; Chang, Won; Lee, Sang Min; Kang, Hyo-Jin; Yang, Hyunkyung; Yoon, Jeong Hee; Park, Sae Jin; Han, Joon Koo

2017-01-01

To assess intra- and inter-observer reproducibility of a new point shear wave elastography technique (pSWE, S-Shearwave, Samsung Medison) and compare its accuracy in assessing liver stiffness (LS) with an established pSWE technique (Virtual Touch Quantification, VTQ). Thirty-three patients were enrolled in this Institutional Review Board-approved prospective study. LS values were measured by VTQ on an Acuson S2000 system (Siemens Healthineer) and S-Shearwave on an RS-80A (Samsung Medison) in the same session, followed by two further S-Shearwave sessions for inter- and intra-observer variation at 8-hour intervals. The technical success rate (SR) and reliability of the measurements of both pSWE techniques were compared. The intra- and inter-observer reproducibility of S-Shearwave was determined by intraclass correlation coefficients (ICCs). LS values were measured by both methods of pSWE. The diagnostic performance in severe fibrosis (F ≥ 3) and cirrhosis (F = 4) was evaluated using the receiver operating characteristics curve analysis and the Obuchowski measure with the LS values of transient elastography as the referenced standard. The VTQ (100%, 33/33) and S-Shearwave (96.9%, 32/33) techniques did not display a significant difference in technical SR ( p = 0.63) or reliability of LS measurements (96.9%, 32/33; 93.9%, 30/32, respectively, p = 0.61). The inter- and intra-observer agreement for LS measurements using the S-Shearwave technique was excellent (ICC = 0.98 and 0.99, respectively). The mean LS values of both pSWE techniques were not significantly different and exhibited a good correlation (r = 0.78). To detect F ≥ 3 and F = 4, VTQ and S-Shearwave showed comparable diagnostic accuracy as indicated by the following outcomes: areas under receiver operating characteristics curve (AUROC) = 0.87 (95% confidence intervals [CI] 0.70-0.96), 0.89 for VTQ (95% CI 0.74-0.97), respectively; and AUROC = 0.84 (95% CI 0.67-0.94), 0.94 (95% CI 0.80-0.99) for S-Shearwave (p > 0.48), respectively. The Obuchowski measures were similarly high for S-Shearwave and VTQ (0.94 vs. 0.95). S-Shearwave shows excellent inter- and intra-observer agreement and diagnostic effectiveness comparable to VTQ in detecting LS.
Three-phase bone scintigraphy for diagnosis of Charcot neuropathic osteoarthropathy in the diabetic foot - does quantitative data improve diagnostic value?

PubMed

Fosbøl, M; Reving, S; Petersen, E H; Rossing, P; Lajer, M; Zerahn, B

2017-01-01

To investigate whether inclusion of quantitative data on blood flow distribution compared with visual qualitative evaluation improve the reliability and diagnostic performance of 99 m Tc-hydroxymethylene diphosphate three-phase bone scintigraphy (TPBS) in patients suspected for charcot neuropathic osteoarthropathy (CNO) of the foot. A retrospective cohort study of TPBS performed on 148 patients with suspected acute CNO referred from a single specialized diabetes care centre. The quantitative blood flow distribution was calculated based on the method described by Deutsch et al. All scintigraphies were re-evaluated by independent, blinded observers twice with and without quantitative data on blood flow distribution at ankle and focus level, respectively. The diagnostic validity of TPBS was determined by subsequent review of clinical data and radiological examinations. A total of 90 patients (61%) had confirmed diagnosis of CNO. The sensitivity, specificity and accuracy of three-phase bone scintigraphy without/with quantitative data were 89%/88%, 58%/62% and 77%/78%, respectively. The intra-observer agreement improved significantly by adding quantitative data in the evaluation (Kappa value 0·79/0·94). The interobserver agreement was not significantly improved. Adding quantitative data on blood flow distribution in the interpretation of TBPS improves intra-observer variation, whereas no difference in interobserver variation was observed. The sensitivity of TPBS in the diagnosis of CNO is high, but holds limited specificity. Diagnostic performance does not improve using quantitative data in the evaluation. This may be due to the reference intervals applied in the study or the absence of a proper gold standard diagnostic procedure for comparison. © 2015 Scandinavian Society of Clinical Physiology and Nuclear Medicine. Published by John Wiley & Sons Ltd.
Concordance between (99m)Tc-ECD SPECT and 18F-FDG PET interpretations in patients with cognitive disorders diagnosed according to NIA-AA criteria.

PubMed

Ito, Kimiteru; Shimano, Yasumasa; Imabayashi, Etsuko; Nakata, Yasuhiro; Omachi, Yoshie; Sato, Noriko; Arima, Kunimasa; Matsuda, Hiroshi

2014-10-01

The purpose of this study was to clarify the concordance of diagnostic abilities and interobserver agreement between 18F-fluorodeoxyglucose (FDG) positron emission tomography (PET) and brain perfusion single photon-emission computed tomography (SPECT) in patients with Alzheimer's disease (AD) who were diagnosed according to the research criteria of the National Institute of Aging-Alzheimer's Association Workshop. Fifty-five patients with "AD and mild cognitive impairment (MCI)" (n = 40) and "non-AD" (n = 15) were evaluated with 18F-FDG PET and (99m)Tc-ethyl cysteinate dimer (ECD) SPECT during an 8-week period. Three radiologists independently graded the regional uptake in the frontal, temporal, parietal, and occipital lobes as well as the precuneus/posterior cingulate cortex in both images. Kappa values were used to determine the interobserver reliability regarding regional uptake. The regions with better interobserver reliability between 18F-FDG PET and (99m)Tc-ECD SPECT were the frontal, parietal, and temporal lobes. The (99m)Tc-ECD SPECT agreement in the occipital lobes was not significant. The frontal, temporal, and parietal lobes showed good correlations between 18F-FDG PET and (99m)Tc-ECD SPECT in the degree of uptake, but the occipital lobe and precuneus/posterior cingulate cortex did not show good correlations. The diagnostic accuracy rates of "AD and MCI" ranged from 60% to 70% in both of the techniques. The degree of uptake on 18F-FDG PET and (99m)Tc-ECD SPECT showed significant correlations in the frontal, temporal, and parietal lobes. The diagnostic abilities of 18F-FDG PET and (99m)Tc-ECD SPECT for "AD and MCI," when diagnosed according to the National Institute of Aging-Alzheimer's Association Workshop criteria, were nearly identical. Copyright © 2014 John Wiley & Sons, Ltd.
Training improves interobserver reliability for the diagnosis of scaphoid fracture displacement.

PubMed

Buijze, Geert A; Guitton, Thierry G; van Dijk, C Niek; Ring, David

2012-07-01

The diagnosis of displacement in scaphoid fractures is notorious for poor interobserver reliability. We tested whether training can improve interobserver reliability and sensitivity, specificity, and accuracy for the diagnosis of scaphoid fracture displacement on radiographs and CT scans. Sixty-four orthopaedic surgeons rated a set of radiographs and CT scans of 10 displaced and 10 nondisplaced scaphoid fractures for the presence of displacement, using a web-based rating application. Before rating, observers were randomized to a training group (34 observers) and a nontraining group (30 observers). The training group received an online training module before the rating session, and the nontraining group did not. Interobserver reliability for training and nontraining was assessed by Siegel's multirater kappa and the Z-test was used to test for significance. There was a small, but significant difference in the interobserver reliability for displacement ratings in favor of the training group compared with the nontraining group. Ratings of radiographs and CT scans combined resulted in moderate agreement for both groups. The average sensitivity, specificity, and accuracy of diagnosing displacement of scaphoid fractures were, respectively, 83%, 85%, and 84% for the nontraining group and 87%, 86%, and 87% for the training group. Assuming a 5% prevalence of fracture displacement, the positive predictive value was 0.23 in the nontraining group and 0.25 in the training group. The negative predictive value was 0.99 in both groups. Our results suggest training can improve interobserver reliability and sensitivity, specificity and accuracy for the diagnosis of scaphoid fracture displacement, but the improvements are slight. These findings are encouraging for future research regarding interobserver variation and how to reduce it further.
A Pilot Comparative Study of Quantitative Ultrasound, Conventional Ultrasound, and MRI for Predicting Histology-Determined Steatosis Grade in Adult Nonalcoholic Fatty Liver Disease

PubMed Central

Paige, Jeremy S.; Bernstein, Gregory S.; Heba, Elhamy; Costa, Eduardo A. C.; Fereirra, Marilia; Wolfson, Tanya; Gamst, Anthony C.; Valasek, Mark A.; Lin, Grace Y.; Han, Aiguo; Erdman, John W.; O’Brien, William D.; Andre, Michael P.; Loomba, Rohit; Sirlin, Claude B.

2017-01-01

OBJECTIVE The purpose of this study is to explore the diagnostic performance of two investigational quantitative ultrasound (QUS) parameters, attenuation coefficient and backscatter coefficient, in comparison with conventional ultrasound (CUS) and MRI-estimated proton density fat fraction (PDFF) for predicting histology-confirmed steatosis grade in adults with nonalcoholic fatty liver disease (NAFLD). SUBJECTS AND METHODS In this prospectively designed pilot study, 61 adults with histology-confirmed NAFLD were enrolled from September 2012 to February 2014. Subjects underwent QUS, CUS, and MRI examinations within 100 days of clinical-care liver biopsy. QUS parameters (attenuation coefficient and backscatter coefficient) were estimated using a reference phantom technique by two analysts independently. Three-point ordinal CUS scores intended to predict steatosis grade (1, 2, or 3) were generated independently by two radiologists on the basis of QUS features. PDFF was estimated using an advanced chemical shift–based MRI technique. Using histologic examination as the reference standard, ROC analysis was performed. Optimal attenuation coefficient, backscatter coefficient, and PDFF cutoff thresholds were identified, and the accuracy of attenuation coefficient, backscatter coefficient, PDFF, and CUS to predict steatosis grade was determined. Interobserver agreement for attenuation coefficient, backscatter coefficient, and CUS was analyzed. RESULTS CUS had 51.7% grading accuracy. The raw and cross-validated steatosis grading accuracies were 61.7% and 55.0%, respectively, for attenuation coefficient, 68.3% and 68.3% for backscatter coefficient, and 76.7% and 71.3% for MRI-estimated PDFF. Interobserver agreements were 53.3% for CUS (κ = 0.61), 90.0% for attenuation coefficient (κ = 0.87), and 71.7% for backscatter coefficient (κ = 0.82) (p < 0.0001 for all). CONCLUSION Preliminary observations suggest that QUS parameters may be more accurate and provide higher interobserver agreement than CUS for predicting hepatic steatosis grade in patients with NAFLD. PMID:28267360
Differentiation between cavernous hemangiomas and untreated malignant neoplasms of the liver with free-breathing diffusion-weighted MR imaging: comparison with T2-weighted fast spin-echo MR imaging.

PubMed

Soyer, Philippe; Corno, Lucie; Boudiaf, Mourad; Aout, Mounir; Sirol, Marc; Placé, Vinciane; Duchat, Florent; Guerrache, Youcef; Fargeaudou, Yann; Vicaut, Eric; Pocard, Marc; Hamzi, Lounis

2011-11-01

To test interobserver variability of ADC measurements and compare the diagnostic performances of free-breathing diffusion-weighted (FBDW) with that of T2-weighted FSE (T2WFSE) MR imaging for differentiating between cavernous hemangiomas and untreated malignant hepatic neoplasms. Thirty-five patients with cavernous hemangiomas and 35 with untreated hepatic malignant neoplasms had FBDW and T2WFSE MR imaging. Hepatic lesions were characterized with ADC measurement and visual evaluation. Interobserver agreement for ADC measurement was calculated. Association between ADC value and lesion type was assessed using univariate analysis. Sensitivity, specificity and accuracy of ADC values and visual evaluation of MR images for the diagnosis of untreated malignant hepatic neoplasm were compared. ADC measurements showed excellent interobserver correlation (intraclass correlation coefficient=0.980). Malignant neoplasms had lower ADC values than hemangiomas for the two observers (1.11×10(-3) mm2/s±.21×10(-3) vs. 1.77×10(-3) mm2/s±.29×10(-3) for observer 1 and 1.11×10(-3) mm2/s±.19×10(-3) vs. 1.79×10(-3) mm2/s±.32×10(-3) for observer 2) and univariate analysis found significant correlations between lesion type and ADC values. Depending on ADC threshold value, accuracy for the diagnosis of malignant neoplasm varied from 82.9% to 94.3%. Using visual evaluation, FBDW showed better specificity and accuracy than T2WFSE MR images for the diagnosis of malignant neoplasm (97.1% vs. 77.1% and 94.3% vs. 62.9%, respectively). FBDW imaging provides reproducible quantitative information and surpasses the value of T2WFSE MR imaging for differentiating between cavernous hemangiomas and untreated malignant hepatic neoplasms. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
A Pilot Comparative Study of Quantitative Ultrasound, Conventional Ultrasound, and MRI for Predicting Histology-Determined Steatosis Grade in Adult Nonalcoholic Fatty Liver Disease.

PubMed

Paige, Jeremy S; Bernstein, Gregory S; Heba, Elhamy; Costa, Eduardo A C; Fereirra, Marilia; Wolfson, Tanya; Gamst, Anthony C; Valasek, Mark A; Lin, Grace Y; Han, Aiguo; Erdman, John W; O'Brien, William D; Andre, Michael P; Loomba, Rohit; Sirlin, Claude B

2017-05-01

The purpose of this study is to explore the diagnostic performance of two investigational quantitative ultrasound (QUS) parameters, attenuation coefficient and backscatter coefficient, in comparison with conventional ultrasound (CUS) and MRI-estimated proton density fat fraction (PDFF) for predicting histology-confirmed steatosis grade in adults with nonalcoholic fatty liver disease (NAFLD). In this prospectively designed pilot study, 61 adults with histology-confirmed NAFLD were enrolled from September 2012 to February 2014. Subjects underwent QUS, CUS, and MRI examinations within 100 days of clinical-care liver biopsy. QUS parameters (attenuation coefficient and backscatter coefficient) were estimated using a reference phantom technique by two analysts independently. Three-point ordinal CUS scores intended to predict steatosis grade (1, 2, or 3) were generated independently by two radiologists on the basis of QUS features. PDFF was estimated using an advanced chemical shift-based MRI technique. Using histologic examination as the reference standard, ROC analysis was performed. Optimal attenuation coefficient, backscatter coefficient, and PDFF cutoff thresholds were identified, and the accuracy of attenuation coefficient, backscatter coefficient, PDFF, and CUS to predict steatosis grade was determined. Interobserver agreement for attenuation coefficient, backscatter coefficient, and CUS was analyzed. CUS had 51.7% grading accuracy. The raw and cross-validated steatosis grading accuracies were 61.7% and 55.0%, respectively, for attenuation coefficient, 68.3% and 68.3% for backscatter coefficient, and 76.7% and 71.3% for MRI-estimated PDFF. Interobserver agreements were 53.3% for CUS (κ = 0.61), 90.0% for attenuation coefficient (κ = 0.87), and 71.7% for backscatter coefficient (κ = 0.82) (p < 0.0001 for all). Preliminary observations suggest that QUS parameters may be more accurate and provide higher interobserver agreement than CUS for predicting hepatic steatosis grade in patients with NAFLD.

Validation of a new UNIX-based quantitative coronary angiographic system for the measurement of coronary artery lesions.

PubMed

Bell, M R; Britson, P J; Chu, A; Holmes, D R; Bresnahan, J F; Schwartz, R S

1997-01-01

We describe a method of validation of computerized quantitative coronary arteriography and report the results of a new UNIX-based quantitative coronary arteriography software program developed for rapid on-line (digital) and off-line (digital or cinefilm) analysis. The UNIX operating system is widely available in computer systems using very fast processors and has excellent graphics capabilities. The system is potentially compatible with any cardiac digital x-ray system for on-line analysis and has been designed to incorporate an integrated database, have on-line and immediate recall capabilities, and provide digital access to all data. The accuracy (mean signed differences of the observed minus the true dimensions) and precision (pooled standard deviations of the measurements) of the program were determined x-ray vessel phantoms. Intra- and interobserver variabilities were assessed from in vivo studies during routine clinical coronary arteriography. Precision from the x-ray phantom studies (6-In. field of view) for digital images was 0.066 mm and for digitized cine images was 0.060 mm. Accuracy was 0.076 mm (overestimation) for digital images compared to 0.008 mm for digitized cine images. Diagnostic coronary catheters were also used for calibration; accuracy.varied according to size of catheter and whether or not they were filled with iodinated contrast. Intra- and interobserver variabilities were excellent and indicated that coronary lesion measurements were relatively user-independent. Thus, this easy to use and very fast UNIX based program appears to be robust with optimal accuracy and precision for clinical and research applications.
Operational accuracy and comparative persistent antigenicity of HRP2 rapid diagnostic tests for Plasmodium falciparum malaria in a hyperendemic region of Uganda.

PubMed

Kyabayinze, Daniel J; Tibenderana, James K; Odong, George W; Rwakimari, John B; Counihan, Helen

2008-10-29

Parasite-based diagnosis of malaria by microscopy requires laboratory skills that are generally unavailable at peripheral health facilities. Rapid diagnostic tests (RDTs) require less expertise, but accuracy under operational conditions has not been fully evaluated in Uganda. There are also concerns about RDTs that use the antigen histidine-rich protein 2 (HRP2) to detect Plasmodium falciparum, because this antigen can persist after effective treatment, giving false positive test results in the absence of infection. An assessment of the accuracy of Malaria Pf immuno-chromatographic test (ICT) and description of persistent antigenicity of HRP2 RDTs was undertaken in a hyperendemic area of Uganda. Using a cross-sectional design, a total of 357 febrile patients of all ages were tested using ICT, and compared to microscopy as the gold standard reference. Two independent RDT readings were used to assess accuracy and inter-observer reliability. With a longitudinal design to describe persistent antigenicity of ICT and Paracheck, 224 children aged 6-59 months were followed up at 7-day intervals until the HRP2 antigens where undetectable by the RDTs. Of the 357 patients tested during the cross-sectional component, 40% (139) had positive blood smears for asexual forms of P. falciparum. ICT had an overall sensitivity of 98%, a specificity of 72%, a negative predictive value (NPV) of 98% and a positive predictive value (PPV) of 69%. ICT showed a high inter-observer reliability under operational conditions, with 95% of readings having assigned the same results (kappa statistics 0.921, p < 0.001). In children followed up after successful antimalaria treatment, the mean duration of persistent antigenicity was 32 days, and this duration varied significantly depending on pre-treatment parasitaemia. In patients with parasite density >50,000/microl, the mean duration of persistent antigenicity was 37 days compared to 26 days for parasitaemia less than 1,000/microl (log rank 21.9, p < 0.001). ICT is an accurate and appropriate test for operational use as a diagnostic tool where microscopy is unavailable. However, persistent antigenicity reduces the accuracy of this and other HRP2-based RDTs. The low specificity continues to be of concern, especially in children below five years of age. These pose limitations that need consideration, such as their use for diagnosis of patients returning with symptoms within two to four weeks of treatment. Good clinical skills are essential to interpret test results.
Operational accuracy and comparative persistent antigenicity of HRP2 rapid diagnostic tests for Plasmodium falciparum malaria in a hyperendemic region of Uganda

PubMed Central

Kyabayinze, Daniel J; Tibenderana, James K; Odong, George W; Rwakimari, John B; Counihan, Helen

2008-01-01

Background Parasite-based diagnosis of malaria by microscopy requires laboratory skills that are generally unavailable at peripheral health facilities. Rapid diagnostic tests (RDTs) require less expertise, but accuracy under operational conditions has not been fully evaluated in Uganda. There are also concerns about RDTs that use the antigen histidine-rich protein 2 (HRP2) to detect Plasmodium falciparum, because this antigen can persist after effective treatment, giving false positive test results in the absence of infection. An assessment of the accuracy of Malaria Pf™ immuno-chromatographic test (ICT) and description of persistent antigenicity of HRP2 RDTs was undertaken in a hyperendemic area of Uganda. Methods Using a cross-sectional design, a total of 357 febrile patients of all ages were tested using ICT, and compared to microscopy as the gold standard reference. Two independent RDT readings were used to assess accuracy and inter-observer reliability. With a longitudinal design to describe persistent antigenicity of ICT and Paracheck, 224 children aged 6–59 months were followed up at 7-day intervals until the HRP2 antigens where undetectable by the RDTs. Results Of the 357 patients tested during the cross-sectional component, 40% (139) had positive blood smears for asexual forms of P. falciparum. ICT had an overall sensitivity of 98%, a specificity of 72%, a negative predictive value (NPV) of 98% and a positive predictive value (PPV) of 69%. ICT showed a high inter-observer reliability under operational conditions, with 95% of readings having assigned the same results (kappa statistics 0.921, p < 0.001). In children followed up after successful antimalaria treatment, the mean duration of persistent antigenicity was 32 days, and this duration varied significantly depending on pre-treatment parasitaemia. In patients with parasite density >50,000/μl, the mean duration of persistent antigenicity was 37 days compared to 26 days for parasitaemia less than 1,000/μl (log rank 21.9, p < 0.001). Conclusion ICT is an accurate and appropriate test for operational use as a diagnostic tool where microscopy is unavailable. However, persistent antigenicity reduces the accuracy of this and other HRP2-based RDTs. The low specificity continues to be of concern, especially in children below five years of age. These pose limitations that need consideration, such as their use for diagnosis of patients returning with symptoms within two to four weeks of treatment. Good clinical skills are essential to interpret test results. PMID:18959777
Clinical assessment of effusion in knee osteoarthritis—A systematic review

PubMed Central

Maricar, Nasimah; Callaghan, Michael J.; Parkes, Matthew J.; Felson, David T.; O׳Neill, Terence W.

2016-01-01

Objective The aim of this systematic review was to determine the validity and inter- and intra-observer reliability of the assessment of knee joint effusion in osteoarthritis (OA) of the knee. Methods MEDLINE, Web of Knowledge, CINAHL, EMBASE, and AMED were searched from their inception to February 2015. Articles were included according to a priori defined criteria: samples containing participants with knee OA; prospective evaluation of clinical tests and assessments of knee effusion that included reliability, sensitivity, and specificity of these tests. Results A total of 10 publications were reviewed. Eight of these considered reliability and four on validity of clinical assessments against ultrasound effusion. It was not possible to undertake a meta-analysis of reliability or validity because of differences in study designs and the clinical tests. Intra-observer kappa agreement for visible swelling ranged from 0.37 (suprapatellar) to 1.0 (prepatellar); for bulge sign 0.47 and balloon sign 0.37. Inter-observer kappa agreement for visible swelling ranged from −0.02 (prepatellar) to 0.65 (infrapatellar), the balloon sign −0.11 to 0.82, patellar tap −0.02 to 0.75 and bulge sign kappa −0.04 to 0.14 or reliability coefficient 0.97. Reliability and diagnostic accuracy tended to be better in experienced observers. Very few data looked at performance of individual clinical tests with sensitivity ranging 18.2–85.7% and specificity 35.3–93.3%, both higher with larger effusions. Conclusion The majority of unstandardized clinical tests to assess joint effusion in knee OA had relatively low intra- and inter-observer reliability. There is some evidence experience improved reliability and diagnostic accuracy of tests. Currently there is insufficient evidence to recommend any particular test in clinical practice. PMID:26581486
Clinical assessment of effusion in knee osteoarthritis-A systematic review.

PubMed

Maricar, Nasimah; Callaghan, Michael J; Parkes, Matthew J; Felson, David T; O'Neill, Terence W

2016-04-01

The aim of this systematic review was to determine the validity and inter- and intra-observer reliability of the assessment of knee joint effusion in osteoarthritis (OA) of the knee. MEDLINE, Web of Knowledge, CINAHL, EMBASE, and AMED were searched from their inception to February 2015. Articles were included according to a priori defined criteria: samples containing participants with knee OA; prospective evaluation of clinical tests and assessments of knee effusion that included reliability, sensitivity, and specificity of these tests. A total of 10 publications were reviewed. Eight of these considered reliability and four on validity of clinical assessments against ultrasound effusion. It was not possible to undertake a meta-analysis of reliability or validity because of differences in study designs and the clinical tests. Intra-observer kappa agreement for visible swelling ranged from 0.37 (suprapatellar) to 1.0 (prepatellar); for bulge sign 0.47 and balloon sign 0.37. Inter-observer kappa agreement for visible swelling ranged from -0.02 (prepatellar) to 0.65 (infrapatellar), the balloon sign -0.11 to 0.82, patellar tap -0.02 to 0.75 and bulge sign kappa -0.04 to 0.14 or reliability coefficient 0.97. Reliability and diagnostic accuracy tended to be better in experienced observers. Very few data looked at performance of individual clinical tests with sensitivity ranging 18.2-85.7% and specificity 35.3-93.3%, both higher with larger effusions. The majority of unstandardized clinical tests to assess joint effusion in knee OA had relatively low intra- and inter-observer reliability. There is some evidence experience improved reliability and diagnostic accuracy of tests. Currently there is insufficient evidence to recommend any particular test in clinical practice. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Validation (in Spanish) of the Mini Nutritional Assessment survey to assess the nutritional status of patients over 65 years of age.

PubMed

Muñoz Díaz, Belén; Molina-Recio, Guillermo; Romero-Saldaña, Manuel; Redondo Sánchez, Juana; Aguado Taberné, Cristina; Arias Blanco, Carmen; Molina-Luque, Rafael; Martínez De La Iglesia, Jorge

2018-06-05

To validate the Mini Nutritional Assessment (MNA) in a Spanish population over 65 years of age with varying degrees of independence. This cross-sectional validation study used the Chang nutritional assessment method as a reference test. 248 subjects (75.4% female), with a mean age of 83, completed the study. They were classified into three groups: (i) autonomous patients who were able to take part in activities outside their home; (ii) patients who require help with daily self-care; (iii) patients living in a residential health care facility. . Three health centres and three residential care homes situated in Cordoba (Spain). The kappa values for intra-observer and inter-observer agreement were 0.870 and 0.784, respectively. The intra-class correlation coefficient intra-observer was 0.874 and the inter-observer was 0.789. The sensitivity and specificity readings for the diagnostic accuracy of MNA were 63.2% and 72.9% in the total sample, respectively. The area under the curve was 0.726. For patients in the Group A, B and C, the sensitivity was 89.3%, 60.7% and 18.8%, and the specificity was 23.3%, 56.8% and 94.1%, respectively. The results for the reliability of the survey were excellent, and its internal consistency was acceptable. The diagnostic accuracy, as measured by the sensitivity and specificity readings, was lower than that obtained with the original survey. It can therefore be considered more suitable for a population with limited autonomy, and less appropriate for independent patients. The results may not be relevant to patients outside of the Cordova region in Spain.
Accuracy of MSCT Coronary Angiography with 64 Row CT Scanner—Facing the Facts

PubMed Central

Wehrschuetz, M.; Wehrschuetz, E.; Schuchlenz, H.; Schaffler, G.

2010-01-01

Improvements in multislice computed tomography (MSCT) angiography of the coronary vessels have enabled the minimally invasive detection of coronary artery stenoses, while quantitative coronary angiography (QCA) is the accepted reference standard for evaluation thereof. Sixteen-slice MSCT showed promising diagnostic accuracy in detecting coronary artery stenoses haemodynamically and the subsequent introduction of 64-slice scanners promised excellent and fast results for coronary artery studies. This prompted us to evaluate the diagnostic accuracy, sensitivity, specificity, and the negative und positive predictive value of 64-slice MSCT in the detection of haemodynamically significant coronary artery stenoses. Thirty-seven consecutive subjects with suspected coronary artery disease were evaluated with MSCT angiography and the results compared with QCA. All vessels were considered for the assessment of significant coronary artery stenosis (diameter reduction ≥ 50%). Thirteen patients (35%) were identified as having significant coronary artery stenoses on QCA with 6.3% (35/555) affected segments. None of the coronary segments were excluded from analysis. Overall sensitivity for classifying stenoses of 64-slice MSCT was 69%, specificity was 92%, positive predictive value was 38% and negative predictive value was 98%. The interobserver variability for detection of significant lesions had a k-value of 0.43. Sixty-four-slice MSCT offers the diagnostic potential to detect coronary artery disease, to quantify haemodynamically significant coronary artery stenoses and to avoid unnecessary invasive coronary artery examinations. PMID:20567636
Diagnostic Performance of Ultrasonography for Pediatric Appendicitis: A Night and Day Difference?

PubMed

Mangona, Kate Louise M; Guillerman, R Paul; Mangona, Victor S; Carpenter, Jennifer; Zhang, Wei; Lopez, Monica; Orth, Robert C

2017-12-01

For imaging pediatric appendicitis, ultrasonography (US) is preferred because of its lack of ionizing radiation, but is limited by operator dependence. This study investigates the US diagnostic performance during night shifts covered by radiology trainees compared to day shifts covered by attending radiologists. Appy-Scores (1 = completely visualized normal appendix; 2 = partially visualized normal appendix; 3 = nonvisualized appendix with no inflammatory changes in the expected region of the appendix; 4 = equivocal; 5a = nonperforated appendicitis; 5b = perforated appendicitis) from 2935 US examinations (2161:774, day-to-night) from July 2013 to 2014 were correlated with the intraoperative diagnoses and the clinical follow-up. The diagnostic performance of trainees and attendings was compared with Fisher exact test. Interobserver agreement was measured by Cohen kappa coefficient. Appendicitis prevalence was 25.3% (day) and 22.5% (night). Sensitivity, specificity, accuracy, negative predictive value, and positive predictive vale were 94.0%, 93.7%, 93.8%, 97.9%, and 83.4% during the day and 92.0%, 91.2%, 91.3%, 97.5%, and 75.2% at night. Specificity (P = .048) and positive predictive value (P = .011) differed, with more false positives at night (7%) than during the day (4.7%). Trainee and attending agreement was high (k = 0.995), with Appy-Scores of 1, 4, and 5a most frequently discordant. US has a high diagnostic performance and interobserver agreement for pediatric appendicitis when interpreted by radiology trainees during night shifts or attending radiologists during day shifts. However, lower specificity and positive predictive value at night warrants a thorough trainee education to avoid false-positive examinations. Published by Elsevier Inc.
Diagnostic accuracy of MRI in the measurement of glenoid bone loss.

PubMed

Gyftopoulos, Soterios; Hasan, Saqib; Bencardino, Jenny; Mayo, Jason; Nayyar, Samir; Babb, James; Jazrawi, Laith

2012-10-01

The purpose of this study is to assess the accuracy of MRI quantification of glenoid bone loss and to compare the diagnostic accuracy of MRI to CT in the measurement of glenoid bone loss. MRI, CT, and 3D CT examinations of 18 cadaveric glenoids were obtained after the creation of defects along the anterior and anteroinferior glenoid. The defects were measured by three readers separately and blindly using the circle method. These measurements were compared with measurements made on digital photographic images of the cadaveric glenoids. Paired sample Student t tests were used to compare the imaging modalities. Concordance correlation coefficients were also calculated to measure interobserver agreement. Our data show that MRI could be used to accurately measure glenoid bone loss with a small margin of error (mean, 3.44%; range, 2.06-5.94%) in estimated percentage loss. MRI accuracy was similar to that of both CT and 3D CT for glenoid loss measurements in our study for the readers familiar with the circle method, with 1.3% as the maximum expected difference in accuracy of the percentage bone loss between the different modalities (95% confidence). Glenoid bone loss can be accurately measured on MRI using the circle method. The MRI quantification of glenoid bone loss compares favorably to measurements obtained using 3D CT and CT. The accuracy of the measurements correlates with the level of training, and a learning curve is expected before mastering this technique.
"Score the Core" Web-based pathologist training tool improves the accuracy of breast cancer IHC4 scoring.

PubMed

Engelberg, Jesse A; Retallack, Hanna; Balassanian, Ronald; Dowsett, Mitchell; Zabaglo, Lila; Ram, Arishneel A; Apple, Sophia K; Bishop, John W; Borowsky, Alexander D; Carpenter, Philip M; Chen, Yunn-Yi; Datnow, Brian; Elson, Sarah; Hasteh, Farnaz; Lin, Fritz; Moatamed, Neda A; Zhang, Yanhong; Cardiff, Robert D

2015-11-01

Hormone receptor status is an integral component of decision-making in breast cancer management. IHC4 score is an algorithm that combines hormone receptor, HER2, and Ki-67 status to provide a semiquantitative prognostic score for breast cancer. High accuracy and low interobserver variance are important to ensure the score is accurately calculated; however, few previous efforts have been made to measure or decrease interobserver variance. We developed a Web-based training tool, called "Score the Core" (STC) using tissue microarrays to train pathologists to visually score estrogen receptor (using the 300-point H score), progesterone receptor (percent positive), and Ki-67 (percent positive). STC used a reference score calculated from a reproducible manual counting method. Pathologists in the Athena Breast Health Network and pathology residents at associated institutions completed the exercise. By using STC, pathologists improved their estrogen receptor H score and progesterone receptor and Ki-67 proportion assessment and demonstrated a good correlation between pathologist and reference scores. In addition, we collected information about pathologist performance that allowed us to compare individual pathologists and measures of agreement. Pathologists' assessment of the proportion of positive cells was closer to the reference than their assessment of the relative intensity of positive cells. Careful training and assessment should be used to ensure the accuracy of breast biomarkers. This is particularly important as breast cancer diagnostics become increasingly quantitative and reproducible. Our training tool is a novel approach for pathologist training that can serve as an important component of ongoing quality assessment and can improve the accuracy of breast cancer prognostic biomarkers. Copyright © 2015 Elsevier Inc. All rights reserved.
Learning curve of office-based ultrasonography for rotator cuff tendons tears.

PubMed

Ok, Ji-Hoon; Kim, Yang-Soo; Kim, Jung-Man; Yoo, Tae-Wook

2013-07-01

To compare the accuracy of ultrasonography and MR arthrography (MRA) imaging in detecting of rotator cuff tears with arthroscopic finding used as the reference standard. The ultrasonography and MRA findings of 51 shoulders that underwent the arthroscopic surgery were prospectively analysed. Two orthopaedic doctors independently performed ultrasonography and interpreted the findings at the office. The tear size measured at ultrasonography and MRA was compared with the size measured at surgery using Pearson correlation coefficients (r). The sensitivity, specificity, accuracy, positive predictive value, negative predictive value and false-positive rate were calculated for a diagnosis of partial-and full-thickness rotator cuff tears. The kappa coefficient was calculated to verify the inter-observer agreement. The sensitivity of ultrasonography and MRA for detecting partial-thickness tears was 45.5 and 72.7 %, and that for full-thickness tears was 80.0 and 100 %, respectively. The accuracy of ultrasonograpy and MRA for detecting partial-thickness tears was 45.1 and 88.2 %, and that for full-thickness tears was 82.4 and 98 %, respectively. Tear size measured based on ultrasonography examination showed a poor correlation with the size measured at arthroscopic surgery (r = 0.21; p < 0.05). However, tear size estimated by MRA showed a strong correlation (r = 0.75; p < 0.05). The kappa coefficient was 0.47 between the two independent examiners. The accuracy of office-based ultrasonography for beginner orthopaedic surgeons to detect full-thickness rotator cuff tears was comparable to that of MRA but was less accurate for detecting partial-thickness tears and torn size measurement. Inter-observer agreement on the interpretation was fair. These results highlight the importance of the correct technique and experience in operation of ultrasonography in shoulder joint. Diagnostic study, Level II.
Metal artefacts severely hamper magnetic resonance imaging of the rotator cuff tendons after rotator cuff repair with titanium suture anchors.

PubMed

Schröder, Femke F; Huis In't Veld, Rianne; den Otter, Lydia A; van Raak, Sjoerd M; Ten Haken, Bennie; Vochteloo, Anne J H

2018-04-01

The rate of retear after rotator cuff surgery is 17%. Magnetic resonance imaging (MRI) scans are used for confirmative diagnosis of retear. However, because of the presence of titanium suture anchors, metal artefacts on the MRI are common. The present study evaluated the diagnostic value of MRI after rotator cuff tendon surgery with respect to assessing the integrity as well as the degeneration and atrophy of the rotator cuff tendons when titanium anchors are in place. Twenty patients who underwent revision surgery of the rotator cuff as a result of a clinically suspected retear between 2013 and 2015 were included. The MRI scans of these patients were retrospectively analyzed by four specialized shoulder surgeons and compared with intra-operative findings (gold standard). Sensitivity and interobserver agreement among the surgeons in assessing retears as well as the Goutallier and Warner classification were examined. In 36% (range 15% to 50%) of the pre-operative MRI scans, the observers could not review the rotator cuff tendons. When the rotator cuff tendons were assessable, a diagnostic accuracy with a mean sensitivity of 0.84 (0.70 to 1.0) across the surgeons was found, with poor interobserver agreement (kappa = 0.12). Metal artefacts prevented accurate diagnosis from MRI scans of rotator cuff retear in 36% of the patients studied.
Diagnostic accuracy of susceptibility-weighted magnetic resonance imaging for the evaluation of pineal gland calcification

PubMed Central

Böker, Sarah M.; Bender, Yvonne Y.; Diederichs, Gerd; Fallenberg, Eva M.; Wagner, Moritz; Hamm, Bernd; Makowski, Marcus R.

2017-01-01

Objectives To determine the diagnostic performance of susceptibility-weighted magnetic resonance imaging (SWMR) for the detection of pineal gland calcifications (PGC) compared to conventional magnetic resonance imaging (MRI) sequences, using computed tomography (CT) as a reference standard. Methods 384 patients who received a 1.5 Tesla MRI scan including SWMR sequences and a CT scan of the brain between January 2014 and October 2016 were retrospectively evaluated. 346 patients were included in the analysis, of which 214 showed PGC on CT scans. To assess correlation between imaging modalities, the maximum calcification diameter was used. Sensitivity and specificity and intra- and interobserver reliability were calculated for SWMR and conventional MRI sequences. Results SWMR reached a sensitivity of 95% (95% CI: 91%-97%) and a specificity of 96% (95% CI: 91%-99%) for the detection of PGC, whereas conventional MRI achieved a sensitivity of 43% (95% CI: 36%-50%) and a specificity of 96% (95% CI: 91%-99%). Detection rates for calcifications in SWMR and conventional MRI differed significantly (95% versus 43%, p<0.001). Diameter measurements between SWMR and CT showed a close correlation (R2 = 0.85, p<0.001) with a slight but not significant overestimation of size (SWMR: 6.5 mm ± 2.5; CT: 5.9 mm ± 2.4, p = 0.02). Interobserver-agreement for diameter measurements was excellent on SWMR (ICC = 0.984, p < 0.0001). Conclusions Combining SWMR magnitude and phase information enables the accurate detection of PGC and offers a better diagnostic performance than conventional MRI with CT as a reference standard. PMID:28278291
Diagnostic and Treatment Reproducibility of Cervical Intraepithelial Neoplasia / Squamous Intraepithelial Lesion and Factors Affecting the Diagnosis.

PubMed

Sağlam, Arzu; Usubütün, Alp; Dolgun, Anıl; Mutter, George L; Salman, M Coşkun; Kurtulan, Olcay; Akyol, Aytekin; Özkan, Eylem Akar; Baykara, Sema; Bülbül, Dilek; Calay, Zerrin; Eren, Funda; Gümürdülü, Derya; Haberal, Nihan; Ilvan, Şennur; Karaveli, Şeyda; Koyuncuoğlu, Meral; Müezzinoğlu, Bahar; Müftüoğlu, Kamil Hakan; Özen, Özlem; Özdemir, Necmettin; Peştereli, Elif; Ulukuş, Çağnur; Zekioğlu, Osman

2017-01-01

Inter-observer differences in the diagnosis of HPV related cervical lesions are problematic and response of gynecologists to these diagnostic entities is non-standardized. This study evaluated the diagnostic reproducibility of "cervical intraepithelial neoplasia" (CIN) and "squamous intraepithelial lesion" (SIL) diagnoses. 19 pathologists evaluated 66 cases once using H&E slides and once with immunohistochemical studies (p16, Ki-67 and Pro-ExC). Management response to diagnoses was evaluated amongst 12 gynecologists. Pathologists and gynecologists were also given a questionnaire about how additional information like smear results and age modify diagnosis and management. We show moderate interobserver diagnostic reproducibility amongst pathologists. The overall kappa value was 0.50 and 0.59 using the CIN and SIL classifications respectively. Impact of immunohistochemical evaluation on interpretation of cases differed and there was lack of statistically significant improvement of interobserver diagnostic reproducibility with the addition of immunohistochemistry. We saw that choice of treatment methods amongst gynecologists varied and overall concordance was only fair to moderate. The CIN2 diagnostic category was seen to have the lowest percentage agreement amongst both pathologists and gynecologists. We showed that pathologists had diagnostic "styles" and gynecologists had management "styles". In summary each pathologist had different diagnostic tendencies which were affected not only by histopathology and marker studies, but also by the patient management tendencies of the gynecologist that the pathologist worked with. The two-tiered modified Bethesda system improved diagnostic agreement. We concluded that immunohistochemistry should be used only to resolve problems in select cases and not for every case.
MR imaging of silicone breast implants: evaluation of prospective and retrospective interpretations and interobserver agreement.

PubMed

Quinn, S F; Neubauer, N M; Sheley, R C; Demlow, T A; Szumowski, J

1996-01-01

MR imaging was used to evaluate the integrity of silicone breast implants in 54 women with 108 implants. MR images were interpreted by relatively inexperienced readers who tried to reproduce the experiences reported in the literature. The study examines the interobserver agreement using different diagnostic signs and the influence of experience on interpretation errors. Prospective and retrospective interpretations were compared with surgical findings at the time of explanation. Diagnostic indicators, including the linguine sign, the inverted tear drop sign, the C sign, water droplets mixed with silicone, and extracapsular globules of silicone, were evaluated for diagnostic efficacy and interobserver agreement. The prospective sensitivity and specificity were 87% and 78%, respectively. With the retrospective interpretations, the sensitivity and specificity increased to 93% and 92%, respectively. Most of the prospective false-positive interpretations were due to misinterpreting radial folds as signs of implant rupture. Six implants interpreted retrospectively as false positives had gross amounts of silicone around the implants at surgery but there were no obvious rents in the implant shells. There was fair to excellent interobserver agreement with the individual diagnostic signs except for extracapsular globules of silicone. All of the signs had specificities of greater than 90%. The sensitivities of the individual signs were less than the overall retrospective sensitivity. With experience, the sensitivity improved from 87% to 93% and the specificity improved from 78% to 92%. This study helps substantiate the use of diagnostic signs used by other authors to detect silicone loss from breast implants by MR imaging; however, questions remain as to the clinical role of MR imaging in evaluating implants for silicone loss.
Magnetic Resonance Angiography in the Diagnosis of Cerebral Arteriovenous Malformation and Dural Arteriovenous Fistulas: Comparison of Time-Resolved Magnetic Resonance Angiography and Three Dimensional Time-of-Flight Magnetic Resonance Angiography

PubMed Central

Cheng, Yu-Ching; Chen, Hung-Chieh; Wu, Chen-Hao; Wu, Yi-Ying; Sun, Ming-His; Chen, Wen-Hsien; Chai, Jyh-Wen; Chi-Chang Chen, Clayton

2016-01-01

Background Traditional digital subtraction angiography (DSA) is currently the gold standard diagnostic method for the diagnosis and evaluation of cerebral arteriovenous malformation (AVM) and dural arteriovenous fistulas (dAVF). Objectives The aim of this study was to analyze different less invasive magnetic resonance angiography (MRA) images, time-resolved MRA (TR-MRA) and three-dimensional time-of-flight MRA (3D TOF MRA) to identify their diagnostic accuracy and to determine which approach is most similar to DSA. Patients and Methods A total of 41 patients with AVM and dAVF at their initial evaluation or follow-up after treatment were recruited in this study. We applied time-resolved angiography using keyhole (4D-TRAK) MRA to perform TR-MRA and 3D TOF MRA examinations simultaneously followed by DSA, which was considered as a standard reference. Two experienced neuroradiologists reviewed the images to compare the diagnostic accuracy, arterial feeder and venous drainage between these two MRA images. Inter-observer agreement for different MRA images was assessed by Kappa coefficient and the differences of diagnostic accuracy between MRA images were evaluated by the Wilcoxon rank sum test. Results Almost all vascular lesions (92.68%) were correctly diagnosed using 4D-TRAK MRA. However, 3D TOF MRA only diagnosed 26 patients (63.41%) accurately. There were statistically significant differences regarding lesion diagnostic accuracy (P = 0.008) and venous drainage identification (P < 0.0001) between 4D-TRAK MRA and 3D TOF MRA. The results indicate that 4D-TRAK MRA is superior to 3D TOF MRA in the assessment of lesions. Conclusion Compared with 3D TOF MRA, 4D-TRAK MRA proved to be a more reliable screening modality and follow-up method for the diagnosis of cerebral AVM and dAVF. PMID:27679690
Dual mobility hip arthroplasty wear measurement: Experimental accuracy assessment using radiostereometric analysis (RSA).

PubMed

Pineau, V; Lebel, B; Gouzy, S; Dutheil, J-J; Vielpeau, C

2010-10-01

The use of dual mobility cups is an effective method to prevent dislocations. However, the specific design of these implants can raise the suspicion of increased wear and subsequent periprosthetic osteolysis. Using radiostereometric analysis (RSA), migration of the femoral head inside the cup of a dual mobility implant can be defined to apprehend polyethylene wear rate. The study aimed to establish the precision of RSA measurement of femoral head migration in the cup of a dual mobility implant, and its intra- and interobserver variability. A total hip prosthesis phantom was implanted and placed under weight loading conditions in a simulator. Model-based RSA measurement of implant penetration involved specially machined polyethylene liners with increasing concentric wear (no wear, then 0.25, 0.5 and 0.75mm). Three examiners, blinded to the level of wear, analyzed (10 times) the radiostereometric films of the four liners. There was one experienced, one trained, and one inexperienced examiner. Statistical analysis measured the accuracy, precision, and intra- and interobserver variability by calculating Root Mean Square Error (RMSE), Concordance Correlation Coefficient (CCC), Intra Class correlation Coefficient (ICC), and Bland-Altman plots. Our protocol, that used a simple geometric model rather than the manufacturer's CAD files, showed precision of 0.072mm and accuracy of 0.034mm, comparable with machining tolerances with low variability. Correlation between wear measurement and true value was excellent with a CCC of 0.9772. Intraobserver reproducibility was very good with an ICC of 0.9856, 0.9883 and 0.9842, respectively for examiners 1, 2 and 3. Interobserver reproducibility was excellent with a CCC of 0.9818 between examiners 2 and 1, and 0.9713 between examiners 3 and 1. Quantification of wear is indispensable for the surveillance of dual mobility implants. This in vitro study validates our measurement method. Our results, and comparison with other studies using different measurement technologies (RSA, standard radiographs, Martell method) make model-based RSA the reference method for measuring the wear of total hip prostheses in vivo. Level 3. Prospective diagnostic study. Copyright © 2010 Elsevier Masson SAS. All rights reserved.
Interobserver agreement between primary graders and an expert grader in the Bristol and Weston diabetic retinopathy screening programme: a quality assurance audit.

PubMed

Patra, S; Gomm, E M W; Macipe, M; Bailey, C

2009-08-01

To assess the quality and accuracy of primary grading in the Bristol and Weston diabetic retinopathy screening programme and to set standards for future interobserver agreement reports. A prospective audit of 213 image sets from six fully trained primary graders in the Bristol and Weston diabetic retinopathy screening programme was carried out over a 4-week period. All the images graded by the primary graders were regraded by an expert grader blinded to the primary grading results and the identity of the primary grader. The interobserver agreement between primary graders and the blinded expert grader and the corresponding Kappa coefficient was determined for overall grading, referable, non-referable and ungradable disease. The audit standard was set at 80% for interobserver agreement with a Kappa coefficient of 0.7. The interobserver agreement bettered the audit standard of 80% in all the categories. The Kappa coefficient was substantial (0.7) for the overall grading results and ranged from moderate to substantial (0.59-0.65) for referable, non-referable and ungradable disease categories. The main recommendation of the audit was to provide refresher training for the primary graders with focus on ungradable disease. The audit demonstrated an acceptable level of quality and accuracy of primary grading in the Bristol and Weston diabetic retinopathy screening programme and provided a standard against which future interobserver agreement can be measured for quality assurance within a screening programme. Diabet. Med. 26, 820-823 (2009).
Detection of myocardial ischemia by automated, motion-corrected, color-encoded perfusion maps compared with visual analysis of adenosine stress cardiovascular magnetic resonance imaging at 3 T: a pilot study.

PubMed

Doesch, Christina; Papavassiliu, Theano; Michaely, Henrik J; Attenberger, Ulrike I; Glielmi, Christopher; Süselbeck, Tim; Fink, Christian; Borggrefe, Martin; Schoenberg, Stefan O

2013-09-01

The purpose of this study was to compare automated, motion-corrected, color-encoded (AMC) perfusion maps with qualitative visual analysis of adenosine stress cardiovascular magnetic resonance imaging for detection of flow-limiting stenoses. Myocardial perfusion measurements applying the standard adenosine stress imaging protocol and a saturation-recovery temporal generalized autocalibrating partially parallel acquisition (t-GRAPPA) turbo fast low angle shot (Turbo FLASH) magnetic resonance imaging sequence were performed in 25 patients using a 3.0-T MAGNETOM Skyra (Siemens Healthcare Sector, Erlangen, Germany). Perfusion studies were analyzed using AMC perfusion maps and qualitative visual analysis. Angiographically detected coronary artery (CA) stenoses greater than 75% or 50% or more with a myocardial perfusion reserve index less than 1.5 were considered as hemodynamically relevant. Diagnostic performance and time requirement for both methods were compared. Interobserver and intraobserver reliability were also assessed. A total of 29 CA stenoses were included in the analysis. Sensitivity, specificity, positive predictive value, negative predictive value, and accuracy for detection of ischemia on a per-patient basis were comparable using the AMC perfusion maps compared to visual analysis. On a per-CA territory basis, the attribution of an ischemia to the respective vessel was facilitated using the AMC perfusion maps. Interobserver and intraobserver reliability were better for the AMC perfusion maps (concordance correlation coefficient, 0.94 and 0.93, respectively) compared to visual analysis (concordance correlation coefficient, 0.73 and 0.79, respectively). In addition, in comparison to visual analysis, the AMC perfusion maps were able to significantly reduce analysis time from 7.7 (3.1) to 3.2 (1.9) minutes (P < 0.0001). The AMC perfusion maps yielded a diagnostic performance on a per-patient and on a per-CA territory basis comparable with the visual analysis. Furthermore, this approach demonstrated higher interobserver and intraobserver reliability as well as a better time efficiency when compared to visual analysis.
Interobserver variation in the diagnosis of fibroepithelial lesions of the breast: a multicentre audit by digital pathology.

PubMed

Dessauvagie, Benjamin F; Lee, Andrew H S; Meehan, Katie; Nijhawan, Anju; Tan, Puay Hoon; Thomas, Jeremy; Tie, Bibiana; Treanor, Darren; Umar, Seemeen; Hanby, Andrew M; Millican-Slater, Rebecca

2018-02-13

Fibroepithelial lesions (FELs) of the breast span a morphological continuum including lesions where distinction between cellular fibroadenoma (FA) and benign phyllodes tumour (PT) is difficult. The distinction is clinically important with FAs managed conservatively while equivocal lesions and PTs are managed with surgery. We sought to audit core biopsy diagnoses of equivocal FELs by digital pathology and to investigate whether digital point counting is useful in clarifying FEL diagnoses. Scanned slide images from cores and subsequent excisions of 69 equivocal FELs were examined in a multicentre audit by eight pathologists to determine the agreement and accuracy of core needle biopsy (CNB) diagnoses and by digital point counting of stromal cellularity and expansion to determine if classification could be improved. Interobserver variation was high on CNB with a unanimous diagnosis from all pathologists in only eight cases of FA, diagnoses of both FA and PT on the same CNB in 15 and a 'weak' mean kappa agreement between pathologists (k=0.36). 'Moderate' agreement was observed on CNBs among breast specialists (k=0.44) and on excision samples (k=0.49). Up to 23% of lesions confidently diagnosed as FA on CNB were PT on excision and up to 30% of lesions confidently diagnosed as PT on CNB were FA on excision. Digital point counting did not aid in the classification of FELs. Accurate and reproducible diagnosis of equivocal FELs is difficult, particularly on CNB, resulting in poor interobserver agreement and suboptimal accuracy. Given the diagnostic difficulty, and surgical implications, equivocal FELs should be reported in consultation with experienced breast pathologists as a small number of benign FAs can be selected out from equivocal lesions. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

Computed tomography arthrography using a radial plane view for the detection of triangular fibrocartilage complex foveal tears.

PubMed

Moritomo, Hisao; Arimitsu, Sayuri; Kubo, Nobuyuki; Masatomi, Takashi; Yukioka, Masao

2015-02-01

To classify triangular fibrocartilage complex (TFCC) foveal lesions on the basis of computed tomography (CT) arthrography using a radial plane view and to correlate the CT arthrography results with surgical findings. We also tested the interobserver and intra-observer reliability of the radial plane view. A total of 33 patients with a suspected TFCC foveal tear who had undergone wrist CT arthrography and subsequent surgical exploration were enrolled. We classified the configurations of TFCC foveal lesions into 5 types on the basis of CT arthrography with the radial plane view in which the image slices rotate clockwise centered on the ulnar styloid process. Sensitivity, specificity, and positive predictive values were calculated for each type of foveal lesion in CT arthrography to detect foveal tears. We determined interobserver and intra-observer agreements using kappa statistics. We also compared accuracies with the radial plane views with those with the coronal plane views. Among the tear types on CT arthrography, type 3, a roundish defect at the fovea, and type 4, a large defect at the overall ulnar insertion, had high specificity and positive predictive value for the detection of foveal tears. Specificity and positive predictive values were 90% and 89% for type 3 and 100% and 100% for type 4, respectively, whereas sensitivity was 35% for type 3 and 22% for type 4. Interobserver and intra-observer agreement was substantial and almost perfect, respectively. The radial plane view identified foveal lesion of each palmar and dorsal radioulnar ligament separately, but accuracy results with the radial plane views were not statistically different from those with the coronal plane views. Computed tomography arthrography with a radial plane view exhibited enhanced specificity and positive predictive value when a type 3 or 4 lesion was identified in the detection of a TFCC foveal tear compared with historical controls. Diagnostic II. Copyright © 2015 American Society for Surgery of the Hand. Published by Elsevier Inc. All rights reserved.
Comparison of Inter-Observer Variability and Diagnostic Performance of the Fifth Edition of BI-RADS for Breast Ultrasound of Static versus Video Images.

PubMed

Youk, Ji Hyun; Jung, Inkyung; Yoon, Jung Hyun; Kim, Sung Hun; Kim, You Me; Lee, Eun Hye; Jeong, Sun Hye; Kim, Min Jung

2016-09-01

Our aim was to compare the inter-observer variability and diagnostic performance of the Breast Imaging Reporting and Data System (BI-RADS) lexicon for breast ultrasound of static and video images. Ninety-nine breast masses visible on ultrasound examination from 95 women 19-81 y of age at five institutions were enrolled in this study. They were scheduled to undergo biopsy or surgery or had been stable for at least 2 y of ultrasound follow-up after benign biopsy results or typically benign findings. For each mass, representative long- and short-axis static ultrasound images were acquired; real-time long- and short-axis B-mode video images through the mass area were separately saved as cine clips. Each image was reviewed independently by five radiologists who were asked to classify ultrasound features according to the fifth edition of the BI-RADS lexicon. Inter-observer variability was assessed using kappa (κ) statistics. Diagnostic performance on static and video images was compared using the area under the receiver operating characteristic curve. No significant difference was found in κ values between static and video images for all descriptors, although κ values of video images were higher than those of static images for shape, orientation, margin and calcifications. After receiver operating characteristic curve analysis, the video images (0.83, range: 0.77-0.87) had higher areas under the curve than the static images (0.80, range: 0.75-0.83; p = 0.08). Inter-observer variability and diagnostic performance of video images was similar to that of static images on breast ultrasonography according to the new edition of BI-RADS. Copyright © 2016 World Federation for Ultrasound in Medicine & Biology. Published by Elsevier Inc. All rights reserved.
Time-resolved imaging of contrast kinetics does not improve performance of follow-up MRA of embolized intracranial aneurysms.

PubMed

Serafin, Zbigniew; Strześniewski, Piotr; Lasek, Władysław; Beuth, Wojciech

2012-07-01

The use of contrast media and the time-resolved imaging of contrast kinetics (TRICKS) technique have some theoretical advantages over time-of-flight magnetic resonance angiography (TOF-MRA) in the follow-up of intracranial aneurysms after endovascular treatment. We prospectively compared the diagnostic performance of TRICKS and TOF-MRA with digital subtracted angiography (DSA) in the assessment of occlusion of embolized aneurysms. Seventy-two consecutive patients with 72 aneurysms were examined 3 months after embolization. Test characteristics of TOF-MRA and TRICKS were calculated for the detection of residual flow. The results of quantification of flow were compared with weighted kappa. Intraobserver and interobserver reproducibility was determined. The sensitivity of TOF-MRA was 85% (95% CI, 65-96%) and of TRICKS, 89% (95% CI, 70-97%). The specificity of both methods was 91% (95% CI, 79-98%). The accuracy of the flow quantification ranged from 0.76 (TOF-MRA) to 0.83 (TRICKS). There was no significant difference between the methods in the area under the ROC curve regarding both the detection and the quantification of flow. Intraobserver reproducibility was very good with both techniques (kappa, 0.86-0.89). The interobserver reproducibility was moderate for TOF-MRA and very good for TRICKS (kappa, 0.74-0.80). In this study, TOF-MRA and TRICKS presented similar diagnostic performance; therefore, the use of time-resolved contrast-enhanced MRA is not justified in the follow-up of embolized aneurysms.
Auscultation versus Point-of-care Ultrasound to Determine Endotracheal versus Bronchial Intubation: A Diagnostic Accuracy Study.

PubMed

Ramsingh, Davinder; Frank, Ethan; Haughton, Robert; Schilling, John; Gimenez, Kimberly M; Banh, Esther; Rinehart, Joseph; Cannesson, Maxime

2016-05-01

Unrecognized malposition of the endotracheal tube (ETT) can lead to severe complications in patients under general anesthesia. The focus of this double-blinded randomized study was to assess the accuracy of point-of-care ultrasound in verifying the correct position of the ETT and to compare it with the accuracy of auscultation. Forty-two adult patients requiring general anesthesia with ETT were consented. Patients were randomized to right main bronchus, left main bronchus, or tracheal intubation. After randomization, the ETT was placed via fiber-optic visualization. Next, the location of the ETT was assessed using auscultation by a separate blinded anesthesiologist, followed by an ultrasound performed by a third blinded anesthesiologist. Ultrasound examination included assessment of tracheal dilation via cuff inflation with air and evaluation of pleural lung sliding. Statistical analysis included sensitivity, specificity, positive predictive value, negative predictive value, and interobserver agreement for the ultrasound examination (95% CI). In differentiating tracheal versus bronchial intubations, auscultation showed a sensitivity of 66% (0.39 to 0.87) and a specificity of 59% (0.39 to 0.77), whereas ultrasound showed a sensitivity of 93% (0.66 to 0.99) and specificity of 96% (0.79 to 1). Identification of tracheal versus bronchial intubation was 62% (26 of 42) in the auscultation group and 95% (40 of 42) in the ultrasound group (P = 0.0005) (CI for difference, 0.15 to 0.52), and the McNemar comparison showed statistically significant improvement with ultrasound (P < 0.0001). Interobserver agreement of ultrasound findings was 100%. Assessment of trachea and pleura via point-of-care ultrasound is superior to auscultation in determining the location of ETT.
Interpretation of Post-operative Distal Humerus Radiographs After Internal Fixation: Prediction of Later Loss of Fixation.

PubMed

Claessen, Femke M A P; Stoop, Nicky; Doornberg, Job N; Guitton, Thierry G; van den Bekerom, Michel P J; Ring, David

2016-10-01

Stable fixation of distal humerus fracture fragments is necessary for adequate healing and maintenance of reduction. The purpose of this study was to measure the reliability and accuracy of interpretation of postoperative radiographs to predict which implants will loosen or break after operative treatment of bicolumnar distal humerus fractures. We also addressed agreement among surgeons regarding which fracture fixation will loosen or break and the influence of years in independent practice, location of practice, and so forth. A total of 232 orthopedic residents and surgeons from around the world evaluated 24 anteroposterior and lateral radiographs of distal humerus fractures on a Web-based platform to predict which implants would loosen or break. Agreement among observers was measured using the multi-rater kappa measure. The sensitivity of prediction of failure of fixation of distal humerus fracture on radiographs was 63%, specificity was 53%, positive predictive value was 36%, the negative predictive value was 78%, and accuracy was 56%. There was fair interobserver agreement (κ = 0.27) regarding predictions of failure of fixation of distal humerus fracture on radiographs. Interobserver variability did not change when assessed for the various subgroups. When experienced and skilled surgeons perform fixation of type C distal humerus fracture, the immediate postoperative radiograph is not predictive of fixation failure. Reoperation based on the probability of failure might not be advisable. Diagnostic III. Copyright © 2016 American Society for Surgery of the Hand. Published by Elsevier Inc. All rights reserved.
Evaluation of the Subscapularis Tendon Tears on 3T Magnetic Resonance Arthrography: Comparison of Diagnostic Performance of T1-Weighted Spectral Presaturation with Inversion-Recovery and T2-Weighted Turbo Spin-Echo Sequences.

PubMed

Lee, Hoseok; Ahn, Joong Mo; Kang, Yusuhn; Oh, Joo Han; Lee, Eugene; Lee, Joon Woo; Kang, Heung Sik

2018-01-01

To compare the T1-weighted spectral presaturation with inversion-recovery sequences (T1 SPIR) with T2-weighted turbo spin-echo sequences (T2 TSE) on 3T magnetic resonance arthrography (MRA) in the evaluation of the subscapularis (SSC) tendon tear with arthroscopic findings as the reference standard. This retrospective study included 120 consecutive patients who had undergone MRA within 3 months between April and December 2015. Two musculoskeletal radiologists blinded to the arthroscopic results evaluated T1 SPIR and T2 TSE images in separate sessions for the integrity of the SSC tendon, examining normal/articular-surface partial-thickness tear (PTTa)/full-thickness tear (FTT). Diagnostic performance of T1 SPIR and T2 TSE was calculated with arthroscopic results as the reference standard, and sensitivity, specificity, and accuracy were compared using the McNemar test. Interobserver agreement was measured with kappa (κ) statistics. There were 74 SSC tendon tears (36 PTTa and 38 FTT) confirmed by arthroscopy. Significant differences were found in the sensitivity and accuracy between T1 SPIR and T2 TSE using the McNemar test, with respective rates of 95.9-94.6% vs. 71.6-75.7% and 90.8-91.7% vs. 79.2-83.3% for detecting tear; 55.3% vs. 31.6-34.2% and 85.8% vs. 78.3-79.2%, respectively, for FTT; and 91.7-97.2% vs. 58.3-61.1% and 89% vs. 78-79.3%, respectively, for PTTa. Interobserver agreement for T1 SPIR was almost perfect for T1 SPIR (κ = 0.839) and substantial for T2 TSE (κ = 0.769). T1-weighted spectral presaturation with inversion-recovery sequences is more sensitive and accurate compared to T2 TSE in detecting SSC tendon tear on 3T MRA.
High resolution microendoscopy for classification of colorectal polyps.

PubMed

Chang, S S; Shukla, R; Polydorides, A D; Vila, P M; Lee, M; Han, H; Kedia, P; Lewis, J; Gonzalez, S; Kim, M K; Harpaz, N; Godbold, J; Richards-Kortum, R; Anandasabapathy, S

2013-07-01

It can be difficult to distinguish adenomas from benign polyps during routine colonoscopy. High resolution microendoscopy (HRME) is a novel method for imaging colorectal mucosa with subcellular detail. HRME criteria for the classification of colorectal neoplasia have not been previously described. Study goals were to develop criteria to characterize HRME images of colorectal mucosa (normal, hyperplastic polyps, adenomas, cancer) and to determine the accuracy and interobserver variability for the discrimination of neoplastic from non-neoplastic polyps when these criteria were applied by novice and expert microendoscopists. Two expert pathologists created consensus HRME image criteria using images from 68 patients with polyps who had undergone colonoscopy plus HRME. Using these criteria, HRME expert and novice microendoscopists were shown a set of training images and then tested to determine accuracy and interobserver variability. Expert microendoscopists identified neoplasia with sensitivity, specificity, and accuracy of 67 % (95 % confidence interval [CI] 58 % - 75 %), 97 % (94 % - 100 %), and 87 %, respectively. Nonexperts achieved sensitivity, specificity, and accuracy of 73 % (66 % - 80 %), 91 % (80 % - 100 %), and 85 %, respectively. Overall, neoplasia were identified with sensitivity 70 % (65 % - 76 %), specificity 94 % (87 % - 100 %), and accuracy 85 %. Kappa values were: experts 0.86; nonexperts 0.72; and overall 0.78. Using the new criteria, observers achieved high specificity and substantial interobserver agreement for distinguishing benign polyps from neoplasia. Increased expertise in HRME imaging improves accuracy. This low-cost microendoscopic platform may be an alternative to confocal microendoscopy in lower-resource or community-based settings.
Improvement of diagnostic agreement among pathologists in resolving an "atypical glands suspicious for cancer" diagnosis in prostate biopsies using a novel "Disease-Focused Diagnostic Review" quality improvement process.

PubMed

Shah, Rajal B; Leandro, Gioacchino; Romerocaces, Gloria; Bentley, James; Yoon, Jiyoon; Mendrinos, Savvas; Tadros, Yousef; Tian, Wei; Lash, Richard

2016-10-01

One of the major goals of an anatomic pathology laboratory quality program is to minimize unwarranted diagnostic variability and equivocal reporting. This study evaluated the utility of Miraca Life Sciences' "Disease-Focused Diagnostic Review" (DFDR) quality program in improving interobserver diagnostic reproducibility associated with classification of "atypical glands suspicious for adenocarcinoma" (ATYP) in prostate biopsies. Seventy-one selected prostate biopsies with a focus of ATYP were reviewed by 8 pathologists. Participants were blinded to the original diagnosis and were first asked to classify the ATYP as benign, atypical, or limited adenocarcinoma. DFDR comprised a "theoretical consensus" (in which pathologists first reached consensus on the morphological features they considered relevant for the diagnosis of limited prostatic adenocarcinoma), a didactic review including relevant literature, and "practical consensus" (pathologists performed joint microscopic sessions, reconciling each other's observations and positions evaluating a separate unique slide set). Participants were finally asked to reclassify the original 71 ATYP cases based on knowledge gleaned from DFDR. Pre- and post-DFDR interobserver reproducibility of overall diagnostic agreement was assessed. Interobserver reproducibility measured by Fleiss κ values of pre- and post-DFDR was 0.36 and 0.59, respectively (P=.006). Post-DFDR, there were significant improvement for "100% concordance" (P=.011) and reduction for "no consensus" (P=.0004) categories. Despite a lower pre-DFDR reproducibility for non-uropathology fellowship-trained (n=3, κ=0.38) versus uropathology fellowship-trained (n=5, κ=0.43) pathologists, both groups achieved similarly high post-DFDR κ levels (κ=0.58 and 0.56, respectively). DFDR represents an effective tool to formally achieve diagnostic consensus and reduce variability associated with critical diagnoses in an anatomic pathology practice. Copyright © 2016 Elsevier Inc. All rights reserved.
Measurement error of mean sac diameter and crown-rump length among pregnant women at Mulago hospital, Uganda.

PubMed

Ali, Sam; Byanyima, Rosemary Kusaba; Ononge, Sam; Ictho, Jerry; Nyamwiza, Jean; Loro, Emmanuel Lako Ernesto; Mukisa, John; Musewa, Angella; Nalutaaya, Annet; Ssenyonga, Ronald; Kawooya, Ismael; Temper, Benjamin; Katamba, Achilles; Kalyango, Joan; Karamagi, Charles

2018-05-04

Ultrasonography is essential in the prenatal diagnosis and care for the pregnant mothers. However, the measurements obtained often contain a small percentage of unavoidable error that may have serious clinical implications if substantial. We therefore evaluated the level of intra and inter-observer error in measuring mean sac diameter (MSD) and crown-rump length (CRL) in women between 6 and 10 weeks' gestation at Mulago hospital. This was a cross-sectional study conducted from January to March 2016. We enrolled 56 women with an intrauterine single viable embryo. The women were scanned using a transvaginal (TVS) technique by two observers who were blinded of each other's measurements. Each observer measured the CRL twice and the MSD once for each woman. Intra-class correlation coefficients (ICCs), 95% limits of agreement (LOA) and technical error of measurement (TEM) were used for analysis. Intra-observer ICCs for CRL measurements were 0.995 and 0.993 while inter-observer ICCs were 0.988 for CRL and 0.955 for MSD measurements. Intra-observer 95% LOA for CRL were ± 2.04 mm and ± 1.66 mm. Inter-observer LOA were ± 2.35 mm for CRL and ± 4.87 mm for MSD. The intra-observer relative TEM for CRL were 4.62% and 3.70% whereas inter-observer relative TEM were 5.88% and 5.93% for CRL and MSD respectively. Intra- and inter-observer error of CRL and MSD measurements among pregnant women at Mulago hospital were acceptable. This implies that at Mulago hospital, the error in pregnancy dating is within acceptable margins of ±3 days in first trimester, and the CRL and MSD cut offs of ≥7 mm and ≥ 25 mm respectively are fit for diagnosis of miscarriage on TVS. These findings should be extrapolated to the whole country with caution. Sonographers can achieve acceptable and comparable diagnostic accuracy levels of MSD and CLR measurements with proper training and adherence to practice guidelines.
Inaccuracy of Wolff-Parkinson-white accessory pathway localization algorithms in children and patients with congenital heart defects.

PubMed

Bar-Cohen, Yaniv; Khairy, Paul; Morwood, James; Alexander, Mark E; Cecchin, Frank; Berul, Charles I

2006-07-01

ECG algorithms used to localize accessory pathways (AP) in patients with Wolff-Parkinson-White (WPW) syndrome have been validated in adults, but less is known of their use in children, especially in patients with congenital heart disease (CHD). We hypothesize that these algorithms have low diagnostic accuracy in children and even lower in those with CHD. Pre-excited ECGs in 43 patients with WPW and CHD (median age 5.4 years [0.9-32 years]) were evaluated and compared to 43 consecutive WPW control patients without CHD (median age 14.5 years [1.8-18 years]). Two blinded observers predicted AP location using 2 adult and 1 pediatric WPW algorithms, and a third blinded observer served as a tiebreaker. Predicted locations were compared with ablation-verified AP location to identify (a) exact match for AP location and (b) match for laterality (left-sided vs right-sided AP). In control children, adult algorithms were accurate in only 56% and 60%, while the pediatric algorithm was correct in 77%. In 19 patients with Ebstein's anomaly, diagnostic accuracy was similar to controls with at times an even better ability to predict laterality. In non-Ebstein's CHD, however, the algorithms were markedly worse (29% for the adult algorithms and 42% for the pediatric algorithms). A relatively large degree of interobserver variability was seen (kappa values from 0.30 to 0.58). Adult localization algorithms have poor diagnostic accuracy in young patients with and without CHD. Both adult and pediatric algorithms are particularly misleading in non-Ebstein's CHD patients and should be interpreted with caution.
Magnetic Resonance Enterography to Assess Multifocal and Multicentric Bowel Endometriosis.

PubMed

Nyangoh Timoh, Krystel; Stewart, Zelda; Benjoar, Mikhael; Beldjord, Selma; Ballester, Marcos; Bazot, Marc; Thomassin-Naggara, Isabelle; Darai, Emile

To prospectively determine the accuracy of magnetic resonance enterography (MRE) compared with conventional magnetic resonance imaging (MRI) for multifocal (i.e., multiple lesions affecting the same digestive segment) and multicentric (i.e., multiple lesions affecting several digestive segments) bowel endometriosis. A prospective study (Canadian Task Force classification II-2). Tenon University Hospital, Paris, France. Patients with MRI-suspected colorectal endometriosis scheduled for colorectal resection from April 2014 to February 2016 were included. Patients underwent both 1.5-Tesla MRI and MRE as well as laparoscopically assisted and open colorectal resections. The diagnostic performance of MRI and MRE was evaluated for sensitivity, specificity, positive and negative predictive values, accuracy, and positive and negative likelihood ratios (LRs). The interobserver variability of the experienced and junior radiologists was quantified using weighted statistics. Forty-seven patients were included. Twenty-two (46.8%) patients had unifocal lesions, 14 (30%) had multifocal lesions, and 11 (23.4%) had multicentric lesions. The sensitivity, specificity, positive LR, and negative LR for the diagnosis of multifocal lesions were 0.29 (6/21), 1.00 (23/24), 15.36, and 0.71 for MRI and 0.57 (12/21), 0.89 (23/25), 4.95, and 0.58 for MRE. The sensitivity, specificity, positive LR, and negative LR for the diagnosis of multicentric lesions were 0.18 (1/11), 1.00 (1/1), 15, and 0.80 for MRI and 0.46 (5/11), 0.92 (33/36), 5.45, and 0.60 for MRE. Lower accuracies for MRI compared with MRE to diagnose multicentric (p = .01) and multifocal lesions (p = .004) were noted. The interobserver agreement for MRE was good for both multifocality (κ = 0.80) and multicentricity (κ = 0.61). MRE has better accuracy for diagnosing multifocal and multicentric bowel endometriosis than conventional MRI. Copyright © 2018. Published by Elsevier Inc.
3D SPECT/CT fusion using image data projection of bone SPECT onto 3D volume-rendered CT images: feasibility and clinical impact in the diagnosis of bone metastasis.

PubMed

Ogata, Yuji; Nakahara, Tadaki; Ode, Kenichi; Matsusaka, Yohji; Katagiri, Mari; Iwabuchi, Yu; Itoh, Kazunari; Ichimura, Akira; Jinzaki, Masahiro

2017-05-01

We developed a method of image data projection of bone SPECT into 3D volume-rendered CT images for 3D SPECT/CT fusion. The aims of our study were to evaluate its feasibility and clinical usefulness. Whole-body bone scintigraphy (WB) and SPECT/CT scans were performed in 318 cancer patients using a dedicated SPECT/CT systems. Volume data of bone SPECT and CT were fused to obtain 2D SPECT/CT images. To generate our 3D SPECT/CT images, colored voxel data of bone SPECT were projected onto the corresponding location of the volume-rendered CT data after a semi-automatic bone extraction. Then, the resultant 3D images were blended with conventional volume-rendered CT images, allowing to grasp the three-dimensional relationship between bone metabolism and anatomy. WB and SPECT (WB + SPECT), 2D SPECT/CT fusion, and 3D SPECT/CT fusion were evaluated by two independent reviewers in the diagnosis of bone metastasis. The inter-observer variability and diagnostic accuracy in these three image sets were investigated using a four-point diagnostic scale. Increased bone metabolism was found in 744 metastatic sites and 1002 benign changes. On a per-lesion basis, inter-observer agreements in the diagnosis of bone metastasis were 0.72 for WB + SPECT, 0.90 for 2D SPECT/CT, and 0.89 for 3D SPECT/CT. Receiver operating characteristic analyses for the diagnostic accuracy of bone metastasis showed that WB + SPECT, 2D SPECT/CT, and 3D SPECT/CT had an area under the curve of 0.800, 0.983, and 0.983 for reader 1, 0.865, 0.992, and 0.993 for reader 2, respectively (WB + SPECT vs. 2D or 3D SPECT/CT, p < 0.001; 2D vs. 3D SPECT/CT, n.s.). The durations of interpretation of WB + SPECT, 2D SPECT/CT, and 3D SPECT/CT images were 241 ± 75, 225 ± 73, and 182 ± 71 s for reader 1 and 207 ± 72, 190 ± 73, and 179 ± 73 s for reader 2, respectively. As a result, it took shorter time to read 3D SPECT/CT images than 2D SPECT/CT (p < 0.0001) or WB + SPECT images (p < 0.0001). 3D SPECT/CT fusion offers comparable diagnostic accuracy to 2D SPECT/CT fusion. The visual effect of 3D SPECT/CT fusion facilitates reduction of reading time compared to 2D SPECT/CT fusion.
Validity and Reliability of Dermoscopic Criteria Used to Differentiate Nevi From Melanoma: A Web-Based International Dermoscopy Society Study.

PubMed

Carrera, Cristina; Marchetti, Michael A; Dusza, Stephen W; Argenziano, Giuseppe; Braun, Ralph P; Halpern, Allan C; Jaimes, Natalia; Kittler, Harald J; Malvehy, Josep; Menzies, Scott W; Pellacani, Giovanni; Puig, Susana; Rabinovitz, Harold S; Scope, Alon; Soyer, H Peter; Stolz, Wilhelm; Hofmann-Wellenhof, Rainer; Zalaudek, Iris; Marghoob, Ashfaq A

2016-07-01

The comparative diagnostic performance of dermoscopic algorithms and their individual criteria are not well studied. To analyze the discriminatory power and reliability of dermoscopic criteria used in melanoma detection and compare the diagnostic accuracy of existing algorithms. This was a retrospective, observational study of 477 lesions (119 melanomas [24.9%] and 358 nevi [75.1%]), which were divided into 12 image sets that consisted of 39 or 40 images per set. A link on the International Dermoscopy Society website from January 1, 2011, through December 31, 2011, directed participants to the study website. Data analysis was performed from June 1, 2013, through May 31, 2015. Participants included physicians, residents, and medical students, and there were no specialty-type or experience-level restrictions. Participants were randomly assigned to evaluate 1 of the 12 image sets. Associations with melanoma and intraclass correlation coefficients (ICCs) were evaluated for the presence of dermoscopic criteria. Diagnostic accuracy measures were estimated for the following algorithms: the ABCD rule, the Menzies method, the 7-point checklist, the 3-point checklist, chaos and clues, and CASH (color, architecture, symmetry, and homogeneity). A total of 240 participants registered, and 103 (42.9%) evaluated all images. The 110 participants (45.8%) who evaluated fewer than 20 lesions were excluded, resulting in data from 130 participants (54.2%), 121 (93.1%) of whom were regular dermoscopy users. Criteria associated with melanoma included marked architectural disorder (odds ratio [OR], 6.6; 95% CI, 5.6-7.8), pattern asymmetry (OR, 4.9; 95% CI, 4.1-5.8), nonorganized pattern (OR, 3.3; 95% CI, 2.9-3.7), border score of 6 (OR, 3.3; 95% CI, 2.5-4.3), and contour asymmetry (OR, 3.2; 95% CI, 2.7-3.7) (P < .001 for all). Most dermoscopic criteria had poor to fair interobserver agreement. Criteria that reached moderate levels of agreement included comma vessels (ICC, 0.44; 95% CI, 0.40-0.49), absence of vessels (ICC, 0.46; 95% CI, 0.42-0.51), dark brown color (ICC, 0.40; 95% CI, 0.35-0.44), and architectural disorder (ICC, 0.43; 95% CI, 0.39-0.48). The Menzies method had the highest sensitivity for melanoma diagnosis (95.1%) but the lowest specificity (24.8%) compared with any other method (P < .001). The ABCD rule had the highest specificity (59.4%). All methods had similar areas under the receiver operating characteristic curves. Important dermoscopic criteria for melanoma recognition were revalidated by participants with varied experience. Six algorithms tested had similar but modest levels of diagnostic accuracy, and the interobserver agreement of most individual criteria was poor.
The intra- and inter-observer reliability of the physical examination methods used to assess patients with patellofemoral joint instability.

PubMed

Smith, Toby O; Clark, Allan; Neda, Sophia; Arendt, Elizabeth A; Post, William R; Grelsamer, Ronald P; Dejour, David; Almqvist, Karl Fredrik; Donell, Simon T

2012-08-01

An accurate physical examination of patients with patellar instability is an important aspect of the diagnosis and treatment. While previous studies have assessed the diagnostic accuracy of such physical examination tests, little has been undertaken to assess the inter- and intra-tester reliability of such techniques. The purpose of this study was to determine the inter- and intra-tester reliability of the physical examination tests used for patients with patellar instability. Five patients (10 knees) with bilateral recurrent patellar instability were assessed by five members of the International Patellofemoral Study Group. Each surgeon assessed each patient twice using 18 reported physical examination tests. The inter- and intra-observer reliability was assessed using weighted Kappa statistics with 95% confidence intervals. The findings of the study suggested that there were very poor inter-observer reliability for the majority of the physical tests, with only the assessments of patellofemoral crepitus, foot arch position and the J-sign presenting with fair to moderate agreement respectively. The intra-observer reliability indicated largely moderate to substantial agreement between the first and second tests performed by each assessor, with the greatest agreement seen for the assessment of tibial torsion, popliteal angle and the Bassett's sign. For the common physical examination tests used in the management of patients with patellar instability inter-observer reliability is poor, while intra-observer reliability is moderate. Standardization of physical exam assessments and further study of these results among different clinicians and more divergent patient groups is indicated. Copyright © 2011 Elsevier B.V. All rights reserved.
Sensitivity and specificity of CT- and MRI-scanning in evaluation of occult fracture of the proximal femur.

PubMed

Haubro, M; Stougaard, C; Torfing, T; Overgaard, S

2015-08-01

To estimate sensitivity and specificity of CT and MRI examinations in patients with fractures of the proximal femur. To determine the interobserver agreement of the modalities among a senior consulting radiologist, a resident in radiology and a resident in orthopaedics surgery. 67 patients (27 males, 40 females, mean age 80.5) seen in the emergency room with hip pain after fall, inability to stand and a primary X-ray without fracture were evaluated with both CT and MRI. The images were analysed by a senior consulting musculoskeletal radiologist, a resident in radiology and a resident in orthopaedic surgery. Sensitivity and specificity were estimated with MRI as the golden standard. Kappa value was used to assess level of agreement in both MRI and CT finding. 15 fractures of the proximal femur were found (7 intertrochanteric-, 3 femoral neck and 5 fractures of the greater trochanter). Two fractures were not identified by CT and four changed fracture location. Among those, three patients underwent surgery. Sensitivity of CT was 0.87; 95% CI [0.60; 0.98]. Kappa for interobserver agreement for CT were 0.46; 95% CI [0.23; 0.76] and 0.67; 95% CI [0.42; 0.90]. For MRI 0.67; 95% CI [0.43; 0.91] and 0.69; 95% CI [0.45; 0.92]. MRI was observed to have a higher diagnostic accuracy than CT in detecting occult fractures of the hip. Interobserver analysis showed high kappa values corresponding substantial agreement in both CT and MRI. Copyright © 2015 Elsevier Ltd. All rights reserved.
Accuracy of clinical neurological examination in diagnosing lumbo-sacral radiculopathy: a systematic literature review.

PubMed

Tawa, Nassib; Rhoda, Anthea; Diener, Ina

2017-02-23

Lumbar radiculopathy remains a clinical challenge among primary care clinicians in both assessment and diagnosis. This often leads to misdiagnosis and inappropriate treatment of patients resulting in poor health outcomes, exacerbating this already debilitating condition. This review evaluated 12 primary diagnostic accuracy studies that specifically assessed the performance of various individual and grouped clinical neurological tests in detecting nerve root impingement, as established in the current literature. Eight electronic data bases were searched for relevant articles from inception until July 2016. All primary diagnostic studies which investigated the accuracy of clinical neurological test (s) in diagnosing lumbar radiculopathy among patients with low back and referred leg symptoms were screened for inclusion. Qualifying studies were retrieved and independently assessed for methodological quality using the 'Quality Assessment of Diagnostic tests Accuracy Studies' criteria. A total of 12 studies which investigated standard components of clinical neurological examination of (sensory, motor, tendon reflex and neuro-dynamics) of the lumbo-sacral spine were included. The mean inter-observer agreement on quality assessment by two independent reviewers was fair (k = 0.3 - 0.7). The diagnostic performance of sensory testing using MR imaging as a reference standard demonstrated a sensitivity (confidence interval 95%) 0.61 (0.47-0.73) and a specificity of 0.63 (0.38-0.84). Motor tests sensitivity was poor to moderate, ranging from 0.13 (0.04-0.31) to 0.61 (0.36-0.83). Generally, the diagnostic performance of reflex testing was notably good with specificity ranging from (confidence interval 95%) 0.60 (0.51-0.69) to 0.93 (0.87-0.97) and sensitivity ranging from 0.14 (0.09-0.21) to 0.67 (0.21-0.94). Femoral nerve stretch test had a high sensitivity of (confidence interval 95%) 1.00 (0.40-1.00) and specificity of 0.83 (0.52-0.98) while SLR test recorded a mean sensitivity of 0.84 (0.72-0.92) and specificity of 0.78 (0.67-0.87). There is a scarcity of studies on the diagnostic accuracy of clinical neurological examination testing. Furthermore there seem to be a disconnect among researchers regarding the diagnostic utility of lower limb neuro-dynamic tests which include the Straight Leg Raise and Femoral Nerve tests for sciatic and femoral nerve respectively. Whether these tests are able to detect the presence of disc herniation and subsequent nerve root compression or hyper-sensitivity of the sacral and femoral plexus due to mechanical irritation still remains debatable.
Diagnostic significance of rib series in minor thorax trauma compared to plain chest film and computed tomography.

PubMed

Hoffstetter, Patrick; Dornia, Christian; Schäfer, Stephan; Wagner, Merle; Dendl, Lena M; Stroszczynski, Christian; Schreyer, Andreas G

2014-01-01

Rib series (RS) are a special radiological technique to improve the visualization of the bony parts of the chest. The aim of this study was to evaluate the diagnostic accuracy of rib series in minor thorax trauma. Retrospective study of 56 patients who received RS, 39 patients where additionally evaluated by plain chest film (PCF). All patients underwent a computed tomography (CT) of the chest. RS and PCF were re-read independently by three radiologists, the results were compared with the CT as goldstandard. Sensitivity, specificity, negative and positive predictive value were calculated. Significance in the differences of findings was determined by McNemar test, interobserver variability by Cohens kappa test. 56 patients were evaluated (34 men, 22 women, mean age =61 y.). In 22 patients one or more rib fracture could be identified by CT. In 18 of these cases (82%) the correct diagnosis was made by RS, in 16 cases (73%) the correct number of involved ribs was detected. These differences were significant (p = 0.03). Specificity was 100%, negative and positive predictive value were 85% and 100%. Kappa values for the interobserver agreement was 0.92-0.96. Sensitivity of PCF was 46% and was significantly lower (p = 0.008) compared to CT. Rib series does not seem to be an useful examination in evaluating minor thorax trauma. CT seems to be the method of choice to detect rib fractures, but the clinical value of the radiological proof has to be discussed and investigated in larger follow up studies.
Diagnosis of Acute Cellular Rejection Using Probe-Based Confocal Laser Endomicroscopy in Lung Transplant Recipients: a Prospective, Multicenter Trial.

PubMed

Keller, Cesar A; Khoor, Andras; Arenberg, Douglas A; Smith, Michael A; Islam, Shaheen U

2018-05-29

Acute cellular rejection (ACR) in lung transplant recipients requires demonstration of perivascular lymphocytic infiltration in alveolar tissue samples from transbronchial biopsies (TBBs). Probe-based confocal laser endomicroscopy (pCLE) allows in vivo observation of alveolar, vascular, and cellular microstructures in the lung with potential to identify ACR. The objective of our prospective, blinded, multicenter observational study was to identify pCLE findings in patients with ACR diagnosed histopathologically by TBB. Lung transplant recipients undergoing diagnostic bronchoscopies within 1 year posttransplant for suspected ACR had pCLE video imaging obtained immediately prior to tissue sampling via TBB. Findings of 2 pCLE criteria, abundant alveolar cellularity and perivascular cellularity (PVC), were assessed by 4 investigators familiar with pCLE and compared to histopathologic criteria of ACR to derive sensitivity, specificity, area under the receiver operating characteristic curve, and accuracy. Interobserver agreement was assessed by calculating intraclass coefficient and Fleiss κ. Findings were analyzed before and after a consensus meeting of investigators on interpreting images. Thirty pCLE procedures were performed on 24 patients, 8 showing ACR in TBB. Diagnostic performance and interobserver agreement using pCLE to identify PVC were significantly higher than those of abundant alveolar cellularity (P<.01). The number of blood vessels identified with PVC on pCLE was significantly correlated with histopathologic activity grading of ACR (P<.01). PVC agreement among investigators significantly improved after consensus meeting (P<.01) CONCLUSIONS: When found on pCLE, PVC is a feasible and reproducible criterion for assessment of ACR in vivo, but there is a learning curve for image interpretation.
Diagnostic performance of direct traction MR arthrography of the hip: detection of chondral and labral lesions with arthroscopic comparison.

PubMed

Schmaranzer, Florian; Klauser, Andrea; Kogler, Michael; Henninger, Benjamin; Forstner, Thomas; Reichkendler, Markus; Schmaranzer, Ehrenfried

2015-06-01

To assess diagnostic performance of traction MR arthrography of the hip in detection and grading of chondral and labral lesions with arthroscopic comparison. Seventy-five MR arthrograms obtained ± traction of 73 consecutive patients (mean age, 34.5 years; range, 14-54 years) who underwent arthroscopy were included. Traction technique included weight-adapted traction (15-23 kg), a supporting plate for the contralateral leg, and intra-articular injection of 18-27 ml (local anaesthetic and contrast agent). Patients reported on neuropraxia and on pain. Two blinded readers independently assessed femoroacetabular cartilage and labrum lesions which were correlated with arthroscopy. Interobserver agreement was calculated using κ values. Joint distraction ± traction was evaluated in consensus. No procedure had to be stopped. There were no cases of neuropraxia. Accuracy for detection of labral lesions was 92 %/93 %, 91 %/83 % for acetabular lesions, and 92 %/88 % for femoral cartilage lesions for reader 1/reader 2, respectively. Interobserver agreement was moderate (κ = 0.58) for grading of labrum lesions and substantial (κ = 0.7, κ = 0.68) for grading of acetabular and femoral cartilage lesions. Joint distraction was achieved in 72/75 and 14/75 hips with/without traction, respectively. Traction MR arthrography safely enabled accurate detection and grading of labral and chondral lesions. • The used traction technique was well tolerated by most patients. • The used traction technique almost consistently achieved separation of cartilage layers. • Traction MR arthrography enabled accurate detection of chondral and labral lesions.
Next generation of optical diagnostics for bladder cancer using probe-based confocal laser endomicroscopy

NASA Astrophysics Data System (ADS)

Liu, Jen-Jane; Chang, Timothy C.; Pan, Ying; Hsiao, Shelly T.; Mach, Kathleen E.; Jensen, Kristin C.; Liao, Joseph C.

2012-02-01

Real-time imaging with confocal laser endomicroscopy (CLE) probes that fit in standard endoscopes has emerged as a clinically feasible technology for optical biopsy of bladder cancer. Confocal images of normal, inflammatory, and neoplastic urothelium obtained with intravesical fluorescein can be differentiated by morphologic characteristics. We compiled a confocal atlas of the urinary tract using these diagnostic criteria to be used in a prospective diagnostic accuracy study. Patients scheduled to undergo transurethral resection of bladder tumor underwent white light cystoscopy (WLC), followed by CLE, and histologic confirmation of resected tissue. Areas that appeared normal by WLC were imaged and biopsied as controls. We imaged and prospectively analyzed 135 areas in 57 patients. We show that CLE improves the diagnostic accuracy of WLC for diagnosing benign tissue, low and high grade cancer. Interobserver studies showed a moderate level of agreement by urologists and nonclinical researchers. Despite morphologic differences between inflammation and cancer, real-time differentiation can still be challenging. Identification of bladder cancer-specific contrast agents could provide molecular specificity to CLE. By using fluorescently-labeled antibodies or peptides that bind to proteins expressed in bladder cancer, we have identified putative molecular contrast agents for targeted imaging with CLE. We describe one candidate agent - anti-CD47 - that was instilled into bladder specimens. The tumor and normal urothelium were imaged with CLE, with increased fluorescent signal demonstrated in areas of tumor compared to normal areas. Thus, cancer-specificity can be achieved using molecular contrast agents ex vivo in conjunction with CLE.

Stress-only myocardial perfusion scintigraphy: a prospective study on the accuracy and observer agreement with quantitative coronary angiography as the gold standard.

PubMed

Ejlersen, June A; May, Ole; Mortensen, Jesper; Nielsen, Gitte L; Lauridsen, Jeppe F; Allan, Johansen

2017-11-01

Patients with normal stress perfusion have an excellent prognosis. Prospective studies on the diagnostic accuracy of stress-only scans with contemporary, independent examinations as gold standards are lacking. A total of 109 patients with typical angina and no previous coronary artery disease underwent a 2-day stress (exercise)/rest, gated, and attenuation-corrected (AC), 99m-technetium-sestamibi perfusion study, followed by invasive coronary angiography. The stress datasets were evaluated twice by four physicians with two different training levels (expert and novice): familiar and unfamiliar with AC. The two experts also made a consensus reading of the integrated stress-rest datasets. The consensus reading and quantitative data from the invasive coronary angiography were applied as reference methods. The sensitivity/specificity were 0.92-1.00/0.73-0.90 (reference: expert consensus reading), 0.93-0.96/0.63-0.82 (reference: ≥1 stenosis>70%), and 0.75-0.88/0.70-0.88 (reference: ≥1 stenosis>50%). The four readers showed a high and fairly equal sensitivity independent of their familiarity with AC. The expert familiar with AC had the highest specificity independent of the reference method. The intraobserver and interobserver agreements on the stress-only readings were good (readers without AC experience) to excellent (readers with AC experience). AC stress-only images yielded a high sensitivity independent of the training level and experience with AC of the nuclear physician, whereas the specificity correlated positively with both. Interobserver and intraobserver agreements tended to be the best for physicians with AC experience.
Interobserver variability of sonography for prediction of placenta accreta.

PubMed

Bowman, Zachary S; Eller, Alexandra G; Kennedy, Anne M; Richards, Douglas S; Winter, Thomas C; Woodward, Paula J; Silver, Robert M

2014-12-01

The sensitivity of sonography to predict accreta has been reported as higher than 90%. However, most studies are from single expert investigators. Our objective was to analyze interobserver variability of sonography for prediction of placenta accreta. Patients with previa with and without accreta were ascertained, and images with placental views were collected, deidentified, and placed in random sequence. Three radiologists and 3 maternal-fetal medicine specialists interpreted each study for the presence of accreta and specific findings reported to be associated with its diagnosis. Investigator-specific sensitivity, specificity, and accuracy were calculated. κ statistics were used to assess variability between individuals and types of investigators. A total of 229 sonographic studies from 55 patients with accreta and 56 control patients were examined. Accuracy ranged from 55.9% to 76.4%. Of imaging studies yielding diagnoses, sensitivity ranged from 53.4% to 74.4%, and specificity ranged from 70.8% to 94.8%. Overall interobserver agreement was moderate (mean κ ± SD = 0.47 ± 0.12). κ values between pairs of investigators ranged from 0.32 (fair agreement) to 0.73 (substantial agreement). Average individual agreement ranged from fair (κ = 0.35) to moderate (κ = 0.53). Blinded from clinical data, sonography has significant interobserver variability for the diagnosis of placenta accreta. © 2013 by the American Institute of Ultrasound in Medicine.
Diagnosis of rotator cuff tears using 3-Tesla MRI versus 3-Tesla MRA: a systematic review and meta-analysis.

PubMed

McGarvey, Ciaran; Harb, Ziad; Smith, Christian; Houghton, Russell; Corbett, Steven; Ajuied, Adil

2016-02-01

To compare the diagnostic accuracy of magnetic resonance imaging (MRI), 2-dimensional magnetic resonance arthrogram (MRA) and 3-dimensional isotropic MRA in the diagnosis of rotator cuff tears when performed exclusively at 3-T. A systematic review was undertaken of the Cochrane, MEDLINE and PubMed databases in accordance with the PRISMA guidelines. Studies comparing 3-T MRI or 3-T MRA (index tests) to arthroscopic surgical findings (reference test) were included. Methodological appraisal was performed using QUADAS 2. Pooled sensitivity and specificity were calculated and summary receiver-operating curves generated. Kappa coefficients quantified inter-observer reliability. Fourteen studies comprising 1332 patients were identified for inclusion. Twelve studies were retrospective and there were concerns regarding index test bias and applicability in nine and six studies respectively. Reference test bias was a concern in all studies. Both 3-T MRI and 3-T MRA showed similar excellent diagnostic accuracy for full-thickness supraspinatus tears. Concerning partial-thickness supraspinatus tears, 3-T 2D MRA was significantly more sensitive (86.6 vs. 80.5 %, p = 0.014) but significantly less specific (95.2 vs. 100 %, p < 0.001). There was a trend towards greater accuracy in the diagnosis of subscapularis tears with 3-T MRA. Three-Tesla 3D isotropic MRA showed similar accuracy to 3-T conventional 2D MRA. Three-Tesla MRI appeared equivalent to 3-T MRA in the diagnosis of full- and partial-thickness tears, although there was a trend towards greater accuracy in the diagnosis of subscapularis tears with 3-T MRA. Three-Tesla 3D isotropic MRA appears equivalent to 3-T 2D MRA for all types of tears.
Detection of prostate cancer with multiparametric MRI (mpMRI): effect of dedicated reader education on accuracy and confidence of index and anterior cancer diagnosis

PubMed Central

Garcia-Reyes, Kirema; Passoni, Niccolò M.; Palmeri, Mark L.; Kauffman, Christopher R.; Choudhury, Kingshuk Roy; Polascik, Thomas J.; Gupta, Rajan T.

2015-01-01

Purpose To evaluate the impact of dedicated reader education on accuracy/confidence of peripheral zone index cancer and anterior prostate cancer (PCa) diagnosis with mpMRI; secondary aim was to assess the ability of readers to differentiate low-grade cancer (Gleason 6 or below) from high-grade cancer (Gleason 7+). Materials and methods Five blinded radiology fellows evaluated 31 total prostate mpMRIs in this IRB-approved, HIPAA-compliant, retrospective study for index lesion detection, confidence in lesion diagnosis (1–5 scale), and Gleason grade (Gleason 6 or lower vs. Gleason 7+). Following a dedicated education program, readers reinterpreted cases after a memory extinction period, blinded to initial reads. Reference standard was established combining whole mount histopathology with mpMRI findings by a board-certified radiologist with 5 years of prostate mpMRI experience. Results Index cancer detection: pre-education accuracy 74.2%; post-education accuracy 87.7% (p = 0.003). Confidence in index lesion diagnosis: pre-education 4.22 ± 1.04; post-education 3.75 ± 1.41 (p = 0.0004). Anterior PCa detection: pre-education accuracy 54.3%; post-education accuracy 94.3% (p = 0.001). Confidence in anterior PCa diagnosis: pre-education 3.22 ± 1.54; post-education 4.29 ± 0.83 (p = 0.0003). Gleason score accuracy: pre-education 54.8%; post-education 73.5% (p = 0.0005). Conclusions A dedicated reader education program on PCa detection with mpMRI was associated with a statistically significant increase in diagnostic accuracy of index cancer and anterior cancer detection as well as Gleason grade identification as compared to pre-education values. This was also associated with a significant increase in reader diagnostic confidence. This suggests that substantial interobserver variability in mpMRI interpretation can potentially be reduced with a focus on education and that this can occur over a fellowship training year. PMID:25034558
Inter-observer agreement for diagnostic classification of esophageal motility disorders defined in high-resolution manometry.

PubMed

Fox, M R; Pandolfino, J E; Sweis, R; Sauter, M; Abreu Y Abreu, A T; Anggiansah, A; Bogte, A; Bredenoord, A J; Dengler, W; Elvevi, A; Fruehauf, H; Gellersen, S; Ghosh, S; Gyawali, C P; Heinrich, H; Hemmink, M; Jafari, J; Kaufman, E; Kessing, K; Kwiatek, M; Lubomyr, B; Banasiuk, M; Mion, F; Pérez-de-la-Serna, J; Remes-Troche, J M; Rohof, W; Roman, S; Ruiz-de-León, A; Tutuian, R; Uscinowicz, M; Valdovinos, M A; Vardar, R; Velosa, M; Waśko-Czopnik, D; Weijenborg, P; Wilshire, C; Wright, J; Zerbib, F; Menne, D

2015-01-01

High-resolution esophageal manometry (HRM) is a recent development used in the evaluation of esophageal function. Our aim was to assess the inter-observer agreement for diagnosis of esophageal motility disorders using this technology. Practitioners registered on the HRM Working Group website were invited to review and classify (i) 147 individual water swallows and (ii) 40 diagnostic studies comprising 10 swallows using a drop-down menu that followed the Chicago Classification system. Data were presented using a standardized format with pressure contours without a summary of HRM metrics. The sequence of swallows was fixed for each user but randomized between users to avoid sequence bias. Participants were blinded to other entries. (i) Individual swallows were assessed by 18 practitioners (13 institutions). Consensus agreement (≤ 2/18 dissenters) was present for most cases of normal peristalsis and achalasia but not for cases of peristaltic dysmotility. (ii) Diagnostic studies were assessed by 36 practitioners (28 institutions). Overall inter-observer agreement was 'moderate' (kappa 0.51) being 'substantial' (kappa > 0.7) for achalasia type I/II and no lower than 'fair-moderate' (kappa >0.34) for any diagnosis. Overall agreement was somewhat higher among those that had performed >400 studies (n = 9; kappa 0.55) and 'substantial' among experts involved in development of the Chicago Classification system (n = 4; kappa 0.66). This prospective, randomized, and blinded study reports an acceptable level of inter-observer agreement for HRM diagnoses across the full spectrum of esophageal motility disorders for a large group of clinicians working in a range of medical institutions. Suboptimal agreement for diagnosis of peristaltic motility disorders highlights contribution of objective HRM metrics. © 2014 International Society for Diseases of the Esophagus.
2D-speckle tracking right ventricular strain to assess right ventricular systolic function in systolic heart failure. Analysis of the right ventricular free and posterolateral walls.

PubMed

Mouton, Stéphanie; Ridon, Héléne; Fertin, Marie; Pentiah, Anju Duva; Goémine, Céline; Petyt, Grégory; Lamblin, Nicolas; Coisne, Augustin; Foucher-Hossein, Claude; Montaigne, David; de Groote, Pascal

2017-10-15

Right ventricular (RV) systolic function is a powerful prognostic factor in patients with systolic heart failure. The accurate estimation of RV function remains difficult. The aim of the study was to determine the diagnostic accuracy of 2D-speckle tracking RV strain in patients with systolic heart failure, analyzing both free and posterolateral walls. Seventy-six patients with dilated cardiopathy (left ventricular end-diastolic volume≥75ml/m 2 ) and left ventricular ejection fraction≤45% had an analysis of the RV strain. Feasibility, reproducibility and diagnostic accuracy of RV strain were analyzed and compared to other echocardiographic parameters of RV function. RV dysfunction was defined as a RV ejection fraction≤40% measured by radionuclide angiography. RV strain feasibility was 93.9% for the free-wall and 79.8% for the posterolateral wall. RV strain reproducibility was good (intra-observer and inter-observer bias and limits of agreement of 0.16±1.2% [-2.2-2.5] and 0.84±2.4 [-5.5-3.8], respectively). Patients with left heart failure have a RV systolic dysfunction that can be unmasked by advanced echocardiographic imaging: mean RV strain was -21±5.7% in patients without RV dysfunction and -15.8±5.1% in patients with RV dysfunction (p=0.0001). Mean RV strain showed the highest diagnostic accuracy to predict depressed RVEF (area under the curve (AUC) 0.75) with moderate sensitivity (60.5%) but high specificity (87.5%) using a cutoff value of -16%. RV strain seems to be a promising and more efficient measure than previous RV echocardiographic parameters for the diagnosis of RV systolic dysfunction. Copyright © 2017 Elsevier B.V. All rights reserved.
Quantitative assessment of tumour extraction from dermoscopy images and evaluation of computer-based extraction methods for an automatic melanoma diagnostic system.

PubMed

Iyatomi, Hitoshi; Oka, Hiroshi; Saito, Masataka; Miyake, Ayako; Kimoto, Masayuki; Yamagami, Jun; Kobayashi, Seiichiro; Tanikawa, Akiko; Hagiwara, Masafumi; Ogawa, Koichi; Argenziano, Giuseppe; Soyer, H Peter; Tanaka, Masaru

2006-04-01

The aims of this study were to provide a quantitative assessment of the tumour area extracted by dermatologists and to evaluate computer-based methods from dermoscopy images for refining a computer-based melanoma diagnostic system. Dermoscopic images of 188 Clark naevi, 56 Reed naevi and 75 melanomas were examined. Five dermatologists manually drew the border of each lesion with a tablet computer. The inter-observer variability was evaluated and the standard tumour area (STA) for each dermoscopy image was defined. Manual extractions by 10 non-medical individuals and by two computer-based methods were evaluated with STA-based assessment criteria: precision and recall. Our new computer-based method introduced the region-growing approach in order to yield results close to those obtained by dermatologists. The effectiveness of our extraction method with regard to diagnostic accuracy was evaluated. Two linear classifiers were built using the results of conventional and new computer-based tumour area extraction methods. The final diagnostic accuracy was evaluated by drawing the receiver operating curve (ROC) of each classifier, and the area under each ROC was evaluated. The standard deviations of the tumour area extracted by five dermatologists and 10 non-medical individuals were 8.9% and 10.7%, respectively. After assessment of the extraction results by dermatologists, the STA was defined as the area that was selected by more than two dermatologists. Dermatologists selected the melanoma area with statistically smaller divergence than that of Clark naevus or Reed naevus (P = 0.05). By contrast, non-medical individuals did not show this difference. Our new computer-based extraction algorithm showed superior performance (precision, 94.1%; recall, 95.3%) to the conventional thresholding method (precision, 99.5%; recall, 87.6%). These results indicate that our new algorithm extracted a tumour area close to that obtained by dermatologists and, in particular, the border part of the tumour was adequately extracted. With this refinement, the area under the ROC increased from 0.795 to 0.875 and the diagnostic accuracy showed an increase of approximately 20% in specificity when the sensitivity was 80%. It can be concluded that our computer-based tumour extraction algorithm extracted almost the same area as that obtained by dermatologists and provided improved computer-based diagnostic accuracy.
Cardiac output and systemic vascular resistance: Clinical assessment compared with a noninvasive objective measurement in children with shock.

PubMed

Razavi, Asma; Newth, Christopher J L; Khemani, Robinder G; Beltramo, Fernando; Ross, Patrick A

2017-06-01

To evaluate physician assessment of cardiac output and systemic vascular resistance in patients with shock compared with an ultrasonic cardiac output monitor (USCOM). To explore potential changes in therapy decisions if USCOM data were available using physician intervention answers. Double-blinded, prospective, observational study in a tertiary hospital pediatric intensive care unit. Forty children (<18years) admitted with shock, requiring ongoing volume resuscitation or inotropic support. Two to 3 physicians clinically assessed cardiac output and systemic vascular resistance, categorizing them as high, normal, or low. An investigator simultaneously measured cardiac index (CI) and systemic vascular resistance index (SVRI) with USCOM categorized as high, normal, or low. Overall agreement between physician and USCOM for CI (48.5% [κ = 0.18]) and SVRI (45.9% [κ = 0.16]) was poor. Interobserver agreement was also poor for CI (58.7% [κ = 0.33]) and SVRI (52.3% [κ = 0.28]). Comparing theoretical physician interventions to "acceptable" or "unacceptable" clinical interventions, based on USCOM measurement, 56 (21%) physician interventions were found to be "unacceptable." There is poor agreement between physician-assessed CI and SVRI and USCOM, with significant interobserver variability among physicians. Objective measurement of CI and SVRI may reduce variability and improve diagnostic accuracy. Copyright © 2016 Elsevier Inc. All rights reserved.
Poorly Differentiated Clusters Predict Colon Cancer Recurrence: An In-Depth Comparative Analysis of Invasive-Front Prognostic Markers.

PubMed

Konishi, Tsuyoshi; Shimada, Yoshifumi; Lee, Lik Hang; Cavalcanti, Marcela S; Hsu, Meier; Smith, Jesse Joshua; Nash, Garrett M; Temple, Larissa K; Guillem, José G; Paty, Philip B; Garcia-Aguilar, Julio; Vakiani, Efsevia; Gonen, Mithat; Shia, Jinru; Weiser, Martin R

2018-06-01

This study aimed to compare common histologic markers at the invasive front of colon adenocarcinoma in terms of prognostic accuracy and interobserver agreement. Consecutive patients who underwent curative resection for stages I to III colon adenocarcinoma at a single institution in 2007 to 2014 were identified. Poorly differentiated clusters (PDCs), tumor budding, perineural invasion, desmoplastic reaction, and Crohn-like lymphoid reaction at the invasive front, as well as the World Health Organization (WHO) grade of the entire tumor, were analyzed. Prognostic accuracies for recurrence-free survival (RFS) were compared, and interobserver agreement among 3 pathologists was assessed. The study cohort consisted of 851 patients. Although all the histologic markers except WHO grade were significantly associated with RFS (PDCs, tumor budding, perineural invasion, and desmoplastic reaction: P<0.001; Crohn-like lymphoid reaction: P=0.021), PDCs (grade 1 [G1]: n=581; G2: n=145; G3: n=125) showed the largest separation of 3-year RFS in the full cohort (G1: 94.1%; G3: 63.7%; hazard ratio [HR], 6.39; 95% confidence interval [CI], 4.11-9.95; P<0.001), stage II patients (G1: 94.0%; G3: 67.3%; HR, 4.15; 95% CI, 1.96-8.82; P<0.001), and stage III patients (G1: 89.0%; G3: 59.4%; HR, 4.50; 95% CI, 2.41-8.41; P<0.001). PDCs had the highest prognostic accuracy for RFS with the concordance probability estimate of 0.642, whereas WHO grade had the lowest. Interobserver agreement was the highest for PDCs, with a weighted kappa of 0.824. The risk of recurrence over time peaked earlier for worse PDCs grade. Our findings indicate that PDCs are the best invasive-front histologic marker in terms of prognostic accuracy and interobserver agreement. PDCs may replace WHO grade as a prognostic indicator.
Can emergency physicians accurately and reliably assess acute vertigo in the emergency department?

PubMed

Vanni, Simone; Nazerian, Peiman; Casati, Carlotta; Moroni, Federico; Risso, Michele; Ottaviani, Maddalena; Pecci, Rudi; Pepe, Giuseppe; Vannucchi, Paolo; Grifoni, Stefano

2015-04-01

To validate a clinical diagnostic tool, used by emergency physicians (EPs), to diagnose the central cause of patients presenting with vertigo, and to determine interrater reliability of this tool. A convenience sample of adult patients presenting to a single academic ED with isolated vertigo (i.e. vertigo without other neurological deficits) was prospectively evaluated with STANDING (SponTAneousNystagmus, Direction, head Impulse test, standiNG) by five trained EPs. The first step focused on the presence of spontaneous nystagmus, the second on the direction of nystagmus, the third on head impulse test and the fourth on gait. The local standard practice, senior audiologist evaluation corroborated by neuroimaging when deemed appropriate, was considered the reference standard. Sensitivity and specificity of STANDING were calculated. On the first 30 patients, inter-observer agreement among EPs was also assessed. Five EPs with limited experience in nystagmus assessment volunteered to participate in the present study enrolling 98 patients. Their average evaluation time was 9.9 ± 2.8 min (range 6-17). Central acute vertigo was suspected in 16 (16.3%) patients. There were 13 true positives, three false positives, 81 true negatives and one false negative, with a high sensitivity (92.9%, 95% CI 70-100%) and specificity (96.4%, 95% CI 93-38%) for central acute vertigo according to senior audiologist evaluation. The Cohen's kappas of the first, second, third and fourth steps of the STANDING were 0.86, 0.93, 0.73 and 0.78, respectively. The whole test showed a good inter-observer agreement (k = 0.76, 95% CI 0.45-1). In the hands of EPs, STANDING showed a good inter-observer agreement and accuracy validated against the local standard of care. © 2015 Australasian College for Emergency Medicine and Australasian Society for Emergency Medicine.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Wehrschuetz, M., E-mail: martin.wehrschuetz@klinikum-graz.at; Aschauer, M.; Portugaller, H.

The purpose of this study was to assess interobserver variability and accuracy in the evaluation of renal artery stenosis (RAS) with gadolinium-enhanced MR angiography (MRA) and digital subtraction angiography (DSA) in patients with hypertension. The authors found that source images are more accurate than maximum intensity projection (MIP) for depicting renal artery stenosis. Two independent radiologists reviewed MRA and DSA from 38 patients with hypertension. Studies were postprocessed to display images in MIP and source images. DSA was the standard for comparison in each patient. For each main renal artery, percentage stenosis was estimated for any stenosis detected by themore » two radiologists. To calculate sensitivity, specificity and accuracy, MRA studies and stenoses were categorized as normal, mild (1-39%), moderate (40-69%) or severe ({>=}70%), or occluded. DSA stenosis estimates of 70% or greater were considered hemodynamically significant. Analysis of variance demonstrated that MIP estimates of stenosis were greater than source image estimates for both readers. Differences in estimates for MIP versus DSA reached significance in one reader. The interobserver variance for MIP, source images and DSA was excellent (0.80< {kappa}{<=} 0.90). The specificity of source images was high (97%) but less for MIP (87%); average accuracy was 92% for MIP and 98% for source images. In this study, source images are significantly more accurate than MIP images in one reader with a similar trend was observed in the second reader. The interobserver variability was excellent. When renal artery stenosis is a consideration, high accuracy can only be obtained when source images are examined.« less
Accuracy of abdominal auscultation for bowel obstruction.

PubMed

Breum, Birger Michael; Rud, Bo; Kirkegaard, Thomas; Nordentoft, Tyge

2015-09-14

To investigate the accuracy and inter-observer variation of bowel sound assessment in patients with clinically suspected bowel obstruction. Bowel sounds were recorded in patients with suspected bowel obstruction using a Littmann(®) Electronic Stethoscope. The recordings were processed to yield 25-s sound sequences in random order on PCs. Observers, recruited from doctors within the department, classified the sound sequences as either normal or pathological. The reference tests for bowel obstruction were intraoperative and endoscopic findings and clinical follow up. Sensitivity and specificity were calculated for each observer and compared between junior and senior doctors. Interobserver variation was measured using the Kappa statistic. Bowel sound sequences from 98 patients were assessed by 53 (33 junior and 20 senior) doctors. Laparotomy was performed in 47 patients, 35 of whom had bowel obstruction. Two patients underwent colorectal stenting due to large bowel obstruction. The median sensitivity and specificity was 0.42 (range: 0.19-0.64) and 0.78 (range: 0.35-0.98), respectively. There was no significant difference in accuracy between junior and senior doctors. The median frequency with which doctors classified bowel sounds as abnormal did not differ significantly between patients with and without bowel obstruction (26% vs 23%, P = 0.08). The 53 doctors made up 1378 unique pairs and the median Kappa value was 0.29 (range: -0.15-0.66). Accuracy and inter-observer agreement was generally low. Clinical decisions in patients with possible bowel obstruction should not be based on auscultatory assessment of bowel sounds.
Diagnostic Accuracy and Clinical Implications of Translabial Ultrasound for the Assessment of Levator Ani Defects and Levator Ani Biometry in Women With Pelvic Organ Prolapse: A Systematic Review.

PubMed

Notten, Kim J B; Vergeldt, Tineke F M; van Kuijk, Sander M J; Weemhoff, Mirjam; Roovers, Jan-Paul W R

The aim of this study was to assess the diagnostic accuracy and clinical implications of translabial 3-dimensional (3D) ultrasound for the assessment of levator ani defects and biometry in women with pelvic organ prolapse (POP). We performed a systematic literature search through computerized databases including MEDLINE (via PubMed), EMBASE (via OvidSP), and the Cochrane Library using both medical subject headings and text terms from January 1, 2003, to December 25, 2015.We included articles that reported on POP status and diagnostic accuracy measurements with translabial 3D ultrasound or transperineal ultrasound for the detection of levator ani defects or for measuring pelvic floor biometry, that is, levator ani hiatus, or reported on the clinical relevance of using translabial 3D ultrasound for levator ani defects or measuring pelvic floor biometry in women with POP. Thirty-one articles were selected in accordance with parts of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines that can be applied to studies of diagnostic accuracy. Twenty-two articles (71%) are coauthored by 1 expert in this field. Detecting levator ani defects with translabial 3D ultrasound compared with magnetic resonance imaging showed a moderate to good agreement, whereas measuring hiatal biometry on translabial 3D ultrasound compared with magnetic resonance imaging showed a moderate to very good agreement.The interobserver agreement for diagnosing levator ani defects and measuring the levator hiatal area showed a moderate to very good agreement. Furthermore, levator ani defects increase the risk of cystocele and uterine prolapse, and levator ani defects are associated with recurrent POP.Finally, a larger hiatus was associated with POP and recurrent POP. Translabial 3D ultrasound is reproducible for diagnosing levator ani defects and ballooning hiatus. Both levator ani defects and a larger hiatal area are, in a selected population of patients with pelvic floor dysfunction, associated with POP and recurrent POP. More research is needed concerning external validation because most data in this article are coauthored by 1 expert in this field.
Validity and Reliability of Dermoscopic Criteria Used to Differentiate Nevi From Melanoma

PubMed Central

Carrera, Cristina; Marchetti, Michael A.; Dusza, StephenW.; Argenziano, Giuseppe; Braun, Ralph P.; Halpern, Allan C.; Jaimes, Natalia; Kittler, Harald J.; Malvehy, Josep; Menzies, Scott W.; Pellacani, Giovanni; Puig, Susana; Rabinovitz, Harold S.; Scope, Alon; Soyer, H. Peter; Stolz, Wilhelm; Hofmann-Wellenhof, Rainer; Zalaudek, Iris; Marghoob, Ashfaq A.

2017-01-01

IMPORTANCE The comparative diagnostic performance of dermoscopic algorithms and their individual criteria are not well studied. OBJECTIVES To analyze the discriminatory power and reliability of dermoscopic criteria used in melanoma detection and compare the diagnostic accuracy of existing algorithms. DESIGN, SETTING, AND PARTICIPANTS This was a retrospective, observational study of 477 lesions (119 melanomas [24.9%] and 358 nevi [75.1%]), which were divided into 12 image sets that consisted of 39 or 40 images per set. A link on the International Dermoscopy Society website from January 1, 2011, through December 31, 2011, directed participants to the study website. Data analysis was performed from June 1, 2013, through May 31, 2015. Participants included physicians, residents, and medical students, and there were no specialty-type or experience-level restrictions. Participants were randomly assigned to evaluate 1 of the 12 image sets. MAIN OUTCOMES AND MEASURES Associations with melanoma and intraclass correlation coefficients (ICCs) were evaluated for the presence of dermoscopic criteria. Diagnostic accuracy measures were estimated for the following algorithms: the ABCD rule, the Menzies method, the 7-point checklist, the 3-point checklist, chaos and clues, and CASH (color, architecture, symmetry, and homogeneity). RESULTS A total of 240 participants registered, and 103 (42.9%) evaluated all images. The 110 participants (45.8%) who evaluated fewer than 20 lesions were excluded, resulting in data from 130 participants (54.2%), 121 (93.1%) of whom were regular dermoscopy users. Criteria associated with melanoma included marked architectural disorder (odds ratio [OR], 6.6; 95% CI, 5.6–7.8), pattern asymmetry (OR, 4.9; 95% CI, 4.1–5.8), nonorganized pattern (OR, 3.3; 95% CI, 2.9–3.7), border score of 6 (OR, 3.3; 95% CI, 2.5–4.3), and contour asymmetry (OR, 3.2; 95% CI, 2.7–3.7) (P < .001 for all). Most dermoscopic criteria had poor to fair interobserver agreement. Criteria that reached moderate levels of agreement included comma vessels (ICC, 0.44; 95% CI, 0.40–0.49), absence of vessels (ICC, 0.46; 95% CI, 0.42–0.51), dark brown color (ICC, 0.40; 95% CI, 0.35–0.44), and architectural disorder (ICC, 0.43; 95% CI, 0.39–0.48). The Menzies method had the highest sensitivity for melanoma diagnosis (95.1%) but the lowest specificity (24.8%) compared with any other method (P < .001). The ABCD rule had the highest specificity (59.4%). All methods had similar areas under the receiver operating characteristic curves. CONCLUSIONS AND RELEVANCE Important dermoscopic criteria for melanoma recognition were revalidated by participants with varied experience. Six algorithms tested had similar but modest levels of diagnostic accuracy, and the interobserver agreement of most individual criteria was poor. PMID:27074267
Intra- and Inter-observer Variability of Measurements of the Laxity Index on Stress Radiographs Performed with the Vezzoni-Modified Badertscher Hip Distension Device.

PubMed

Bertal, Mileva; Vezzoni, Aldo; Houdellier, Blandine; Bogaerts, Evelien; Stock, Emmelie; Polis, Ingeborgh; Deforce, Dieter; Saunders, Jimmy H; Broeckx, Bart J G

2018-06-02

To describe and evaluate the accuracy, intra- and inter-observer variability of the laxity index (LI), used to quantify hip laxity on stress radiographs obtained with the Vezzoni-modified Badertscher distension device (VMBDD). Stress radiographs of 10 dogs obtained with the VMBDD were measured three times by an experienced observer. Six participants with different backgrounds (two ECVDI residents, two PhD students, two veterinary assistants) followed a short presentation and performed subsequently the measurements four times in two separate sessions. The effect of self-learning, feedback and specialization on the accuracy of the measurements was assessed. While the intra- and inter-observer variability were in agreement with other studies, the results of the experienced observer indicated that the variability can be very low. Neither feedback nor self-learning improved the results. A high degree of experience in radiographic assessment was not necessary to perform the measurements correctly. As the LI measurements were acceptable after a short presentation, they support the use of VMBDD for a complete and correct in-house evaluation of the hip joint by trained clinicians. However, we propose that, in the context of screening, measurements should be performed by a limited number of experienced examiners, to limit the impact of the inter-observer variability. Schattauer GmbH Stuttgart.
Detection of artificial air space opacities with digital radiography: ex vivo study on enhanced latitude post-processing.

PubMed

Biederer, J; Bolte, H; Schmidt, T; Charalambous, N; Both, M; Kopp, U; Hoffmann, B; Freitag-Wolf, S; Van Metter, R; Heller, M

2010-03-01

To evaluate in a.-p. digital chest radiograms of an ex vivo system if increased latitude and enhanced image detail contrast (EVP) improve the accuracy of detecting artificial air space opacities in parts of the lung that are superimposed by the diaphragm. 19 porcine lungs were inflated inside a chest phantom, prepared with 20-50 ml gelatin-stabilized liquid to generate alveolar air space opacities, and examined with direct radiography (3.0 × 2.5 k detector/ 125 kVp/ 4 mAs). 276 a.-p. images with and without EVP of 1.0-3.0 were presented to 6 observers. 8 regions were read for opacities, the reference was defined by CT. Statistics included sensitivity/specificity, interobserver variability, and calculation of Az (area under ROC curve). Behind the diaphragm (opacities in 32/92 regions), the median sensitivity increased from 0.35 without EVP to 0.53-0.56 at EVP 1.5-3.0 (significant in 5/6 observers). The specificity decreased from 0.96 to 0.90 (significant in 6/6), and the Az value and interobserver correlation increased from 0.66 to 0.74 and 0.39 to 0.48, respectively. Above the diaphragm, the median sensitivity for artificial opacities (136/276 regions) increased from 0.71 to 0.77-0.82 with EVP (significant in 4/6 observers). The specificity and Az value decreased from 0.76 to 0.62 and 0.74 to 0.70, respectively, (significant in 3/6). In this ex vivo experiment, EVP improved the diagnostic accuracy for artificial air space opacities in the superimposed parts of the lung (area under the ROC curve). Above the diaphragm, the accuracy was not affected due to a tradeoff in sensitivity/specificity. © Georg Thieme Verlag KG Stuttgart · New York.
Vascular Pattern Analysis on Microvascular Sonography for Differentiation of Pleomorphic Adenomas and Warthin Tumors of Salivary Glands.

PubMed

Ryoo, Inseon; Suh, Sangil; Lee, Young Hen; Seo, Hyung Suk; Seol, Hae Young; Woo, Jeong-Soo; Kim, Soo Chin

2018-03-01

Pleomorphic adenomas and Warthin tumors are the most common salivary gland tumors. It is important to differentiate between them because at least a partial parotidectomy is necessary for pleomorphic adenomas, whereas enucleation is sufficient for Warthin tumors. This study aimed to evaluate the usefulness of vascular pattern analysis using microvascular sonography to differentiate between the tumors. Sixty-two patients with pathologically proven pleomorphic adenomas (n = 38) and Warthin tumors (n = 24) were included. For all tumors, grayscale, power Doppler, and microvascular sonographic examinations were performed. Differences in vascular patterns (vascular distribution and internal vascularity) on power Doppler and microvascular sonography as well as grayscale sonographic features (size, shape, border, echogenicity, heterogeneity, and cystic change) between pleomorphic adenomas and Warthin tumors were evaluated. A comparison of diagnostic performances of grayscale sonography with power Doppler sonography and grayscale sonography with microvascular sonography was performed. The level of interobserver agreement between 2 reviewers in diagnosing tumors was evaluated. No grayscale sonographic features showed a significant difference between the tumors. Vascular distributions and internal vascularity on power Doppler sonography (P = .01 and .002) and microvascular sonography (both P < .001) were all significantly different. The diagnostic accuracy of grayscale sonography with microvascular sonography (79.0%) was higher than that of grayscale sonography with power Doppler sonography (72.6%). This difference was significant according to the McNemar test (P = .004). Interobserver agreement was excellent in diagnosing tumors on both grayscale sonography with power Doppler sonography (κ = 0.83) and grayscale sonography with microvascular sonography (κ = 0.94). Vascular pattern analysis using microvascular sonography with other sonographic features is helpful for differentiating between pleomorphic adenomas and Warthin tumors. © 2017 by the American Institute of Ultrasound in Medicine.
High interobserver variability in the assessment of epsilon waves: Implications for diagnosis of arrhythmogenic right ventricular cardiomyopathy/dysplasia.

PubMed

Platonov, Pyotr G; Calkins, Hugh; Hauer, Richard N; Corrado, Domenico; Svendsen, Jesper H; Wichter, Thomas; Biernacka, Elżbieta Katarzyna; Saguner, Ardan M; Te Riele, Anneline S J M; Zareba, Wojciech

2016-01-01

Revision of the Task Force diagnostic criteria for arrhythmogenic right ventricular cardiomyopathy/dysplasia (ARVC/D) has increased their sensitivity for the diagnosis of early and familial forms of the disease. The epsilon wave is a major diagnostic criterion in the context of ARVC/D, which, however, remains not quantifiable and therefore may leave room for substantial subjective interpretation. The purpose of this study was to assess interobserver agreement in epsilon wave definition and epsilon wave importance for ARVC/D diagnosis. Electrocardiographic (ECG) tracings depicting leads V1, V2, and V3 collected from individuals evaluated for ARVC/D (n = 30) were given to panel members who were asked to respond to the question whether ECG patterns meet epsilon wave definition outlined by the Task Force diagnostic criteria. The prevalence and importance of epsilon waves for ARVC/D diagnosis were assessed in a pooled data set of patients with definite ARVC/D from European and American registries (n = 815). The number of ECG patterns identified as epsilon waves varied from 5 to 18 per reviewer (median 13 per reviewer). A unanimous agreement was reached for only 10 cases (33%), 2 of which qualified as epsilon waves and 8 as non-epsilon waves by all panel members. From a pooled data set, 106 patients reportedly had epsilon waves (13%). In 105 of 106 patients with epsilon waves (99%), exclusion of epsilon waves from the diagnostic score would not affect the "definite" diagnostic category. Interobserver variability in the assessment of epsilon waves is high; however, the impact of epsilon waves on ARVC/D diagnosis is negligibly low. The results urge to exercise caution in the assessment of epsilon waves, especially in patients who would not otherwise meet diagnostic criteria. Copyright © 2016 Heart Rhythm Society. Published by Elsevier Inc. All rights reserved.
Comparison of empirical estimate of clinical pretest probability with the Wells score for diagnosis of deep vein thrombosis.

PubMed

Wang, Bo; Lin, Yin; Pan, Fu-shun; Yao, Chen; Zheng, Zi-Yu; Cai, Dan; Xu, Xiang-dong

2013-01-01

Wells score has been validated for estimation of pretest probability in patients with suspected deep vein thrombosis (DVT). In clinical practice, many clinicians prefer to use empirical estimation rather than Wells score. However, which method is better to increase the accuracy of clinical evaluation is not well understood. Our present study compared empirical estimation of pretest probability with the Wells score to investigate the efficiency of empirical estimation in the diagnostic process of DVT. Five hundred and fifty-five patients were enrolled in this study. One hundred and fifty patients were assigned to examine the interobserver agreement for Wells score between emergency and vascular clinicians. The other 405 patients were assigned to evaluate the pretest probability of DVT on the basis of the empirical estimation and Wells score, respectively, and plasma D-dimer levels were then determined in the low-risk patients. All patients underwent venous duplex scans and had a 45-day follow up. Weighted Cohen's κ value for interobserver agreement between emergency and vascular clinicians of the Wells score was 0.836. Compared with Wells score evaluation, empirical assessment increased the sensitivity, specificity, Youden's index, positive likelihood ratio, and positive and negative predictive values, but decreased negative likelihood ratio. In addition, the appropriate D-dimer cutoff value based on Wells score was 175 μg/l and 108 patients were excluded. Empirical assessment increased the appropriate D-dimer cutoff point to 225 μg/l and 162 patients were ruled out. Our findings indicated that empirical estimation not only improves D-dimer assay efficiency for exclusion of DVT but also increases clinical judgement accuracy in the diagnosis of DVT.
Machine learning-based quantitative texture analysis of CT images of small renal masses: Differentiation of angiomyolipoma without visible fat from renal cell carcinoma.

PubMed

Feng, Zhichao; Rong, Pengfei; Cao, Peng; Zhou, Qingyu; Zhu, Wenwei; Yan, Zhimin; Liu, Qianyun; Wang, Wei

2018-04-01

To evaluate the diagnostic performance of machine-learning based quantitative texture analysis of CT images to differentiate small (≤ 4 cm) angiomyolipoma without visible fat (AMLwvf) from renal cell carcinoma (RCC). This single-institutional retrospective study included 58 patients with pathologically proven small renal mass (17 in AMLwvf and 41 in RCC groups). Texture features were extracted from the largest possible tumorous regions of interest (ROIs) by manual segmentation in preoperative three-phase CT images. Interobserver reliability and the Mann-Whitney U test were applied to select features preliminarily. Then support vector machine with recursive feature elimination (SVM-RFE) and synthetic minority oversampling technique (SMOTE) were adopted to establish discriminative classifiers, and the performance of classifiers was assessed. Of the 42 extracted features, 16 candidate features showed significant intergroup differences (P < 0.05) and had good interobserver agreement. An optimal feature subset including 11 features was further selected by the SVM-RFE method. The SVM-RFE+SMOTE classifier achieved the best performance in discriminating between small AMLwvf and RCC, with the highest accuracy, sensitivity, specificity and AUC of 93.9 %, 87.8 %, 100 % and 0.955, respectively. Machine learning analysis of CT texture features can facilitate the accurate differentiation of small AMLwvf from RCC. • Although conventional CT is useful for diagnosis of SRMs, it has limitations. • Machine-learning based CT texture analysis facilitate differentiation of small AMLwvf from RCC. • The highest accuracy of SVM-RFE+SMOTE classifier reached 93.9 %. • Texture analysis combined with machine-learning methods might spare unnecessary surgery for AMLwvf.

Stenosis detection in native hemodialysis fistulas with MDCT angiography.

PubMed

Heye, Sam; Maleux, Geert; Claes, Kathleen; Kuypers, Dirk; Oyen, Raymond

2009-04-01

The objective of our study was to assess the diagnostic value of 64-MDCT angiography in the evaluation of failing hemodialysis arteriovenous fistulas (AVFs) in comparison with conventional digital subtraction angiography (DSA). Thirty-six patients (22 men; mean age +/- SD, 65 +/- 15 years) with hemodialysis fistula dysfunction underwent MDCT angiography before DSA. Linear weighted kappa was used to calculate interobserver agreement for stenosis for both MDCT angiography and DSA on a 5-point scale. Accuracy, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) for the detection of >/= 50% stenosis or occlusion on MDCT angiography was calculated using DSA as the standard of reference. Wilcoxon's signed rank test and Mann-Whitney U test were used to compare differences in image quality between MDCT angiography and DSA and between MDCT angiography with the patient's arm stretched overhead or alongside the body, respectively. Interobserver agreement for detecting stenosis was excellent for both DSA (kappa = 0.86; 95% CI, 0.81-0.91) and MDCT angiography (kappa = 0.82; 95% CI, 0.77-0.87). Accuracy, sensitivity, specificity, PPV, and NPV of MDCT angiography for detecting >/= 50% stenosis or occlusion was 92.0% (95% CI, 86.8-95.3%), 90.2% (77.8-96.3%), 92.8% (85.9-96.6%), 85.2% (72.3-92.9%), and 95.4% (89.0-98.3%), respectively. No significant difference in image quality was seen between MDCT angiography and DSA (p = 0.3008) or between MDCT angiography with the patient's arm stretched overhead or alongside the body (p = 0.2912). MDCT angiography is a reproducible and reliable imaging technique for detection of >/= 50% stenosis or occlusion in dysfunctional hemodialysis fistulas.
Diagnosis Performance of Different MR Imaging Signs of Cirrhosis: the Caudate to Right Lobe Ratio, the Posterior Right Hepatic Notch, and the Expanded Gallbladder Fossa

PubMed Central

Bolog, N.; Oancea, I.; Andreisek, G.; Mangrau, Angelica; Caruntu, F.

2009-01-01

Background & Aims The purpose of the study is to evaluate the accuracy of the C/RL, RPN, and EGF in diagnosing cirrhosis. Methods The study population included 95 cirrhotic patients in the cirrhosis group (56 men, 39 women, age range 14-76;mean age 52.3) and 57 subjects in the control group (26 men, 31 women, age range 18-83;mean age 51). All MR examinations were performed by using the same protocol. Two radiologists independently assessed data sets in two different reading sessions. The sensitivity, specificity, and accuracy and the relative risk of the signs in diagnosing cirrhosis were calculated. The diagnosis accuracy of the C/RL sign was calculated using the ROC curve. The statistical significance of any difference of each sign between different classes of cirrhosis was also calculated. Results The interobserver agreement between the readers was excellent (κ≥ 0.81;95% CI:0.92, 1.0). There was a significant statistical difference of the diagnostic value of C/RL, RPN, and EGF between cirrhotic patients and control group (p<0.001). The sensitivity, specificity, and accuracy of C/RL were 72%, 87%, and 78%; 67%, 87%, and 75% for RPN; and 49%, 91%, and 65% for EGF. C/RL (OR=18.95) and RPN (OR=14.74) showed a higher risk for cirrhosis compared to EGF (OR=14.74). There was a statistical significance difference between C/RL and EGF (p=0.002) and between RPN and EGF for Child A class of cirrhosis (p-0.037). Conclusion The C/RL and RPN have similar performance regarding the diagnosis of cirrhosis having a higher diagnostic performance compared to EGF in cirrhosis. PMID:24778811
Diagnostic Accuracy of the FIGO and the 5-Tier Fetal Heart Rate Classification Systems in the Detection of Neonatal Acidemia.

PubMed

Martí Gamboa, Sabina; Giménez, Olga Redrado; Mancho, Jara Pascual; Moros, María Lapresta; Sada, Julia Ruiz; Mateo, Sergio Castan

2017-04-01

Objective The objective of this study was to determine ability to detect neonatal acidemia and interobserver agreement with the FIGO 3-tier and 5-tier fetal heart rate (FHR) classification systems. Design This was a case-control study. Setting This study was set at the University Medical Center. Population A total of 202 FHR tracings of 102 women who delivered an acidemic fetus (umbilical arterial cord gas pH ≤ 7.10 and BE < - 8) and 100 who delivered a nonacidemic fetus (umbilical arterial cord gas pH > 7.10) were assessed. A subanalysis was performed for those fetuses who suffered severe metabolic acidemia (pH ≤ 7.0 and BE < - 12). Methods Two reviewers blind to clinical and outcome data classified tracings according to the new 3-tier system proposed by the FIGO and the 5-tier system proposed by Parer and Ikeda. Main Outcome Measures Sensitivity and specificity for detecting neonatal acidemia and interobserver agreement in classifying FHR tracings into categories of both systems were studied. Results The 3-tier system showed a greater sensitivity and lower specificity to detect neonatal acidemia (43.6% sensitivity, 82.5% specificity) and severe metabolic acidemia (71.4% sensitivity, 74.0% specificity) compared with the 5-tier system (36.3% sensitivity, 88% specificity and 61.9% sensitivity, 80.1% specificity, respectively). Both systems were compared by area under the receiver-operating characteristic curve, with comparable predictive ability for detecting neonatal acidemia (FIGO-area under the curve [AUC]: 0.63 [95% confidence interval [CI]: 0.57-0.68] and Parer-AUC: 0.62 [95% CI: 0.56-0.67]). Interobserver agreement was moderate for both systems, but performance at each specific category showed a better agreement for the 5-tier system identifying a pathological tracing (orange or red, κ: 0.625 vs. pathological category, κ: 0.538). Conclusion Both systems presented a comparable ability to predict neonatal acidemia, although the 5-tier system showed a better interobserver agreement identifying pathological tracings. Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.
A global comparative evaluation of commercial immunochromatographic rapid diagnostic tests for visceral leishmaniasis.

PubMed

Cunningham, Jane; Hasker, Epco; Das, Pradeep; El Safi, Sayda; Goto, Hiro; Mondal, Dinesh; Mbuchi, Margaret; Mukhtar, Maowia; Rabello, Ana; Rijal, Suman; Sundar, Shyam; Wasunna, Monique; Adams, Emily; Menten, Joris; Peeling, Rosanna; Boelaert, Marleen

2012-11-15

Poor access to diagnosis stymies control of visceral leishmaniasis (VL). Antibody-detecting rapid diagnostic tests (RDTs) can be performed in peripheral health settings. However, there are many brands available and published reports of variable accuracy. Commercial VL RDTs containing bound rK39 or rKE16 antigen were evaluated using archived human sera from confirmed VL cases (n = 750) and endemic non-VL controls (n = 754) in the Indian subcontinent (ISC), Brazil, and East Africa to assess sensitivity and specificity with 95% confidence intervals. A subset of RDTs were also evaluated after 60 days' heat incubation (37°C, 45°C). Interlot and interobserver variability was assessed. All test brands performed well against ISC panels (sensitivity range, 92.8%-100%; specificity range, 96%-100%); however, sensitivity was lower against Brazil and East African panels (61.5%-91% and 36.8%-87.2%, respectively). Specificity was consistently > 95% in Brazil and ranged between 90.8% and 98% in East Africa. Performance of some products was adversely affected by high temperatures. Agreement between lots and readers was good to excellent (κ > 0.73-0.99). Diagnostic accuracy of VL RDTs varies between the major endemic regions. Many tests performed well and showed good heat stability in the ISC; however, reduced sensitivity against Brazilian and East African panels suggests that in these regions, used alone, several RDTs are inadequate for excluding a VL diagnosis. More research is needed to assess ease of use and to compare performance using whole blood instead of serum and in patients coinfected with human immunodeficiency virus.
Accuracy of abdominal auscultation for bowel obstruction

PubMed Central

Breum, Birger Michael; Rud, Bo; Kirkegaard, Thomas; Nordentoft, Tyge

2015-01-01

AIM: To investigate the accuracy and inter-observer variation of bowel sound assessment in patients with clinically suspected bowel obstruction. METHODS: Bowel sounds were recorded in patients with suspected bowel obstruction using a Littmann® Electronic Stethoscope. The recordings were processed to yield 25-s sound sequences in random order on PCs. Observers, recruited from doctors within the department, classified the sound sequences as either normal or pathological. The reference tests for bowel obstruction were intraoperative and endoscopic findings and clinical follow up. Sensitivity and specificity were calculated for each observer and compared between junior and senior doctors. Interobserver variation was measured using the Kappa statistic. RESULTS: Bowel sound sequences from 98 patients were assessed by 53 (33 junior and 20 senior) doctors. Laparotomy was performed in 47 patients, 35 of whom had bowel obstruction. Two patients underwent colorectal stenting due to large bowel obstruction. The median sensitivity and specificity was 0.42 (range: 0.19-0.64) and 0.78 (range: 0.35-0.98), respectively. There was no significant difference in accuracy between junior and senior doctors. The median frequency with which doctors classified bowel sounds as abnormal did not differ significantly between patients with and without bowel obstruction (26% vs 23%, P = 0.08). The 53 doctors made up 1378 unique pairs and the median Kappa value was 0.29 (range: -0.15-0.66). CONCLUSION: Accuracy and inter-observer agreement was generally low. Clinical decisions in patients with possible bowel obstruction should not be based on auscultatory assessment of bowel sounds. PMID:26379407
Application of classification trees for the qualitative differentiation of focal liver lesions suspicious for metastasis in gadolinium-EOB-DTPA-enhanced liver MR imaging.

PubMed

Schelhorn, J; Benndorf, M; Dietzel, M; Burmeister, H P; Kaiser, W A; Baltzer, P A T

2012-09-01

To evaluate the diagnostic accuracy of qualitative descriptors alone and in combination for the classification of focal liver lesions (FLLs) suspicious for metastasis in gadolinium-EOB-DTPA-enhanced liver MR imaging. Consecutive patients with clinically suspected liver metastases were eligible for this retrospective investigation. 50 patients met the inclusion criteria. All underwent Gd-EOB-DTPA-enhanced liver MRI (T2w, chemical shift T1w, dynamic T1w). Primary liver malignancies or treated lesions were excluded. All investigations were read by two blinded observers (O1, O2). Both independently identified the presence of lesions and evaluated predefined qualitative lesion descriptors (signal intensities, enhancement pattern and morphology). A reference standard was determined under consideration of all clinical and follow-up information. Statistical analysis besides contingency tables (chi square, kappa statistics) included descriptor combinations using classification trees (CHAID methodology) as well as ROC analysis. In 38 patients, 120 FLLs (52 benign, 68 malignant) were present. 115 (48 benign, 67 malignant) were identified by the observers. The enhancement pattern, relative SI upon T2w and late enhanced T1w images contributed significantly to the differentiation of FLLs. The overall classification accuracy was 91.3 % (O1) and 88.7 % (O2), kappa = 0.902. The combination of qualitative lesion descriptors proposed in this work revealed high diagnostic accuracy and interobserver agreement in the differentiation of focal liver lesions suspicious for metastases using Gd-EOB-DTPA-enhanced liver MRI. © Georg Thieme Verlag KG Stuttgart · New York.
Journal club: Acute abdominal pain in elderly patients: effect of radiologist awareness of clinicobiologic information on CT accuracy.

PubMed

Millet, Ingrid; Alili, Chakib; Bouic-Pages, Emmanuelle; Curros-Doyon, Fernanda; Nagot, Nicolas; Taourel, Patrice

2013-12-01

The purpose of this study was to assess whether the availability of clinicobiologic findings would affect the diagnostic performance of CT of elderly emergency department patients with nontraumatic acute abdominal pain. The cases of 333 consecutively registered patients 75 years old or older presenting to the emergency department with acute abdominal pain and who underwent CT were retrospectively reviewed by two radiologists blinded or not to the patient's clinicobiologic results. Diagnostic accuracy was calculated according to the level of correctly classified cases in both the entire cohort and a surgical subgroup and was compared between readings performed with and without knowledge of the clinicobiologic findings. Agreement between each reading and the reference diagnosis and interobserver agreement were assessed with kappa statistics. In both the entire cohort (87.4% vs 85.3%, p = 0.07) and the surgical group (94% vs 91%, p = 0.15), there was no significant difference in CT accuracy between diagnoses made when the radiologist was aware and those made when the radiologist was not aware of the clinicobiologic findings. Agreement between the CT diagnosis and the final diagnosis was excellent whether or not the radiologist was aware of the clinicobiologic findings. In the care of elderly patients, CT is accurate for diagnosing the cause of acute abdominal pain, particularly when it is of surgical origin, regardless of the availability of clinical and biologic findings. Thus CT interpretation should not be delayed until complete clinicobiologic data are available, and the images should be quickly transmitted to the emergency physician so that appropriate therapy can be begun.
COMFORT scale: a reliable and valid method to measure the amount of stress of ventilated preterm infants.

PubMed

Wielenga, J M; De Vos, R; de Leeuw, R; De Haan, R J

2004-01-01

Assessment of clinimetric properties and diagnostic quality of a stress measurement scale (COMFORT scale). Sample of an open population. Neonatology department (Neonatal Intensive Care Unit), Academic Medical Centre/Emma Children's Hospital, Amsterdam, The Netherlands. One clinical expert and 9 observers observed ventilated premature born babies simultaneously. Criterion validity was assessed by correlating the COMFORT scale with the clinical judgment regarding the amount of stress. Interobserver reliability was assessed on the clinical judgment as well as on the COMFORT scale. Diagnostic qualities were evaluated with a ROC curve. On 19 ventilated prematurely born babies (mean gestational age 30 weeks, mean birth weight 1385 gm), one clinical expert and 9 observers made 30 paired observations. The criterion validity of the COMFORT scale was good (Pearson's r of 0.84). The interobserver reliability of the clinical judgment was very good (weighted Kappa 0.84). The interobserver reliability of each item varied from good to almost perfect (weighted Kappa of 0.64 for muscle tone to 1.00 on heart rate). The reliability of the total COMFORT scale score was satisfying (intra-class correlation coefficient of 0.94). The diagnostic quality of the COMFORT scale was excellent, at a cut-off point of 20 the sensitivity was 100 percent, the specificity was 77 percent, and the area under the curve (AUC) of 0.95. In this first evaluation, the COMFORT scale appears to be a valid and reliable measurement tool to assess the stress of ventilated prematurely born babies.
Substituting Sodium Hydrosulfite with Sodium Metabisulfite Improves Long-Term Stability of a Distributable Paper-Based Test Kit for Point-of-Care Screening for Sickle Cell Anemia.

PubMed

Torabian, Kian; Lezzar, Dalia; Piety, Nathaniel Z; George, Alex; Shevkoplyas, Sergey S

2017-09-20

Sickle cell anemia (SCA) is a genetic blood disorder that is particularly lethal in early childhood. Universal newborn screening programs and subsequent early treatment are known to drastically reduce under-five SCA mortality. However, in resource-limited settings, cost and infrastructure constraints limit the effectiveness of laboratory-based SCA screening programs. To address this limitation our laboratory previously developed a low-cost, equipment-free, point-of-care, paper-based SCA test. Here, we improved the stability and performance of the test by replacing sodium hydrosulfite (HS), a key reducing agent in the hemoglobin solubility buffer which is not stable in aqueous solutions, with sodium metabisulfite (MS). The MS formulation of the test was compared to the HS formulation in a laboratory setting by inexperienced users ( n = 3), to determine visual limit of detection (LOD), readout time, diagnostic accuracy, intra- and inter-observer agreement, and shelf life. The MS test was found to have a 10% sickle hemoglobin LOD, 21-min readout time, 97.3% sensitivity and 99.5% specificity for SCA, almost perfect intra- and inter-observer agreement, at least 24 weeks of shelf stability at room temperature, and could be packaged into a self-contained, distributable test kits comprised of off-the-shelf disposable components and food-grade reagents with a total cost of only $0.21 (USD).
Validation of the Italian version of the Coma Recovery Scale-Revised (CRS-R).

PubMed

Sacco, Simona; Altobelli, Emma; Pistarini, Caterina; Cerone, Davide; Cazzulani, Benedetta; Carolei, Antonio

2011-01-01

To validate the Italian version of the Coma Recovery Scale-Revised (CRS-R). Two observers applied the Italian version of the CRS-R to selected patients. On day 1, observer A and B independently scored each patient; the comparison of their observations was used to evaluate inter-observer agreement. On day 2, observer A completed a second evaluation and the comparison of this observation with that obtained on day 1 by the same observer was used to evaluate test-re-test agreement. For each evaluation, also diagnostic impression (vegetative state/minimally conscious state) was reported. Thirty-eight patients were evaluated (mean age ± SD, 58.9 ± 13.8 years). Inter-observer (ρ = 0.81; p < 0.001) as well as test-re-test agreement (ρ = 0.97; p < 0.001) for the total score was high. Inter-observer agreement was excellent for the communication sub-scale, good for the auditory, visual and motor sub-scales and moderate for the oromotor/verbal and arousal sub-scales. Test-re-test agreement was excellent for the visual, motor, oromotor/verbal and communication sub-scales, good for the auditory sub-scale and moderate for the arousal sub-scale. When considering the diagnostic impression, inter-observer agreement was good (κ = 0.75; p < 0.001) and test-re-test agreement was excellent (κ = 0.92; p < 0.001). The Italian version of the CRS-R can be administered reliably and can be also employed to discriminate patients in vegetative and in minimally conscious state.
Diagnostic accuracy of sequential co-registered PET+MR in comparison to PET/CT in local thoracic staging of malignant pleural mesothelioma.

PubMed

Martini, Katharina; Meier, Andreas; Opitz, Isabelle; Weder, Walter; Veit-Haibach, Patrick; Stahel, Rolf A; Frauenfelder, Thomas

2016-04-01

To investigate the diagnostic accuracy of sequential co-registered PET+MR (PET+MR) for local staging of malignant pleural mesothelioma (MPM) compared to PET/CT. In a prospective clinical trial 34 consecutive patients (median age 66 years; range 40-79 years; 1 female, 33 male) with known MPM, who underwent PET/CT and PET+MR exams for either staging or re-staging/follow-up were evaluated. Imaging was conducted using a tri-modality PET/CT-MR set-up (Discovery PET/CT 690, 3T Discovery MR 750w, both GE Healthcare, Waukesha, WI, USA). In 26 cases histopathology served as standard of reference. Two independent readers evaluated images for T and N stage, confidence level (sure to unsure; 1-3) and subjective overall image quality (very good to non-diagnostic; 1-4). Inter-observer agreement of T and N stages (Cohen's kappa) and interclass correlation coefficient (ICC) between PET/CT vs. PET+MR was calculated. Inter observer agreement for evaluation of T and N Stage in PET/CT images was excellent (k=0.844 and k=0.824, respectively), whereas PET+MR imaging showed substantial agreement in T and N stage (k=0.729 and k=0.691, respectively). The ICC of PET/CT vs. PET+MR for evaluation of both, T and N Stage, was excellent (ICC=0.951 and ICC=0.93, respectively). Diagnostic confidence was scored significantly higher in PET+MR compared to PET/CT (mean score=1.66 and 1.93, respectively; p=0.004). Image quality was diagnostic for all image series. Comparing pT and pN stage vs cT and cN stage (n=26 cases), both imaging modalities showed excellent agreement for T stage (ICCPET+MR=0.888 vs. ICCPET/CT=0.853, respectively) and substantial to moderate agreement for N stage (ICCPET+MR=0.683 vs. ICC=0.595PET/CT, respectively). Our findings suggest that diagnostic accuracy of PET+MR is comparable to PET/CT for local staging of MPM, whereas radiologists felt significantly more confident staging PET+MR compared to PET/CT images (p=0003), using dedicated sequences. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Bicuspid aortic valves: diagnostic accuracy of standard axial 64-slice chest CT compared to aortic valve image plane ECG-gated cardiac CT.

PubMed

Murphy, David J; McEvoy, Sinead H; Iyengar, Sri; Feuchtner, Gudrun; Cury, Ricardo C; Roobottom, Carl; Baumueller, Stephan; Alkadhi, Hatem; Dodd, Jonathan D

2014-08-01

To assess the diagnostic accuracy of standard axial 64-slice chest CT compared to aortic valve image plane ECG-gated cardiac CT for bicuspid aortic valves. The standard axial chest CT scans of 20 patients with known bicuspid aortic valves were blindly, randomly analyzed for (i) the appearance of the valve cusps, (ii) the largest aortic sinus area, (iii) the longest aortic cusp length, (iv) the thickest aortic valve cusp and (v) valve calcification. A second blinded reader independently analyzed the appearance of the valve cusps. Forty-two age- and sex-matched patients with known tricuspid aortic valves were used as controls. Retrospectively ECG-gated cardiac CT multiphase reconstructions of the aortic valve were used as the gold-standard. Fourteen (21%) scans were scored as unevaluable (7 bicuspid, 7 tricuspid). Of the remainder, there were 13 evaluable bicuspid valves, ten of which showed an aortic valve line sign, while the remaining three showed a normal Mercedes-Benz appearance owing to fused valve cusps. The 35 evaluable tricuspid aortic valves all showed a normal Mercedes-Benz appearance (P=0.001). Kappa analysis=0.62 indicating good interobserver agreement for the aortic valve cusp appearance. Aortic sinus areas, aortic cusp lengths and aortic cusp thicknesses of ≥ 3.8 cm(2), 3.2 cm and 1.6mm respectively on standard axial chest CT best distinguished bicuspid from tricuspid aortic valves (P<0.0001 for all). Of evaluable scans, the sensitivity, specificity, positive and negative predictive values of standard axial chest CT in diagnosing bicuspid aortic valves was 77% (CI 0.54-1.0), 100%, 100% and 70% respectively. The aortic valve is evaluable in approximately 80% of standard chest 64-slice CT scans. Bicuspid aortic valves may be diagnosed on evaluable scans with good diagnostic accuracy. An aortic valve line sign, enlarged aortic sinuses and elongated, thickened valve cusps are specific CT features. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
The applied research of MRI with ASSET-EPI-FLAIR combined with 3D TOF MRA sequences in the assessment of patients with acute cerebral infarction.

PubMed

Lin, Zhichao; Guo, Zexiong; Qiu, Lin; Yang, Wanyoug; Lin, Mingxia

2016-12-01

Background To extend the time window for thrombolysis, reducing the time for diagnosis and detection of acute cerebral infarction seems to be warranted. Purpose To evaluate the feasibility of implementing an array spatial sensitivity technique (ASSET)-echo-planar imaging (EPI)-fluid attenuated inversion recovery (FLAIR) (AE-FLAIR) sequence into an acute cerebral infarction magnetic resonance (MR) evaluation protocol, and to assess the diagnostic value of AE-FLAIR combined with three-dimensional time-of-flight MR angiography (3D TOF MRA). Material and Methods A total of 100 patients (68 men, 32 women; age range, 44-82 years) with acute cerebral infarction, including 50 consecutive uncooperative and 50 cooperative patients, were evaluated with T1-weighted (T1W) imaging, T2-weighted (T2W) imaging, FLAIR, diffusion-weighted imaging (DWI), 3D TOF, EPI-FLAIR, and AE-FLAIR. Conventional FLAIR, EPI-FLAIR, and AE-FLAIR were assessed by two observers independently for image quality. The optimized group (AE-FLAIR and 3D TOF) and the control group (T1W imaging, T2W imaging, conventional FLAIR, DWI, and 3D TOF) were compared for evaluation time and diagnostic accuracy. Results One hundred and twenty-five lesions were detected and images having adequate diagnostic image quality were in 73% of conventional FLAIR, 62% of EPI-FLAIR, and 89% of AE-FLAIR. The detection time was 12 ± 1 min with 76% accuracy and 4 ± 0.5 min with 100% accuracy in the control and the optimized groups, respectively. Inter-observer agreements of κ = 0.78 and κ = 0.81 were for the optimized group and control group, respectively. Conclusion With reduced acquisition time and better image quality, AE-FLAIR combined with 3D TOF may be used as a rapid diagnosis tool in patients with acute cerebral infarction, especially in uncooperative patients.
Making cytological diagnoses on digital images using the iPath network.

PubMed

Dalquen, Peter; Savic Prince, Spasenija; Spieler, Peter; Kunze, Dietmar; Neumann, Heinrich; Eppenberger-Castori, Serenella; Adams, Heiner; Glatz, Katharina; Bubendorf, Lukas

2014-01-01

The iPath telemedicine platform Basel is mainly used for histological and cytological consultations, but also serves as a valuable learning tool. To study the level of accuracy in making diagnoses based on still images achieved by experienced cytopathologists, to identify limiting factors, and to provide a cytological image series as a learning set. Images from 167 consecutive cytological specimens of different origin were uploaded on the iPath platform and evaluated by four cytopathologists. Only wet-fixed and well-stained specimens were used. The consultants made specific diagnoses and categorized each as benign, suspicious or malignant. For all consultants, specificity and sensitivity regarding categorized diagnoses were 83-92 and 85-93%, respectively; the overall accuracy was 88-90%. The interobserver agreement was substantial (κ = 0.791). The lowest rate of concordance was achieved in urine and bladder washings and in the identification of benign lesions. Using a digital image set for diagnostic purposes implies that even under optimal conditions the accuracy rate will not exceed to 80-90%, mainly because of lacking supportive immunocytochemical or molecular tests. This limitation does not disqualify digital images for teleconsulting or as a learning aid. The series of images used for the study are open to the public at http://pathorama.wordpress.com/extragenital-cytology-2013/. © 2014 S. Karger AG, Basel.
Diagnostic accuracy of phosphor plate systems and conventional radiography in the detection of simulated internal root resorption.

PubMed

Vasconcelos, Karla de Faria; Rovaris, Karla; Nascimento, Eduarda Helena Leandro; Oliveira, Matheus Lima; Távora, Débora de Melo; Bóscolo, Frab Norberto

2017-11-01

To evaluate the performance of conventional radiography and photostimulable phosphor (PSP) plate in the detection of simulated internal root resorption (IRR) lesions in early stages. Twenty single-rooted teeth were X-rayed before and after having a simulated IRR early lesion. Three imaging systems were used: Kodak InSight dental film and two PSPs digital systems, Digora Optime and VistaScan. The digital images were displayed on a 20.1″ LCD monitor using the native software of each system, and the conventional radiographs were evaluated on a masked light box. Two radiologists were asked to indicate the presence or absence of IRR and, after two weeks, all images were re-evaluated. Cohen's kappa coefficient was calculated to assess intra- and interobserver agreement. The three imaging systems were compared using the Kruskal-Wallis test. For interexaminer agreement, overall kappa values were 0.70, 0.65 and 0.70 for conventional film, Digora Optima and VistaScan, respectively. Both the conventional and digital radiography presented low sensitivity, specificity, accuracy, positive and negative predictive values with no significant difference between imaging systems (p = .0725). The performance of conventional and PSP was similar in the detection of simulated IRR lesions in early stages with low accuracy.
Atlas-based segmentation technique incorporating inter-observer delineation uncertainty for whole breast

NASA Astrophysics Data System (ADS)

Bell, L. R.; Dowling, J. A.; Pogson, E. M.; Metcalfe, P.; Holloway, L.

2017-01-01

Accurate, efficient auto-segmentation methods are essential for the clinical efficacy of adaptive radiotherapy delivered with highly conformal techniques. Current atlas based auto-segmentation techniques are adequate in this respect, however fail to account for inter-observer variation. An atlas-based segmentation method that incorporates inter-observer variation is proposed. This method is validated for a whole breast radiotherapy cohort containing 28 CT datasets with CTVs delineated by eight observers. To optimise atlas accuracy, the cohort was divided into categories by mean body mass index and laterality, with atlas’ generated for each in a leave-one-out approach. Observer CTVs were merged and thresholded to generate an auto-segmentation model representing both inter-observer and inter-patient differences. For each category, the atlas was registered to the left-out dataset to enable propagation of the auto-segmentation from atlas space. Auto-segmentation time was recorded. The segmentation was compared to the gold-standard contour using the dice similarity coefficient (DSC) and mean absolute surface distance (MASD). Comparison with the smallest and largest CTV was also made. This atlas-based auto-segmentation method incorporating inter-observer variation was shown to be efficient (<4min) and accurate for whole breast radiotherapy, with good agreement (DSC>0.7, MASD <9.3mm) between the auto-segmented contours and CTV volumes.
ROC analysis of diagnostic performance in liver scintigraphy.

PubMed

Fritz, S L; Preston, D F; Gallagher, J H

1981-02-01

Studies on the accuracy of liver scintigraphy for the detection of metastases were assembled from 38 sources in the medical literature. An ROC curve was fitted to the observed values of sensitivity and specificity using an algorithm developed by Ogilvie and Creelman. This ROC curve fitted the data better than average sensitivity and specificity values in each of four subsets of the data. For the subset dealing with Tc-99m sulfur colloid scintigraphy, performed for detection of suspected metastases and containing data on 2800 scans from 17 independent series, it was not possible to reject the hypothesis that interobserver variation was entirely due to the use of different decision thresholds by the reporting clinicians. Thus the ROC curve obtained is a reasonable baseline estimate of the performance potentially achievable in today's clinical setting. Comparison of new reports with these data is possible, but is limited by the small sample sizes in most reported series.
Diagnostic performance and reproducibility of T2w based and diffusion weighted imaging (DWI) based PI-RADSv2 lexicon descriptors for prostate MRI.

PubMed

Benndorf, Matthias; Hahn, Felix; Krönig, Malte; Jilg, Cordula Annette; Krauss, Tobias; Langer, Mathias; Dovi-Akué, Philippe

2017-08-01

To examine the diagnostic performance of PI-RADSv2 T2w and diffusion weighted imaging (DWI) based lexicon descriptors, inter-observer agreement for descriptor assignment and diagnostic accuracy of the PI-RADSv2 assessment categories for multiparametric prostate MRI. 176 lesions in 79 consecutive patients are analyzed, lesions are histopathologically verified by MRI-ultrasound fusion biopsy. All lesions are rated according to the PI-RADSv2 lexicon, descriptors for T2w and DWI sequences and resulting assessment categories are assigned by two independent blinded radiologists. We perform receiver-operating-characteristic analysis using the assessment categories. To analyze inter-observer agreement, we calculate weighted kappa values for assessment category assignment and unweighted kappa values for descriptor assignment. PI-RADSv2 assessment categories yield an area under the curve of 0.76/0.74 (radiologist 1/radiologist 2), P >0.05. Weighted kappa for agreement is 0.601 in the peripheral zone and 0.580 in the transition zone. We detect a difference in the cancer rate for PI-RADSv2 category 3 between peripheral zone (32%) and transition zone (12%), P <0.05. We obtain moderate agreement at most for descriptor assignment with kappa values ranging from 0.082 (T2w shape in the transition zone) to 0.407 (T2w signal intensity in the peripheral zone) and 0.493 (ADC pattern in the peripheral zone). Our analysis corroborates typical descriptors for benign/malignant lesions, but also reveals insights into potential pitfalls - T2w wedge shaped lesions in the peripheral zone have a considerable cancer rate, despite being labelled category 2 in the lexicon. Agreement for descriptor assignment in the PI-RADSv2 lexicon is at most moderate in our study. Typical descriptors for benign and malignant lesions are validated, whereas the discriminatory power of some descriptors is challenged. The difference in the cancer rate for PI-RADSv2 category 3 between peripheral zone and transition zone should be considered when management recommendations are linked to assessment categories in the future. Copyright © 2017 Elsevier B.V. All rights reserved.
Performance of unenhanced respiratory-gated 3D SSFP MRA to depict hepatic and visceral artery anatomy and variants.

PubMed

Puippe, Gilbert D; Alkadhi, Hatem; Hunziker, Roger; Nanz, Daniel; Pfammatter, Thomas; Baumueller, Stephan

2012-08-01

To prospectively evaluate the performance of unenhanced respiratory-gated magnetization-prepared 3D-SSFP inversion recovery MRA (unenhanced-MRA) to depict hepatic and visceral artery anatomy and variants in comparison to contrast-enhanced dynamic gradient-echo MRI (CE-MRI) and to digital subtraction angiography (DSA). Eighty-four patients (55.6±12.4 years) were imaged with CE-MRI (TR/TE 3.5/1.7ms, TI 1.7ms, flip-angle 15°) and unenhanced-MRA (TR/TE 4.4/2.2ms, TI 200ms, flip-angle 90°). Two independent readers assessed image quality of hepatic and visceral arteries on a 4-point-scale. Vessel contrast was measured by a third reader. In 28 patients arterial anatomy was compared to DSA. Interobserver agreement regarding image quality was good for CE-MRI (κ=0.77) and excellent for unenhanced-MRA (κ=0.83). Unenhanced-MRA yielded diagnostic image quality in 71.6% of all vessels, whereas CE-MRI provided diagnostic image quality in 90.6% (p<0.001). Vessel-based image quality was significantly superior for all vessels at CE-MRI compared to unenhanced-MRA (p<0.01). Vessel contrast was similar among both sequences (p=0.15). Compared to DSA, CE-MRI and unenhanced-MRA yielded equal accuracy of 92.9-96.4% for depiction of hepatic and visceral artery variants (p=0.93). Unenhanced-MRA provides diagnostic image quality in 72% of hepatic and visceral arteries with no significant difference in vessel contrast and similar accuracy to CE-MRI for depiction of hepatic and visceral anatomy. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Is computed tomography an accurate and reliable method for measuring total knee arthroplasty component rotation?

PubMed

Figueroa, José; Guarachi, Juan Pablo; Matas, José; Arnander, Magnus; Orrego, Mario

2016-04-01

Computed tomography (CT) is widely used to assess component rotation in patients with poor results after total knee arthroplasty (TKA). The purpose of this study was to simultaneously determine the accuracy and reliability of CT in measuring TKA component rotation. TKA components were implanted in dry-bone models and assigned to two groups. The first group (n = 7) had variable femoral component rotations, and the second group (n = 6) had variable tibial tray rotations. CT images were then used to assess component rotation. Accuracy of CT rotational assessment was determined by mean difference, in degrees, between implanted component rotation and CT-measured rotation. Intraclass correlation coefficient (ICC) was applied to determine intra-observer and inter-observer reliability. Femoral component accuracy showed a mean difference of 2.5° and the tibial tray a mean difference of 3.2°. There was good intra- and inter-observer reliability for both components, with a femoral ICC of 0.8 and 0.76, and tibial ICC of 0.68 and 0.65, respectively. CT rotational assessment accuracy can differ from true component rotation by approximately 3° for each component. It does, however, have good inter- and intra-observer reliability.

Diffusion-weighted magnetic resonance imaging in the characterization of testicular germ cell neoplasms: Effect of ROI methods on apparent diffusion coefficient values and interobserver variability.

PubMed

Tsili, Athina C; Ntorkou, Alexandra; Astrakas, Loukas; Xydis, Vasilis; Tsampalas, Stavros; Sofikitis, Nikolaos; Argyropoulou, Maria I

2017-04-01

To evaluate the difference in apparent diffusion coefficient (ADC) measurements at diffusion-weighted (DW) magnetic resonance imaging of differently shaped regions-of-interest (ROIs) in testicular germ cell neoplasms (TGCNS), the diagnostic ability of differently shaped ROIs in differentiating seminomas from nonseminomatous germ cell neoplasms (NSGCNs) and the interobserver variability. Thirty-three TGCNs were retrospectively evaluated. Patients underwent MR examinations, including DWI on a 1.5-T MR system. Two observers measured mean tumor ADCs using four distinct ROI methods: round, square, freehand and multiple small, round ROIs. The interclass correlation coefficient was analyzed to assess interobserver variability. Statistical analysis was used to compare mean ADC measurements among observers, methods and histologic types. All ROI methods showed excellent interobserver agreement, with excellent correlation (P<0.001). Multiple, small ROIs provided the lower mean ADC in TGCNs. Seminomas had lower mean ADC compared to NSGCNs for each ROI method (P<0.001). Round ROI proved the most accurate method in characterizing TGCNS. Interobserver variability in ADC measurement is excellent, irrespective of the ROI shape. Multiple, small round ROIs and round ROI proved the more accurate methods for ADC measurement in the characterization of TGCNs and in the differentiation between seminomas and NSGCNs, respectively. Copyright © 2017 Elsevier B.V. All rights reserved.
Computed tomography detection of extracapsular spread of squamous cell carcinoma of the head and neck in metastatic cervical lymph nodes.

PubMed

Carlton, Joshua A; Maxwell, Adam W; Bauer, Lyndsey B; McElroy, Sara M; Layfield, Lester J; Ahsan, Humera; Agarwal, Ajay

2017-06-01

Background and purpose In patients with squamous cell carcinoma of the head and neck (HNSCC), extracapsular spread (ECS) of metastases in cervical lymph nodes affects prognosis and therapy. We assessed the accuracy of intravenous contrast-enhanced computed tomography (CT) and the utility of imaging criteria for preoperative detection of ECS in metastatic cervical lymph nodes in patients with HNSCC. Materials and methods Preoperative intravenous contrast-enhanced neck CT images of 93 patients with histopathological HNSCC metastatic nodes were retrospectively assessed by two neuroradiologists for ECS status and ECS imaging criteria. Radiological assessments were compared with histopathological assessments of neck dissection specimens, and interobserver agreement of ECS status and ECS imaging criteria were measured. Results Sensitivity, specificity, positive predictive value, and accuracy for overall ECS assessment were 57%, 81%, 82% and 67% for observer 1, and 66%, 76%, 80% and 70% for observer 2, respectively. Correlating three or more ECS imaging criteria with histopathological ECS increased specificity and positive predictive value, but decreased sensitivity and accuracy. Interobserver agreement for overall ECS assessment demonstrated a kappa of 0.59. Central necrosis had the highest kappa of 0.74. Conclusion CT has moderate specificity for ECS assessment in HNSCC metastatic cervical nodes. Identifying three or more ECS imaging criteria raises specificity and positive predictive value, therefore preoperative identification of multiple criteria may be clinically useful. Interobserver agreement is moderate for overall ECS assessment, substantial for central necrosis. Other ECS CT criteria had moderate agreement at best and therefore should not be used individually as criteria for detecting ECS by CT.
Articular cartilage grading of the knee: diagnostic performance of fat-suppressed 3D volume isotropic turbo spin-echo acquisition (VISTA) compared with 3D T1 high-resolution isovolumetric examination (THRIVE).

PubMed

Lee, Young Han; Hahn, Seok; Lim, Daekeon; Suh, Jin-Suck

2017-02-01

Background Conventionally, two-dimensional (2D) fast spin-echo (FSE) sequences have been widely used for clinical cartilage imaging as well as gradient (GRE) sequences. Recently, three-dimensional (3D) volumetric magnetic resonance imaging (MRI) has been introduced with one 3D volumetric scan, and this is replacing slice-by-slice 2D MR scans. Purpose To evaluate the image quality and diagnostic performance of two 3D sequences for abnormalities of knee cartilage: fat-suppressed (FS) FSE-based 3D volume isotropic turbo spin-echo acquisition (VISTA) and GRE-based 3D T1 high-resolution isovolumetric examination (THRIVE). Material and Methods The institutional review board approved the protocol of this retrospective review. This study enrolled 40 patients (41 knees) with arthroscopically confirmed abnormalities of cartilage. All patients underwent isovoxel 3D-VISTA and 3D-THRIVE MR sequences on 3T MRI. We assessed the cartilage grade on the two 3D sequences using arthroscopy as a gold standard. Inter-observer agreement for each technique was evaluated with the intraclass correlation coefficient (ICC). Differences in the area under the curve (AUC) were compared between the 3D-THRIVE and 3D-VISTA. Results Although inter-observer agreement for both sequences was excellent, the inter-observer agreement for 3D-VISTA was higher than for 3D-THRIVE for cartilage grading in all regions of the knee. There was no significant difference in the diagnostic performance ( P > 0.05) between the two sequences for detecting cartilage grade. Conclusion FSE-based 3D-VISTA images had good diagnostic performance that was comparable to GRE-based 3D-THRIVE images in the evaluation of knee cartilage, and can be used in routine knee MR protocols for the evaluation of cartilage.
[Comparison of the diagnostic utility from visual inspection with acetic acid and cervical cytology].

PubMed

Velázquez-Hernández, Nadia; Sánchez-Anguiano, Luis Francisco; Lares-Bayona, Edgar Felipe; Cisneros-Pérez, Vicente; Milla-Villeda, Reinaldo Humberto; Arreola-Herrera, Francisco de Asís; Navarrete-Flores, José Antonio; Aguilar-Durán, Maricela; Núñez-Márquez, Teresita; Rueda-Cisneros, Dora Alicia

2010-05-01

In Mexico, cervical cancer is the second leading cause of death in women after breast cancer. The human papillomavirus is associated with intraepithelial lesions, detected up to 99.7% of cervical carcinomas. Despite being easy to detect is a condition that many women suffer. To determine the diagnostic utility of the visual inspection with acetic acid of the uterine cervix compared with the cervical cytology. Study of diagnostic tests. The study was realized in the Centro de Atención Materno Infantil y Planificación Familiar of the Instituto de Investigación Científica, Durango, Mexico, research of the Juárez University of the State of Durango, from August 23, 2005 to November 13, 2006. 1,521 participants were examined who went consecutively to opportune detection of cervical cancer. One doctor practiced the test of acetic acid and cervical cytology to them, and one digital photograph, which was evaluated by three inter-observers triple blind. Those that was positive to anyone of these tests, were remitted to colposcopy and/or biopsy; also to 10% of selected negative population randomly was realized this procedure. Sensitivity, specificity, positive and negative predictive values and exactitude were determined. For the agreement inter-observer index of Kappa was used. Sensitivity, specificity, values predictive positive, negative and exactitude for the visual inspection with acetic acid were 20, 97, 5 and 99%, respectively. For the cervical cytology were of 80, 99, 57 and 99%, respectively. The force of agreement between the interobservant was poor. In this study cervical cytology was more useful than visual inspection with acetic acid to detect dysplasias or cervical cancer opportunely, due to detect all the positive true cases confirmed by biopsy.
Clinical application of qualitative assessment for breast masses in shear-wave elastography.

PubMed

Gweon, Hye Mi; Youk, Ji Hyun; Son, Eun Ju; Kim, Jeong-Ah

2013-11-01

To evaluate the interobserver agreement and the diagnostic performance of various qualitative features in shear-wave elastography (SWE) for breast masses. A total of 153 breast lesions in 152 women who underwent B-mode ultrasound and SWE before biopsy were included. Qualitative analysis in SWE was performed using two different classifications: E values (Ecol; 6-point color score, Ehomo; homogeneity score and Esha; shape score) and a four-color pattern classification. Two radiologists reviewed five data sets: B-mode ultrasound, SWE, and combination of both for E values and four-color pattern. The BI-RADS categories were assessed B-mode and combined sets. Interobserver agreement was assessed using weighted κ statistics. Areas under the receiver operating characteristic curve (AUC), sensitivity, and specificity were analyzed. Interobserver agreement was substantial for Ecol (κ=0.79), Ehomo (κ=0.77) and four-color pattern (κ=0.64), and moderate for Esha (κ=0.56). Better-performing qualitative features were Ecol and four-color pattern (AUCs, 0.932 and 0.925) compared with Ehomo and Esha (AUCs, 0.857 and 0.864; P<0.05). The diagnostic performance of B-mode ultrasound (AUC, 0.950) was not significantly different from combined sets with E value and with four color pattern (AUCs, 0.962 and 0.954). When all qualitative values were negative, leading to downgrade the BI-RADS category, the specificity increased significantly from 16.5% to 56.1% (E value) and 57.0% (four-color pattern) (P<0.001) without improvement in sensitivity. The qualitative SWE features were highly reproducible and showed good diagnostic performance in suspicious breast masses. Adding qualitative SWE to B-mode ultrasound increased specificity in decision making for biopsy recommendation. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Bronchoscopic Lung Cryobiopsy Increases Diagnostic Confidence in the Multidisciplinary Diagnosis of Idiopathic Pulmonary Fibrosis.

PubMed

Tomassetti, Sara; Wells, Athol U; Costabel, Ulrich; Cavazza, Alberto; Colby, Thomas V; Rossi, Giulio; Sverzellati, Nicola; Carloni, Angelo; Carretta, Elisa; Buccioli, Matteo; Tantalocco, Paola; Ravaglia, Claudia; Gurioli, Christian; Dubini, Alessandra; Piciucchi, Sara; Ryu, Jay H; Poletti, Venerino

2016-04-01

Surgical lung biopsy is often required for a confident multidisciplinary diagnosis of idiopathic pulmonary fibrosis (IPF). Alternative, less-invasive biopsy methods, such as bronchoscopic lung cryobiopsy (BLC), are highly desirable. To address the impact of BLC on diagnostic confidence in the multidisciplinary diagnosis of IPF. In this cross-sectional study we selected 117 patients with fibrotic interstitial lung disease without a typical usual interstitial pneumonia pattern on high-resolution computed tomography. All cases underwent lung biopsies: 58 were BLC, and 59 were surgical lung biopsy (SLB). Two clinicians, two radiologists, and two pathologists sequentially reviewed clinical-radiologic findings and biopsy results, recording at each step in the process their diagnostic impressions and confidence levels. We observed a major increase in diagnostic confidence after the addition of BLC, similar to SLB (from 29 to 63%, P = 0.0003 and from 30 to 65%, P = 0.0016 of high confidence IPF diagnosis, in the BLC group and SLB group, respectively). The overall interobserver agreement in IPF diagnosis was similar for both approaches (BLC overall kappa, 0.96; SLB overall kappa, 0.93). IPF was the most frequent diagnosis (50 and 39% in the BLC and SLB group, respectively; P = 0.23). After the addition of histopathologic information, 17% of cases in the BLC group and 19% of cases in the SLB group, mostly idiopathic nonspecific interstitial pneumonia and hypersensitivity pneumonitis, were reclassified as IPF. BLC is a new biopsy method that has a meaningful impact on diagnostic confidence in the multidisciplinary diagnosis of interstitial lung disease and may prove useful in the diagnosis of IPF. This study provides a robust rationale for future studies investigating the diagnostic accuracy of BLC compared with SLB.
Can routine chest radiography be used to diagnose mild COPD? A nested case-control study.

PubMed

den Harder, A M; Snoek, A M; Leiner, T; Suyker, W J; de Heer, L M; Budde, R P J; Lammers, J W J; de Jong, P A; Gondrie, M J A

2017-07-01

To determine whether mild stage chronic obstructive pulmonary disease (COPD) can be detected on chest radiography without substantial overdiagnosis. A retrospective nested case-control study (case:control, 1:1) was performed in 783 patients scheduled for cardiothoracic surgery who underwent both spirometry and a chest radiograph preoperative. Diagnostic accuracy of chest radiography for diagnosing mild COPD was investigated using objective measurements and overall appearance specific for COPD on chest radiography. Inter-observer variability was investigated and variables with a kappa >0.40 as well as baseline characteristics were used to make a diagnostic model which was aimed at achieving a high positive predictive value (PPV). Twenty percent (155/783) had COPD. The PPV of overall appearance specific for COPD alone was low (37-55%). Factors in the diagnostic model were age, type of surgery, gender, distance of the right diaphragm apex to the first rib, retrosternal space, sternodiaphragmatic angle, maximum height right diaphragm (lateral view) and subjective impression of COPD (using both views). The model resulted in a PPV of 100%, negative predictive value (NPV) of 82%, sensitivity of 10% and specificity of 100% with an area under the curve of 0.811. Detection of mild COPD without substantial overdiagnosis was not feasible on chest radiographs in our cohort. Copyright © 2017 Elsevier B.V. All rights reserved.
PI-RADS v2: Current standing and future outlook.

PubMed

Smith, Clayton P; Türkbey, Barış

2018-05-01

The Prostate Imaging-Reporting and Data System (PI-RADS) was created in 2012 to establish standardization in prostate multiparametric magnetic resonance imaging (mpMRI) acquisition, interpretation, and reporting. In hopes of improving upon some of the PI-RADS v1 shortcomings, the PI-RADS Steering Committee released PI-RADS v2 in 2015. This paper reviews the accuracy, interobserver agreement, and clinical outcomes of PI-RADS v2 and comments on the limitations of the current literature. Overall, PI-RADS v2 shows improved sensitivity and similar specificity compared to PI-RADS v1. However, concerns exist regarding interobserver agreement and the heterogeneity of the study methodology.
PI-RADS v2: Current standing and future outlook

PubMed Central

Smith, Clayton P.

2018-01-01

The Prostate Imaging-Reporting and Data System (PI-RADS) was created in 2012 to establish standardization in prostate multiparametric magnetic resonance imaging (mpMRI) acquisition, interpretation, and reporting. In hopes of improving upon some of the PI-RADS v1 shortcomings, the PI-RADS Steering Committee released PI-RADS v2 in 2015. This paper reviews the accuracy, interobserver agreement, and clinical outcomes of PI-RADS v2 and comments on the limitations of the current literature. Overall, PI-RADS v2 shows improved sensitivity and similar specificity compared to PI-RADS v1. However, concerns exist regarding interobserver agreement and the heterogeneity of the study methodology. PMID:29733790
Interobserver agreement on Poser's and the new McDonald's diagnostic criteria for multiple sclerosis.

PubMed

Zipoli, V; Portaccio, E; Siracusa, G; Pracucci, G; Sorbi, S; Amato, M P

2003-10-01

We assessed the interobserver agreement on the diagnosis of multiple sclerosis (MS) in a study sample consisting of 41 MS (15 relapsing remitting, two secondary progressive, five primary progressive and 19 presenting their first clinical attack) and three non-MS cases. Clinical and paraclinical information was recorded in standardized forms. Four neurologists were asked to make a diagnosis using Poser's and McDonald's criteria and to assess MRI scans according to the McDonald's guidelines. In terms of the kappa statistic (kappa), we found a moderate agreement on the overall diagnosis using both Poser's and McDonald's criteria (kappa, respectively 0.57 and 0.52). As for distinct diagnostic categories, we observed a moderate to substantial agreement for the three McDonald categories (range of kappa values 0.49-0.64) and a fair to substantial agreement for the nine Poser categories (range of kappa values 0.37-0.67). Taking into account clinical information, the agreement on dissemination over time was substantially higher (kappa = 0.69) than that found on dissemination over space (kappa = 0.46). In contrast, for MRI assessment, the agreement for spatial dissemination was substantial (kappa = 0.74) compared with the fair agreement (kappa = 0.25) yielded by dissemination over time. The new McDonald's criteria yield a good overall diagnostic reliability, and compare favourably with Poser's classification in terms of agreement on distinct diagnostic categories.
Accuracy and reliability testing of two methods to measure internal rotation of the glenohumeral joint.

PubMed

Hall, Justin M; Azar, Frederick M; Miller, Robert H; Smith, Richard; Throckmorton, Thomas W

2014-09-01

We compared accuracy and reliability of a traditional method of measurement (most cephalad vertebral spinous process that can be reached by a patient with the extended thumb) to estimates made with the shoulder in abduction to determine if there were differences between the two methods. Six physicians with fellowship training in sports medicine or shoulder surgery estimated measurements in 48 healthy volunteers. Three were randomly chosen to make estimates of both internal rotation measurements for each volunteer. An independent observer made objective measurements on lateral scoliosis films (spinous process method) or with a goniometer (abduction method). Examiners were blinded to objective measurements as well as to previous estimates. Intraclass coefficients for interobserver reliability for the traditional method averaged 0.75, indicating good agreement among observers. The difference in vertebral level estimated by the examiner and the actual radiographic level averaged 1.8 levels. The intraclass coefficient for interobserver reliability for the abduction method averaged 0.81 for all examiners, indicating near-perfect agreement. Confidence intervals indicated that estimates were an average of 8° different from the objective goniometer measurements. Pearson correlation coefficients of intraobserver reliability for the abduction method averaged 0.94, indicating near-perfect agreement within observers. Confidence intervals demonstrated repeated estimates between 5° and 10° of the original. Internal rotation estimates made with the shoulder abducted demonstrated interobserver reliability superior to that of spinous process estimates, and reproducibility was high. On the basis of this finding, we now take glenohumeral internal rotation measurements with the shoulder in abduction and use a goniometer to maximize accuracy and objectivity. Copyright © 2014 Journal of Shoulder and Elbow Surgery Board of Trustees. Published by Mosby, Inc. All rights reserved.
Dermoscopic patterns of Melanoma Metastases: inter-observer consistency and accuracy for metastases recognition

PubMed Central

Costa, J.; Ortiz-Ibañez, K.; Salerni, G.; Borges, V.; Carrera, C.; Puig, S.; Malvehy, J.

2013-01-01

Background Cutaneous metastases of malignant melanoma (CMMM) can be confused with other skin lesions. Dermoscopy could be helpful in the differential diagnosis. Objective To describe distinctive dermoscopic patterns that are reproducible and accurate in the identification of CMMM Methods A retrospective study of 146 dermoscopic images of CMMM from 42 patients attending a Melanoma Unit between 2002 and 2009 was performed. Firstly, two investigators established six dermoscopic patterns for CMMM. The correlation of 73 dermoscopic images with their distinctive patterns was assessed by four independent dermatologists to evaluate the reproducibility in the identification of the patterns. Finally, 163 dermoscopic images, including CMMM and non-metastatic lesions, were evaluated by the same four dermatologists to calculate the accuracy of the patterns in the recognition of CMMM. Results Five CMMM dermoscopic patterns had a good inter-observer agreement (blue nevus-like, nevus-like, angioma like, vascular and unspecific). When CMMM were classified according to these patterns, correlation between the investigators and the four dermatologists ranged from κ = 0.56 to 0.7. 71 CMMM, 16 angiomas, 22 blue nevus, 15 malignant melanoma, 11 seborrheic keratosis, 15 melanocytic nevus with globular pattern and 13 pink lesions with vascular pattern were evaluated according to the previously described CMMM dermoscopy patterns, showing an overall sensitivity of 68% (between 54.9–76%) and a specificity of 81% (between 68.6–93.5) for the diagnosis of CMMM. Conclusion Five dermoscopic patterns of CMMM with good inter-observer agreement obtained a high sensitivity and specificity in the diagnosis of metastasis, the accuracy varying according to the experience of the observer. PMID:23495915
Can we improve accuracy and reliability of MRI interpretation in children with optic pathway glioma? Proposal for a reproducible imaging classification.

PubMed

Lambron, Julien; Rakotonjanahary, Josué; Loisel, Didier; Frampas, Eric; De Carli, Emilie; Delion, Matthieu; Rialland, Xavier; Toulgoat, Frédérique

2016-02-01

Magnetic resonance (MR) images from children with optic pathway glioma (OPG) are complex. We initiated this study to evaluate the accuracy of MR imaging (MRI) interpretation and to propose a simple and reproducible imaging classification for MRI. We randomly selected 140 MRIs from among 510 MRIs performed on 104 children diagnosed with OPG in France from 1990 to 2004. These images were reviewed independently by three radiologists (F.T., 15 years of experience in neuroradiology; D.L., 25 years of experience in pediatric radiology; and J.L., 3 years of experience in radiology) using a classification derived from the Dodge and modified Dodge classifications. Intra- and interobserver reliabilities were assessed using the Bland-Altman method and the kappa coefficient. These reviews allowed the definition of reliable criteria for MRI interpretation. The reviews showed intraobserver variability and large discrepancies among the three radiologists (kappa coefficient varying from 0.11 to 1). These variabilities were too large for the interpretation to be considered reproducible over time or among observers. A consensual analysis, taking into account all observed variabilities, allowed the development of a definitive interpretation protocol. Using this revised protocol, we observed consistent intra- and interobserver results (kappa coefficient varying from 0.56 to 1). The mean interobserver difference for the solid portion of the tumor with contrast enhancement was 0.8 cm(3) (limits of agreement = -16 to 17). We propose simple and precise rules for improving the accuracy and reliability of MRI interpretation for children with OPG. Further studies will be necessary to investigate the possible prognostic value of this approach.
Assessment of the intraobserver and interobserver reliability of a communicating vessels volumeter to measure wrist-hand volume.

PubMed

de Carvalho, Rogério Mendonca; Perez, Maria Del Carmen Janerio; Miranda, Fausto

2012-10-01

Traditional volumetry based on Archimedes' principle is the gold standard for the measurement of limb volume, but the routine use of this technique is discouraged because of several disadvantages. The purpose of this study was to evaluate intraobserver and interobserver reliability of direct measurements of wrist-hand volume using a new communicating vessels volumeter based on Pascal's law. A reliability study was conducted. To evaluate the reliability of the communicating vessels volumeter in generating measurements, 30 hands of 15 participants (9 women, 6 men) were measured 3 times each by 3 observers, totaling 270 volumetric results. Measurement time was short (X =3 minutes 42 seconds). The intraclass correlation coefficient (ICC) was .9977 for observer 1 and .9976 for observers 2 and 3. The interobserver ICC was .9998. The standard error of measurement was about 3 mL for all observers; the interobserver result was 1 mL. The interrater coefficient of variance (CV) was 1.15% for the series of 9 measurements collected for each segment; the intrarater CV was 1.20%. Limitations No swollen hands were measured, and measurements were not compared with the gold standard technique. Thus, accuracy of the new volumeter was not determined in this study. A new device has been developed for plethysmography of the extremities, and the results of its use to measure the volume of the wrist-hand segment were reliable in both intraobserver and interobserver analyses.
Spectrally Encoded Confocal Microscopy (SECM) for Diagnosing of Breast Cancer in Excision and Margin Specimens

PubMed Central

Brachtel, Elena F.; Johnson, Nicole B.; Huck, Amelia E.; Rice-Stitt, Travis L.; Vangel, Mark G.; Smith, Barbara L.; Tearney, Guillermo J.; Kang, Dongkyun

2016-01-01

A large percentage of breast cancer patients treated with breast conserving surgery need to undergo multiple surgeries due to positive margins found during post-operative margin assessment. Carcinomas could be removed completely during the initial surgery and additional surgery avoided if positive margins can be determined intra-operatively. Spectrally-encoded confocal microscopy (SECM) is a high-speed reflectance confocal microscopy technology that has a potential to rapidly image the entire surgical margin at sub-cellular resolution and accurately determine margin status intra-operatively. In this paper, in order to test feasibility of using SECM for intra-operative margin assessment, we have evaluated the diagnostic accuracy of SECM for detecting various types of breast cancers. Forty-six surgically-removed breast specimens were imaged with a SECM system. Side-by-side comparison between SECM and histologic images showed that SECM images can visualize key histomorphologic patterns of normal/benign and malignant breast tissues. Small (500 µm × 500 µm) spatially-registered SECM and histologic images (n=124 for each) were diagnosed independently by three pathologists with expertise in breast pathology. Diagnostic accuracy of SECM for determining malignant tissues was high, average sensitivity of 0.91, specificity of 0.93, positive predictive value of 0.95, and negative predictive value of 0.87. Intra-observer agreement and inter-observer agreement for SECM were also high, 0.87 and 0.84, respectively. Results from this study suggest that SECM may be developed into an intra-operative margin assessment tool for guiding breast cancer excisions. PMID:26779830
Anastomotic leakage after colorectal surgery: diagnostic accuracy of CT.

PubMed

Kauv, Paul; Benadjaoud, Samir; Curis, Emmanuel; Boulay-Coletta, Isabelle; Loriau, Jerome; Zins, Marc

2015-12-01

To evaluate the diagnostic accuracy of CT in postoperative colorectal anastomotic leakage (AL). Two independent blinded radiologists reviewed 153 CTs performed for suspected AL within 60 days after surgery in 131 consecutive patients, with (n = 58) or without (n = 95) retrograde contrast enema (RCE). Results were compared to original interpretations. The reference standard was reoperation or consensus (a radiologist and a surgeon) regarding clinical, laboratory, radiological, and follow-up data after medical treatment. AL was confirmed in 34/131 patients. For the two reviewers and original interpretation, sensitivity of CT was 82 %, 87 %, and 71 %, respectively; specificity was 84 %, 84 %, and 92 %. RCE significantly increased the positive predictive value (from 40 % to 88 %, P = 0.0009; 41 % to 92 %, P = 0.0016; and 40 % to 100 %, P = 0.0006). Contrast extravasation was the most sensitive (reviewers, 83 % and 83 %) and specific (97 % and 97 %) sign and was significantly associated with AL by univariate analysis (P < 0.0001 and P < 0.0001). By multivariate analysis with recursive partitioning, CT with RCE was accurate to confirm or rule out AL with contrast extravasation. CT with RCE is accurate for diagnosing postoperative colorectal AL. Contrast extravasation is the most reliable sign. RCE should be performed during CT for suspected AL. • CT accurately diagnosed clinically suspected colorectal AL and showed good interobserver agreement • Contrast extravasation was the most sensitive and specific CT sign • Retrograde contrast enema during CT improved positive predictive value • Retrograde contrast enema decreased false-negative or indeterminate original CT interpretations.
Radiographic evaluation of perching-joint angles in cockatiels (Nymphicus hollandicus), Hispaniolan Amazon parrots (Amazona ventralis), and barred owls (Strix varia).

PubMed

Bonin, Glen; Lauer, Susanne K; Guzman, David Sanchez-Migallon; Nevarez, Javier; Tully, Thomas N; Hosgood, Giselle; Gaschen, Lorrie

2009-06-01

Information on perching-joint angles in birds is limited. Joint immobilization in a physiologic perching angle has the potential to result more often in complete restoration of limb function. We evaluated perching-joint angles in 10 healthy cockatiels (Nymphicus hollandicus), 10 Hispaniolan Amazons (Amazona ventralis), and 9 barred owls (Strix varia) and determined intra- and interobserver variability for goniometric measurements in 2 different radiographic projections. Intra- and interobserver variation was less than 7% for all stifle and intertarsal joint measurements but frequently exceeded 10% for the hip-joint measurements. Hip, stifle, and intertarsal perching angles differed significantly among cockatiels, Hispaniolan Amazon parrots, and barred owls. The accuracy of measurements performed on straight lateral radiographic projections with superimposed limbs was not consistently superior to measurements on oblique projections with a slightly rotated pelvis. Stifle and intertarsal joint angles can be measured on radiographs by different observers with acceptable variability, but intra- and interobserver variability for hip-joint-angle measurements is higher.
Beyond triage: the diagnostic accuracy of emergency department nursing staff risk assessment in patients with suspected acute coronary syndromes.

PubMed

Carlton, Edward Watts; Khattab, Ahmed; Greaves, Kim

2016-02-01

To establish the accuracy of emergency department (ED) nursing staff risk assessment using an established chest pain risk score alone and when incorporated with presentation high-sensitivity troponin testing as part of an accelerated diagnostic protocol (ADP). Prospective observational study comparing nursing and physician risk assessment using the modified Goldman (m-Goldman) score and a predefined ADP, incorporating presentation high-sensitivity troponin. A UK District ED. Consecutive patients, aged ≥18, with suspected cardiac chest pain and non-ischaemic ECG, for whom the treating physician determined serial troponin testing was required. 30-day major adverse cardiac events (MACE). 960 participants were recruited. 912/960 (95.0%) had m-Goldman scores recorded by physicians and 745/960 (77.6%) by nursing staff. The area under the curve of the m-Goldman score in predicting 30-day MACE was 0.647 (95% CI 0.594 to 0.700) for physicians and 0.572 (95% CI 0.510 to 0.634) for nursing staff (p=0.09). When incorporated into an ADP, sensitivity for the rule-out of MACE was 99.2% (95% CI 94.8% to 100%) and 96.7% (90.3% to 99.2%) for physicians and nurses, respectively. One patient in the physician group (0.3%) and three patients (1.1%) in the nursing group were classified as low risk yet had MACE. There was fair agreement in the identification of low-risk patients (kappa 0.31, 95% CI 0.24 to 0.38). The diagnostic accuracy of ED nursing staff risk assessment is similar to that of ED physicians and interobserver reliability between assessor groups is fair. When incorporating high-sensitivity troponin testing, a nurse-led ADP has a miss rate of 1.1% for MACE at 30 days. Controlled Trials Database (ISRCTN no. 21109279). Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
Diagnosis of mild chronic pancreatitis (Cambridge classification): comparative study using secretin injection-magnetic resonance cholangiopancreatography and endoscopic retrograde pancreatography.

PubMed

Sai, Jin-Kan; Suyama, Masafumi; Kubokawa, Yoshihiro; Watanabe, Sumio

2008-02-28

To investigate the usefulness of secretin injection-MRCP for the diagnosis of mild chronic pancreatitis. Sixteen patients having mild chronic pancreatitis according to the Cambridge classification and 12 control subjects with no abnormal findings on the pancreatogram were examined for the diagnostic accuracy of secretin injection-MRCP regarding abnormal branch pancreatic ducts associated with mild chronic pancreatitis (Cambridge Classification), using endoscopic retrograde cholangiopancreatography (ERCP) for comparison. The sensitivity and specificity for abnormal branch pancreatic ducts determined by two reviewers were respectively 55%-63% and 75%-83% in the head, 57%-64% and 82%-83% in the body, and 44%-44% and 72%-76% in the tail of the pancreas. The sensitivity and specificity for mild chronic pancreatitis were 56%-63% and 92%-92%, respectively. Interobserver agreement (kappa statistics) concerning the diagnosis of an abnormal branch pancreatic duct and of mild chronic pancreatitis was good to excellent. Secretin injection-MRCP might be useful for the diagnosis of mild chronic pancreatitis.
Dermatological and morphological findings in quarter horses with hereditary equine regional dermal asthenia.

PubMed

Badial, Peres R; Oliveira-Filho, José P; Pantoja, José Carlos F; Moreira, José C L; Conceição, Lissandro G; Borges, Alexandre S

2014-12-01

Hereditary equine regional dermal asthenia (HERDA) is an autosomal recessive disorder affecting quarter horses (QHs); affected horses exhibit characteristic skin abnormalities related to abnormal collagen biosynthesis. To characterize the thickness and morphological abnormalities of the skin of HERDA-affected horses and to determine the interobserver agreement and the diagnostic accuracy of histopathological examination of skin biopsies from horses with HERDA. Six affected QHs, confirmed by DNA testing, from a research herd and five unaffected QHs from a stud farm. The skin thickness in 25 distinct body regions was measured on both sides in all affected and unaffected horses. Histopathological and ultrastructural evaluation of skin biopsies was performed. The average skin thickness in all of the evaluated regions was thinner in the affected horses. A statistically significant difference between skin thickness of the affected and unaffected animals was observed only when the average magnitude of difference was ≥38.7% (P = 0.038). The interobserver agreement for the histopathological evaluation was fair to substantial. The histopathological sensitivity for the diagnosis of HERDA was dependent on the evaluator and ranged from 73 to 88%, whereas the specificity was affected by the region sampled and ranged from 35 to 75%. Despite the regional pattern of the cutaneous signs, skin with decreased thickness was not regionally distributed in the HERDA-affected horses. Histopathological evaluation is informative but not conclusive for establishing the diagnosis. Samples of skin from the neck, croup or back are useful for diagnosis of HERDA. However, the final diagnosis must be confirmed using molecular testing. © 2014 ESVD and ACVD.

New endoscopic indicator of esophageal achalasia: "pinstripe pattern".

PubMed

Minami, Hitomi; Isomoto, Hajime; Miuma, Satoshi; Kobayashi, Yasutoshi; Yamaguchi, Naoyuki; Urabe, Shigetoshi; Matsushima, Kayoko; Akazawa, Yuko; Ohnita, Ken; Takeshima, Fuminao; Inoue, Haruhiro; Nakao, Kazuhiko

2015-01-01

Endoscopic diagnosis of esophageal achalasia lacking typical endoscopic features can be extremely difficult. The aim of this study was to identify simple and reliable early indicator of esophageal achalasia. This single-center retrospective study included 56 cases of esophageal achalasia without previous treatment. As a control, 60 non-achalasia subjects including reflux esophagitis and superficial esophageal cancer were also included in this study. Endoscopic findings were evaluated according to Descriptive Rules for Achalasia of the Esophagus as follows: (1) esophageal dilatation, (2) abnormal retention of liquid and/or food, (3) whitish change of the mucosal surface, (4) functional stenosis of the esophago-gastric junction, and (5) abnormal contraction. Additionally, the presence of the longitudinal superficial wrinkles of esophageal mucosa, "pinstripe pattern (PSP)" was evaluated endoscopically. Then, inter-observer diagnostic agreement was assessed for each finding. The prevalence rates of the above-mentioned findings (1-5) were 41.1%, 41.1%, 16.1%, 94.6%, and 43.9%, respectively. PSP was observed in 60.7% of achalasia, while none of the control showed positivity for PSP. PSP was observed in 26 (62.5%) of 35 cases with shorter history < 10 years, which usually lacks typical findings such as severe esophageal dilation and tortuosity. Inter-observer agreement level was substantial for food/liquid remnant (k = 0.6861) and PSP (k = 0.6098), and was fair for abnormal contraction and white change. The accuracy, sensitivity, and specificity for achalasia were 83.8%, 64.7%, and 100%, respectively. "Pinstripe pattern" could be a reliable indicator for early discrimination of primary esophageal achalasia.
Fat-suppressed 3D spoiled gradient-echo MRI and MDCT arthrography of articular cartilage in patients with hip dysplasia.

PubMed

Nishii, Takashi; Tanaka, Hisashi; Nakanishi, Katsuyuki; Sugano, Nobuhiko; Miki, Hidenobu; Yoshikawa, Hideki

2005-08-01

Our objective was to assess the diagnostic ability of MDCT arthrography for acetabular and femoral cartilage lesions in patients with hip dysplasia. A disorder of the articular cartilage was evaluated in 20 hips of 18 patients with acetabular dysplasia who did not have osteoarthritis or who had early stage osteoarthritis before undergoing pelvic osteotomy surgery. The findings on fat-suppressed 3D fast spoiled gradient-echo MRI and MDCT arthrography of the hip were evaluated by two independent observers, and sensitivity, specificity, and accuracy were determined using arthroscopic findings as the standard of reference. Kappa values were calculated to quantify the level of interobserver agreement. The sensitivity and specificity for the detection of any cartilage disorder (grade 1 or higher) were (observer 1/observer 2) 49%/67% and 89%/76%, respectively, on MRI, and 67%/67% and 89%/82%, respectively, on CT arthrography. The sensitivity and specificity for the detection of cartilage lesions with substance loss (grade 2 or higher) were (observer 1/observer 2) 47%/53% and 92%/87%, respectively, on MRI, and 70%/79% and 93%/94%, respectively, on CT arthrography. CT arthrography provided significantly higher sensitivity in the detection of grade 2 or higher lesions than MRI for both observers. Interobserver agreement in the detection of grade 2 or higher cartilage lesions was moderate (kappa = 0.53) on MRI and substantial (kappa = 0.78) on CT. MDCT arthrography is a sensitive and reproducible method for assessing articular cartilage lesions with substance loss in patients with hip dysplasia.
Reliability and criterion validity of an observation protocol for working technique assessments in cash register work.

PubMed

Palm, Peter; Josephson, Malin; Mathiassen, Svend Erik; Kjellberg, Katarina

2016-06-01

We evaluated the intra- and inter-observer reliability and criterion validity of an observation protocol, developed in an iterative process involving practicing ergonomists, for assessment of working technique during cash register work for the purpose of preventing upper extremity symptoms. Two ergonomists independently assessed 17 15-min videos of cash register work on two occasions each, as a basis for examining reliability. Criterion validity was assessed by comparing these assessments with meticulous video-based analyses by researchers. Intra-observer reliability was acceptable (i.e. proportional agreement >0.7 and kappa >0.4) for 10/10 questions. Inter-observer reliability was acceptable for only 3/10 questions. An acceptable inter-observer reliability combined with an acceptable criterion validity was obtained only for one working technique aspect, 'Quality of movements'. Thus, major elements of the cashiers' working technique could not be assessed with an acceptable accuracy from short periods of observations by one observer, such as often desired by practitioners. Practitioner Summary: We examined an observation protocol for assessing working technique in cash register work. It was feasible in use, but inter-observer reliability and criterion validity were generally not acceptable when working technique aspects were assessed from short periods of work. We recommend the protocol to be used for educational purposes only.
Validation of Diagnoses of Transient Ischemic Attack in the Swedish Stroke Register (Riksstroke) TIA-Module.

PubMed

Buchwald, Fredrik; Ström, Jakob O; Norrving, Bo; Petersson, Jesper

2015-01-01

In 2010, the Swedish Stroke Register (Riksstroke; RS) established a module for transient ischemic attacks (RS-TIA). We report a diagnostic validation study of patients included in RS-TIA. During the first year, 7,825 patients were registered at 59 out of 74 Swedish hospitals. A time-based TIA definition was applied. A sample of 180 patients (30 patients each from 6 hospitals), with a similar distribution of age and sex as in RS-TIA, was prepared. Two independent observers assessed medical records for quality of documentation and assigned a diagnosis of likely, possible, unlikely TIA or ischemic stroke, according to prespecified criteria. The 2 observers agreed in 77% of cases that the event was a likely or possible TIA, in 3% that the event was an ischemic stroke, and in 2% that the event was an unlikely TIA. The observers disagreed in 8% of patients on TIA vs. ischemic stroke, and in 11% on a vascular vs. non-vascular cause. Quality of documentation was fair. There was interobserver agreement on diagnosis of TIA in the majority of patients included in RS-TIA. Diagnostic accuracy may be further improved by more systematic documentation of symptoms and signs. © 2015 S. Karger AG, Basel.
Does the Modified Gartland Classification Clarify Decision Making?

PubMed

Leung, Sophia; Paryavi, Ebrahim; Herman, Martin J; Sponseller, Paul D; Abzug, Joshua M

2018-01-01

The modified Gartland classification system for pediatric supracondylar fractures is often utilized as a communication tool to aid in determining whether or not a fracture warrants operative intervention. This study sought to determine the interobserver and intraobserver reliability of the Gartland classification system, as well as to determine whether there was agreement that a fracture warranted operative intervention regardless of the classification system. A total of 200 anteroposterior and lateral radiographs of pediatric supracondylar humerus fractures were retrospectively reviewed by 3 fellowship-trained pediatric orthopaedic surgeons and 2 orthopaedic residents and then classified as type I, IIa, IIb, or III. The surgeons then recorded whether they would treat the fracture nonoperatively or operatively. The κ coefficients were calculated to determine interobserver and intraobserver reliability. Overall, the Wilkins-modified Gartland classification has low-moderate interobserver reliability (κ=0.475) and high intraobserver reliability (κ=0.777). A low interobserver reliability was found when differentiating between type IIa and IIb (κ=0.240) among attendings. There was moderate-high interobserver reliability for the decision to operate (κ=0.691) and high intraobserver reliability (κ=0.760). Decreased interobserver reliability was present for decision to operate among residents. For fractures classified as type I, the decision to operate was made 3% of the time and 27% for type IIa. The decision was made to operate 99% of the time for type IIb and 100% for type III. There is almost full agreement for the nonoperative treatment of Type I fractures and operative treatment for type III fractures. There is agreement that type IIb fractures should be treated operatively and that the majority of type IIa fractures should be treated nonoperatively. However, the interobserver reliability for differentiating between type IIa and IIb fractures is low. Our results validate the Gartland classfication system as a method to help direct treatment of pediatric supracondylar humerus fractures, although the modification of the system, IIa versus IIb, seems to have limited reliability and utility. Terminology based on decision to treat may lead to a more clinically useful classification system in the evaluation and treatment of pediatric supracondylar humerus fractures. Level III-diagnostic studies.
Contrast-Enhanced and Time-of-Flight MRA at 3T Compared with DSA for the Follow-Up of Intracranial Aneurysms Treated with the WEB Device.

PubMed

Timsit, C; Soize, S; Benaissa, A; Portefaix, C; Gauvrit, J-Y; Pierot, L

2016-09-01

Imaging follow-up at 3T of intracranial aneurysms treated with the WEB Device has not been evaluated yet. Our aim was to assess the diagnostic accuracy of 3D-time-of-flight MRA and contrast-enhanced MRA at 3T against DSA, as the criterion standard, for the follow-up of aneurysms treated with the Woven EndoBridge (WEB) system. From June 2011 to December 2014, patients treated with the WEB in our institution, then followed for ≥6 months after treatment by MRA at 3T (3D-TOF-MRA and contrast-enhanced MRA) and DSA within 48 hours were included. Aneurysm occlusion was assessed with a simplified 2-grade scale (adequate occlusion [total occlusion + neck remnant] versus aneurysm remnant). Interobserver and intermodality agreement was evaluated by calculating the linear weighted κ. MRA test characteristics and predictive values were calculated from a 2 × 2 contingency table, by using DSA data as the standard of reference. Twenty-six patients with 26 WEB-treated aneurysms were included. The interobserver reproducibility was good with DSA (κ = 0.71) and contrast-enhanced-MRA (κ = 0.65) compared with moderate with 3D-TOF-MRA (κ = 0.47). Intermodality agreement with DSA was fair with both contrast-enhanced MRA (κ = 0.36) and 3D-TOF-MRA (κ = 0.36) for the evaluation of total occlusion. For aneurysm remnant detection, the prevalence was low (15%), on the basis of DSA, and both MRA techniques showed low sensitivity (25%), high specificity (100%), very good positive predictive value (100%), and very good negative predictive value (88%). Despite acceptable interobserver reproducibility and predictive values, the low sensitivity of contrast-enhanced MRA and 3D-TOF-MRA for aneurysm remnant detection suggests that MRA is a useful screening procedure for WEB-treated aneurysms, but similar to stents and flow diverters, DSA remains the criterion standard for follow-up. © 2016 by American Journal of Neuroradiology.
Evaluating causes of error in landmark-based data collection using scanners

PubMed Central

Shearer, Brian M.; Cooke, Siobhán B.; Halenar, Lauren B.; Reber, Samantha L.; Plummer, Jeannette E.; Delson, Eric

2017-01-01

In this study, we assess the precision, accuracy, and repeatability of craniodental landmarks (Types I, II, and III, plus curves of semilandmarks) on a single macaque cranium digitally reconstructed with three different surface scanners and a microCT scanner. Nine researchers with varying degrees of osteological and geometric morphometric knowledge landmarked ten iterations of each scan (40 total) to test the effects of scan quality, researcher experience, and landmark type on levels of intra- and interobserver error. Two researchers additionally landmarked ten specimens from seven different macaque species using the same landmark protocol to test the effects of the previously listed variables relative to species-level morphological differences (i.e., observer variance versus real biological variance). Error rates within and among researchers by scan type were calculated to determine whether or not data collected by different individuals or on different digitally rendered crania are consistent enough to be used in a single dataset. Results indicate that scan type does not impact rate of intra- or interobserver error. Interobserver error is far greater than intraobserver error among all individuals, and is similar in variance to that found among different macaque species. Additionally, experience with osteology and morphometrics both positively contribute to precision in multiple landmarking sessions, even where less experienced researchers have been trained in point acquisition. Individual training increases precision (although not necessarily accuracy), and is highly recommended in any situation where multiple researchers will be collecting data for a single project. PMID:29099867
Digital radiography with computerized conventional monitors compared to medical monitors in vertical root fracture diagnosis.

PubMed

Tofangchiha, Maryam; Adel, Mamak; Bakhshi, Mahin; Esfehani, Mahsa; Nazeman, Pantea; Ghorbani Elizeyi, Mojgan; Javadi, Amir

2013-01-01

Vertical root fracture (VRF) is a complication which is chiefly diagnosed radiographically. Recently, film-based radiography has been substituted with digital radiography. At the moment, there is a wide range of monitors available in the market for viewing digital images. The present study aims to compare the diagnostic accuracy, sensitivity and specificity of medical and conventional monitors in detection of vertical root fractures. In this in vitro study 228 extracted single-rooted human teeth were endodontically treated. Vertical root fractures were induced in 114 samples. The teeth were imaged by a digital charge-coupled device radiography using parallel technique. The images were evaluated by a radiologist and an endodontist on two medical and conventional liquid-crystal display (LCD) monitors twice. Z-test was used to analyze the sensitivity, accuracy and specificity of each monitor. Significance level was set at 0.05. Inter and intra observer agreements were calculated by Cohen's kappa. Accuracy, specificity and sensitivity for conventional monitor were calculated as 67.5%, 72%, 62.5% respectively; and data for medical grade monitor were 67.5%, 66.5% and 68% respectively. Statistical analysis showed no significant differences in detecting VRF between the two techniques. Inter-observer agreement for conventional and medical monitor was 0.47 and 0.55 respectively (moderate). Intra-observer agreement was 0.78 for medical monitor and 0.87 for conventional one (substantial). The type of monitor does not influence diagnosis of vertical root fractures.
Pilot study on probe-based confocal laser endomicroscopy for colorectal neoplasms: an initial experience in Japan.

PubMed

Abe, Seiichiro; Saito, Yutaka; Oono, Yasuhiro; Tanaka, Yusaku; Sakamoto, Taku; Yamada, Masayoshi; Nakajima, Takeshi; Matsuda, Takahisa; Ikematsu, Hiroaki; Yano, Tomonori; Sekine, Shigeki; Kojima, Motohiro; Yamagishi, Hidetsugu; Kato, Hiroyuki

2018-04-26

The aim of this pilot study is to investigate the diagnostic yield of probe-based confocal laser endomicroscopy (pCLE) in the evaluation of depth of invasion in colorectal lesions. Patients with colorectal lesions eligible for either endoscopic treatment or surgery were enrolled in the study. Tumor's depth of invasion was classified as mucosal or slight submucosal (M-SM1) and deep submucosal invasion or deeper (SM2 or deeper). White light endoscopy (WLE), magnifying narrow band imaging (M-NBI), and magnifying chromoendoscopy (M-CE) were used to assess colorectal lesions, and pCLE was used to identify tumor's features related to SM2 or deeper. The diagnostic classification of depth of invasion was obtained by correlating pCLE findings with histology results (on-site diagnosis). All colorectal lesions were stratified by a second endoscopist who was blinded to any clinical and histological information with the use of WLE, M-NBI, M-CE, and pCLE (off-line review). A total of 22 colorectal lesions were analyzed: seven were adenoma, ten intramucosal cancer, and five SM2 or deeper cancer. With respect to pCLE findings, loss of crypt structure was seen in all SM2 or deeper cancers and only in one M-SM1 lesion. Sensitivity, specificity, and accuracy of WLE, M-NBI, and M-CE in off-line review were 60/94/86, 60/94/86, and 80/94/91%, respectively. Sensitivity/specificity/accuracy of pCLE in off-line review were 80/94/91%, respectively. The inter-observer agreement of pCLE between on-site diagnosis and off-line review was 0.64 (95%CI 0.27-1.0). pCLE may represent a useful tool to evaluate the depth of invasion in colorectal lesions.
Quality control of regional wall motion analysis in stress Echo 2020.

PubMed

Ciampi, Quirino; Picano, Eugenio; Paterni, Marco; Daros, Clarissa Borguezan; Simova, Iana; de Castro E Silva Pretto, José Luis; Scali, Maria Chiara; Gaibazzi, Nicola; Severino, Sergio; Djordjevic-Dikic, Ana; Kasprzak, Jaroslaw D; Zagatina, Angela; Varga, Albert; Lowenstein, Jorge; Merlo, Pablo Martin; Amor, Miguel; Celutkiene, Jelena; Perez, Julio E; Di Salvo, Giovanni; Galderisi, Maurizio; Mori, Fabio; Costantino, Marco Fabio; Massa, Laura; Dekleva, Milica; Chaves, Daniel Quesada; Trambaiolo, Paolo; Citro, Rodolfo; Colonna, Paolo; Rigo, Fausto; Torres, Marco A R; Monte, Ines; Stankovic, Ivan; Neskovic, Aleksander; Cortigiani, Lauro; Re, Federica; Dodi, Claudio; D'Andrea, Antonello; Villari, Bruno; Arystan, Ayana; De Nes, Michele; Carpeggiani, Clara

2017-12-15

The trial "Stress Echo (SE) 2020" evaluates novel applications of SE beyond coronary artery disease. The aim of the study was control quality and harmonize reading criteria. One reader from 78 centers of the SE 2020 network asked for credentials to read a set of 20 SE video-clips selected by the core lab. All aspiring centers met the pre-requisite of high-volume and the years of experience in SE ranged from 5 to 31years (mean value 18years). The diagnostic gold standard was a reading by the core lab. The a priori determined pass threshold was 18/20 (≥90%). Of the initial 78 who started, 57 completed the first attempt: individual readers' score on first attempt ranged from 07/20 to 20/20 (accuracy from 35% to 100%, mean 78.7±13%) and 44 readers passed it. There was a very poor correlation between years of experience and the reader's score on first attempt (r=-0.161, p=0.231). Of the 13 readers who failed the first attempt, 12 took it again after the web-based session and their accuracy improved (74% vs. 96%, p<0.001). The kappa inter-observer agreement before and after web-based training was 0.59 on first attempt and rose to 0.91 on the last attempt. In SE reading, the volume of activity or years of experience is not synonymous with diagnostic quality. Qualitative analysis and operator-dependence can become a limiting weakness in clinical practice, in the absence of strict pathways of learning, credentialing and audit. Copyright © 2017 Elsevier B.V. All rights reserved.
Diagnostic accuracy and reproducibility of pleural and lung ultrasound in discriminating cardiogenic causes of acute dyspnea in the emergency department.

PubMed

Cibinel, Gian Alfonso; Casoli, Giovanna; Elia, Fabrizio; Padoan, Monica; Pivetta, Emanuele; Lupia, Enrico; Goffi, Alberto

2012-02-01

Dyspnea is a common symptom in patients admitted to the Emergency Department (ED), and discriminating between cardiogenic and non-cardiogenic dyspnea is often a clinical dilemma. The initial diagnostic work-up may be inaccurate in defining the etiology and the underlying pathophysiology. The aim of this study was to evaluate the diagnostic accuracy and reproducibility of pleural and lung ultrasound (PLUS), performed by emergency physicians at the time of a patient's initial evaluation in the ED, in identifying cardiac causes of acute dyspnea. Between February and July 2007, 56 patients presenting to the ED with acute dyspnea were prospectively enrolled in this study. In all patients, PLUS was performed by emergency physicians with the purpose of identifying the presence of diffuse alveolar-interstitial syndrome (AIS) or pleural effusion. All scans were later reviewed by two other emergency physicians, expert in PLUS and blinded to clinical parameters, who were the ultimate judges of positivity for diffuse AIS and pleural effusion. A random set of 80 recorded scannings were also reviewed by two inexperienced observers to assess inter-observer variability. The entire medical record was independently reviewed by two expert physicians (an emergency medicine physician and a cardiologist) blinded to the ultrasound (US) results, in order to determine whether, for each patient, dyspnea was due to heart failure, or not. Sensitivity, specificity, and positive/negative predictive values were obtained; likelihood ratio (LR) test was used. Cohen's kappa was used to assess inter-observer agreement. The presence of diffuse AIS was highly predictive for cardiogenic dyspnea (sensitivity 93.6%, specificity 84%, positive predictive value 87.9%, negative predictive value 91.3%). On the contrary, US detection of pleural effusion was not helpful in the differential diagnosis (sensitivity 83.9%, specificity 52%, positive predictive value 68.4%, negative predictive value 72.2%). Finally, the coexistence of diffuse AIS and pleural effusion is less accurate than diffuse AIS alone for cardiogenic dyspnea (sensitivity 81.5%, specificity 82.8%, positive predictive value 81.5%, negative predictive value 82.8%). The positive LR was 5.8 for AIS [95% confidence interval (CI) 4.8-7.1] and 1.7 (95% CI 1.2-2.6) for pleural effusion, negative LR resulted 0.1 (95% CI 0.0-0.4) for AIS and 0.3 (95% CI 0.1-0.8) for pleural effusion. Agreement between experienced and inexperienced operators was 92.2% (p < 0.01) and 95% (p < 0.01) for diagnosis of AIS and pleural effusion, respectively. In early evaluation of patients presenting to the ED with dyspnea, PLUS, performed with the purpose of identifying diffuse AIS, may represent an accurate and reproducible bedside tool in discriminating between cardiogenic and non-cardiogenic dyspnea. On the contrary, US detection of pleural effusions does not allow reliable discrimination between different causes of acute dyspnea in unselected ED patients.
The Role of Ultrasound Compared to Biopsy of Temporal Arteries in the Diagnosis and Treatment of Giant Cell Arteritis (TABUL): a diagnostic accuracy and cost-effectiveness study.

PubMed

Luqmani, Raashid; Lee, Ellen; Singh, Surjeet; Gillett, Mike; Schmidt, Wolfgang A; Bradburn, Mike; Dasgupta, Bhaskar; Diamantopoulos, Andreas P; Forrester-Barker, Wulf; Hamilton, William; Masters, Shauna; McDonald, Brendan; McNally, Eugene; Pease, Colin; Piper, Jennifer; Salmon, John; Wailoo, Allan; Wolfe, Konrad; Hutchings, Andrew

2016-11-01

Giant cell arteritis (GCA) is a relatively common form of primary systemic vasculitis, which, if left untreated, can lead to permanent sight loss. We compared ultrasound as an alternative diagnostic test with temporal artery biopsy, which may be negative in 9-61% of true cases. To compare the clinical effectiveness and cost-effectiveness of ultrasound with biopsy in diagnosing patients with suspected GCA. Prospective multicentre cohort study. Secondary care. A total of 381 patients referred with newly suspected GCA. Sensitivity, specificity and cost-effectiveness of ultrasound compared with biopsy or ultrasound combined with biopsy for diagnosing GCA and interobserver reliability in interpreting scan or biopsy findings. We developed and implemented an ultrasound training programme for diagnosing suspected GCA. We recruited 430 patients with suspected GCA. We analysed 381 patients who underwent both ultrasound and biopsy within 10 days of starting treatment for suspected GCA and who attended a follow-up assessment (median age 71.1 years; 72% female). The sensitivity of biopsy was 39% [95% confidence interval (CI) 33% to 46%], which was significantly lower than previously reported and inferior to ultrasound (54%, 95% CI 48% to 60%); the specificity of biopsy (100%, 95% CI 97% to 100%) was superior to ultrasound (81%, 95% CI 73% to 88%). If we scanned all suspected patients and performed biopsies only on negative cases, sensitivity increased to 65% and specificity was maintained at 81%, reducing the need for biopsies by 43%. Strategies combining clinical judgement (clinician's assessment at 2 weeks) with the tests showed sensitivity and specificity of 91% and 81%, respectively, for biopsy and 93% and 77%, respectively, for ultrasound; cost-effectiveness (incremental net monetary benefit) was £485 per patient in favour of ultrasound with both cost savings and a small health gain. Inter-rater analysis revealed moderate agreement among sonographers (intraclass correlation coefficient 0.61, 95% CI 0.48 to 0.75), similar to pathologists (0.62, 95% CI 0.49 to 0.76). There is no independent gold standard diagnosis for GCA. The reference diagnosis used to determine accuracy was based on classification criteria for GCA that include clinical features at presentation and biopsy results. We have demonstrated the feasibility of providing training in ultrasound for the diagnosis of GCA. Our results indicate better sensitivity but poorer specificity of ultrasound compared with biopsy and suggest some scope for reducing the role of biopsy. The moderate interobserver agreement for both ultrasound and biopsy indicates scope for improving assessment and reporting of test results and challenges the assumption that a positive biopsy always represents GCA. Further research should address the issue of an independent reference diagnosis, standards for interpreting and reporting test results and the evaluation of ultrasound training, and should also explore the acceptability of these new diagnostic strategies in GCA. The National Institute for Health Research Health Technology Assessment programme.
Diagnostic accuracy of an iPhone DICOM viewer for the interpretation of magnetic resonance imaging of the knee.

PubMed

De Maio, Peter; White, Lawrence M; Bleakney, Robert; Menezes, Ravi J; Theodoropoulos, John

2014-07-01

To evaluate the diagnostic performance of viewing magnetic resonance (MR) images on a handheld mobile device compared with a conventional radiology workstation for the diagnosis of intra-articular knee pathology. Prospective comparison study. Tertiary care center. Fifty consecutive subjects who had MR imaging of the knee followed by knee arthroscopy were prospectively evaluated. Two musculoskeletal radiologists independently reviewed each MR study using 2 different viewers: the OsiriX DICOM viewer software on an Apple iPhone 3GS device and eFilm Workstation software on a conventional picture archiving and communications system workstation. Sensitivity and specificity of the iPhone and workstation interpretations was performed using knee arthroscopy as the reference standard. Intraobserver concordance and agreement between the iPhone and workstation interpretations were determined. There was no statistically significant difference between the 2 devices for each paired comparison of diagnostic performance. For the iPhone interpretations, sensitivity ranged from 77% (13 of 17) for the lateral meniscus to 100% (17 of 17) for the anterior cruciate ligament. Specificity ranged from 74% (14 of 19) for cartilage to 100% (50 of 50) for the posterior cruciate ligament. There was a very high level of interobserver and intraobserver agreement between devices and readers. The iPhone reads took longer than the corresponding workstation reads, with a significant mean difference between the iPhone and workstation reads of 3.98 minutes (P < 0.001). The diagnostic performance of interpreting MR images on a handheld mobile device for the assessment of intra-articular knee pathology is similar to that of a conventional radiology workstation, however, requires a longer viewing time. Timely and accurate interpretation of complex medical images using mobile device solutions could result in new workflow efficiencies and ultimately improve patient care.
Detection of crestal radiolucencies around dental implants: an in vitro experimental study.

PubMed

Sirin, Yigit; Horasan, Sinan; Yaman, Duygu; Basegmez, Cansu; Tanyel, Cem; Aral, Ali; Guven, Koray

2012-07-01

The aim of this study was to compare the diagnostic potentials and practical advantages of different imaging modalities in detecting bone defects around dental implants. Crestal bone defects with sequentially larger diameters were randomly prepared around 100 implants that were inserted in bovine bone blocks. Conventional periapical radiography (PR), direct digital radiography (DDR), panoramic radiography (PANO), cone-beam computed tomography (CBCT), and multislice computed tomography (MSCT) were performed for all specimens. The diagnostic accuracies of the devices, confidence of the answers, subjective image quality, defect visibility in planar orientations, and duration of diagnosis were analyzed based on the interpretations of 7 calibrated observers. The agreement levels of intra- and interobserver scores were rated good. PR, DDR, and CBCT were mostly more accurate than PANO and MSCT (P < .05). Confidence levels were positively correlated with the defect size (ρ = 0.20, P < .01), and that of DDR was the highest (P < .05). The subjective image quality of PR and DDR was higher than that of CBCT, PANO, and MSCT (P < .05 for all comparisons). Axial-coronal-sagittal visibilities of the defects were higher for CBCT compared with MSCT (P < .05). The diagnostic time was shorter for DDR (P < .05) and longer for the tomographic systems (P < .05) than for the other devices. DDR may provide a faster and more confident diagnostic option that is as accurate as PR in detecting peri-implant radiolucencies. CBCT has a comparable potential to these intraoral systems but with slower decision making and lower image quality, whereas PANO and MSCT become more reliable when bone defects have a diameter that is at least 1.5 mm larger than that of the implant. Copyright © 2012 American Association of Oral and Maxillofacial Surgeons. Published by Elsevier Inc. All rights reserved.
Comparison of High-Resolution MR Imaging and Digital Subtraction Angiography for the Characterization and Diagnosis of Intracranial Artery Disease.

PubMed

Lee, N J; Chung, M S; Jung, S C; Kim, H S; Choi, C-G; Kim, S J; Lee, D H; Suh, D C; Kwon, S U; Kang, D-W; Kim, J S

2016-12-01

High-resolution MR imaging has recently been introduced as a promising diagnostic modality in intracranial artery disease. Our aim was to compare high-resolution MR imaging with digital subtraction angiography for the characterization and diagnosis of various intracranial artery diseases. Thirty-seven patients who had undergone both high-resolution MR imaging and DSA for intracranial artery disease were enrolled in our study (August 2011 to April 2014). The time interval between the high-resolution MR imaging and DSA was within 1 month. The degree of stenosis and the minimal luminal diameter were independently measured by 2 observers in both DSA and high-resolution MR imaging, and the results were compared. Two observers independently diagnosed intracranial artery diseases on DSA and high-resolution MR imaging. The time interval between the diagnoses on DSA and high-resolution MR imaging was 2 weeks. Interobserver diagnostic agreement for each technique and intermodality diagnostic agreement for each observer were acquired. High-resolution MR imaging showed moderate-to-excellent agreement (interclass correlation coefficient = 0.892-0.949; κ = 0.548-0.614) and significant correlations (R = 0.766-892) with DSA on the degree of stenosis and minimal luminal diameter. The interobserver diagnostic agreement was good for DSA (κ = 0.643) and excellent for high-resolution MR imaging (κ = 0.818). The intermodality diagnostic agreement was good (κ = 0.704) for observer 1 and moderate (κ = 0.579) for observer 2, respectively. High-resolution MR imaging may be an imaging method comparable with DSA for the characterization and diagnosis of various intracranial artery diseases. © 2016 by American Journal of Neuroradiology.
Has 4D transperineal ultrasound additional value over 2D transperineal ultrasound for diagnosing obstructed defaecation syndrome?

PubMed

van Gruting, I M A; Kluivers, K; Sultan, A H; De Bin, R; Stankiewicz, A; Blake, H; Thakar, R

2018-06-08

To establish the diagnostic test accuracy of both two-dimensional (2D) and four-dimensional (4D) transperineal ultrasound, to assess if 4D ultrasound imaging provides additional value in the diagnosis of posterior pelvic floor disorders in women with obstructed defaecation syndrome. In this prospective cohort study, 121 consecutive women with obstructed defaecation syndrome were recruited. Symptoms of obstructed defaecation and signs of pelvic organ prolapse were assessed using validated methods. All women underwent both 2D transperineal ultrasound (Pro-focus, 8802 transducer, BK-medical) and 4D transperineal ultrasound (Voluson i, RAB4-8-RS transducer, GE). Imaging analysis was performed by two blinded observers. Pelvic floor disorders were dichotomised into presence or absence according pre-defined cut-off values. In the absence of a reference standard a composite reference standard was created from a combination of results of evacuation proctogram, magnetic resonance imaging and endovaginal ultrasound. Primary outcome measures were diagnostic test characteristics of 2D and 4D transperineal ultrasound for diagnosis or rectocele, enterocele, intussusception and anismus. Secondary outcome measures were interobserver agreement, agreement between the two techniques and correlation of signs and symptoms to imaging findings. For diagnosis of all four posterior pelvic floor disorders there was no difference in sensitivity and specificity between 2D and 4D TPUS (p= 0.131 - 1.000). A good agreement between 2D and 4D TPUS was found for the diagnosis of rectocele (ĸ 0.675) and a moderate agreement for diagnosis of enterocele, intussusception and anismus (ĸ 0.465 - 0.545). There was no difference in rectocele depth measurements between both TPUS techniques (19.9 mm vs 19.0 mm, p=0.802). Inter-observer agreement was comparable for both techniques, however 2D TPUS had an excellent interobserver agreement for diagnosis of enterocele and rectocele depth measurements. Diagnosis of rectocele and enterocele on both 2D and 4D TPUS correlated well with presence of posterior vaginal wall prolapse on clinical examination (OR 1.89 - 2.72). In this group of ODS patients, the imaging findings on both techniques did not correlate with severity of symptoms of ODS (OR 0.82 - 1.08). There is no evidence of a superiority of 4D ultrasound acquisition to dynamic 2D ultrasound acquisition for the diagnosis of posterior pelvic floor disorders. Both 2D and 4D TPUS could be used interchangeably to screen women with symptoms of obstructed defaecation. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Physicians’ accuracy and interrator reliability for the diagnosis of unstable meniscal tears in patients having osteoarthritis of the knee

PubMed Central

Dervin, Geoffrey F.; Stiell, Ian G.; Wells, George A.; Rody, Kelly; Grabowski, Jenny

2001-01-01

Objective To determine clinicians’ accuracy and reliability for the clinical diagnosis of unstable meniscus tears in patients with symptomatic osteoarthritis of the knee. Design A prospective cohort study. Setting A single tertiary care centre. Patients One hundred and fifty-two patients with symptomatic osteoarthritis of the knee refractory to conservative medical treatment were selected for prospective evaluation of arthroscopic débridement. Intervention Arthroscopic débridement of the knee, including meniscal tear and chondral flap resection, without abrasion arthroplasty. Outcome measures A standardized assessment protocol was administered to each patient by 2 independent observers. Arthroscopic determination of unstable meniscal tears was recorded by 1 observer who reviewed a video recording and was blinded to preoperative data. Those variables that had the highest interobserver agreement and the strongest association with meniscal tear by univariate methods were entered into logistic regression to model the best prediction of resectable tears. Results There were 92 meniscal tears (77 medial, 15 lateral). Interobserver agreement between clinical fellows and treating surgeons was poor to fair (κ < 0.4) for all clinical variables except radiographic measures, which were good. Fellows and surgeons predicted unstable meniscal tear preoperatively with equivalent accuracy of 60%. Logistic regression modelling revealed that a history of swelling and a ballottable effusion were negative predictors. A positive McMurray test was the only positive predictor of unstable meniscal tear. “Mechanical” symptoms were not reliable predictors in this prospective study. The model was 69% accurate for all patients and 76% for those with advanced medial compartment osteoarthritis defined by a joint space height of 2 mm or less. Conclusions This study underscored the difficulty in using clinical variables to predict unstable medial meniscal tears in patients with pre-existing osteoarthritis of the knee. The lack of interobserver agreement must be overcome to ensure that the findings can be generalized to other physician observers. PMID:11504260
New Endoscopic Indicator of Esophageal Achalasia: “Pinstripe Pattern”

PubMed Central

Minami, Hitomi; Isomoto, Hajime; Miuma, Satoshi; Kobayashi, Yasutoshi; Yamaguchi, Naoyuki; Urabe, Shigetoshi; Matsushima, Kayoko; Akazawa, Yuko; Ohnita, Ken; Takeshima, Fuminao; Inoue, Haruhiro; Nakao, Kazuhiko

2015-01-01

Background and Study Aims Endoscopic diagnosis of esophageal achalasia lacking typical endoscopic features can be extremely difficult. The aim of this study was to identify simple and reliable early indicator of esophageal achalasia. Patients and Methods This single-center retrospective study included 56 cases of esophageal achalasia without previous treatment. As a control, 60 non-achalasia subjects including reflux esophagitis and superficial esophageal cancer were also included in this study. Endoscopic findings were evaluated according to Descriptive Rules for Achalasia of the Esophagus as follows: (1) esophageal dilatation, (2) abnormal retention of liquid and/or food, (3) whitish change of the mucosal surface, (4) functional stenosis of the esophago-gastric junction, and (5) abnormal contraction. Additionally, the presence of the longitudinal superficial wrinkles of esophageal mucosa, “pinstripe pattern (PSP)” was evaluated endoscopically. Then, inter-observer diagnostic agreement was assessed for each finding. Results The prevalence rates of the above-mentioned findings (1–5) were 41.1%, 41.1%, 16.1%, 94.6%, and 43.9%, respectively. PSP was observed in 60.7% of achalasia, while none of the control showed positivity for PSP. PSP was observed in 26 (62.5%) of 35 cases with shorter history < 10 years, which usually lacks typical findings such as severe esophageal dilation and tortuosity. Inter-observer agreement level was substantial for food/liquid remnant (k = 0.6861) and PSP (k = 0.6098), and was fair for abnormal contraction and white change. The accuracy, sensitivity, and specificity for achalasia were 83.8%, 64.7%, and 100%, respectively. Conclusion “Pinstripe pattern” could be a reliable indicator for early discrimination of primary esophageal achalasia. PMID:25664812
Updated ultrasound criteria for polycystic ovary syndrome: reliable thresholds for elevated follicle population and ovarian volume.

PubMed

Lujan, Marla E; Jarrett, Brittany Y; Brooks, Eric D; Reines, Jonathan K; Peppin, Andrew K; Muhn, Narry; Haider, Ehsan; Pierson, Roger A; Chizen, Donna R

2013-05-01

Do the ultrasonographic criteria for polycystic ovaries supported by the 2003 Rotterdam consensus adequately discriminate between the normal and polycystic ovary syndrome (PCOS) condition in light of recent advancements in imaging technology and reliable methods for estimating follicle populations in PCOS? Using newer ultrasound technology and a reliable grid system approach to count follicles, we concluded that a substantially higher threshold of follicle counts throughout the entire ovary (FNPO)-26 versus 12 follicles-is required to distinguish among women with PCOS and healthy women from the general population. The Rotterdam consensus defined the polycystic ovary as having 12 or more follicles, measuring between 2 and 9 mm (FNPO), and/or an ovarian volume (OV) >10 cm(3). Since their initial proposal in 2003, a heightened prevalence of polycystic ovaries has been described in healthy women with regular menstrual cycles, which has questioned the accuracy of these criteria and marginalized the specificity of polycystic ovaries as a diagnostic criterion for PCOS. A diagnostic test study was performed using cross-sectional data, collected from 2006 to 2011, from 168 women prospectively evaluated by transvaginal ultrasonography. Receiver operating characteristic (ROC) curve analyses were performed to determine the appropriate diagnostic thresholds for: (i) FNPO, (ii) follicle counts in a single cross section (FNPS) and (iii) OV. The levels of intra- and inter-observer reliability when five observers used the proposed criteria on 100 ultrasound cases were also determined. Ninety-eight women diagnosed with PCOS by the National Institutes of Health criteria as having both oligo-amenorrhea and hyperandrogenism and 70 healthy female volunteers recruited from the general population. Participants were evaluated by transvaginal ultrasonography at the Royal University Hospital within the Department of Obstetrics, Gynecology and Reproductive Sciences, University of Saskatchewan (Saskatoon, SK, Canada) and in the Division of Nutritional Sciences' Human Metabolic Research Unit, Cornell University (Ithaca, NY, USA). Diagnostic potential for PCOS was highest for FNPO (0.969), followed by FNPS (0.880) and OV (0.873) as judged by the area under the ROC curve. An FNPO threshold of 26 follicles had the best compromise between sensitivity (85%) and specificity (94%) when discriminating between controls and PCOS. Similarly, an FNPS threshold of nine follicles had a 69% sensitivity and 90% specificity, and an OV of 10 cm(3) had a 81% sensitivity and 84% specificity. Levels of intra-observer reliability were 0.81, 0.80 and 0.86 when assessing FNPO, FNPS and OV, respectively. Inter-observer reliability was 0.71, 0.72 and 0.82, respectively. Thresholds proposed by this study should be limited to use in women aged between 18 and 35 years. Polycystic ovarian morphology has excellent diagnostic potential for detecting PCOS. FNPO have better diagnostic potential and yield greater diagnostic confidence compared with assessments of FNPS or OV. Whenever possible, images throughout the entire ovary should be collected for the ultrasonographic evaluation of PCOS. This study was funded by Cornell University and fellowship awards from the Saskatchewan Health Research Foundation and Canadian Institutes of Health Research. The authors have no conflict of interests to disclose.
Accuracy and variability of tumor burden measurement on multi-parametric MRI

NASA Astrophysics Data System (ADS)

Salarian, Mehrnoush; Gibson, Eli; Shahedi, Maysam; Gaed, Mena; Gómez, José A.; Moussa, Madeleine; Romagnoli, Cesare; Cool, Derek W.; Bastian-Jordan, Matthew; Chin, Joseph L.; Pautler, Stephen; Bauman, Glenn S.; Ward, Aaron D.

2014-03-01

Measurement of prostate tumour volume can inform prognosis and treatment selection, including an assessment of the suitability and feasibility of focal therapy, which can potentially spare patients the deleterious side effects of radical treatment. Prostate biopsy is the clinical standard for diagnosis but provides limited information regarding tumour volume due to sparse tissue sampling. A non-invasive means for accurate determination of tumour burden could be of clinical value and an important step toward reduction of overtreatment. Multi-parametric magnetic resonance imaging (MPMRI) is showing promise for prostate cancer diagnosis. However, the accuracy and inter-observer variability of prostate tumour volume estimation based on separate expert contouring of T2-weighted (T2W), dynamic contrastenhanced (DCE), and diffusion-weighted (DW) MRI sequences acquired using an endorectal coil at 3T is currently unknown. We investigated this question using a histologic reference standard based on a highly accurate MPMRIhistology image registration and a smooth interpolation of planimetric tumour measurements on histology. Our results showed that prostate tumour volumes estimated based on MPMRI consistently overestimated histological reference tumour volumes. The variability of tumour volume estimates across the different pulse sequences exceeded interobserver variability within any sequence. Tumour volume estimates on DCE MRI provided the lowest inter-observer variability and the highest correlation with histology tumour volumes, whereas the apparent diffusion coefficient (ADC) maps provided the lowest volume estimation error. If validated on a larger data set, the observed correlations could support the development of automated prostate tumour volume segmentation algorithms as well as correction schemes for tumour burden estimation on MPMRI.

Journal Club: Comparison of assessment of preoperative pulmonary vasculature in patients with non-small cell lung cancer by non-contrast- and 4D contrast-enhanced 3-T MR angiography and contrast-enhanced 64-MDCT.

PubMed

Ohno, Yoshiharu; Nishio, Mizuho; Koyama, Hisanobu; Yoshikawa, Takeshi; Matsumoto, Sumiaki; Seki, Shinichiro; Sugimura, Kazuro

2014-03-01

The purpose of this article is to prospectively and directly compare the capabilities of non-contrast-enhanced MR angiography (MRA), 4D contrast-enhanced MRA, and contrast-enhanced MDCT for assessing pulmonary vasculature in patients with non-small cell lung cancer (NSCLC) before surgical treatment. A total of 77 consecutive patients (41 men and 36 women; mean age, 71 years) with pathologically proven and clinically assessed stage I NSCLC underwent thin-section contrast-enhanced MDCT, non-contrast-enhanced and contrast-enhanced MRA, and surgical treatment. The capability for anomaly assessment of the three methods was independently evaluated by two reviewers using a 5-point visual scoring system, and final assessment for each patient was made by consensus of the two readers. Interobserver agreement for pulmonary arterial and venous assessment was evaluated with the kappa statistic. Then, sensitivity, specificity, and accuracy for the detection of anomalies were directly compared among the three methods by use of the McNemar test. Interobserver agreement for pulmonary artery and vein assessment was substantial or almost perfect (κ=0.72-0.86). For pulmonary arterial and venous variation assessment, there were no significant differences in sensitivity, specificity, and accuracy among non-contrast-enhanced MRA (pulmonary arteries: sensitivity, 77.1%; specificity, 97.4%; accuracy, 87.7%; pulmonary veins: sensitivity, 50%; specificity, 98.5%; accuracy, 93.2%), 4D contrast-enhanced MRA (pulmonary arteries: sensitivity, 77.1%; specificity, 97.4%; accuracy, 87.7%; pulmonary veins: sensitivity, 62.5%; specificity, 100.0%; accuracy, 95.9%), and thin-section contrast-enhanced MDCT (pulmonary arteries: sensitivity, 91.4%; specificity, 89.5%; accuracy, 90.4%; pulmonary veins: sensitivity, 50%; specificity, 100.0%; accuracy, 95.9%) (p>0.05). Pulmonary vascular assessment of patients with NSCLC before surgical resection by non-contrast-enhanced MRA can be considered equivalent to that by 4D contrast-enhanced MRA and contrast-enhanced MDCT.
Reliability of classification for post-traumatic ankle osteoarthritis.

PubMed

Claessen, Femke M A P; Meijer, Diederik T; van den Bekerom, Michel P J; Gevers Deynoot, Barend D J; Mallee, Wouter H; Doornberg, Job N; van Dijk, C Niek

2016-04-01

The purpose of this study was to identify the most reliable classification system for clinical outcome studies to categorize post-traumatic-fracture-osteoarthritis. A total of 118 orthopaedic surgeons and residents-gathered in the Ankle Platform Study Collaborative Science of Variation Group-evaluated 128 anteroposterior and lateral radiographs of patients after a bi- or trimalleolar ankle fracture on a Web-based platform in order to rate post-traumatic osteoarthritis according to the classification systems coined by (1) van Dijk, (2) Kellgren, and (3) Takakura. Reliability was evaluated with the use of the Siegel and Castellan's multirater kappa measure. Differences between classification systems were compared using the two-sample Z-test. Interobserver agreement of surgeons who participated in the survey was fair for the van Dijk osteoarthritis scale (k = 0.24), and poor for the Takakura (k = 0.19) and the Kellgren systems (k = 0.18) according to the categorical rating of Landis and Koch. This difference in one categorical rating was found to be significant (p < 0.001, CI 0.046-0.053) with the high numbers of observers and cases available. This study documents fair interobserver agreement for the van Dijk osteoarthritis scale, and poor interobserver agreement for the Takakura and Kellgren osteoarthritis classification systems. Because of the low interobserver agreement for the van Dijk, Kellgren, and Takakura classification systems, those systems cannot be used for clinical decision-making. Development of diagnostic criteria on basis of consecutive patients, Level II.
Accuracy of clinical pallor in the diagnosis of anaemia in children: a meta-analysis.

PubMed

Chalco, Juan P; Huicho, Luis; Alamo, Carlos; Carreazo, Nilton Y; Bada, Carlos A

2005-12-08

Anaemia is highly prevalent in children of developing countries. It is associated with impaired physical growth and mental development. Palmar pallor is recommended at primary level for diagnosing it, on the basis of few studies. The objective of the study was to systematically assess the accuracy of clinical signs in the diagnosis of anaemia in children. A systematic review on the accuracy of clinical signs of anaemia in children. We performed an Internet search in various databases and an additional reference tracking. Studies had to be on performance of clinical signs in the diagnosis of anaemia, using haemoglobin as the gold standard. We calculated pooled diagnostic likelihood ratios (LR's) and odds ratios (DOR's) for each clinical sign at different haemoglobin thresholds. Eleven articles met the inclusion criteria. Most studies were performed in Africa, in children underfive. Chi-square test for proportions and Cochran Q for DOR's and for LR's showed heterogeneity. Type of observer and haemoglobin technique influenced the results. Pooling was done using the random effects model. Pooled DOR at haemoglobin <11 g/dL was 4.3 (95% CI 2.6-7.2) for palmar pallor, 3.7 (2.3-5.9) for conjunctival pallor, and 3.4 (1.8-6.3) for nailbed pallor. DOR's and LR's were slightly better for nailbed pallor at all other haemoglobin thresholds. The accuracy did not vary substantially after excluding outliers. This meta-analysis did not document a highly accurate clinical sign of anaemia. In view of poor performance of clinical signs, universal iron supplementation may be an adequate control strategy in high prevalence areas. Further well-designed studies are needed in settings other than Africa. They should assess inter-observer variation, performance of combined clinical signs, phenotypic differences, and different degrees of anaemia.
Micro-anatomical quantitative optical imaging: toward automated assessment of breast tissues.

PubMed

Dobbs, Jessica L; Mueller, Jenna L; Krishnamurthy, Savitri; Shin, Dongsuk; Kuerer, Henry; Yang, Wei; Ramanujam, Nirmala; Richards-Kortum, Rebecca

2015-08-20

Pathologists currently diagnose breast lesions through histologic assessment, which requires fixation and tissue preparation. The diagnostic criteria used to classify breast lesions are qualitative and subjective, and inter-observer discordance has been shown to be a significant challenge in the diagnosis of selected breast lesions, particularly for borderline proliferative lesions. Thus, there is an opportunity to develop tools to rapidly visualize and quantitatively interpret breast tissue morphology for a variety of clinical applications. Toward this end, we acquired images of freshly excised breast tissue specimens from a total of 34 patients using confocal fluorescence microscopy and proflavine as a topical stain. We developed computerized algorithms to segment and quantify nuclear and ductal parameters that characterize breast architectural features. A total of 33 parameters were evaluated and used as input to develop a decision tree model to classify benign and malignant breast tissue. Benign features were classified in tissue specimens acquired from 30 patients and malignant features were classified in specimens from 22 patients. The decision tree model that achieved the highest accuracy for distinguishing between benign and malignant breast features used the following parameters: standard deviation of inter-nuclear distance and number of duct lumens. The model achieved 81 % sensitivity and 93 % specificity, corresponding to an area under the curve of 0.93 and an overall accuracy of 90 %. The model classified IDC and DCIS with 92 % and 96 % accuracy, respectively. The cross-validated model achieved 75 % sensitivity and 93 % specificity and an overall accuracy of 88 %. These results suggest that proflavine staining and confocal fluorescence microscopy combined with image analysis strategies to segment morphological features could potentially be used to quantitatively diagnose freshly obtained breast tissue at the point of care without the need for tissue preparation.
Diagnostic accuracy assessment of cytopathological examination of feline sporotrichosis.

PubMed

Jessica, N; Sonia, R L; Rodrigo, C; Isabella, D F; Tânia, M P; Jeferson, C; Anna, B F; Sandro, A

2015-11-01

Sporotrichosis is an implantation mycosis caused by pathogenic species of Sporothrix schenckii complex that affects humans and animals, especially cats. Its main forms of zoonotic transmission include scratching, biting and/or contact with the exudate from lesions of sick cats. In Brazil, epidemic involving humans, dogs and cats has occurred since 1998. The definitive diagnosis of sporotrichosis is obtained by the isolation of the fungus in culture; however, the result can take up to four weeks, which may delay the beginning of antifungal treatment in some cases. Cytopathological examination is often used in feline sporotrichosis diagnosis, but accuracy parameters have not been established yet. The aim of this study was to evaluate the accuracy and reliability of cytopathological examination in the diagnosis of feline sporotrichosis. The present study included 244 cats from the metropolitan region of Rio de Janeiro, mostly males in reproductive age with three or more lesions in non-adjacent anatomical places. To evaluate the inter-observer reliability, two different observers performed the microscopic examination of the slides blindly. Test sensitivity was 84.9%. The values of positive predictive value, negative predictive value, positive likelihood ratio, negative likelihood ratio and accuracy were 86.0, 24.4, 2.02, 0.26 and 82.8%, respectively. The reliability between the two observers was considered substantial. We conclude that the cytopathological examination is a sensitive, rapid and practical method to be used in feline sporotrichosis diagnosis in outbreaks of this mycosis. © The Author 2015. Published by Oxford University Press on behalf of The International Society for Human and Animal Mycology. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Reliability of internal oblique elbow radiographs for measuring displacement of medial epicondyle humerus fractures: a cadaveric study.

PubMed

Gottschalk, Hilton P; Bastrom, Tracey P; Edmonds, Eric W

2013-01-01

Standard elbow radiographs (AP and lateral views) are not accurate enough to measure true displacement of medial epicondyle fractures of the humerus. The amount of perceived displacement has been used to determine treatment options. This study assesses the utility of internal oblique radiographs for measurement of true displacement in these fractures. A medial epicondyle fracture was created in a cadaveric specimen. Displacement of the fragment (mm) was set at 5, 10, and 15 in line with the vector of the flexor pronator mass. The fragment was sutured temporarily in place. Radiographs were obtained at 0 (AP), 15, 30, 45, 60, 75, and 90 degrees (lateral) of internal rotation, with the elbow in set positions of flexion. This was done with and without radio-opaque markers placed on the fragment and fracture bed. The 45 and 60 degrees internal oblique radiographs were then presented to 5 separate reviewers (of different levels of training) to evaluate intraobserver and interobserver agreement. Change in elbow position did not affect the perceived displacement (P=0.82) with excellent intraobserver reliability (intraclass correlation coefficient range, 0.979 to 0.988) and interobserver agreement of 0.953. The intraclass correlation coefficient for intraobserver reliability on 45 degrees internal oblique films for all groups ranged from 0.985 to 0.998, with interobserver agreement of 0.953. For predicting displacement, the observers were 60% accurate in predicting the true displacement on the 45 degrees internal oblique films and only 35% accurate using the 60 degrees internal oblique view. Standardizing to a 45 degrees internal oblique radiograph of the elbow (regardless of elbow flexion) can augment the treating surgeon's ability to determine true displacement. At this degree of rotation, the measured number can be multiplied by 1.4 to better estimate displacement. The addition of a 45 degrees internal oblique radiograph in medial humeral epicondyle fractures has good intraobserver and interobserver reliability to more accurately estimate the true displacement of these fractures. Diagnostic study, Level II (Development of diagnostic study with universally applied reference "gold" standard).
Diagnostic electrocardiography in epidemiological studies of Chagas' disease: multicenter evaluation of a standardized method.

PubMed

Lázzari, J O; Pereira, M; Antunes, C M; Guimarães, A; Moncayo, A; Chávez Domínguez, R; Hernández Pieretti, O; Macedo, V; Rassi, A; Maguire, J; Romero, A

1998-11-01

An electrocardiographic recording method with an associated reading guide, designed for epidemiological studies on Chagas' disease, was tested to assess its diagnostic reproducibility. Six cardiologists from five countries each read 100 electrocardiographic (ECG) tracings, including 30 from chronic chagasic patients, then reread them after an interval of 6 months. The readings were blind, with the tracings numbered randomly for the first reading and renumbered randomly for the second reading. The physicians, all experienced in interpreting ECGs from chagasic patients, followed printed instructions for reading the tracings. Reproducibility of the readings was evaluated using the kappa (kappa) index for concordance. The results showed a high degree of interobserver concordance with respect to the diagnosis of normal vs. abnormal tracings (kappa = 0.66; SE 0.02). While the interpretations of some categories of ECG abnormalities were highly reproducible, others, especially those having a low prevalence, showed lower levels of concordance. Intraobserver concordance was uniformly higher than interobserver concordance. The findings of this study justify the use by specialists of the recording of readings method proposed for epidemiological studies on Chagas' disease, but warrant caution in the interpretation of some categories of electrocardiographic alterations.
FISH analysis for diagnostic evaluation of challenging melanocytic lesions.

PubMed

Zimmermann, A K; Hirschmann, A; Pfeiffer, D; Paredes, B E; Diebold, J

2010-09-01

The differential diagnosis of malignant melanomas and atypical melanocytic nevi is still a diagnostic challenge. The currently accepted morphologic criteria show substantial interobserver variability, likewise immunohistochemical studies are often not able to discriminate these lesions reliably. Techniques that support diagnostic accuracy are of the greatest importance considering the growing incidence of malignant melanomas and their increase in younger patients. In this study we analyzed the feasibility of fluorescence in situ hybridization (FISH) analysis for the discrimination of malignant and benign melanocytic tumors. A panel of DNA probes was used to detect chromosomal aberrations of chromosomes 6 and 11. On a series of 5 clearly malignant and benign melanocytic tumors we confirmed the applicability of the test. Then we focused on examination of ambiguous melanocytic lesions, where atypical cells are often difficult to relocalize in the 4',6-Diamidino-2-phenylindol (DAPI)-fluorescence stain. FISH analyses were conducted on destained H&E-stained slides. By comparison of the DAPI-image with photos taken from the H&E stain, unambiguous assignment of the FISH results to the conspicuous groups of cells was possible. The results of FISH analysis were consistent with the conventional diagnosis in 11 of 14 small ambiguous lesions. Of the remaining 3 cases, 2 showed FISH-results close to the cut-off level. Comparison of FISH results on thin and thick sections revealed that the cut-off values have to be adapted for 2 microm destained sections. In conclusion, FISH analysis is a useful and applicable tool for assessment of even smallest melanocytic neoplasms, although there will remain unclear cases that cannot be solved even after additional FISH evaluation.
Echocardiographic agreement in the diagnostic evaluation for infective endocarditis.

PubMed

Lauridsen, Trine Kiilerich; Selton-Suty, Christine; Tong, Steven; Afonso, Luis; Cecchi, Enrico; Park, Lawrence; Yow, Eric; Barnhart, Huiman X; Paré, Carlos; Samad, Zainab; Levine, Donald; Peterson, Gail; Stancoven, Amy Butler; Johansson, Magnus Carl; Dickerman, Stuart; Tamin, Syahidah; Habib, Gilbert; Douglas, Pamela S; Bruun, Niels Eske; Crowley, Anna Lisa

2016-07-01

Echocardiography is essential for the diagnosis and management of infective endocarditis (IE). However, the reproducibility for the echocardiographic assessment of variables relevant to IE is unknown. Objectives of this study were: (1) To define the reproducibility for IE echocardiographic variables and (2) to describe a methodology for assessing quality in an observational cohort containing site-interpreted data. IE reproducibility was assessed on a subset of echocardiograms from subjects enrolled in the International Collaboration on Endocarditis registry. Specific echocardiographic case report forms were used. Intra-observer agreement was assessed from six site readers on ten randomly selected echocardiograms. Inter-observer agreement between sites and an echocardiography core laboratory was assessed on a separate random sample of 110 echocardiograms. Agreement was determined using intraclass correlation (ICC), coverage probability (CP), and limits of agreement for continuous variables and kappa statistics (κweighted) and CP for categorical variables. Intra-observer agreement for LVEF was excellent [ICC = 0.93 ± 0.1 and all pairwise differences for LVEF (CP) were within 10 %]. For IE categorical echocardiographic variables, intra-observer agreement was best for aortic abscess (κweighted = 1.0, CP = 1.0 for all readers). Highest inter-observer agreement for IE categorical echocardiographic variables was obtained for vegetation location (κweighted = 0.95; 95 % CI 0.92-0.99) and lowest agreement was found for vegetation mobility (κweighted = 0.69; 95 % CI 0.62-0.86). Moderate to excellent intra- and inter-observer agreement is observed for echocardiographic variables in the diagnostic assessment of IE. A pragmatic approach for determining echocardiographic data reproducibility in a large, multicentre, site interpreted observational cohort is feasible.
Reproducibility of atypia of undetermined significance/follicular lesion of undetermined significance category using the bethesda system for reporting thyroid cytology when reviewing slides from different institutions: A study of interobserver variability among cytopathologists.

PubMed

Padmanabhan, Vijayalakshmi; Marshall, Carrie B; Akdas Barkan, Guliz; Ghofrani, Mohiedean; Laser, Alice; Tolgay Ocal, Idris; David Sturgis, Charles; Souers, Rhona; Kurtycz, Daniel F I

2017-05-01

The Bethesda System for Reporting Thyroid Cytopathology (TBSRTC) offers a six-tiered diagnostic scheme for thyroid Fine Needle Aspiration (FNA): Benign, Atypia of Undetermined Significance/Follicular Lesion of Undetermined Significance (AUS/FLUS), suspicious for follicular neoplasm, suspicious for malignancy, malignant, and unsatisfactory with an aim to standardize diagnostic criteria. Reported rate of AUS/FLUS category in the literature has varied from 3% to 20.5%. The aim of this study was to assess interobserver variability among cytopathologists to assess reproducibility of the AUS/FLUS category. Seven cytopathologists brought FNA cases (a mixture of atypical and non-atypical FNA diagnosis) diagnosed using TBSRTC from their respective institutions which were reviewed and diagnosed by the participants. The analysis assessed interobserver variability among 7 cytopathologists and determined characteristics on the slides which were associated with concordance to the institutional diagnosis. Seventy eight of 125 (62.4%) benign cases were classified as benign by the reviewers and 26 (21%) were called AUS/FLUS on review. A third of the AUS/FLUS cases were called benign on review and 28.2% were classified as suspicious for neoplasia/malignancy. Roughly a third each of the suspicious for follicular neoplasm/suspicious for malignancy cases were classified as AUS/FLUS. When pathologists from different institutions shared their slides, concordance was high for specimens with adequate cellularity and those that were clearly benign but thresholds varied for the other indeterminate categories. Most definite categorization of the AUS/FLUS category was seen on review. Diagn. Cytopathol. 2017;45:399-405. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Interobserver Reliability of the Berlin ARDS Definition and Strategies to Improve the Reliability of ARDS Diagnosis.

PubMed

Sjoding, Michael W; Hofer, Timothy P; Co, Ivan; Courey, Anthony; Cooke, Colin R; Iwashyna, Theodore J

2018-02-01

Failure to reliably diagnose ARDS may be a major driver of negative clinical trials and underrecognition and treatment in clinical practice. We sought to examine the interobserver reliability of the Berlin ARDS definition and examine strategies for improving the reliability of ARDS diagnosis. Two hundred five patients with hypoxic respiratory failure from four ICUs were reviewed independently by three clinicians, who evaluated whether patients had ARDS, the diagnostic confidence of the reviewers, whether patients met individual ARDS criteria, and the time when criteria were met. Interobserver reliability of an ARDS diagnosis was "moderate" (kappa = 0.50; 95% CI, 0.40-0.59). Sixty-seven percent of diagnostic disagreements between clinicians reviewing the same patient was explained by differences in how chest imaging studies were interpreted, with other ARDS criteria contributing less (identification of ARDS risk factor, 15%; cardiac edema/volume overload exclusion, 7%). Combining the independent reviews of three clinicians can increase reliability to "substantial" (kappa = 0.75; 95% CI, 0.68-0.80). When a clinician diagnosed ARDS with "high confidence," all other clinicians agreed with the diagnosis in 72% of reviews. There was close agreement between clinicians about the time when a patient met all ARDS criteria if ARDS developed within the first 48 hours of hospitalization (median difference, 5 hours). The reliability of the Berlin ARDS definition is moderate, driven primarily by differences in chest imaging interpretation. Combining independent reviews by multiple clinicians or improving methods to identify bilateral infiltrates on chest imaging are important strategies for improving the reliability of ARDS diagnosis. Copyright © 2017 American College of Chest Physicians. All rights reserved.
Evaluation of a New Handheld Instrument for the Detection of Counterfeit Artesunate by Visual Fluorescence Comparison

PubMed Central

Ranieri, Nicola; Tabernero, Patricia; Green, Michael D.; Verbois, Leigh; Herrington, James; Sampson, Eric; Satzger, R. Duane; Phonlavong, Chindaphone; Thao, Khamxay; Newton, Paul N.; Witkowski, Mark R.

2014-01-01

There is an urgent need for accurate and inexpensive handheld instruments for the evaluation of medicine quality in the field. A blinded evaluation of the diagnostic accuracy of the Counterfeit Detection Device 3 (CD-3), developed by the US Food and Drug Administration Forensic Chemistry Center, was conducted in the Lao People's Democratic Republic. Two hundred three samples of the oral antimalarial artesunate were compared with authentic products using the CD-3 by a trainer and two trainees. The specificity (95% confidence interval [95% CI]), sensitivity (95% CI), positive predictive value (95% CI), and negative predictive value (95% CI) of the CD-3 for detecting counterfeit (falsified) artesunate were 100% (93.8–100%), 98.4% (93.8–99.7%), 100% (96.2–100%), and 97.4% (90.2–99.6%), respectively. Interobserver agreement for 203 samples of artesunate was 100%. The CD-3 holds promise as a relatively inexpensive and easy to use instrument for field evaluation of medicines, potentially empowering drug inspectors, customs agents, and pharmacists. PMID:25266348
Assessment of Bowel Wall Enhancement for the Diagnosis of Intestinal Ischemia in Patients with Small Bowel Obstruction: Value of Adding Unenhanced CT to Contrast-enhanced CT.

PubMed

Chuong, Anh Minh; Corno, Lucie; Beaussier, Hélène; Boulay-Coletta, Isabelle; Millet, Ingrid; Hodel, Jérôme; Taourel, Patrice; Chatellier, Gilles; Zins, Marc

2016-07-01

Purpose To determine whether adding unenhanced computed tomography (CT) to contrast material-enhanced CT improves the diagnostic performance of decreased bowel wall enhancement as a sign of ischemia complicating mechanical small bowel obstruction (SBO). Materials and Methods This retrospective study was approved by the institutional review board, which waived the requirement for informed consent. Two gastrointestinal radiologists independently performed retrospective assessments of 164 unenhanced and contrast-enhanced CT studies from 158 consecutive patients (mean age, 71.2 years) with mechanical SBO. The reference standard was the intraoperative and/or histologic diagnosis (in 80 cases) or results from clinical follow-up in patients who did not undergo surgery (84 cases). Decreased bowel wall enhancement was evaluated with contrast-enhanced images then and both unenhanced and contrast-enhanced images 1 month later. Diagnostic performance of decreased bowel wall enhancement and confidence in the diagnosis were compared between the two readings by using McNemar and Wilcoxon signed rank tests. Interobserver agreement was assessed by using κ statistics and compared with bootstrapping. Results Ischemia was diagnosed in 41 of 164 (25%) episodes of SBO. For both observers, adding unenhanced images improved decreased bowel wall enhancement sensitivity (observer 1: 46.3% [19 of 41] vs 65.8% [27 of 41], P = .02; observer 2: 56.1% [23 of 41] vs 63.4% [26 of 41], P = .45), Youden index (from 0.41 to 0.58 for observer 1 and from 0.42 to 0.61 for observer 2), and confidence score (P < .001 for both). Specificity significantly increased for observer 2 (84.5% [104 of 123] vs 94.3% [116 of 123], P = .002), and interobserver agreement significantly increased, from moderate (κ = 0.48) to excellent (κ = 0.89; P < .0001). Conclusion Adding unenhanced CT to contrast-enhanced CT improved the sensitivity, diagnostic confidence, and interobserver agreement of the diagnosis of ischemia, a complication of mechanical SBO, on the basis of decreased bowel wall enhancement. (©) RSNA, 2016.
Automated cerebral infarct volume measurement in follow-up noncontrast CT scans of patients with acute ischemic stroke.

PubMed

Boers, A M; Marquering, H A; Jochem, J J; Besselink, N J; Berkhemer, O A; van der Lugt, A; Beenen, L F; Majoie, C B

2013-08-01

Cerebral infarct volume as observed in follow-up CT is an important radiologic outcome measure of the effectiveness of treatment of patients with acute ischemic stroke. However, manual measurement of CIV is time-consuming and operator-dependent. The purpose of this study was to develop and evaluate a robust automated measurement of the CIV. The CIV in early follow-up CT images of 34 consecutive patients with acute ischemic stroke was segmented with an automated intensity-based region-growing algorithm, which includes partial volume effect correction near the skull, midline determination, and ventricle and hemorrhage exclusion. Two observers manually delineated the CIV. Interobserver variability of the manual assessments and the accuracy of the automated method were evaluated by using the Pearson correlation, Bland-Altman analysis, and Dice coefficients. The accuracy was defined as the correlation with the manual assessment as a reference standard. The Pearson correlation for the automated method compared with the reference standard was similar to the manual correlation (R = 0.98). The accuracy of the automated method was excellent with a mean difference of 0.5 mL with limits of agreement of -38.0-39.1 mL, which were more consistent than the interobserver variability of the 2 observers (-40.9-44.1 mL). However, the Dice coefficients were higher for the manual delineation. The automated method showed a strong correlation and accuracy with the manual reference measurement. This approach has the potential to become the standard in assessing the infarct volume as a secondary outcome measure for evaluating the effectiveness of treatment.
IOTA Simple Rules in Differentiating between Benign and Malignant Adnexal Masses by Non-expert Examiners.

PubMed

Tinnangwattana, Dangcheewan; Vichak-Ururote, Linlada; Tontivuthikul, Paponrad; Charoenratana, Cholaros; Lerthiranwong, Thitikarn; Tongsong, Theera

2015-01-01

To evaluate the diagnostic performance of IOTA simple rules in predicting malignant adnexal tumors by non-expert examiners. Five obstetric/gynecologic residents, who had never performed gynecologic ultrasound examination by themselves before, were trained for IOTA simple rules by an experienced examiner. One trained resident performed ultrasound examinations including IOTA simple rules on 100 women, who were scheduled for surgery due to ovarian masses, within 24 hours of surgery. The gold standard diagnosis was based on pathological or operative findings. The five-trained residents performed IOTA simple rules on 30 patients for evaluation of inter-observer variability. A total of 100 patients underwent ultrasound examination for the IOTA simple rules. Of them, IOTA simple rules could be applied in 94 (94%) masses including 71 (71.0%) benign masses and 29 (29.0%) malignant masses. The diagnostic performance of IOTA simple rules showed sensitivity of 89.3% (95%CI, 77.8%; 100.7%), specificity 83.3% (95%CI, 74.3%; 92.3%). Inter-observer variability was analyzed using Cohen's kappa coefficient. Kappa indices of the four pairs of raters are 0.713-0.884 (0.722, 0.827, 0.713, and 0.884). IOTA simple rules have high diagnostic performance in discriminating adnexal masses even when are applied by non-expert sonographers, though a training course may be required. Nevertheless, they should be further tested by a greater number of general practitioners before widely use.
A critical appraisal of vertebral fracture assessment in paediatrics.

PubMed

Kyriakou, Andreas; Shepherd, Sheila; Mason, Avril; Faisal Ahmed, S

2015-12-01

There is a need to improve our understanding of the clinical utility of vertebral fracture assessment (VFA) in paediatrics and this requires a thorough evaluation of its readability, reproducibility, and accuracy for identifying VF. VFA was performed independently by two observers, in 165 children and adolescents with a median age of 13.4 years (range, 3.6, 18). In 20 of these subjects, VFA was compared to lateral vertebral morphometry assessment on lateral spine X-ray (LVM). 1528 (84%) of the vertebrae were adequately visualised by both observers for VFA. Interobserver agreement in vertebral readability was 94% (kappa, 0.73 [95% CI, 0.68, 0.73]). 93% of the non-readable vertebrae were located between T6 and T9. Interobserver agreement per-vertebra for the presence of VF was 99% (kappa, 0.85 [95% CI, 0.79, 0.91]). Interobserver agreement per-subject was 91% (kappa, 0.78 [95% CI, 0.66, 0.87]). Per-vertebra agreement between LVM and VFA was 95% (kappa 0.79 [95% CI, 0.62, 0.92]) and per-subject agreement was 95% (kappa, 0.88 [95% CI, 0.58, 1.0]). Accepting LVM as the gold standard, VFA had a positive predictive value (PPV) of 90% and a negative predictive value (NPV) of 95% in per-vertebra analysis and a PPV of 100% and NPV of 93% in per-subject analysis. VFA reaches an excellent level of agreement between observers and a high level of accuracy in identifying VF in a paediatric population. The readability of vertebrae at the mid thoracic region is suboptimal and interpretation at this level should be exercised with caution. Copyright © 2015 Elsevier Inc. All rights reserved.
Reproducibility of abdominal fat assessment by ultrasound and computed tomography

PubMed Central

Mauad, Fernando Marum; Chagas-Neto, Francisco Abaeté; Benedeti, Augusto César Garcia Saab; Nogueira-Barbosa, Marcello Henrique; Muglia, Valdair Francisco; Carneiro, Antonio Adilton Oliveira; Muller, Enrico Mattana; Elias Junior, Jorge

2017-01-01

Objective: To test the accuracy and reproducibility of ultrasound and computed tomography (CT) for the quantification of abdominal fat in correlation with the anthropometric, clinical, and biochemical assessments. Materials and Methods: Using ultrasound and CT, we determined the thickness of subcutaneous and intra-abdominal fat in 101 subjects-of whom 39 (38.6%) were men and 62 (61.4%) were women-with a mean age of 66.3 years (60-80 years). The ultrasound data were correlated with the anthropometric, clinical, and biochemical parameters, as well as with the areas measured by abdominal CT. Results: Intra-abdominal thickness was the variable for which the correlation with the areas of abdominal fat was strongest (i.e., the correlation coefficient was highest). We also tested the reproducibility of ultrasound and CT for the assessment of abdominal fat and found that CT measurements of abdominal fat showed greater reproducibility, having higher intraobserver and interobserver reliability than had the ultrasound measurements. There was a significant correlation between ultrasound and CT, with a correlation coefficient of 0.71. Conclusion: In the assessment of abdominal fat, the intraobserver and interobserver reliability were greater for CT than for ultrasound, although both methods showed high accuracy and good reproducibility. PMID:28670024
Reproducibility of abdominal fat assessment by ultrasound and computed tomography.

PubMed

Mauad, Fernando Marum; Chagas-Neto, Francisco Abaeté; Benedeti, Augusto César Garcia Saab; Nogueira-Barbosa, Marcello Henrique; Muglia, Valdair Francisco; Carneiro, Antonio Adilton Oliveira; Muller, Enrico Mattana; Elias Junior, Jorge

2017-01-01

To test the accuracy and reproducibility of ultrasound and computed tomography (CT) for the quantification of abdominal fat in correlation with the anthropometric, clinical, and biochemical assessments. Using ultrasound and CT, we determined the thickness of subcutaneous and intra-abdominal fat in 101 subjects-of whom 39 (38.6%) were men and 62 (61.4%) were women-with a mean age of 66.3 years (60-80 years). The ultrasound data were correlated with the anthropometric, clinical, and biochemical parameters, as well as with the areas measured by abdominal CT. Intra-abdominal thickness was the variable for which the correlation with the areas of abdominal fat was strongest (i.e., the correlation coefficient was highest). We also tested the reproducibility of ultrasound and CT for the assessment of abdominal fat and found that CT measurements of abdominal fat showed greater reproducibility, having higher intraobserver and interobserver reliability than had the ultrasound measurements. There was a significant correlation between ultrasound and CT, with a correlation coefficient of 0.71. In the assessment of abdominal fat, the intraobserver and interobserver reliability were greater for CT than for ultrasound, although both methods showed high accuracy and good reproducibility.
Toluidine Blue 0.05% Vital Staining for the Diagnosis of Ocular Surface Squamous Neoplasia in Kenya.

PubMed

Gichuhi, Stephen; Macharia, Ephantus; Kabiru, Joy; Zindamoyen, Alain M'bongo; Rono, Hilary; Ollando, Ernest; Wanyonyi, Leonard; Wachira, Joseph; Munene, Rhoda; Onyuma, Timothy; Jaoko, Walter G; Sagoo, Mandeep S; Weiss, Helen A; Burton, Matthew J

2015-11-01

Clinical features are unreliable for distinguishing ocular surface squamous neoplasia (OSSN) from benign conjunctival lesions. To evaluate the adverse effects, accuracy, and interobserver variation of toluidine blue 0.05% vital staining in distinguishing OSSN, confirmed by histopathology, from other conjunctival lesions. Cross-sectional study in Kenya from July 2012 through July 2014 of 419 adults with suspicious conjunctival lesions. Pregnant and breastfeeding women were excluded. Comprehensive ophthalmic slitlamp examination was conducted. Vital staining with toluidine blue 0.05% aqueous solution was performed before surgery. Initial safety testing was conducted on large tumors scheduled for exenteration looking for corneal toxicity on histology before testing smaller tumors. We asked about pain or discomfort after staining and evaluated the cornea at the slitlamp for epithelial defects. Lesions were photographed before and after staining. Diagnosis was confirmed by histopathology. Six examiners assessed photographs from a subset of 100 consecutive participants for staining and made a diagnosis of OSSN vs non-OSSN. Staining was compared with histopathology to estimate sensitivity, specificity, and predictive values. Adverse effects were enumerated. Interobserver agreement was estimated using the κ statistic. A total of 143 of 419 participants (34%) had OSSN by histopathology. The median age of all participants was 37 years (interquartile range, 32-45 years) and 278 (66%) were female. A total of 322 of the 419 participants had positive staining while 2 of 419 were equivocal. There was no histological evidence of corneal toxicity. Mild discomfort was reported by 88 (21%) and mild superficial punctate keratopathy seen in 7 (1.7%). For detecting OSSN, toluidine blue had a sensitivity of 92% (95% CI, 87%-96%), specificity of 31% (95% CI, 25%-36%), positive predictive value of 41% (95% CI, 35%-46%), and negative predictive value of 88% (95% CI, 80%-94%). Interobserver agreement was substantial for staining (κ = 0.76) and moderate for diagnosis (κ = 0.40). With the high sensitivity and low specificity for OSSN compared with histopathology among patients with conjunctival lesions, toluidine blue 0.05% vital staining is a good screening tool. However, it is not a good diagnostic tool owing to a high frequency of false-positives. The high negative predictive value suggests that a negative staining result indicates that OSSN is relatively unlikely.
The effect of dental artifacts, contrast media, and experience on interobserver contouring variations in head and neck anatomy.

PubMed

O'Daniel, Jennifer C; Rosenthal, David I; Garden, Adam S; Barker, Jerry L; Ahamad, Anesa; Ang, K Kian; Asper, Joshua A; Blanco, Angel I; de Crevoisier, Renaud; Holsinger, F Christopher; Patel, Chirag B; Schwartz, David L; Wang, He; Dong, Lei

2007-04-01

To investigate interobserver variability in the delineation of head-and-neck (H&N) anatomic structures on CT images, including the effects of image artifacts and observer experience. Nine observers (7 radiation oncologists, 1 surgeon, and 1 physician assistant) with varying levels of H&N delineation experience independently contoured H&N gross tumor volumes and critical structures on radiation therapy treatment planning CT images alongside reference diagnostic CT images for 4 patients with oropharynx cancer. Image artifacts from dental fillings partially obstructed 3 images. Differences in the structure volumes, center-of-volume positions, and boundary positions (1 SD) were measured. In-house software created three-dimensional overlap distributions, including all observers. The effects of dental artifacts and observer experience on contouring precision were investigated, and the need for contrast media was assessed. In the absence of artifacts, all 9 participants achieved reasonable precision (1 SD < or =3 mm all boundaries). The structures obscured by dental image artifacts had larger variations when measured by the 3 metrics (1 SD = 8 mm cranial/caudal boundary). Experience improved the interobserver consistency of contouring for structures obscured by artifacts (1 SD = 2 mm cranial/caudal boundary). Interobserver contouring variability for anatomic H&N structures, specifically oropharyngeal gross tumor volumes and parotid glands, was acceptable in the absence of artifacts. Dental artifacts increased the contouring variability, but experienced participants achieved reasonable precision even with artifacts present. With a staging contrast CT image as a reference, delineation on a noncontrast treatment planning CT image can achieve acceptable precision.

Diagnostic Accuracy of Lumbosacral Spine Magnetic Resonance Image Reading by Chiropractors, Chiropractic Radiologists, and Medical Radiologists.

PubMed

de Zoete, Annemarie; Ostelo, Raymond; Knol, Dirk L; Algra, Paul R; Wilmink, Jan T; van Tulder, Maurits W

2015-06-01

A cross-sectional diagnostic accuracy study was conducted in 2 sessions. It is important to know whether it is possible to accurately detect "specific findings" on lumbosacral magnetic resonance (MR) images and whether the results of different observers are comparable. Health care providers frequently use magnetic resonance imaging in the diagnostic process of patients with low back pain. The use of MR scans is increasing. This leads to an increase in costs and to an increase in risk of inaccurately labeling patients with an anatomical diagnosis that might not be the actual cause of symptoms. A set of 300 blinded MR images was read by medical radiologists, chiropractors, and chiropractic radiologists in 2 sessions. Each assessor read 100 scans in round 1 and 50 scans in round 2. The reference test was an expert panel.For all analyses, the magnetic resonance imaging findings were dichotomized into "specific findings" or "no specific findings." For the agreement, percentage agreement and κ values were calculated and for validity, sensitivity, and specificity. Sensitivity analysis was done for classifications A and B (prevalence of 31% and 57%, respectively). The intraobserver κ values for chiropractors, chiropractic radiologists, and medical radiologists were 0.46, 0.49, and 0.69 for A and 0.55, 0.75, and 0.64 for B, respectively.The interobserver κ values were lowest for chiropractors (0.28 for A, 0.37 for B) and highest for chiropractic radiologists (0.50 for A, 0.49 for B).The sensitivities of the medical radiologists, chiropractors, and chiropractic radiologists were 0.62, 0.71, and 0.75 for A and 0.70, 0.74, 0.84 for B, respectively.The specificities of medical radiologists, chiropractic radiologists, and chiropractors were 0.82, 0.77, and 0.70 for A and 0.74, 0.52, and 0.61 for B, respectively. Agreement and validity of MR image readings of chiropractors and chiropractic and medical radiologists is modest at best. This study supports recommendations in clinical guidelines against routine use of magnetic resonance imaging in patients with low back pain. 3.
Transperineal ultrasound compared to evacuation proctography for diagnosing enteroceles and intussusceptions.

PubMed

Weemhoff, M; Kluivers, K B; Govaert, B; Evers, J L H; Kessels, A G H; Baeten, C G

2013-03-01

This study concerns the level of agreement between transperineal ultrasound and evacuation proctography for diagnosing enteroceles and intussusceptions. In a prospective observational study, 50 consecutive women who were planned to have an evacuation proctography underwent transperineal ultrasound too. Sensitivity, specificity, positive (PPV) and negative predictive value, as well as the positive and negative likelihood ratio of transperineal ultrasound were assessed in comparison to evacuation proctography. To determine the interobserver agreement of transperineal ultrasound, the quadratic weighted kappa was calculated. Furthermore, receiver operating characteristic curves were generated to show the diagnostic capability of transperineal ultrasound. For diagnosing intussusceptions (PPV 1.00), a positive finding on transperineal ultrasound was predictive of an abnormal evacuation proctography. Sensitivity of transperineal ultrasound was poor for intussusceptions (0.25). For diagnosing enteroceles, the positive likelihood ratio was 2.10 and the negative likelihood ratio, 0.85. There are many false-positive findings of enteroceles on ultrasonography (PPV 0.29). The interobserver agreement of the two ultrasonographers assessed as the quadratic weighted kappa of diagnosing enteroceles was 0.44 and that of diagnosing intussusceptions was 0.23. An intussusception on ultrasound is predictive of an abnormal evacuation proctography. For diagnosing enteroceles, the diagnostic quality of transperineal ultrasound was limited compared to evacuation proctography.
Comparison of C-arm Computed Tomography and Digital Subtraction Angiography in Patients with Chronic Thromboembolic Pulmonary Hypertension

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hinrichs, Jan B., E-mail: hinrichs.jan@mh-hannover.de; Marquardt, Steffen, E-mail: marquardt.steffen@mh-hannover.de; Falck, Christian von, E-mail: falck.christian.von@mh-hannover.de

PurposeTo assess the feasibility and diagnostic performance of contrast-enhanced, C-arm computed tomography (CACT) of the pulmonary arteries compared to digital subtraction angiography (DSA) in patients suffering from chronic thromboembolic pulmonary hypertension (CTEPH).MaterialsFifty-two patients with CTEPH underwent ECG-gated DSA and contrast-enhanced CACT. Two readers (R1, R2) independently evaluated pulmonary artery segments and their sub-segmental branching using DSA and CACT for optimal image quality. Afterwards, the diagnostic findings, i.e., intraluminal filling defects, stenosis, and occlusion, were compared. Inter-modality and inter-observer agreement was calculated, and subsequently consensus reading was done and correlated to a reference standard representing the overall consensus of both modalities.more » Fisher’s exact test and Cohen’s Kappa were applied.ResultsA total of 1352 pulmonary segments were evaluated, of which 1255 (92.8 %) on DSA and 1256 (92.9 %) on CACT were rated to be fully diagnostic. The main causes of the non-diagnostic image quality were motion artifacts on CACT (R1:37, R2:78) and insufficient contrast enhancement on DSA (R1:59, R2:38). Inter-observer agreement was good for DSA (κ = 0.74) and CACT (κ = 0.75), while inter-modality agreement was moderate (R1: κ = 0.46, R2: κ = 0.47). Compared to the reference standard, the inter-modality agreement for CACT was excellent (κ = 0.96), whereas it was inferior for DSA (κ = 0.61) due to the higher number of abnormal consensus findings read as normal on DSA.ConclusionCACT of the pulmonary arteries is feasible and provides additional information to DSA. CACT has the potential to improve the diagnostic work-up of patients with CTEPH and may be particularly useful prior to surgical or interventional treatment.« less
Accurate Classification of Diminutive Colorectal Polyps Using Computer-Aided Analysis.

PubMed

Chen, Peng-Jen; Lin, Meng-Chiung; Lai, Mei-Ju; Lin, Jung-Chun; Lu, Henry Horng-Shing; Tseng, Vincent S

2018-02-01

Narrow-band imaging is an image-enhanced form of endoscopy used to observed microstructures and capillaries of the mucosal epithelium which allows for real-time prediction of histologic features of colorectal polyps. However, narrow-band imaging expertise is required to differentiate hyperplastic from neoplastic polyps with high levels of accuracy. We developed and tested a system of computer-aided diagnosis with a deep neural network (DNN-CAD) to analyze narrow-band images of diminutive colorectal polyps. We collected 1476 images of neoplastic polyps and 681 images of hyperplastic polyps, obtained from the picture archiving and communications system database in a tertiary hospital in Taiwan. Histologic findings from the polyps were also collected and used as the reference standard. The images and data were used to train the DNN. A test set of images (96 hyperplastic and 188 neoplastic polyps, smaller than 5 mm), obtained from patients who underwent colonoscopies from March 2017 through August 2017, was then used to test the diagnostic ability of the DNN-CAD vs endoscopists (2 expert and 4 novice), who were asked to classify the images of the test set as neoplastic or hyperplastic. Their classifications were compared with findings from histologic analysis. The primary outcome measures were diagnostic accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and diagnostic time. The accuracy, sensitivity, specificity, PPV, NPV, and diagnostic time were compared among DNN-CAD, the novice endoscopists, and the expert endoscopists. The study was designed to detect a difference of 10% in accuracy by a 2-sided McNemar test. In the test set, the DNN-CAD identified neoplastic or hyperplastic polyps with 96.3% sensitivity, 78.1% specificity, a PPV of 89.6%, and a NPV of 91.5%. Fewer than half of the novice endoscopists classified polyps with a NPV of 90% (their NPVs ranged from 73.9% to 84.0%). DNN-CAD classified polyps as neoplastic or hyperplastic in 0.45 ± 0.07 seconds-shorter than the time required by experts (1.54 ± 1.30 seconds) and nonexperts (1.77 ± 1.37 seconds) (both P < .001). DNN-CAD classified polyps with perfect intra-observer agreement (kappa score of 1). There was a low level of intra-observer and inter-observer agreement in classification among endoscopists. We developed a system called DNN-CAD to identify neoplastic or hyperplastic colorectal polyps less than 5 mm. The system classified polyps with a PPV of 89.6%, and a NPV of 91.5%, and in a shorter time than endoscopists. This deep-learning model has potential for not only endoscopic image recognition but for other forms of medical image analysis, including sonography, computed tomography, and magnetic resonance images. Copyright © 2018 AGA Institute. Published by Elsevier Inc. All rights reserved.
A comparison of digital tomosynthesis and chest radiography in evaluating airway lesions using computed tomography as a reference.

PubMed

Choo, Ji Yung; Lee, Ki Yeol; Yu, Ami; Kim, Je-Hyeong; Lee, Seung Heon; Choi, Jung Won; Kang, Eun-Young; Oh, Yu Whan

2016-09-01

To compare the diagnostic performance of digital tomosynthesis (DTS) and chest radiography for detecting airway abnormalities, using computed tomography (CT) as a reference. We evaluated 161 data sets from 149 patients (91 with and 70 without airway abnormalities) who had undergone radiography, DTS, and CT to detect airway problems. Radiographs and DTS were evaluated to localize and score the severity of the airway abnormalities, and to score the image quality using CT as a reference. Receiver operating characteristics (ROC), McNemar's test, weighted kappa, and the paired t-test were used for statistical analysis. The sensitivity of DTS was higher (reader 1, 93.51 %; reader 2, 94.29 %) than chest radiography (68.83 %; 71.43 %) in detecting airway lesions. The diagnostic accuracy of DTS (90.91 %; 94.70 %) was also significantly better than that of radiography (78.03 %; 82.58 %, all p < 0.05). DTS image quality was significantly better than chest radiography (1.83, 2.74; p < 0.05) in the results of both readers. The inter-observer agreement with respect to DTS findings was moderate and superior when compared to radiography findings. DTS is a more accurate and sensitive modality than radiography for detecting airway lesions that are easily obscured by soft tissue structures in the mediastinum. • Digital tomosynthesis offers new diagnostic options for airway lesions. • Digital tomosynthesis is more sensitive and accurate than radiography for airway lesions. • Digital tomosynthesis shows better image quality than radiography. • Assessment of lesion severity, via tomosynthesis is comparable to computed tomography.
Image Quality and Stenosis Assessment of Non-Contrast-Enhanced 3-T Magnetic Resonance Angiography in Patients with Peripheral Artery Disease Compared with Contrast-Enhanced Magnetic Resonance Angiography and Digital Subtraction Angiography

PubMed Central

Liu, Jiayi; Zhang, Nan; Fan, Zhaoyang; Luo, Nan; Zhao, Yike; Bi, Xiaoming; An, Jing; Chen, Zhong; Liu, Dongting; Wen, Zhaoying; Fan, Zhanming; Li, Debiao

2016-01-01

Purpose To evaluate the diagnostic performance of flow-sensitive dephasing (FSD)-prepared steady-state free precession (SSFP) magnetic resonance angiography (MRA) at 3 T for imaging infragenual arteries relative to contrast-enhanced MRA (CE-MRA) and digital subtraction angiography (DSA). Materials and Methods A series of 16 consecutive patients with peripheral arterial disease (PAD) underwent a combined peripheral MRA protocol consisting of FSD-MRA for the calves and large field-of-view CE-MRA. DSA was performed on all patients within 1 week of the MR angiographies. Image quality and degree of stenosis was assessed by two readers with rich experience. Inter-observer agreement was determined using kappa statistics. Receiver operating characteristic (ROC) curve analysis determined the diagnostic value of FSD-MRA, CE-MRA, and CE-MRA combined with FSD-MRA (CE+FSD MRA) in predicting vascular stenosis. Results At the calf station, no significantly difference of subjective image quality scores was found between FSD-MRA and CE-MRA. Inter-reader agreement was excellent for both FSD-MRA and CE-MRA. Both of FSD-MRA and CE-MRA carry a stenosis overestimation risk relative to DSA standard. With DSA as the reference standard, ROC curve analysis showed that the area under the curve was largest for CE+FSD MRA. The greatest sensitivity and specificity were obtained when a cut-off stenosis score of 2 was used. Conclusion In patients with severe PAD,3 T FSD-MRA provides good-quality diagnostic images without a contrast agent and is a good supplement for CE-MRA. CE+FSD MRA can improve the accuracy of vascular stenosis diagnosis. PMID:27861626
Computer-assisted assessment of ultrasound real-time elastography: initial experience in 145 breast lesions.

PubMed

Zhang, Xue; Xiao, Yang; Zeng, Jie; Qiu, Weibao; Qian, Ming; Wang, Congzhi; Zheng, Rongqin; Zheng, Hairong

2014-01-01

To develop and evaluate a computer-assisted method of quantifying five-point elasticity scoring system based on ultrasound real-time elastography (RTE), for classifying benign and malignant breast lesions, with pathologic results as the reference standard. Conventional ultrasonography (US) and RTE images of 145 breast lesions (67 malignant, 78 benign) were performed in this study. Each lesion was automatically contoured on the B-mode image by the level set method and mapped on the RTE image. The relative elasticity value of each pixel was reconstructed and classified into hard or soft by the fuzzy c-means clustering method. According to the hardness degree inside lesion and its surrounding tissue, the elasticity score of the RTE image was computed in an automatic way. Visual assessments of the radiologists were used for comparing the diagnostic performance. Histopathologic examination was used as the reference standard. The Student's t test and receiver operating characteristic (ROC) curve analysis were performed for statistical analysis. Considering score 4 or higher as test positive for malignancy, the diagnostic accuracy, sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) were 93.8% (136/145), 92.5% (62/67), 94.9% (74/78), 93.9% (62/66), and 93.7% (74/79) for the computer-assisted scheme, and 89.7% (130/145), 85.1% (57/67), 93.6% (73/78), 92.0% (57/62), and 88.0% (73/83) for manual assessment. Area under ROC curve (Az value) for the proposed method was higher than the Az value for visual assessment (0.96 vs. 0.93). Computer-assisted quantification of classical five-point scoring system can significantly eliminate the interobserver variability and thereby improve the diagnostic confidence of classifying the breast lesions to avoid unnecessary biopsy. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Emerging enhanced imaging technologies of the esophagus: spectroscopy, confocal laser endomicroscopy, and optical coherence tomography.

PubMed

Robles, Lourdes Y; Singh, Satish; Fisichella, Piero Marco

2015-05-15

Despite advances in diagnoses and therapy, esophageal adenocarcinoma remains a highly lethal neoplasm. Hence, a great interest has been placed in detecting early lesions and in the detection of Barrett esophagus (BE). Advanced imaging technologies of the esophagus have then been developed with the aim of improving biopsy sensitivity and detection of preplastic and neoplastic cells. The purpose of this article was to review emerging imaging technologies for esophageal pathology, spectroscopy, confocal laser endomicroscopy (CLE), and optical coherence tomography (OCT). We conducted a PubMed search using the search string "esophagus or esophageal or oesophageal or oesophagus" and "Barrett or esophageal neoplasm" and "spectroscopy or optical spectroscopy" and "confocal laser endomicroscopy" and "confocal microscopy" and "optical coherence tomography." The first and senior author separately reviewed all articles. Our search identified: 19 in vivo studies with spectroscopy that accounted for 1021 patients and 4 ex vivo studies; 14 clinical CLE in vivo studies that accounted for 941 patients and 1 ex vivo study with 13 patients; and 17 clinical OCT in vivo studies that accounted for 773 patients and 2 ex vivo studies. Human studies using spectroscopy had a very high sensitivity and specificity for the detection of BE. CLE showed a high interobserver agreement in diagnosing esophageal pathology and an accuracy of predicting neoplasia. We also found several clinical studies that reported excellent diagnostic sensitivity and specificity for the detection of BE using OCT. Advanced imaging technology for the detection of esophageal lesions is a promising field that aims to improve the detection of early esophageal lesions. Although advancing imaging techniques improve diagnostic sensitivities and specificities, their integration into diagnostic protocols has yet to be perfected. Copyright © 2015 Elsevier Inc. All rights reserved.
Predictive diagnostic value of the tourniquet test for the diagnosis of dengue infection in adults

PubMed Central

Mayxay, Mayfong; Phetsouvanh, Rattanaphone; Moore, Catrin E; Chansamouth, Vilada; Vongsouvath, Manivanh; Sisouphone, Syho; Vongphachanh, Pankham; Thaojaikong, Thaksinaporn; Thongpaseuth, Soulignasack; Phongmany, Simmaly; Keolouangkhot, Valy; Strobel, Michel; Newton, Paul N

2011-01-01

Objective To examine the accuracy of the admission tourniquet test in the diagnosis of dengue infection among Lao adults. Methods Prospective assessment of the predictive diagnostic value of the tourniquet test for the diagnosis of dengue infection, as defined by IgM, IgG and NS1 ELISAs (Panbio Ltd, Australia), among Lao adult inpatients with clinically suspected dengue infection. Results Of 234 patients with clinically suspected dengue infection on admission, 73% were serologically confirmed to have dengue, while 64 patients with negative dengue serology were diagnosed as having scrub typhus (39%), murine typhus (11%), undetermined typhus (12%), Japanese encephalitis virus (5%), undetermined flavivirus (5%) and typhoid fever (3%); 25% had no identifiable aetiology. The tourniquet test was positive in 29.1% (95% CI = 23.2–34.9%) of all patients and in 34.1% (95% CI = 27.0–41.2%) of dengue-seropositive patients, in 32.7% (95% CI = 23.5–41.8) of those with dengue fever and in 36.4% (95% CI = 24.7–48.0) of those with dengue haemorrhagic fever. Interobserver agreement for the tourniquet test was 90.2% (95% CI = 86.4–94.0) (Kappa = 0.76). Using ELISAs as the diagnostic gold standard, the sensitivity of the tourniquet test was 33.5–34%; its specificity was 84–91%. The positive and negative predictive values were 85–90% and 32.5–34%, respectively. Conclusions The admission tourniquet test has low sensitivity and adds relatively little value to the diagnosis of dengue among Lao adult inpatients with suspected dengue. Although a positive tourniquet test suggests dengue and that treatment of alternative diagnoses may not be needed, a negative test result does not exclude dengue. PMID:20958892
Test-retest and interobserver reliability of quantitative sensory testing according to the protocol of the German Research Network on Neuropathic Pain (DFNS): a multi-centre study.

PubMed

Geber, Christian; Klein, Thomas; Azad, Shahnaz; Birklein, Frank; Gierthmühlen, Janne; Huge, Volker; Lauchart, Meike; Nitzsche, Dorothee; Stengel, Maike; Valet, Michael; Baron, Ralf; Maier, Christoph; Tölle, Thomas; Treede, Rolf-Detlef

2011-03-01

Quantitative sensory testing (QST) is an instrument to assess positive and negative sensory signs, helping to identify mechanisms underlying pathologic pain conditions. In this study, we evaluated the test-retest reliability (TR-R) and the interobserver reliability (IO-R) of QST in patients with sensory disturbances of different etiologies. In 4 centres, 60 patients (37 male and 23 female, 56.4±1.9years) with lesions or diseases of the somatosensory system were included. QST comprised 13 parameters including detection and pain thresholds for thermal and mechanical stimuli. QST was performed in the clinically most affected test area and a less or unaffected control area in a morning and an afternoon session on 2 consecutive days by examiner pairs (4 QSTs/patient). For both, TR-R and IO-R, there were high correlations (r=0.80-0.93) at the affected test area, except for wind-up ratio (TR-R: r=0.67; IO-R: r=0.56) and paradoxical heat sensations (TR-R: r=0.35; IO-R: r=0.44). Mean IO-R (r=0.83, 31% unexplained variance) was slightly lower than TR-R (r=0.86, 26% unexplained variance, P<.05); the difference in variance amounted to 5%. There were no differences between study centres. In a subgroup with an unaffected control area (n=43), reliabilities were significantly better in the test area (TR-R: r=0.86; IO-R: r=0.83) than in the control area (TR-R: r=0.79; IO-R: r=0.71, each P<.01), suggesting that disease-related systematic variance enhances reliability of QST. We conclude that standardized QST performed by trained examiners is a valuable diagnostic instrument with good test-retest and interobserver reliability within 2days. With standardized training, observer bias is much lower than random variance. Quantitative sensory testing performed by trained examiners is a valuable diagnostic instrument with good interobserver and test-retest reliability for use in patients with sensory disturbances of different etiologies to help identify mechanisms of neuropathic and non-neuropathic pain. Copyright © 2010 International Association for the Study of Pain. Published by Elsevier B.V. All rights reserved.
Routine Use of Three-Dimensional Contrast-Enhanced Moving-Table MR Angiography in Patients with Peripheral Arterial Occlusive Disease: Comparison with Selective Digital Subtraction Angiography

DOE Office of Scientific and Technical Information (OSTI.GOV)

Deutschmann, Hannes A.; Schoellnast, Helmut; Portugaller, Horst R.

2006-10-15

Purpose. To compare the diagnostic accuracy of contrast-enhanced (CE) three-dimensional (3D) moving-table magnetic resonance (MR) angiography with that of selective digital subtraction angiography (DSA) for routine clinical investigation in patients with peripheral arterial occlusive disease. Methods. Thirty-eight patients underwent CE 3D moving-table MR angiography of the pelvic and peripheral arteries. A commercially available large-field-of-view adapter and a dedicated peripheral vascular phased-array coil were used. MR angiograms were evaluated for grade of arterial stenosis, diagnostic quality, and presence of artifacts. MR imaging results for each patient were compared with those of selective DSA. Results. Two hundred and twenty-six arterial segments inmore » 38 patients were evaluated by both selective DSA and MR angiography. No complications related to MR angiography were observed. There was agreement in stenosis classification in 204 (90.3%) segments; MR angiography overgraded 16 (7%) segments and undergraded 6 (2.7%) segments. Compared with selective DSA, MR angiography provided high sensitivity and specificity and excellent interobserver agreement for detection of severe stenosis (97% and 95%, {kappa} = 0.9 {+-} 0.03) and moderate stenosis (96.5% and 94.3%, {kappa} = 0.9 {+-} 0.03). Conclusion. Compared with selective DSA, moving-table MR angiography proved to be an accurate, noninvasive method for evaluation of peripheral arterial occlusive disease and may thus serve as an alternative to DSA in clinical routine.« less
"Eyeball test" of thermographic patterns for predicting a successful lateral infraclavicular block.

PubMed

Andreasen, Asger M; Linnet, Karen E; Asghar, Semera; Rothe, Christian; Rosenstock, Charlotte V; Lange, Kai H W; Lundstrøm, Lars H

2017-11-01

Increased distal skin temperature can be used to predict the success of lateral infraclavicular (LIC) block. We hypothesized that an "eyeball test" of specific infrared thermographic patterns after LIC block could be used to determine block success. In this observational study, five observers trained in four distinct thermographic patterns independently evaluated thermographic images of the hands of 40 patients at baseline and at one-minute intervals for 30 min after a LIC block. Sensitivity, specificity, and predictive values of a positive and a negative test were estimated to evaluate the validity of specific thermographic patterns for predicting a successful block. Sensory and motor block of the musculocutaneous, radial, ulnar, and median nerves defined block success. Fleiss' kappa statistics of multiple interobserver agreements were used to evaluate reliability. As a diagnostic test, the defined specific thermographic patterns of the hand predicted a successful block with increasing accuracy over the 30-min observation period. Block success was predicted with a sensitivity of 92.4% (95% confidence interval [CI], 86.8 to 96.2) and with a specificity of 84.0% (95% CI, 70.3 to 92.4) at min 30. The Fleiss' kappa for the five observers was 0.87 (95% CI, 0.77 to 0.96). We conclude that visual evaluation by an eyeball test of specific thermographic patterns of the blocked hands may be useful as a valid and reliable diagnostic test for predicting a successful LIC block.
Feasibility of MDCT angiography for determination of tumor-feeding vessels in chemoembolization of hepatocellular carcinoma.

PubMed

Kim, Inwha; Kim, Dae Jung; Kim, Kyoung Ah; Yoon, Sang Wook; Lee, Jong Tae

2014-01-01

To investigate the feasibility and accuracy of multidetector computed tomography (MDCT) angiography for assessment of subsegmental tumor-feeding vessels in transarterial chemoembolization (TACE) of hepatocellular carcinoma (HCC). A total of 23 patients with 36 HCCs who underwent TACE during a 14-month period were enrolled. All patients underwent 3-phase dynamic MDCT within a month before TACE. Arterial phase MDCT images were retrospectively reformatted and analyzed for determination of single subsegmental tumor-feeding vessel using maximum intensity projection (MIP) and volume-rendering technique (VRT). Two radiologists independently assessed and scored the MIP and VRT images using 4-grade visual scores (grade 1, no depiction of tumor-feeding vessel; grade 2, indeterminate tumor-feeding vessel; grade 3, probable tumor-feeding vessel; and grade 4, good depiction of tumor-feeding vessel). The weighted kappa test was used to determine interobserver variability, and Wilcoxon signed rank test was used to differentiate visual scores of each technique. Results of digital subtraction angiography were defined as the criterion standard; therefore, assessment of subsegmental tumor-feeding vessel using MIP or VRT was compared with digital subtraction angiography, and the accuracy of each technique was calculated. Interobserver agreement (weighted kappa, 0.746 on VRT and 0.806 on MIP) was substantial to almost perfect. The visual scores for MIP (mean, 3.64 for reviewer 1 and 3.5 for reviewer 2) were higher than those for VRT (mean, 2.11 for reviewer 1 and 2.22 for reviewer 2; P = 0.000). The accuracy for assessing subsegmental tumor-feeding vessel was 22.2% for VRT and 77.8% for MIP. Multidetector CT angiography using MIP showed good imaging quality and high accuracy for determination of subsegmental tumor-feeding vessels.
Rectosigmoid endometriosis: comparison between CT water enema and video laparoscopy.

PubMed

Stabile Ianora, A A; Moschetta, M; Lorusso, F; Lattarulo, S; Telegrafo, M; Rella, L; Scardapane, A

2013-09-01

To evaluate the accuracy of water enema computed tomography (CT) for predicting the location of endometriosis in patients with contraindications to magnetic resonance imaging (MRI), focusing on rectosigmoid lesions and having laparoscopic and histological data as the reference standard. Thirty-three women (mean age 33.4 ± 3.1 years) suspected of having deep pelvic endometriosis underwent 64-row CT and video laparoscopy within 4 weeks. Two radiologists blinded to the clinical data evaluated the CT images obtained after colonic retrograde distension using water as the contrast medium, and a comparison with laparoscopic and histological findings was performed. CT sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and diagnostic accuracy were calculated. The radiation dose to patients was estimated. Cohen's weighted kappa (κ) test was used to evaluate the interobserver agreement. In 23 out of 33 patients (69%) intestinal implants were found at surgery and pathological examinations. CT confirmed the diagnosis of rectosigmoid endometriosis in 20 out of 23 implants. Three nodules located on the proximal sigmoid colon (two serosal lesions and one infiltrating the muscularis layer) with a diameter of less than 1 cm were not diagnosed. CT sensitivity, specificity, PPV, NPV, and accuracy values were 87, 100, 100, 77, and 91%, respectively. The mean effective dose estimate was 6.30 ± 1.7 mSv. Almost perfect agreement between the two readers was found (k = 0.84). Water enema CT can play a role in the diagnosis of bowel endometriosis and represents another accurate potential tool for video laparoscopic approaches, especially in patients for whom MRI is contraindicated. Copyright © 2013 The Royal College of Radiologists. Published by Elsevier Ltd. All rights reserved.
Diagnostic accuracy of hepatorenal index in the detection and grading of hepatic steatosis.

PubMed

Chauhan, Anil; Sultan, Laith R; Furth, Emma E; Jones, Lisa P; Khungar, Vandana; Sehgal, Chandra M

2016-11-12

The objectives of our study were to assess the accuracy of hepatorenal index (HRI) in detection and grading of hepatic steatosis and to evaluate various factors that can affect the HRI measurement. Forty-five patients, who had undergone an abdominal sonographic examination within 30 days of liver biopsy, were enrolled. The HRI was calculated as the ratio of the mean brightness levels of the liver and renal parenchymas. The effect of the measurement technique on the HRI was evaluated by using various sizes, depths, and locations of the regions of interest (ROIs) in the liver. The measurements were obtained by two observers. The HRI was compared with the subjective grading of steatosis. The optimal HRI cutoff to detect steatosis was 2.01, yielding a sensitivity of 62.5% and specificity of 95.2%. Subjective grading had a sensitivity of 87.5% and specificity of 62.5%. HRIs of the hepatic steatosis group were statistically different from the no-steatosis group (p < 0.05). However, there was no statistically significant difference between mild steatosis and no-steatosis groups (p value = 0.72). There was a strong correlation between different HRIs based on variable placements of ROIs, except when the ROIs were positioned randomly. Interclass correlation coefficient for measurements performed by two observers was 0.74 (confidence interval: 0.58-0.86). The HRI is an effective tool for detecting hepatic steatosis. It provides similar accuracy for different methods of ROI placement (except for random placement) and has good interobserver agreement. It, however, is unable to effectively differentiate between absent and mild steatosis. © 2016 Wiley Periodicals, Inc. J Clin Ultrasound 44:580-586, 2016. © 2016 Wiley Periodicals, Inc.
Magnetic resonance angiography in infrapopliteal arterial disease: prospective comparison of 1.5 and 3 Tesla magnetic resonance imaging.

PubMed

Diehm, Nicolas; Kickuth, Ralph; Baumgartner, Iris; Srivastav, Sudesh K; Gretener, Silvia; Husmann, Marc J; Jaccard, Yves; Do, Do Dai; Triller, Juergen; Bonel, Harald M

2007-06-01

To prospectively determine the accuracy of 1.5 Tesla (T) and 3 T magnetic resonance angiography (MRA) versus digital subtraction angiography (DSA) in the depiction of infrageniculate arteries in patients with symptomatic peripheral arterial disease. A prospective 1.5 T, 3 T MRA, and DSA comparison was used to evaluate 360 vessel segments in 10 patients (15 limbs) with chronic symptomatic peripheral arterial disease. Selective DSA was performed within 30 days before both MRAs. The accuracy of 1.5 T and 3 T MRA was compared with DSA as the standard of reference by consensus agreement of 2 experienced readers. Signal-to-noise ratios (SNR) and signal-difference-to-noise ratios (SDNRs) were quantified. No significant difference in overall image quality, sufficiency for diagnosis, depiction of arterial anatomy, motion artifacts, and venous overlap was found comparing 1.5 T with 3 T MRA (P > 0.05 by Wilcoxon signed rank and as by Cohen k test). Overall sensitivity of 1.5 and 3 T MRA for detection of significant arterial stenosis was 79% and 82%, and specificity was 87% and 87% for both modalities, respectively. Interobserver agreement was excellent k > 0.8, P < 0.05) for 1.5 T as well as for 3 T MRA. SNR and SDNR were significantly increased using the 3 T system (average increase: 36.5%, P < 0.032 by t test, and 38.5%, P < 0.037 respectively). Despite marked improvement of SDNR, 3 T MRA does not yet provide a significantly higher accuracy in diagnostic imaging of atherosclerotic lesions below the knee joint as compared with 1.5 T MRA.
SU-E-P-33: Critical Role of T2-Weighted Imaging Combined with Diffusion-Weighted Imaging of MRI in Diagnosis of Loco-Regional Recurrent Esophageal Cancer After Radical Surgery

DOE Office of Scientific and Technical Information (OSTI.GOV)

Deng, G; Qiao, L; Liang, N

Purpose: We perform this study to investigate the diagnostic efficacy of T2-weighted MRI (T2WI) and diffusion-weighted MRI (DWI) in confirming local relapses of esophageal cancer in patients highly suspected of recurrence after eradicating surgery. Methods: Forty-two postoperative esophageal cancer patients with clinical suspicions of cancer recurrence underwent 3.0T MRI applying axial, coronal, sagittal T2WI and axial DWI sequences. Two experienced radiologists (R1 and R2) both used two methods (T2WI, T2WI+DWI) to observe the images, and graded the patients ranging from 1 to 5 to represent severity of the disease based on visual signal intensity (patients equal to or more thanmore » grade 3 was confirmed as recurrent disease) Results: 27/42patients were verified of recurrent disease by pathologic findings and/or imaging findings during follow-up. The sensitivity, specificity and accuracy of R1 applying T2WI+DWI are 96%, 87% and 93% versus 81%, 80% and 77% on T2WI, these figures by R2 were 96%, 93% and 95% versus 89%, 93% and 90%. The receiver operating curve (ROC) analyses suggest that both of the two readers can obtain better accuracy when adding DWI to T2WI compared with T2WI alone. Kappa test between R1 and R2 indicates excellent inter-observer agreement on T2WI+DWI. Conclusion: Standard T2WI in combination DWI can achieve better accuracy than T2WI alone in diagnosing local recurrence of esophageal cancer, and improve consistency between different readers.« less
First-year medical students use of ultrasound or physical examination to diagnose hepatomegaly and ascites: a randomized controlled trial.

PubMed

Arora, Samantha; Cheung, Angela C; Tarique, Usman; Agarwal, Arnav; Firdouse, Mohammed; Ailon, Jonathan

2017-09-01

To compare point-of-care ultrasound and physical examination (PEx), each performed by first-year medical students after brief teaching, for assessing ascites and hepatomegaly. Ultrasound and PEx were compared on: (1) reliability, validity and performance, (2) diagnostic confidence, ease of use, utility, and applicability. A single-center, randomized controlled trial was performed at a tertiary centre. First-year medical students were randomized to use ultrasound or PEx to assess for ascites and hepatomegaly. Cohen's kappa and interclass coefficient (ICC) were used to measure interrater reliability between trainee assessments and the reference standard (a same day ultrasound by a radiologist). Sensitivity, specificity, accuracy, positive predictive value (PPV), and negative predictive value (NPV) were compared. A ten-point Likert scale was used to assess trainee diagnostic confidence and perceptions of utility. There were no significant differences in interobserver reliability, sensitivity, specificity, accuracy, PPV, or NPV between the ultrasound and PEx groups. However, students in the ultrasound group provided higher scores for perceived utility (ascites 8.38 ± 1.35 vs 7.08 ± 1.86, p = 0.008; hepatomegaly 7.68 ± 1.52 vs 5.36 ± 2.48, p < 0.001) and likelihood of adoption (ascites 8.67 ± 1.61 vs 7.46 ± 1.79, p = 0.02; hepatomegaly 8.12 ± 1.90 vs 5.92 ± 2.32, p = 0.001). When performed by first-year medical students, the validity and reliability of ultrasound is comparable to PEx, but with greater perceived utility and likelihood of adoption. With similarly brief instruction, point-of-care ultrasonography can be as effectively learned and performed as PEx, with a high degree of interest from trainees.
Non-invasive diagnosis of liver fibrosis in chronic hepatitis C

PubMed Central

Schiavon, Leonardo de Lucca; Narciso-Schiavon, Janaína Luz; de Carvalho-Filho, Roberto José

2014-01-01

Assessment of liver fibrosis in chronic hepatitis C virus (HCV) infection is considered a relevant part of patient care and key for decision making. Although liver biopsy has been considered the gold standard for staging liver fibrosis, it is an invasive technique and subject to sampling errors and significant intra- and inter-observer variability. Over the last decade, several noninvasive markers were proposed for liver fibrosis diagnosis in chronic HCV infection, with variable performance. Besides the clear advantage of being noninvasive, a more objective interpretation of test results may overcome the mentioned intra- and inter-observer variability of liver biopsy. In addition, these tests can theoretically offer a more accurate view of fibrogenic events occurring in the entire liver with the advantage of providing frequent fibrosis evaluation without additional risk. However, in general, these tests show low accuracy in discriminating between intermediate stages of fibrosis and may be influenced by several hepatic and extra-hepatic conditions. These methods are either serum markers (usually combined in a mathematical model) or imaging modalities that can be used separately or combined in algorithms to improve accuracy. In this review we will discuss the different noninvasive methods that are currently available for the evaluation of liver fibrosis in chronic hepatitis C, their advantages, limitations and application in clinical practice. PMID:24659877
Inter-observer variability between general pathologists and a specialist in breast pathology in the diagnosis of lobular neoplasia, columnar cell lesions, atypical ductal hyperplasia and ductal carcinoma in situ of the breast

PubMed Central

2014-01-01

Background This study aimed to assess inter-observer variability between the original diagnostic reports and later review by a specialist in breast pathology considering lobular neoplasias (LN), columnar cell lesions (CCL), atypical ductal hyperplasia (ADH), and ductal carcinoma in situ (DCIS) of the breast. Methods A retrospective, observational, cross-sectional study was conducted. A total of 610 breast specimens that had been formally sent for consultation and/or second opinions to the Breast Pathology Laboratory of Federal University of Minas Gerais were analysed between January 2005 and December 2010. The inter-observer variability between the original report and later review was compared regarding the diagnoses of LN, CCL, ADH, and DCIS. Statistical analyses were conducted using the Kappa index. Results Weak correlations were observed for the diagnoses of columnar cell change (CCC; Kappa = 0.38), columnar cell hyperplasia (CCH; Kappa = 0.32), while a moderate agreement (Kappa = 0.47) was observed for the diagnoses of flat epithelial atypia (FEA). Good agreement was observed in the diagnoses of atypical lobular hyperplasia (ALH; Kappa = 0.62) and lobular carcinoma in situ (LCIS; Kappa = 0.66). However, poor agreement was observed for the diagnoses of pleomorphic LCIS (Kappa = 0.22). Moderate agreement was observed for the diagnoses of ADH (Kappa = 0.44), low-grade DCIS (Kappa = 0.47), intermediate-grade DCIS (Kappa = 0.45), and DCIS with microinvasion (Kappa = 0.56). Good agreement was observed between the diagnoses of high-grade DCIS (Kappa = 0.68). Conclusions According to our data, the best diagnostic agreements were observed for high-grade DCIS, ALH, and LCIS. CCL without atypia and pleomorphic LCIS had the worst agreement indices. Virtual Slides The virtual slide(s) for this article can be found here: http://www.diagnosticpathology.diagnomx.eu/vs/1640072350119725. PMID:24948027

Inter-observer variability between general pathologists and a specialist in breast pathology in the diagnosis of lobular neoplasia, columnar cell lesions, atypical ductal hyperplasia and ductal carcinoma in situ of the breast.

PubMed

Gomes, Douglas S; Porto, Simone S; Balabram, Débora; Gobbi, Helenice

2014-06-19

This study aimed to assess inter-observer variability between the original diagnostic reports and later review by a specialist in breast pathology considering lobular neoplasias (LN), columnar cell lesions (CCL), atypical ductal hyperplasia (ADH), and ductal carcinoma in situ (DCIS) of the breast. A retrospective, observational, cross-sectional study was conducted. A total of 610 breast specimens that had been formally sent for consultation and/or second opinions to the Breast Pathology Laboratory of Federal University of Minas Gerais were analysed between January 2005 and December 2010. The inter-observer variability between the original report and later review was compared regarding the diagnoses of LN, CCL, ADH, and DCIS. Statistical analyses were conducted using the Kappa index. Weak correlations were observed for the diagnoses of columnar cell change (CCC; Kappa=0.38), columnar cell hyperplasia (CCH; Kappa=0.32), while a moderate agreement (Kappa=0.47) was observed for the diagnoses of flat epithelial atypia (FEA). Good agreement was observed in the diagnoses of atypical lobular hyperplasia (ALH; Kappa=0.62) and lobular carcinoma in situ (LCIS; Kappa=0.66). However, poor agreement was observed for the diagnoses of pleomorphic LCIS (Kappa=0.22). Moderate agreement was observed for the diagnoses of ADH (Kappa=0.44), low-grade DCIS (Kappa=0.47), intermediate-grade DCIS (Kappa=0.45), and DCIS with microinvasion (Kappa=0.56). Good agreement was observed between the diagnoses of high-grade DCIS (Kappa=0.68). According to our data, the best diagnostic agreements were observed for high-grade DCIS, ALH, and LCIS. CCL without atypia and pleomorphic LCIS had the worst agreement indices. The virtual slide(s) for this article can be found here: http://www.diagnosticpathology.diagnomx.eu/vs/1640072350119725.
Digital microscopy as valid alternative to conventional microscopy for histological evaluation of Barrett's esophagus biopsies.

PubMed

van der Wel, M J; Duits, L C; Seldenrijk, C A; Offerhaus, G J; Visser, M; Ten Kate, F J; de Boer, O J; Tijssen, J G; Bergman, J J; Meijer, S L

2017-11-01

Management of Barrett's esophagus (BE) relies heavily on histopathological assessment of biopsies, associated with significant intra- and interobserver variability. Guidelines recommend biopsy review by an expert in case of dysplasia. Conventional review of biopsies, however, is impractical and does not allow for teleconferencing or annotations. An expert digital review platform might overcome these limitations. We compared diagnostic agreement of digital and conventional microscopy for diagnosing BE ± dysplasia. Sixty BE biopsy glass slides (non-dysplastic BE (NDBE); n = 25, low-grade dysplasia (LGD); n = 20; high-grade dysplasia (HGD); n = 15) were scanned at ×20 magnification. The slides were assessed four times by five expert BE pathologists, all practicing histopathologists (range: 5-30 years), in 2 alternating rounds of digital and conventional microscopy, each in randomized order and sequence of slides. Intraobserver and pairwise interobserver agreement were calculated, using custom weighted Cohen's kappa, adjusted for the maximum possible kappa scores. Split into three categories (NDBE, IND, LGD+HGD), the mean intraobserver agreement was 0.75 and 0.84 for digital and conventional assessment, respectively (p = 0.35). Mean pairwise interobserver agreement was 0.80 for digital and 0.85 for conventional microscopy (p = 0.17). In 47/60 (78%) of digital microscopy reviews a majority vote of ≥3 pathologists was reached before consensus meeting. After group discussion, a majority vote was achieved in all cases (60/60). Diagnostic agreement of digital microscopy is comparable to that of conventional microscopy. These outcomes justify the use of digital slides in a nationwide, web-based BE revision platform in the Netherlands. This will overcome the practical issues associated with conventional histologic review by multiple pathologists. © The Authors 2017. Published by Oxford University Press on behalf of International Society for Diseases of the Esophagus. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Pilot Study of an Open-source Image Analysis Software for Automated Screening of Conventional Cervical Smears.

PubMed

Sanyal, Parikshit; Ganguli, Prosenjit; Barui, Sanghita; Deb, Prabal

2018-01-01

The Pap stained cervical smear is a screening tool for cervical cancer. Commercial systems are used for automated screening of liquid based cervical smears. However, there is no image analysis software used for conventional cervical smears. The aim of this study was to develop and test the diagnostic accuracy of a software for analysis of conventional smears. The software was developed using Python programming language and open source libraries. It was standardized with images from Bethesda Interobserver Reproducibility Project. One hundred and thirty images from smears which were reported Negative for Intraepithelial Lesion or Malignancy (NILM), and 45 images where some abnormality has been reported, were collected from the archives of the hospital. The software was then tested on the images. The software was able to segregate images based on overall nuclear: cytoplasmic ratio, coefficient of variation (CV) in nuclear size, nuclear membrane irregularity, and clustering. 68.88% of abnormal images were flagged by the software, as well as 19.23% of NILM images. The major difficulties faced were segmentation of overlapping cell clusters and separation of neutrophils. The software shows potential as a screening tool for conventional cervical smears; however, further refinement in technique is required.
Comparison of MPEG-1 digital videotape with digitized sVHS videotape for quantitative echocardiographic measurements

NASA Technical Reports Server (NTRS)

Garcia, M. J.; Thomas, J. D.; Greenberg, N.; Sandelski, J.; Herrera, C.; Mudd, C.; Wicks, J.; Spencer, K.; Neumann, A.; Sankpal, B.;

2001-01-01

Digital format is rapidly emerging as a preferred method for displaying and retrieving echocardiographic studies. The qualitative diagnostic accuracy of Moving Pictures Experts Group (MPEG-1) compressed digital echocardiographic studies has been previously reported. The goals of the present study were to compare quantitative measurements derived from MPEG-1 recordings with the super-VHS (sVHS) videotape clinical standard. Six reviewers performed blinded measurements from still-frame images selected from 20 echocardiographic studies that were simultaneously acquired in sVHS and MPEG-1 formats. Measurements were obtainable in 1401 (95%) of 1486 MPEG-1 variables compared with 1356 (91%) of 1486 sVHS variables (P <.001). Excellent agreement existed between MPEG-1 and sVHS 2-dimensional linear measurements (r = 0.97; MPEG-1 = 0.95[sVHS] + 1.1 mm; P <.001; Delta = 9% +/- 10%), 2-dimensional area measurements (r = 0.89), color jet areas (r = 0.87, p <.001), and Doppler velocities (r = 0.92, p <.001). Interobserver variability was similar for both sVHS and MPEG-1 readings. Our results indicate that quantitative off-line measurements from MPEG-1 digitized echocardiographic studies are feasible and comparable to those obtained from sVHS.

Diagnostic accuracy of sub-mSv prospective ECG-triggering cardiac CT in young infant with complex congenital heart disease.

PubMed

Gao, Wei; Zhong, Yu Min; Sun, Ai Min; Wang, Qian; Ouyang, Rong Zhen; Hu, Li Wei; Qiu, Han Sheng; Wang, Shi Yu; Li, Jian Ying

2016-06-01

To explore the clinical value and evaluate the diagnostic accuracy of sub-mSv low-dose prospective ECG-triggering cardiac CT (CCT) in young infants with complex congenital heart disease (CHD). A total of 102 consecutive infant patients (53 boys and 49 girls with mean age of 2.9 ± 2.4 m and weight less than 5 kg) with complex CHD were prospectively enrolled. Scans were performed on a 64-slice high definition CT scanner with low dose prospective ECG-triggering mode and reconstructed with 80 % adaptive statistical iterative reconstruction algorithm. All studies were performed during free breathing with sedation. The subjective image quality was evaluated by 5-point grading scale and interobserver variability was calculated. The objective image noise (standard deviation, SD) and contrast to noise ratio (CNR) was calculated. The effective radiation dose from the prospective ECG-triggering mode was recorded and compared with the virtual conventional retrospective ECG-gating mode. The detection rate for the origin of coronary artery was calculated. All patients also underwent echocardiography before CCT examination. 81 patients had surgery and their preoperative CCT and echocardiography findings were compared with the surgical results and sensitivity, specificity, positive and negative predictive values and accuracy were calculated for separate cardiovascular anomalies. Heart rates were 70-161 beats per minute (bpm) with mean value of 129.19 ± 14.52 bpm. The effective dose of 0.53 ± 0.15 mSv in the prospective ECG-triggering cardiac CT was lower than the calculated value in a conventional retrospective ECG-gating mode (2.00 ± 0.35 mSv) (p < 0.001). The mean CNR and SD were 28.19 ± 13.00 and 15.75 ± 3.61HU, respectively. The image quality scores were 4.31 ± 0.36 and 4.29 ± 0.41 from reviewer 1 and 2 respectively with an excellent agreement between them (Kappa = 0.85). The detection rate for the origins of the left and right coronary arteries was 96 and 90 %, respectively. The detection rates of the origins of left coronary artery and right coronary artery in all cases were 96 % (78/81) and 90 % (73/81), respectively. Twenty cases of conotruncal anomalies and ALCAPA were validated surgically and the accuracy of cardiac CT diagnosis was 95 % (19/20). The overall deformity based sensitivity, specificity, positive predictive value and negative predictive value were 94.0.1, 99.9, 98.6, 99.5 % respectively, by CCT, and 88.2, 99.9, 97.8, 99.0 %, respectively, by echocardiography. Prospective ECG-triggering CCT with sub-mSv effective dose provides excellent imaging quality and high diagnostic accuracy for young infants with complex CHD.
An inter-observer agreement study of autofluorescence endoscopy in Barrett's esophagus among expert and non-expert endoscopists.

PubMed

Mannath, J; Subramanian, V; Telakis, E; Lau, K; Ramappa, V; Wireko, M; Kaye, P V; Ragunath, K

2013-02-01

Autofluorescence imaging (AFI), which is a "red flag" technique during Barrett's surveillance, is associated with significant false positive results. The aim of this study was to assess the inter-observer agreement (IOA) in identifying AFI-positive lesions and to assess the overall accuracy of AFI. Anonymized AFI and high resolution white light (HRE) images were prospectively collected. The AFI images were presented in random order, followed by corresponding AFI + HRE images. Three AFI experts and 3 AFI non-experts scored images after a training presentation. The IOA was calculated using kappa and accuracy was calculated with histology as gold standard. Seventy-four sets of images were prospectively collected from 63 patients (48 males, mean age 69 years). The IOA for number of AF positive lesions was fair when AFI images were presented. This improved to moderate with corresponding AFI and HRE images [experts 0.57 (0.44-0.70), non-experts 0.47 (0.35-0.62)]. The IOA for the site of AF lesion was moderate for experts and fair for non-experts using AF images, which improved to substantial for experts [κ = 0.62 (0.50-0.72)] but remained at fair for non-experts [κ = 0.28 (0.18-0.37)] with AFI + HRE. Among experts, the accuracy of identifying dysplasia was 0.76 (0.7-0.81) using AFI images and 0.85 (0.79-0.89) using AFI + HRE images. The accuracy was 0.69 (0.62-0.74) with AFI images alone and 0.75 (0.70-0.80) using AFI + HRE among non-experts. The IOA for AF positive lesions is fair to moderate using AFI images which improved with addition of HRE. The overall accuracy of identifying dysplasia was modest, and was better when AFI and HRE images were combined.
Diagnostic accuracy of routine blood examinations and CSF lactate level for post-neurosurgical bacterial meningitis.

PubMed

Zhang, Yang; Xiao, Xiong; Zhang, Junting; Gao, Zhixian; Ji, Nan; Zhang, Liwei

2017-06-01

To evaluate the diagnostic accuracy of routine blood examinations and Cerebrospinal Fluid (CSF) lactate level for Post-neurosurgical Bacterial Meningitis (PBM) at a large sample-size of post-neurosurgical patients. The diagnostic accuracies of routine blood examinations and CSF lactate level to distinguish between PAM and PBM were evaluated with the values of the Area Under the Curve of the Receiver Operating Characteristic (AUC -ROC ) by retrospectively analyzing the datasets of post-neurosurgical patients in the clinical information databases. The diagnostic accuracy of routine blood examinations was relatively low (AUC -ROC <0.7). The CSF lactate level achieved rather high diagnostic accuracy (AUC -ROC =0.891; CI 95%, 0.852-0.922). The variables of patient age, operation duration, surgical diagnosis and postoperative days (the interval days between the neurosurgery and examinations) were shown to affect the diagnostic accuracy of these examinations. The variables were integrated with routine blood examinations and CSF lactate level by Fisher discriminant analysis to improve their diagnostic accuracy. As a result, the diagnostic accuracy of blood examinations and CSF lactate level was significantly improved with an AUC -ROC value=0.760 (CI 95%, 0.737-0.782) and 0.921 (CI 95%, 0.887-0.948) respectively. The PBM diagnostic accuracy of routine blood examinations was relatively low, whereas the accuracy of CSF lactate level was high. Some variables that are involved in the incidence of PBM can also affect the diagnostic accuracy for PBM. Taking into account the effects of these variables significantly improves the diagnostic accuracies of routine blood examinations and CSF lactate level. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.
Is it better to include necrosis in apparent diffusion coefficient (ADC) measurements? The necrosis/wall ADC ratio to differentiate malignant and benign necrotic lung lesions: Preliminary results.

PubMed

Karaman, Adem; Durur-Subasi, Irmak; Alper, Fatih; Durur-Karakaya, Afak; Subasi, Mahmut; Akgun, Metin

2017-10-01

To determine whether the use of necrosis/wall apparent diffusion coefficient (ADC) ratios in the differentiation of necrotic lung lesions is more reliable than measuring the wall alone. In this retrospective study, a total of 76 patients (54 males and 22 females, 71% vs. 29%, with a mean age of 53 ± 18 years, range, 18-84) were enrolled, 33 of whom had lung carcinoma and 43 had a benign necrotic lung lesion. A 3T scanner was used. The calculation of the necrosis/wall ADC ratio was based on ADC values measured from necrosis and the wall of the lesions by diffusion-weighted imaging (DWI). Statistical analyses were performed with the independent samples t-test and receiver operating characteristic analysis. Intraobserver and interobserver reliability were calculated for ADC values of wall and necrosis. The mean necrosis/wall ADC ratio was 1.67 ± 0.23 for malignant lesions and 0.75 ± 0.19 for benign lung lesions (P < 0.001). To estimate malignancy the area under the curve (AUC) values for necrosis ADC, wall ADC, and the necrosis/wall ADC ratio were 0.720, 0.073, and 0.997, respectively. A wall/necrosis ADC ratio cutoff value of 1.12 demonstrated a 100% sensitivity and 98% specificity in the estimation of malignancy. Positive predictive value was 100%, and negative predictive value 98% and diagnostic accuracy 99%. There was a good intraobserver and interobserver reliability for wall and necrosis. The necrosis/wall ADC ratio appears to be a reliable and promising tool for discriminating lung carcinoma from benign necrotic lung lesions than measuring the wall alone. 4 Technical Efficacy: Stage 2 J. Magn. Reson. Imaging 2017;46:1001-1006. © 2017 International Society for Magnetic Resonance in Medicine.
Small animal magnetic resonance imaging: an efficient tool to assess liver volume and intrahepatic vascular anatomy.

PubMed

Melloul, Emmanuel; Raptis, Dimitri A; Boss, Andreas; Pfammater, Thomas; Tschuor, Christoph; Tian, Yinghua; Graf, Rolf; Clavien, Pierre-Alain; Lesurtel, Mickael

2014-04-01

To develop a noninvasive technique to assess liver volumetry and intrahepatic portal vein anatomy in a mouse model of liver regeneration. Fifty-two C57BL/6 male mice underwent magnetic resonance imaging (MRI) of the liver using a 4.7 T small animal MRI system after no treatment, 70% partial hepatectomy (PH), or selective portal vein embolization. The protocol consisted of the following sequences: three-dimensional-encoded spoiled gradient-echo sequence (repetition time per echo time 15 per 2.7 ms, flip angle 20°) for volumetry, and two-dimensional-encoded time-of-flight angiography sequence (repetition time per echo time 18 per 6.4 ms, flip angle 80°) for vessel visualization. Liver volume and portal vein segmentation was performed using a dedicated postprocessing software. In animals with portal vein embolization, portography served as reference standard. True liver volume was measured after sacrificing the animals. Measurements were carried out by two independent observers with subsequent analysis by the Cohen κ-test for interobserver agreement. MRI liver volumetry highly correlated with the true liver volume measurement using a conventional method in both the untreated liver and the liver remnant after 70% PH with a high interobserver correlation coefficient of 0.94 (95% confidence interval, 0.80-0.98 for untreated liver [P < 0.001] and 0.90-0.97 after 70% PH [P < 0.001]). The diagnostic accuracy of magnetic resonance angiography for the occlusion of one branch of the portal vein was 0.95 (95% confidence interval, 0.84-1). The level of agreement between the two observers for the description of intrahepatic vascular anatomy was excellent (Cohen κ value = 0.925). This protocol may be used for noninvasive liver volumetry and visualization of portal vein anatomy in mice. It will serve the dynamic study of new strategies to enhance liver regeneration in vivo. Copyright © 2014 Elsevier Inc. All rights reserved.
Aortic valve type and calcification as assessed by transthoracic and transoesophageal echocardiography.

PubMed

Yousry, Mohamed; Rickenlund, Anette; Petrini, Johan; Jenner, Jonas; Liska, Jan; Eriksson, Per; Franco-Cereceda, Anders; Eriksson, Maria J; Caidahl, Kenneth

2015-07-01

Aortic valve calcification (AVC) may predict poor outcome. Bicuspid aortic valve (BAV) leads to several haemodynamic changes accelerating the progress of aortic valve (AV) disease. To compare the diagnostic accuracy of transoesophageal echocardiography (TEE) and transthoracic echocardiography (TTE) in the assessment of aortic valve phenotype and degree of AVC, with intra-operative evaluation as a reference. We examined 169 patients (median age 65 years, 51 women) without significant coronary artery disease undergoing AV and/or aortic root surgery. TTE was performed within a week prior to surgery and TEE at the time of surgery. Compared with surgical AVC assessment, visual evaluation using a 5-grade scoring system and real-time images showed a higher correlation (TTE r = 0·83 and TEE r = 0·82) than visual (TTE r = 0·64 and TEE 0·63) or grey scale mean (GSMn) (TTE r = 0·63 and TEE r = 0·52) assessment of end-diastolic still frames. AVC assessment using real-time images showed high intraclass correlation coefficients (TTE 0·94 and TEE 0·93). With regard to BAV, TEE was superior to TTE with a higher interobserver agreement, sensitivity and specificity (0·86, 92% and 94% versus 0·57, 77% and 82%, respectively). Semi-quantitative AVC assessment of real-time cine loops from both TEE and TTE correlated well with intra-operative evaluation of AVC. Applying a predefined scoring system for AVC evaluation assures a high interobserver correlation. TEE was superior to TTE for evaluation of valve phenotype and should be considered when a diagnosis of BAV is clinically important. © 2014 The Authors. Clinical Physiology and Functional Imaging published by John Wiley & Sons Ltd on behalf of Scandinavian Society of Clinical Physiology and Nuclear Medicine.
The feasibility of point-of-care ankle ultrasound examination in patients with recurrent ankle sprain and chronic ankle instability: Comparison with magnetic resonance imaging.

PubMed

Lee, Sun Hwa; Yun, Seong Jong

2017-10-01

To evaluate the feasibility of point-of-care ankle ultrasound compared with magnetic resonance imaging (MRI) for diagnosing major ligaments and Achilles tendon injuries in patients with recurrent ankle sprain and chronic instability, and to evaluate inter-observer reliability between an emergency physician and a musculoskeletal radiology fellow. A prospective cross-sectional study was conducted in an emergency department. Patients with recurrent ankle sprain and chronic instability were recruited. An emergency physician and a musculoskeletal radiology fellow independently evaluated the anterior talofibular ligament (ATFL), calcaneofibular ligament (CFL), distal anterior tibiofibular ligament (ATiFL), deltoid ligament, and Achilles tendon using point-of-care ankle ultrasound. Findings were classified normal, partial tear, and complete tear. MRI was used as the reference standard. We calculated diagnostic values for point-of-care ankle ultrasound for both reviewers and compared them using DeLong's test. Intra-class correlation coefficients (ICCs) were calculated for agreement between each reviewer and the reference standard, and between the two reviewers. Eighty-five patients were enrolled. Point-of-care ankle ultrasound showed acceptable sensitivity (96.4-100%), specificity (95.0-100%), and accuracy (96.5-100%); these performance markers did not differ significantly between reviewers. Agreement between each reviewer and the reference standard was excellent (emergency physician, ICC=0.846-1.000; musculoskeletal radiology fellow, ICC=0.930-1.000), as was inter-observer agreement (ICC=0.873-1.000). Point-of-care ankle ultrasound is as precise as MRI for detecting major ankle ligament and Achilles tendon injuries; it could be used for immediate diagnosis and further pre-operative imaging. Moreover, it may reduce the interval from emergency department admission to admission for surgical intervention, and may save costs. Copyright © 2017 Elsevier Ltd. All rights reserved.
Accuracy of clinical pallor in the diagnosis of anaemia in children: a meta-analysis

PubMed Central

Chalco, Juan P; Huicho, Luis; Alamo, Carlos; Carreazo, Nilton Y; Bada, Carlos A

2005-01-01

Background Anaemia is highly prevalent in children of developing countries. It is associated with impaired physical growth and mental development. Palmar pallor is recommended at primary level for diagnosing it, on the basis of few studies. The objective of the study was to systematically assess the accuracy of clinical signs in the diagnosis of anaemia in children. Methods A systematic review on the accuracy of clinical signs of anaemia in children. We performed an Internet search in various databases and an additional reference tracking. Studies had to be on performance of clinical signs in the diagnosis of anaemia, using haemoglobin as the gold standard. We calculated pooled diagnostic likelihood ratios (LR's) and odds ratios (DOR's) for each clinical sign at different haemoglobin thresholds. Results Eleven articles met the inclusion criteria. Most studies were performed in Africa, in children underfive. Chi-square test for proportions and Cochran Q for DOR's and for LR's showed heterogeneity. Type of observer and haemoglobin technique influenced the results. Pooling was done using the random effects model. Pooled DOR at haemoglobin <11 g/dL was 4.3 (95% CI 2.6–7.2) for palmar pallor, 3.7 (2.3–5.9) for conjunctival pallor, and 3.4 (1.8–6.3) for nailbed pallor. DOR's and LR's were slightly better for nailbed pallor at all other haemoglobin thresholds. The accuracy did not vary substantially after excluding outliers. Conclusion This meta-analysis did not document a highly accurate clinical sign of anaemia. In view of poor performance of clinical signs, universal iron supplementation may be an adequate control strategy in high prevalence areas. Further well-designed studies are needed in settings other than Africa. They should assess inter-observer variation, performance of combined clinical signs, phenotypic differences, and different degrees of anaemia. PMID:16336667
The added value of 68Ga-DOTA-TATE-PET to contrast-enhanced CT for primary site detection in CUP of neuroendocrine origin.

PubMed

Kazmierczak, Philipp M; Rominger, Axel; Wenter, Vera; Spitzweg, Christine; Auernhammer, Christoph; Angele, Martin K; Rist, Carsten; Cyran, Clemens C

2017-04-01

To quantify the additional value of 68 Ga-DOTA-TATE PET/CT in comparison with contrast-enhanced CT alone for primary tumour detection in neuroendocrine cancer of unknown primary (CUP-NET). In total, 38 consecutive patients (27 men, 11 women; mean age 62 years) with histologically proven CUP-NET who underwent a contrast-enhanced 68 Ga-DOTA-TATE PET/CT scan for primary tumour detection and staging between 2010 and 2014 were included in this IRB-approved retrospective study. Two blinded readers independently analysed the contrast-enhanced CT and 68 Ga-DOTA-TATE PET datasets separately and noted from which modality they suspected a primary tumour. Consensus was reached if the results were divergent. Postoperative histopathology (24 patients) and follow-up 68 Ga-DOTA-TATE PET/CT imaging (14 patients) served as the reference standards and statistical measures of diagnostic accuracy were calculated accordingly. The majority of confirmed primary tumours were located in the abdomen (ileum in 19 patients, pancreas in 12, lung in 2, small pelvis in 1). High interobserver agreement was noted regarding the suspected primary tumour site (Cohen's k 0.90, p < 0.001). 68 Ga-DOTA-TATE PET demonstrated a significantly higher sensitivity (94 % vs. 63 %, p = 0.005) and a significantly higher accuracy (87 % vs. 68 %, p = 0.003) than contrast-enhanced CT. Ga-DOTA-TATE PET/CT compared with contrast-enhanced CT alone provides an improvement in sensitivity of 50 % and an improvement in accuracy of 30 % in primary tumour detection in CUP-NET. • 68 Ga-DOTA-TATE PET augments the sensitivity of contrast-enhanced CT by 50 % • 68 Ga-DOTA-TATE PET augments the accuracy of contrast-enhanced CT by 30 % • Somatostatin receptor-targeted hybrid imaging optimizes primary tumour detection in CUP-NET.
Inter-Rater Reliability and Downstream Financial Implications of Electrocardiography Screening in Young Athletes.

PubMed

Dhutia, Harshil; Malhotra, Aneil; Yeo, Tee Joo; Ster, Irina Chis; Gabus, Vincent; Steriotis, Alexandros; Dores, Helder; Mellor, Greg; García-Corrales, Carmen; Ensam, Bode; Jayalapan, Viknesh; Ezzat, Vivienne Anne; Finocchiaro, Gherardo; Gati, Sabiha; Papadakis, Michael; Tome-Esteban, Maria; Sharma, Sanjay

2017-08-01

Preparticipation screening for cardiovascular disease in young athletes with electrocardiography is endorsed by the European Society of Cardiology and several major sporting organizations. One of the concerns of the ECG as a screening test in young athletes relates to the potential for variation in interpretation. We investigated the degree of variation in ECG interpretation in athletes and its financial impact among cardiologists of differing experience. Eight cardiologists (4 with experience in screening athletes) each reported 400 ECGs of consecutively screened young athletes according to the 2010 European Society of Cardiology recommendations, Seattle criteria, and refined criteria. Cohen κ coefficient was used to calculate interobserver reliability. Cardiologists proposed secondary investigations after ECG interpretation, the costs of which were based on the UK National Health Service tariffs. Inexperienced cardiologists were more likely to classify an ECG as abnormal compared with experienced cardiologists (odds ratio, 1.44; 95% confidence interval, 1.03-2.02). Modification of ECG interpretation criteria improved interobserver reliability for categorizing an ECG as abnormal from poor (2010 European Society of Cardiology recommendations; κ=0.15) to moderate (refined criteria; κ=0.41) among inexperienced cardiologists; however, interobserver reliability was moderate for all 3 criteria among experienced cardiologists (κ=0.40-0.53). Inexperienced cardiologists were more likely to refer athletes for further evaluation compared with experienced cardiologists (odds ratio, 4.74; 95% confidence interval, 3.50-6.43) with poorer interobserver reliability (κ=0.22 versus κ=0.47). Interobserver reliability for secondary investigations after ECG interpretation ranged from poor to fair among inexperienced cardiologists (κ=0.15-0.30) and fair to moderate among experienced cardiologists (κ=0.21-0.46). The cost of cardiovascular evaluation per athlete was $175 (95% confidence interval, $142-$228) and $101 (95% confidence interval, $83-$131) for inexperienced and experienced cardiologists, respectively. Interpretation of the ECG in athletes and the resultant cascade of investigations are highly physician dependent even in experienced hands with important downstream financial implications, emphasizing the need for formal training and standardized diagnostic pathways. © 2017 American Heart Association, Inc.
Patient-oriented cancer information on the internet: a comparison of wikipedia and a professionally maintained database.

PubMed

Rajagopalan, Malolan S; Khanna, Vineet K; Leiter, Yaacov; Stott, Meghan; Showalter, Timothy N; Dicker, Adam P; Lawrence, Yaacov R

2011-09-01

A wiki is a collaborative Web site, such as Wikipedia, that can be freely edited. Because of a wiki's lack of formal editorial control, we hypothesized that the content would be less complete and accurate than that of a professional peer-reviewed Web site. In this study, the coverage, accuracy, and readability of cancer information on Wikipedia were compared with those of the patient-orientated National Cancer Institute's Physician Data Query (PDQ) comprehensive cancer database. For each of 10 cancer types, medically trained personnel scored PDQ and Wikipedia articles for accuracy and presentation of controversies by using an appraisal form. Reliability was assessed by using interobserver variability and test-retest reproducibility. Readability was calculated from word and sentence length. Evaluators were able to rapidly assess articles (18 minutes/article), with a test-retest reliability of 0.71 and interobserver variability of 0.53. For both Web sites, inaccuracies were rare, less than 2% of information examined. PDQ was significantly more readable than Wikipedia: Flesch-Kincaid grade level 9.6 versus 14.1. There was no difference in depth of coverage between PDQ and Wikipedia (29.9, 34.2, respectively; maximum possible score 72). Controversial aspects of cancer care were relatively poorly discussed in both resources (2.9 and 6.1 for PDQ and Wikipedia, respectively, NS; maximum possible score 18). A planned subanalysis comparing common and uncommon cancers demonstrated no difference. Although the wiki resource had similar accuracy and depth as the professionally edited database, it was significantly less readable. Further research is required to assess how this influences patients' understanding and retention.
New methodology to reconstruct in 2-D the cuspal enamel of modern human lower molars.

PubMed

Modesto-Mata, Mario; García-Campos, Cecilia; Martín-Francés, Laura; Martínez de Pinillos, Marina; García-González, Rebeca; Quintino, Yuliet; Canals, Antoni; Lozano, Marina; Dean, M Christopher; Martinón-Torres, María; Bermúdez de Castro, José María

2017-08-01

In the last years different methodologies have been developed to reconstruct worn teeth. In this article, we propose a new 2-D methodology to reconstruct the worn enamel of lower molars. Our main goals are to reconstruct molars with a high level of accuracy when measuring relevant histological variables and to validate the methodology calculating the errors associated with the measurements. This methodology is based on polynomial regression equations, and has been validated using two different dental variables: cuspal enamel thickness and crown height of the protoconid. In order to perform the validation process, simulated worn modern human molars were employed. The associated errors of the measurements were also estimated applying methodologies previously proposed by other authors. The mean percentage error estimated in reconstructed molars for these two variables in comparison with their own real values is -2.17% for the cuspal enamel thickness of the protoconid and -3.18% for the crown height of the protoconid. This error significantly improves the results of other methodologies, both in the interobserver error and in the accuracy of the measurements. The new methodology based on polynomial regressions can be confidently applied to the reconstruction of cuspal enamel of lower molars, as it improves the accuracy of the measurements and reduces the interobserver error. The present study shows that it is important to validate all methodologies in order to know the associated errors. This new methodology can be easily exportable to other modern human populations, the human fossil record and forensic sciences. © 2017 Wiley Periodicals, Inc.
Assessment of accuracy and efficiency of atlas-based autosegmentation for prostate radiotherapy in a variety of clinical conditions.

PubMed

Simmat, I; Georg, P; Georg, D; Birkfellner, W; Goldner, G; Stock, M

2012-09-01

The goal of the current study was to evaluate the commercially available atlas-based autosegmentation software for clinical use in prostate radiotherapy. The accuracy was benchmarked against interobserver variability. A total of 20 planning computed tomographs (CTs) and 10 cone-beam CTs (CBCTs) were selected for prostate, rectum, and bladder delineation. The images varied regarding to individual (age, body mass index) and setup parameters (contrast agent, rectal balloon, implanted markers). Automatically created contours with ABAS(®) and iPlan(®) were compared to an expert's delineation by calculating the Dice similarity coefficient (DSC) and conformity index. Demo-atlases of both systems showed different results for bladder (DSC(ABAS) 0.86 ± 0.17, DSC(iPlan) 0.51 ± 0.30) and prostate (DSC(ABAS) 0.71 ± 0.14, DSC(iPlan) 0.57 ± 0.19). Rectum delineation (DSC(ABAS) 0.78 ± 0.11, DSC(iPlan) 0.84 ± 0.08) demonstrated differences between the systems but better correlation of the automatically drawn volumes. ABAS(®) was closest to the interobserver benchmark. Autosegmentation with iPlan(®), ABAS(®) and manual segmentation took 0.5, 4 and 15-20 min, respectively. Automatic contouring on CBCT showed high dependence on image quality (DSC bladder 0.54, rectum 0.42, prostate 0.34). For clinical routine, efforts are still necessary to either redesign algorithms implemented in autosegmentation or to optimize image quality for CBCT to guarantee required accuracy and time savings for adaptive radiotherapy.
Preferred Reporting Items for a Systematic Review and Meta-analysis of Diagnostic Test Accuracy Studies: The PRISMA-DTA Statement.

PubMed

McInnes, Matthew D F; Moher, David; Thombs, Brett D; McGrath, Trevor A; Bossuyt, Patrick M; Clifford, Tammy; Cohen, Jérémie F; Deeks, Jonathan J; Gatsonis, Constantine; Hooft, Lotty; Hunt, Harriet A; Hyde, Christopher J; Korevaar, Daniël A; Leeflang, Mariska M G; Macaskill, Petra; Reitsma, Johannes B; Rodin, Rachel; Rutjes, Anne W S; Salameh, Jean-Paul; Stevens, Adrienne; Takwoingi, Yemisi; Tonelli, Marcello; Weeks, Laura; Whiting, Penny; Willis, Brian H

2018-01-23

Systematic reviews of diagnostic test accuracy synthesize data from primary diagnostic studies that have evaluated the accuracy of 1 or more index tests against a reference standard, provide estimates of test performance, allow comparisons of the accuracy of different tests, and facilitate the identification of sources of variability in test accuracy. To develop the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) diagnostic test accuracy guideline as a stand-alone extension of the PRISMA statement. Modifications to the PRISMA statement reflect the specific requirements for reporting of systematic reviews and meta-analyses of diagnostic test accuracy studies and the abstracts for these reviews. Established standards from the Enhancing the Quality and Transparency of Health Research (EQUATOR) Network were followed for the development of the guideline. The original PRISMA statement was used as a framework on which to modify and add items. A group of 24 multidisciplinary experts used a systematic review of articles on existing reporting guidelines and methods, a 3-round Delphi process, a consensus meeting, pilot testing, and iterative refinement to develop the PRISMA diagnostic test accuracy guideline. The final version of the PRISMA diagnostic test accuracy guideline checklist was approved by the group. The systematic review (produced 64 items) and the Delphi process (provided feedback on 7 proposed items; 1 item was later split into 2 items) identified 71 potentially relevant items for consideration. The Delphi process reduced these to 60 items that were discussed at the consensus meeting. Following the meeting, pilot testing and iterative feedback were used to generate the 27-item PRISMA diagnostic test accuracy checklist. To reflect specific or optimal contemporary systematic review methods for diagnostic test accuracy, 8 of the 27 original PRISMA items were left unchanged, 17 were modified, 2 were added, and 2 were omitted. The 27-item PRISMA diagnostic test accuracy checklist provides specific guidance for reporting of systematic reviews. The PRISMA diagnostic test accuracy guideline can facilitate the transparent reporting of reviews, and may assist in the evaluation of validity and applicability, enhance replicability of reviews, and make the results from systematic reviews of diagnostic test accuracy studies more useful.
Radiographic Blind Test of Curvature of the Posterior Border of the Mandibular Ramus as a Morphological Indicator of Gender.

PubMed

Peregrina, Alejandro; Azer, Shereen S; Tao, Erin E; Johnston, William M

2016-12-01

Curvature of the posterior border of the mandibular ramus at the occlusal plane has been described as a morphological trait for males. Controversy over the accuracy of this method remains among researchers; studies employing similar methods report accuracy rates for successful gender identification ranging from 59% to 99%. This blind study assessed evaluators' ability to determine gender based on the presence or absence of curvature of the posterior margin of the mandibular ramus through panoramic radiographs. Randomly selected panoramic radiographs were obtained from The Ohio State University College of Dentistry for 413 adult male (M) and female (F) subjects. Two evaluators separately assigned ratings using a similar method to the Loth and Henenberg methodology to each subject on the right and left sides of mandibular rami. The ratings were based upon three criteria: (1) presence of curvature at the occlusal plane (M), (2) presence of curvature but not at the occlusal plane (F), and (3) lack of curvature (F). Pearson exact chi-squared test was used to evaluate the statistical strength of the ratings. The evaluators were only in agreement for both the right and left rami in roughly two-thirds (66.8%) of cases when there was no excessive tooth loss (ETL); however, the inter-observer agreement improved to 82.1% for those rami associated with ETL. Inter-observer agreement occurred in 72.9% of female rami and in only 64.4% of male rami. The results of this study indicated that assessment of posterior border curvature of mandibular rami through panoramic radiographs was not a reliable indicator of gender and was further plagued by unacceptably high levels of inter-observer disagreement. © 2016 by the American College of Prosthodontists.
Intraoperative Physical Examination for Diagnosis of Interosseous Ligament Rupture-Cadaveric Study.

PubMed

Kachooei, Amir Reza; Rivlin, Michael; Wu, Fei; Faghfouri, Aram; Eberlin, Kyle R; Ring, David

2015-09-01

To study the intraobserver and interobserver reliability of the diagnosis of interosseous ligament (IOL) rupture in a cadaver model. On 12 fresh frozen cadavers, radial heads were cut using an identical incision and osteotomy. After randomization, the soft tissues of the limbs were divided into 4 groups: both IOL and triangular fibrocartilage (TFCC) intact; IOL disruption but TFCC intact; both IOL and TFCC divided; and IOL intact but TFCC divided. All incisions had identical suturing. After standard instruction and demonstration of radius pull-push and radius lateral pull tests, 10 physician evaluators with different levels of experience examined the cadaver limbs in a standardized way (elbow at 90° with the forearm held in both supination and pronation) and were asked to classify them into one of the 4 groups. Next, the same examiners were asked to re-examine the limbs after randomly changing the order of examination. The interobserver reliability of agreement for the diagnosis of IOL injury (groups 2 and 3) was fair in both rounds of examination and the intraobserver reliability was moderate. The intra- and interobserver reliabilities of agreement for the 4 groups of injuries among the examiners were fair in both rounds of examination. The sensitivity, specificity, accuracy, positive, and negative predictive values were all around 70%. The likelihood of a positive test corresponding with the presence of IOL rupture (positive likelihood ratio) was 2.2. The likelihood of a negative test correctly diagnosing an intact IOL was 0.40. In cadavers, intraoperative tests had fair reliability and 70% accuracy for the diagnosis of IOL rupture using the push-pull and lateral pull maneuvers. The level of experience did not have any effect on the correct diagnosis of intact versus disrupted IOL. Although not common, some failure of surgeries for traumatic elbow fracture-dislocations is because of failure in timely diagnosis of IOL disruption. Copyright © 2015 American Society for Surgery of the Hand. Published by Elsevier Inc. All rights reserved.

STARD 2015: An Updated List of Essential Items for Reporting Diagnostic Accuracy Studies.

PubMed

Bossuyt, Patrick M; Reitsma, Johannes B; Bruns, David E; Gatsonis, Constantine A; Glasziou, Paul P; Irwig, Les; Lijmer, Jeroen G; Moher, David; Rennie, Drummond; de Vet, Henrica C W; Kressel, Herbert Y; Rifai, Nader; Golub, Robert M; Altman, Douglas G; Hooft, Lotty; Korevaar, Daniël A; Cohen, Jérémie F

2015-12-01

Incomplete reporting has been identified as a major source of avoidable waste in biomedical research. Essential information is often not provided in study reports, impeding the identification, critical appraisal, and replication of studies. To improve the quality of reporting of diagnostic accuracy studies, the Standards for Reporting of Diagnostic Accuracy Studies (STARD) statement was developed. Here we present STARD 2015, an updated list of 30 essential items that should be included in every report of a diagnostic accuracy study. This update incorporates recent evidence about sources of bias and variability in diagnostic accuracy and is intended to facilitate the use of STARD. As such, STARD 2015 may help to improve completeness and transparency in reporting of diagnostic accuracy studies.
Acetate templating on digital images is more accurate than computer-based templating for total hip arthroplasty.

PubMed

Petretta, Robert; Strelzow, Jason; Ohly, Nicholas E; Misur, Peter; Masri, Bassam A

2015-12-01

Templating is an important aspect of preoperative planning for total hip arthroplasty and can help determine the size and positioning of the prosthesis. Historically, templating has been performed using acetate templates over printed radiographs. As a result of the increasing use of digital imaging, surgeons now either obtain additional printed radiographs solely for templating purposes or use specialized digital templating software, both of which carry additional cost. The purposes of this study was to compare acetate templating of digitally calibrated images on an LCD monitor to digital templating in terms of (1) accuracy; (2) reproducibility; and (3) time efficiency. Acetate onlay templating was performed directly over digital radiographs on an LCD monitor and was compared with digital templating. Five separate observers participated in this study templating on 52 total hip arthroplasties. For the acetate templating, the digital images were magnified to the scaled reference on the templates provided by the manufacturer (ratio 1.2:1) before templating using a 25-mm marker as a reference. Both the acetate and digital templating results were then compared with the actual implanted components to determine accuracy. Interobserver and intraobserver variability was determined by an intraclass correlation coefficient. Observers recorded time to complete templating from the time of complete upload of patients' imaging onto the system to completion of templating. Both acetate and digital templates demonstrated moderate accuracy in predicting within one size of the eventual implanted acetabular cup (77% [199 of 260]; 70% [181 of 260], respectively; p = 0.050; 95% confidence interval [CI], 0.058-0.32), whereas acetate templating was better at predicting the femoral stem compared to digital templating (75% [195 of 260]; 60% [155 of 260], respectively; p < 0.001; 95% CI, 0.084-0.32). Acetate templating showed moderate to substantial interobserver agreement (cup intraclass correlation coefficient [ICC] = 0.55; 95% CI, 0.14-0.86; femoral ICC = 0.75; 95% CI, 0.39-0.95) and both methods showed almost perfect intraobserver agreement in reproducibility (acetate cup ICC = 0.82; 95% CI, 0.66-0.97; acetate femoral ICC = 0.86; 95% CI, 0.74-0.97; digital cup ICC = 0.82; 95% CI, 0.68-0.97; digital femoral ICC = 0.88; 95% CI, 0.77-1.0). Acetate templating could be performed more quickly (acetate mean 119 seconds; range, 37-220 seconds versus 154 seconds; range, 73-343 seconds; p < 0.001). Acetate onlay templating on digitally calibrated images can be a reliable substitute for digital templating using specialized software. It is quicker to perform and much less expensive. Hospitals and practices need not purchase expensive software, particularly at lower volume centers. Level III, diagnostic study.
Assessing clinical reasoning (ASCLIRE): Instrument development and validation.

PubMed

Kunina-Habenicht, Olga; Hautz, Wolf E; Knigge, Michel; Spies, Claudia; Ahlers, Olaf

2015-12-01

Clinical reasoning is an essential competency in medical education. This study aimed at developing and validating a test to assess diagnostic accuracy, collected information, and diagnostic decision time in clinical reasoning. A norm-referenced computer-based test for the assessment of clinical reasoning (ASCLIRE) was developed, integrating the entire clinical decision process. In a cross-sectional study participants were asked to choose as many diagnostic measures as they deemed necessary to diagnose the underlying disease of six different cases with acute or sub-acute dyspnea and provide a diagnosis. 283 students and 20 content experts participated. In addition to diagnostic accuracy, respective decision time and number of used relevant diagnostic measures were documented as distinct performance indicators. The empirical structure of the test was investigated using a structural equation modeling approach. Experts showed higher accuracy rates and lower decision times than students. In a cross-sectional comparison, the diagnostic accuracy of students improved with the year of study. Wrong diagnoses provided by our sample were comparable to wrong diagnoses in practice. We found an excellent fit for a model with three latent factors-diagnostic accuracy, decision time, and choice of relevant diagnostic information-with diagnostic accuracy showing no significant correlation with decision time. ASCLIRE considers decision time as an important performance indicator beneath diagnostic accuracy and provides evidence that clinical reasoning is a complex ability comprising diagnostic accuracy, decision time, and choice of relevant diagnostic information as three partly correlated but still distinct aspects.
Diagnostic Reproducibility: What Happens When the Same Pathologist Interprets the Same Breast Biopsy Specimen at Two Points in Time?

PubMed Central

Jackson, Sara L.; Frederick, Paul D.; Pepe, Margaret S.; Nelson, Heidi D.; Weaver, Donald L.; Allison, Kimberly H.; Carney, Patricia A.; Geller, Berta M.; Tosteson, Anna N. A.; Onega, Tracy; Elmore, Joann G.

2017-01-01

Background Surgeons may receive a different diagnosis when a breast biopsy is interpreted by a second pathologist. The extent to which diagnostic agreement by the same pathologist varies at two time points is unknown. Participants and Methods Pathologists from 8 U.S. states independently interpreted 60 breast specimens, one glass slide per case, on 2 occasions separated by ≥9 months. Reproducibility was assessed by comparing interpretations between the two time points; associations between reproducibility (intra-observer agreement rates) and characteristics of pathologists and cases were determined and also compared with inter-observer agreement of baseline interpretations. Results Sixty-five percent of invited, responding pathologists were eligible and consented; 49 interpreted glass slides in both study phases resulting in 2,940 interpretations. Intra-observer agreement rates between the two phases were 92% (95% CI 88%-95%) for invasive breast cancer, 84% (95% CI 81%-87%) for ductal carcinoma in situ (DCIS), 53% (95% CI 47%-59%) for atypia, and 84% (95% CI 81%-86%) for benign without atypia. When comparing all study participants' case interpretations at baseline, inter-observer agreement rates were 89% (95% CI 84%-92%) for invasive cancer, 79% (95% CI 76%-81%) for DCIS, 43% (95% CI 41%-45%) for atypia, and 77% (95% CI 74%-79%) for benign without atypia. Conclusions Interpretive agreement between two time points by the same individual pathologists was low for atypia, and similar to observed rates of agreement for atypia between different pathologists. Physicians and patients should be aware of the diagnostic challenges associated with a breast biopsy diagnosis of atypia when considering treatment and surveillance decisions. PMID:27913946
Diagnostic Reproducibility: What Happens When the Same Pathologist Interprets the Same Breast Biopsy Specimen at Two Points in Time?

PubMed

Jackson, Sara L; Frederick, Paul D; Pepe, Margaret S; Nelson, Heidi D; Weaver, Donald L; Allison, Kimberly H; Carney, Patricia A; Geller, Berta M; Tosteson, Anna N A; Onega, Tracy; Elmore, Joann G

2017-05-01

Surgeons may receive a different diagnosis when a breast biopsy is interpreted by a second pathologist. The extent to which diagnostic agreement by the same pathologist varies at two time points is unknown. Pathologists from eight U.S. states independently interpreted 60 breast specimens, one glass slide per case, on two occasions separated by ≥9 months. Reproducibility was assessed by comparing interpretations between the two time points; associations between reproducibility (intraobserver agreement rates); and characteristics of pathologists and cases were determined and also compared with interobserver agreement of baseline interpretations. Sixty-five percent of invited, responding pathologists were eligible and consented; 49 interpreted glass slides in both study phases, resulting in 2940 interpretations. Intraobserver agreement rates between the two phases were 92% [95% confidence interval (CI) 88-95] for invasive breast cancer, 84% (95% CI 81-87) for ductal carcinoma-in-situ, 53% (95% CI 47-59) for atypia, and 84% (95% CI 81-86) for benign without atypia. When comparing all study participants' case interpretations at baseline, interobserver agreement rates were 89% (95% CI 84-92) for invasive cancer, 79% (95% CI 76-81) for ductal carcinoma-in-situ, 43% (95% CI 41-45) for atypia, and 77% (95% CI 74-79) for benign without atypia. Interpretive agreement between two time points by the same individual pathologist was low for atypia and was similar to observed rates of agreement for atypia between different pathologists. Physicians and patients should be aware of the diagnostic challenges associated with a breast biopsy diagnosis of atypia when considering treatment and surveillance decisions.
CT angiography for one-year follow-up of intracranial aneurysms treated with the WEB device: Utility in evaluating aneurysm occlusion and WEB compression at one year.

PubMed

Raoult, Hélène; Eugène, François; Le Bras, Anthony; Mineur, Géraldine; Carsin-Nicol, Béatrice; Ferré, Jean-Christophe; Gauvrit, Jean-Yves

2018-03-07

The WEB is an innovative flow disruption device for cerebral aneurysm embolization with rapidly expanding indications. Our purpose was to evaluate the diagnostic performance of computed tomography angiography (CTA) at 1-year follow-up of aneurysms treated with the WEB. Between April 2014 and May 2016, the study prospectively included patients treated with the WEB at our institution, and followed up within 24hours by CTA and at 1year by CTA, time-of-flight magnetic resonance angiography (TOF MRA) and digital subtraction angiography (DSA). The diagnostic quality of imaging data was assessed based on the confidence index, artifacts, and WEB shape depiction. The imaging diagnostic performance was assessed using 3 criteria at 1year: aneurysm occlusion status and worsening, and WEB shape compression. Interobserver and intermodality agreement was determined by calculating κ values. The study ultimately included 16 patients (9 women, mean age 53±7.6years). CTA quality confidence was scored as 2/2, artifacts 0.4/2 and WEB shape depiction 1.9/2, superior to TOF MRA for the latter two criteria. Aneurysm occlusion was adequate in 93.7% of patients, with CTA showing excellent interobserver reproducibility and agreement with DSA on a 4-grade scale (κ=1.00), while TOF MRA yielded good reproducibility (κ=0.76) and agreement with DSA (κ=0.69). CTA also identified aneurysm occlusion worsening (43.7%) and WEB compression (81.2%) in excellent agreement with DSA (κ=0.85 and 1.00). CTA is a reproducible and reliable technique for the follow-up of aneurysms treated with the WEB device. Copyright © 2018 Elsevier Masson SAS. All rights reserved.
Increased risk of malignancy for non-atypical urothelial cell groups compared to negative cytology in voided urine. Morphological changes with LBC.

PubMed

Granados, Rosario; Butrón, Mercedes; Santonja, Carlos; Rodríguez, José-María; Martín, Ana; Duarte, Joanny; Camarmo, Encarnación; Corrales, Teresa; Aramburu, José-Antonio

2016-07-01

Liquid-based cytology (LBC) has recently become the preferred method for urine cytology analysis, but differences with conventional cytology (CC) have been observed. The purpose of this study is to analyze these differences and the clinical relevance of non-atypical urothelial cell groups (UCG) in voided urine specimens. Reporting terminology is discussed. Initially, diagnostic categories from 619 LBC and 474 CC samples, reviewed by five different pathologists, were compared (phase 1). Five years after LBC was implemented and applying strict cytologic criteria for UCG diagnosis, 760 samples were analyzed (phase 2) and compared to previous LBC specimens. Diagnostic differences, interobserver variability and clinicopathological correlation with a 6-month follow-up, were analyzed. UCG increased from 6.5% with CC to 20.7% (218%, 3.2 fold, P < 0.0001) with LBC. This difference was not related to interobserver variability. Five years later, the rate of UCG had decreased to 13 2%. While 6% of cases with a negative cytology had urothelial carcinoma (UC) within 6 months of diagnosis, this percentage increased to 15.7% with UCG. The sensitivity of the UCG category for UC was low (30.4%), but the specificity and the negative predictive value (NPV) were high (87.1% and 94%, respectively). LBC increases UCG when compared to CC. This can be corrected with observeŕs experience and using set cytological criteria. Due to its association with carcinoma, the presence of UCG in voided urine should be framed in a diagnostic category other than "negative for malignancy." Diagn. Cytopathol. 2016;44:582-590. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Accuracy and Reliability of the Klales et al. (2012) Morphoscopic Pelvic Sexing Method.

PubMed

Lesciotto, Kate M; Doershuk, Lily J

2018-01-01

Klales et al. (2012) devised an ordinal scoring system for the morphoscopic pelvic traits described by Phenice (1969) and used for sex estimation of skeletal remains. The aim of this study was to test the accuracy and reliability of the Klales method using a large sample from the Hamann-Todd collection (n = 279). Two observers were blinded to sex, ancestry, and age and used the Klales et al. method to estimate the sex of each individual. Sex was correctly estimated for females with over 95% accuracy; however, the male allocation accuracy was approximately 50%. Weighted Cohen's kappa and intraclass correlation coefficient analysis for evaluating intra- and interobserver error showed moderate to substantial agreement for all traits. Although each trait can be reliably scored using the Klales method, low accuracy rates and high sex bias indicate better trait descriptions and visual guides are necessary to more accurately reflect the range of morphological variation. © 2017 American Academy of Forensic Sciences.
Effects of disease severity distribution on the performance of quantitative diagnostic methods and proposal of a novel 'V-plot' methodology to display accuracy values.

PubMed

Petraco, Ricardo; Dehbi, Hakim-Moulay; Howard, James P; Shun-Shin, Matthew J; Sen, Sayan; Nijjer, Sukhjinder S; Mayet, Jamil; Davies, Justin E; Francis, Darrel P

2018-01-01

Diagnostic accuracy is widely accepted by researchers and clinicians as an optimal expression of a test's performance. The aim of this study was to evaluate the effects of disease severity distribution on values of diagnostic accuracy as well as propose a sample-independent methodology to calculate and display accuracy of diagnostic tests. We evaluated the diagnostic relationship between two hypothetical methods to measure serum cholesterol (Chol rapid and Chol gold ) by generating samples with statistical software and (1) keeping the numerical relationship between methods unchanged and (2) changing the distribution of cholesterol values. Metrics of categorical agreement were calculated (accuracy, sensitivity and specificity). Finally, a novel methodology to display and calculate accuracy values was presented (the V-plot of accuracies). No single value of diagnostic accuracy can be used to describe the relationship between tests, as accuracy is a metric heavily affected by the underlying sample distribution. Our novel proposed methodology, the V-plot of accuracies, can be used as a sample-independent measure of a test performance against a reference gold standard.
A systematic review of the PTSD Checklist's diagnostic accuracy studies using QUADAS.

PubMed

McDonald, Scott D; Brown, Whitney L; Benesek, John P; Calhoun, Patrick S

2015-09-01

Despite the popularity of the PTSD Checklist (PCL) as a clinical screening test, there has been no comprehensive quality review of studies evaluating its diagnostic accuracy. A systematic quality assessment of 22 diagnostic accuracy studies of the English-language PCL using the Quality Assessment of Diagnostic Accuracy Studies (QUADAS) assessment tool was conducted to examine (a) the quality of diagnostic accuracy studies of the PCL, and (b) whether quality has improved since the 2003 STAndards for the Reporting of Diagnostic accuracy studies (STARD) initiative regarding reporting guidelines for diagnostic accuracy studies. Three raters independently applied the QUADAS tool to each study, and a consensus among the 4 authors is reported. Findings indicated that although studies generally met standards in several quality areas, there is still room for improvement. Areas for improvement include establishing representativeness, adequately describing clinical and demographic characteristics of the sample, and presenting better descriptions of important aspects of test and reference standard execution. Only 2 studies met each of the 14 quality criteria. In addition, study quality has not appreciably improved since the publication of the STARD Statement in 2003. Recommendations for the improvement of diagnostic accuracy studies of the PCL are discussed. (c) 2015 APA, all rights reserved).
STARD 2015: An Updated List of Essential Items for Reporting Diagnostic Accuracy Studies.

PubMed

Bossuyt, Patrick M; Reitsma, Johannes B; Bruns, David E; Gatsonis, Constantine A; Glasziou, Paul P; Irwig, Les; Lijmer, Jeroen G; Moher, David; Rennie, Drummond; de Vet, Henrica C W; Kressel, Herbert Y; Rifai, Nader; Golub, Robert M; Altman, Douglas G; Hooft, Lotty; Korevaar, Daniël A; Cohen, Jérémie F

2015-12-01

Incomplete reporting has been identified as a major source of avoidable waste in biomedical research. Essential information is often not provided in study reports, impeding the identification, critical appraisal, and replication of studies. To improve the quality of reporting of diagnostic accuracy studies, the Standards for Reporting of Diagnostic Accuracy Studies (STARD) statement was developed. Here we present STARD 2015, an updated list of 30 essential items that should be included in every report of a diagnostic accuracy study. This update incorporates recent evidence about sources of bias and variability in diagnostic accuracy and is intended to facilitate the use of STARD. As such, STARD 2015 may help to improve completeness and transparency in reporting of diagnostic accuracy studies. © 2015 American Association for Clinical Chemistry.
BRIEF REPORT: Beyond Clinical Experience: Features of Data Collection and Interpretation That Contribute to Diagnostic Accuracy

PubMed Central

Nendaz, Mathieu R; Gut, Anne M; Perrier, Arnaud; Louis-Simonet, Martine; Blondon-Choa, Katherine; Herrmann, François R; Junod, Alain F; Vu, Nu V

2006-01-01

BACKGROUND Clinical experience, features of data collection process, or both, affect diagnostic accuracy, but their respective role is unclear. OBJECTIVE, DESIGN Prospective, observational study, to determine the respective contribution of clinical experience and data collection features to diagnostic accuracy. METHODS Six Internists, 6 second year internal medicine residents, and 6 senior medical students worked up the same 7 cases with a standardized patient. Each encounter was audiotaped and immediately assessed by the subjects who indicated the reasons underlying their data collection. We analyzed the encounters according to diagnostic accuracy, information collected, organ systems explored, diagnoses evaluated, and final decisions made, and we determined predictors of diagnostic accuracy by logistic regression models. RESULTS Several features significantly predicted diagnostic accuracy after correction for clinical experience: early exploration of correct diagnosis (odds ratio [OR] 24.35) or of relevant diagnostic hypotheses (OR 2.22) to frame clinical data collection, larger number of diagnostic hypotheses evaluated (OR 1.08), and collection of relevant clinical data (OR 1.19). CONCLUSION Some features of data collection and interpretation are related to diagnostic accuracy beyond clinical experience and should be explicitly included in clinical training and modeled by clinical teachers. Thoroughness in data collection should not be considered a privileged way to diagnostic success. PMID:17105525
Accuracy of the Interpretation of Chest Radiographs for the Diagnosis of Paediatric Pneumonia

PubMed Central

Elemraid, Mohamed A.; Muller, Michelle; Spencer, David A.; Rushton, Stephen P.; Gorton, Russell; Thomas, Matthew F.; Eastham, Katherine M.; Hampton, Fiona; Gennery, Andrew R.; Clark, Julia E.

2014-01-01

Introduction World Health Organization (WHO) radiological classification remains an important entry criterion in epidemiological studies of pneumonia in children. We report inter-observer variability in the interpretation of 169 chest radiographs in children suspected of having pneumonia. Methods An 18-month prospective aetiological study of pneumonia was undertaken in Northern England. Chest radiographs were performed on eligible children aged ≤16 years with clinical features of pneumonia. The initial radiology report was compared with a subsequent assessment by a consultant cardiothoracic radiologist. Chest radiographic changes were categorised according to the WHO classification. Results There was significant disagreement (22%) between the first and second reports (kappa = 0.70, P<0.001), notably in those aged <5 years (26%, kappa = 0.66, P<0.001). The most frequent sources of disagreement were the reporting of patchy and perihilar changes. Conclusion This substantial inter-observer variability highlights the need for experts from different countries to create a consensus to review the radiological definition of pneumonia in children. PMID:25148361
Accuracy of MRI for the diagnosis of metastatic cervical lymphadenopathy in patients with thyroid cancer.

PubMed

Chen, Qinghua; Raghavan, Prashant; Mukherjee, Sugoto; Jameson, Mark J; Patrie, James; Xin, Wenjun; Xian, Junfang; Wang, Zhenchang; Levine, Paul A; Wintermark, Max

2015-10-01

The aim of this study was to systematically compare a comprehensive array of magnetic resonance (MR) imaging features in terms of their sensitivity and specificity to diagnose cervical lymph node metastases in patients with thyroid cancer. The study included 41 patients with thyroid malignancy who underwent surgical excision of cervical lymph nodes and had preoperative MR imaging ≤4weeks prior to surgery. Three head and neck neuroradiologists independently evaluated all the MR images. Using the pathology results as reference, the sensitivity, specificity and interobserver agreement of each MR imaging characteristic were calculated. On multivariate analysis, no single imaging feature was significantly correlated with metastasis. In general, imaging features demonstrated high specificity, but poor sensitivity and moderate interobserver agreement at best. Commonly used MR imaging features have limited sensitivity at correctly identifying cervical lymph node metastases in patients with thyroid cancer. A negative neck MR scan should not dissuade a surgeon from performing a neck dissection in patients with thyroid carcinomas.
Scaling digital radiographs for templating in total hip arthroplasty using conventional acetate templates independent of calibration markers.

PubMed

Brew, Christopher J; Simpson, Philip M; Whitehouse, Sarah L; Donnelly, William; Crawford, Ross W; Hubble, Matthew J W

2012-04-01

We describe a scaling method for templating digital radiographs using conventional acetate templates independent of template magnification without the need for a calibration marker. The mean magnification factor for the radiology department was determined (119.8%; range, 117%-123.4%). This fixed magnification factor was used to scale the radiographs by the method described. Thirty-two femoral heads on postoperative total hip arthroplasty radiographs were then measured and compared with the actual size. The mean absolute accuracy was within 0.5% of actual head size (range, 0%-3%) with a mean absolute difference of 0.16 mm (range, 0-1 mm; SD, 0.26 mm). Intraclass correlation coefficient showed excellent reliability for both interobserver and intraobserver measurements with intraclass correlation coefficient scores of 0.993 (95% CI, 0.988-0.996) for interobserver measurements and intraobserver measurements ranging between 0.990 and 0.993 (95% CI, 0.980-0.997). Crown Copyright Â© 2012. Published by Elsevier Inc. All rights reserved.
Interobserver Reliability of the Total Body Score System for Quantifying Human Decomposition.

PubMed

Dabbs, Gretchen R; Connor, Melissa; Bytheway, Joan A

2016-03-01

Several authors have tested the accuracy of the Total Body Score (TBS) method for quantifying decomposition, but none have examined the reliability of the method as a scoring system by testing interobserver error rates. Sixteen participants used the TBS system to score 59 observation packets including photographs and written descriptions of 13 human cadavers in different stages of decomposition (postmortem interval: 2-186 days). Data analysis used a two-way random model intraclass correlation in SPSS (v. 17.0). The TBS method showed "almost perfect" agreement between observers, with average absolute correlation coefficients of 0.990 and average consistency correlation coefficients of 0.991. While the TBS method may have sources of error, scoring reliability is not one of them. Individual component scores were examined, and the influences of education and experience levels were investigated. Overall, the trunk component scores were the least concordant. Suggestions are made to improve the reliability of the TBS method. © 2016 American Academy of Forensic Sciences.
Publishing nutrition research: validity, reliability, and diagnostic test assessment in nutrition-related research.

PubMed

Gleason, Philip M; Harris, Jeffrey; Sheean, Patricia M; Boushey, Carol J; Bruemmer, Barbara

2010-03-01

This is the sixth in a series of monographs on research design and analysis. The purpose of this article is to describe and discuss several concepts related to the measurement of nutrition-related characteristics and outcomes, including validity, reliability, and diagnostic tests. The article reviews the methodologic issues related to capturing the various aspects of a given nutrition measure's reliability, including test-retest, inter-item, and interobserver or inter-rater reliability. Similarly, it covers content validity, indicators of absolute vs relative validity, and internal vs external validity. With respect to diagnostic assessment, the article summarizes the concepts of sensitivity and specificity. The hope is that dietetics practitioners will be able to both use high-quality measures of nutrition concepts in their research and recognize these measures in research completed by others. Copyright 2010 American Dietetic Association. Published by Elsevier Inc. All rights reserved.
Quantitative contrast enhanced magnetic resonance imaging for the evaluation of peripheral arterial disease: a comparative study versus standard digital angiography.

PubMed

Pavlovic, Chris; Futamatsu, Hideki; Angiolillo, Dominick J; Guzman, Luis A; Wilke, Norbert; Siragusa, Daniel; Wludyka, Peter; Percy, Robert; Northrup, Martin; Bass, Theodore A; Costa, Marco A

2007-04-01

The purpose of this study is to evaluate the accuracy of semiautomated analysis of contrast enhanced magnetic resonance angiography (MRA) in patients who have undergone standard angiographic evaluation for peripheral vascular disease (PVD). Magnetic resonance angiography is an important tool for evaluating PVD. Although this technique is both safe and noninvasive, the accuracy and reproducibility of quantitative measurements of disease severity using MRA in the clinical setting have not been fully investigated. 43 lesions in 13 patients who underwent both MRA and digital subtraction angiography (DSA) of iliac and common femoral arteries within 6 months were analyzed using quantitative magnetic resonance angiography (QMRA) and quantitative vascular analysis (QVA). Analysis was repeated by a second operator and by the same operator in approximately 1 month time. QMRA underestimated percent diameter stenosis (%DS) compared to measurements made with QVA by 2.47%. Limits of agreement between the two methods were +/- 9.14%. Interobserver variability in measurements of %DS were +/- 12.58% for QMRA and +/- 10.04% for QVA. Intraobserver variability of %DS for QMRA was +/- 4.6% and for QVA was +/- 8.46%. QMRA displays a high level of agreement to QVA when used to determine stenosis severity in iliac and common femoral arteries. Similar levels of interobserver and intraobserver variability are present with each method. Overall, QMRA represents a useful method to quantify severity of PVD.
Additive value of non-contrast MRA in the preoperative evaluation of potential liver donors.

PubMed

Luk, Lyndon; Shenoy-Bhangle, Anuradha S; Jimenez, Guillermo; Ahmed, Firas S; Prince, Martin R; Samstein, Benjamin; Hecht, Elizabeth M

The purpose of this study is to compare diagnostic quality, inter-observer variability and agreement of non-contrast enhanced MRA (NC-MRA) with contrast-enhanced MRA (CE-MRA) in the evaluation of hepatic arterial anatomy. 20 potential liver donors were included in this retrospective study. NC-MRA, CE-MRA and combined data sets were randomized and reviewed by two readers. Reference standard was consensus by two senior radiologists using all data including CTA. There was no difference in IQ or diagnostic confidence between NC-MRA, CE-MRA or combined data for either reader but the arterial origin of segment IV was successfully identified on NC-MRA when CE-MRA was suboptimal. Copyright © 2016 Elsevier Inc. All rights reserved.
Evaluating performance of a user-trained MR lung tumor autocontouring algorithm in the context of intra- and interobserver variations.

PubMed

Yip, Eugene; Yun, Jihyun; Gabos, Zsolt; Baker, Sarah; Yee, Don; Wachowicz, Keith; Rathee, Satyapal; Fallone, B Gino

2018-01-01

Real-time tracking of lung tumors using magnetic resonance imaging (MRI) has been proposed as a potential strategy to mitigate the ill-effects of breathing motion in radiation therapy. Several autocontouring methods have been evaluated against a "gold standard" of a single human expert user. However, contours drawn by experts have inherent intra- and interobserver variations. In this study, we aim to evaluate our user-trained autocontouring algorithm with manually drawn contours from multiple expert users, and to contextualize the accuracy of these autocontours within intra- and interobserver variations. Six nonsmall cell lung cancer patients were recruited, with institutional ethics approval. Patients were imaged with a clinical 3 T Philips MR scanner using a dynamic 2D balanced SSFP sequence under free breathing. Three radiation oncology experts, each in two separate sessions, contoured 130 dynamic images for each patient. For autocontouring, the first 30 images were used for algorithm training, and the remaining 100 images were autocontoured and evaluated. Autocontours were compared against manual contours in terms of Dice's coefficient (DC) and Hausdorff distances (d H ). Intra- and interobserver variations of the manual contours were also evaluated. When compared with the manual contours of the expert user who trained it, the algorithm generates autocontours whose evaluation metrics (same session: DC = 0.90(0.03), d H = 3.8(1.6) mm; different session DC = 0.88(0.04), d H = 4.3(1.5) mm) are similar to or better than intraobserver variations (DC = 0.88(0.04), and d H = 4.3(1.7) mm) between two sessions. The algorithm's autocontours are also compared to the manual contours from different expert users with evaluation metrics (DC = 0.87(0.04), d H = 4.8(1.7) mm) similar to interobserver variations (DC = 0.87(0.04), d H = 4.7(1.6) mm). Our autocontouring algorithm delineates tumor contours (<20 ms per contour), in dynamic MRI of lung, that are comparable to multiple human experts (several seconds per contour), but at a much faster speed. At the same time, the agreement between autocontours and manual contours is comparable to the intra- and interobserver variations. This algorithm may be a key component of the real time tumor tracking workflow for our hybrid Linac-MR device in the future. © 2017 American Association of Physicists in Medicine.

Spectrally encoded confocal microscopy (SECM) for rapid assessment of breast excision specimens (Conference Presentation)

NASA Astrophysics Data System (ADS)

Brachtel, Elena F.; Johnson, Nicole B.; Huck, Amelia E.; Rice-Stitt, Travis L.; Vangel, Mark G.; Smith, Barbara L.; Tearney, Guillermo J.; Kang, DongKyun

2016-03-01

Unacceptably large percentage (20-40%) of breast cancer lumpectomy patients are required to undergo multiple surgeries when positive margins are found upon post-operative histologic assessment. If the margin status can be determined during surgery, surgeon can resect additional tissues to achieve tumor-free margin, which will reduce the need for additional surgeries. Spectrally encoded confocal microscopy (SECM) is a high-speed reflectance confocal microscopy technology that has a potential to image the entire surgical margin within a short procedural time. Previously, SECM was shown to rapidly image a large area (10 mm by 10 mm) of human esophageal tissue within a short procedural time (15 seconds). When used in lumpectomy, SECM will be able to image the entire margin surface of ~30 cm2 in around 7.5 minutes. SECM images will then be used to determine margin status intra-operatively. In this paper, we present results from a study of testing accuracy of SECM for diagnosing malignant breast tissues. We have imaged freshly-excised breast specimens (N=46) with SECM. SECM images clearly visualized histomorphologic features associated with normal/benign and malignant breast tissues in a similar manner to histologic images. Diagnostic accuracy was tested by comparing SECM diagnoses made by three junior pathologists with corresponding histologic diagnoses made by a senior pathologist. SECM sensitivity and specificity were high, 0.91 and 0.93, respectively. Intra-observer agreement and inter-observer agreement were also high, 0.87 and 0.84, respectively. Results from this study showed that SECM has a potential to accurately determine margin status during breast cancer lumpectomy.
Effects of disease severity distribution on the performance of quantitative diagnostic methods and proposal of a novel ‘V-plot’ methodology to display accuracy values

PubMed Central

Dehbi, Hakim-Moulay; Howard, James P; Shun-Shin, Matthew J; Sen, Sayan; Nijjer, Sukhjinder S; Mayet, Jamil; Davies, Justin E; Francis, Darrel P

2018-01-01

Background Diagnostic accuracy is widely accepted by researchers and clinicians as an optimal expression of a test’s performance. The aim of this study was to evaluate the effects of disease severity distribution on values of diagnostic accuracy as well as propose a sample-independent methodology to calculate and display accuracy of diagnostic tests. Methods and findings We evaluated the diagnostic relationship between two hypothetical methods to measure serum cholesterol (Cholrapid and Cholgold) by generating samples with statistical software and (1) keeping the numerical relationship between methods unchanged and (2) changing the distribution of cholesterol values. Metrics of categorical agreement were calculated (accuracy, sensitivity and specificity). Finally, a novel methodology to display and calculate accuracy values was presented (the V-plot of accuracies). Conclusion No single value of diagnostic accuracy can be used to describe the relationship between tests, as accuracy is a metric heavily affected by the underlying sample distribution. Our novel proposed methodology, the V-plot of accuracies, can be used as a sample-independent measure of a test performance against a reference gold standard. PMID:29387424
Postoperative imaging of orthopaedic hardware in the hand and wrist: is there an added value for tomosynthesis?

PubMed

De Silvestro, A; Martini, K; Becker, A S; Kim-Nguyen, T D L; Guggenberger, R; Calcagni, M; Frauenfelder, T

2018-02-01

To prospectively investigate digital tomosynthesis (DTS) as an alternative to digital radiography (DR) for postoperative imaging of orthopaedic hardware after trauma or arthrodesis in the hand and wrist. Thirty-six consecutive patients (12 female, median age 36 years, range 19-86 years) were included in this institutional review board approved clinical trial. Imaging was performed with DTS in dorso-palmar projection and DR was performed in dorso-palmar, lateral, and oblique views. Images were evaluated by two independent radiologists for qualitative and diagnosis-related imaging parameters using a four-point Likert scale (1=excellent, 4not diagnostic) and nominal scale. Interobserver agreement between the two readers was assessed with Cohen's kappa (k). Differences between DTS and CR were tested with Wilcoxon's signed-rank test. A p-value <0.05 was considered statistically significant. Regarding image quality, interobserver agreement was higher for DTS compared to DR, especially for fracture-related parameters (delineation osteosynthesis material [OSM]: K DTS 0.96 versus K DR 0.45; delineation fracture margins: K DTS 0.78 versus K DR 0.35). Delineation of fracture margins and delineation of adjacent joint spaces scored significant better for DTS compared to DR (delineation fracture margins: DTS1.54, DR2.28, p0.001; delineation adjacent joint spaces: DTS1.31, DR2.24, p0.001). Regarding diagnosis-related findings, interobserver agreement was almost equal. DTS showed a significant higher sharpness of fracture margins (DTS1.94, DR2.33, p0.04). Mean dose area product (DAP) for DTS was significant higher compared to DR (mean DR0.219 Gy·cm 2 , mean DTS0.903 Gy·cm 2 , p0.001). Fracture healing is more visible and interobserver agreement is higher for DTS compared to DR in the postoperative assessment of orthopaedic hardware in the hand and wrist. Copyright © 2017 The Royal College of Radiologists. Published by Elsevier Ltd. All rights reserved.
Reproducibility of right-to-left shunt quantification using transthoracic contrast echocardiography in hereditary haemorrhagic telangiectasia.

PubMed

Vorselaars, V M M; Velthuis, S; Huitema, M P; Hosman, A E; Westermann, C J J; Snijder, R J; Mager, J J; Post, M C

2018-04-01

Transthoracic contrast echocardiography (TTCE) is recommended for screening of pulmonary arteriovenous malformations (PAVMs) in hereditary haemorrhagic telangiectasia. Shunt quantification is used to find treatable PAVMs. So far, there has been no study investigating the reproducibility of this diagnostic test. Therefore, this study aimed to describe inter-observer and inter-injection variability of TTCE. We conducted a prospective single centre study. We included all consecutive persons screened for presence of PAVMs in association with hereditary haemorrhagic telangiectasia in 2015. The videos of two contrast injections per patient were divided and reviewed by two cardiologists blinded for patient data. Pulmonary right-to-left shunts were graded using a three-grade scale. Inter-observer and inter-injection agreement was calculated with κ statistics for the presence and grade of pulmonary right-to-left shunts. We included 107 persons (accounting for 214 injections) (49.5% male, mean age 45.0 ± 16.6 years). A pulmonary right-to-left shunt was present in 136 (63.6%) and 131 (61.2%) injections for observer 1 and 2, respectively. Inter-injection agreement for the presence of pulmonary right-to-left shunts was 0.96 (95% confidence interval (CI) 0.9-1.0) and 0.98 (95% CI 0.94-1.00) for observer 1 and 2, respectively. Inter-injection agreement for pulmonary right-to-left shunt grade was 0.96 (95% CI 0.93-0.99) and 0.95 (95% CI 0.92-0.98) respectively. There was disagreement in right-to-left shunt grade between the contrast injections in 11 patients (10.3%). Inter-observer variability for presence and grade of the pulmonary right-to-left shunt was 0.95 (95% CI 0.91-0.99) and 0.97 (95% CI 0.95-0.99) respectively. TTCE has an excellent inter-injection and inter-observer agreement for both the presence and grade of pulmonary right-to-left shunts.
Validity and reliability of the iPhone to measure rib hump in scoliosis.

PubMed

Balg, Frederic; Juteau, Mathieu; Theoret, Chantal; Svotelis, Amy; Grenier, Guillaume

2014-12-01

This was a prospective blinded validity and reliability analysis. The aim of this study was validation and reliability evaluation of the Scoligauge iPhone app. The scoliometer is used to clinically measure the rib hump in scoliosis as a means to evaluate the axial trunk rotation. The increasing availability of smartphone with built-in accelerometer led to the development of a vast number of applications to measure angles. Of these, the Scoligauge mimics a scoliometer. The aim of this study was to compare the validity of the Scoligauge iPhone application without an associated adapter with the traditional scoliometer and to test the reliability of the application in a clinical setting. Two observers measured the rib hump deformity on 34 consecutive patients with idiopathic scoliosis with an average Cobb angle of 24.2 ± 13.5 degrees (range, 4 to 65 degrees). Measurements were made with an iPhone without the adapter and with a scoliometer. The validity as well as the interobserver and intraobserver reliability were calculated using the intraclass coefficient (ICC) and the Bland-Altman test. The mean difference between the scoliometer and the Scoligauge application was 0.4 degrees [95% confidence interval (CI) of ± 3.1 degrees] with an ICC of 0.947 (P < 0.001). The intraobserver and interobserver ICC were 0.961 (P < 0.001) and 0.901 (P < 0.001), respectively. The mean intraobserver difference was 0.0 degrees (95% CI of ± 2.7 degrees) and the mean interobserver difference was 0.1 degrees (95% CI of ± 4.4 degrees). The intraobserver and interobserver reliability of the Scoligauge iPhone app, as well as its validity compared with the scoliometer, are excellent. The mean differences between measurements are small and clinically not significant. Thus, the Scoligauge application is valid for clinical evaluation even without special adapter. Level I (Diagnostic Study).
Evaluating the accuracy of the XVI dual registration tool compared with manual soft tissue matching to localise tumour volumes for post-prostatectomy patients receiving radiotherapy.

PubMed

Campbell, Amelia; Owen, Rebecca; Brown, Elizabeth; Pryor, David; Bernard, Anne; Lehman, Margot

2015-08-01

Cone beam computerised tomography (CBCT) enables soft tissue visualisation to optimise matching in the post-prostatectomy setting, but is associated with inter-observer variability. This study assessed the accuracy and consistency of automated soft tissue localisation using XVI's dual registration tool (DRT). Sixty CBCT images from ten post-prostatectomy patients were matched using: (i) the DRT and (ii) manual soft tissue registration by six radiation therapists (RTs). Shifts in the three Cartesian planes were recorded. The accuracy of the match was determined by comparing shifts to matches performed by two genitourinary radiation oncologists (ROs). A Bland-Altman method was used to assess the 95% levels of agreement (LoA). A clinical threshold of 3 mm was used to define equivalence between methods of matching. The 95% LoA between DRT-ROs in the superior/inferior, left/right and anterior/posterior directions were -2.21 to +3.18 mm, -0.77 to +0.84 mm, and -1.52 to +4.12 mm, respectively. The 95% LoA between RTs-ROs in the superior/inferior, left/right and anterior/posterior directions were -1.89 to +1.86 mm, -0.71 to +0.62 mm and -2.8 to +3.43 mm, respectively. Five DRT CBCT matches (8.33%) were outside the 3-mm threshold, all in the setting of bladder underfilling or rectal gas. The mean time for manual matching was 82 versus 65 s for DRT. XVI's DRT is comparable with RTs manually matching soft tissue on CBCT. The DRT can minimise RT inter-observer variability; however, involuntary bladder and rectal filling can influence the tools accuracy, highlighting the need for RT evaluation of the DRT match. © 2015 The Royal Australian and New Zealand College of Radiologists.
Development and validation of the SIMPLE endoscopic classification of diminutive and small colorectal polyps.

PubMed

Iacucci, Marietta; Trovato, Cristina; Daperno, Marco; Akinola, Oluseyi; Greenwald, David; Gross, Seth A; Hoffman, Arthur; Lee, Jeffrey; Lethebe, Brendan C; Lowerison, Mark; Nayor, Jennifer; Neumann, Helmut; Rath, Timo; Sanduleanu, Silvia; Sharma, Prateek; Kiesslich, Ralf; Ghosh, Subrata; Saltzman, John R

2018-03-23

Prediction of histology of small polyps facilitates colonoscopic treatment. The aims of this study were: 1) to develop a simplified polyp classification, 2) to evaluate its performance in predicting polyp histology, and 3) to evaluate the reproducibility of the classification by trainees using multiplatform endoscopic systems. In phase 1, a new simplified endoscopic classification for polyps - Simplified Identification Method for Polyp Labeling during Endoscopy (SIMPLE) - was created, using the new I-SCAN OE system (Pentax, Tokyo, Japan), by eight international experts. In phase 2, the accuracy, level of confidence, and interobserver agreement to predict polyp histology before and after training, and univariable/multivariable analysis of the endoscopic features, were performed. In phase 3, the reproducibility of SIMPLE by trainees using different endoscopy platforms was evaluated. Using the SIMPLE classification, the accuracy of experts in predicting polyps was 83 % (95 % confidence interval [CI] 77 % - 88 %) before and 94 % (95 %CI 89 % - 97 %) after training ( P = 0.002). The sensitivity, specificity, positive predictive value, and negative predictive value after training were 97 %, 88 %, 95 %, and 91 %. The interobserver agreement of polyp diagnosis improved from 0.46 (95 %CI 0.30 - 0.64) before to 0.66 (95 %CI 0.48 - 0.82) after training. The trainees demonstrated that the SIMPLE classification is applicable across endoscopy platforms, with similar post-training accuracies for narrow-band imaging NBI classification (0.69; 95 %CI 0.64 - 0.73) and SIMPLE (0.71; 95 %CI 0.67 - 0.75). Using the I-SCAN OE system, the new SIMPLE classification demonstrated a high degree of accuracy for adenoma diagnosis, meeting the ASGE PIVI recommendations. We demonstrated that SIMPLE may be used with either I-SCAN OE or NBI. © Georg Thieme Verlag KG Stuttgart · New York.
Dental measurements and Bolton index reliability and accuracy obtained from 2D digital, 3D segmented CBCT, and 3d intraoral laser scanner

PubMed Central

San José, Verónica; Bellot-Arcís, Carlos; Tarazona, Beatriz; Zamora, Natalia; O Lagravère, Manuel

2017-01-01

Background To compare the reliability and accuracy of direct and indirect dental measurements derived from two types of 3D virtual models: generated by intraoral laser scanning (ILS) and segmented cone beam computed tomography (CBCT), comparing these with a 2D digital model. Material and Methods One hundred patients were selected. All patients’ records included initial plaster models, an intraoral scan and a CBCT. Patients´ dental arches were scanned with the iTero® intraoral scanner while the CBCTs were segmented to create three-dimensional models. To obtain 2D digital models, plaster models were scanned using a conventional 2D scanner. When digital models had been obtained using these three methods, direct dental measurements were measured and indirect measurements were calculated. Differences between methods were assessed by means of paired t-tests and regression models. Intra and inter-observer error were analyzed using Dahlberg´s d and coefficients of variation. Results Intraobserver and interobserver error for the ILS model was less than 0.44 mm while for segmented CBCT models, the error was less than 0.97 mm. ILS models provided statistically and clinically acceptable accuracy for all dental measurements, while CBCT models showed a tendency to underestimate measurements in the lower arch, although within the limits of clinical acceptability. Conclusions ILS and CBCT segmented models are both reliable and accurate for dental measurements. Integration of ILS with CBCT scans would get dental and skeletal information altogether. Key words:CBCT, intraoral laser scanner, 2D digital models, 3D models, dental measurements, reliability. PMID:29410764
Patient-Oriented Cancer Information on the Internet: A Comparison of Wikipedia and a Professionally Maintained Database

PubMed Central

Rajagopalan, Malolan S.; Khanna, Vineet K.; Leiter, Yaacov; Stott, Meghan; Showalter, Timothy N.; Dicker, Adam P.; Lawrence, Yaacov R.

2011-01-01

Purpose: A wiki is a collaborative Web site, such as Wikipedia, that can be freely edited. Because of a wiki's lack of formal editorial control, we hypothesized that the content would be less complete and accurate than that of a professional peer-reviewed Web site. In this study, the coverage, accuracy, and readability of cancer information on Wikipedia were compared with those of the patient-orientated National Cancer Institute's Physician Data Query (PDQ) comprehensive cancer database. Methods: For each of 10 cancer types, medically trained personnel scored PDQ and Wikipedia articles for accuracy and presentation of controversies by using an appraisal form. Reliability was assessed by using interobserver variability and test-retest reproducibility. Readability was calculated from word and sentence length. Results: Evaluators were able to rapidly assess articles (18 minutes/article), with a test-retest reliability of 0.71 and interobserver variability of 0.53. For both Web sites, inaccuracies were rare, less than 2% of information examined. PDQ was significantly more readable than Wikipedia: Flesch-Kincaid grade level 9.6 versus 14.1. There was no difference in depth of coverage between PDQ and Wikipedia (29.9, 34.2, respectively; maximum possible score 72). Controversial aspects of cancer care were relatively poorly discussed in both resources (2.9 and 6.1 for PDQ and Wikipedia, respectively, NS; maximum possible score 18). A planned subanalysis comparing common and uncommon cancers demonstrated no difference. Conclusion: Although the wiki resource had similar accuracy and depth as the professionally edited database, it was significantly less readable. Further research is required to assess how this influences patients' understanding and retention. PMID:22211130
Intra- and Interobserver Variability of Cochlear Length Measurements in Clinical CT.

PubMed

Iyaniwura, John E; Elfarnawany, Mai; Riyahi-Alam, Sadegh; Sharma, Manas; Kassam, Zahra; Bureau, Yves; Parnes, Lorne S; Ladak, Hanif M; Agrawal, Sumit K

2017-07-01

The cochlear A-value measurement exhibits significant inter- and intraobserver variability, and its accuracy is dependent on the visualization method in clinical computed tomography (CT) images of the cochlea. An accurate estimate of the cochlear duct length (CDL) can be used to determine electrode choice, and frequency map the cochlea based on the Greenwood equation. Studies have described estimating the CDL using a single A-value measurement, however the observer variability has not been assessed. Clinical and micro-CT images of 20 cadaveric cochleae were acquired. Four specialists measured A-values on clinical CT images using both standard views and multiplanar reconstructed (MPR) views. Measurements were repeated to assess for intraobserver variability. Observer variabilities were evaluated using intra-class correlation and absolute differences. Accuracy was evaluated by comparison to the gold standard micro-CT images of the same specimens. Interobserver variability was good (average absolute difference: 0.77 ± 0.42 mm) using standard views and fair (average absolute difference: 0.90 ± 0.31 mm) using MPR views. Intraobserver variability had an average absolute difference of 0.31 ± 0.09 mm for the standard views and 0.38 ± 0.17 mm for the MPR views. MPR view measurements were more accurate than standard views, with average relative errors of 9.5 and 14.5%, respectively. There was significant observer variability in A-value measurements using both the standard and MPR views. Creating the MPR views increased variability between experts, however MPR views yielded more accurate results. Automated A-value measurement algorithms may help to reduce variability and increase accuracy in the future.
Methodological quality of diagnostic accuracy studies on non-invasive coronary CT angiography: influence of QUADAS (Quality Assessment of Diagnostic Accuracy Studies included in systematic reviews) items on sensitivity and specificity.

PubMed

Schueler, Sabine; Walther, Stefan; Schuetz, Georg M; Schlattmann, Peter; Dewey, Marc

2013-06-01

To evaluate the methodological quality of diagnostic accuracy studies on coronary computed tomography (CT) angiography using the QUADAS (Quality Assessment of Diagnostic Accuracy Studies included in systematic reviews) tool. Each QUADAS item was individually defined to adapt it to the special requirements of studies on coronary CT angiography. Two independent investigators analysed 118 studies using 12 QUADAS items. Meta-regression and pooled analyses were performed to identify possible effects of methodological quality items on estimates of diagnostic accuracy. The overall methodological quality of coronary CT studies was merely moderate. They fulfilled a median of 7.5 out of 12 items. Only 9 of the 118 studies fulfilled more than 75 % of possible QUADAS items. One QUADAS item ("Uninterpretable Results") showed a significant influence (P = 0.02) on estimates of diagnostic accuracy with "no fulfilment" increasing specificity from 86 to 90 %. Furthermore, pooled analysis revealed that each QUADAS item that is not fulfilled has the potential to change estimates of diagnostic accuracy. The methodological quality of studies investigating the diagnostic accuracy of non-invasive coronary CT is only moderate and was found to affect the sensitivity and specificity. An improvement is highly desirable because good methodology is crucial for adequately assessing imaging technologies. • Good methodological quality is a basic requirement in diagnostic accuracy studies. • Most coronary CT angiography studies have only been of moderate design quality. • Weak methodological quality will affect the sensitivity and specificity. • No improvement in methodological quality was observed over time. • Authors should consider the QUADAS checklist when undertaking accuracy studies.
SEMAC-VAT MR Imaging Unravels Peri-instrumentation Lesions in Patients With Attendant Symptoms After Spinal Surgery.

PubMed

Qi, Shun; Wu, Zhi-Gang; Mu, Yun-Feng; Gao, Lang-Lang; Yang, Jian; Zuo, Pan-Li; Nittka, Mathias; Liu, Ying; Wang, Hai-Qiang; Yin, Hong

2016-04-01

The study aimed for evaluating the diagnostic value of a 2D Turbo Spin Echo (TSE) magnetic resonance (MR) imaging sequence implanted slice-encoding metal artifact correction (SEMAC) and view-angle tilting (VAT) in patients with spinal instrumentation.Sixty-seven consecutive patients with an average age of 59.7 ± 17.8 years old (range: 32-75 years) were enrolled in this study. Both sagittal, axial T1-weighted and T2-weighted MRI images were acquired with a standard TSE sequence and a high-bandwidth TSE sequence implemented the SEMAC and VAT techniques. Three continuous sections around the instrumentation in axial and sagittal images were selected for quantitative evaluation. The measurement included cumulative areas of signal void on axial images and the length of spinal canal obscuration on sagittal images. Three radiologists independently evaluated all images blindly. The inter-observer reliability was evaluated with inter-class coefficient. We defined patients with discomfortable symptoms caused by spinal instrumentation as spinal instrumentation adverse reaction.Visualizations of all periprosthetic anatomic structures were significantly better for SEMAC-VAT compared with standard imaging. For axial images, the area of signal void at the level of the instrumentation were statistically reduced with SEMAC-VAT TSE sequences than with standard TSE sequences for T2-weighted images (9.9 ± 2.6 cm vs 29.8 ± 14.7 cm, P < 0.001). For sagittal imaging, the length of spinal canal obscuration at the level of the instrumentation was reduced from 5.2 ± 2.0 cm to 1.2 ± 0.6 cm on T2-weighted images (P < 0.001), and from 4.8 ± 2.1 cm to 1.1 ± 0.5 cm on T1-weighted images with SEMAC-VAT sequences (P < 0.001). Interobserver agreement for visualization of anatomic structures and image quality was good for both SEMAC-VAT (k = 0.77 and 0.68, respectively) and standard (k = 0.74 and 0.80, respectively) imaging. The number of abnormal findings noted on SEMAC images (59 findings) was significantly higher than detected on standard images (40 findings). The incidence rate of spinal instrumentation adverse reaction was 38.81%.MR images with SEMAC-VAT can significantly reduce metal artifacts for spinal instrumentation and improve delineation of the instrumentation and periprosthetic region. Furthermore, SEMAC-VAT technique can improve diagnostic accuracy in patients with post-instrumentation spinal diseases.
Clinical performance of a free-breathing spatiotemporally accelerated 3-D time-resolved contrast-enhanced pediatric abdominal MR angiography.

PubMed

Zhang, Tao; Yousaf, Ufra; Hsiao, Albert; Cheng, Joseph Y; Alley, Marcus T; Lustig, Michael; Pauly, John M; Vasanawala, Shreyas S

2015-10-01

Pediatric contrast-enhanced MR angiography is often limited by respiration, other patient motion and compromised spatiotemporal resolution. To determine the reliability of a free-breathing spatiotemporally accelerated 3-D time-resolved contrast-enhanced MR angiography method for depicting abdominal arterial anatomy in young children. With IRB approval and informed consent, we retrospectively identified 27 consecutive children (16 males and 11 females; mean age: 3.8 years, range: 14 days to 8.4 years) referred for contrast-enhanced MR angiography at our institution, who had undergone free-breathing spatiotemporally accelerated time-resolved contrast-enhanced MR angiography studies. A radio-frequency-spoiled gradient echo sequence with Cartesian variable density k-space sampling and radial view ordering, intrinsic motion navigation and intermittent fat suppression was developed. Images were reconstructed with soft-gated parallel imaging locally low-rank method to achieve both motion correction and high spatiotemporal resolution. Quality of delineation of 13 abdominal arteries in the reconstructed images was assessed independently by two radiologists on a five-point scale. Ninety-five percent confidence intervals of the proportion of diagnostically adequate cases were calculated. Interobserver agreements were also analyzed. Eleven out of 13 arteries achieved acceptable image quality (mean score range: 3.9-5.0) for both readers. Fair to substantial interobserver agreement was reached on nine arteries. Free-breathing spatiotemporally accelerated 3-D time-resolved contrast-enhanced MR angiography frequently yields diagnostic image quality for most abdominal arteries in young children.
Quantitative estimation of the high-intensity zone in the lumbar spine: comparison between the symptomatic and asymptomatic population.

PubMed

Liu, Chao; Cai, Hong-Xin; Zhang, Jian-Feng; Ma, Jian-Jun; Lu, Yin-Jiang; Fan, Shun-Wu

2014-03-01

The high-intensity zone (HIZ) on magnetic resonance imaging (MRI) has been studied for more than 20 years, but its diagnostic value in low back pain (LBP) is limited by the high incidence in asymptomatic subjects. Little effort has been made to improve the objective assessment of HIZ. To develop quantitative measurements for HIZ and estimate intra- and interobserver reliability and to clarify different signal intensity of HIZ in patients with or without LBP. A measurement reliability and prospective comparative study. A consecutive series of patients with LBP between June 2010 and May 2011 (group A) and a successive series of asymptomatic controls during the same period (group B). Incidence of HIZ; quantitative measures, including area of disc, area and signal intensity of HIZ, and magnetic resonance imaging index; and intraclass correlation coefficients (ICCs) for intra- and interobserver reliability. On the basis of HIZ criteria, a series of quantitative dimension and signal intensity measures was developed for assessing HIZ. Two experienced spine surgeons traced the region of interest twice within 4 weeks for assessment of the intra- and interobserver reliability. The quantitative variables were compared between groups A and B. There were 72 patients with LBP and 79 asymptomatic controls enrolling in this study. The prevalence of HIZ in group A and group B was 45.8% and 20.2%, respectively. The intraobserver agreement was excellent for the quantitative measures (ICC=0.838-0.977) as well as interobserver reliability (ICC=0.809-0.935). The mean signal of HIZ in group A was significantly brighter than in group B (57.55±14.04% vs. 45.61±7.22%, p=.000). There was no statistical difference of area of disc and HIZ between the two groups. The magnetic resonance imaging index was found to be higher in group A when compared with group B (3.94±1.71 vs. 3.06±1.50), but with a p value of .050. A series of quantitative measurements for HIZ was established and demonstrated excellent intra- and interobserver reliability. The signal intensity of HIZ was different in patients with or without LBP, and significant brighter signal was observed in symptomatic subjects. Copyright © 2014 Elsevier Inc. All rights reserved.
Reproducibility of cine displacement encoding with stimulated echoes (DENSE) cardiovascular magnetic resonance for measuring left ventricular strains, torsion, and synchrony in mice.

PubMed

Haggerty, Christopher M; Kramer, Sage P; Binkley, Cassi M; Powell, David K; Mattingly, Andrea C; Charnigo, Richard; Epstein, Frederick H; Fornwalt, Brandon K

2013-08-27

Advanced measures of cardiac function are increasingly important to clinical assessment due to their superior diagnostic and predictive capabilities. Cine DENSE cardiovascular magnetic resonance (CMR) is ideal for quantifying advanced measures of cardiac function based on its high spatial resolution and streamlined post-processing. While many studies have utilized cine DENSE in both humans and small-animal models, the inter-test and inter-observer reproducibility for quantification of advanced cardiac function in mice has not been evaluated. This represents a critical knowledge gap for both understanding the capabilities of this technique and for the design of future experiments. We hypothesized that cine DENSE CMR would show excellent inter-test and inter-observer reproducibility for advanced measures of left ventricular (LV) function in mice. Five normal mice (C57BL/6) and four mice with depressed cardiac function (diet-induced obesity) were imaged twice, two days apart, on a 7T ClinScan MR system. Images were acquired with 15-20 frames per cardiac cycle in three short-axis (basal, mid, apical) and two long-axis orientations (4-chamber and 2-chamber). LV strain, twist, torsion, and measures of synchrony were quantified. Images from both days were analyzed by one observer to quantify inter-test reproducibility, while inter-observer reproducibility was assessed by a second observer's analysis of day-1 images. The coefficient of variation (CoV) was used to quantify reproducibility. LV strains and torsion were highly reproducible on both inter-observer and inter-test bases with CoVs ≤ 15%, and inter-observer reproducibility was generally better than inter-test reproducibility. However, end-systolic twist angles showed much higher variance, likely due to the sensitivity of slice location within the sharp longitudinal gradient in twist angle. Measures of synchrony including the circumferential (CURE) and radial (RURE) uniformity of strain indices, showed excellent reproducibility with CoVs of 1% and 3%, respectively. Finally, peak measures (e.g., strains) were generally more reproducible than the corresponding rates of change (e.g., strain rate). Cine DENSE CMR is a highly reproducible technique for quantification of advanced measures of left ventricular cardiac function in mice including strains, torsion and measures of synchrony. However, myocardial twist angles are not reproducible and future studies should instead report torsion.
Validity and reliability of the Paprosky acetabular defect classification.

PubMed

Yu, Raymond; Hofstaetter, Jochen G; Sullivan, Thomas; Costi, Kerry; Howie, Donald W; Solomon, Lucian B

2013-07-01

The Paprosky acetabular defect classification is widely used but has not been appropriately validated. Reliability of the Paprosky system has not been evaluated in combination with standardized techniques of measurement and scoring. This study evaluated the reliability, teachability, and validity of the Paprosky acetabular defect classification. Preoperative radiographs from a random sample of 83 patients undergoing 85 acetabular revisions were classified by four observers, and their classifications were compared with quantitative intraoperative measurements. Teachability of the classification scheme was tested by dividing the four observers into two groups. The observers in Group 1 underwent three teaching sessions; those in Group 2 underwent one session and the influence of teaching on the accuracy of their classifications was ascertained. Radiographic evaluation showed statistically significant relationships with intraoperative measurements of anterior, medial, and superior acetabular defect sizes. Interobserver reliability improved substantially after teaching and did not improve without it. The weighted kappa coefficient went from 0.56 at Occasion 1 to 0.79 after three teaching sessions in Group 1 observers, and from 0.49 to 0.65 after one teaching session in Group 2 observers. The Paprosky system is valid and shows good reliability when combined with standardized definitions of radiographic landmarks and a structured analysis. Level II, diagnostic study. See the Guidelines for Authors for a complete description of levels of evidence.
Acoustic radiation force impulse (ARFI) ultrasound imaging of pancreatic cystic lesions.

PubMed

D'Onofrio, M; Gallotti, A; Salvia, R; Capelli, P; Mucelli, R Pozzi

2011-11-01

To evaluate the ARFI ultrasound imaging with Virtual Touch tissue quantification in studying pancreatic cystic lesions, compared with phantom fluid models. Different phantom fluids at different viscosity or density (water, iodinate contrast agent, and oil) were evaluated by two independent operators. From September to December 2008, 23 pancreatic cystic lesions were prospectively studied. All lesions were pathologically confirmed. Non-numerical values on water and numerical values on other phantoms were obtained. Inter-observer evaluation revealed a perfect correlation (rs=1.00; p<0.0001) between all measurements achieved by both operators per each balloon and fluid. Among the pancreatic cystic lesions, 14 mucinous cystadenomas, 4 pseudocysts, 3 intraductal papillary-mucinous neoplasms and 2 serous cystadenomas were studied. The values obtained ranged from XXXX/0-4,85 m/s in mucinous cystadenomas, from XXXX/0-3,11 m/s in pseudocysts, from XXXX/0-4,57 m/s in intraductal papillary-mucinous neoplasms. In serous cystadenomas all values measured were XXXX/0m/s. Diagnostic accuracy in benign and non-benign differentiation of pancreatic cystic lesions was 78%. Virtual Touch tissue quantification can be applied in the analysis of fluids and is potentially able to differentiate more complex (mucinous) from simple (serous) content in studying pancreatic cystic lesions. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
Image processing and machine learning techniques to automate diagnosis of Lugol's iodine cervigrams for a low-cost point-of-care digital colposcope

NASA Astrophysics Data System (ADS)

Asiedu, Mercy Nyamewaa; Simhal, Anish; Lam, Christopher T.; Mueller, Jenna; Chaudhary, Usamah; Schmitt, John W.; Sapiro, Guillermo; Ramanujam, Nimmi

2018-02-01

The world health organization recommends visual inspection with acetic acid (VIA) and/or Lugol's Iodine (VILI) for cervical cancer screening in low-resource settings. Human interpretation of diagnostic indicators for visual inspection is qualitative, subjective, and has high inter-observer discordance, which could lead both to adverse outcomes for the patient and unnecessary follow-ups. In this work, we a simple method for automatic feature extraction and classification for Lugol's Iodine cervigrams acquired with a low-cost, miniature, digital colposcope. Algorithms to preprocess expert physician-labelled cervigrams and to extract simple but powerful color-based features are introduced. The features are used to train a support vector machine model to classify cervigrams based on expert physician labels. The selected framework achieved a sensitivity, specificity, and accuracy of 89.2%, 66.7% and 80.6% with majority diagnosis of the expert physicians in discriminating cervical intraepithelial neoplasia (CIN +) relative to normal tissues. The proposed classifier also achieved an area under the curve of 84 when trained with majority diagnosis of the expert physicians. The results suggest that utilizing simple color-based features may enable unbiased automation of VILI cervigrams, opening the door to a full system of low-cost data acquisition complemented with automatic interpretation.
Computer-automated ABCD versus dermatologists with different degrees of experience in dermoscopy.

PubMed

Piccolo, Domenico; Crisman, Giuliana; Schoinas, Spyridon; Altamura, Davide; Peris, Ketty

2014-01-01

Dermoscopy is a very useful and non-invasive technique for in vivo observation and preoperative diagnosis of pigmented skin lesions (PSLs) inasmuch as it enables analysis of surface and subsurface structures that are not discernible to the naked eye. The authors used the ABCD rule of dermoscopy to test the accuracy of melanoma diagnosis with respect to a panel of 165 PSLs and the intra- and inter-observer diagnostic agreement obtained between three dermatologists with different degrees of experience, one General Practitioner and a DDA for computer-assisted diagnosis (Nevuscreen(®), Arkè s.a.s., Avezzano, Italy). 165 Pigmented Skin Lesions from 165 patients were selected. Histopathological examination revealed 132 benign melanocytic skin lesions and 33 melanomas. The kappa statistic, sensitivity, specificity and predictive positive and negative values were calculated to measure agreement between all the human observers and in comparison with the automated DDA. Our results revealed poor reproducibility of the semi-quantitative algorithm devised by Stolz et al. independently of observers' experience in dermoscopy. Nevuscreen(®) (Arkè s.a.s., Avezzano, Italy) proved to be 'user friendly' to all observers, thus enabling a more critical evaluation of each lesion and representing a helpful tool for clinicians without significant experience in dermoscopy in improving and achieving more accurate diagnosis of PSLs.
Diagnostic reproducibility of hydatidiform moles: ancillary techniques (p57 immunohistochemistry and molecular genotyping) improve morphologic diagnosis.

PubMed

Vang, Russell; Gupta, Mamta; Wu, Lee-Shu-Fune; Yemelyanova, Anna V; Kurman, Robert J; Murphy, Kathleen M; Descipio, Cheryl; Ronnett, Brigitte M

2012-03-01

Distinction of hydatidiform moles (HMs) from nonmolar specimens (NMs) and subclassification of HMs as complete hydatidiform moles (CHMs) and partial hydatidiform moles (PHMs) are important for clinical practice and investigational studies; yet, diagnosis based solely on morphology is affected by interobserver variability. Molecular genotyping can distinguish these entities by discerning androgenetic diploidy, diandric triploidy, and biparental diploidy to diagnose CHMs, PHMs, and NMs, respectively. Eighty genotyped cases (27 CHMs, 27 PHMs, and 26 NMs) were selected from a series of 200 potentially molar specimens previously diagnosed using p57 immunostaining and genotyping. Cases were classified by 3 gynecologic pathologists on the basis of H&E slides (masked to p57 immunostaining and genotyping results) into 1 of 3 categories (CHM, PHM, or NM) during 2 diagnostic rounds; a third round incorporating p57 immunostaining results was also conducted. Consensus diagnoses (those rendered by 2 of 3 pathologists) were determined. Genotyping results were used as the gold standard for assessing diagnostic performance. Sensitivity of a diagnosis of CHM ranged from 59% to 100% for individual pathologists and from 70% to 81% by consensus; specificity ranged from 91% to 96% for individuals and from 94% to 98% by consensus. Sensitivity of a diagnosis of PHM ranged from 56% to 93% for individual pathologists and from 70% to 78% by consensus; specificity ranged from 58% to 92% for individuals and from 74% to 85% by consensus. The percentage of correct classification of all cases by morphology ranged from 55% to 75% for individual pathologists and from 70% to 75% by consensus. The κ values for interobserver agreement ranged from 0.59 to 0.73 (moderate to good) for a diagnosis of CHM, from 0.15 to 0.43 (poor to moderate) for PHM, and from 0.13 to 0.42 (poor to moderate) for NM. The κ values for intraobserver agreement ranged from 0.44 to 0.67 (moderate to good). Addition of the p57 immunostain improved sensitivity of a diagnosis of CHM to a range of 93% to 96% for individual pathologists and 96% by consensus; specificity was improved from a range of 96% to 98% for individual pathologists and 96% by consensus; there was no substantial impact on diagnosis of PHMs and NMs. Interobserver agreement for interpretation of the p57 immunostain was 0.96 (almost perfect). Even with morphologic assessment by gynecologic pathologists and p57 immunohistochemistry, 20% to 30% of cases will be misclassified, and, in particular, distinction of PHMs and NMs will remain problematic.

Prospective evaluation of fluciclovine (18F) PET-CT and MRI in detection of recurrent prostate cancer in non-prostatectomy patients.

PubMed

Akin-Akintayo, Oladunni; Tade, Funmilayo; Mittal, Pardeep; Moreno, Courtney; Nieh, Peter T; Rossi, Peter; Patil, Dattatraya; Halkar, Raghuveer; Fei, Baowei; Master, Viraj; Jani, Ashesh B; Kitajima, Hiroumi; Osunkoya, Adeboye O; Ormenisan-Gherasim, Claudia; Goodman, Mark M; Schuster, David M

2018-05-01

To investigate the disease detection rate, diagnostic performance and interobserver agreement of fluciclovine ( 18 F) PET-CT and multiparametric magnetic resonance imaging (mpMR) in recurrent prostate cancer. Twenty-four patients with biochemical failure after non-prostatectomy definitive therapy, 16/24 of whom had undergone brachytherapy, underwent fluciclovine PET-CT and mpMR with interpretation by expert readers blinded to patient history, PSA and other imaging results. Reference standard was established via a multidisciplinary truth panel utilizing histology and clinical follow-up (22.9 ± 10.5 months) and emphasizing biochemical control. The truth panel was blinded to investigative imaging results. Diagnostic performance and interobserver agreement (kappa) for the prostate and extraprostatic regions were calculated for each of 2 readers for PET-CT (P1 and P2) and 2 different readers for mpMR (M1 and M2). On a whole body basis, the detection rate for fluciclovine PET-CT was 94.7% (both readers), while it ranged from 31.6-36.8% for mpMR. Kappa for fluciclovine PET-CT was 0.90 in the prostate and 1.0 in the extraprostatic regions. For mpMR, kappa was 0.25 and 0.74, respectively. In the prostate, 22/24 patients met the reference standard with 13 malignant and 9 benign results. Sensitivity, specificity and positive predictive value (PPV) were 100.0%, 11.1% and 61.9%, respectively for both PET readers. For mpMR readers, values ranged from 15.4-38.5% for sensitivity, 55.6-77.8% for specificity and 50.0-55.6% for PPV. For extraprostatic disease determination, 18/24 patients met the reference standard. Sensitivity, specificity and PPV were 87.5%, 90.0% and 87.5%, respectively, for fluciclovine PET-CT, while for mpMR, sensitivity ranged from 50 to 75%, specificity 70-80% and PPV 57-75%. The disease detection rate for fluciclovine PET-CT in non-prostatectomy patients with biochemical failure was 94.7% versus 31.6-36.8% for mpMR. For extraprostatic disease detection, fluciclovine PET-CT had overall better diagnostic performance than mpMR. For the treated prostate, fluciclovine PET-CT had high sensitivity though low specificity for disease detection, while mpMR had higher specificity, though low sensitivity. Interobserver agreement was also higher with fluciclovine PET-CT compared with mpMR. Copyright © 2018 Elsevier B.V. All rights reserved.
Intra- and inter-observer agreement in histological assessment of canine soft tissue sarcoma.

PubMed

Yap, F W; Rasotto, R; Priestnall, S L; Parsons, K J; Stewart, J

2017-12-01

The diagnosis of canine soft tissue sarcoma (STS) is based on histological assessment. Assessment of criteria such as, degree of differentiation, necrosis score and mitotic score, gives rise to a final tumour grade, which is important in the recommendation of treatment and prognosis of patients. Previously diagnosed cases of STS were independently assessed by three board-certified veterinary pathologists. Participating pathologists were blinded to the original results. For the intra-observer study, the cases were assessed by a single pathologist six months apart and slides were randomized between readings. For the inter-observer study, the whole case series was assessed by a single pathologist before being passed onto the next pathologist. Intraclass correlation coefficient (ICC) and Fleiss's Kappa (ƙ) for the intra- (single observer) and inter-observer agreement. Strong agreement was observed for the intra-observer assessment in necrosis score, mitotic score, total score and tumour grading (ICC between 0.78 to 0.91). The intra-observer agreement for differentiation score was rated perfect (ICC 1.00). The agreement between pathologists for the diagnosis and grading of canine STS was moderate (ƙ = 0.60 and 0.43 respectively). Histological assessment of canine STS had high reproducibility by an individual pathologist. The agreement of diagnosis and grading of canine STS was moderate between pathologists. Future studies are required to investigate further assessment criteria to improve the specificity of STS diagnosis and the accuracy of the STS grading in dogs. © 2017 John Wiley & Sons Ltd.
Accuracy of contrast-enhanced spectral mammography for estimating residual tumor size after neoadjuvant chemotherapy in patients with breast cancer: a feasibility study.

PubMed

Barra, Filipe Ramos; de Souza, Fernanda Freire; Camelo, Rosimara Eva Ferreira Almeida; Ribeiro, Andrea Campos de Oliveira; Farage, Luciano

2017-01-01

To assess the feasibility of contrast-enhanced spectral mammography (CESM) of the breast for assessing the size of residual tumors after neoadjuvant chemotherapy (NAC). In breast cancer patients who underwent NAC between 2011 and 2013, we evaluated residual tumor measurements obtained with CESM and full-field digital mammography (FFDM). We determined the concordance between the methods, as well as their level of agreement with the pathology. Three radiologists analyzed eight CESM and FFDM measurements separately, considering the size of the residual tumor at its largest diameter and correlating it with that determined in the pathological analysis. Interobserver agreement was also evaluated. The sensitivity, specificity, positive predictive value, and negative predictive value were higher for CESM than for FFDM (83.33%, 100%, 100%, and 66% vs. 50%, 50%, 50%, and 25%, respectively). The CESM measurements showed a strong, consistent correlation with the pathological findings (correlation coefficient = 0.76-0.92; intraclass correlation coefficient = 0.692-0.886). The correlation between the FFDM measurements and the pathological findings was not statistically significant, with questionable consistency (intraclass correlation coefficient = 0.488-0.598). Agreement with the pathological findings was narrower for CESM measurements than for FFDM measurements. Interobserver agreement was higher for CESM than for FFDM (0.94 vs. 0.88). CESM is a feasible means of evaluating residual tumor size after NAC, showing a good correlation and good agreement with pathological findings. For CESM measurements, the interobserver agreement was excellent.
Intra- and interobserver agreement in the classification and treatment of distal third clavicle fractures.

PubMed

Bishop, Julie Y; Jones, Grant L; Lewis, Brian; Pedroza, Angela

2015-04-01

In treatment of distal third clavicle fractures, the Neer classification system, based on the location of the fracture in relation to the coracoclavicular ligaments, has traditionally been used to determine fracture pattern stability. To determine the intra- and interobserver reliability in the classification of distal third clavicle fractures via standard plain radiographs and the intra- and interobserver agreement in the preferred treatment of these fractures. Cohort study (Diagnosis); Level of evidence, 3. Thirty radiographs of distal clavicle fractures were randomly selected from patients treated for distal clavicle fractures between 2006 and 2011. The radiographs were distributed to 22 shoulder/sports medicine fellowship-trained orthopaedic surgeons. Fourteen surgeons responded and took part in the study. The evaluators were asked to measure the size of the distal fragment, classify the fracture pattern as stable or unstable, assign the Neer classification, and recommend operative versus nonoperative treatment. The radiographs were reordered and redistributed 3 months later. Inter- and intrarater agreement was determined for the distal fragment size, stability of the fracture, Neer classification, and decision to operate. Single variable logistic regression was performed to determine what factors could most accurately predict the decision for surgery. Interrater agreement was fair for distal fragment size, moderate for stability, fair for Neer classification, slight for type IIB and III fractures, and moderate for treatment approach. Intrarater agreement was moderate for distal fragment size categories (κ = 0.50, P < .001) and Neer classification (κ = 0.42, P < .001) and substantial for stable fracture (κ = 0.65, P < .001) and decision to operate (κ = 0.65, P < .001). Fracture stability was the best predictor of treatment, with 89% accuracy (P < .001). Fracture stability determination and the decision to operate had the highest interobserver agreement. Fracture stability was the key determinant of treatment, rather than the Neer classification system or the size of the distal fragment. © 2015 The Author(s).
Computer-Aided Medical Diagnosis. Literature Review

DTIC Science & Technology

1978-12-15

Croft found a 13% difference in diagnostic accuracy. He considered this difference insignificant in relation to the diagnostic differences caused ...type of diseases diagnosed probably are the major cause of cross-study variability in diagnostic accuracy. The consistency of diagnostic accuracy...REFERENCES ALPEROVITCH, A. and FRAGU, P., A suggestion for an effective use of a computer-aided diagnosis system in screening for hyperthyroidism , Method
Muscle MR Imaging in Tubular Aggregate Myopathy

PubMed Central

Beltrame, Valeria; Ortolan, Paolo; Coran, Alessandro; Zanato, Riccardo; Gazzola, Matteo; Frigo, Annachiara; Bello, Luca; Pegoraro, Elena; Stramare, Roberto

2014-01-01

Purpose To evaluate with Magnetic Resonance (MR) the degree of fatty replacement and edematous involvement in skeletal muscles in patients with Tubular Aggregate Myopathy (TAM). To asses the inter-observer agreement in evaluating muscle involvement and the symmetry index of fatty replacement. Materials and Methods 13 patients were evaluated by MR to ascertain the degree of fatty replacement (T1W sequences) according to Mercuri's scale, and edema score (STIR sequences) according to extent and site. Results Fatty replacement mainly affects the posterior superficial compartment of the leg; the anterior compartment is generally spared. Edema was generally poor and almost only in the superficial compartment of the leg. The inter-observer agreement is very good with a Krippendorff's coefficient >0.9. Data show a total symmetry in the muscular replacement (McNemar-Bowker test with p = 1). Conclusions MR reveals characteristic muscular involvement, and is a reproducible technique for evaluation of TAM. There may also be a characteristic involvement of the long and short heads of the biceps femoris. It is useful for aimed biopsies, diagnostic hypotheses and evaluation of disease progression. PMID:24722334
Comprehension and reproducibility of the Judet and Letournel classification

PubMed Central

Polesello, Giancarlo Cavalli; Nunes, Marcus Aurelius Araujo; Azuaga, Thiago Leonardi; de Queiroz, Marcelo Cavalheiro; Honda, Emerson Kyoshi; Ono, Nelson Keiske

2012-01-01

Objective To evaluate the effectiveness of the method of radiographic interpretation of acetabular fractures, according to the classification of Judet and Letournel, used by a group of residents of Orthopedics at a university hospital. Methods We selected ten orthopedic residents, who were divided into two groups; one group received training in a methodology for the classification of acetabular fractures, which involves transposing the radiographic images to a graphic two-dimensional representation. We classified fifty cases of acetabular fracture on two separate occasions, and determined the intraobserver and interobserver agreement. Result The success rate was 16.2% (10-26%) for the trained group and 22.8% (10-36%) for the untrained group. The mean kappa coefficients for interobserver and intraobserver agreement in the trained group were 0.08 and 0.12, respectively, and for the untrained group, 0.14 and 0.29. Conclusion Training in the method of radiographic interpretation of acetabular fractures was not effective for assisting in the classification of acetabular fractures. Level of evidence I, Testing of previously developed diagnostic criteria on consecutive patients (with universally applied reference "gold" standard). PMID:24453583
In-class didactic versus self-directed teaching of the probe-based confocal laser endomicroscopy (pCLE) criteria for Barrett's esophagus.

PubMed

Rzouq, Fadi; Vennalaganti, Prashanth; Pakseresht, Kavous; Kanakadandi, Vijay; Parasa, Sravanthi; Mathur, Sharad C; Alsop, Benjamin R; Hornung, Benjamin; Gupta, Neil; Sharma, Prateek

2016-02-01

Optimal teaching methods for disease recognition using probe-based confocal laser endomicroscopy (pCLE) have not been developed. Our aim was to compare in-class didactic teaching vs. self-directed teaching of Barrett's neoplasia diagnosis using pCLE. This randomized controlled trial was conducted at a tertiary academic center. Study participants with no prior pCLE experience were randomized to in-class didactic (group 1) or self-directed teaching groups (group 2). For group 1, an expert conducted a classroom teaching session using standardized educational material. Participants in group 2 were provided with the same material on an audio PowerPoint. After initial training, all participants graded an initial set of 20 pCLE videos and reviewed correct responses with the expert (group 1) or on audio PowerPoint (group 2). Finally, all participants completed interpretations of a further 40 videos. Eighteen trainees (8 medical students, 10 gastroenterology trainees) participated in the study. Overall diagnostic accuracy for neoplasia prediction by pCLE was 77 % (95 % confidence interval [CI] 74.0 % - 79.2 %); of predictions made with high confidence (53 %), the accuracy was 85 % (95 %CI 81.8 % - 87.8 %). The overall accuracy and interobserver agreement was significantly higher in group 1 than in group 2 for all predictions (80.4 % vs. 73 %; P = 0.005) and for high confidence predictions (90 % vs. 80 %; P < 0.001). Following feedback (after the initial 20 videos), the overall accuracy improved from 73 % to 79 % (P = 0.04), mainly driven by a significant improvement in group 1 (74 % to 84 %; P < 0.01). Accuracy of prediction significantly improved with time in endoscopy training (72 % students, 77 % FY1, 82 % FY2, and 85 % FY3; P = 0.003). For novice trainees, in-class didactic teaching enables significantly better recognition of the pCLE features of Barrett's esophagus than self-directed teaching. The in-class didactic group had a shorter learning curve and were able to achieve 90 % accuracy for their high confidence predictions. © Georg Thieme Verlag KG Stuttgart · New York.
Systematic Review and Meta-Analysis of Studies Evaluating Diagnostic Test Accuracy: A Practical Review for Clinical Researchers-Part II. Statistical Methods of Meta-Analysis

PubMed Central

Lee, Juneyoung; Kim, Kyung Won; Choi, Sang Hyun; Huh, Jimi

2015-01-01

Meta-analysis of diagnostic test accuracy studies differs from the usual meta-analysis of therapeutic/interventional studies in that, it is required to simultaneously analyze a pair of two outcome measures such as sensitivity and specificity, instead of a single outcome. Since sensitivity and specificity are generally inversely correlated and could be affected by a threshold effect, more sophisticated statistical methods are required for the meta-analysis of diagnostic test accuracy. Hierarchical models including the bivariate model and the hierarchical summary receiver operating characteristic model are increasingly being accepted as standard methods for meta-analysis of diagnostic test accuracy studies. We provide a conceptual review of statistical methods currently used and recommended for meta-analysis of diagnostic test accuracy studies. This article could serve as a methodological reference for those who perform systematic review and meta-analysis of diagnostic test accuracy studies. PMID:26576107
Tuberculin skin testing: Spectrum of adverse reactions.

PubMed

Praveen, Ramar; Bahuguna, Amit; Dhadwal, Bhumesh Singh

2015-01-01

Tuberculin skin testing (TST) is one of the primary diagnostic modalities recommended by the World Health Organization (WHO) and the National Institute for Health and Care Excellence (NICE) study conducted in the United Kingdom (UK) for diagnosing tuberculosis (TB). Even after acceptance as a diagnostic modality and stern standardization, TST has its own flaws that include a spectrum of adverse reactions. We report a series of cases with a spectrum of adverse reactions occurring with a higher frequency than present in the available evidence. The study has some demerits such as being a retrospective one with interobserver variation and lack of histopathological confirmation. The observation is presented to accentuate the fact that adverse reactions are not a rarity and that further studies are required to establish the cause and exact incidence of the same.
Describing Peripancreatic Collections According to the Revised Atlanta Classification of Acute Pancreatitis: An International Interobserver Agreement Study.

PubMed

Bouwense, Stefan A; van Brunschot, Sandra; van Santvoort, Hjalmar C; Besselink, Marc G; Bollen, Thomas L; Bakker, Olaf J; Banks, Peter A; Boermeester, Marja A; Cappendijk, Vincent C; Carter, Ross; Charnley, Richard; van Eijck, Casper H; Freeny, Patrick C; Hermans, John J; Hough, David M; Johnson, Colin D; Laméris, Johan S; Lerch, Markus M; Mayerle, Julia; Mortele, Koenraad J; Sarr, Michael G; Stedman, Brian; Vege, Santhi Swaroop; Werner, Jens; Dijkgraaf, Marcel G; Gooszen, Hein G; Horvath, Karen D

2017-08-01

Severe acute pancreatitis is associated with peripancreatic morphologic changes as seen on imaging. Uniform communication regarding these morphologic findings is crucial for accurate diagnosis and treatment. For the original 1992 Atlanta classification, interobserver agreement is poor. We hypothesized that for the revised Atlanta classification, interobserver agreement will be better. An international, interobserver agreement study was performed among expert and nonexpert radiologists (n = 14), surgeons (n = 15), and gastroenterologists (n = 8). Representative computed tomographies of all stages of acute pancreatitis were selected from 55 patients and were assessed according to the revised Atlanta classification. The interobserver agreement was calculated among all reviewers and subgroups, that is, expert and nonexpert reviewers; interobserver agreement was defined as poor (≤0.20), fair (0.21-0.40), moderate (0.41-0.60), good (0.61-0.80), or very good (0.81-1.00). Interobserver agreement among all reviewers was good (0.75 [standard deviation, 0.21]) for describing the type of acute pancreatitis and good (0.62 [standard deviation, 0.19]) for the type of peripancreatic collection. Expert radiologists showed the best and nonexpert clinicians the lowest interobserver agreement. Interobserver agreement was good for the revised Atlanta classification, supporting the importance for widespread adaption of this revised classification for clinical and research communications.
The STARD statement for reporting diagnostic accuracy studies: application to the history and physical examination.

PubMed

Simel, David L; Rennie, Drummond; Bossuyt, Patrick M M

2008-06-01

The Standards for Reporting of Diagnostic Accuracy (STARD) statement provided guidelines for investigators conducting diagnostic accuracy studies. We reviewed each item in the statement for its applicability to clinical examination diagnostic accuracy research, viewing each discrete aspect of the history and physical examination as a diagnostic test. Nonsystematic review of the STARD statement. Two former STARD Group participants and 1 editor of a journal series on clinical examination research reviewed each STARD item. Suggested interpretations and comments were shared to develop consensus. The STARD Statement applies generally well to clinical examination diagnostic accuracy studies. Three items are the most important for clinical examination diagnostic accuracy studies, and investigators should pay particular attention to their requirements: describe carefully the patient recruitment process, describe participant sampling and address if patients were from a consecutive series, and describe whether the clinicians were masked to the reference standard tests and whether the interpretation of the reference standard test was masked to the clinical examination components or overall clinical impression. The consideration of these and the other STARD items in clinical examination diagnostic research studies would improve the quality of investigations and strengthen conclusions reached by practicing clinicians. The STARD statement provides a very useful framework for diagnostic accuracy studies. The group correctly anticipated that there would be nuances applicable to studies of the clinical examination. We offer guidance that should enhance their usefulness to investigators embarking on original studies of a patient's history and physical examination.
Serological markers in inflammatory bowel disease: the pros and cons.

PubMed

Lerner, Aaron; Shoenfeld, Yehuda

2002-02-01

Accurate serological assays are desirable for the diagnosis of inflammatory bowel disease. Among several serological markers anti-Saccharomyces cerevisiae mannan antibodies and perinuclear antineutrophil cytoplasmic autoantibodies are highly disease specific for Crohn's disease and ulcerative colitis, respectively. Combining the two improves their specificity. Sensitivity, however, is still low. Due to lack of standardization and vast interobserver variability, they cannot be used as the only diagnostic criteria but can assist clinicians in diagnosing and categorizing patients with inflammatory bowel disease as well as in helping them to take therapeutic decisions.
The role of computerized diagnostic proposals in the interpretation of the 12-lead electrocardiogram by cardiology and non-cardiology fellows.

PubMed

Novotny, Tomas; Bond, Raymond; Andrsova, Irena; Koc, Lumir; Sisakova, Martina; Finlay, Dewar; Guldenring, Daniel; Spinar, Jindrich; Malik, Marek

2017-05-01

Most contemporary 12-lead electrocardiogram (ECG) devices offer computerized diagnostic proposals. The reliability of these automated diagnoses is limited. It has been suggested that incorrect computer advice can influence physician decision-making. This study analyzed the role of diagnostic proposals in the decision process by a group of fellows of cardiology and other internal medicine subspecialties. A set of 100 clinical 12-lead ECG tracings was selected covering both normal cases and common abnormalities. A team of 15 junior Cardiology Fellows and 15 Non-Cardiology Fellows interpreted the ECGs in 3 phases: without any diagnostic proposal, with a single diagnostic proposal (half of them intentionally incorrect), and with four diagnostic proposals (only one of them being correct) for each ECG. Self-rated confidence of each interpretation was collected. Availability of diagnostic proposals significantly increased the diagnostic accuracy (p<0.001). Nevertheless, in case of a single proposal (either correct or incorrect) the increase of accuracy was present in interpretations with correct diagnostic proposals, while the accuracy was substantially reduced with incorrect proposals. Confidence levels poorly correlated with interpretation scores (rho≈2, p<0.001). Logistic regression showed that an interpreter is most likely to be correct when the ECG offers a correct diagnostic proposal (OR=10.87) or multiple proposals (OR=4.43). Diagnostic proposals affect the diagnostic accuracy of ECG interpretations. The accuracy is significantly influenced especially when a single diagnostic proposal (either correct or incorrect) is provided. The study suggests that the presentation of multiple computerized diagnoses is likely to improve the diagnostic accuracy of interpreters. Copyright © 2017 Elsevier B.V. All rights reserved.
Regression Analysis of Optical Coherence Tomography Disc Variables for Glaucoma Diagnosis.

PubMed

Richter, Grace M; Zhang, Xinbo; Tan, Ou; Francis, Brian A; Chopra, Vikas; Greenfield, David S; Varma, Rohit; Schuman, Joel S; Huang, David

2016-08-01

To report diagnostic accuracy of optical coherence tomography (OCT) disc variables using both time-domain (TD) and Fourier-domain (FD) OCT, and to improve the use of OCT disc variable measurements for glaucoma diagnosis through regression analyses that adjust for optic disc size and axial length-based magnification error. Observational, cross-sectional. In total, 180 normal eyes of 112 participants and 180 eyes of 138 participants with perimetric glaucoma from the Advanced Imaging for Glaucoma Study. Diagnostic variables evaluated from TD-OCT and FD-OCT were: disc area, rim area, rim volume, optic nerve head volume, vertical cup-to-disc ratio (CDR), and horizontal CDR. These were compared with overall retinal nerve fiber layer thickness and ganglion cell complex. Regression analyses were performed that corrected for optic disc size and axial length. Area-under-receiver-operating curves (AUROC) were used to assess diagnostic accuracy before and after the adjustments. An index based on multiple logistic regression that combined optic disc variables with axial length was also explored with the aim of improving diagnostic accuracy of disc variables. Comparison of diagnostic accuracy of disc variables, as measured by AUROC. The unadjusted disc variables with the highest diagnostic accuracies were: rim volume for TD-OCT (AUROC=0.864) and vertical CDR (AUROC=0.874) for FD-OCT. Magnification correction significantly worsened diagnostic accuracy for rim variables, and while optic disc size adjustments partially restored diagnostic accuracy, the adjusted AUROCs were still lower. Axial length adjustments to disc variables in the form of multiple logistic regression indices led to a slight but insignificant improvement in diagnostic accuracy. Our various regression approaches were not able to significantly improve disc-based OCT glaucoma diagnosis. However, disc rim area and vertical CDR had very high diagnostic accuracy, and these disc variables can serve to complement additional OCT measurements for diagnosis of glaucoma.
An ROC-type measure of diagnostic accuracy when the gold standard is continuous-scale.

PubMed

Obuchowski, Nancy A

2006-02-15

ROC curves and summary measures of accuracy derived from them, such as the area under the ROC curve, have become the standard for describing and comparing the accuracy of diagnostic tests. Methods for estimating ROC curves rely on the existence of a gold standard which dichotomizes patients into disease present or absent. There are, however, many examples of diagnostic tests whose gold standards are not binary-scale, but rather continuous-scale. Unnatural dichotomization of these gold standards leads to bias and inconsistency in estimates of diagnostic accuracy. In this paper, we propose a non-parametric estimator of diagnostic test accuracy which does not require dichotomization of the gold standard. This estimator has an interpretation analogous to the area under the ROC curve. We propose a confidence interval for test accuracy and a statistical test for comparing accuracies of tests from paired designs. We compare the performance (i.e. CI coverage, type I error rate, power) of the proposed methods with several alternatives. An example is presented where the accuracies of two quick blood tests for measuring serum iron concentrations are estimated and compared.
The quadrant method measuring four points is as a reliable and accurate as the quadrant method in the evaluation after anatomical double-bundle ACL reconstruction.

PubMed

Mochizuki, Yuta; Kaneko, Takao; Kawahara, Keisuke; Toyoda, Shinya; Kono, Norihiko; Hada, Masaru; Ikegami, Hiroyasu; Musha, Yoshiro

2017-11-20

The quadrant method was described by Bernard et al. and it has been widely used for postoperative evaluation of anterior cruciate ligament (ACL) reconstruction. The purpose of this research is to further develop the quadrant method measuring four points, which we named four-point quadrant method, and to compare with the quadrant method. Three-dimensional computed tomography (3D-CT) analyses were performed in 25 patients who underwent double-bundle ACL reconstruction using the outside-in technique. The four points in this study's quadrant method were defined as point1-highest, point2-deepest, point3-lowest, and point4-shallowest, in femoral tunnel position. Value of depth and height in each point was measured. Antero-medial (AM) tunnel is (depth1, height2) and postero-lateral (PL) tunnel is (depth3, height4) in this four-point quadrant method. The 3D-CT images were evaluated independently by 2 orthopaedic surgeons. A second measurement was performed by both observers after a 4-week interval. Intra- and inter-observer reliability was calculated by means of intra-class correlation coefficient (ICC). Also, the accuracy of the method was evaluated against the quadrant method. Intra-observer reliability was almost perfect for both AM and PL tunnel (ICC > 0.81). Inter-observer reliability of AM tunnel was substantial (ICC > 0.61) and that of PL tunnel was almost perfect (ICC > 0.81). The AM tunnel position was 0.13% deep, 0.58% high and PL tunnel position was 0.01% shallow, 0.13% low compared to quadrant method. The four-point quadrant method was found to have high intra- and inter-observer reliability and accuracy. This method can evaluate the tunnel position regardless of the shape and morphology of the bone tunnel aperture for use of comparison and can provide measurement that can be compared with various reconstruction methods. The four-point quadrant method of this study is considered to have clinical relevance in that it is a detailed and accurate tool for evaluating femoral tunnel position after ACL reconstruction. Case series, Level IV.
Delayed pneumothorax after stab wound to thorax and upper abdomen: Truth or myth?

PubMed

Zehtabchi, Shahriar; Morley, Eric J; Sajed, Dana; Greenberg, Oded; Sinert, Richard

2009-01-01

Stab wounds to the thorax and upper abdomen have the potential to cause pneumothorax (PTX). When a CXR (CXR) obtained during initial resuscitation is negative, a second CXR (CXR-2) is commonly performed with the goal of identifying delayed PTX. To assess the diagnostic yield of the CXR-2 in identifying delayed PTX. Prospective observational study of patients (age >or=13 years) with stab wounds to the thorax (chest/back) and upper abdomen with suspected PTX, in a level 1 trauma centre. Patients were included if they had a negative initial CXR followed by a repeat CXR 3-6h after the initial one. patients who died, were transferred out of the ED, or received chest tubes before the second CXR. The outcome of interest was delayed PTX. All CXR were read by an attending radiologist. To test the inter-observer agreement, another blinded radiologist reviewed 20% of CXR. Continuous data is presented as mean+/-standard deviation and categorical data as percentages with 95% confidence interval (CI). Kappa statistics were used to measure the inter-observer agreement between radiologists. Between January 2003 and December 2006 a total of 185 patients qualified for the enrollment (mean age: 28+/-10 years, age range: 13-65, 94% male). Only 2 patients (1.1%, 95% CI, 0.4- 4.1%) had PTX on the CXR-2. Both patients received chest tubes. The inter-observer agreement for radiology reports was high (kappa: 0.79). Occurrence of delayed PTX in patients with stab wounds to the thorax and upper abdomen and negative triage CXR is rare.
Ultrasound definition of tendon damage in patients with rheumatoid arthritis. Results of a OMERACT consensus-based ultrasound score focussing on the diagnostic reliability.

PubMed

Bruyn, George A W; Hanova, Petra; Iagnocco, Annamaria; d'Agostino, Maria-Antonietta; Möller, Ingrid; Terslev, Lene; Backhaus, Marina; Balint, Peter V; Filippucci, Emilio; Baudoin, Paul; van Vugt, Richard; Pineda, Carlos; Wakefield, Richard; Garrido, Jesus; Pecha, Ondrej; Naredo, Esperanza

2014-11-01

To develop the first ultrasound scoring system of tendon damage in rheumatoid arthritis (RA) and assess its intraobserver and interobserver reliability. We conducted a Delphi study on ultrasound-defined tendon damage and ultrasound scoring system of tendon damage in RA among 35 international rheumatologists with experience in musculoskeletal ultrasound. Twelve patients with RA were included and assessed twice by 12 rheumatologists-sonographers. Ultrasound examination for tendon damage in B mode of five wrist extensor compartments (extensor carpi radialis brevis and longus; extensor pollicis longus; extensor digitorum communis; extensor digiti minimi; extensor carpi ulnaris) and one ankle tendon (tibialis posterior) was performed blindly, independently and bilaterally in each patient. Intraobserver and interobserver reliability were calculated by κ coefficients. A three-grade semiquantitative scoring system was agreed for scoring tendon damage in B mode. The mean intraobserver reliability for tendon damage scoring was excellent (κ value 0.91). The mean interobserver reliability assessment showed good κ values (κ value 0.75). The most reliable were the extensor digiti minimi, the extensor carpi ulnaris, and the tibialis posterior tendons. An ultrasound reference image atlas of tenosynovitis and tendon damage was also developed. Ultrasound is a reproducible tool for evaluating tendon damage in RA. This study strongly supports a new reliable ultrasound scoring system for tendon damage. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Virtual microscopy: an evaluation of its validity and diagnostic performance in routine histologic diagnosis of skin tumors.

PubMed

Nielsen, Patricia Switten; Lindebjerg, Jan; Rasmussen, Jan; Starklint, Henrik; Waldstrøm, Marianne; Nielsen, Bjarne

2010-12-01

Digitization of histologic slides is associated with many advantages, and its use in routine diagnosis holds great promise. Nevertheless, few articles evaluate virtual microscopy in routine settings. This study is an evaluation of the validity and diagnostic performance of virtual microscopy in routine histologic diagnosis of skin tumors. Our aim is to investigate whether conventional microscopy of skin tumors can be replaced by virtual microscopy. Ninety-six skin tumors and skin-tumor-like changes were consecutively gathered over a 1-week period. Specimens were routinely processed, and digital slides were captured on Mirax Scan (Carl Zeiss MicroImaging, Göttingen, Germany). Four pathologists evaluated the 96 virtual slides and the associated 96 conventional slides twice with intermediate time intervals of at least 3 weeks. Virtual slides that caused difficulties were reevaluated to identify possible reasons for this. The accuracy was 89.2% for virtual microscopy and 92.7% for conventional microscopy. All κ coefficients expressed very good intra- and interobserver agreement. The sensitivities were 85.7% (78.0%-91.0%) and 92.0% (85.5%-95.7%) for virtual and conventional microscopy, respectively. The difference between the sensitivities was 6.3% (0.8%-12.6%). The subsequent reevaluation showed that virtual slides were as useful as conventional slides when rendering a diagnosis. Differences seen are presumed to be due to the pathologists' lack of experience using the virtual microscope. We conclude that it is feasible to make histologic diagnosis on the skin tumor types represented in this study using virtual microscopy after pathologists have completed a period of training. Larger studies should be conducted to verify whether virtual microscopy can replace conventional microscopy in routine practice. Copyright © 2010 Elsevier Inc. All rights reserved.

Validation of in vivo 2D displacements from spiral cine DENSE at 3T.

PubMed

Wehner, Gregory J; Suever, Jonathan D; Haggerty, Christopher M; Jing, Linyuan; Powell, David K; Hamlet, Sean M; Grabau, Jonathan D; Mojsejenko, Walter Dimitri; Zhong, Xiaodong; Epstein, Frederick H; Fornwalt, Brandon K

2015-01-30

Displacement Encoding with Stimulated Echoes (DENSE) encodes displacement into the phase of the magnetic resonance signal. Due to the stimulated echo, the signal is inherently low and fades through the cardiac cycle. To compensate, a spiral acquisition has been used at 1.5T. This spiral sequence has not been validated at 3T, where the increased signal would be valuable, but field inhomogeneities may result in measurement errors. We hypothesized that spiral cine DENSE is valid at 3T and tested this hypothesis by measuring displacement errors at both 1.5T and 3T in vivo. Two-dimensional spiral cine DENSE and tagged imaging of the left ventricle were performed on ten healthy subjects at 3T and six healthy subjects at 1.5T. Intersection points were identified on tagged images near end-systole. Displacements from the DENSE images were used to project those points back to their origins. The deviation from a perfect grid was used as a measure of accuracy and quantified as root-mean-squared error. This measure was compared between 3T and 1.5T with the Wilcoxon rank sum test. Inter-observer variability of strains and torsion quantified by DENSE and agreement between DENSE and harmonic phase (HARP) were assessed by Bland-Altman analyses. The signal to noise ratio (SNR) at each cardiac phase was compared between 3T and 1.5T with the Wilcoxon rank sum test. The displacement accuracy of spiral cine DENSE was not different between 3T and 1.5T (1.2 ± 0.3 mm and 1.2 ± 0.4 mm, respectively). Both values were lower than the DENSE pixel spacing of 2.8 mm. There were no substantial differences in inter-observer variability of DENSE or agreement of DENSE and HARP between 3T and 1.5T. Relative to 1.5T, the SNR at 3T was greater by a factor of 1.4 ± 0.3. The spiral cine DENSE acquisition that has been used at 1.5T to measure cardiac displacements can be applied at 3T with equivalent accuracy. The inter-observer variability and agreement of DENSE-derived peak strains and torsion with HARP is also comparable at both field strengths. Future studies with spiral cine DENSE may take advantage of the additional SNR at 3T.
Comparative Diagnostic Accuracy of the ACE-III, MIS, MMSE, MoCA, and RUDAS for Screening of Alzheimer Disease.

PubMed

Matías-Guiu, Jordi A; Valles-Salgado, María; Rognoni, Teresa; Hamre-Gil, Frank; Moreno-Ramos, Teresa; Matías-Guiu, Jorge

2017-01-01

Our aim was to evaluate and compare the diagnostic properties of 5 screening tests for the diagnosis of mild Alzheimer disease (AD). We conducted a prospective and cross-sectional study of 92 patients with mild AD and of 68 healthy controls from our Department of Neurology. The diagnostic properties of the following tests were compared: Mini-Mental State Examination (MMSE), Addenbrooke's Cognitive Examination III (ACE-III), Memory Impairment Screen (MIS), Montreal Cognitive Assessment (MoCA), and Rowland Universal Dementia Assessment Scale (RUDAS). All tests yielded high diagnostic accuracy, with the ACE-III achieving the best diagnostic properties. The area under the curve was 0.897 for the ACE-III, 0.889 for the RUDAS, 0.874 for the MMSE, 0.866 for the MIS, and 0.856 for the MoCA. The Mini-ACE score from the ACE-III showed the highest diagnostic capacity (area under the curve 0.939). Memory scores of the ACE-III and of the RUDAS showed a better diagnostic accuracy than those of the MMSE and of the MoCA. All tests, especially the ACE-III, conveyed a higher diagnostic accuracy in patients with full primary education than in the less educated group. Implementing normative data improved the diagnostic accuracy of the ACE-III but not that of the other tests. The ACE-III achieved the highest diagnostic accuracy. This better discrimination was more evident in the more educated group. © 2017 S. Karger AG, Basel.
A systematic review of the passive straight leg raising test as a diagnostic aid for low back pain (1989 to 2000).

PubMed

Rebain, Richard; Baxter, G David; McDonough, Suzanne

2002-09-01

A systematic review. This systematic review sought papers (January 1989-January 2000) on the passive straight leg raising test (PSLR) as a diagnostic component for low back pain (LBP) to identify, summarize, and assess developments in the test procedure, the factors influencing PSLR outcome, and the clinical significance of that outcome. Previous studies suggested that the PSLR tractioned the sciatic nerve and that diminished leg elevation with reproduced pain indicated low lumbar intervertebral disc pathology. Searches on six computerized bibliographic databases identified publications written about the PSLR. Papers were excluded if they were published before January 1989, were non-English language papers, or employed either an active SLR or a PSLR for purposes other than LBP diagnosis. The references of qualifying papers (and the references of references) were searched. Contact with primary authors, and others known to be active in this field, was attempted. The PSLR procedure remains unchanged. The influence of hip rotation during the PSLR was discussed without consensus. Biomechanical devices improved intra- and interobserver reliability and so increased test reproducibility. Hamstrings were found to have a defensive role in protecting nerve roots by limiting PSLR range in cases of nerve root inflammation. A small diurnal variation in the PSLR may imply a poorer prognosis. A positive PSLR at 4 months after lumbar intervertebral disc surgery predicted poor reoperative outcome, and a negative 4-month PSLR predicted excellent outcome. The influence of psychosocial factors was not discussed, neither was the diagnostic significance of a negative PSLR outcome. There remains no standard PSLR procedure, no consensus on interpretation of results, and little recognition that a negative PSLR test outcome may be of greater diagnostic value than a positive one. The causal link between LBP pathology and hamstring action remains unclear. There is a need for research into the clinical use of the PSLR; its intra- and interobserver reliability; the influences of age, gender, diurnal variation, and psychosocial factors; and its predictive value in lumbar intervertebral disc surgery.
Liver fibrosis staging with a new 2D-shear wave elastography using comb-push technique: Applicability, reproducibility, and diagnostic performance

PubMed Central

Lee, Sang Min; Kang, Hyo-Jin; Yang, Hyung Kung; Yoon, Jeong Hee; Chang, Won; An, Su Joa; Lee, Kyoung Bun; Baek, Seung Yon

2017-01-01

Objective To evaluate the applicability, reproducibility, and diagnostic performance of a new 2D-shear wave elastography (SWE) using the comb-push technique (2D CP-SWE) for detection of hepatic fibrosis, using histopathology as the reference standard. Materials and methods This prospective study was approved by the institutional review board, and informed consent was obtained from all patients. The liver stiffness (LS) measurements were obtained from 140 patients, using the new 2D-SWE, which uses comb-push excitation to produce shear waves and a time-aligned sequential tracking method to detect shear wave signals. The applicability rate of 2D CP-SWE was estimated, and factors associated with its applicability were identified. Intraobserver reproducibility was evaluated in the 105 patients with histopathologic diagnosis, and interobserver reproducibility was assessed in 20 patients. Diagnostic performance of the 2D CP-SWE for hepatic fibrosis was evaluated by receiver operating characteristic (ROC) curve analysis. Results The applicability rate of 2D CP-SWE was 90.8% (109 of 120). There was a significant difference in age, presence or absence of ascites, and the distance from the transducer to the Glisson capsule between the patients with applicable LS measurements and patients with unreliable measurement or technical failure. The intraclass correlation of interobserver agreement was 0.87, and the value for the intraobserver agreement was 0.95. The area under the ROC curve of LS values for stage F2 fibrosis or greater, stage F3 or greater, and stage F4 fibrosis was 0.874 (95% confidence interval [CI]: 0.794–0.930), 0.905 (95% CI: 0.832–0.954), and 0.894 (95% CI: 0.819–0.946), respectively. Conclusion 2D CP-SWE can be employed as a reliable method for assessing hepatic fibrosis with a reasonably good diagnostic performance, and its applicability might be influenced by age, ascites, and the distance between the transducer and Glisson capsule. PMID:28510583
Assessment of Interobserver Reliability in Nutrition Studies that Use Direct Observation of School Meals

PubMed Central

BAGLIO, MICHELLE L.; BAXTER, SUZANNE DOMEL; GUINN, CAROLINE H.; THOMPSON, WILLIAM O.; SHAFFER, NICOLE M.; FRYE, FRANCESCA H. A.

2005-01-01

This article (a) provides a general review of interobserver reliability (IOR) and (b) describes our method for assessing IOR for items and amounts consumed during school meals for a series of studies regarding the accuracy of fourth-grade children's dietary recalls validated with direct observation of school meals. A widely used validation method for dietary assessment is direct observation of meals. Although many studies utilize several people to conduct direct observations, few published studies indicate whether IOR was assessed. Assessment of IOR is necessary to determine that the information collected does not depend on who conducted the observation. Two strengths of our method for assessing IOR are that IOR was assessed regularly throughout the data collection period and that IOR was assessed for foods at the item and amount level instead of at the nutrient level. Adequate agreement among observers is essential to the reasoning behind using observation as a validation tool. Readers are encouraged to question the results of studies that fail to mention and/or to include the results for assessment of IOR when multiple people have conducted observations. PMID:15354155
Development and validation of a paediatric long-bone fracture classification. A prospective multicentre study in 13 European paediatric trauma centres

PubMed Central

2011-01-01

Background The aim of this study was to develop a child-specific classification system for long bone fractures and to examine its reliability and validity on the basis of a prospective multicentre study. Methods Using the sequentially developed classification system, three samples of between 30 and 185 paediatric limb fractures from a pool of 2308 fractures documented in two multicenter studies were analysed in a blinded fashion by eight orthopaedic surgeons, on a total of 5 occasions. Intra- and interobserver reliability and accuracy were calculated. Results The reliability improved with successive simplification of the classification. The final version resulted in an overall interobserver agreement of κ = 0.71 with no significant difference between experienced and less experienced raters. Conclusions In conclusion, the evaluation of the newly proposed classification system resulted in a reliable and routinely applicable system, for which training in its proper use may further improve the reliability. It can be recommended as a useful tool for clinical practice and offers the option for developing treatment recommendations and outcome predictions in the future. PMID:21548939
Using Meta-Analysis to Inform the Design of Subsequent Studies of Diagnostic Test Accuracy

ERIC Educational Resources Information Center

Hinchliffe, Sally R.; Crowther, Michael J.; Phillips, Robert S.; Sutton, Alex J.

2013-01-01

An individual diagnostic accuracy study rarely provides enough information to make conclusive recommendations about the accuracy of a diagnostic test; particularly when the study is small. Meta-analysis methods provide a way of combining information from multiple studies, reducing uncertainty in the result and hopefully providing substantial…
STARD 2015 guidelines for reporting diagnostic accuracy studies: explanation and elaboration

PubMed Central

Cohen, Jérémie F; Korevaar, Daniël A; Altman, Douglas G; Bruns, David E; Gatsonis, Constantine A; Hooft, Lotty; Irwig, Les; Levine, Deborah; Reitsma, Johannes B; de Vet, Henrica C W; Bossuyt, Patrick M M

2016-01-01

Diagnostic accuracy studies are, like other clinical studies, at risk of bias due to shortcomings in design and conduct, and the results of a diagnostic accuracy study may not apply to other patient groups and settings. Readers of study reports need to be informed about study design and conduct, in sufficient detail to judge the trustworthiness and applicability of the study findings. The STARD statement (Standards for Reporting of Diagnostic Accuracy Studies) was developed to improve the completeness and transparency of reports of diagnostic accuracy studies. STARD contains a list of essential items that can be used as a checklist, by authors, reviewers and other readers, to ensure that a report of a diagnostic accuracy study contains the necessary information. STARD was recently updated. All updated STARD materials, including the checklist, are available at http://www.equator-network.org/reporting-guidelines/stard. Here, we present the STARD 2015 explanation and elaboration document. Through commented examples of appropriate reporting, we clarify the rationale for each of the 30 items on the STARD 2015 checklist, and describe what is expected from authors in developing sufficiently informative study reports. PMID:28137831
[Diagnostic value of cardiac magnetic resonance in patients with acute viral myocarditis].

PubMed

Ouyang, Haichun; Chen, Haixiong; Hu, Yunzhao; Wu, Yanxian; Li, Wensheng; Chen, Yuying; Cen, Yujian

2014-11-01

To assess the diagnostic value of cardiac magnetic resonance (CMR) in patients with acute viral myocarditis. Thirty patients with suspected acute viral myocarditis admitted in first people's hospital of Shunde from June 2011 to June 2013 were included in this prospective study. The diagnostic sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and accuracy of acute viral myocarditis were evaluated by clinical diagnosis. Diagnostic value among different scan methods and Lake Louise criteria were compared. Acute viral myocarditis was diagnosed in 63.33% (19/30) patients.Values for sensitivity, specificity, PPV, NPV, and diagnostic accuracy within the overall cohort were 57.89%, 72.73%, 78.57%, 50.00%, 63.33%, respectively by edema imaging (ER).Values for sensitivity, specificity, PPV, NPV, and diagnostic accuracy within the overall cohort were 78.95%, 63.64%, 78.95%, 63.64%, 73.33%, respectively using global relative enhancement (gRE).Values for sensitivity, specificity, PPV, NPV, and diagnostic accuracy within the overall cohort were 78.95%, 54.55%, 75.00%, 60.00%, 70.00%, respectively using late gadolinium enhancement (LGE) criteria.Values for sensitivity, specificity, PPV, NPV, and diagnostic accuracy within the overall cohort were 84.21%, 81.82%, 88.89%, 75.00%, 83.33% using Lake Louise criteria. The sensitivity, specificity, PPV, NPV, and diagnostic accuracy using Lake Louise criteria were significantly higher than using ER, gRE, LGE alone(all P < 0.05).Specificity was higher using ER than using gRE and LGE (both P < 0.05). The sensitivity, NPV, and diagnostic accuracy were significantly higher using gRE than using ER (all P < 0.05) and was similar as using LGE (all P > 0.05). Cardiac magnetic resonance is an excellent imaging modality for the diagnosis of acute viral myocarditis.
The STARD Statement for Reporting Diagnostic Accuracy Studies: Application to the History and Physical Examination

PubMed Central

Rennie, Drummond; Bossuyt, Patrick M. M.

2008-01-01

Summary Objective The Standards for Reporting of Diagnostic Accuracy (STARD) statement provided guidelines for investigators conducting diagnostic accuracy studies. We reviewed each item in the statement for its applicability to clinical examination diagnostic accuracy research, viewing each discrete aspect of the history and physical examination as a diagnostic test. Setting Nonsystematic review of the STARD statement. Interventions Two former STARD Group participants and 1 editor of a journal series on clinical examination research reviewed each STARD item. Suggested interpretations and comments were shared to develop consensus. Measurements and Main Results The STARD Statement applies generally well to clinical examination diagnostic accuracy studies. Three items are the most important for clinical examination diagnostic accuracy studies, and investigators should pay particular attention to their requirements: describe carefully the patient recruitment process, describe participant sampling and address if patients were from a consecutive series, and describe whether the clinicians were masked to the reference standard tests and whether the interpretation of the reference standard test was masked to the clinical examination components or overall clinical impression. The consideration of these and the other STARD items in clinical examination diagnostic research studies would improve the quality of investigations and strengthen conclusions reached by practicing clinicians. Conclusions The STARD statement provides a very useful framework for diagnostic accuracy studies. The group correctly anticipated that there would be nuances applicable to studies of the clinical examination. We offer guidance that should enhance their usefulness to investigators embarking on original studies of a patient’s history and physical examination. PMID:18347878
DNA methylation-based classification of central nervous system tumours.

PubMed

Capper, David; Jones, David T W; Sill, Martin; Hovestadt, Volker; Schrimpf, Daniel; Sturm, Dominik; Koelsche, Christian; Sahm, Felix; Chavez, Lukas; Reuss, David E; Kratz, Annekathrin; Wefers, Annika K; Huang, Kristin; Pajtler, Kristian W; Schweizer, Leonille; Stichel, Damian; Olar, Adriana; Engel, Nils W; Lindenberg, Kerstin; Harter, Patrick N; Braczynski, Anne K; Plate, Karl H; Dohmen, Hildegard; Garvalov, Boyan K; Coras, Roland; Hölsken, Annett; Hewer, Ekkehard; Bewerunge-Hudler, Melanie; Schick, Matthias; Fischer, Roger; Beschorner, Rudi; Schittenhelm, Jens; Staszewski, Ori; Wani, Khalida; Varlet, Pascale; Pages, Melanie; Temming, Petra; Lohmann, Dietmar; Selt, Florian; Witt, Hendrik; Milde, Till; Witt, Olaf; Aronica, Eleonora; Giangaspero, Felice; Rushing, Elisabeth; Scheurlen, Wolfram; Geisenberger, Christoph; Rodriguez, Fausto J; Becker, Albert; Preusser, Matthias; Haberler, Christine; Bjerkvig, Rolf; Cryan, Jane; Farrell, Michael; Deckert, Martina; Hench, Jürgen; Frank, Stephan; Serrano, Jonathan; Kannan, Kasthuri; Tsirigos, Aristotelis; Brück, Wolfgang; Hofer, Silvia; Brehmer, Stefanie; Seiz-Rosenhagen, Marcel; Hänggi, Daniel; Hans, Volkmar; Rozsnoki, Stephanie; Hansford, Jordan R; Kohlhof, Patricia; Kristensen, Bjarne W; Lechner, Matt; Lopes, Beatriz; Mawrin, Christian; Ketter, Ralf; Kulozik, Andreas; Khatib, Ziad; Heppner, Frank; Koch, Arend; Jouvet, Anne; Keohane, Catherine; Mühleisen, Helmut; Mueller, Wolf; Pohl, Ute; Prinz, Marco; Benner, Axel; Zapatka, Marc; Gottardo, Nicholas G; Driever, Pablo Hernáiz; Kramm, Christof M; Müller, Hermann L; Rutkowski, Stefan; von Hoff, Katja; Frühwald, Michael C; Gnekow, Astrid; Fleischhack, Gudrun; Tippelt, Stephan; Calaminus, Gabriele; Monoranu, Camelia-Maria; Perry, Arie; Jones, Chris; Jacques, Thomas S; Radlwimmer, Bernhard; Gessi, Marco; Pietsch, Torsten; Schramm, Johannes; Schackert, Gabriele; Westphal, Manfred; Reifenberger, Guido; Wesseling, Pieter; Weller, Michael; Collins, Vincent Peter; Blümcke, Ingmar; Bendszus, Martin; Debus, Jürgen; Huang, Annie; Jabado, Nada; Northcott, Paul A; Paulus, Werner; Gajjar, Amar; Robinson, Giles W; Taylor, Michael D; Jaunmuktane, Zane; Ryzhova, Marina; Platten, Michael; Unterberg, Andreas; Wick, Wolfgang; Karajannis, Matthias A; Mittelbronn, Michel; Acker, Till; Hartmann, Christian; Aldape, Kenneth; Schüller, Ulrich; Buslei, Rolf; Lichter, Peter; Kool, Marcel; Herold-Mende, Christel; Ellison, David W; Hasselblatt, Martin; Snuderl, Matija; Brandner, Sebastian; Korshunov, Andrey; von Deimling, Andreas; Pfister, Stefan M

2018-03-22

Accurate pathological diagnosis is crucial for optimal management of patients with cancer. For the approximately 100 known tumour types of the central nervous system, standardization of the diagnostic process has been shown to be particularly challenging-with substantial inter-observer variability in the histopathological diagnosis of many tumour types. Here we present a comprehensive approach for the DNA methylation-based classification of central nervous system tumours across all entities and age groups, and demonstrate its application in a routine diagnostic setting. We show that the availability of this method may have a substantial impact on diagnostic precision compared to standard methods, resulting in a change of diagnosis in up to 12% of prospective cases. For broader accessibility, we have designed a free online classifier tool, the use of which does not require any additional onsite data processing. Our results provide a blueprint for the generation of machine-learning-based tumour classifiers across other cancer entities, with the potential to fundamentally transform tumour pathology.
Shunt flow evaluation in congenital heart disease based on two-dimensional speckle tracking.

PubMed

Fadnes, Solveig; Nyrnes, Siri Ann; Torp, Hans; Lovstakken, Lasse

2014-10-01

High-frame-rate ultrasound speckle tracking was used for quantification of peak velocity in shunt flows resulting from septal defects in congenital heart disease. In a duplex acquisition scheme implemented on a research scanner, unfocused transmit beams and full parallel receive beamforming were used to achieve a frame rate of 107 frames/s for full field-of-view flow images with high accuracy, while also ensuring high-quality focused B-mode tissue imaging. The setup was evaluated in vivo for neonates with atrial and ventricular septal defects. The shunt position was automatically tracked in B-mode images and further used in blood speckle tracking to obtain calibrated shunt flow velocities throughout the cardiac cycle. Validation toward color flow imaging and pulsed wave Doppler with manual angle correction indicated that blood speckle tracking could provide accurate estimates of shunt flow velocities. The approach was less biased by clutter filtering compared with color flow imaging and was able to provide velocity estimates beyond the Nyquist range. Possible placements of sample volumes (and angle corrections) for conventional Doppler resulted in a peak shunt velocity variations of 0.49-0.56 m/s for the ventricular septal defect of patient 1 and 0.38-0.58 m/s for the atrial septal defect of patient 2. In comparison, the peak velocities found from speckle tracking were 0.77 and 0.33 m/s for patients 1 and 2, respectively. Results indicated that complex intraventricular flow velocity patterns could be quantified using high-frame-rate speckle tracking of both blood and tissue movement. This could potentially help increase diagnostic accuracy and decrease inter-observer variability when measuring peak velocity in shunt flows. Copyright © 2014 World Federation for Ultrasound in Medicine & Biology. Published by Elsevier Inc. All rights reserved.
Semiautomatic Assessment of the Terminal Ileum and Colon in Patients with Crohn Disease Using MRI (the VIGOR++ Project).

PubMed

Puylaert, Carl A J; Schüffler, Peter J; Naziroglu, Robiel E; Tielbeek, Jeroen A W; Li, Zhang; Makanyanga, Jesica C; Tutein Nolthenius, Charlotte J; Nio, C Yung; Pendsé, Douglas A; Menys, Alex; Ponsioen, Cyriel Y; Atkinson, David; Forbes, Alastair; Buhmann, Joachim M; Fuchs, Thomas J; Hatzakis, Haralambos; van Vliet, Lucas J; Stoker, Jaap; Taylor, Stuart A; Vos, Frans M

2018-02-07

The objective of this study was to develop and validate a predictive magnetic resonance imaging (MRI) activity score for ileocolonic Crohn disease activity based on both subjective and semiautomatic MRI features. An MRI activity score (the "virtual gastrointestinal tract [VIGOR]" score) was developed from 27 validated magnetic resonance enterography datasets, including subjective radiologist observation of mural T2 signal and semiautomatic measurements of bowel wall thickness, excess volume, and dynamic contrast enhancement (initial slope of increase). A second subjective score was developed based on only radiologist observations. For validation, two observers applied both scores and three existing scores to a prospective dataset of 106 patients (59 women, median age 33) with known Crohn disease, using the endoscopic Crohn's Disease Endoscopic Index of Severity (CDEIS) as a reference standard. The VIGOR score (17.1 × initial slope of increase + 0.2 × excess volume + 2.3 × mural T2) and other activity scores all had comparable correlation to the CDEIS scores (observer 1: r = 0.58 and 0.59, and observer 2: r = 0.34-0.40 and 0.43-0.51, respectively). The VIGOR score, however, improved interobserver agreement compared to the other activity scores (intraclass correlation coefficient = 0.81 vs 0.44-0.59). A diagnostic accuracy of 80%-81% was seen for the VIGOR score, similar to the other scores. The VIGOR score achieves comparable accuracy to conventional MRI activity scores, but with significantly improved reproducibility, favoring its use for disease monitoring and therapy evaluation. Copyright © 2018 The Association of University Radiologists. Published by Elsevier Inc. All rights reserved.
Development of CD3 cell quantitation algorithms for renal allograft biopsy rejection assessment utilizing open source image analysis software.

PubMed

Moon, Andres; Smith, Geoffrey H; Kong, Jun; Rogers, Thomas E; Ellis, Carla L; Farris, Alton B Brad

2018-02-01

Renal allograft rejection diagnosis depends on assessment of parameters such as interstitial inflammation; however, studies have shown interobserver variability regarding interstitial inflammation assessment. Since automated image analysis quantitation can be reproducible, we devised customized analysis methods for CD3+ T-cell staining density as a measure of rejection severity and compared them with established commercial methods along with visual assessment. Renal biopsy CD3 immunohistochemistry slides (n = 45), including renal allografts with various degrees of acute cellular rejection (ACR) were scanned for whole slide images (WSIs). Inflammation was quantitated in the WSIs using pathologist visual assessment, commercial algorithms (Aperio nuclear algorithm for CD3+ cells/mm 2 and Aperio positive pixel count algorithm), and customized open source algorithms developed in ImageJ with thresholding/positive pixel counting (custom CD3+%) and identification of pixels fulfilling "maxima" criteria for CD3 expression (custom CD3+ cells/mm 2 ). Based on visual inspections of "markup" images, CD3 quantitation algorithms produced adequate accuracy. Additionally, CD3 quantitation algorithms correlated between each other and also with visual assessment in a statistically significant manner (r = 0.44 to 0.94, p = 0.003 to < 0.0001). Methods for assessing inflammation suggested a progression through the tubulointerstitial ACR grades, with statistically different results in borderline versus other ACR types, in all but the custom methods. Assessment of CD3-stained slides using various open source image analysis algorithms presents salient correlations with established methods of CD3 quantitation. These analysis techniques are promising and highly customizable, providing a form of on-slide "flow cytometry" that can facilitate additional diagnostic accuracy in tissue-based assessments.
Diagnostic Quality of 3D T2-SPACE Compared with T2-FSE in the Evaluation of Cervical Spine MRI Anatomy.

PubMed

Chokshi, F H; Sadigh, G; Carpenter, W; Allen, J W

2017-04-01

Spinal anatomy has been variably investigated using 3D MRI. We aimed to compare the diagnostic quality of T2 sampling perfection with application-optimized contrasts by using flip angle evolution (SPACE) with T2-FSE sequences for visualization of cervical spine anatomy. We predicted that T2-SPACE will be equivalent or superior to T2-FSE for visibility of anatomic structures. Adult patients undergoing cervical spine MR imaging with both T2-SPACE and T2-FSE sequences for radiculopathy or myelopathy between September 2014 and February 2015 were included. Two blinded subspecialty-trained radiologists independently assessed the visibility of 12 anatomic structures by using a 5-point scale and assessed CSF pulsation artifact by using a 4-point scale. Sagittal images and 6 axial levels from C2-T1 on T2-FSE were reviewed; 2 weeks later and after randomization, T2-SPACE was evaluated. Diagnostic quality for each structure and CSF pulsation artifact visibility on both sequences were compared by using a paired t test. Interobserver agreement was calculated (κ). Forty-five patients were included (mean age, 57 years; 40% male). The average visibility scores for intervertebral disc signal, neural foramina, ligamentum flavum, ventral rootlets, and dorsal rootlets were higher for T2-SPACE compared with T2-FSE for both reviewers ( P < .001). Average scores for remaining structures were either not statistically different or the superiority of one sequence was discordant between reviewers. T2-SPACE showed less degree of CSF flow artifact ( P < .001). Interobserver variability ranged between -0.02-0.20 for T2-SPACE and -0.02-0.30 for T2-FSE (slight to fair agreement). T2-SPACE may be equivalent or superior to T2-FSE for the evaluation of cervical spine anatomic structures, and T2-SPACE shows a lower degree of CSF pulsation artifact. © 2017 by American Journal of Neuroradiology.
Noninvasive measurement of burn wound depth applying infrared thermal imaging (Conference Presentation)

NASA Astrophysics Data System (ADS)

Jaspers, Mariëlle E.; Maltha, Ilse M.; Klaessens, John H.; Vet, Henrica C.; Verdaasdonk, Rudolf M.; Zuijlen, Paul P.

2016-02-01

In burn wounds early discrimination between the different depths plays an important role in the treatment strategy. The remaining vasculature in the wound determines its healing potential. Non-invasive measurement tools that can identify the vascularization are therefore considered to be of high diagnostic importance. Thermography is a non-invasive technique that can accurately measure the temperature distribution over a large skin or tissue area, the temperature is a measure of the perfusion of that area. The aim of this study was to investigate the clinimetric properties (i.e. reliability and validity) of thermography for measuring burn wound depth. In a cross-sectional study with 50 burn wounds of 35 patients, the inter-observer reliability and the validity between thermography and Laser Doppler Imaging were studied. With ROC curve analyses the ΔT cut-off point for different burn wound depths were determined. The inter-observer reliability, expressed by an intra-class correlation coefficient of 0.99, was found to be excellent. In terms of validity, a ΔT cut-off point of 0.96°C (sensitivity 71%; specificity 79%) differentiates between a superficial partial-thickness and deep partial-thickness burn. A ΔT cut-off point of -0.80°C (sensitivity 70%; specificity 74%) could differentiate between a deep partial-thickness and a full-thickness burn wound. This study demonstrates that thermography is a reliable method in the assessment of burn wound depths. In addition, thermography was reasonably able to discriminate among different burn wound depths, indicating its potential use as a diagnostic tool in clinical burn practice.
Diagnostic validity of alternative manual stress radiographic technique detecting subtalar instability with concomitant ankle instability.

PubMed

Lee, Byung Hoon; Choi, Kyung-Hwa; Seo, Dong Yeon; Choi, Sang Min; Kim, Gab Lae

2016-04-01

To incorporate a diagnostic technique for measuring subtalar motion, namely "talar rotation", into the manual supination-anterior drawer stress radiographs for evaluation of the severity of rotational instability, and to determine its clinical relevance. Sixty-six patients with combined injuries of the anterior talofibular (ATFL) and calcaneofibular ligament (CFL) underwent three bilateral manual stress radiographs, and mean increments of anterior talar translation (mm), talar tilt (°), and talar rotation (%) in the injured ankle compared to the normal opposite side were measured with the technique. Intraobserver and interobserver reliability of each measure was assessed, and the difference in the degree of increments was compared according to the presence of additional cervical ligament insufficiency. Ankle stress radiographic intraobserver and interobserver agreement was ICC = 0.91 and 0.82 for talar rotation (%), ICC = 0.64 and 0.51 for anterior talar translation, and ICC = 0.78 and 0.71 for talar tilt angle, respectively. In group 2 including patients with combined injuries of the ATFL and CFL along with additional cervical ligament insufficiency, a significantly higher increment of talar rotation, mean 6.4% (SD 3.4%), was observed compared to that of talar rotation, mean 4.1% (SD 2.7 ), in the other group (group 1) with an intact cervical ligament (p < 0.001). A new comprehensive stress radiographic technique for diagnosis of chronic lateral ankle instability presented in this study might be a reliable and representable measurement tool to assess additional injury or instability of the subtalar joint. Prospective cohort study, Level II.
Clinical performance of a free-breathing spatiotemporally accelerated 3-D time-resolved contrast-enhanced pediatric abdominal MR angiography

PubMed Central

Yousaf, Ufra; Hsiao, Albert; Cheng, Joseph Y.; Alley, Marcus T.; Lustig, Michael; Pauly, John M.; Vasanawala, Shreyas S.

2015-01-01

Background Pediatric contrast-enhanced MR angiography is often limited by respiration, other patient motion and compromised spatiotemporal resolution. Objective To determine the reliability of a free-breathing spatiotemporally accelerated 3-D time-resolved contrast enhanced MR angiography method for depicting abdominal arterial anatomy in young children. Materials and methods With IRB approval and informed consent, we retrospectively identified 27 consecutive children (16 males and 11 females; mean age: 3.8 years, range: 14 days to 8.4 years) referred for contrast enhanced MR angiography at our institution, who had undergone free-breathing spatiotemporally accelerated time-resolved contrast enhanced MR angiography studies. An radio-frequency-spoiled gradient echo sequence with Cartesian variable density k-space sampling and radial view ordering, intrinsic motion navigation and intermittent fat suppression was developed. Images were reconstructed with soft-gated parallel imaging locally low-rank method to achieve both motion correction and high spatiotemporal resolution. Quality of delineation of 13 abdominal arteries in the reconstructed images was assessed independently by two radiologists on a five-point scale. Ninety-five percent confidence intervals of the proportion of diagnostically adequate cases were calculated. Interobserver agreements were also analyzed. Results Eleven out of 13 arteries achieved acceptable image quality (mean score range: 3.9–5.0) for both readers. Fair to substantial interobserver agreement was reached on nine arteries. Conclusion Free-breathing spatiotemporally accelerated 3-D time-resolved contrast enhanced MR angiography frequently yields diagnostic image quality for most abdominal arteries for pediatric contrast enhanced MR angiography. PMID:26040509
Hematoxylin and Eosin Counterstaining Protocol for Immunohistochemistry Interpretation and Diagnosis.

PubMed

Grosset, Andrée-Anne; Loayza-Vega, Kevin; Adam-Granger, Éloïse; Birlea, Mirela; Gilks, Blake; Nguyen, Bich; Soucy, Geneviève; Tran-Thanh, Danh; Albadine, Roula; Trudel, Dominique

2017-12-21

Hematoxylin and eosin (H&E) staining is a well-established technique in histopathology. However, immunohistochemistry (IHC) interpretation is done exclusively with hematoxylin counterstaining. Our goal was to investigate the potential of H&E as counterstaining (H&E-IHC) to allow for visualization of a marker while confirming the diagnosis on the same slide. The quality of immunostaining and the fast-technical performance were the main criteria to select the final protocol. We stained multiple diagnostic tissues with class I IHC tests with different subcellular localization markers (anti-CK7, CK20, synaptophysin, CD20, HMB45, and Ki-67) and with double-staining on prostate tissues with anti-high molecular weight keratins/p63 (DAB detection) and p504s (alkaline phosphatase detection). To validate the efficacy of the counterstaining, we stained tissue microarrays from the Canadian Immunohistochemistry Quality Control (cIQc) with class II IHC tests (ER, PR, HER2, and p53 markers). Interobserver and intraobserver concordance was assessed by κ statistics. Excellent agreement of H&E-IHC interpretation was observed in comparison with standard IHC from our laboratory (κ, 0.87 to 1.00), and with the cIQc reference values (κ, 0.81 to 1.00). Interobserver and intraobserver agreement was excellent (κ, 0.89 to 1.00 and 0.87 to 1.00, respectively). We therefore show for the first time the potential of using H&E counterstaining for IHC interpretation. We recommend the H&E-IHC protocol to enhance diagnostic precision for the clinical workflow and research studies.
Training and quality assurance with the Structured Clinical Interview for DSM-IV (SCID-I/P).

PubMed

Ventura, J; Liberman, R P; Green, M F; Shaner, A; Mintz, J

1998-06-15

Accuracy in psychiatric diagnosis is critical for evaluating the suitability of the subjects for entry into research protocols and for establishing comparability of findings across study sites. However, training programs in the use of diagnostic instruments for research projects are not well systematized. Furthermore, little information has been published on the maintenance of interrater reliability of diagnostic assessments. At the UCLA Research Center for Major Mental Illnesses, a Training and Quality Assurance Program for SCID interviewers was used to evaluate interrater reliability and diagnostic accuracy. Although clinically experienced interviewers achieved better interrater reliability and overall diagnostic accuracy than neophyte interviewers, both groups were able to achieve and maintain high levels of interrater reliability, diagnostic accuracy, and interviewer skill. At the first quality assurance check after training, there were no significant differences between experienced and neophyte interviewers in interrater reliability or diagnostic accuracy. Standardization of training and quality assurance procedures within and across research projects may make research findings from study sites more comparable.

Systematic Review of the Diagnostic Accuracy and Therapeutic Effectiveness of Sacroiliac Joint Interventions.

PubMed

Simopoulos, Thomas T; Manchikanti, Laxmaiah; Gupta, Sanjeeva; Aydin, Steve M; Kim, Chong Hwan; Solanki, Daneshvari; Nampiaparampil, Devi E; Singh, Vijay; Staats, Peter S; Hirsch, Joshua A

2015-01-01

The sacroiliac joint is well known as a cause of low back and lower extremity pain. Prevalence estimates are 10% to 25% in patients with persistent axial low back pain without disc herniation, discogenic pain, or radiculitis based on multiple diagnostic studies and systematic reviews. However, at present there are no definitive management options for treating sacroiliac joint pain. To evaluate the diagnostic accuracy and therapeutic effectiveness of sacroiliac joint interventions. A systematic review of the diagnostic accuracy and therapeutic effectiveness of sacroiliac joint interventions. The available literature on diagnostic and therapeutic sacroiliac joint interventions was reviewed. The quality assessment criteria utilized were the Quality Appraisal of Reliability Studies (QAREL) checklist for diagnostic accuracy studies, Cochrane review criteria to assess sources of risk of bias, and Interventional Pain Management Techniques-Quality Appraisal of Reliability and Risk of Bias Assessment (IPM-QRB) criteria for randomized therapeutic trials and Interventional Pain Management Techniques-Quality Appraisal of Reliability and Risk of Bias Assessment for Nonrandomized Studies (IPM-QRBNR) for observational therapeutic assessments. The level of evidence was based on a best evidence synthesis with modified grading of qualitative evidence from Level I to Level V. Data sources included relevant literature published from 1966 through March 2015 that were identified through searches of PubMed and EMBASE, manual searches of the bibliographies of known primary and review articles, and all other sources. For the diagnostic accuracy assessment, and for the therapeutic modalities, the primary outcome measure of pain relief and improvement in functional status were utilized. A total of 11 diagnostic accuracy studies and 14 therapeutic studies were included. The evidence for diagnostic accuracy is Level II for dual diagnostic blocks with at least 70% pain relief as the criterion standard and Level III evidence for single diagnostic blocks with at least 75% pain relief as the criterion standard. The evidence for cooled radiofrequency neurotomy in managing sacroiliac joint pain is Level II to III. The evidence for conventional radiofrequency neurotomy, intraarticular steroid injections, and periarticular injections with steroids or botulinum toxin is limited: Level III or IV. The limitations of this systematic review include inconsistencies in diagnostic accuracy studies with a paucity of high quality, replicative, and consistent literature. The limitations for therapeutic interventions include variations in technique, variable diagnostic standards for inclusion criteria, and variable results. The evidence for the accuracy of diagnostic and therapeutic effectiveness of sacroiliac joint interventions varied from Level II to Level IV.
Accuracy of contrast-enhanced spectral mammography for estimating residual tumor size after neoadjuvant chemotherapy in patients with breast cancer: a feasibility study

PubMed Central

Barra, Filipe Ramos; de Souza, Fernanda Freire; Camelo, Rosimara Eva Ferreira Almeida; Ribeiro, Andrea Campos de Oliveira; Farage, Luciano

2017-01-01

Objective To assess the feasibility of contrast-enhanced spectral mammography (CESM) of the breast for assessing the size of residual tumors after neoadjuvant chemotherapy (NAC). Materials and methods In breast cancer patients who underwent NAC between 2011 and 2013, we evaluated residual tumor measurements obtained with CESM and full-field digital mammography (FFDM). We determined the concordance between the methods, as well as their level of agreement with the pathology. Three radiologists analyzed eight CESM and FFDM measurements separately, considering the size of the residual tumor at its largest diameter and correlating it with that determined in the pathological analysis. Interobserver agreement was also evaluated. Results The sensitivity, specificity, positive predictive value, and negative predictive value were higher for CESM than for FFDM (83.33%, 100%, 100%, and 66% vs. 50%, 50%, 50%, and 25%, respectively). The CESM measurements showed a strong, consistent correlation with the pathological findings (correlation coefficient = 0.76-0.92; intraclass correlation coefficient = 0.692-0.886). The correlation between the FFDM measurements and the pathological findings was not statistically significant, with questionable consistency (intraclass correlation coefficient = 0.488-0.598). Agreement with the pathological findings was narrower for CESM measurements than for FFDM measurements. Interobserver agreement was higher for CESM than for FFDM (0.94 vs. 0.88). Conclusion CESM is a feasible means of evaluating residual tumor size after NAC, showing a good correlation and good agreement with pathological findings. For CESM measurements, the interobserver agreement was excellent. PMID:28894329
Validity and Diagnostic Accuracy of Scores from the Autism Diagnostic Observation Schedule-Generic

ERIC Educational Resources Information Center

Reid, Melissa A.

2012-01-01

The purpose of this study was to examine the internal structure, relationships with other variables, and diagnostic accuracy of scores on the Autism Diagnostic Observation Schedule-Generic (ADOS-G; Lord et al., 1999) for the purpose of diagnostic decision-making. Participants were 462 children enrolled in a public school district in the southern…
Reproducibility and diagnostic performance of shear wave elastography in evaluating breast solid mass.

PubMed

Hong, Sun; Woo, Ok Hee; Shin, Hye Seon; Hwang, Soon-Young; Cho, Kyu Ran; Seo, Bo Kyoung

Shear wave elastography (SWE) was performed independently by two radiologists in 264 solid breast masses. The images were reviewed for color overlay pattern (COP) classification by the two radiologists, double blinded to any information. The interobserver agreement of the COP was almost perfect (κ=0.908) and high in E max (ICC=0.89). The AUC value of the COP (0.954) was significantly higher than that of E max (0.915) (p=0.002) but not significantly different from that of E max combined with COP (0.957) (p=0.098). The SWE color overlay pattern and E max of breast masses were highly reproducible. The COP had better diagnostic ability than E max , suggesting that COP may be a more reliable parameter for solid breast mass evaluation. Copyright © 2017 Elsevier Inc. All rights reserved.
Diagnostic accuracy of histopathologic and cytopathologic examination of Aspergillus species.

PubMed

Shah, Akeesha A; Hazen, Kevin C

2013-01-01

To assess the diagnostic accuracy of histopatho-logic and cytopathologic examination (HCE) of Aspergillus species (spp), we performed an 11-year retrospective review to correlate surgical/cytology cases with a diagnosis of Aspergillus spp with their concurrent fungal culture results. Diagnostic accuracy was defined as the percentage of cases with culture-proven Aspergillus spp divided by the number of cases diagnosed as Aspergillus spp on HCE that had growth on fungal culture. Ninety surgical/cytology cases with concurrent fungal culture were reviewed, 58 of which grew a fungal organism. Of these 58 cases, 45 grew an Aspergillus spp, whereas 13 grew an organism other than Aspergillus spp, including both common (Scedosporium, Fusarium, and Paecilomyces spp) and uncommon mimickers (Trichosporon loubieri), resulting in a diagnostic accuracy of 78%. The low diagnostic accuracy indicates that several fungal organisms can morphologically mimic Aspergillus spp and can only be distinguished by fungal culture and DNA sequencing.
Quality Assessment of Comparative Diagnostic Accuracy Studies: Our Experience Using a Modified Version of the QUADAS-2 Tool

ERIC Educational Resources Information Center

Wade, Ros; Corbett, Mark; Eastwood, Alison

2013-01-01

Assessing the quality of included studies is a vital step in undertaking a systematic review. The recently revised Quality Assessment of Diagnostic Accuracy Studies (QUADAS) tool (QUADAS-2), which is the only validated quality assessment tool for diagnostic accuracy studies, does not include specific criteria for assessing comparative studies. As…
The 2014 updated version of the Confusion Assessment Method for the Intensive Care Unit compared to the 5th version of the Diagnostic and Statistical Manual of Mental Disorders and other current methods used by intensivists.

PubMed

Chanques, Gérald; Ely, E Wesley; Garnier, Océane; Perrigault, Fanny; Eloi, Anaïs; Carr, Julie; Rowan, Christine M; Prades, Albert; de Jong, Audrey; Moritz-Gasser, Sylvie; Molinari, Nicolas; Jaber, Samir

2018-03-01

One third of patients admitted to an intensive care unit (ICU) will develop delirium. However, delirium is under-recognized by bedside clinicians without the use of delirium screening tools, such as the Intensive Care Delirium Screening Checklist (ICDSC) or the Confusion Assessment Method for the ICU (CAM-ICU). The CAM-ICU was updated in 2014 to improve its use by clinicians throughout the world. It has never been validated compared to the new reference standard, the Diagnostic and Statistical Manual of Mental Disorders 5th version (DSM-5). We made a prospective psychometric study in a 16-bed medical-surgical ICU of a French academic hospital, to measure the diagnostic performance of the 2014 updated CAM-ICU compared to the DSM-5 as the reference standard. We included consecutive adult patients with a Richmond Agitation Sedation Scale (RASS) ≥ -3, without preexisting cognitive disorders, psychosis or cerebral injury. Delirium was independently assessed by neuropsychological experts using an operationalized approach to DSM-5, by investigators using the CAM-ICU and the ICDSC, by bedside clinicians and by ICU patients. The sensitivity, specificity, positive and negative predictive values were calculated considering neuropsychologist DSM-5 assessments as the reference standard (primary endpoint). CAM-ICU inter-observer agreement, as well as that between delirium diagnosis methods and the reference standard, was summarized using κ coefficients, which were subsequently compared using the Z-test. Delirium was diagnosed by experts in 38% of the 108 patients included for analysis. The CAM-ICU had a sensitivity of 83%, a specificity of 100%, a positive predictive value of 100% and a negative predictive value of 91%. Compared to the reference standard, the CAM-ICU had a significantly (p < 0.05) higher agreement (κ = 0.86 ± 0.05) than the physicians,' residents' and nurses' diagnoses (κ = 0.65 ± 0.09; 0.63 ± 0.09; 0.61 ± 0.09, respectively), as well as the patient's own impression of feeling delirious (κ = 0.02 ± 0.11). Differences between the ICDSC (κ = 0.69 ± 0.07) and CAM-ICU were not significant (p = 0.054). The CAM-ICU demonstrated a high reliability for inter-observer agreement (κ = 0.87 ± 0.06). The 2014 updated version of the CAM-ICU is valid according to DSM-5 criteria and reliable regarding inter-observer agreement in a research setting. Delirium remains under-recognized by bedside clinicians.
Do Orthopaedic Oncologists Agree on the Diagnosis and Treatment of Cartilage Tumors of the Appendicular Skeleton?

PubMed

Zamora, Tomas; Urrutia, Julio; Schweitzer, Daniel; Amenabar, Pedro Pablo; Botello, Eduardo

2017-09-01

Distinguishing a benign enchondroma from a low-grade chondrosarcoma is a common diagnostic challenge for orthopaedic oncologists. Low interrater agreement has been observed for the diagnosis of cartilaginous neoplasms among radiologists and pathologists, but, to our knowledge, no study has evaluated inter- and intraobserver agreement among orthopaedic oncologists grading these lesions using initial clinical and imaging information. Determining such agreement is important since it reflects the certainty in the diagnosis by orthopaedic oncologists. Agreement also is important as it will guide future treatment and prognosis, considering that there is no gold standard for diagnosis of these lesions. (1) to determine inter- and intraobserver agreement among a multinational panel of expert orthopaedic oncologists in diagnosing cartilaginous neoplasms based on their assessment of clinical symptoms and imaging at diagnosis. (2) To describe the most important clinical and imaging features that experts use during the initial diagnostic process. (3) To determine interobserver agreement for proposed initial treatment strategies for cartilaginous neoplasms by this panel of evaluators. Thirty-nine patients with intramedullary cartilaginous neoplasms of the appendicular skeleton of various histopathologic grades were selected and classified as having benign, low-grade malignant, or intermediate- or high-grade malignant neoplasms by 10 experienced orthopaedic oncologists based on clinical and imaging information. Additionally, they chose the three most important clinical or imaging features for the diagnosis of these neoplasms, and they proposed a treatment strategy for each patient. The Kappa coefficient (κ) was used to determine inter- and intraobserver agreement. Inter- and intraobserver agreements were only fair to good, κ = 0.44(95% CI, 0.41-0.48) and κ = 0.62 (95% CI, 0.52-0.72), respectively. The three factors most frequently identified as helpful in making the diagnosis by our panel were cortical involvement in 65% of evaluations (253/390), neoplasm size in 51% (198/390), and pain in 50% (194/390). The interobserver agreement for the proposed initial treatment strategy after diagnosis was poor (κ = 0.21; 95% CI, 0.18-0.24). This study showed barely fair interobserver and fair to good intraobserver agreement for grading of intramedullary cartilaginous neoplasms by orthopaedic oncologists using initial clinical and imaging findings. These results reflect the insufficient guidance interpreting clinical and imaging features, and the limitations of the systems we use today when making these diagnoses. In the same way, they generate concern for the implications that this may have on different treatment strategies and the future prognosis of our patients. Future studies should build on these observations and focus on clarifying our criteria of diagnosis so that treatment recommendations are standardized regardless of the treating institution or oncologist. Level III, diagnostic study.
Efficient strategies to find diagnostic test accuracy studies in kidney journals.

PubMed

Rogerson, Thomas E; Ladhani, Maleeka; Mitchell, Ruth; Craig, Jonathan C; Webster, Angela C

2015-08-01

Nephrologists looking for quick answers to diagnostic clinical questions in MEDLINE can use a range of published search strategies or Clinical Query limits to improve the precision of their searches. We aimed to evaluate existing search strategies for finding diagnostic test accuracy studies in nephrology journals. We assessed the accuracy of 14 search strategies for retrieving diagnostic test accuracy studies from three nephrology journals indexed in MEDLINE. Two investigators hand searched the same journals to create a reference set of diagnostic test accuracy studies to compare search strategy results against. We identified 103 diagnostic test accuracy studies, accounting for 2.1% of all studies published. The most specific search strategy was the Narrow Clinical Queries limit (sensitivity: 0.20, 95% CI 0.13-0.29; specificity: 0.99, 95% CI 0.99-0.99). Using the Narrow Clinical Queries limit, a searcher would need to screen three (95% CI 2-6) articles to find one diagnostic study. The most sensitive search strategy was van der Weijden 1999 Extended (sensitivity: 0.95; 95% CI 0.89-0.98; specificity 0.55, 95% CI 0.53-0.56) but required a searcher to screen 24 (95% CI 23-26) articles to find one diagnostic study. Bachmann 2002 was the best balanced search strategy, which was sensitive (0.88, 95% CI 0.81-0.94), but also specific (0.74, 95% CI 0.73-0.75), with a number needed to screen of 15 (95% CI 14-17). Diagnostic studies are infrequently published in nephrology journals. The addition of a strategy for diagnostic studies to a subject search strategy in MEDLINE may reduce the records needed to screen while preserving adequate search sensitivity for routine clinical use. © 2015 Asian Pacific Society of Nephrology.
Meta-epidemiologic study showed frequent time trends in summary estimates from meta-analyses of diagnostic accuracy studies.

PubMed

Cohen, Jérémie F; Korevaar, Daniël A; Wang, Junfeng; Leeflang, Mariska M; Bossuyt, Patrick M

2016-09-01

To evaluate changes over time in summary estimates from meta-analyses of diagnostic accuracy studies. We included 48 meta-analyses from 35 MEDLINE-indexed systematic reviews published between September 2011 and January 2012 (743 diagnostic accuracy studies; 344,015 participants). Within each meta-analysis, we ranked studies by publication date. We applied random-effects cumulative meta-analysis to follow how summary estimates of sensitivity and specificity evolved over time. Time trends were assessed by fitting a weighted linear regression model of the summary accuracy estimate against rank of publication. The median of the 48 slopes was -0.02 (-0.08 to 0.03) for sensitivity and -0.01 (-0.03 to 0.03) for specificity. Twelve of 96 (12.5%) time trends in sensitivity or specificity were statistically significant. We found a significant time trend in at least one accuracy measure for 11 of the 48 (23%) meta-analyses. Time trends in summary estimates are relatively frequent in meta-analyses of diagnostic accuracy studies. Results from early meta-analyses of diagnostic accuracy studies should be considered with caution. Copyright © 2016 Elsevier Inc. All rights reserved.
Assessment of interobserver concordance in polysomnography scoring of sleep bruxism☆

PubMed Central

Ferraz, Otávio; de Moura Guimarães, Thais; Maluly Filho, Milton; Dal-Fabbro, Cibele; Abraão Crosara Cunha, Thays; Cristina Lotaif, Ana; Cristina Barros Schütz, Teresa; Santos-Silva, Rogério; Tufik, Sergio; Bittencourt, Lia

2015-01-01

Introduction Objective evaluation of sleep bruxism (SB) using whole-night polysomnography (PSG) is relevant for diagnostic confirmation. Nevertheless, the PSG electromyogram (EMG) scoring may give rise to controversy, particularly when audiovisual monitoring is not performed. Therefore, the present study assessed the concordance between two independent scorers to visual SB on a PSG performed without audiovisual monitoring. Methods Fifty-six PSG tests were scored from individuals with clinical history and polysomnography criteria of SB. In addition to the protocol of conventional whole-night PSG, electrodes were also placed bilaterally on the masseter and temporal muscles. Visual EMG scoring without audio video monitoring was scored by two independent scorers (Dentist 1 and Dentist 2) according the recommendations formulated in the AASM manual (2007). Kendall Tau correlation was used to assess interobserver concordance relative to variables “total duration of events (seconds), “shortest events”, “longest events” and index in each phasic, tonic or mixed event. Results The correlation was positive and significant relative to all the investigated variables, being T>0.54. Conclusion It was found a good inter-examiner concordance rate in SB scoring in absence of audio video monitoring. PMID:26779318
Sample size in studies on diagnostic accuracy in ophthalmology: a literature survey.

PubMed

Bochmann, Frank; Johnson, Zoe; Azuara-Blanco, Augusto

2007-07-01

To assess the sample sizes used in studies on diagnostic accuracy in ophthalmology. Design and sources: A survey literature published in 2005. The frequency of reporting calculations of sample sizes and the samples' sizes were extracted from the published literature. A manual search of five leading clinical journals in ophthalmology with the highest impact (Investigative Ophthalmology and Visual Science, Ophthalmology, Archives of Ophthalmology, American Journal of Ophthalmology and British Journal of Ophthalmology) was conducted by two independent investigators. A total of 1698 articles were identified, of which 40 studies were on diagnostic accuracy. One study reported that sample size was calculated before initiating the study. Another study reported consideration of sample size without calculation. The mean (SD) sample size of all diagnostic studies was 172.6 (218.9). The median prevalence of the target condition was 50.5%. Only a few studies consider sample size in their methods. Inadequate sample sizes in diagnostic accuracy studies may result in misleading estimates of test accuracy. An improvement over the current standards on the design and reporting of diagnostic studies is warranted.
Interobserver agreement for post mortem renal histopathology and diagnosis of acute tubular necrosis in critically ill patients.

PubMed

Glassford, Neil J; Skene, Alison; Guardiola, Maria B; Chan, Matthew J; Bagshaw, Sean M; Bellomo, Rinaldo; Solez, Kim

2017-12-01

The renal histopathology of critically ill patients dying with acute kidney injury (AKI) in intensive care units of high income countries remains uncertain. Retrospective observational assessment of interobserver agreement in the reporting of renal post mortem histopathology, and the ability of pathologists blinded to the clinical context to independently identify the presence of pre-mortem AKI from digital images of histological sections from 34 critically ill patients dying in teaching hospitals in Australia and Canada. We identified a heterogeneous cohort with a median age of 65 years (interquartile range [IQR], 56.5-77), APACHE II score of 27 (IQR, 19-33), and sepsis as the most common admission diagnosis (12/34; 35%). The most common proximate causes of death were cardiovascular (19/34; 56%) and respiratory (7/34; 21%) failure. AKI was common, with 23 patients (68%) developing RIFLE-F AKI, and 21 patients (62%) receiving renal replacement therapy. Structured reporting for tubular inflammation showed excellent agreement (kappa = 1), but no other subdomain demonstrated better than moderate agreement (kappa < 0.6). Only fair agreement (55.9% of cases; kappa = 0.23) was demonstrated on the diagnosis of moderate to severe acute tubular necrosis (ATN). Pathologist A predicted RIFLE-I or worse AKI with the diagnosis of ATN, with an overall accuracy of 61.8%; pathologist B predicted AKI with an accuracy of 35.3%. Post mortem assessment of the renal histopathology in critically ill patients is neither robust nor reproducible; independent pathologists agree poorly on the diagnosis of ATN, and their structural assessment appears dissociated from ante-mortem renal function.
Ultrasound detection of cartilage calcification at knee level in calcium pyrophosphate deposition disease.

PubMed

Gutierrez, Marwin; Di Geso, Luca; Salaffi, Fausto; Carotti, Marina; Girolimetti, Rita; De Angelis, Rossella; Filippucci, Emilio; Grassi, Walter

2014-01-01

To determine the sensitivity, specificity, and accuracy of ultrasound (US) in the detection of cartilage calcification at knee level in patients with calcium pyrophosphate deposition disease (CPDD) and to assess the interobserver reliability. Seventy-four CPDD patients and 83 controls with other chronic arthritis were included. All patients underwent a clinical examination, synovial fluid analysis, and radiographic assessment of the knee. US examinations were performed in order to detect hyperechoic spots within the hyaline cartilage layer and hyperechoic areas within the meniscal fibrocartilage. Twenty patients were assessed by 2 operators in order to calculate the interobserver reliability. A total of 314 knees in 157 patients (74 with CPDD, 19 with rheumatoid arthritis, 17 with spondyloarthritis, 32 with osteoarthritis, and 15 with gout) were assessed. In the 74 patients with CPDD, hyaline cartilage spots were detected by US in at least 1 knee in 44 patients (59.5%), whereas radiography detected hyaline cartilage spots in 34 patients (45.9%) (P < 0.001). Meniscal fibrocartilage calcifications were detected by US in 67 of the 74 CPDD patients (90.5%), whereas conventional radiography detected calcifications in 62 patients (83.7%) (P = 0.011). The criterion validity expressed as percentage of sensitivity, specificity, and accuracy of US in the detection of articular cartilage calcification was high. Both kappa values and overall agreement percentages showed moderate to excellent agreement. US is an accurate and reliable imaging technique in the detection of articular cartilage calcification at knee level in patients with CPDD. Copyright © 2014 by the American College of Rheumatology.
Error Analysis: How Precise is Fused Deposition Modeling in Fabrication of Bone Models in Comparison to the Parent Bones?

PubMed

Reddy, M V; Eachempati, Krishnakiran; Gurava Reddy, A V; Mugalur, Aakash

2018-01-01

Rapid prototyping (RP) is used widely in dental and faciomaxillary surgery with anecdotal uses in orthopedics. The purview of RP in orthopedics is vast. However, there is no error analysis reported in the literature on bone models generated using office-based RP. This study evaluates the accuracy of fused deposition modeling (FDM) using standard tessellation language (STL) files and errors generated during the fabrication of bone models. Nine dry bones were selected and were computed tomography (CT) scanned. STL files were procured from the CT scans and three-dimensional (3D) models of the bones were printed using our in-house FDM based 3D printer using Acrylonitrile Butadiene Styrene (ABS) filament. Measurements were made on the bone and 3D models according to data collection procedures for forensic skeletal material. Statistical analysis was performed to establish interobserver co-relation for measurements on dry bones and the 3D bone models. Statistical analysis was performed using SPSS version 13.0 software to analyze the collected data. The inter-observer reliability was established using intra-class coefficient for both the dry bones and the 3D models. The mean of absolute difference is 0.4 that is very minimal. The 3D models are comparable to the dry bones. STL file dependent FDM using ABS material produces near-anatomical 3D models. The high 3D accuracy hold a promise in the clinical scenario for preoperative planning, mock surgery, and choice of implants and prostheses, especially in complicated acetabular trauma and complex hip surgeries.
Accuracy of both virtual and printed 3-dimensional models for volumetric measurement of alveolar clefts before grafting with alveolar bone compared with a validated algorithm: a preliminary investigation.

PubMed

Kasaven, C P; McIntyre, G T; Mossey, P A

2017-01-01

Our objective was to assess the accuracy of virtual and printed 3-dimensional models derived from cone-beam computed tomographic (CT) scans to measure the volume of alveolar clefts before bone grafting. Fifteen subjects with unilateral cleft lip and palate had i-CAT cone-beam CT scans recorded at 0.2mm voxel and sectioned transversely into slices 0.2mm thick using i-CAT Vision. Volumes of alveolar clefts were calculated using first a validated algorithm; secondly, commercially-available virtual 3-dimensional model software; and finally 3-dimensional printed models, which were scanned with microCT and analysed using 3-dimensional software. For inter-observer reliability, a two-way mixed model intraclass correlation coefficient (ICC) was used to evaluate the reproducibility of identification of the cranial and caudal limits of the clefts among three observers. We used a Friedman test to assess the significance of differences among the methods, and probabilities of less than 0.05 were accepted as significant. Inter-observer reliability was almost perfect (ICC=0.987). There were no significant differences among the three methods. Virtual and printed 3-dimensional models were as precise as the validated computer algorithm in the calculation of volumes of the alveolar cleft before bone grafting, but virtual 3-dimensional models were the most accurate with the smallest 95% CI and, subject to further investigation, could be a useful adjunct in clinical practice. Copyright © 2016 The British Association of Oral and Maxillofacial Surgeons. Published by Elsevier Ltd. All rights reserved.
Mitotic rate in primary melanoma: interobserver and intraobserver reliability, analyzed using H&E sections and immunohistochemistry.

PubMed

Garbe, Claus; Eigentler, Thomas K; Bauer, Jürgen; Blödorn-Schlicht, Norbert; Cerroni, Lorenzo; Fend, Falko; Hantschke, Markus; Kurschat, Peter; Kutzner, Heinz; Metze, Dieter; Mielke, Volker; Preßler, Harald; Reusch, Michael; Reusch, Ursula; Stadler, Rudolf; Tronnier, Michael; Yazdi, Amir; Metzler, Gisela

2016-09-01

In 2009, the AJCC issued a revised melanoma staging system. In addition to tumor thickness and ulceration, the mitotic rate was introduced as the third major prognostic parameter for the classification of primary cutaneous melanoma. Given that, according to the 2009 AJCC classification, the detection of one or more dermal tumor mitoses leads to an upstaging - from stage Ia to Ib - of melanomas with a tumor thickness of ≤ 1.0 mm, we set out to investigate the reproducibility of this new parameter. In order to assess interobserver reliability, 17 dermatopathologists und pathologists - all well versed in the diagnosis of cutaneous melanoma - analyzed the mitotic rate in 15 thin primary cutaneous melanomas (mean tumor thickness 0.91 mm) using identical slides. Mitotic rates were determined on H&E and phosphohistone H3 (Ser10)-stained samples. Without knowledge of their previous assessment, five of the aforementioned examiners reevaluated the samples after more than one year in order to ascertain intraobserver reliability. Interobserver reliability of the mitotic rate in thin primary melanomas is disappointing and independent of whether H&E or immunohistochemically stained samples are used (kappa value: 0.088 [H&E], 0.154 [IH], respectively). Kappa values improved to 0.345 (H&E) and 0.403 (IH) when using a cutoff of 0/1 vs. 2+ mitoses. Similarly unsatisfactory, kappa values for intraobserver reliability ranged from 0.18 and 0.348, depending on the individual examiner. Given the unsatisfactory reproducibility and large variations in assessing the mitotic rate, it remains a matter of debate whether this diagnostic parameter should play a role in therapeutic decisions. © 2016 Deutsche Dermatologische Gesellschaft (DDG). Published by John Wiley & Sons Ltd.
Improved inter-observer agreement of an expert review panel in an oncology treatment trial--Insights from a structured interventional process.

PubMed

Nestle, Ursula; Rischke, Hans Christian; Eschmann, Susanne Martina; Holl, Gabriele; Tosch, Marco; Miederer, Matthias; Plotkin, Michail; Essler, Markus; Puskas, Cornelia; Schimek-Jasch, Tanja; Duncker-Rohr, Viola; Rühl, Friederike; Leifert, Anja; Mix, Michael; Grosu, Anca-Ligia; König, Jochem; Vach, Werner

2015-11-01

Oncologic imaging is a key for successful cancer treatment. While the quality assurance (QA) of image acquisition protocols has already been focussed, QA of reading and reporting offers still room for improvement. The latter was addressed in the context of a prospective multicentre trial on fluoro-deoxyglucose (FDG)-positron-emission tomography (PET)/CT-based chemoradiotherapy for locally advanced non-small cell lung cancer (NSCLC). An expert panel was prospectively installed performing blinded reviews of mediastinal NSCLC involvement in FDG-PET/CT. Due to a high initial reporting inter-observer disagreement, the independent data monitoring committee (IDMC) triggered an interventional harmonisation process, which overall involved 11 experts uttering 6855 blinded diagnostic statements. After assessing the baseline inter-observer agreement (IOA) of a blinded re-review (phase 1), a discussion process led to improved reading criteria (phase 2). Those underwent a validation study (phase 3) and were then implemented into the study routine. After 2 months (phase 4) and 1 year (phase 5), the IOA was reassessed. The initial overall IOA was moderate (kappa 0.52 CT; 0.53 PET). After improvement of reading criteria, the kappa values improved substantially (kappa 0.61 CT; 0.66 PET), which was retained until the late reassessment (kappa 0.71 CT; 0.67 PET). Subjective uncertainty was highly predictive for low IOA. The IOA of an expert panel was significantly improved by a structured interventional harmonisation process which could be a model for future clinical trials. Furthermore, the low IOA in reporting nodal involvement in NSCLC may bear consequences for individual patient care. Copyright © 2015 Elsevier Ltd. All rights reserved.
International perception of lung sounds: a comparison of classification across some European borders

PubMed Central

Aviles-Solis, Juan Carlos; Vanbelle, Sophie; Halvorsen, Peder A; Francis, Nick; Cals, Jochen W L; Andreeva, Elena A; Marques, Alda; Piirilä, Päivi; Pasterkamp, Hans; Melbye, Hasse

2017-01-01

Introduction Lung auscultation is helpful in the diagnosis of lung and heart diseases; however, the diagnostic value of lung sounds may be questioned due to interobserver variation. This situation may also impair clinical research in this area to generate evidence-based knowledge about the role that chest auscultation has in a modern clinical setting. The recording and visual display of lung sounds is a method that is both repeatable and feasible to use in large samples, and the aim of this study was to evaluate interobserver agreement using this method. Methods With a microphone in a stethoscope tube, we collected digital recordings of lung sounds from six sites on the chest surface in 20 subjects aged 40 years or older with and without lung and heart diseases. A total of 120 recordings and their spectrograms were independently classified by 28 observers from seven different countries. We employed absolute agreement and kappa coefficients to explore interobserver agreement in classifying crackles and wheezes within and between subgroups of four observers. Results When evaluating agreement on crackles (inspiratory or expiratory) in each subgroup, observers agreed on between 65% and 87% of the cases. Conger’s kappa ranged from 0.20 to 0.58 and four out of seven groups reached a kappa of ≥0.49. In the classification of wheezes, we observed a probability of agreement between 69% and 99.6% and kappa values from 0.09 to 0.97. Four out of seven groups reached a kappa ≥0.62. Conclusions The kappa values we observed in our study ranged widely but, when addressing its limitations, we find the method of recording and presenting lung sounds with spectrograms sufficient for both clinic and research. Standardisation of terminology across countries would improve international communication on lung auscultation findings. PMID:29435344
International perception of lung sounds: a comparison of classification across some European borders.

PubMed

Aviles-Solis, Juan Carlos; Vanbelle, Sophie; Halvorsen, Peder A; Francis, Nick; Cals, Jochen W L; Andreeva, Elena A; Marques, Alda; Piirilä, Päivi; Pasterkamp, Hans; Melbye, Hasse

2017-01-01

Lung auscultation is helpful in the diagnosis of lung and heart diseases; however, the diagnostic value of lung sounds may be questioned due to interobserver variation. This situation may also impair clinical research in this area to generate evidence-based knowledge about the role that chest auscultation has in a modern clinical setting. The recording and visual display of lung sounds is a method that is both repeatable and feasible to use in large samples, and the aim of this study was to evaluate interobserver agreement using this method. With a microphone in a stethoscope tube, we collected digital recordings of lung sounds from six sites on the chest surface in 20 subjects aged 40 years or older with and without lung and heart diseases. A total of 120 recordings and their spectrograms were independently classified by 28 observers from seven different countries. We employed absolute agreement and kappa coefficients to explore interobserver agreement in classifying crackles and wheezes within and between subgroups of four observers. When evaluating agreement on crackles (inspiratory or expiratory) in each subgroup, observers agreed on between 65% and 87% of the cases. Conger's kappa ranged from 0.20 to 0.58 and four out of seven groups reached a kappa of ≥0.49. In the classification of wheezes, we observed a probability of agreement between 69% and 99.6% and kappa values from 0.09 to 0.97. Four out of seven groups reached a kappa ≥0.62. The kappa values we observed in our study ranged widely but, when addressing its limitations, we find the method of recording and presenting lung sounds with spectrograms sufficient for both clinic and research. Standardisation of terminology across countries would improve international communication on lung auscultation findings.

Office-Based Point of Care Testing (IgA/IgG-Deamidated Gliadin Peptide) for Celiac Disease.

PubMed

Lau, Michelle S; Mooney, Peter D; White, William L; Rees, Michael A; Wong, Simon H; Hadjivassiliou, Marios; Green, Peter H R; Lebwohl, Benjamin; Sanders, David S

2018-06-19

Celiac disease (CD) is common yet under-detected. A point of care test (POCT) may improve CD detection. We aimed to assess the diagnostic performance of an IgA/IgG-deamidated gliadin peptide (DGP)-based POCT for CD detection, patient acceptability, and inter-observer variability of the POCT results. From 2013-2017, we prospectively recruited patients referred to secondary care with gastrointestinal symptoms, anemia and/or weight loss (group 1); and patients with self-reported gluten sensitivity with unknown CD status (group 2). All patients had concurrent POCT, IgA-tissue transglutaminase (IgA-TTG), IgA-endomysial antibodies (IgA-EMA), total IgA levels, and duodenal biopsies. Five hundred patients completed acceptability questionnaires, and inter-observer variability of the POCT results was compared among five clinical staff for 400 cases. Group 1: 1000 patients, 58.5% female, age 16-91, median age 57. Forty-one patients (4.1%) were diagnosed with CD. The sensitivities of the POCT, IgA-TTG, and IgA-EMA were 82.9, 78.1, and 70.7%; the specificities were 85.4, 96.3, and 99.8%. Group 2: 61 patients, 83% female; age 17-73, median age 35. The POCT had 100% sensitivity and negative predictive value in detecting CD in group 2. Most patients preferred the POCT to venepuncture (90.4% vs. 2.8%). There was good inter-observer agreement on the POCT results with a Fleiss Kappa coefficient of 0.895. The POCT had comparable sensitivities to serology, and correctly identified all CD cases in a gluten sensitive cohort. However, its low specificity may increase unnecessary investigations. Despite its advantage of convenience and rapid results, it may not add significant value to case finding in an office-based setting.
A score card for upper GI endoscopy: Evaluation of interobserver variability in examiners with various levels of experience.

PubMed

Neumann, M; Friedl, S; Meining, A; Egger, K; Heldwein, W; Rey, J F; Hochberger, J; Classen, M; Hohenberger, W; Rösch, T

2002-10-01

In most European countries, training in GI endoscopy has largely been based on hands-on acquisition of experience in patients rather than on a structured training programme. With the development of training models systematic hands-on training in a variety of diagnostic and therapeutic endoscopy techniques was achieved. Little, however, is known about methods of objectively assessing trainees' performance. We therefore developed an assessment 'score card' for upper GI endoscopy and tested it in endoscopists with various levels of experience. The aim of the study was therefore to assess interobserver variations in the evaluation of trainees. On the basis of textbook and expert opinions a consensus group of eight experienced endoscopists developed a score card for diagnostic upper GI endoscopy with biopsy. The score card includes an assessment of the single steps of the procedure as well as of the times needed to complete each step. This score card was then evaluated in a further conference including ten experts who blindly assessed videotapes of 15 endoscopists performing upper GI endoscopy in a training bio-simulation model (the 'Erlangen Endo-Trainer'). On the basis of their previous experience (i. e. the number of endoscopies performed) these 15 endoscopists were classified into four groups: very experienced, experienced, having some experience and inexperienced. Interobserver variability (IOV) was tested for the various score card parameters (Kendall's rank-correlation coefficient 0.0-0.5 poor, 0.5-1.0 good agreement). In addition, the correlation between the score card assessment and the examiners' experience levels was analysed. Despite poor IOV results for all the parameters tested (Kendall coefficient < 0.3), the assessment parameters correlated well when the examiners' different experience levels were taken into account (correlation coefficient 0.59-0.89, p < 0.05). The score card parameters were suitable for differentiating between the four groups of examiners with different levels of endoscopic experience. As expected with scores involving subjective assessment of performance, the variability between reviewers was substantial. Nevertheless, the assessment score was capable of distinguishing reliably between different experience levels in terms of a good individual observer consistency. The score card can therefore be used to document both training status and progress during endoscopy training courses using bio-simulation models, and this might be able to provide improved quality assurance in GI endoscopy training.
The Effect of Study Design Biases on the Diagnostic Accuracy of Magnetic Resonance Imaging to Detect Silicone Breast Implant Ruptures: A Meta-Analysis

PubMed Central

Song, Jae W.; Kim, Hyungjin Myra; Bellfi, Lillian T.; Chung, Kevin C.

2010-01-01

Background All silicone breast implant recipients are recommended by the US Food and Drug Administration to undergo serial screening to detect implant rupture with magnetic resonance imaging (MRI). We performed a systematic review of the literature to assess the quality of diagnostic accuracy studies utilizing MRI or ultrasound to detect silicone breast implant rupture and conducted a meta-analysis to examine the effect of study design biases on the estimation of MRI diagnostic accuracy measures. Method Studies investigating the diagnostic accuracy of MRI and ultrasound in evaluating ruptured silicone breast implants were identified using MEDLINE, EMBASE, ISI Web of Science, and Cochrane library databases. Two reviewers independently screened potential studies for inclusion and extracted data. Study design biases were assessed using the QUADAS tool and the STARDS checklist. Meta-analyses estimated the influence of biases on diagnostic odds ratios. Results Among 1175 identified articles, 21 met the inclusion criteria. Most studies using MRI (n= 10 of 16) and ultrasound (n=10 of 13) examined symptomatic subjects. Meta-analyses revealed that MRI studies evaluating symptomatic subjects had 14-fold higher diagnostic accuracy estimates compared to studies using an asymptomatic sample (RDOR 13.8; 95% CI 1.83–104.6) and 2-fold higher diagnostic accuracy estimates compared to studies using a screening sample (RDOR 1.89; 95% CI 0.05–75.7). Conclusion Many of the published studies utilizing MRI or ultrasound to detect silicone breast implant rupture are flawed with methodological biases. These methodological shortcomings may result in overestimated MRI diagnostic accuracy measures and should be interpreted with caution when applying the data to a screening population. PMID:21364405
Diagnostic accuracy of ultrasonography, MRI and MR arthrography in the characterisation of rotator cuff disorders: a systematic review and meta-analysis

PubMed Central

Roy, Jean-Sébastien; Braën, Caroline; Leblond, Jean; Desmeules, François; Dionne, Clermont E; MacDermid, Joy C; Bureau, Nathalie J; Frémont, Pierre

2015-01-01

Background Different diagnostic imaging modalities, such as ultrasonography (US), MRI, MR arthrography (MRA) are commonly used for the characterisation of rotator cuff (RC) disorders. Since the most recent systematic reviews on medical imaging, multiple diagnostic studies have been published, most using more advanced technological characteristics. The first objective was to perform a meta-analysis on the diagnostic accuracy of medical imaging for characterisation of RC disorders. Since US is used at the point of care in environments such as sports medicine, a secondary analysis assessed accuracy by radiologists and non-radiologists. Methods A systematic search in three databases was conducted. Two raters performed data extraction and evaluation of risk of bias independently, and agreement was achieved by consensus. Hierarchical summary receiver-operating characteristic package was used to calculate pooled estimates of included diagnostic studies. Results Diagnostic accuracy of US, MRI and MRA in the characterisation of full-thickness RC tears was high with overall estimates of sensitivity and specificity over 0.90. As for partial RC tears and tendinopathy, overall estimates of specificity were also high (>0.90), while sensitivity was lower (0.67–0.83). Diagnostic accuracy of US was similar whether a trained radiologist, sonographer or orthopaedist performed it. Conclusions Our results show the diagnostic accuracy of US, MRI and MRA in the characterisation of full-thickness RC tears. Since full thickness tear constitutes a key consideration for surgical repair, this is an important characteristic when selecting an imaging modality for RC disorder. When considering accuracy, cost, and safety, US is the best option. PMID:25677796
Reporting completeness and transparency of meta-analyses of depression screening tool accuracy: A comparison of meta-analyses published before and after the PRISMA statement.

PubMed

Rice, Danielle B; Kloda, Lorie A; Shrier, Ian; Thombs, Brett D

2016-08-01

Meta-analyses that are conducted rigorously and reported completely and transparently can provide accurate evidence to inform the best possible healthcare decisions. Guideline makers have raised concerns about the utility of existing evidence on the diagnostic accuracy of depression screening tools. The objective of our study was to evaluate the transparency and completeness of reporting in meta-analyses of the diagnostic accuracy of depression screening tools using the PRISMA tool adapted for diagnostic test accuracy meta-analyses. We searched MEDLINE and PsycINFO from January 1, 2005 through March 13, 2016 for recent meta-analyses in any language on the diagnostic accuracy of depression screening tools. Two reviewers independently assessed the transparency in reporting using the PRISMA tool with appropriate adaptations made for studies of diagnostic test accuracy. We identified 21 eligible meta-analyses. Twelve of 21 meta-analyses complied with at least 50% of adapted PRISMA items. Of 30 adapted PRISMA items, 11 were fulfilled by ≥80% of included meta-analyses, 3 by 50-79% of meta-analyses, 7 by 25-45% of meta-analyses, and 9 by <25%. On average, post-PRISMA meta-analyses complied with 17 of 30 items compared to 13 of 30 items pre-PRISMA. Deficiencies in the transparency of reporting in meta-analyses of the diagnostic test accuracy of depression screening tools of meta-analyses were identified. Authors, reviewers, and editors should adhere to the PRISMA statement to improve the reporting of meta-analyses of the diagnostic accuracy of depression screening tools. Copyright © 2016 Elsevier Inc. All rights reserved.
Digital Pathology: Data-Intensive Frontier in Medical Imaging

PubMed Central

Cooper, Lee A. D.; Carter, Alexis B.; Farris, Alton B.; Wang, Fusheng; Kong, Jun; Gutman, David A.; Widener, Patrick; Pan, Tony C.; Cholleti, Sharath R.; Sharma, Ashish; Kurc, Tahsin M.; Brat, Daniel J.; Saltz, Joel H.

2013-01-01

Pathology is a medical subspecialty that practices the diagnosis of disease. Microscopic examination of tissue reveals information enabling the pathologist to render accurate diagnoses and to guide therapy. The basic process by which anatomic pathologists render diagnoses has remained relatively unchanged over the last century, yet advances in information technology now offer significant opportunities in image-based diagnostic and research applications. Pathology has lagged behind other healthcare practices such as radiology where digital adoption is widespread. As devices that generate whole slide images become more practical and affordable, practices will increasingly adopt this technology and eventually produce an explosion of data that will quickly eclipse the already vast quantities of radiology imaging data. These advances are accompanied by significant challenges for data management and storage, but they also introduce new opportunities to improve patient care by streamlining and standardizing diagnostic approaches and uncovering disease mechanisms. Computer-based image analysis is already available in commercial diagnostic systems, but further advances in image analysis algorithms are warranted in order to fully realize the benefits of digital pathology in medical discovery and patient care. In coming decades, pathology image analysis will extend beyond the streamlining of diagnostic workflows and minimizing interobserver variability and will begin to provide diagnostic assistance, identify therapeutic targets, and predict patient outcomes and therapeutic responses. PMID:25328166
The diagnostic accuracy of 1.5T magnetic resonance imaging for detecting root avulsions in traumatic adult brachial plexus injuries.

PubMed

Wade, Ryckie G; Itte, Vinay; Rankine, James J; Ridgway, John P; Bourke, Grainne

2018-03-01

Identification of root avulsions is of critical importance in traumatic brachial plexus injuries because it alters the reconstruction and prognosis. Pre-operative magnetic resonance imaging is gaining popularity, but there is limited and conflicting data on its diagnostic accuracy for root avulsion. This cohort study describes consecutive patients requiring brachial plexus exploration following trauma between 2008 and 2016. The index test was magnetic resonance imaging at 1.5 Tesla and the reference test was operative exploration of the supraclavicular plexus. Complete data from 29 males was available. The diagnostic accuracy of magnetic resonance imaging for root avulsion(s) of C5-T1 was 79%. The diagnostic accuracy of a pseudomeningocoele as a surrogate marker of root avulsion(s) of C5-T1 was 68%. We conclude that pseudomeningocoles were not a reliable sign of root avulsion and magnetic resonance imaging has modest diagnostic accuracy for root avulsions in the context of adult traumatic brachial plexus injuries. III.
Radiological interpretation of images displayed on tablet computers: a systematic review.

PubMed

Caffery, L J; Armfield, N R; Smith, A C

2015-06-01

To review the published evidence and to determine if radiological diagnostic accuracy is compromised when images are displayed on a tablet computer and thereby inform practice on using tablet computers for radiological interpretation by on-call radiologists. We searched the PubMed and EMBASE databases for studies on the diagnostic accuracy or diagnostic reliability of images interpreted on tablet computers. Studies were screened for inclusion based on pre-determined inclusion and exclusion criteria. Studies were assessed for quality and risk of bias using Quality Appraisal of Diagnostic Reliability Studies or the revised Quality Assessment of Diagnostic Accuracy Studies tool. Treatment of studies was reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA). 11 studies met the inclusion criteria. 10 of these studies tested the Apple iPad(®) (Apple, Cupertino, CA). The included studies reported high sensitivity (84-98%), specificity (74-100%) and accuracy rates (98-100%) for radiological diagnosis. There was no statistically significant difference in accuracy between a tablet computer and a digital imaging and communication in medicine-calibrated control display. There was a near complete consensus from authors on the non-inferiority of diagnostic accuracy of images displayed on a tablet computer. All of the included studies were judged to be at risk of bias. Our findings suggest that the diagnostic accuracy of radiological interpretation is not compromised by using a tablet computer. This result is only relevant to the Apple iPad and to the modalities of CT, MRI and plain radiography. The iPad may be appropriate for an on-call radiologist to use for radiological interpretation.
Quality and reporting of diagnostic accuracy studies in TB, HIV and malaria: evaluation using QUADAS and STARD standards.

PubMed

Fontela, Patricia Scolari; Pant Pai, Nitika; Schiller, Ian; Dendukuri, Nandini; Ramsay, Andrew; Pai, Madhukar

2009-11-13

Poor methodological quality and reporting are known concerns with diagnostic accuracy studies. In 2003, the QUADAS tool and the STARD standards were published for evaluating the quality and improving the reporting of diagnostic studies, respectively. However, it is unclear whether these tools have been applied to diagnostic studies of infectious diseases. We performed a systematic review on the methodological and reporting quality of diagnostic studies in TB, malaria and HIV. We identified diagnostic accuracy studies of commercial tests for TB, malaria and HIV through a systematic search of the literature using PubMed and EMBASE (2004-2006). Original studies that reported sensitivity and specificity data were included. Two reviewers independently extracted data on study characteristics and diagnostic accuracy, and used QUADAS and STARD to evaluate the quality of methods and reporting, respectively. Ninety (38%) of 238 articles met inclusion criteria. All studies had design deficiencies. Study quality indicators that were met in less than 25% of the studies included adequate description of withdrawals (6%) and reference test execution (10%), absence of index test review bias (19%) and reference test review bias (24%), and report of uninterpretable results (22%). In terms of quality of reporting, 9 STARD indicators were reported in less than 25% of the studies: methods for calculation and estimates of reproducibility (0%), adverse effects of the diagnostic tests (1%), estimates of diagnostic accuracy between subgroups (10%), distribution of severity of disease/other diagnoses (11%), number of eligible patients who did not participate in the study (14%), blinding of the test readers (16%), and description of the team executing the test and management of indeterminate/outlier results (both 17%). The use of STARD was not explicitly mentioned in any study. Only 22% of 46 journals that published the studies included in this review required authors to use STARD. Recently published diagnostic accuracy studies on commercial tests for TB, malaria and HIV have moderate to low quality and are poorly reported. The more frequent use of tools such as QUADAS and STARD may be necessary to improve the methodological and reporting quality of future diagnostic accuracy studies in infectious diseases.
Quality and Reporting of Diagnostic Accuracy Studies in TB, HIV and Malaria: Evaluation Using QUADAS and STARD Standards

PubMed Central

Fontela, Patricia Scolari; Pant Pai, Nitika; Schiller, Ian; Dendukuri, Nandini; Ramsay, Andrew; Pai, Madhukar

2009-01-01

Background Poor methodological quality and reporting are known concerns with diagnostic accuracy studies. In 2003, the QUADAS tool and the STARD standards were published for evaluating the quality and improving the reporting of diagnostic studies, respectively. However, it is unclear whether these tools have been applied to diagnostic studies of infectious diseases. We performed a systematic review on the methodological and reporting quality of diagnostic studies in TB, malaria and HIV. Methods We identified diagnostic accuracy studies of commercial tests for TB, malaria and HIV through a systematic search of the literature using PubMed and EMBASE (2004–2006). Original studies that reported sensitivity and specificity data were included. Two reviewers independently extracted data on study characteristics and diagnostic accuracy, and used QUADAS and STARD to evaluate the quality of methods and reporting, respectively. Findings Ninety (38%) of 238 articles met inclusion criteria. All studies had design deficiencies. Study quality indicators that were met in less than 25% of the studies included adequate description of withdrawals (6%) and reference test execution (10%), absence of index test review bias (19%) and reference test review bias (24%), and report of uninterpretable results (22%). In terms of quality of reporting, 9 STARD indicators were reported in less than 25% of the studies: methods for calculation and estimates of reproducibility (0%), adverse effects of the diagnostic tests (1%), estimates of diagnostic accuracy between subgroups (10%), distribution of severity of disease/other diagnoses (11%), number of eligible patients who did not participate in the study (14%), blinding of the test readers (16%), and description of the team executing the test and management of indeterminate/outlier results (both 17%). The use of STARD was not explicitly mentioned in any study. Only 22% of 46 journals that published the studies included in this review required authors to use STARD. Conclusion Recently published diagnostic accuracy studies on commercial tests for TB, malaria and HIV have moderate to low quality and are poorly reported. The more frequent use of tools such as QUADAS and STARD may be necessary to improve the methodological and reporting quality of future diagnostic accuracy studies in infectious diseases. PMID:19915664
Diagnostic Accuracy of the Veteran Affairs' Traumatic Brain Injury Screen.

PubMed

Louise Bender Pape, Theresa; Smith, Bridget; Babcock-Parziale, Judith; Evans, Charlesnika T; Herrold, Amy A; Phipps Maieritsch, Kelly; High, Walter M

2018-01-31

To comprehensively estimate the diagnostic accuracy and reliability of the Department of Veterans Affairs (VA) Traumatic Brain Injury (TBI) Clinical Reminder Screen (TCRS). Cross-sectional, prospective, observational study using the Standards for Reporting of Diagnostic Accuracy criteria. Three VA Polytrauma Network Sites. Operation Iraqi Freedom, Operation Enduring Freedom veterans (N=433). TCRS, Comprehensive TBI Evaluation, Structured TBI Diagnostic Interview, Symptom Attribution and Classification Algorithm, and Clinician-Administered Posttraumatic Stress Disorder (PTSD) Scale. Forty-five percent of veterans screened positive on the TCRS for TBI. For detecting occurrence of historical TBI, the TCRS had a sensitivity of .56 to .74, a specificity of .63 to .93, a positive predictive value (PPV) of 25% to 45%, a negative predictive value (NPV) of 91% to 94%, and a diagnostic odds ratio (DOR) of 4 to 13. For accuracy of attributing active symptoms to the TBI, the TCRS had a sensitivity of .64 to .87, a specificity of .59 to .89, a PPV of 26% to 32%, an NPV of 92% to 95%, and a DOR of 6 to 9. The sensitivity was higher for veterans with PTSD (.80-.86) relative to veterans without PTSD (.57-.82). The specificity, however, was higher among veterans without PTSD (.75-.81) relative to veterans with PTSD (.36-.49). All indices of diagnostic accuracy changed when participants with questionably valid (QV) test profiles were eliminated from analyses. The utility of the TCRS to screen for mild TBI (mTBI) depends on the stringency of the diagnostic reference standard to which it is being compared, the presence/absence of PTSD, and QV test profiles. Further development, validation, and use of reproducible diagnostic algorithms for symptom attribution after possible mTBI would improve diagnostic accuracy. Published by Elsevier Inc.
Nuances of Morphology in Myelodysplastic Diseases in the Age of Molecular Diagnostics.

PubMed

Shaver, Aaron C; Seegmiller, Adam C

2017-10-01

Morphologic dysplasia is an important factor in diagnosis of myelodysplastic syndrome (MDS). However, the role of dysplasia is changing as new molecular genetic and genomic technologies take a more prominent place in diagnosis. This review discusses the role of morphology in the diagnosis of MDS and its interactions with cytogenetic and molecular testing. Recent changes in diagnostic criteria have attempted to standardize approaches to morphologic diagnosis of MDS, recognizing significant inter-observer variability in assessment of dysplasia. Definitive correlates between cytogenetic/molecular and morphologic findings have been described in only a small set of cases. However, these genetic and morphologic tools do play a complementary role in the diagnosis of both MDS and other myeloid neoplasms. Diagnosis of MDS requires a multi-factorial approach, utilizing both traditional morphologic as well as newer molecular genetic techniques. Understanding these tools, and the interplay between them, is crucial in the modern diagnosis of myeloid neoplasms.
Issues of diagnostic review in brain tumor studies: from the Brain Tumor Epidemiology Consortium.

PubMed

Davis, Faith G; Malmer, Beatrice S; Aldape, Ken; Barnholtz-Sloan, Jill S; Bondy, Melissa L; Brännström, Thomas; Bruner, Janet M; Burger, Peter C; Collins, V Peter; Inskip, Peter D; Kruchko, Carol; McCarthy, Bridget J; McLendon, Roger E; Sadetzki, Siegal; Tihan, Tarik; Wrensch, Margaret R; Buffler, Patricia A

2008-03-01

Epidemiologists routinely conduct centralized single pathology reviews to minimize interobserver diagnostic variability, but this practice does not facilitate the combination of studies across geographic regions and institutions where diagnostic practices differ. A meeting of neuropathologists and epidemiologists focused on brain tumor classification issues in the context of protocol needs for consortial studies (http://epi.grants.cancer.gov/btec/). It resulted in recommendations relevant to brain tumors and possibly other rare disease studies. Two categories of brain tumors have enough general agreement over time, across regions, and between individual pathologists that one can consider using existing diagnostic data without further review: glioblastomas and meningiomas (as long as uniform guidelines such as those provided by the WHO are used). Prospective studies of these tumors benefit from collection of pathology reports, at a minimum recording the pathology department and classification system used in the diagnosis. Other brain tumors, such as oligodendroglioma, are less distinct and require careful histopathologic review for consistent classification across study centers. Epidemiologic study protocols must consider the study specific aims, diagnostic changes that have taken place over time, and other issues unique to the type(s) of tumor being studied. As diagnostic changes are being made rapidly, there are no readily available answers on disease classification issues. It is essential that epidemiologists and neuropathologists collaborate to develop appropriate study designs and protocols for specific hypothesis and populations.
Wheeze as an Adverse Event in Pediatric Vaccine and Drug Randomized Controlled Trials: A Systematic Review

PubMed Central

Marangu, Diana; Kovacs, Stephanie; Walson, Judd; Bonhoeffer, Jan; Ortiz, Justin R.; John-Stewart, Grace; Horne, David J.

2016-01-01

Introduction Wheeze is an important sign indicating a potentially severe adverse event in vaccine and drug trials, particularly in children. However, there are currently no consensus definitions of wheeze or associated respiratory compromise in randomized controlled trials (RCTs). Objective To identify definitions and severity grading scales of wheeze as an adverse event in vaccine and drug RCTs enrolling children <5 years and to determine their diagnostic performance based on sensitivity, specificity and inter-observer agreement. Methods We performed a systematic review of electronic databases and reference lists with restrictions for trial settings, English language and publication date ≥ 1970. Wheeze definitions and severity grading were abstracted and ranked by a diagnostic certainty score based on sensitivity, specificity and inter-observer agreement. Results Of 1,205 articles identified using our broad search terms, we identified 58 eligible trials conducted in 38 countries, mainly in high-income settings. Vaccines made up the majority (90%) of interventions, particularly influenza vaccines (65%). Only 15 trials provided explicit definitions of wheeze. Of 24 studies that described severity, 11 described wheeze severity in the context of an explicit wheeze definition. The remaining 13 studies described wheeze severity where wheeze was defined as part of a respiratory illness or a wheeze equivalent. Wheeze descriptions were elicited from caregiver reports (14%), physical examination by a health worker (45%) or a combination (41%). There were 21/58 studies in which wheeze definitions included combined caregiver report and healthcare worker assessment. The use of these two methods appeared to have the highest combined sensitivity and specificity. Conclusion Standardized wheeze definitions and severity grading scales for use in pediatric vaccine or drug trials are lacking. Standardized definitions of wheeze are needed for assessment of possible adverse events as new vaccines and drugs are evaluated. PMID:26319071
Impact of image quality on reliability of the measurements of left ventricular systolic function and global longitudinal strain in 2D echocardiography

PubMed Central

Nagata, Yasufumi; Kado, Yuichiro; Onoue, Takeshi; Otani, Kyoko; Nakazono, Akemi; Otsuji, Yutaka; Takeuchi, Masaaki

2018-01-01

Background Left ventricular ejection fraction (LVEF) and global longitudinal strain (GLS) play important roles in diagnosis and management of cardiac diseases. However, the issue of the accuracy and reliability of LVEF and GLS remains to be solved. Image quality is one of the most important factors affecting measurement variability. The aim of this study was to investigate whether improved image quality could reduce observer variability. Methods Two sets of three apical images were acquired using relatively old- and new-generation ultrasound imaging systems (Vivid 7 and Vivid E95) in 308 subjects. Image quality was assessed by endocardial border delineation index (EBDI) using a 3-point scoring system. Three observers measured the LVEF and GLS, and these values and inter-observer variability were investigated. Results Image quality was significantly better with Vivid E95 (EBDI: 26.8 ± 5.9) than that with Vivid 7 (22.8 ± 6.3, P < 0.0001). Regarding the inter-observer variability of LVEF, the r-value, bias, 95% limit of agreement and intra-class correlation coefficient for Vivid 7 were comparable to those for Vivid E95. The % variabilities were significantly lower for Vivid E95 (5.3–6.5%) than those for Vivid 7 (6.5–7.5%). Regarding GLS, all observer variability parameters were better for Vivid E95 than for Vivid 7. Improvements in image quality yielded benefits to both LVEF and GLS measurement reliability. Multivariate analysis showed that image quality was indeed an important factor of observer variability in the measurement of LVEF and GLS. Conclusions The new-generation ultrasound imaging system offers improved image quality and reduces inter-observer variability in the measurement of LVEF and GLS. PMID:29432198
Echocardiographic Methods, Quality Review, and Measurement Accuracy in a Randomized Multicenter Clinical Trial of Marfan Syndrome

PubMed Central

Selamet Tierney, Elif Seda; Levine, Jami C.; Chen, Shan; Bradley, Timothy J.; Pearson, Gail D.; Colan, Steven D.; Sleeper, Lynn A.; Campbell, M. Jay; Cohen, Meryl S.; Backer, Julie De; Guey, Lin T.; Heydarian, Haleh; Lai, Wyman W.; Lewin, Mark B.; Marcus, Edward; Mart, Christopher R.; Pignatelli, Ricardo H.; Printz, Beth F.; Sharkey, Angela M.; Shirali, Girish S.; Srivastava, Shubhika; Lacro, Ronald V.

2013-01-01

Background The Pediatric Heart Network is conducting a large international randomized trial to compare aortic root growth and other cardiovascular outcomes in 608 subjects with Marfan syndrome randomized to receive atenolol or losartan for 3 years. The authors report here the echocardiographic methods and baseline echocardiographic characteristics of the randomized subjects, describe the interobserver agreement of aortic measurements, and identify factors influencing agreement. Methods Individuals aged 6 months to 25 years who met the original Ghent criteria and had body surface area–adjusted maximum aortic root diameter (ROOTmax) Z scores > 3 were eligible for inclusion. The primary outcome measure for the trial is the change over time in ROOTmax Z score. A detailed echocardiographic protocol was established and implemented across 22 centers, with an extensive training and quality review process. Results Interobserver agreement for the aortic measurements was excellent, with intraclass correlation coefficients ranging from 0.921 to 0.989. Lower interobserver percentage error in ROOTmax measurements was independently associated (model R2 = 0.15) with better image quality (P = .002) and later study reading date (P < .001). Echocardiographic characteristics of the randomized subjects did not differ by treatment arm. Subjects with ROOTmax Z scores ≥ 4.5 (36%) were more likely to have mitral valve prolapse and dilation of the main pulmonary artery and left ventricle, but there were no differences in aortic regurgitation, aortic stiffness indices, mitral regurgitation, or left ventricular function compared with subjects with ROOTmax Z scores < 4.5. Conclusions The echocardiographic methodology, training, and quality review process resulted in a robust evaluation of aortic root dimensions, with excellent reproducibility. PMID:23582510
Assessment of cone beam CT registration for prostate radiation therapy: fiducial marker and soft tissue methods.

PubMed

Deegan, Timothy; Owen, Rebecca; Holt, Tanya; Fielding, Andrew; Biggs, Jennifer; Parfitt, Matthew; Coates, Alicia; Roberts, Lisa

2015-02-01

This investigation aimed to assess the consistency and accuracy of radiation therapists (RTs) performing cone beam computed tomography (CBCT) alignment to fiducial markers (FMs) (CBCTFM ) and the soft tissue prostate (CBCTST ). Six patients receiving prostate radiation therapy underwent daily CBCTs. Manual alignment of CBCTFM and CBCTST was performed by three RTs. Inter-observer agreement was assessed using a modified Bland-Altman analysis for each alignment method. Clinically acceptable 95% limits of agreement with the mean (LoAmean ) were defined as ±2.0 mm for CBCTFM and ±3.0 mm for CBCTST . Differences between CBCTST alignment and the observer-averaged CBCTFM (AvCBCTFM ) alignment were analysed. Clinically acceptable 95% LoA were defined as ±3.0 mm for the comparison of CBCTST and AvCBCTFM . CBCTFM and CBCTST alignments were performed for 185 images. The CBCTFM 95% LoAmean were within ±2.0 mm in all planes. CBCTST 95% LoAmean were within ±3.0 mm in all planes. Comparison of CBCTST with AvCBCTFM resulted in 95% LoA of -4.9 to 2.6, -1.6 to 2.5 and -4.7 to 1.9 mm in the superior-inferior, left-right and anterior-posterior planes, respectively. Significant differences were found between soft tissue alignment and the predicted FM position. FMs are useful in reducing inter-observer variability compared with soft tissue alignment. Consideration needs to be given to margin design when using soft tissue matching due to increased inter-observer variability. This study highlights some of the complexities of soft tissue guidance for prostate radiation therapy. © 2014 The Royal Australian and New Zealand College of Radiologists.
Volumetric glioma quantification: comparison of manual and semi-automatic tumor segmentation for the quantification of tumor growth.

PubMed

Odland, Audun; Server, Andres; Saxhaug, Cathrine; Breivik, Birger; Groote, Rasmus; Vardal, Jonas; Larsson, Christopher; Bjørnerud, Atle

2015-11-01

Volumetric magnetic resonance imaging (MRI) is now widely available and routinely used in the evaluation of high-grade gliomas (HGGs). Ideally, volumetric measurements should be included in this evaluation. However, manual tumor segmentation is time-consuming and suffers from inter-observer variability. Thus, tools for semi-automatic tumor segmentation are needed. To present a semi-automatic method (SAM) for segmentation of HGGs and to compare this method with manual segmentation performed by experts. The inter-observer variability among experts manually segmenting HGGs using volumetric MRIs was also examined. Twenty patients with HGGs were included. All patients underwent surgical resection prior to inclusion. Each patient underwent several MRI examinations during and after adjuvant chemoradiation therapy. Three experts performed manual segmentation. The results of tumor segmentation by the experts and by the SAM were compared using Dice coefficients and kappa statistics. A relatively close agreement was seen among two of the experts and the SAM, while the third expert disagreed considerably with the other experts and the SAM. An important reason for this disagreement was a different interpretation of contrast enhancement as either surgically-induced or glioma-induced. The time required for manual tumor segmentation was an average of 16 min per scan. Editing of the tumor masks produced by the SAM required an average of less than 2 min per sample. Manual segmentation of HGG is very time-consuming and using the SAM could increase the efficiency of this process. However, the accuracy of the SAM ultimately depends on the expert doing the editing. Our study confirmed a considerable inter-observer variability among experts defining tumor volume from volumetric MRIs. © The Foundation Acta Radiologica 2014.
Impact of image quality on reliability of the measurements of left ventricular systolic function and global longitudinal strain in 2D echocardiography.

PubMed

Nagata, Yasufumi; Kado, Yuichiro; Onoue, Takeshi; Otani, Kyoko; Nakazono, Akemi; Otsuji, Yutaka; Takeuchi, Masaaki

2018-03-01

Left ventricular ejection fraction (LVEF) and global longitudinal strain (GLS) play important roles in diagnosis and management of cardiac diseases. However, the issue of the accuracy and reliability of LVEF and GLS remains to be solved. Image quality is one of the most important factors affecting measurement variability. The aim of this study was to investigate whether improved image quality could reduce observer variability. Two sets of three apical images were acquired using relatively old- and new-generation ultrasound imaging systems (Vivid 7 and Vivid E95) in 308 subjects. Image quality was assessed by endocardial border delineation index (EBDI) using a 3-point scoring system. Three observers measured the LVEF and GLS, and these values and inter-observer variability were investigated. Image quality was significantly better with Vivid E95 (EBDI: 26.8 ± 5.9) than that with Vivid 7 (22.8 ± 6.3, P < 0.0001). Regarding the inter-observer variability of LVEF, the r -value, bias, 95% limit of agreement and intra-class correlation coefficient for Vivid 7 were comparable to those for Vivid E95. The % variabilities were significantly lower for Vivid E95 (5.3-6.5%) than those for Vivid 7 (6.5-7.5%). Regarding GLS, all observer variability parameters were better for Vivid E95 than for Vivid 7. Improvements in image quality yielded benefits to both LVEF and GLS measurement reliability. Multivariate analysis showed that image quality was indeed an important factor of observer variability in the measurement of LVEF and GLS. The new-generation ultrasound imaging system offers improved image quality and reduces inter-observer variability in the measurement of LVEF and GLS. © 2018 The authors.
Lumbar lordosis and sacral slope in lumbar spinal stenosis: standard values and measurement accuracy.

PubMed

Bredow, J; Oppermann, J; Scheyerer, M J; Gundlfinger, K; Neiss, W F; Budde, S; Floerkemeier, T; Eysel, P; Beyer, F

2015-05-01

Radiological study. To asses standard values, intra- and interobserver reliability and reproducibility of sacral slope (SS) and lumbar lordosis (LL) and the correlation of these parameters in patients with lumbar spinal stenosis (LSS). Anteroposterior and lateral X-rays of the lumbar spine of 102 patients with LSS were included in this retrospective, radiologic study. Measurements of SS and LL were carried out by five examiners. Intraobserver correlation and correlation between LL and SS were calculated with Pearson's r linear correlation coefficient and intraclass correlation coefficients (ICC) were calculated for inter- and intraobserver reliability. In addition, patients were examined in subgroups with respect to previous surgery and the current therapy. Lumbar lordosis averaged 45.6° (range 2.5°-74.9°; SD 14.2°), intraobserver correlation was between Pearson r = 0.93 and 0.98. The measurement of SS averaged 35.3° (range 13.8°-66.9°; SD 9.6°), intraobserver correlation was between Pearson r = 0.89 and 0.96. Intraobserver reliability ranged from 0.966 to 0.992 ICC in LL measurements and 0.944-0.983 ICC in SS measurements. There was an interobserver reliability ICC of 0.944 in LL and 0.990 in SS. Correlation between LL and SS averaged r = 0.79. No statistically significant differences were observed between the analyzed subgroups. Manual measurement of LL and SS in patients with LSS on lateral radiographs is easily performed with excellent intra- and interobserver reliability. Correlation between LL and SS is very high. Differences between patients with and without previous decompression were not statistically significant.

The use of atlas registration and graph cuts for prostate segmentation in magnetic resonance images

DOE Office of Scientific and Technical Information (OSTI.GOV)

Korsager, Anne Sofie, E-mail: asko@hst.aau.dk; Østergaard, Lasse Riis; Fortunati, Valerio

2015-04-15

Purpose: An automatic method for 3D prostate segmentation in magnetic resonance (MR) images is presented for planning image-guided radiotherapy treatment of prostate cancer. Methods: A spatial prior based on intersubject atlas registration is combined with organ-specific intensity information in a graph cut segmentation framework. The segmentation is tested on 67 axial T{sub 2}-weighted MR images in a leave-one-out cross validation experiment and compared with both manual reference segmentations and with multiatlas-based segmentations using majority voting atlas fusion. The impact of atlas selection is investigated in both the traditional atlas-based segmentation and the new graph cut method that combines atlas andmore » intensity information in order to improve the segmentation accuracy. Best results were achieved using the method that combines intensity information, shape information, and atlas selection in the graph cut framework. Results: A mean Dice similarity coefficient (DSC) of 0.88 and a mean surface distance (MSD) of 1.45 mm with respect to the manual delineation were achieved. Conclusions: This approaches the interobserver DSC of 0.90 and interobserver MSD 0f 1.15 mm and is comparable to other studies performing prostate segmentation in MR.« less
Protocol for accuracy of point of care (POC) or in-office urine drug testing (immunoassay) in chronic pain patients: a prospective analysis of immunoassay and liquid chromatography tandem mass spectometry (LC/MS/MS).

PubMed

Manchikanti, Laxmaiah; Malla, Yogesh; Wargo, Bradley W; Cash, Kimberly A; Pampati, Vidyasagar; Damron, Kim S; McManus, Carla D; Brandon, Doris E

2010-01-01

Therapeutic use, overuse, abuse, and diversion of controlled substances in managing chronic non-cancer pain continues to be an issue for physicians and patients. It has been stated that physicians, along with the public and federal, state, and local government; professional associations; and pharmaceutical companies all share responsibility for preventing abuse of controlled prescription drugs. The challenge is to eliminate or significantly curtail abuse of controlled prescription drugs while still assuring the proper treatment of those patients. A number of techniques, instruments, and tools have been described to monitor controlled substance use and abuse. Thus, multiple techniques and tools available for adherence monitoring include urine drug testing in conjunction with prescription monitoring programs and other screening tests. However, urine drug testing is associated with multiple methodological flaws. Multiple authors have provided conflicting results in relation to diagnostic accuracy with differing opinions about how to monitor adherence in a non-systematic fashion. Thus far, there have not been any studies systematically assessing the diagnostic accuracy of immunoassay with laboratory testing. A diagnostic accuracy study of urine drug testing. An interventional pain management practice, a specialty referral center, a private practice setting in the United States. To compare the information obtained by point of care (POC) or in-office urine drug testing (index test) to the information found when all drugs and analytes are tested by liquid chromatography tandem mass spectroscopy (LC/MS/MS) reference test in the same urine sample. The study is designed to include 1,000 patients with chronic pain receiving controlled substances. The primary outcome measure is the diagnostic accuracy. Patients will be tested for various controlled substances, including opioids, benzodiazepines, and illicit drugs. The diagnostic accuracy study is performed utilizing the Standards for Reporting of Diagnostic Accuracy Studies (STARD) initiative which established reporting guidelines for diagnostic accuracy studies to improve the quality of reporting. The prototypical flow diagram of diagnostic accuracy study as described by STARD will be utilized. Results of diagnostic accuracy and correlation of clinical factors in relation to threshold levels, prevalence of abuse, false-positives, false-negatives, influence of other drugs, and demographic characteristics will be calculated. The limitations include lack of availability of POC testing with lower cutoff levels. This article presents a protocol for a diagnostic accuracy study of urine drug testing. The protocol also will permit correlation of various clinical factors in relation to threshold levels, prevalence of abuse, false-positives, false-negatives, influence of other drugs, and demographic characteristics. NCT 01052155.
Physical examination tests for screening and diagnosis of cervicogenic headache: A systematic review.

PubMed

Rubio-Ochoa, J; Benítez-Martínez, J; Lluch, E; Santacruz-Zaragozá, S; Gómez-Contreras, P; Cook, C E

2016-02-01

It has been suggested that differential diagnosis of headaches should consist of a robust subjective examination and a detailed physical examination of the cervical spine. Cervicogenic headache (CGH) is a form of headache that involves referred pain from the neck. To our knowledge, no studies have summarized the reliability and diagnostic accuracy of physical examination tests for CGH. The aim of this study was to summarize the reliability and diagnostic accuracy of physical examination tests used to diagnose CGH. A systematic review following PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines was performed in four electronic databases (MEDLINE, Web of Science, Embase and Scopus). Full text reports concerning physical tests for the diagnosis of CGH which reported the clinometric properties for assessment of CGH, were included and screened for methodological quality. Quality Appraisal for Reliability Studies (QAREL) and Quality Assessment of Studies of Diagnostic Accuracy (QUADAS-2) scores were completed to assess article quality. Eight articles were retrieved for quality assessment and data extraction. Studies investigating diagnostic reliability of physical examination tests for CGH scored poorer on methodological quality (higher risk of bias) than those of diagnostic accuracy. There is sufficient evidence showing high levels of reliability and diagnostic accuracy of the selected physical examination tests for the diagnosis of CGH. The cervical flexion-rotation test (CFRT) exhibited both the highest reliability and the strongest diagnostic accuracy for the diagnosis of CGH. Copyright © 2015 Elsevier Ltd. All rights reserved.
A matter of accuracy. Nanobiochips in diagnostics and in research: ethical issues as value trade-offs.

PubMed

Le Roux, Ronan

2015-04-01

The paper deals with the introduction of nanotechnology in biochips. Based on interviews and theoretical reflections, it explores blind spots left by technology assessment and ethical investigations. These have focused on possible consequences of increased diffusability of a diagnostic device, neglecting both the context of research as well as increased accuracy, despite it being a more essential feature of nanobiochip projects. Also, rather than one of many parallel aspects (technical, legal and social) in innovation processes, ethics is considered here as a ubiquitous system of choices between sometimes antagonistic values. Thus, the paper investigates what is at stake when accuracy is balanced with other practical values in different contexts. Dramatic nanotechnological increase of accuracy in biochips can raise ethical issues, since it is at odds with other values such as diffusability and reliability. But those issues will not be as revolutionary as is often claimed: neither in diagnostics, because accuracy of measurements is not accuracy of diagnostics; nor in research, because a boost in measurement accuracy is not sufficient to overcome significance-chasing malpractices. The conclusion extends to methodological recommendations.
The "Down the PC" view - A new tool to assess screw positioning in the posterior column of the acetabulum.

PubMed

Osterhoff, G; Amiri, S; Unno, F; Dodd, A; Guy, P; O'Brien, P J; Lefaivre, K A

2015-08-01

Minimal-invasive placement of screws into the posterior column of the acetabulum (PC) is challenging. Due to the saddle-shaped curvature of the medial cortical border of the PC, the standard fluoroscopic views of the pelvis cannot provide the desired safety during screw insertion. The aim of this study was to define a view tangentially to the medial cortex of the PC and to evaluate its accuracy and inter-observer reproducibility. Radio-dense markers on the medial cortex of the PC along the axis of a PC screw were brought in line and landmarks of the new "Down the PC" view were determined. Kirschner wires were placed into the PC of a pelvis composite model and five pelvic cadaver specimens in a total of 34 different correct and incorrect positions. Based on either only the "Down the PC" view, only the standard views, or a combination of both, three fellowship-trained orthopaedic surgeons had to decide if the inserted wires were in bone in the posterior column or had exited cortex, and if they penetrated the acetabulum. Sensitivity, specificity, and the intra-class correlation coefficient were calculated. A view using three radiographic landmarks (pelvic brim, medial cortical wall of the body of the ischium, ischial spine) was found. Sensitivity and specificity to detect perforation out of the bone were 1.00 and 0.97 for the "Down the PC" view, 0.46 and 0.97 if only the standard views were used, and 1.00 and 0.95 for a combination of both. Sensitivity and specificity to detect intra-articular wire placement were 1.00 and 0.96 for the "Down the PC" view, 0.72 and 0.95 if only the standard views were used, and 0.94 and 0.99 for a combination of both. Inter-observer agreement using only the "Down the PC" view was excellent with an ICC of 0.92 for perforation and ICC of 0.82 for intra-articular wire placement. The "Down the PC" view is a useful addendum in the orthopaedic trauma surgeon's tool box. Using simple landmarks, it is easily to reproduce and thereby shows excellent accuracy and inter-observer agreement in order to detect medial perforation or intra-articular implant position. Copyright © 2015 Elsevier Ltd. All rights reserved.
Diagnostic Accuracy of Natriuretic Peptides for Heart Failure in Patients with Pleural Effusion: A Systematic Review and Updated Meta-Analysis

PubMed Central

Cheng, Juan-Juan; Zhao, Shi-Di; Gao, Ming-Zhu; Huang, Hong-Yu; Gu, Bing; Ma, Ping; Chen, Yan; Wang, Jun-Hong; Yang, Cheng-Jian; Yan, Zi-He

2015-01-01

Background Previous studies have reported that natriuretic peptides in the blood and pleural fluid (PF) are effective diagnostic markers for heart failure (HF). These natriuretic peptides include N-terminal pro-brain natriuretic peptide (NT-proBNP), brain natriuretic peptide (BNP), and midregion pro-atrial natriuretic peptide (MR-proANP). This systematic review and meta-analysis evaluates the diagnostic accuracy of blood and PF natriuretic peptides for HF in patients with pleural effusion. Methods PubMed and EMBASE databases were searched to identify articles published in English that investigated the diagnostic accuracy of BNP, NT-proBNP, and MR-proANP for HF. The last search was performed on 9 October 2014. The quality of the eligible studies was assessed using the revised Quality Assessment of Diagnostic Accuracy Studies tool. The diagnostic performance characteristics (sensitivity, specificity, and other measures of accuracy) were pooled and examined using a bivariate model. Results In total, 14 studies were included in the meta-analysis, including 12 studies reporting the diagnostic accuracy of PF NT-proBNP and 4 studies evaluating blood NT-proBNP. The summary estimates of PF NT-proBNP for HF had a diagnostic sensitivity of 0.94 (95% confidence interval [CI]: 0.90–0.96), specificity of 0.91 (95% CI: 0.86–0.95), positive likelihood ratio of 10.9 (95% CI: 6.4–18.6), negative likelihood ratio of 0.07 (95% CI: 0.04–0.12), and diagnostic odds ratio of 157 (95% CI: 57–430). The overall sensitivity of blood NT-proBNP for diagnosis of HF was 0.92 (95% CI: 0.86–0.95), with a specificity of 0.88 (95% CI: 0.77–0.94), positive likelihood ratio of 7.8 (95% CI: 3.7–16.3), negative likelihood ratio of 0.10 (95% CI: 0.06–0.16), and diagnostic odds ratio of 81 (95% CI: 27–241). The diagnostic accuracy of PF MR-proANP and blood and PF BNP was not analyzed due to the small number of related studies. Conclusions BNP, NT-proBNP, and MR-proANP, either in blood or PF, are effective tools for diagnosis of HF. Additional studies are needed to rigorously evaluate the diagnostic accuracy of PF and blood MR-proANP and BNP for the diagnosis of HF. PMID:26244664
Usual interstitial pneumonia: typical, possible, and “inconsistent” patterns

PubMed Central

Torres, Pedro Paulo Teixeira e Silva; Rabahi, Marcelo Fouad; Moreira, Maria Auxiliadora Carmo; Meirelles, Gustavo de Souza Portes; Marchiori, Edson

2017-01-01

ABSTRACT Idiopathic pulmonary fibrosis is a severe and progressive chronic fibrosing interstitial lung disease, a definitive diagnosis being established by specific combinations of clinical, radiological, and pathological findings. According to current international guidelines, HRCT plays a key role in establishing a diagnosis of usual interstitial pneumonia (UIP). Current guidelines describe three UIP patterns based on HRCT findings: a typical UIP pattern; a pattern designated “possible UIP”; and a pattern designated “inconsistent with UIP”, each pattern having important diagnostic implications. A typical UIP pattern on HRCT is highly accurate for the presence of histopathological UIP, being currently considered to be diagnostic of UIP. The remaining patterns require further diagnostic investigation. Other known causes of a UIP pattern include drug-induced interstitial lung disease, chronic hypersensitivity pneumonitis, occupational diseases (e.g., asbestosis), and connective tissue diseases, all of which should be included in the clinical differential diagnosis. Given the importance of CT studies in establishing a diagnosis and the possibility of interobserver variability, the objective of this pictorial essay was to illustrate all three UIP patterns on HRCT. PMID:29160385
Radiological interpretation of images displayed on tablet computers: a systematic review

PubMed Central

Armfield, N R; Smith, A C

2015-01-01

Objective: To review the published evidence and to determine if radiological diagnostic accuracy is compromised when images are displayed on a tablet computer and thereby inform practice on using tablet computers for radiological interpretation by on-call radiologists. Methods: We searched the PubMed and EMBASE databases for studies on the diagnostic accuracy or diagnostic reliability of images interpreted on tablet computers. Studies were screened for inclusion based on pre-determined inclusion and exclusion criteria. Studies were assessed for quality and risk of bias using Quality Appraisal of Diagnostic Reliability Studies or the revised Quality Assessment of Diagnostic Accuracy Studies tool. Treatment of studies was reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA). Results: 11 studies met the inclusion criteria. 10 of these studies tested the Apple iPad® (Apple, Cupertino, CA). The included studies reported high sensitivity (84–98%), specificity (74–100%) and accuracy rates (98–100%) for radiological diagnosis. There was no statistically significant difference in accuracy between a tablet computer and a digital imaging and communication in medicine-calibrated control display. There was a near complete consensus from authors on the non-inferiority of diagnostic accuracy of images displayed on a tablet computer. All of the included studies were judged to be at risk of bias. Conclusion: Our findings suggest that the diagnostic accuracy of radiological interpretation is not compromised by using a tablet computer. This result is only relevant to the Apple iPad and to the modalities of CT, MRI and plain radiography. Advances in knowledge: The iPad may be appropriate for an on-call radiologist to use for radiological interpretation. PMID:25882691
Meta-analysis of stratus OCT glaucoma diagnostic accuracy.

PubMed

Chen, Hsin-Yi; Chang, Yue-Cune

2014-09-01

To evaluate the diagnostic accuracy of glaucoma in different stages, different types of glaucoma, and different ethnic groups using Stratus optical coherence tomography (OCT). We searched MEDLINE to identify available articles on diagnostic accuracy of glaucoma published between January 2004 and December 2011. A PubMed (National Center for Biotechnology Information) search using medical subject headings and keywords was executed using the following terms: "diagnostic accuracy" or "receiver operator characteristic" or "area under curve" or "AUC" and "Stratus OCT" and "glaucoma." The search was subsequently limited to publications in English. The area under a receiver operator characteristic (AUC) curve was used to measure the diagnostic performance. A random-effects model was used to estimate the pooled AUC value of the 17 parameters (average retinal nerve fiber layer thickness, temporal quadrant, superior quadrant, nasal quadrant, inferior quadrant, and 1 to 12 o'clock). Meta-regression analysis was used to check the significance of some important factors: (1) glaucoma severity (five stages), (2) glaucoma types (four types), and (3) ethnicity (four categories). The orders of accuracy among those parameters were as follows: average > inferior > superior > 7 o'clock > 6 o'clock > 11 o'clock > 12 o'clock > 1 o'clock > 5 o'clock > nasal > temporal > 2 o'clock > 10 o'clock > 8 o'clock > 9 o'clock > 4 o'clock > 3 o'clock. After adjusting for the effects of age, glaucoma severity, glaucoma types, and ethnicity, the average retinal nerve fiber layer thickness provided highest accuracy compared with the other parameters of OCT. The diagnostic accuracy in Asian populations was significantly lower than that in whites and the other two ethnic types. Stratus OCT demonstrated good diagnostic capability in differentiating glaucomatous from normal eyes. However, we should be more cautious in applying this instrument in Asian groups in glaucoma management.
Accuracy of computer-aided diagnosis based on narrow-band imaging endocytoscopy for diagnosing colorectal lesions: comparison with experts.

PubMed

Misawa, Masashi; Kudo, Shin-Ei; Mori, Yuichi; Takeda, Kenichi; Maeda, Yasuharu; Kataoka, Shinichi; Nakamura, Hiroki; Kudo, Toyoki; Wakamura, Kunihiko; Hayashi, Takemasa; Katagiri, Atsushi; Baba, Toshiyuki; Ishida, Fumio; Inoue, Haruhiro; Nimura, Yukitaka; Oda, Msahiro; Mori, Kensaku

2017-05-01

Real-time characterization of colorectal lesions during colonoscopy is important for reducing medical costs, given that the need for a pathological diagnosis can be omitted if the accuracy of the diagnostic modality is sufficiently high. However, it is sometimes difficult for community-based gastroenterologists to achieve the required level of diagnostic accuracy. In this regard, we developed a computer-aided diagnosis (CAD) system based on endocytoscopy (EC) to evaluate cellular, glandular, and vessel structure atypia in vivo. The purpose of this study was to compare the diagnostic ability and efficacy of this CAD system with the performances of human expert and trainee endoscopists. We developed a CAD system based on EC with narrow-band imaging that allowed microvascular evaluation without dye (ECV-CAD). The CAD algorithm was programmed based on texture analysis and provided a two-class diagnosis of neoplastic or non-neoplastic, with probabilities. We validated the diagnostic ability of the ECV-CAD system using 173 randomly selected EC images (49 non-neoplasms, 124 neoplasms). The images were evaluated by the CAD and by four expert endoscopists and three trainees. The diagnostic accuracies for distinguishing between neoplasms and non-neoplasms were calculated. ECV-CAD had higher overall diagnostic accuracy than trainees (87.8 vs 63.4%; [Formula: see text]), but similar to experts (87.8 vs 84.2%; [Formula: see text]). With regard to high-confidence cases, the overall accuracy of ECV-CAD was also higher than trainees (93.5 vs 71.7%; [Formula: see text]) and comparable to experts (93.5 vs 90.8%; [Formula: see text]). ECV-CAD showed better diagnostic accuracy than trainee endoscopists and was comparable to that of experts. ECV-CAD could thus be a powerful decision-making tool for less-experienced endoscopists.
Methodology and reporting of diagnostic accuracy studies of automated perimetry in glaucoma: evaluation using a standardised approach.

PubMed

Fidalgo, Bruno M R; Crabb, David P; Lawrenson, John G

2015-05-01

To evaluate methodological and reporting quality of diagnostic accuracy studies of perimetry in glaucoma and to determine whether there had been any improvement since the publication of the Standards for Reporting of Diagnostic Accuracy (STARD) guidelines. A systematic review of English language articles published between 1993 and 2013 reporting the diagnostic accuracy of perimetry in glaucoma. Articles were appraised for methodological quality using the 14-item Quality assessment tool for diagnostic accuracy studies (QUADAS) and evaluated for quality of reporting by applying the STARD checklist. Fifty-eight articles were appraised. Overall methodological quality of these studies was moderate with a median number of QUADAS items rated as 'yes' equal to nine (out of a maximum of 14) (IQR 7-10). The studies were often poorly reported; median score of STARD items fully reported was 11 out of 25 (IQR 10-14). A comparison of the studies published in 10-year periods before and after the publication of the STARD checklist in 2003 found quality of reporting had not substantially improved. Methodological and reporting quality of diagnostic accuracy studies of perimetry is sub-optimal and appears not to have improved substantially following the development of the STARD reporting guidance. This observation is consistent with previous studies in ophthalmology and in other medical specialities. © 2015 The Authors Ophthalmic & Physiological Optics © 2015 The College of Optometrists.
Routine Use of Adjunctive p16 Immunohistochemistry Improves Diagnostic Agreement of Cervical Biopsy Interpretation: Results From the CERTAIN Study.

PubMed

Stoler, Mark H; Wright, Thomas C; Ferenczy, Alex; Ranger-Moore, James; Fang, Qijun; Kapadia, Monesh; Ridder, Ruediger

2018-04-24

The diagnosis of squamous intraepithelial lesions in cervical tissue specimens is subject to substantial variability. Adjunctive immunohistochemical (IHC) staining for p16 has been shown to add objective biomarker information to morphologic interpretation of hematoxylin and eosin (H&E)-stained tissues. In the CERvical Tissue AdjunctIve aNalysis (CERTAIN) study, we systematically analyzed the impact of adjunctive p16 IHC on the accuracy (agreement with reference pathology results) of diagnosing cervical intraepithelial neoplasia of grade 2 or worse (CIN2+) in the United States. Eleven hundred cervical biopsies were divided into 4 sets of 275 cases by stratified randomization. All H&E slides from each set were interpreted by 17 to 18 individual surgical pathologists, for a total of 19,250 reads by 70 surgical pathologists. After a wash-out period and blinding to original results, cases were re-read by the same pathologists using H&E+p16-stained slides. Using expert consensus diagnoses on H&E+p16 as reference, adjunctive p16 IHC use significantly improved diagnostic agreement of surgical pathologists by 4.7% (95% confidence interval [CI], 3.9, 5.4; P<0.0001). This improvement was driven by an increase of 11.5% (95% CI, 9.3, 13.5; P<0.0001) in sensitivity and an increase of 3.0% (95% CI, 2.2, 3.7; P<0.0001) in specificity. Diagnostic performance was significantly increased as well when expert consensus diagnoses established on H&E only was used as reference. Furthermore, interobserver reliability improved significantly from moderate (H&E: κ=0.58) to substantial (H&E+p16: κ=0.73; P<0.0001). Adjunctive use of p16 IHC provides more accurate and reproducible diagnostic results in the interpretation of cervical biopsies, ensuring that more patients are treated correctly without treating more patients.This is an open-access article distributed under the terms of the Creative Commons Attribution-Non Commercial-No Derivatives License 4.0 (CCBY-NC-ND), where it is permissible to download and share the work provided it is properly cited. The work cannot be changed in any way or used commercially without permission from the journal. http://creativecommons.org/licenses/by-nc-nd/4.0/.
Implementation of a Posted Schedule to Increase Class-Wide Interobserver Agreement Assessment

ERIC Educational Resources Information Center

Doucette, Stefanie; DiGennaro Reed, Florence D.; Reed, Derek D.; Maguire, Helena; Marquardt, Heidi

2012-01-01

The present study investigated the impact of an antecedent intervention in the form of a daily posted schedule on the interobserver agreement (IOA) assessment of educational goals implemented within a classroom at a private school serving individuals with disabilities. During baseline, the percentage of academic goals with interobserver agreement…
Diagnostic accuracy of fractional exhaled nitric oxide measurement in predicting cough-variant asthma and eosinophilic bronchitis in adults with chronic cough: A systematic review and meta-analysis.

PubMed

Song, Woo-Jung; Kim, Hyun Jung; Shim, Ji-Su; Won, Ha-Kyeong; Kang, Sung-Yoon; Sohn, Kyoung-Hee; Kim, Byung-Keun; Jo, Eun-Jung; Kim, Min-Hye; Kim, Sang-Heon; Park, Heung-Woo; Kim, Sun-Sin; Chang, Yoon-Seok; Morice, Alyn H; Lee, Byung-Jae; Cho, Sang-Heon

2017-09-01

Individual studies have suggested the utility of fractional exhaled nitric oxide (Feno) measurement in detecting cough-variant asthma (CVA) and eosinophilic bronchitis (EB) in patients with chronic cough. We sought to obtain summary estimates of diagnostic test accuracy of Feno measurement in predicting CVA, EB, or both in adults with chronic cough. Electronic databases were searched for studies published until January 2016, without language restriction. Cross-sectional studies that reported the diagnostic accuracy of Feno measurement for detecting CVA or EB were included. Risk of bias was assessed with Quality Assessment of Diagnostic Accuracy Studies 2. Random effects meta-analyses were performed to obtain summary estimates of the diagnostic accuracy of Feno measurement. A total of 15 studies involving 2187 adults with chronic cough were identified. Feno measurement had a moderate diagnostic accuracy in predicting CVA in patients with chronic cough, showing the summary area under the curve to be 0.87 (95% CI, 0.83-0.89). Specificity was higher and more consistent than sensitivity (0.85 [95% CI, 0.81-0.88] and 0.72 [95% CI, 0.61-0.81], respectively). However, in the nonasthmatic population with chronic cough, the diagnostic accuracy to predict EB was found to be relatively lower (summary area under the curve, 0.81 [95% CI, 0.77-0.84]), and specificity was inconsistent. The present meta-analyses indicated the diagnostic potential of Feno measurement as a rule-in test for detecting CVA in adult patients with chronic cough. However, Feno measurement may not be useful to predict EB in nonasthmatic subjects with chronic cough. These findings warrant further studies to validate the roles of Feno measurement in clinical practice of patients with chronic cough. Copyright © 2017 American Academy of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.
Radiologists’ Interpretive Skills in Screening vs. Diagnostic Mammography: Are They Related?

PubMed Central

Elmore, Joann G.; Cook, Andrea J.; Bogart, Andy; Carney, Patricia A.; Geller, Berta; Taplin, Stephen; Buist, Diana SM; Onega, Tracy; Lee, Christoph I.; Miglioretti, Diana L.

2016-01-01

Purpose To determine whether radiologists who perform well in screening also perform well in interpreting diagnostic mammography. Materials & Methods We evaluated the accuracy of 468 radiologists interpreting 2,234,947 screening and 196,164 diagnostic mammograms. Adjusting for site, radiologist, and patient characteristics, we identified radiologists with performance in the highest tertile and compared to those with lower performance. Results A moderate correlation was noted for radiologists’ accuracy when interpreting screening versus their accuracy on diagnostic exams: sensitivity (rspearman=0.51, 95% CI: 0.22, 0.80; P=0.0006), specificity (rspearman=0.40, 95% CI: 0.30, 0.49; P<0.0001). Conclusion Different educational approaches to screening and diagnostic imaging should be considered. PMID:27438069
Does clinical pretest probability influence image quality and diagnostic accuracy in dual-source coronary CT angiography?

PubMed

Thomas, Christoph; Brodoefel, Harald; Tsiflikas, Ilias; Bruckner, Friederike; Reimann, Anja; Ketelsen, Dominik; Drosch, Tanja; Claussen, Claus D; Kopp, Andreas; Heuschmid, Martin; Burgstahler, Christof

2010-02-01

To prospectively evaluate the influence of the clinical pretest probability assessed by the Morise score onto image quality and diagnostic accuracy in coronary dual-source computed tomography angiography (DSCTA). In 61 patients, DSCTA and invasive coronary angiography were performed. Subjective image quality and accuracy for stenosis detection (>50%) of DSCTA with invasive coronary angiography as gold standard were evaluated. The influence of pretest probability onto image quality and accuracy was assessed by logistic regression and chi-square testing. Correlations of image quality and accuracy with the Morise score were determined using linear regression. Thirty-eight patients were categorized into the high, 21 into the intermediate, and 2 into the low probability group. Accuracies for the detection of significant stenoses were 0.94, 0.97, and 1.00, respectively. Logistic regressions and chi-square tests showed statistically significant correlations between Morise score and image quality (P < .0001 and P < .001) and accuracy (P = .0049 and P = .027). Linear regression revealed a cutoff Morise score for a good image quality of 16 and a cutoff for a barely diagnostic image quality beyond the upper Morise scale. Pretest probability is a weak predictor of image quality and diagnostic accuracy in coronary DSCTA. A sufficient image quality for diagnostic images can be reached with all pretest probabilities. Therefore, coronary DSCTA might be suitable also for patients with a high pretest probability. Copyright 2010 AUR. Published by Elsevier Inc. All rights reserved.
Proposed Diagnostic Criteria for Smartphone Addiction

PubMed Central

Lin, Yu-Hsuan; Chiang, Chih-Lin; Lin, Po-Hsien; Chang, Li-Ren; Ko, Chih-Hung; Lee, Yang-Han

2016-01-01

Background Global smartphone penetration has led to unprecedented addictive behaviors. The aims of this study are to develop diagnostic criteria of smartphone addiction and to examine the discriminative ability and the validity of the diagnostic criteria. Methods We developed twelve candidate criteria for characteristic symptoms of smartphone addiction and four criteria for functional impairment caused by excessive smartphone use. The participants consisted of 281 college students. Each participant was systematically assessed for smartphone-using behaviors by psychiatrist’s structured diagnostic interview. The sensitivity, specificity, and diagnostic accuracy of the candidate symptom criteria were analyzed with reference to the psychiatrists’ clinical global impression. The optimal model selection with its cutoff point of the diagnostic criteria differentiating the smartphone addicted subjects from non-addicted subjects was then determined by the best diagnostic accuracy. Results Six symptom criteria model with optimal cutoff point were determined based on the maximal diagnostic accuracy. The proposed smartphone addiction diagnostic criteria consisted of (1) six symptom criteria, (2) four functional impairment criteria and (3) exclusion criteria. Setting three symptom criteria as the cutoff point resulted in the highest diagnostic accuracy (84.3%), while the sensitivity and specificity were 79.4% and 87.5%, respectively. We suggested determining the functional impairment by two or more of the four domains considering the high accessibility and penetration of smartphone use. Conclusion The diagnostic criteria of smartphone addiction demonstrated the core symptoms “impaired control” paralleled with substance related and addictive disorders. The functional impairment involved multiple domains provide a strict standard for clinical assessment. PMID:27846211
Proposed Diagnostic Criteria for Smartphone Addiction.

PubMed

Lin, Yu-Hsuan; Chiang, Chih-Lin; Lin, Po-Hsien; Chang, Li-Ren; Ko, Chih-Hung; Lee, Yang-Han; Lin, Sheng-Hsuan

2016-01-01

Global smartphone penetration has led to unprecedented addictive behaviors. The aims of this study are to develop diagnostic criteria of smartphone addiction and to examine the discriminative ability and the validity of the diagnostic criteria. We developed twelve candidate criteria for characteristic symptoms of smartphone addiction and four criteria for functional impairment caused by excessive smartphone use. The participants consisted of 281 college students. Each participant was systematically assessed for smartphone-using behaviors by psychiatrist's structured diagnostic interview. The sensitivity, specificity, and diagnostic accuracy of the candidate symptom criteria were analyzed with reference to the psychiatrists' clinical global impression. The optimal model selection with its cutoff point of the diagnostic criteria differentiating the smartphone addicted subjects from non-addicted subjects was then determined by the best diagnostic accuracy. Six symptom criteria model with optimal cutoff point were determined based on the maximal diagnostic accuracy. The proposed smartphone addiction diagnostic criteria consisted of (1) six symptom criteria, (2) four functional impairment criteria and (3) exclusion criteria. Setting three symptom criteria as the cutoff point resulted in the highest diagnostic accuracy (84.3%), while the sensitivity and specificity were 79.4% and 87.5%, respectively. We suggested determining the functional impairment by two or more of the four domains considering the high accessibility and penetration of smartphone use. The diagnostic criteria of smartphone addiction demonstrated the core symptoms "impaired control" paralleled with substance related and addictive disorders. The functional impairment involved multiple domains provide a strict standard for clinical assessment.
Cluster signal-to-noise analysis for evaluation of the information content in an image.

PubMed

Weerawanich, Warangkana; Shimizu, Mayumi; Takeshita, Yohei; Okamura, Kazutoshi; Yoshida, Shoko; Yoshiura, Kazunori

2018-01-01

(1) To develop an observer-free method of analysing image quality related to the observer performance in the detection task and (2) to analyse observer behaviour patterns in the detection of small mass changes in cone-beam CT images. 13 observers detected holes in a Teflon phantom in cone-beam CT images. Using the same images, we developed a new method, cluster signal-to-noise analysis, to detect the holes by applying various cut-off values using ImageJ and reconstructing cluster signal-to-noise curves. We then evaluated the correlation between cluster signal-to-noise analysis and the observer performance test. We measured the background noise in each image to evaluate the relationship with false positive rates (FPRs) of the observers. Correlations between mean FPRs and intra- and interobserver variations were also evaluated. Moreover, we calculated true positive rates (TPRs) and accuracies from background noise and evaluated their correlations with TPRs from observers. Cluster signal-to-noise curves were derived in cluster signal-to-noise analysis. They yield the detection of signals (true holes) related to noise (false holes). This method correlated highly with the observer performance test (R 2 = 0.9296). In noisy images, increasing background noise resulted in higher FPRs and larger intra- and interobserver variations. TPRs and accuracies calculated from background noise had high correlation with actual TPRs from observers; R 2 was 0.9244 and 0.9338, respectively. Cluster signal-to-noise analysis can simulate the detection performance of observers and thus replace the observer performance test in the evaluation of image quality. Erroneous decision-making increased with increasing background noise.
Can imaging criteria distinguish enchondroma from grade 1 chondrosarcoma?

PubMed

Crim, Julia; Schmidt, Robert; Layfield, Lester; Hanrahan, Christopher; Manaster, Betty Jean

2015-11-01

To minimize systematic bias and optimize agreement on imaging criteria in order to better define the accuracy of imaging criteria in the diagnosis of grade 1 chondrosarcoma. Study was IRB-approved and HIPAA compliant; informed consent was waived. Records were reviewed and disclosed 53 cases (38 women, 15 men ages 21-76) which were diagnosed as enchondroma or grade 1 chondrosarcoma and had available radiographs, contrast-enhanced MRI, and definitive diagnosis by histology or 5-year follow-up. 2 MSK radiologists read the studies independently after a session where they agreed on criteria for malignancy. Interobserver variability was determined as raw variability and with the kappa statistic. Accuracy was determined compared to final diagnosis. Reliability of imaging features of chondrosarcoma was determined using regression analysis. The correct diagnosis of enchondroma was made on radiographs in 43 (67.2%) of readings, and on MRI in 37/64 (57.8%). The correct diagnosis of chondrosarcoma was made on radiographs in 5/24 (20.8%) of readings, and on MRI in 14/24 (57.8%). A diagnosis of borderline lesion was made in 19/64 (29.7%) of enchondromas on radiographs and 18/64 (28.1%) on MRI. The false positive rate of radiographs for chondrosarcoma was 2/64 (3.1%) and the false positive rate of MRI was 9/64 (14.1%). There was substantial interobserver variability. Cortical thickening and bone expansion were rare but specific signs of chondrosarcoma. Both radiographs and MRI have limitations in the evaluation of low-grade cartilage lesions. MRI has an increased rate of both true-positive and false-positive diagnosis compared to radiographs. Differences in the findings of this study compared to previous literature may reflect the influence of systematic biases. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

Fully Automatic Segmentation of Fluorescein Leakage in Subjects With Diabetic Macular Edema

PubMed Central

Rabbani, Hossein; Allingham, Michael J.; Mettu, Priyatham S.; Cousins, Scott W.; Farsiu, Sina

2015-01-01

Purpose. To create and validate software to automatically segment leakage area in real-world clinical fluorescein angiography (FA) images of subjects with diabetic macular edema (DME). Methods. Fluorescein angiography images obtained from 24 eyes of 24 subjects with DME were retrospectively analyzed. Both video and still-frame images were obtained using a Heidelberg Spectralis 6-mode HRA/OCT unit. We aligned early and late FA frames in the video by a two-step nonrigid registration method. To remove background artifacts, we subtracted early and late FA frames. Finally, after postprocessing steps, including detection and inpainting of the vessels, a robust active contour method was utilized to obtain leakage area in a 1500-μm-radius circular region centered at the fovea. Images were captured at different fields of view (FOVs) and were often contaminated with outliers, as is the case in real-world clinical imaging. Our algorithm was applied to these images with no manual input. Separately, all images were manually segmented by two retina specialists. The sensitivity, specificity, and accuracy of manual interobserver, manual intraobserver, and automatic methods were calculated. Results. The mean accuracy was 0.86 ± 0.08 for automatic versus manual, 0.83 ± 0.16 for manual interobserver, and 0.90 ± 0.08 for manual intraobserver segmentation methods. Conclusions. Our fully automated algorithm can reproducibly and accurately quantify the area of leakage of clinical-grade FA video and is congruent with expert manual segmentation. The performance was reliable for different DME subtypes. This approach has the potential to reduce time and labor costs and may yield objective and reproducible quantitative measurements of DME imaging biomarkers. PMID:25634978
Fully automatic segmentation of fluorescein leakage in subjects with diabetic macular edema.

PubMed

Rabbani, Hossein; Allingham, Michael J; Mettu, Priyatham S; Cousins, Scott W; Farsiu, Sina

2015-01-29

To create and validate software to automatically segment leakage area in real-world clinical fluorescein angiography (FA) images of subjects with diabetic macular edema (DME). Fluorescein angiography images obtained from 24 eyes of 24 subjects with DME were retrospectively analyzed. Both video and still-frame images were obtained using a Heidelberg Spectralis 6-mode HRA/OCT unit. We aligned early and late FA frames in the video by a two-step nonrigid registration method. To remove background artifacts, we subtracted early and late FA frames. Finally, after postprocessing steps, including detection and inpainting of the vessels, a robust active contour method was utilized to obtain leakage area in a 1500-μm-radius circular region centered at the fovea. Images were captured at different fields of view (FOVs) and were often contaminated with outliers, as is the case in real-world clinical imaging. Our algorithm was applied to these images with no manual input. Separately, all images were manually segmented by two retina specialists. The sensitivity, specificity, and accuracy of manual interobserver, manual intraobserver, and automatic methods were calculated. The mean accuracy was 0.86 ± 0.08 for automatic versus manual, 0.83 ± 0.16 for manual interobserver, and 0.90 ± 0.08 for manual intraobserver segmentation methods. Our fully automated algorithm can reproducibly and accurately quantify the area of leakage of clinical-grade FA video and is congruent with expert manual segmentation. The performance was reliable for different DME subtypes. This approach has the potential to reduce time and labor costs and may yield objective and reproducible quantitative measurements of DME imaging biomarkers. Copyright 2015 The Association for Research in Vision and Ophthalmology, Inc.
Vestibular schwannomas: Accuracy of tumor volume estimated by ice cream cone formula using thin-sliced MR images.

PubMed

Ho, Hsing-Hao; Li, Ya-Hui; Lee, Jih-Chin; Wang, Chih-Wei; Yu, Yi-Lin; Hueng, Dueng-Yuan; Ma, Hsin-I; Hsu, Hsian-He; Juan, Chun-Jung

2018-01-01

We estimated the volume of vestibular schwannomas by an ice cream cone formula using thin-sliced magnetic resonance images (MRI) and compared the estimation accuracy among different estimating formulas and between different models. The study was approved by a local institutional review board. A total of 100 patients with vestibular schwannomas examined by MRI between January 2011 and November 2015 were enrolled retrospectively. Informed consent was waived. Volumes of vestibular schwannomas were estimated by cuboidal, ellipsoidal, and spherical formulas based on a one-component model, and cuboidal, ellipsoidal, Linskey's, and ice cream cone formulas based on a two-component model. The estimated volumes were compared to the volumes measured by planimetry. Intraobserver reproducibility and interobserver agreement was tested. Estimation error, including absolute percentage error (APE) and percentage error (PE), was calculated. Statistical analysis included intraclass correlation coefficient (ICC), linear regression analysis, one-way analysis of variance, and paired t-tests with P < 0.05 considered statistically significant. Overall tumor size was 4.80 ± 6.8 mL (mean ±standard deviation). All ICCs were no less than 0.992, suggestive of high intraobserver reproducibility and high interobserver agreement. Cuboidal formulas significantly overestimated the tumor volume by a factor of 1.9 to 2.4 (P ≤ 0.001). The one-component ellipsoidal and spherical formulas overestimated the tumor volume with an APE of 20.3% and 29.2%, respectively. The two-component ice cream cone method, and ellipsoidal and Linskey's formulas significantly reduced the APE to 11.0%, 10.1%, and 12.5%, respectively (all P < 0.001). The ice cream cone method and other two-component formulas including the ellipsoidal and Linskey's formulas allow for estimation of vestibular schwannoma volume more accurately than all one-component formulas.
Pulmonary tumor measurements from x-ray computed tomography in one, two, and three dimensions.

PubMed

Villemaire, Lauren; Owrangi, Amir M; Etemad-Rezai, Roya; Wilson, Laura; O'Riordan, Elaine; Keller, Harry; Driscoll, Brandon; Bauman, Glenn; Fenster, Aaron; Parraga, Grace

2011-11-01

We evaluated the accuracy and reproducibility of three-dimensional (3D) measurements of lung phantoms and patient tumors from x-ray computed tomography (CT) and compared these to one-dimensional (1D) and two-dimensional (2D) measurements. CT images of three spherical and three irregularly shaped tumor phantoms were evaluated by three observers who performed five repeated measurements. Additionally, three observers manually segmented 29 patient lung tumors five times each. Follow-up imaging was performed for 23 tumors and response criteria were compared. For a single subject, imaging was performed on nine occasions over 2 years to evaluate multidimensional tumor response. To evaluate measurement accuracy, we compared imaging measurements to ground truth using analysis of variance. For estimates of precision, intraobserver and interobserver coefficients of variation and intraclass correlations (ICC) were used. Linear regression and Pearson correlations were used to evaluate agreement and tumor response was descriptively compared. For spherical shaped phantoms, all measurements were highly accurate, but for irregularly shaped phantoms, only 3D measurements were in high agreement with ground truth measurements. All phantom and patient measurements showed high intra- and interobserver reproducibility (ICC >0.900). Over a 2-year period for a single patient, there was disagreement between tumor response classifications based on 3D measurements and those generated using 1D and 2D measurements. Tumor volume measurements were highly reproducible and accurate for irregular, spherical phantoms and patient tumors with nonuniform dimensions. Response classifications obtained from multidimensional measurements suggest that 3D measurements provide higher sensitivity to tumor response. Copyright © 2011 AUR. Published by Elsevier Inc. All rights reserved.
Pancreatic mucinous cystic neoplasm size using CT volumetry, spherical and ellipsoid formulas: validation study.

PubMed

Chalian, Hamid; Seyal, Adeel Rahim; Rezai, Pedram; Töre, Hüseyin Gürkan; Miller, Frank H; Bentrem, David J; Yaghmai, Vahid

2014-01-10

The accuracy for determining pancreatic cyst volume with commonly used spherical and ellipsoid methods is unknown. The role of CT volumetry in volumetric assessment of pancreatic cysts needs to be explored. To compare volumes of the pancreatic cysts by CT volumetry, spherical and ellipsoid methods and determine their accuracy by correlating with actual volume as determined by EUS-guided aspiration. Setting This is a retrospective analysis performed at a tertiary care center. Patients Seventy-eight pathologically proven pancreatic cysts evaluated with CT and endoscopic ultrasound (EUS) were included. Design The volume of fourteen cysts that had been fully aspirated by EUS was compared to CT volumetry and the routinely used methods (ellipsoid and spherical volume). Two independent observers measured all cysts using commercially available software to evaluate inter-observer reproducibility for CT volumetry. The volume of pancreatic cysts as determined by various methods was compared using repeated measures analysis of variance. Bland-Altman plot and intraclass correlation coefficient were used to determine mean difference and correlation between observers and methods. The error was calculated as the percentage of the difference between the CT estimated volumes and the aspirated volume divided by the aspirated one. CT volumetry was comparable to aspirated volume (P=0.396) with very high intraclass correlation (r=0.891, P<0.001) and small mean difference (0.22 mL) and error (8.1%). Mean difference with aspirated volume and error were larger for ellipsoid (0.89 mL, 30.4%; P=0.024) and spherical (1.73 mL, 55.5%; P=0.004) volumes than CT volumetry. There was excellent inter-observer correlation in volumetry of the entire cohort (r=0.997, P<0.001). CT volumetry is accurate and reproducible. Ellipsoid and spherical volume overestimate the true volume of pancreatic cysts.
Evaluation of left ventricular function using electrocardiographically gated myocardial SPECT with (123)I-labeled fatty acid analog.

PubMed

Nanasato, M; Ando, A; Isobe, S; Nonokawa, M; Hirayama, H; Tsuboi, N; Ito, T; Hirai, M; Yokota, M; Saito, H

2001-12-01

Electrocardiographically (ECG) gated myocardial SPECT with (99m)Tc-tetrofosmin has been used widely to assess left ventricular (LV) function. However, the accuracy of variables using ECG gated myocardial SPECT with beta-methyl-p-(123)I-iodophenylpentadecanoic acid (BMIPP) has not been well defined. Thirty-six patients (29 men, 7 women; mean age, 61.6 +/- 15.6 y) with ischemic heart disease underwent ECG gated myocardial SPECT with (123)I-BMIPP and with (99m)Tc-tetrofosmin and left ventriculography (LVG) within 1 wk. LV ejection fraction (LVEF), LV end-diastolic volume (LVEDV), and LV end-systolic volume (LVESV) were determined on gated SPECT using commercially available software for automatic data analysis. These volume-related items on LVG were calculated with an area-length method and were estimated by 2 independent observers to evaluate interobserver validity. The regional wall motion with these methods was assessed visually. LVEF was 41.1% +/- 12.5% on gated SPECT with (123)I-BMIPP, 44.5% +/- 13.1% on gated SPECT with (99m)Tc-tetrofosmin, and 46.0% +/- 12.7% on LVG. Global LV function and regional wall motion between both gated SPECT procedures had excellent correlation (LVEF, r = 0.943; LVEDV, r = 0.934; LVESV, r = 0.952; regional wall motion, kappa = 0.92). However, the correlations of global LV function and regional wall motion between each gated SPECT and LVG were significantly lower. Gated SPECT with (123)I-BMIPP showed the same interobserver validity as gated SPECT with (99m)Tc-tetrofosmin. Gated SPECT with (123)I-BMIPP provides high accuracy with regard to LV function and is sufficiently applicable for use in clinical SPECT. This technique can simultaneously reveal myocardial fatty acid metabolism and LV function, which may be useful to evaluate various cardiac diseases.
Comparison between fine needle aspiration cytology (FNAC) and core needle biopsy (CNB) in the diagnosis of breast lesions.

PubMed

Moschetta, M; Telegrafo, M; Carluccio, D A; Jablonska, J P; Rella, L; Serio, Gabriella; Carrozzo, M; Stabile Ianora, A A; Angelelli, G

2014-01-01

To compare the diagnostic accuracy of fine-needle aspiration cytology (FNAC) and core needle biopsy (CNB) in patients with USdetected breast lesions. Between September 2011 and May 2013, 3469 consecutive breast US examinations were performed. 400 breast nodules were detected in 398 patients. 210 FNACs and 190 CNBs were performed. 183 out of 400 (46%) lesions were surgically removed within 30 days form diagnosis; in the remaining cases, a six month follow up US examination was performed. Sensitivity, specificity, diagnostic accuracy, positive predictive (PPV) and negative predictive (NPV) values were calculated for FNAC and CNB. 174 out of 400 (43%) malignant lesions were found while the remaining 226 resulted to be benign lesions. 166 out of 210 (79%) FNACs and 154 out of 190 (81%) CNBs provided diagnostic specimens. Sensitivity, specificity, diagnostic accuracy, PPV and NPV of 97%, 94%, 95%, 91% and 98% were found for FNAC, and values of 92%, 82%, 89%, 92% and 82% were obtained for CNB. Sensitivity, specificity, diagnostic accuracy, PPV and NPV of 97%, 96%, 96%, 97% and 96% were found for FNAC, and values of 97%, 96%, 96%, 97% and 96% were obtained for CNB. FNAC and CNB provide similar values of diagnostic accuracy.
The diagnostic accuracy of multiparametric MRI to determine pediatric brain tumor grades and types.

PubMed

Koob, Mériam; Girard, Nadine; Ghattas, Badih; Fellah, Slim; Confort-Gouny, Sylviane; Figarella-Branger, Dominique; Scavarda, Didier

2016-04-01

Childhood brain tumors show great histological variability. The goal of this retrospective study was to assess the diagnostic accuracy of multimodal MR imaging (diffusion, perfusion, MR spectroscopy) in the distinction of pediatric brain tumor grades and types. Seventy-six patients (range 1 month to 18 years) with brain tumors underwent multimodal MR imaging. Tumors were categorized by grade (I-IV) and by histological type (A-H). Multivariate statistical analysis was performed to evaluate the diagnostic accuracy of single and combined MR modalities, and of single imaging parameters to distinguish the different groups. The highest diagnostic accuracy for tumor grading was obtained with diffusion-perfusion (73.24%) and for tumor typing with diffusion-perfusion-MR spectroscopy (55.76%). The best diagnostic accuracy was obtained for tumor grading in I and IV and for tumor typing in embryonal tumor and pilocytic astrocytoma. Poor accuracy was seen in other grades and types. ADC and rADC were the best parameters for tumor grading and typing followed by choline level with an intermediate echo time, CBV for grading and Tmax for typing. Multiparametric MR imaging can be accurate in determining tumor grades (primarily grades I and IV) and types (mainly pilocytic astrocytomas and embryonal tumors) in children.
Attribute-Level and Pattern-Level Classification Consistency and Accuracy Indices for Cognitive Diagnostic Assessment

ERIC Educational Resources Information Center

Wang, Wenyi; Song, Lihong; Chen, Ping; Meng, Yaru; Ding, Shuliang

2015-01-01

Classification consistency and accuracy are viewed as important indicators for evaluating the reliability and validity of classification results in cognitive diagnostic assessment (CDA). Pattern-level classification consistency and accuracy indices were introduced by Cui, Gierl, and Chang. However, the indices at the attribute level have not yet…
Surgeon Reliability for the Assessment of Lumbar Spinal Stenosis on MRI: The Impact of Surgeon Experience.

PubMed

Marawar, Satyajit V; Madom, Ian A; Palumbo, Mark; Tallarico, Richard A; Ordway, Nathaniel R; Metkar, Umesh; Wang, Dongliang; Green, Adam; Lavelle, William F

2017-01-01

Treating surgeon's visual assessment of axial MRI images to ascertain the degree of stenosis has a critical impact on surgical decision-making. The purpose of this study was to prospectively analyze the impact of surgeon experience on inter-observer and intra-observer reliability of assessing severity of spinal stenosis on MRIs by spine surgeons directly involved in surgical decision-making. Seven fellowship trained spine surgeons reviewed MRI studies of 30 symptomatic patients with lumbar stenosis and graded the stenosis in the central canal, the lateral recess and the foramen at T12-L1 to L5-S1 as none, mild, moderate or severe. No specific instructions were provided to what constituted mild, moderate, or severe stenosis. Two surgeons were "senior" (>fifteen years of practice experience); two were "intermediate" (>four years of practice experience), and three "junior" (< one year of practice experience). The concordance correlation coefficient (CCC) was calculated to assess inter-observer reliability. Seven MRI studies were duplicated and randomly re-read to evaluate inter-observer reliability. Surgeon experience was found to be a strong predictor of inter-observer reliability. Senior inter-observer reliability was significantly higher assessing central(p<0.001), foraminal p=0.005 and lateral p=0.001 than "junior" group.Senior group also showed significantly higher inter-observer reliability that intermediate group assessing foraminal stenosis (p=0.036). In intra-observer reliability the results were contrary to that found in inter-observer reliability. Inter-observer reliability of assessing stenosis on MRIs increases with surgeon experience. Lower intra-observer reliability values among the senior group, although not clearly explained, may be due to the small number of MRIs evaluated and quality of MRI images.Level of evidence: Level 3.
Diagnostic value of a pancreatic mass on computed tomography in patients undergoing pancreatoduodenectomy for presumed pancreatic cancer.

PubMed

Gerritsen, Arja; Bollen, Thomas L; Nio, C Yung; Molenaar, I Quintus; Dijkgraaf, Marcel G W; van Santvoort, Hjalmar C; Offerhaus, G Johan; Brosens, Lodewijk A; Biermann, Katharina; Sieders, Egbert; de Jong, Koert P; van Dam, Ronald M; van der Harst, Erwin; van Goor, Harry; van Ramshorst, Bert; Bonsing, Bert A; de Hingh, Ignace H; Gerhards, Michael F; van Eijck, Casper H; Gouma, Dirk J; Borel Rinkes, Inne H M; Busch, Olivier R C; Besselink, Marc G H

2015-07-01

Previous studies have shown that 5-14% of patients undergoing pancreatoduodenectomy for suspected malignancy ultimately are diagnosed with benign disease. A "pancreatic mass" on computed tomography (CT) is considered to be the strongest predictor of malignancy, but studies describing its diagnostic value are lacking. The aim of this study was to determine the diagnostic value of a pancreatic mass on CT in patients with presumed pancreatic cancer, as well as the interobserver agreement among radiologists and the additional value of reassessment by expert-radiologists. Reassessment of preoperative CT scans was performed within a previously described multicenter retrospective cohort study in 344 patients undergoing pancreatoduodenectomy for suspected malignancy (2003-2010). Preoperative CT scans were reassessed by 2 experienced abdominal radiologists separately and subsequently in a consensus meeting, after defining a pancreatic mass as "a measurable space occupying soft tissue density, except for an enlarged papilla or focal steatosis". CT scans of 86 patients with benign and 258 patients with (pre)malignant disease were reassessed. In 66% of patients a pancreatic mass was reported in the original CT report, versus 48% and 50% on reassessment by the 2 expert radiologists separately and 44% in consensus (P < .001 vs original report). Interobserver agreement between the original CT report and expert consensus was fair (kappa = 0.32, 95% confidence interval 0.23-0.42). Among both expert-radiologists agreement was moderate (kappa = 0.47, 95% confidence interval 0.38-0.56), with disagreement on the presence of a pancreatic mass in 29% of cases. The specificity for malignancy of pancreatic masses identified in expert consensus was twice as high compared with the original CT report (87% vs 42%, respectively). Positive predictive value increased to 98% after expert consensus, but negative predictive value was low (12%). Clinicians need to be aware of potential considerable disagreement among radiologists about the presence of a pancreatic mass. The specificity for malignancy doubled by expert radiologist reassessment when a uniform definition of "pancreatic mass" was used. Copyright © 2015 Elsevier Inc. All rights reserved.
Inter- and Intra-Observer Agreement in Ultrasound BI-RADS Classification and Real-Time Elastography Tsukuba Score Assessment of Breast Lesions.

PubMed

Schwab, Fabienne; Redling, Katharina; Siebert, Matthias; Schötzau, Andy; Schoenenberger, Cora-Ann; Zanetti-Dällenbach, Rosanna

2016-11-01

Our aim was to prospectively evaluate inter- and intra-observer agreement between Breast Imaging Reporting and Data System (BI-RADS) classifications and Tsukuba elasticity scores (TSs) of breast lesions. The study included 164 breast lesions (63 malignant, 101 benign). The BI-RADS classification and TS of each breast lesion was assessed by the examiner and twice by three reviewers at an interval of 2 months. Weighted κ values for inter-observer agreement ranged from moderate to substantial for BI-RADS classification (κ = 0.585-0.738) and was substantial for TS (κ = 0.608-0.779). Intra-observer agreement was almost perfect for ultrasound (US) BI-RADS (κ = 0.847-0.872) and TS (κ = 0.879-0.914). Overall, individual reviewers are highly self-consistent (almost perfect intra-observer agreement) with respect to BI-RADS classification and TS, whereas inter-observer agreement was moderate to substantial. Comprehensive training is essential for achieving high agreement and minimizing the impact of subjectivity. Our results indicate that breast US and real-time elastography can achieve high diagnostic performance. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Magnetic resonance direct thrombus imaging differentiates acute recurrent ipsilateral deep vein thrombosis from residual thrombosis.

PubMed

Tan, Melanie; Mol, Gerben C; van Rooden, Cornelis J; Klok, Frederikus A; Westerbeek, Robin E; Iglesias Del Sol, Antonio; van de Ree, Marcel A; de Roos, Albert; Huisman, Menno V

2014-07-24

Accurate diagnostic assessment of suspected ipsilateral recurrent deep vein thrombosis (DVT) is a major clinical challenge because differentiating between acute recurrent thrombosis and residual thrombosis is difficult with compression ultrasonography (CUS). We evaluated noninvasive magnetic resonance direct thrombus imaging (MRDTI) in a prospective study of 39 patients with symptomatic recurrent ipsilateral DVT (incompressibility of a different proximal venous segment than at the prior DVT) and 42 asymptomatic patients with at least 6-month-old chronic residual thrombi and normal D-dimer levels. All patients were subjected to MRDTI. MRDTI images were judged by 2 independent radiologists blinded for the presence of acute DVT and a third in case of disagreement. The sensitivity, specificity, and interobserver reliability of MRDTI were determined. MRDTI demonstrated acute recurrent ipsilateral DVT in 37 of 39 patients and was normal in all 42 patients without symptomatic recurrent disease for a sensitivity of 95% (95% CI, 83% to 99%) and a specificity of 100% (95% CI, 92% to 100%). Interobserver agreement was excellent (κ = 0.98). MRDTI images were adequate for interpretation in 95% of the cases. MRDTI is a sensitive and reproducible method for distinguishing acute ipsilateral recurrent DVT from 6-month-old chronic residual thrombi in the leg veins. © 2014 by The American Society of Hematology.
Sublingual Nitroglycerin Administration in Coronary Computed Tomography Angiography: a Systematic Review.

PubMed

Takx, Richard A P; Suchá, Dominika; Park, Jakob; Leiner, Tim; Hoffmann, Udo

2015-12-01

To systematically investigate the literature for the influence of sublingual nitroglycerin administration on coronary diameter, the number of evaluable segments, image quality, heart rate and blood pressure, and diagnostic accuracy of coronary computed tomography (CT) angiography. A systematic search was performed in PubMed, EMBASE and Web of Science. The studies were evaluated for the effect of sublingual nitroglycerin on coronary artery diameter, evaluable segments, objective and subjective image quality, systemic physiological effects and diagnostic accuracy. Due to the heterogeneous reporting of outcome measures, a narrative synthesis was applied. Of the 217 studies identified, nine met the inclusion criteria: seven reported on the effect of nitroglycerin on coronary artery diameter, six on evaluable segments, four on image quality, five on systemic physiological effects and two on diagnostic accuracy. Sublingual nitroglycerin administration resulted in an improved evaluation of more coronary segments, in particular, in smaller coronary branches, better image quality and improved diagnostic accuracy. Side effects were mild and were alleviated without medical intervention. Sublingual nitroglycerin improves the coronary diameter, the number of assessable segments, image quality and diagnostic accuracy of coronary CT angiography without major side effects or systemic physiological changes. • Sublingual nitroglycerin administration results in significant coronary artery dilatation. • Nitroglycerin increases the number of evaluable coronary branches. • Image quality is improved the most in smaller coronary branches. • Nitroglycerin increases the diagnostic accuracy of coronary CT angiography. • Most side effects are mild and do not require medical intervention.
Evaluation of diagnostic accuracy in detecting ordered symptom statuses without a gold standard

PubMed Central

Wang, Zheyu; Zhou, Xiao-Hua; Wang, Miqu

2011-01-01

Our research is motivated by 2 methodological problems in assessing diagnostic accuracy of traditional Chinese medicine (TCM) doctors in detecting a particular symptom whose true status has an ordinal scale and is unknown—imperfect gold standard bias and ordinal scale symptom status. In this paper, we proposed a nonparametric maximum likelihood method for estimating and comparing the accuracy of different doctors in detecting a particular symptom without a gold standard when the true symptom status had an ordered multiple class. In addition, we extended the concept of the area under the receiver operating characteristic curve to a hyper-dimensional overall accuracy for diagnostic accuracy and alternative graphs for displaying a visual result. The simulation studies showed that the proposed method had good performance in terms of bias and mean squared error. Finally, we applied our method to our motivating example on assessing the diagnostic abilities of 5 TCM doctors in detecting symptoms related to Chills disease. PMID:21209155
Design of Malaria Diagnostic Criteria for the Sysmex XE-2100 Hematology Analyzer

PubMed Central

Campuzano-Zuluaga, Germán; Álvarez-Sánchez, Gonzalo; Escobar-Gallo, Gloria Elcy; Valencia-Zuluaga, Luz Marina; Ríos-Orrego, Alexandra Marcela; Pabón-Vidal, Adriana; Miranda-Arboleda, Andrés Felipe; Blair-Trujillo, Silvia; Campuzano-Maya, Germán

2010-01-01

Thick film, the standard diagnostic procedure for malaria, is not always ordered promptly. A failsafe diagnostic strategy using an XE-2100 analyzer is proposed, and for this strategy, malaria diagnostic models for the XE-2100 were developed and tested for accuracy. Two hundred eighty-one samples were distributed into Plasmodium vivax, P. falciparum, and acute febrile syndrome groups for model construction. Model validation was performed using 60% of malaria cases and a composite control group of samples from AFS and healthy participants from endemic and non-endemic regions. For P. vivax, two observer-dependent models (accuracy = 95.3–96.9%), one non–observer-dependent model using built-in variables (accuracy = 94.7%), and one non–observer-dependent model using new and built-in variables (accuracy = 96.8%) were developed. For P. falciparum, two non–observer-dependent models (accuracies = 85% and 89%) were developed. These models could be used by health personnel or be integrated as a malaria alarm for the XE-2100 to prompt early malaria microscopic diagnosis. PMID:20207864
Interobserver and intermodality agreement of standardized algorithms for non-invasive diagnosis of hepatocellular carcinoma in high-risk patients: CEUS-LI-RADS versus MRI-LI-RADS.

PubMed

Schellhaas, Barbara; Hammon, Matthias; Strobel, Deike; Pfeifer, Lukas; Kielisch, Christian; Goertz, Ruediger S; Cavallaro, Alexander; Janka, Rolf; Neurath, Markus F; Uder, Michael; Seuss, Hannes

2018-04-19

We compared the interobserver agreement for the recently introduced contrast-enhanced ultrasound (CEUS)-based algorithm CEUS-LI-RADS (Liver Imaging Reporting and Data System) versus the well-established magnetic resonance imaging (MRI)-LI-RADS for non-invasive diagnosis of hepatocellular carcinoma (HCC) in high-risk patients. Focal liver lesions in 50 high-risk patients (mean age 66.2 ± 11.8 years; 39 male) were assessed retrospectively with CEUS and MRI. Two independent observers reviewed CEUS and MRI examinations, separately, classifying observations according to CEUS-LI-RADSv.2016 and MRI-LI-RADSv.2014. Interobserver agreement was assessed with Cohen's kappa. Forty-three lesions were HCCs; two were intrahepatic cholangiocarcinomas; five were benign lesions. Arterial phase hyperenhancement was perceived less frequently with CEUS than with MRI (37/50 / 38/50 lesions = 74%/78% [CEUS; observer 1/observer 2] versus 46/50 / 44/50 lesions = 92%/88% [MRI; observer 1/observer 2]). Washout appearance was observed in 34/50 / 20/50 lesions = 68%/40% with CEUS and 31/50 / 31/50 lesions = 62%/62%) with MRI. Interobserver agreement was moderate for arterial hyperenhancement (ĸ = 0.511/0.565 [CEUS/MRI]) and "washout" (ĸ = 0.490/0.582 [CEUS/MRI]), fair for CEUS-LI-RADS category (ĸ = 0.309) and substantial for MRI-LI-RADS category (ĸ = 0.609). Intermodality agreement was fair for arterial hyperenhancement (ĸ = 0.329), slight to fair for "washout" (ĸ = 0.202) and LI-RADS category (ĸ = 0.218) CONCLUSION: Interobserver agreement is substantial for MRI-LI-RADS and only fair for CEUS-LI-RADS. This is mostly because interobserver agreement in the perception of washout appearance is better in MRI than in CEUS. Further refinement of the LI-RADS algorithms and increasing education and practice may be necessary to improve the concordance between CEUS and MRI for the final LI-RADS categorization. • CEUS-LI-RADS and MRI-LIRADS enable standardized non-invasive diagnosis of HCC in high-risk patients. • With CEUS, interobserver agreement is better for arterial hyperenhancement than for "washout". • Interobserver agreement for major features is moderate for both CEUS and MRI. • Interobserver agreement for LI-RADS category is substantial for MRI, and fair for CEUS. • Interobserver-agreement for CEUS-LI-RADS will presumably improve with ongoing use of the algorithm.
The diagnostic test accuracy of magnetic resonance imaging, magnetic resonance arthrography and computer tomography in the detection of chondral lesions of the hip.

PubMed

Smith, Toby O; Simpson, Michael; Ejindu, Vivian; Hing, Caroline B

2013-04-01

The purpose of this study was to assess the diagnostic test accuracy of magnetic resonance imaging (MRI), magnetic resonance arthrography (MRA) and multidetector arrays in CT arthrography (MDCT) for assessing chondral lesions in the hip joint. A review of the published and unpublished literature databases was performed to identify all studies reporting the diagnostic test accuracy (sensitivity/specificity) of MRI, MRA or MDCT for the assessment of adults with chondral (cartilage) lesions of the hip with surgical comparison (arthroscopic or open) as the reference test. All included studies were reviewed using the quality assessment of diagnostic accuracy studies appraisal tool. Pooled sensitivity, specificity, likelihood ratios and diagnostic odds ratios were calculated with 95 % confidence intervals using a random-effects meta-analysis for MRI, MRA and MDCT imaging. Eighteen studies satisfied the eligibility criteria. These included 648 hips from 637 patients. MRI indicated a pooled sensitivity of 0.59 (95 % CI: 0.49-0.70) and specificity of 0.94 (95 % CI: 0.90-0.97), and MRA sensitivity and specificity values were 0.62 (95 % CI: 0.57-0.66) and 0.86 (95 % CI: 0.83-0.89), respectively. The diagnostic test accuracy for the detection of hip joint cartilage lesions is currently superior for MRI compared with MRA. There were insufficient data to perform meta-analysis for MDCT or CTA protocols. Based on the current limited diagnostic test accuracy of the use of magnetic resonance or CT, arthroscopy remains the most accurate method of assessing chondral lesions in the hip joint.
Interobserver concordance of assessments of dysplasia and blast counts for the diagnosis of patients with cytopenia: From the Japanese central review study.

PubMed

Matsuda, Akira; Kawabata, Hiroshi; Tohyama, Kaoru; Maeda, Tomoya; Araseki, Kayano; Hata, Tomoko; Suzuki, Takahiro; Kayano, Hidekazu; Shimbo, Kei; Usuki, Kensuke; Chiba, Shigeru; Ishikawa, Takayuki; Arima, Nobuyoshi; Nohgawa, Masaharu; Ohta, Akiko; Miyazaki, Yasushi; Nakao, Sinnji; Ozawa, Keiya; Arai, Shunya; Kurokawa, Mineo; Mitani, Kinuko; Takaori-Kondo, Akifumi

2018-06-07

The diagnosis of myelodysplastic syndromes (MDS) is based on morphology and cytogenetics. However, limited information is currently available on the interobserver concordance of the assessment of dysplastic lineages (<10% or ≥10% in bone marrow (BM)). The revised International Prognostic Scoring System (IPSS-R) described a new threshold (2%) for BM blasts. However, the interobserver concordance of the categories (0-≤2% and >2-<5%) has limited data. The purpose of the present study was to investigate the assessment of dysplastic lineages and IPSS-R reproducibility. Our study was divided into two Steps. In each Step, the microscopic examinations were performed separately by two morphologists. Regarding the category of BM blasts ≤2% and >2-<5%, interobserver agreement was more than 'moderate' in all pairs (kappa test: 0.43-0.90). Regarding dysgranulopoiesis (dysG) and dyserythropoiesis (dysE) in BM, interobserver agreement was more than 'moderate' in all pairs (kappa test, dysG: 0.45-0.96, dysE: 0.45-0.81). Regarding the category of dysmegakaryopoiesis (dysMgk) in BM, interobserver agreement was more than moderate in 4 out of 5 pairs (kappa test: 0.58-1.00), and was fair for one pair (kappa test: 0.37). We consider that high interobserver concordance may be possible for the BM blast cell count (≤2% or >2-<5%) and dysplasia (<10% or ≥10%) of each lineage. Copyright © 2018 Elsevier Ltd. All rights reserved.
Diagnostic Accuracy Assessment of Sensititre and Agar Disk Diffusion for Determining Antimicrobial Resistance Profiles of Bovine Clinical Mastitis Pathogens▿

PubMed Central

Saini, V.; Riekerink, R. G. M. Olde; McClure, J. T.; Barkema, H. W.

2011-01-01

Determining the accuracy and precision of a measuring instrument is pertinent in antimicrobial susceptibility testing. This study was conducted to predict the diagnostic accuracy of the Sensititre MIC mastitis panel (Sensititre) and agar disk diffusion (ADD) method with reference to the manual broth microdilution test method for antimicrobial resistance profiling of Escherichia coli (n = 156), Staphylococcus aureus (n = 154), streptococcal (n = 116), and enterococcal (n = 31) bovine clinical mastitis isolates. The activities of ampicillin, ceftiofur, cephalothin, erythromycin, oxacillin, penicillin, the penicillin-novobiocin combination, pirlimycin, and tetracycline were tested against the isolates. Diagnostic accuracy was determined by estimating the area under the receiver operating characteristic curve; intertest essential and categorical agreements were determined as well. Sensititre and the ADD method demonstrated moderate to highly accurate (71 to 99%) and moderate to perfect (71 to 100%) predictive accuracies for 74 and 76% of the isolate-antimicrobial MIC combinations, respectively. However, the diagnostic accuracy was low for S. aureus-ceftiofur/oxacillin combinations and other streptococcus-ampicillin combinations by either testing method. Essential agreement between Sensititre automatic MIC readings and MIC readings obtained by the broth microdilution test method was 87%. Essential agreement between Sensititre automatic and manual MIC reading methods was 97%. Furthermore, the ADD test method and Sensititre MIC method exhibited 92 and 91% categorical agreement (sensitive, intermediate, resistant) of results, respectively, compared with the reference method. However, both methods demonstrated lower agreement for E. coli-ampicillin/cephalothin combinations than for Gram-positive isolates. In conclusion, the Sensititre and ADD methods had moderate to high diagnostic accuracy and very good essential and categorical agreement for most udder pathogen-antimicrobial combinations and can be readily employed in veterinary diagnostic laboratories. PMID:21270215

Diagnostic accuracy of magnetic resonance imaging techniques for treatment response evaluation in patients with high-grade glioma, a systematic review and meta-analysis.

PubMed

van Dijken, Bart R J; van Laar, Peter Jan; Holtman, Gea A; van der Hoorn, Anouk

2017-10-01

Treatment response assessment in high-grade gliomas uses contrast enhanced T1-weighted MRI, but is unreliable. Novel advanced MRI techniques have been studied, but the accuracy is not well known. Therefore, we performed a systematic meta-analysis to assess the diagnostic accuracy of anatomical and advanced MRI for treatment response in high-grade gliomas. Databases were searched systematically. Study selection and data extraction were done by two authors independently. Meta-analysis was performed using a bivariate random effects model when ≥5 studies were included. Anatomical MRI (five studies, 166 patients) showed a pooled sensitivity and specificity of 68% (95%CI 51-81) and 77% (45-93), respectively. Pooled apparent diffusion coefficients (seven studies, 204 patients) demonstrated a sensitivity of 71% (60-80) and specificity of 87% (77-93). DSC-perfusion (18 studies, 708 patients) sensitivity was 87% (82-91) with a specificity of 86% (77-91). DCE-perfusion (five studies, 207 patients) sensitivity was 92% (73-98) and specificity was 85% (76-92). The sensitivity of spectroscopy (nine studies, 203 patients) was 91% (79-97) and specificity was 95% (65-99). Advanced techniques showed higher diagnostic accuracy than anatomical MRI, the highest for spectroscopy, supporting the use in treatment response assessment in high-grade gliomas. • Treatment response assessment in high-grade gliomas with anatomical MRI is unreliable • Novel advanced MRI techniques have been studied, but diagnostic accuracy is unknown • Meta-analysis demonstrates that advanced MRI showed higher diagnostic accuracy than anatomical MRI • Highest diagnostic accuracy for spectroscopy and perfusion MRI • Supports the incorporation of advanced MRI in high-grade glioma treatment response assessment.
Low-Dose Radiation 3D Intraoperative Imaging: How Low Can We Go? An O-Arm, CT Scan, Cadaveric Study.

PubMed

Sarwahi, Vishal; Payares, Monica; Wendolowski, Stephen; Maguire, Kathleen; Thornhill, Beverly; Lo, Yungtai; Amaral, Terry D

2017-11-15

MINI: The objective of this study was to evaluate the accuracy and reliability of pedicle screw placement using O-Arm at dosages below the manufactured recommended dose. O-Arm at reduced dose showed a 90% accuracy when compared with computed tomography; however, about 30% medial breaches were misclassified. Cadaveric study. The objective was to evaluate O-Arm's ability at low-dose (LD) settings to assess intraoperative screw placement. Accurate placement of pedicle screws is crucial because of proximity to vital structures. Malposition of screws may result in significant morbidity and potential mortality. O-arm provides real-time, intraoperative imaging of patient's anatomy and provides higher accuracy in scoliosis surgeries, avoiding risk to vital structures. We hypothesize using LD or ultra-low doses (ULDs) to obtain intraoperative images allow for accurate assessment of screw placement, both minimizing radiation exposure and preventing screw misplacement. Eight cadavers were instrumented with pedicle screws bilaterally from T1 to S1. Screws were randomly placed using O-arm navigation into three positions: contained within the bone, OUT-anterior/lateral, and OUT-medial. O-arm images were obtained at three dosage settings: LD (kVp120/mAs125-lowest manufacturer recommended), very-low dose (VLD) (kVp120/mAs63), and ULD (kVp120/mAs39). Computed tomography (CT) scan was performed using institution's LD protocol (kVp100/mAs50) and gross dissection to identify screw positions. LD, VLD, ULD, and CT for identifying "IN" screws relative to gross dissection had, a mean (standard deviation) sensitivity of 84.2% (±5.7), specificity of 76.1% (±9.3), and accuracy of 79.9% (±3.1) from all three observers. Across the three observers, the interobserver agreement was 0.67 (0.61-0.72) for LD, 0.74 (0.69-0.79) for VLD, 0.61 (0.56-0.66) for ULD, and 0.79 (0.74-0.84) for CT. Effective doses of radiation (mSV) for LD O-arm scan was 2.16, VLD 1.08, ULD 0.68, and our LD CT protocol was 1.05. Accuracy of pedicle screw placement is similar for O-arm at all doses and CT compared to gross dissection. Interobserver reliability was substantial for VLD and CT. Approximately 30% of medial screw breaches are, however, misclassified. ULD and VLDs can be used for intraoperative navigation and evaluation purposes within these limitations. N/A.
Diagnostic performance of 18F-FDG PET/CT and whole-body diffusion-weighted imaging with background body suppression (DWIBS) in detection of lymph node and bone metastases from pediatric neuroblastoma.

PubMed

Ishiguchi, Hiroaki; Ito, Shinji; Kato, Katsuhiko; Sakurai, Yusuke; Kawai, Hisashi; Fujita, Naotoshi; Abe, Shinji; Narita, Atsushi; Nishio, Nobuhiro; Muramatsu, Hideki; Takahashi, Yoshiyuki; Naganawa, Shinji

2018-06-01

Recent many studies have shown that whole body "diffusion-weighted imaging with background body signal suppression" (DWIBS) seems a beneficial tool having higher tumor detection sensitivity without ionizing radiation exposure for pediatric tumors. In this study, we evaluated the diagnostic performance of whole body DWIBS and 18 F-FDG PET/CT for detecting lymph node and bone metastases in pediatric patients with neuroblastoma. Subjects in this retrospective study comprised 13 consecutive pediatric patients with neuroblastoma (7 males, 6 females; mean age, 2.9 ± 2.0 years old) who underwent both 18 F-FDG PET/CT and whole-body DWIBS. All patients were diagnosed as neuroblastoma on the basis of pathological findings. Eight regions of lymph nodes and 17 segments of skeletons in all patients were evaluated. The images of 123 I-MIBG scintigraphy/SPECT-CT, bone scintigraphy/SPECT, and CT were used to confirm the presence of lymph node and bone metastases. Two radiologists trained in nuclear medicine evaluated independently the uptake of lesions in 18 F-FDG PET/CT and the signal-intensity of lesions in whole-body DWIBS visually. Interobserver difference was overcome through discussion to reach a consensus. The sensitivities, specificities, and overall accuracies of 18 F-FDG PET/CT and whole-body DWIBS were compared using McNemer's test. Positive predictive values (PPVs) and negative predictive values (NPVs) of both modalities were compared using Fisher's exact test. The total numbers of lymph node regions and bone segments which were confirmed to have metastasis in the total 13 patients were 19 and 75, respectively. The sensitivity, specificity, overall accuracy, PPV, and NPV of 18 F-FDG PET/CT for detecting lymph node metastasis from pediatric neuroblastoma were 100, 98.7, 98.9, 95.0, and 100%, respectively, and those for detecting bone metastasis were 90.7, 73.1, 80.3, 70.1, and 91.9%, respectively. In contrast, the sensitivity, specificity, overall accuracy, PPV, and NPV of whole-body DWIBS for detecting bone metastasis from pediatric neuroblastoma were 94.7, 24.0, 53.0, 46.4 and 86.7%, respectively, whereas those for detecting lymph node metastasis were 94.7, 85.3, 87.2, 62.1, and 98.5%, respectively. The low specificity, overall accuracy, and PPV of whole-body DWIBS for detecting bone metastasis were due to a high incidence of false-positive findings (82/108, 75.9%). The specificity, overall accuracy, and PPV of whole-body DWIBS for detecting lymph node metastasis were also significantly lower than those of 18 F-FDG PET/CT for detecting lymph node metastasis, although the difference between these 2 modalities was less than that for detecting bone metastasis. The specificity, overall accuracy, and PPV of whole-body DWIBS are significantly lower than those of 18 F-FDG PET/CT because of a high incidence of false-positive findings particularly for detecting bone metastasis, whereas whole-body DWIBS shows a similar level of sensitivities for detecting lymph node and bone metastases to those of 18 F-FDG PET/CT. DWIBS should be carefully used for cancer staging in children because of its high incidence of false-positive findings in skeletons.
Performance characteristics of multicolor versus blue light and infrared imaging in the identification of reticular pseudodrusen.

PubMed

Badal, Josep; Biarnés, Marc; Monés, Jordi

2018-02-01

To describe the appearance of reticular pseudodrusen on multicolor imaging and to evaluate its diagnostic accuracy as compared with the two modalities that may be considered the current reference standard, blue light and infrared imaging. Retrospective study in which all multicolor images (constructed from images acquired at 486 nm-blue, 518 nm-green and 815 nm-infrared) of 45 consecutive patients visited in a single center was reviewed. Inclusion criteria involved the presence of >1 reticular pseudodrusen on a 30° × 30° image centered on the fovea as seen with the blue light channel derived from the multicolor imaging. Three experienced observers, masked to each other's results with other imaging modalities, independently classified the number of reticular pseudodrusen with each modality. The median interobserver agreement (kappa) was 0.58 using blue light; 0.65 using infrared; and 0.64 using multicolor images. Multicolor and infrared modalities identified a higher number of reticular pseudodrusen than blue light modality in all fields for all observers (p < 0.0001). Results were not different when multicolor and infrared were compared (p ≥ 0.27). These results suggest that multicolor and infrared are more sensitive and reproducible than blue light in the identification of RPD. Multicolor did not appear to add a significant value to infrared in the evaluation of RDP. Clinicians using infrared do not need to incorporate multicolor for the identification and quantification of RPD.
Cytologic separation of branchial cleft cyst from metastatic cystic squamous cell carcinoma: A multivariate analysis of nineteen cytomorphologic features.

PubMed

Layfield, Lester J; Esebua, Magda; Schmidt, Robert L

2016-07-01

The separation of branchial cleft cysts from metastatic cystic squamous cell carcinomas in adults can be clinically and cytologically challenging. Diagnostic accuracy for separation is reported to be as low as 75% prompting some authors to recommend frozen section evaluation of suspected branchial cleft cysts before resection. We evaluated 19 cytologic features to determine which were useful in this distinction. Thirty-three cases (21 squamous carcinoma and 12 branchial cysts) of histologically confirmed cystic lesions of the lateral neck were graded for the presence or absence of 19 cytologic features by two cytopathologists. The cytologic features were analyzed for agreement between observers and underwent multivariate analysis for correlation with the diagnosis of carcinoma. Interobserver agreement was greatest for increased nuclear/cytoplasmic (N/C) ratio, pyknotic nuclei, and irregular nuclear membranes. Recursive partitioning analysis showed increased N/C ratio, small clusters of cells, and irregular nuclear membranes were the best discriminators. The distinction of branchial cleft cysts from cystic squamous cell carcinoma is cytologically difficult. Both digital image analysis and p16 testing have been suggested as aids in this separation, but analysis of cytologic features remains the main method for diagnosis. In an analysis of 19 cytologic features, we found that high nuclear cytoplasmic ratio, irregular nuclear membranes, and small cell clusters were most helpful in their distinction. Diagn. Cytopathol. 2016;44:561-567. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Assessment of breast pathologies using nonlinear microscopy

PubMed Central

Tao, Yuankai K.; Shen, Dejun; Sheikine, Yuri; Ahsen, Osman O.; Wang, Helen H.; Schmolze, Daniel B.; Johnson, Nicole B.; Brooker, Jeffrey S.; Cable, Alex E.; Connolly, James L.; Fujimoto, James G.

2014-01-01

Rapid intraoperative assessment of breast excision specimens is clinically important because up to 40% of patients undergoing breast-conserving cancer surgery require reexcision for positive or close margins. We demonstrate nonlinear microscopy (NLM) for the assessment of benign and malignant breast pathologies in fresh surgical specimens. A total of 179 specimens from 50 patients was imaged with NLM using rapid extrinsic nuclear staining with acridine orange and intrinsic second harmonic contrast generation from collagen. Imaging was performed on fresh, intact specimens without the need for fixation, embedding, and sectioning required for conventional histopathology. A visualization method to aid pathological interpretation is presented that maps NLM contrast from two-photon fluorescence and second harmonic signals to features closely resembling histopathology using hematoxylin and eosin staining. Mosaicking is used to overcome trade-offs between resolution and field of view, enabling imaging of subcellular features over square-centimeter specimens. After NLM examination, specimens were processed for standard paraffin-embedded histology using a protocol that coregistered histological sections to NLM images for paired assessment. Blinded NLM reading by three pathologists achieved 95.4% sensitivity and 93.3% specificity, compared with paraffin-embedded histology, for identifying invasive cancer and ductal carcinoma in situ versus benign breast tissue. Interobserver agreement was κ = 0.88 for NLM and κ = 0.89 for histology. These results show that NLM achieves high diagnostic accuracy, can be rapidly performed on unfixed specimens, and is a promising method for intraoperative margin assessment. PMID:25313045
Non-invasive diagnosis of hepatitis B virus-related cirrhosis

PubMed Central

Lee, Sangheun; Kim, Do Young

2014-01-01

Chronic hepatitis B (CHB) infection is a major public health problem associated with significant morbidity and mortality worldwide. Twenty-three percent of patients with CHB progress naturally to liver cirrhosis, which was earlier thought to be irreversible. However, it is now known that cirrhosis can in fact be reversed by treatment with oral anti-nucleotide drugs. Thus, early and accurate diagnosis of cirrhosis is important to allow an appropriate treatment strategy to be chosen and to predict the prognosis of patients with CHB. Liver biopsy is the reference standard for assessment of liver fibrosis. However, the method is invasive, and is associated with pain and complications that can be fatal. In addition, intra- and inter-observer variability compromises the accuracy of liver biopsy data. Only small tissue samples are obtained and fibrosis is heterogeneous in such samples. This confounds the two types of observer variability mentioned above. Such limitations have encouraged development of non-invasive methods for assessment of fibrosis. These include measurements of serum biomarkers of fibrosis; and assessment of liver stiffness via transient elastography, acoustic radiation force impulse imaging, real-time elastography, or magnetic resonance elastography. Although significant advances have been made, most work to date has addressed the diagnostic utility of these techniques in the context of cirrhosis caused by chronic hepatitis C infection. In the present review, we examine the advantages afforded by use of non-invasive methods to diagnose cirrhosis in patients with CHB infections and the utility of such methods in clinical practice. PMID:24574713
Three-dimensional image technology in forensic anthropology: Assessing the validity of biological profiles derived from CT-3D images of the skeleton

NASA Astrophysics Data System (ADS)

Garcia de Leon Valenzuela, Maria Julia

This project explores the reliability of building a biological profile for an unknown individual based on three-dimensional (3D) images of the individual's skeleton. 3D imaging technology has been widely researched for medical and engineering applications, and it is increasingly being used as a tool for anthropological inquiry. While the question of whether a biological profile can be derived from 3D images of a skeleton with the same accuracy as achieved when using dry bones has been explored, bigger sample sizes, a standardized scanning protocol and more interobserver error data are needed before 3D methods can become widely and confidently used in forensic anthropology. 3D images of Computed Tomography (CT) scans were obtained from 130 innominate bones from Boston University's skeletal collection (School of Medicine). For each bone, both 3D images and original bones were assessed using the Phenice and Suchey-Brooks methods. Statistical analysis was used to determine the agreement between 3D image assessment versus traditional assessment. A pool of six individuals with varying experience in the field of forensic anthropology scored a subsample (n = 20) to explore interobserver error. While a high agreement was found for age and sex estimation for specimens scored by the author, the interobserver study shows that observers found it difficult to apply standard methods to 3D images. Higher levels of experience did not result in higher agreement between observers, as would be expected. Thus, a need for training in 3D visualization before applying anthropological methods to 3D bones is suggested. Future research should explore interobserver error using a larger sample size in order to test the hypothesis that training in 3D visualization will result in a higher agreement between scores. The need for the development of a standard scanning protocol focusing on the optimization of 3D image resolution is highlighted. Applications for this research include the possibility of digitizing skeletal collections in order to expand their use and for deriving skeletal collections from living populations and creating population-specific standards. Further research for the development of a standard scanning and processing protocol is needed before 3D methods in forensic anthropology are considered as reliable tools for generating biological profiles.
Inter-observer agreement, diagnostic sensitivity and specificity of animal-based indicators of young lamb welfare.

PubMed

Phythian, C J; Toft, N; Cripps, P J; Michalopoulou, E; Winter, A C; Jones, P H; Grove-White, D; Duncan, J S

2013-07-01

A scientific literature review and consensus of expert opinion used the welfare definitions provided by the Farm Animal Welfare Council (FAWC) Five Freedoms as the framework for selecting a set of animal-based indicators that were sensitive to the current on-farm welfare issues of young lambs (aged ≤ 6 weeks). Ten animal-based indicators assessed by observation - demeanour, response to stimulation, shivering, standing ability, posture, abdominal fill, body condition, lameness, eye condition and salivation were tested as part of the objective of developing valid, reliable and feasible animal-based measures of lamb welfare The indicators were independently tested on 966 young lambs from 17 sheep flocks across Northwest England and Wales during December 2008 to April 2009 by four trained observers. Inter-observer reliability was assessed using Fleiss's kappa (κ), and the pair-wise agreement with an experienced, observer designated as the 'test standard observer' (TSO) was examined using Cohen's κ. Latent class analysis (LCA) estimated the sensitivity (Se) and specificity (Sp) of each observer without assuming a gold standard and predicted the Se and Sp of randomly selected observers who may apply the indicators in the future. Overall, good levels of inter-observer reliability, and high levels of Sp were identified for demeanour (κ = 0.54, Se ≥ 0.70, Sp ≥ 0.98), stimulation (κ = 0.57, Se = 0.30 to 0.77, Sp ≥ 0.98), shivering (κ = 0.55, Se = 0.37 to 0.85, Sp ≥ 0.99), standing ability (0.54, Se ≥ 0.80, Sp ≥ 0.99), posture (κ = 0.45, Se ≥ 0.56, Sp = 0.99), abdominal fill (κ = 0.44, Se = 0.39 to 0.98, Sp = 0.99), body condition (κ = 0.72, Se ⩾ 0.38 to 0.90, Sp = 0.99), lameness (κ = 0.68, Se > 0.73, Sp = 1.00), and eye condition (κ = 0.72, Se ≥ 0.86, Sp = 0.99). LCA predicted that randomly selected observers had Se > 0.77 (acceptable), and Sp ≥ 0.98 (high) for assessments of demeanour, lameness, abdominal fill posture, body condition and eye condition. The diagnostic performance of some indicators was influenced by the composition of the study population, and it would be useful to test the indicators on lambs with a greater level of outcomes associated with poor welfare. The findings presented in this paper could be applied in the selection of valid, reliable and feasible indicators used for the purposes of on-farm assessments of lamb welfare.
Feasibility of Non-contrast-enhanced MR Angiography Using the Time-SLIP Technique for the Assessment of Pulmonary Arteriovenous Malformation

PubMed Central

HAMAMOTO, Kohei; MATSUURA, Katsuhiko; CHIBA, Emiko; OKOCHI, Tomohisa; TANNO, Keisuke; TANAKA, Osamu

2016-01-01

Purpose: The purpose of this study was to evaluate the diagnostic performance of non-contrast-enhanced magnetic resonance angiography with time-spatial labeling inversion pulse (time-SLIP MRA) in the assessment of pulmonary arteriovenous malformation (PAVM). Methods: Eleven consecutive patients with 38 documented PAVMs underwent time-SLIP MRA with a 3-tesla unit. Eight patients with 25 lesions were examined twice, once before and once after embolotherapy. The lesions were divided into two groups—initial diagnosis (n = 35) and follow-up (n = 28)—corresponding to untreated and treated lesions, respectively, and were evaluated separately. To evaluate the initial diagnosis group, two reviewers assessed image quality for visualization of PAVMs by using a qualitative 4-point scale (1 = not assessable to 4 = excellent). The location and classification of PAVMs were also evaluated. The results were compared with those from digital subtraction angiography. For evaluation of the follow-up group, the reviewers assessed the status of treated PAVMs. Reperfusion and occlusion were defined respectively as visualization or disappearance of the aneurysmal sac. The diagnostic accuracy of time-SLIP MRA was assessed and compared with standard reference images. Interobserver agreement was evaluated with the κ statistic. Results: In the initial diagnosis group, time-SLIP MRA correctly determined the PAVMs in all but one patient with one lesion who had image degradation due to irregular breath. Image quality was considered excellent (median = 4) and the κ coefficient was 0.85. Additionally, both readers could correctly localize and classify the PAVMs on time-SLIP MRA images with both κ coefficient of 1.00. In the follow-up group, the sensitivity and specificity of time-SLIP MRA for reperfusion of PAVMs were both 100%, and the κ coefficient was 1.00. Conclusion: Time-SLIP MRA is technically and clinically feasible and represents a promising technique for noninvasive pre- and post-treatment assessment of PAVMs. PMID:26841853
Optimization of Tube Current in Cone-beam Computed Tomography for the Detection of Vertical Root Fractures with Different Intracanal Materials.

PubMed

Gaêta-Araujo, Hugo; Silva de Souza, Gabriela Queiroz; Freitas, Deborah Queiroz; de Oliveira-Santos, Christiano

2017-10-01

There is no consensus about the accuracy of cone-beam computed tomography (CBCT) for detecting vertical root fractures (VRFs), nor is there certainty about the isolated effect of different tube current parameters on the diagnosis of VRF through CBCT scans. This study aimed to evaluate how tube current affects the detection of VRF on CBCT examinations in the absence of intracanal materials and in the presence of gutta-percha (GP) and metal (MP) or fiberglass (FP) intracanal posts. The sample consisted of 320 CBCT scans of tooth roots with and without VRF divided into 8 groups: no fracture/no intracanal material; no fracture + GP; no fracture + MP; no fracture + FP; fracture/no intracanal material; fracture + GP; fracture + MP; fracture + FP. The scans were acquired with an OP300 unit using 4 different milliamperes (4 mA, 8 mA, 10 mA, 13 mA). Five oral radiologists analyzed the images. The area under the receiver operating characteristic curve (Az), sensitivity, specificity, positive and negative predictive values, and interobserver agreement were calculated. Diagnostic performance for the different milliamperes tested was similar for teeth without root filling materials or with FP. Teeth with GP and MP showed the highest Az values for 8 mA and 10 mA, respectively. For teeth with MP, specificity was significantly higher when 10 mA was used. For teeth without root filling materials or with FP, the use of a reduced milliampere does not seem to influence the detection of VRF in a significant manner. For teeth with GP and MP, an increased milliampere may lead to increased diagnostic performance. Copyright © 2017 American Association of Endodontists. Published by Elsevier Inc. All rights reserved.
Measurement of compartment elasticity using pressure related ultrasound: a method to identify patients with potential compartment syndrome.

PubMed

Sellei, R M; Hingmann, S J; Kobbe, P; Weber, C; Grice, J E; Zimmerman, F; Jeromin, S; Gansslen, A; Hildebrand, F; Pape, H C

2015-01-01

PURPOSE OF THE STUDY Decision-making in treatment of an acute compartment syndrome is based on clinical assessment, supported by invasive monitoring. Thus, evolving compartment syndrome may require repeated pressure measurements. In suspected cases of potential compartment syndromes clinical assessment alone seems to be unreliable. The objective of this study was to investigate the feasibility of a non-invasive application estimating whole compartmental elasticity by ultrasound, which may improve accuracy of diagnostics. MATERIAL AND METHODS In an in-vitro model, using an artificial container simulating dimensions of the human anterior tibial compartment, intracompartmental pressures (p) were raised subsequently up to 80 mm Hg by infusion of saline solution. The compartmental depth (mm) in the cross-section view was measured before and after manual probe compression (100 mm Hg) upon the surface resulting in a linear compartmental displacement (Δd). This was repeated at rising compartmental pressures. The resulting displacements were related to the corresponding intra-compartmental pressures simulated in our model. A hypothesized relationship between pressures related compartmental displacement and the elasticity at elevated compartment pressures was investigated. RESULTS With rising compartmental pressures, a non-linear, reciprocal proportional relation between the displacement (mm) and the intra-compartmental pressure (mm Hg) occurred. The Pearson's coefficient showed a high correlation (r2 = -0.960). The intraobserver reliability value kappa resulted in a statistically high reliability (κ = 0.840). The inter-observer value indicated a fair reliability (κ = 0.640). CONCLUSIONS Our model reveals that a strong correlation between compartmental strain displacements assessed by ultrasound and the intra-compartmental pressure changes occurs. Further studies are required to prove whether this assessment is transferable to human muscle tissue. Determining the complete compartmental elasticity by ultrasound enhancement, this application may improve detection of early signs of potential compartment syndrome. Key words: compartment syndrome, intra-compartmental pressure, non-invasive diagnostic, elasticity measurement, elastography.
Indexing of Diagnostic Accuracy Studies in MEDLINE and EMBASE

PubMed Central

Wilczynski, Nancy L.; Haynes, R. Brian

2007-01-01

Background: STAndards for Reporting of Diagnostic Accuracy (STARD) were published in 2003 and endorsed by some journals but not others. Objective: To determine whether the quality of indexing of diagnostic accuracy studies in MEDLINE and EMBASE has improved since the STARD statement was published. Design: Evaluate the change in the mean number of “accurate index terms” assigned to diagnostic accuracy studies, comparing STARD (endorsing) and non-STARD (non-endorsing) journals, for 2 years before and after STARD publication. Results: In MEDLINE, no differences in indexing quality were found for STARD and non-STARD journals before or after the STARD statement was published in 2003. In EMBASE, indexing in STARD journals improved compared with non-STARD journals (p = 0.02). However, articles in STARD journals had half the number of accurate indexing terms as articles in non-STARD journals, both before and after STARD statement publication (p < 0.001). PMID:18693947
ROC curve analyses of eyewitness identification decisions: An analysis of the recent debate.

PubMed

Rotello, Caren M; Chen, Tina

2016-01-01

How should the accuracy of eyewitness identification decisions be measured, so that best practices for identification can be determined? This fundamental question is under intense debate. One side advocates for continued use of a traditional measure of identification accuracy, known as the diagnosticity ratio , whereas the other side argues that receiver operating characteristic curves (ROCs) should be used instead because diagnosticity is confounded with response bias. Diagnosticity proponents have offered several criticisms of ROCs, which we show are either false or irrelevant to the assessment of eyewitness accuracy. We also show that, like diagnosticity, Bayesian measures of identification accuracy confound response bias with witnesses' ability to discriminate guilty from innocent suspects. ROCs are an essential tool for distinguishing memory-based processes from decisional aspects of a response; simulations of different possible identification tasks and response strategies show that they offer important constraints on theory development.
Diagnostic accuracy of scapular physical examination tests for shoulder disorders: a systematic review.

PubMed

Wright, Alexis A; Wassinger, Craig A; Frank, Mason; Michener, Lori A; Hegedus, Eric J

2013-09-01

To systematically review and critique the evidence regarding the diagnostic accuracy of physical examination tests for the scapula in patients with shoulder disorders. A systematic, computerised literature search of PubMED, EMBASE, CINAHL and the Cochrane Library databases (from database inception through January 2012) using keywords related to diagnostic accuracy of physical examination tests of the scapula. The Quality Assessment of Diagnostic Accuracy Studies tool was used to critique the quality of each paper. Eight articles met the inclusion criteria; three were considered to be of high quality. Of the three high-quality studies, two were in reference to a 'diagnosis' of shoulder pain. Only one high-quality article referenced specific shoulder pathology of acromioclavicular dislocation with reported sensitivity of 71% and 41% for the scapular dyskinesis and SICK scapula test, respectively. Overall, no physical examination test of the scapula was found to be useful in differentially diagnosing pathologies of the shoulder.
Using meta-analysis to inform the design of subsequent studies of diagnostic test accuracy.

PubMed

Hinchliffe, Sally R; Crowther, Michael J; Phillips, Robert S; Sutton, Alex J

2013-06-01

An individual diagnostic accuracy study rarely provides enough information to make conclusive recommendations about the accuracy of a diagnostic test; particularly when the study is small. Meta-analysis methods provide a way of combining information from multiple studies, reducing uncertainty in the result and hopefully providing substantial evidence to underpin reliable clinical decision-making. Very few investigators consider any sample size calculations when designing a new diagnostic accuracy study. However, it is important to consider the number of subjects in a new study in order to achieve a precise measure of accuracy. Sutton et al. have suggested previously that when designing a new therapeutic trial, it could be more beneficial to consider the power of the updated meta-analysis including the new trial rather than of the new trial itself. The methodology involves simulating new studies for a range of sample sizes and estimating the power of the updated meta-analysis with each new study added. Plotting the power values against the range of sample sizes allows the clinician to make an informed decision about the sample size of a new trial. This paper extends this approach from the trial setting and applies it to diagnostic accuracy studies. Several meta-analytic models are considered including bivariate random effects meta-analysis that models the correlation between sensitivity and specificity. Copyright © 2012 John Wiley & Sons, Ltd. Copyright © 2012 John Wiley & Sons, Ltd.
A structured proteomic approach identifies 14-3-3Sigma as a novel and reliable protein biomarker in panel based differential diagnostics of liver tumors.

PubMed

Reis, Henning; Pütter, Carolin; Megger, Dominik A; Bracht, Thilo; Weber, Frank; Hoffmann, Andreas-C; Bertram, Stefanie; Wohlschläger, Jeremias; Hagemann, Sascha; Eisenacher, Martin; Scherag, André; Schlaak, Jörg F; Canbay, Ali; Meyer, Helmut E; Sitek, Barbara; Baba, Hideo A

2015-06-01

Hepatocellular carcinoma (HCC) is a major lethal cancer worldwide. Despite sophisticated diagnostic algorithms, the differential diagnosis of small liver nodules still is difficult. While imaging techniques have advanced, adjuvant protein-biomarkers as glypican3 (GPC3), glutamine-synthetase (GS) and heat-shock protein 70 (HSP70) have enhanced diagnostic accuracy. The aim was to further detect useful protein-biomarkers of HCC with a structured systematic approach using differential proteome techniques, bring the results to practical application and compare the diagnostic accuracy of the candidates with the established biomarkers. After label-free and gel-based proteomics (n=18 HCC/corresponding non-tumorous liver tissue (NTLT)) biomarker candidates were tested for diagnostic accuracy in immunohistochemical analyses (n=14 HCC/NTLT). Suitable candidates were further tested for consistency in comparison to known protein-biomarkers in HCC (n=78), hepatocellular adenoma (n=25; HCA), focal nodular hyperplasia (n=28; FNH) and cirrhosis (n=28). Of all protein-biomarkers, 14-3-3Sigma (14-3-3S) exhibited the most pronounced up-regulation (58.8×) in proteomics and superior diagnostic accuracy (73.0%) in the differentiation of HCC from non-tumorous hepatocytes also compared to established biomarkers as GPC3 (64.7%) and GS (45.4%). 14-3-3S was part of the best diagnostic three-biomarker panel (GPC3, HSP70, 14-3-3S) for the differentiation of HCC and HCA which is of most important significance. Exclusion of GS and inclusion of 14-3-3S in the panel (>1 marker positive) resulted in a profound increase in specificity (+44.0%) and accuracy (+11.0%) while sensitivity remained stable (96.0%). 14-3-3S is an interesting protein biomarker with the potential to further improve the accuracy of differential diagnostic process of hepatocellular tumors. This article is part of a Special Issue entitled: Medical Proteomics. Copyright © 2014 Elsevier B.V. All rights reserved.
Instantaneous wave-free ratio as an alternative to fractional flow reserve in assessment of moderate coronary stenoses: A meta-analysis of diagnostic accuracy studies.

PubMed

Maini, Rohit; Moscona, John; Katigbak, Paul; Fernandez, Camilo; Sidhu, Gursukhmandeep; Saleh, Qusai; Irimpen, Anand; Samson, Rohan; LeJemtel, Thierry

2017-12-27

Fractional flow reserve (FFR) remains underutilized due to practical concerns related to the need for hyperemic agents. These concerns have prompted the study of instantaneous wave-free ratio (iFR), a vasodilator-free index of coronary stenosis. Non-inferior cardiovascular outcomes have been demonstrated in two recent randomized clinic trials. We performed this meta-analysis to provide a necessary update of the diagnostic accuracy of iFR referenced to FFR based on the addition of eight more recent studies and 3727 more lesions. We searched the PubMed, EMBASE, Central, ProQuest, and Web of Science databases for full text articles published through May 31, 2017 to identify studies addressing the diagnostic accuracy of iFR referenced to FFR≤0.80. The following keywords were used: "instantaneous wave-free ratio" OR "iFR" AND "fractional flow reserve" OR "FFR." In total, 16 studies comprising 5756 lesions were identified. Pooled diagnostic accuracy estimates of iFR versus FFR≤0.80 were: sensitivity, 0.78 (95% CI, 0.76-0.79); specificity, 0.83 (0.81-0.84); positive likelihood ratio, 4.54 (3.85-5.35); negative likelihood ratio, 0.28 (0.24-0.32); diagnostic odds ratio, 17.38 (14.16-21.34); area under the summary receiver-operating characteristic curve, 0.87; and an overall diagnostic accuracy of 0.81 (0.78-0.84). In conclusion, iFR showed excellent agreement with FFR as a resting index of coronary stenosis severity without the undesired effects and cost of hyperemic agents. When considering along with its clinical outcome data and ease of application, the diagnostic accuracy of iFR supports its use as a suitable alternative to FFR for physiology-guided revascularization of moderate coronary stenoses. We performed a meta-analysis of the diagnostic accuracy of iFR referenced to FFR. iFR showed excellent agreement with FFR as a resting index of coronary stenosis severity without the undesired effects and cost of hyperemic agents. This supports its use as a suitable alternative to FFR for physiology-guided revascularization of moderate coronary stenoses. Copyright © 2017. Published by Elsevier Inc.
Using Language Sample Analysis in Clinical Practice: Measures of Grammatical Accuracy for Identifying Language Impairment in Preschool and School-Aged Children.

PubMed

Eisenberg, Sarita; Guo, Ling-Yu

2016-05-01

This article reviews the existing literature on the diagnostic accuracy of two grammatical accuracy measures for differentiating children with and without language impairment (LI) at preschool and early school age based on language samples. The first measure, the finite verb morphology composite (FVMC), is a narrow grammatical measure that computes children's overall accuracy of four verb tense morphemes. The second measure, percent grammatical utterances (PGU), is a broader grammatical measure that computes children's accuracy in producing grammatical utterances. The extant studies show that FVMC demonstrates acceptable (i.e., 80 to 89% accurate) to good (i.e., 90% accurate or higher) diagnostic accuracy for children between 4;0 (years;months) and 6;11 in conversational or narrative samples. In contrast, PGU yields acceptable to good diagnostic accuracy for children between 3;0 and 8;11 regardless of sample types. Given the diagnostic accuracy shown in the literature, we suggest that FVMC and PGU can be used as one piece of evidence for identifying children with LI in assessment when appropriate. However, FVMC or PGU should not be used as therapy goals directly. Instead, when children are low in FVMC or PGU, we suggest that follow-up analyses should be conducted to determine the verb tense morphemes or grammatical structures that children have difficulty with. Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.
Quantifying facial paralysis using the Kinect v2.

PubMed

Gaber, Amira; Taher, Mona F; Wahed, Manal Abdel

2015-01-01

Assessment of facial paralysis (FP) and quantitative grading of facial asymmetry are essential in order to quantify the extent of the condition as well as to follow its improvement or progression. As such, there is a need for an accurate quantitative grading system that is easy to use, inexpensive and has minimal inter-observer variability. A comprehensive automated system to quantify and grade FP is the main objective of this work. An initial prototype has been presented by the authors. The present research aims to enhance the accuracy and robustness of one of this system's modules: the resting symmetry module. This is achieved by including several modifications to the computation method of the symmetry index (SI) for the eyebrows, eyes and mouth. These modifications are the gamma correction technique, the area of the eyes, and the slope of the mouth. The system was tested on normal subjects and showed promising results. The mean SI of the eyebrows decreased slightly from 98.42% to 98.04% using the modified method while the mean SI for the eyes and mouth increased from 96.93% to 99.63% and from 95.6% to 98.11% respectively while using the modified method. The system is easy to use, inexpensive, automated and fast, has no inter-observer variability and is thus well suited for clinical use.

Accuracy of clinical tests in the diagnosis of anterior cruciate ligament injury: a systematic review

PubMed Central

2014-01-01

Background Numerous clinical tests are used in the diagnosis of anterior cruciate ligament (ACL) injury but their accuracy is unclear. The purpose of this study is to evaluate the diagnostic accuracy of clinical tests for the diagnosis of ACL injury. Methods Study Design: Systematic review. The review protocol was registered through PROSPERO (CRD42012002069). Electronic databases (PubMed, MEDLINE, EMBASE, CINAHL) were searched up to 19th of June 2013 to identify diagnostic studies comparing the accuracy of clinical tests for ACL injury to an acceptable reference standard (arthroscopy, arthrotomy, or MRI). Risk of bias was appraised using the QUADAS-2 checklist. Index test accuracy was evaluated using a descriptive analysis of paired likelihood ratios and displayed as forest plots. Results A total of 285 full-text articles were assessed for eligibility, from which 14 studies were included in this review. Included studies were deemed to be clinically and statistically heterogeneous, so a meta-analysis was not performed. Nine clinical tests from the history (popping sound at time of injury, giving way, effusion, pain, ability to continue activity) and four from physical examination (anterior draw test, Lachman’s test, prone Lachman’s test and pivot shift test) were investigated for diagnostic accuracy. Inspection of positive and negative likelihood ratios indicated that none of the individual tests provide useful diagnostic information in a clinical setting. Most studies were at risk of bias and reported imprecise estimates of diagnostic accuracy. Conclusion Despite being widely used and accepted in clinical practice, the results of individual history items or physical tests do not meaningfully change the probability of ACL injury. In contrast combinations of tests have higher diagnostic accuracy; however the most accurate combination of clinical tests remains an area for future research. Clinical relevance Clinicians should be aware of the limitations associated with the use of clinical tests for diagnosis of ACL injury. PMID:25187877
Accuracy of clinical tests in the diagnosis of anterior cruciate ligament injury: a systematic review.

PubMed

Swain, Michael S; Henschke, Nicholas; Kamper, Steven J; Downie, Aron S; Koes, Bart W; Maher, Chris G

2014-01-01

Numerous clinical tests are used in the diagnosis of anterior cruciate ligament (ACL) injury but their accuracy is unclear. The purpose of this study is to evaluate the diagnostic accuracy of clinical tests for the diagnosis of ACL injury. Systematic review. The review protocol was registered through PROSPERO (CRD42012002069). Electronic databases (PubMed, MEDLINE, EMBASE, CINAHL) were searched up to 19th of June 2013 to identify diagnostic studies comparing the accuracy of clinical tests for ACL injury to an acceptable reference standard (arthroscopy, arthrotomy, or MRI). Risk of bias was appraised using the QUADAS-2 checklist. Index test accuracy was evaluated using a descriptive analysis of paired likelihood ratios and displayed as forest plots. A total of 285 full-text articles were assessed for eligibility, from which 14 studies were included in this review. Included studies were deemed to be clinically and statistically heterogeneous, so a meta-analysis was not performed. Nine clinical tests from the history (popping sound at time of injury, giving way, effusion, pain, ability to continue activity) and four from physical examination (anterior draw test, Lachman's test, prone Lachman's test and pivot shift test) were investigated for diagnostic accuracy. Inspection of positive and negative likelihood ratios indicated that none of the individual tests provide useful diagnostic information in a clinical setting. Most studies were at risk of bias and reported imprecise estimates of diagnostic accuracy. Despite being widely used and accepted in clinical practice, the results of individual history items or physical tests do not meaningfully change the probability of ACL injury. In contrast combinations of tests have higher diagnostic accuracy; however the most accurate combination of clinical tests remains an area for future research. Clinicians should be aware of the limitations associated with the use of clinical tests for diagnosis of ACL injury.
Reliability and accuracy analysis of a new semiautomatic radiographic measurement software in adult scoliosis.

PubMed

Aubin, Carl-Eric; Bellefleur, Christian; Joncas, Julie; de Lanauze, Dominic; Kadoury, Samuel; Blanke, Kathy; Parent, Stefan; Labelle, Hubert

2011-05-20

Radiographic software measurement analysis in adult scoliosis. To assess the accuracy as well as the intra- and interobserver reliability of measuring different indices on preoperative adult scoliosis radiographs using a novel measurement software that includes a calibration procedure and semiautomatic features to facilitate the measurement process. Scoliosis requires a careful radiographic evaluation to assess the deformity. Manual and computer radiographic process measures have been studied extensively to determine the reliability and reproducibility in adolescent idiopathic scoliosis. Most studies rely on comparing given measurements, which are repeated by the same user or by an expert user. A given measure with a small intra- or interobserver error might be deemed as good repeatability, but all measurements might not be truly accurate because the ground-truth value is often unknown. Thorough accuracy assessment of radiographic measures is necessary to assess scoliotic deformities, compare these measures at different stages or to permit valid multicenter studies. Thirty-four sets of adult scoliosis digital radiographs were measured two times by three independent observers using a novel radiographic measurement software that includes semiautomatic features to facilitate the measurement process. Twenty different measures taken from the Spinal Deformity Study Group radiographic measurement manual were performed on the coronal and sagittal images. Intra- and intermeasurer reliability for each measure was assessed. The accuracy of the measurement software was also assessed using a physical spine model in six different scoliotic configurations as a true reference. The majority of the measures demonstrated good to excellent intra- and intermeasurer reliability, except for sacral obliquity. The standard variation of all the measures was very small: ≤ 4.2° for Cobb angles, ≤ 4.2° for the kyphosis, ≤ 5.7° for the lordosis, ≤ 3.9° for the pelvic angles, and ≤5.3° for the sacral angles. The variability in the linear measurements (distances) was <4 mm. The variance of the measures was 1.7 and 2.6 times greater, respectively, for the angular and linear measures between the inter- and intrameasurer reliability. The image quality positively influenced the intermeasurer reliability especially for the proximal thoracic Cobb angle, T10-L2 lordosis, sacral slope and L5 seating. The accuracy study revealed that on average the difference in the angular measures was < 2° for the Cobb angles, and < 4° for the other angles, except T2-T12 kyphosis (5.3°). The linear measures were all <3.5 mm difference on average. The majority of the measures, which were analyzed in this study demonstrated good to excellent reliability and accuracy. The novel semiautomatic measurement software can be recommended for use for clinical, research or multicenter study purposes.
Transbronchial Lung Cryobiopsy and Video-assisted Thoracoscopic Lung Biopsy in the Diagnosis of Diffuse Parenchymal Lung Disease. A Meta-analysis of Diagnostic Test Accuracy.

PubMed

Iftikhar, Imran H; Alghothani, Lana; Sardi, Alejandro; Berkowitz, David; Musani, Ali I

2017-07-01

Transbronchial lung cryobiopsy is increasingly being used for the assessment of diffuse parenchymal lung diseases. Several studies have shown larger biopsy samples and higher yields compared with conventional transbronchial biopsies. However, the higher risk of bleeding and other complications has raised concerns for widespread use of this modality. To study the diagnostic accuracy and safety profile of transbronchial lung cryobiopsy and compare with video-assisted thoracoscopic surgery (VATS) by reviewing available evidence from the literature. Medline and PubMed were searched from inception until December 2016. Data on diagnostic performance were abstracted by constructing two-by-two contingency tables for each study. Data on a priori selected safety outcomes were collected. Risk of bias was assessed with the Quality Assessment of Diagnostic Accuracy Studies tool. Random effects meta-analyses were performed to obtain summary estimates of the diagnostic accuracy. The pooled diagnostic yield, pooled sensitivity, and pooled specificity of transbronchial lung cryobiopsy were 83.7% (76.9-88.8%), 87% (85-89%), and 57% (40-73%), respectively. The pooled diagnostic yield, pooled sensitivity, and pooled specificity of VATS were 92.7% (87.6-95.8%), 91.0% (89-92%), and 58% (31-81%), respectively. The incidence of grade 2 (moderate to severe) endobronchial bleeding after transbronchial lung cryobiopsy and of post-procedural pneumothorax was 4.9% (2.2-10.7%) and 9.5% (5.9-14.9%), respectively. Although the diagnostic test accuracy measures of transbronchial lung cryobiopsy lag behind those of VATS, with an acceptable safety profile and potential cost savings, the former could be considered as an alternative in the evaluation of patients with diffuse parenchymal lung diseases.
Evaluating Random Error in Clinician-Administered Surveys: Theoretical Considerations and Clinical Applications of Interobserver Reliability and Agreement.

PubMed

Bennett, Rebecca J; Taljaard, Dunay S; Olaithe, Michelle; Brennan-Jones, Chris; Eikelboom, Robert H

2017-09-18

The purpose of this study is to raise awareness of interobserver concordance and the differences between interobserver reliability and agreement when evaluating the responsiveness of a clinician-administered survey and, specifically, to demonstrate the clinical implications of data types (nominal/categorical, ordinal, interval, or ratio) and statistical index selection (for example, Cohen's kappa, Krippendorff's alpha, or interclass correlation). In this prospective cohort study, 3 clinical audiologists, who were masked to each other's scores, administered the Practical Hearing Aid Skills Test-Revised to 18 adult owners of hearing aids. Interobserver concordance was examined using a range of reliability and agreement statistical indices. The importance of selecting statistical measures of concordance was demonstrated with a worked example, wherein the level of interobserver concordance achieved varied from "no agreement" to "almost perfect agreement" depending on data types and statistical index selected. This study demonstrates that the methodology used to evaluate survey score concordance can influence the statistical results obtained and thus affect clinical interpretations.
Reliability and accuracy of three imaging software packages used for 3D analysis of the upper airway on cone beam computed tomography images.

PubMed

Chen, Hui; van Eijnatten, Maureen; Wolff, Jan; de Lange, Jan; van der Stelt, Paul F; Lobbezoo, Frank; Aarab, Ghizlane

2017-08-01

The aim of this study was to assess the reliability and accuracy of three different imaging software packages for three-dimensional analysis of the upper airway using CBCT images. To assess the reliability of the software packages, 15 NewTom 5G ® (QR Systems, Verona, Italy) CBCT data sets were randomly and retrospectively selected. Two observers measured the volume, minimum cross-sectional area and the length of the upper airway using Amira ® (Visage Imaging Inc., Carlsbad, CA), 3Diagnosys ® (3diemme, Cantu, Italy) and OnDemand3D ® (CyberMed, Seoul, Republic of Korea) software packages. The intra- and inter-observer reliability of the upper airway measurements were determined using intraclass correlation coefficients and Bland & Altman agreement tests. To assess the accuracy of the software packages, one NewTom 5G ® CBCT data set was used to print a three-dimensional anthropomorphic phantom with known dimensions to be used as the "gold standard". This phantom was subsequently scanned using a NewTom 5G ® scanner. Based on the CBCT data set of the phantom, one observer measured the volume, minimum cross-sectional area, and length of the upper airway using Amira ® , 3Diagnosys ® , and OnDemand3D ® , and compared these measurements with the gold standard. The intra- and inter-observer reliability of the measurements of the upper airway using the different software packages were excellent (intraclass correlation coefficient ≥0.75). There was excellent agreement between all three software packages in volume, minimum cross-sectional area and length measurements. All software packages underestimated the upper airway volume by -8.8% to -12.3%, the minimum cross-sectional area by -6.2% to -14.6%, and the length by -1.6% to -2.9%. All three software packages offered reliable volume, minimum cross-sectional area and length measurements of the upper airway. The length measurements of the upper airway were the most accurate results in all software packages. All software packages underestimated the upper airway dimensions of the anthropomorphic phantom.
Polyp morphology: an interobserver evaluation for the Paris classification among international experts.

PubMed

van Doorn, Sascha C; Hazewinkel, Y; East, James E; van Leerdam, Monique E; Rastogi, Amit; Pellisé, Maria; Sanduleanu-Dascalescu, Silvia; Bastiaansen, Barbara A J; Fockens, Paul; Dekker, Evelien

2015-01-01

The Paris classification is an international classification system for describing polyp morphology. Thus far, the validity and reproducibility of this classification have not been assessed. We aimed to determine the interobserver agreement for the Paris classification among seven Western expert endoscopists. A total of 85 short endoscopic video clips depicting polyps were created and assessed by seven expert endoscopists according to the Paris classification. After a digital training module, the same 85 polyps were assessed again. We calculated the interobserver agreement with a Fleiss kappa and as the proportion of pairwise agreement. The interobserver agreement of the Paris classification among seven experts was moderate with a Fleiss kappa of 0.42 and a mean pairwise agreement of 67%. The proportion of lesions assessed as "flat" by the experts ranged between 13 and 40% (P<0.001). After the digital training, the interobserver agreement did not change (kappa 0.38, pairwise agreement 60%). Our study is the first to validate the Paris classification for polyp morphology. We demonstrated only a moderate interobserver agreement among international Western experts for this classification system. Our data suggest that, in its current version, the use of this classification system in daily practice is questionable and it is unsuitable for comparative endoscopic research. We therefore suggest introduction of a simplification of the classification system.
Diagnostic value of Doppler assessment of the hepatic and portal vessels and ultrasound of the spleen in liver disease.

PubMed

O'Donohue, John; Ng, Chaan; Catnach, Susan; Farrant, Patricia; Williams, Roger

2004-02-01

To investigate the clinical utility and the intra-observer and inter-observer variability of Doppler ultrasound assessment of the hepatic and portal vessels along with measurement of spleen size in the diagnosis of chronic liver disease and cirrhosis. Ultrasound measurements of portal vein diameter (PVD), portal vein velocity (PVV), hepatic arterial resistance index (HARI), hepatic vein profile (HVP), and spleen size were obtained in 49 controls and 45 patients with liver disease (23 with primary biliary cirrhosis, 22 with hepatitis C) by two experienced observers, who each performed three blinded measurements of each variable. Control values were derived from normal hospital workers. Percutaneous liver biopsies in 41 of the patients showed cirrhosis (14 patients), moderate/severe fibrosis (13 patients), and early disease (14 patients). Seventy-one percent of cirrhotic patients had splenomegaly (> 13.6 cm). The spleen size was significantly larger in cirrhotics (16.0 cm) than in non-cirrhotics (13.0 cm, P < 0.009) and healthy controls (10.7 cm, P < 0.00005), and was the only independent predictor of cirrhosis, with a threshold of 15 cm predicting cirrhosis with a specificity of 98%, positive predictive value of 93%, sensitivity of 57% and negative predictive value of 80%. HVP was abnormal in 76.9% of cirrhotics, 57.7% of non-cirrhotics and 2.1% of controls (P < 0.04). However, the mean PVV, PVD and HARI were no different between controls and patients or between cirrhotic and non-cirrhotic liver disease. There was significant inter-observer variability for PVV, but intra-observer and inter-observer variability was acceptable for the other measurements. Splenomegaly size and abnormal HVP are useful predictors of chronic liver disease and cirrhosis, and both can be measured reliably and reproducibly. However, Doppler measurements of PVV, PVD and HARI are not useful in distinguishing patients with chronic liver disease from normal controls.
Delineation and segmentation of cerebral tumors by mapping blood-brain barrier disruption with dynamic contrast-enhanced CT and tracer kinetics modeling-a feasibility study.

PubMed

Bisdas, S; Yang, X; Lim, C C T; Vogl, T J; Koh, T S

2008-01-01

Dynamic contrast-enhanced (DCE) imaging is a promising approach for in vivo assessment of tissue microcirculation. Twenty patients with clinical and routine computed tomography (CT) evidence of intracerebral neoplasm were examined with DCE-CT imaging. Using a distributed-parameter model for tracer kinetics modeling of DCE-CT data, voxel-level maps of cerebral blood flow (F), intravascular blood volume (vi) and intravascular mean transit time (t1) were generated. Permeability-surface area product (PS), extravascular extracellular blood volume (ve) and extraction ratio (E) maps were also calculated to reveal pathologic locations of tracer extravasation, which are indicative of disruptions in the blood-brain barrier (BBB). All maps were visually assessed for quality of tumor delineation and measurement of tumor extent by two radiologists. Kappa (kappa) coefficients and their 95% confidence intervals (CI) were calculated to determine the interobserver agreement for each DCE-CT map. There was a substantial agreement for the tumor delineation quality in the F, ve and t1 maps. The agreement for the quality of the tumor delineation was excellent for the vi, PS and E maps. Concerning the measurement of tumor extent, excellent and nearly excellent agreement was achieved only for E and PS maps, respectively. According to these results, we performed a segmentation of the cerebral tumors on the base of the E maps. The interobserver agreement for the tumor extent quantification based on manual segmentation of tumor in the E maps vs. the computer-assisted segmentation was excellent (kappa = 0.96, CI: 0.93-0.99). The interobserver agreement for the tumor extent quantification based on computer segmentation in the mean images and the E maps was substantial (kappa = 0.52, CI: 0.42-0.59). This study illustrates the diagnostic usefulness of parametric maps associated with BBB disruption on a physiology-based approach and highlights the feasibility for automatic segmentation of cerebral tumors.
Investigating Various Thresholds as Immunohistochemistry Cutoffs for Observer Agreement.

PubMed

Ali, Asif; Bell, Sarah; Bilsland, Alan; Slavin, Jill; Lynch, Victoria; Elgoweini, Maha; Derakhshan, Mohammad H; Jamieson, Nigel B; Chang, David; Brown, Victoria; Denley, Simon; Orange, Clare; McKay, Colin; Carter, Ross; Oien, Karin A; Duthie, Fraser R

2017-10-01

Clinical translation of immunohistochemistry (IHC) biomarkers requires reliable and reproducible cutoffs or thresholds for interpretation of immunostaining. Most IHC biomarker research focuses on the clinical relevance (diagnostic, prognostic, or predictive utility) of cutoffs, with less emphasis on observer agreement using these cutoffs. From the literature, we identified 3 commonly used cutoffs of 10% positive epithelial cells, 20% positive epithelial cells, and moderate to strong staining intensity (+2/+3 hereafter) to use for investigating observer agreement. A series of 36 images of microarray cores stained for 4 different IHC biomarkers, with variable staining intensity and percentage of positive cells, was used for investigating interobserver and intraobserver agreement. Seven pathologists scored the immunostaining in each image using the 3 cutoffs for positive and negative staining. Kappa (κ) statistic was used to assess the strength of agreement for each cutoff. The interobserver agreement between all 7 pathologists using the 3 cutoffs was reasonably good, with mean κ scores of 0.64, 0.59, and 0.62, respectively, for 10%, 20%, and +2/+3 cutoffs. A good agreement was observed for experienced pathologists using the 10% cutoff, and their agreement was statistically higher than for junior pathologists (P=0.02). In addition, the mean intraobserver agreement for all 7 pathologists using the 3 cutoffs was reasonably good, with mean κ scores of 0.71, 0.60, and 0.73, respectively, for 10%, 20%, and +2/+3 cutoffs. For all 3 cutoffs, a positive correlation was observed with perceived ease of interpretation (P<0.003). Finally, cytoplasmic-only staining achieved higher agreement using all 3 cutoffs than mixed staining patterns. All 3 cutoffs investigated achieve reasonable strength of agreement, modestly decreasing interobserver and intraobserver variability in IHC interpretation. These cutoffs have previously been used in cancer pathology, and this study provides evidence that these cutoffs can be reproducible between practicing pathologists.
Performance of search strategies to retrieve systematic reviews of diagnostic test accuracy from the Cochrane Library.

PubMed

Huang, Yuansheng; Yang, Zhirong; Wang, Jing; Zhuo, Lin; Li, Zhixia; Zhan, Siyan

2016-05-06

To compare the performance of search strategies to retrieve systematic reviews of diagnostic test accuracy from The Cochrane Library. Databases of CDSR and DARE in the Cochrane Library were searched for systematic reviews of diagnostic test accuracy published between 2008 and 2012 through nine search strategies. Each strategy consists of one group or combination of groups of searching filters about diagnostic test accuracy. Four groups of diagnostic filters were used. The Strategy combing all the filters was used as the reference to determine the sensitivity, precision, and the sensitivity x precision product for another eight Strategies. The reference Strategy retrieved 8029 records, of which 832 were eligible. The strategy only composed of MeSH terms about "accuracy measures" achieved the highest values in both precision (69.71%) and product (52.45%) with a moderate sensitivity (75.24%). The combination of MeSH terms and free text words about "accuracy measures" contributed little to increasing the sensitivity. Strategies composed of filters about "diagnosis" had similar sensitivity but lower precision and product to those composed of filters about "accuracy measures". MeSH term "exp'diagnosis' " achieved the lowest precision (9.78%) and product (7.91%), while its hyponym retrieved only half the number of records at the expense of missing 53 target articles. The precision was negatively correlated with sensitivities among the nine strategies. Compared to the filters about "diagnosis", the filters about "accuracy measures" achieved similar sensitivities but higher precision. When combining both terms, sensitivity of the strategy was enhanced obviously. The combination of MeSH terms and free text words about the same concept seemed to be meaningless for enhancing sensitivity. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Reliability, Validity, and Classification Accuracy of the DSM-5 Diagnostic Criteria for Gambling Disorder and Comparison to DSM-IV.

PubMed

Stinchfield, Randy; McCready, John; Turner, Nigel E; Jimenez-Murcia, Susana; Petry, Nancy M; Grant, Jon; Welte, John; Chapman, Heather; Winters, Ken C

2016-09-01

The DSM-5 was published in 2013 and it included two substantive revisions for gambling disorder (GD). These changes are the reduction in the threshold from five to four criteria and elimination of the illegal activities criterion. The purpose of this study was to twofold. First, to assess the reliability, validity and classification accuracy of the DSM-5 diagnostic criteria for GD. Second, to compare the DSM-5-DSM-IV on reliability, validity, and classification accuracy, including an examination of the effect of the elimination of the illegal acts criterion on diagnostic accuracy. To compare DSM-5 and DSM-IV, eight datasets from three different countries (Canada, USA, and Spain; total N = 3247) were used. All datasets were based on similar research methods. Participants were recruited from outpatient gambling treatment services to represent the group with a GD and from the community to represent the group without a GD. All participants were administered a standardized measure of diagnostic criteria. The DSM-5 yielded satisfactory reliability, validity and classification accuracy. In comparing the DSM-5 to the DSM-IV, most comparisons of reliability, validity and classification accuracy showed more similarities than differences. There was evidence of modest improvements in classification accuracy for DSM-5 over DSM-IV, particularly in reduction of false negative errors. This reduction in false negative errors was largely a function of lowering the cut score from five to four and this revision is an improvement over DSM-IV. From a statistical standpoint, eliminating the illegal acts criterion did not make a significant impact on diagnostic accuracy. From a clinical standpoint, illegal acts can still be addressed in the context of the DSM-5 criterion of lying to others.
Diagnostic Accuracy of Chinese Medicine Diagnosis Scale of Phlegm and Blood Stasis Syndrome in Coronary Heart Disease: A Study Protocol.

PubMed

Liu, Xiao-Qi; Peng, Dan-Hong; Wang, Yan-Ping; Xie, Rong; Chen, Xin-Lin; Yu, Chun-Quan; Li, Xian-Tao

2018-05-03

Phlegm and blood stasis syndrome (PBSS) is one of the main syndromes in coronary heart disease (CHD). Syndromes of Chinese medicine (CM) are lack of quantitative and easyimplementation diagnosis standards. To quantify and standardize the diagnosis of PBSS, scales are usually applied. To evaluate the diagnostic accuracy of CM diagnosis scale of PBSS in CHD. Six hundred patients with stable angina pectoris of CHD, 300 in case group and 300 in control group, will be recruited from 5 hospitals across China. Diagnosis from 2 experts will be considered as the "gold standard". The study design consists of 2 phases: pilot test is used to evaluate the reliability and validity, and diagnostic test is used to assess the diagnostic accuracy of the scale, including sensitivity, specififi city, likelihood ratio and area under the receiver operator characteristic (ROC) curve. This study will evaluate the diagnostic accuracy of CM diagnosis scale of PBSS in CHD. The consensus of 2 experts may not be ideal as a "gold standard", and itself still requires further study. (No. ChiCTR-OOC-15006599).
A Statistical Evaluation of the Diagnostic Performance of MEDAS-The Medical Emergency Decision Assistance System

PubMed Central

Georgakis, D. Christine; Trace, David A.; Naeymi-Rad, Frank; Evens, Martha

1990-01-01

Medical expert systems require comprehensive evaluation of their diagnostic accuracy. The usefulness of these systems is limited without established evaluation methods. We propose a new methodology for evaluating the diagnostic accuracy and the predictive capacity of a medical expert system. We have adapted to the medical domain measures that have been used in the social sciences to examine the performance of human experts in the decision making process. Thus, in addition to the standard summary measures, we use measures of agreement and disagreement, and Goodman and Kruskal's λ and τ measures of predictive association. This methodology is illustrated by a detailed retrospective evaluation of the diagnostic accuracy of the MEDAS system. In a study using 270 patients admitted to the North Chicago Veterans Administration Hospital, diagnoses produced by MEDAS are compared with the discharge diagnoses of the attending physicians. The results of the analysis confirm the high diagnostic accuracy and predictive capacity of the MEDAS system. Overall, the agreement of the MEDAS system with the “gold standard” diagnosis of the attending physician has reached a 90% level.
Added value of cost-utility analysis in simple diagnostic studies of accuracy: (18)F-fluoromethylcholine PET/CT in prostate cancer staging.

PubMed

Gerke, Oke; Poulsen, Mads H; Høilund-Carlsen, Poul Flemming

2015-01-01

Diagnostic studies of accuracy targeting sensitivity and specificity are commonly done in a paired design in which all modalities are applied in each patient, whereas cost-effectiveness and cost-utility analyses are usually assessed either directly alongside to or indirectly by means of stochastic modeling based on larger randomized controlled trials (RCTs). However the conduct of RCTs is hampered in an environment such as ours, in which technology is rapidly evolving. As such, there is a relatively limited number of RCTs. Therefore, we investigated as to which extent paired diagnostic studies of accuracy can be also used to shed light on economic implications when considering a new diagnostic test. We propose a simple decision tree model-based cost-utility analysis of a diagnostic test when compared to the current standard procedure and exemplify this approach with published data from lymph node staging of prostate cancer. Average procedure costs were taken from the Danish Diagnosis Related Groups Tariff in 2013 and life expectancy was estimated for an ideal 60 year old patient based on prostate cancer stage and prostatectomy or radiation and chemotherapy. Quality-adjusted life-years (QALYs) were deduced from the literature, and an incremental cost-effectiveness ratio (ICER) was used to compare lymph node dissection with respective histopathological examination (reference standard) and (18)F-fluoromethylcholine positron emission tomography/computed tomography (FCH-PET/CT). Lower bounds of sensitivity and specificity of FCH-PET/CT were established at which the replacement of the reference standard by FCH-PET/CT comes with a trade-off between worse effectiveness and lower costs. Compared to the reference standard in a diagnostic accuracy study, any imperfections in accuracy of a diagnostic test imply that replacing the reference standard generates a loss in effectiveness and utility. We conclude that diagnostic studies of accuracy can be put to a more extensive use, over and above a mere indication of sensitivity and specificity of an imaging test, and that health economic considerations should be undertaken when planning a prospective diagnostic accuracy study. These endeavors will prove especially fruitful when comparing several imaging techniques with one another, or the same imaging technique using different tracers, with an independent reference standard for the evaluation of results.
The role of serum erythropoietin level and JAK2 V617F allele burden in the diagnosis of polycythaemia vera.

PubMed

Ancochea, Agueda; Alvarez-Larrán, Alberto; Morales-Indiano, Cristian; García-Pallarols, Francesc; Martínez-Avilés, Luz; Angona, Anna; Senín, Alicia; Bellosillo, Beatriz; Besses, Carles

2014-11-01

Low serum erythropoietin (EPO) is a minor criterion of Polycythaemia Vera (PV) but its diagnostic usefulness relies on studies performed before the discovery of JAK2 V617F mutation. The objective of the present study was to evaluate the diagnostic accuracy of serum EPO and JAK2 V617F allele burden as markers of PV as well as the combination of different diagnostic criteria in 287 patients (99 with PV, 137 with Essential Thrombocythaemia and 51 with non-clonal erythrocytosis). Low EPO showed good diagnostic accuracy as a marker for PV, with the area under the curve (AUC) of the chemiluminescent-enhanced enzyme immunoassay (CEIA) being better than that of radioimmunoassay (RIA) (0·87 and 0·76 for CEIA and RIA, respectively). JAK2 V617F quantification displayed an excellent diagnostic accuracy, with an AUC of 0·95. A haematocrit >52% (males) or >48% (females) plus the presence of the JAK2 V617F mutation had a sensitivity and specificity of 79% and 97%, respectively. Adding low EPO or the JAK2 V617F allele burden did not improve the diagnostic accuracy for PV whereas the inclusion of both improved the sensitivity up to 83% and maintaining 96% specificity. Haematocrit and qualitative JAK2 V617F mutation allow a reliable diagnosis of PV. Incorporation of EPO and/or JAK2 V617F mutant load does not improve the diagnostic accuracy. © 2014 John Wiley & Sons Ltd.
Diagnostic Accuracy of MRI-guided Percutaneous Transthoracic Needle Biopsy of Solitary Pulmonary Nodules

DOE Office of Scientific and Technical Information (OSTI.GOV)

Liu, Shangang, E-mail: 1198685580@qq.com; Li, Chengli, E-mail: chenglilichina@yeah.net; Yu, Xuejuan, E-mail: yuxuejuan2011@126.com

2015-04-15

ObjectiveThe purpose of our study was to evaluate the diagnostic accuracy of MRI-guided percutaneous transthoracic needle biopsy (PTNB) of solitary pulmonary nodules (SPNs).MethodsRetrospective review of 69 patients who underwent MR-guided PTNB of SPNs was performed. Each case was reviewed for complications. The final diagnosis was established by surgical pathology of the nodule or clinical and imaging follow-up. Pneumothorax rate and diagnostic accuracy were compared between two groups according to nodule diameter (≤2 vs. >2 cm) using χ{sup 2} chest and Fisher’s exact test, respectively.ResultsThe success rate of single puncture was 95.6 %. Twelve (17.4 %) patients had pneumothorax, with 1 (1.4 %) requiring chestmore » tube insertion. Mild hemoptysis occurred in 7 (7.2 %) patients. All of the sample material was sufficient for histological diagnostic evaluation. Pathological analysis of biopsy specimens showed 46 malignant, 22 benign, and 1 nondiagnostic nodule. The final diagnoses were 49 malignant nodules and 20 benign nodules basing on postoperative histopathology and clinical follow-up data. One nondiagnostic sample was excluded from calculating diagnostic performance. A sensitivity, specificity, accuracy, positive predictive value, and negative predictive value in diagnosing SPNs were 95.8, 100, 97.0, 100, and 90.9 %, respectively. Pneumothorax rate, diagnostic sensitivity, and accuracy were not significantly different between the two groups (P > 0.05).ConclusionsMRI-guided PTNB is safe, feasible, and high accurate diagnostic technique for pathologic diagnosis of pulmonary nodules.« less
Diagnostic test accuracy of nutritional tools used to identify undernutrition in patients with colorectal cancer: a systematic review.

PubMed

Håkonsen, Sasja Jul; Pedersen, Preben Ulrich; Bath-Hextall, Fiona; Kirkpatrick, Pamela

2015-05-15

Effective nutritional screening, nutritional care planning and nutritional support are essential in all settings, and there is no doubt that a health service seeking to increase safety and clinical effectiveness must take nutritional care seriously. Screening and early detection of malnutrition is crucial in identifying patients at nutritional risk. There is a high prevalence of malnutrition in hospitalized patients undergoing treatment for colorectal cancer. To synthesize the best available evidence regarding the diagnostic test accuracy of nutritional tools (sensitivity and specificity) used to identify malnutrition (specifically undernutrition) in patients with colorectal cancer (such as the Malnutrition Screening Tool and Nutritional Risk Index) compared to reference tests (such as the Subjective Global Assessment or Patient Generated Subjective Global Assessment). Patients with colorectal cancer requiring either (or all) surgery, chemotherapy and/or radiotherapy in secondary care. Focus of the review: The diagnostic test accuracy of validated assessment tools/instruments (such as the Malnutrition Screening Tool and Nutritional Risk Index) in the diagnosis of malnutrition (specifically under-nutrition) in patients with colorectal cancer, relative to reference tests (Subjective Global Assessment or Patient Generated Subjective Global Assessment). Types of studies: Diagnostic test accuracy studies regardless of study design. Studies published in English, German, Danish, Swedish and Norwegian were considered for inclusion in this review. Databases were searched from their inception to April 2014. Methodological quality was determined using the Quality Assessment of Diagnostic Accuracy Studies checklist. Data was collected using the data extraction form: the Standards for Reporting Studies of Diagnostic Accuracy checklist for the reporting of studies of diagnostic accuracy. The accuracy of diagnostic tests is presented in terms of sensitivity, specificity, positive and negative predictive values. In addition, the positive likelihood ratio (sensitivity/ [1 - specificity]) and negative likelihood ratio (1 - sensitivity)/ specificity), were also calculated and presented in this review to provide information about the likelihood that a given test result would be expected when the target condition is present compared with the likelihood that the same result would be expected when the condition is absent. Not all trials reported true positive, true negative, false positive and false negative rates, therefore these rates were calculated based on the data in the published papers. A two-by-two truth table was reconstructed for each study, and sensitivity, specificity, positive predictive value, negative predictive value positive likelihood ratio and negative likelihood ratio were calculated for each study. A summary receiver operator characteristics curve was constructed to determine the relationship between sensitivity and specificity, and the area under the summary receiver operator characteristics curve which measured the usefulness of a test was calculated. Meta-analysis was not considered appropriate, therefore data was synthesized in a narrative summary. 1. One study evaluated the Malnutrition Screening Tool against the reference standard Patient-Generated Subjective Global Assessment. The sensitivity was 56% and the specificity 84%. The positive likelihood ratio was 3.100, negative likelihood ratio was 0.59, the diagnostic odds ratio (CI 95%) was 5.20 (1.09-24.90) and the Area Under the Curve (AUC) represents only a poor to fair diagnostic test accuracy. A total of two studies evaluated the diagnostic accuracy of Malnutrition Universal Screening Tool (MUST) (index test) compared to both Subjective Global Assessment (SGA) (reference standard) and PG-SGA (reference standard) in patients with colorectal cancer. In MUST vs SGA the sensitivity of the tool was 96%, specificity was 75%, LR+ 3.826, LR- 0.058, diagnostic OR (CI 95%) 66.00 (6.61-659.24) and AUC represented excellent diagnostic accuracy. In MUST vs PG-SGA the sensitivity of the tool was 72%, specificity 48.9%, LR+ 1.382, LR- 0.579, diagnostic OR (CI 95%) 2.39 (0.87-6.58) and AUC indicated that the tool failed as a diagnostic test to identify patients with colorectal cancer at nutritional risk,. The Nutrition Risk Index (NRI) was compared to SGA representing a sensitivity of 95.2%, specificity of 62.5%, LR+ 2.521, LR- 0.087, diagnostic OR (CI 95%) 28.89 (6.93-120.40) and AUC represented good diagnostic accuracy. In regard to NRI vs PG-SGA the sensitivity of the tool was 68%, specificity 64%, LR+ 1.947, LR- 0.487, diagnostic OR (CI 95%) 4.00 (1.23-13.01) and AUC indicated poor diagnostic test accuracy. There are no single, specific tools used to screen or assess the nutritional status of colorectal cancer patients. All tools showed varied diagnostic accuracies when compared to the reference standards SGA and PG-SGA. Hence clinical judgment combined with perhaps the SGA or PG-SGA should play a major role. The PG-SGA offers several advantages over the SGA tool: 1) the patient completes the medical history component, thereby decreasing the amount of time involved; 2) it contains more nutrition impact symptoms, which are important to the patient with cancer; and 3) it has a scoring system that allows patients to be triaged for nutritional intervention. Therefore, the PG-SGA could be used as a nutrition assessment tool as it allows quick identification and prioritization of colorectal cancer patients with malnutrition in combination with other parameters. This systematic review highlights the need for the following: Further studies needs to investigate the diagnostic accuracy of already existing nutritional screening tools in the context of colorectal cancer patients. If new screenings tools are developed, they should be developed and validated in the specific clinical context within the same patient population (colorectal cancer patients). The Joanna Briggs Institute.
Genetics and genomics of breast fibroadenomas.

PubMed

Loke, Benjamin Nathanael; Md Nasir, Nur Diyana; Thike, Aye Aye; Lee, Jonathan Yu Han; Lee, Cheok Soon; Teh, Bin Tean; Tan, Puay Hoon

2018-05-01

Fibroadenomas of the breast are benign fibroepithelial tumours most frequently encountered in women of reproductive age, although they may be diagnosed at any age. The fibroadenoma comprises a proliferation of both stromal and epithelial components. The mechanisms underlying fibroadenoma pathogenesis remain incompletely understood. In the clinical setting, distinguishing cellular fibroadenomas from benign phyllodes tumours is a common diagnostic challenge due to subjective histopathological criteria and interobserver differences. Recent sequencing studies have demonstrated the presence of highly recurrent mutations in fibroadenomas, and also delineated the genomic landscapes of fibroadenomas and the closely related phyllodes tumours, revealing differences at the gene level, which may be of potential adjunctive diagnostic use. The present article provides an overview of key studies uncovering genetic and genomic abnormalities in fibroadenomas, from initial karyotype reports revealing myriad cytogenetic aberrations to next-generation sequencing-based approaches that led to the discovery of highly recurrent MED12 mutations. A thorough understanding of these abnormalities is important to further elucidate the mechanisms by which fibroadenomas arise and to refine diagnostic assessment of this very common tumour. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Estimation of diagnostic test accuracy without full verification: a review of latent class methods

PubMed Central

Collins, John; Huynh, Minh

2014-01-01

The performance of a diagnostic test is best evaluated against a reference test that is without error. For many diseases, this is not possible, and an imperfect reference test must be used. However, diagnostic accuracy estimates may be biased if inaccurately verified status is used as the truth. Statistical models have been developed to handle this situation by treating disease as a latent variable. In this paper, we conduct a systematized review of statistical methods using latent class models for estimating test accuracy and disease prevalence in the absence of complete verification. PMID:24910172

Diagnostic accuracy of imaging devices in glaucoma: A meta-analysis.

PubMed

Fallon, Monica; Valero, Oliver; Pazos, Marta; Antón, Alfonso

Imaging devices such as the Heidelberg retinal tomograph-3 (HRT3), scanning laser polarimetry (GDx), and optical coherence tomography (OCT) play an important role in glaucoma diagnosis. A systematic search for evidence-based data was performed for prospective studies evaluating the diagnostic accuracy of HRT3, GDx, and OCT. The diagnostic odds ratio (DOR) was calculated. To compare the accuracy among instruments and parameters, a meta-analysis considering the hierarchical summary receiver-operating characteristic model was performed. The risk of bias was assessed using quality assessment of diagnostic accuracy studies, version 2. Studies in the context of screening programs were used for qualitative analysis. Eighty-six articles were included. The DOR values were 29.5 for OCT, 18.6 for GDx, and 13.9 for HRT. The heterogeneity analysis demonstrated statistically a significant influence of degree of damage and ethnicity. Studies analyzing patients with earlier glaucoma showed poorer results. The risk of bias was high for patient selection. Screening studies showed lower sensitivity values and similar specificity values when compared with those included in the meta-analysis. The classification capabilities of GDx, HRT, and OCT were high and similar across the 3 instruments. The highest estimated DOR was obtained with OCT. Diagnostic accuracy could be overestimated in studies including prediagnosed groups of subjects. Copyright © 2017 Elsevier Inc. All rights reserved.
Is diagnostic accuracy for detecting pulmonary nodules in chest CT reduced after a long day of reading?

NASA Astrophysics Data System (ADS)

Krupinski, Elizabeth A.; Berbaum, Kevin S.; Caldwell, Robert; Schartz, Kevin M.

2012-02-01

Radiologists are reading more cases with more images, especially in CT and MRI and thus working longer hours than ever before. There have been concerns raised regarding fatigue and whether it impacts diagnostic accuracy. This study measured the impact of reader visual fatigue by assessing symptoms, visual strain via dark focus of accommodation, and diagnostic accuracy. Twenty radiologists and 20 radiology residents were given two diagnostic performance tests searching CT chest sequences for a solitary pulmonary nodule before (rested) and after (tired) a day of clinical reading. 10 cases used free search and navigation, and the other 100 cases used preset scrolling speed and duration. Subjects filled out the Swedish Occupational Fatigue Inventory (SOFI) and the oculomotor strain subscale of the Simulator Sickness Questionnaire (SSQ) before each session. Accuracy was measured using ROC techniques. Using Swensson's technique yields an ROC area = 0.86 rested vs. 0.83 tired, p (one-tailed) = 0.09. Using Swensson's LROC technique yields an area = 0.73 rested vs. 0.66 tired, p (one-tailed) = 0.09. Using Swensson's Loc Accuracy technique yields an area = 0.77 rested vs. 0.72 tired, p (one-tailed) = 0.13). Subjective measures of fatigue increased significantly from early to late reading. To date, the results support our findings with static images and detection of bone fractures. Radiologists at the end of a long work day experience greater levels of measurable visual fatigue or strain, contributing to a decrease in diagnostic accuracy. The decrease in accuracy was not as great however as with static images.
Effect of varying displays and room illuminance on caries diagnostic accuracy in digital dental radiographs.

PubMed

Pakkala, T; Kuusela, L; Ekholm, M; Wenzel, A; Haiter-Neto, F; Kortesniemi, M

2012-01-01

In clinical practice, digital radiographs taken for caries diagnostics are viewed on varying types of displays and usually in relatively high ambient lighting (room illuminance) conditions. Our purpose was to assess the effect of room illuminance and varying display types on caries diagnostic accuracy in digital dental radiographs. Previous studies have shown that the diagnostic accuracy of caries detection is significantly better in reduced lighting conditions. Our hypothesis was that higher display luminance could compensate for this in higher ambient lighting conditions. Extracted human teeth with approximal surfaces clinically ranging from sound to demineralized were radiographed and evaluated by 3 observers who detected carious lesions on 3 different types of displays in 3 different room illuminance settings ranging from low illumination, i.e. what is recommended for diagnostic viewing, to higher illumination levels corresponding to those found in an average dental office. Sectioning and microscopy of the teeth validated the presence or absence of a carious lesion. Sensitivity, specificity and accuracy were calculated for each modality and observer. Differences were estimated by analyzing the binary data assuming the added effects of observer and modality in a generalized linear model. The observers obtained higher sensitivities in lower illuminance settings than in higher illuminance settings. However, this was related to a reduction in specificity, which meant that there was no significant difference in overall accuracy. Contrary to our hypothesis, there were no significant differences between the accuracy of different display types. Therefore, different displays and room illuminance levels did not affect the overall accuracy of radiographic caries detection. Copyright © 2012 S. Karger AG, Basel.
Administrative database code accuracy did not vary notably with changes in disease prevalence.

PubMed

van Walraven, Carl; English, Shane; Austin, Peter C

2016-11-01

Previous mathematical analyses of diagnostic tests based on the categorization of a continuous measure have found that test sensitivity and specificity varies significantly by disease prevalence. This study determined if the accuracy of diagnostic codes varied by disease prevalence. We used data from two previous studies in which the true status of renal disease and primary subarachnoid hemorrhage, respectively, had been determined. In multiple stratified random samples from the two previous studies having varying disease prevalence, we measured the accuracy of diagnostic codes for each disease using sensitivity, specificity, and positive and negative predictive value. Diagnostic code sensitivity and specificity did not change notably within clinically sensible disease prevalence. In contrast, positive and negative predictive values changed significantly with disease prevalence. Disease prevalence had no important influence on the sensitivity and specificity of diagnostic codes in administrative databases. Copyright Â© 2016 Elsevier Inc. All rights reserved.
Bayesian modeling and inference for diagnostic accuracy and probability of disease based on multiple diagnostic biomarkers with and without a perfect reference standard.

PubMed

Jafarzadeh, S Reza; Johnson, Wesley O; Gardner, Ian A

2016-03-15

The area under the receiver operating characteristic (ROC) curve (AUC) is used as a performance metric for quantitative tests. Although multiple biomarkers may be available for diagnostic or screening purposes, diagnostic accuracy is often assessed individually rather than in combination. In this paper, we consider the interesting problem of combining multiple biomarkers for use in a single diagnostic criterion with the goal of improving the diagnostic accuracy above that of an individual biomarker. The diagnostic criterion created from multiple biomarkers is based on the predictive probability of disease, conditional on given multiple biomarker outcomes. If the computed predictive probability exceeds a specified cutoff, the corresponding subject is allocated as 'diseased'. This defines a standard diagnostic criterion that has its own ROC curve, namely, the combined ROC (cROC). The AUC metric for cROC, namely, the combined AUC (cAUC), is used to compare the predictive criterion based on multiple biomarkers to one based on fewer biomarkers. A multivariate random-effects model is proposed for modeling multiple normally distributed dependent scores. Bayesian methods for estimating ROC curves and corresponding (marginal) AUCs are developed when a perfect reference standard is not available. In addition, cAUCs are computed to compare the accuracy of different combinations of biomarkers for diagnosis. The methods are evaluated using simulations and are applied to data for Johne's disease (paratuberculosis) in cattle. Copyright © 2015 John Wiley & Sons, Ltd.
Is there any evidence for the validity of diagnostic criteria used for accommodative and nonstrabismic binocular dysfunctions?

PubMed Central

Cacho-Martínez, Pilar; García-Muñoz, Ángel; Ruiz-Cantero, María Teresa

2013-01-01

Purpose To analyze the diagnostic criteria used in the scientific literature published in the past 25 years for accommodative and nonstrabismic binocular dysfunctions and to explore if the epidemiological analysis of diagnostic validity has been used to propose which clinical criteria should be used for diagnostic purposes. Methods We carried out a systematic review of papers on accommodative and non-strabic binocular disorders published from 1986 to 2012 analysing the MEDLINE, CINAHL, PsycINFO and FRANCIS databases. We admitted original articles about diagnosis of these anomalies in any population. We identified 839 articles and 12 studies were included. The quality of included articles was assessed using the QUADAS-2 tool. Results The review shows a wide range of clinical signs and cut-off points between authors. Only 3 studies (regarding accommodative anomalies) assessed diagnostic accuracy of clinical signs. Their results suggest using the accommodative amplitude and monocular accommodative facility for diagnosing accommodative insufficiency and a high positive relative accommodation for accommodative excess. The remaining 9 articles did not analyze diagnostic accuracy, assessing a diagnosis with the criteria the authors considered. We also found differences between studies in the way of considering patients’ symptomatology. 3 studies of 12 analyzed, performed a validation of a symptom survey used for convergence insufficiency. Conclusions Scientific literature reveals differences between authors according to diagnostic criteria for accommodative and nonstrabismic binocular dysfunctions. Diagnostic accuracy studies show that there is only certain evidence for accommodative conditions. For binocular anomalies there is only evidence about a validated questionnaire for convergence insufficiency with no data of diagnostic accuracy. PMID:24646897
Systematic reviews of diagnostic tests in endocrinology: an audit of methods, reporting, and performance.

PubMed

Spencer-Bonilla, Gabriela; Singh Ospina, Naykky; Rodriguez-Gutierrez, Rene; Brito, Juan P; Iñiguez-Ariza, Nicole; Tamhane, Shrikant; Erwin, Patricia J; Murad, M Hassan; Montori, Victor M

2017-07-01

Systematic reviews provide clinicians and policymakers estimates of diagnostic test accuracy and their usefulness in clinical practice. We identified all available systematic reviews of diagnosis in endocrinology, summarized the diagnostic accuracy of the tests included, and assessed the credibility and clinical usefulness of the methods and reporting. We searched Ovid MEDLINE, EMBASE, and Cochrane CENTRAL from inception to December 2015 for systematic reviews and meta-analyses reporting accuracy measures of diagnostic tests in endocrinology. Experienced reviewers independently screened for eligible studies and collected data. We summarized the results, methods, and reporting of the reviews. We performed subgroup analyses to categorize diagnostic tests as most useful based on their accuracy. We identified 84 systematic reviews; half of the tests included were classified as helpful when positive, one-fourth as helpful when negative. Most authors adequately reported how studies were identified and selected and how their trustworthiness (risk of bias) was judged. Only one in three reviews, however, reported an overall judgment about trustworthiness and one in five reported using adequate meta-analytic methods. One in four reported contacting authors for further information and about half included only patients with diagnostic uncertainty. Up to half of the diagnostic endocrine tests in which the likelihood ratio was calculated or provided are likely to be helpful in practice when positive as are one-quarter when negative. Most diagnostic systematic reviews in endocrine lack methodological rigor, protection against bias, and offer limited credibility. Substantial efforts, therefore, seem necessary to improve the quality of diagnostic systematic reviews in endocrinology.
The biasing effect of clinical history on physical examination diagnostic accuracy.

PubMed

Sibbald, Matthew; Cavalcanti, Rodrigo B

2011-08-01

Literature on diagnostic test interpretation has shown that access to clinical history can both enhance diagnostic accuracy and increase diagnostic error. Knowledge of clinical history has also been shown to enhance the more complex cognitive task of physical examination diagnosis, possibly by enabling early hypothesis generation. However, it is unclear whether clinicians adhere to these early hypotheses in the face of unexpected physical findings, thus resulting in diagnostic error. A sample of 180 internal medicine residents received a short clinical history and conducted a cardiac physical examination on a high-fidelity simulator. Resident Doctors (Residents) were randomised to three groups based on the physical findings in the simulator. The concordant group received physical examination findings consistent with the diagnosis that was most probable based on the clinical history. Discordant groups received findings associated with plausible alternative diagnoses which either lacked expected findings (indistinct discordant) or contained unexpected findings (distinct discordant). Physical examination diagnostic accuracy and physical examination findings were analysed. Physical examination diagnostic accuracy varied significantly among groups (75 ± 44%, 2 ± 13% and 31 ± 47% in the concordant, indistinct discordant and distinct discordant groups, respectively (F(2,177) = 53, p < 0.0001). Of the 115 Residents who were diagnostically unsuccessful, 33% adhered to their original incorrect hypotheses. Residents verbalised an average of 12 findings (interquartile range: 10-14); 58 ± 17% were correct and the percentage of correct findings was similar in all three groups (p = 0.44). Residents showed substantially decreased diagnostic accuracy when faced with discordant physical findings. The majority of trainees given discordant physical findings rejected their initial hypotheses, but were still diagnostically unsuccessful. These results suggest that overcoming the bias induced by a misleading clinical history may involve two independent steps: rejection of the incorrect initial hypothesis, and selection of the correct diagnosis. Educational strategies focused solely on prompting clinicians to re-examine their hypotheses may be insufficient to reduce diagnostic error. © Blackwell Publishing Ltd 2011.
The Clinical Usefulness of Endoscopic Ultrasound-Guided Fine Needle Aspiration and Biopsy for Rectal and Perirectal Lesions

PubMed Central

Soh, Jae Seung; Lee, Ho-Su; Lee, Seohyun; Bae, Jungho; Lee, Hyo Jeong; Park, Sang Hyoung; Yang, Dong-Hoon; Kim, Kyung-Jo; Ye, Byong Duk; Myung, Seung-Jae; Yang, Suk-Kyun; Kim, Jin-Ho

2015-01-01

Background/Aims Endoscopic ultrasound-guided fine needle aspiration and/or biopsy (EUS-FNA/B) have been used to diagnose subepithelial tumors (SETs) and extraluminal lesions in the gastrointestinal tract. Our group previously reported the usefulness of EUS-FNA/B for rectal and perirectal lesions. This study reports our expanded experience with EUS-FNA/B for rectal and perirectal lesions in terms of diagnostic accuracy and safety. We also included our new experience with EUS-FNB using the recently introduced ProCore needle. Methods From April 2009 to March 2014, EUS-FNA/B for rectal and perirectal lesions was performed in 30 consecutive patients. We evaluated EUS-FNA/B performance by comparing histological diagnoses with final results. We also investigated factors affecting diagnostic accuracy. Results Among 10 patients with SETs, EUS-FNA/B specimen results revealed a gastrointestinal stromal tumor in 4 patients and malignant lymphoma in 1 patient. The diagnostic accuracy of EUS-FNA/B was 50% for SETs (5/10). Among 20 patients with non-SET lesions, 8 patients were diagnosed with malignant disease and 7 were diagnosed with benign disease based on both EUS-FNA/B and the final results. The diagnostic accuracy of EUS-FNA/B for non-SET lesions was 75% (15/20). The size of lesions was the only factor related to diagnostic accuracy (P=0.027). Two complications of mild fever and asymptomatic pneumoperitoneum occurred after EUS-FNA/B. Conclusions The overall diagnostic accuracy of EUS-FNA/B for rectal and perirectal lesions was 67% (20/30). EUS-FNA/B is a clinically useful method for cytological and histological diagnoses of rectal and perirectal lesions. PMID:25931998
Diagnostic accuracy of 3D-transvaginal ultrasound in detecting uterine cavity abnormalities in infertile patients as compared with hysteroscopy.

PubMed

Apirakviriya, Chayanis; Rungruxsirivorn, Tassawan; Phupong, Vorapong; Wisawasukmongchol, Wirach

2016-05-01

To assess diagnostic accuracy of 3D transvaginal ultrasound (3D-TVS) compared with hysteroscopy in detecting uterine cavity abnormalities in infertile women. This prospective observational cross-sectional study was conducted during the July 2013 to December 2013 study period. Sixty-nine women with infertility were enrolled. In the mid to late follicular phase of each subject's menstrual cycle, 3D transvaginal ultrasound and hysteroscopy were performed on the same day in each patient. Hysteroscopy is widely considered to be the gold standard method for investigation of the uterine cavity. Uterine cavity characteristics and abnormalities were recorded. Diagnostic accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and positive and negative likelihood ratios were evaluated. Hysteroscopy was successfully performed in all subjects. Hysteroscopy diagnosed pathological findings in 22 of 69 cases (31.8%). There were 18 endometrial polyps, 3 submucous myomas, and 1 septate uterus. Three-dimensional transvaginal ultrasound in comparison with hysteroscopy had 84.1% diagnostic accuracy, 68.2% sensitivity, 91.5% specificity, 79% positive predictive value, and 86% negative predictive value. The positive and negative likelihood ratios were 8.01 and 0.3, respectively. 3D-TVS successfully detected every case of submucous myoma and uterine anomaly. For detection of endometrial polyps, 3D-TVS had 61.1% sensitivity, 91.5% specificity, and 83.1% diagnostic accuracy. 3D-TVS demonstrated 84.1% diagnostic accuracy for detecting uterine cavity abnormalities in infertile women. A significant percentage of infertile patients had evidence of uterine cavity pathology. Hysteroscopy is, therefore, recommended for accurate detection and diagnosis of uterine cavity lesion. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Low clinical diagnostic accuracy of early vs advanced Parkinson disease: clinicopathologic study.

PubMed

Adler, Charles H; Beach, Thomas G; Hentz, Joseph G; Shill, Holly A; Caviness, John N; Driver-Dunckley, Erika; Sabbagh, Marwan N; Sue, Lucia I; Jacobson, Sandra A; Belden, Christine M; Dugger, Brittany N

2014-07-29

Determine diagnostic accuracy of a clinical diagnosis of Parkinson disease (PD) using neuropathologic diagnosis as the gold standard. Data from the Arizona Study of Aging and Neurodegenerative Disorders were used to determine the predictive value of a clinical PD diagnosis, using 2 clinical diagnostic confidence levels, PossPD (never treated or not clearly responsive) and ProbPD (responsive to medications). Neuropathologic diagnosis was the gold standard. Based on first visit, 9 of 34 (26%) PossPD cases had neuropathologically confirmed PD while 80 of 97 (82%) ProbPD cases had confirmed PD. PD was confirmed in 8 of 15 (53%) ProbPD cases with <5 years of disease duration and 72 of 82 (88%) with ≥5 years of disease duration. Using final diagnosis at time of death, 91 of 107 (85%) ProbPD cases had confirmed PD. Clinical variables that improved diagnostic accuracy were medication response, motor fluctuations, dyskinesias, and hyposmia. Using neuropathologic findings of PD as the gold standard, this study establishes the novel findings of only 26% accuracy for a clinical diagnosis of PD in untreated or not clearly responsive subjects, 53% accuracy in early PD responsive to medication (<5 years' duration), and >85% diagnostic accuracy of longer duration, medication-responsive PD. Caution is needed when interpreting clinical studies of PD, especially studies of early disease that do not have autopsy confirmation. The need for a tissue or other diagnostic biomarker is reinforced. This study provides Class II evidence that a clinical diagnosis of PD identifies patients who will have pathologically confirmed PD with a sensitivity of 88% and specificity of 68%. © 2014 American Academy of Neurology.
Diagnostic Accuracy of the Slump Test for Identifying Neuropathic Pain in the Lower Limb.

PubMed

Urban, Lawrence M; MacNeil, Brian J

2015-08-01

Diagnostic accuracy study with nonconsecutive enrollment. To assess the diagnostic accuracy of the slump test for neuropathic pain (NeP) in those with low to moderate levels of chronic low back pain (LBP), and to determine whether accuracy of the slump test improves by adding anatomical or qualitative pain descriptors. Neuropathic pain has been linked with poor outcomes, likely due to inadequate diagnosis, which precludes treatment specific for NeP. Current diagnostic approaches are time consuming or lack accuracy. A convenience sample of 21 individuals with LBP, with or without radiating leg pain, was recruited. A standardized neurosensory examination was used to determine the reference diagnosis for NeP. Afterward, the slump test was administered to all participants. Reports of pain location and quality produced during the slump test were recorded. The neurosensory examination designated 11 of the 21 participants with LBP/sciatica as having NeP. The slump test displayed high sensitivity (0.91), moderate specificity (0.70), a positive likelihood ratio of 3.03, and a negative likelihood ratio of 0.13. Adding the criterion of pain below the knee significantly increased specificity to 1.00 (positive likelihood ratio = 11.9). Pain-quality descriptors did not improve diagnostic accuracy. The slump test was highly sensitive in identifying NeP within the study sample. Adding a pain-location criterion improved specificity. Combining the diagnostic outcomes was very effective in identifying all those without NeP and half of those with NeP. Limitations arising from the small and narrow spectrum of participants with LBP/sciatica sampled within the study prevent application of the findings to a wider population. Diagnosis, level 4-.
Comparison of Laser Scanning Diagnostic Devices for Early Glaucoma Detection.

PubMed

Schulze, Andreas; Lamparter, Julia; Pfeiffer, Norbert; Berisha, Fatmire; Schmidtmann, Irene; Hoffmann, Esther M

2015-08-01

To compare the diagnostic accuracy and to evaluate the correlation of optic nerve head and retinal nerve fiber layer thickness values between Fourier-Domain optical coherence tomography (FD-OCT), confocal scanning laser ophthalmoscopy (CSLO), and scanning laser polarimetry (SLP) for early glaucoma detection. Ninety-three patients with early open-angle glaucoma, 58 patients with ocular hypertension, and 60 healthy control subjects were included in this observational, cross-sectional study. All study participants underwent FD-OCT (RTVue-100), CSLO (HRT3), and SLP (GDx VCC) imaging of the optic nerve head and the retinal nerve fiber layer. Area under the receiver operating characteristic curves (AUROC) and Bland-Altman analysis were performed. The parameters with the highest diagnostic accuracy were found for FD-OCT cup-to-disc ratio (AUROC=0.841), for SLP NFI (AUROC=0.835), and for CSLO cup-to-disc ratio (AUROC=0.789). Diagnostic accuracy of the best CSLO and SLP parameter was similar (P=0.259). There was a small statistically significant difference between the best CSLO and FD-OCT parameters for differentiating between glaucoma and healthy eyes (P=0.047). FD-OCT and SLP have a similarly good diagnostic ability to distinguish between early glaucoma and healthy subjects. The diagnostic accuracy of CSLO was comparable with SLP and marginally lower compared with FD-OCT.
Megavoltage computed tomography image guidance with helical tomotherapy in patients with vertebral tumors: analysis of factors influencing interobserver variability.

PubMed

Levegrün, Sabine; Pöttgen, Christoph; Jawad, Jehad Abu; Berkovic, Katharina; Hepp, Rodrigo; Stuschke, Martin

2013-02-01

To evaluate megavoltage computed tomography (MVCT)-based image guidance with helical tomotherapy in patients with vertebral tumors by analyzing factors influencing interobserver variability, considered as quality criterion of image guidance. Five radiation oncologists retrospectively registered 103 MVCTs in 10 patients to planning kilovoltage CTs by rigid transformations in 4 df. Interobserver variabilities were quantified using the standard deviations (SDs) of the distributions of the correction vector components about the observers' fraction mean. To assess intraobserver variabilities, registrations were repeated after ≥4 weeks. Residual deviations after setup correction due to uncorrectable rotational errors and elastic deformations were determined at 3 craniocaudal target positions. To differentiate observer-related variations in minimizing these residual deviations across the 3-dimensional MVCT from image resolution effects, 2-dimensional registrations were performed in 30 single transverse and sagittal MVCT slices. Axial and longitudinal MVCT image resolutions were quantified. For comparison, image resolution of kilovoltage cone-beam CTs (CBCTs) and interobserver variability in registrations of 43 CBCTs were determined. Axial MVCT image resolution is 3.9 lp/cm. Longitudinal MVCT resolution amounts to 6.3 mm, assessed as full-width at half-maximum of thin objects in MVCTs with finest pitch. Longitudinal CBCT resolution is better (full-width at half-maximum, 2.5 mm for CBCTs with 1-mm slices). In MVCT registrations, interobserver variability in the craniocaudal direction (SD 1.23 mm) is significantly larger than in the lateral and ventrodorsal directions (SD 0.84 and 0.91 mm, respectively) and significantly larger compared with CBCT alignments (SD 1.04 mm). Intraobserver variabilities are significantly smaller than corresponding interobserver variabilities (variance ratio [VR] 1.8-3.1). Compared with 3-dimensional registrations, 2-dimensional registrations have significantly smaller interobserver variability in the lateral and ventrodorsal directions (VR 3.8 and 2.8, respectively) but not in the craniocaudal direction (VR 0.75). Tomotherapy image guidance precision is affected by image resolution and residual deviations after setup correction. Eliminating the effect of residual deviations yields small interobserver variabilities with submillimeter precision in the axial plane. In contrast, interobserver variability in the craniocaudal direction is dominated by the poorer longitudinal MVCT image resolution. Residual deviations after image guidance exist and need to be considered when dose gradients ultimately achievable with image guided radiation therapy techniques are analyzed. Copyright © 2013 Elsevier Inc. All rights reserved.
External Validation and Evaluation of Reliability and Validity of the Modified Seoul National University Renal Stone Complexity Scoring System to Predict Stone-Free Status After Retrograde Intrarenal Surgery.

PubMed

Park, Juhyun; Kang, Minyong; Jeong, Chang Wook; Oh, Sohee; Lee, Jeong Woo; Lee, Seung Bae; Son, Hwancheol; Jeong, Hyeon; Cho, Sung Yong

2015-08-01

The modified Seoul National University Renal Stone Complexity scoring system (S-ReSC-R) for retrograde intrarenal surgery (RIRS) was developed as a tool to predict stone-free rate (SFR) after RIRS. We externally validated the S-ReSC-R. We retrospectively reviewed 159 patients who underwent RIRS. The S-ReSC-R was assigned from 1 to 12 according to the location and number of sites involved. The stone-free status was defined as no evidence of a stone or with clinically insignificant residual fragment stones less than 2 mm. Interobserver and test-retest reliabilities were evaluated. Statistical performance of the prediction model was assessed by its predictive accuracy, predictive probability, and clinical usefulness. Overall SFR was 73.0%. The SFRs were 86.7%, 70.2%, and 48.6% in low-score (1-2), intermediate-score (3-4), and high-score (5-12) groups, respectively (p<0.001). External validation of S-ReSC-R revealed an area under the curve (AUC) of 0.731 (95% CI 0.650-0.813). The AUC of the three-titered S-ReSC-R was 0.701 (95% CI 0.609-0.794). The calibration plot showed that the predicted probability of SFR had a concordance comparable to that of observed frequency. The Hosmer-Lemeshow goodness of fit test revealed a p-value of 0.01 for the S-ReSC-R and 0.90 for the three-titered S-ReSC-R. Interobserver and test-retest reliabilities revealed an almost perfect level of agreement. The present study proved the predictive value of S-ReSC-R to predict SFR following RIRS in an independent cohort. Interobserver and test-retest reliabilities confirmed that S-ReSC-R was reliable and valid.
Evaluation of the applicability of territorial arterial spin labeling in meningiomas for presurgical assessments compared with 3-dimensional time-of-flight magnetic resonance angiography.

PubMed

Lu, Yiping; Luan, Shihai; Liu, Li; Xiong, Ji; Wen, Jianbo; Qu, Jianxun; Geng, Daoying; Yin, Bo

2017-10-01

To prospectively evaluate the application of territorial arterial spin labelling (t-ASL) in comparison with unenhanced three-dimensional time-of-flight magnetic resonance angiography (3D-TOF-MRA) in the identification of the feeding vasculature of meningiomas. Thirty consecutive patients with suspected meningiomas underwent conventional MR imaging, unenhanced 3D-TOF-MRA and t-ASL scanning. Four experienced neuro-radiologists assessed the feeding vessels with different techniques separately. For the identification of the origin of the feeding arteries on t-ASL, the inter-observer agreement was excellent (к = 0.913), while the inter-observer agreement of 3D-TOF-MRA was good (к = 0.653). The inter-modality agreement between t-ASL and 3D-TOF-MRA for the feeding arteries was moderate (к = 0.514). All 8 patients with motor or sensory disorders proved to have meningiomas supplied completely or partially by the internal carotid arteries, while all 14 patients with meningiomas supplied by the external carotid arteries or basilar arteries didn't show any symptoms concerning motor or sensory disorders (p = 0.003). T-ASL could complement unenhanced 3D-TOF-MRA and increase accuracy in the identification of the supplying arteries of meningiomas in a safe, intuitive, non-radioactive manner. The information about feeding arteries was potentially related to patients' symptoms and pathology, making it more crucial for neurosurgeons in planning surgery as well as evaluating prognosis. • A comprehensive understanding of feeding vasculature is helpful for optimized treatment decisions. • T-ASL could identify main supplying arteries of meningiomas with excellent inter-observer agreement. • The inter-modality agreement for identification of the main feeding arteries was moderate. • Blood supply from ICAs was related to motor or sensory disorders. • High-level meningiomas were found to have double main supplying arteries.
Instrument for evaluation of sedentary lifestyle in patients with high blood pressure.

PubMed

Lopes, Marcos Venícios de Oliveira; da Silva, Viviane Martins; de Araujo, Thelma Leite; Guedes, Nirla Gomes; Martins, Larissa Castelo Guedes; Teixeira, Iane Ximenes

2015-01-01

this article describes the diagnostic accuracy of the International Physical Activity Questionnaire to identify the nursing diagnosis of sedentary lifestyle. a diagnostic accuracy study was developed with 240 individuals with established high blood pressure. The analysis of diagnostic accuracy was based on measures of sensitivity, specificity, predictive values, likelihood ratios, efficiency, diagnostic odds ratio, Youden index, and area under the receiver-operating characteristic curve. statistical differences between genders were observed for activities of moderate intensity and for total physical activity. Age was negatively correlated with activities of moderate intensity and total physical activity. the analysis of area under the receiver-operating characteristic curve for moderate intensity activities, walking, and total physical activity showed that the International Physical Activity Questionnaire present moderate capacity to correctly classify individuals with and without sedentary lifestyle.
Histological features associated with diagnostic agreement in atypical ductal hyperplasia of the breast: illustrative cases from the B-Path study.

PubMed

Allison, Kimberly H; Rendi, Mara H; Peacock, Sue; Morgan, Tom; Elmore, Joann G; Weaver, Donald L

2016-12-01

This study examined the case-specific characteristics associated with interobserver diagnostic agreement in atypical ductal hyperplasia (ADH) of the breast. Seventy-two test set cases with a consensus diagnosis of ADH from the B-Path study were evaluated. Cases were scored for 17 histological features, which were then correlated with the participant agreement with the consensus ADH diagnosis. Participating pathologists' perceptions of case difficulty, borderline features or whether they would obtain a second opinion were also examined for associations with agreement. Of the 2070 participant interpretations of the 72 consensus ADH cases, 48% were scored by participants as difficult and 45% as borderline between two diagnoses; the presence of both of these features was significantly associated with increased agreement (P < 0.001). A second opinion would have been obtained in 80% of interpretations, and this was associated with increased agreement (P < 0.001). Diagnostic agreement ranged from 10% to 89% on a case-by-case basis. Cases with papillary lesions, cribriform architecture and obvious cytological monotony were associated with higher agreement. Lower agreement rates were associated with solid or micropapillary architecture, borderline cytological monotony, or cases without a diagnostic area that was obvious on low power. The results of this study suggest that pathologists frequently recognize the challenge of ADH cases, with some cases being more prone to diagnostic variability. In addition, there are specific histological features associated with diagnostic agreement on ADH cases. Multiple example images from cases in this test set are provided to serve as educational illustrations of these challenges. © 2016 John Wiley & Sons Ltd.
Histologic Features associated with Diagnostic Agreement in Atypical Ductal Hyperplasia of the Breast: Illustrative Cases from the B-Path Study

PubMed Central

Allison, Kimberly H.; Rendi, Mara H.; Peacock, Sue; Morgan, Tom; Elmore, Joann G.; Weaver, Donald L.

2016-01-01

Background Case specific characteristics associated with interobserver diagnostic agreement in atypical ductal hyperplasia (ADH) of the breast are poorly understood. Methods Seventy-two test set cases with a consensus diagnosis of ADH from the B-Path study were evaluated. Cases were scored for 17 histologic features which were then correlated with the participant agreement with the consensus ADH diagnosis. Participating pathologists’ perceptions of case difficulty, borderline features, or if they would obtain a second opinion were also examined for associations with agreement. Results Of the 2,070 participant interpretations on the 72 consensus ADH cases, 48% were scored by participants as difficult and 45% as borderline between two diagnoses; the presence of both of these features was significantly associated with increased agreement (p < 0.001). A second opinion would have been obtained in 80% of interpretations, and this was associated with increased agreement (p < 0.001). Diagnostic agreement ranged from 10–89% on a case-by-case basis. Cases with papillary lesions, cribriform architecture and obvious cytologic monotony were associated with higher agreement. Lower agreement rates were associated with solid or micro-papillary architecture, borderline cytologic monotony or cases without a diagnostic area that was obvious on low power. Conclusions The results of this study suggest that pathologists frequently recognize the challenge of ADH cases with some cases more prone to diagnostic variability. In addition, there are specific histologic features associated with diagnostic agreement on ADH cases. Multiple example images from cases in this test set are provided to serve as educational illustrations of these challenges. PMID:27398812
Estimating Classification Consistency and Accuracy for Cognitive Diagnostic Assessment

ERIC Educational Resources Information Center

Cui, Ying; Gierl, Mark J.; Chang, Hua-Hua

2012-01-01

This article introduces procedures for the computation and asymptotic statistical inference for classification consistency and accuracy indices specifically designed for cognitive diagnostic assessments. The new classification indices can be used as important indicators of the reliability and validity of classification results produced by…

Can mandibular bone resorption predict hip fracture in elderly women? A systematic review of diagnostic test accuracy.

PubMed

Devlin, Hugh; Whelton, Christopher

2015-09-01

The aim of this systematic review was to determine the diagnostic accuracy of the mandibular cortical width measurements and porosity in detecting hip osteoporosis. All of the included studies used measurements on panoramic radiographs. Studies were included if they compared the radiographic measurements (or index tests) with central dual energy X-ray absorptiometry (DXA) of the hip as the reference standard. A measure of diagnostic accuracy such as sensitivity and specificity or area under the receiver operating characteristic curve was also required for inclusion. Seven studies were identified. Meta-analysis was not possible because of the heterogeneity of the studies. The studies all demonstrated moderate diagnostic accuracy. If a patient with a thin or porous mandibular cortex is identified by a chance radiographic finding, additional clinical risk factors need to be considered and the patient referred for further investigation with DXA where necessary. © 2013 John Wiley & Sons A/S and The Gerodontology Society. Published by John Wiley & Sons Ltd.
Evaluation of the International Consensus Guidelines for the Surgical Resection of Intraductal Papillary Mucinous Neoplasms.

PubMed

Tsukagoshi, Mariko; Araki, Kenichiro; Saito, Fumiyoshi; Kubo, Norio; Watanabe, Akira; Igarashi, Takamichi; Ishii, Norihiro; Yamanaka, Takahiro; Shirabe, Ken; Kuwano, Hiroyuki

2018-04-01

International consensus guidelines for intraductal papillary mucinous neoplasms (IPMNs) were revised in 2012. We aimed to evaluate the clinical utility of each predictor in the 2006 and 2012 guidelines and validate the diagnostic value and surgical indications. Forty-two patients with surgically resected IPMNs were included. Each predictor was applied to evaluate its diagnostic value. The 2012 guidelines had greater accuracy for invasive carcinoma than the 2006 guidelines (64.3 vs. 31.0%). Moreover, the accuracy for high-grade dysplasia was also increased (48.6 vs. 77.1%). When the main pancreatic duct (MPD) size ≥8 mm was substituted for MPD size ≥10 mm in the 2012 guidelines, the accuracy for high-grade dysplasia was 80.0%. The 2012 guidelines exhibited increased diagnostic accuracy for invasive IPMN. It is important to consider surgical resection prior to invasive carcinoma, and high-risk stigmata might be a useful diagnostic criterion. Furthermore, MPD size ≥8 mm may be predictive of high-grade dysplasia.
Impact of time-resolved MRA on diagnostic accuracy in patients with symptomatic peripheral artery disease of the calf station.

PubMed

Hansmann, Jan; Michaely, Henrik J; Morelli, John N; Diehl, Steffen J; Meyer, Mathias; Schoenberg, Stefan O; Attenberger, Ulrike I

2013-12-01

The purpose of this article is to evaluate the added diagnostic accuracy of time-resolved MR angiography (MRA) of the calves compared with continuous-table-movement MRA in patients with symptomatic lower extremity peripheral artery disease (PAD) using digital subtraction angiography (DSA) correlation. Eighty-four consecutive patients with symptomatic PAD underwent a low-dose 3-T MRA protocol, consisting of continuous-table-movement MRA, acquired from the diaphragm to the calves, and an additional time-resolved MRA of the calves; 0.1 mmol/kg body weight (bw) of contrast material was used (0.07 mmol/kg bw for continuous-table-movement MRA and 0.03 mmol/kg bw for time-resolved MRA). Two radiologists rated image quality on a 4-point scale and stenosis degree on a 3-point scale. An additional assessment determined the degree of venous contamination and whether time-resolved MRA improved diagnostic confidence. The accuracy of stenosis gradation with continuous-table-movement and time-resolved MRA was compared with that of DSA as a correlation. Overall diagnostic accuracy was calculated for continuous-table-movement and time-resolved MRA. Median image quality was rated as good for 578 vessel segments with continuous-table-movement MRA and as excellent for 565 vessel segments with time-resolved MRA. Interreader agreement was excellent (κ = 0.80-0.84). Venous contamination interfered with diagnosis in more than 60% of continuous-table-movement MRA examinations. The degree of stenosis was assessed for 340 vessel segments. The diagnostic accuracies (continuous-table-movement MRA/time-resolved MRA) combined for the readers were obtained for the tibioperoneal trunk (84%/93%), anterior tibial (69%/87%), posterior tibial (85%/91%), and peroneal (67%/81%) arteries. The addition of time-resolved MRA improved diagnostic confidence in 69% of examinations. The addition of time-resolved MRA at the calf station improves diagnostic accuracy over continuous-table-movement MRA alone in symptomatic patients with PAD.
Diagnostic accuracy of functional, imaging and biochemical tests for patients presenting with chest pain to the emergency department: A systematic review and meta-analysis.

PubMed

Iannaccone, Mario; Gili, Sebastiano; De Filippo, Ovidio; D'Amico, Salvatore; Gagliardi, Marco; Bertaina, Maurizio; Mazzilli, Silvia; Rettegno, Sara; Bongiovanni, Federica; Gatti, Paolo; Ugo, Fabrizio; Boccuzzi, Giacomo G; Colangelo, Salvatore; Prato, Silvia; Moretti, Claudio; D'Amico, Maurizio; Noussan, Patrizia; Garbo, Roberto; Hildick-Smith, David; Gaita, Fiorenzo; D'Ascenzo, Fabrizio

2018-01-01

Non-invasive ischaemia tests and biomarkers are widely adopted to rule out acute coronary syndrome in the emergency department. Their diagnostic accuracy has yet to be precisely defined. Medline, Cochrane Library CENTRAL, EMBASE and Biomed Central were systematically screened (start date 1 September 2016, end date 1 December 2016). Prospective studies (observational or randomised controlled trial) comparing functional/imaging or biochemical tests for patients presenting with chest pain to the emergency department were included. Overall, 77 studies were included, for a total of 49,541 patients (mean age 59.9 years). Fast and six-hour highly sensitive troponin T protocols did not show significant differences in their ability to detect acute coronary syndromes, as they reported a sensitivity and specificity of 0.89 (95% confidence interval 0.79-0.94) and 0.84 (0.74-0.9) vs 0.89 (0.78-0.94) and 0.83 (0.70-0.92), respectively. The addition of copeptin to troponin increased sensitivity and reduced specificity, without improving diagnostic accuracy. The diagnostic value of non-invasive tests for patients without troponin increase was tested. Coronary computed tomography showed the highest level of diagnostic accuracy (sensitivity 0.93 (0.81-0.98) and specificity 0.90 (0.93-0.94)), along with myocardial perfusion scintigraphy (sensitivity 0.85 (0.77-0.91) and specificity 0.92 (0.83-0.96)). Stress echography was inferior to coronary computed tomography but non-inferior to myocardial perfusion scintigraphy, while exercise testing showed the lower level of diagnostic accuracy. Fast and six-hour highly sensitive troponin T protocols provide an overall similar level of diagnostic accuracy to detect acute coronary syndrome. Among the non-invasive ischaemia tests for patients without troponin increase, coronary computed tomography and myocardial perfusion scintigraphy showed the highest sensitivity and specificity.
Gastritis staging: interobserver agreement by applying OLGA and OLGIM systems.

PubMed

Isajevs, Sergejs; Liepniece-Karele, Inta; Janciauskas, Dainius; Moisejevs, Georgijs; Putnins, Viesturs; Funka, Konrads; Kikuste, Ilze; Vanags, Aigars; Tolmanis, Ivars; Leja, Marcis

2014-04-01

Atrophic gastritis remains a difficult histopathological diagnosis with low interobserver agreement. The aim of our study was to compare gastritis staging and interobserver agreement between general and expert gastrointestinal (GI) pathologists using Operative Link for Gastritis Assessment (OLGA) and Operative Link on Gastric Intestinal Metaplasia (OLGIM). We enrolled 835 patients undergoing upper endoscopy in the study. Two general and two expert gastrointestinal pathologists graded biopsy specimens according to the Sydney classification, and the stage of gastritis was assessed by OLGA and OLGIM system. Using OLGA, 280 (33.4 %) patients had gastritis (stage I-IV), whereas with OLGIM this was 167 (19.9 %). OLGA stage III- IV gastritis was observed in 25 patients, whereas by OLGIM stage III-IV was found in 23 patients. Interobserver agreement between expert GI pathologists for atrophy in the antrum, incisura angularis, and corpus was moderate (kappa = 0.53, 0.57 and 0.41, respectively, p < 0.0001), but almost perfect for intestinal metaplasia (kappa = 0.82, 0.80 and 0.81, respectively, p < 0.0001). However, interobserver agreement between general pathologists was poor for atrophy, but moderate for intestinal metaplasia. OLGIM staging provided the highest interobserver agreement, but a substantial proportion of potentially high-risk individuals would be missed if only OLGIM staging is applied. Therefore, we recommend to use a combination of OLGA and OLGIM for staging of chronic gastritis.
Interobserver agreement in analysis of cardiotocograms recorded during trial of labor after cesarean.

PubMed

Caning, M M; Thisted, D L A; Amer-Wählin, I; Laier, G H; Krebs, L

2018-05-17

To examine interobserver agreement in intrapartum cardiotocography (CTG) classification in women undergoing trial of labor after a cesarean section (TOLAC) at term with or without complete uterine rupture. Nineteen blinded and independent Danish obstetricians assessed CTG tracings from 47 women (174 individual pages) with a complete uterine rupture during TOLAC and 37 women (133 individual pages) with no uterine rupture during TOLAC. Individual pages with CTG tracings lasting at least 20 min were evaluated by three different assessors and counted as an individual case. The tracings were analyzed according to the modified version of the Federation of Gynaecology and Obstetrics (FIGO) guidelines elaborated for the use of STAN (ST-analysis). Occurrence of defined abnormalities was recorded and the tracings were classified as normal, suspicious, pathological, or preterminal. The interobserver agreement was evaluated using Fleiss' kappa. Agreement on classification of a preterminal CTG was almost perfect. The interobserver agreement on normal, suspicious or pathological CTG was moderate to substantial. Regarding the presence of severe variable decelerations, the agreement was moderate. No statistical difference was found in the interobserver agreement between classification of tracings from women undergoing TOLAC with and without complete uterine rupture. The interobserver agreement on classification of CTG tracings from high-risk deliveries during TOLAC is best for assessment of a preterminal CTG and the poorest for the identification of severe variable decelerations.
Repeated significance tests of linear combinations of sensitivity and specificity of a diagnostic biomarker

PubMed Central

Wu, Mixia; Shu, Yu; Li, Zhaohai; Liu, Aiyi

2016-01-01

A sequential design is proposed to test whether the accuracy of a binary diagnostic biomarker meets the minimal level of acceptance. The accuracy of a binary diagnostic biomarker is a linear combination of the marker’s sensitivity and specificity. The objective of the sequential method is to minimize the maximum expected sample size under the null hypothesis that the marker’s accuracy is below the minimal level of acceptance. The exact results of two-stage designs based on Youden’s index and efficiency indicate that the maximum expected sample sizes are smaller than the sample sizes of the fixed designs. Exact methods are also developed for estimation, confidence interval and p-value concerning the proposed accuracy index upon termination of the sequential testing. PMID:26947768
Diagnostic accuracy research in glaucoma is still incompletely reported: An application of Standards for Reporting of Diagnostic Accuracy Studies (STARD) 2015.

PubMed

Michelessi, Manuele; Lucenteforte, Ersilia; Miele, Alba; Oddone, Francesco; Crescioli, Giada; Fameli, Valeria; Korevaar, Daniël A; Virgili, Gianni

2017-01-01

Research has shown a modest adherence of diagnostic test accuracy (DTA) studies in glaucoma to the Standards for Reporting of Diagnostic Accuracy Studies (STARD). We have applied the updated 30-item STARD 2015 checklist to a set of studies included in a Cochrane DTA systematic review of imaging tools for diagnosing manifest glaucoma. Three pairs of reviewers, including one senior reviewer who assessed all studies, independently checked the adherence of each study to STARD 2015. Adherence was analyzed on an individual-item basis. Logistic regression was used to evaluate the effect of publication year and impact factor on adherence. We included 106 DTA studies, published between 2003-2014 in journals with a median impact factor of 2.6. Overall adherence was 54.1% for 3,286 individual rating across 31 items, with a mean of 16.8 (SD: 3.1; range 8-23) items per study. Large variability in adherence to reporting standards was detected across individual STARD 2015 items, ranging from 0 to 100%. Nine items (1: identification as diagnostic accuracy study in title/abstract; 6: eligibility criteria; 10: index test (a) and reference standard (b) definition; 12: cut-off definitions for index test (a) and reference standard (b); 14: estimation of diagnostic accuracy measures; 21a: severity spectrum of diseased; 23: cross-tabulation of the index and reference standard results) were adequately reported in more than 90% of the studies. Conversely, 10 items (3: scientific and clinical background of the index test; 11: rationale for the reference standard; 13b: blinding of index test results; 17: analyses of variability; 18; sample size calculation; 19: study flow diagram; 20: baseline characteristics of participants; 28: registration number and registry; 29: availability of study protocol; 30: sources of funding) were adequately reported in less than 30% of the studies. Only four items showed a statistically significant improvement over time: missing data (16), baseline characteristics of participants (20), estimates of diagnostic accuracy (24) and sources of funding (30). Adherence to STARD 2015 among DTA studies in glaucoma research is incomplete, and only modestly increasing over time.
Diagnostic accuracy of physical examination tests of the ankle/foot complex: a systematic review.

PubMed

Schwieterman, Braun; Haas, Deniele; Columber, Kirby; Knupp, Darren; Cook, Chad

2013-08-01

Orthopedic special tests of the ankle/foot complex are routinely used during the physical examination process in order to help diagnose ankle/lower leg pathologies. The purpose of this systematic review was to investigate the diagnostic accuracy of ankle/lower leg special tests. A search of the current literature was conducted using PubMed, CINAHL, SPORTDiscus, ProQuest Nursing and Allied Health Sources, Scopus, and Cochrane Library. Studies were eligible if they included the following: 1) a diagnostic clinical test of musculoskeletal pathology in the ankle/foot complex, 2) description of the clinical test or tests, 3) a report of the diagnostic accuracy of the clinical test (e.g. sensitivity and specificity), and 4) an acceptable reference standard for comparison. The quality of included studies was determined by two independent reviewers using the Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) tool. Nine diagnostic accuracy studies met the inclusion criteria for this systematic review; analyzing a total of 16 special tests of the ankle/foot complex. After assessment using the QUADAS-2, only one study had low risk of bias and low concerns regarding applicability. Most ankle/lower leg orthopedic special tests are confirmatory in nature and are best utilized at the end of the physical examination. Most of the studies included in this systematic review demonstrate notable biases, which suggest that results and recommendations in this review should be taken as a guide rather than an outright standard. There is need for future research with more stringent study design criteria so that more accurate diagnostic power of ankle/lower leg special tests can be determined. 3a.
Preclinical study of diagnostic performances of contrast-enhanced spectral mammography versus MRI for breast diseases in China.

PubMed

Wang, Qingguo; Li, Kangan; Wang, Lihui; Zhang, Jianbing; Zhou, Zhiguo; Feng, Yan

2016-01-01

To evaluate diagnostic performances of CESM for breast diseases with comparison to breast MRI in China. Sixty-eight patients with 77 breast lesions underwent MR and CESM. Two radiologists interpreted either MRI or CESM images, separately and independently. BI-RADS 1-3 and BI-RADS 4-5 were classified into the suspicious benign and suspicious malignant groups. Diagnostic accuracy parameters were calculated. Receiver operating characteristic (ROC) curves were constructed for the two modalities. The agreement and correlation between maximum lesion diameter based on CESM and MRI, or CESM and pathology were analyzed. Diagnostic accuracy parameters for CESM were sensitivity 95.8 %, specificity 65.5 %, PPV 82.1 %, NPV 90.5 % and accuracy 84.4 %. The diagnostic accuracy parameters for breast MRI were sensitivity 93.8 %, specificity 82.8 %, PPV 88.2 %, NPV 92.3 %and accuracy 89.6 %. Area under the curve (AUC) of ROC was 0.96 for breast MRI and 0.88 for CESM. The Bland-Altman plots showed a mean difference of 0.7 mm with 95 % limits of agreement of 11.4 mm in tumor diameter measured using CESM and breast MRI. The differences of size measurement between CESM and breast MRI were significant, whereas no difference was observed between CESM and pathology as well as between breast MRI and pathology. The better correlation with pathological results was found in CESM than breast MRI. Our study demonstrates that CESM possesses better diagnostic performances than breast MRI in terms of diagnostic sensitivity and lesion size assessment. And CESM is a good alternative method of screening breast cancer in high-risk people.
A meta-analysis of use of Prostate Imaging Reporting and Data System Version 2 (PI-RADS V2) with multiparametric MR imaging for the detection of prostate cancer.

PubMed

Zhang, Li; Tang, Min; Chen, Sipan; Lei, Xiaoyan; Zhang, Xiaoling; Huan, Yi

2017-12-01

This meta-analysis was undertaken to review the diagnostic accuracy of PI-RADS V2 for prostate cancer (PCa) detection with multiparametric MR (mp-MR). A comprehensive literature search of electronic databases was performed by two observers independently. Inclusion criteria were original research using the PI-RADS V2 system in reporting prostate MRI. The methodological quality was assessed using the Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) tool. Data necessary to complete 2 × 2 contingency tables were obtained from the included studies. Thirteen studies (2,049 patients) were analysed. This is an initial meta-analysis of PI-RADs V2 and the overall diagnostic accuracy in diagnosing PCa was as follows: pooled sensitivity, 0.85 (0.78-0.91); pooled specificity, 0.71 (0.60-0.80); pooled positive likelihood ratio (LR+), 2.92 (2.09-4.09); pooled negative likelihood ratio (LR-), 0.21 (0.14-0.31); pooled diagnostic odds ratio (DOR), 14.08 (7.93-25.01), respectively. Positive predictive values ranged from 0.54 to 0.97 and negative predictive values ranged from 0.26 to 0.92. Currently available evidence indicates that PI-RADS V2 appears to have good diagnostic accuracy in patients with PCa lesions with high sensitivity and moderate specificity. However, no recommendation regarding the best threshold can be provided because of heterogeneity. • PI-RADS V2 shows good diagnostic accuracy for PCa detection. • Initially pooled specificity of PI-RADS v2 remains moderate. • PCa detection is increased by experienced radiologists. • There is currently a high heterogeneity in prostate diagnostics with MRI.
Value of physical tests in diagnosing cervical radiculopathy: a systematic review.

PubMed

Thoomes, Erik J; van Geest, Sarita; van der Windt, Danielle A; Falla, Deborah; Verhagen, Arianne P; Koes, Bart W; Thoomes-de Graaf, Marloes; Kuijper, Barbara; Scholten-Peeters, Wendy G M; Vleggeert-Lankamp, Carmen L

2018-01-01

In clinical practice, the diagnosis of cervical radiculopathy is based on information from the patient's history, physical examination, and diagnostic imaging. Various physical tests may be performed, but their diagnostic accuracy is unknown. This study aimed to summarize and update the evidence on diagnostic performance of tests carried out during a physical examination for the diagnosis of cervical radiculopathy. A review of the accuracy of diagnostic tests was carried out. The study sample comprised diagnostic studies comparing results of tests performed during a physical examination in diagnosing cervical radiculopathy with a reference standard of imaging or surgical findings. Sensitivity, specificity, likelihood ratios are presented, together with pooled results for sensitivity and specificity. A literature search up to March 2016 was performed in CENTRAL, PubMed (MEDLINE), Embase, CINAHL, Web of Science, and Google Scholar. The methodological quality of studies was assessed using the QUADAS-2. Five diagnostic accuracy studies were identified. Only Spurling's test was evaluated in more than one study, showing high specificity ranging from 0.89 to 1.00 (95% confidence interval [CI]: 0.59-1.00); sensitivity varied from 0.38 to 0.97 (95% CI: 0.21-0.99). No studies were found that assessed the diagnostic accuracy of widely used neurological tests such as key muscle strength, tendon reflexes, and sensory impairments. There is limited evidence for accuracy of physical examination tests for the diagnosis of cervical radiculopathy. When consistent with patient history, clinicians may use a combination of Spurling's, axial traction, and an Arm Squeeze test to increase the likelihood of a cervical radiculopathy, whereas a combined results of four negative neurodynamics tests and an Arm Squeeze test could be used to rule out the disorder. Copyright © 2017 Elsevier Inc. All rights reserved.
Towards improving diagnosis of memory loss in general practice: TIMeLi diagnostic test accuracy study protocol.

PubMed

Creavin, Sam T; Cullum, Sarah J; Haworth, Judy; Wye, Lesley; Bayer, Antony; Fish, Mark; Purdy, Sarah; Ben-Shlomo, Yoav

2016-07-19

People with cognitive problems, and their families, report distress and uncertainty whilst undergoing evaluation for dementia and perceive that traditional diagnostic evaluation in secondary care is insufficiently patient centred. The James Lind Alliance has prioritised research to investigate the role of primary care in supporting a more effective diagnostic pathway, and the topic is also of interest to health commissioners. However, there are very few studies that investigate the accuracy of diagnostic tests for dementia in primary care. We will conduct a prospective diagnostic test accuracy study to evaluate the accuracy of a range of simple tests for diagnosing all-cause-dementia in symptomatic people aged over 70 years who have consulted with their general practitioner (GP). We will invite eligible people to attend a research clinic where they will undergo a range of index tests that a GP could perform in the surgery and also be assessed by a specialist in memory disorders at the same appointment. Participating GPs will request neuroimaging and blood tests and otherwise manage patients in line with their usual clinical practice. The reference standard will be the consensus judgement of three experts (neurologist, psychiatrist and geriatrician) based on information from the specialist assessment, GP records and investigations, but not including items in the index test battery. The target condition will be all-cause dementia but we will also investigate diagnostic accuracy for sub-types where possible. We will use qualitative interviews with patients and focus groups with clinicians to help us understand the acceptability and feasibility of diagnosing dementia in primary care using the tests that we are investigating. Our results will help clinicians decide on which tests to perform in someone where there is concern about possible dementia and inform commissioning of diagnostic pathways.
DIAGNOSTIC ACCURACY OF PHYSICAL EXAMINATION TESTS OF THE ANKLE/FOOT COMPLEX: A SYSTEMATIC REVIEW

PubMed Central

Schwieterman, Braun; Haas, Deniele; Columber, Kirby; Knupp, Darren

2013-01-01

Background: Orthopedic special tests of the ankle/foot complex are routinely used during the physical examination process in order to help diagnose ankle/lower leg pathologies. Purpose: The purpose of this systematic review was to investigate the diagnostic accuracy of ankle/lower leg special tests. Methods: A search of the current literature was conducted using PubMed, CINAHL, SPORTDiscus, ProQuest Nursing and Allied Health Sources, Scopus, and Cochrane Library. Studies were eligible if they included the following: 1) a diagnostic clinical test of musculoskeletal pathology in the ankle/foot complex, 2) description of the clinical test or tests, 3) a report of the diagnostic accuracy of the clinical test (e.g. sensitivity and specificity), and 4) an acceptable reference standard for comparison. The quality of included studies was determined by two independent reviewers using the Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) tool. Results: Nine diagnostic accuracy studies met the inclusion criteria for this systematic review; analyzing a total of 16 special tests of the ankle/foot complex. After assessment using the QUADAS-2, only one study had low risk of bias and low concerns regarding applicability. Conclusion: Most ankle/lower leg orthopedic special tests are confirmatory in nature and are best utilized at the end of the physical examination. Most of the studies included in this systematic review demonstrate notable biases, which suggest that results and recommendations in this review should be taken as a guide rather than an outright standard. There is need for future research with more stringent study design criteria so that more accurate diagnostic power of ankle/lower leg special tests can be determined. Level of Evidence: 3a PMID:24175128
Very Low Intravenous Contrast Volume Protocol for Computed Tomography Angiography Providing Comprehensive Cardiac and Vascular Assessment Prior to Transcatheter Aortic Valve Replacement in Patients with Chronic Kidney Disease

PubMed Central

Pulerwitz, Todd C.; Khalique, Omar K.; Nazif, Tamim N.; Rozenshtein, Anna; Pearson, Gregory D.N.; Hahn, Rebecca T.; Vahl, Torsten P.; Kodali, Susheel K.; George, Isaac; Leon, Martin B.; D'Souza, Belinda; Po, Ming Jack; Einstein, Andrew J.

2016-01-01

Background Transcatheter aortic valve replacement (TAVR) is a lifesaving procedure for many patients high risk for surgical aortic valve replacement. The prevalence of chronic kidney disease (CKD) is high in this population, and thus a very low contrast volume (VLCV) computed tomography angiography (CTA) protocol providing comprehensive cardiac and vascular imaging would be valuable. Methods 52 patients with severe, symptomatic aortic valve disease, undergoing pre-TAVR CTA assessment from 2013-4 at Columbia University Medical Center were studied, including all 26 patients with CKD (eGFR<30mL/min) who underwent a novel VLCV protocol (20mL of iohexol at 2.5mL/s), and 26 standard-contrast-volume (SCV) protocol patients. Using a 320-slice volumetric scanner, the protocol included ECG-gated volume scanning of the aortic root followed by medium-pitch helical vascular scanning through the femoral arteries. Two experienced cardiologists performed aortic annulus and root measurements. Vascular image quality was assessed by two radiologists using a 4-point scale. Results VLCV patients had mean(±SD) age 86±6.5, BMI 23.9±3.4 kg/m2 with 54% men; SCV patients age 83±8.8, BMI 28.7±5.3 kg/m2, 65% men. There was excellent intra- and inter-observer agreement for annular and root measurements, and excellent agreement with 3D-transesophageal echocardiographic measurements. Both radiologists found diagnostic-quality vascular imaging in 96% of VLCV and 100% of SCV cases, with excellent inter-observer agreement. Conclusions This study is the first of its kind to report the feasibility and reproducibility of measurements for a VLCV protocol for comprehensive pre-TAVR CTA. There was excellent agreement of cardiac measurements and almost all studies were diagnostic quality for vascular access assessment. PMID:27061253
Six-minute magnetic resonance imaging protocol for evaluation of acute ischemic stroke: pushing the boundaries.

PubMed

Nael, Kambiz; Khan, Rihan; Choudhary, Gagandeep; Meshksar, Arash; Villablanca, Pablo; Tay, Jennifer; Drake, Kendra; Coull, Bruce M; Kidwell, Chelsea S

2014-07-01

If magnetic resonance imaging (MRI) is to compete with computed tomography for evaluation of patients with acute ischemic stroke, there is a need for further improvements in acquisition speed. Inclusion criteria for this prospective, single institutional study were symptoms of acute ischemic stroke within 24 hours onset, National Institutes of Health Stroke Scale ≥3, and absence of MRI contraindications. A combination of echo-planar imaging (EPI) and a parallel acquisition technique were used on a 3T magnetic resonance (MR) scanner to accelerate the acquisition time. Image analysis was performed independently by 2 neuroradiologists. A total of 62 patients met inclusion criteria. A repeat MRI scan was performed in 22 patients resulting in a total of 84 MRIs available for analysis. Diagnostic image quality was achieved in 100% of diffusion-weighted imaging, 100% EPI-fluid attenuation inversion recovery imaging, 98% EPI-gradient recalled echo, 90% neck MR angiography and 96% of brain MR angiography, and 94% of dynamic susceptibility contrast perfusion scans with interobserver agreements (k) ranging from 0.64 to 0.84. Fifty-nine patients (95%) had acute infarction. There was good interobserver agreement for EPI-fluid attenuation inversion recovery imaging findings (k=0.78; 95% confidence interval, 0.66-0.87) and for detection of mismatch classification using dynamic susceptibility contrast-Tmax (k=0.92; 95% confidence interval, 0.87-0.94). Thirteen acute intracranial hemorrhages were detected on EPI-gradient recalled echo by both observers. A total of 68 and 72 segmental arterial stenoses were detected on contrast-enhanced MR angiography of the neck and brain with k=0.93, 95% confidence interval, 0.84 to 0.96 and 0.87, 95% confidence interval, 0.80 to 0.90, respectively. A 6-minute multimodal MR protocol with good diagnostic quality is feasible for the evaluation of patients with acute ischemic stroke and can result in significant reduction in scan time rivaling that of the multimodal computed tomographic protocol. © 2014 American Heart Association, Inc.
Comparison of the severity of lower extremity arterial disease in smokers and patients with diabetes using a novel duplex Doppler scoring system.

PubMed

Hiremath, Rudresh; Gowda, Goutham; Ibrahim, Jebin; Reddy, Harish T; Chodiboina, Haritha; Shah, Rushit

2017-07-01

The aim of this study was to validate the diagnostic feasibility of a novel scoring system of peripheral arterial disease (PAD) in smokers and patients with diabetes depending on duplex Doppler sonographic features. Patients presenting with the symptomatology of PAD were divided into three groups: diabetes only, smoking only, and smokers with diabetes. The patients were clinically examined, a clinical severity score was obtained, and the subjects were categorized into the three extrapolated categories of mild, moderate, and severe. All 106 subjects also underwent a thorough duplex Doppler examination, and various aspects of PAD were assessed and tabulated. These components were used to create a novel duplex Doppler scoring system. Depending on the scores obtained, each individual was categorized as having mild, moderate, or severe illness. The Cohen kappa value was used to assess interobserver agreement between the two scoring systems. Interobserver agreement between the traditional Rutherford clinical scoring system and the newly invented duplex Doppler scoring system showed a kappa value of 0.83, indicating significant agreement between the two scoring systems (P<0.001). Duplex Doppler imaging is an effective screening investigation for lower extremity arterial disease, as it not only helps in its diagnosis, but also in the staging and grading of the disease, providing information that can be utilized for future management and treatment planning.
Detection of vascularity in wrist tenosynovitis: power doppler ultrasound compared with contrast-enhanced grey-scale ultrasound.

PubMed

Klauser, Andrea S; Franz, Magdalena; Arora, Rohit; Feuchtner, Gudrun M; Gruber, Johann; Schirmer, Michael; Jaschke, Werner R; Gabl, Markus F

2010-01-01

We sought to assess vascularity in wrist tenosynovitis by using power Doppler ultrasound (PDUS) and to compare detection of intra- and peritendinous vascularity with that of contrast-enhanced grey-scale ultrasound (CEUS). Twenty-six tendons of 24 patients (nine men, 15 women; mean age ± SD, 54.4 ± 11.8 years) with a clinical diagnosis of tenosynovitis were examined with B-mode ultrasonography, PDUS, and CEUS by using a second-generation contrast agent, SonoVue (Bracco Diagnostics, Milan, Italy) and a low-mechanical-index ultrasound technique. Thickness of synovitis, extent of vascularized pannus, intensity of peritendinous vascularisation, and detection of intratendinous vessels was incorporated in a 3-score grading system (grade 0 to 2). Interobserver variability was calculated. With CEUS, a significantly greater extent of vascularity could be detected than by using PDUS (P < 0.001). In terms of peri- and intratendinous vessels, CEUS was significantly more sensitive in the detection of vascularization compared with PDUS (P < 0.001). No significant correlation between synovial thickening and extent of vascularity could be found (P = 0.089 to 0.097). Interobserver reliability was calculated to be excellent when evaluating the grading score (κ = 0.811 to 1.00). CEUS is a promising tool to detect tendon vascularity with higher sensitivity than PDUS by improved detection of intra- and peritendinous vascularity.
Prenasal thickness to nasal bone length ratio: effectiveness as a second or third trimester marker for Down syndrome.

PubMed

Tournemire, A; Groussolles, M; Ehlinger, V; Lusque, A; Morin, M; Benevent, J B; Arnaud, C; Vayssière, C

2015-08-01

To assess the value of the prenasal thickness to nasal bone length ratio (PT/NBL) for detecting trisomy 21 (T21) after the first trimester. Two examiners blinded to fetal T21 status retrospectively measured prenasal thickness (PT) and nasal bone length (NBL) of T21 and control fetuses at 15-36 weeks' gestational age on two-dimensional images from all T21-screening ultrasounds from November 2010 to April 2013. ROC curve analysis and its diagnostic values determined the best cut-off value for the ratio. Interobserver reproducibility was assessed. Good quality ultrasound profile images were available for 26 fetuses with T21 compared to 91 normal fetuses. The median PT/NBL ratio was 1.28 for T21 and 0.73 for control fetuses (p<0.0001). The PT/NBL ratio performed significantly better (AUC 0.99; 95%CI 0.97-1) than either PT (0.82; 0.73-0.91) or NBL (0.91; 0.85-0.98). The optimal PT/NBL ratio cut-off was 0.98, with a sensitivity of 88.5% [76.2-100%] and a specificity of 100%. Interobserver variability was low. The PT/NBL ratio is a strong marker for detecting T21 in the second and third trimesters, significantly more effective than either indicator alone. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
A Systematic Review and Meta-analysis of the Diagnostic Accuracy of Prostate Health Index and 4-Kallikrein Panel Score in Predicting Overall and High-grade Prostate Cancer.

PubMed

Russo, Giorgio Ivan; Regis, Federica; Castelli, Tommaso; Favilla, Vincenzo; Privitera, Salvatore; Giardina, Raimondo; Cimino, Sebastiano; Morgia, Giuseppe

2017-08-01

Markers for prostate cancer (PCa) have progressed over recent years. In particular, the prostate health index (PHI) and the 4-kallikrein (4K) panel have been demonstrated to improve the diagnosis of PCa. We aimed to review the diagnostic accuracy of PHI and the 4K panel for PCa detection. We performed a systematic literature search of PubMed, EMBASE, Cochrane, and Academic One File databases until July 2016. We included diagnostic accuracy studies that used PHI or 4K panel for the diagnosis of PCa or high-grade PCa. The methodological quality was assessed using the Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) tool. Twenty-eight studies including 16,762 patients have been included for the analysis. The pooled data showed a sensitivity of 0.89 and 0.74 for PHI and 4K panel, respectively, for PCa detection and a pooled specificity of 0.34 and 0.60 for PHI and 4K panel, respectively. The derived area under the curve (AUC) from the hierarchical summary receiver operating characteristic (HSROC) showed an accuracy of 0.76 and 0.72 for PHI and 4K panel respectively. For high-grade PCa detection, the pooled sensitivity was 0.93 and 0.87 for PHI and 4K panel, respectively, whereas the pooled specificity was 0.34 and 0.61 for PHI and 4K panel, respectively. The derived AUC from the HSROC showed an accuracy of 0.82 and 0.81 for PHI and 4K panel, respectively. Both PHI and the 4K panel provided good diagnostic accuracy in detecting overall and high-grade PCa. Copyright © 2016 Elsevier Inc. All rights reserved.

Ontario multidetector computed tomographic coronary angiography study: field evaluation of diagnostic accuracy.

PubMed

Chow, Benjamin J W; Freeman, Michael R; Bowen, James M; Levin, Leslie; Hopkins, Robert B; Provost, Yves; Tarride, Jean-Eric; Dennie, Carole; Cohen, Eric A; Marcuzzi, Dan; Iwanochko, Robert; Moody, Alan R; Paul, Narinder; Parker, John D; O'Reilly, Daria J; Xie, Feng; Goeree, Ron

2011-06-13

Computed tomographic coronary angiography (CTCA) has gained clinical acceptance for the detection of obstructive coronary artery disease. Although single-center studies have demonstrated excellent accuracy, multicenter studies have yielded variable results. The true diagnostic accuracy of CTCA in the "real world" remains uncertain. We conducted a field evaluation comparing multidetector CTCA with invasive CA (ICA) to understand CTCA's diagnostic accuracy in a real-world setting. A multicenter cohort study of patients awaiting ICA was conducted between September 2006 and June 2009. All patients had either a low or an intermediate pretest probability for coronary artery disease and underwent CTCA and ICA within 10 days. The results of CTCA and ICA were interpreted visually by local expert observers who were blinded to all clinical data and imaging results. Using a patient-based analysis (diameter stenosis ≥50%) of 169 patients, the sensitivity, specificity, positive predictive value, and negative predictive value were 81.3% (95% confidence interval [CI], 71.0%-89.1%), 93.3% (95% CI, 85.9%-97.5%), 91.6% (95% CI, 82.5%-96.8%), and 84.7% (95% CI, 76.0%-91.2%), respectively; the area under receiver operating characteristic curve was 0.873. The diagnostic accuracy varied across centers (P < .001), with a sensitivity, specificity, positive predictive value, and negative predictive value ranging from 50.0% to 93.2%, 92.0% to 100%, 84.6% to 100%, and 42.9% to 94.7%, respectively. Compared with ICA, CTCA appears to have good accuracy; however, there was variability in diagnostic accuracy across centers. Factors affecting institutional variability need to be better understood before CTCA is universally adopted. Additional real-world evaluations are needed to fully understand the impact of CTCA on clinical care. clinicaltrials.gov Identifier: NCT00371891.
Diagnostic accuracy of serological diagnosis of hepatitis C and B using dried blood spot samples (DBS): two systematic reviews and meta-analyses.

PubMed

Lange, Berit; Cohn, Jennifer; Roberts, Teri; Camp, Johannes; Chauffour, Jeanne; Gummadi, Nina; Ishizaki, Azumi; Nagarathnam, Anupriya; Tuaillon, Edouard; van de Perre, Philippe; Pichler, Christine; Easterbrook, Philippa; Denkinger, Claudia M

2017-11-01

Dried blood spots (DBS) are a convenient tool to enable diagnostic testing for viral diseases due to transport, handling and logistical advantages over conventional venous blood sampling. A better understanding of the performance of serological testing for hepatitis C (HCV) and hepatitis B virus (HBV) from DBS is important to enable more widespread use of this sampling approach in resource limited settings, and to inform the 2017 World Health Organization (WHO) guidance on testing for HBV/HCV. We conducted two systematic reviews and meta-analyses on the diagnostic accuracy of HCV antibody (HCV-Ab) and HBV surface antigen (HBsAg) from DBS samples compared to venous blood samples. MEDLINE, EMBASE, Global Health and Cochrane library were searched for studies that assessed diagnostic accuracy with DBS and agreement between DBS and venous sampling. Heterogeneity of results was assessed and where possible a pooled analysis of sensitivity and specificity was performed using a bivariate analysis with maximum likelihood estimate and 95% confidence intervals (95%CI). We conducted a narrative review on the impact of varying storage conditions or limits of detection in subsets of samples. The QUADAS-2 tool was used to assess risk of bias. For the diagnostic accuracy of HBsAg from DBS compared to venous blood, 19 studies were included in a quantitative meta-analysis, and 23 in a narrative review. Pooled sensitivity and specificity were 98% (95%CI:95%-99%) and 100% (95%CI:99-100%), respectively. For the diagnostic accuracy of HCV-Ab from DBS, 19 studies were included in a pooled quantitative meta-analysis, and 23 studies were included in a narrative review. Pooled estimates of sensitivity and specificity were 98% (CI95%:95-99) and 99% (CI95%:98-100), respectively. Overall quality of studies and heterogeneity were rated as moderate in both systematic reviews. HCV-Ab and HBsAg testing using DBS compared to venous blood sampling was associated with excellent diagnostic accuracy. However, generalizability is limited as no uniform protocol was applied and most studies did not use fresh samples. Future studies on diagnostic accuracy should include an assessment of impact of environmental conditions common in low resource field settings. Manufacturers also need to formally validate their assays for DBS for use with their commercial assays.
Comparison and validation of International Consensus Diagnostic Criteria for diagnosis of autoimmune pancreatitis from pancreatic cancer in a Taiwanese cohort.

PubMed

Chang, Ming-Chu; Liang, Po-Chin; Jan, I-Shiow; Yang, Ching-Yao; Tien, Yu-Wen; Wei, Shu-Chen; Wong, Jau-Min; Chang, Yu-Ting

2014-08-18

The International Consensus Diagnostic Criteria (ICDC) designed to diagnosis autoimmune pancreatitis (AIP) has been proposed recently. The diagnostic performance of ICDC has not been previously evaluated in diffuse-type and focal-type AIP, respectively, in comparison with the revised HISORt and Asian criteria in Taiwan. Prospective, consecutive patient cohort. Largest tertiary referred centre hospital managing pancreatic disease in Taiwan. 188 patients with AIP and 130 with tissue proofed pancreatic adenocarcinoma were consecutively recruited. The ICDC, as well as revised HISORt and Asian criteria, was applied for each participant. Each diagnostic criterion of ICDC was validated with special reference to levels 1 and 2 in diffuse-type and focal-type AIP. Sensitivity, specificity and accuracy. Each diagnostic criterion of ICDC was validated with special reference to levels 1 and 2 in AIP and focal-type AIP. The sensitivity, specificity and accuracy of ICDC for all AIP were the best: 89.4%, 100% and 93.7%, respectively, in these three criteria. The sensitivity, specificity and accuracy of ICDC for focal-type AIP (84.9%, 100% and 93.8%) were also the best among these three criteria. The area under the curve of receiver-operator characteristic of ICDC was 0.95 (95% CI 0.92 to 0.97) in all AIP and 0.93 (95% CI 0.88 to 0.97) in focal-type AIP. The sensitivity, specificity and accuracy of ICDC are higher than the revised HISORt and Asian criteria. The sensitivity, specificity and accuracy of each criterion are higher in diffuse-type AIP compared with focal-type AIP. Under the same specificity, the sensitivity and accuracy of ICDC are higher than other diagnostic criteria in focal-type AIP. ICDC has better diagnostic performance compared with previously proposed diagnostic criteria in diffuse-type and focal-type AIP. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Diagnostic relevance of high field MRI in clinical neuroradiology: the advantages and challenges of driving a sports car.

PubMed

Wattjes, Mike P; Barkhof, Frederik

2012-11-01

High field MRI operating at 3 T is increasingly being used in the field of neuroradiology on the grounds that higher magnetic field strength should theoretically lead to a higher diagnostic accuracy in the diagnosis of several disease entities. This Editorial discusses the exhaustive review by Wardlaw and colleagues of research comparing 3 T MRI with 1.5 T MRI in the field of neuroradiology. Interestingly, the authors found no convincing evidence of improved image quality, diagnostic accuracy, or reduced total examination times using 3 T MRI instead of 1.5 T MRI. These findings are highly relevant since a new generation of high field MRI systems operating at 7 T has recently been introduced. • Higher magnetic field strengths do not necessarily lead to a better diagnostic accuracy. • Disadvantages of high field MR systems have to be considered in clinical practice. • Higher field strengths are needed for functional imaging, spectroscopy, etc. • Disappointingly there are few direct comparisons of 1.5 and 3 T MRI. • Whether the next high field MR generation (7 T) will improve diagnostic accuracy has to be investigated.
Joint confidence region estimation for area under ROC curve and Youden index.

PubMed

Yin, Jingjing; Tian, Lili

2014-03-15

In the field of diagnostic studies, the area under the ROC curve (AUC) serves as an overall measure of a biomarker/diagnostic test's accuracy. Youden index, defined as the overall correct classification rate minus one at the optimal cut-off point, is another popular index. For continuous biomarkers of binary disease status, although researchers mainly evaluate the diagnostic accuracy using AUC, for the purpose of making diagnosis, Youden index provides an important and direct measure of the diagnostic accuracy at the optimal threshold and hence should be taken into consideration in addition to AUC. Furthermore, AUC and Youden index are generally correlated. In this paper, we initiate the idea of evaluating diagnostic accuracy based on AUC and Youden index simultaneously. As the first step toward this direction, this paper only focuses on the confidence region estimation of AUC and Youden index for a single marker. We present both parametric and non-parametric approaches for estimating joint confidence region of AUC and Youden index. We carry out extensive simulation study to evaluate the performance of the proposed methods. In the end, we apply the proposed methods to a real data set. Copyright © 2013 John Wiley & Sons, Ltd.
Intelligent Diagnostic Assistant for Complicated Skin Diseases through C5's Algorithm.

PubMed

Jeddi, Fatemeh Rangraz; Arabfard, Masoud; Kermany, Zahra Arab

2017-09-01

Intelligent Diagnostic Assistant can be used for complicated diagnosis of skin diseases, which are among the most common causes of disability. The aim of this study was to design and implement a computerized intelligent diagnostic assistant for complicated skin diseases through C5's Algorithm. An applied-developmental study was done in 2015. Knowledge base was developed based on interviews with dermatologists through questionnaires and checklists. Knowledge representation was obtained from the train data in the database using Excel Microsoft Office. Clementine Software and C5's Algorithms were applied to draw the decision tree. Analysis of test accuracy was performed based on rules extracted using inference chains. The rules extracted from the decision tree were entered into the CLIPS programming environment and the intelligent diagnostic assistant was designed then. The rules were defined using forward chaining inference technique and were entered into Clips programming environment as RULE. The accuracy and error rates obtained in the training phase from the decision tree were 99.56% and 0.44%, respectively. The accuracy of the decision tree was 98% and the error was 2% in the test phase. Intelligent diagnostic assistant can be used as a reliable system with high accuracy, sensitivity, specificity, and agreement.
[Risk on bias assessment: (6) A Revised Tool for the Quality Assessment on Diagnostic Accuracy Studies (QUADAS-2)].

PubMed

Qu, Y J; Yang, Z R; Sun, F; Zhan, S Y

2018-04-10

This paper introduced the Revised Tool for the Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2), including the development and comparison with the original QUADAS, and illustrated the application of QUADAS-2 in a published paper related to the study on diagnostic accuracy which was included in systematic review and Meta-analysis. QUADAS-2 presented considerable improvement over the original tool. Confused items that included in QUADAS had disappeared and the quality assessment of the original study replaced by the rating of risk on bias and applicability. This was implemented through the description on the four main domains with minimal overlapping and answering the signal questions in each domain. The risk of bias and applicability with 'high','low' or 'unclear' was in line with the risk of bias assessment of intervention studies in Cochrane, so to replace the total score of quality assessment in QUADAS. Meanwhile, QUADAS-2 was also applicable to assess the diagnostic accuracy studies in which follow-up without prognosis was involved in golden standard. It was useful to assess the overall methodological quality of the study despite more time consuming than the original QUADAS. However, QUADAS-2 needs to be modified to apply in comparative studies on diagnostic accuracy and we hope the users would follow the updates and give their feedbacks on line.
Diagnostic accuracy of rectal mucosa biopsy testing for chronic wasting disease within white-tailed deer (Odocoileus virginianus) herds in North America:Effects of age,sex,polymorphism at PRNP codon 96,and disease progression

USDA-ARS?s Scientific Manuscript database

An effective live animal diagnostic test is needed to assist in the control of chronic wasting disease (CWD), which has spread through captive and wild herds of white-tailed deer (Odocoileus virginianus) in Canada and the United States. In the present study, the diagnostic accuracy of rectal mucosa ...
Verification and classification bias interactions in diagnostic test accuracy studies for fine-needle aspiration biopsy.

PubMed

Schmidt, Robert L; Walker, Brandon S; Cohen, Michael B

2015-03-01

Reliable estimates of accuracy are important for any diagnostic test. Diagnostic accuracy studies are subject to unique sources of bias. Verification bias and classification bias are 2 sources of bias that commonly occur in diagnostic accuracy studies. Statistical methods are available to estimate the impact of these sources of bias when they occur alone. The impact of interactions when these types of bias occur together has not been investigated. We developed mathematical relationships to show the combined effect of verification bias and classification bias. A wide range of case scenarios were generated to assess the impact of bias components and interactions on total bias. Interactions between verification bias and classification bias caused overestimation of sensitivity and underestimation of specificity. Interactions had more effect on sensitivity than specificity. Sensitivity was overestimated by at least 7% in approximately 6% of the tested scenarios. Specificity was underestimated by at least 7% in less than 0.1% of the scenarios. Interactions between verification bias and classification bias create distortions in accuracy estimates that are greater than would be predicted from each source of bias acting independently. © 2014 American Cancer Society.
Evidence synthesis to inform model-based cost-effectiveness evaluations of diagnostic tests: a methodological review of health technology assessments.

PubMed

Shinkins, Bethany; Yang, Yaling; Abel, Lucy; Fanshawe, Thomas R

2017-04-14

Evaluations of diagnostic tests are challenging because of the indirect nature of their impact on patient outcomes. Model-based health economic evaluations of tests allow different types of evidence from various sources to be incorporated and enable cost-effectiveness estimates to be made beyond the duration of available study data. To parameterize a health-economic model fully, all the ways a test impacts on patient health must be quantified, including but not limited to diagnostic test accuracy. We assessed all UK NIHR HTA reports published May 2009-July 2015. Reports were included if they evaluated a diagnostic test, included a model-based health economic evaluation and included a systematic review and meta-analysis of test accuracy. From each eligible report we extracted information on the following topics: 1) what evidence aside from test accuracy was searched for and synthesised, 2) which methods were used to synthesise test accuracy evidence and how did the results inform the economic model, 3) how/whether threshold effects were explored, 4) how the potential dependency between multiple tests in a pathway was accounted for, and 5) for evaluations of tests targeted at the primary care setting, how evidence from differing healthcare settings was incorporated. The bivariate or HSROC model was implemented in 20/22 reports that met all inclusion criteria. Test accuracy data for health economic modelling was obtained from meta-analyses completely in four reports, partially in fourteen reports and not at all in four reports. Only 2/7 reports that used a quantitative test gave clear threshold recommendations. All 22 reports explored the effect of uncertainty in accuracy parameters but most of those that used multiple tests did not allow for dependence between test results. 7/22 tests were potentially suitable for primary care but the majority found limited evidence on test accuracy in primary care settings. The uptake of appropriate meta-analysis methods for synthesising evidence on diagnostic test accuracy in UK NIHR HTAs has improved in recent years. Future research should focus on other evidence requirements for cost-effectiveness assessment, threshold effects for quantitative tests and the impact of multiple diagnostic tests.
Is there any evidence for the validity of diagnostic criteria used for accommodative and nonstrabismic binocular dysfunctions?

PubMed

Cacho-Martínez, Pilar; García-Muñoz, Ángel; Ruiz-Cantero, María Teresa

2014-01-01

To analyze the diagnostic criteria used in the scientific literature published in the past 25 years for accommodative and nonstrabismic binocular dysfunctions and to explore if the epidemiological analysis of diagnostic validity has been used to propose which clinical criteria should be used for diagnostic purposes. We carried out a systematic review of papers on accommodative and non-strabic binocular disorders published from 1986 to 2012 analysing the MEDLINE, CINAHL, PsycINFO and FRANCIS databases. We admitted original articles about diagnosis of these anomalies in any population. We identified 839 articles and 12 studies were included. The quality of included articles was assessed using the QUADAS-2 tool. The review shows a wide range of clinical signs and cut-off points between authors. Only 3 studies (regarding accommodative anomalies) assessed diagnostic accuracy of clinical signs. Their results suggest using the accommodative amplitude and monocular accommodative facility for diagnosing accommodative insufficiency and a high positive relative accommodation for accommodative excess. The remaining 9 articles did not analyze diagnostic accuracy, assessing a diagnosis with the criteria the authors considered. We also found differences between studies in the way of considering patients' symptomatology. 3 studies of 12 analyzed, performed a validation of a symptom survey used for convergence insufficiency. Scientific literature reveals differences between authors according to diagnostic criteria for accommodative and nonstrabismic binocular dysfunctions. Diagnostic accuracy studies show that there is only certain evidence for accommodative conditions. For binocular anomalies there is only evidence about a validated questionnaire for convergence insufficiency with no data of diagnostic accuracy. Copyright © 2012 Spanish General Council of Optometry. Published by Elsevier Espana. All rights reserved.
Systematic review of dermoscopy and digital dermoscopy/ artificial intelligence for the diagnosis of melanoma.

PubMed

Rajpara, S M; Botello, A P; Townend, J; Ormerod, A D

2009-09-01

Dermoscopy improves diagnostic accuracy of the unaided eye for melanoma, and digital dermoscopy with artificial intelligence or computer diagnosis has also been shown useful for the diagnosis of melanoma. At present there is no clear evidence regarding the diagnostic accuracy of dermoscopy compared with artificial intelligence. To evaluate the diagnostic accuracy of dermoscopy and digital dermoscopy/artificial intelligence for melanoma diagnosis and to compare the diagnostic accuracy of the different dermoscopic algorithms with each other and with digital dermoscopy/artificial intelligence for the detection of melanoma. A literature search on dermoscopy and digital dermoscopy/artificial intelligence for melanoma diagnosis was performed using several databases. Titles and abstracts of the retrieved articles were screened using a literature evaluation form. A quality assessment form was developed to assess the quality of the included studies. Heterogeneity among the studies was assessed. Pooled data were analysed using meta-analytical methods and comparisons between different algorithms were performed. Of 765 articles retrieved, 30 studies were eligible for meta-analysis. Pooled sensitivity for artificial intelligence was slightly higher than for dermoscopy (91% vs. 88%; P = 0.076). Pooled specificity for dermoscopy was significantly better than artificial intelligence (86% vs. 79%; P < 0.001). Pooled diagnostic odds ratio was 51.5 for dermoscopy and 57.8 for artificial intelligence, which were not significantly different (P = 0.783). There were no significance differences in diagnostic odds ratio among the different dermoscopic diagnostic algorithms. Dermoscopy and artificial intelligence performed equally well for diagnosis of melanocytic skin lesions. There was no significant difference in the diagnostic performance of various dermoscopy algorithms. The three-point checklist, the seven-point checklist and Menzies score had better diagnostic odds ratios than the others; however, these results need to be confirmed by a large-scale high-quality population-based study.
Validation of the Omron MIT Elite blood pressure device in a pregnant population with large arm circumference.

PubMed

James, Lauren; Nzelu, Diane; Hay, Anna; Shennan, Andrew; Kametas, Nikos A

2017-04-01

The aim of this study was to evaluate the accuracy of the Omron MIT Elite automated device in pregnant women with an arm circumference of or above 32 cm, using the British Hypertension Society validation protocol. Blood pressure was measured sequentially in 46 women of any gestation requiring the use of a large cuff (arm circumference ≥32 cm) alternating between the mercury sphygmomanometer and the Omron MIT Elite device. The Omron MIT Elite achieved an overall D/D grade with a mean of the device-observer difference being 7.17±6.67 and 9.31±6.59 for systolic and diastolic blood pressure respectively. Interobserver accuracy was 94.6% for systolic and 95% for diastolic readings within 5 mmHg. The Omron MIT Elite overestimates blood pressure and has failed the British Hypertension Society protocol requirements. Therefore, it cannot be recommended for use in pregnant women with an arm circumference of or above 32 cm.
Inter- and intraobserver reliability of the vertebral, local and segmental kyphosis in 120 traumatic lumbar and thoracic burst fractures: evaluation in lateral X-rays and sagittal computed tomographies

PubMed Central

Brunner, Alexander; Gühring, Markus; Schmälzle, Traude; Weise, Kuno; Badke, Andreas

2009-01-01

Evaluation of the kyphosis angle in thoracic and lumbar burst fractures is often used to indicate surgical procedures. The kyphosis angle could be measured as vertebral, segmental and local kyphosis according to the method of Cobb. The vertebral, segmental and local kyphosis according to the method of Cobb were measured at 120 lateral X-rays and sagittal computed tomographies of 60 thoracic and 60 lumbar burst fractures by 3 independent observers on 2 separate occasions. Osteoporotic fractures were excluded. The intra- and interobserver reliability of these angles in X-ray and computed tomogram, using the intra class correlation coefficient (ICC) were evaluated. Highest reproducibility showed the segmental kyphosis followed by the vertebral kyphosis. For thoracic fractures segmental kyphosis shows in X-ray “excellent” inter- and intraobserver reliabilities (ICC 0.826, 0.802) and for lumbar fractures “good” to “excellent” inter- and intraobserver reliabilities (ICC = 0.790, 0.803). In computed tomography, the segmental kyphosis showed “excellent” inter- and intraobserver reliabilities (ICC = 0.824, 0.801) for thoracic and “excellent” inter- and intraobserver reliabilities (ICC = 0.874, 0.835) for the lumbar fractures. Regarding both diagnostic work ups (X-ray and computed tomography), significant differences were evaluated in interobserver reliabilities for vertebral kyphosis measured in lumbar fracture X-rays (p = 0.035) and interobserver reliabilities for local kyphosis, measured in thoracic fracture X-rays (p = 0.010). Regarding both fracture localizations (thoracic and lumbar fractures), significant differences could only be evaluated in interobserver reliabilities for the local kyphosis measured in computed tomographies (p = 0.045) and in intraobserver reliabilities for the vertebral kyphosis measured in X-rays (p = 0.024). “Good” to “excellent” inter- and intraobserver reliabilities for vertebral, segmental and local kyphosis in X-ray make these angles to a helpful tool, indicating surgical procedures. For the practical use in lateral X-ray, we emphasize the determination of the segmental kyphosis, because of the highest reproducibility of this angle. “Good” to “excellent” inter- and intraobserver reliabilities for these three angles could also be evaluated in computed tomographies. Therefore, also in computed tomography, the use of these three angles seems to be generally possible. For a direct correlation of the results in lateral X-ray and in computed tomography, further studies should be needed. PMID:19953277
Cardiac valve calcifications on low-dose unenhanced ungated chest computed tomography: inter-observer and inter-examination reliability, agreement and variability.

PubMed

van Hamersvelt, Robbert W; Willemink, Martin J; Takx, Richard A P; Eikendal, Anouk L M; Budde, Ricardo P J; Leiner, Tim; Mol, Christian P; Isgum, Ivana; de Jong, Pim A

2014-07-01

To determine inter-observer and inter-examination variability for aortic valve calcification (AVC) and mitral valve and annulus calcification (MC) in low-dose unenhanced ungated lung cancer screening chest computed tomography (CT). We included 578 lung cancer screening trial participants who were examined by CT twice within 3 months to follow indeterminate pulmonary nodules. On these CTs, AVC and MC were measured in cubic millimetres. One hundred CTs were examined by five observers to determine the inter-observer variability. Reliability was assessed by kappa statistics (κ) and intra-class correlation coefficients (ICCs). Variability was expressed as the mean difference ± standard deviation (SD). Inter-examination reliability was excellent for AVC (κ = 0.94, ICC = 0.96) and MC (κ = 0.95, ICC = 0.90). Inter-examination variability was 12.7 ± 118.2 mm(3) for AVC and 31.5 ± 219.2 mm(3) for MC. Inter-observer reliability ranged from κ = 0.68 to κ = 0.92 for AVC and from κ = 0.20 to κ = 0.66 for MC. Inter-observer ICC was 0.94 for AVC and ranged from 0.56 to 0.97 for MC. Inter-observer variability ranged from -30.5 ± 252.0 mm(3) to 84.0 ± 240.5 mm(3) for AVC and from -95.2 ± 210.0 mm(3) to 303.7 ± 501.6 mm(3) for MC. AVC can be quantified with excellent reliability on ungated unenhanced low-dose chest CT, but manual detection of MC can be subject to substantial inter-observer variability. Lung cancer screening CT may be used for detection and quantification of cardiac valve calcifications. • Low-dose unenhanced ungated chest computed tomography can detect cardiac valve calcifications. • However, calcified cardiac valves are not reported by most radiologists. • Inter-observer and inter-examination variability of aortic valve calcifications is sufficient for longitudinal studies. • Volumetric measurement variability of mitral valve and annulus calcifications is substantial.
Identification of facilitators and barriers to residents' use of a clinical reasoning tool.

PubMed

DiNardo, Deborah; Tilstra, Sarah; McNeil, Melissa; Follansbee, William; Zimmer, Shanta; Farris, Coreen; Barnato, Amber E

2018-03-28

While there is some experimental evidence to support the use of cognitive forcing strategies to reduce diagnostic error in residents, the potential usability of such strategies in the clinical setting has not been explored. We sought to test the effect of a clinical reasoning tool on diagnostic accuracy and to obtain feedback on its usability and acceptability. We conducted a randomized behavioral experiment testing the effect of this tool on diagnostic accuracy on written cases among post-graduate 3 (PGY-3) residents at a single internal medical residency program in 2014. Residents completed written clinical cases in a proctored setting with and without prompts to use the tool. The tool encouraged reflection on concordant and discordant aspects of each case. We used random effects regression to assess the effect of the tool on diagnostic accuracy of the independent case sets, controlling for case complexity. We then conducted audiotaped structured focus group debriefing sessions and reviewed the tapes for facilitators and barriers to use of the tool. Of 51 eligible PGY-3 residents, 34 (67%) participated in the study. The average diagnostic accuracy increased from 52% to 60% with the tool, a difference that just met the test for statistical significance in adjusted analyses (p=0.05). Residents reported that the tool was generally acceptable and understandable but did not recognize its utility for use with simple cases, suggesting the presence of overconfidence bias. A clinical reasoning tool improved residents' diagnostic accuracy on written cases. Overconfidence bias is a potential barrier to its use in the clinical setting.
Diagnostic accuracy of optical coherence tomography in actinic keratosis and basal cell carcinoma.

PubMed

Olsen, J; Themstrup, L; De Carvalho, N; Mogensen, M; Pellacani, G; Jemec, G B E

2016-12-01

Early diagnosis of non-melanoma skin cancer (NMSC) is potentially possible using optical coherence tomography (OCT) which provides non-invasive, real-time images of skin with micrometre resolution and an imaging depth of up to 2mm. OCT technology for skin imaging has undergone significant developments, improving image quality substantially. The diagnostic accuracy of any method is influenced by continuous technological development making it necessary to regularly re-evaluate methods. The objective of this study is to estimate the diagnostic accuracy of OCT in basal cell carcinomas (BCC) and actinic keratosis (AK) as well as differentiating these lesions from normal skin. A study set consisting of 142 OCT images meeting selection criterea for image quality and diagnosis of AK, BCC and normal skin was presented uniformly to two groups of blinded observers: 5 dermatologists experienced in OCT-image interpretation and 5 dermatologists with no experience in OCT. During the presentation of the study set the observers filled out a standardized questionnaire regarding the OCT diagnosis. Images were captured using a commercially available OCT machine (Vivosight ® , Michelson Diagnostics, UK). Skilled OCT observers were able to diagnose BCC lesions with a sensitivity of 86% to 95% and a specificity of 81% to 98%. Skilled observers with at least one year of OCT-experience showed an overall higher diagnostic accuracy compared to inexperienced observers. The study shows an improved diagnostic accuracy of OCT in differentiating AK and BCC from healthy skin using state-of-the-art technology compared to earlier OCT technology, especially concerning BCC diagnosis. Copyright Â© 2016 Elsevier B.V. All rights reserved.
Efficacy and cost-effectiveness of stereotactic vacuum-assisted core biopsy of nonpalpable breast lesions: analysis of 602 biopsies performed over 5 years.

PubMed

Luparia, A; Durando, M; Campanino, P; Regini, E; Lucarelli, D; Talenti, A; Mattone, G; Mariscotti, G; Sapino, A; Gandini, G

2011-04-01

The authors sought to evaluate the diagnostic accuracy and cost-effectiveness of vacuum-assisted core biopsy (VACB) in comparison with diagnostic surgical excision for characterisation of nonpalpable breast lesions classified as Breast Imaging Reporting and Data System (BI-RADS) categories R3 and R4. From January 2004 to December 2008, we conducted 602 stereotactic, 11-gauge, VACB procedures on 243 nonpalpable breast lesions categorised as BI-RADS R3, 346 categorised as BI-RADS R4 and 13 categorised as BI-RADS R5. We calculated the diagnostic accuracy and cost savings of VACB by subtracting the cost of the stereotactic biopsy from that of the diagnostic surgical procedure. A total of 56% of the lesions were benign and required no further assessment. Lesions of uncertain malignant potential (B3) (23.6%) were debated at multidisciplinary meetings, and diagnostic surgical biopsy was recommended for 83.1% of them. All malignant lesions (B4 and B5) underwent surgical excision. VACB had a sensitivity of 94.9%, specificity of 98.3% and diagnostic accuracy of 97.7%. The cost savings per VACB procedure were 464.00 euro; by obviating 335 surgical biopsies, the overall cost savings was 155,440.00 euro over 5 years. VACB proved to have high diagnostic accuracy for characterising abnormalities at low to intermediate risk of malignancy and obviated surgical excision in about half of the cases, allowing for considerable cost savings.
Clinical Validation of the "Sedentary Lifestyle" Nursing Diagnosis in Secondary School Students

ERIC Educational Resources Information Center

de Oliveira, Marcos Renato; da Silva, Viviane Martins; Guedes, Nirla Gomes; de Oliveira Lopes, Marcos Venícios

2016-01-01

This study clinically validated the nursing diagnosis of "sedentary lifestyle" (SL) among 564 Brazilian adolescents. Measures of diagnostic accuracy were calculated for defining characteristics, and Mantel--Haenszel analysis was used to identify related factors. The measures of diagnostic accuracy showed that the following defining…
Effects of Experience and Training on Diagnostic Accuracy.

ERIC Educational Resources Information Center

Brammer, Robert

The interview process was studied to uncover the relationship of expertise in psychotherapy to the likelihood of accurate diagnosis. Experience and training affect the number of diagnostic questions clinicians ask as compared to personal, family, social, occupational, and history questions; and this in turn affects the accuracy of the diagnoses…

Some links on this page may take you to non-federal websites. Their policies may differ from this site.