van Hamersvelt, Robbert W; Willemink, Martin J; Takx, Richard A P; Eikendal, Anouk L M; Budde, Ricardo P J; Leiner, Tim; Mol, Christian P; Isgum, Ivana; de Jong, Pim A
2014-07-01
To determine inter-observer and inter-examination variability for aortic valve calcification (AVC) and mitral valve and annulus calcification (MC) in low-dose unenhanced ungated lung cancer screening chest computed tomography (CT). We included 578 lung cancer screening trial participants who were examined by CT twice within 3 months to follow indeterminate pulmonary nodules. On these CTs, AVC and MC were measured in cubic millimetres. One hundred CTs were examined by five observers to determine the inter-observer variability. Reliability was assessed by kappa statistics (κ) and intra-class correlation coefficients (ICCs). Variability was expressed as the mean difference ± standard deviation (SD). Inter-examination reliability was excellent for AVC (κ = 0.94, ICC = 0.96) and MC (κ = 0.95, ICC = 0.90). Inter-examination variability was 12.7 ± 118.2 mm(3) for AVC and 31.5 ± 219.2 mm(3) for MC. Inter-observer reliability ranged from κ = 0.68 to κ = 0.92 for AVC and from κ = 0.20 to κ = 0.66 for MC. Inter-observer ICC was 0.94 for AVC and ranged from 0.56 to 0.97 for MC. Inter-observer variability ranged from -30.5 ± 252.0 mm(3) to 84.0 ± 240.5 mm(3) for AVC and from -95.2 ± 210.0 mm(3) to 303.7 ± 501.6 mm(3) for MC. AVC can be quantified with excellent reliability on ungated unenhanced low-dose chest CT, but manual detection of MC can be subject to substantial inter-observer variability. Lung cancer screening CT may be used for detection and quantification of cardiac valve calcifications. • Low-dose unenhanced ungated chest computed tomography can detect cardiac valve calcifications. • However, calcified cardiac valves are not reported by most radiologists. • Inter-observer and inter-examination variability of aortic valve calcifications is sufficient for longitudinal studies. • Volumetric measurement variability of mitral valve and annulus calcifications is substantial.
Levegrün, Sabine; Pöttgen, Christoph; Jawad, Jehad Abu; Berkovic, Katharina; Hepp, Rodrigo; Stuschke, Martin
2013-02-01
To evaluate megavoltage computed tomography (MVCT)-based image guidance with helical tomotherapy in patients with vertebral tumors by analyzing factors influencing interobserver variability, considered as quality criterion of image guidance. Five radiation oncologists retrospectively registered 103 MVCTs in 10 patients to planning kilovoltage CTs by rigid transformations in 4 df. Interobserver variabilities were quantified using the standard deviations (SDs) of the distributions of the correction vector components about the observers' fraction mean. To assess intraobserver variabilities, registrations were repeated after ≥4 weeks. Residual deviations after setup correction due to uncorrectable rotational errors and elastic deformations were determined at 3 craniocaudal target positions. To differentiate observer-related variations in minimizing these residual deviations across the 3-dimensional MVCT from image resolution effects, 2-dimensional registrations were performed in 30 single transverse and sagittal MVCT slices. Axial and longitudinal MVCT image resolutions were quantified. For comparison, image resolution of kilovoltage cone-beam CTs (CBCTs) and interobserver variability in registrations of 43 CBCTs were determined. Axial MVCT image resolution is 3.9 lp/cm. Longitudinal MVCT resolution amounts to 6.3 mm, assessed as full-width at half-maximum of thin objects in MVCTs with finest pitch. Longitudinal CBCT resolution is better (full-width at half-maximum, 2.5 mm for CBCTs with 1-mm slices). In MVCT registrations, interobserver variability in the craniocaudal direction (SD 1.23 mm) is significantly larger than in the lateral and ventrodorsal directions (SD 0.84 and 0.91 mm, respectively) and significantly larger compared with CBCT alignments (SD 1.04 mm). Intraobserver variabilities are significantly smaller than corresponding interobserver variabilities (variance ratio [VR] 1.8-3.1). Compared with 3-dimensional registrations, 2-dimensional registrations have significantly smaller interobserver variability in the lateral and ventrodorsal directions (VR 3.8 and 2.8, respectively) but not in the craniocaudal direction (VR 0.75). Tomotherapy image guidance precision is affected by image resolution and residual deviations after setup correction. Eliminating the effect of residual deviations yields small interobserver variabilities with submillimeter precision in the axial plane. In contrast, interobserver variability in the craniocaudal direction is dominated by the poorer longitudinal MVCT image resolution. Residual deviations after image guidance exist and need to be considered when dose gradients ultimately achievable with image guided radiation therapy techniques are analyzed. Copyright © 2013 Elsevier Inc. All rights reserved.
Variability in Cobb angle measurements using reformatted computerized tomography scans.
Adam, Clayton J; Izatt, Maree T; Harvey, Jason R; Askin, Geoffrey N
2005-07-15
Survey of intraobserver and interobserver measurement variability. To assess the use of reformatted computerized tomography (CT) images for manual measurement of coronal Cobb angles in idiopathic scoliosis. Cobb angle measurements in idiopathic scoliosis are traditionally made from standing radiographs, whereas CT is often used for assessment of vertebral rotation. Correlating Cobb angles from standing radiographs with vertebral rotations from supine CT is problematic because the geometry of the spine changes significantly from standing to supine positions, and 2 different imaging methods are involved. We assessed the use of reformatted thoracolumbar CT images for Cobb angle measurement. Preoperative CT of 12 patients with idiopathic scoliosis were used to generate reformatted coronal images. Five observers measured coronal Cobb angles on 3 occasions from each of the images. Intraobserver and interobserver variability associated with Cobb measurement from reformatted CT scans was assessed and compared with previous studies of measurement variability using plain radiographs. For major curves, 95% confidence intervals for intraobserver and interobserver variability were +/-6.6 degrees and +/-7.7 degrees, respectively. For minor curves, the intervals were +/-7.5 degrees and +/-8.2 degrees, respectively. Intraobserver and interobserver technical error of measurement was 2.4 degrees and 2.7 degrees, with reliability coefficients of 88% and 84%, respectively. There was no correlation between measurement variability and curve severity. Reformatted CT images may be used for manual measurement of coronal Cobb angles in idiopathic scoliosis with similar variability to manual measurement of plain radiographs.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Levegruen, Sabine, E-mail: sabine.levegruen@uni-due.de; Poettgen, Christoph; Abu Jawad, Jehad
Purpose: To evaluate megavoltage computed tomography (MVCT)-based image guidance with helical tomotherapy in patients with vertebral tumors by analyzing factors influencing interobserver variability, considered as quality criterion of image guidance. Methods and Materials: Five radiation oncologists retrospectively registered 103 MVCTs in 10 patients to planning kilovoltage CTs by rigid transformations in 4 df. Interobserver variabilities were quantified using the standard deviations (SDs) of the distributions of the correction vector components about the observers' fraction mean. To assess intraobserver variabilities, registrations were repeated after {>=}4 weeks. Residual deviations after setup correction due to uncorrectable rotational errors and elastic deformations were determinedmore » at 3 craniocaudal target positions. To differentiate observer-related variations in minimizing these residual deviations across the 3-dimensional MVCT from image resolution effects, 2-dimensional registrations were performed in 30 single transverse and sagittal MVCT slices. Axial and longitudinal MVCT image resolutions were quantified. For comparison, image resolution of kilovoltage cone-beam CTs (CBCTs) and interobserver variability in registrations of 43 CBCTs were determined. Results: Axial MVCT image resolution is 3.9 lp/cm. Longitudinal MVCT resolution amounts to 6.3 mm, assessed as full-width at half-maximum of thin objects in MVCTs with finest pitch. Longitudinal CBCT resolution is better (full-width at half-maximum, 2.5 mm for CBCTs with 1-mm slices). In MVCT registrations, interobserver variability in the craniocaudal direction (SD 1.23 mm) is significantly larger than in the lateral and ventrodorsal directions (SD 0.84 and 0.91 mm, respectively) and significantly larger compared with CBCT alignments (SD 1.04 mm). Intraobserver variabilities are significantly smaller than corresponding interobserver variabilities (variance ratio [VR] 1.8-3.1). Compared with 3-dimensional registrations, 2-dimensional registrations have significantly smaller interobserver variability in the lateral and ventrodorsal directions (VR 3.8 and 2.8, respectively) but not in the craniocaudal direction (VR 0.75). Conclusion: Tomotherapy image guidance precision is affected by image resolution and residual deviations after setup correction. Eliminating the effect of residual deviations yields small interobserver variabilities with submillimeter precision in the axial plane. In contrast, interobserver variability in the craniocaudal direction is dominated by the poorer longitudinal MVCT image resolution. Residual deviations after image guidance exist and need to be considered when dose gradients ultimately achievable with image guided radiation therapy techniques are analyzed.« less
Williams, Michelle C; Golay, Saroj K; Hunter, Amanda; Weir-McCall, Jonathan R; Mlynska, Lucja; Dweck, Marc R; Uren, Neal G; Reid, John H; Lewis, Steff C; Berry, Colin; van Beek, Edwin J R; Roditi, Giles; Newby, David E; Mirsadraee, Saeed
2015-01-01
Introduction Observer variability can influence the assessment of CT coronary angiography (CTCA) and the subsequent diagnosis of angina pectoris due to coronary heart disease. Methods We assessed 210 CTCAs from the Scottish COmputed Tomography of the HEART (SCOT-HEART) trial for intraobserver and interobserver variability. Calcium score, coronary angiography and image quality were evaluated. Coronary artery disease was defined as none (<10%), mild (10–49%), moderate (50–70%) and severe (>70%) luminal stenosis and classified as no (<10%), non-obstructive (10–70%) or obstructive (>70%) coronary artery disease. Post-CTCA diagnosis of angina pectoris due to coronary heart disease was classified as yes, probable, unlikely or no. Results Patients had a mean body mass index of 29 (28, 30) kg/m2, heart rate of 58 (57, 60)/min and 62% were men. Intraobserver and interobserver agreements for the presence or absence of coronary artery disease were excellent (95% agreement, κ 0.884 (0.817 to 0.951) and good (91%, 0.791 (0.703 to 0.879)). Intraobserver and interobserver agreement for the presence or absence of angina pectoris due to coronary heart disease were excellent (93%, 0.842 (0.918 to 0.755) and good (86%, 0.701 (0.799 to 0.603)), respectively. Observer variability of calcium score was excellent for calcium scores below 1000. More segments were categorised as uninterpretable with 64-multidetector compared to 320-multidetector CTCA (10.1% vs 2.6%, p<0.001) but there was no difference in observer variability. Conclusions Multicentre multidetector CTCA has excellent agreement in patients under investigation for suspected angina due to coronary heart disease. Trial registration number NCT01149590. PMID:26019881
Nakajima, Erica C; Frankland, Michael P; Johnson, Tucker F; Antic, Sanja L; Chen, Heidi; Chen, Sheau-Chiann; Karwoski, Ronald A; Walker, Ronald; Landman, Bennett A; Clay, Ryan D; Bartholmai, Brian J; Rajagopalan, Srinivasan; Peikert, Tobias; Massion, Pierre P; Maldonado, Fabien
2018-01-01
Lung adenocarcinoma (ADC), the most common lung cancer type, is recognized increasingly as a disease spectrum. To guide individualized patient care, a non-invasive means of distinguishing indolent from aggressive ADC subtypes is needed urgently. Computer-Aided Nodule Assessment and Risk Yield (CANARY) is a novel computed tomography (CT) tool that characterizes early ADCs by detecting nine distinct CT voxel classes, representing a spectrum of lepidic to invasive growth, within an ADC. CANARY characterization has been shown to correlate with ADC histology and patient outcomes. This study evaluated the inter-observer variability of CANARY analysis. Three novice observers segmented and analyzed independently 95 biopsy-confirmed lung ADCs from Vanderbilt University Medical Center/Nashville Veterans Administration Tennessee Valley Healthcare system (VUMC/TVHS) and the Mayo Clinic (Mayo). Inter-observer variability was measured using intra-class correlation coefficient (ICC). The average ICC for all CANARY classes was 0.828 (95% CI 0.76, 0.895) for the VUMC/TVHS cohort, and 0.852 (95% CI 0.804, 0.901) for the Mayo cohort. The most invasive voxel classes had the highest ICC values. To determine whether nodule size influenced inter-observer variability, an additional cohort of 49 sub-centimeter nodules from Mayo were also segmented by three observers, with similar ICC results. Our study demonstrates that CANARY ADC classification between novice CANARY users has an acceptably low degree of variability, and supports the further development of CANARY for clinical application.
Le Couteulx, S; Caudron, J; Dubourg, B; Cauchois, G; Dupré, M; Michelin, P; Durand, E; Eltchaninoff, H; Dacher, J-N
2018-05-01
To evaluate intra- and inter-observer variability of multidetector computed tomography (MDCT) sizing of the aortic annulus before transcatheter aortic valve replacement (TAVR) and the effect of observer experience, aortic valve calcification and image quality. MDCT examinations of 52 consecutive patients with tricuspid aortic valve (30 women, 22 men) with a mean age of 83±7 (SD) years (range: 64-93 years) were evaluated retrospectively. The maximum and minimum diameters, area and circumference of the aortic annulus were measured twice at diastole and systole with a standardized approach by three independent observers with different levels of experience (expert [observer 1]; resident with intensive 6 months practice [observer 2]; trained resident with starting experience [observer 3]). Observers were requested to recommend the valve prosthesis size. Calcification volume of the aortic valve and signal to noise ratio were evaluated. Intra- and inter-observer reproducibility was excellent for all aortic annulus dimensions, with an intraclass correlation coefficient ranging respectively from 0.84 to 0.98 and from 0.82 to 0.97. Agreement for selection of prosthesis size was almost perfect between the two most experienced observers (k=0.82) and substantial with the inexperienced observer (k=0.67). Aortic valve calcification did not influence intra-observer reproducibility. Image quality influenced reproducibility of the inexperienced observer. Intra- and inter-observer variability of aortic annulus sizing by MDCT is low. Nevertheless, the less experienced observer showed lower reliability suggesting a learning curve. Copyright © 2017. Published by Elsevier Masson SAS.
Figueroa, José; Guarachi, Juan Pablo; Matas, José; Arnander, Magnus; Orrego, Mario
2016-04-01
Computed tomography (CT) is widely used to assess component rotation in patients with poor results after total knee arthroplasty (TKA). The purpose of this study was to simultaneously determine the accuracy and reliability of CT in measuring TKA component rotation. TKA components were implanted in dry-bone models and assigned to two groups. The first group (n = 7) had variable femoral component rotations, and the second group (n = 6) had variable tibial tray rotations. CT images were then used to assess component rotation. Accuracy of CT rotational assessment was determined by mean difference, in degrees, between implanted component rotation and CT-measured rotation. Intraclass correlation coefficient (ICC) was applied to determine intra-observer and inter-observer reliability. Femoral component accuracy showed a mean difference of 2.5° and the tibial tray a mean difference of 3.2°. There was good intra- and inter-observer reliability for both components, with a femoral ICC of 0.8 and 0.76, and tibial ICC of 0.68 and 0.65, respectively. CT rotational assessment accuracy can differ from true component rotation by approximately 3° for each component. It does, however, have good inter- and intra-observer reliability.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jodda, A; Piotrowski, T
2014-06-01
Purpose: The intra- and inter-observer variability in delineation of the parotids on the kilo-voltage computed tomography (kVCT) and mega-voltage computed tomography (MVCT) were examined to establish their impact on the dose calculation during adaptive head and neck helical tomotherapy (HT). Methods: Three observers delineated left and right parotids for ten randomly selected patients with oropharynx cancer treated on HT. The pre-treatment kVCT and the MVCT from the first fraction of irradiation were selected to delineation. The delineation procedure was repeated three times by each observer. The parotids were delineated according to the institutional protocol. The analyses included intra-observer reproducibility andmore » inter-structure, -observer and -modality variability of the volume and dose. Results: The differences between the left and right parotid outlines were not statistically significant (p>0.3). The reproducibility of the delineation was confirmed for each observer on the kVCT (p>0.2) and on the MVCT (p>0.1). The inter-observer variability of the outlines was significant (p<0.001) as well as the inter-modality variability (p<0.006). The parotids delineated on the MVCT were 10% smaller than on the kVCT. The inter-observer variability of the parotids delineation did not affect the average dose (p=0.096 on the kVCT and p=0.176 on the MVCT). The dose calculated on the MVCT was higher by 3.3% than dose from the kVCT (p=0.009). Conclusion: Usage of the institutional protocols for the parotids delineation reduces intra-observer variability and increases reproducibility of the outlines. These protocols do not eliminate delineation differences between the observers, but these differences are not clinically significant and do not affect average doses in the parotids. The volumes of the parotids delineated on the MVCT are smaller than on the kVCT, which affects the differences in the calculated doses.« less
Chapman, Cary B; Herrera, Mauricio F; Binenbaum, Gil; Schweppe, Michael; Staron, Ronald B; Feldman, Frieda; Rosenwasser, Melvin P
2003-09-01
The purpose of this prospective study was to determine the level of interobserver and intraobserver agreement among orthopedic surgeons and radiologists when computed tomography (CT) scans are used with plain radiographs to evaluate intertrochanteric fractures. In addition, the prognostic value of current classifications systems concerning quality of life was evaluated. Sixty-one patients who presented with intertrochanteric fractures received open reduction and internal fixation with compression hip screw. Three orthopedic surgeons and 2 radiologists independently classified the fractures according to 2 systems: Evans-Jensen and AO (Arbeitsgemeinschaft für Osteo-synthesefragen). Fractures were initially graded with plain radiographs and then again in conjunction with CT. Results were analyzed using the (kappa) kappa coefficient. The 36-item Short-Form Health Survey was administered at baseline, 3 months, and 1 year, and results were correlated with fracture grade. Mean kappa coefficients when comparing radiography alone with radiography and CT scan were 0.63 for the AO system and 0.59 for the Evans-Jensen system. Both represent "fair" agreements. Mean overall interobserver kappa coefficients were 0.67 for radiologists and 0.57 for orthopedic surgeons. Radiologists also had higher intraobserver kappa coefficients. No significant relationships were found between follow-up Short Form Health Survey results and intraoperative grading of fractures. When these classification schemes are compared, interobserver agreement does not appear to change dramatically when information from CT scans is added. This may suggest that (1) more data have been provided by CT with greater possibilities for misinterpretation and (2) these classification schemes may not be comprehensive in describing fracture pattern and displacement. Finally, both systems failed to provide any prognostic value.
Ghobrial, Fady Emil Ibrahim; Eldin, Manal Salah; Razek, Ahmed Abdel Khalek Abdel; Atwan, Nadia Ibrahim; Shamaa, Sameh Sayed Ahmed
2017-01-01
To assess inter-observer agreement of revised RECIST criteria (version 1.1) for computed tomography assessment of hepatic metastases of breast cancer. A prospective study was conducted in 28 female patients with breast cancer and with at least one measurable metastatic lesion in the liver that was treated with 3 cycles of anthracycline-based chemotherapy. All patients underwent computed tomography of the abdomen with 64-row multi- detector CT at baseline and after 3 cycles of chemotherapy for response assessment. Image analysis was performed by 2 observers, based on the RECIST criteria (version 1.1). Computed tomography revealed partial response of hepatic metastases in 7 patients (25%) by one observer and in 10 patients (35.7%) by the other observer, with good inter-observer agreement (k=0.75, percent agreement of 89.29%). Stable disease was detected in 19 patients (67.8%) by one observer and in 16 patients (57.1%) by the other observer, with good agreement (k=0.774, percent agreement of 89.29%). Progressive disease was detected in 2 patients (7.2%) by both observers, with perfect agreement (k=1, percent agreement of 100%). The overall inter-observer agreement in the CT-based response assessment of hepatic metastasis between the two observers was good ( k =0.793, percent agreement of 89.29%). We concluded that computed tomography is a reliable and reproducible imaging modality for response assessment of hepatic metastases of breast cancer according to the RECIST criteria (version 1.1).
Schreiter, V; Steffen, I; Huebner, H; Bredow, J; Heimann, U; Kroencke, T J; Poellinger, A; Doellinger, F; Buchert, R; Hamm, B; Brenner, W; Schreiter, N F
2015-01-01
The purpose of this study was to evaluate the reproducibility of a new software based analysing system for ventilation/perfusion single-photon emission computed tomography/computed tomography (V/P SPECT/CT) in patients with pulmonary emphysema and to compare it to the visual interpretation. 19 patients (mean age: 68.1 years) with pulmonary emphysema who underwent V/P SPECT/CT were included. Data were analysed by two independent observers in visual interpretation (VI) and by software based analysis system (SBAS). SBAS PMOD version 3.4 (Technologies Ltd, Zurich, Switzerland) was used to assess counts and volume per lung lobe/per lung and to calculate the count density per lung, lobe ratio of counts and ratio of count density. VI was performed using a visual scale to assess the mean counts per lung lobe. Interobserver variability and association for SBAS and VI were analysed using Spearman's rho correlation coefficient. Interobserver agreement correlated highly in perfusion (rho: 0.982, 0.957, 0.90, 0.979) and ventilation (rho: 0.972, 0.924, 0.941, 0.936) for count/count density per lobe and ratio of counts/count density in SBAS. Interobserver agreement correlated clearly for perfusion (rho: 0.655) and weakly for ventilation (rho: 0.458) in VI. SBAS provides more reproducible measures than VI for the relative tracer uptake in V/P SPECT/CTs in patients with pulmonary emphysema. However, SBAS has to be improved for routine clinical use.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hou, Y; Aileen, C; Kozono, D
Purpose: Quantification of volume changes on CBCT during SBRT for NSCLC may provide a useful radiological marker for radiation response and adaptive treatment planning, but the reproducibility of CBCT volume delineation is a concern. This study is to quantify inter-scan/inter-observer variability in tumor volume delineation on CBCT. Methods: Twenty earlystage (stage I and II) NSCLC patients were included in this analysis. All patients were treated with SBRT with a median dose of 54 Gy in 3 to 5 fractions. Two physicians independently manually contoured the primary gross tumor volume on CBCTs taken immediately before SBRT treatment (Pre) and after themore » same SBRT treatment (Post). Absolute volume differences (AVD) were calculated between the Pre and Post CBCTs for a given treatment to quantify inter-scan variability, and then between the two observers for a given CBCT to quantify inter-observer variability. AVD was also normalized with respect to average volume to obtain relative volume differences (RVD). Bland-Altman approach was used to evaluate variability. All statistics were calculated with SAS version 9.4. Results: The 95% limit of agreement (mean ± 2SD) on AVD and RVD measurements between Pre and Post scans were −0.32cc to 0.32cc and −0.5% to 0.5% versus −1.9 cc to 1.8 cc and −15.9% to 15.3% for the two observers respectively. The 95% limit of agreement of AVD and RVD between the two observers were −3.3 cc to 2.3 cc and −42.4% to 28.2% respectively. The greatest variability in inter-scan RVD was observed with very small tumors (< 5 cc). Conclusion: Inter-scan variability in RVD is greatest with small tumors. Inter-observer variability was larger than inter-scan variability. The 95% limit of agreement for inter-observer and inter-scan variability (∼15–30%) helps define a threshold for clinically meaningful change in tumor volume to assess SBRT response, with larger thresholds needed for very small tumors. Part of the work was funded by a Kaye award; Disclosure/Conflict of interest: Raymond H. Mak: Stock ownership: Celgene, Inc. Consulting: Boehringer-Ingelheim, Inc.« less
Reproducibility of abdominal fat assessment by ultrasound and computed tomography
Mauad, Fernando Marum; Chagas-Neto, Francisco Abaeté; Benedeti, Augusto César Garcia Saab; Nogueira-Barbosa, Marcello Henrique; Muglia, Valdair Francisco; Carneiro, Antonio Adilton Oliveira; Muller, Enrico Mattana; Elias Junior, Jorge
2017-01-01
Objective: To test the accuracy and reproducibility of ultrasound and computed tomography (CT) for the quantification of abdominal fat in correlation with the anthropometric, clinical, and biochemical assessments. Materials and Methods: Using ultrasound and CT, we determined the thickness of subcutaneous and intra-abdominal fat in 101 subjects-of whom 39 (38.6%) were men and 62 (61.4%) were women-with a mean age of 66.3 years (60-80 years). The ultrasound data were correlated with the anthropometric, clinical, and biochemical parameters, as well as with the areas measured by abdominal CT. Results: Intra-abdominal thickness was the variable for which the correlation with the areas of abdominal fat was strongest (i.e., the correlation coefficient was highest). We also tested the reproducibility of ultrasound and CT for the assessment of abdominal fat and found that CT measurements of abdominal fat showed greater reproducibility, having higher intraobserver and interobserver reliability than had the ultrasound measurements. There was a significant correlation between ultrasound and CT, with a correlation coefficient of 0.71. Conclusion: In the assessment of abdominal fat, the intraobserver and interobserver reliability were greater for CT than for ultrasound, although both methods showed high accuracy and good reproducibility. PMID:28670024
Reproducibility of abdominal fat assessment by ultrasound and computed tomography.
Mauad, Fernando Marum; Chagas-Neto, Francisco Abaeté; Benedeti, Augusto César Garcia Saab; Nogueira-Barbosa, Marcello Henrique; Muglia, Valdair Francisco; Carneiro, Antonio Adilton Oliveira; Muller, Enrico Mattana; Elias Junior, Jorge
2017-01-01
To test the accuracy and reproducibility of ultrasound and computed tomography (CT) for the quantification of abdominal fat in correlation with the anthropometric, clinical, and biochemical assessments. Using ultrasound and CT, we determined the thickness of subcutaneous and intra-abdominal fat in 101 subjects-of whom 39 (38.6%) were men and 62 (61.4%) were women-with a mean age of 66.3 years (60-80 years). The ultrasound data were correlated with the anthropometric, clinical, and biochemical parameters, as well as with the areas measured by abdominal CT. Intra-abdominal thickness was the variable for which the correlation with the areas of abdominal fat was strongest (i.e., the correlation coefficient was highest). We also tested the reproducibility of ultrasound and CT for the assessment of abdominal fat and found that CT measurements of abdominal fat showed greater reproducibility, having higher intraobserver and interobserver reliability than had the ultrasound measurements. There was a significant correlation between ultrasound and CT, with a correlation coefficient of 0.71. In the assessment of abdominal fat, the intraobserver and interobserver reliability were greater for CT than for ultrasound, although both methods showed high accuracy and good reproducibility.
Area of ischemia assessed by physicians and software packages from myocardial perfusion scintigrams
2014-01-01
Background The European Society of Cardiology recommends that patients with >10% area of ischemia should receive revascularization. We investigated inter-observer variability for the extent of ischemic defects reported by different physicians and by different software tools, and if inter-observer variability was reduced when the physicians were provided with a computerized suggestion of the defects. Methods Twenty-five myocardial perfusion single photon emission computed tomography (SPECT) patients who were regarded as ischemic according to the final report were included. Eleven physicians in nuclear medicine delineated the extent of the ischemic defects. After at least two weeks, they delineated the defects again, and were this time provided a suggestion of the defect delineation by EXINI HeartTM (EXINI). Summed difference scores and ischemic extent values were obtained from four software programs. Results The median extent values obtained from the 11 physicians varied between 8% and 34%, and between 9% and 16% for the software programs. For all 25 patients, mean extent obtained from EXINI was 17.0% (± standard deviation (SD) 14.6%). Mean extent for physicians was 22.6% (± 15.6%) for the first delineation and 19.1% (± 14.9%) for the evaluation where they were provided computerized suggestion. Intra-class correlation (ICC) increased from 0.56 (95% confidence interval (CI) 0.41-0.72) to 0.81 (95% CI 0.71-0.90) between the first and the second delineation, and SD between physicians were 7.8 (first) and 5.9 (second delineation). Conclusions There was large variability in the estimated ischemic defect size obtained both from different physicians and from different software packages. When the physicians were provided with a suggested delineation, the inter-observer variability decreased significantly. PMID:24479846
Moreno-Montañés, Javier; Antón, Vanesa; Antón, Alfonso; Larrosa, José M; Martinez-de-la-Casa, José María; Rebolleda, Gema; Ussa, Fernando; García-Granero, Marta
2017-04-01
It is important to evaluate intraobserver and interobserver agreement using visual field (VF) testing and optical coherence tomography (OCT) software in order to understand whether the use of this software is sufficient to detect glaucoma progression and to make decisions regarding its treatment. To evaluate agreement in VF and OCT software among 5 glaucoma specialists. The printout pages from VF progression software and OCT progression software from 100 patients were randomized, and the 5 glaucoma specialists subjectively and independently evaluated them for glaucoma. Each image was classified as having no progression, questionable progression, or progression. The principal investigator classified the patients previously as without variability (normal) or with high variability among tests (difficult). Using both software, the specialists also evaluated whether the glaucoma damage had progressed and if treatment change was needed. One month later, the same observers reevaluated the patients in a different order to determine intraobserver reproducibility. Intraobserver and interobserver agreement was estimated using κ statistics and Gwet second-order agreement coefficient. The agreement was compared with other factors. Of the 100 observed patients, half were male and all were white; the mean (SD) age was 69.7 (14.1) years. Intraobserver agreement was substantial to almost perfect for VF software (overall κ [95% CI], 0.59 [0.46-0.72] to 0.87 [0.79-0.96]) and similar for OCT software (overall κ [95% CI], 0.59 [0.46-0.71] to 0.85 [0.76-0.94]). Interobserver agreement among the 5 glaucoma specialists with the VF progression software was moderate (κ, 0.48; 95% CI, 0.41-0.55) and similar to OCT progression software (κ, 0.52; 95% CI, 0.44-0.59). Interobserver agreement was substantial in images classified as having no progression but only fair in those classified as having questionable glaucoma progression or glaucoma progression. Interobserver agreement was fair regarding questions about glaucoma progression (κ, 0.39; 95% CI, 0.32-0.48) and consideration about treatment changes (κ, 0.39; 95% CI, 0.32-0.48). The factors associated with agreement were the glaucoma stage and case difficulty. There was substantial intraobserver agreement but moderate interobserver agreement among glaucoma specialists using 2 glaucoma progression software packages. These data suggest that these glaucoma progression software packages are insufficient to obtain high interobserver agreement in both devices except in patients with no progression. The low agreement regarding progression or treatment changes suggests that both software programs used in isolation are insufficient for decision making.
Razek, Ahmed Abdel Khalek Abdel; Shamaa, Sameh; Lattif, Mahmoud Abdel; Yousef, Hanan Hamid
2017-01-01
To assess inter-observer agreement of whole-body computed tomography (WBCT) in staging and response assessment in lymphoma according to the Lugano classification. Retrospective analysis was conducted of 115 consecutive patients with lymphomas (45 females, 70 males; mean age of 46 years). Patients underwent WBCT with a 64 multi-detector CT device for staging and response assessment after a complete course of chemotherapy. Image analysis was performed by 2 reviewers according to the Lugano classification for staging and response assessment. The overall inter-observer agreement of WBCT in staging of lymphoma was excellent ( k =0.90, percent agreement=94.9%). There was an excellent inter-observer agreement for stage I ( k =0.93, percent agreement=96.4%), stage II ( k =0.90, percent agreement=94.8%), stage III ( k =0.89, percent agreement=94.6%) and stage IV ( k =0.88, percent agreement=94%). The overall inter-observer agreement in response assessment after a completer course of treatment was excellent ( k =0.91, percent agreement=95.8%). There was an excellent inter-observer agreement in progressive disease ( k =0.94, percent agreement=97.1%), stable disease ( k =0.90, percent agreement=95%), partial response ( k =0.96, percent agreement=98.1%) and complete response ( k =0.87, Percent agreement=93.3%). We concluded that WBCT is a reliable and reproducible imaging modality for staging and treatment assessment in lymphoma according to the Lugano classification.
Yi, Ji Sook; Han, Jong Kyu; Kim, Hyun-Joo
2015-01-01
Objective To assess inter-modality variability when evaluating cervical intervertebral disc herniation using 64-slice multidetector-row computed tomography (MDCT) and magnetic resonance imaging (MRI). Materials and Methods Three musculoskeletal radiologists independently reviewed cervical spine 1.5-T MRI and 64-slice MDCT data on C2-3 though C6-7 of 51 patients in the context of intervertebral disc herniation. Interobserver and inter-modality agreements were expressed as unweighted kappa values. Weighted kappa statistics were used to assess the extents of agreement in terms of the number of involved segments (NIS) in disc herniation and epicenter measurements collected using MDCT and MRI. Results The interobserver agreement rates upon evaluation of disc morphology by the three radiologists were in fair to moderate agreement (k = 0.39-0.53 for MDCT images; k = 0.45-0.56 for MRIs). When the disc morphology was categorized into two and four grades, the inter-modality agreement rates were moderate (k-value, 0.59) and substantial (k-value, 0.66), respectively. The inter-modality agreements for evaluations of the NIS (k-value, 0.78) and the epicenter (k-value, 0.79) were substantial. Also, the interobserver agreements for the NIS (CT; k-value, 0.85 and MRI; k-value, 0.88) and epicenter (CT; k-value, 0.74 and MRI; k-value, 0.70) evaluations by two readers were substantial. MDCT tended to underestimate the extent of herniated disc lesions compared with MRI. Conclusion Multidetector-row computed tomography and MRI showed a moderate-to-substantial degree of inter-modality agreement for the assessment of herniated cervical discs. MDCT images have a tendency to underestimate the anterior/posterior extent of the herniated disc compared with MRI. PMID:26175589
Karam, Jose A; Devine, Catherine E; Fellman, Bryan M; Urbauer, Diana L; Abel, E Jason; Allaf, Mohamad E; Bex, Axel; Lane, Brian R; Thompson, R Houston; Wood, Christopher G
2016-04-01
To evaluate how many patients could have undergone partial nephrectomy (PN) rather than radical nephrectomy (RN) before and after neoadjuvant axitinib therapy, as assessed by five independent urological oncologists, and to study the variability of inter-observer agreement. Pre- and post-systemic treatment computed tomography scans from 22 patients with clear cell renal cell carcinoma in a phase II neoadjuvant axitinib trial were reviewed by five independent urological oncologists. R.E.N.A.L. nephrometry score and κ statistics were calculated. The median R.E.N.A.L. nephrometry score changed from 11 before treatment to 10 after treatment (P = 0.002). Five tumours with moderate complexity before axitinib treatment remained moderate complexity after treatment. Of 17 tumours with high complexity before axitinib treatment, three became moderate complexity after treatment. The overall κ statistic was 0.611. Moderate-complexity κ was 0.611 vs a high-complexity κ of 0.428. Before axitinib treatment the κ was 0.550 vs 0.609 after treatment. After treatment with axitinib, all five reviewers agreed that only five patients required RN (instead of eight before treatment) and that 10 patients could now undergo PN (instead of three before treatment). The odds of PN feasibility were 22.8-times higher after treatment with axitinib. There is considerable variability in inter-observer agreement on the feasibility of PN in patients treated with neoadjuvant targeted therapy. Although more patients were candidates for PN after neoadjuvant axitinib therapy, it remains difficult to identify these patients a priori. © 2015 The Authors BJU International © 2015 BJU International Published by John Wiley & Sons Ltd.
Bouwense, Stefan A; van Brunschot, Sandra; van Santvoort, Hjalmar C; Besselink, Marc G; Bollen, Thomas L; Bakker, Olaf J; Banks, Peter A; Boermeester, Marja A; Cappendijk, Vincent C; Carter, Ross; Charnley, Richard; van Eijck, Casper H; Freeny, Patrick C; Hermans, John J; Hough, David M; Johnson, Colin D; Laméris, Johan S; Lerch, Markus M; Mayerle, Julia; Mortele, Koenraad J; Sarr, Michael G; Stedman, Brian; Vege, Santhi Swaroop; Werner, Jens; Dijkgraaf, Marcel G; Gooszen, Hein G; Horvath, Karen D
2017-08-01
Severe acute pancreatitis is associated with peripancreatic morphologic changes as seen on imaging. Uniform communication regarding these morphologic findings is crucial for accurate diagnosis and treatment. For the original 1992 Atlanta classification, interobserver agreement is poor. We hypothesized that for the revised Atlanta classification, interobserver agreement will be better. An international, interobserver agreement study was performed among expert and nonexpert radiologists (n = 14), surgeons (n = 15), and gastroenterologists (n = 8). Representative computed tomographies of all stages of acute pancreatitis were selected from 55 patients and were assessed according to the revised Atlanta classification. The interobserver agreement was calculated among all reviewers and subgroups, that is, expert and nonexpert reviewers; interobserver agreement was defined as poor (≤0.20), fair (0.21-0.40), moderate (0.41-0.60), good (0.61-0.80), or very good (0.81-1.00). Interobserver agreement among all reviewers was good (0.75 [standard deviation, 0.21]) for describing the type of acute pancreatitis and good (0.62 [standard deviation, 0.19]) for the type of peripancreatic collection. Expert radiologists showed the best and nonexpert clinicians the lowest interobserver agreement. Interobserver agreement was good for the revised Atlanta classification, supporting the importance for widespread adaption of this revised classification for clinical and research communications.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pogson, Elise M.; Liverpool and Macarthur Cancer Therapy Centres, Liverpool; Ingham Institute for Applied Medical Research, Liverpool
2016-11-15
Purpose: To determine whether T2-weighted MRI improves seroma cavity (SC) and whole breast (WB) interobserver conformity for radiation therapy purposes, compared with the gold standard of CT, both in the prone and supine positions. Methods and Materials: Eleven observers (2 radiologists and 9 radiation oncologists) delineated SC and WB clinical target volumes (CTVs) on T2-weighted MRI and CT supine and prone scans (4 scans per patient) for 33 patient datasets. Individual observer's volumes were compared using the Dice similarity coefficient, volume overlap index, center of mass shift, and Hausdorff distances. An average cavity visualization score was also determined. Results: Imaging modalitymore » did not affect interobserver variation for WB CTVs. Prone WB CTVs were larger in volume and more conformal than supine CTVs (on both MRI and CT). Seroma cavity volumes were larger on CT than on MRI. Seroma cavity volumes proved to be comparable in interobserver conformity in both modalities (volume overlap index of 0.57 (95% Confidence Interval (CI) 0.54-0.60) for CT supine and 0.52 (95% CI 0.48-0.56) for MRI supine, 0.56 (95% CI 0.53-0.59) for CT prone and 0.55 (95% CI 0.51-0.59) for MRI prone); however, after registering modalities together the intermodality variation (Dice similarity coefficient of 0.41 (95% CI 0.36-0.46) for supine and 0.38 (0.34-0.42) for prone) was larger than the interobserver variability for SC, despite the location typically remaining constant. Conclusions: Magnetic resonance imaging interobserver variation was comparable to CT for the WB CTV and SC delineation, in both prone and supine positions. Although the cavity visualization score and interobserver concordance was not significantly higher for MRI than for CT, the SCs were smaller on MRI, potentially owing to clearer SC definition, especially on T2-weighted MR images.« less
Bashir, Usman; Azad, Gurdip; Siddique, Muhammad Musib; Dhillon, Saana; Patel, Nikheel; Bassett, Paul; Landau, David; Goh, Vicky; Cook, Gary
2017-12-01
Measures of tumour heterogeneity derived from 18-fluoro-2-deoxyglucose positron emission tomography/computed tomography ( 18 F-FDG PET/CT) scans are increasingly reported as potential biomarkers of non-small cell lung cancer (NSCLC) for classification and prognostication. Several segmentation algorithms have been used to delineate tumours, but their effects on the reproducibility and predictive and prognostic capability of derived parameters have not been evaluated. The purpose of our study was to retrospectively compare various segmentation algorithms in terms of inter-observer reproducibility and prognostic capability of texture parameters derived from non-small cell lung cancer (NSCLC) 18 F-FDG PET/CT images. Fifty three NSCLC patients (mean age 65.8 years; 31 males) underwent pre-chemoradiotherapy 18 F-FDG PET/CT scans. Three readers segmented tumours using freehand (FH), 40% of maximum intensity threshold (40P), and fuzzy locally adaptive Bayesian (FLAB) algorithms. Intraclass correlation coefficient (ICC) was used to measure the inter-observer variability of the texture features derived by the three segmentation algorithms. Univariate cox regression was used on 12 commonly reported texture features to predict overall survival (OS) for each segmentation algorithm. Model quality was compared across segmentation algorithms using Akaike information criterion (AIC). 40P was the most reproducible algorithm (median ICC 0.9; interquartile range [IQR] 0.85-0.92) compared with FLAB (median ICC 0.83; IQR 0.77-0.86) and FH (median ICC 0.77; IQR 0.7-0.85). On univariate cox regression analysis, 40P found 2 out of 12 variables, i.e. first-order entropy and grey-level co-occurence matrix (GLCM) entropy, to be significantly associated with OS; FH and FLAB found 1, i.e., first-order entropy. For each tested variable, survival models for all three segmentation algorithms were of similar quality, exhibiting comparable AIC values with overlapping 95% CIs. Compared with both FLAB and FH, segmentation with 40P yields superior inter-observer reproducibility of texture features. Survival models generated by all three segmentation algorithms are of at least equivalent utility. Our findings suggest that a segmentation algorithm using a 40% of maximum threshold is acceptable for texture analysis of 18 F-FDG PET in NSCLC.
Wang, Qingle; Zhang, Zhiyong; Shan, Fei; Shi, Yuxin; Xing, Wei; Shi, Liangrong; Zhang, Xingwei
2017-09-01
This study was conducted to assess intra-observer and inter-observer agreements for the measurement of dual-input whole tumor computed tomography perfusion (DCTP) in patients with lung cancer. A total of 88 patients who had undergone DCTP, which had proved a diagnosis of primary lung cancer, were divided into two groups: (i) nodules (diameter ≤3 cm) and masses (diameter >3 cm) by size, and (ii) tumors with and without air density. Pulmonary flow, bronchial flow, and pulmonary index were measured in each group. Intra-observer and inter-observer agreements for measurement were assessed using intraclass correlation coefficient, within-subject coefficient of variation, and Bland-Altman analysis. In all lung cancers, the reproducibility coefficient for intra-observer agreement (range 26.1-38.3%) was superior to inter-observer agreement (range 38.1-81.2%). Further analysis revealed lower agreements for nodules compared to masses. Additionally, inner-air density reduced both agreements for lung cancer. The intra-observer agreement for measuring lung cancer DCTP was satisfied, while the inter-observer agreement was limited. The effects of tumoral size and inner-air density to agreements, especially between two observers, should be emphasized. In future, an automatic computer-aided segment of perfusion value of the tumor should be developed. © 2017 The Authors. Thoracic Cancer published by China Lung Oncology Group and John Wiley & Sons Australia, Ltd.
Brunner, Alexander; Gühring, Markus; Schmälzle, Traude; Weise, Kuno; Badke, Andreas
2009-01-01
Evaluation of the kyphosis angle in thoracic and lumbar burst fractures is often used to indicate surgical procedures. The kyphosis angle could be measured as vertebral, segmental and local kyphosis according to the method of Cobb. The vertebral, segmental and local kyphosis according to the method of Cobb were measured at 120 lateral X-rays and sagittal computed tomographies of 60 thoracic and 60 lumbar burst fractures by 3 independent observers on 2 separate occasions. Osteoporotic fractures were excluded. The intra- and interobserver reliability of these angles in X-ray and computed tomogram, using the intra class correlation coefficient (ICC) were evaluated. Highest reproducibility showed the segmental kyphosis followed by the vertebral kyphosis. For thoracic fractures segmental kyphosis shows in X-ray “excellent” inter- and intraobserver reliabilities (ICC 0.826, 0.802) and for lumbar fractures “good” to “excellent” inter- and intraobserver reliabilities (ICC = 0.790, 0.803). In computed tomography, the segmental kyphosis showed “excellent” inter- and intraobserver reliabilities (ICC = 0.824, 0.801) for thoracic and “excellent” inter- and intraobserver reliabilities (ICC = 0.874, 0.835) for the lumbar fractures. Regarding both diagnostic work ups (X-ray and computed tomography), significant differences were evaluated in interobserver reliabilities for vertebral kyphosis measured in lumbar fracture X-rays (p = 0.035) and interobserver reliabilities for local kyphosis, measured in thoracic fracture X-rays (p = 0.010). Regarding both fracture localizations (thoracic and lumbar fractures), significant differences could only be evaluated in interobserver reliabilities for the local kyphosis measured in computed tomographies (p = 0.045) and in intraobserver reliabilities for the vertebral kyphosis measured in X-rays (p = 0.024). “Good” to “excellent” inter- and intraobserver reliabilities for vertebral, segmental and local kyphosis in X-ray make these angles to a helpful tool, indicating surgical procedures. For the practical use in lateral X-ray, we emphasize the determination of the segmental kyphosis, because of the highest reproducibility of this angle. “Good” to “excellent” inter- and intraobserver reliabilities for these three angles could also be evaluated in computed tomographies. Therefore, also in computed tomography, the use of these three angles seems to be generally possible. For a direct correlation of the results in lateral X-ray and in computed tomography, further studies should be needed. PMID:19953277
Intra- and Interobserver Variability of Cochlear Length Measurements in Clinical CT.
Iyaniwura, John E; Elfarnawany, Mai; Riyahi-Alam, Sadegh; Sharma, Manas; Kassam, Zahra; Bureau, Yves; Parnes, Lorne S; Ladak, Hanif M; Agrawal, Sumit K
2017-07-01
The cochlear A-value measurement exhibits significant inter- and intraobserver variability, and its accuracy is dependent on the visualization method in clinical computed tomography (CT) images of the cochlea. An accurate estimate of the cochlear duct length (CDL) can be used to determine electrode choice, and frequency map the cochlea based on the Greenwood equation. Studies have described estimating the CDL using a single A-value measurement, however the observer variability has not been assessed. Clinical and micro-CT images of 20 cadaveric cochleae were acquired. Four specialists measured A-values on clinical CT images using both standard views and multiplanar reconstructed (MPR) views. Measurements were repeated to assess for intraobserver variability. Observer variabilities were evaluated using intra-class correlation and absolute differences. Accuracy was evaluated by comparison to the gold standard micro-CT images of the same specimens. Interobserver variability was good (average absolute difference: 0.77 ± 0.42 mm) using standard views and fair (average absolute difference: 0.90 ± 0.31 mm) using MPR views. Intraobserver variability had an average absolute difference of 0.31 ± 0.09 mm for the standard views and 0.38 ± 0.17 mm for the MPR views. MPR view measurements were more accurate than standard views, with average relative errors of 9.5 and 14.5%, respectively. There was significant observer variability in A-value measurements using both the standard and MPR views. Creating the MPR views increased variability between experts, however MPR views yielded more accurate results. Automated A-value measurement algorithms may help to reduce variability and increase accuracy in the future.
Acar, Nihat; Karakasli, Ahmet; Karaarslan, Ahmet; Mas, Nermin Ng; Hapa, Onur
2017-01-01
Volumetric measurements of benign tumors enable surgeons to trace volume changes during follow-up periods. For a volumetric measurement technique to be applicable, it should be easy, rapid, and inexpensive and should carry a high interobserver reliability. We aimed to assess the interobserver reliability of a volumetric measurement technique using the Cavalier's principle of stereological methods. The computerized tomography (CT) of 15 patients with a histopathologically confirmed diagnosis of enchondroma with variant tumor sizes and localizations was retrospectively reviewed for interobserver reliability evaluation of the volumetric stereological measurement with the Cavalier's principle, V = t × [((SU) × d) /SL]2 × Σ P. The volumes of the 15 tumors collected by the observers are demonstrated in Table 1. There was no statistical significance between the first and second observers ( p = 0.000 and intraclass correlation coefficient = 0.970) and between the first and third observers ( p = 0.000 and intraclass correlation coefficient = 0.981). No statistical significance was detected between the second and third observers ( p = 0.000 and intraclass correlation coefficient = 0.976). The Cavalier's principle with the stereological technique using the CT scans is an easy, rapid, and inexpensive technique in volumetric evaluation of enchondromas with a trustable interobserver reliability.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kosztyla, Robert, E-mail: rkosztyla@bccancer.bc.ca; Chan, Elisa K.; Hsu, Fred
Purpose: The objective of this study was to compare recurrent tumor locations after radiation therapy with pretreatment delineations of high-grade gliomas from magnetic resonance imaging (MRI) and 3,4-dihydroxy-6-[{sup 18}F]fluoro-L-phenylalanine ({sup 18}F-FDOPA) positron emission tomography (PET) using contours delineated by multiple observers. Methods and Materials: Nineteen patients with newly diagnosed high-grade gliomas underwent computed tomography (CT), gadolinium contrast-enhanced MRI, and {sup 18}F-FDOPA PET/CT. The image sets (CT, MRI, and PET/CT) were registered, and 5 observers contoured gross tumor volumes (GTVs) using MRI and PET. Consensus contours were obtained by simultaneous truth and performance level estimation (STAPLE). Interobserver variability was quantified bymore » the percentage of volume overlap. Recurrent tumor locations after radiation therapy were contoured by each observer using CT or MRI. Consensus recurrence contours were obtained with STAPLE. Results: The mean interobserver volume overlap for PET GTVs (42% ± 22%) and MRI GTVs (41% ± 22%) was not significantly different (P=.67). The mean consensus volume was significantly larger for PET GTVs (58.6 ± 52.4 cm{sup 3}) than for MRI GTVs (30.8 ± 26.0 cm{sup 3}, P=.003). More than 95% of the consensus recurrence volume was within the 95% isodose surface for 11 of 12 (92%) cases with recurrent tumor imaging. Ten (91%) of these cases extended beyond the PET GTV, and 9 (82%) were contained within a 2-cm margin on the MRI GTV. One recurrence (8%) was located outside the 95% isodose surface. Conclusions: High-grade glioma contours obtained with {sup 18}F-FDOPA PET had similar interobserver agreement to volumes obtained with MRI. Although PET-based consensus target volumes were larger than MRI-based volumes, treatment planning using PET-based volumes may not have yielded better treatment outcomes, given that all but 1 recurrence extended beyond the PET GTV and most were contained by a 2-cm margin on the MRI GTV.« less
Hanna, Gerard G; McAleese, Jonathan; Carson, Kathryn J; Stewart, David P; Cosgrove, Vivian P; Eakin, Ruth L; Zatari, Ashraf; Lynch, Tom; Jarritt, Peter H; Young, V A Linda; O'Sullivan, Joe M; Hounsell, Alan R
2010-05-01
Positron emission tomography (PET), in addition to computed tomography (CT), has an effect in target volume definition for radical radiotherapy (RT) for non-small-cell lung cancer (NSCLC). In previously PET-CT staged patients with NSCLC, we assessed the effect of using an additional planning PET-CT scan for gross tumor volume (GTV) definition. A total of 28 patients with Stage IA-IIIB NSCLC were enrolled. All patients had undergone staging PET-CT to ensure suitability for radical RT. Of the 28 patients, 14 received induction chemotherapy. In place of a RT planning CT scan, patients underwent scanning on a PET-CT scanner. In a virtual planning study, four oncologists independently delineated the GTV on the CT scan alone and then on the PET-CT scan. Intraobserver and interobserver variability were assessed using the concordance index (CI), and the results were compared using the Wilcoxon signed ranks test. PET-CT improved the CI between observers when defining the GTV using the PET-CT images compared with using CT alone for matched cases (median CI, 0.57 for CT and 0.64 for PET-CT, p = .032). The median of the mean percentage of volume change from GTV(CT) to GTV(FUSED) was -5.21% for the induction chemotherapy group and 18.88% for the RT-alone group. Using the Mann-Whitney U test, this was significantly different (p = .001). PET-CT RT planning scan, in addition to a staging PET-CT scan, reduces interobserver variability in GTV definition for NSCLC. The GTV size with PET-CT compared with CT in the RT-alone group increased and was reduced in the induction chemotherapy group.
A Multicenter Study of Volumetric Computed Tomography for Staging Malignant Pleural Mesothelioma
Rusch, Valerie W.; Gill, Ritu; Mitchell, Alan; Naidich, David; Rice, David C.; Pass, Harvey I.; Kindler, Hedy; De Perrot, Marc; Friedberg, Joseph
2016-01-01
Background Standard imaging modalities are inaccurate in staging malignant pleural mesothelioma (MPM). Single institution studies suggest that volumetric computed tomography (VolCT) is more accurate but labor intensive. We established a multicenter network to test interobserver variability, accuracy (relative to pathologic stage) and prognostic significance of semi-automated VolCT. Methods Six institutions electronically submitted clinical and pathologic data to an established multicenter database on patients with MPM who had surgery. Institutional radiologists reviewed preoperative CT scans for quality then submitted via electronic network (AG mednet) to biostatistical center (BC). Two reference radiologists, blinded to clinical data, performed semi-automated tumor volume calculations using commercially available software (Vitrea Enterprise 6.0), then submitted readings to BC. Study endpoints included: feasibility of network; interobserver variability for VolCT; correlation of tumor volume to pTN stages, and overall survival (OS). Results Of 164 cases, 129 were analyzable and read by reference radiologists. Most tumors were <500cm3. A small bias was observed between readers, as one provided consistently larger measurements than the other (mean difference=47.9, p=.0027), but for 80% of cases, the absolute difference was ≤ 200cm3. Spearman correlation between readers was 0.822. Volume correlated with pTN stages and OS, best defined by 3 groups with average volumes of: 91.2, 245.3, 511.3cm3, associated with median OS of 37, 18, 8 months respectively. Conclusions For the first time, a multicenter network was established and initial correlations of tumor volume to pTN stages and OS shown. A larger multicenter international study is planned to confirm results and refine correlations. PMID:27596916
Huynh, Thien J; Flaherty, Matthew L; Gladstone, David J; Broderick, Joseph P; Demchuk, Andrew M; Dowlatshahi, Dar; Meretoja, Atte; Davis, Stephen M; Mitchell, Peter J; Tomlinson, George A; Chenkin, Jordan; Chia, Tze L; Symons, Sean P; Aviv, Richard I
2014-01-01
Rapid, accurate, and reliable identification of the computed tomography angiography spot sign is required to identify patients with intracerebral hemorrhage for trials of acute hemostatic therapy. We sought to assess the accuracy and interobserver agreement for spot sign identification. A total of 131 neurology, emergency medicine, and neuroradiology staff and fellows underwent imaging certification for spot sign identification before enrolling patients in 3 trials targeting spot-positive intracerebral hemorrhage for hemostatic intervention (STOP-IT, SPOTLIGHT, STOP-AUST). Ten intracerebral hemorrhage cases (spot-positive/negative ratio, 1:1) were presented for evaluation of spot sign presence, number, and mimics. True spot positivity was determined by consensus of 2 experienced neuroradiologists. Diagnostic performance, agreement, and differences by training level were analyzed. Mean accuracy, sensitivity, and specificity for spot sign identification were 87%, 78%, and 96%, respectively. Overall sensitivity was lower than specificity (P<0.001) because of true spot signs incorrectly perceived as spot mimics. Interobserver agreement for spot sign presence was moderate (k=0.60). When true spots were correctly identified, 81% correctly identified the presence of single or multiple spots. Median time needed to evaluate the presence of a spot sign was 1.9 minutes (interquartile range, 1.2-3.1 minutes). Diagnostic performance, interobserver agreement, and time needed for spot sign evaluation were similar among staff physicians and fellows. Accuracy for spot identification is high with opportunity for improvement in spot interpretation sensitivity and interobserver agreement particularly through greater reliance on computed tomography angiography source data and awareness of limitations of multiplanar images. Further prospective study is needed.
Nguyen, Donna; Minnal, Vandana R.
2016-01-01
Purpose. To evaluate interobserver, intervisit, and interinstrument agreements for gonioscopy and Fourier domain anterior segment optical coherence tomography (FD ASOCT) for classifying open and narrow angle eyes. Methods. Eighty-six eyes with open or narrow anterior chamber angles were included. The superior angle was classified open or narrow by 2 of 5 glaucoma specialists using gonioscopy and imaged by FD ASOCT in the dark. The superior angle of each FD ASOCT image was graded as open or narrow by 2 masked readers. The same procedures were repeated within 6 months. Kappas for interobserver and intervisit agreements for each instrument and interinstrument agreements were calculated. Results. The mean age was 50.9 (±18.4) years. Interobserver agreements were moderate to good for both gonioscopy (0.57 and 0.69) and FD ASOCT (0.58 and 0.75). Intervisit agreements were moderate to excellent for both gonioscopy (0.53 to 0.86) and FD ASOCT (0.57 and 0.85). Interinstrument agreements were fair to good (0.34 to 0.63), with FD ASOCT classifying more angles as narrow than gonioscopy. Conclusions. Both gonioscopy and FD ASOCT examiners were internally consistent with similar interobserver and intervisit agreements for angle classification. Agreement between instruments was fair to good, with FD ASOCT classifying more angles as narrow than gonioscopy. PMID:27990300
Deegan, Timothy; Owen, Rebecca; Holt, Tanya; Fielding, Andrew; Biggs, Jennifer; Parfitt, Matthew; Coates, Alicia; Roberts, Lisa
2015-02-01
This investigation aimed to assess the consistency and accuracy of radiation therapists (RTs) performing cone beam computed tomography (CBCT) alignment to fiducial markers (FMs) (CBCTFM ) and the soft tissue prostate (CBCTST ). Six patients receiving prostate radiation therapy underwent daily CBCTs. Manual alignment of CBCTFM and CBCTST was performed by three RTs. Inter-observer agreement was assessed using a modified Bland-Altman analysis for each alignment method. Clinically acceptable 95% limits of agreement with the mean (LoAmean ) were defined as ±2.0 mm for CBCTFM and ±3.0 mm for CBCTST . Differences between CBCTST alignment and the observer-averaged CBCTFM (AvCBCTFM ) alignment were analysed. Clinically acceptable 95% LoA were defined as ±3.0 mm for the comparison of CBCTST and AvCBCTFM . CBCTFM and CBCTST alignments were performed for 185 images. The CBCTFM 95% LoAmean were within ±2.0 mm in all planes. CBCTST 95% LoAmean were within ±3.0 mm in all planes. Comparison of CBCTST with AvCBCTFM resulted in 95% LoA of -4.9 to 2.6, -1.6 to 2.5 and -4.7 to 1.9 mm in the superior-inferior, left-right and anterior-posterior planes, respectively. Significant differences were found between soft tissue alignment and the predicted FM position. FMs are useful in reducing inter-observer variability compared with soft tissue alignment. Consideration needs to be given to margin design when using soft tissue matching due to increased inter-observer variability. This study highlights some of the complexities of soft tissue guidance for prostate radiation therapy. © 2014 The Royal Australian and New Zealand College of Radiologists.
Tsili, Athina C; Ntorkou, Alexandra; Astrakas, Loukas; Xydis, Vasilis; Tsampalas, Stavros; Sofikitis, Nikolaos; Argyropoulou, Maria I
2017-04-01
To evaluate the difference in apparent diffusion coefficient (ADC) measurements at diffusion-weighted (DW) magnetic resonance imaging of differently shaped regions-of-interest (ROIs) in testicular germ cell neoplasms (TGCNS), the diagnostic ability of differently shaped ROIs in differentiating seminomas from nonseminomatous germ cell neoplasms (NSGCNs) and the interobserver variability. Thirty-three TGCNs were retrospectively evaluated. Patients underwent MR examinations, including DWI on a 1.5-T MR system. Two observers measured mean tumor ADCs using four distinct ROI methods: round, square, freehand and multiple small, round ROIs. The interclass correlation coefficient was analyzed to assess interobserver variability. Statistical analysis was used to compare mean ADC measurements among observers, methods and histologic types. All ROI methods showed excellent interobserver agreement, with excellent correlation (P<0.001). Multiple, small ROIs provided the lower mean ADC in TGCNs. Seminomas had lower mean ADC compared to NSGCNs for each ROI method (P<0.001). Round ROI proved the most accurate method in characterizing TGCNS. Interobserver variability in ADC measurement is excellent, irrespective of the ROI shape. Multiple, small round ROIs and round ROI proved the more accurate methods for ADC measurement in the characterization of TGCNs and in the differentiation between seminomas and NSGCNs, respectively. Copyright © 2017 Elsevier B.V. All rights reserved.
Hoffstetter, Patrick; Dornia, Christian; Schäfer, Stephan; Wagner, Merle; Dendl, Lena M; Stroszczynski, Christian; Schreyer, Andreas G
2014-01-01
Rib series (RS) are a special radiological technique to improve the visualization of the bony parts of the chest. The aim of this study was to evaluate the diagnostic accuracy of rib series in minor thorax trauma. Retrospective study of 56 patients who received RS, 39 patients where additionally evaluated by plain chest film (PCF). All patients underwent a computed tomography (CT) of the chest. RS and PCF were re-read independently by three radiologists, the results were compared with the CT as goldstandard. Sensitivity, specificity, negative and positive predictive value were calculated. Significance in the differences of findings was determined by McNemar test, interobserver variability by Cohens kappa test. 56 patients were evaluated (34 men, 22 women, mean age =61 y.). In 22 patients one or more rib fracture could be identified by CT. In 18 of these cases (82%) the correct diagnosis was made by RS, in 16 cases (73%) the correct number of involved ribs was detected. These differences were significant (p = 0.03). Specificity was 100%, negative and positive predictive value were 85% and 100%. Kappa values for the interobserver agreement was 0.92-0.96. Sensitivity of PCF was 46% and was significantly lower (p = 0.008) compared to CT. Rib series does not seem to be an useful examination in evaluating minor thorax trauma. CT seems to be the method of choice to detect rib fractures, but the clinical value of the radiological proof has to be discussed and investigated in larger follow up studies.
Impact of Anatomical Location on Value of CT-PET Co-Registration for Delineation of Lung Tumors
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fitton, Isabelle; Netherlands Cancer Institute-Antoni van Leeuwenhoek Hospital, Amsterdam; Steenbakkers, Roel J.H.M.
2008-04-01
Purpose: To derive guidelines for the need to use positron emission tomography (PET) for delineation of the primary tumor (PT) according to its anatomical location in the lung. Methods and Materials: In 22 patients with non-small-cell lung cancer, thoracic X-ray computed tomography (CT) and PET were performed. Eleven radiation oncologists delineated the PT on the CT and on the CT-PET registered scans. The PTs were classified into two groups. In Group I patients, the PT was surrounded by lung or visceral pleura, without venous invasion, without extension to chest wall or the mediastinum over more than one quarter of itsmore » surface. In Group II patients, the PT invaded the hilar region, heart, great vessels, pericardium, mediastinum over more than one quarter of its surface and/or associated with atelectasis. A comparison of interobserver variability for each group was performed and expressed as a local standard deviation. Results: The comparison of delineations showed a good reproducibility for Group I, with an average SD of 0.4 cm on CT and an average SD of 0.3 cm on CT-PET (p = 0.1628). There was also a significant improvement with CT-PET for Group II, with an average SD of 1.3 cm on CT and SD of 0.4 cm on CT-PET (p = 0.0003). The improvement was mainly located at the atelectasis/tumor interface. At the tumor/lung and tumor/hilum interfaces, the observer variation was similar with both modalities. Conclusions: Using PET for PT delineation is mandatory to decrease interobserver variability in the hilar region, heart, great vessels, pericardium, mediastinum, and/or the region associated with atelectasis; however it is not essential for delineation of PT surrounded by lung or visceral pleura, without venous invasion or extension to the chest wall.« less
Image analysis of pubic bone for age estimation in a computed tomography sample.
López-Alcaraz, Manuel; González, Pedro Manuel Garamendi; Aguilera, Inmaculada Alemán; López, Miguel Botella
2015-03-01
Radiology has demonstrated great utility for age estimation, but most of the studies are based on metrical and morphological methods in order to perform an identification profile. A simple image analysis-based method is presented, aimed to correlate the bony tissue ultrastructure with several variables obtained from the grey-level histogram (GLH) of computed tomography (CT) sagittal sections of the pubic symphysis surface and the pubic body, and relating them with age. The CT sample consisted of 169 hospital Digital Imaging and Communications in Medicine (DICOM) archives of known sex and age. The calculated multiple regression models showed a maximum R (2) of 0.533 for females and 0.726 for males, with a high intra- and inter-observer agreement. The method suggested is considered not only useful for performing an identification profile during virtopsy, but also for application in further studies in order to attach a quantitative correlation for tissue ultrastructure characteristics, without complex and expensive methods beyond image analysis.
Suojärvi, Nora; Sillat, T; Lindfors, N; Koskinen, S K
2015-12-01
Operative treatment of an intra-articular distal radius fracture is one of the most common procedures in orthopedic and hand surgery. The intra- and interobserver agreement of common radiographical measurements of these fractures using cone beam computed tomography (CBCT) and plain radiographs were evaluated. Thirty-seven patients undergoing open reduction and volar fixation for a distal radius fracture were studied. Two radiologists analyzed the preoperative radiographs and CBCT images. Agreement of the measurements was subjected to intra-class correlation coefficient and the Bland-Altman analyses. Plain radiographs provided a slightly poorer level of agreement. For fracture diastasis, excellent intraobserver agreement was achieved for radiographs and good or excellent agreement for CBCT, compared to poor interobserver agreement (ICC 0.334) for radiographs and good interobserver agreement (ICC 0.621) for CBCT images. The Bland-Altman analyses indicated a small mean difference between the measurements but rather large variation using both imaging methods, especially in angular measurements. For most of the measurements, radiographs do well, and may be used in clinical practice. Two different measurements by the same reader or by two different readers can lead to different decisions, and therefore a standardization of the measurements is imperative. More detailed analysis of articular surface needs cross-sectional imaging modalities.
Filli, Lukas; Marcon, Magda; Scholz, Bernhard; Calcagni, Maurizio; Finkenstädt, Tim; Andreisek, Gustav; Guggenberger, Roman
2014-12-01
The aim of this study was to evaluate a prototype correction algorithm to reduce metal artefacts in flat detector computed tomography (FDCT) of scaphoid fixation screws. FDCT has gained interest in imaging small anatomic structures of the appendicular skeleton. Angiographic C-arm systems with flat detectors allow fluoroscopy and FDCT imaging in a one-stop procedure emphasizing their role as an ideal intraoperative imaging tool. However, FDCT imaging can be significantly impaired by artefacts induced by fixation screws. Following ethical board approval, commercially available scaphoid fixation screws were inserted into six cadaveric specimens in order to fix artificially induced scaphoid fractures. FDCT images corrected with the algorithm were compared to uncorrected images both quantitatively and qualitatively by two independent radiologists in terms of artefacts, screw contour, fracture line visibility, bone visibility, and soft tissue definition. Normal distribution of variables was evaluated using the Kolmogorov-Smirnov test. In case of normal distribution, quantitative variables were compared using paired Student's t tests. The Wilcoxon signed-rank test was used for quantitative variables without normal distribution and all qualitative variables. A p value of < 0.05 was considered to indicate statistically significant differences. Metal artefacts were significantly reduced by the correction algorithm (p < 0.001), and the fracture line was more clearly defined (p < 0.01). The inter-observer reliability was "almost perfect" (intra-class correlation coefficient 0.85, p < 0.001). The prototype correction algorithm in FDCT for metal artefacts induced by scaphoid fixation screws may facilitate intra- and postoperative follow-up imaging. Flat detector computed tomography (FDCT) is a helpful imaging tool for scaphoid fixation. The correction algorithm significantly reduces artefacts in FDCT induced by scaphoid fixation screws. This may facilitate intra- and postoperative follow-up imaging.
Segmentation precision of abdominal anatomy for MRI-based radiotherapy
DOE Office of Scientific and Technical Information (OSTI.GOV)
Noel, Camille E.; Zhu, Fan; Lee, Andrew Y.
2014-10-01
The limited soft tissue visualization provided by computed tomography, the standard imaging modality for radiotherapy treatment planning and daily localization, has motivated studies on the use of magnetic resonance imaging (MRI) for better characterization of treatment sites, such as the prostate and head and neck. However, no studies have been conducted on MRI-based segmentation for the abdomen, a site that could greatly benefit from enhanced soft tissue targeting. We investigated the interobserver and intraobserver precision in segmentation of abdominal organs on MR images for treatment planning and localization. Manual segmentation of 8 abdominal organs was performed by 3 independent observersmore » on MR images acquired from 14 healthy subjects. Observers repeated segmentation 4 separate times for each image set. Interobserver and intraobserver contouring precision was assessed by computing 3-dimensional overlap (Dice coefficient [DC]) and distance to agreement (Hausdorff distance [HD]) of segmented organs. The mean and standard deviation of intraobserver and interobserver DC and HD values were DC{sub intraobserver} = 0.89 ± 0.12, HD{sub intraobserver} = 3.6 mm ± 1.5, DC{sub interobserver} = 0.89 ± 0.15, and HD{sub interobserver} = 3.2 mm ± 1.4. Overall, metrics indicated good interobserver/intraobserver precision (mean DC > 0.7, mean HD < 4 mm). Results suggest that MRI offers good segmentation precision for abdominal sites. These findings support the utility of MRI for abdominal planning and localization, as emerging MRI technologies, techniques, and onboard imaging devices are beginning to enable MRI-based radiotherapy.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Saha, Ashirbani, E-mail: as698@duke.edu; Grimm, La
Purpose: To assess the interobserver variability of readers when outlining breast tumors in MRI, study the reasons behind the variability, and quantify the effect of the variability on algorithmic imaging features extracted from breast MRI. Methods: Four readers annotated breast tumors from the MRI examinations of 50 patients from one institution using a bounding box to indicate a tumor. All of the annotated tumors were biopsy proven cancers. The similarity of bounding boxes was analyzed using Dice coefficients. An automatic tumor segmentation algorithm was used to segment tumors from the readers’ annotations. The segmented tumors were then compared between readersmore » using Dice coefficients as the similarity metric. Cases showing high interobserver variability (average Dice coefficient <0.8) after segmentation were analyzed by a panel of radiologists to identify the reasons causing the low level of agreement. Furthermore, an imaging feature, quantifying tumor and breast tissue enhancement dynamics, was extracted from each segmented tumor for a patient. Pearson’s correlation coefficients were computed between the features for each pair of readers to assess the effect of the annotation on the feature values. Finally, the authors quantified the extent of variation in feature values caused by each of the individual reasons for low agreement. Results: The average agreement between readers in terms of the overlap (Dice coefficient) of the bounding box was 0.60. Automatic segmentation of tumor improved the average Dice coefficient for 92% of the cases to the average value of 0.77. The mean agreement between readers expressed by the correlation coefficient for the imaging feature was 0.96. Conclusions: There is a moderate variability between readers when identifying the rectangular outline of breast tumors on MRI. This variability is alleviated by the automatic segmentation of the tumors. Furthermore, the moderate interobserver variability in terms of the bounding box does not translate into a considerable variability in terms of assessment of enhancement dynamics. The authors propose some additional ways to further reduce the interobserver variability.« less
Lopez, Mandi J; Davis, Kechia M; Jeffrey-Borger, Susan L; Markel, Mark D; Rettenmund, Christy
2009-12-01
To determine interobserver repeatability of measurements on computed tomography (CT) images of lax canine hip joints at different ages and in the presence of degenerative joint disease at maturity. Longitudinal observational investigation. Sibling crossbreed hounds. Pelvic CT was performed at 20, 24, 32, 48, 68, and 104 weeks of age. Measures were performed on 3 contiguous two-dimensional (2D) transverse CT images of both hips at each time point by 3 investigators. Center-edge angle (CEA), horizontal toit externe angle (HTEA), ventral (VASA), dorsal (DASA), and horizontal (HASA) acetabular sector angles, acetabular index (AI), and percent femoral head coverage (CPC) were measured. Interobserver repeatability was quantified with the intraclass correlation coefficient (ICC). Satisfactory repeatability was considered when ICC >or=0.75. DASA, CEA, and CPC were repeatable in all age groups. HASA and HTEA were repeatable for all but 1 time point. At 20 weeks of age, all measures but AI were repeatable, and at 104 weeks of age, DASA, CEA, CPC, and HASA were repeatable. Measures were repeatable in hips with and without degenerative changes with the exceptions of AI and HASA in normal hips and VASA and HTEA in osteoarthritic hips. Most 2D CT measurements examined were repeatable regardless of age or joint disease. Two-dimensional CT measures may augment current techniques for assessing joint changes in lax canine hips.
Bertal, Mileva; Vezzoni, Aldo; Houdellier, Blandine; Bogaerts, Evelien; Stock, Emmelie; Polis, Ingeborgh; Deforce, Dieter; Saunders, Jimmy H; Broeckx, Bart J G
2018-06-02
To describe and evaluate the accuracy, intra- and inter-observer variability of the laxity index (LI), used to quantify hip laxity on stress radiographs obtained with the Vezzoni-modified Badertscher distension device (VMBDD). Stress radiographs of 10 dogs obtained with the VMBDD were measured three times by an experienced observer. Six participants with different backgrounds (two ECVDI residents, two PhD students, two veterinary assistants) followed a short presentation and performed subsequently the measurements four times in two separate sessions. The effect of self-learning, feedback and specialization on the accuracy of the measurements was assessed. While the intra- and inter-observer variability were in agreement with other studies, the results of the experienced observer indicated that the variability can be very low. Neither feedback nor self-learning improved the results. A high degree of experience in radiographic assessment was not necessary to perform the measurements correctly. As the LI measurements were acceptable after a short presentation, they support the use of VMBDD for a complete and correct in-house evaluation of the hip joint by trained clinicians. However, we propose that, in the context of screening, measurements should be performed by a limited number of experienced examiners, to limit the impact of the inter-observer variability. Schattauer GmbH Stuttgart.
Interobserver variability of sonography for prediction of placenta accreta.
Bowman, Zachary S; Eller, Alexandra G; Kennedy, Anne M; Richards, Douglas S; Winter, Thomas C; Woodward, Paula J; Silver, Robert M
2014-12-01
The sensitivity of sonography to predict accreta has been reported as higher than 90%. However, most studies are from single expert investigators. Our objective was to analyze interobserver variability of sonography for prediction of placenta accreta. Patients with previa with and without accreta were ascertained, and images with placental views were collected, deidentified, and placed in random sequence. Three radiologists and 3 maternal-fetal medicine specialists interpreted each study for the presence of accreta and specific findings reported to be associated with its diagnosis. Investigator-specific sensitivity, specificity, and accuracy were calculated. κ statistics were used to assess variability between individuals and types of investigators. A total of 229 sonographic studies from 55 patients with accreta and 56 control patients were examined. Accuracy ranged from 55.9% to 76.4%. Of imaging studies yielding diagnoses, sensitivity ranged from 53.4% to 74.4%, and specificity ranged from 70.8% to 94.8%. Overall interobserver agreement was moderate (mean κ ± SD = 0.47 ± 0.12). κ values between pairs of investigators ranged from 0.32 (fair agreement) to 0.73 (substantial agreement). Average individual agreement ranged from fair (κ = 0.35) to moderate (κ = 0.53). Blinded from clinical data, sonography has significant interobserver variability for the diagnosis of placenta accreta. © 2013 by the American Institute of Ultrasound in Medicine.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wernicke, A. Gabriella, E-mail: gaw9008@med.cornell.ed; Parashar, Bhupesh; Kulidzhanov, Fridon
2011-05-01
Purpose: Accurate detection of radiation-induced fibrosis (RIF) is crucial in management of breast cancer survivors. Tissue compliance meter (TCM) has been validated in musculature. We validate TCM in healthy breast tissue with respect to interobserver and intraobserver variability before applying it in RIF. Methods and Materials: Three medical professionals obtained three consecutive TCM measurements in each of the four quadrants of the right and left breasts of 40 women with no breast disease or surgical intervention. The intraclass correlation coefficient (ICC) assessed interobserver variability. The paired t test and Pearson correlation coefficient (r) were used to assess intraobserver variability withinmore » each rater. Results: The median age was 45 years (range, 24-68 years). The median bra size was 35C (range, 32A-40DD). Of the participants, 27 were white (67%), 4 black (10%), 5 Asian (13%), and 4 Hispanic (10%). ICCs indicated excellent interrater reliability (low interobserver variability) among the three raters, by breast and quadrant (all ICC {>=}0.99). The paired t test and Pearson correlation coefficient both indicated low intraobserver variability within each rater (right vs. left breast), stratified by quadrant (all r{>=} 0.94, p < 0.0001). Conclusions: The interobserver and intraobserver variability is small using TCM in healthy mammary tissue. We are now embarking on a prospective study using TCM in women with breast cancer at risk of developing RIF that may guide early detection, timely therapeutic intervention, and assessment of success of therapy for RIF.« less
Verma, Nupur; Hippe, Daniel S; Robinson, Jeffrey D
2016-12-01
Peer review is an important and necessary part of radiology. There are several options to perform the peer review process. This study examines the reproducibility of peer review by comparing two scoring systems. American Board of Radiology-certified radiologists from various practice environments and subspecialties were recruited to score deidentified examinations on a web-based PACS with two scoring systems, RADPEER and Cleareview. Quantitative analysis of the scores was performed for interrater agreement. Interobserver variability was high for both the RADPEER and Cleareview scoring systems. The interobserver correlations (kappa values) were 0.17-0.23 for RADPEER and 0.10-0.16 for Cleareview. Interrater correlation was not statistically significantly different when comparing the RADPEER and Cleareview systems (p = 0.07-0.27). The kappa values were low for the Cleareview subscores when we evaluated for missed findings (0.26), satisfaction of search (0.17), and inadequate interpretation of findings (0.12). Our study confirms the previous report of low interobserver correlation when using the peer review process. There was low interobserver agreement seen when using both the RADPEER and the Cleareview scoring systems.
Caning, M M; Thisted, D L A; Amer-Wählin, I; Laier, G H; Krebs, L
2018-05-17
To examine interobserver agreement in intrapartum cardiotocography (CTG) classification in women undergoing trial of labor after a cesarean section (TOLAC) at term with or without complete uterine rupture. Nineteen blinded and independent Danish obstetricians assessed CTG tracings from 47 women (174 individual pages) with a complete uterine rupture during TOLAC and 37 women (133 individual pages) with no uterine rupture during TOLAC. Individual pages with CTG tracings lasting at least 20 min were evaluated by three different assessors and counted as an individual case. The tracings were analyzed according to the modified version of the Federation of Gynaecology and Obstetrics (FIGO) guidelines elaborated for the use of STAN (ST-analysis). Occurrence of defined abnormalities was recorded and the tracings were classified as normal, suspicious, pathological, or preterminal. The interobserver agreement was evaluated using Fleiss' kappa. Agreement on classification of a preterminal CTG was almost perfect. The interobserver agreement on normal, suspicious or pathological CTG was moderate to substantial. Regarding the presence of severe variable decelerations, the agreement was moderate. No statistical difference was found in the interobserver agreement between classification of tracings from women undergoing TOLAC with and without complete uterine rupture. The interobserver agreement on classification of CTG tracings from high-risk deliveries during TOLAC is best for assessment of a preterminal CTG and the poorest for the identification of severe variable decelerations.
Broekstra, Dieuwke C; Lanting, Rosanne; Werker, Paul M N; van den Heuvel, Edwin R
2015-08-01
Dupuytren disease (DD) is a fibrosing disease affecting the palmar aponeurosis, and is mostly treated by surgery based on measurement of severity of flexion contracture of the fingers. Literature concerning the measurement reliability is scarce. This study aimed to determine the intra- and inter-observer agreement of four variables for diagnosing DD, determining severity of contracture, and disease extent. One of them is a new measurement on the area of nodules and cords for measuring the disease extent in early disease stages. An agreement study (n = 54) was performed by two trained investigators. Agreement was calculated per finger, based on an intraclass correlation coefficient (ICC) using a latent variable model on subjects for diagnosis and Tubiana stage. For total passive extension deficit (TPED) and the area of nodules and cords, agreement was calculated with an ICC using a one-way random effects model with subject as random effect. Inter-observer agreement was very good for diagnosing DD (ICC: 95.5%-99.9%) and good to very good for classifying Tubiana stage (ICC: 73.5%-94.9%). Agreements for area and TPED were moderate (middle finger) to very good (ICC: 48.4%-98.6% and 45.0%-99.5%, respectively). Intra-observer agreement was slightly higher on average than inter-observer agreement. Overall, the intra- and inter-observer agreement in diagnosing DD, and determining the severity of flexion contracture is high. Also, the newly introduced variable area of nodules and cords has high intra- and inter-observer agreement, indicating that it is suitable to measure disease extent. Copyright © 2015 Elsevier Ltd. All rights reserved.
Kerkhof, M; Hagenbeek, R E; van der Kallen, B F W; Lycklama À Nijeholt, G J; Dirven, L; Taphoorn, M J B; Vos, M J
2016-10-01
Conventional magnetic resonance imaging (MRI) has limited value for differentiation of true tumor progression and pseudoprogression in treated glioblastoma multiforme (GBM). Perfusion weighted imaging (PWI) may be helpful in the differentiation of these two phenomena. Here interobserver variability in routine radiological evaluation of GBM patients is assessed using MRI, including PWI. Three experienced neuroradiologists evaluated MR scans of 28 GBM patients during temozolomide chemoradiotherapy at three time points: preoperative (MR1) and postoperative (MR2) MR scan and the follow-up MR scan after three cycles of adjuvant temozolomide (MR3). Tumor size was measured both on T1 post-contrast and T2 weighted images according to the Response Assessment in Neuro-Oncology criteria. PW images of MR3 were evaluated by visual inspection of relative cerebral blood volume (rCBV) color maps and by quantitative rCBV measurements of enhancing areas with highest rCBV. Image interpretability of PW images was also scored. Finally, the neuroradiologists gave a conclusion on tumor status, based on the interpretation of both T1 and T2 weighted images (MR1, MR2 and MR3) in combination with PWI (MR3). Interobserver agreement on visual interpretation of rCBV maps was good (κ = 0.63) but poor on quantitative rCBV measurements and on interpretability of perfusion images (intraclass correlation coefficient 0.37 and κ = 0.23, respectively). Interobserver agreement on the overall conclusion of tumor status was moderate (κ = 0.48). Interobserver agreement on the visual interpretation of PWI color maps was good. However, overall interpretation of MR scans (using both conventional and PW images) showed considerable interobserver variability. Therefore, caution should be applied when interpreting MRI results during chemoradiation therapy. © 2016 EAN.
Syed, Mushabbar A; Oshinski, John N; Kitchen, Charles; Ali, Arshad; Charnigo, Richard J; Quyyumi, Arshed A
2009-08-01
Carotid MRI measurements are increasingly being employed in research studies for atherosclerosis imaging. The majority of carotid imaging studies use 1.5 T MRI. Our objective was to investigate intra-observer and inter-observer variability in carotid measurements using high resolution 3 T MRI. We performed 3 T carotid MRI on 10 patients (age 56 +/- 8 years, 7 male) with atherosclerosis risk factors and ultrasound intima-media thickness > or =0.6 mm. A total of 20 transverse images of both right and left carotid arteries were acquired using T2 weighted black-blood sequence. The lumen and outer wall of the common carotid and internal carotid arteries were manually traced; vessel wall area, vessel wall volume, and average wall thickness measurements were then assessed for intra-observer and inter-observer variability. Pearson and intraclass correlations were used in these assessments, along with Bland-Altman plots. For inter-observer variability, Pearson correlations ranged from 0.936 to 0.996 and intraclass correlations from 0.927 to 0.991. For intra-observer variability, Pearson correlations ranged from 0.934 to 0.954 and intraclass correlations from 0.831 to 0.948. Calculations showed that inter-observer variability and other sources of error would inflate sample size requirements for a clinical trial by no more than 7.9%, indicating that 3 T MRI is nearly optimal in this respect. In patients with subclinical atherosclerosis, 3 T carotid MRI measurements are highly reproducible and have important implications for clinical trial design.
Online Studies on Variation in Orthopedic Surgery: Computed Tomography in MPEG4 Versus DICOM Format.
Mellema, Jos J; Mallee, Wouter H; Guitton, Thierry G; van Dijk, C Niek; Ring, David; Doornberg, Job N
2017-10-01
The purpose of this study was to compare the observer participation and satisfaction as well as interobserver reliability between two online platforms, Science of Variation Group (SOVG) and Traumaplatform Study Collaborative, for the evaluation of complex tibial plateau fractures using computed tomography in MPEG4 and DICOM format. A total of 143 observers started with the online evaluation of 15 complex tibial plateau fractures via either the SOVG or Traumaplatform Study Collaborative websites using MPEG4 videos or a DICOM viewer, respectively. Observers were asked to indicate the absence or presence of four tibial plateau fracture characteristics and to rate their satisfaction with the evaluation as provided by the respective online platforms. The observer participation rate was significantly higher in the SOVG (MPEG4 video) group compared to that in the Traumaplatform Study Collaborative (DICOM viewer) group (75 and 43%, respectively; P < 0.001). The median observer satisfaction with the online evaluation was seven (range, 0-10) using MPEG4 video compared to six (range, 1-9) using DICOM viewer (P = 0.11). The interobserver reliability for recognition of fracture characteristics in complex tibial plateau fractures was higher for the evaluation using MPEG4 video. In conclusion, observer participation and interobserver reliability for the characterization of tibial plateau fractures was greater with MPEG4 videos than with a standard DICOM viewer, while there was no difference in observer satisfaction. Future reliability studies should account for the method of delivering images.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Balik, Salim; Weiss, Elisabeth; Jan, Nuzhat
2013-06-01
Purpose: To evaluate 2 deformable image registration (DIR) algorithms for the purpose of contour mapping to support image-guided adaptive radiation therapy with 4-dimensional cone-beam CT (4DCBCT). Methods and Materials: One planning 4D fan-beam CT (4DFBCT) and 7 weekly 4DCBCT scans were acquired for 10 locally advanced non-small cell lung cancer patients. The gross tumor volume was delineated by a physician in all 4D images. End-of-inspiration phase planning 4DFBCT was registered to the corresponding phase in weekly 4DCBCT images for day-to-day registrations. For phase-to-phase registration, the end-of-inspiration phase from each 4D image was registered to the end-of-expiration phase. Two DIR algorithms—smallmore » deformation inverse consistent linear elastic (SICLE) and Insight Toolkit diffeomorphic demons (DEMONS)—were evaluated. Physician-delineated contours were compared with the warped contours by using the Dice similarity coefficient (DSC), average symmetric distance, and false-positive and false-negative indices. The DIR results are compared with rigid registration of tumor. Results: For day-to-day registrations, the mean DSC was 0.75 ± 0.09 with SICLE, 0.70 ± 0.12 with DEMONS, 0.66 ± 0.12 with rigid-tumor registration, and 0.60 ± 0.14 with rigid-bone registration. Results were comparable to intraobserver variability calculated from phase-to-phase registrations as well as measured interobserver variation for 1 patient. SICLE and DEMONS, when compared with rigid-bone (4.1 mm) and rigid-tumor (3.6 mm) registration, respectively reduced the average symmetric distance to 2.6 and 3.3 mm. On average, SICLE and DEMONS increased the DSC to 0.80 and 0.79, respectively, compared with rigid-tumor (0.78) registrations for 4DCBCT phase-to-phase registrations. Conclusions: Deformable image registration achieved comparable accuracy to reported interobserver delineation variability and higher accuracy than rigid-tumor registration. Deformable image registration performance varied with the algorithm and the patient.« less
Ueno, Yoshiko; Maeda, Tetsuo; Tanaka, Utaru; Tanimura, Kenji; Kitajima, Kazuhiro; Suenaga, Yuko; Takahashi, Satoru; Yamada, Hideto; Sugimura, Kazuro
2016-09-01
To evaluate the interobserver variability and diagnostic performance of a developed magnetic resonance imaging (MRI)-based scoring system for invasive placenta previa. Prenatal MR images of 70 women were retrospectively evaluated, 18 of whom were diagnosed with invasive placenta. The six MR features (dark band on T2 -weighted images, intraplacental abnormal vascularity, placental bulge, heterogeneous placenta, myometrial thinning, and placental protrusion sign) were scored on 5-point Likert scale separately, and the cumulative radiological score (CRS) was defined as the sum of each score. Two more experienced radiologists (readers A and B) and two less experienced residents (readers C and D) calculated the CRS. Interobserver variability was assessed by measuring the intraclass correlation coefficient. Diagnostic performance was evaluated by means of receiver operating characteristic (ROC) analysis. Interobserver variability for CRS was excellent for the more experienced radiologists (0.85), and good for all readers (0.72) and the less experienced residents (0.66). The area under the ROC curve (Az) and accuracy (Acc) for CRS were significantly higher or equivalent to those of other MR features for all readers (Az and Acc for reader A; CRS, 0.92, 91.4%; intraplacental T2 dark band, 0.83, P = 0.009, 81.4%, P = 0.03; intraplacental abnormal vascularity, 0.9, P = 0.3, 90.0%, P = 1.00; placental bulge, 0.81, P = 0.0008, 80.0%, P = 0.02; heterogeneous placenta, 0.85, P = 0.11, 74.3%, P = 0.002; myometrial thinning, 0.84, P = 0.06, 60.0%, P < 0.0001; placental protrusion sign, 0.81, P = 0.01, 81.4%, P = 0.26). This developed MRI-based scoring system demonstrated excellent or good interobserver variability, and good diagnostic performance for invasive placenta previa. J. Magn. Reson. Imaging 2016;44:573-583. © 2016 International Society for Magnetic Resonance in Medicine.
Lai, Isabel; Mak, Heather; Lai, Gilda; Yu, Marco; Lam, Dennis S C; Leung, Christopher K S
2013-06-01
To investigate the use of swept-source optical coherence tomography (OCT) for measuring the area and degree of peripheral anterior synechia (PAS) involvement in patients with angle-closure glaucoma. Cross-sectional study. Twenty-three eyes with PAS (detected by indentation gonioscopy) from 20 patients with angle-closure glaucoma (20 eyes had primary angle-closure glaucoma and 3 eyes had angle-closure glaucoma secondary to chronic anterior uveitis [n = 2] and Axenfeld-Rieger syndrome [n = 1]). The anterior chamber angles were evaluated with indentation gonioscopy and imaged by swept-source OCT (Casia OCT, Tomey, Nagoya, Japan) in room light and in the dark using the "angle analysis" protocol, which was composed of 128 radial B-scans each with 512 A-scans (16-mm scan length). The area and degree of PAS involvement were measured in each eye after manual detection of the scleral spur and the anterior irido-angle adhesion by 2 masked observers. The interobserver variability of the PAS measurements was calculated. The agreement of PAS assessment by gonioscopy and OCT, the area and the degree of PAS involvement, and the intraclass correlation coefficient (ICC) of interobserver PAS measurements. The area of PAS (mean ± standard deviation) was 20.8 ± 16.9 mm(2) (range, 3.9-74.9 mm(2)), and the degree of PAS involvement was 186.5 ± 79.9 degrees (range, 42-314 degrees). There was no difference in the area of PAS (P = 0.90) and the degree of PAS involvement (P = 0.95) between images obtained in room light and in the dark. The interobserver ICCs were 0.99 (95% confidence interval [CI], 0.98-1.00) for the area of PAS and 0.99 (95% CI, 0.97-1.00) for the degree of PAS involvement. There was good agreement of PAS assessment between gonioscopy and OCT images (kappa = 0.79; 95% CI, 0.67-0.91). Swept-source OCT allows visualization and reproducible measurements of the area and degree of PAS involvement, providing a new paradigm for evaluation of PAS progression and risk assessment for development of angle-closure glaucoma. The author(s) have no proprietary or commercial interest in any materials discussed in this article. Copyright © 2013 American Academy of Ophthalmology. Published by Elsevier Inc. All rights reserved.
The Impact of Computed Tomography on Decision Making in Tibial Plateau Fractures.
Castiglia, Marcello Teixeira; Nogueira-Barbosa, Marcello Henrique; Messias, Andre Marcio Vieira; Salim, Rodrigo; Fogagnolo, Fabricio; Schatzker, Joseph; Kfuri, Mauricio
2018-02-14
Schatzker introduced one of the most used classification systems for tibial plateau fractures, based on plain radiographs. Computed tomography brought to attention the importance of coronal plane-oriented fractures. The goal of our study was to determine if the addition of computed tomography would affect the decision making of surgeons who usually use the Schatzker classification to assess tibial plateau fractures. Image studies of 70 patients who sustained tibial plateau fractures were uploaded to a dedicated homepage. Every patient was linked to a folder which contained two radiographic projections (anteroposterior and lateral), three interactive videos of computed tomography (axial, sagittal, and coronal), and eight pictures depicting tridimensional reconstructions of the tibial plateau. Ten attending orthopaedic surgeons, who were blinded to the cases, were granted access to the homepage and assessed each set of images in two different rounds, separated to each other by an interval of 2 weeks. Each case was evaluated in three steps, where surgeons had access, respectively to radiographs, two-dimensional videos of computed tomography, and three-dimensional reconstruction images. After every step, surgeons were asked to present how would they classify the case using the Schatzker system and which surgical approaches would be appropriate. We evaluated the inter- and intraobserver reliability of the Schatzker classification using the Kappa concordance coefficient, as well as the impact of computed tomography in the decision making regarding the surgical approach for each case, by using the chi-square test and likelihood ratio. The interobserver concordance kappa coefficients after each assessment step were, respectively, 0.58, 0.62, and 0.64. For the intraobserver analysis, the coefficients were, respectively, 0.76, 0.75, and 0.78. Computed tomography changed the surgical approach selection for the types II, V, and VI of Schatzker ( p < 0.01). The addition of computed tomography scans to plain radiographs improved the interobserver reliability of Schatzker classification. Computed tomography had a statistically significant impact in the selection of surgical approaches for the lateral tibial plateau. Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.
Moritomo, Hisao; Arimitsu, Sayuri; Kubo, Nobuyuki; Masatomi, Takashi; Yukioka, Masao
2015-02-01
To classify triangular fibrocartilage complex (TFCC) foveal lesions on the basis of computed tomography (CT) arthrography using a radial plane view and to correlate the CT arthrography results with surgical findings. We also tested the interobserver and intra-observer reliability of the radial plane view. A total of 33 patients with a suspected TFCC foveal tear who had undergone wrist CT arthrography and subsequent surgical exploration were enrolled. We classified the configurations of TFCC foveal lesions into 5 types on the basis of CT arthrography with the radial plane view in which the image slices rotate clockwise centered on the ulnar styloid process. Sensitivity, specificity, and positive predictive values were calculated for each type of foveal lesion in CT arthrography to detect foveal tears. We determined interobserver and intra-observer agreements using kappa statistics. We also compared accuracies with the radial plane views with those with the coronal plane views. Among the tear types on CT arthrography, type 3, a roundish defect at the fovea, and type 4, a large defect at the overall ulnar insertion, had high specificity and positive predictive value for the detection of foveal tears. Specificity and positive predictive values were 90% and 89% for type 3 and 100% and 100% for type 4, respectively, whereas sensitivity was 35% for type 3 and 22% for type 4. Interobserver and intra-observer agreement was substantial and almost perfect, respectively. The radial plane view identified foveal lesion of each palmar and dorsal radioulnar ligament separately, but accuracy results with the radial plane views were not statistically different from those with the coronal plane views. Computed tomography arthrography with a radial plane view exhibited enhanced specificity and positive predictive value when a type 3 or 4 lesion was identified in the detection of a TFCC foveal tear compared with historical controls. Diagnostic II. Copyright © 2015 American Society for Surgery of the Hand. Published by Elsevier Inc. All rights reserved.
Bonin, Glen; Lauer, Susanne K; Guzman, David Sanchez-Migallon; Nevarez, Javier; Tully, Thomas N; Hosgood, Giselle; Gaschen, Lorrie
2009-06-01
Information on perching-joint angles in birds is limited. Joint immobilization in a physiologic perching angle has the potential to result more often in complete restoration of limb function. We evaluated perching-joint angles in 10 healthy cockatiels (Nymphicus hollandicus), 10 Hispaniolan Amazons (Amazona ventralis), and 9 barred owls (Strix varia) and determined intra- and interobserver variability for goniometric measurements in 2 different radiographic projections. Intra- and interobserver variation was less than 7% for all stifle and intertarsal joint measurements but frequently exceeded 10% for the hip-joint measurements. Hip, stifle, and intertarsal perching angles differed significantly among cockatiels, Hispaniolan Amazon parrots, and barred owls. The accuracy of measurements performed on straight lateral radiographic projections with superimposed limbs was not consistently superior to measurements on oblique projections with a slightly rotated pelvis. Stifle and intertarsal joint angles can be measured on radiographs by different observers with acceptable variability, but intra- and interobserver variability for hip-joint-angle measurements is higher.
NASA Astrophysics Data System (ADS)
Forsberg, Daniel; Lundström, Claes; Andersson, Mats; Vavruch, Ludvig; Tropp, Hans; Knutsson, Hans
2013-03-01
Reliable measurements of spinal deformities in idiopathic scoliosis are vital, since they are used for assessing the degree of scoliosis, deciding upon treatment and monitoring the progression of the disease. However, commonly used two dimensional methods (e.g. the Cobb angle) do not fully capture the three dimensional deformity at hand in scoliosis, of which axial vertebral rotation (AVR) is considered to be of great importance. There are manual methods for measuring the AVR, but they are often time-consuming and related with a high intra- and inter-observer variability. In this paper, we present a fully automatic method for estimating the AVR in images from computed tomography. The proposed method is evaluated on four scoliotic patients with 17 vertebrae each and compared with manual measurements performed by three observers using the standard method by Aaro-Dahlborn. The comparison shows that the difference in measured AVR between automatic and manual measurements are on the same level as the inter-observer difference. This is further supported by a high intraclass correlation coefficient (0.971-0.979), obtained when comparing the automatic measurements with the manual measurements of each observer. Hence, the provided results and the computational performance, only requiring approximately 10 to 15 s for processing an entire volume, demonstrate the potential clinical value of the proposed method.
Carlton, Joshua A; Maxwell, Adam W; Bauer, Lyndsey B; McElroy, Sara M; Layfield, Lester J; Ahsan, Humera; Agarwal, Ajay
2017-06-01
Background and purpose In patients with squamous cell carcinoma of the head and neck (HNSCC), extracapsular spread (ECS) of metastases in cervical lymph nodes affects prognosis and therapy. We assessed the accuracy of intravenous contrast-enhanced computed tomography (CT) and the utility of imaging criteria for preoperative detection of ECS in metastatic cervical lymph nodes in patients with HNSCC. Materials and methods Preoperative intravenous contrast-enhanced neck CT images of 93 patients with histopathological HNSCC metastatic nodes were retrospectively assessed by two neuroradiologists for ECS status and ECS imaging criteria. Radiological assessments were compared with histopathological assessments of neck dissection specimens, and interobserver agreement of ECS status and ECS imaging criteria were measured. Results Sensitivity, specificity, positive predictive value, and accuracy for overall ECS assessment were 57%, 81%, 82% and 67% for observer 1, and 66%, 76%, 80% and 70% for observer 2, respectively. Correlating three or more ECS imaging criteria with histopathological ECS increased specificity and positive predictive value, but decreased sensitivity and accuracy. Interobserver agreement for overall ECS assessment demonstrated a kappa of 0.59. Central necrosis had the highest kappa of 0.74. Conclusion CT has moderate specificity for ECS assessment in HNSCC metastatic cervical nodes. Identifying three or more ECS imaging criteria raises specificity and positive predictive value, therefore preoperative identification of multiple criteria may be clinically useful. Interobserver agreement is moderate for overall ECS assessment, substantial for central necrosis. Other ECS CT criteria had moderate agreement at best and therefore should not be used individually as criteria for detecting ECS by CT.
Schimek-Jasch, Tanja; Troost, Esther G C; Rücker, Gerta; Prokic, Vesna; Avlar, Melanie; Duncker-Rohr, Viola; Mix, Michael; Doll, Christian; Grosu, Anca-Ligia; Nestle, Ursula
2015-06-01
Interobserver variability in the definition of target volumes (TVs) is a well-known confounding factor in (multicentre) clinical studies employing radiotherapy. Therefore, detailed contouring guidelines are provided in the prospective randomised multicentre PET-Plan (NCT00697333) clinical trial protocol. This trial compares strictly FDG-PET-based TV delineation with conventional TV delineation in patients with locally advanced non-small cell lung cancer (NSCLC). Despite detailed contouring guidelines, their interpretation by different radiation oncologists can vary considerably, leading to undesirable discrepancies in TV delineation. Considering this, as part of the PET-Plan study quality assurance (QA), a contouring dummy run (DR) consisting of two phases was performed to analyse the interobserver variability before and after teaching. In the first phase of the DR (DR1), radiation oncologists from 14 study centres were asked to delineate TVs as defined by the study protocol (gross TV, GTV; and two clinical TVs, CTV-A and CTV-B) in a test patient. A teaching session was held at a study group meeting, including a discussion of the results focussing on discordances in comparison to the per-protocol solution. Subsequently, the second phase of the DR (DR2) was performed in order to evaluate the impact of teaching. Teaching after DR1 resulted in a reduction of absolute TVs in DR2, as well as in better concordance of TVs. The Overall Kappa(κ) indices increased from 0.63 to 0.71 (GTV), 0.60 to 0.65 (CTV-A) and from 0.59 to 0.63 (CTV-B), demonstrating improvements in overall interobserver agreement. Contouring DRs and study group meetings as part of QA in multicentre clinical trials help to identify misinterpretations of per-protocol TV delineation. Teaching the correct interpretation of protocol contouring guidelines leads to a reduction in interobserver variability and to more consistent contouring, which should consequently improve the validity of the overall study results.
Qiao, Jun; Xu, Leilei; Zhu, Zezhang; Zhu, Feng; Liu, Zhen; Qian, Bangping; Qiu, Yong
2014-10-11
Scoliogauge, has been developed for the measurement of ATR on iPhone smartphones. This study was to evaluate the reliability for the smartphone-aided ATR measurement method and to compare its reliability with that of the manual method. Sixty-four AIS patients with single thoracic or lumbar curve participated in this study. Of these patients, thirty-two patients had main thoracic scoliosis while other thirty-two had main thoracolumbar/lumbar scoliosis. Two spine surgeons performed the measurements with Scoliometer and Scoliogauge. The Scoliogauge measurements were conducted on an iPhone 4 smartphone. The intraclass correlation coefficient (ICC) 2-way mixed model on absolute agreement was used to analyze the reliability categorized according to regions: thoracic or lumbar, and Cobb angles: <20 degrees and >40 degrees. ICC < 0.40 is considered as poor, 0.40-0.59 as fair, 0.60-0.74 as good, and 0.75-1.00 as excellent. The overall intraobserver variability was 0.954 and the overall interobserver variability was 0.943 for the scoliometer set, whereas the intraobserver variability was 0.965 and interobserver variability was 0.964 for the scoliogauge set. Both the intraobserver and interobserver ICCs reached the excellent value in the 2 sets for both observers. The mean Cobb angle of thoracic curves in patients with main thoracic scoliosis was similar to that of lumbar curves in those with main thoracolumbar/lumbar scoliosis (35.7 degrees vs. 36.1 degrees). The intraobserver and interobserver reliability was similar between two groups (thoracic vs. lumbar) in the 2 sets. There were 21 patients having Cobb angles < 20 degrees, while 20 patients >40 degrees. The intraobserver and interobserver reliability was better in severe curve(>40 degrees) group. Smartphone-aided measurement for ATR showed excellent reliability, and the reliability of measurement with either scoliometer or scoliogauge could be influenced by Cobb angle that reliability was better for curves with larger Cobb angles.
Tomizawa, Yutaka; Iyer, Prasad G; Wongkeesong, Louis M; Buttar, Navtej S; Lutzke, Lori S; Wu, Tsung-Teh; Wang, Kenneth K
2013-01-01
AIM: To investigate a classification of endocytoscopy (ECS) images in Barrett’s esophagus (BE) and evaluate its diagnostic performance and interobserver variability. METHODS: ECS was applied to surveillance endoscopic mucosal resection (EMR) specimens of BE ex-vivo. The mucosal surface of specimen was stained with 1% methylene blue and surveyed with a catheter-type endocytoscope. We selected still images that were most representative of the endoscopically suspect lesion and matched with the final histopathological diagnosis to accomplish accurate correlation. The diagnostic performance and inter-observer variability of the new classification scheme were assessed in a blinded fashion by physicians with expertise in both BE and ECS and inexperienced physicians with no prior exposure to ECS. RESULTS: Three staff physicians and 22 gastroenterology fellows classified eight randomly assigned unknown still ECS pictures (two images per each classification) into one of four histopathologic categories as follows: (1) BEC1-squamous epithelium; (2) BEC2-BE without dysplasia; (3) BEC3-BE with dysplasia; and (4) BEC4-esophageal adenocarcinoma (EAC) in BE. Accuracy of diagnosis in staff physicians and clinical fellows were, respectively, 100% and 99.4% for BEC1, 95.8% and 83.0% for BEC2, 91.7% and 83.0% for BEC3, and 95.8% and 98.3% for BEC4. Interobserver agreement of the faculty physicians and fellows in classifying each category were 0.932 and 0.897, respectively. CONCLUSION: This is the first study to investigate classification system of ECS in BE. This ex-vivo pilot study demonstrated acceptable diagnostic accuracy and excellent interobserver agreement. PMID:24379583
Inter-observer variability within BI-RADS and RANZCR mammographic density assessment schemes
NASA Astrophysics Data System (ADS)
Damases, Christine N.; Mello-Thoms, Claudia; McEntee, Mark F.
2016-03-01
This study compares variability associated with two visual mammographic density (MD) assessment methods using two separate samples of radiologists. The image test-set comprised of images obtained from 20 women (age 42-89 years). The images were assessed for their MD by twenty American Board of Radiology (ABR) examiners and twenty-six radiologists registered with the Royal Australian and New Zealand College of Radiologists (RANZCR). Images were assessed using the same technology and conditions, however the ABR radiologists used the BI-RADS and the RANZCR radiologists used the RANZCR breast density synoptic. Both scales use a 4-point assessment. The images were then grouped as low- and high-density; low including BIRADS 1 and 2 or RANZCR 1 and 2 and high including BI-RADS 3 and 4 or RANZCR 3 and 4. Four-point BI-RADS and RANZCR showed no or negligible correlation (ρ=-0.029 p<0.859). The average inter-observer agreement on the BI-RADS scale had a Kappa of 0.565; [95% CI = 0.519 - 0.610], and ranged between 0.328-0.669 while the inter-observer agreement using the RANZCR scale had a Kappa of 0.360; [95% CI = 0.308 - 0.412] and a range of 0.078-0.499. Our findings show a wider range of inter-observer variability among RANZCR registered radiologists than the ABR examiners.
Gabriele, Alex; Marco, Valeria; Gatto, Laura; Paoletti, Giulia; Di Vito, Luca; Castriota, Fausto; Romagnoli, Enrico; Ricciardi, Andrea; Prati, Francesco
2014-10-01
The optical coherence tomography (OCT) evaluation of the stent anatomy requires the inspection of sequential cross section (CS). However stent coils cannot be appreciated in the conventional format as the OCT CS simply display stent struts, that are poorly representative of the stent architecture. The aim of the present study was to validate a new software (Carpet View), which unfolds the stented segment, reconstructing it as an open structure and displaying the stent meshwork. 21 patients were studied with frequency domain OCT after the deployment of different stents: seven bio-absorbable scaffolds (Dream), seven bare metal stent (Vision/Multilink8), seven drug eluting stent (Cre8). Conventional CS reconstructions were post-processed with the Carpet View software and analyzed by the same reader twice (intra-observer variability) and by two different readers (inter-observer variability). A small average difference in the number of all struts was obtained with the two methods (conventional vs carpet view reconstruction). Using the carpet view, high intra-observer and inter-observer correlations were found for the number of struts obtained in each coil. The Pearson correlation values were 0.98 (p = 0.0001) and 0.96 (p = 0.0001) respectively. The same number of coils was found when analyses were repeated by the same reader or by a different reader whilst mild differences in the count of stent junctions were reported. The Carpet View can be used to address the stent geometry with high reproducibility. This approach enables the matching of the same stent portion during serial time points and promises to improve the stent assessment.
Cunningham, Gregory; Freebody, John; Smith, Margaret M; Taha, Mohy E; Young, Allan A; Cass, Benjamin; Giuffre, Bruno
2018-05-16
Most glenoid version measurement methods have been validated on 3-dimensionally corrected axial computed tomography (CT) slices at the mid glenoid. Variability of the vault according to slice height and angulation has not yet been studied and is crucial for proper surgical implant positioning. The aim of this study was to analyze the variation of the glenoid vault compared with the Friedman angle according to different CT slice heights and angulations. The hypothesis was that the Friedman angle would show less variability. Sixty shoulder CT scans were retrieved from a hospital imaging database and were reconstructed in the plane of the scapula. Seven axial slices of different heights and coronal angulations were selected, and measurements were carried out by 3 observers. Mid-glenoid mean version was -8.0° (±4.9°; range, -19.6° to +7.0°) and -2.1° (±4.7°; range, -13.0° to +10.3°) using the vault method and Friedman angle, respectively. For both methods, decreasing slice height or angulation did not significantly alter version. Increasing slice height or angulation significantly increased anteversion for the vault method (P < .001). Both interobserver reliability and intraobserver reliability were significantly higher using the Friedman angle. Version at the mid and lower glenoid is similar using either method. The vault method shows less reliability and more variability according to slice height or angulation. Yet, as it significantly differs from the Friedman angle, it should still be used in situations where maximum bone purchase is sought with glenoid implants. For any other situation, the Friedman angle remains the method of choice. Copyright © 2018 Journal of Shoulder and Elbow Surgery Board of Trustees. Published by Elsevier Inc. All rights reserved.
Westendorp, Hendrik; Surmann, Kathrin; van de Pol, Sandrine M G; Hoekstra, Carel J; Kattevilder, Robert A J; Nuver, Tonnis T; Moerland, Marinus A; Slump, Cornelis H; Minken, André W
The quality of permanent prostate brachytherapy can be increased by addition of imaging modalities in the intraoperative procedure. This addition involves image registration, which inherently has inter- and intraobserver variabilities. We sought to quantify the inter- and intraobserver variabilities in geometry and dosimetry for contouring and image registration and analyze the results for our dynamic 125 I brachytherapy procedure. Five observers contoured 11 transrectal ultrasound (TRUS) data sets three times and 11 CT data sets one time. The observers registered 11 TRUS and MRI data sets to cone beam CT (CBCT) using fiducial gold markers. Geometrical and dosimetrical inter- and intraobserver variabilities were assessed. For the contouring study, structures were subdivided into three parts along the craniocaudal axis. We analyzed 165 observations. Interobserver geometrical variability for prostate was 1.1 mm, resulting in a dosimetric variability of 1.6% for V 100 and 9.3% for D 90 . The geometric intraobserver variability was 0.6 mm with a V 100 of 0.7% and D 90 of 1.1%. TRUS-CBCT registration showed an interobserver variability in V 100 of 2.0% and D 90 of 3.1%. Intraobserver variabilities were 0.9% and 1.6%, respectively. For MRI-CBCT registration, V 100 and D 90 were 1.3% and 2.1%. Intraobserver variabilities were 0.7% and 1.1% for the same. Prostate dosimetry is affected by interobserver contouring and registration variability. The observed variability is smaller than underdosages that are adapted during our dynamic brachytherapy procedure. Copyright © 2017 American Brachytherapy Society. Published by Elsevier Inc. All rights reserved.
Kim, Sun Mi; Han, Heon; Park, Jeong Mi; Choi, Yoon Jung; Yoon, Hoi Soo; Sohn, Jung Hee; Baek, Moon Hee; Kim, Yoon Nam; Chae, Young Moon; June, Jeon Jong; Lee, Jiwon; Jeon, Yong Hwan
2012-10-01
To determine which Breast Imaging Reporting and Data System (BI-RADS) descriptors for ultrasound are predictors for breast cancer using logistic regression (LR) analysis in conjunction with interobserver variability between breast radiologists, and to compare the performance of artificial neural network (ANN) and LR models in differentiation of benign and malignant breast masses. Five breast radiologists retrospectively reviewed 140 breast masses and described each lesion using BI-RADS lexicon and categorized final assessments. Interobserver agreements between the observers were measured by kappa statistics. The radiologists' responses for BI-RADS were pooled. The data were divided randomly into train (n = 70) and test sets (n = 70). Using train set, optimal independent variables were determined by using LR analysis with forward stepwise selection. The LR and ANN models were constructed with the optimal independent variables and the biopsy results as dependent variable. Performances of the models and radiologists were evaluated on the test set using receiver-operating characteristic (ROC) analysis. Among BI-RADS descriptors, margin and boundary were determined as the predictors according to stepwise LR showing moderate interobserver agreement. Area under the ROC curves (AUC) for both of LR and ANN were 0.87 (95% CI, 0.77-0.94). AUCs for the five radiologists ranged 0.79-0.91. There was no significant difference in AUC values among the LR, ANN, and radiologists (p > 0.05). Margin and boundary were found as statistically significant predictors with good interobserver agreement. Use of the LR and ANN showed similar performance to that of the radiologists for differentiation of benign and malignant breast masses.
Interobserver delineation variation in lung tumour stereotactic body radiotherapy
Persson, G F; Nygaard, D E; Hollensen, C; Munck af Rosenschöld, P; Mouritsen, L S; Due, A K; Berthelsen, A K; Nyman, J; Markova, E; Roed, A P; Roed, H; Korreman, S; Specht, L
2012-01-01
Objectives In radiotherapy, delineation uncertainties are important as they contribute to systematic errors and can lead to geographical miss of the target. For margin computation, standard deviations (SDs) of all uncertainties must be included as SDs. The aim of this study was to quantify the interobserver delineation variation for stereotactic body radiotherapy (SBRT) of peripheral lung tumours using a cross-sectional study design. Methods 22 consecutive patients with 26 tumours were included. Positron emission tomography/CT scans were acquired for planning of SBRT. Three oncologists and three radiologists independently delineated the gross tumour volume. The interobserver variation was calculated as a mean of multiple SDs of distances to a reference contour, and calculated for the transversal plane (SDtrans) and craniocaudal (CC) direction (SDcc) separately. Concordance indexes and volume deviations were also calculated. Results Median tumour volume was 13.0 cm3, ranging from 0.3 to 60.4 cm3. The mean SDtrans was 0.15 cm (SD 0.08 cm) and the overall mean SDcc was 0.26 cm (SD 0.15 cm). Tumours with pleural contact had a significantly larger SDtrans than tumours surrounded by lung tissue. Conclusions The interobserver delineation variation was very small in this systematic cross-sectional analysis, although significantly larger in the CC direction than in the transversal plane, stressing that anisotropic margins should be applied. This study is the first to make a systematic cross-sectional analysis of delineation variation for peripheral lung tumours referred for SBRT, establishing the evidence that interobserver variation is very small for these tumours. PMID:22919015
DOE Office of Scientific and Technical Information (OSTI.GOV)
Carillo, Viviana; Cozzarini, Cesare; Perna, Lucia
2012-11-01
Purpose: Within a multicenter study (DUE-01) focused on the search of predictors of erectile dysfunction and urinary toxicity after radiotherapy for prostate cancer, a dummy run exercise on penile bulb (PB) contouring on computed tomography (CT) images was carried out. The aim of this study was to quantitatively assess interobserver contouring variability by the application of the generalized DICE index. Methods and Materials: Fifteen physicians from different Institutes drew the PB on CT images of 10 patients. The spread of DICE values was used to objectively select those observers who significantly disagreed with the others. The analyses were performed withmore » a dedicated module in the VODCA software package. Results: DICE values were found to significantly change among observers and patients. The mean DICE value was 0.67, ranging between 0.43 and 0.80. The statistics of DICE coefficients identified 4 of 15 observers who systematically showed a value below the average (p value range, 0.013 - 0.059): Mean DICE values were 0.62 for the 4 'bad' observers compared to 0.69 of the 11 'good' observers. For all bad observers, the main cause of the disagreement was identified. Average DICE values were significantly worse from the average in 2 of 10 patients (0.60 vs. 0.70, p < 0.05) because of the limited visibility of the PB. Excluding the bad observers and the 'bad' patients,' the mean DICE value increased from 0.67 to 0.70; interobserver variability, expressed in terms of standard deviation of DICE spread, was also reduced. Conclusions: The obtained values of DICE around 0.7 shows an acceptable agreement, considered the small dimension of the PB. Additional strategies to improve this agreement are under consideration and include an additional tutorial of the so-called bad observers with a recontouring procedure, or the recontouring by a single observer of the PB for all patients included in the DUE-01 study.« less
Schütze, Christopher; Teleky, Katharina; Baumann, Bernhard; Pircher, Michael; Götzinger, Erich; Hitzenberger, Christoph K; Schmidt-Erfurth, Ursula
2016-03-01
To examine the reproducibility of lesion dimensions of the retinal pigment epithelium (RPE) in neovascular age-related macular degeneration (AMD) with polarisation-sensitive optical coherence tomography (PS-OCT), specifically imaging the RPE. Twenty-six patients (28 eyes) with neovascular AMD were included in this study, and examined by a PS-OCT prototype. Each patient was scanned five times at a 1-day visit. The PS-OCT B-scan located closest to the macular centre presenting with RPE atrophy was identified, and the longitudinal diameter of the lesion was quantified manually using AutoCAD 2008. This procedure was followed for the identical B-scan position in all five scans per eye and patient. Reproducibility of qualitative changes in PS-OCT was evaluated. Interobserver variability was assessed. Results were compared with intensity-based spectral-domain OCT (SD-OCT) imaging. Mean variability of all atrophy lesion dimensions was 0.10 mm (SD±=0.06 mm). Coefficient of variation (SD±/mean) was 0.06 on average (SD±=0.03). Interobserver variability assessment showed a mean difference of 0.02 mm across all patients regarding RPE lesion size evaluation (paired t test: p=0.38). Spearman correlation coefficient was r=0.98, p<0.001. Results revealed a good overall reproducibility of ∼90%. PS-OCT specifically detected the RPE in all eyes compared with conventional intensity-based SD-OCT that was not capable to clearly identify RPE atrophy in 25 eyes (89.3%, p<0.01). PS-OCT offers good reproducibility of RPE atrophy assessment in neovascular AMD, and may be suitable for precise RPE evaluation in clinical practice. PS-OCT unambiguously identifies RPE changes in choroidal neovascularisation compared with intensity-based SD-OCT that does not identify the RPE status reliably. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
Measurement of pelvic osteolytic lesions in follow-up studies after total hip arthroplasty
NASA Astrophysics Data System (ADS)
Castaneda, Benjamin; Tamez-Pena, Jose G.; Totterman, Saara; O'Keefe, Regis; Looney, R. John
2006-03-01
Previous studies have demonstrated the plausibility of using volumetric computerized tomography to provide an accurate representation and measurement of volume for pelvic osteolytic lesions following total hip joint replacement. These studies have been performed manually (or computed-assisted) by expert radiologists with the disadvantage of poor reproducibility of the experiment. The purpose of this work is to minimize the effect of user interaction in these experiments by introducing Laplacian level set methods in the volume segmentation process and using temporal articulated registration in order to follow the evolution of a lesion over time. Laplacian level set methods reduce the inter and intra-observer variability by attaching the segmented contour to edges defined in the image while keeping smoothness. The registration process allows the information of the lesion from the first visit to be used in the segmentation process of the current visit. This work compares the automated results on 7 volunteers versus the volume measured manually. Results have shown that the proposed technique is able to track osteolytic lesions and detect changes in volume over time. Intra-reader and inter-observer variabilities were reduced.
Proximal humeral fracture classification systems revisited.
Majed, Addie; Macleod, Iain; Bull, Anthony M J; Zyto, Karol; Resch, Herbert; Hertel, Ralph; Reilly, Peter; Emery, Roger J H
2011-10-01
This study evaluated several classification systems and expert surgeons' anatomic understanding of these complex injuries based on a consecutive series of patients. We hypothesized that current proximal humeral fracture classification systems, regardless of imaging methods, are not sufficiently reliable to aid clinical management of these injuries. Complex fractures in 96 consecutive patients were investigated by generation of rapid sequence prototyping models from computed tomography Digital Imaging and Communications in Medicine (DICOM) imaging data. Four independent senior observers were asked to classify each model using 4 classification systems: Neer, AO, Codman-Hertel, and a prototype classification system by Resch. Interobserver and intraobserver κ coefficient values were calculated for the overall classification system and for selected classification items. The κ coefficient values for the interobserver reliability were 0.33 for Neer, 0.11 for AO, 0.44 for Codman-Hertel, and 0.15 for Resch. Interobserver reliability κ coefficient values were 0.32 for the number of fragments and 0.30 for the anatomic segment involved using the Neer system, 0.30 for the AO type (A, B, C), and 0.53, 0.48, and 0.08 for the Resch impaction/distraction, varus/valgus and flexion/extension subgroups, respectively. Three-part fractures showed low reliability for the Neer and AO systems. Currently available evidence suggests fracture classifications in use have poor intra- and inter-observer reliability despite the modality of imaging used thus making treating these injuries difficult as weak as affecting scientific research as well. This study was undertaken to evaluate the reliability of several systems using rapid sequence prototype models. Overall interobserver κ values represented slight to moderate agreement. The most reliable interobserver scores were found with the Codman-Hertel classification, followed by elements of Resch's trial system. The AO system had the lowest values. The higher interobserver reliability values for the Codman-Hertel system showed that is the only comprehensive fracture description studied, whereas the novel classification by Resch showed clear definition in respect to varus/valgus and impaction/distraction angulation. Copyright © 2011 Journal of Shoulder and Elbow Surgery Board of Trustees. All rights reserved.
Nazerian, Peiman; Vanni, Simone; Morello, Fulvio; Castelli, Matteo; Ottaviani, Maddalena; Casula, Claudia; Petrioli, Alessandra; Bartolucci, Maurizio; Grifoni, Stefano
2015-05-01
The diagnostic performance of transthoracic focused cardiac ultrasound (FoCUS) performed by emergency physicians (EP) to estimate ascending aorta dimensions in the acute setting has not been prospectively studied. The diagnostic accuracy and the interobserver variability of EP-performed FoCUS were investigated to estimate thoracic aortic dilation and aneurysm compared with the results of computed tomography angiography (CTA). This was a prospective single-center cohort study of a convenience sample of patients who underwent CTA in the emergency department for suspected aortic pathology. FoCUS was performed before CTA, and the maximum ascending aorta diameter evaluated in parasternal long-axis view. Aorta diameter < 40 mm by visual estimation or by diameter measurement was considered normal. Measurements were recorded in all patients with aorta diameter ≥ 40 mm. Diagnostic accuracy of FoCUS for detection of aortic dilation (diameter ≥ 40 mm) and aneurysm (diameter ≥ 45 mm) were calculated considering the CTA result as reference standard. In a subgroup of patients, a second EP-sonographer performed FoCUS to evaluate interobserver agreement for the diagnosis of ascending aorta dilation. A total of 140 patients were enrolled in the study. Ascending aorta dilation and aneurysm were detected with FoCUS in 50 (35.7%) and in 27 (17.8%) patients, respectively. Sensitivity and specificity of FoCUS were 78.6% (95% confidence interval [CI] = 65.6% to 88.4%) and 92.9% (95% CI = 85.1% to 97.3%), respectively, for ascending aorta dilation and 64.7% (95% CI = 46.5% to 80.2%) and 95.3% (95% CI = 89.3% to 98.4%), respectively, for ascending aorta aneurysm. Interobserver agreement of FoCUS was k = 0.82. FoCUS performed by EP is specific for ascending aorta dilation and aneurysm when compared to CTA and appears a reproducible technique. © 2015 by the Society for Academic Emergency Medicine.
López-Miguel, Alberto; Calabuig-Goena, María; Marqués-Fernández, Victoria; Fernández, Itziar; Alió, Jorge L; Maldonado, Miguel J
2016-11-04
To assess the reliability of corneal epithelial thickness (CET), nonepithelial central corneal thickness (NECCT), and central corneal thickness (CCT) measurements using Cirrus high-definition optical coherence tomography (HD-OCT) in patients who did and did not undergo cataract surgery. Forty patients who underwent uneventful phacoemulsification and 40 healthy participants were recruited to evaluate the intraobserver repeatability and interobserver reproducibility of CET, NECCT, and CCT measurements using Cirrus HD-OCT. To analyze repeatability, one examiner obtained 5 consecutive scans in each participant; for interobserver reproducibility, another examiner randomly obtained another scan. Within-subject standard deviation, coefficient of variation (CV), limits of agreement, and intraclass correlation coefficient (ICC) data were obtained. For intraobserver repeatability, the intrasession CV (CVw) and ICC values of the CET in the operated and nonoperated groups were 3.7% and 0.80 and 3.8% and 0.73, respectively; for NECCT, 0.7% and 0.98 and 0.8% and 0.97; and for CCT, 0.6% and 0.99 and 0.7% and 0.98. For interobserver reproducibility, the CVw and ICC values for the CET in the operated and nonoperated groups were 2.6% and 0.82 and 2.3% and 0.62, respectively; for NECCT, 0.7% and 0.98 and 0.5% and 0.98; and for CCT, 0.5% and 0.99 and 0.4% and 0.99. The corneal sublayer thickness can be measured reliably using Cirrus HD-OCT in patients who underwent cataract surgery and elderly participants; however, the CET consistency is poorer than the NECCT. Corneal epithelial thickness modifications exceeding 4% reflect true thickness changes instead of random error variations using HD-OCT.
Fledelius, Joan; Khalil, Azza; Hjorthaug, Karin; Frøkiær, Jørgen
2016-12-01
The purpose of this study is to determine whether a qualitative approach or a semi-quantitative approach provides the most robust method for early response evaluation with 2'-deoxy-2'-[(18)F]fluoro-D-glucose (F-18-FDG) positron emission tomography combined with whole body computed tomography (PET/CT) in non-small cell lung cancer (NSCLC). In this study eight Nuclear Medicine consultants analyzed F-18-FDG PET/CT scans from 35 patients with locally advanced NSCLC. Scans were performed at baseline and after 2 cycles of chemotherapy. Each observer used two different methods for evaluation: (1) PET response criteria in solid tumors (PERCIST) 1.0 and (2) a qualitative approach. Both methods allocate patients into one of four response categories (complete and partial metabolic response (CMR and PMR) and stable and progressive metabolic disease (SMD and PMD)). The inter-observer agreement was evaluated using Fleiss' kappa for multiple raters, Cohens kappa for comparison of the two methods, and intraclass correlation coefficients (ICC) for comparison of lean body mass corrected standardized uptake value (SUL) peak measurements. The agreement between observers when determining the percentage change in SULpeak was "almost perfect", with ICC = 0.959. There was a strong agreement among observers allocating patients to the different response categories with a Fleiss kappa of 0.76 (0.71-0.81). In 22 of the 35 patients, complete agreement was observed with PERCIST 1.0. The agreement was lower when using the qualitative method, moderate, having a Fleiss kappa of 0.60 (0.55-0.64). Complete agreement was achieved in only 10 of the 35 patients. The difference between the two methods was statistically significant (p < 0.005) (chi-squared). Comparing the two methods for each individual observer showed Cohen's kappa values ranging from 0.64 to 0.79, translating into a strong agreement between the two methods. PERCIST 1.0 provides a higher overall agreement between observers than the qualitative approach in categorizing early treatment response in NSCLC patients. The inter-observer agreement is in fact strong when using PERCIST 1.0 even when the level of instruction is purposely kept to a minimum in order to mimic the everyday situation. The variability is largely owing to the subjective elements of the method.
Cohen, Julien G; Kim, Hyungjin; Park, Su Bin; van Ginneken, Bram; Ferretti, Gilbert R; Lee, Chang Hyun; Goo, Jin Mo; Park, Chang Min
2017-08-01
To evaluate the differences between filtered back projection (FBP) and model-based iterative reconstruction (MBIR) algorithms on semi-automatic measurements in subsolid nodules (SSNs). Unenhanced CT scans of 73 SSNs obtained using the same protocol and reconstructed with both FBP and MBIR algorithms were evaluated by two radiologists. Diameter, mean attenuation, mass and volume of whole nodules and their solid components were measured. Intra- and interobserver variability and differences between FBP and MBIR were then evaluated using Bland-Altman method and Wilcoxon tests. Longest diameter, volume and mass of nodules and those of their solid components were significantly higher using MBIR (p < 0.05) with mean differences of 1.1% (limits of agreement, -6.4 to 8.5%), 3.2% (-20.9 to 27.3%) and 2.9% (-16.9 to 22.7%) and 3.2% (-20.5 to 27%), 6.3% (-51.9 to 64.6%), 6.6% (-50.1 to 63.3%), respectively. The limits of agreement between FBP and MBIR were within the range of intra- and interobserver variability for both algorithms with respect to the diameter, volume and mass of nodules and their solid components. There were no significant differences in intra- or interobserver variability between FBP and MBIR (p > 0.05). Semi-automatic measurements of SSNs significantly differed between FBP and MBIR; however, the differences were within the range of measurement variability. • Intra- and interobserver reproducibility of measurements did not differ between FBP and MBIR. • Differences in SSNs' semi-automatic measurement induced by reconstruction algorithms were not clinically significant. • Semi-automatic measurement may be conducted regardless of reconstruction algorithm. • SSNs' semi-automated classification agreement (pure vs. part-solid) did not significantly differ between algorithms.
Youk, Ji Hyun; Jung, Inkyung; Yoon, Jung Hyun; Kim, Sung Hun; Kim, You Me; Lee, Eun Hye; Jeong, Sun Hye; Kim, Min Jung
2016-09-01
Our aim was to compare the inter-observer variability and diagnostic performance of the Breast Imaging Reporting and Data System (BI-RADS) lexicon for breast ultrasound of static and video images. Ninety-nine breast masses visible on ultrasound examination from 95 women 19-81 y of age at five institutions were enrolled in this study. They were scheduled to undergo biopsy or surgery or had been stable for at least 2 y of ultrasound follow-up after benign biopsy results or typically benign findings. For each mass, representative long- and short-axis static ultrasound images were acquired; real-time long- and short-axis B-mode video images through the mass area were separately saved as cine clips. Each image was reviewed independently by five radiologists who were asked to classify ultrasound features according to the fifth edition of the BI-RADS lexicon. Inter-observer variability was assessed using kappa (κ) statistics. Diagnostic performance on static and video images was compared using the area under the receiver operating characteristic curve. No significant difference was found in κ values between static and video images for all descriptors, although κ values of video images were higher than those of static images for shape, orientation, margin and calcifications. After receiver operating characteristic curve analysis, the video images (0.83, range: 0.77-0.87) had higher areas under the curve than the static images (0.80, range: 0.75-0.83; p = 0.08). Inter-observer variability and diagnostic performance of video images was similar to that of static images on breast ultrasonography according to the new edition of BI-RADS. Copyright © 2016 World Federation for Ultrasound in Medicine & Biology. Published by Elsevier Inc. All rights reserved.
Zahnd, Guillaume; Karanasos, Antonios; van Soest, Gijs; Regar, Evelyn; Niessen, Wiro; Gijsen, Frank; van Walsum, Theo
2015-09-01
Fibrous cap thickness is the most critical component of plaque stability. Therefore, in vivo quantification of cap thickness could yield valuable information for estimating the risk of plaque rupture. In the context of preoperative planning and perioperative decision making, intracoronary optical coherence tomography imaging can provide a very detailed characterization of the arterial wall structure. However, visual interpretation of the images is laborious, subject to variability, and therefore not always sufficiently reliable for immediate decision of treatment. A novel semiautomatic segmentation method to quantify coronary fibrous cap thickness in optical coherence tomography is introduced. To cope with the most challenging issue when estimating cap thickness (namely the diffuse appearance of the anatomical abluminal interface to be detected), the proposed method is based on a robust dynamic programming framework using a geometrical a priori. To determine the optimal parameter settings, a training phase was conducted on 10 patients. Validated on a dataset of 179 images from 21 patients, the present framework could successfully extract the fibrous cap contours. When assessing minimal cap thickness, segmentation results from the proposed method were in good agreement with the reference tracings performed by a medical expert (mean absolute error and standard deviation of 22 ± 18 μm) and were similar to inter-observer reproducibility (21 ± 19 μm, R = .74), while being significantly faster and fully reproducible. The proposed framework demonstrated promising performances and could potentially be used for online identification of high-risk plaques.
On measuring bird habitat: influence of observer variability and sample size
William M. Block; Kimberly A. With; Michael L. Morrison
1987-01-01
We studied the effects of observer variability when estimating vegetation characteristics at 75 0.04-ha bird plots. Observer estimates were significantly different for 31 of 49 variables. Multivariate analyses showed significant interobserver differences for five of the seven classes of variables studied. Variable classes included the height, number, and diameter of...
Segmentation precision of abdominal anatomy for MRI-based radiotherapy
Noel, Camille E.; Zhu, Fan; Lee, Andrew Y.; Yanle, Hu; Parikh, Parag J.
2014-01-01
The limited soft tissue visualization provided by computed tomography, the standard imaging modality for radiotherapy treatment planning and daily localization, has motivated studies on the use of magnetic resonance imaging (MRI) for better characterization of treatment sites, such as the prostate and head and neck. However, no studies have been conducted on MRI-based segmentation for the abdomen, a site that could greatly benefit from enhanced soft tissue targeting. We investigated the interobserver and intraobserver precision in segmentation of abdominal organs on MR images for treatment planning and localization. Manual segmentation of 8 abdominal organs was performed by 3 independent observers on MR images acquired from 14 healthy subjects. Observers repeated segmentation 4 separate times for each image set. Interobserver and intraobserver contouring precision was assessed by computing 3-dimensional overlap (Dice coefficient [DC]) and distance to agreement (Hausdorff distance [HD]) of segmented organs. The mean and standard deviation of intraobserver and interobserver DC and HD values were DCintraobserver = 0.89 ± 0.12, HDintraobserver = 3.6 mm ± 1.5, DCinterobserver = 0.89 ± 0.15, and HDinterobserver = 3.2 mm ± 1.4. Overall, metrics indicated good interobserver/intraobserver precision (mean DC > 0.7, mean HD < 4 mm). Results suggest that MRI offers good segmentation precision for abdominal sites. These findings support the utility of MRI for abdominal planning and localization, as emerging MRI technologies, techniques, and onboard imaging devices are beginning to enable MRI-based radiotherapy. PMID:24726701
O'Daniel, Jennifer C; Rosenthal, David I; Garden, Adam S; Barker, Jerry L; Ahamad, Anesa; Ang, K Kian; Asper, Joshua A; Blanco, Angel I; de Crevoisier, Renaud; Holsinger, F Christopher; Patel, Chirag B; Schwartz, David L; Wang, He; Dong, Lei
2007-04-01
To investigate interobserver variability in the delineation of head-and-neck (H&N) anatomic structures on CT images, including the effects of image artifacts and observer experience. Nine observers (7 radiation oncologists, 1 surgeon, and 1 physician assistant) with varying levels of H&N delineation experience independently contoured H&N gross tumor volumes and critical structures on radiation therapy treatment planning CT images alongside reference diagnostic CT images for 4 patients with oropharynx cancer. Image artifacts from dental fillings partially obstructed 3 images. Differences in the structure volumes, center-of-volume positions, and boundary positions (1 SD) were measured. In-house software created three-dimensional overlap distributions, including all observers. The effects of dental artifacts and observer experience on contouring precision were investigated, and the need for contrast media was assessed. In the absence of artifacts, all 9 participants achieved reasonable precision (1 SD < or =3 mm all boundaries). The structures obscured by dental image artifacts had larger variations when measured by the 3 metrics (1 SD = 8 mm cranial/caudal boundary). Experience improved the interobserver consistency of contouring for structures obscured by artifacts (1 SD = 2 mm cranial/caudal boundary). Interobserver contouring variability for anatomic H&N structures, specifically oropharyngeal gross tumor volumes and parotid glands, was acceptable in the absence of artifacts. Dental artifacts increased the contouring variability, but experienced participants achieved reasonable precision even with artifacts present. With a staging contrast CT image as a reference, delineation on a noncontrast treatment planning CT image can achieve acceptable precision.
Kim, Sung Sun; Kook, Myeong-Cherl; Shin, Ok-Ran; Kim, Hee Sung; Bae, Han-Ik; Seo, An Na; Park, Do Youn; Choi, Il Ju; Kim, Young-Il; Nam, Byung Ho; Kim, Sohee
2018-04-01
Intestinal metaplasia and atrophy of the gastric mucosa are associated with Helicobacter pylori infection and are considered premalignant lesions. The updated Sydney system is used for these parameters, but experienced pathologists and consensus processes are required for interobserver agreement. We sought to determine the influence of the consensus process on the assessment of intestinal metaplasia and atrophy. Two study sets were used: consensus and validation. The consensus set was circulated and five gastrointestinal pathologists evaluated them independently using the updated Sydney system. The consensus of the definitions was then determined at the first consensus meeting. The same set was recirculated to determine the effect of the consensus. The second consensus meeting was held to standardise the grading criteria and the validation set was circulated to determine the influence. Two additional circulations were performed to assess the maintainance of consensus and intraobserver variability. Interobserver agreement of intestinal metaplasia and atrophy was improved through the consensus process (intestinal metaplasia: baseline κ = 0.52 versus final κ = 0.68, P = 0.006; atrophy: baseline κ = 0.19 versus final κ = 0.43, P < 0.001). Higher interobserver agreement in atrophy was observed after consensus regarding the definition (pre-consensus: κ = 0.19 versus post-consensus: κ = 0.34, P = 0.001). There was improved interobserver agreement in intestinal metaplasia after standardisation of the grading criteria (pre-standardisation: κ = 0.56 versus post-standardisation: κ = 0.71, P = 0.010). This study suggests that interobserver variability regarding intestinal metaplasia and atrophy may result from lack of a precise definition and fine criteria, and can be reduced by consensus of definition and standardisation of grading criteria. © 2017 John Wiley & Sons Ltd.
NASA Astrophysics Data System (ADS)
Chen, Chieh-Li; Bojikian, Karine D.; Xin, Chen; Wen, Joanne C.; Gupta, Divakar; Zhang, Qinqin; Mudumbai, Raghu C.; Johnstone, Murray A.; Chen, Philip P.; Wang, Ruikang K.
2016-06-01
Optical coherence tomography angiography (OCTA) has increasingly become a clinically useful technique in ophthalmic imaging. We evaluate the repeatability and reproducibility of blood perfusion in the optic nerve head (ONH) measured using optical microangiography (OMAG)-based OCTA. Ten eyes from 10 healthy volunteers are recruited and scanned three times with a 68-kHz Cirrus HD-OCT 5000-based OMAG prototype system (Carl Zeiss Meditec Inc., Dublin, California) centered at the ONH involving two separate visits within six weeks. Vascular images are generated with OMAG processing by detecting the differences in OCT signals between consecutive B-scans acquired at the same retina location. ONH perfusion is quantified as flux, vessel area density, and normalized flux within the ONH for the prelaminar, lamina cribrosa, and the full ONH. Coefficient of variation (CV) and intraclass correlation coefficient (ICC) are used to evaluate intravisit and intervisit repeatability, and interobserver reproducibility. ONH perfusion measurements show high repeatability [CV≤3.7% (intravisit) and ≤5.2% (intervisit)] and interobserver reproducibility (ICC≤0.966) in all three layers by three metrics. OCTA provides a noninvasive method to visualize and quantify ONH perfusion in human eyes with excellent repeatability and reproducibility, which may add additional insight into ONH perfusion in clinical practice.
Nagata, Yasufumi; Kado, Yuichiro; Onoue, Takeshi; Otani, Kyoko; Nakazono, Akemi; Otsuji, Yutaka; Takeuchi, Masaaki
2018-01-01
Background Left ventricular ejection fraction (LVEF) and global longitudinal strain (GLS) play important roles in diagnosis and management of cardiac diseases. However, the issue of the accuracy and reliability of LVEF and GLS remains to be solved. Image quality is one of the most important factors affecting measurement variability. The aim of this study was to investigate whether improved image quality could reduce observer variability. Methods Two sets of three apical images were acquired using relatively old- and new-generation ultrasound imaging systems (Vivid 7 and Vivid E95) in 308 subjects. Image quality was assessed by endocardial border delineation index (EBDI) using a 3-point scoring system. Three observers measured the LVEF and GLS, and these values and inter-observer variability were investigated. Results Image quality was significantly better with Vivid E95 (EBDI: 26.8 ± 5.9) than that with Vivid 7 (22.8 ± 6.3, P < 0.0001). Regarding the inter-observer variability of LVEF, the r-value, bias, 95% limit of agreement and intra-class correlation coefficient for Vivid 7 were comparable to those for Vivid E95. The % variabilities were significantly lower for Vivid E95 (5.3–6.5%) than those for Vivid 7 (6.5–7.5%). Regarding GLS, all observer variability parameters were better for Vivid E95 than for Vivid 7. Improvements in image quality yielded benefits to both LVEF and GLS measurement reliability. Multivariate analysis showed that image quality was indeed an important factor of observer variability in the measurement of LVEF and GLS. Conclusions The new-generation ultrasound imaging system offers improved image quality and reduces inter-observer variability in the measurement of LVEF and GLS. PMID:29432198
Nagata, Yasufumi; Kado, Yuichiro; Onoue, Takeshi; Otani, Kyoko; Nakazono, Akemi; Otsuji, Yutaka; Takeuchi, Masaaki
2018-03-01
Left ventricular ejection fraction (LVEF) and global longitudinal strain (GLS) play important roles in diagnosis and management of cardiac diseases. However, the issue of the accuracy and reliability of LVEF and GLS remains to be solved. Image quality is one of the most important factors affecting measurement variability. The aim of this study was to investigate whether improved image quality could reduce observer variability. Two sets of three apical images were acquired using relatively old- and new-generation ultrasound imaging systems (Vivid 7 and Vivid E95) in 308 subjects. Image quality was assessed by endocardial border delineation index (EBDI) using a 3-point scoring system. Three observers measured the LVEF and GLS, and these values and inter-observer variability were investigated. Image quality was significantly better with Vivid E95 (EBDI: 26.8 ± 5.9) than that with Vivid 7 (22.8 ± 6.3, P < 0.0001). Regarding the inter-observer variability of LVEF, the r -value, bias, 95% limit of agreement and intra-class correlation coefficient for Vivid 7 were comparable to those for Vivid E95. The % variabilities were significantly lower for Vivid E95 (5.3-6.5%) than those for Vivid 7 (6.5-7.5%). Regarding GLS, all observer variability parameters were better for Vivid E95 than for Vivid 7. Improvements in image quality yielded benefits to both LVEF and GLS measurement reliability. Multivariate analysis showed that image quality was indeed an important factor of observer variability in the measurement of LVEF and GLS. The new-generation ultrasound imaging system offers improved image quality and reduces inter-observer variability in the measurement of LVEF and GLS. © 2018 The authors.
Panzer, Stephanie; Mc Coy, Mark R; Hitzl, Wolfgang; Piombino-Mascali, Dario; Jankauskas, Rimantas; Zink, Albert R; Augat, Peter
2015-01-01
The purpose of this study was to develop a checklist for standardized assessment of soft tissue preservation in human mummies based on whole-body computed tomography examinations, and to add a scoring system to facilitate quantitative comparison of mummies. Computed tomography examinations of 23 mummies from the Capuchin Catacombs of Palermo, Sicily (17 adults, 6 children; 17 anthropogenically and 6 naturally mummified) and 7 mummies from the crypt of the Dominican Church of the Holy Spirit of Vilnius, Lithuania (5 adults, 2 children; all naturally mummified) were used to develop the checklist following previously published guidelines. The scoring system was developed by assigning equal scores for checkpoints with equivalent quality. The checklist was evaluated by intra- and inter-observer reliability. The finalized checklist was applied to compare the groups of anthropogenically and naturally mummified bodies. The finalized checklist contains 97 checkpoints and was divided into two main categories, "A. Soft Tissues of Head and Musculoskeletal System" and "B. Organs and Organ Systems", each including various subcategories. The complete checklist had an intra-observer reliability of 98% and an inter-observer reliability of 93%. Statistical comparison revealed significantly higher values in anthropogenically compared to naturally mummified bodies for the total score and for three subcategories. In conclusion, the developed checklist allows for a standardized assessment and documentation of soft tissue preservation in whole-body computed tomography examinations of human mummies. The scoring system facilitates a quantitative comparison of the soft tissue preservation status between single mummies or mummy collections.
Capillary refill time: a study of interobserver reliability among nurses and nurse assistants.
Brabrand, Mikkel; Hosbond, Susanne; Folkestad, Lars
2011-02-01
The interobserver variability of capillary refill time (CRT) has been questioned. Earlier studies of interobserver variability of CRT have been on a large number of patients but with few observers. The objective of our study was to investigate how a large group of nurses and nurse assistants would grade CRT. We recorded a video of the index finger of six medical patients and these were shown to nurses and nurse assistants. They were asked to record the CRT and whether they found this value to be normal. The data were analyzed using the Fleiss Kappa Coefficient Analysis and graded according to the Landis and Koch correlation. Correlation between the exact numbers was evaluated using interclass correlation. Nine nurse assistants and 37 nurses participated. The patients were aged between 44 and 87 years. All but one patient had a systolic blood pressure reading above 130 mmHg. All had arterial blood oxygen saturation above 92% and all but one had normal body temperature. The κ value for normality was 0.56. The interclass correlation of measurement of CRT was 0.62. This is the largest interobserver study of CRT when looking at the number of observers. We found an only moderate agreement for the exact value of CRT and a moderate agreement for normality. We believe that CRT should be used with caution in clinical practice.
Interpretation of bedside chest X-rays in the ICU: is the radiologist still needed?
Martini, Katharina; Ganter, Christoph; Maggiorini, Marco; Winklehner, Anna; Leupi-Skibinski, Katarzyna E; Frauenfelder, Thomas; Nguyen-Kim, Thi Dan Linh
2015-01-01
To compare diagnostic accuracy of intensivists to radiologists in reading bedside chest X-rays. In a retrospective trial, 33 bedside chest X-rays were evaluated by five radiologists and five intensivists with different experience. Images were evaluated for devices and lung pathologies. Interobserver agreement and diagnostic accuracy were calculated. Computed tomography served as reference standard. Seniors had higher diagnostic accuracy than residents (mean-ExpB(Senior)=1.456; mean-ExpB(Resident)=1.635). Interobserver agreement for installations was more homogenously distributed between radiologists compared to intensivists (ExpB(Rad)=1.204-1.672; ExpB(Int)=1.005-2.368). Seniors had comparable diagnostic accuracy. No significant difference in diagnostic performance was seen between seniors of both disciplines, whereas the resident intensivists might still benefit from an interdisciplinary dialogue. Copyright © 2015 Elsevier Inc. All rights reserved.
Reproducibility of geometrical acquisition of intra-thoracic organs of children on CT scans.
Coulongeat, François; Jarrar, Mohamed-Salah; Serre, Thierry; Thollon, Lionel
2011-08-01
This paper analyses geometry of intra-thoracic organs from computed tomography (CT) scans performed on 20 children aged from 4 months to 16 years. A set of two measurements on lungs and heart were performed by the same observer. A third set was performed by a second observer. Thus, the intra- and inter-observer relative deviation of measurements was analysed. Multiple regressions were used in order to study the relationship between the CT properties (scanner, voltage, dose, pixel size, slice increment) and the relative deviation of measurements. There is a very low systematic intra- and inter-observer bias in measurements except for the volume of the heart. None of the CT data properties has a significant influence on the relative deviation of measurement. In the present paper, the measurements and 3D reconstruction protocol described can be applied to characterise the growth of the intra-thoracic organs.
Echocardiographic agreement in the diagnostic evaluation for infective endocarditis.
Lauridsen, Trine Kiilerich; Selton-Suty, Christine; Tong, Steven; Afonso, Luis; Cecchi, Enrico; Park, Lawrence; Yow, Eric; Barnhart, Huiman X; Paré, Carlos; Samad, Zainab; Levine, Donald; Peterson, Gail; Stancoven, Amy Butler; Johansson, Magnus Carl; Dickerman, Stuart; Tamin, Syahidah; Habib, Gilbert; Douglas, Pamela S; Bruun, Niels Eske; Crowley, Anna Lisa
2016-07-01
Echocardiography is essential for the diagnosis and management of infective endocarditis (IE). However, the reproducibility for the echocardiographic assessment of variables relevant to IE is unknown. Objectives of this study were: (1) To define the reproducibility for IE echocardiographic variables and (2) to describe a methodology for assessing quality in an observational cohort containing site-interpreted data. IE reproducibility was assessed on a subset of echocardiograms from subjects enrolled in the International Collaboration on Endocarditis registry. Specific echocardiographic case report forms were used. Intra-observer agreement was assessed from six site readers on ten randomly selected echocardiograms. Inter-observer agreement between sites and an echocardiography core laboratory was assessed on a separate random sample of 110 echocardiograms. Agreement was determined using intraclass correlation (ICC), coverage probability (CP), and limits of agreement for continuous variables and kappa statistics (κweighted) and CP for categorical variables. Intra-observer agreement for LVEF was excellent [ICC = 0.93 ± 0.1 and all pairwise differences for LVEF (CP) were within 10 %]. For IE categorical echocardiographic variables, intra-observer agreement was best for aortic abscess (κweighted = 1.0, CP = 1.0 for all readers). Highest inter-observer agreement for IE categorical echocardiographic variables was obtained for vegetation location (κweighted = 0.95; 95 % CI 0.92-0.99) and lowest agreement was found for vegetation mobility (κweighted = 0.69; 95 % CI 0.62-0.86). Moderate to excellent intra- and inter-observer agreement is observed for echocardiographic variables in the diagnostic assessment of IE. A pragmatic approach for determining echocardiographic data reproducibility in a large, multicentre, site interpreted observational cohort is feasible.
Campbell, Amelia; Owen, Rebecca; Brown, Elizabeth; Pryor, David; Bernard, Anne; Lehman, Margot
2015-08-01
Cone beam computerised tomography (CBCT) enables soft tissue visualisation to optimise matching in the post-prostatectomy setting, but is associated with inter-observer variability. This study assessed the accuracy and consistency of automated soft tissue localisation using XVI's dual registration tool (DRT). Sixty CBCT images from ten post-prostatectomy patients were matched using: (i) the DRT and (ii) manual soft tissue registration by six radiation therapists (RTs). Shifts in the three Cartesian planes were recorded. The accuracy of the match was determined by comparing shifts to matches performed by two genitourinary radiation oncologists (ROs). A Bland-Altman method was used to assess the 95% levels of agreement (LoA). A clinical threshold of 3 mm was used to define equivalence between methods of matching. The 95% LoA between DRT-ROs in the superior/inferior, left/right and anterior/posterior directions were -2.21 to +3.18 mm, -0.77 to +0.84 mm, and -1.52 to +4.12 mm, respectively. The 95% LoA between RTs-ROs in the superior/inferior, left/right and anterior/posterior directions were -1.89 to +1.86 mm, -0.71 to +0.62 mm and -2.8 to +3.43 mm, respectively. Five DRT CBCT matches (8.33%) were outside the 3-mm threshold, all in the setting of bladder underfilling or rectal gas. The mean time for manual matching was 82 versus 65 s for DRT. XVI's DRT is comparable with RTs manually matching soft tissue on CBCT. The DRT can minimise RT inter-observer variability; however, involuntary bladder and rectal filling can influence the tools accuracy, highlighting the need for RT evaluation of the DRT match. © 2015 The Royal Australian and New Zealand College of Radiologists.
Siddiqui, Usman T; Khan, Anjum F; Shamim, Muhammad Shahzad; Hamid, Rana Shoaib; Alam, Muhammad Mehboob; Emaduddin, Muhammad
2014-01-01
A noncontrast computed tomography (CT) scan remains the initial radiological investigation of choice for a patient with suspected aneurysmal subarachnoid hemorrhage (aSAH). This initial scan may be used to derive key information about the underlying aneurysm which may aid in further management. The interpretation, however, is subject to the skill and experience of the interpreting individual. The authors here evaluate the interpretation of such CT scans by different individuals at different levels of training, and in two different specialties (Radiology and Neurosurgery). Initial nonontrast CT scan of 35 patients with aSAH was evaluated independently by four different observers. The observers selected for the study included two from Radiology and two from Neurosurgery at different levels of training; a resident currently in mid training and a resident who had recently graduated from training of each specialty. Measured variables included interpreter's suspicion of presence of subarachnoid blood, side of the subarachnoid hemorrhage, location of the aneurysm, the aneurysm's proximity to vessel bifurcation, number of aneurysm(s), contour of aneurysm(s), presence of intraventricular hemorrhage (IVH), intracerebral hemorrhage (ICH), infarction, hydrocephalus and midline shift. To determine the inter-observer variability (IOV), weighted kappa values were calculated. There was moderate agreement on most of the CT scan findings among all observers. Substantial agreement was found amongst all observers for hydrocephalus, IVH, and ICH. Lowest agreement rates were seen in the location of aneurysm being supra or infra tentorial. There were, however, some noteworthy exceptions. There was substantial to almost perfect agreement between the radiology graduate and radiology resident on most CT findings. The lowest agreement was found between the neurosurgery graduate and the radiology graduate. Our study suggests that although agreements were seen in the interpretation of some of the radiological features of aSAH, there is still considerable IOV in the interpretation of most features among physicians belonging to different levels of training and different specialties. Whether these might affect management or outcome is unclear.
van Vugt, Jeroen L A; Levolger, Stef; Gharbharan, Arvind; Koek, Marcel; Niessen, Wiro J; Burger, Jacobus W A; Willemsen, Sten P; de Bruin, Ron W F; IJzermans, Jan N M
2017-04-01
The association between body composition (e.g. sarcopenia or visceral obesity) and treatment outcomes, such as survival, using single-slice computed tomography (CT)-based measurements has recently been studied in various patient groups. These studies have been conducted with different software programmes, each with their specific characteristics, of which the inter-observer, intra-observer, and inter-software correlation are unknown. Therefore, a comparative study was performed. Fifty abdominal CT scans were randomly selected from 50 different patients and independently assessed by two observers. Cross-sectional muscle area (CSMA, i.e. rectus abdominis, oblique and transverse abdominal muscles, paraspinal muscles, and the psoas muscle), visceral adipose tissue area (VAT), and subcutaneous adipose tissue area (SAT) were segmented by using standard Hounsfield unit ranges and computed for regions of interest. The inter-software, intra-observer, and inter-observer agreement for CSMA, VAT, and SAT measurements using FatSeg, OsiriX, ImageJ, and sliceOmatic were calculated using intra-class correlation coefficients (ICCs) and Bland-Altman analyses. Cohen's κ was calculated for the agreement of sarcopenia and visceral obesity assessment. The Jaccard similarity coefficient was used to compare the similarity and diversity of measurements. Bland-Altman analyses and ICC indicated that the CSMA, VAT, and SAT measurements between the different software programmes were highly comparable (ICC 0.979-1.000, P < 0.001). All programmes adequately distinguished between the presence or absence of sarcopenia (κ = 0.88-0.96 for one observer and all κ = 1.00 for all comparisons of the other observer) and visceral obesity (all κ = 1.00). Furthermore, excellent intra-observer (ICC 0.999-1.000, P < 0.001) and inter-observer (ICC 0.998-0.999, P < 0.001) agreement for all software programmes were found. Accordingly, excellent Jaccard similarity coefficients were found for all comparisons (mean ≥ 0.964). FatSeg, OsiriX, ImageJ, and sliceOmatic showed an excellent agreement for CSMA, VAT, and SAT measurements on abdominal CT scans. Furthermore, excellent inter-observer and intra-observer agreement were achieved. Therefore, results of studies using these different software programmes can reliably be compared. © 2016 The Authors. Journal of Cachexia, Sarcopenia and Muscle published by John Wiley & Sons Ltd on behalf of the Society on Sarcopenia, Cachexia and Wasting Disorders.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sadeghi, P; Smith, W; Tom Baker Cancer Centre, Calgary, AB
2015-06-15
Purpose This study quantifies errors associated with MR-guided High Dose Rate (HDR) gynecological brachytherapy. Uncertainties in this treatment results from contouring, organ motion between imaging and treatment delivery, dose calculation, and dose delivery. We focus on interobserver and inter-modality variability in contouring and the motion of organs at risk (OARs) in the time span between the MR and CT scans (∼1 hour). We report the change in organ volume and position of center of mass (CM) between the two imaging modalities. Methods A total of 8 patients treated with MR-guided HDR brachytherapy were included in this study. Two observers contouredmore » the bladder and rectum on both MR and CT scans. The change in OAR volume and CM position between the MR and CT imaging sessions on both image sets were calculated. Results The absolute mean bladder volume change between the two imaging modalities is 67.1cc. The absolute mean inter-observer difference in bladder volume is much lower at 15.5cc (MR) and 11.0cc (CT). This higher inter-modality volume difference suggests a real change in the bladder filling between the two imaging sessions. Change in Rectum volume inter-observer standard error of means (SEM) is 3.18cc (MR) and 3.09cc (CT), while the inter-modality SEM is 3.65cc (observer 1), and 2.75cc (observer 2). The SEM for rectum CM position in the superior-inferior direction was approximately three times higher than in other directions for both the inter—observer (0.77 cm, 0.92 cm for observers 1 and 2, respectively) and inter-modality (0.91 cm, 0.95 cm for MR and CT, respectively) variability. Conclusion Bladder contours display good consistency between different observers on both CT and MR images. For rectum contouring the highest inconsistency stems from the observers’ choice of the superior-inferior borders. A complete analysis of a larger patient cohort will enable us to separate the true organ motion from the inter-observer variability.« less
Kamishima, Tamotsu; Tanimura, Kazuhide; Henmi, Mihoko; Narita, Akihiro; Sakamoto, Fumihiko; Terae, Satoshi; Shirato, Hiroki
2009-05-01
The objective of this study was to assess interobserver uncertainties in power Doppler (PD) examination of the fingers of patients with rheumatoid arthritis (RA), by separating the source of the discrepancy into (1) acquisition of the images and (2) criteria for assessment of the images. Twenty patients who had been diagnosed with RA were enrolled in this study. Ultrasound examinations were performed by one inexperienced and two experienced sonographers. Interobserver variation was measured using a conventional semiquantitative image grading scale. Interobserver variation of the quantitative PD (QPD) index (the summation of the colored pixels in a region of interest) was also assessed. The agreement was higher between the two experienced sonographers (kappa value of 0.8) than between experienced and inexperienced sonographers (kappa value, 0.6-0.7) in the semiquantitative image grading scale. Results suggest that the difference in the assessment on the image grading scale was due more to the difference in the acquisition of the images than to variations in the grading criteria between sonographers. An excellent relationship was noted between the image grading scale and the QPD index for Doppler signal with a Spearman's coefficient of rank correlation of 0.83 (P < 0.0001). Interobserver discrepancies in the image grading and QPD index methods were due more to the difference in the acquisition of the image than to the grading criteria used. The QPD index seems to be as reliable as the image grading scale with reasonable interobserver agreement between experienced sonographers.
Accuracy and variability of tumor burden measurement on multi-parametric MRI
NASA Astrophysics Data System (ADS)
Salarian, Mehrnoush; Gibson, Eli; Shahedi, Maysam; Gaed, Mena; Gómez, José A.; Moussa, Madeleine; Romagnoli, Cesare; Cool, Derek W.; Bastian-Jordan, Matthew; Chin, Joseph L.; Pautler, Stephen; Bauman, Glenn S.; Ward, Aaron D.
2014-03-01
Measurement of prostate tumour volume can inform prognosis and treatment selection, including an assessment of the suitability and feasibility of focal therapy, which can potentially spare patients the deleterious side effects of radical treatment. Prostate biopsy is the clinical standard for diagnosis but provides limited information regarding tumour volume due to sparse tissue sampling. A non-invasive means for accurate determination of tumour burden could be of clinical value and an important step toward reduction of overtreatment. Multi-parametric magnetic resonance imaging (MPMRI) is showing promise for prostate cancer diagnosis. However, the accuracy and inter-observer variability of prostate tumour volume estimation based on separate expert contouring of T2-weighted (T2W), dynamic contrastenhanced (DCE), and diffusion-weighted (DW) MRI sequences acquired using an endorectal coil at 3T is currently unknown. We investigated this question using a histologic reference standard based on a highly accurate MPMRIhistology image registration and a smooth interpolation of planimetric tumour measurements on histology. Our results showed that prostate tumour volumes estimated based on MPMRI consistently overestimated histological reference tumour volumes. The variability of tumour volume estimates across the different pulse sequences exceeded interobserver variability within any sequence. Tumour volume estimates on DCE MRI provided the lowest inter-observer variability and the highest correlation with histology tumour volumes, whereas the apparent diffusion coefficient (ADC) maps provided the lowest volume estimation error. If validated on a larger data set, the observed correlations could support the development of automated prostate tumour volume segmentation algorithms as well as correction schemes for tumour burden estimation on MPMRI.
Robbrecht, Cedric; Claes, Steven; Cromheecke, Michiel; Mahieu, Peter; Kakavelakis, Kyriakos; Victor, Jan; Bellemans, Johan; Verdonk, Peter
2014-10-01
Post-operative widening of tibial and/or femoral bone tunnels is a common observation after ACL reconstruction, especially with soft-tissue grafts. There are no studies comparing tunnel widening in hamstring autografts versus tibialis anterior allografts. The goal of this study was to observe the difference in tunnel widening after the use of allograft vs. autograft for ACL reconstruction, by measuring it with a novel 3-D computed tomography based method. Thirty-five ACL-deficient subjects were included, underwent anatomic single-bundle ACL reconstruction and were evaluated at one year after surgery with the use of 3-D CT imaging. Three independent observers semi-automatically delineated femoral and tibial tunnel outlines, after which a best-fit cylinder was derived and the tunnel diameter was determined. Finally, intra- and inter-observer reliability of this novel measurement protocol was defined. In femoral tunnels, the intra-observer ICC was 0.973 (95% CI: 0.922-0.991) and the inter-observer ICC was 0.992 (95% CI: 0.982-0.996). In tibial tunnels, the intra-observer ICC was 0.955 (95% CI: 0.875-0.985). The combined inter-observer ICC was 0.970 (95% CI: 0.987-0.917). Tunnel widening was significantly higher in allografts compared to autografts, in the tibial tunnels (p=0.013) as well as in the femoral tunnels (p=0.007). To our knowledge, this novel, semi-automated 3D-computed tomography image processing method has shown to yield highly reproducible results for the measurement of bone tunnel diameter and area. This series showed a significantly higher amount of tunnel widening observed in the allograft group at one-year follow-up. Level II, Prospective comparative study. Copyright © 2014 Elsevier B.V. All rights reserved.
Ito, Kimiteru; Shimano, Yasumasa; Imabayashi, Etsuko; Nakata, Yasuhiro; Omachi, Yoshie; Sato, Noriko; Arima, Kunimasa; Matsuda, Hiroshi
2014-10-01
The purpose of this study was to clarify the concordance of diagnostic abilities and interobserver agreement between 18F-fluorodeoxyglucose (FDG) positron emission tomography (PET) and brain perfusion single photon-emission computed tomography (SPECT) in patients with Alzheimer's disease (AD) who were diagnosed according to the research criteria of the National Institute of Aging-Alzheimer's Association Workshop. Fifty-five patients with "AD and mild cognitive impairment (MCI)" (n = 40) and "non-AD" (n = 15) were evaluated with 18F-FDG PET and (99m)Tc-ethyl cysteinate dimer (ECD) SPECT during an 8-week period. Three radiologists independently graded the regional uptake in the frontal, temporal, parietal, and occipital lobes as well as the precuneus/posterior cingulate cortex in both images. Kappa values were used to determine the interobserver reliability regarding regional uptake. The regions with better interobserver reliability between 18F-FDG PET and (99m)Tc-ECD SPECT were the frontal, parietal, and temporal lobes. The (99m)Tc-ECD SPECT agreement in the occipital lobes was not significant. The frontal, temporal, and parietal lobes showed good correlations between 18F-FDG PET and (99m)Tc-ECD SPECT in the degree of uptake, but the occipital lobe and precuneus/posterior cingulate cortex did not show good correlations. The diagnostic accuracy rates of "AD and MCI" ranged from 60% to 70% in both of the techniques. The degree of uptake on 18F-FDG PET and (99m)Tc-ECD SPECT showed significant correlations in the frontal, temporal, and parietal lobes. The diagnostic abilities of 18F-FDG PET and (99m)Tc-ECD SPECT for "AD and MCI," when diagnosed according to the National Institute of Aging-Alzheimer's Association Workshop criteria, were nearly identical. Copyright © 2014 John Wiley & Sons, Ltd.
Robust semi-automatic segmentation of pulmonary subsolid nodules in chest computed tomography scans
NASA Astrophysics Data System (ADS)
Lassen, B. C.; Jacobs, C.; Kuhnigk, J.-M.; van Ginneken, B.; van Rikxoort, E. M.
2015-02-01
The malignancy of lung nodules is most often detected by analyzing changes of the nodule diameter in follow-up scans. A recent study showed that comparing the volume or the mass of a nodule over time is much more significant than comparing the diameter. Since the survival rate is higher when the disease is still in an early stage it is important to detect the growth rate as soon as possible. However manual segmentation of a volume is time-consuming. Whereas there are several well evaluated methods for the segmentation of solid nodules, less work is done on subsolid nodules which actually show a higher malignancy rate than solid nodules. In this work we present a fast, semi-automatic method for segmentation of subsolid nodules. As minimal user interaction the method expects a user-drawn stroke on the largest diameter of the nodule. First, a threshold-based region growing is performed based on intensity analysis of the nodule region and surrounding parenchyma. In the next step the chest wall is removed by a combination of a connected component analyses and convex hull calculation. Finally, attached vessels are detached by morphological operations. The method was evaluated on all nodules of the publicly available LIDC/IDRI database that were manually segmented and rated as non-solid or part-solid by four radiologists (Dataset 1) and three radiologists (Dataset 2). For these 59 nodules the Jaccard index for the agreement of the proposed method with the manual reference segmentations was 0.52/0.50 (Dataset 1/Dataset 2) compared to an inter-observer agreement of the manual segmentations of 0.54/0.58 (Dataset 1/Dataset 2). Furthermore, the inter-observer agreement using the proposed method (i.e. different input strokes) was analyzed and gave a Jaccard index of 0.74/0.74 (Dataset 1/Dataset 2). The presented method provides satisfactory segmentation results with minimal observer effort in minimal time and can reduce the inter-observer variability for segmentation of subsolid nodules in clinical routine.
Fiorella, David; Arthur, Adam; Byrne, James; Pierot, Laurent; Molyneux, Andy; Duckwiler, Gary; McCarthy, Thomas; Strother, Charles
2015-08-01
The WEB (WEB aneurysm embolization system, Sequent Medical, Aliso Viejo, California, USA) is a self-expanding, nitinol, mesh device designed to achieve aneurysm occlusion after endosaccular deployment. The WEB Occlusion Scale (WOS) is a standardized angiographic assessment scale for reporting aneurysm occlusion achieved with intrasaccular mesh implants. This study was performed to assess the interobserver variability of the WOS. Seven experienced neurovascular specialists were trained to apply the WOS. These physicians independently reviewed angiographic image sets from 30 patients treated with the WEB under blinded conditions. No additional clinical information was provided. Raters graded each image according to the WOS (complete occlusion, residual neck or residual aneurysm). Final statistics were calculated using the dichotomous outcomes of complete occlusion or incomplete occlusion. The interobserver agreement was measured by the generalized κ statistic. In this series of 30 test case aneurysms, observers rated 12-17 as completely occluded, 3-9 as nearly completely occluded, and 9-11 as demonstrating residual aneurysm filling. Agreement was perfect across all seven observers for the presence or absence of complete occlusion in 22 of 30 cases. Overall, interobserver agreement was substantial (κ statistic 0.779 with a 95% CI of 0.700 to 0.857). The WOS allows a consistent means of reporting angiographic occlusion for aneurysms treated with the WEB device. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Yang, Ping-Liang; Wong, David T; Dai, Shuang-Bo; Song, Hai-Bo; Ye, Ling; Liu, Jin; Liu, Bin
2009-05-01
There is no reliable method to monitor renal blood flow intraoperatively. In this study, we evaluated the feasibility and reproducibility of left renal blood flow measurements using transesophageal echocardiography during cardiac surgery. In this prospective noninterventional study, left renal blood flow was measured with transesophageal echocardiography during three time points (pre-, intra-, and postcardiopulmonary bypass) in 60 patients undergoing cardiac surgery. Sonograms from 6 subjects were interpreted by 2 blinded independent assessors at the time of acquisition and 6 mo later. Interobserver and intraobserver reproducibility were quantified by calculating variability and intraclass correlation coefficients. Patients with Doppler angles of >30 degrees (20 of 60 subjects) were eliminated from renal blood flow measurements. Left renal blood flow was successfully measured and analyzed in 36 of 60 (60%) subjects. Both interobserver and intraobserver variability were <10%. Interobserver and intraobserver reproducibility in left renal blood flow measurements were good to excellent (intraclass correlation coefficients 0.604-0.999). Left renal arterial luminal diameter for the pre, intra, and postcardiopulmonary bypass phases, ranged from 3.8 to 4.1 mm, renal arterial velocity from 25 to 35 cm/s, and left renal blood flow from 192 to 299 mL/min. In patients undergoing cardiac surgery, it was feasible in 60% of the subjects to measure left renal blood flow using intraoperative transesophageal echocardiography. The interobserver and intraobserver reproducibility of renal blood flow measurements was good to excellent.
Fotina, I; Lütgendorf-Caucig, C; Stock, M; Pötter, R; Georg, D
2012-02-01
Inter-observer studies represent a valid method for the evaluation of target definition uncertainties and contouring guidelines. However, data from the literature do not yet give clear guidelines for reporting contouring variability. Thus, the purpose of this work was to compare and discuss various methods to determine variability on the basis of clinical cases and a literature review. In this study, 7 prostate and 8 lung cases were contoured on CT images by 8 experienced observers. Analysis of variability included descriptive statistics, calculation of overlap measures, and statistical measures of agreement. Cross tables with ratios and correlations were established for overlap parameters. It was shown that the minimal set of parameters to be reported should include at least one of three volume overlap measures (i.e., generalized conformity index, Jaccard coefficient, or conformation number). High correlation between these parameters and scatter of the results was observed. A combination of descriptive statistics, overlap measure, and statistical measure of agreement or reliability analysis is required to fully report the interrater variability in delineation.
Kakinuma, Ryutaro; Ashizawa, Kazuto; Kuriyama, Keiko; Fukushima, Aya; Ishikawa, Hiroyuki; Kamiya, Hisashi; Koizumi, Naoya; Maruyama, Yuichiro; Minami, Kazunori; Nitta, Norihisa; Oda, Seitaro; Oshiro, Yasuji; Kusumoto, Masahiko; Murayama, Sadayuki; Murata, Kiyoshi; Muramatsu, Yukio; Moriyama, Noriyuki
2012-04-01
To evaluate interobserver agreement in regard to measurements of focal ground-glass opacities (GGO) diameters on computed tomography (CT) images to identify increases in the size of GGOs. Approval by the institutional review board and informed consent by the patients were obtained. Ten GGOs (mean size, 10.4 mm; range, 6.5-15 mm), one each in 10 patients (mean age, 65.9 years; range, 58-78 years), were used to make the diameter measurements. Eleven radiologists independently measured the diameters of the GGOs on a total of 40 thin-section CT images (the first [n = 10], the second [n = 10], and the third [n = 10] follow-up CT examinations and remeasurement of the first [n = 10] follow-up CT examinations) without comparing time-lapse CT images. Interobserver agreement was assessed by means of Bland-Altman plots. The smallest range of the 95% limits of interobserver agreement between the members of the 55 pairs of the 11 radiologists in regard to maximal diameter was -1.14 to 1.72 mm, and the largest range was -7.7 to 1.7 mm. The mean value of the lower limit of the 95% limits of agreement was -3.1 ± 1.4 mm, and the mean value of their upper limit was 2.5 ± 1.1 mm. When measurements are made by any two radiologists, an increase in the length of the maximal diameter of more than 1.72 mm would be necessary in order to be able to state that the maximal diameter of a particular GGO had actually increased. Copyright © 2012 AUR. Published by Elsevier Inc. All rights reserved.
Sternby, Hanna; Verdonk, Robert C; Aguilar, Guadalupe; Dimova, Alexandra; Ignatavicius, Povilas; Ilzarbe, Lucas; Koiva, Peeter; Lantto, Eila; Loigom, Tonis; Penttilä, Anne; Regnér, Sara; Rosendahl, Jonas; Strahinova, Vanya; Zackrisson, Sophia; Zviniene, Kristina; Bollen, Thomas L
2016-01-01
For consistent reporting and better comparison of data in research the revised Atlanta classification (RAC) proposes new computed tomography (CT) criteria to describe the morphology of acute pancreatitis (AP). The aim of this study was to analyse the interobserver agreement among radiologists in evaluating CT morphology by using the new RAC criteria in patients with AP. Patients with a first episode of AP who obtained a CT were identified and consecutively enrolled at six European centres backwards from January 2013 to January 2012. A local radiologist at each center and a central expert radiologist scored the CTs separately using the RAC criteria. Center dependent and independent interobserver agreement was determined using Kappa statistics. In total, 285 patients with 388 CTs were included. For most CT criteria, interobserver agreement was moderate to substantial. In four categories, the center independent kappa values were fair: extrapancreatic necrosis (EXPN) (0.326), type of pancreatitis (0.370), characteristics of collections (0.408), and appropriate term of collections (0.356). The fair kappa values relate to discrepancies in the identification of extrapancreatic necrotic material. The local radiologists diagnosed EXPN (33% versus 59%, P < 0.0001) and non-homogeneous collections (35% versus 66%, P < 0.0001) significantly less frequent than the central expert. Cases read by the central expert showed superior correlation with clinical outcome. Diagnosis of EXPN and recognition of non-homogeneous collections show only fair agreement potentially resulting in inconsistent reporting of morphologic findings. Copyright © 2016 IAP and EPC. Published by Elsevier B.V. All rights reserved.
Patange Subba Rao, Sheethal Prasad; Lewis, James; Haddad, Ziad; Paringe, Vishal; Mohanty, Khitish
2014-10-01
The aim of the study was to evaluate inter-observer reliability and intra-observer reproducibility between the three-column classification and Schatzker classification systems using 2D and 3D CT models. Fifty-two consecutive patients with tibial plateau fractures were evaluated by five orthopaedic surgeons. All patients were classified into Schatzker and three-column classification systems using x-rays and 2D and 3D CT images. The inter-observer reliability was evaluated in the first round and the intra-observer reliability was determined during the second round 2 weeks later. The average intra-observer reproducibility for the three-column classification was from substantial to excellent in all sub classifications, as compared with Schatzker classification. The inter-observer kappa values increased from substantial to excellent in three-column classification and to moderate in Schatzker classification The average values for three-column classification for all the categories are as follows: (I-III) k2D = 0.718, 95% CI 0.554-0.864, p < 0.0001 and average 3D = 0.874, 95% CI 0.754-0.890, p < 0.0001. For Schatzker classification system, the average values for all six categories are as follows: (I-VI) k2D = 0.536, 95% CI 0.365-0.685, p < 0.0001 and average k3D = 0.552 95% CI 0.405-0.700, p < 0.0001. The values are statistically significant. Statistically significant inter-observer values in both rounds were noted with the three-column classification, making it statistically an excellent agreement. The intra-observer reproducibility for the three-column classification improved as compared with the Schatzker classification. The three-column classification seems to be an effective way to characterise and classify fractures of tibial plateau.
Diagnosing Nodular Regenerative Hyperplasia of the Liver Is Thwarted by Low Interobserver Agreement.
Jharap, Bindia; van Asseldonk, Dirk P; de Boer, Nanne K H; Bedossa, Pierre; Diebold, Joachim; Jonker, A Mieke; Leteurtre, Emmanuelle; Verheij, Joanne; Wendum, Dominique; Wrba, Fritz; Zondervan, Pieter E; Colombel, Jean-Frédéric; Reinisch, Walter; Mulder, Chris J J; Bloemena, Elisabeth; van Bodegraven, Adriaan A
2015-01-01
Nodular regenerative hyperplasia (NRH) of the liver is associated with several diseases and drugs. Clinical symptoms of NRH may vary from absence of symptoms to full-blown (non-cirrhotic) portal hypertension. However, diagnosing NRH is challenging. The objective of this study was to determine inter- and intraobserver agreement on the histopathologic diagnosis of NRH. Liver specimens (n=48) previously diagnosed as NRH, were reviewed for the presence of NRH by seven pathologists without prior knowledge of the original diagnosis or clinical background. The majority of the liver specimens were from thiopurine using inflammatory bowel disease patients. Histopathologic features contributing to NRH were also assessed. Criteria for NRH were modified by consensus and subsequently validated. Interobserver agreement was evaluated by using the standard kappa index. After review, definite NRH, inconclusive NRH and no NRH were found in 35% (23-40%), 21% (13-27%) and 44% (38-56%), respectively (median, IQR). The median interobserver agreement for NRH was poor (κ = 0.20, IQR 0.14-0.28). The intraobserver variability on NRH ranged between 14% and 71%. After modification of the criteria and exclusion of biopsies with technical shortcomings, the interobserver agreement on the diagnosis NRH was fair (κ = 0.45). The interobserver agreement on the histopathologic diagnosis of NRH was poor, even when assessed by well-experienced liver pathologists. Modification of the criteria of NRH based on consensus effort and exclusion of biopsies of poor quality led to a fairly increased interobserver agreement. The main conclusion of this study is that NRH is a clinicopathologic diagnosis that cannot reliably be based on histopathology alone.
Park, Chang Suk; Kim, Sung Hun; Jung, Na Young; Choi, Jae Jung; Kang, Bong Joo; Jung, Hyun Seouk
2015-03-01
Elastographpy is a newly developed noninvasive imaging technique that uses ultrasound (US) to evaluate tissue stiffness. The interpretation of the same elastographic images may be variable according to reviewers. Because breast lesions are usually reported according to American College of Radiology Breast Imaging and Data System (ACR BI-RADS) lexicons and final category, we tried to compare observer variability between lexicons and final categorization of US BI-RADS and the elasticity score of US elastography. From April 2009 to February 2010, 1356 breast lesions in 1330 patients underwent ultrasound-guided core biopsy. Among them, 63 breast lesions in 55 patients (mean age, 45.7 years; range, 21-79 years) underwent both conventional ultrasound and elastography and were included in this study. Two radiologists independently performed conventional ultrasound and elastography, and another three observers reviewed conventional ultrasound images and elastography videos. Observers independently recorded the elasticity score for a 5-point scoring system proposed by Itoh et al., BI-RADS lexicons and final category using ultrasound BI-RADS. The histopathologic results were obtained and used as the reference standard. Interobserver variability was evaluated. Of the 63 lesions, 42 (66.7 %) were benign, and 21 (33.3 %) were malignant. The highest value of concordance among all variables was achieved for the elasticity score (k = 0.59), followed by shape (k = 0.54), final category (k = 0.48), posterior acoustic features (k = 0.44), echogenecity and orientation (k = 0.43). The least concordances were margin (k = 0.26), lesion boundary (k = 0.29) and calcification (k = 0.3). Elasticity score showed a higher level of interobserver agreement for the diagnosis of breast lesions than BI-RADS lexicons and final category.
Lai, Jeffrey K C; Robertson, Patricia L; Goh, Christine; Szer, Jeff
2018-02-01
To evaluate the intraobserver and interobserver agreement for bone marrow burden (BMB) scores for individual examinations and for the change in BMB score over time in the same patient. A total of 119 sets of MR images of the lumbar spine and femora from 60 patients with Gaucher disease were included. Each set of MR images was scored using the BMB score independently by two experienced MSK radiologists. One radiologist performed a second read four weeks later. Intraobserver and interobserver agreement was assessed using Bland-Altman analysis and weighted kappa scores. BMB scores (n=119) demonstrated fair intraobserver agreement (weighted kappa=0.53) with a mean difference of -0.20 and 95% limits of agreement (LOA) of (-3.41, 3.01). Inter observer agreement was poor with weighted kappa 0.28 with mean difference of -0.16 and 95% LOA of (-4.45, 4.11). Change in BMB scores over time (n=59) demonstrated poor/fair intraobserver agreement (weighted kappa 0.41, mean difference-0.20 and 95% LOA (-4.35, 3.94)). Interobserver agreement was poor (weighted kappa 0.25, mean difference -0.12 with wide 95% LOA (-6.23, 5.99)). Significant interobserver, and to a lesser extent intraobserver, variation occurs with blinded BMB scoring of Gaucher disease. Copyright © 2016 Elsevier Inc. All rights reserved.
Beiderwellen, Karsten J; Poeppel, Thorsten D; Hartung-Knemeyer, Verena; Buchbender, Christian; Kuehl, Hilmar; Bockisch, Andreas; Lauenstein, Thomas C
2013-05-01
The aim of this pilot study was to demonstrate the potential of simultaneously acquired 68-Gallium-DOTA-D-Phe1-Tyr3-octreotide (68Ga-DOTATOC) positron emission tomography/magnetic resonance imaging (PET/MRI) in comparison with 68Ga-DOTATOC PET/computed tomography (PET/CT) in patients with known gastroenteropancreatic neuroendocrine tumors (NETs). Eight patients (4 women and 4 men; mean [SD] age, 54 [17] years; median, 55 years; range 25-74 years) with histopathologically confirmed NET and scheduled 68Ga-DOTATOC PET/CT were prospectively enrolled for an additional integrated PET/MRI scan. Positron emission tomography/computed tomography was performed using a triple-phase contrast-enhanced full-dose protocol. Positron emission tomography/magnetic resonance imaging encompassed a diagnostic, contrast-enhanced whole-body MRI protocol. Two readers separately analyzed the PET/CT and PET/MRI data sets including their subscans in random order regarding lesion localization, count, and characterization on a 4-point ordinal scale (0, not visible; 1, benign; 2, indeterminate; and 3, malignant). In addition, each lesion was rated in consensus on a binary scale (allowing for benign/malignant only). Clinical imaging, existing prior examinations, and histopathology (if available) served as the standard of reference. In PET-positive lesions, the standardized uptake value (SUV max) was measured in consensus. A descriptive, case-oriented data analysis was performed, including determination of frequencies and percentages in detection of malignant, benign, and indeterminate lesions in connection to their localization. In addition, percentages in detection by a singular modality (such as PET, CT, or MRI) were calculated. Interobserver variability was calculated (Cohen's κ). The SUVs in the lesions in PET/CT and PET/MRI were measured, and the correlation coefficient (Pearson, 2-tailed) was calculated. According to the reference standard, 5 of the 8 patients had malignant NET lesions at the time of the examination. A total of 4 patients were correctly identified by PET/CT, with the PET and CT component correctly identifying 3 patients each. All 5 patients positive for NET disease were correctly identified by PET/MRI, with the MRI subscan identifying all 5 patients and the PET subscan identifying 3 patients. All lesions considered as malignant in PET/CT were equally depicted in and considered using PET/MRI. One liver lesion rated as "indetermined" in PET/CT was identified as metastasis in PET/MRI because of a diffusion restriction in diffusion-weighted imaging. Of the 4 lung lesions characterized in PET/CT, only 1 was depicted in PET/MRI. Of the 3 lymph nodes depicted in PET/CT, only 1 was characterized in PET/MRI. Interobserver reliability was equally very good in PET/CT (κ = 0.916) and PET/MRI (κ = 1.0). The SUV max measured in PET/CT and in PET/MRI showed a strong correlation (Pearson correlation coefficient, 0.996). This pilot study demonstrates the potential of 68Ga-DOTATOC PET/MRI in patients with gastroenteropancreatic NET, with special advantages in the characterization of abdominal lesions yet certain weaknesses inherent to MRI, such as lung metastases and hypersclerotic bone lesions.
Singleton, Neal; Agius, Lewis; Andrews, Stephen
2017-01-01
Various radiographic measurements that describe humeral head coverage by the acromion and the effect on rotator cuff pathology have been reported. This study aimed to describe and validate a new radiographic measurement, the acromiohumeral centre edge angle (ACEA). We compared the ACEA on computed tomography (CT) and plain X-ray to determine whether X-ray is accurate for measuring this angle. We then compared the results from this control population with 107 patients with acute rotator cuff tears. We compared functional outcomes in rotator cuff tear patients to determine whether the ACEA has any effect on outcome after surgery. An intra- and inter-observer variability analysis was performed and we compared the ACEA to the acromial index (AI) on rotation X-rays. The ACEA was comparable on CT and plain X-ray and was most accurate when true anteroposterior glenohumeral X-rays were used (15.94° vs. 15.87° on CT, p = 0.476). The ACEA showed high intra- and inter-observer reproducibility and was unchanged on internal and external rotation X-rays (20.48 vs. 20.47, p = 0.842), whereas the AI was significantly different (0.74 vs. 0.70, p < 0.001). The ACEA was significantly higher in our rotator cuff tear patients than the control population (23.9° vs. 16.6°, p < 0.001), although a higher ACEA was not associated with poorer outcomes. The ACEA is a valid measurement for describing humeral head coverage by the acromion and can be accurately measured on plain radiographs with good reproducibility. It is unaffected by shoulder rotation and was significantly higher in patients with acute rotator cuff tears.
Kim, Inwha; Kim, Dae Jung; Kim, Kyoung Ah; Yoon, Sang Wook; Lee, Jong Tae
2014-01-01
To investigate the feasibility and accuracy of multidetector computed tomography (MDCT) angiography for assessment of subsegmental tumor-feeding vessels in transarterial chemoembolization (TACE) of hepatocellular carcinoma (HCC). A total of 23 patients with 36 HCCs who underwent TACE during a 14-month period were enrolled. All patients underwent 3-phase dynamic MDCT within a month before TACE. Arterial phase MDCT images were retrospectively reformatted and analyzed for determination of single subsegmental tumor-feeding vessel using maximum intensity projection (MIP) and volume-rendering technique (VRT). Two radiologists independently assessed and scored the MIP and VRT images using 4-grade visual scores (grade 1, no depiction of tumor-feeding vessel; grade 2, indeterminate tumor-feeding vessel; grade 3, probable tumor-feeding vessel; and grade 4, good depiction of tumor-feeding vessel). The weighted kappa test was used to determine interobserver variability, and Wilcoxon signed rank test was used to differentiate visual scores of each technique. Results of digital subtraction angiography were defined as the criterion standard; therefore, assessment of subsegmental tumor-feeding vessel using MIP or VRT was compared with digital subtraction angiography, and the accuracy of each technique was calculated. Interobserver agreement (weighted kappa, 0.746 on VRT and 0.806 on MIP) was substantial to almost perfect. The visual scores for MIP (mean, 3.64 for reviewer 1 and 3.5 for reviewer 2) were higher than those for VRT (mean, 2.11 for reviewer 1 and 2.22 for reviewer 2; P = 0.000). The accuracy for assessing subsegmental tumor-feeding vessel was 22.2% for VRT and 77.8% for MIP. Multidetector CT angiography using MIP showed good imaging quality and high accuracy for determination of subsegmental tumor-feeding vessels.
Schelhorn, Juliane; Neudorf, Ulrich; Schemuth, Haemi; Nensa, Felix; Nassenstein, Kai; Schlosser, Thomas W
2015-11-01
Patients with corrected tetralogy of Fallot (cToF) are prone to develop pulmonary regurgitation and right ventricular enlargement resulting in long-term complications, thus correct right ventricular volumetric monitoring is crucial. However, it remains controversial which cardiovascular magnetic resonance imaging (CMRI) slice orientation is most appropriate in cToF for the analysis of the right ventricular volume. To investigate which slice orientation is most suited for right ventricular volumetry in cToF we compared short-axis and axial slices, and furthermore we compared right ventricular data between CMRI and echocardiography. Thirty CMRI examinations of 27 patients with cToF were included retrospectively. Right ventricular end-diastolic (EDV) and end-systolic volume (ESV) were derived from short-axis and axial cine CMRI planes. Furthermore, pulmonary trunk forward flow in phase-contrast CMRI and right ventricular inner diastolic diameter in echocardiography (R VIDdiast) were measured. By Bland-Altman and variance analysis intra- and inter-observer agreement were assessed for cine CMRI data. By Pearson correlation CMRI cine and phase-contrast data and CMRI cine and echocardiographic data were compared. Intra- and inter-observer variability for right ventricular EDV were significantly lower in axial slices (P = 0.016, P = 0.010). For right ventricular ESV a trend towards a lower intra- and inter-observer variability in axial slices was found (P = 0.063, P = 0.138). Right ventricular stroke volume in short-axis (r = 0.872, P < 0.001) and in axial (r = 0.914, P < 0.001) planes correlated highly, respectively very highly with pulmonary trunk forward flow in phase-contrast CMRI. R VIDdiast correlated highly with right ventricular EDV assessed by short-axis and axial CMRI (P < 0.001, P < 0.001). Due to lower intra- and inter-observer variability, axial slices are recommended for right ventricular volumetry in cToF. © The Foundation Acta Radiologica 2014.
Duong, Luc; Cheriet, Farida; Labelle, Hubert; Cheung, Kenneth M C; Abel, Mark F; Newton, Peter O; McCall, Richard E; Lenke, Lawrence G; Stokes, Ian A F
2009-08-01
Interobserver and intraobserver reliability study for the identification of the Lenke classification lumbar modifier by a panel of experts compared with a computer algorithm. To measure the variability of the Lenke classification lumbar modifier and determine if computer assistance using 3-dimensional spine models can improve the reliability of classification. The lumbar modifier has been proposed to subclassify Lenke scoliotic curve types into A, B, and C on the basis of the relationship between the central sacral vertical line (CSVL) and the apical lumbar vertebra. Landmarks for identification of the CSVL have not been clearly defined, and the reliability of the actual CSVL position and lumbar modifier selection have never been tested independently. Therefore, the value of the lumbar modifier for curve classification remains unknown. The preoperative radiographs of 68 patients with adolescent idiopathic scoliosis presenting a Lenke type 1 curve were measured manually twice by 6 members of the Scoliosis Research Society 3-dimensional classification committee at 6 months interval. Intraobserver and interobserver reliability was quantified using the percentage of agreement and kappa statistics. In addition, the lumbar curve of all subjects was reconstructed in 3-dimension using a stereoradiographic technique and was submitted to a computer algorithm to infer the lumbar modifier according to measurements from the pedicles. Interobserver rates for the first trial showed a mean kappa value of 0.56. Second trial rates were higher with a mean kappa value of 0.64. Intraobserver rates were evaluated at a mean kappa value of 0.69. The computer algorithm was successful in identifying the lumbar curve type and was in agreement with the observers by a proportion up to 93%. Agreement between and within observers for the Lenke lumbar modifier is only moderate to substantial with manual methods. Computer assistance with 3-dimensional models of the spine has the potential to decrease this variability.
Podlesnikar, Tomaz; Prihadi, Edgard A; van Rosendael, Philippe J; Vollema, E Mara; van der Kley, Frank; de Weger, Arend; Ajmone Marsan, Nina; Naji, Franjo; Fras, Zlatko; Bax, Jeroen J; Delgado, Victoria
2018-01-01
Accurate aortic annulus sizing is key for selection of appropriate transcatheter aortic valve implantation (TAVI) prosthesis size. The present study compared novel automated 3-dimensional (3D) transesophageal echocardiography (TEE) software and multidetector row computed tomography (MDCT) for aortic annulus sizing and investigated the influence of the quantity of aortic valve calcium (AVC) on the selection of TAVI prosthesis size. A total of 83 patients with severe aortic stenosis undergoing TAVI were evaluated. Maximal and minimal aortic annulus diameter, perimeter, and area were measured. AVC was assessed with computed tomography. The low and high AVC burden groups were defined according to the median AVC score. Overall, 3D TEE measurements slightly underestimated the aortic annulus dimensions as compared with MDCT (mean differences between maximum, minimum diameter, perimeter, and area: -1.7 mm, 0.5 mm, -2.7 mm, and -13 mm 2 , respectively). The agreement between 3D TEE and MDCT on aortic annulus dimensions was superior among patients with low AVC burden (<3,025 arbitrary units) compared with patients with high AVC burden (≥3,025 arbitrary units). The interobserver variability was excellent for both methods. 3D TEE and MDCT led to the same prosthesis size selection in 88%, 95%, and 81% of patients in the total population, the low, and the high AVC burden group, respectively. In conclusion, the novel automated 3D TEE imaging software allows accurate and highly reproducible measurements of the aortic annulus dimensions and shows excellent agreement with MDCT to determine the TAVI prosthesis size, particularly in patients with low AVC burden. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.
The estimation of bone cyst volume using the Cavalieri principle on computed tomography images.
Say, Ferhat; Gölpınar, Murat; Kılınç, Cem Yalın; Şahin, Bünyamin
2018-01-01
To evaluate the volume of bone cyst using the planimetry method of the Cavalieri principle. A retrospective analysis was carried out on data from 25 computed tomography (CT) images of patients with bone cyst. The volume of the cysts was calculated by two independent observers using the planimetry method. The procedures were repeated 1 month later by each observer. The overall mean volume of the bone cyst was 29.25 ± 25.86 cm 3 . The mean bone cyst volumes calculated by the first observer for the first and second sessions were 29.18 ± 26.14 and 29.27 ± 26.19 cm 3 , respectively. The mean bone cyst volumes calculated by the second observer for the first and second sessions were 29.32 ± 26.36 and 29.23 ± 26.36 cm 3 , respectively. Statistical analysis showed no difference and high agreement between the first and second measurements of both observers. The Bland-Altman plots showed strong intraobserver and interobserver concordance in the measurement of the bone cyst volume. The mean total time necessary to obtain the cyst volume by the two observers was 5.27 ± 2.30 min. The bone cyst of the patients can be objectively evaluated using the planimetry method of the Cavalieri principle on CT. This method showed high interobserver and intraobserver agreement. This volume measurement can be used to evaluate cyst remodeling, including complete healing and cyst recurrence.
Diagnosing Nodular Regenerative Hyperplasia of the Liver Is Thwarted by Low Interobserver Agreement
Jharap, Bindia; van Asseldonk, Dirk P.; de Boer, Nanne K. H.; Bedossa, Pierre; Diebold, Joachim; Jonker, A. Mieke; Leteurtre, Emmanuelle; Verheij, Joanne; Wendum, Dominique; Wrba, Fritz; Zondervan, Pieter E.; Colombel, Jean-Frédéric; Reinisch, Walter; Mulder, Chris J. J.; Bloemena, Elisabeth; van Bodegraven, Adriaan A.
2015-01-01
Background and Aims Nodular regenerative hyperplasia (NRH) of the liver is associated with several diseases and drugs. Clinical symptoms of NRH may vary from absence of symptoms to full-blown (non-cirrhotic) portal hypertension. However, diagnosing NRH is challenging. The objective of this study was to determine inter- and intraobserver agreement on the histopathologic diagnosis of NRH. Methods Liver specimens (n=48) previously diagnosed as NRH, were reviewed for the presence of NRH by seven pathologists without prior knowledge of the original diagnosis or clinical background. The majority of the liver specimens were from thiopurine using inflammatory bowel disease patients. Histopathologic features contributing to NRH were also assessed. Criteria for NRH were modified by consensus and subsequently validated. Interobserver agreement was evaluated by using the standard kappa index. Results After review, definite NRH, inconclusive NRH and no NRH were found in 35% (23-40%), 21% (13-27%) and 44% (38-56%), respectively (median, IQR). The median interobserver agreement for NRH was poor (κ = 0.20, IQR 0.14-0.28). The intraobserver variability on NRH ranged between 14% and 71%. After modification of the criteria and exclusion of biopsies with technical shortcomings, the interobserver agreement on the diagnosis NRH was fair (κ = 0.45). Conclusions The interobserver agreement on the histopathologic diagnosis of NRH was poor, even when assessed by well-experienced liver pathologists. Modification of the criteria of NRH based on consensus effort and exclusion of biopsies of poor quality led to a fairly increased interobserver agreement. The main conclusion of this study is that NRH is a clinicopathologic diagnosis that cannot reliably be based on histopathology alone. PMID:26054009
Radiographic classifications in Perthes disease
Huhnstock, Stefan; Svenningsen, Svein; Merckoll, Else; Catterall, Anthony; Terjesen, Terje; Wiig, Ola
2017-01-01
Background and purpose Different radiographic classifications have been proposed for prediction of outcome in Perthes disease. We assessed whether the modified lateral pillar classification would provide more reliable interobserver agreement and prognostic value compared with the original lateral pillar classification and the Catterall classification. Patients and methods 42 patients (38 boys) with Perthes disease were included in the interobserver study. Their mean age at diagnosis was 6.5 (3–11) years. 5 observers classified the radiographs in 2 separate sessions according to the Catterall classification, the original and the modified lateral pillar classifications. Interobserver agreement was analysed using weighted kappa statistics. We assessed the associations between the classifications and femoral head sphericity at 5-year follow-up in 37 non-operatively treated patients in a crosstable analysis (Gamma statistics for ordinal variables, γ). Results The original lateral pillar and Catterall classifications showed moderate interobserver agreement (kappa 0.49 and 0.43, respectively) while the modified lateral pillar classification had fair agreement (kappa 0.40). The original lateral pillar classification was strongly associated with the 5-year radiographic outcome, with a mean γ correlation coefficient of 0.75 (95% CI: 0.61–0.95) among the 5 observers. The modified lateral pillar and Catterall classifications showed moderate associations (mean γ correlation coefficient 0.55 [95% CI: 0.38–0.66] and 0.64 [95% CI: 0.57–0.72], respectively). Interpretation The Catterall classification and the original lateral pillar classification had sufficient interobserver agreement and association to late radiographic outcome to be suitable for clinical use. Adding the borderline B/C group did not increase the interobserver agreement or prognostic value of the original lateral pillar classification. PMID:28613966
A Probabilistic Method for Estimation of Bowel Wall Thickness in MR Colonography
Menys, Alex; Jaffer, Asif; Bhatnagar, Gauraang; Punwani, Shonit; Atkinson, David; Halligan, Steve; Hawkes, David J.; Taylor, Stuart A.
2017-01-01
MRI has recently been applied as a tool to quantitatively evaluate the response to therapy in patients with Crohn’s disease, and is the preferred choice for repeated imaging. Bowel wall thickness on MRI is an important biomarker of underlying inflammatory activity, being abnormally increased in the acute phase and reducing in response to successful therapy; however, a poor level of interobserver agreement of measured thickness is reported and therefore a system for accurate, robust and reproducible measurements is desirable. We propose a novel method for estimating bowel wall-thickness to improve the poor interobserver agreement of the manual procedure. We show that the variability of wall thickness measurement between the algorithm and observer measurements (0.25mm ± 0.81mm) has differences which are similar to observer variability (0.16mm ± 0.64mm). PMID:28072831
Sinclair, R C F; Danjoux, G R; Goodridge, V; Batterham, A M
2009-11-01
The variability between observers in the interpretation of cardiopulmonary exercise tests may impact upon clinical decision making and affect the risk stratification and peri-operative management of a patient. The purpose of this study was to quantify the inter-reader variability in the determination of the anaerobic threshold (V-slope method). A series of 21 cardiopulmonary exercise tests from patients attending a surgical pre-operative assessment clinic were read independently by nine experienced clinicians regularly involved in clinical decision making. The grand mean for the anaerobic threshold was 10.5 ml O(2).kg body mass(-1).min(-1). The technical error of measurement was 8.1% (circa 0.9 ml.kg(-1).min(-1); 90% confidence interval, 7.4-8.9%). The mean absolute difference between readers was 4.5% with a typical random error of 6.5% (6.0-7.2%). We conclude that the inter-observer variability for experienced clinicians determining the anaerobic threshold from cardiopulmonary exercise tests is acceptable.
Schuurmann, Richte C L; Overeem, Simon P; van Noort, Kim; de Vries, Bastiaan A; Slump, Cornelis H; de Vries, Jean-Paul P M
2018-04-01
To validate a novel methodology employing regular postoperative computed tomography angiography (CTA) scans to assess essential factors contributing to durable endovascular aneurysm repair (EVAR), including endograft deployment accuracy, neck adaptation to radial forces, and effective apposition of the fabric within the aortic neck. Semiautomatic calculation of the apposition surface between the endograft and the infrarenal aortic neck was validated in vitro by comparing the calculated surfaces over a cylindrical silicon model with known dimensions on CTA reconstructions with various slice thicknesses. Interobserver variabilities were assessed for calculating endograft position, apposition, and expansion in a retrospective series of 24 elective EVAR patients using the repeatability coefficient (RC) and the intraclass correlation coefficient (ICC). The variability of these calculations was compared with variability of neck length and diameter measurements on centerline reconstructions of the preoperative and first postoperative CTA scans. In vitro validation showed accurate calculation of apposition, with deviation of 2.8% from the true surface for scans with 1-mm slice thickness. Excellent agreement was achieved for calculation of the endograft dimensions (ICC 0.909 to 0.996). Variability was low for calculation of endograft diameter (RC 2.3 mm), fabric distances (RC 5.2 to 5.7 mm), and shortest apposition length (RC 4.1 mm), which was the same as variability of regular neck diameter (RC 0.9 to 1.1 mm) and length (RC 4.0 to 8.0 mm) measurements. This retrospective validation study showed that apposition surfaces between an endograft and the infrarenal neck can be calculated accurately and with low variability. Determination of the (ap)position of the endograft in the aortic neck and detection of subtle changes during follow-up are crucial to determining eventual failure after EVAR.
Assessment of colon polyp morphology: Is education effective?
Kim, Jae Hyun; Nam, Kyoung Sik; Kwon, Hye Jung; Choi, Youn Jung; Jung, Kyoungwon; Kim, Sung Eun; Moon, Won; Park, Moo In; Park, Seun Ja
2017-01-01
AIM To determine the inter-observer variability for colon polyp morphology and to identify whether education can improve agreement among observers. METHODS For purposes of the tests, we recorded colonoscopy video clips that included scenes visualizing the polyps. A total of 15 endoscopists and 15 nurses participated in the study. Participants watched 60 video clips of the polyp morphology scenes and then estimated polyp morphology (pre-test). After education for 20 min, participants performed a second test in which the order of 60 video clips was changed (post-test). To determine if the effectiveness of education was sustained, four months later, a third, follow-up test was performed with the same participants. RESULTS The overall Fleiss’ kappa value of the inter-observer agreement was 0.510 in the pre-test, 0.618 in the post-test, and 0.580 in the follow-up test. The overall diagnostic accuracy of the estimation for polyp morphology in the pre-, post-, and follow-up tests was 0.662, 0.797, and 0.761, respectively. After education, the inter-observer agreement and diagnostic accuracy of all participants improved. However, after four months, the inter-observer agreement and diagnostic accuracy of expert groups were markedly decreased, and those of beginner and nurse groups remained similar to pre-test levels. CONCLUSION The education program used in this study can improve inter-observer agreement and diagnostic accuracy in assessing the morphology of colon polyps; it is especially effective when first learning endoscopy. PMID:28974894
Assessment of colon polyp morphology: Is education effective?
Kim, Jae Hyun; Nam, Kyoung Sik; Kwon, Hye Jung; Choi, Youn Jung; Jung, Kyoungwon; Kim, Sung Eun; Moon, Won; Park, Moo In; Park, Seun Ja
2017-09-14
To determine the inter-observer variability for colon polyp morphology and to identify whether education can improve agreement among observers. For purposes of the tests, we recorded colonoscopy video clips that included scenes visualizing the polyps. A total of 15 endoscopists and 15 nurses participated in the study. Participants watched 60 video clips of the polyp morphology scenes and then estimated polyp morphology (pre-test). After education for 20 min, participants performed a second test in which the order of 60 video clips was changed (post-test). To determine if the effectiveness of education was sustained, four months later, a third, follow-up test was performed with the same participants. The overall Fleiss' kappa value of the inter-observer agreement was 0.510 in the pre-test, 0.618 in the post-test, and 0.580 in the follow-up test. The overall diagnostic accuracy of the estimation for polyp morphology in the pre-, post-, and follow-up tests was 0.662, 0.797, and 0.761, respectively. After education, the inter-observer agreement and diagnostic accuracy of all participants improved. However, after four months, the inter-observer agreement and diagnostic accuracy of expert groups were markedly decreased, and those of beginner and nurse groups remained similar to pre-test levels. The education program used in this study can improve inter-observer agreement and diagnostic accuracy in assessing the morphology of colon polyps; it is especially effective when first learning endoscopy.
Faita, Francesco; Gemignani, Vincenzo; Bianchini, Elisabetta; Giannarelli, Chiara; Ghiadoni, Lorenzo; Demi, Marcello
2008-09-01
The purpose of this report is to describe an automatic real-time system for evaluation of the carotid intima-media thickness (CIMT) characterized by 3 main features: minimal interobserver and intraobserver variability, real-time capabilities, and great robustness against noise. One hundred fifty carotid B-mode ultrasound images were used to validate the system. Two skilled operators were involved in the analysis. Agreement with the gold standard, defined as the mean of 2 manual measurements of a skilled operator, and the interobserver and intraobserver variability were quantitatively evaluated by regression analysis and Bland-Altman statistics. The automatic measure of the CIMT showed a mean bias +/- SD of 0.001 +/- 0.035 mm toward the manual measurement. The intraobserver variability, evaluated with Bland-Altman plots, showed a bias that was not significantly different from 0, whereas the SD of the differences was greater in the manual analysis (0.038 mm) than in the automatic analysis (0.006 mm). For interobserver variability, the automatic measurement had a bias that was not significantly different from 0, with a satisfactory SD of the differences (0.01 mm), whereas in the manual measurement, a little bias was present (0.012 mm), and the SD of the differences was noticeably greater (0.044 mm). The CIMT has been accepted as a noninvasive marker of early vascular alteration. At present, the manual approach is largely used to estimate CIMT values. However, that method is highly operator dependent and time-consuming. For these reasons, we developed a new system for the CIMT measurement that conjugates precision with real-time analysis, thus providing considerable advantages in clinical practice.
Vinod, Shalini K; Min, Myo; Jameson, Michael G; Holloway, Lois C
2016-06-01
Inter-observer variability (IOV) in target volume and organ-at-risk (OAR) delineation is a source of potential error in radiation therapy treatment. The aims of this study were to identify interventions shown to reduce IOV in volume delineation. Medline and Pubmed databases were queried for relevant articles using various keywords to identify articles which evaluated IOV in target or OAR delineation for multiple (>2) observers. The search was limited to English language articles and to those published from 1 January 2000 to 31 December 2014. Reference lists of identified articles were scrutinised to identify relevant studies. Studies were included if they reported IOV in contouring before and after an intervention including the use of additional or alternative imaging. Fifty-six studies were identified. These were grouped into evaluation of guidelines (n = 9), teaching (n = 9), provision of an autocontour (n = 7) and the impact of imaging (n = 31) on IOV. Guidelines significantly reduced IOV in 7/9 studies. Teaching interventions reduced IOV in 8/9 studies, statistically significant in 4. The provision of an autocontour improved consistency of contouring in 6/7 studies, statistically significant in 5. The effect of additional imaging on IOV was variable. Pre-operative CT was useful in reducing IOV in contouring breast and liver cancers, PET scans in lung cancer, rectal cancer and lymphoma and MRI scans in OARs in head and neck cancers. Inter-observer variability in volume delineation can be reduced with the use of guidelines, provision of autocontours and teaching. The use of multimodality imaging is useful in certain tumour sites. © 2016 The Royal Australian and New Zealand College of Radiologists.
van der Palen, Roel L F; Roest, Arno A W; van den Boogaard, Pieter J; de Roos, Albert; Blom, Nico A; Westenberg, Jos J M
2018-05-26
The aim was to investigate scan-rescan reproducibility and observer variability of segmental aortic 3D systolic wall shear stress (WSS) by phase-specific segmentation with 4D flow MRI in healthy volunteers. Ten healthy volunteers (age 26.5 ± 2.6 years) underwent aortic 4D flow MRI twice. Maximum 3D systolic WSS (WSSmax) and mean 3D systolic WSS (WSSmean) for five thoracic aortic segments over five systolic cardiac phases by phase-specific segmentations were calculated. Scan-rescan analysis and observer reproducibility analysis were performed. Scan-rescan data showed overall good reproducibility for WSSmean (coefficient of variation, COV 10-15%) with moderate-to-strong intraclass correlation coefficient (ICC 0.63-0.89). The variability in WSSmax was high (COV 16-31%) with moderate-to-good ICC (0.55-0.79) for different aortic segments. Intra- and interobserver reproducibility was good-to-excellent for regional aortic WSSmax (ICC ≥ 0.78; COV ≤ 17%) and strong-to-excellent for WSSmean (ICC ≥ 0.86; COV ≤ 11%). In general, ascending aortic segments showed more WSSmax/WSSmean variability compared to aortic arch or descending aortic segments for scan-rescan, intraobserver and interobserver comparison. Scan-rescan reproducibility was good for WSSmean and moderate for WSSmax for all thoracic aortic segments over multiple systolic phases in healthy volunteers. Intra/interobserver reproducibility for segmental WSS assessment was good-to-excellent. Variability of WSSmax is higher and should be taken into account in case of individual follow-up or in comparative rest-stress studies to avoid misinterpretation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pogson, EM; University of Wollongong, Wollongong, NSW; Liverpool and Macarthur Cancer Therapy Centres, Liverpool, NSW
2016-06-15
Purpose: Breast cancers predominantly arise from Glandular Breast Tissue (GBT). If the GBT can be treated effectively post-operatively utilising radiotherapy this may be adequate volumetric coverage for adjuvant breast radiotherapy. Adequate imaging of the GBT is necessary and will be assessed between MRI and CT modalities. GBT visualisation is acknowledged to be qualitatively superior on Magnetic Resonance Image (MRI) compared to Computed Tomography (CT), the current radiotherapy imaging standard, however this has not been quantitatively assessed. For radiotherapy purposes it is important that any treatment volume can be consistently defined between observers. This study investigates the consistency of CT andmore » MRI GBT contours for potential radiotherapy planning. Methods: Ten experts (9 breast radiation oncologists and 1 radiologist) contoured the extent of the visible GBT for 33 patients on MRI and CT (both without contrast), which was performed according to a contouring guideline in supine and prone patient positions. The GBT volume was not a conventional whole breast radiotherapy planning volume, but rather the extent of GBT that was indicated from the CT or MR imaging. Volumes were compared utilizing the dice similarity coefficient (DSC), kappa statistic, and Hausdorff Distances (HDs) to ascertain the modality that was most consistently volumed. Results: The inter-observer concordance was of substantial agreement (kappa above 0.6) for the CT supine, CT prone, MRI supine and MRI prone datasets. The MRI GBT volumes were larger than the CT GBT volumes (p<0.001). Inter-observer conformity was higher for CT than MRI, although the magnitude of this difference was small (VOI<0.04). Conformity between modalities (CT and MRI) was in agreement for both prone and supine, DSC=0.75. Prone GBT volumes were larger than supine for both MRI and CT. Conclusion: MRI improves the extent of GBT delineation. The role of MRI guided, GBT-targeted radiotherapy requires investigation in a clinical trial. This work was supported by a grant number APP1033237 from Cancer Australia and the National Breast Cancer Foundation.« less
Can computed tomography aid in diagnosis of intramural hematomas of the intestinal wall?
Ulusan, Serife; Pekoz, Burcak; Sariturk, Cagla
2015-12-01
We sought to use computed tomography (CT) data to support the correct differential diagnosis of patients with spontaneous intramural hematomas of the gastrointestinal tract, to aid in the clinical management of those using oral anticoagulants. Patient data were retrospectively analyzed and patients were divided into two groups. The first group contained 10 patients (5 females, 5 males, median age 65 years [range 35-79 years]) who had been diagnosed with spontaneous intramural hematomas of the gastrointestinal tract. The second group contained nine patients (5 females, 4 males, median age 41 years [range 24-56 years]) who exhibited intestinal wall thickening on CT, and who had been diagnosed with ulcerative colitis, Crohn's disease, ameboma, and lymphoma. The enhancement patterns in the CT images of the two groups were compared by an experienced and inexperienced radiologist. The differences in values were subjected to ROC analysis. Inter-observer variability was excellent (0.84) when post-contrast CT images were evaluated, as were the subtraction values (0.89). The subtracted values differed significantly between the two groups (p=0.0001). A cutoff of +31.5 HU was optimal in determining whether a hematoma was or was not present. Contrast enhancement of an intestinal wall hematoma is less than that of other intestinal wall pathologies associated with increased wall thickness. If the post-contrast enhancement of a thickened intestinal wall is less than +31.5 HU, a wall hematoma is possible. © Acta Gastro-Enterologica Belgica.
Iosca, Simona; Lumia, Domenico; Bracchi, Elena; Duka, Ejona; De Bon, Monica; Lekaj, Manjola; Uccella, Stefano; Ghezzi, Fabio; Fugazzola, Carlo
2013-01-01
This study evaluates retrospectively the accuracy and reproducibility of multislice computed tomography with colon water distension (MSCT-c) in diagnosing bowel (BE) and ureteral (UE) endometriosis. Sixty-four patients underwent MSCT-c and videolaparoscopic surgery. Two radiologists reviewed MSCT-c examinations: sensitivity and specificity were calculated, considering histological exam as reference standard. In the BE cases, the degree of bowel wall infiltration was also assessed. Sensitivity and specificity for both readers were 100% and 97.6% for BE and 72.2% and 100% for UE; the interobserver agreement was excellent. The degree of bowel wall involvement was correctly defined in 90.9% of cases. MSCT-c is an accurate and reproducible technique but-considering the age of the patients-delivers a nonnegligible radiation dose. © 2013 Elsevier Inc. All rights reserved.
Mutual information-based feature selection for radiomics
NASA Astrophysics Data System (ADS)
Oubel, Estanislao; Beaumont, Hubert; Iannessi, Antoine
2016-03-01
Background The extraction and analysis of image features (radiomics) is a promising field in the precision medicine era, with applications to prognosis, prediction, and response to treatment quantification. In this work, we present a mutual information - based method for quantifying reproducibility of features, a necessary step for qualification before their inclusion in big data systems. Materials and Methods Ten patients with Non-Small Cell Lung Cancer (NSCLC) lesions were followed over time (7 time points in average) with Computed Tomography (CT). Five observers segmented lesions by using a semi-automatic method and 27 features describing shape and intensity distribution were extracted. Inter-observer reproducibility was assessed by computing the multi-information (MI) of feature changes over time, and the variability of global extrema. Results The highest MI values were obtained for volume-based features (VBF). The lesion mass (M), surface to volume ratio (SVR) and volume (V) presented statistically significant higher values of MI than the rest of features. Within the same VBF group, SVR showed also the lowest variability of extrema. The correlation coefficient (CC) of feature values was unable to make a difference between features. Conclusions MI allowed to discriminate three features (M, SVR, and V) from the rest in a statistically significant manner. This result is consistent with the order obtained when sorting features by increasing values of extrema variability. MI is a promising alternative for selecting features to be considered as surrogate biomarkers in a precision medicine context.
Hyun, Yil Sik; Bae, Joong Ho; Park, Hye Sun; Eun, Chang Soo
2013-01-01
Accurate diagnosis of gastric intestinal metaplasia is important; however, conventional endoscopy is known to be an unreliable modality for diagnosing gastric intestinal metaplasia (IM). The aims of the study were to evaluate the interobserver variation in diagnosing IM by high-definition (HD) endoscopy and the diagnostic accuracy of this modality for IM among experienced and inexperienced endoscopists. Selected 50 cases, taken with HD endoscopy, were sent for a diagnostic inquiry of gastric IM through visual inspection to five experienced and five inexperienced endoscopists. The interobserver agreement between endoscopists was evaluated to verify the diagnostic reliability of HD endoscopy in diagnosing IM, and the diagnostic accuracy, sensitivity, and specificity were evaluated for validity of HD endoscopy in diagnosing IM. Interobserver agreement among the experienced endoscopists was "poor" (κ = 0.38) and it was also "poor" (κ = 0.33) among the inexperienced endoscopists. The diagnostic accuracy of the experienced endoscopists was superior to that of the inexperienced endoscopists (P = 0.003). Since diagnosis through visual inspection is unreliable in the diagnosis of IM, all suspicious areas for gastric IM should be considered to be biopsied. Furthermore, endoscopic experience and education are needed to raise the diagnostic accuracy of gastric IM. PMID:23678267
Costantini, Massimo; Sciallero, Stefania; Giannini, Augusto; Gatteschi, Beatrice; Rinaldi, Paolo; Lanzanova, Giuseppe; Bonelli, Luigina; Casetti, Tino; Bertinelli, Elisabetta; Giuliani, Orietta; Castiglione, Guido; Mantellini, Paola; Naldoni, Carlo; Bruzzi, Paolo
2003-03-01
Current clinical practice guidelines for patients with colorectal polyps are mainly based on the histologic characteristics of their lesions. However, interobserver variability in the assessment of specific polyp characteristics was evaluated in very few studies. The purpose of this study was to evaluate the interobserver agreement of four pathologists in the diagnosis of histologic type of colorectal polyps and in the degree of dysplasia and of infiltrating carcinoma in adenomas. A stratified random sample of 100 polyps was obtained from the 4,889 polyps resected within the Multicentre Adenoma Colorectal Study (SMAC), and the slides were blindly reviewed by the four pathologists. Agreement was analyzed using kappa statistics. A median kappa of 0.89 (range 0.79-1.0) was estimated for the interobserver agreement for the diagnosis of hyperplastic polyp vs. adenoma. The agreement in the diagnosis of tubular, tubulovillous, and villous type, was given by median kappa values of 0.50, 0.15, and 0.36, respectively. The median kappa for the diagnosis of infiltrating carcinoma was 0.78 (range 0.73-0.84). Agreement on diagnosis of adenoma histologic subtypes, degrees of dysplasia, or infiltrating carcinoma in adenoma was moderate. A simpler classifications might help to better identify patients at different risk of colorectal cancer.
Hyun, Yil Sik; Han, Dong Soo; Bae, Joong Ho; Park, Hye Sun; Eun, Chang Soo
2013-05-01
Accurate diagnosis of gastric intestinal metaplasia is important; however, conventional endoscopy is known to be an unreliable modality for diagnosing gastric intestinal metaplasia (IM). The aims of the study were to evaluate the interobserver variation in diagnosing IM by high-definition (HD) endoscopy and the diagnostic accuracy of this modality for IM among experienced and inexperienced endoscopists. Selected 50 cases, taken with HD endoscopy, were sent for a diagnostic inquiry of gastric IM through visual inspection to five experienced and five inexperienced endoscopists. The interobserver agreement between endoscopists was evaluated to verify the diagnostic reliability of HD endoscopy in diagnosing IM, and the diagnostic accuracy, sensitivity, and specificity were evaluated for validity of HD endoscopy in diagnosing IM. Interobserver agreement among the experienced endoscopists was "poor" (κ = 0.38) and it was also "poor" (κ = 0.33) among the inexperienced endoscopists. The diagnostic accuracy of the experienced endoscopists was superior to that of the inexperienced endoscopists (P = 0.003). Since diagnosis through visual inspection is unreliable in the diagnosis of IM, all suspicious areas for gastric IM should be considered to be biopsied. Furthermore, endoscopic experience and education are needed to raise the diagnostic accuracy of gastric IM.
Wiland, Homer O; Procop, Gary W; Goldblum, John R; Tuohy, Marion; Rybicki, Lisa; Patil, Deepa T
2013-06-01
Polymerase chain reaction (PCR)-based assays using stool samples are currently the most effective method of detecting Clostridium difficile. This study examines the feasibility of this assay using mucosal biopsy samples and evaluates the interobserver reproducibility in diagnosing and distinguishing ischemic colitis from C difficile colitis. Thirty-eight biopsy specimens were reviewed and classified by 3 observers into C difficile and ischemic colitis. The findings were correlated with clinical data. PCR was performed on 34 cases using BD GeneOhm C difficile assay. The histologic interobserver agreement was excellent (κ= 0.86) and the agreement between histologic and clinical diagnosis was good (κ = 0.84). All 19 ischemic colitis cases tested negative (100% specificity) and 3 of 15 cases of C difficile colitis tested positive (20% sensitivity). C difficile colitis can be reliably distinguished from ischemic colitis using histologic criteria. The C difficile PCR test on endoscopic biopsy specimens has excellent specificity but limited sensitivity.
Rønjom, Marianne F; Brink, Carsten; Lorenzen, Ebbe L; Hegedüs, Laszlo; Johansen, Jørgen
2015-01-01
To examine the variations of risk-estimates of radiation-induced hypothyroidism (HT) from our previously developed normal tissue complication probability (NTCP) model in patients with head and neck squamous cell carcinoma (HNSCC) in relation to variability of delineation of the thyroid gland. In a previous study for development of an NTCP model for HT, the thyroid gland was delineated in 246 treatment plans of patients with HNSCC. Fifty of these plans were randomly chosen for re-delineation for a study of the intra- and inter-observer variability of thyroid volume, Dmean and estimated risk of HT. Bland-Altman plots were used for assessment of the systematic (mean) and random [standard deviation (SD)] variability of the three parameters, and a method for displaying the spatial variation in delineation differences was developed. Intra-observer variability resulted in a mean difference in thyroid volume and Dmean of 0.4 cm(3) (SD ± 1.6) and -0.5 Gy (SD ± 1.0), respectively, and 0.3 cm(3) (SD ± 1.8) and 0.0 Gy (SD ± 1.3) for inter-observer variability. The corresponding mean differences of NTCP values for radiation-induced HT due to intra- and inter-observer variations were insignificantly small, -0.4% (SD ± 6.0) and -0.7% (SD ± 4.8), respectively, but as the SDs show, for some patients the difference in estimated NTCP was large. For the entire study population, the variation in predicted risk of radiation-induced HT in head and neck cancer was small and our NTCP model was robust against observer variations in delineation of the thyroid gland. However, for the individual patient, there may be large differences in estimated risk which calls for precise delineation of the thyroid gland to obtain correct dose and NTCP estimates for optimized treatment planning in the individual patient.
Automatic delineation of tumor volumes by co-segmentation of combined PET/MR data
NASA Astrophysics Data System (ADS)
Leibfarth, S.; Eckert, F.; Welz, S.; Siegel, C.; Schmidt, H.; Schwenzer, N.; Zips, D.; Thorwarth, D.
2015-07-01
Combined PET/MRI may be highly beneficial for radiotherapy treatment planning in terms of tumor delineation and characterization. To standardize tumor volume delineation, an automatic algorithm for the co-segmentation of head and neck (HN) tumors based on PET/MR data was developed. Ten HN patient datasets acquired in a combined PET/MR system were available for this study. The proposed algorithm uses both the anatomical T2-weighted MR and FDG-PET data. For both imaging modalities tumor probability maps were derived, assigning each voxel a probability of being cancerous based on its signal intensity. A combination of these maps was subsequently segmented using a threshold level set algorithm. To validate the method, tumor delineations from three radiation oncologists were available. Inter-observer variabilities and variabilities between the algorithm and each observer were quantified by means of the Dice similarity index and a distance measure. Inter-observer variabilities and variabilities between observers and algorithm were found to be comparable, suggesting that the proposed algorithm is adequate for PET/MR co-segmentation. Moreover, taking into account combined PET/MR data resulted in more consistent tumor delineations compared to MR information only.
Peng, Liqing; Yu, Jianqun; Li, Zhenlin; Li, Wanjiang; Cheng, Wei
2016-10-01
The purpose of this study was to explore the feasibility of dual-source computed tomography(DSCT)highpitch scan mode in the preoperative evaluation of severe aortic stenosis(AS)referred to transcatheter aortic valve implantation(TAVI).Thirty patients with severe AS referred for TAVI underwent cervico-femoral artery joint DSCT angiography.Measurement and calculation of contrast,contrast noise ratio(CNR)and noise of aorta and access vessels were performed.The intra-and inter-observer reproducibilities for assessing aortic root and access vessels were evaluated.Evaluation of shape and plagues of aorta and access vessels was performed.The contrast,CNR and noise of aorta and access vessels were 348.2~457.9HU,12.2~30.3HU and 19.1~48.1 HU,respectively.There were good intra-and inter-observer reproducibilities in assessing aortic root and access vessels by DSCT(mean difference:-0.73~0.79 mm,r=0.90~0.98,P<0.001;mean difference:-0.70~0.73 mm,r=0.90~0.96,P<0.001).In the 30 patients,the diameters of external iliac artery,femeral artery or subclavian artery were less than 7mm in 5cases(16.7%),marked calcification in bilateral common iliac arteries in 1case(3.3%)and marked soft plaque in left common iliac artery in 1case(3.3%).DSCT high-pitch scan mode was feasible in the preoperative evaluation of aorta and access vessels in patients with AS referred for TAVI.
Kim, Ko Eun; Oh, Sohee; Jeoung, Jin Wook; Suh, Min Hee; Seo, Je Hyun; Kim, Martha; Park, Ki Ho; Kim, Dong Myung; Kim, Seok Hwan
2016-11-01
To investigate the additive role of spectral-domain optical coherence tomography (SDOCT) in the structural diagnosis in glaucoma. Reliability and validity analysis. Structural examinations from 109 eyes of 109 healthy individuals and 151 eyes of 151 glaucoma patients with different severities were included. Four structural-diagnostic examination sets were prepared using stereo-optic disc photography (SDP), red-free retinal nerve fiber layer photography (RNFLP), and SDOCT: (1) SDP (S), (2) SDP and SDOCT (SO), (3) SDP and RNFLP (SR), and (4) SDP, RNFLP, and SDOCT (SRO). Five glaucoma specialists were instructed to classify subjects as normal or glaucoma using each of the 4 diagnostic sets in the order S, SO, SR, and SRO, with a 1-month interval. The interobserver agreement was evaluated using kappa (κ) statistics. The additive effect of SDOCT on the diagnostic performance of the specialists was evaluated using the generalized estimating equation. Five glaucoma specialists showed an excellent level of interobserver agreement on the diagnostic assessments based on the 4 sets. In the comparison of the collective diagnostic performance of the specialists, addition of SDOCT to SDP showed an approximately 2-fold significant increase in the diagnostic accuracy. Adding SDOCT to SDP significantly enhanced the specialists' structural-diagnostic ability with respect to the moderate glaucoma, though not mild or advanced glaucoma. SDOCT significantly enhanced the diagnostic accuracy of the glaucoma specialists' performance, showing its additive diagnostic value in judging glaucomatous structural damage, especially in the moderate stage of glaucoma. Copyright © 2016 Elsevier Inc. All rights reserved.
Kitzing, Yu Xuan; Ng, Bernard H K; Kitzing, Bjoern; Waugh, Richard; Kench, James G; Strasser, Simone I; McCormack, Samuel
2015-12-01
Washout is an important diagnostic imaging feature of hepatocellular carcinoma (HCC) on computed tomography (CT). The primary aim of this study is to evaluate the prevalence and the interobserver variation in the detection of portal venous phase (PVP) washout of HCCs using CT in a transplant population. The secondary aim is to evaluate factors influencing the detection of PVP washout. Forty-five patients who underwent CT liver imaging within the 60 days before transplantation had viable HCCs confirmed on pathology. Two radiologists retrospectively reviewed the images for HCCs including features of arterial enhancement and PVP washout. Clinical data, peak kilovoltage, imaging features of portal hypertension, region of interest attenuation measurements of the individual lesions, background liver parenchyma and portal vein were obtained. Liver to lesion attenuation ratio was also calculated. Statistical analysis was performed. The two readers identified 50 arterially enhancing HCCs in 45 patients. In consensus, the two readers identified washout in 60% of the HCCs with a substantial interobserver agreement (kappa = 0.633). PVP washout was associated with larger lesion size, increased background liver parenchyma attenuation, increased liver to lesion attenuation ratio, increased portal vein attenuation and hepatitis B viral status (P = 0.027, 0.008, 0.014, 0.017 and 0.037 respectively). In our transplant population, portal venous phase washout was seen in 60% of the hypervascular HCCs. Factors influencing the presence of PVP washout include lesion size as well as the liver and portal vein attenuation reflective of the portal haemodynamics. © 2015 The Royal Australian and New Zealand College of Radiologists.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dugas, Alexandre; Therasse, Eric; Kauffmann, Claude
2012-08-15
Purpose: To compare different methods measuring abdominal aortic aneurysm (AAA) maximal diameter (Dmax) and its progression on multidetector computed tomography (MDCT) scan. Materials and Methods: Forty AAA patients with two MDCT scans acquired at different times (baseline and follow-up) were included. Three observers measured AAA diameters by seven different methods: on axial images (anteroposterior, transverse, maximal, and short-axis views) and on multiplanar reformation (MPR) images (coronal, sagittal, and orthogonal views). Diameter measurement and progression were compared over time for the seven methods. Reproducibility of measurement methods was assessed by intraclass correlation coefficient (ICC) and Bland-Altman analysis. Results: Dmax, as measuredmore » on axial slices at baseline and follow-up (FU) MDCTs, was greater than that measured using the orthogonal method (p = 0.046 for baseline and 0.028 for FU), whereas Dmax measured with the orthogonal method was greater those using all other measurement methods (p-value range: <0.0001-0.03) but anteroposterior diameter (p = 0.18 baseline and 0.10 FU). The greatest interobserver ICCs were obtained for the orthogonal and transverse methods (0.972) at baseline and for the orthogonal and sagittal MPR images at FU (0.973 and 0.977). Interobserver ICC of the orthogonal method to document AAA progression was greater (ICC = 0.833) than measurements taken on axial images (ICC = 0.662-0.780) and single-plane MPR images (0.772-0.817). Conclusion: AAA Dmax measured on MDCT axial slices overestimates aneurysm size. Diameter as measured by the orthogonal method is more reproducible, especially to document AAA progression.« less
Forensic postmortem computed tomography: volumetric measurement of the heart and liver.
Jakobsen, Lykke Schrøder; Lundemose, Sissel; Banner, Jytte; Lynnerup, Niels; Jacobsen, Christina
2016-12-01
The purpose of this study was to investigate the utility of postmortem computed tomography (PMCT) images in estimating organ sizes and to examine the use of the cardiothoracic ratio (CTR). We included 45 individuals (19 females), who underwent a medico-legal autopsy. Using the computer software program Mimics ® , we determined in situ heart and liver volumes derived from linear measurements (width, height and depth) on a whole body PMCT-scan, and compared the volumes with ex vivo volumes derived by CT-scan of the eviscerated heart and liver. The ex vivo volumes were also compared with the organ weights. Further, we compared the CTR with the ex vivo heart volume and a heart weight-ratio (HWR). Intra- and inter-observer analyses were performed. We found no correlation between the in situ and ex vivo volumes of the heart and liver. However, a highly significant correlation was found between the ex vivo volumes and weights of the heart and liver. No correlations between CTR and the ex vivo heart volume nor with HWR was found. Concerning cardiomegaly, we found no agreement between the CTR and HWR. The intra- and inter-observer analyses showed no significant differences. Noninvasive in situ PMCT methods for organ measuring, as performed in this study, are not useful tools in forensic pathology. The best method to estimate organ volume is a CT-scan of the eviscerated organ. PMCT-determined CTR seems to be useless for ascertaining cardiomegaly, as it neither correlated with the ex vivo heart volume nor with the HWR.
Interobserver agreement on histopathological lesions in class III or IV lupus nephritis.
Wilhelmus, Suzanne; Cook, H Terence; Noël, Laure-Hélène; Ferrario, Franco; Wolterbeek, Ron; Bruijn, Jan A; Bajema, Ingeborg M
2015-01-07
To treat lupus nephritis effectively, proper identification of the histologic class is essential. Although the classification system for lupus nephritis is nearly 40 years old, remarkably few studies have investigated interobserver agreement. Interobserver agreement among nephropathologists was studied, particularly with respect to the recognition of class III/IV lupus nephritis lesions, and possible causes of disagreement were determined. A link to a survey containing pictures of 30 glomeruli was provided to all 360 members of the Renal Pathology Society; 34 responses were received from 12 countries (a response rate of 9.4%). The nephropathologist was asked whether glomerular lesions were present that would categorize the biopsy as class III/IV. If so, additional parameters were scored. To determine the interobserver agreement among the participants, κ or intraclass correlation values were calculated. The intraclass correlation or κ-value was also calculated for two separate levels of experience (specifically, nephropathologists who were new to the field or moderately experienced [less experienced] and nephropathologists who were highly experienced). Intraclass correlation for the presence of a class III/IV lesion was 0.39 (poor). The κ/intraclass correlation values for the additional parameters were as follows: active, chronic, or both: 0.36; segmental versus global: 0.39; endocapillary proliferation: 0.46; influx of inflammatory cells: 0.32; swelling of endothelial cells: 0.46; extracapillary proliferation: 0.57; type of crescent: 0.46; and wire loops: 0.35. The highly experienced nephropathologists had significantly less interobserver variability compared with the less experienced nephropathologists (P=0.004). There is generally poor agreement in terms of recognizing class III/IV lesions. Because experience clearly increases interobserver agreement, this agreement may be improved by training nephropathologists. These results also underscore the importance of a central review by experienced nephropathologists in clinical trials. Copyright © 2015 by the American Society of Nephrology.
The ICI classification for calcaneal injuries: a validation study.
Frima, Herman; Eshuis, Rienk; Mulder, Paul; Leenen, Luke
2012-06-01
The integral classification of injuries (ICI), by Zwipp et al. has been developed as a classification system for injuries of the bones, joints, cartilage and ligaments of the foot. It follows the principles of the comprehensive classification of fractures by Müller et al. The ICI was developed for 'everyday use' and scientific purposes. Our aim was to perform a validation study for this classification system applied to the calcaneal injuries. A panel of five experienced trauma and orthopaedic surgeons evaluated the ICI score in 20 calcaneal injuries. After 2 months, a second classification was performed in a different order. Inter- and intra-observer variability were evaluated by kappa statistics. Panel members were not able to evaluate capsule and ligamental injuries based on X-ray and computed tomography (CT) films. Two injuries were excluded for logistical reasons. The inter-observer agreement based on 18 injuries of bone and joints was slight; kappa 0.14 (90% confidence interval (CI): 0.05-0.22). The intra-observer agreement was fair; kappa 0.31 (90% CI: 0.22-0.41). Overall, the panel rated the system as very complicated and not practical. The ICI is a complicated classification system with slight to fair inter- and intra-observer variabilities. It might not be a practical classification system for calcaneal injuries in 'everyday use' or scientific purposes. Copyright © 2011 Elsevier Ltd. All rights reserved.
Shade selection performed by novice dental professionals and colorimeter.
Klemetti, E; Matela, A-M; Haag, P; Kononen, M
2006-01-01
The objective of this study was to test inter-observer variability in shade selection for porcelain restorations, using three different shade guides: Vita Lumin Vacuum, Vita 3D-Master and Procera. Nineteen young dental professionals acted as observers. The results were also compared with those of a digital colorimeter (Shade Eye Ex; Shofu, Japan). Regarding repeatability, no significant differences were found between the three shade guides, although repeatability was relatively low (33-43%). Agreement with the colorimetric results was also low (8-34%). In conclusion, shade selection shows moderate to great inter-observer variation. In teaching and standardizing the shade selection procedure, a digital colorimeter may be a useful educational tool.
Koo, Hyun Jung; Yang, Dong Hyun; Kang, Joon-Won; Lee, Joo Yeon; Kim, Dae-Hee; Song, Jong-Min; Kang, Duk-Hyun; Song, Jae-Kwan; Kim, Joon Bum; Jung, Sung-Ho; Choo, Suk Jung; Chung, Cheol Hyun; Lee, Jae-Won; Lim, Tae-Hwan
2018-02-01
We aimed to compare imaging findings of infective endocarditis between computed tomography (CT) and transoesophageal echocardiography (TEE) using surgical inspection as a reference standard. Forty-nine patients (aged 54 ± 17 years, 69% men) who underwent pre-operative CT and TEE for infective endocarditis were included. Twelve of these patients had prosthetic valve endocarditis. Imaging findings of infective endocarditis were classified as vegetation, leaflet perforation, abscess/pseudoaneurysm, and paravalvular leakage. Diagnostic performances of CT and TEE were evaluated using surgical inspection as a reference standard. Interobserver agreements for CT findings were obtained using Cohen's κ test. The detection rates of infective endocarditis per patient with CT and TEE were 93.9% (46/49) and 95.9% (47/49), respectively. In per-imaging analysis, the sensitivities of CT and TEE were not significantly different for both native and prosthetic valve infective endocarditis (sensitivity: vegetation, 100% in TEE and 90.9% in CT; leaflet perforation, 87.5% in TEE and 50.0% in CT; abscess/pseudoaneurysm, 40.0% in TEE and 60.0% in CT; paravalvular leakage, 100% in TEE and 50.0% in CT). Interobserver agreements for CT findings were substantial or excellent (0.79-0.88). Cardiac CT can accurately demonstrate infective endocarditis in pre-operative patients with a similar diagnostic accuracy to TEE. The interobserver agreements for the CT findings of infective endocarditis were excellent. Published on behalf of the European Society of Cardiology. All rights reserved. © The Author 2017. For permissions, please email: journals.permissions@oup.com.
Koo, Henry; Leveridge, Mike; Thompson, Charles; Zdero, Rad; Bhandari, Mohit; Kreder, Hans J; Stephen, David; McKee, Michael D; Schemitsch, Emil H
2008-07-01
The purpose of this study was to measure interobserver reliability of 2 classification systems of pelvic ring fractures and to determine whether computed tomography (CT) improves reliability. The reliability of several radiographic findings was also tested. Thirty patients taken from a database at a Level I trauma facility were reviewed. For each patient, 3 radiographs (AP pelvis, inlet, and outlet) and CT scans were available. Six different reviewers (pelvic and acetabular specialist, orthopaedic traumatologist, or orthopaedic trainee) classified the injury according to Young-Burgess and Tile classification systems after reviewing plain radiographs and then after CT scans. The Kappa coefficient was used to determine interobserver reliability of these classification systems before and after CT scan. For plain radiographs, overall Kappa values for the Young-Burgess and Tile classification systems were 0.72 and 0.30, respectively. For CT scan and plain radiographs, the overall Kappa values for the Young-Burgess and Tile classification systems were 0.63 and 0.33, respectively. The pelvis/acetabular surgeons demonstrated the highest level of agreement using both classification systems. For individual questions, the addition of CT did significantly improve reviewer interpretation of fracture stability. The pre-CT and post-CT Kappa values for fracture stability were 0.59 and 0.93, respectively. The CT scan can improve the reliability of assessment of pelvic stability because of its ability to identify anatomical features of injury. The Young-Burgess system may be optimal for the learning surgeon. The Tile classification system is more beneficial for specialists in pelvic and acetabular surgery.
Kundu, S; Kuehnle, E; Schippert, C; von Ehr, J; Hillemanns, P; Staboulidou, Ismini
2017-11-01
The aim of this study was to analyze whether the umbilical artery pH value can be estimated throughout CTG assessment 60 min prior to delivery and if the estimated umbilical artery pH value correlates with the actual one. This includes analysis of correlation between CTG trace classification and actual umbilical artery pH value. Intra-and interobserver agreement and the impact of professional experience on visual analysis of fetal heart rate tracing were evaluated. This was a retrospective study. 300 CTG records of the last 60 min before delivery were picked randomly from the computer database with the following inclusion criteria; singleton pregnancy >37 weeks, no fetal anomalies, vaginal delivery either spontaneous or instrumental-assisted. Five obstetricians and two midwives of different professional experience classified 300 CTG traces according to the FIGO criteria and estimated the postnatal umbilical artery pH. The results showed a significant difference (p < 0.05) in estimated and actual pH value, independent of professional experience. Analysis and correlation of CTG assessment and actual umbilical artery pH value showed significantly (p < 0.05) diverging results. Intra- and interobserver variability was high. Intraobserver variability was significantly higher for the resident (p = 0.001). No significant differences were detected regarding interobserver variability. An estimation of the pH value and consequently of neonatal outcome on the basis of a present CTG seems to be difficult. Therefore, not only CTG training but also clinical experience and the collaboration and consultation within the whole team is important.
Sargos, P; Charleux, T; Haas, R L; Michot, A; Llacer, C; Moureau-Zabotto, L; Vogin, G; Le Péchoux, C; Verry, C; Ducassou, A; Delannes, M; Mervoyer, A; Wiazzane, N; Thariat, J; Sunyach, M P; Benchalal, M; Laredo, J D; Kind, M; Gillon, P; Kantor, G
2018-04-01
The purpose of this study was to evaluate, during a national workshop, the inter-observer variability in target volume delineation for primary extremity soft tissue sarcoma radiation therapy. Six expert sarcoma radiation oncologists (members of French Sarcoma Group) received two extremity soft tissue sarcoma radiation therapy cases 1: one preoperative and one postoperative. They were distributed with instructions for contouring gross tumour volume or reconstructed gross tumour volume, clinical target volume and to propose a planning target volume. The preoperative radiation therapy case was a patient with a grade 1 extraskeletal myxoid chondrosarcoma of the thigh. The postoperative case was a patient with a grade 3 pleomorphic undifferentiated sarcoma of the thigh. Contour agreement analysis was performed using kappa statistics. For the preoperative case, contouring agreement regarding GTV, gross tumour volume GTV, clinical target volume and planning target volume were substantial (kappa between 0.68 and 0.77). In the postoperative case, the agreement was only fair for reconstructed gross tumour volume (kappa: 0.38) but moderate for clinical target volume and planning target volume (kappa: 0.42). During the workshop discussion, consensus was reached on most of the contour divergences especially clinical target volume longitudinal extension. The determination of a limited cutaneous cover was also discussed. Accurate delineation of target volume appears to be a crucial element to ensure multicenter clinical trial quality assessment, reproducibility and homogeneity in delivering RT. radiation therapy RT. Quality assessment process should be proposed in this setting. We have shown in our study that preoperative radiation therapy of extremity soft tissue sarcoma has less inter-observer contouring variability. Copyright © 2018 Société française de radiothérapie oncologique (SFRO). Published by Elsevier SAS. All rights reserved.
Odland, Audun; Server, Andres; Saxhaug, Cathrine; Breivik, Birger; Groote, Rasmus; Vardal, Jonas; Larsson, Christopher; Bjørnerud, Atle
2015-11-01
Volumetric magnetic resonance imaging (MRI) is now widely available and routinely used in the evaluation of high-grade gliomas (HGGs). Ideally, volumetric measurements should be included in this evaluation. However, manual tumor segmentation is time-consuming and suffers from inter-observer variability. Thus, tools for semi-automatic tumor segmentation are needed. To present a semi-automatic method (SAM) for segmentation of HGGs and to compare this method with manual segmentation performed by experts. The inter-observer variability among experts manually segmenting HGGs using volumetric MRIs was also examined. Twenty patients with HGGs were included. All patients underwent surgical resection prior to inclusion. Each patient underwent several MRI examinations during and after adjuvant chemoradiation therapy. Three experts performed manual segmentation. The results of tumor segmentation by the experts and by the SAM were compared using Dice coefficients and kappa statistics. A relatively close agreement was seen among two of the experts and the SAM, while the third expert disagreed considerably with the other experts and the SAM. An important reason for this disagreement was a different interpretation of contrast enhancement as either surgically-induced or glioma-induced. The time required for manual tumor segmentation was an average of 16 min per scan. Editing of the tumor masks produced by the SAM required an average of less than 2 min per sample. Manual segmentation of HGG is very time-consuming and using the SAM could increase the efficiency of this process. However, the accuracy of the SAM ultimately depends on the expert doing the editing. Our study confirmed a considerable inter-observer variability among experts defining tumor volume from volumetric MRIs. © The Foundation Acta Radiologica 2014.
Lambron, Julien; Rakotonjanahary, Josué; Loisel, Didier; Frampas, Eric; De Carli, Emilie; Delion, Matthieu; Rialland, Xavier; Toulgoat, Frédérique
2016-02-01
Magnetic resonance (MR) images from children with optic pathway glioma (OPG) are complex. We initiated this study to evaluate the accuracy of MR imaging (MRI) interpretation and to propose a simple and reproducible imaging classification for MRI. We randomly selected 140 MRIs from among 510 MRIs performed on 104 children diagnosed with OPG in France from 1990 to 2004. These images were reviewed independently by three radiologists (F.T., 15 years of experience in neuroradiology; D.L., 25 years of experience in pediatric radiology; and J.L., 3 years of experience in radiology) using a classification derived from the Dodge and modified Dodge classifications. Intra- and interobserver reliabilities were assessed using the Bland-Altman method and the kappa coefficient. These reviews allowed the definition of reliable criteria for MRI interpretation. The reviews showed intraobserver variability and large discrepancies among the three radiologists (kappa coefficient varying from 0.11 to 1). These variabilities were too large for the interpretation to be considered reproducible over time or among observers. A consensual analysis, taking into account all observed variabilities, allowed the development of a definitive interpretation protocol. Using this revised protocol, we observed consistent intra- and interobserver results (kappa coefficient varying from 0.56 to 1). The mean interobserver difference for the solid portion of the tumor with contrast enhancement was 0.8 cm(3) (limits of agreement = -16 to 17). We propose simple and precise rules for improving the accuracy and reliability of MRI interpretation for children with OPG. Further studies will be necessary to investigate the possible prognostic value of this approach.
Cunningham, Jane; Sharma, Richa; Kirzner, Anna; Hwang, Sinchun; Lefkowitz, Robert; Greenspan, Daniel; Shapoval, Anton; Panicek, David M.
2016-01-01
Objective To determine etiologies of myonecrosis in oncology patients and to assess interobserver variability in interpreting its MRI features. Materials and Methods Pathology records in our tertiary cancer hospital were searched for proven myonecrosis, and MRIs of affected regions in those patients were identified. MRI reports that suggested myonecrosis also were identified. Each MRI was reviewed independently by two of six readers to assess anatomic site, size, and signal intensities of muscle changes, and presence of the previously reported stipple sign (enhancing foci within a region defined by rim enhancement). The stipple sign was assessed again, weeks after a training session. Cohen kappa and percent agreement were calculated. Medical records were reviewed for contemporaneous causes of myonecrosis. Results MRI reports in 73 patients suggested the diagnosis of myonecrosis; pathologic proof was available in another two. Myonecrosis was frequently associated with radiotherapy (n=34 (45%) patients)); less frequent causes included intraoperative immobilization, trauma, therapeutic embolization, ablation therapy, exercise, and diabetes. Myonecrosis usually involved lower extremity, pelvis, and upper extremity; mean size was 13.0 cm. Stipple sign was observed in 55–95% of patients at first assessment (k=0.09–0.42; 60–80% agreement) and 55–100% at second (k=0.0–0.58; 72–90% agreement). Enhancement surrounded myonecrosis in 55–100% patients (k=0.03 – 0.32; 58–70% agreement). Conclusion Myonecrosis in oncology patients usually occurred after radiotherapy, and less commonly after intraoperative immobilization, trauma, therapeutic embolization, ablation therapy, exercise, or diabetes. Although interobserver variability for MRI features of myonecrosis exists (even after focused training), a combination of findings facilitates diagnosis and conservative management. PMID:27105618
Padmanabhan, Vijayalakshmi; Marshall, Carrie B; Akdas Barkan, Guliz; Ghofrani, Mohiedean; Laser, Alice; Tolgay Ocal, Idris; David Sturgis, Charles; Souers, Rhona; Kurtycz, Daniel F I
2017-05-01
The Bethesda System for Reporting Thyroid Cytopathology (TBSRTC) offers a six-tiered diagnostic scheme for thyroid Fine Needle Aspiration (FNA): Benign, Atypia of Undetermined Significance/Follicular Lesion of Undetermined Significance (AUS/FLUS), suspicious for follicular neoplasm, suspicious for malignancy, malignant, and unsatisfactory with an aim to standardize diagnostic criteria. Reported rate of AUS/FLUS category in the literature has varied from 3% to 20.5%. The aim of this study was to assess interobserver variability among cytopathologists to assess reproducibility of the AUS/FLUS category. Seven cytopathologists brought FNA cases (a mixture of atypical and non-atypical FNA diagnosis) diagnosed using TBSRTC from their respective institutions which were reviewed and diagnosed by the participants. The analysis assessed interobserver variability among 7 cytopathologists and determined characteristics on the slides which were associated with concordance to the institutional diagnosis. Seventy eight of 125 (62.4%) benign cases were classified as benign by the reviewers and 26 (21%) were called AUS/FLUS on review. A third of the AUS/FLUS cases were called benign on review and 28.2% were classified as suspicious for neoplasia/malignancy. Roughly a third each of the suspicious for follicular neoplasm/suspicious for malignancy cases were classified as AUS/FLUS. When pathologists from different institutions shared their slides, concordance was high for specimens with adequate cellularity and those that were clearly benign but thresholds varied for the other indeterminate categories. Most definite categorization of the AUS/FLUS category was seen on review. Diagn. Cytopathol. 2017;45:399-405. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Sieslack, Anne K; Dziallas, Peter; Nolte, Ingo; Wefstaedt, Patrick; Hungerbühler, Stephan O
2014-10-12
Right ventricular (RV) volume and function are important diagnostic and prognostic factors in dogs with primary or secondary right-sided heart failure. The complex shape of the right ventricle and its retrosternal position make the quantification of its volume difficult. For that reason, only few studies exist, which deal with the determination of RV volume parameters. In human medicine cardiac magnetic resonance imaging (CMRI) is considered to be the reference technique for RV volumetric measurement (Nat Rev Cardiol 7(10):551-563, 2010), but cardiac computed tomography (CCT) and three-dimensional echocardiography (3DE) are other non-invasive methods feasible for RV volume quantification. The purpose of this study was the comparison of 3DE and CCT with CMRI, the gold standard for RV volumetric quantification. 3DE showed significant lower and CCT significant higher right ventricular volumes than CMRI. Both techniques showed very good correlations (R > 0.8) with CMRI for the volumetric parameters end-diastolic volume (EDV) and end-systolic volume (ESV). Ejection fraction (EF) and stroke volume (SV) were not different when considering CCT and CMRI, whereas 3DE showed a significant higher EF and lower SV than CMRI. The 3DE values showed excellent intra-observer variability (<3%) and still acceptable inter-observer variability (<13%). CCT provides an accurate image quality of the right ventricle with comparable results to the reference method CMRI. CCT overestimates the RV volumes; therefore, it is not an interchangeable method, having the disadvantage as well of needing general anaesthesia. 3DE underestimated the RV-Volumes, which could be explained by the worse image resolution. The excellent correlation between the methods indicates a close relationship between 3DE and CMRI although not directly comparable. 3DE is a promising technique for RV volumetric quantification, but further studies in awake dogs and dogs with heart disease are necessary to evaluate its usefulness in veterinary cardiology.
O'Donohue, John; Ng, Chaan; Catnach, Susan; Farrant, Patricia; Williams, Roger
2004-02-01
To investigate the clinical utility and the intra-observer and inter-observer variability of Doppler ultrasound assessment of the hepatic and portal vessels along with measurement of spleen size in the diagnosis of chronic liver disease and cirrhosis. Ultrasound measurements of portal vein diameter (PVD), portal vein velocity (PVV), hepatic arterial resistance index (HARI), hepatic vein profile (HVP), and spleen size were obtained in 49 controls and 45 patients with liver disease (23 with primary biliary cirrhosis, 22 with hepatitis C) by two experienced observers, who each performed three blinded measurements of each variable. Control values were derived from normal hospital workers. Percutaneous liver biopsies in 41 of the patients showed cirrhosis (14 patients), moderate/severe fibrosis (13 patients), and early disease (14 patients). Seventy-one percent of cirrhotic patients had splenomegaly (> 13.6 cm). The spleen size was significantly larger in cirrhotics (16.0 cm) than in non-cirrhotics (13.0 cm, P < 0.009) and healthy controls (10.7 cm, P < 0.00005), and was the only independent predictor of cirrhosis, with a threshold of 15 cm predicting cirrhosis with a specificity of 98%, positive predictive value of 93%, sensitivity of 57% and negative predictive value of 80%. HVP was abnormal in 76.9% of cirrhotics, 57.7% of non-cirrhotics and 2.1% of controls (P < 0.04). However, the mean PVV, PVD and HARI were no different between controls and patients or between cirrhotic and non-cirrhotic liver disease. There was significant inter-observer variability for PVV, but intra-observer and inter-observer variability was acceptable for the other measurements. Splenomegaly size and abnormal HVP are useful predictors of chronic liver disease and cirrhosis, and both can be measured reliably and reproducibly. However, Doppler measurements of PVV, PVD and HARI are not useful in distinguishing patients with chronic liver disease from normal controls.
Vorselaars, V M M; Velthuis, S; Huitema, M P; Hosman, A E; Westermann, C J J; Snijder, R J; Mager, J J; Post, M C
2018-04-01
Transthoracic contrast echocardiography (TTCE) is recommended for screening of pulmonary arteriovenous malformations (PAVMs) in hereditary haemorrhagic telangiectasia. Shunt quantification is used to find treatable PAVMs. So far, there has been no study investigating the reproducibility of this diagnostic test. Therefore, this study aimed to describe inter-observer and inter-injection variability of TTCE. We conducted a prospective single centre study. We included all consecutive persons screened for presence of PAVMs in association with hereditary haemorrhagic telangiectasia in 2015. The videos of two contrast injections per patient were divided and reviewed by two cardiologists blinded for patient data. Pulmonary right-to-left shunts were graded using a three-grade scale. Inter-observer and inter-injection agreement was calculated with κ statistics for the presence and grade of pulmonary right-to-left shunts. We included 107 persons (accounting for 214 injections) (49.5% male, mean age 45.0 ± 16.6 years). A pulmonary right-to-left shunt was present in 136 (63.6%) and 131 (61.2%) injections for observer 1 and 2, respectively. Inter-injection agreement for the presence of pulmonary right-to-left shunts was 0.96 (95% confidence interval (CI) 0.9-1.0) and 0.98 (95% CI 0.94-1.00) for observer 1 and 2, respectively. Inter-injection agreement for pulmonary right-to-left shunt grade was 0.96 (95% CI 0.93-0.99) and 0.95 (95% CI 0.92-0.98) respectively. There was disagreement in right-to-left shunt grade between the contrast injections in 11 patients (10.3%). Inter-observer variability for presence and grade of the pulmonary right-to-left shunt was 0.95 (95% CI 0.91-0.99) and 0.97 (95% CI 0.95-0.99) respectively. TTCE has an excellent inter-injection and inter-observer agreement for both the presence and grade of pulmonary right-to-left shunts.
Accuracy of the Interpretation of Chest Radiographs for the Diagnosis of Paediatric Pneumonia
Elemraid, Mohamed A.; Muller, Michelle; Spencer, David A.; Rushton, Stephen P.; Gorton, Russell; Thomas, Matthew F.; Eastham, Katherine M.; Hampton, Fiona; Gennery, Andrew R.; Clark, Julia E.
2014-01-01
Introduction World Health Organization (WHO) radiological classification remains an important entry criterion in epidemiological studies of pneumonia in children. We report inter-observer variability in the interpretation of 169 chest radiographs in children suspected of having pneumonia. Methods An 18-month prospective aetiological study of pneumonia was undertaken in Northern England. Chest radiographs were performed on eligible children aged ≤16 years with clinical features of pneumonia. The initial radiology report was compared with a subsequent assessment by a consultant cardiothoracic radiologist. Chest radiographic changes were categorised according to the WHO classification. Results There was significant disagreement (22%) between the first and second reports (kappa = 0.70, P<0.001), notably in those aged <5 years (26%, kappa = 0.66, P<0.001). The most frequent sources of disagreement were the reporting of patchy and perihilar changes. Conclusion This substantial inter-observer variability highlights the need for experts from different countries to create a consensus to review the radiological definition of pneumonia in children. PMID:25148361
A Statistical Analysis of Reviewer Agreement and Bias in Evaluating Medical Abstracts 1
Cicchetti, Domenic V.; Conn, Harold O.
1976-01-01
Observer variability affects virtually all aspects of clinical medicine and investigation. One important aspect, not previously examined, is the selection of abstracts for presentation at national medical meetings. In the present study, 109 abstracts, submitted to the American Association for the Study of Liver Disease, were evaluated by three “blind” reviewers for originality, design-execution, importance, and overall scientific merit. Of the 77 abstracts rated for all parameters by all observers, interobserver agreement ranged between 81 and 88%. However, corresponding intraclass correlations varied between 0.16 (approaching statistical significance) and 0.37 (p < 0.01). Specific tests of systematic differences in scoring revealed statistically significant levels of observer bias on most of the abstract components. Moreover, the mean differences in interobserver ratings were quite small compared to the standard deviations of these differences. These results emphasize the importance of evaluating the simple percentage of rater agreement within the broader context of observer variability and systematic bias. PMID:997596
Tewes, S; Rodt, T; Marquardt, S; Evangelidou, E; Wacker, F K; von Falck, C
2013-11-01
Evaluation of the potential usability of an iPad 3 with a high-resolution display in CT emergency diagnosis compared to a 3 D PACS workstation. 3 readers used a 5-point Likert scale to evaluate 40 CCT scans and 40 CTPA scans to determine the detectability of early signs of infarction in CCT or segmental and subsegmental pulmonary embolisms in CT angiography of the pulmonary arteries (CTPA) on the iPad 3 (Apple Inc., USA) using an application for image viewing (Visage Ease, Visage Imaging GmbH, Berlin) and on a 3 D PACS workstation (Visage 7.1, Visage Imaging, Berlin) using a certified monitor for image viewing. The results were compared using the Wilcoxon rank sum test, Spearman's correlation coefficient, and a kappa statistic. There was no significant difference in the median evaluations for the readings of both the CCT scans and the CTPA scans on the iPad 3 and on the workstation (p > 0.05) for all three readers. The mean Spearman's correlation coefficient for CCT and CTPA was 0.46 (± 0.2) and 0.69 (± 0.16), respectively, for the comparison iPad/PACS, 0.41 (± 0.16) and 0.68 (± 0.06), respectively, for the interobserver agreement on the iPad, and 0.35 (± 0.05) and 0.68 (± 0.10), respectively, for the interobserver agreement on the PACS. Mean kappa values for CCT of 0.52 (± 0.17) for the comparison iPad/PACS and 0.33 (± 0.16) and 0.32 (± 0.16), respectively, for the interobserver agreement on the iPad and the PACS were achieved. For CTPA average kappa values of 0.67 (± 0.19) were calculated for the comparison iPad/PACS and 0.69 (± 0.08) and 0.60 (± 0.14), respectively, for the interobserver concordance on the iPad 3 and the PACS. All differences were not statistically significant (p > 0.05). The variability of the interpretation of typical emergency scans on an iPad 3 with a high-resolution display and on a 3 D PACS workstation does not differ from the interobserver variability. © Georg Thieme Verlag KG Stuttgart · New York.
NASA Astrophysics Data System (ADS)
Diffey, Jenny; Berks, Michael; Hufton, Alan; Chung, Camilla; Verow, Rosanne; Morrison, Joanna; Wilson, Mary; Boggis, Caroline; Morris, Julie; Maxwell, Anthony; Astley, Susan
2010-04-01
Breast density is positively linked to the risk of developing breast cancer. We have developed a semi-automated, stepwedge-based method that has been applied to the mammograms of 1,289 women in the UK breast screening programme to measure breast density by volume and area. 116 images were analysed by three independent operators to assess inter-observer variability; 24 of these were analysed on 10 separate occasions by the same operator to determine intra-observer variability. 168 separate images were analysed using the stepwedge method and by two radiologists who independently estimated percentage breast density by area. There was little intra-observer variability in the stepwedge method (average coefficients of variation 3.49% - 5.73%). There were significant differences in the volumes of glandular tissue obtained by the three operators. This was attributed to variations in the operators' definition of the breast edge. For fatty and dense breasts, there was good correlation between breast density assessed by the stepwedge method and the radiologists. This was also observed between radiologists, despite significant inter-observer variation. Based on analysis of thresholds used in the stepwedge method, radiologists' definition of a dense pixel is one in which the percentage of glandular tissue is between 10 and 20% of the total thickness of tissue.
Opolski, Maksymilian P; Pregowski, Jerzy; Kruk, Mariusz; Kepka, Cezary; Staruch, Adam D; Witkowski, Adam
2014-07-01
The widespread clinical application of coronary computed tomography angiography (CCTA) has resulted in increased referral patterns of patients with intermediate coronary stenoses to invasive coronary angiography. We evaluated the application of advanced quantitative coronary angiography (A-QCA) for predicting fractional flow reserve (FFR) in intermediate coronary lesions detected on CCTA. Fifty-six patients with 66 single intermediate coronary lesions (≥ 50% to 80% stenosis) on CCTA prospectively underwent coronary angiography and FFR. A-QCA including calculation of the Poiseuille-based index defined as the ratio of lesion length to the fourth power of the minimal lumen diameter (MLD) was performed. Significant stenosis was defined as FFR ≤ 0.80. The mean FFR was 0.86 ± 0.09, and 18 lesions (27%) were functionally significant. FFR correlated with lesion length (R=-0.303, P=0.013), MLD (R=0.527, P<0.001), diameter stenosis (R=-0.404, P=0.001), minimum lumen area (MLA) (R=0.530, P<0.001), lumen stenosis (R=-0.400, P=0.001), and Poiseuille-based index (R=-0.602, P<0.001). The optimal cutoff values for MLD, MLA, diameter stenosis, and lumen stenosis were ≤ 1.3 mm, ≤ 1.5 mm, >44%, and >69%, respectively (maximum negative predictive value of 94% for MLA, maximum positive predictive value of 58% for diameter stenosis). The Poiseuille-based index was the most accurate (C statistic 0.86, sensitivity 100%, specificity 71%, positive predictive value 56%, and negative predictive value 100%) predictor of FFR ≤ 0.80, but showed the lowest interobserver agreement (intraclass correlation coefficient 0.37). A-QCA might be used to rule out significant ischemia in intermediate stenoses detected by CCTA. The diagnostic application of the Poiseuille-based angiographic index is precluded by its high interobserver variability.
Dolz, J; Kirişli, H A; Fechter, T; Karnitzki, S; Oehlke, O; Nestle, U; Vermandel, M; Massoptier, L
2016-05-01
Accurate delineation of organs at risk (OARs) on computed tomography (CT) image is required for radiation treatment planning (RTP). Manual delineation of OARs being time consuming and prone to high interobserver variability, many (semi-) automatic methods have been proposed. However, most of them are specific to a particular OAR. Here, an interactive computer-assisted system able to segment various OARs required for thoracic radiation therapy is introduced. Segmentation information (foreground and background seeds) is interactively added by the user in any of the three main orthogonal views of the CT volume and is subsequently propagated within the whole volume. The proposed method is based on the combination of watershed transformation and graph-cuts algorithm, which is used as a powerful optimization technique to minimize the energy function. The OARs considered for thoracic radiation therapy are the lungs, spinal cord, trachea, proximal bronchus tree, heart, and esophagus. The method was evaluated on multivendor CT datasets of 30 patients. Two radiation oncologists participated in the study and manual delineations from the original RTP were used as ground truth for evaluation. Delineation of the OARs obtained with the minimally interactive approach was approved to be usable for RTP in nearly 90% of the cases, excluding the esophagus, which segmentation was mostly rejected, thus leading to a gain of time ranging from 50% to 80% in RTP. Considering exclusively accepted cases, overall OARs, a Dice similarity coefficient higher than 0.7 and a Hausdorff distance below 10 mm with respect to the ground truth were achieved. In addition, the interobserver analysis did not highlight any statistically significant difference, at the exception of the segmentation of the heart, in terms of Hausdorff distance and volume difference. An interactive, accurate, fast, and easy-to-use computer-assisted system able to segment various OARs required for thoracic radiation therapy has been presented and clinically evaluated. The introduction of the proposed system in clinical routine may offer valuable new option to radiation oncologists in performing RTP.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wehrschuetz, M., E-mail: martin.wehrschuetz@klinikum-graz.at; Aschauer, M.; Portugaller, H.
The purpose of this study was to assess interobserver variability and accuracy in the evaluation of renal artery stenosis (RAS) with gadolinium-enhanced MR angiography (MRA) and digital subtraction angiography (DSA) in patients with hypertension. The authors found that source images are more accurate than maximum intensity projection (MIP) for depicting renal artery stenosis. Two independent radiologists reviewed MRA and DSA from 38 patients with hypertension. Studies were postprocessed to display images in MIP and source images. DSA was the standard for comparison in each patient. For each main renal artery, percentage stenosis was estimated for any stenosis detected by themore » two radiologists. To calculate sensitivity, specificity and accuracy, MRA studies and stenoses were categorized as normal, mild (1-39%), moderate (40-69%) or severe ({>=}70%), or occluded. DSA stenosis estimates of 70% or greater were considered hemodynamically significant. Analysis of variance demonstrated that MIP estimates of stenosis were greater than source image estimates for both readers. Differences in estimates for MIP versus DSA reached significance in one reader. The interobserver variance for MIP, source images and DSA was excellent (0.80< {kappa}{<=} 0.90). The specificity of source images was high (97%) but less for MIP (87%); average accuracy was 92% for MIP and 98% for source images. In this study, source images are significantly more accurate than MIP images in one reader with a similar trend was observed in the second reader. The interobserver variability was excellent. When renal artery stenosis is a consideration, high accuracy can only be obtained when source images are examined.« less
High resolution microendoscopy for classification of colorectal polyps.
Chang, S S; Shukla, R; Polydorides, A D; Vila, P M; Lee, M; Han, H; Kedia, P; Lewis, J; Gonzalez, S; Kim, M K; Harpaz, N; Godbold, J; Richards-Kortum, R; Anandasabapathy, S
2013-07-01
It can be difficult to distinguish adenomas from benign polyps during routine colonoscopy. High resolution microendoscopy (HRME) is a novel method for imaging colorectal mucosa with subcellular detail. HRME criteria for the classification of colorectal neoplasia have not been previously described. Study goals were to develop criteria to characterize HRME images of colorectal mucosa (normal, hyperplastic polyps, adenomas, cancer) and to determine the accuracy and interobserver variability for the discrimination of neoplastic from non-neoplastic polyps when these criteria were applied by novice and expert microendoscopists. Two expert pathologists created consensus HRME image criteria using images from 68 patients with polyps who had undergone colonoscopy plus HRME. Using these criteria, HRME expert and novice microendoscopists were shown a set of training images and then tested to determine accuracy and interobserver variability. Expert microendoscopists identified neoplasia with sensitivity, specificity, and accuracy of 67 % (95 % confidence interval [CI] 58 % - 75 %), 97 % (94 % - 100 %), and 87 %, respectively. Nonexperts achieved sensitivity, specificity, and accuracy of 73 % (66 % - 80 %), 91 % (80 % - 100 %), and 85 %, respectively. Overall, neoplasia were identified with sensitivity 70 % (65 % - 76 %), specificity 94 % (87 % - 100 %), and accuracy 85 %. Kappa values were: experts 0.86; nonexperts 0.72; and overall 0.78. Using the new criteria, observers achieved high specificity and substantial interobserver agreement for distinguishing benign polyps from neoplasia. Increased expertise in HRME imaging improves accuracy. This low-cost microendoscopic platform may be an alternative to confocal microendoscopy in lower-resource or community-based settings.
Claessen, Femke M A P; van den Ende, Kimberly I M; Doornberg, Job N; Guitton, Thierry G; Eygendaal, Denise; van den Bekerom, Michel P J
2015-10-01
The radiographic appearance of osteochondritis dissecans (OCD) of the humeral capitellum varies according to the stage of the lesion. It is important to evaluate the stage of OCD lesion carefully to guide treatment. We compared the interobserver reliability of currently used classification systems for OCD of the humeral capitellum to identify the most reliable classification system. Thirty-two musculoskeletal radiologists and orthopaedic surgeons specialized in elbow surgery from several countries evaluated anteroposterior and lateral radiographs and corresponding computed tomography (CT) scans of 22 patients to classify the stage of OCD of the humeral capitellum according to the classification systems developed by (1) Minami, (2) Berndt and Harty, (3) Ferkel and Sgaglione, and (4) Anderson on a Web-based study platform including a Digital Imaging and Communications in Medicine viewer. Magnetic resonance imaging was not evaluated as part of this study. We measured agreement among observers using the Siegel and Castellan multirater κ. All OCD classification systems, except for Berndt and Harty, which had poor agreement among observers (κ = 0.20), had fair interobserver agreement: κ was 0.27 for the Minami, 0.23 for Anderson, and 0.22 for Ferkel and Sgaglione classifications. The Minami Classification was significantly more reliable than the other classifications (P < .001). The Minami Classification was the most reliable for classifying different stages of OCD of the humeral capitellum. However, it is unclear whether radiographic evidence of OCD of the humeral capitellum, as categorized by the Minami Classification, guides treatment in clinical practice as a result of this fair agreement. Copyright © 2015 Journal of Shoulder and Elbow Surgery Board of Trustees. Published by Elsevier Inc. All rights reserved.
Yildizer Keris, Elif; Demirel, Oguzhan; Ozdede, Melih; Altunkaynak, Bulent; Peker, Ilkay
2017-01-01
The aim of this in vitro study was to assess the diagnostic performance of cone-beam computed tomography (CBCT) in the detection of secondary carious lesions under composite resin fillings applied to different types of cavities. Occlusal cavities (O) (n=18), occlusal cavities with mesial or distal component (MO/DO) (n=30), and mesial-occlusal-distal cavities (MOD) (n=30) were prepared in seventy eight extracted human posterior teeth. In half of the cavities in each group, artificial secondary caries lesions were simulated. All cavities were restored by using composite resin. All specimens were embedded in silicone and they were positioned to have approximal contacts. CBCT imaging was done and data were evaluated two times with two week interval by two observers, using a five-point confidence scale. Intra- and inter-observer agreements were calculated with Kappa statistics (κ). The area under (Az) the receiver operating characteristic (ROC) curve was used to evaluate the diagnostic accuracy. Intra- (κ =0.89) and inter-observer (κ = 0.79) agreements were found to be excellent. Az values were highest for the O restorations which is followed by the MOD and DO/MO restorations. Az values for MOD and DO/MO restorations were very low and no statistically significant difference was found. Sensitivity for DO/MO restorations and specificity for MOD restorations were found to be the lowest values. Diagnostic performance of CBCT was higher in O composite restorations than MOD and DO/MO restorations for secondary caries detection. The use of alternative imaging methods rather than CBCT may be useful for evaluating secondary caries under composite MOD and DO/MO restorations.
Neves, Frederico S; Vasconcelos, Taruska V; Campos, Paulo S F; Haiter-Neto, Francisco; Freitas, Deborah Q
2014-02-01
The aim of this study was to evaluate the effect of scan mode of the cone beam computed tomography (CBCT) in the preoperative dental implant measurements. Completely edentulous mandibles with entirely resorbed alveolar processes were selected for this study. Five regions were selected (incisor, canine, premolar, first molar, and second molar). The mandibles were scanned with Next Generation i-CAT CBCT unit (Imaging Sciences International, Inc, Hatfield, PA, USA) with half (180°) and full (360°) mode. Two oral radiologists performed vertical measurements in all selected regions; the measurements of half of the sample were repeated within an interval of 30 days. The mandibles were sectioned using an electrical saw in all evaluated regions to obtain the gold standard. The intraclass correlation coefficient was calculated for the intra- and interobserver agreement. Descriptive statistics were calculated as mean, median, and standard deviation. Wilcoxon signed rank test was used to determine the correlation between the measurements obtained in different scan mode with the gold standard. The significance level was 5%. The values of intra- and interobserver reproducibility indicated a strong agreement. In the dental implant measurements, except the bone height of the second molar region in full scan mode (P = 0.02), the Wilcoxon signed rank test did not show statistical significant difference with the gold standard (P > 0.05). Both modes provided real measures, necessary when performing implant planning; however, half scan mode uses smaller doses, following the principle of effectiveness. We believe that this method should be used because of the best dose-effect relationship and offer less risk to the patient. © 2012 John Wiley & Sons A/S.
Benatti, Lucia; Corvi, Federico; Tomasso, Livia; Mercuri, Stefano; Querques, Lea; Ricceri, Fulvio; Bandello, Francesco; Querques, Giuseppe
2017-06-01
To analyze the inter-methods agreement in arteriovenous ratio (AVR) evaluation between spectral-domain optical coherence tomography (SD-OCT) and Dynamic Vessel Analyzer (DVA). Healthy volunteers underwent DVA and SD-OCT examination. AVR was measured by SD-OCT using the four external lines of the optic nerve head-centered 7-line cube and by DVA using an automated AVR estimation. The mean AVR was calculated, twice, separately by two independent readers for each tool. Twenty-two eyes of 11 healthy subjects (five women and six men, mean age 35) were included. AVR analysis by DVA showed high inter-observer agreement between reader 1 and 2, and high intra-observer agreement for both reader 1 and reader 2. With regard to AVR analysis on SD-OCT, we found high inter-observer agreement between reader 1 and 2, and low intra-observer agreement for reader 2 but high intra-observer agreement for reader 1. Overall, the mean AVR measured on SD-OCT turned out to be significantly higher than mean AVR measured through DVA (reader 1, 0.9023 ± 0.06 vs 0.8036 ± 0.08; p < 0.001, and reader 2, 0.9067 ± 0.06 vs 0.8083 ± 0.05; p= 0.003). No inter-method agreement in AVR could be detected in the present study due to bias in measurements (shift between DVA and SD-OCT). We found significant difference in the two noninvasive methods for AVR measurement, with a tendency for SD-OCT to overestimate retinal vascular caliber in comparison to DVA. This may be useful for achieving greater accuracy in the evaluation of retinal vessel in ocular as well as systemic diseases.
Pulmonary tumor measurements from x-ray computed tomography in one, two, and three dimensions.
Villemaire, Lauren; Owrangi, Amir M; Etemad-Rezai, Roya; Wilson, Laura; O'Riordan, Elaine; Keller, Harry; Driscoll, Brandon; Bauman, Glenn; Fenster, Aaron; Parraga, Grace
2011-11-01
We evaluated the accuracy and reproducibility of three-dimensional (3D) measurements of lung phantoms and patient tumors from x-ray computed tomography (CT) and compared these to one-dimensional (1D) and two-dimensional (2D) measurements. CT images of three spherical and three irregularly shaped tumor phantoms were evaluated by three observers who performed five repeated measurements. Additionally, three observers manually segmented 29 patient lung tumors five times each. Follow-up imaging was performed for 23 tumors and response criteria were compared. For a single subject, imaging was performed on nine occasions over 2 years to evaluate multidimensional tumor response. To evaluate measurement accuracy, we compared imaging measurements to ground truth using analysis of variance. For estimates of precision, intraobserver and interobserver coefficients of variation and intraclass correlations (ICC) were used. Linear regression and Pearson correlations were used to evaluate agreement and tumor response was descriptively compared. For spherical shaped phantoms, all measurements were highly accurate, but for irregularly shaped phantoms, only 3D measurements were in high agreement with ground truth measurements. All phantom and patient measurements showed high intra- and interobserver reproducibility (ICC >0.900). Over a 2-year period for a single patient, there was disagreement between tumor response classifications based on 3D measurements and those generated using 1D and 2D measurements. Tumor volume measurements were highly reproducible and accurate for irregular, spherical phantoms and patient tumors with nonuniform dimensions. Response classifications obtained from multidimensional measurements suggest that 3D measurements provide higher sensitivity to tumor response. Copyright © 2011 AUR. Published by Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fleckenstein, Jochen; Hellwig, Dirk; Kremp, Stephanie
2011-11-15
Purpose: The integration of fluoro-deoxy-D-glucose positron emission tomography (FDG-PET) in the process of radiotherapy (RT) planning of locally advanced non-small-cell lung cancer (NSCLC) may improve diagnostic accuracy and minimize interobserver variability compared with target volume definition solely based on computed tomography. Furthermore, irradiating only FDG-PET-positive findings and omitting elective nodal regions may allow dose escalation by treating smaller volumes. The aim of this prospective pilot trial was to evaluate the therapeutic safety of FDG-PET-based RT treatment planning with an autocontour-derived delineation of the primary tumor. Methods and Materials: Eligible patients had Stages II-III inoperable NSCLC, and simultaneous, platinum-based radiochemotherapy wasmore » indicated. FDG-PET and computed tomography acquisitions in RT treatment planning position were coregistered. The clinical target volume (CTV) included the FDG-PET-defined primary tumor, which was autodelineated with a source-to-background algorithm, plus FDG-PET-positive lymph node stations. Limited by dose restrictions for normal tissues, prescribed total doses were in the range of 66.6 to 73.8 Gy. The primary endpoint was the rate of out-of-field isolated nodal recurrences (INR). Results: As per intent to treat, 32 patients received radiochemotherapy. In 15 of these patients, dose escalation above 66.6 Gy was achieved. No Grade 4 toxicities occurred. After a median follow-up time of 27.2 months, the estimated median survival time was 19.3 months. During the observation period, one INR was observed in 23 evaluable patients. Conclusions: FDG-PET-confined target volume definition in radiochemotherapy of NSCLC, based on a contrast-oriented source-to-background algorithm, was associated with a low risk of INR. It might provide improved tumor control because of dose escalation.« less
Rei, Mariana; Tavares, Sara; Pinto, Pedro; Machado, Ana P; Monteiro, Sofia; Costa, Antónia; Costa-Santos, Cristina; Bernardes, João; Ayres-De-Campos, Diogo
2016-10-01
Visual analysis of cardiotocographic (CTG) tracings has been shown to be prone to poor intra- and interobserver agreement when several interpretation guidelines are used, and this may have an important impact on the technology's performance. The aim of this study was to evaluate agreement in CTG interpretation using the new 2015 FIGO guidelines on intrapartum fetal monitoring. A pre-existing database of intrapartum CTG tracings was used to sequentially select 151 cases acquired with a fetal electrode, with duration exceeding 60minutes, and signal loss less than 15%. These tracings were presented to six clinicians, three with more than 5 years' experience in the labor ward, and three with 5 or less years' experience. Observers were asked to evaluate tracings independently, to assess basic CTG features: baseline, variability, accelerations, decelerations, sinusoidal pattern, tachysystole, and to classify each tracing as normal, suspicious or pathologic, according to the 2015 FIGO guidelines on intrapartum fetal monitoring. Agreement between observers was evaluated using the proportions of agreement (Pa), with 95% confidence intervals (95%CI). A good interobserver agreement was found in the evaluation of most CTG features, but not bradycardia, reduced variability, saltatory pattern, absence of accelerations and absence of decelerations. For baseline classification Pa was 0.85 [0.82-0.90], for variability 0.82 [0.78-0.85], for accelerations 0.72 [0.68-0.75], for tachysystole 0.77 [0.74-0.81], for decelerations 0.92 [0.90-0.95], for variable decelerations 0.62 [0.58-0.65], for late decelerations 0.63 [0.59-0.66], for repetitive decelerations 0.73 [0.69-0.78], and for prolonged decelerations 0.81 [0.77-0.85]. For overall CTG classification, Pa were 0.60 [0.56-0.64], for classification as normal 0.67 [0.61-0.72], for suspicious 0.54 [0.48-0.60] and for pathologic 0.59 [0.51-0.66]. No differences in agreement according to the level of expertise were observed, except in the identification of accelerations, where it was better in the more experienced group. A good interobserver agreement was found in evaluation of most CTG features and in overall tracing classification. Results were better than those reported in previous studies evaluating agreement in overall tracing classification. Observer experience did not appear to play a role in agreement. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Saade, Charbel; Mayat, Ahmad; El-Merhi, Fadi
2016-01-01
Matching contrast injection timing with vessel dynamics significantly improves vessel opacification and reduces contrast dose in the assessment of pulmonary embolism during computed tomography (CT) pulmonary angiography. The aim of this study was to investigate opacification of the pulmonary vasculature (PV) during CT pulmonary angiography using a patient-specific contrast formula (PSCF) and exponentially decelerated contrast media (EDCM) injection rate. Institutional review board approved this retrospective study. Computed tomography pulmonary angiography was performed on 200 patients with suspected pulmonary embolism using a 64-channel CT scanner. Patient demographics were equally distributed. Patients were randomly assigned to 2 equal protocol groups: protocol A used a PSCF, and protocol B involved the use of a PSCF combined with EDCM. The mean cross-sectional opacification profile of 8 central and 11 peripheral PVs were measured for each patient, and arteriovenous contrast ratio was calculated. Protocols were compared using Mann-Whitney U nonparametric statistics. Jackknife alternative free-response receiver operating characteristic analyses were used to assess diagnostic efficacy. Interobserver variations were investigated using kappa methods. A number of pulmonary arteries demonstrated increases in opacification (P < 0.02) for protocol B compared with A, whereas opacification in all veins was reduced in protocol B (P < 0.03). Subsequently, increased arteriovenous contrast ratio in protocol B compared with A was observed at all anatomic locations (P < 0.0002). An increase in jackknife alternative free-response receiver operating characteristic figure of merit (P < 0.0002) and interobserver variation was observed with protocol B compared with protocol A (κ = 0.3-0.73). Mean contrast volume was reduced in protocol B (29 [4] mL) compared with protocol A (33 [9] mL). Mean effective radiation dose in protocol B (1.2 [0.4] mSv) was reduced by 14% compared with protocol A (1.4 [0.6] mSv). Significant improvements in visualization of the PV can be achieved with a low contrast volume using an EDCM and PSCF. The reduced risk of cancer induction is highlighted.
Pulerwitz, Todd C; Khalique, Omar K; Nazif, Tamim N; Rozenshtein, Anna; Pearson, Gregory D N; Hahn, Rebecca T; Vahl, Torsten P; Kodali, Susheel K; George, Isaac; Leon, Martin B; D'Souza, Belinda; Po, Ming Jack; Einstein, Andrew J
2016-01-01
Transcatheter aortic valve replacement (TAVR) is a lifesaving procedure for many patients high risk for surgical aortic valve replacement. The prevalence of chronic kidney disease (CKD) is high in this population, and thus a very low contrast volume (VLCV) computed tomography angiography (CTA) protocol providing comprehensive cardiac and vascular imaging would be valuable. 52 patients with severe, symptomatic aortic valve disease, undergoing pre-TAVR CTA assessment from 2013-4 at Columbia University Medical Center were studied, including all 26 patients with CKD (eGFR<30 mL/min) who underwent a novel VLCV protocol (20 mL of iohexol at 2.5 mL/s), and 26 standard-contrast-volume (SCV) protocol patients. Using a 320-slice volumetric scanner, the protocol included ECG-gated volume scanning of the aortic root followed by medium-pitch helical vascular scanning through the femoral arteries. Two experienced cardiologists performed aortic annulus and root measurements. Vascular image quality was assessed by two radiologists using a 4-point scale. VLCV patients had mean (±SD) age 86 ± 6.5, BMI 23.9 ± 3.4 kg/m(2) with 54% men; SCV patients age 83 ± 8.8, BMI 28.7 ± 5.3 kg/m(2), 65% men. There was excellent intra- and inter-observer agreement for annular and root measurements, and excellent agreement with 3D-transesophageal echocardiographic measurements. Both radiologists found diagnostic-quality vascular imaging in 96% of VLCV and 100% of SCV cases, with excellent inter-observer agreement. This study is the first of its kind to report the feasibility and reproducibility of measurements for a VLCV protocol for comprehensive pre-TAVR CTA. There was excellent agreement of cardiac measurements and almost all studies were diagnostic quality for vascular access assessment. Copyright © 2016 Society of Cardiovascular Computed Tomography. Published by Elsevier Inc. All rights reserved.
Reliability and concurrent validity of the Infant Motor Profile.
Heineman, Kirsten R; Middelburg, Karin J; Bos, Arend F; Eidhof, Lieke; La Bastide-Van Gemert, Sacha; Van Den Heuvel, Edwin R; Hadders-Algra, Mijna
2013-06-01
The Infant Motor Profile (IMP) is a qualitative assessment of motor behaviour in infancy. It consists of five domains: movement variation, variability, fluency, symmetry, and performance. The aim of this study was to assess interobserver reliability and concurrent validity of the IMP with the Alberta Infant Motor Scale (AIMS) and an age-specific neurological examination. Fifty-nine preterm infants (25 females, 34 males; median gestational age 29.7wks, median birthweight 1285g) and 146 term infants (74 females, 72 males; median gestational age 40.1wks, birthweight 3500g) were included. Assessments were performed at corrected ages of 4, 6, 10, 12, and 18 months and consisted of the IMP, AIMS, and an age-specific neurological examination. Interobserver reliability was investigated on a sample of 25 video recordings. Non-parametric statistics were used to analyse the data. Interobserver reliability was high (intraclass correlation coefficient 0.95). At all ages, AIMS scores correlated weakly to fairly with total IMP scores (Spearman's ρ 0.36-0.55), but moderately to strongly with scores on the performance domain of the IMP (Spearman's ρ 0.47-0.84). A clear relation was found between total IMP score and outcome of the neurological examination (Kruskal-Wallis p<0.001 at all ages). Interobserver reliability of the IMP is good. Concurrent validity with the AIMS is best for the IMP performance domain. Concurrent validity with age-specific neurological examination is very good. © The Authors. Developmental Medicine & Child Neurology © 2013 Mac Keith Press.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Monsky, Wayne L., E-mail: wayne.monsky@ucdmc.ucdavis.edu; Garza, Armando S.; Kim, Isaac
Purpose: The primary purpose of this study was to demonstrate intraobserver/interobserver reproducibility for novel semiautomated measurements of hepatic volume used for Yttrium-90 dose calculations as well as whole-liver and necrotic-liver (hypodense/nonenhancing) tumor volume after radioembolization. The secondary aim was to provide initial comparisons of tumor volumetric measurements with linear measurements, as defined by Response Evaluation Criteria in Solid Tumors criteria, and survival outcomes. Methods: Between 2006 and 2009, 23 consecutive radioembolization procedures were performed for 14 cases of hepatocellular carcinoma and 9 cases of hepatic metastases. Baseline and follow-up computed tomography obtained 1 month after treatment were retrospectively analyzed. Threemore » observers measured liver, whole-tumor, and tumor-necrosis volumes twice using semiautomated software. Results: Good intraobserver/interobserver reproducibility was demonstrated (intraclass correlation [ICC] > 0.9) for tumor and liver volumes. Semiautomated measurements of liver volumes were statistically similar to those obtained with manual tracing (ICC = 0.868), but they required significantly less time to perform (p < 0.0001, ICC = 0.088). There was a positive association between change in linear tumor measurements and whole-tumor volume (p < 0.0001). However, linear measurements did not correlate with volume of necrosis (p > 0.05). Dose, change in tumor diameters, tumor volume, and necrotic volume did not correlate with survival (p > 0.05 in all instances). However, Kaplan-Meier curves suggest that a >10% increase in necrotic volume correlated with survival (p = 0.0472). Conclusion: Semiautomated volumetric analysis of liver, whole-tumor, and tumor-necrosis volume can be performed with good intraobserver/interobserver reproducibility. In this small retrospective study, measurements of tumor necrosis were suggested to correlate with survival.« less
Nestle, Ursula; Rischke, Hans Christian; Eschmann, Susanne Martina; Holl, Gabriele; Tosch, Marco; Miederer, Matthias; Plotkin, Michail; Essler, Markus; Puskas, Cornelia; Schimek-Jasch, Tanja; Duncker-Rohr, Viola; Rühl, Friederike; Leifert, Anja; Mix, Michael; Grosu, Anca-Ligia; König, Jochem; Vach, Werner
2015-11-01
Oncologic imaging is a key for successful cancer treatment. While the quality assurance (QA) of image acquisition protocols has already been focussed, QA of reading and reporting offers still room for improvement. The latter was addressed in the context of a prospective multicentre trial on fluoro-deoxyglucose (FDG)-positron-emission tomography (PET)/CT-based chemoradiotherapy for locally advanced non-small cell lung cancer (NSCLC). An expert panel was prospectively installed performing blinded reviews of mediastinal NSCLC involvement in FDG-PET/CT. Due to a high initial reporting inter-observer disagreement, the independent data monitoring committee (IDMC) triggered an interventional harmonisation process, which overall involved 11 experts uttering 6855 blinded diagnostic statements. After assessing the baseline inter-observer agreement (IOA) of a blinded re-review (phase 1), a discussion process led to improved reading criteria (phase 2). Those underwent a validation study (phase 3) and were then implemented into the study routine. After 2 months (phase 4) and 1 year (phase 5), the IOA was reassessed. The initial overall IOA was moderate (kappa 0.52 CT; 0.53 PET). After improvement of reading criteria, the kappa values improved substantially (kappa 0.61 CT; 0.66 PET), which was retained until the late reassessment (kappa 0.71 CT; 0.67 PET). Subjective uncertainty was highly predictive for low IOA. The IOA of an expert panel was significantly improved by a structured interventional harmonisation process which could be a model for future clinical trials. Furthermore, the low IOA in reporting nodal involvement in NSCLC may bear consequences for individual patient care. Copyright © 2015 Elsevier Ltd. All rights reserved.
Xu, Yi; Zhao, Shufan; Shi, Jiayu; Wang, Yan; Shi, Bing; Zheng, Qian; Lo, Lun-Jou
2013-08-01
This study investigated 3D differences of the pharynx in adult patients with unrepaired isolated cleft palate (ICP) versus normal adults using cone-beam computed tomography (CBCT). CBCT data of 32 adult patients with nonsyndromic unrepaired ICP and 30 normal controls were acquired. Image processing and analyses were performed using Mimics (Materialise NV, Leuven, Belgium). Linear, planar, and volumetric measurements and comparisons were performed between patients with ICP and controls. Interobserver and intraobserver reliabilities of 3D pharyngeal analysis were determined by the Pearson correlation coefficient. Statistical analyses comparing patients with ICP to normal adults were performed using independent-samples t test, with the significance threshold set at P = .05. Interobserver and intraobserver reliabilities were high. Pearson correlation coefficients ranged from 0.992 to 0.999 for interobserver measurements and from 0.994 to 0.999 for intraobserver measurements. Anterior height (P = .000), total depth (P = .003), and floor length (P = .034) of the bony nasopharynx; posteroanterior diameter of the pharyngeal airway at the palatal plane (P = .000); cross-sectional area of the pharyngeal airway at the palatal plane (P = .000); total volume (P = .031); volume above the palatal plane (P = .024); and the volume between the palatal plane and the plane of the most anterior point on the inferior margin of the outline of the body of the second cervical vertebra (P = .022) were larger in patients with ICP. This imaging study showed an enlarged nasopharynx in the sagittal plane and increased nasopharyngeal airway volume at the palatal plane in patients with ICP. Crown Copyright © 2013. Published by Elsevier Inc. All rights reserved.
Baek, Hye Jin; Kim, Dong Wook; Ryu, Ji Hwa; Lee, Yoo Jin
2013-09-01
There has been no study to compare the diagnostic accuracy of an experienced radiologist with a trainee in nasal bone fracture. To compare the diagnostic accuracy between conventional radiography and computed tomography (CT) for the identification of nasal bone fractures and to evaluate the interobserver reliability between a staff radiologist and a trainee. A total of 108 patients who underwent conventional radiography and CT after acute nasal trauma were included in this retrospective study. Two readers, a staff radiologist and a second-year resident, independently assessed the results of the imaging studies. Of the 108 patients, the presence of a nasal bone fracture was confirmed in 88 (81.5%) patients. The number of non-depressed fractures was higher than the number of depressed fractures. In nine (10.2%) patients, nasal bone fractures were only identified on conventional radiography, including three depressed and six non-depressed fractures. CT was more accurate as compared to conventional radiography for the identification of nasal bone fractures as determined by both readers (P <0.05), all diagnostic indices of an experienced radiologist were similar to or higher than those of a trainee, and κ statistics showed moderate agreement between the two diagnostic tools for both readers. There was no statistical difference in the assessment of interobserver reliability for both imaging modalities in the identification of nasal bone fractures. For the identification of nasal bone fractures, CT was significantly superior to conventional radiography. Although a staff radiologist showed better values in the identification of nasal bone fracture and differentiation between depressed and non-depressed fractures than a trainee, there was no statistically significant difference in the interpretation of conventional radiography and CT between a radiologist and a trainee.
A demonstration of lack of variability among six tuberculin skin test readers.
Perez-Stable, E J; Slutkin, G
1985-01-01
The variability of tuberculin skin test readings among six trained and experienced readers was evaluated using a modified sliding caliper method. Each of 537 tests were read independently by two readers. There were 23 disagreements between paired readers resulting in an overall interobserver reliability of 95.7 per cent. In 82 per cent of the paired readings the results were different by 2 mm or less. The observer lack of variability was likely due to the training and experience of the readers. PMID:4051078
Cho, Heeyoon; Pillai, Parvathy; Nicholson, Laura; Sobrin, Lucia
2016-01-01
To describe the clinical course of uveitis-associated inflammatory papillitis and evaluate the utility and reproducibility of optic nerve spectral domain optical coherence tomography (SD-OCT). Data on 22 eyes of 14 patients with uveitis-related papillitis and optic nerve imaging were reviewed. SD-OCT measure reproducibility was determined and parameters were compared in active vs. inactive uveitis. Papillitis resolution lagged behind uveitis resolution in three patients. For SD-OCT measures, the intraclass correlation coefficients were 99.1-100% and 86.9-100% for intraobserver and interobserver reproducibility, respectively. All SD-OCT optic nerve measures except inferior and nasal peripapillary retinal thicknesses were significantly higher in active vs. inactive uveitis after correction for multiple hypotheses testing. Mean optic nerve central thickness decreased from 545.1 to 362.9 µm (p = 0.01). Resolution of inflammatory papillitis can lag behind resolution of uveitis. SD-OCT assessment of papillitis is reproducible and correlates with presence vs. resolution of uveitis.
NASA Astrophysics Data System (ADS)
Ang, Teri; Harkness, Elaine F.; Maxwell, Anthony J.; Lim, Yit Y.; Emsley, Richard; Howell, Anthony; Evans, D. Gareth; Astley, Susan; Gadde, Soujanya
2017-03-01
Breast density is a strong risk factor for breast cancer and has potential use in breast cancer risk prediction, with subjective methods of density assessment providing a strong relationship with the development of breast cancer. This study aims to assess intra- and inter-observer variability in visual density assessment recorded on Visual Analogue Scales (VAS) among trained readers, and examine whether reader age, gender and experience are associated with assessed density. Eleven readers estimated the breast density of 120 mammograms on two occasions 3 years apart using VAS. Intra- and inter-observer agreement was assessed with Intraclass Correlation Coefficient (ICC) and variation between readers visualised on Bland-Altman plots. The mean scores of all mammograms per reader were used to analyse the effect of reader attributes on assessed density. Excellent intra-observer agreement (ICC>0.80) was found in the majority of the readers. All but one reader had a mean difference of <10 percentage points from the first to the second reading. Inter-observer agreement was excellent for consistency (ICC 0.82) and substantial for absolute agreement (ICC 0.69). However, the 95% limits of agreement for pairwise differences were -6.8 to 15.7 at the narrowest and 0.8 to 62.3 at the widest. No significant association was found between assessed density and reader age, experience or gender, or with reading time. Overall, the readers were consistent in their scores, although some large variations were observed. Reader evaluation and targeted training may alleviate this problem.
Wong, Lih-Ming; Chum, Jia-Min; Maddy, Peter; Chan, Steven T F; Travis, Douglas; Lawrentschuk, Nathan
2010-07-01
Macroscopic hematuria is a common symptom and sign that is challenging to quantify and describe. The degree of hematuria communicated is variable due to health worker experience combined with lack of a reliable grading tool. We produced a reliable, standardized visual scale to describe hematuria severity. Our secondary aim was to validate a new laboratory test to quantify hemoglobin in hematuria specimens. Nurses were surveyed to ascertain current hematuria descriptions. Blood and urine were titrated at varying concentrations and digitally photographed in catheter bag tubing. Photos were processed and printed on transparency paper to create a prototype swatch or card showing light, medium, heavy and old hematuria. Using the swatch 60 samples were rated by nurses and laymen. Interobserver variability was reported using the generalized kappa coefficient of agreement. Specimens were analyzed for hemolysis by measuring optical density at oxyhemoglobin absorption peaks. Interobserver agreement between nurses and laymen was good (kappa = 0.51, p <0.001). Subgroup analysis showed substantial agreement for light hematuria (kappa = 0.71). Overall agreement improved when the moderate (kappa = 0.28) and heavy (kappa = 0.53) hematuria categories were combined (kappa = 0.70). Compared to known blood concentrations the assay of optical density at oxyhemoglobin absorption peaks showed a linear trend. A simple visual scale to grade and communicate hematuria with adequate interobserver agreement is feasible. The test for optical density at oxyhemoglobin absorption peaks is a new method, validated in our study, to quantify hemoglobin in a hematuria specimen. Copyright (c) 2010 American Urological Association Education and Research, Inc. Published by Elsevier Inc. All rights reserved.
Hadlich, Marcelo Souza; Oliveira, Gláucia Maria Moraes; Feijóo, Raúl A; Azevedo, Clerio F; Tura, Bernardo Rangel; Ziemer, Paulo Gustavo Portela; Blanco, Pablo Javier; Pina, Gustavo; Meira, Márcio; Souza e Silva, Nelson Albuquerque de
2012-10-01
The standardization of images used in Medicine in 1993 was performed using the DICOM (Digital Imaging and Communications in Medicine) standard. Several tests use this standard and it is increasingly necessary to design software applications capable of handling this type of image; however, these software applications are not usually free and open-source, and this fact hinders their adjustment to most diverse interests. To develop and validate a free and open-source software application capable of handling DICOM coronary computed tomography angiography images. We developed and tested the ImageLab software in the evaluation of 100 tests randomly selected from a database. We carried out 600 tests divided between two observers using ImageLab and another software sold with Philips Brilliance computed tomography appliances in the evaluation of coronary lesions and plaques around the left main coronary artery (LMCA) and the anterior descending artery (ADA). To evaluate intraobserver, interobserver and intersoftware agreements, we used simple and kappa statistics agreements. The agreements observed between software applications were generally classified as substantial or almost perfect in most comparisons. The ImageLab software agreed with the Philips software in the evaluation of coronary computed tomography angiography tests, especially in patients without lesions, with lesions < 50% in the LMCA and < 70% in the ADA. The agreement for lesions > 70% in the ADA was lower, but this is also observed when the anatomical reference standard is used.
Sources and magnitude of sampling error in redd counts for bull trout
Jason B. Dunham; Bruce Rieman
2001-01-01
Monitoring of salmonid populations often involves annual redd counts, but the validity of this method has seldom been evaluated. We conducted redd counts of bull trout Salvelinus confluentus in two streams in northern Idaho to address four issues: (1) relationships between adult escapements and redd counts; (2) interobserver variability in redd...
Hänsel, N H; Schubert, G A; Scholz, B; Nikoubashman, O; Othman, A E; Wiesmann, M; Pjontek, R; Brockmann, M A
2018-02-01
To compare the diagnostic quality of time-of-flight magnetic resonance angiography (TOF-MRA) and metal-artefact-reduction (MAR) flat-panel-detector computed tomography angiography (FPCTA) and to determine the imaging technique best suited for evaluation endovascular and surgically treated aneurysms. The image quality of TOF-MRA and MAR-FPCTA of 44 intracranial implants (coiling: n=20; clipping: n=15; coiling + stenting: n=9) in a patient cohort of 25 was evaluated by two independent readers. Images obtained using MAR-FPCTA (20 second scan time, 496 projections, intravenous contrast medium administration; Artis Zee, Siemens Healthcare, Forchheim) were compared with TOF-MRA-images (1.5 or 3 T). Nominal data were analysed using McNemar's chi-square test and ordinal variables using the Wilcoxon rank test. Compared to TOF-MRA, MAR-FPCTA was significantly better suited to detect aneurysm remnants and to evaluate parent vessels after clipping (p<0.01). For coil packages >160 mm 3 , TOF-MRA provided significantly better assessment than MAR-FPCTA (p<0.01). For small coil packages (<160 mm 3 ), no significant difference between TOF-MRA and MAR-FPCTA (p=0.232) was observed. For different clip sizes (cut-off 492 mm 3 ) likewise no significant differences were found. The interobserver comparison showed high interrater agreement. MAR-FPCTA is significantly better suited for follow-up examinations of clipped aneurysms, whereas for larger coil packages TOF-MRA is preferable. Smaller coil packages can be analysed using MAR-FPCTA or TOF-MRA. Copyright © 2017 The Royal College of Radiologists. Published by Elsevier Ltd. All rights reserved.
Montillet, Marie; Baqué-Juston, Marie; Tasu, Jean-Pierre; Bertrand, Sandra; Berthier, Frédéric; Zarqane, Naïma; Brunner, Philippe
2018-03-01
The purpose of this study is to describe a new method to quickly estimate left atrial enlargement (LAE) on Computed Tomography. Left atrial (LA) volume was assessed with a 3D-threshold Hounsfield unit detection technique, including left atrial appendage and excluding pulmonary venous confluence, in 201 patients with ECG-gated 128-slice dual-source CT and indexed to body surface area. LA and vertebral axial diameter and area were measured at the bottom level of the right inferior pulmonary vein ostium. Ratio of LA diameter and surface on vertebra (LAVD and LAVA) were compared to LA volume. In accordance with the literature, a cutoff value of 78 ml/m 2 was chosen for maximal normal LA volume. 18% of LA was enlarged. The best cutoff values for LAE assessment were 2.5 for LAVD (AUC: 0.65; 95% CI: 0.58-0.73; sensitivity: 57%; specificity: 71%), and 3 for LAVA (AUC: 0.78; 95% CI: 0.72-0.84; sensitivity: 67%; specificity: 79%), with higher accuracy for LAVA (P=0.015). Inter-observer and intra-observer variability were either good or excellent for LAVD and LAVA (respective intraclass coefficients: 0.792 and 0.910; 0.912 and 0.937). A left atrium area superior to three times the vertebral area indicates LAE with high specificity. • Left atrial enlargement is a frequent condition associated with poor cardiac outcome. • Left atrial enlargement is highly time-consuming to diagnose on CT. • The left atrio-vertebral ratio quickly assesses left atrial enlargement. • A left atrial area > three times vertebral area is highly specific.
Lofthag-Hansen, Sara; Thilander-Klang, Anne; Gröndahl, Kerstin
2011-11-01
To evaluate subjective image quality for two diagnostic tasks, periapical diagnosis and implant planning, for cone beam computed tomography (CBCT) using different exposure parameters and fields of view (FOVs). Examinations were performed in posterior part of the jaws on a skull phantom with 3D Accuitomo (FOV 3 cm×4 cm) and 3D Accuitomo FPD (FOVs 4 cm×4 cm and 6 cm×6 cm). All combinations of 60, 65, 70, 75, 80 kV and 2, 4, 6, 8, 10 mA with a rotation of 180° and 360° were used. Dose-area product (DAP) value was determined for each combination. The images were presented, displaying the object in axial, cross-sectional and sagittal views, without scanning data in a random order for each FOV and jaw. Seven observers assessed image quality on a six-point rating scale. Intra-observer agreement was good (κw=0.76) and inter-observer agreement moderate (κw=0.52). Stepwise logistic regression showed kV, mA and diagnostic task to be the most important variables. Periapical diagnosis, regardless jaw, required higher exposure parameters compared to implant planning. Implant planning in the lower jaw required higher exposure parameters compared to upper jaw. Overall ranking of FOVs gave 4 cm×4 cm, 6 cm×6 cm followed by 3 cm×4 cm. This study has shown that exposure parameters should be adjusted according to diagnostic task. For this particular CBCT brand a rotation of 180° gave good subjective image quality, hence a substantial dose reduction can be achieved without loss of diagnostic information. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
Gimber, Lana Hirai; Travis, R Ing; Takahashi, Jayme M; Goodman, Torrey L; Yoon, Hyo-Chun
2009-01-01
Pulmonary computed tomography angiography (CTA) and the Wells criteria both have interobserver variability in the assessment of pulmonary embolism (PE). Quantitative D-dimer assay findings have been shown to have a high negative predictive value in patients with low pretest probability of PE. Evaluate roles for clinical probability and CTA in Emergency Department (ED) patients suspected of acute PE but having a low serum D-dimer level. Prospective observational study of ED patients with possible PE who underwent pulmonary CTA and had D-dimer levels =1.0 mug/mL. Clinical probability of PE determined by ED physicians using standard published criteria; pulmonary CTAs read by initial and study radiologists kept unaware of D-dimer results. In 16 months, 744 patients underwent pulmonary CTA, with 347 study participants who had a D-dimer level = 1.0 mug/mL. In one participant, CTA showed a PE that was agreed on by both the initial and study radiologists. In six participants, the initial findings were reported as positive for PE but were not interpreted as positive by the study radiologist. In none of these participants was PE diagnosed on the basis of clinical probability, of findings on ancillary studies and three-month follow-up examination, or by another radiologist, unaware of findings, acting as a tiebreaker. Pulmonary CTA findings positive for acute embolism should be viewed with caution, especially if the suspected PE is in a distal segmental or subsegmental artery in a patient with a serum D-dimer level of =1.0 mug/mL. Furthermore, the Wells criteria may be of limited additional value in this group of patients with low D-dimer levels because most will have low or intermediate clinical probability of PE.
Interstitial Lung Disease in India. Results of a Prospective Registry.
Singh, Sheetu; Collins, Bridget F; Sharma, Bharat B; Joshi, Jyotsna M; Talwar, Deepak; Katiyar, Sandeep; Singh, Nishtha; Ho, Lawrence; Samaria, Jai Kumar; Bhattacharya, Parthasarathi; Gupta, Rakesh; Chaudhari, Sudhir; Singh, Tejraj; Moond, Vijay; Pipavath, Sudhakar; Ahuja, Jitesh; Chetambath, Ravindran; Ghoshal, Aloke G; Jain, Nirmal K; Devi, H J Gayathri; Kant, Surya; Koul, Parvaiz; Dhar, Raja; Swarnakar, Rajesh; Sharma, Surendra K; Roy, Dhrubajyoti J; Sarmah, Kripesh R; Jankharia, Bhavin; Schmidt, Rodney; Katiyar, Santosh K; Jindal, Arpita; Mangal, Daya K; Singh, Virendra; Raghu, Ganesh
2017-03-15
Interstitial lung disease (ILD) is a heterogeneous group of acute and chronic inflammatory and fibrotic lung diseases. Existing ILD registries have had variable findings. Little is known about the clinical profile of ILDs in India. To characterize new-onset ILDs in India by creating a prospective ILD using multidisciplinary discussion (MDD) to validate diagnoses. Adult patients of Indian origin living in India with new-onset ILD (27 centers, 19 Indian cities, March 2012-June 2015) without malignancy or infection were included. All had connective tissue disease (CTD) serologies, spirometry, and high-resolution computed tomography chest. ILD pattern was defined by high-resolution computed tomography images. Three groups independently made diagnoses after review of clinical data including that from prompted case report forms: local site investigators, ILD experts at the National Data Coordinating Center (NDCC; Jaipur, India) with MDD, and experienced ILD experts at the Center for ILD (CILD; Seattle, WA) with MDD. Cohen's κ was used to assess reliability of interobserver agreement. A total of 1,084 patients were recruited. Final diagnosis: hypersensitivity pneumonitis in 47.3% (n = 513; exposure, 48.1% air coolers), CTD-ILD in 13.9%, and idiopathic pulmonary fibrosis in 13.7%. Cohen's κ: 0.351 site investigator/CILD, 0.519 site investigator/NDCC, and 0.618 NDCC/CILD. Hypersensitivity pneumonitis was the most common new-onset ILD in India, followed by CTD-ILD and idiopathic pulmonary fibrosis; diagnoses varied between site investigators and CILD experts, emphasizing the value of MDD in ILD diagnosis. Prompted case report forms including environmental exposures in prospective registries will likely provide further insight into the etiology and management of ILD worldwide.
Berthon, Beatrice; Marshall, Christopher; Evans, Mererid; Spezi, Emiliano
2016-07-07
Accurate and reliable tumour delineation on positron emission tomography (PET) is crucial for radiotherapy treatment planning. PET automatic segmentation (PET-AS) eliminates intra- and interobserver variability, but there is currently no consensus on the optimal method to use, as different algorithms appear to perform better for different types of tumours. This work aimed to develop a predictive segmentation model, trained to automatically select and apply the best PET-AS method, according to the tumour characteristics. ATLAAS, the automatic decision tree-based learning algorithm for advanced segmentation is based on supervised machine learning using decision trees. The model includes nine PET-AS methods and was trained on a 100 PET scans with known true contour. A decision tree was built for each PET-AS algorithm to predict its accuracy, quantified using the Dice similarity coefficient (DSC), according to the tumour volume, tumour peak to background SUV ratio and a regional texture metric. The performance of ATLAAS was evaluated for 85 PET scans obtained from fillable and printed subresolution sandwich phantoms. ATLAAS showed excellent accuracy across a wide range of phantom data and predicted the best or near-best segmentation algorithm in 93% of cases. ATLAAS outperformed all single PET-AS methods on fillable phantom data with a DSC of 0.881, while the DSC for H&N phantom data was 0.819. DSCs higher than 0.650 were achieved in all cases. ATLAAS is an advanced automatic image segmentation algorithm based on decision tree predictive modelling, which can be trained on images with known true contour, to predict the best PET-AS method when the true contour is unknown. ATLAAS provides robust and accurate image segmentation with potential applications to radiation oncology.
NASA Astrophysics Data System (ADS)
Berthon, Beatrice; Marshall, Christopher; Evans, Mererid; Spezi, Emiliano
2016-07-01
Accurate and reliable tumour delineation on positron emission tomography (PET) is crucial for radiotherapy treatment planning. PET automatic segmentation (PET-AS) eliminates intra- and interobserver variability, but there is currently no consensus on the optimal method to use, as different algorithms appear to perform better for different types of tumours. This work aimed to develop a predictive segmentation model, trained to automatically select and apply the best PET-AS method, according to the tumour characteristics. ATLAAS, the automatic decision tree-based learning algorithm for advanced segmentation is based on supervised machine learning using decision trees. The model includes nine PET-AS methods and was trained on a 100 PET scans with known true contour. A decision tree was built for each PET-AS algorithm to predict its accuracy, quantified using the Dice similarity coefficient (DSC), according to the tumour volume, tumour peak to background SUV ratio and a regional texture metric. The performance of ATLAAS was evaluated for 85 PET scans obtained from fillable and printed subresolution sandwich phantoms. ATLAAS showed excellent accuracy across a wide range of phantom data and predicted the best or near-best segmentation algorithm in 93% of cases. ATLAAS outperformed all single PET-AS methods on fillable phantom data with a DSC of 0.881, while the DSC for H&N phantom data was 0.819. DSCs higher than 0.650 were achieved in all cases. ATLAAS is an advanced automatic image segmentation algorithm based on decision tree predictive modelling, which can be trained on images with known true contour, to predict the best PET-AS method when the true contour is unknown. ATLAAS provides robust and accurate image segmentation with potential applications to radiation oncology.
Jia, Xiaoyang; Chen, Yanxi; Qiang, Minfei; Zhang, Kun; Li, Haobo; Jiang, Yuchen; Zhang, Yijie
2016-07-15
Accurate comprehension of the normal humeral morphology is crucial for anatomical reconstruction in shoulder arthroplasty. However, traditional morphological measurements for humerus were mainly based on cadaver and radiography. The purpose of this study was to provide a series of precise and repeatable parameters of the normal proximal humerus for arthroplasty, based on the three-dimensional (3-D) measurements. Radiographic and 3-D computed tomography (CT) measurements of the proximal humerus were performed in a sample of 120 consecutive adults. Sex differences, two image modalities differences, and correlations of the parameters were evaluated. Intra- and inter-observer reproducibility was evaluated using intraclass correlation coefficients (ICCs). In the male group, all parameters except the neck-shaft angle of humerus, based on 3-D CT images, were greater than those in the female group (P < 0.05). All variables were significantly different between two image modalities (P < 0.05). In 3-D CT measurement, all parameters expect neck-shaft angle had correlation with each other (P < 0.001), particularly between two diameters of the humeral head (r = 0.907). All parameters in the 3-D CT measurement had excellent reproducibility (ICC range, 0.878 to 0.936) that was higher than those in the radiographs (ICC range, 0.741 to 0.858). The present study suggested that 3-D CT was more reproducible than plain radiography in the assessment of morphology of the normal proximal humerus. Therefore, this reproducible modality could be utilized in the preoperative planning. Our data could serve as an effective guideline for humeral component selection and improve the design of shoulder prosthesis.
Intra-tumoral heterogeneity of gemcitabine delivery and mass transport in human pancreatic cancer
NASA Astrophysics Data System (ADS)
Koay, Eugene J.; Baio, Flavio E.; Ondari, Alexander; Truty, Mark J.; Cristini, Vittorio; Thomas, Ryan M.; Chen, Rong; Chatterjee, Deyali; Kang, Ya'an; Zhang, Joy; Court, Laurence; Bhosale, Priya R.; Tamm, Eric P.; Qayyum, Aliya; Crane, Christopher H.; Javle, Milind; Katz, Matthew H.; Gottumukkala, Vijaya N.; Rozner, Marc A.; Shen, Haifa; Lee, Jeffrey E.; Wang, Huamin; Chen, Yuling; Plunkett, William; Abbruzzese, James L.; Wolff, Robert A.; Maitra, Anirban; Ferrari, Mauro; Varadhachary, Gauri R.; Fleming, Jason B.
2014-12-01
There is substantial heterogeneity in the clinical behavior of pancreatic cancer and in its response to therapy. Some of this variation may be due to differences in delivery of cytotoxic therapies between patients and within individual tumors. Indeed, in 12 patients with resectable pancreatic cancer, we previously demonstrated wide inter-patient variability in the delivery of gemcitabine as well as in the mass transport properties of tumors as measured by computed tomography (CT) scans. However, the variability of drug delivery and transport properties within pancreatic tumors is currently unknown. Here, we analyzed regional measurements of gemcitabine DNA incorporation in the tumors of the same 12 patients to understand the degree of intra-tumoral heterogeneity of drug delivery. We also developed a volumetric segmentation approach to measure mass transport properties from the CT scans of these patients and tested inter-observer agreement with this new methodology. Our results demonstrate significant heterogeneity of gemcitabine delivery within individual pancreatic tumors and across the patient cohort, with gemcitabine DNA incorporation in the inner portion of the tumors ranging from 38 to 74% of the total. Similarly, the CT-derived mass transport properties of the tumors had a high degree of heterogeneity, ranging from minimal difference to almost 200% difference between inner and outer portions of the tumor. Our quantitative method to derive transport properties from CT scans demonstrated less than 5% difference in gemcitabine prediction at the average CT-derived transport value across observers. These data illustrate significant inter-patient and intra-tumoral heterogeneity in the delivery of gemcitabine, and highlight how this variability can be reproducibly accounted for using principles of mass transport. With further validation as a biophysical marker, transport properties of tumors may be useful in patient selection for therapy and prediction of therapeutic outcome.
2014-01-01
Background This study aimed to assess inter-observer variability between the original diagnostic reports and later review by a specialist in breast pathology considering lobular neoplasias (LN), columnar cell lesions (CCL), atypical ductal hyperplasia (ADH), and ductal carcinoma in situ (DCIS) of the breast. Methods A retrospective, observational, cross-sectional study was conducted. A total of 610 breast specimens that had been formally sent for consultation and/or second opinions to the Breast Pathology Laboratory of Federal University of Minas Gerais were analysed between January 2005 and December 2010. The inter-observer variability between the original report and later review was compared regarding the diagnoses of LN, CCL, ADH, and DCIS. Statistical analyses were conducted using the Kappa index. Results Weak correlations were observed for the diagnoses of columnar cell change (CCC; Kappa = 0.38), columnar cell hyperplasia (CCH; Kappa = 0.32), while a moderate agreement (Kappa = 0.47) was observed for the diagnoses of flat epithelial atypia (FEA). Good agreement was observed in the diagnoses of atypical lobular hyperplasia (ALH; Kappa = 0.62) and lobular carcinoma in situ (LCIS; Kappa = 0.66). However, poor agreement was observed for the diagnoses of pleomorphic LCIS (Kappa = 0.22). Moderate agreement was observed for the diagnoses of ADH (Kappa = 0.44), low-grade DCIS (Kappa = 0.47), intermediate-grade DCIS (Kappa = 0.45), and DCIS with microinvasion (Kappa = 0.56). Good agreement was observed between the diagnoses of high-grade DCIS (Kappa = 0.68). Conclusions According to our data, the best diagnostic agreements were observed for high-grade DCIS, ALH, and LCIS. CCL without atypia and pleomorphic LCIS had the worst agreement indices. Virtual Slides The virtual slide(s) for this article can be found here: http://www.diagnosticpathology.diagnomx.eu/vs/1640072350119725. PMID:24948027
Dervin, Geoffrey F.; Stiell, Ian G.; Wells, George A.; Rody, Kelly; Grabowski, Jenny
2001-01-01
Objective To determine clinicians’ accuracy and reliability for the clinical diagnosis of unstable meniscus tears in patients with symptomatic osteoarthritis of the knee. Design A prospective cohort study. Setting A single tertiary care centre. Patients One hundred and fifty-two patients with symptomatic osteoarthritis of the knee refractory to conservative medical treatment were selected for prospective evaluation of arthroscopic débridement. Intervention Arthroscopic débridement of the knee, including meniscal tear and chondral flap resection, without abrasion arthroplasty. Outcome measures A standardized assessment protocol was administered to each patient by 2 independent observers. Arthroscopic determination of unstable meniscal tears was recorded by 1 observer who reviewed a video recording and was blinded to preoperative data. Those variables that had the highest interobserver agreement and the strongest association with meniscal tear by univariate methods were entered into logistic regression to model the best prediction of resectable tears. Results There were 92 meniscal tears (77 medial, 15 lateral). Interobserver agreement between clinical fellows and treating surgeons was poor to fair (κ < 0.4) for all clinical variables except radiographic measures, which were good. Fellows and surgeons predicted unstable meniscal tear preoperatively with equivalent accuracy of 60%. Logistic regression modelling revealed that a history of swelling and a ballottable effusion were negative predictors. A positive McMurray test was the only positive predictor of unstable meniscal tear. “Mechanical” symptoms were not reliable predictors in this prospective study. The model was 69% accurate for all patients and 76% for those with advanced medial compartment osteoarthritis defined by a joint space height of 2 mm or less. Conclusions This study underscored the difficulty in using clinical variables to predict unstable medial meniscal tears in patients with pre-existing osteoarthritis of the knee. The lack of interobserver agreement must be overcome to ensure that the findings can be generalized to other physician observers. PMID:11504260
Gomes, Douglas S; Porto, Simone S; Balabram, Débora; Gobbi, Helenice
2014-06-19
This study aimed to assess inter-observer variability between the original diagnostic reports and later review by a specialist in breast pathology considering lobular neoplasias (LN), columnar cell lesions (CCL), atypical ductal hyperplasia (ADH), and ductal carcinoma in situ (DCIS) of the breast. A retrospective, observational, cross-sectional study was conducted. A total of 610 breast specimens that had been formally sent for consultation and/or second opinions to the Breast Pathology Laboratory of Federal University of Minas Gerais were analysed between January 2005 and December 2010. The inter-observer variability between the original report and later review was compared regarding the diagnoses of LN, CCL, ADH, and DCIS. Statistical analyses were conducted using the Kappa index. Weak correlations were observed for the diagnoses of columnar cell change (CCC; Kappa=0.38), columnar cell hyperplasia (CCH; Kappa=0.32), while a moderate agreement (Kappa=0.47) was observed for the diagnoses of flat epithelial atypia (FEA). Good agreement was observed in the diagnoses of atypical lobular hyperplasia (ALH; Kappa=0.62) and lobular carcinoma in situ (LCIS; Kappa=0.66). However, poor agreement was observed for the diagnoses of pleomorphic LCIS (Kappa=0.22). Moderate agreement was observed for the diagnoses of ADH (Kappa=0.44), low-grade DCIS (Kappa=0.47), intermediate-grade DCIS (Kappa=0.45), and DCIS with microinvasion (Kappa=0.56). Good agreement was observed between the diagnoses of high-grade DCIS (Kappa=0.68). According to our data, the best diagnostic agreements were observed for high-grade DCIS, ALH, and LCIS. CCL without atypia and pleomorphic LCIS had the worst agreement indices. The virtual slide(s) for this article can be found here: http://www.diagnosticpathology.diagnomx.eu/vs/1640072350119725.
Non-invasive diagnosis of liver fibrosis in chronic hepatitis C
Schiavon, Leonardo de Lucca; Narciso-Schiavon, Janaína Luz; de Carvalho-Filho, Roberto José
2014-01-01
Assessment of liver fibrosis in chronic hepatitis C virus (HCV) infection is considered a relevant part of patient care and key for decision making. Although liver biopsy has been considered the gold standard for staging liver fibrosis, it is an invasive technique and subject to sampling errors and significant intra- and inter-observer variability. Over the last decade, several noninvasive markers were proposed for liver fibrosis diagnosis in chronic HCV infection, with variable performance. Besides the clear advantage of being noninvasive, a more objective interpretation of test results may overcome the mentioned intra- and inter-observer variability of liver biopsy. In addition, these tests can theoretically offer a more accurate view of fibrogenic events occurring in the entire liver with the advantage of providing frequent fibrosis evaluation without additional risk. However, in general, these tests show low accuracy in discriminating between intermediate stages of fibrosis and may be influenced by several hepatic and extra-hepatic conditions. These methods are either serum markers (usually combined in a mathematical model) or imaging modalities that can be used separately or combined in algorithms to improve accuracy. In this review we will discuss the different noninvasive methods that are currently available for the evaluation of liver fibrosis in chronic hepatitis C, their advantages, limitations and application in clinical practice. PMID:24659877
Soukup, Viktor; Čapoun, Otakar; Cohen, Daniel; Hernández, Virginia; Babjuk, Marek; Burger, Max; Compérat, Eva; Gontero, Paolo; Lam, Thomas; MacLennan, Steven; Mostafid, A Hugh; Palou, Joan; van Rhijn, Bas W G; Rouprêt, Morgan; Shariat, Shahrokh F; Sylvester, Richard; Yuan, Yuhong; Zigeuner, Richard
2017-11-01
Tumour grade is an important prognostic indicator in non-muscle-invasive bladder cancer (NMIBC). Histopathological classifications are limited by interobserver variability (reproducibility), which may have prognostic implications. European Association of Urology NMIBC guidelines suggest concurrent use of both 1973 and 2004/2016 World Health Organization (WHO) classifications. To compare the prognostic performance and reproducibility of the 1973 and 2004/2016 WHO grading systems for NMIBC. A systematic literature search was undertaken incorporating Medline, Embase, and the Cochrane Library. Studies were critically appraised for risk of bias (QUIPS). For prognosis, the primary outcome was progression to muscle-invasive or metastatic disease. Secondary outcomes were disease recurrence, and overall and cancer-specific survival. For reproducibility, the primary outcome was interobserver variability between pathologists. Secondary outcome was intraobserver variability (repeatability) by the same pathologist. Of 3593 articles identified, 20 were included in the prognostic review; three were eligible for the reproducibility review. Increasing tumour grade in both classifications was associated with higher disease progression and recurrence rates. Progression rates in grade 1 patients were similar to those in low-grade patients; progression rates in grade 3 patients were higher than those in high-grade patients. Survival data were limited. Reproducibility of the 2004/2016 system was marginally better than that of the 1973 system. Two studies on repeatability showed conflicting results. Most studies had a moderate to high risk of bias. Current grading classifications in NMIBC are suboptimal. The 1973 system identifies more aggressive tumours. Intra- and interobserver variability was slightly less in the 2004/2016 classification. We could not confirm that the 2004/2016 classification outperforms the 1973 classification in prediction of recurrence and progression. This article summarises the utility of two different grading systems for non-muscle-invasive bladder cancer. Both systems predict progression and recurrence, although pathologists vary in their reporting; suggestions for further improvements are made. Copyright © 2017 European Association of Urology. Published by Elsevier B.V. All rights reserved.
Interobserver variability for the WHO classification of pulmonary carcinoids.
Swarts, Dorian R A; van Suylen, Robert-Jan; den Bakker, Michael A; van Oosterhout, Matthijs F M; Thunnissen, Frederik B J M; Volante, Marco; Dingemans, Anne-Marie C; Scheltinga, Marc R M; Bootsma, Gerben P; Pouwels, Harry M M; van den Borne, Ben E E M; Ramaekers, Frans C S; Speel, Ernst-Jan M
2014-10-01
Pulmonary carcinoids are neuroendocrine tumors histopathologically subclassified into typical (TC; no necrosis, <2 mitoses per 2 mm) and atypical (AC; necrosis or 2 to 10 mitoses per 2 mm). The reproducibility of lung carcinoid classification, however, has not been extensively studied and may be hampered by the presence of pyknotic apoptosis mimicking mitotic figures. Furthermore, prediction of prognosis based on histopathology varies, especially for ACs. We examined the presence of interobserver variation between 5 experienced pulmonary pathologists who reviewed 123 originally diagnosed pulmonary carcinoid cases. The tumors were subsequently redistributed over 3 groups: unanimously classified cases, consensus cases (4/5 pathologists rendered identical diagnosis), and disagreement cases (divergent diagnosis by ≥2 assessors). κ-values were calculated, and results were correlated with clinical follow-up and molecular data. When focusing on the 114/123 cases unanimously classified as pulmonary carcinoids, the interobserver agreement was only fair (κ=0.32). Of these 114 cases, 55% were unanimously classified, 25% reached consensus classification, and for 19% there was no consensus. ACs were significantly more often in the latter category (P=0.00038). The designation of TCs and ACs by ≥3 assessors was not associated with prognosis (P=0.11). However, when disagreement cases were allocated on the basis of Ki-67 proliferative index (<5%; ≥5%) or nuclear orthopedia homeobox immunostaining (+; -), correlation with prognosis improved significantly (P=0.00040 and 0.0024, respectively). In conclusion, there is a considerable interobserver variation in the histopathologic classification of lung carcinoids, in particular concerning ACs. Additional immunomarkers such as Ki-67 or orthopedia homeobox may improve classification and prediction of prognosis.
Rispoli, Marco; Savastano, Maria Cristina; Lumbroso, Bruno
2015-11-01
To analyze the foveal microvasculature features in eyes with branch retinal vein occlusion (BRVO) using optical coherence tomography angiography based on split spectrum amplitude decorrelation angiography technology. A total of 10 BRVO eyes (mean age 64.2 ± 8.02 range between 52 years and 76 years) were evaluated by optical coherence tomography angiography (XR-Avanti; Optovue). The macular angiography scan protocol covered a 3 mm × 3 mm area. The focus of angiography analysis were two retinal layers: superficial vascular network and deep vascular network. The following vascular morphological congestion parameters were assessed in the vein occlusion area in both the superficial and deep networks: foveal avascular zone enlargement, capillary non-perfusion occurrence, microvascular abnormalities appearance, and vascular congestion signs. Image analyses were performed by 2 masked observers and interobserver agreement of image analyses was 0.90 (κ = 0.225, P < 0.01). In both superficial and deep network of BRVO, a decrease in capillary density with foveal avascular zone enlargement, capillary non-perfusion occurrence, and microvascular abnormalities appearance was observed (P < 0.01). The deep network showed the main vascular congestion at the boundary between healthy and nonperfused retina. Optical coherence tomography angiography in BRVO allows to detect foveal avascular zone enlargement, capillary nonperfusion, microvascular abnormalities, and vascular congestion signs both in the superficial and deep capillary network in all eyes. Optical coherence tomography angiography technology is a potential clinical tool for BRVO diagnosis and follow-up, providing stratigraphic vascular details that have not been previously observed by standard fluorescein angiography. The normal retinal vascular nets and areas of nonperfusion and congestion can be identified at various retinal levels. Optical coherence tomography angiography provides noninvasive images of the retinal capillaries and vascular networks.
Munarriz, Pablo M; Paredes, Igor; Alén, José F; Castaño-Leon, Ana M; Cepeda, Santiago; Hernandez-Lain, Aurelio; Lagares, Alfonso
The use of histological degeneration scores in surgically-treated herniated lumbar discs is not common in clinical practice and its use has been primarily restricted to research. The objective of this study is to evaluate if there is an association between a higher grade of histological degeneration when compared with clinical or radiological parameters. Retrospective consecutive analysis of 122 patients who underwent single-segment lumbar disc herniation surgery. Clinical information was available on all patients, while the histological study and preoperative magnetic resonance imaging were also retrieved for 75 patients. Clinical variables included age, duration of symptoms, neurological deficits, or affected deep tendon reflex. The preoperative magnetic resonance imaging was evaluated using Modic and Pfirrmann scores for the affected segment by 2 independent observers. Histological degeneration was evaluated using Weiler's score; the presence of inflammatory infiltrates and neovascularization, not included in the score, were also studied. Correlation and chi-square tests were used to assess the association between histological variables and clinical or radiological variables. Interobserver agreement was also evaluated for the MRI variables using weighted kappa. No statistically significant correlation was found between histological variables (histological degeneration score, inflammatory infiltrates or neovascularization) and clinical or radiological variables. Interobserver agreement for radiological scores resulted in a kappa of 0.79 for the Pfirrmann scale and 0.65 for the Modic scale, both statistically significant. In our series of patients, we could not demonstrate any correlation between the degree of histological degeneration or the presence of inflammatory infiltrates when compared with radiological degeneration scales or clinical variables such as the patient's age or duration of symptoms. Copyright © 2017 Sociedad Española de Neurocirugía. Publicado por Elsevier España, S.L.U. All rights reserved.
Sainz, José A; Fernández-Palacín, Ana; Borrero, Carlota; Aquise, Adriana; Ramos, Zenaida; García-Mejido, José A
2018-04-01
The aim of this study was to evaluate the inter- and intraobserver correlation of the different intrapartum-transperineal-ultrasound-parameters(ITU) (angle of progression (AoP), progression-distance (PD), head-direction (HD), midline-angle (MLA) and head-perineum distance (HPD)) with contraction and pushing. We evaluated 28 nulliparous women at full dilatation under epidural analgesia. We performed a transperineal ultrasound evaluating AoP and PD in the longitudinal plane, and MLA and HPD in the transverse plane. Interclass correlation coefficients (ICC) with 95% CIs and Bland-Altman analysis were used to assess intra- and interobserver measurement's repeatability. The ICC of the ITU for the same observer was adequate for all the parameters (p < .005) AoP 0.98 (95%CI, 0.96-0.99), PD 0.98 (95%CI, 0.97-0.99), MLA 0.99 (95%CI, 0.97-0.99), HPD 0.96 (95%CI, 0.88-0.99). The ICC of the ITU for interobserver was: AoP 0.93 (95%CI, 0.79-0.98), PD 0.92 (95%CI, 0.76-0.97), MLA 0.77 (95%CI, 0.42-0.92), HPD 0.47 (95%CI, -0.12-0.8). The HD had an interobserver correlation of 0.53 (95%CI, 0.1-0.9) (Kappa C). The mean difference of the AoP was 2.42°, of the PD 1 mm and 0.28° MLA (Bland-Altman test). ITU has an adequate intra- and interobserver correlation for its use with contraction and pushing under epidural analgesia. Impact statement What is already known on this subject: The intrapartum transperineal ultrasound parameters can be used with contraction and pushing under epidural analgesia. What the results of this study add to what we know: ITU may be used to evaluate the difficulty of instrumental delivery/to evaluate the difficulty of instrumentation in vaginal operative deliveries and this study concludes that ITU is reproducible during uterine contraction with pushing. What the implications are of these findings for clinical practice and/or further research: Therefore, ITU could be used without difficulty with an adequate intra- and interobserver correlation for the prediction of instrumentation difficulty in operative vaginal deliveries.
Variable pixel size ionospheric tomography
NASA Astrophysics Data System (ADS)
Zheng, Dunyong; Zheng, Hongwei; Wang, Yanjun; Nie, Wenfeng; Li, Chaokui; Ao, Minsi; Hu, Wusheng; Zhou, Wei
2017-06-01
A novel ionospheric tomography technique based on variable pixel size was developed for the tomographic reconstruction of the ionospheric electron density (IED) distribution. In variable pixel size computerized ionospheric tomography (VPSCIT) model, the IED distribution is parameterized by a decomposition of the lower and upper ionosphere with different pixel sizes. Thus, the lower and upper IED distribution may be very differently determined by the available data. The variable pixel size ionospheric tomography and constant pixel size tomography are similar in most other aspects. There are some differences between two kinds of models with constant and variable pixel size respectively, one is that the segments of GPS signal pay should be assigned to the different kinds of pixel in inversion; the other is smoothness constraint factor need to make the appropriate modified where the pixel change in size. For a real dataset, the variable pixel size method distinguishes different electron density distribution zones better than the constant pixel size method. Furthermore, it can be non-chided that when the effort is spent to identify the regions in a model with best data coverage. The variable pixel size method can not only greatly improve the efficiency of inversion, but also produce IED images with high fidelity which are the same as a used uniform pixel size method. In addition, variable pixel size tomography can reduce the underdetermined problem in an ill-posed inverse problem when the data coverage is irregular or less by adjusting quantitative proportion of pixels with different sizes. In comparison with constant pixel size tomography models, the variable pixel size ionospheric tomography technique achieved relatively good results in a numerical simulation. A careful validation of the reliability and superiority of variable pixel size ionospheric tomography was performed. Finally, according to the results of the statistical analysis and quantitative comparison, the proposed method offers an improvement of 8% compared with conventional constant pixel size tomography models in the forward modeling.
Olmos-Temois, S G; Santos-Martínez, L E; Álvarez-Álvarez, R; Gutiérrez-Delgado, L G; Baranda-Tovar, F M
2016-11-01
To know the variability of transthoracic echocardiographic parameters that assess right ventricular systolic function by analyzing interobserver agreement in the early postoperative period of cardiovascular surgery. To assess the feasibility of these echocardiographic measurements. A cross-sectional study, double-blind pilot study was carried out from May 2011 to February 2013. Cardiovascular postoperative critical care at the National Institute of Cardiology "Ignacio Chávez", Mexico City, Mexico. Consecutive, non-probabilistic sampling. Fifty-six patients were studied in the postoperative period of cardiac surgery. The first echocardiographic parameters were obtained between 6-8hours after cardiac surgery, followed by blinded second measurements. Tricuspid annular plane systolic excursion (TAPSE), tricuspid annular peak systolic velocity on tissue Doppler imaging (VSPAT), diameters and right ventricular outflow area, tract fractional shortening. The agreement was analyzed by the Bland-Altman method, and its magnitude was assessed by the intraclass correlation coefficient (95% confidence interval). Both observers evaluated TAPSE and VSPAT in 48 patients (92%). The average TAPSE was 11.68±4.53mm (range 4-27mm). Right ventricular systolic dysfunction was observed in 41 cases (85%) and normal TAPSE in 7 patients (15%). The average difference and its limits according to TAPSE were -0.917±2.95 (-6.821, 4.988), with a magnitude of 0.725 (0.552, 0.837); the tricuspid annular peak systolic velocity on tissue Doppler imaging was -0.001±0.015 (-0.031, 0.030), and its magnitude 0.825 (0.708, 0.898), respectively. VSPAT and TAPSE were estimated by both observers in 92% of the patients, these parameters exhibiting the lowest interobserver variability. Copyright © 2016 Elsevier España, S.L.U. y SEMICYUC. All rights reserved.
Shah, Rajal B; Leandro, Gioacchino; Romerocaces, Gloria; Bentley, James; Yoon, Jiyoon; Mendrinos, Savvas; Tadros, Yousef; Tian, Wei; Lash, Richard
2016-10-01
One of the major goals of an anatomic pathology laboratory quality program is to minimize unwarranted diagnostic variability and equivocal reporting. This study evaluated the utility of Miraca Life Sciences' "Disease-Focused Diagnostic Review" (DFDR) quality program in improving interobserver diagnostic reproducibility associated with classification of "atypical glands suspicious for adenocarcinoma" (ATYP) in prostate biopsies. Seventy-one selected prostate biopsies with a focus of ATYP were reviewed by 8 pathologists. Participants were blinded to the original diagnosis and were first asked to classify the ATYP as benign, atypical, or limited adenocarcinoma. DFDR comprised a "theoretical consensus" (in which pathologists first reached consensus on the morphological features they considered relevant for the diagnosis of limited prostatic adenocarcinoma), a didactic review including relevant literature, and "practical consensus" (pathologists performed joint microscopic sessions, reconciling each other's observations and positions evaluating a separate unique slide set). Participants were finally asked to reclassify the original 71 ATYP cases based on knowledge gleaned from DFDR. Pre- and post-DFDR interobserver reproducibility of overall diagnostic agreement was assessed. Interobserver reproducibility measured by Fleiss κ values of pre- and post-DFDR was 0.36 and 0.59, respectively (P=.006). Post-DFDR, there were significant improvement for "100% concordance" (P=.011) and reduction for "no consensus" (P=.0004) categories. Despite a lower pre-DFDR reproducibility for non-uropathology fellowship-trained (n=3, κ=0.38) versus uropathology fellowship-trained (n=5, κ=0.43) pathologists, both groups achieved similarly high post-DFDR κ levels (κ=0.58 and 0.56, respectively). DFDR represents an effective tool to formally achieve diagnostic consensus and reduce variability associated with critical diagnoses in an anatomic pathology practice. Copyright © 2016 Elsevier Inc. All rights reserved.
Ajtony, Csilla; Elkarmouty, Ahmed; Barton, Keith; Kotecha, Aachal
2016-06-01
To evaluate the levels of agreement between the standard reusable prism and a disposable prism, and to examine the agreement between ophthalmologists, nursing and technical staff when measuring intraocular pressure (IOP) using the Goldmann applanation tonometer. Three hundred eyes of 300 patients were recruited. IOP measurements were made in a randomised order by three observer groups consisting of ophthalmologists and ophthalmic technicians/nurses taken from a pool of clinicians working within a busy outpatient clinic. Agreement was calculated by Bland-Altman analysis, showing the mean difference and 95% limits of agreement (LoA) of measurements. The mean difference between the reusable and disposable prism IOP measurements was <0.5 mm Hg. The LoA ranged from ±3.1 to ±4.9 mm Hg, depending on the observer group. The interobserver variability was <1 mm Hg across all observer groups; the LoA was slightly higher for observers using the reusable prism (range between ±4.3 and ±5.6 mm Hg) compared with using the disposable prism (range between ±3.7 and ±5.4 mm Hg) across observer groups. There is an acceptable agreement between IOP measurements made with the reusable Goldmann tonometer prism and the disposable Tonosafe prism. Interobserver variability in IOP measurements within an outpatient setting is larger than that found within a research setting, and may be of a level that impacts on clinical decision-making. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
Erdoğan, Zeynep; Abdülrezzak, Ümmühan; Silov, Güler; Özdal, Ayşegül; Turhal, Özgül
2014-01-01
Objective: The aim of this study was to investigate the variability in the interpretation of parenchymal abnormalities and to assess the differences in interpretation of routine renal scintigraphic findings on posterior view of technetium-99m dimercaptosuccinic acid (pvDMSA) scans and parenchymal phase of technetium-99m mercaptoacetyltriglycine (ppMAG3) scans by using standard criterions to make standardization and semiquantitative evaluation and to have more accurately correlation. Materials and Methods: Two experienced nuclear medicine physicians independently interpreted pvDMSA scans of 204 and ppMAG3 scans of 102 pediatric patients, retrospectively. Comparisons were made by visual inspection of pvDMSA scans, and ppMAG3 scans by using a grading system modified from Itoh et al. According to this, anatomical damage of the renal parenchyma was classified into six types: Grade 0-V. In the calculation of the agreement rates, Kendall correlation (tau-b) analysis was used. Results: According to our findings, excellent agreement was found for DMSA grade readings (DMSA-GR) (tau-b = 0.827) and good agreement for MAG3 grade readings (MAG3-GR) (tau-b = 0.790) between two observers. Most of clear parenchymal lesions detected on pvDMSA scans and ppMAG3 scans identified by observers equally. Studies with negative or minimal lesions reduced correlation degrees for both DMSA-GR and MAG3-GR. Conclusion: Our grading system can be used for standardization of the reports. We conclude that standardization of criteria and terminology in the interpretations may result in higher interobserver consistency, also improve low interobserver reproducibility and objectivity of renal scintigraphy reports. PMID:24761059
Office-Based Point of Care Testing (IgA/IgG-Deamidated Gliadin Peptide) for Celiac Disease.
Lau, Michelle S; Mooney, Peter D; White, William L; Rees, Michael A; Wong, Simon H; Hadjivassiliou, Marios; Green, Peter H R; Lebwohl, Benjamin; Sanders, David S
2018-06-19
Celiac disease (CD) is common yet under-detected. A point of care test (POCT) may improve CD detection. We aimed to assess the diagnostic performance of an IgA/IgG-deamidated gliadin peptide (DGP)-based POCT for CD detection, patient acceptability, and inter-observer variability of the POCT results. From 2013-2017, we prospectively recruited patients referred to secondary care with gastrointestinal symptoms, anemia and/or weight loss (group 1); and patients with self-reported gluten sensitivity with unknown CD status (group 2). All patients had concurrent POCT, IgA-tissue transglutaminase (IgA-TTG), IgA-endomysial antibodies (IgA-EMA), total IgA levels, and duodenal biopsies. Five hundred patients completed acceptability questionnaires, and inter-observer variability of the POCT results was compared among five clinical staff for 400 cases. Group 1: 1000 patients, 58.5% female, age 16-91, median age 57. Forty-one patients (4.1%) were diagnosed with CD. The sensitivities of the POCT, IgA-TTG, and IgA-EMA were 82.9, 78.1, and 70.7%; the specificities were 85.4, 96.3, and 99.8%. Group 2: 61 patients, 83% female; age 17-73, median age 35. The POCT had 100% sensitivity and negative predictive value in detecting CD in group 2. Most patients preferred the POCT to venepuncture (90.4% vs. 2.8%). There was good inter-observer agreement on the POCT results with a Fleiss Kappa coefficient of 0.895. The POCT had comparable sensitivities to serology, and correctly identified all CD cases in a gluten sensitive cohort. However, its low specificity may increase unnecessary investigations. Despite its advantage of convenience and rapid results, it may not add significant value to case finding in an office-based setting.
Inter-observer variability in fetal biometric measurements.
Kilani, Rami; Aleyadeh, Wesam; Atieleh, Luay Abu; Al Suleimat, Abdul Mane; Khadra, Maysa; Hawamdeh, Hassan M
2018-02-01
To evaluate inter-observer variability and reproducibility of ultrasound measurements for fetal biometric parameters. A prospective cohort study was implemented in two tertiary care hospitals in Amman, Jordan; Prince Hamza Hospital and Albashir Hospital. 192 women with a singleton pregnancy at a gestational age of 18-36 weeks were the participants in the study. Transabdominal scans for fetal biometric parameter measurement were performed on study participants from the period of November 2014 to March 2015. Women who agreed to participate in the study were administered two ultrasound scans for head circumference, abdominal circumference and femur length. The correlation coefficient was calculated. Bland-Altman plots were used to analyze the degree of measurement agreement between observers. Limits of agreement ± 2 SD for the differences in fetal biometry measurements in proportions of the mean of the measurements were derived. Main outcome measures examine the reproducibility of fetal biometric measurements by different observers. High inter-observer inter-class correlation coefficient (ICC) was found for femur length (0.990) and abdominal circumference (0.996) where Bland-Altman plots showed high degrees of agreement. The highest degrees of agreement were noted in the measurement of abdominal circumference followed by head circumference. The lowest degree of agreement was found for femur length measurement. We used a paired-sample t-test and found that the mean difference between duplicate measurements was not significant (P > 0.05). Biometric fetal parameter measurements may be reproducible by different operators in the clinical setting with similar results. Fetal head circumference, abdominal circumference and femur length were highly reproducible. Large organized studies are needed to ensure accurate fetal measurements due to the important clinical implications of inaccurate measurements. Copyright © 2018. Published by Elsevier B.V.
Superimposition of 3-dimensional cone-beam computed tomography models of growing patients
Cevidanes, Lucia H. C.; Heymann, Gavin; Cornelis, Marie A.; DeClerck, Hugo J.; Tulloch, J. F. Camilla
2009-01-01
Introduction The objective of this study was to evaluate a new method for superimposition of 3-dimensional (3D) models of growing subjects. Methods Cone-beam computed tomography scans were taken before and after Class III malocclusion orthopedic treatment with miniplates. Three observers independently constructed 18 3D virtual surface models from cone-beam computed tomography scans of 3 patients. Separate 3D models were constructed for soft-tissue, cranial base, maxillary, and mandibular surfaces. The anterior cranial fossa was used to register the 3D models of before and after treatment (about 1 year of follow-up). Results Three-dimensional overlays of superimposed models and 3D color-coded displacement maps allowed visual and quantitative assessment of growth and treatment changes. The range of interobserver errors for each anatomic region was 0.4 mm for the zygomatic process of maxilla, chin, condyles, posterior border of the rami, and lower border of the mandible, and 0.5 mm for the anterior maxilla soft-tissue upper lip. Conclusions Our results suggest that this method is a valid and reproducible assessment of treatment outcomes for growing subjects. This technique can be used to identify maxillary and mandibular positional changes and bone remodeling relative to the anterior cranial fossa. PMID:19577154
Choroidal thickness measurement in children using optical coherence tomography.
Bidaut-Garnier, Mélanie; Schwartz, Claire; Puyraveau, Marc; Montard, Michel; Delbosc, Bernard; Saleh, Maher
2014-04-01
To measure choroidal thickness (CT) in children of various ages by using spectral optical coherence tomography with enhanced depth imaging and to investigate the association between subfoveal CT and ocular axial length, age, gender, weight, and height in children. Healthy children were prospectively included between May and August 2012. Optical coherence tomography with the enhanced depth imaging system (Spectralis, Heidelberg, Germany) was used for choroidal imaging at nine defined points of the macula of both eyes. Axial length was measured using IOLMaster (Carl Zeiss Meditec, Dublin, CA). Height, weight, and refraction were recorded. Interobserver agreement in readings was also assessed by the Bland-Altman Method. Three hundred and forty-eight eyes from 174 children aged 3.5 years to 14.9 years were imaged. The mean subfoveal CT in right eyes was 341.96 ± 74.7 µm. Choroidal thickness increased with age (r = 0.24, P = 0.017), height, and weight but not with gender (P > 0.05). It was also inversely correlated to the axial length (r = 0.24, P = 0.001). The nasal choroid appeared thinner than in the temporal area (analysis of variance, P < 0.0001). In children, CT increases with age and is inversely correlated to axial length. There is a significant variation of CT between children of the same age.
Padayachy, Llewellyn C; Padayachy, Vaishali; Galal, Ushma; Gray, Rebecca; Fieggen, A Graham
2016-10-01
The aim of this study was to investigate the relationship between optic nerve sheath diameter (ONSD) measurement and invasively measured intracranial pressure (ICP) in children. ONSD measurement was performed prior to invasive measurement of ICP. The mean binocular ONSD measurement was compared to the ICP reading. Physiological variables including systolic blood pressure (SBP), diastolic blood pressure (DBP), mean arterial pressure (MAP), pulse rate, temperature, respiratory rate and end tidal carbon dioxide (ETCO2) level were recorded at the time of ONSD measurement. Diagnostic accuracy analysis was performed at various ICP thresholds and repeatability, intra- and inter-observer variability, correlation between measurements in different imaging planes as well the relationship over the entire patient cohort were examined in part I of this study. Data from 174 patients were analysed. Repeatability and intra-observer variability were excellent (α = 0.97-0.99). Testing for inter-observer variability revealed good correlation (r = 0.89, p < 0.001). Imaging in the sagittal plane demonstrated a slightly better correlation with ICP (r = 0.66, p < 0.001). The ONSD measurement with the best diagnostic accuracy for detecting an ICP ≥ 20 mmHg over the entire patient cohort was 5.5 mm, sensitivity 93.2 %, specificity 74 % and odds ratio (OR) of 39.3. Transorbital ultrasound measurement of the OSND is a reliable and reproducible technique, demonstrating a good relationship with ICP and high diagnostic accuracy for detecting raised ICP.
Pulerwitz, Todd C.; Khalique, Omar K.; Nazif, Tamim N.; Rozenshtein, Anna; Pearson, Gregory D.N.; Hahn, Rebecca T.; Vahl, Torsten P.; Kodali, Susheel K.; George, Isaac; Leon, Martin B.; D'Souza, Belinda; Po, Ming Jack; Einstein, Andrew J.
2016-01-01
Background Transcatheter aortic valve replacement (TAVR) is a lifesaving procedure for many patients high risk for surgical aortic valve replacement. The prevalence of chronic kidney disease (CKD) is high in this population, and thus a very low contrast volume (VLCV) computed tomography angiography (CTA) protocol providing comprehensive cardiac and vascular imaging would be valuable. Methods 52 patients with severe, symptomatic aortic valve disease, undergoing pre-TAVR CTA assessment from 2013-4 at Columbia University Medical Center were studied, including all 26 patients with CKD (eGFR<30mL/min) who underwent a novel VLCV protocol (20mL of iohexol at 2.5mL/s), and 26 standard-contrast-volume (SCV) protocol patients. Using a 320-slice volumetric scanner, the protocol included ECG-gated volume scanning of the aortic root followed by medium-pitch helical vascular scanning through the femoral arteries. Two experienced cardiologists performed aortic annulus and root measurements. Vascular image quality was assessed by two radiologists using a 4-point scale. Results VLCV patients had mean(±SD) age 86±6.5, BMI 23.9±3.4 kg/m2 with 54% men; SCV patients age 83±8.8, BMI 28.7±5.3 kg/m2, 65% men. There was excellent intra- and inter-observer agreement for annular and root measurements, and excellent agreement with 3D-transesophageal echocardiographic measurements. Both radiologists found diagnostic-quality vascular imaging in 96% of VLCV and 100% of SCV cases, with excellent inter-observer agreement. Conclusions This study is the first of its kind to report the feasibility and reproducibility of measurements for a VLCV protocol for comprehensive pre-TAVR CTA. There was excellent agreement of cardiac measurements and almost all studies were diagnostic quality for vascular access assessment. PMID:27061253
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hinrichs, Jan B., E-mail: hinrichs.jan@mh-hannover.de; Marquardt, Steffen, E-mail: marquardt.steffen@mh-hannover.de; Falck, Christian von, E-mail: falck.christian.von@mh-hannover.de
PurposeTo assess the feasibility and diagnostic performance of contrast-enhanced, C-arm computed tomography (CACT) of the pulmonary arteries compared to digital subtraction angiography (DSA) in patients suffering from chronic thromboembolic pulmonary hypertension (CTEPH).MaterialsFifty-two patients with CTEPH underwent ECG-gated DSA and contrast-enhanced CACT. Two readers (R1, R2) independently evaluated pulmonary artery segments and their sub-segmental branching using DSA and CACT for optimal image quality. Afterwards, the diagnostic findings, i.e., intraluminal filling defects, stenosis, and occlusion, were compared. Inter-modality and inter-observer agreement was calculated, and subsequently consensus reading was done and correlated to a reference standard representing the overall consensus of both modalities.more » Fisher’s exact test and Cohen’s Kappa were applied.ResultsA total of 1352 pulmonary segments were evaluated, of which 1255 (92.8 %) on DSA and 1256 (92.9 %) on CACT were rated to be fully diagnostic. The main causes of the non-diagnostic image quality were motion artifacts on CACT (R1:37, R2:78) and insufficient contrast enhancement on DSA (R1:59, R2:38). Inter-observer agreement was good for DSA (κ = 0.74) and CACT (κ = 0.75), while inter-modality agreement was moderate (R1: κ = 0.46, R2: κ = 0.47). Compared to the reference standard, the inter-modality agreement for CACT was excellent (κ = 0.96), whereas it was inferior for DSA (κ = 0.61) due to the higher number of abnormal consensus findings read as normal on DSA.ConclusionCACT of the pulmonary arteries is feasible and provides additional information to DSA. CACT has the potential to improve the diagnostic work-up of patients with CTEPH and may be particularly useful prior to surgical or interventional treatment.« less
Marion, Kenneth M.; Maram, Jyotsna; Pan, Xiaojing; Dastiridou, Anna; Zhang, ZhouYuan; Ho, Alex; Francis, Brian A.; Sadda, Srinivas R.
2015-01-01
Purpose: To compare anterior chamber angle parameters based on the location of Schwalbe line (SL) from 2 spectral domain optical coherence tomography (SD-OCT) instruments and to measure their reproducibility. Methods: Forty-two eyes from 21 normal, healthy participants underwent imaging of the inferior irido-corneal angle with the Spectralis and Cirrus SD-OCT under tightly controlled low-light conditions. SL-angle opening distance (SL-AOD) and SL-trabecular iris space area (SL-TISA) were measured by masked, certified graders at the Doheny Imaging Reading Center using customized grading software. Interinstrument and intrainstrument, as well as interobserver and intraobserver reproducibility of SL-AOD and SL-TISA measurements were evaluated by intraclass correlation coefficients (ICCs) and Bland-Altman plots with limits of agreement (LoA). Results: The mean SL-AOD was 0.662±0.191 mm in Spectralis and 0.677±0.213 mm in Cirrus. The mean SL-TISA was 0.250±0.073 mm2 in Spectralis and 0.256±0.082 mm2 in Cirrus. The agreement for intrainstrument (ICCs>0.979), intragrader (ICCs>0.992), and intergrader (ICCs>0.929) was excellent. Excellent agreement between the 2 devices was also documented with a mean difference of −0.016 (LoA −0.125 to 0.092) mm for SL-AOD and −0.007 (LoA −0.056 to 0.043) mm2 in SL-TISA. Conclusions: Both SD-OCTs provided comparable measurements and permitted calculation of SL-based angle metrics. There was excellent interinstrument and intrainstrument and intraobserver and interobserver reproducibility for Spectralis and Cirrus SD-OCTs, suggesting true interchangeability between SD-OCT devices. This has the potential to lead to development of standardized grading assessments and quantification of angle parameters that would be valid across various SD-OCT devices. PMID:26200742
Feasibility of four-dimensional preoperative simulation for elbow debridement arthroplasty.
Yamamoto, Michiro; Murakami, Yukimi; Iwatsuki, Katsuyuki; Kurimoto, Shigeru; Hirata, Hitoshi
2016-04-02
Recent advances in imaging modalities have enabled three-dimensional preoperative simulation. A four-dimensional preoperative simulation system would be useful for debridement arthroplasty of primary degenerative elbow osteoarthritis because it would be able to detect the impingement lesions. We developed a four-dimensional simulation system by adding the anatomical axis to the three-dimensional computed tomography scan data of the affected arm in one position. Eleven patients with primary degenerative elbow osteoarthritis were included. A "two rings" method was used to calculate the flexion-extension axis of the elbow by converting the surface of the trochlea and capitellum into two rings. A four-dimensional simulation movie was created and showed the optimal range of motion and the impingement area requiring excision. To evaluate the reliability of the flexion-extension axis, interobserver and intraobserver reliabilities regarding the assessment of bony overlap volumes were calculated twice for each patient by two authors. Patients were treated by open or arthroscopic debridement arthroplasties. Pre- and postoperative examinations included elbow range of motion measurement, and completion of the patient-rated questionnaire Hand20, Japanese Orthopaedic Association-Japan Elbow Society Elbow Function Score, and the Mayo Elbow Performance Score. Measurement of the bony overlap volume showed an intraobserver intraclass correlation coefficient of 0.93 and 0.90, and an interobserver intraclass correlation coefficient of 0.94. The mean elbow flexion-extension arc significantly improved from 101° to 125°. The mean Hand20 score significantly improved from 52 to 22. The mean Japanese Orthopaedic Association-Japan Elbow Society Elbow Function Score significantly improved from 67 to 88. The mean Mayo Elbow Performance Score significantly improved from 71 to 91 at the final follow-up evaluation. We showed that four-dimensional, preoperative simulation can be generated by adding the rotation axis to the one-position, three-dimensional computed tomography image of the affected arm. This method is feasible for elbow debridement arthroplasty.
Dewes, Patricia; Frellesen, Claudia; Scholtz, Jan-Erik; Fischer, Sebastian; Vogl, Thomas J; Bauer, Ralf W; Schulz, Boris
2016-06-01
To evaluate a novel tin filter-based abdominal CT protocol for urolithiasis in terms of image quality and CT dose parameters. 130 consecutive patients with suspected urolithiasis underwent non-enhanced CT with three different protocols: 48 patients (group 1) were examined at tin-filtered 150kV (150kV Sn) on a third-generation dual-source-CT, 33 patients were examined with automated kV-selection (110-140kV) based on the scout view on the same CT-device (group 2), and 49 patients were examined on a second-generation dual-source-CT (group 3) with automated kV-selection (100-140kV). Automated exposure control was active in all groups. Image quality was subjectively evaluated on a 5-point-likert-scale by two radiologists and interobserver agreement as well as signal-to-noise-ratio (SNR) was calculated. Dose-length-product (DLP) and volume CT dose index (CTDIvol) were compared. Image quality was rated in favour for the tin filter protocol with excellent interobserver agreement (ICC=0.86-0.91) and the difference reached statistical significance (p<0.001). SNR was significantly higher in group 1 and 2 compared to second-generation DSCT (p<0.001). On third-generation dual-source CT, there was no significant difference in SNR between the 150kV Sn and the automated kV selection protocol (p=0.5). The DLP of group 1 was 23% and 21% (p<0.002) lower in comparison to group 2 and 3, respectively. So was the CTDIvol of group 1 compared to group 2 (-36%) and 3 (-32%) (p<0.001). Additional shaping of a 150kV source spectrum by a tin filter substantially lowers patient exposure while improving image quality on un-enhanced abdominal computed tomography for urinary stone disease. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Garcia de Leon Valenzuela, Maria Julia
This project explores the reliability of building a biological profile for an unknown individual based on three-dimensional (3D) images of the individual's skeleton. 3D imaging technology has been widely researched for medical and engineering applications, and it is increasingly being used as a tool for anthropological inquiry. While the question of whether a biological profile can be derived from 3D images of a skeleton with the same accuracy as achieved when using dry bones has been explored, bigger sample sizes, a standardized scanning protocol and more interobserver error data are needed before 3D methods can become widely and confidently used in forensic anthropology. 3D images of Computed Tomography (CT) scans were obtained from 130 innominate bones from Boston University's skeletal collection (School of Medicine). For each bone, both 3D images and original bones were assessed using the Phenice and Suchey-Brooks methods. Statistical analysis was used to determine the agreement between 3D image assessment versus traditional assessment. A pool of six individuals with varying experience in the field of forensic anthropology scored a subsample (n = 20) to explore interobserver error. While a high agreement was found for age and sex estimation for specimens scored by the author, the interobserver study shows that observers found it difficult to apply standard methods to 3D images. Higher levels of experience did not result in higher agreement between observers, as would be expected. Thus, a need for training in 3D visualization before applying anthropological methods to 3D bones is suggested. Future research should explore interobserver error using a larger sample size in order to test the hypothesis that training in 3D visualization will result in a higher agreement between scores. The need for the development of a standard scanning protocol focusing on the optimization of 3D image resolution is highlighted. Applications for this research include the possibility of digitizing skeletal collections in order to expand their use and for deriving skeletal collections from living populations and creating population-specific standards. Further research for the development of a standard scanning and processing protocol is needed before 3D methods in forensic anthropology are considered as reliable tools for generating biological profiles.
Terslev, Lene; Naredo, Esperanza; Aegerter, Philippe; Wakefield, Richard J; Backhaus, Marina; Balint, Peter; Bruyn, George A W; Iagnocco, Annamaria; Jousse-Joulin, Sandrine; Schmidt, Wolfgang A; Szkudlarek, Marcin; Conaghan, Philip G; Filippucci, Emilio
2017-01-01
Objectives To test the reliability of new ultrasound (US) definitions and quantification of synovial hypertrophy (SH) and power Doppler (PD) signal, separately and in combination, in a range of joints in patients with rheumatoid arthritis (RA) using the European League Against Rheumatisms–Outcomes Measures in Rheumatology (EULAR-OMERACT) combined score for PD and SH. Methods A stepwise approach was used: (1) scoring static images of metacarpophalangeal (MCP) joints in a web-based exercise and subsequently when scanning patients; (2) scoring static images of wrist, proximal interphalangeal joints, knee and metatarsophalangeal joints in a web-based exercise and subsequently when scanning patients using different acquisitions (standardised vs usual practice). For reliability, kappa coefficients (κ) were used. Results Scoring MCP joints in static images showed substantial intraobserver variability but good to excellent interobserver reliability. In patients, intraobserver reliability was the same for the two acquisition methods. Interobserver reliability for SH (κ=0.87) and PD (κ=0.79) and the EULAR-OMERACT combined score (κ=0.86) were better when using a ‘standardised’ scan. For the other joints, the intraobserver reliability was excellent in static images for all scores (κ=0.8–0.97) and the interobserver reliability marginally lower. When using standardised scanning in patients, the intraobserver was good (κ=0.64 for SH and the EULAR-OMERACT combined score, 0.66 for PD) and the interobserver reliability was also good especially for PD (κ range=0.41–0.92). Conclusion The EULAR-OMERACT score demonstrated moderate-good reliability in MCP joints using a standardised scan and is equally applicable in non-MCP joints. This scoring system should underpin improved reliability and consequently the responsiveness of US in RA clinical trials. PMID:28948984
An International Ki67 Reproducibility Study in Adrenal Cortical Carcinoma.
Papathomas, Thomas G; Pucci, Eugenio; Giordano, Thomas J; Lu, Hao; Duregon, Eleonora; Volante, Marco; Papotti, Mauro; Lloyd, Ricardo V; Tischler, Arthur S; van Nederveen, Francien H; Nose, Vania; Erickson, Lori; Mete, Ozgur; Asa, Sylvia L; Turchini, John; Gill, Anthony J; Matias-Guiu, Xavier; Skordilis, Kassiani; Stephenson, Timothy J; Tissier, Frédérique; Feelders, Richard A; Smid, Marcel; Nigg, Alex; Korpershoek, Esther; van der Spek, Peter J; Dinjens, Winand N M; Stubbs, Andrew P; de Krijger, Ronald R
2016-04-01
Despite the established role of Ki67 labeling index in prognostic stratification of adrenocortical carcinomas and its recent integration into treatment flow charts, the reproducibility of the assessment method has not been determined. The aim of this study was to investigate interobserver variability among endocrine pathologists using a web-based virtual microscopy approach. Ki67-stained slides of 76 adrenocortical carcinomas were analyzed independently by 14 observers, each according to their method of preference including eyeballing, formal manual counting, and digital image analysis. The interobserver variation was statistically significant (P<0.001) in the absence of any correlation between the various methods. Subsequently, 61 static images were distributed among 15 observers who were instructed to follow a category-based scoring approach. Low levels of interobserver (F=6.99; Fcrit=1.70; P<0.001) as well as intraobserver concordance (n=11; Cohen κ ranging from -0.057 to 0.361) were detected. To improve harmonization of Ki67 analysis, we tested the utility of an open-source Galaxy virtual machine application, namely Automated Selection of Hotspots, in 61 virtual slides. The software-provided Ki67 values were validated by digital image analysis in identical images, displaying a strong correlation of 0.96 (P<0.0001) and dividing the cases into 3 classes (cutoffs of 0%-15%-30% and/or 0%-10%-20%) with significantly different overall survivals (P<0.05). We conclude that current practices in Ki67 scoring assessment vary greatly, and interobserver variation sets particular limitations to its clinical utility, especially around clinically relevant cutoff values. Novel digital microscopy-enabled methods could provide critical aid in reducing variation, increasing reproducibility, and improving reliability in the clinical setting.
Assessment of interobserver concordance in polysomnography scoring of sleep bruxism☆
Ferraz, Otávio; de Moura Guimarães, Thais; Maluly Filho, Milton; Dal-Fabbro, Cibele; Abraão Crosara Cunha, Thays; Cristina Lotaif, Ana; Cristina Barros Schütz, Teresa; Santos-Silva, Rogério; Tufik, Sergio; Bittencourt, Lia
2015-01-01
Introduction Objective evaluation of sleep bruxism (SB) using whole-night polysomnography (PSG) is relevant for diagnostic confirmation. Nevertheless, the PSG electromyogram (EMG) scoring may give rise to controversy, particularly when audiovisual monitoring is not performed. Therefore, the present study assessed the concordance between two independent scorers to visual SB on a PSG performed without audiovisual monitoring. Methods Fifty-six PSG tests were scored from individuals with clinical history and polysomnography criteria of SB. In addition to the protocol of conventional whole-night PSG, electrodes were also placed bilaterally on the masseter and temporal muscles. Visual EMG scoring without audio video monitoring was scored by two independent scorers (Dentist 1 and Dentist 2) according the recommendations formulated in the AASM manual (2007). Kendall Tau correlation was used to assess interobserver concordance relative to variables “total duration of events (seconds), “shortest events”, “longest events” and index in each phasic, tonic or mixed event. Results The correlation was positive and significant relative to all the investigated variables, being T>0.54. Conclusion It was found a good inter-examiner concordance rate in SB scoring in absence of audio video monitoring. PMID:26779318
Interobserver variability in recognizing arousal in respiratory sleep disorders.
Drinnan, M J; Murray, A; Griffiths, C J; Gibson, G J
1998-08-01
Daytime sleepiness is a common consequence of repeated arousal in obstructive sleep apnea (OSA). Arousal indices are sometimes used to make decisions on treatment, but there is no evidence that arousals are detected similarly even by experienced observers. Using the American Sleep Disorders Association (ASDA) definition of arousal in terms of the accompanying electroencephalogram (EEG) changes, we have quantified interobserver agreement for arousal scoring and identified factors affecting it. Ten patients with suspected OSA were studied; three representative EEG events during each of light, slow-wave, and rapid-eye-movement (REM) sleep were extracted from each record (90 events total) and evaluated by experts in 14 sleep laboratories. Observers differed (ANOVA, p < 0.001) in the number of events scored as arousal (totals ranged from 23 to 53 of the 90 events). Overall agreement was moderate (kappa = 0.47), but it was best for events during slow-wave sleep, moderate for REM, and poor for light sleep (kappa = 0.60, 0.52, and 0.28, respectively). Agreement was unrelated to arousal duration. We conclude that the ASDA definition of arousal is only moderately repeatable. Account should be taken of this variability when results from different centers are compared.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dolz, J., E-mail: jose.dolz.upv@gmail.com; Kirişli, H. A.; Massoptier, L.
2016-05-15
Purpose: Accurate delineation of organs at risk (OARs) on computed tomography (CT) image is required for radiation treatment planning (RTP). Manual delineation of OARs being time consuming and prone to high interobserver variability, many (semi-) automatic methods have been proposed. However, most of them are specific to a particular OAR. Here, an interactive computer-assisted system able to segment various OARs required for thoracic radiation therapy is introduced. Methods: Segmentation information (foreground and background seeds) is interactively added by the user in any of the three main orthogonal views of the CT volume and is subsequently propagated within the whole volume.more » The proposed method is based on the combination of watershed transformation and graph-cuts algorithm, which is used as a powerful optimization technique to minimize the energy function. The OARs considered for thoracic radiation therapy are the lungs, spinal cord, trachea, proximal bronchus tree, heart, and esophagus. The method was evaluated on multivendor CT datasets of 30 patients. Two radiation oncologists participated in the study and manual delineations from the original RTP were used as ground truth for evaluation. Results: Delineation of the OARs obtained with the minimally interactive approach was approved to be usable for RTP in nearly 90% of the cases, excluding the esophagus, which segmentation was mostly rejected, thus leading to a gain of time ranging from 50% to 80% in RTP. Considering exclusively accepted cases, overall OARs, a Dice similarity coefficient higher than 0.7 and a Hausdorff distance below 10 mm with respect to the ground truth were achieved. In addition, the interobserver analysis did not highlight any statistically significant difference, at the exception of the segmentation of the heart, in terms of Hausdorff distance and volume difference. Conclusions: An interactive, accurate, fast, and easy-to-use computer-assisted system able to segment various OARs required for thoracic radiation therapy has been presented and clinically evaluated. The introduction of the proposed system in clinical routine may offer valuable new option to radiation oncologists in performing RTP.« less
Pavlovic, Chris; Futamatsu, Hideki; Angiolillo, Dominick J; Guzman, Luis A; Wilke, Norbert; Siragusa, Daniel; Wludyka, Peter; Percy, Robert; Northrup, Martin; Bass, Theodore A; Costa, Marco A
2007-04-01
The purpose of this study is to evaluate the accuracy of semiautomated analysis of contrast enhanced magnetic resonance angiography (MRA) in patients who have undergone standard angiographic evaluation for peripheral vascular disease (PVD). Magnetic resonance angiography is an important tool for evaluating PVD. Although this technique is both safe and noninvasive, the accuracy and reproducibility of quantitative measurements of disease severity using MRA in the clinical setting have not been fully investigated. 43 lesions in 13 patients who underwent both MRA and digital subtraction angiography (DSA) of iliac and common femoral arteries within 6 months were analyzed using quantitative magnetic resonance angiography (QMRA) and quantitative vascular analysis (QVA). Analysis was repeated by a second operator and by the same operator in approximately 1 month time. QMRA underestimated percent diameter stenosis (%DS) compared to measurements made with QVA by 2.47%. Limits of agreement between the two methods were +/- 9.14%. Interobserver variability in measurements of %DS were +/- 12.58% for QMRA and +/- 10.04% for QVA. Intraobserver variability of %DS for QMRA was +/- 4.6% and for QVA was +/- 8.46%. QMRA displays a high level of agreement to QVA when used to determine stenosis severity in iliac and common femoral arteries. Similar levels of interobserver and intraobserver variability are present with each method. Overall, QMRA represents a useful method to quantify severity of PVD.
Automatic segmentation of the choroid in enhanced depth imaging optical coherence tomography images.
Tian, Jing; Marziliano, Pina; Baskaran, Mani; Tun, Tin Aung; Aung, Tin
2013-03-01
Enhanced Depth Imaging (EDI) optical coherence tomography (OCT) provides high-definition cross-sectional images of the choroid in vivo, and hence is used in many clinical studies. However, the quantification of the choroid depends on the manual labelings of two boundaries, Bruch's membrane and the choroidal-scleral interface. This labeling process is tedious and subjective of inter-observer differences, hence, automatic segmentation of the choroid layer is highly desirable. In this paper, we present a fast and accurate algorithm that could segment the choroid automatically. Bruch's membrane is detected by searching the pixel with the biggest gradient value above the retinal pigment epithelium (RPE) and the choroidal-scleral interface is delineated by finding the shortest path of the graph formed by valley pixels using Dijkstra's algorithm. The experiments comparing automatic segmentation results with the manual labelings are conducted on 45 EDI-OCT images and the average of Dice's Coefficient is 90.5%, which shows good consistency of the algorithm with the manual labelings. The processing time for each image is about 1.25 seconds.
Use of cone beam computed tomography in identifying postmenopausal women with osteoporosis.
Brasileiro, C B; Chalub, L L F H; Abreu, M H N G; Barreiros, I D; Amaral, T M P; Kakehasi, A M; Mesquita, R A
2017-12-01
The aim of this study is to correlate radiometric indices from cone beam computed tomography (CBCT) images and bone mineral density (BMD) in postmenopausal women. Quantitative CBCT indices can be used to screen for women with low BMD. Osteoporosis is a disease characterized by the deterioration of bone tissue and the consequent decrease in BMD and increase in bone fragility. Several studies have been performed to assess radiometric indices in panoramic images as low-BMD predictors. The aim of this study is to correlate radiometric indices from CBCT images and BMD in postmenopausal women. Sixty postmenopausal women with indications for dental implants and CBCT evaluation were selected. Dual-energy X-ray absorptiometry (DXA) was performed, and the patients were divided into normal, osteopenia, and osteoporosis groups, according to the World Health Organization (WHO) criteria. Cross-sectional images were used to evaluate the computed tomography mandibular index (CTMI), the computed tomography index (inferior) (CTI (I)) and computed tomography index (superior) (CTI (S)). Student's t test was used to compare the differences between the indices of the groups' intraclass correlation coefficient (ICC). Statistical analysis showed a high degree of interobserver and intraobserver agreement for all measurements (ICC > 0.80). The mean values of CTMI, CTI (S), and CTI (I) were lower in the osteoporosis group than in osteopenia and normal patients (p < 0.05). In comparing normal patients and women with osteopenia, there was no statistically significant difference in the mean value of CTI (I) (p = 0.075). Quantitative CBCT indices may help dentists to screen for women with low spinal and femoral bone mineral density so that they can refer postmenopausal women for bone densitometry.
Baek, Hye Jin; Kim, Dong Wook; Ryu, Ji Hwa; Lee, Yoo Jin
2013-01-01
Background There has been no study to compare the diagnostic accuracy of an experienced radiologist with a trainee in nasal bone fracture. Objectives To compare the diagnostic accuracy between conventional radiography and computed tomography (CT) for the identification of nasal bone fractures and to evaluate the interobserver reliability between a staff radiologist and a trainee. Patients and Methods A total of 108 patients who underwent conventional radiography and CT after acute nasal trauma were included in this retrospective study. Two readers, a staff radiologist and a second-year resident, independently assessed the results of the imaging studies. Results Of the 108 patients, the presence of a nasal bone fracture was confirmed in 88 (81.5%) patients. The number of non-depressed fractures was higher than the number of depressed fractures. In nine (10.2%) patients, nasal bone fractures were only identified on conventional radiography, including three depressed and six non-depressed fractures. CT was more accurate as compared to conventional radiography for the identification of nasal bone fractures as determined by both readers (P <0.05), all diagnostic indices of an experienced radiologist were similar to or higher than those of a trainee, and κ statistics showed moderate agreement between the two diagnostic tools for both readers. There was no statistical difference in the assessment of interobserver reliability for both imaging modalities in the identification of nasal bone fractures. Conclusion For the identification of nasal bone fractures, CT was significantly superior to conventional radiography. Although a staff radiologist showed better values in the identification of nasal bone fracture and differentiation between depressed and non-depressed fractures than a trainee, there was no statistically significant difference in the interpretation of conventional radiography and CT between a radiologist and a trainee. PMID:24348599
Bisdas, S; Yang, X; Lim, C C T; Vogl, T J; Koh, T S
2008-01-01
Dynamic contrast-enhanced (DCE) imaging is a promising approach for in vivo assessment of tissue microcirculation. Twenty patients with clinical and routine computed tomography (CT) evidence of intracerebral neoplasm were examined with DCE-CT imaging. Using a distributed-parameter model for tracer kinetics modeling of DCE-CT data, voxel-level maps of cerebral blood flow (F), intravascular blood volume (vi) and intravascular mean transit time (t1) were generated. Permeability-surface area product (PS), extravascular extracellular blood volume (ve) and extraction ratio (E) maps were also calculated to reveal pathologic locations of tracer extravasation, which are indicative of disruptions in the blood-brain barrier (BBB). All maps were visually assessed for quality of tumor delineation and measurement of tumor extent by two radiologists. Kappa (kappa) coefficients and their 95% confidence intervals (CI) were calculated to determine the interobserver agreement for each DCE-CT map. There was a substantial agreement for the tumor delineation quality in the F, ve and t1 maps. The agreement for the quality of the tumor delineation was excellent for the vi, PS and E maps. Concerning the measurement of tumor extent, excellent and nearly excellent agreement was achieved only for E and PS maps, respectively. According to these results, we performed a segmentation of the cerebral tumors on the base of the E maps. The interobserver agreement for the tumor extent quantification based on manual segmentation of tumor in the E maps vs. the computer-assisted segmentation was excellent (kappa = 0.96, CI: 0.93-0.99). The interobserver agreement for the tumor extent quantification based on computer segmentation in the mean images and the E maps was substantial (kappa = 0.52, CI: 0.42-0.59). This study illustrates the diagnostic usefulness of parametric maps associated with BBB disruption on a physiology-based approach and highlights the feasibility for automatic segmentation of cerebral tumors.
NASA Astrophysics Data System (ADS)
Aklan, Bassim; Hartmann, Josefin; Zink, Diana; Siavooshhaghighi, Hadi; Merten, Ricarda; Putz, Florian; Ott, Oliver; Fietkau, Rainer; Bert, Christoph
2017-06-01
The aim of this study was to systematically investigate the influence of the inter- and intra-observer segmentation variation of tumors and organs at risk on the simulated temperature coverage of the target. CT scans of six patients with tumors in the pelvic region acquired for radiotherapy treatment planning were used for hyperthermia treatment planning. To study the effect of inter-observer variation, three observers manually segmented in the CT images of each patient the following structures: fat, muscle, bone and the bladder. The gross tumor volumes (GTV) were contoured by three radiation oncology residents and used as the hyperthermia target volumes. For intra-observer variation, one of the observers of each group contoured the structures of each patient three times with a time span of one week between the segmentations. Moreover, the impact of segmentation variations in organs at risk (OARs) between the three inter-observers was investigated on simulated temperature distributions using only one GTV. The spatial overlap between individual segmentations was assessed by the Dice similarity coefficient (DSC) and the mean surface distance (MSD). Additionally, the temperatures T90/T10 delivered to 90%/10% of the GTV, respectively, were assessed for each observer combination. The results of the segmentation similarity evaluation showed that the DSC of the inter-observer variation of fat, muscle, the bladder, bone and the target was 0.68 ± 0.12, 0.88 ± 0.05, 0.73 ± 0.14, 0.91 ± 0.04 and 0.64 ± 0.11, respectively. Similar results were found for the intra-observer variation. The MSD results were similar to the DSCs for both observer variations. A statistically significant difference (p < 0.05) was found for T90 and T10 in the predicted target temperature due to the observer variability. The conclusion is that intra- and inter-observer variations have a significant impact on the temperature coverage of the target. Furthermore, OARs, such as bone and the bladder, may essentially influence the homogeneity of the simulated target temperature distribution.
Varga, Zsuzsanna; Cassoly, Estelle; Li, Qiyu; Oehlschlegel, Christian; Tapia, Coya; Lehr, Hans Anton; Klingbiel, Dirk; Thürlimann, Beat; Ruhstaller, Thomas
2015-01-01
Background Proliferative activity (Ki-67 Labelling Index) in breast cancer increasingly serves as an additional tool in the decision for or against adjuvant chemotherapy in midrange hormone receptor positive breast cancer. Ki-67 Index has been previously shown to suffer from high inter-observer variability especially in midrange (G2) breast carcinomas. In this study we conducted a systematic approach using different Ki-67 assessments on large tissue sections in order to identify the method with the highest reliability and the lowest variability. Materials and Methods Five breast pathologists retrospectively analyzed proliferative activity of 50 G2 invasive breast carcinomas using large tissue sections by assessing Ki-67 immunohistochemistry. Ki-67-assessments were done on light microscopy and on digital images following these methods: 1) assessing five regions, 2) assessing only darkly stained nuclei and 3) considering only condensed proliferative areas (‘hotspots’). An individual review (the first described assessment from 2008) was also performed. The assessments on light microscopy were done by estimating. All measurements were performed three times. Inter-observer and intra-observer reliabilities were calculated using the approach proposed by Eliasziw et al. Clinical cutoffs (14% and 20%) were tested using Fleiss’ Kappa. Results There was a good intra-observer reliability in 5 of 7 methods (ICC: 0.76–0.89). The two highest inter-observer reliability was fair to moderate (ICC: 0.71 and 0.74) in 2 methods (region-analysis and individual-review) on light microscopy. Fleiss’-kappa-values (14% cut-off) were the highest (moderate) using the original recommendation on light-microscope (Kappa 0.58). Fleiss’ kappa values (20% cut-off) were the highest (Kappa 0.48 each) in analyzing hotspots on light-microscopy and digital-analysis. No methodologies using digital-analysis were superior to the methods on light microscope. Conclusion Our results show that all methods on light-microscopy for Ki-67 assessment in large tissue sections resulted in a good intra-observer reliability. Region analysis and individual review (the original recommendation) on light-microscopy yielded the highest inter-observer reliability. These results show slight improvement to previously published data on poor-reproducibility and thus might be a practical-pragmatic way for routine assessment of Ki-67 Index in G2 breast carcinomas. PMID:25885288
Holland-Letz, Tim; Endres, Heinz G; Biedermann, Stefanie; Mahn, Matthias; Kunert, Joachim; Groh, Sabine; Pittrow, David; von Bilderling, Peter; Sternitzky, Reinhardt; Diehm, Curt
2007-05-01
The reliability of ankle-brachial index (ABI) measurements performed by different observer groups in primary care has not yet been determined. The aims of the study were to provide precise estimates for all effects influencing the variability of the ABI (patients' individual variability, intra- and inter-observer variability), with particular focus on the performance of different observer groups. Using a partially balanced incomplete block design, 144 unselected individuals aged > or = 65 years underwent double ABI measurements by one vascular surgeon or vascular physician, one family physician and one nurse with training in Doppler sonography. Three groups comprising a total of 108 individuals were analyzed (only two with ABI < 0.90). Errors for two repeated measurements for all three observer groups did not differ (experts 8.5%, family physicians 7.7%, and nurses 7.5%, p = 0.39). There was no relevant bias among observer groups. Intra-observer variability expressed as standard deviation divided by the mean was 8%, and inter-observer variability was 9%. In conclusion, reproducibility of the ABI measurement was good in this cohort of elderly patients who almost all had values in the normal range. The mean error of 8-9% within or between observers is smaller than with established screening measures. Since there were no differences among observers with different training backgrounds, our study confirms the appropriateness of ABI assessment for screening peripheral arterial disease (PAD) and generalized atherosclerosis in the primary case setting. Given the importance of the early detection and management of PAD, this diagnostic tool should be used routinely as a standard for PAD screening. Additional studies will be required to confirm our observations in patients with PAD of various severities.
Tian, Bing; Xu, Bing; Lu, Jianping; Liu, Qi; Wang, Li; Wang, Minjie
2015-06-01
This study aimed to evaluate the usefulness of four-dimensional CTA before and after embolization treatment with ONYX-18 in eleven patients with cranial dural arteriovenous fistulas, and to compare the results with those of the reference standard DSA. Eleven patients with cranial dural arteriovenous fistulas detected on DSA underwent transarterial embolization with ONYX-18. Four-dimensional CTA was performed an average of 2 days before and 4 days after DSA. Four-dimensional CTA and DSA images were reviewed by two neuroradiologists for identification of feeding arteries and drainage veins and for determining treatment effects. Interobserver and intermodality agreement between four-dimensional CTA and DSA were assessed. Forty-two feeding arteries were identified for 14 fistulas in the 11 patients. Of these, 36 (85.71%) were detected on four-dimensional CTA. After transarterial embolization, one patient got partly embolized, and the fistulas in the remaining 10 patients were completely occluded. The interobserver agreement for four-dimensional CTA and intermodality agreement between four-dimensional CTA and DSA were excellent (κ=1) for shunt location, identification of drainage veins, and fistula occlusion after treatment. Four-dimensional CTA images are highly accurate when compared with DSA images both before and after transarterial embolization treatment. Four-dimensional CTA can be used for diagnosis as well as follow-up of cranial dural arteriovenous fistulas in clinical settings. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Storr, Ashleigh; Venetis, Christos A; Cooke, Simon; Kilani, Suha; Ledger, William
2017-02-01
What is the inter-observer and intra-observer agreement between embryologists when selecting a single Day 5 embryo for transfer? The inter-observer and intra-observer agreement between embryologists when selecting a single Day 5 embryo for transfer was generally good, although not optimal, even among experienced embryologists. Previous research on the morphological assessment of early stage (two pronuclei to Day 3) embryos has shown varying levels of inter-observer and intra-observer agreement. However, single blastocyst transfer is now becoming increasingly popular and there are no published data that assess inter-observer and intra-observer agreement when selecting a single embryo for Day 5 transfer. This was a prospective study involving 10 embryologists working at five different IVF clinics within a single organization between July 2013 and November 2015. The top 10 embryologists were selected based on their yearly Quality Assurance Program scores for blastocyst grading and were asked to morphologically grade all Day 5 embryos and choose a single embryo for transfer in a survey of 100 cases using 2D images. A total of 1000 decisions were therefore assessed. For each case, Day 5 images were shown, followed by a Day 3 and Day 5 image of the same embryo. Subgroup analyses were also performed based on the following characteristics of embryologists: the level of clinical embryology experience in the laboratory; amount of research experience; number of days per week spent grading embryos. The agreement between these embryologists and the one that scored the embryos on the actual day of transfer was also evaluated. Inter-observer and intra-observer variability was assessed using the kappa coefficient to evaluate the extent of agreement. This study showed that all 10 embryologists agreed on the embryo chosen for transfer in 50 out of 100 cases. In 93 out of 100 cases, at least 6 out of the 10 embryologists agreed. The inter-observer and intra-observer agreement among embryologists when selecting a single Day 5 embryo for transfer was generally good as assessed by the kappa scores (kappa = 0.734, 95% CI: 0.665-0.791 and 0.759, 95% CI: 0.622-0.833, respectively). The subgroup analyses did not substantially alter the inter-observer and intra-observer agreement among embryologists. The agreement when Day 3 images were included alongside Day 5 images of the same embryos resulted in a change of mind at least three times by each embryologist (on average for <10% of cases) and resulted in a small decrease in inter-observer and intra-observer agreement between embryologists (kappa = 0.676, 95% CI: 0.617-0.724 and 0.752, 95% CI: 0.656-808, respectively).The assessment of the inter-observer agreement with regard to morphological grading of Day 5 embryos showed only a fair-to-moderate agreement, which was observed across all subgroup analyses. The highest overall kappa coefficient was seen for the grading of the developmental stage of an embryo (0.513; 95% CI: 0.492-0.538). The findings were similar when the individual embryologists were compared with the embryologist who made the morphological assessments of the available embryos on the actual day of transfer. All embryologists had already completed their training and were working under one organization with similar policies between the five clinics. Therefore, the inter-observer agreement might not be as high between embryologists working in clinics with different policies or with different levels of training. The generally good, although not optimal uniformity between participating embryologists when selecting a Day 5 embryo for transfer, as well as, the surprisingly low agreement when morphologically grading Day 5 embryos could be improved, potentially resulting in increased pregnancy rates. Future studies need to be directed toward technologies that can help achieve this. None declared. Not applicable. © The Author 2016. Published by Oxford University Press on behalf of the European Society of Human Reproduction and Embryology. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Pineau, V; Lebel, B; Gouzy, S; Dutheil, J-J; Vielpeau, C
2010-10-01
The use of dual mobility cups is an effective method to prevent dislocations. However, the specific design of these implants can raise the suspicion of increased wear and subsequent periprosthetic osteolysis. Using radiostereometric analysis (RSA), migration of the femoral head inside the cup of a dual mobility implant can be defined to apprehend polyethylene wear rate. The study aimed to establish the precision of RSA measurement of femoral head migration in the cup of a dual mobility implant, and its intra- and interobserver variability. A total hip prosthesis phantom was implanted and placed under weight loading conditions in a simulator. Model-based RSA measurement of implant penetration involved specially machined polyethylene liners with increasing concentric wear (no wear, then 0.25, 0.5 and 0.75mm). Three examiners, blinded to the level of wear, analyzed (10 times) the radiostereometric films of the four liners. There was one experienced, one trained, and one inexperienced examiner. Statistical analysis measured the accuracy, precision, and intra- and interobserver variability by calculating Root Mean Square Error (RMSE), Concordance Correlation Coefficient (CCC), Intra Class correlation Coefficient (ICC), and Bland-Altman plots. Our protocol, that used a simple geometric model rather than the manufacturer's CAD files, showed precision of 0.072mm and accuracy of 0.034mm, comparable with machining tolerances with low variability. Correlation between wear measurement and true value was excellent with a CCC of 0.9772. Intraobserver reproducibility was very good with an ICC of 0.9856, 0.9883 and 0.9842, respectively for examiners 1, 2 and 3. Interobserver reproducibility was excellent with a CCC of 0.9818 between examiners 2 and 1, and 0.9713 between examiners 3 and 1. Quantification of wear is indispensable for the surveillance of dual mobility implants. This in vitro study validates our measurement method. Our results, and comparison with other studies using different measurement technologies (RSA, standard radiographs, Martell method) make model-based RSA the reference method for measuring the wear of total hip prostheses in vivo. Level 3. Prospective diagnostic study. Copyright © 2010 Elsevier Masson SAS. All rights reserved.
Cunningham, Devin P; Mostafa, Ayman A; Gordan-Evans, Wanda J; Boudrieau, Randy J; Griffon, Dominique J
2017-08-14
We recently reported that a conformation score derived from the tibial plateau angle (TPA) and the femoral anteversion angle (FAA), best discriminates limbs predisposed to, or affected by cranial cruciate ligament disease (CCLD), from those that are at low risk for CCLD. The specificity and sensitivity of this score were high enough to support further investigations toward its use for large-scale screening of dogs by veterinarians. The next step, which is the objective of the current study, is to determine inter-observer variability of that CCLD score in a large population of Labrador Retrievers. A total of 167 Labradors were enrolled in this cross-sectional study. Limbs of normal dogs over 6 years of age with no history of CCLD were considered at low risk for CCLD. Limbs of dogs with CCLD were considered at high risk for CCLD. Tibial plateau and femoral anteversion angles were measured independently by two investigators to calculate a CCLD score for each limb. Kappa statistics were used to determine the extent of agreement between investigators. Pearson's correlation and intraclass coefficients were calculated to evaluate the correlation between investigators and the relative contribution of each measurement to the variability of the CCLD score. The correlation between CCLD scores calculated by investigators was good (correlation coefficient = 0.68 p < 0.0001). However, interobserver agreement with regards to the predicted status of limbs was fair (kappa value = 0.28), with 37% of limbs being assigned divergent classifications. Variations in CCLD scores correlated best with those of TPA, which was the least consistent parameter between investigators. Absolute interobserver differences were two times greater for FAAs (4.19° ± 3.15) than TPAs (2.23° ± 1.91). The reproducibility of the CCLD score between investigators is fair, justifying caution when interpreting individual scores. Future studies should focus on improving the reproducibility of TPA and FAA measurements, as strategies to improve the agreement between CCLD scores.
Error quantification of osteometric data in forensic anthropology.
Langley, Natalie R; Meadows Jantz, Lee; McNulty, Shauna; Maijanen, Heli; Ousley, Stephen D; Jantz, Richard L
2018-06-01
This study evaluates the reliability of osteometric data commonly used in forensic case analyses, with specific reference to the measurements in Data Collection Procedures 2.0 (DCP 2.0). Four observers took a set of 99 measurements four times on a sample of 50 skeletons (each measurement was taken 200 times by each observer). Two-way mixed ANOVAs and repeated measures ANOVAs with pairwise comparisons were used to examine interobserver (between-subjects) and intraobserver (within-subjects) variability. Relative technical error of measurement (TEM) was calculated for measurements with significant ANOVA results to examine the error among a single observer repeating a measurement multiple times (e.g. repeatability or intraobserver error), as well as the variability between multiple observers (interobserver error). Two general trends emerged from these analyses: (1) maximum lengths and breadths have the lowest error across the board (TEM<0.5), and (2) maximum and minimum diameters at midshaft are more reliable than their positionally-dependent counterparts (i.e. sagittal, vertical, transverse, dorso-volar). Therefore, maxima and minima are specified for all midshaft measurements in DCP 2.0. Twenty-two measurements were flagged for excessive variability (either interobserver, intraobserver, or both); 15 of these measurements were part of the standard set of measurements in Data Collection Procedures for Forensic Skeletal Material, 3rd edition. Each measurement was examined carefully to determine the likely source of the error (e.g. data input, instrumentation, observer's method, or measurement definition). For several measurements (e.g. anterior sacral breadth, distal epiphyseal breadth of the tibia) only one observer differed significantly from the remaining observers, indicating a likely problem with the measurement definition as interpreted by that observer; these definitions were clarified in DCP 2.0 to eliminate this confusion. Other measurements were taken from landmarks that are difficult to locate consistently (e.g. pubis length, ischium length); these measurements were omitted from DCP 2.0. This manual is available for free download online (https://fac.utk.edu/wp-content/uploads/2016/03/DCP20_webversion.pdf), along with an accompanying instructional video (https://www.youtube.com/watch?v=BtkLFl3vim4). Copyright © 2018 Elsevier B.V. All rights reserved.
Özkan, Sezai; Mellema, Jos J.; Ring, David; Chen, Neal C.
2017-01-01
Background: To examine whether interobserver reliability, decision-making, and confidence in decision-making in the treatment of distal radius fractures changes if radiographs are viewed on a messenger application on a mobile phone compared to a standard DICOM viewer. Methods: Radiographs of distal radius fractures were presented to surgeons on either a smart phone using a mobile messenger application or a laptop using a DICOM viewer application. Twenty observers participated: 10 (50%) were randomly assigned to the DICOM viewer group and 10 (50%) to the mobile messenger group. Each observer was asked to evaluate the cases and (1) classify the fracture type according to the AO classification, (2) recommend operative or conservative treatment and (3) rate their confidence about this decision. Results: There was no significant difference in interobserver reliability for AO classification and recommendation for surgery for distal radius fractures in both groups. The percentage of recommendation for surgery was significantly higher in the messenger application group compared to the DICOM viewer group (89% versus 78%, P=0.019) and the confidence for treatment decision was significantly higher in the mobile messenger group compared to the DICOM viewer group (8.9 versus 7.9, P=0.026). Conclusion: Messenger applications on mobile phones could facilitate remote decision-making for patients with distal radius fractures, but should be used with caution. PMID:29226202
Agreement in the assessment of metastatic spine disease using scoring systems.
Arana, Estanislao; Kovacs, Francisco M; Royuela, Ana; Asenjo, Beatriz; Pérez-Ramírez, Ursula; Zamora, Javier
2015-04-01
To assess variability in the use of Tomita and modified Bauer scores in spine metastases. Clinical data and imaging from 90 patients with biopsy-proven spinal metastases, were provided to 83 specialists from 44 hospitals. Spinal levels involved and the Tomita and modified Bauer scores for each case were determined twice by each clinician, with a minimum of 6-week interval. Clinicians were blinded to every evaluation. Kappa statistic was used to assess intra and inter-observer agreement. Subgroup analyses were performed according to clinicians' specialty (medical oncology, neurosurgery, radiology, orthopedic surgery and radiation oncology), years of experience (⩽7, 8-13, ⩾14), and type of hospital (four levels). For metastases identification, intra-observer agreement was "substantial" (0.60
Identifying and classifying hyperostosis frontalis interna via computerized tomography.
May, Hila; Peled, Nathan; Dar, Gali; Hay, Ori; Abbas, Janan; Masharawi, Youssef; Hershkovitz, Israel
2010-12-01
The aim of this study was to recognize the radiological characteristics of hyperostosis frontalis interna (HFI) and to establish a valid and reliable method for its identification and classification. A reliability test was carried out on 27 individuals who had undergone a head computerized tomography (CT) scan. Intra-observer reliability was obtained by examining the images three times, by the same researcher, with a 2-week interval between each sample ranking. The inter-observer test was performed by three independent researchers. A validity test was carried out using two methods for identifying and classifying HFI: 46 cadaver skullcaps were ranked twice via computerized tomography scans and then by direct observation. Reliability and validity were calculated using Kappa test (SPSS 15.0). Reliability tests of ranking HFI via CT scans demonstrated good results (K > 0.7). As for validity, a very good consensus was obtained between the CT and direct observation, when moderate and advanced types of HFI were present (K = 0.82). The suggested classification method for HFI, using CT, demonstrated a sensitivity of 84%, specificity of 90.5%, and positive predictive value of 91.3%. In conclusion, volume rendering is a reliable and valid tool for identifying HFI. The suggested three-scale classification is most suitable for radiological diagnosis of the phenomena. Considering the increasing awareness of HFI as an early indicator of a developing malady, this study may assist radiologists in identifying and classifying the phenomena.
Accuracy of MSCT Coronary Angiography with 64 Row CT Scanner—Facing the Facts
Wehrschuetz, M.; Wehrschuetz, E.; Schuchlenz, H.; Schaffler, G.
2010-01-01
Improvements in multislice computed tomography (MSCT) angiography of the coronary vessels have enabled the minimally invasive detection of coronary artery stenoses, while quantitative coronary angiography (QCA) is the accepted reference standard for evaluation thereof. Sixteen-slice MSCT showed promising diagnostic accuracy in detecting coronary artery stenoses haemodynamically and the subsequent introduction of 64-slice scanners promised excellent and fast results for coronary artery studies. This prompted us to evaluate the diagnostic accuracy, sensitivity, specificity, and the negative und positive predictive value of 64-slice MSCT in the detection of haemodynamically significant coronary artery stenoses. Thirty-seven consecutive subjects with suspected coronary artery disease were evaluated with MSCT angiography and the results compared with QCA. All vessels were considered for the assessment of significant coronary artery stenosis (diameter reduction ≥ 50%). Thirteen patients (35%) were identified as having significant coronary artery stenoses on QCA with 6.3% (35/555) affected segments. None of the coronary segments were excluded from analysis. Overall sensitivity for classifying stenoses of 64-slice MSCT was 69%, specificity was 92%, positive predictive value was 38% and negative predictive value was 98%. The interobserver variability for detection of significant lesions had a k-value of 0.43. Sixty-four-slice MSCT offers the diagnostic potential to detect coronary artery disease, to quantify haemodynamically significant coronary artery stenoses and to avoid unnecessary invasive coronary artery examinations. PMID:20567636
Automatic tissue segmentation of head and neck MR images for hyperthermia treatment planning
NASA Astrophysics Data System (ADS)
Fortunati, Valerio; Verhaart, René F.; Niessen, Wiro J.; Veenland, Jifke F.; Paulides, Margarethus M.; van Walsum, Theo
2015-08-01
A hyperthermia treatment requires accurate, patient-specific treatment planning. This planning is based on 3D anatomical models which are generally derived from computed tomography. Because of its superior soft tissue contrast, magnetic resonance imaging (MRI) information can be introduced to improve the quality of these 3D patient models and therefore the treatment planning itself. Thus, we present here an automatic atlas-based segmentation algorithm for MR images of the head and neck. Our method combines multiatlas local weighting fusion with intensity modelling. The accuracy of the method was evaluated using a leave-one-out cross validation experiment over a set of 11 patients for which manual delineation were available. The accuracy of the proposed method was high both in terms of the Dice similarity coefficient (DSC) and the 95th percentile Hausdorff surface distance (HSD) with median DSC higher than 0.8 for all tissues except sclera. For all tissues, except the spine tissues, the accuracy was approaching the interobserver agreement/variability both in terms of DSC and HSD. The positive effect of adding the intensity modelling to the multiatlas fusion decreased when a more accurate atlas fusion method was used. Using the proposed approach we improved the performance of the approach previously presented for H&N hyperthermia treatment planning, making the method suitable for clinical application.
Streitberger, Andrea; Hocke, Verena; Modler, Peter
2013-09-01
To evaluate the feasibility of measuring pulmonary transit time (PTT) in healthy cats by transthoracic echocardiography using the ultrasound contrast agent Sonovue(®). To determine normalized PTT (nPTT) values in 42 healthy cats and to estimate the interobserver variability and the within-day repeatability of nPTT measurements. Forty-two privately owned healthy cats of different breeds, gender and age presented for cardiac examination. A bolus injection of contrast agent (Sonovue(®)) was administered intravenously. The right parasternal short axis echocardiographic view was used to record the contrast agent's transit time from the pulmonary artery to the left atrium. Pulmonary transit time and nPTT were determined independently by three examiners with different levels of experience. Normalized PTT was 4.12 ± 1.0 (mean ± SD) in our population. The median interobserver variability across our population was 6.8%, the median within-day variability for the three observers were 13.1%, 12.7% and 13%. No effect of the observer's experience on nPTT measurement was identified. Age, sex and body weight did not significantly influence nPTT. This study demonstrates that nPTT measurement is feasible in cats using ultrasound and the blood pool contrast media Sonovue(®). Measurements of nPTT can be performed in a clinical setting. Normalized PTT values in healthy cats are comparable with those reported in healthy dogs. Copyright © 2013 Elsevier B.V. All rights reserved.
Razavi, Asma; Newth, Christopher J L; Khemani, Robinder G; Beltramo, Fernando; Ross, Patrick A
2017-06-01
To evaluate physician assessment of cardiac output and systemic vascular resistance in patients with shock compared with an ultrasonic cardiac output monitor (USCOM). To explore potential changes in therapy decisions if USCOM data were available using physician intervention answers. Double-blinded, prospective, observational study in a tertiary hospital pediatric intensive care unit. Forty children (<18years) admitted with shock, requiring ongoing volume resuscitation or inotropic support. Two to 3 physicians clinically assessed cardiac output and systemic vascular resistance, categorizing them as high, normal, or low. An investigator simultaneously measured cardiac index (CI) and systemic vascular resistance index (SVRI) with USCOM categorized as high, normal, or low. Overall agreement between physician and USCOM for CI (48.5% [κ = 0.18]) and SVRI (45.9% [κ = 0.16]) was poor. Interobserver agreement was also poor for CI (58.7% [κ = 0.33]) and SVRI (52.3% [κ = 0.28]). Comparing theoretical physician interventions to "acceptable" or "unacceptable" clinical interventions, based on USCOM measurement, 56 (21%) physician interventions were found to be "unacceptable." There is poor agreement between physician-assessed CI and SVRI and USCOM, with significant interobserver variability among physicians. Objective measurement of CI and SVRI may reduce variability and improve diagnostic accuracy. Copyright © 2016 Elsevier Inc. All rights reserved.
Damasio, Maria Beatrice; Malattia, Clara; Tanturri de Horatio, Laura; Mattiuz, Chiara; Pistorio, Angela; Bracaglia, Claudia; Barbuti, Domenico; Boavida, Peter; Juhan, Karen Lambot; Ording, Lil Sophie Mueller; Rosendahl, Karen; Martini, Alberto; Magnano, GianMichele; Tomà, Paolo
2012-09-01
MRI is a sensitive tool for the evaluation of synovitis in juvenile idiopathic arthritis (JIA). The purpose of this study was to introduce a novel MRI-based score for synovitis in children and to examine its inter- and intraobserver variability in a multi-centre study. Wrist MRI was performed in 76 children with JIA. On postcontrast 3-D spoiled gradient-echo and fat-suppressed T2-weighted spin-echo images, joint recesses were scored for the degree of synovial enhancement, effusion and overall inflammation independently by two paediatric radiologists. Total-enhancement and inflammation-synovitis scores were calculated. Interobserver agreement was poor to moderate for enhancement and inflammation in all recesses, except in the radioulnar and radiocarpal joints. Intraobserver agreement was good to excellent. For enhancement and inflammation scores, mean differences (95 % CI) between observers were -1.18 (-4.79 to 2.42) and -2.11 (-6.06 to 1.83). Intraobserver variability (reader 1) was 0 (-1.65 to 1.65) and 0.02 (-1.39 to 1.44). Intraobserver agreement was good. Except for the radioulnar and radiocarpal joints, interobserver agreement was not acceptable. Therefore, the proposed scoring system requires further refinement.
Quantifying facial paralysis using the Kinect v2.
Gaber, Amira; Taher, Mona F; Wahed, Manal Abdel
2015-01-01
Assessment of facial paralysis (FP) and quantitative grading of facial asymmetry are essential in order to quantify the extent of the condition as well as to follow its improvement or progression. As such, there is a need for an accurate quantitative grading system that is easy to use, inexpensive and has minimal inter-observer variability. A comprehensive automated system to quantify and grade FP is the main objective of this work. An initial prototype has been presented by the authors. The present research aims to enhance the accuracy and robustness of one of this system's modules: the resting symmetry module. This is achieved by including several modifications to the computation method of the symmetry index (SI) for the eyebrows, eyes and mouth. These modifications are the gamma correction technique, the area of the eyes, and the slope of the mouth. The system was tested on normal subjects and showed promising results. The mean SI of the eyebrows decreased slightly from 98.42% to 98.04% using the modified method while the mean SI for the eyes and mouth increased from 96.93% to 99.63% and from 95.6% to 98.11% respectively while using the modified method. The system is easy to use, inexpensive, automated and fast, has no inter-observer variability and is thus well suited for clinical use.
Gormsen, Lars C; Haraldsen, Ate; Kramer, Stine; Dias, Andre H; Kim, Won Yong; Borghammer, Per
2016-12-01
Cardiac sarcoidosis (CS) is a potentially fatal condition lacking a single test with acceptable diagnostic accuracy. (18)F-FDG PET/CT has emerged as a promising imaging modality, but is challenged by physiological myocardial glucose uptake. An alternative tracer, (68)Ga-DOTANOC, binds to somatostatin receptors on inflammatory cells in sarcoid granulomas. We therefore aimed to conduct a proof-of-concept study using (68)Ga-DOTANOC to diagnose CS. In addition, we compared diagnostic accuracy and inter-observer variability of (68)Ga-DOTANOC vs. (18)F-FDG PET/CT. Nineteen patients (seven female) with suspected CS were prospectively recruited and dual tracer scanned within 7 days. PET images were reviewed by four expert readers for signs of CS and compared to the reference standard (Japanese ministry of Health and Welfare CS criteria). CS was diagnosed in 3/19 patients. By consensus, 11/19 (18)F-FDG scans and 0/19 (68)Ga-DOTANOC scans were rated as inconclusive. The sensitivity of (18)F-FDG PET for diagnosing CS was 33 %, specificity was 88 %, PPV was 33 %, NPV was 88 %, and diagnostic accuracy was 79 %. For (68)Ga-DOTANOC, accuracy was 100 %. Inter-observer agreement was poor for (18)F-FDG PET (Fleiss' combined kappa 0.27, NS) and significantly better for (68)Ga-DOTANOC (Fleiss' combined kappa 0.46, p = 0.001). Despite prolonged pre-scan fasting, a large proportion of (18)F-FDG PET/CT images were rated as inconclusive, resulting in low agreement among reviewers and correspondingly poor diagnostic accuracy. By contrast, (68)Ga-DOTANOC PET/CT had excellent diagnostic accuracy with the caveat that inter-observer variability was still significant. Nevertheless, (68)Ga-DOTANOC PET/CT looks very promising as an alternative CS PET tracer. Current Controlled Trials NCT01729169 .
Platonov, Pyotr G; Calkins, Hugh; Hauer, Richard N; Corrado, Domenico; Svendsen, Jesper H; Wichter, Thomas; Biernacka, Elżbieta Katarzyna; Saguner, Ardan M; Te Riele, Anneline S J M; Zareba, Wojciech
2016-01-01
Revision of the Task Force diagnostic criteria for arrhythmogenic right ventricular cardiomyopathy/dysplasia (ARVC/D) has increased their sensitivity for the diagnosis of early and familial forms of the disease. The epsilon wave is a major diagnostic criterion in the context of ARVC/D, which, however, remains not quantifiable and therefore may leave room for substantial subjective interpretation. The purpose of this study was to assess interobserver agreement in epsilon wave definition and epsilon wave importance for ARVC/D diagnosis. Electrocardiographic (ECG) tracings depicting leads V1, V2, and V3 collected from individuals evaluated for ARVC/D (n = 30) were given to panel members who were asked to respond to the question whether ECG patterns meet epsilon wave definition outlined by the Task Force diagnostic criteria. The prevalence and importance of epsilon waves for ARVC/D diagnosis were assessed in a pooled data set of patients with definite ARVC/D from European and American registries (n = 815). The number of ECG patterns identified as epsilon waves varied from 5 to 18 per reviewer (median 13 per reviewer). A unanimous agreement was reached for only 10 cases (33%), 2 of which qualified as epsilon waves and 8 as non-epsilon waves by all panel members. From a pooled data set, 106 patients reportedly had epsilon waves (13%). In 105 of 106 patients with epsilon waves (99%), exclusion of epsilon waves from the diagnostic score would not affect the "definite" diagnostic category. Interobserver variability in the assessment of epsilon waves is high; however, the impact of epsilon waves on ARVC/D diagnosis is negligibly low. The results urge to exercise caution in the assessment of epsilon waves, especially in patients who would not otherwise meet diagnostic criteria. Copyright © 2016 Heart Rhythm Society. Published by Elsevier Inc. All rights reserved.
Investigating Various Thresholds as Immunohistochemistry Cutoffs for Observer Agreement.
Ali, Asif; Bell, Sarah; Bilsland, Alan; Slavin, Jill; Lynch, Victoria; Elgoweini, Maha; Derakhshan, Mohammad H; Jamieson, Nigel B; Chang, David; Brown, Victoria; Denley, Simon; Orange, Clare; McKay, Colin; Carter, Ross; Oien, Karin A; Duthie, Fraser R
2017-10-01
Clinical translation of immunohistochemistry (IHC) biomarkers requires reliable and reproducible cutoffs or thresholds for interpretation of immunostaining. Most IHC biomarker research focuses on the clinical relevance (diagnostic, prognostic, or predictive utility) of cutoffs, with less emphasis on observer agreement using these cutoffs. From the literature, we identified 3 commonly used cutoffs of 10% positive epithelial cells, 20% positive epithelial cells, and moderate to strong staining intensity (+2/+3 hereafter) to use for investigating observer agreement. A series of 36 images of microarray cores stained for 4 different IHC biomarkers, with variable staining intensity and percentage of positive cells, was used for investigating interobserver and intraobserver agreement. Seven pathologists scored the immunostaining in each image using the 3 cutoffs for positive and negative staining. Kappa (κ) statistic was used to assess the strength of agreement for each cutoff. The interobserver agreement between all 7 pathologists using the 3 cutoffs was reasonably good, with mean κ scores of 0.64, 0.59, and 0.62, respectively, for 10%, 20%, and +2/+3 cutoffs. A good agreement was observed for experienced pathologists using the 10% cutoff, and their agreement was statistically higher than for junior pathologists (P=0.02). In addition, the mean intraobserver agreement for all 7 pathologists using the 3 cutoffs was reasonably good, with mean κ scores of 0.71, 0.60, and 0.73, respectively, for 10%, 20%, and +2/+3 cutoffs. For all 3 cutoffs, a positive correlation was observed with perceived ease of interpretation (P<0.003). Finally, cytoplasmic-only staining achieved higher agreement using all 3 cutoffs than mixed staining patterns. All 3 cutoffs investigated achieve reasonable strength of agreement, modestly decreasing interobserver and intraobserver variability in IHC interpretation. These cutoffs have previously been used in cancer pathology, and this study provides evidence that these cutoffs can be reproducible between practicing pathologists.
Gimber, Lana Hirai; Travis, R Ing; Takahashi, Jayme M; Goodman, Torrey L; Yoon, Hyo-Chun
2009-01-01
Context: Pulmonary computed tomography angiography (CTA) and the Wells criteria both have interobserver variability in the assessment of pulmonary embolism (PE). Quantitative D-dimer assay findings have been shown to have a high negative predictive value in patients with low pretest probability of PE. Objective: Evaluate roles for clinical probability and CTA in Emergency Department (ED) patients suspected of acute PE but having a low serum D-dimer level. Design: Prospective observational study of ED patients with possible PE who underwent pulmonary CTA and had D-dimer levels ≤1.0 μg/mL. Main Outcome: Clinical probability of PE determined by ED physicians using standard published criteria; pulmonary CTAs read by initial and study radiologists kept unaware of D-dimer results. Results: In 16 months, 744 patients underwent pulmonary CTA, with 347 study participants who had a D-dimer level ≤ 1.0 μg/mL. In one participant, CTA showed a PE that was agreed on by both the initial and study radiologists. In six participants, the initial findings were reported as positive for PE but were not interpreted as positive by the study radiologist. In none of these participants was PE diagnosed on the basis of clinical probability, of findings on ancillary studies and three-month follow-up examination, or by another radiologist, unaware of findings, acting as a tiebreaker. Conclusion: Pulmonary CTA findings positive for acute embolism should be viewed with caution, especially if the suspected PE is in a distal segmental or subsegmental artery in a patient with a serum D-dimer level of ≤1.0 μg/mL. Furthermore, the Wells criteria may be of limited additional value in this group of patients with low D-dimer levels because most will have low or intermediate clinical probability of PE. PMID:20740096
Belli, Maria Luisa; Mori, Martina; Broggi, Sara; Cattaneo, Giovanni Mauro; Bettinardi, Valentino; Dell'Oca, Italo; Fallanca, Federico; Passoni, Paolo; Vanoli, Emilia Giovanna; Calandrino, Riccardo; Di Muzio, Nadia; Picchio, Maria; Fiorino, Claudio
2018-05-01
To investigate the robustness of PET radiomic features (RF) against tumour delineation uncertainty in two clinically relevant situations. Twenty-five head-and-neck (HN) and 25 pancreatic cancer patients previously treated with 18 F-Fluorodeoxyglucose (FDG) positron emission tomography/computed tomography (PET/CT)-based planning optimization were considered. Seven FDG-based contours were delineated for tumour (T) and positive lymph nodes (N, for HN patients only) following manual (2 observers), semi-automatic (based on SUV maximum gradient: PET_Edge) and automatic (40%, 50%, 60%, 70% SUV_max thresholds) methods. Seventy-three RF (14 of first order and 59 of higher order) were extracted using the CGITA software (v.1.4). The impact of delineation on volume agreement and RF was assessed by DICE and Intra-class Correlation Coefficients (ICC). A large disagreement between manual and SUV_max method was found for thresholds ≥50%. Inter-observer variability showed median DICE values between 0.81 (HN-T) and 0.73 (pancreas). Volumes defined by PET_Edge were better consistent with the manual ones compared to SUV40%. Regarding RF, 19%/19%/47% of the features showed ICC < 0.80 between observers for HN-N/HN-T/pancreas, mostly in the Voxel-alignment matrix and in the intensity-size zone matrix families. RFs with ICC < 0.80 against manual delineation (taking the worst value) increased to 44%/36%/61% for PET_Edge and to 69%/53%/75% for SUV40%. About 80%/50% of 72 RF were consistent between observers for HN/pancreas patients. PET_edge was sufficiently robust against manual delineation while SUV40% showed a worse performance. This result suggests the possibility to replace manual with semi-automatic delineation of HN and pancreas tumours in studies including PET radiomic analyses. Copyright © 2018 Associazione Italiana di Fisica Medica. Published by Elsevier Ltd. All rights reserved.
Overeem, Simon P; Donselaar, Esmé J; Boersen, Jorrit T; Groot Jebbink, Erik; Slump, Cornelis H; de Vries, Jean-Paul P M; Reijnen, Michel M P J
2018-03-01
To assess the dynamic behavior of chimney grafts during the cardiac cycle. Three chimney endovascular aneurysm repair (EVAR) stent-graft configurations (Endurant and Advanta V12, Endurant and Viabahn, and Endurant and BeGraft) were placed in silicone aneurysm models and subjected to physiologic flow. Electrocardiography (ECG)-gated contrast-enhanced computed tomography was used to visualize geometric changes during the cardiac cycle. Endograft and chimney graft surface, gutter volume, chimney graft angulation over the center lumen line, and the D-ratio (the ratio between the lengths of the major and minor axes) were independently assessed by 2 observers at 10 time points in the cardiac cycle. Both gutter volumes and chimney graft geometry changed significantly during the cardiac cycle in all 3 configurations (p<0.001). Gutters and endoleaks were observed in all configurations. The largest gutter volume (232.8 mm 3 ) and change in volume (20.7 mm 3 ) between systole and diastole were observed in the Endurant-Advanta configuration. These values were 2.7- and 3.0-fold higher, respectively, compared to the Endurant-Viabahn configuration and 1.7- and 1.6-fold higher as observed in the Endurant-BeGraft configuration. The Endurant-Viabahn configuration had the highest D-ratio (right, 1.26-1.35; left, 1.33-1.48), while the Endurant-BeGraft configuration had the lowest (right, 1.11-1.17; left, 1.08-1.15). Assessment of the interobserver variability showed a high correlation (intraclass correlation >0.935) between measurements. Gutter volumes and stent compression are dynamic phenomena that reshape during the cardiac cycle. Compelling differences were observed during the cardiac cycle in all configurations, with the self-expanding (Endurant-Viabahn) chimney EVAR configurations having smaller gutters and less variation in gutter volume during the cardiac cycle yet more stent compression without affecting the chimney graft surface.
Yeung, Debby; Sorbara, Luigina
2018-01-01
It is important to be able to accurately estimate the central corneal clearance when fitting scleral contact lenses. Tools available have intrinsic biases due to the angle of viewing, and therefore an idea of the amount of error in estimation will benefit the fitter. To compare the accuracy of observers' ability to estimate scleral contact lens central corneal clearance (CCC) with biomicroscopy to measurements using slit-lamp imaging and anterior segment optical coherence tomography (AS-OCT). In a Web-based survey with images of four scleral lens fits obtained with a slit-lamp video imaging system, participants were asked to estimate the CCC. Responses were compared with known values of CCC of these images determined with an image-processing program (digital CCC) and using the AS-OCT (AS-OCT CCC). Bland-Altman plots and concordance correlation coefficients were used to assess the agreement of CCC measured by the various methods. Sixty-six participants were categorized for analysis based on the amount of experience with scleral lens fitting into novice, intermediate, or advanced fitters. Comparing the estimated CCC to the digital CCC, all three groups overestimated by an average of +27.3 ± 67.3 μm. The estimated CCC was highly correlated to the digital CCC (0.79, 0.92, and 0.94 for each group, respectively). Compared with the CCC measurements using AS-OCT, the three groups of participants overestimated by +103.3 μm and had high correlations (0.79, 0.93, and 0.94 for each group). Results from this study validate the ability of contact lens practitioners to observe and estimate the CCC in scleral lens fittings through the use of biomicroscopic viewing. Increasing experience with scleral lens fitting does not improve the correlation with measured CCC from digital or the AS-OCT. However, the intermediate and advanced groups display significantly less inter-observer variability compared with the novice group.
Implementation of a Posted Schedule to Increase Class-Wide Interobserver Agreement Assessment
ERIC Educational Resources Information Center
Doucette, Stefanie; DiGennaro Reed, Florence D.; Reed, Derek D.; Maguire, Helena; Marquardt, Heidi
2012-01-01
The present study investigated the impact of an antecedent intervention in the form of a daily posted schedule on the interobserver agreement (IOA) assessment of educational goals implemented within a classroom at a private school serving individuals with disabilities. During baseline, the percentage of academic goals with interobserver agreement…
Doubilet, Peter M; Benson, Carol B
2013-07-01
To assess the interobserver agreement, frequency of occurrence, and prognostic importance of the double sac sign (DSS), intradecidual sign (IDS), and other sonographic findings in early intrauterine pregnancies. We retrospectively identified all sonograms obtained between January 1, 2006, and December 31, 2011, in which: (1) the scan demonstrated an intrauterine fluid collection without a yolk sac or embryo; (2) a follow-up scan confirmed an intrauterine pregnancy; and (3) the first-trimester outcome was known. Each coinvestigator characterized the 199 study sonograms as demonstrating or not demonstrating a DSS or an IDS, based on judgment about whether the scan met published criteria defining these signs. Interobserver agreement was poor for the DSS (κ= 0.24) and IDS (κ= 0.23). Scans frequently demonstrated neither sign: 150 cases (75.4%) if we considered a sign to be present when both investigators graded it as present and 69 cases (34.7%) using the looser criterion that either graded it as present. The presence of a DSS or an IDS was unrelated to the β-human chorionic gonadotropin (β-hCG) value (P > .05, t test, all comparisons). An inner echogenic ring was present in 158 cases (79.4%), and the decidua was brighter peripherally than centrally in 102 (51.3%). The first-trimester outcome was unrelated to the presence of a DSS or an IDS, presence of an inner echogenic ring, or decidual appearance (P > .05, χ(2), all comparisons). The sonographic appearance of early gestational sacs, before visualization of a yolk sac or embryo, is highly variable. The DSS and IDS are often absent; there is poor interobserver agreement regarding these signs; and the prognosis is unrelated to their presence or absence. A round or oval intrauterine fluid collection in a woman with positive β-hCG should be treated as a gestational sac until proven otherwise, regardless of whether it demonstrates a DSS or an IDS.
Veta, Mitko; van Diest, Paul J.; Jiwa, Mehdi; Al-Janabi, Shaimaa; Pluim, Josien P. W.
2016-01-01
Background Tumor proliferation speed, most commonly assessed by counting of mitotic figures in histological slide preparations, is an important biomarker for breast cancer. Although mitosis counting is routinely performed by pathologists, it is a tedious and subjective task with poor reproducibility, particularly among non-experts. Inter- and intraobserver reproducibility of mitosis counting can be improved when a strict protocol is defined and followed. Previous studies have examined only the agreement in terms of the mitotic count or the mitotic activity score. Studies of the observer agreement at the level of individual objects, which can provide more insight into the procedure, have not been performed thus far. Methods The development of automatic mitosis detection methods has received large interest in recent years. Automatic image analysis is viewed as a solution for the problem of subjectivity of mitosis counting by pathologists. In this paper we describe the results from an interobserver agreement study between three human observers and an automatic method, and make two unique contributions. For the first time, we present an analysis of the object-level interobserver agreement on mitosis counting. Furthermore, we train an automatic mitosis detection method that is robust with respect to staining appearance variability and compare it with the performance of expert observers on an “external” dataset, i.e. on histopathology images that originate from pathology labs other than the pathology lab that provided the training data for the automatic method. Results The object-level interobserver study revealed that pathologists often do not agree on individual objects, even if this is not reflected in the mitotic count. The disagreement is larger for objects from smaller size, which suggests that adding a size constraint in the mitosis counting protocol can improve reproducibility. The automatic mitosis detection method can perform mitosis counting in an unbiased way, with substantial agreement with human experts. PMID:27529701
Veta, Mitko; van Diest, Paul J; Jiwa, Mehdi; Al-Janabi, Shaimaa; Pluim, Josien P W
2016-01-01
Tumor proliferation speed, most commonly assessed by counting of mitotic figures in histological slide preparations, is an important biomarker for breast cancer. Although mitosis counting is routinely performed by pathologists, it is a tedious and subjective task with poor reproducibility, particularly among non-experts. Inter- and intraobserver reproducibility of mitosis counting can be improved when a strict protocol is defined and followed. Previous studies have examined only the agreement in terms of the mitotic count or the mitotic activity score. Studies of the observer agreement at the level of individual objects, which can provide more insight into the procedure, have not been performed thus far. The development of automatic mitosis detection methods has received large interest in recent years. Automatic image analysis is viewed as a solution for the problem of subjectivity of mitosis counting by pathologists. In this paper we describe the results from an interobserver agreement study between three human observers and an automatic method, and make two unique contributions. For the first time, we present an analysis of the object-level interobserver agreement on mitosis counting. Furthermore, we train an automatic mitosis detection method that is robust with respect to staining appearance variability and compare it with the performance of expert observers on an "external" dataset, i.e. on histopathology images that originate from pathology labs other than the pathology lab that provided the training data for the automatic method. The object-level interobserver study revealed that pathologists often do not agree on individual objects, even if this is not reflected in the mitotic count. The disagreement is larger for objects from smaller size, which suggests that adding a size constraint in the mitosis counting protocol can improve reproducibility. The automatic mitosis detection method can perform mitosis counting in an unbiased way, with substantial agreement with human experts.
Margossian, Renee; Schwartz, Marcy L; Prakash, Ashwin; Wruck, Lisa; Colan, Steven D; Atz, Andrew M; Bradley, Timothy J; Fogel, Mark A; Hurwitz, Lynne M; Marcus, Edward; Powell, Andrew J; Printz, Beth F; Puchalski, Michael D; Rychik, Jack; Shirali, Girish; Williams, Richard; Yoo, Shi-Joon; Geva, Tal
2009-08-01
Assessment of the size and function of a functional single ventricle (FSV) is a key element in the management of patients after the Fontan procedure. Measurement variability of ventricular mass, volume, and ejection fraction (EF) among observers by echocardiography and cardiac magnetic resonance imaging (CMR) and their reproducibility among readers in these patients have not been described. From the 546 patients enrolled in the Pediatric Heart Network Fontan Cross-Sectional Study (mean age 11.9 +/- 3.4 years), 100 echocardiograms and 50 CMR studies were assessed for measurement reproducibility; 124 subjects with paired studies were selected for comparison between modalities. Interobserver agreement for qualitative grading of ventricular function by echocardiography was modest for left ventricular (LV) morphology (kappa = 0.42) and weak for right ventricular (RV) morphology (kappa = 0.12). For quantitative assessment, high intraclass correlation coefficients were found for echocardiographic interobserver agreement (LV 0.87 to 0.92, RV 0.82 to 0.85) of systolic and diastolic volumes, respectively. In contrast, intraclass correlation coefficients for LV and RV mass were moderate (LV 0.78, RV 0.72). The corresponding intraclass correlation coefficients by CMR were high (LV 0.96, RV 0.85). Volumes by echocardiography averaged 70% of CMR values. Interobserver reproducibility for the EF was similar for the 2 modalities. Although the absolute mean difference between modalities for the EF was small (<2%), 95% limits of agreement were wide. In conclusion, agreement between observers of qualitative FSV function by echocardiography is modest. Measurements of FSV volume by 2-dimensional echocardiography underestimate CMR measurements, but their reproducibility is high. Echocardiographic and CMR measurements of FSV EF demonstrate similar interobserver reproducibility, whereas measurements of FSV mass and LV diastolic volume are more reproducible by CMR.
Bishop, Julie Y; Jones, Grant L; Lewis, Brian; Pedroza, Angela
2015-04-01
In treatment of distal third clavicle fractures, the Neer classification system, based on the location of the fracture in relation to the coracoclavicular ligaments, has traditionally been used to determine fracture pattern stability. To determine the intra- and interobserver reliability in the classification of distal third clavicle fractures via standard plain radiographs and the intra- and interobserver agreement in the preferred treatment of these fractures. Cohort study (Diagnosis); Level of evidence, 3. Thirty radiographs of distal clavicle fractures were randomly selected from patients treated for distal clavicle fractures between 2006 and 2011. The radiographs were distributed to 22 shoulder/sports medicine fellowship-trained orthopaedic surgeons. Fourteen surgeons responded and took part in the study. The evaluators were asked to measure the size of the distal fragment, classify the fracture pattern as stable or unstable, assign the Neer classification, and recommend operative versus nonoperative treatment. The radiographs were reordered and redistributed 3 months later. Inter- and intrarater agreement was determined for the distal fragment size, stability of the fracture, Neer classification, and decision to operate. Single variable logistic regression was performed to determine what factors could most accurately predict the decision for surgery. Interrater agreement was fair for distal fragment size, moderate for stability, fair for Neer classification, slight for type IIB and III fractures, and moderate for treatment approach. Intrarater agreement was moderate for distal fragment size categories (κ = 0.50, P < .001) and Neer classification (κ = 0.42, P < .001) and substantial for stable fracture (κ = 0.65, P < .001) and decision to operate (κ = 0.65, P < .001). Fracture stability was the best predictor of treatment, with 89% accuracy (P < .001). Fracture stability determination and the decision to operate had the highest interobserver agreement. Fracture stability was the key determinant of treatment, rather than the Neer classification system or the size of the distal fragment. © 2015 The Author(s).
Olsen, Cody S; Kuppermann, Nathan; Jaffe, David M; Brown, Kathleen; Babcock, Lynn; Mahajan, Prashant V; Leonard, Julie C
2015-04-01
The objective was to describe the interobserver agreement between trained chart reviewers and physician reviewers in a multicenter retrospective chart review study of children with cervical spine injuries (CSIs). Medical records of children younger than 16 years old with cervical spine radiography from 17 Pediatric Emergency Care Applied Research Network (PECARN) hospitals from years 2000 through 2004 were abstracted by trained reviewers for a study aimed to identify predictors of CSIs in children. Independent physician-reviewers abstracted patient history and clinical findings from a random sample of study patient medical records at each hospital. Interobserver agreement was assessed using percent agreement and the weighted kappa (κ) statistic, with lower 95% confidence intervals. Moderate or better agreement (κ > 0.4) was achieved for most candidate CSI predictors, including altered mental status (κ = 0.87); focal neurologic findings (κ = 0.74); posterior midline neck tenderness (κ = 0.74); any neck tenderness (κ = 0.89); torticollis (κ = 0.79); complaint of neck pain (κ = 0.83); history of loss of consciousness (κ = 0.89); nonambulatory status (κ = 0.74); and substantial injuries to the head (κ = 0.50), torso/trunk (κ = 0.48), and extremities (κ = 0.59). High-risk mechanisms showed near-perfect agreement (diving, κ = 1.0; struck by car, κ = 0.93; other motorized vehicle crash, κ = 0.93; fall, κ = 0.92; high-risk motor vehicle collision, κ = 0.89; hanging, κ = 0.80). Fair agreement was found for clotheslining mechanisms (κ = 0.36) and substantial face injuries (κ = 0.40). Most retrospectively assessed variables thought to be predictive of CSIs in blunt trauma-injured children had at least moderate interobserver agreement, suggesting that these data are sufficiently valid for use in identifying potential predictors of CSI. © 2015 by the Society for Academic Emergency Medicine.
Liu, Chao; Cai, Hong-Xin; Zhang, Jian-Feng; Ma, Jian-Jun; Lu, Yin-Jiang; Fan, Shun-Wu
2014-03-01
The high-intensity zone (HIZ) on magnetic resonance imaging (MRI) has been studied for more than 20 years, but its diagnostic value in low back pain (LBP) is limited by the high incidence in asymptomatic subjects. Little effort has been made to improve the objective assessment of HIZ. To develop quantitative measurements for HIZ and estimate intra- and interobserver reliability and to clarify different signal intensity of HIZ in patients with or without LBP. A measurement reliability and prospective comparative study. A consecutive series of patients with LBP between June 2010 and May 2011 (group A) and a successive series of asymptomatic controls during the same period (group B). Incidence of HIZ; quantitative measures, including area of disc, area and signal intensity of HIZ, and magnetic resonance imaging index; and intraclass correlation coefficients (ICCs) for intra- and interobserver reliability. On the basis of HIZ criteria, a series of quantitative dimension and signal intensity measures was developed for assessing HIZ. Two experienced spine surgeons traced the region of interest twice within 4 weeks for assessment of the intra- and interobserver reliability. The quantitative variables were compared between groups A and B. There were 72 patients with LBP and 79 asymptomatic controls enrolling in this study. The prevalence of HIZ in group A and group B was 45.8% and 20.2%, respectively. The intraobserver agreement was excellent for the quantitative measures (ICC=0.838-0.977) as well as interobserver reliability (ICC=0.809-0.935). The mean signal of HIZ in group A was significantly brighter than in group B (57.55±14.04% vs. 45.61±7.22%, p=.000). There was no statistical difference of area of disc and HIZ between the two groups. The magnetic resonance imaging index was found to be higher in group A when compared with group B (3.94±1.71 vs. 3.06±1.50), but with a p value of .050. A series of quantitative measurements for HIZ was established and demonstrated excellent intra- and interobserver reliability. The signal intensity of HIZ was different in patients with or without LBP, and significant brighter signal was observed in symptomatic subjects. Copyright © 2014 Elsevier Inc. All rights reserved.
Interobserver Variation in Response Evaluation Criteria in Solid Tumors 1.1.
Karmakar, Arunabha; Kumtakar, Apeksha; Sehgal, Himanshu; Kumar, Savith; Kalyanpur, Arjun
2018-06-19
Response Evaluation Criteria in Solid Tumors (RECIST 1.1) is the gold standard for imaging response evaluation in cancer trials. We sought to evaluate consistency of applying RECIST 1.1 between 2 conventionally trained radiologists, designated as A and B; identify reasons for variation; and reconcile these differences for future studies. The study was approved as an institutional quality check exercise. Since no identifiable patient data was collected or used, a waiver of informed consent was granted. Imaging case report forms of a concluded multicentric breast cancer trial were retrospectively reviewed. Cohen's kappa was used to rate interobserver agreement in Response Evaluation Data (target response, nontarget response, new lesions, overall response). Significant variations were reassessed by a senior radiologist to extrapolate reasons for disagreement. Methods to improve agreement were similarly ascertained. Sixty one cases with total of 82 data-pairs were evaluated (35 data-pairs in visit 5, 47 in visit 9). Both radiologists showed moderate agreement in target response (n = 82; ĸ = 0.477; 95% confidence interval [CI]: 0.314-0.640-), nontarget response (n = 82; ĸ = 0.578; 95% CI: 0.213-0.944) and overall response evaluation in both visits (n = 82; ĸ = 0.510; 95% CI: 0.344-0.676). Further assessment demonstrated "Prevalence effect" of Kappa in some cases which led to underestimation of agreement. Percent agreement of overall response was 74.39% while percent variation was 25.6%. Differences in interpreting RECIST 1.1 and in radiological image interpretation were the primary sources of variation. The commonest overall response was "Partial Response" (Rad A:45/82; Rad B:63/82). Inspite of moderate interobserver agreement, qualitative interpretation differences in some cases increased interobserver variability. Protocols such as Adjudication, to reduce easily avoidable inconsistencies are or should be a part of the Standard Operating Procedure in imaging institutions. Based on our findings, a standard checklist has been developed to help reduce the interpretation error-margin for future studies. Such check-lists may improve interobserver agreement in the preadjudication phase thereby improving quality of results and reducing adjudication per case ratio. Improving data reliability when using RECIST 1.1 will reflect in better cancer clinical trial outcomes. A checklist can be of use to imaging centers to assess and improve their own processes. Copyright © 2018. Published by Elsevier Inc.
Varela, Gonzalo; Jiménez, Marcelo F; Novoa, Nuria Maria; Aranda, José Luis
2009-01-01
Since there are no data in the literature regarding variability in the management of postoperative pleural drainages, we have designed a prospective randomized study aimed at measuring inter-observer variability in deciding when to withdraw chest tubes after lung resection and to evaluate if the use of an electronic device to measure postoperative air leak decreases clinical practice variations. Sixty-one patients undergoing pulmonary resection were randomly assigned to one of the following groups: digital group (electronic measure of pleural air leak using Millicore AB DigiVent chest drainage system) or traditional group (standard water seal pleural chamber). Chest tube withdrawal criteria were established in advance. During morning rounds, two thoracic surgeons with comparable clinical experience and blinded to the decision of their counterpart, evaluated chest tube withdrawal criteria and noted whether the tube should be withdrawn or not. Inter-observer variability kappa index and global, positive, and negative agreement rates were calculated on 2 x 2 tables. Each observation episode was considered in the calculation. Fifty-four observations were recorded in the traditional group. Kappa coefficient was 0.37 (overall agreement rate: 0.58; positive agreement rate: 0.72; and negative agreement rate: 0.64). In the digital group, 67 observations were recorded. Kappa coefficient was 0.88 (overall agreement rate: 0.94; positive agreement rate 0.94; and negative agreement rate 0.94). We have demonstrated a high rate of disagreement related to the indication to remove chest tubes after lung resection and the improvement of the agreement rate with the use of an electronic device to measure postoperative air leak and pleural pressures.
Morbach, Caroline; Gelbrich, Götz; Breunig, Margret; Tiffe, Theresa; Wagner, Martin; Heuschmann, Peter U; Störk, Stefan
2018-02-14
Variability related to image acquisition and interpretation is an important issue of echocardiography in clinical trials. Nevertheless, there is no broadly accepted standard method for quality assessment of echocardiography in clinical research reports. We present analyses based on the echocardiography quality-assurance program of the ongoing STAAB cohort study (characteristics and course of heart failure stages A-B and determinants of progression). In 43 healthy individuals (mean age 50 ± 14 years; 18 females), duplicate echocardiography scans were acquired and mutually interpreted by one of three trained sonographers and an EACVI certified physician, respectively. Acquisition (AcV), interpretation (InV), and inter-observer variability (IOV; i.e., variability between the acquisition-interpretation sequences of two different observers), were determined for selected M-mode, B-mode, and Doppler parameters. We calculated Bland-Altman upper 95% limits of absolute differences, implying that 95% of measurement differences were smaller/equal to the given value: e.g. LV end-diastolic volume (mL): 25.0, 25.0, 27.9; septal e' velocity (cm/s): 3.03, 1.25, 3.58. Further, 90, 85, and 80% upper limits of absolute differences were determined for the respective parameters. Both, acquisition and interpretation, independently and sizably contributed to IOV. As such, separate assessment of AcV and InV is likely to aid in echocardiography training and quality-assurance. Our results further suggest to routinely determine IOV in clinical trials as a comprehensive measure of imaging quality. The derived 95, 90, 85, and 80% upper limits of absolute differences are suggested as reproducibility targets of future studies, thus contributing to the international efforts of standardization in quality-assurance.
Sánchez-Sánchez, M M; Sánchez-Izquierdo, R; Sánchez-Muñoz, E I; Martínez-Yegles, I; Fraile-Gamo, M P; Arias-Rivera, S
2014-01-01
The Glasgow coma scale (GCS) is a common tool used for neurological assessment of critically ill patients. Despite its widespread use, the GCS has some limitations, as sometimes different observers may value differently the same response. To evaluate the interobserver agreement, among intensive care nurses with a minimum of 3 years experience, both in the overall estimate of GCS and for each of its components. Prospective observational study including 110 neurological and/or neurosurgical patients conducted in a critical care unit of 18 beds, from October 2010 until December 2012. Registered variables: Demographic characteristics, reason for admission, overall GCS and its components. The neurological evaluation was conducted by a minimum of 3 nurses. One of them applied an algorithm and consensual assessment technique and all, independently, valued response to stimuli. Interobserver agreement was measured using the intraclass correlation coefficient (ICC) for a confidence interval (CI) of 95%. The study was approved by the Ethics Committee for Clinical Trails. The intraclass correlation coefficient (confident interval) for scale was: Overall GCS: 0.989 (0.985-0.992); ocular response: 0.981 (0.974-0.986); verbal response: 0.971 (0.960-0.979); motor response: 0.987 (0.982-0.991). In our cohort of patients we observed a high level of consistency in the application of both the GCS as in each of its components. Copyright © 2013 Elsevier España, S.L. y SEEIUC. All rights reserved.
Evaluating causes of error in landmark-based data collection using scanners
Shearer, Brian M.; Cooke, Siobhán B.; Halenar, Lauren B.; Reber, Samantha L.; Plummer, Jeannette E.; Delson, Eric
2017-01-01
In this study, we assess the precision, accuracy, and repeatability of craniodental landmarks (Types I, II, and III, plus curves of semilandmarks) on a single macaque cranium digitally reconstructed with three different surface scanners and a microCT scanner. Nine researchers with varying degrees of osteological and geometric morphometric knowledge landmarked ten iterations of each scan (40 total) to test the effects of scan quality, researcher experience, and landmark type on levels of intra- and interobserver error. Two researchers additionally landmarked ten specimens from seven different macaque species using the same landmark protocol to test the effects of the previously listed variables relative to species-level morphological differences (i.e., observer variance versus real biological variance). Error rates within and among researchers by scan type were calculated to determine whether or not data collected by different individuals or on different digitally rendered crania are consistent enough to be used in a single dataset. Results indicate that scan type does not impact rate of intra- or interobserver error. Interobserver error is far greater than intraobserver error among all individuals, and is similar in variance to that found among different macaque species. Additionally, experience with osteology and morphometrics both positively contribute to precision in multiple landmarking sessions, even where less experienced researchers have been trained in point acquisition. Individual training increases precision (although not necessarily accuracy), and is highly recommended in any situation where multiple researchers will be collecting data for a single project. PMID:29099867
Scapula fractures: interobserver reliability of classification and treatment.
Neuhaus, Valentin; Bot, Arjan G J; Guitton, Thierry G; Ring, David C; Abdel-Ghany, Mahmoud I; Abrams, Jeffrey; Abzug, Joshua M; Adolfsson, Lars E; Balfour, George W; Bamberger, H Brent; Barquet, Antonio; Baskies, Michael; Batson, W Arnold; Baxamusa, Taizoon; Bayne, Grant J; Begue, Thierry; Behrman, Michael; Beingessner, Daphne; Biert, Jan; Bishop, Julius; Alves, Mateus Borges Oliveira; Boyer, Martin; Brilej, Drago; Brink, Peter R G; Brunton, Lance M; Buckley, Richard; Cagnone, Juan Carlos; Calfee, Ryan P; Campinhos, Luiz Augusto B; Cassidy, Charles; Catalano, Louis; Chivers, Karel; Choudhari, Pradeep; Cimerman, Matej; Conflitti, Joseph M; Costanzo, Ralph M; Crist, Brett D; Cross, Brian J; Dantuluri, Phani; Darowish, Michael; de Bedout, Ramon; DeCoster, Thomas; Dennison, David G; DeNoble, Peter H; DeSilva, Gregory; Dienstknecht, Thomas; Duncan, Scott F; Duralde, Xavier A; Durchholz, Holger; Egol, Kenneth; Ekholm, Carl; Elias, Nelson; Erickson, John M; Esparza, J Daniel Espinosa; Fernandes, C H; Fischer, Thomas J; Fischmeister, Martin; Forigua Jaime, E; Getz, Charles L; Gilbert, Richard S; Giordano, Vincenzo; Glaser, David L; Gosens, Taco; Grafe, Michael W; Filho, Jose Eduardo Grandi Ribeiro; Gray, Robert R L; Gulotta, Lawrence V; Gummerson, Nigel William; Hammerberg, Eric Mark; Harvey, Edward; Haverlag, R; Henry, Patrick D G; Hobby, Jonathan L; Hofmeister, Eric P; Hughes, Thomas; Itamura, John; Jebson, Peter; Jenkinson, Richard; Jeray, Kyle; Jones, Christopher M; Jones, Jedediah; Jubel, Axel; Kaar, Scott G; Kabir, K; Kaplan, F Thomas D; Kennedy, Stephen A; Kessler, Michael W; Kimball, Hervey L; Kloen, Peter; Klostermann, Cyrus; Kohut, Georges; Kraan, G A; Kristan, Anze; Loebenberg, Mark I; Malone, Kevin J; Marsh, L; Martineau, Paul A; McAuliffe, John; McGraw, Iain; Mehta, Samir; Merchant, Milind; Metzger, Charles; Meylaerts, S A; Miller, Anna N; Wolf, Jennifer Moriatis; Murachovsky, Joel; Murthi, Anand; Nancollas, Michael; Nolan, Betsy M; Omara, Timothy; Omid, Reza; Ortiz, Jose A; Overbeck, Joachim P; Castillo, Alberto Pérez; Pesantez, Rodrigo; Polatsch, Daniel; Porcellini, G; Prayson, Michael; Quell, M; Ragsdell, Matthew M; Reid, James G; Reuver, J M; Richard, Marc J; Richardson, Martin; Rizzo, Marco; Rowinski, Sergio; Rubio, Jorge; Guerrero, Carlos G Sánchez; Satora, Wojciech; Schandelmaier, Peter; Scheer, Johan H; Schmidt, Andrew; Schubkegel, Todd A; Schulte, Leah M; Schumer, Evan D; Sears, Benjamin W; Shafritz, Adam B; Shortt, Nicholas L; Siff, Todd; Silva, Dario Mejia; Smith, Raymond Malcolm; Spruijt, Sander; Stein, Jason A; Pemovska, Emilija Stojkovska; Streubel, Philipp N; Swigart, Carrie; Swiontkowski, Marc; Thomas, George; Tolo, Eric T; Turina, Matthias; Tyllianakis, Minos; van den Bekerom, Michel P J; van der Heide, Huub; van de Sande, M A J; van Eerten, P V; Verbeek, Diederik O F; Hoffmann, David Victoria; Vochteloo, A J H; Wagenmakers, Robert; Wall, Christopher J; Wallensten, Richard; Wascher, Daniel C; Weiss, Lawrence; Wiater, J Michael; Wills, Brian P D; Wint, Jeffrey; Wright, Thomas; Young, Jason P; Zalavras, Charalampos; Zura, Robert D; Zyto, Karol
2014-03-01
There is substantial variation in the classification and management of scapula fractures. The first purpose of this study was to analyze the interobserver reliability of the OTA/AO classification and the New International Classification for Scapula Fractures. The second purpose was to assess the proportion of agreement among orthopaedic surgeons on operative or nonoperative treatment. Web-based reliability study. Independent orthopaedic surgeons from several countries were invited to classify scapular fractures in an online survey. One hundred three orthopaedic surgeons evaluated 35 movies of three-dimensional computerized tomography reconstruction of selected scapular fractures, representing a full spectrum of fracture patterns. Fleiss kappa (κ) was used to assess the reliability of agreement between the surgeons. The overall agreement on the OTA/AO classification was moderate for the types (A, B, and C, κ = 0.54) with a 71% proportion of rater agreement (PA) and for the 9 groups (A1 to C3, κ = 0.47) with a 57% PA. For the New International Classification, the agreement about the intraarticular extension of the fracture (Fossa (F), κ = 0.79) was substantial and the agreement about a fractured body (Body (B), κ = 0.57) or process was moderate (Process (P), κ = 0.53); however, PAs were more than 81%. The agreement on the treatment recommendation was moderate (κ = 0.57) with a 73% PA. The New International Classification was more reliable. Body and process fractures generated more disagreement than intraarticular fractures and need further clear definitions.
Plain film measurement error in acute displaced midshaft clavicle fractures
Archer, Lori Anne; Hunt, Stephen; Squire, Daniel; Moores, Carl; Stone, Craig; O’Dea, Frank; Furey, Andrew
2016-01-01
Background Clavicle fractures are common and optimal treatment remains controversial. Recent literature suggests operative fixation of acute displaced mid-shaft clavicle fractures (DMCFs) shortened more than 2 cm improves outcomes. We aimed to identify correlation between plain film and computed tomography (CT) measurement of displacement and the inter- and intraobserver reliability of repeated radiographic measurements. Methods We obtained radiographs and CT scans of patients with acute DMCFs. Three orthopedic staff and 3 residents measured radiographic displacement at time zero and 2 weeks later. The CT measurements identified absolute shortening in 3 dimensions (by subtracting the length of the fractured from the intact clavicle). We then compared shortening measured on radiographs and shortening measured in 3 dimensions on CT. Interobserver and intraobserver reliability were calculated. Results We reviewed the fractures of 22 patients. Bland–Altman repeatability coefficient calculations indicated that radiograph and CT measurements of shortening could not be correlated owing to an unacceptable amount of measurement error (6 cm). Interobserver reliability for plain radiograph measurements was excellent (Cronbach α = 0.90). Likewise, intraobserver reliabilities for plain radiograph measurements as calculated with paired t tests indicated excellent correlation (p > 0.05 in all but 1 observer [p = 0.04]). Conclusion To establish shortening as an indication for DMCF fixation, reliable measurement tools are required. The low correlation between plain film and CT measurements we observed suggests further research is necessary to establish what imaging modality reliably predicts shortening. Our results indicate weak correlation between radiograph and CT measurement of acute DMCF shortening. PMID:27438054
Marawar, Satyajit V; Madom, Ian A; Palumbo, Mark; Tallarico, Richard A; Ordway, Nathaniel R; Metkar, Umesh; Wang, Dongliang; Green, Adam; Lavelle, William F
2017-01-01
Treating surgeon's visual assessment of axial MRI images to ascertain the degree of stenosis has a critical impact on surgical decision-making. The purpose of this study was to prospectively analyze the impact of surgeon experience on inter-observer and intra-observer reliability of assessing severity of spinal stenosis on MRIs by spine surgeons directly involved in surgical decision-making. Seven fellowship trained spine surgeons reviewed MRI studies of 30 symptomatic patients with lumbar stenosis and graded the stenosis in the central canal, the lateral recess and the foramen at T12-L1 to L5-S1 as none, mild, moderate or severe. No specific instructions were provided to what constituted mild, moderate, or severe stenosis. Two surgeons were "senior" (>fifteen years of practice experience); two were "intermediate" (>four years of practice experience), and three "junior" (< one year of practice experience). The concordance correlation coefficient (CCC) was calculated to assess inter-observer reliability. Seven MRI studies were duplicated and randomly re-read to evaluate inter-observer reliability. Surgeon experience was found to be a strong predictor of inter-observer reliability. Senior inter-observer reliability was significantly higher assessing central(p<0.001), foraminal p=0.005 and lateral p=0.001 than "junior" group.Senior group also showed significantly higher inter-observer reliability that intermediate group assessing foraminal stenosis (p=0.036). In intra-observer reliability the results were contrary to that found in inter-observer reliability. Inter-observer reliability of assessing stenosis on MRIs increases with surgeon experience. Lower intra-observer reliability values among the senior group, although not clearly explained, may be due to the small number of MRIs evaluated and quality of MRI images.Level of evidence: Level 3.
Gerritsen, Arja; Bollen, Thomas L; Nio, C Yung; Molenaar, I Quintus; Dijkgraaf, Marcel G W; van Santvoort, Hjalmar C; Offerhaus, G Johan; Brosens, Lodewijk A; Biermann, Katharina; Sieders, Egbert; de Jong, Koert P; van Dam, Ronald M; van der Harst, Erwin; van Goor, Harry; van Ramshorst, Bert; Bonsing, Bert A; de Hingh, Ignace H; Gerhards, Michael F; van Eijck, Casper H; Gouma, Dirk J; Borel Rinkes, Inne H M; Busch, Olivier R C; Besselink, Marc G H
2015-07-01
Previous studies have shown that 5-14% of patients undergoing pancreatoduodenectomy for suspected malignancy ultimately are diagnosed with benign disease. A "pancreatic mass" on computed tomography (CT) is considered to be the strongest predictor of malignancy, but studies describing its diagnostic value are lacking. The aim of this study was to determine the diagnostic value of a pancreatic mass on CT in patients with presumed pancreatic cancer, as well as the interobserver agreement among radiologists and the additional value of reassessment by expert-radiologists. Reassessment of preoperative CT scans was performed within a previously described multicenter retrospective cohort study in 344 patients undergoing pancreatoduodenectomy for suspected malignancy (2003-2010). Preoperative CT scans were reassessed by 2 experienced abdominal radiologists separately and subsequently in a consensus meeting, after defining a pancreatic mass as "a measurable space occupying soft tissue density, except for an enlarged papilla or focal steatosis". CT scans of 86 patients with benign and 258 patients with (pre)malignant disease were reassessed. In 66% of patients a pancreatic mass was reported in the original CT report, versus 48% and 50% on reassessment by the 2 expert radiologists separately and 44% in consensus (P < .001 vs original report). Interobserver agreement between the original CT report and expert consensus was fair (kappa = 0.32, 95% confidence interval 0.23-0.42). Among both expert-radiologists agreement was moderate (kappa = 0.47, 95% confidence interval 0.38-0.56), with disagreement on the presence of a pancreatic mass in 29% of cases. The specificity for malignancy of pancreatic masses identified in expert consensus was twice as high compared with the original CT report (87% vs 42%, respectively). Positive predictive value increased to 98% after expert consensus, but negative predictive value was low (12%). Clinicians need to be aware of potential considerable disagreement among radiologists about the presence of a pancreatic mass. The specificity for malignancy doubled by expert radiologist reassessment when a uniform definition of "pancreatic mass" was used. Copyright © 2015 Elsevier Inc. All rights reserved.
Tane, Shinya; Ohno, Yoshiharu; Hokka, Daisuke; Ogawa, Hiroyuki; Tauchi, Shunsuke; Nishio, Wataru; Yoshimura, Masahiro; Okita, Yutaka; Maniwa, Yoshimasa
2013-12-01
The purpose of this study was to compare the efficacy of 320-detector row computed tomography (CT) with that of 64-detector row CT for three-dimensional assessment of pulmonary vasculature of candidates for pulmonary segmentectomy. We included 32 patients who underwent both 320- and 64-detector CT before pulmonary segmentectomy, which was performed by cutting the pulmonary artery and bronchi of the affected segment followed by dissection of the intersegmental plane along the intersegmental vein. Before the operation, three-dimensional pulmonary vasculature images were obtained for each patient, and the arteries and intersegmental veins of the affected segments were identified. Two thoracic surgeons independently assessed the vessels with visual scoring systems, and kappa analysis was used to determine interobserver agreement. The Wilcoxon signed-rank test was used to compare the visual scores for the assessment of the visualization capabilities of the two methods. In addition, the final determination of pulmonary vasculature at a given site was made by consensus from thoracic surgeons during operation, and receiver operating characteristic analysis was performed to compare their efficacy of pulmonary vasculature assessment. Sensitivity, specificity and accuracy of either method were also compared by means of McNemar's test. Of the 32 cases, there were no operative complications, but 1 patient died of postoperative idiopathic interstitial pneumonia. Visualization scores for the pulmonary vessels were significantly higher for 320- than those for 64-detector CT (P < 0.0001 for the affected arteries and P < 0.0001 for the intersegmental veins). As for pulmonary vasculature assessment, the areas under the curve showed no statistically significant differences in between the two methods, while the specificity and accuracy of intersegemental vein assessment were significantly better for 320- than those for 64-detector row CT (P < 0.05). Interobserver agreement for the assessment yielded by either method was almost perfect for all cases. Three hundred and twenty-detector row CT is more useful than conventional 64-detector row CT for preoperative three-dimensional assessment of pulmonary vasculature, especially when we identify the intersegmental veins, in candidates for pulmonary segmentectomy.
In vivo analysis of the iris thickness by spectral domain optical coherence tomography.
Invernizzi, Alessandro; Cigada, Mario; Savoldi, Luisa; Cavuto, Silvio; Fontana, Luigi; Cimino, Luca
2014-09-01
To assess the effectiveness of spectral domain optical coherence tomography (SD-OCT) in providing in vivo measurements of iris thickness in healthy and pathological subjects. 14 healthy volunteers and 14 patients with unilateral Fuchs' uveitis were enrolled in the study. The two groups were comparable for age, gender and race. Each subject underwent complete clinical examination and anterior segment SD-OCT imaging in both eyes. SD-OCT scans of the iris were performed following a cross-sectional pattern. Iris thickness values were obtained using a purposely developed software-based analysis of OCT images. Measurements were carried out twice by two trained independent operators to assess intraobserver and interobserver repeatability. Analysis of iris thickness was conducted in four main quadrants: superior, inferior, nasal and temporal. Iris thickness values from normal subjects were compared with the ones measured in the affected and fellow eyes of patients with Fuchs' uveitis. Iris thickness measurements showed good intraobserver and interobserver repeatability (intraclass correlation coefficient >0.971). Superior and temporal iris sectors showed respectively thickest and thinnest values in all groups. In healthy eyes, iris thickness ranged from 327.92±37.29 μm temporally to 405.25±48.49 μm superiorly. Iris thickness measurements in the affected eyes of Fuchs' uveitis patients ranged from 285.48±56.02 μm temporally to 376.12±60.97 μm superiorly. Multiple comparison analysis showed iris thickness values to be significantly lower in eyes affected by Fuchs' uveitis than both in fellow eyes (p<0.001) of the same patients and in healthy eyes (p=0.0074). SD-OCT is a suitable technique for iris thickness assessment. Thickness analysis must be carried out using a sectorial approach, taking into consideration anatomical variations existing between different iris regions. SD-OCT is a potentially useful tool for detecting iris thickness variations induced by pathological conditions such as Fuchs' uveitis. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Schellhaas, Barbara; Hammon, Matthias; Strobel, Deike; Pfeifer, Lukas; Kielisch, Christian; Goertz, Ruediger S; Cavallaro, Alexander; Janka, Rolf; Neurath, Markus F; Uder, Michael; Seuss, Hannes
2018-04-19
We compared the interobserver agreement for the recently introduced contrast-enhanced ultrasound (CEUS)-based algorithm CEUS-LI-RADS (Liver Imaging Reporting and Data System) versus the well-established magnetic resonance imaging (MRI)-LI-RADS for non-invasive diagnosis of hepatocellular carcinoma (HCC) in high-risk patients. Focal liver lesions in 50 high-risk patients (mean age 66.2 ± 11.8 years; 39 male) were assessed retrospectively with CEUS and MRI. Two independent observers reviewed CEUS and MRI examinations, separately, classifying observations according to CEUS-LI-RADSv.2016 and MRI-LI-RADSv.2014. Interobserver agreement was assessed with Cohen's kappa. Forty-three lesions were HCCs; two were intrahepatic cholangiocarcinomas; five were benign lesions. Arterial phase hyperenhancement was perceived less frequently with CEUS than with MRI (37/50 / 38/50 lesions = 74%/78% [CEUS; observer 1/observer 2] versus 46/50 / 44/50 lesions = 92%/88% [MRI; observer 1/observer 2]). Washout appearance was observed in 34/50 / 20/50 lesions = 68%/40% with CEUS and 31/50 / 31/50 lesions = 62%/62%) with MRI. Interobserver agreement was moderate for arterial hyperenhancement (ĸ = 0.511/0.565 [CEUS/MRI]) and "washout" (ĸ = 0.490/0.582 [CEUS/MRI]), fair for CEUS-LI-RADS category (ĸ = 0.309) and substantial for MRI-LI-RADS category (ĸ = 0.609). Intermodality agreement was fair for arterial hyperenhancement (ĸ = 0.329), slight to fair for "washout" (ĸ = 0.202) and LI-RADS category (ĸ = 0.218) CONCLUSION: Interobserver agreement is substantial for MRI-LI-RADS and only fair for CEUS-LI-RADS. This is mostly because interobserver agreement in the perception of washout appearance is better in MRI than in CEUS. Further refinement of the LI-RADS algorithms and increasing education and practice may be necessary to improve the concordance between CEUS and MRI for the final LI-RADS categorization. • CEUS-LI-RADS and MRI-LIRADS enable standardized non-invasive diagnosis of HCC in high-risk patients. • With CEUS, interobserver agreement is better for arterial hyperenhancement than for "washout". • Interobserver agreement for major features is moderate for both CEUS and MRI. • Interobserver agreement for LI-RADS category is substantial for MRI, and fair for CEUS. • Interobserver-agreement for CEUS-LI-RADS will presumably improve with ongoing use of the algorithm.
Three-dimensional analysis of third molar development to estimate age of majority.
Márquez-Ruiz, Ana Belén; Treviño-Tijerina, María Concepción; González-Herrera, Lucas; Sánchez, Belén; González-Ramírez, Amanda Rocío; Valenzuela, Aurora
2017-09-01
Third molars are one of the few biological markers available for age estimation in undocumented juveniles close the legal age of majority, assuming an age of 18years as the most frequent legal demarcation between child and adult status. To obtain more accurate visualization and evaluation of third molar mineralization patterns from computed tomography images, a new software application, DentaVol©, was developed. Third molar mineralization according to qualitative (Demirjian's maturational stage) and quantitative parameters (third molar volume) of dental development was assessed in multi-slice helical computed tomography images of both maxillary arches displayed by DentaVol© from 135 individuals (62 females and 73 males) aged between 14 and 23years. Intra- and inter-observer agreement values were remarkably high for both evaluation procedures and for all third molars. A linear correlation between third molar mineralization and chronological age was found, with third molar maturity occurring earlier in males than in females. Assessment of dental development with both procedures, by using DentaVol© software, can be considered a good indicator of age of majority (18years or older) in all third molars. Our results indicated that virtual computed tomography imaging can be considered a valid alternative to orthopantomography for evaluations of third molar mineralization, and therefore a complementary tool for determining the age of majority. Copyright © 2017 The Chartered Society of Forensic Sciences. Published by Elsevier B.V. All rights reserved.
O'Neill, Marisol; Huang, Gene O; Lamb, Dolores J
2017-12-01
The murine penis model has enriched our understanding of anomalous penile development. The morphologic characterization of the murine penis using conventional serial sectioning methods is labor intensive and prone to errors. To develop a novel application of micro-computerized tomography (micro-CT) with iodine staining for rapid, non-destructive morphologic study of murine penis structure. Penises were dissected from 10 adult wild-type mice and imaged using micro-CT with iodine staining. Images were acquired at 5-μm spatial resolution on a Bruker SkyScan 1272 micro-CT system. After images were acquired, the specimens were washed of any remaining iodine and embedded in paraffin for conventional histologic examination. Histologic and micro-CT measurements for all specimens were made by 2 independent observers. Measurements of penile structures were made on virtual micro-CT sections and histologic slides. The Lin concordance correlation coefficient demonstrated almost perfect strength of agreement for interobserver variability for histologic section (0.9995, 95% CI = 0.9990-0.9997) and micro-CT section (0.9982, 95% CI = 0.9963-0.9991) measurements. Bland-Altman analysis for agreement between the 2 modalities of measurement demonstrated mean differences of -0.029, 0.022, and -0.068 mm for male urogenital mating protuberance, baculum, and penile glans length, respectively. There did not appear to be a bias for overestimation or underestimation of measured lengths and limits of agreement were narrow. The enhanced ability offered by micro-CT to phenotype the murine penis has the potential to improve translational studies examining the molecular pathways contributing to anomalous penile development. The present study describes the first reported use of micro-CT with iodine staining for imaging the murine penis. Producing repeated histologic sections of identical orientation was limited by inherent imperfections in mounting and tissue sectioning, but this was compensated for by using micro-CT reconstructions to identify matching virtual sections. This study demonstrates the successful use of micro-CT with iodine staining, which has the potential for submicron spatial resolution, as a non-destructive method of characterizing murine penile morphology. O'Neill M, Huang GO, Lamb DJ. Novel Application of Micro-Computerized Tomography for Morphologic Characterization of the Murine Penis. J Sex Med 2017;14:1533-1539. Copyright © 2017. Published by Elsevier Inc.
Stylus/tablet user input device for MRI heart wall segmentation: efficiency and ease of use.
Taslakian, Bedros; Pires, Antonio; Halpern, Dan; Babb, James S; Axel, Leon
2018-05-02
To determine whether use of a stylus user input device (UID) would be superior to a mouse for CMR segmentation. Twenty-five consecutive clinical cardiac magnetic resonance (CMR) examinations were selected. Image analysis was independently performed by four observers. Manual tracing of left (LV) and right (RV) ventricular endocardial contours was performed twice in 10 randomly assigned sessions, each session using only one UID. Segmentation time and the ventricular function variables were recorded. The mean segmentation time and time reduction were calculated for each method. Intraclass correlation coefficients (ICC) and Bland-Altman plots of function variables were used to assess intra- and interobserver variability and agreement between methods. Observers completed a Likert-type questionnaire. The mean segmentation time (in seconds) was significantly less with the stylus compared to the mouse, averaging 206±108 versus 308±125 (p<0.001) and 225±140 versus 353±162 (p<0.001) for LV and RV segmentation, respectively. The intra- and interobserver agreement rates were excellent (ICC≥0.75) regardless of the UID. There was an excellent agreement between measurements derived from manual segmentation using different UIDs (ICC≥0.75), with few exceptions. Observers preferred the stylus. The study shows a significant reduction in segmentation time using the stylus, a subjective preference, and excellent agreement between the methods. • Using a stylus for MRI ventricular segmentation is faster compared to mouse • A stylus is easier to use and results in less fatigue • There is excellent agreement between stylus and mouse UIDs.
New methodology to reconstruct in 2-D the cuspal enamel of modern human lower molars.
Modesto-Mata, Mario; García-Campos, Cecilia; Martín-Francés, Laura; Martínez de Pinillos, Marina; García-González, Rebeca; Quintino, Yuliet; Canals, Antoni; Lozano, Marina; Dean, M Christopher; Martinón-Torres, María; Bermúdez de Castro, José María
2017-08-01
In the last years different methodologies have been developed to reconstruct worn teeth. In this article, we propose a new 2-D methodology to reconstruct the worn enamel of lower molars. Our main goals are to reconstruct molars with a high level of accuracy when measuring relevant histological variables and to validate the methodology calculating the errors associated with the measurements. This methodology is based on polynomial regression equations, and has been validated using two different dental variables: cuspal enamel thickness and crown height of the protoconid. In order to perform the validation process, simulated worn modern human molars were employed. The associated errors of the measurements were also estimated applying methodologies previously proposed by other authors. The mean percentage error estimated in reconstructed molars for these two variables in comparison with their own real values is -2.17% for the cuspal enamel thickness of the protoconid and -3.18% for the crown height of the protoconid. This error significantly improves the results of other methodologies, both in the interobserver error and in the accuracy of the measurements. The new methodology based on polynomial regressions can be confidently applied to the reconstruction of cuspal enamel of lower molars, as it improves the accuracy of the measurements and reduces the interobserver error. The present study shows that it is important to validate all methodologies in order to know the associated errors. This new methodology can be easily exportable to other modern human populations, the human fossil record and forensic sciences. © 2017 Wiley Periodicals, Inc.
Fractal analysis for assessing tumour grade in microscopic images of breast tissue
NASA Astrophysics Data System (ADS)
Tambasco, Mauro; Costello, Meghan; Newcomb, Chris; Magliocco, Anthony M.
2007-03-01
In 2006, breast cancer is expected to continue as the leading form of cancer diagnosed in women, and the second leading cause of cancer mortality in this group. A method that has proven useful for guiding the choice of treatment strategy is the assessment of histological tumor grade. The grading is based upon the mitosis count, nuclear pleomorphism, and tubular formation, and is known to be subject to inter-observer variability. Since cancer grade is one of the most significant predictors of prognosis, errors in grading can affect patient management and outcome. Hence, there is a need to develop a breast cancer-grading tool that is minimally operator dependent to reduce variability associated with the current grading system, and thereby reduce uncertainty that may impact patient outcome. In this work, we explored the potential of a computer-based approach using fractal analysis as a quantitative measure of cancer grade for breast specimens. More specifically, we developed and optimized computational tools to compute the fractal dimension of low- versus high-grade breast sections and found them to be significantly different, 1.3+/-0.10 versus 1.49+/-0.10, respectively (Kolmogorov-Smirnov test, p<0.001). These results indicate that fractal dimension (a measure of morphologic complexity) may be a useful tool for demarcating low- versus high-grade cancer specimens, and has potential as an objective measure of breast cancer grade. Such prognostic value could provide more sensitive and specific information that would reduce inter-observer variability by aiding the pathologist in grading cancers.
Enhancing reproducibility of ultrasonic measurements by new users
NASA Astrophysics Data System (ADS)
Pramanik, Manojit; Gupta, Madhumita; Krishnan, Kajoli Banerjee
2013-03-01
Perception of operator influences ultrasound image acquisition and processing. Lower costs are attracting new users to medical ultrasound. Anticipating an increase in this trend, we conducted a study to quantify the variability in ultrasonic measurements made by novice users and identify methods to reduce it. We designed a protocol with four presets and trained four new users to scan and manually measure the head circumference of a fetal phantom with an ultrasound scanner. In the first phase, the users followed this protocol in seven distinct sessions. They then received feedback on the quality of the scans from an expert. In the second phase, two of the users repeated the entire protocol aided by visual cues provided to them during scanning. We performed off-line measurements on all the images using a fully automated algorithm capable of measuring the head circumference from fetal phantom images. The ground truth (198.1±1.6 mm) was based on sixteen scans and measurements made by an expert. Our analysis shows that: (1) the inter-observer variability of manual measurements was 5.5 mm, whereas the inter-observer variability of automated measurements was only 0.6 mm in the first phase (2) consistency of image appearance improved and mean manual measurements was 4-5 mm closer to the ground truth in the second phase (3) automated measurements were more precise, accurate and less sensitive to different presets compared to manual measurements in both phases. Our results show that visual aids and automation can bring more reproducibility to ultrasonic measurements made by new users.
Normal values of 3 methods to determine patellar height in children from 6 to 12 years.
Vergara-Amador, E; Davalos Herrera, D; Guevara, O A
2018-03-26
The aim of the study was to compare three methods for high-score measurement in children, Caton-Deschamps, Blackburne-Peel and Koshino-Sugimoto, to determine the normal value of each method in a group of normal children. A cross-sectional study on knee x-rays of normal children. Three orthopaedic surgeons measured the Caton-Deschamps, Blackburne-Peel and Koshino-Sugimoto indices. Concordance was assessed using the intraclass correlation coefficient. For interobserver variability, the measurements of each observer for each index were compared and for intraobserver variability, the coefficient between the 2 measurements was calculated by the same observer at 2 different times. 140 knee X-rays divided into 4 age groups were obtained. For the Blackburne-Peel index, an average median of the 3 observers was obtained of 1.07 and with P5-P95 (0.76-1.60). For the Caton-Deschamps index, an average median of the three observers of 1.22 was obtained and with P5-P95 (0.91-1.70). For the Koshino-Sugimoto index, we obtained an average median of the 3 observers of 1.16 and with P5-P95 (0.99-1.36). This study shows that the Koshino-Sugimoto index had the highest reliability, reproducibility and similarity in the population studied, both intra-observer and inter-observer. The other methods evaluated also had variability indices to be taken into account, but were inferior to the Koshino-Sugimoto index. Copyright © 2018 SECOT. Publicado por Elsevier España, S.L.U. All rights reserved.
Persson, A; Brismar, T B; Lundström, C; Dahlström, N; Othberg, F; Smedby, O
2006-03-01
To compare three methods for standardizing volume rendering technique (VRT) protocols by studying aortic diameter measurements in magnetic resonance angiography (MRA) datasets. Datasets from 20 patients previously examined with gadolinium-enhanced MRA and with digital subtraction angiography (DSA) for abdominal aortic aneurysm were retrospectively evaluated by three independent readers. The MRA datasets were viewed using VRT with three different standardized transfer functions: the percentile method (Pc-VRT), the maximum-likelihood method (ML-VRT), and the partial range histogram method (PRH-VRT). The aortic diameters obtained with these three methods were compared with freely chosen VRT parameters (F-VRT) and with maximum intensity projection (MIP) concerning inter-reader variability and agreement with the reference method DSA. F-VRT parameters and PRH-VRT gave significantly higher diameter values than DSA, whereas Pc-VRT gave significantly lower values than DSA. The highest interobserver variability was found for F-VRT parameters and MIP, and the lowest for Pc-VRT and PRH-VRT. All standardized VRT methods were significantly superior to both MIP and F-VRT in this respect. The agreement with DSA was best for PRH-VRT, which was the only method with a mean error below 1 mm and which also had the narrowest limits of agreement (95% of cases between 2.1 mm below and 3.1 mm above DSA). All the standardized VRT methods compare favorably with MIP and VRT with freely selected parameters as regards interobserver variability. The partial range histogram method, although systematically overestimating vessel diameters, gives results closest to those of DSA.
Neumann, M; Friedl, S; Meining, A; Egger, K; Heldwein, W; Rey, J F; Hochberger, J; Classen, M; Hohenberger, W; Rösch, T
2002-10-01
In most European countries, training in GI endoscopy has largely been based on hands-on acquisition of experience in patients rather than on a structured training programme. With the development of training models systematic hands-on training in a variety of diagnostic and therapeutic endoscopy techniques was achieved. Little, however, is known about methods of objectively assessing trainees' performance. We therefore developed an assessment 'score card' for upper GI endoscopy and tested it in endoscopists with various levels of experience. The aim of the study was therefore to assess interobserver variations in the evaluation of trainees. On the basis of textbook and expert opinions a consensus group of eight experienced endoscopists developed a score card for diagnostic upper GI endoscopy with biopsy. The score card includes an assessment of the single steps of the procedure as well as of the times needed to complete each step. This score card was then evaluated in a further conference including ten experts who blindly assessed videotapes of 15 endoscopists performing upper GI endoscopy in a training bio-simulation model (the 'Erlangen Endo-Trainer'). On the basis of their previous experience (i. e. the number of endoscopies performed) these 15 endoscopists were classified into four groups: very experienced, experienced, having some experience and inexperienced. Interobserver variability (IOV) was tested for the various score card parameters (Kendall's rank-correlation coefficient 0.0-0.5 poor, 0.5-1.0 good agreement). In addition, the correlation between the score card assessment and the examiners' experience levels was analysed. Despite poor IOV results for all the parameters tested (Kendall coefficient < 0.3), the assessment parameters correlated well when the examiners' different experience levels were taken into account (correlation coefficient 0.59-0.89, p < 0.05). The score card parameters were suitable for differentiating between the four groups of examiners with different levels of endoscopic experience. As expected with scores involving subjective assessment of performance, the variability between reviewers was substantial. Nevertheless, the assessment score was capable of distinguishing reliably between different experience levels in terms of a good individual observer consistency. The score card can therefore be used to document both training status and progress during endoscopy training courses using bio-simulation models, and this might be able to provide improved quality assurance in GI endoscopy training.
Matsuda, Akira; Kawabata, Hiroshi; Tohyama, Kaoru; Maeda, Tomoya; Araseki, Kayano; Hata, Tomoko; Suzuki, Takahiro; Kayano, Hidekazu; Shimbo, Kei; Usuki, Kensuke; Chiba, Shigeru; Ishikawa, Takayuki; Arima, Nobuyoshi; Nohgawa, Masaharu; Ohta, Akiko; Miyazaki, Yasushi; Nakao, Sinnji; Ozawa, Keiya; Arai, Shunya; Kurokawa, Mineo; Mitani, Kinuko; Takaori-Kondo, Akifumi
2018-06-07
The diagnosis of myelodysplastic syndromes (MDS) is based on morphology and cytogenetics. However, limited information is currently available on the interobserver concordance of the assessment of dysplastic lineages (<10% or ≥10% in bone marrow (BM)). The revised International Prognostic Scoring System (IPSS-R) described a new threshold (2%) for BM blasts. However, the interobserver concordance of the categories (0-≤2% and >2-<5%) has limited data. The purpose of the present study was to investigate the assessment of dysplastic lineages and IPSS-R reproducibility. Our study was divided into two Steps. In each Step, the microscopic examinations were performed separately by two morphologists. Regarding the category of BM blasts ≤2% and >2-<5%, interobserver agreement was more than 'moderate' in all pairs (kappa test: 0.43-0.90). Regarding dysgranulopoiesis (dysG) and dyserythropoiesis (dysE) in BM, interobserver agreement was more than 'moderate' in all pairs (kappa test, dysG: 0.45-0.96, dysE: 0.45-0.81). Regarding the category of dysmegakaryopoiesis (dysMgk) in BM, interobserver agreement was more than moderate in 4 out of 5 pairs (kappa test: 0.58-1.00), and was fair for one pair (kappa test: 0.37). We consider that high interobserver concordance may be possible for the BM blast cell count (≤2% or >2-<5%) and dysplasia (<10% or ≥10%) of each lineage. Copyright © 2018 Elsevier Ltd. All rights reserved.
Mochizuki, Yuta; Kaneko, Takao; Kawahara, Keisuke; Toyoda, Shinya; Kono, Norihiko; Hada, Masaru; Ikegami, Hiroyasu; Musha, Yoshiro
2017-11-20
The quadrant method was described by Bernard et al. and it has been widely used for postoperative evaluation of anterior cruciate ligament (ACL) reconstruction. The purpose of this research is to further develop the quadrant method measuring four points, which we named four-point quadrant method, and to compare with the quadrant method. Three-dimensional computed tomography (3D-CT) analyses were performed in 25 patients who underwent double-bundle ACL reconstruction using the outside-in technique. The four points in this study's quadrant method were defined as point1-highest, point2-deepest, point3-lowest, and point4-shallowest, in femoral tunnel position. Value of depth and height in each point was measured. Antero-medial (AM) tunnel is (depth1, height2) and postero-lateral (PL) tunnel is (depth3, height4) in this four-point quadrant method. The 3D-CT images were evaluated independently by 2 orthopaedic surgeons. A second measurement was performed by both observers after a 4-week interval. Intra- and inter-observer reliability was calculated by means of intra-class correlation coefficient (ICC). Also, the accuracy of the method was evaluated against the quadrant method. Intra-observer reliability was almost perfect for both AM and PL tunnel (ICC > 0.81). Inter-observer reliability of AM tunnel was substantial (ICC > 0.61) and that of PL tunnel was almost perfect (ICC > 0.81). The AM tunnel position was 0.13% deep, 0.58% high and PL tunnel position was 0.01% shallow, 0.13% low compared to quadrant method. The four-point quadrant method was found to have high intra- and inter-observer reliability and accuracy. This method can evaluate the tunnel position regardless of the shape and morphology of the bone tunnel aperture for use of comparison and can provide measurement that can be compared with various reconstruction methods. The four-point quadrant method of this study is considered to have clinical relevance in that it is a detailed and accurate tool for evaluating femoral tunnel position after ACL reconstruction. Case series, Level IV.
Chuong, Anh Minh; Corno, Lucie; Beaussier, Hélène; Boulay-Coletta, Isabelle; Millet, Ingrid; Hodel, Jérôme; Taourel, Patrice; Chatellier, Gilles; Zins, Marc
2016-07-01
Purpose To determine whether adding unenhanced computed tomography (CT) to contrast material-enhanced CT improves the diagnostic performance of decreased bowel wall enhancement as a sign of ischemia complicating mechanical small bowel obstruction (SBO). Materials and Methods This retrospective study was approved by the institutional review board, which waived the requirement for informed consent. Two gastrointestinal radiologists independently performed retrospective assessments of 164 unenhanced and contrast-enhanced CT studies from 158 consecutive patients (mean age, 71.2 years) with mechanical SBO. The reference standard was the intraoperative and/or histologic diagnosis (in 80 cases) or results from clinical follow-up in patients who did not undergo surgery (84 cases). Decreased bowel wall enhancement was evaluated with contrast-enhanced images then and both unenhanced and contrast-enhanced images 1 month later. Diagnostic performance of decreased bowel wall enhancement and confidence in the diagnosis were compared between the two readings by using McNemar and Wilcoxon signed rank tests. Interobserver agreement was assessed by using κ statistics and compared with bootstrapping. Results Ischemia was diagnosed in 41 of 164 (25%) episodes of SBO. For both observers, adding unenhanced images improved decreased bowel wall enhancement sensitivity (observer 1: 46.3% [19 of 41] vs 65.8% [27 of 41], P = .02; observer 2: 56.1% [23 of 41] vs 63.4% [26 of 41], P = .45), Youden index (from 0.41 to 0.58 for observer 1 and from 0.42 to 0.61 for observer 2), and confidence score (P < .001 for both). Specificity significantly increased for observer 2 (84.5% [104 of 123] vs 94.3% [116 of 123], P = .002), and interobserver agreement significantly increased, from moderate (κ = 0.48) to excellent (κ = 0.89; P < .0001). Conclusion Adding unenhanced CT to contrast-enhanced CT improved the sensitivity, diagnostic confidence, and interobserver agreement of the diagnosis of ischemia, a complication of mechanical SBO, on the basis of decreased bowel wall enhancement. (©) RSNA, 2016.
Artificial intelligence in mitral valve analysis.
Jeganathan, Jelliffe; Knio, Ziyad; Amador, Yannis; Hai, Ting; Khamooshian, Arash; Matyal, Robina; Khabbaz, Kamal R; Mahmood, Feroze
2017-01-01
Echocardiographic analysis of mitral valve (MV) has become essential for diagnosis and management of patients with MV disease. Currently, the various software used for MV analysis require manual input and are prone to interobserver variability in the measurements. The aim of this study is to determine the interobserver variability in an automated software that uses artificial intelligence for MV analysis. Retrospective analysis of intraoperative three-dimensional transesophageal echocardiography data acquired from four patients with normal MV undergoing coronary artery bypass graft surgery in a tertiary hospital. Echocardiographic data were analyzed using the eSie Valve Software (Siemens Healthcare, Mountain View, CA, USA). Three examiners analyzed three end-systolic (ES) frames from each of the four patients. A total of 36 ES frames were analyzed and included in the study. A multiple mixed-effects ANOVA model was constructed to determine if the examiner, the patient, and the loop had a significant effect on the average value of each parameter. A Bonferroni correction was used to correct for multiple comparisons, and P = 0.0083 was considered to be significant. Examiners did not have an effect on any of the six parameters tested. Patient and loop had an effect on the average parameter value for each of the six parameters as expected (P < 0.0083 for both). We were able to conclude that using automated analysis, it is possible to obtain results with good reproducibility, which only requires minimal user intervention.
Pavlides, Michael; Birks, Jacqueline; Fryer, Eve; Delaney, David; Sarania, Nikita; Banerjee, Rajarshi; Neubauer, Stefan; Barnes, Eleanor; Fleming, Kenneth A; Wang, Lai Mun
2017-04-01
The aim of the study was to investigate the interobserver agreement for categorical and quantitative scores of liver fibrosis. Sixty-five consecutive biopsy specimens from patients with mixed liver disease etiologies were assessed by three pathologists using the Ishak and nonalcoholic steatohepatitis Clinical Research Network (NASH CRN) scoring systems, and the fibrosis area (collagen proportionate area [CPA]) was estimated by visual inspection (visual-CPA). A subset of 20 biopsy specimens was analyzed using digital imaging analysis (DIA) for the measurement of CPA (DIA-CPA). The bivariate weighted κ between any two pathologists ranged from 0.57 to 0.67 for Ishak staging and from 0.47 to 0.57 for the NASH CRN staging. Bland-Altman analysis showed poor agreement between all possible pathologist pairings for visual-CPA but good agreement between all pathologist pairings for DIA-CPA. There was good agreement between the two pathologists who assessed biopsy specimens by visual-CPA and DIA-CPA. The intraclass correlation coefficient, which is equivalent to the κ statistic for continuous variables, was 0.78 for visual-CPA and 0.97 for DIA-CPA. These results suggest that DIA-CPA is the most robust method for assessing liver fibrosis followed by visual-CPA. Categorical scores perform less well than both the quantitative CPA scores assessed here. © American Society for Clinical Pathology, 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com
Rajyalakshmi, R.; Prakash, Winston D.; Ali, Mohammad Javed; Naik, Milind N.
2017-01-01
Purpose: To assess the reliability and repeatability of periorbital biometric measurements using ImageJ software and to assess if the horizontal visible iris diameter (HVID) serves as a reliable scale for facial measurements. Methods: This study was a prospective, single-blind, comparative study. Two clinicians performed 12 periorbital measurements on 100 standardised face photographs. Each individual’s HVID was determined by Orbscan IIz and used as a scale for measurements using ImageJ software. All measurements were repeated using the ‘average’ HVID of the study population as a measurement scale. Intraclass correlation coefficient (ICC) and Pearson product-moment coefficient were used as statistical tests to analyse the data. Results: The range of ICC for intra- and interobserver variability was 0.79–0.99 and 0.86–0.99, respectively. Test-retest reliability ranged from 0.66–1.0 to 0.77–0.98, respectively. When average HVID of the study population was used as scale, ICC ranged from 0.83 to 0.99, and the test-retest reliability ranged from 0.83 to 0.96 and the measurements correlated well with recordings done with individual Orbscan HVID measurements. Conclusion: Periorbital biometric measurements using ImageJ software are reproducible and repeatable. Average HVID of the population as measured by Orbscan is a reliable scale for facial measurements. PMID:29403183
Research on ionospheric tomography based on variable pixel height
NASA Astrophysics Data System (ADS)
Zheng, Dunyong; Li, Peiqing; He, Jie; Hu, Wusheng; Li, Chaokui
2016-05-01
A novel ionospheric tomography technique based on variable pixel height was developed for the tomographic reconstruction of the ionospheric electron density distribution. The method considers the height of each pixel as an unknown variable, which is retrieved during the inversion process together with the electron density values. In contrast to conventional computerized ionospheric tomography (CIT), which parameterizes the model with a fixed pixel height, the variable-pixel-height computerized ionospheric tomography (VHCIT) model applies a disturbance to the height of each pixel. In comparison with conventional CIT models, the VHCIT technique achieved superior results in a numerical simulation. A careful validation of the reliability and superiority of VHCIT was performed. According to the results of the statistical analysis of the average root mean square errors, the proposed model offers an improvement by 15% compared with conventional CIT models.
Bennett, Rebecca J; Taljaard, Dunay S; Olaithe, Michelle; Brennan-Jones, Chris; Eikelboom, Robert H
2017-09-18
The purpose of this study is to raise awareness of interobserver concordance and the differences between interobserver reliability and agreement when evaluating the responsiveness of a clinician-administered survey and, specifically, to demonstrate the clinical implications of data types (nominal/categorical, ordinal, interval, or ratio) and statistical index selection (for example, Cohen's kappa, Krippendorff's alpha, or interclass correlation). In this prospective cohort study, 3 clinical audiologists, who were masked to each other's scores, administered the Practical Hearing Aid Skills Test-Revised to 18 adult owners of hearing aids. Interobserver concordance was examined using a range of reliability and agreement statistical indices. The importance of selecting statistical measures of concordance was demonstrated with a worked example, wherein the level of interobserver concordance achieved varied from "no agreement" to "almost perfect agreement" depending on data types and statistical index selected. This study demonstrates that the methodology used to evaluate survey score concordance can influence the statistical results obtained and thus affect clinical interpretations.
van Doorn, Sascha C; Hazewinkel, Y; East, James E; van Leerdam, Monique E; Rastogi, Amit; Pellisé, Maria; Sanduleanu-Dascalescu, Silvia; Bastiaansen, Barbara A J; Fockens, Paul; Dekker, Evelien
2015-01-01
The Paris classification is an international classification system for describing polyp morphology. Thus far, the validity and reproducibility of this classification have not been assessed. We aimed to determine the interobserver agreement for the Paris classification among seven Western expert endoscopists. A total of 85 short endoscopic video clips depicting polyps were created and assessed by seven expert endoscopists according to the Paris classification. After a digital training module, the same 85 polyps were assessed again. We calculated the interobserver agreement with a Fleiss kappa and as the proportion of pairwise agreement. The interobserver agreement of the Paris classification among seven experts was moderate with a Fleiss kappa of 0.42 and a mean pairwise agreement of 67%. The proportion of lesions assessed as "flat" by the experts ranged between 13 and 40% (P<0.001). After the digital training, the interobserver agreement did not change (kappa 0.38, pairwise agreement 60%). Our study is the first to validate the Paris classification for polyp morphology. We demonstrated only a moderate interobserver agreement among international Western experts for this classification system. Our data suggest that, in its current version, the use of this classification system in daily practice is questionable and it is unsuitable for comparative endoscopic research. We therefore suggest introduction of a simplification of the classification system.
Kwon, Mi-Ri; Kim, Chan Kyo; Kim, Jae-Hun
2017-11-01
To investigate the variability of diffusion-weighted imaging (DWI) interpretation of Prostate Imaging Reporting and Data System (PI-RADS) version 2 (v2) in evaluating prostate cancer (PCa). 154 patients with PCa underwent multiparametric 3T MRI, followed by radical prostatectomy. DWI with different b values (b = 0, 100, 1000 and 1500 s mm - 2 ) was obtained. Using the PI-RADS v2, two radiologists independently scored suspicious lesions in each patient and compared DWI of b = 1000 (DWI 1000 ) with 1500 (DWI 1500 ) s mm - 2 . On DWI 1000 and DWI 1500 , the intermethod and interobserver agreements of DWI scores were excellent in all patients (κ ≥ 0.873). In each peripheral zone and transition zone DWI scores, both observers showed excellent intermethod agreement between DWI 1000 and DWI 1500 (κ ≥ 0.897), and interobserver agreement for DWI 1000 and DWI 1500 was good to excellent (κ ≥ 0.796). For estimating clinically significant cancer, the area under receiver operating characteristics curves of DWI 1000 and DWI 1500 were 0.710 and 0.724 for observer 1 (p = 0.11), and 0.649 and 0.656 for observer 2 (p = 0.12), respectively. The PI-RADS v2 scoring at 3T shows excellent agreement between DWI 1000 and DWI 1500 in evaluating PCa, with excellent inter-observer agreement. Advance in knowledge: DWI using b = 1000 s mm -2 instead of b = 1500 s mm -2 reduces examination time or image distortion, with improved the signal-to-noise ratio.
Williamson, Sean R; Rao, Priya; Hes, Ondrej; Epstein, Jonathan I; Smith, Steven C; Picken, Maria M; Zhou, Ming; Tretiakova, Maria S; Tickoo, Satish K; Chen, Ying-Bei; Reuter, Victor E; Fleming, Stewart; Maclean, Fiona M; Gupta, Nilesh S; Kuroda, Naoto; Delahunt, Brett; Mehra, Rohit; Przybycin, Christopher G; Cheng, Liang; Eble, John N; Grignon, David J; Moch, Holger; Lopez, Jose I; Kunju, Lakshmi P; Tamboli, Pheroze; Srigley, John R; Amin, Mahul B; Martignoni, Guido; Hirsch, Michelle S; Bonsib, Stephen M; Trpkov, Kiril
2018-06-06
Staging criteria for renal cell carcinoma differ from many other cancers, in that renal tumors are often spherical with subtle, finger-like extensions into veins, renal sinus, or perinephric tissue. We sought to study interobserver agreement in pathologic stage categories for challenging cases. An online survey was circulated to urologic pathologists interested in kidney tumors, yielding 89% response (31/35). Most questions included 1 to 4 images, focusing on: vascular and renal sinus invasion (n=24), perinephric invasion (n=9), and gross pathology/specimen handling (n=17). Responses were collapsed for analysis into positive and negative/equivocal for upstaging. Consensus was regarded as an agreement of 67% (2/3) of participants, which was reached in 20/33 (61%) evaluable scenarios regarding renal sinus, perinephric, or vein invasion, of which 13/33 (39%) had ≥80% consensus. Lack of agreement was especially encountered regarding small tumor protrusions into a possible vascular lumen, close to the tumor leading edge. For gross photographs, most were interpreted as suspicious but requiring histologic confirmation. Most participants (61%) rarely used special stains to evaluate vascular invasion, usually endothelial markers (81%). Most agreed that a spherical mass bulging well beyond the kidney parenchyma into the renal sinus (71%) or perinephric fat (90%) did not necessarily indicate invasion. Interobserver agreement in pathologic staging of renal cancer is relatively good among urologic pathologists interested in kidney tumors, even when selecting cases that test the earliest and borderline thresholds for extrarenal extension. Disagreements remain, however, particularly for tumors with small, finger-like protrusions, closely juxtaposed to the main mass.
Ercan, Ertuğrul; Kırılmaz, Bahadır; Kahraman, İsmail; Bayram, Vildan; Doğan, Hüseyin
2012-11-01
Flow-mediated dilation (FMD) is used to evaluate endothelial functions. Computer-assisted analysis utilizing edge detection permits continuous measurements along the vessel wall. We have developed a new fully automated software program to allow accurate and reproducible measurement. FMD has been measured and analyzed in 18 coronary artery disease (CAD) patients and 17 controls both by manually and by the software developed (computer supported) methods. The agreement between methods was assessed by Bland-Altman analysis. The mean age, body mass index and cardiovascular risk factors were higher in CAD group. Automated FMD% measurement for the control subjects was 18.3±8.5 and 6.8±6.5 for the CAD group (p=0.0001). The intraobserver and interobserver correlation for automated measurement was high (r=0.974, r=0.981, r=0.937, r=0.918, respectively). Manual FMD% at 60th second was correlated with automated FMD % (r=0.471, p=0.004). The new fully automated software© can be used to precise measurement of FMD with low intra- and interobserver variability than manual assessment.
Awad, I A; Katz, A; Hahn, J F; Kong, A K; Ahl, J; Lüders, H
1989-01-01
The extent of resection was assessed in 45 temporal lobectomies for medically intractable epilepsy with mapped temporal lobe foci. Postoperative magnetic resonance imaging (MRI) in the coronal plane was used to quantify the extent of resection of superior lateral, inferior lateral, basal, and medial structures, including the amygdalohippocampal complex. A new 20-compartment model of the temporal lobe was used for this assessment. Blinded interobserver variability was minimal. Intraoperative measurements and maps routinely overestimated the actual extent of resection, especially of medial structures. One year after surgery, 70% of patients remained seizure-free (except for auras). Seizure-free outcome was accomplished despite varying degrees of resection, but was more likely achieved with more extensive resections in all compartments. Among patients with mesiobasal foci, seizure-free outcome correlated significantly with extent of resection of amygdalohippocampal complex. We conclude that assessment of extent of resection by postoperative MRI provides an objective basis of evaluating outcome after temporal lobectomy. It allows a rational approach to understanding of operative failures and is potentially useful in comparing efficacy of various surgical approaches.
Verhaart, René F; Fortunati, Valerio; Verduijn, Gerda M; van der Lugt, Aad; van Walsum, Theo; Veenland, Jifke F; Paulides, Margarethus M
2014-12-01
In current clinical practice, head and neck (H&N) hyperthermia treatment planning (HTP) is solely based on computed tomography (CT) images. Magnetic resonance imaging (MRI) provides superior soft-tissue contrast over CT. The purpose of the authors' study is to investigate the relevance of using MRI in addition to CT for patient modeling in H&N HTP. CT and MRI scans were acquired for 11 patients in an immobilization mask. Three observers manually segmented on CT, MRI T1 weighted (MRI-T1w), and MRI T2 weighted (MRI-T2w) images the following thermo-sensitive tissues: cerebrum, cerebellum, brainstem, myelum, sclera, lens, vitreous humor, and the optical nerve. For these tissues that are used for patient modeling in H&N HTP, the interobserver variation of manual tissue segmentation in CT and MRI was quantified with the mean surface distance (MSD). Next, the authors compared the impact of CT and CT and MRI based patient models on the predicted temperatures. For each tissue, the modality was selected that led to the lowest observer variation and inserted this in the combined CT and MRI based patient model (CT and MRI), after a deformable image registration. In addition, a patient model with a detailed segmentation of brain tissues (including white matter, gray matter, and cerebrospinal fluid) was created (CT and MRIdb). To quantify the relevance of MRI based segmentation for H&N HTP, the authors compared the predicted maximum temperatures in the segmented tissues (Tmax) and the corresponding specific absorption rate (SAR) of the patient models based on (1) CT, (2) CT and MRI, and (3) CT and MRIdb. In MRI, a similar or reduced interobserver variation was found compared to CT (maximum of median MSD in CT: 0.93 mm, MRI-T1w: 0.72 mm, MRI-T2w: 0.66 mm). Only for the optical nerve the interobserver variation is significantly lower in CT compared to MRI (median MSD in CT: 0.58 mm, MRI-T1w: 1.27 mm, MRI-T2w: 1.40 mm). Patient models based on CT (Tmax: 38.0 °C) and CT and MRI (Tmax: 38.1 °C) result in similar simulated temperatures, while CT and MRIdb (Tmax: 38.5 °C) resulted in significantly higher temperatures. The SAR corresponding to these temperatures did not differ significantly. Although MR imaging reduces the interobserver variation in most tissues, it does not affect simulated local tissue temperatures. However, the improved soft-tissue contrast provided by MRI allows generating a detailed brain segmentation, which has a strong impact on the predicted local temperatures and hence may improve simulation guided hyperthermia.
Aznar, Marianne C; Girinsky, Theodore; Berthelsen, Anne Kiil; Aleman, Berthe; Beijert, Max; Hutchings, Martin; Lievens, Yolande; Meijnders, Paul; Meidahl Petersen, Peter; Schut, Deborah; Maraldo, Maja V; van der Maazen, Richard; Specht, Lena
2017-04-01
In early-stage classical Hodgkin lymphoma (HL) the target volume nowadays consists of the volume of the originally involved nodes. Delineation of this volume on a post-chemotherapy CT-scan is challenging. We report on the interobserver variability in target volume definition and its impact on resulting treatment plans. Two representative cases were selected (1: male, stage IB, localization: left axilla; 2: female, stage IIB, localizations: mediastinum and bilateral neck). Eight experienced observers individually defined the clinical target volume (CTV) using involved-node radiotherapy (INRT) as defined by the EORTC-GELA guidelines for the H10 trial. A consensus contour was generated and the standard deviation computed. We investigated the overlap between observer and consensus contour [Sørensen-Dice coefficient (DSC)] and the magnitude of gross deviations between the surfaces of the observer and consensus contour (Hausdorff distance). 3D-conformal (3D-CRT) and intensity-modulated radiotherapy (IMRT) plans were calculated for each contour in order to investigate the impact of interobserver variability on each treatment modality. Similar target coverage was enforced for all plans. The median CTV was 120 cm 3 (IQR: 95-173 cm 3 ) for Case 1, and 255 cm 3 (IQR: 183-293 cm 3 ) for Case 2. DSC values were generally high (>0.7), and Hausdorff distances were about 30 mm. The SDs between all observer contours, providing an estimate of the systematic error associated with delineation uncertainty, ranged from 1.9 to 3.8 mm (median: 3.2 mm). Variations in mean dose resulting from different observer contours were small and were not higher in IMRT plans than in 3D-CRT plans. We observed considerable differences in target volume delineation, but the systematic delineation uncertainty of around 3 mm is comparable to that reported in other tumour sites. This report is a first step towards calculating an evidence-based planning target volume margin for INRT in HL.
Training improves interobserver reliability for the diagnosis of scaphoid fracture displacement.
Buijze, Geert A; Guitton, Thierry G; van Dijk, C Niek; Ring, David
2012-07-01
The diagnosis of displacement in scaphoid fractures is notorious for poor interobserver reliability. We tested whether training can improve interobserver reliability and sensitivity, specificity, and accuracy for the diagnosis of scaphoid fracture displacement on radiographs and CT scans. Sixty-four orthopaedic surgeons rated a set of radiographs and CT scans of 10 displaced and 10 nondisplaced scaphoid fractures for the presence of displacement, using a web-based rating application. Before rating, observers were randomized to a training group (34 observers) and a nontraining group (30 observers). The training group received an online training module before the rating session, and the nontraining group did not. Interobserver reliability for training and nontraining was assessed by Siegel's multirater kappa and the Z-test was used to test for significance. There was a small, but significant difference in the interobserver reliability for displacement ratings in favor of the training group compared with the nontraining group. Ratings of radiographs and CT scans combined resulted in moderate agreement for both groups. The average sensitivity, specificity, and accuracy of diagnosing displacement of scaphoid fractures were, respectively, 83%, 85%, and 84% for the nontraining group and 87%, 86%, and 87% for the training group. Assuming a 5% prevalence of fracture displacement, the positive predictive value was 0.23 in the nontraining group and 0.25 in the training group. The negative predictive value was 0.99 in both groups. Our results suggest training can improve interobserver reliability and sensitivity, specificity and accuracy for the diagnosis of scaphoid fracture displacement, but the improvements are slight. These findings are encouraging for future research regarding interobserver variation and how to reduce it further.
Strampe, Margaret R; Huckenpahler, Alison L; Higgins, Brian P; Tarima, Sergey; Visotcky, Alexis; Stepien, Kimberly E; Kay, Christine N; Carroll, Joseph
2018-05-01
To examine repeatability and reproducibility of ellipsoid zone (EZ) width measurements in patients with retinitis pigmentosa (RP) using a longitudinal reflectivity profile (LRP) analysis. We examined Bioptigen optical coherence tomography (OCT) scans from 48 subjects with RP or Usher syndrome. Nominal scan lengths were 6, 7, or 10 mm, and the lateral scale of each scan was calculated using axial length measurements. LRPs were generated from OCT line scans, and the peak corresponding to EZ was manually identified using ImageJ. The locations at which the EZ peak disappeared were used to calculate EZ width. Each scan was analyzed twice by each of two observers, who were masked to their previous measurements and those of the other observer. On average, horizontal width (HW) was significantly greater than vertical width (VW), and there was high interocular symmetry for both HW and VW. We observed excellent intraobserver repeatability with intraclass correlation coefficients (ICCs) ranging from 0.996 to 0.998 for HW and VW measurements. Interobserver reproducibility was also excellent for both HW (ICC = 0.989; 95% confidence interval [CI] = 0.983-0.995) and VW (ICC = 0.991; 95% CI = 0.985-0.996), with no significant bias observed between observers. EZ width can be measured using LRPs with excellent repeatability and reproducibility. Our observation of greater HW than VW is consistent with previous observations in RP, though the reason for this anisotropy remains unclear. We describe repeatability and reproducibility of a method for measuring EZ width in patients with RP or Usher syndrome. This approach could facilitate measurement of retinal band thickness and/or intensity.
Reddy, M V; Eachempati, Krishnakiran; Gurava Reddy, A V; Mugalur, Aakash
2018-01-01
Rapid prototyping (RP) is used widely in dental and faciomaxillary surgery with anecdotal uses in orthopedics. The purview of RP in orthopedics is vast. However, there is no error analysis reported in the literature on bone models generated using office-based RP. This study evaluates the accuracy of fused deposition modeling (FDM) using standard tessellation language (STL) files and errors generated during the fabrication of bone models. Nine dry bones were selected and were computed tomography (CT) scanned. STL files were procured from the CT scans and three-dimensional (3D) models of the bones were printed using our in-house FDM based 3D printer using Acrylonitrile Butadiene Styrene (ABS) filament. Measurements were made on the bone and 3D models according to data collection procedures for forensic skeletal material. Statistical analysis was performed to establish interobserver co-relation for measurements on dry bones and the 3D bone models. Statistical analysis was performed using SPSS version 13.0 software to analyze the collected data. The inter-observer reliability was established using intra-class coefficient for both the dry bones and the 3D models. The mean of absolute difference is 0.4 that is very minimal. The 3D models are comparable to the dry bones. STL file dependent FDM using ABS material produces near-anatomical 3D models. The high 3D accuracy hold a promise in the clinical scenario for preoperative planning, mock surgery, and choice of implants and prostheses, especially in complicated acetabular trauma and complex hip surgeries.
Huguet, Audrey; Latournerie, Marianne; Debry, Pauline Houssel; Jezequel, Caroline; Legros, Ludivine; Rayar, Michel; Boudjema, Karim; Guyader, Dominique; Jacquet, Edouard Bardou; Thibault, Ronan
2018-02-09
Malnutrition impairs prognosis in liver cirrhosis. Our aims were to determine (1) if transversal (TPTI) and axial (APTI) psoas thickness indices predict mortality in cirrhotic patients and (2) the feasibility and reproducibility of transversal (TDPM) and axial (ADPM) diameters of the psoas muscle measurements. This was a retrospective study. Inclusion criteria included cirrhosis diagnosis, on liver transplantation waiting list, and abdominal computed tomography (CT) scan within the 3 mo preceding list inscription. TDPM and ADPM were measured on a single umbilicus-targeted CT image by non-expert and expert operators. TPTI or APTI (mm/m) were calculated as TDPM or ADPM/height (m). Area under the receiver operating characteristic curve (AUC) and Cox proportional hazard models were assessed. TPTI and APTI interobserver agreement: κ correlation test. A total of 173 patients were included. Low TPTI was associated with increased mortality: AUC = 0.66 (95% confidence interval, 0.51-0.80). TPTI was the only factor associated with mortality (hazard ratio = 0.87, 95% confidence interval 0.76-0.99, P = 0.034). There was an almost perfect interobserver agreement between the two operators: TDPM, κ = 0.97; ADPM, κ = 0.94; P <0.0001. TPTI measured on umbilicus-targeted CT scan before inscription on the waiting list for liver transplantation predicts mortality of cirrhotic patients. TPTI measurement is easy and reliable, even by a non-trained operator, and this is highly feasible in daily clinical practice. Copyright © 2018 Elsevier Inc. All rights reserved.
Böker, Sarah M.; Bender, Yvonne Y.; Diederichs, Gerd; Fallenberg, Eva M.; Wagner, Moritz; Hamm, Bernd; Makowski, Marcus R.
2017-01-01
Objectives To determine the diagnostic performance of susceptibility-weighted magnetic resonance imaging (SWMR) for the detection of pineal gland calcifications (PGC) compared to conventional magnetic resonance imaging (MRI) sequences, using computed tomography (CT) as a reference standard. Methods 384 patients who received a 1.5 Tesla MRI scan including SWMR sequences and a CT scan of the brain between January 2014 and October 2016 were retrospectively evaluated. 346 patients were included in the analysis, of which 214 showed PGC on CT scans. To assess correlation between imaging modalities, the maximum calcification diameter was used. Sensitivity and specificity and intra- and interobserver reliability were calculated for SWMR and conventional MRI sequences. Results SWMR reached a sensitivity of 95% (95% CI: 91%-97%) and a specificity of 96% (95% CI: 91%-99%) for the detection of PGC, whereas conventional MRI achieved a sensitivity of 43% (95% CI: 36%-50%) and a specificity of 96% (95% CI: 91%-99%). Detection rates for calcifications in SWMR and conventional MRI differed significantly (95% versus 43%, p<0.001). Diameter measurements between SWMR and CT showed a close correlation (R2 = 0.85, p<0.001) with a slight but not significant overestimation of size (SWMR: 6.5 mm ± 2.5; CT: 5.9 mm ± 2.4, p = 0.02). Interobserver-agreement for diameter measurements was excellent on SWMR (ICC = 0.984, p < 0.0001). Conclusions Combining SWMR magnitude and phase information enables the accurate detection of PGC and offers a better diagnostic performance than conventional MRI with CT as a reference standard. PMID:28278291
Yue, Dong; Fan Rong, Cheng; Ning, Cai; Liang, Hu; Ai Lian, Liu; Ru Xin, Wang; Ya Hong, Luo
2018-07-01
Background The evaluation of hip arthroplasty is a challenge in computed tomography (CT). The virtual monochromatic spectral (VMS) images with metal artifact reduction software (MARs) in spectral CT can reduce the artifacts and improve the image quality. Purpose To evaluate the effects of VMS images and MARs for metal artifact reduction in patients with unilateral hip arthroplasty. Material and Methods Thirty-five patients underwent dual-energy CT. Four sets of VMS images without MARs and four sets of VMS images with MARs were obtained. Artifact index (AI), CT number, and SD value were assessed at the periprosthetic region and the pelvic organs. The scores of two observers for different images and the inter-observer agreement were evaluated. Results The AIs in 120 and 140 keV images were significantly lower than those in 80 and 100 keV images. The AIs of the periprosthetic region in VMS images with MARs were significantly lower than those in VMS images without MARs, while the AIs of pelvic organs were not significantly different. VMS images with MARs improved the accuracy of CT numbers for the periprosthetic region. The inter-observer agreements were good for all the images. VMS images with MARs at 120 and 140 keV had higher subjective scores and could improve the image quality, leading to reliable diagnosis of prosthesis-related problems. Conclusion VMS images with MARs at 120 and 140 keV could significantly reduce the artifacts from hip arthroplasty and improve the image quality at the periprosthetic region but had no obvious advantage for pelvic organs.
Murakami, Keiko; Rancilio, Nicholas J; Plantenga, Jeannie Poulson; Moore, George E; Heng, Hock Gan; Lim, Chee Kin
2018-05-01
In radiation therapy (RT) treatment planning for canine head and neck cancer, the tonsils may be included as part of the treated volume. Delineation of tonsils on computed tomography (CT) scans is difficult. Error or uncertainty in the volume and location of contoured structures may result in treatment failure. The purpose of this prospective, observer agreement study was to assess the interobserver agreement of tonsillar contouring by two groups of trained observers. Thirty dogs undergoing pre- and post-contrast CT studies of the head were included. After the pre- and postcontrast CT scans, the tonsils were identified via direct visualization, barium paste was applied bilaterally to the visible tonsils, and a third CT scan was acquired. Data from each of the three CT scans were registered in an RT treatment planning system. Two groups of observers (one veterinary radiologist and one veterinary radiation oncologist in each group) contoured bilateral tonsils by consensus, obtaining three sets of contours. Tonsil volume and location data were obtained from both groups. The contour volumes and locations were compared between groups using mixed (fixed and random effect) linear models. There was no significant difference between each group's contours in terms of three-dimensional coordinates. However there was a significant difference between each group's contours in terms of the tonsillar volume (P < 0.0001). Pre- and postcontrast CT can be used to identify the location of canine tonsils with reasonable agreement between trained observers. Discrepancy in tonsillar volume between groups of trained observers may affect RT treatment outcome. © 2017 American College of Veterinary Radiology.
Cooper, David T; Behrens, Claus F
2016-01-01
Objective: In cervical radiotherapy, it is essential that the uterine position is correctly determined prior to treatment delivery. The aim of this study was to evaluate an autoscan ultrasound (A-US) probe, a motorized transducer creating three-dimensional (3D) images by sweeping, by comparing it with a conventional ultrasound (C-US) probe, where manual scanning is required to acquire 3D images. Methods: Nine healthy volunteers were scanned by seven operators, using the Clarity® system (Elekta, Stockholm, Sweden). In total, 72 scans, 36 scans from the C-US and 36 scans from the A-US probes, were acquired. Two observers delineated the uterine structure, using the software-assisted segmentation in the Clarity workstation. The data of uterine volume, uterine centre of mass (COM) and maximum uterine lengths, in three orthogonal directions, were analyzed. Results: In 53% of the C-US scans, the whole uterus was captured, compared with 89% using the A-US. F-test on 36 scans demonstrated statistically significant differences in interobserver COM standard deviation (SD) when comparing the C-US with the A-US probe for the inferior–superior (p < 0.006), left–right (p < 0.012) and anteroposterior directions (p < 0.001). The median of the interobserver COM distance (Euclidean distance for 36 scans) was reduced from 8.5 (C-US) to 6.0 mm (A-US). An F-test on the 36 scans showed strong significant differences (p < 0.001) in the SD of the Euclidean interobserver distance when comparing the C-US with the A-US scans. The average Dice coefficient when comparing the two observers was 0.67 (C-US) and 0.75 (A-US). The predictive interval demonstrated better interobserver delineation concordance using the A-US probe. Conclusion: The A-US probe imaging might be a better choice of image-guided radiotherapy system for correcting for daily uterine positional changes in cervical radiotherapy. Advances in knowledge: Using a novel A-US probe might reduce the uncertainty in interoperator variability during ultrasound scanning. PMID:27452268
Baker, Mariwan; Cooper, David T; Behrens, Claus F
2016-10-01
In cervical radiotherapy, it is essential that the uterine position is correctly determined prior to treatment delivery. The aim of this study was to evaluate an autoscan ultrasound (A-US) probe, a motorized transducer creating three-dimensional (3D) images by sweeping, by comparing it with a conventional ultrasound (C-US) probe, where manual scanning is required to acquire 3D images. Nine healthy volunteers were scanned by seven operators, using the Clarity(®) system (Elekta, Stockholm, Sweden). In total, 72 scans, 36 scans from the C-US and 36 scans from the A-US probes, were acquired. Two observers delineated the uterine structure, using the software-assisted segmentation in the Clarity workstation. The data of uterine volume, uterine centre of mass (COM) and maximum uterine lengths, in three orthogonal directions, were analyzed. In 53% of the C-US scans, the whole uterus was captured, compared with 89% using the A-US. F-test on 36 scans demonstrated statistically significant differences in interobserver COM standard deviation (SD) when comparing the C-US with the A-US probe for the inferior-superior (p < 0.006), left-right (p < 0.012) and anteroposterior directions (p < 0.001). The median of the interobserver COM distance (Euclidean distance for 36 scans) was reduced from 8.5 (C-US) to 6.0 mm (A-US). An F-test on the 36 scans showed strong significant differences (p < 0.001) in the SD of the Euclidean interobserver distance when comparing the C-US with the A-US scans. The average Dice coefficient when comparing the two observers was 0.67 (C-US) and 0.75 (A-US). The predictive interval demonstrated better interobserver delineation concordance using the A-US probe. The A-US probe imaging might be a better choice of image-guided radiotherapy system for correcting for daily uterine positional changes in cervical radiotherapy. Using a novel A-US probe might reduce the uncertainty in interoperator variability during ultrasound scanning.
Gastritis staging: interobserver agreement by applying OLGA and OLGIM systems.
Isajevs, Sergejs; Liepniece-Karele, Inta; Janciauskas, Dainius; Moisejevs, Georgijs; Putnins, Viesturs; Funka, Konrads; Kikuste, Ilze; Vanags, Aigars; Tolmanis, Ivars; Leja, Marcis
2014-04-01
Atrophic gastritis remains a difficult histopathological diagnosis with low interobserver agreement. The aim of our study was to compare gastritis staging and interobserver agreement between general and expert gastrointestinal (GI) pathologists using Operative Link for Gastritis Assessment (OLGA) and Operative Link on Gastric Intestinal Metaplasia (OLGIM). We enrolled 835 patients undergoing upper endoscopy in the study. Two general and two expert gastrointestinal pathologists graded biopsy specimens according to the Sydney classification, and the stage of gastritis was assessed by OLGA and OLGIM system. Using OLGA, 280 (33.4 %) patients had gastritis (stage I-IV), whereas with OLGIM this was 167 (19.9 %). OLGA stage III- IV gastritis was observed in 25 patients, whereas by OLGIM stage III-IV was found in 23 patients. Interobserver agreement between expert GI pathologists for atrophy in the antrum, incisura angularis, and corpus was moderate (kappa = 0.53, 0.57 and 0.41, respectively, p < 0.0001), but almost perfect for intestinal metaplasia (kappa = 0.82, 0.80 and 0.81, respectively, p < 0.0001). However, interobserver agreement between general pathologists was poor for atrophy, but moderate for intestinal metaplasia. OLGIM staging provided the highest interobserver agreement, but a substantial proportion of potentially high-risk individuals would be missed if only OLGIM staging is applied. Therefore, we recommend to use a combination of OLGA and OLGIM for staging of chronic gastritis.
NASA Astrophysics Data System (ADS)
Gavrielides, Marios A.; Ronnett, Brigitte M.; Vang, Russell; Seidman, Jeffrey D.
2015-03-01
Studies have shown that different cell types of ovarian carcinoma have different molecular profiles, exhibit different behavior, and that patients could benefit from typespecific treatment. Different cell types display different histopathology features, and different criteria are used for each cell type classification. Inter-observer variability for the task of classifying ovarian cancer cell types is an under-examined area of research. This study served as a pilot study to quantify observer variability related to the classification of ovarian cancer cell types and to extract valuable data for designing a validation study of digital pathology (DP) for this task. Three observers with expertise in gynecologic pathology reviewed 114 cases of ovarian cancer with optical microscopy, with specific guidelines for classifications into distinct cell types. For 93 cases all 3 pathologists agreed on the same cell type, for 18 cases 2 out of 3 agreed, and for 3 cases there was no agreement. Across cell types with a minimum sample size of 10 cases, agreement between all three observers was {91.1%, 80.0%, 90.0%, 78.6%, 100.0%, 61.5%} for the high grade serous carcinoma, low grade serous carcinoma, endometrioid, mucinous, clear cell, and carcinosarcoma cell types respectively. These results indicate that unanimous agreement varied over a fairly wide range. However, additional research is needed to determine the importance of these differences in comparison studies. These results will be used to aid in the design and sizing of such a study comparing optical and digital pathology. In addition, the results will help in understanding the potential role computer-aided diagnosis has in helping to improve the agreement of pathologists for this task.
Hesketh, Kim; Sankar, Wudbhav; Joseph, Benjamin; Narayanan, Unni; Mulpuri, Kishore
2016-04-01
The incidence of avascular necrosis (AVN) following reconstructive hip surgery in cerebral palsy (CP) ranges from 0 to 69 % in the current literature. The purpose of this study was to determine the inter- and intra-observer reliability of radiographically diagnosing AVN in children with CP after hip surgery. A retrospective review of 65 children with CP who had reconstructive hip surgery between 2009 and 2012 at BC Children's Hospital was completed. Anterior-posterior and lateral radiographs were presented to four pediatric orthopaedic surgeons over two rounds. Surgeons were asked to review the set of unidentified radiographs and comment 'yes' or 'no' for the presence of AVN. Two weeks later the same set of radiographs was sent in a different order and the surgeons were again asked to comment on AVN. Inter- and intra-observer reliability was determined using kappa statistics. The intra-observer reliability ranged from 0.65 to 0.88 with an average score of 0.76. Inter-observer reliability showed greater variability, ranging from 0.41 to 0.77 with an average score of 0.56 across all surgeons. Although the intra-rater reliability produced a strength of "good" and the inter-rater reliability a strength of "moderate" agreement, the variability within these scores is clinically important as it demonstrates the difficulty in identifying AVN. This may explain the variability in AVN that is reported in the literature. The need for further education and research in the diagnosis of AVN in children with CP who have undergone reconstructive hip surgery is clinically necessary.
Duncan, James R; Kline, Benjamin; Glaiberman, Craig B
2007-04-01
To create and test methods of extracting efficiency data from recordings of simulated renal stent procedures. Task analysis was performed and used to design a standardized testing protocol. Five experienced angiographers then performed 16 renal stent simulations using the Simbionix AngioMentor angiographic simulator. Audio and video recordings of these simulations were captured from multiple vantage points. The recordings were synchronized and compiled. A series of efficiency metrics (procedure time, contrast volume, and tool use) were then extracted from the recordings. The intraobserver and interobserver variability of these individual metrics was also assessed. The metrics were converted to costs and aggregated to determine the fixed and variable costs of a procedure segment or the entire procedure. Task analysis and pilot testing led to a standardized testing protocol suitable for performance assessment. Task analysis also identified seven checkpoints that divided the renal stent simulations into six segments. Efficiency metrics for these different segments were extracted from the recordings and showed excellent intra- and interobserver correlations. Analysis of the individual and aggregated efficiency metrics demonstrated large differences between segments as well as between different angiographers. These differences persisted when efficiency was expressed as either total or variable costs. Task analysis facilitated both protocol development and data analysis. Efficiency metrics were readily extracted from recordings of simulated procedures. Aggregating the metrics and dividing the procedure into segments revealed potential insights that could be easily overlooked because the simulator currently does not attempt to aggregate the metrics and only provides data derived from the entire procedure. The data indicate that analysis of simulated angiographic procedures will be a powerful method of assessing performance in interventional radiology.
Ullman, Karen L; Ning, Holly; Susil, Robert C; Ayele, Asna; Jocelyn, Lucresse; Havelos, Jan; Guion, Peter; Xie, Huchen; Li, Guang; Arora, Barbara C; Cannon, Angela; Miller, Robert W; Norman Coleman, C; Camphausen, Kevin; Ménard, Cynthia
2006-01-01
Background We sought to determine the intra- and inter-radiation therapist reproducibility of a previously established matching technique for daily verification and correction of isocenter position relative to intraprostatic fiducial markers (FM). Materials and methods With the patient in the treatment position, anterior-posterior and left lateral electronic images are acquired on an amorphous silicon flat panel electronic portal imaging device. After each portal image is acquired, the therapist manually translates and aligns the fiducial markers in the image to the marker contours on the digitally reconstructed radiograph. The distances between the planned and actual isocenter location is displayed. In order to determine the reproducibility of this technique, four therapists repeated and recorded this operation two separate times on 20 previously acquired portal image datasets from two patients. The data were analyzed to obtain the mean variability in the distances measured between and within observers. Results The mean and median intra-observer variability ranged from 0.4 to 0.7 mm and 0.3 to 0.6 mm respectively with a standard deviation of 0.4 to 1.0 mm. Inter-observer results were similar with a mean variability of 0.9 mm, a median of 0.6 mm, and a standard deviation of 0.7 mm. When using a 5 mm threshold, only 0.5% of treatments will undergo a table shift due to intra or inter-observer error, increasing to an error rate of 2.4% if this threshold were reduced to 3 mm. Conclusion We have found high reproducibility with a previously established method for daily verification and correction of isocenter position relative to prostatic fiducial markers using electronic portal imaging. PMID:16722575
Can imaging criteria distinguish enchondroma from grade 1 chondrosarcoma?
Crim, Julia; Schmidt, Robert; Layfield, Lester; Hanrahan, Christopher; Manaster, Betty Jean
2015-11-01
To minimize systematic bias and optimize agreement on imaging criteria in order to better define the accuracy of imaging criteria in the diagnosis of grade 1 chondrosarcoma. Study was IRB-approved and HIPAA compliant; informed consent was waived. Records were reviewed and disclosed 53 cases (38 women, 15 men ages 21-76) which were diagnosed as enchondroma or grade 1 chondrosarcoma and had available radiographs, contrast-enhanced MRI, and definitive diagnosis by histology or 5-year follow-up. 2 MSK radiologists read the studies independently after a session where they agreed on criteria for malignancy. Interobserver variability was determined as raw variability and with the kappa statistic. Accuracy was determined compared to final diagnosis. Reliability of imaging features of chondrosarcoma was determined using regression analysis. The correct diagnosis of enchondroma was made on radiographs in 43 (67.2%) of readings, and on MRI in 37/64 (57.8%). The correct diagnosis of chondrosarcoma was made on radiographs in 5/24 (20.8%) of readings, and on MRI in 14/24 (57.8%). A diagnosis of borderline lesion was made in 19/64 (29.7%) of enchondromas on radiographs and 18/64 (28.1%) on MRI. The false positive rate of radiographs for chondrosarcoma was 2/64 (3.1%) and the false positive rate of MRI was 9/64 (14.1%). There was substantial interobserver variability. Cortical thickening and bone expansion were rare but specific signs of chondrosarcoma. Both radiographs and MRI have limitations in the evaluation of low-grade cartilage lesions. MRI has an increased rate of both true-positive and false-positive diagnosis compared to radiographs. Differences in the findings of this study compared to previous literature may reflect the influence of systematic biases. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Prospective comparison of speckle tracking longitudinal bidimensional strain between two vendors.
Castel, Anne-Laure; Szymanski, Catherine; Delelis, François; Levy, Franck; Menet, Aymeric; Mailliet, Amandine; Marotte, Nathalie; Graux, Pierre; Tribouilloy, Christophe; Maréchaux, Sylvestre
2014-02-01
Speckle tracking is a relatively new, largely angle-independent technique used for the evaluation of myocardial longitudinal strain (LS). However, significant differences have been reported between LS values obtained by speckle tracking with the first generation of software products. To compare LS values obtained with the most recently released equipment from two manufacturers. Systematic scanning with head-to-head acquisition with no modification of the patient's position was performed in 64 patients with equipment from two different manufacturers, with subsequent off-line post-processing for speckle tracking LS assessment (Philips QLAB 9.0 and General Electric [GE] EchoPAC BT12). The interobserver variability of each software product was tested on a randomly selected set of 20 echocardiograms from the study population. GE and Philips interobserver coefficients of variation (CVs) for global LS (GLS) were 6.63% and 5.87%, respectively, indicating good reproducibility. Reproducibility was very variable for regional and segmental LS values, with CVs ranging from 7.58% to 49.21% with both software products. The concordance correlation coefficient (CCC) between GLS values was high at 0.95, indicating substantial agreement between the two methods. While good agreement was observed between midwall and apical regional strains with the two software products, basal regional strains were poorly correlated. The agreement between the two software products at a segmental level was very variable; the highest correlation was obtained for the apical cap (CCC 0.90) and the poorest for basal segments (CCC range 0.31-0.56). A high level of agreement and reproducibility for global but not for basal regional or segmental LS was found with two vendor-dependent software products. This finding may help to reinforce clinical acceptance of GLS in everyday clinical practice. Copyright © 2014 Elsevier Masson SAS. All rights reserved.
Quantitative characterization of color Doppler images: reproducibility, accuracy, and limitations.
Delorme, S; Weisser, G; Zuna, I; Fein, M; Lorenz, A; van Kaick, G
1995-01-01
A computer-based quantitative analysis for color Doppler images of complex vascular formations is presented. The red-green-blue-signal from an Acuson XP10 is frame-grabbed and digitized. By matching each image pixel with the color bar, color pixels are identified and assigned to the corresponding flow velocity (color value). Data analysis consists of delineation of a region of interest and calculation of the relative number of color pixels in this region (color pixel density) as well as the mean color value. The mean color value was compared to flow velocities in a flow phantom. The thyroid and carotid artery in a volunteer were repeatedly examined by a single examiner to assess intra-observer variability. The thyroids in five healthy controls were examined by three experienced physicians to assess the extent of inter-observer variability and observer bias. The correlation between the mean color value and flow velocity ranged from 0.94 to 0.96 for a range of velocities determined by pulse repetition frequency. The average deviation of the mean color value from the flow velocity was 22% to 41%, depending on the selected pulse repetition frequency (range of deviations, -46% to +66%). Flow velocity was underestimated with inadequately low pulse repetition frequency, or inadequately high reject threshold. An overestimation occurred with inadequately high pulse repetition frequency. The highest intra-observer variability was 22% (relative standard deviation) for the color pixel density, and 9.1% for the mean color value. The inter-observer variation was approximately 30% for the color pixel density, and 20% for the mean color value. In conclusion, computer assisted image analysis permits an objective description of color Doppler images. However, the user must be aware that image acquisition under in vivo conditions as well as physical and instrumental factors may considerably influence the results.
Kushnir, Vladimir M; Wani, Sachin B; Fowler, Kathryn; Menias, Christine; Varma, Rakesh; Narra, Vamsi; Hovis, Christine; Murad, Faris M; Mullady, Daniel K; Jonnalagadda, Sreenivasa S; Early, Dayna S; Edmundowicz, Steven A; Azar, Riad R
2013-04-01
There are limited data comparing imaging modalities in the diagnosis of pancreas divisum. We aimed to: (1) evaluate the sensitivity of endoscopic ultrasound (EUS), magnetic resonance cholangiopancreatography (MRCP), and multidetector computed tomography (MDCT) for pancreas divisum; and (2) assess interobserver agreement (IOA) among expert radiologists for detecting pancreas divisum on MDCT and MRCP. For this retrospective cohort study, we identified 45 consecutive patients with pancreaticobiliary symptoms and pancreas divisum established by endoscopic retrograde pancreatography who underwent EUS and cross-sectional imaging. The control group was composed of patients without pancreas divisum who underwent endoscopic retrograde pancreatography and cross-sectional imaging. The sensitivity of EUS for pancreas divisum was 86.7%, significantly higher than the sensitivity reported in the medical records for MDCT (15.5%) or MRCP (60%) (P < 0.001 for each). On review by expert radiologists, the sensitivity of MDCT increased to 83.3% in cases where the pancreatic duct was visualized, with fair IOA (κ = 0.34). Expert review of MRCPs did not identify any additional cases of pancreas divisum; IOA was moderate (κ = 0.43). Endoscopic ultrasound is a sensitive test for diagnosing pancreas divisum and is superior to MDCT and MRCP. Review of MDCT studies by expert radiologists substantially raises its sensitivity for pancreas divisum.
Petrović, Kosta; Turkalj, Ivan; Stojanović, Sanja; Vucaj-Cirilović, Viktorija; Nikolić, Olivera; Stojiljković, Dragana
2013-08-01
Computerized tomography (CT), especially multidetector CT (MDCT), has had a revolutionary impact in diagnostic in traumatized patients. The aim of the study was to identify and compare the frequency of injuries to bone structures of the thorax displayed with 5-mm-thick axial CT slices and thin-slice (MDCT) examination with the use of 3D reconstructions, primarily multiplanar reformations (MPR). This prospective study included 61 patients with blunt trauma submitted to CT scan of the thorax as initial assessment. The two experienced radiologists inde pendently and separately described the findings for 5-mm-thick axial CT slices (5 mm CT) as in monoslice CT examination; MPR and other 3D reconstructions along with thin-slice axial sections which were available in modern MDCT technologies. After describing thin-slice examination in case of disagreement in the findings, the examiners redescribed thin-slice examination together which was ultimately considered as a real, true finding. No statistically significant difference in interobserver evaluation of 5 mm CT examination was recorded (p > 0.05). Evaluation of fractures of sternum with 5 mm CT and MDCT showed a statistically significant difference (p < 0.05) in favor of better display of injury by MDCT examination. MDCT is a powerful diagnostic tool that can describe higher number of bone fractures of the chest in traumatized patients compared to 5 mm CT, especially in the region of sternum for which a statistical significance was obtained using MPR. Moreover, the importance of MDCT is also set by easier and more accurate determination of the level of bone injury.
Observers' Agreement on Measurements in Fiberoptic Endoscopic Evaluation of Swallowing.
Pilz, Walmari; Vanbelle, Sophie; Kremer, Bernd; van Hooren, Michel R; van Becelaere, Tine; Roodenburg, Nel; Baijens, Laura W J
2016-04-01
This study analyzed the effect that dysphagia etiology, different observers, and bolus consistency might have on the level of agreement for measurements in FEES images reached by independent versus consensus panel rating. Sixty patients were included and divided into two groups according to dysphagia etiology: neurological or head and neck oncological. All patients underwent standardized FEES examination using thin and thick liquid consistencies. Two observers scored the same exams, first independently and then in a consensus panel. Four ordinal FEES variables were analyzed. Statistical analysis was performed using a linear weighted kappa coefficient and Bayesian multilevel model. Intra- and interobserver agreement on FEES measurements ranged from 0.76 to 0.93 and from 0.61 to 0.88, respectively. Dysphagia etiology did not influence observers' agreement level. However, bolus consistency resulted in decreased interobserver agreement for all measured FEES variables during thin liquid swallows. When rating on the consensus panel, the observers deviated considerably from the scores they had previously given on the independent rating task. Observer agreement on measurements in FEES exams was influenced by bolus consistency, not by dysphagia etiology. Therefore, observer agreement on FEES measurements should be analyzed by taking bolus consistency into account, as it might affect the interpretation of the outcome. Identifying factors that might influence agreement levels could lead to better understanding of the rating process and assist in developing a more precise measurement scale that would ensure higher levels of observer agreement for measurements in FEES exams.
Artificial Intelligence in Mitral Valve Analysis
Jeganathan, Jelliffe; Knio, Ziyad; Amador, Yannis; Hai, Ting; Khamooshian, Arash; Matyal, Robina; Khabbaz, Kamal R; Mahmood, Feroze
2017-01-01
Background: Echocardiographic analysis of mitral valve (MV) has become essential for diagnosis and management of patients with MV disease. Currently, the various software used for MV analysis require manual input and are prone to interobserver variability in the measurements. Aim: The aim of this study is to determine the interobserver variability in an automated software that uses artificial intelligence for MV analysis. Settings and Design: Retrospective analysis of intraoperative three-dimensional transesophageal echocardiography data acquired from four patients with normal MV undergoing coronary artery bypass graft surgery in a tertiary hospital. Materials and Methods: Echocardiographic data were analyzed using the eSie Valve Software (Siemens Healthcare, Mountain View, CA, USA). Three examiners analyzed three end-systolic (ES) frames from each of the four patients. A total of 36 ES frames were analyzed and included in the study. Statistical Analysis: A multiple mixed-effects ANOVA model was constructed to determine if the examiner, the patient, and the loop had a significant effect on the average value of each parameter. A Bonferroni correction was used to correct for multiple comparisons, and P = 0.0083 was considered to be significant. Results: Examiners did not have an effect on any of the six parameters tested. Patient and loop had an effect on the average parameter value for each of the six parameters as expected (P < 0.0083 for both). Conclusion: We were able to conclude that using automated analysis, it is possible to obtain results with good reproducibility, which only requires minimal user intervention. PMID:28393769
Tinnangwattana, Dangcheewan; Vichak-Ururote, Linlada; Tontivuthikul, Paponrad; Charoenratana, Cholaros; Lerthiranwong, Thitikarn; Tongsong, Theera
2015-01-01
To evaluate the diagnostic performance of IOTA simple rules in predicting malignant adnexal tumors by non-expert examiners. Five obstetric/gynecologic residents, who had never performed gynecologic ultrasound examination by themselves before, were trained for IOTA simple rules by an experienced examiner. One trained resident performed ultrasound examinations including IOTA simple rules on 100 women, who were scheduled for surgery due to ovarian masses, within 24 hours of surgery. The gold standard diagnosis was based on pathological or operative findings. The five-trained residents performed IOTA simple rules on 30 patients for evaluation of inter-observer variability. A total of 100 patients underwent ultrasound examination for the IOTA simple rules. Of them, IOTA simple rules could be applied in 94 (94%) masses including 71 (71.0%) benign masses and 29 (29.0%) malignant masses. The diagnostic performance of IOTA simple rules showed sensitivity of 89.3% (95%CI, 77.8%; 100.7%), specificity 83.3% (95%CI, 74.3%; 92.3%). Inter-observer variability was analyzed using Cohen's kappa coefficient. Kappa indices of the four pairs of raters are 0.713-0.884 (0.722, 0.827, 0.713, and 0.884). IOTA simple rules have high diagnostic performance in discriminating adnexal masses even when are applied by non-expert sonographers, though a training course may be required. Nevertheless, they should be further tested by a greater number of general practitioners before widely use.
Automated consensus contour building for prostate MRI.
Khalvati, Farzad
2014-01-01
Inter-observer variability is the lack of agreement among clinicians in contouring a given organ or tumour in a medical image. The variability in medical image contouring is a source of uncertainty in radiation treatment planning. Consensus contour of a given case, which was proposed to reduce the variability, is generated by combining the manually generated contours of several clinicians. However, having access to several clinicians (e.g., radiation oncologists) to generate a consensus contour for one patient is costly. This paper presents an algorithm that automatically generates a consensus contour for a given case using the atlases of different clinicians. The algorithm was applied to prostate MR images of 15 patients manually contoured by 5 clinicians. The automatic consensus contours were compared to manual consensus contours where a median Dice similarity coefficient (DSC) of 88% was achieved.
Heiberg, Einar; Ugander, Martin; Engblom, Henrik; Götberg, Matthias; Olivecrona, Göran K; Erlinge, David; Arheden, Håkan
2008-02-01
Ethics committees approved human and animal study components; informed written consent was provided (prospective human study [20 men; mean age, 62 years]) or waived (retrospective human study [16 men, four women; mean age, 59 years]). The purpose of this study was to prospectively evaluate a clinically applicable method, accounting for the partial volume effect, to automatically quantify myocardial infarction from delayed contrast material-enhanced magnetic resonance images. Pixels were weighted according to signal intensity to calculate infarct fraction for each pixel. Mean bias +/- variability (or standard deviation), expressed as percentage left ventricular myocardium (%LVM), were -0.3 +/- 1.3 (animals), -1.2 +/- 1.7 (phantoms), and 0.3 +/- 2.7 (patients), respectively. Algorithm had lower variability than dichotomous approach (2.7 vs 7.7 %LVM, P < .01) and did not differ from interobserver variability for bias (P = .31) or variability (P = .38). The weighted approach provides automatic quantification of myocardial infarction with higher accuracy and lower variability than a dichotomous algorithm. (c) RSNA, 2007.
Patra, S; Gomm, E M W; Macipe, M; Bailey, C
2009-08-01
To assess the quality and accuracy of primary grading in the Bristol and Weston diabetic retinopathy screening programme and to set standards for future interobserver agreement reports. A prospective audit of 213 image sets from six fully trained primary graders in the Bristol and Weston diabetic retinopathy screening programme was carried out over a 4-week period. All the images graded by the primary graders were regraded by an expert grader blinded to the primary grading results and the identity of the primary grader. The interobserver agreement between primary graders and the blinded expert grader and the corresponding Kappa coefficient was determined for overall grading, referable, non-referable and ungradable disease. The audit standard was set at 80% for interobserver agreement with a Kappa coefficient of 0.7. The interobserver agreement bettered the audit standard of 80% in all the categories. The Kappa coefficient was substantial (0.7) for the overall grading results and ranged from moderate to substantial (0.59-0.65) for referable, non-referable and ungradable disease categories. The main recommendation of the audit was to provide refresher training for the primary graders with focus on ungradable disease. The audit demonstrated an acceptable level of quality and accuracy of primary grading in the Bristol and Weston diabetic retinopathy screening programme and provided a standard against which future interobserver agreement can be measured for quality assurance within a screening programme. Diabet. Med. 26, 820-823 (2009).
Validation of in vivo 2D displacements from spiral cine DENSE at 3T.
Wehner, Gregory J; Suever, Jonathan D; Haggerty, Christopher M; Jing, Linyuan; Powell, David K; Hamlet, Sean M; Grabau, Jonathan D; Mojsejenko, Walter Dimitri; Zhong, Xiaodong; Epstein, Frederick H; Fornwalt, Brandon K
2015-01-30
Displacement Encoding with Stimulated Echoes (DENSE) encodes displacement into the phase of the magnetic resonance signal. Due to the stimulated echo, the signal is inherently low and fades through the cardiac cycle. To compensate, a spiral acquisition has been used at 1.5T. This spiral sequence has not been validated at 3T, where the increased signal would be valuable, but field inhomogeneities may result in measurement errors. We hypothesized that spiral cine DENSE is valid at 3T and tested this hypothesis by measuring displacement errors at both 1.5T and 3T in vivo. Two-dimensional spiral cine DENSE and tagged imaging of the left ventricle were performed on ten healthy subjects at 3T and six healthy subjects at 1.5T. Intersection points were identified on tagged images near end-systole. Displacements from the DENSE images were used to project those points back to their origins. The deviation from a perfect grid was used as a measure of accuracy and quantified as root-mean-squared error. This measure was compared between 3T and 1.5T with the Wilcoxon rank sum test. Inter-observer variability of strains and torsion quantified by DENSE and agreement between DENSE and harmonic phase (HARP) were assessed by Bland-Altman analyses. The signal to noise ratio (SNR) at each cardiac phase was compared between 3T and 1.5T with the Wilcoxon rank sum test. The displacement accuracy of spiral cine DENSE was not different between 3T and 1.5T (1.2 ± 0.3 mm and 1.2 ± 0.4 mm, respectively). Both values were lower than the DENSE pixel spacing of 2.8 mm. There were no substantial differences in inter-observer variability of DENSE or agreement of DENSE and HARP between 3T and 1.5T. Relative to 1.5T, the SNR at 3T was greater by a factor of 1.4 ± 0.3. The spiral cine DENSE acquisition that has been used at 1.5T to measure cardiac displacements can be applied at 3T with equivalent accuracy. The inter-observer variability and agreement of DENSE-derived peak strains and torsion with HARP is also comparable at both field strengths. Future studies with spiral cine DENSE may take advantage of the additional SNR at 3T.
DeAngelis, Lisa M.; Brandes, Alba A.; Peereboom, David M.; Galanis, Evanthia; Lin, Nancy U.; Soffietti, Riccardo; Macdonald, David R.; Chamberlain, Marc; Perry, James; Jaeckle, Kurt; Mehta, Minesh; Stupp, Roger; Muzikansky, Alona; Pentsova, Elena; Cloughesy, Timothy; Iwamoto, Fabio M.; Tonn, Joerg-Christian; Vogelbaum, Michael A.; Wen, Patrick Y.; van den Bent, Martin J.; Reardon, David A.
2017-01-01
Abstract Background. The Macdonald criteria and the Response Assessment in Neuro-Oncology (RANO) criteria define radiologic parameters to classify therapeutic outcome among patients with malignant glioma and specify that clinical status must be incorporated and prioritized for overall assessment. But neither provides specific parameters to do so. We hypothesized that a standardized metric to measure neurologic function will permit more effective overall response assessment in neuro-oncology. Methods. An international group of physicians including neurologists, medical oncologists, radiation oncologists, and neurosurgeons with expertise in neuro-oncology drafted the Neurologic Assessment in Neuro-Oncology (NANO) scale as an objective and quantifiable metric of neurologic function evaluable during a routine office examination. The scale was subsequently tested in a multicenter study to determine its overall reliability, inter-observer variability, and feasibility. Results. The NANO scale is a quantifiable evaluation of 9 relevant neurologic domains based on direct observation and testing conducted during routine office visits. The score defines overall response criteria. A prospective, multinational study noted a >90% inter-observer agreement rate with kappa statistic ranging from 0.35 to 0.83 (fair to almost perfect agreement), and a median assessment time of 4 minutes (interquartile range, 3–5). Conclusion. The NANO scale provides an objective clinician-reported outcome of neurologic function with high inter-observer agreement. It is designed to combine with radiographic assessment to provide an overall assessment of outcome for neuro-oncology patients in clinical trials and in daily practice. Furthermore, it complements existing patient-reported outcomes and cognition testing to combine for a global clinical outcome assessment of well-being among brain tumor patients. PMID:28453751
Conte, Gian Marco; Castellano, Antonella; Altabella, Luisa; Iadanza, Antonella; Cadioli, Marcello; Falini, Andrea; Anzalone, Nicoletta
2017-04-01
Dynamic susceptibility contrast MRI (DSC) and dynamic contrast-enhanced MRI (DCE) are useful tools in the diagnosis and follow-up of brain gliomas; nevertheless, both techniques leave the open issue of data reproducibility. We evaluated the reproducibility of data obtained using two different commercial software for perfusion maps calculation and analysis, as one of the potential sources of variability can be the software itself. DSC and DCE analyses from 20 patients with gliomas were tested for both the intrasoftware (as intraobserver and interobserver reproducibility) and the intersoftware reproducibility, as well as the impact of different postprocessing choices [vascular input function (VIF) selection and deconvolution algorithms] on the quantification of perfusion biomarkers plasma volume (Vp), volume transfer constant (K trans ) and rCBV. Data reproducibility was evaluated with the intraclass correlation coefficient (ICC) and Bland-Altman analysis. For all the biomarkers, the intra- and interobserver reproducibility resulted in almost perfect agreement in each software, whereas for the intersoftware reproducibility the value ranged from 0.311 to 0.577, suggesting fair to moderate agreement; Bland-Altman analysis showed high dispersion of data, thus confirming these findings. Comparisons of different VIF estimation methods for DCE biomarkers resulted in ICC of 0.636 for K trans and 0.662 for Vp; comparison of two deconvolution algorithms in DSC resulted in an ICC of 0.999. The use of single software ensures very good intraobserver and interobservers reproducibility. Caution should be taken when comparing data obtained using different software or different postprocessing within the same software, as reproducibility is not guaranteed anymore.
Huang, Qi-Fang; Wei, Fang-Fei; Zhang, Zhen-Yu; Raaijmakers, Anke; Asayama, Kei; Thijs, Lutgarde; Yang, Wen-Yi; Mujaj, Blerim; Allegaert, Karel; Verhamme, Peter; Struijker-Boudier, Harry A J; Li, Yan; Staessen, Jan A
2018-03-10
Retinal microvascular traits predict adverse health outcomes. The Singapore I Vessel Assessment (SIVA) software improved automated postprocessing of retinal photographs. In addition to microvessel caliber, it generates measures of arteriolar and venular geometry. Few studies addressed the reproducibility of SIVA measurements across a wide age range. In the current study, 2 blinded graders read images obtained by nonmydriatic retinal photography twice in 20 11-year-old children, born prematurely (n = 10) or at term (n = 10) and in 60 adults (age range, 18.9-86.1 years). Former preterm compared with term children had lower microvessel diameter and disorganized vessel geometry with no differences in intraobserver and interobserver variability. Among adults, microvessel caliber decreased with age and blood pressure and arteriolar geometry was inversely correlated with female sex and age. Intraobserver differences estimated by the Bland-Altman method did not reach significance for any measurement. Across measurements, median reproducibility (RM) expressed as percent of the average trait value was 8.8% in children (median intraclass correlation coefficient [ICC], 0.94) and 8.0% (0.97) in adults. Likewise, interobserver differences did not reach significance with RM (ICC) of 10.6% (0.85) in children and 10.4% (0.93) in adults. Reproducibility was best for microvessel caliber (intraobserver/interobserver RM, 4.7%/6.0%; ICC, 0.98/0.96), worst for venular geometry (17.0%/18.8%; 0.93/0.84), and intermediate for arteriolar geometry (10.9%/14.9%; 0.95/0.86). SIVA produces repeatable measures of the retinal microvasculature in former preterm and term children and in adults, thereby proving its usability from childhood to old age.
Chang, Shang-Jen; Yang, Stephen S D
2008-12-01
To evaluate the inter-observer and intra-observer agreement on the interpretation of uroflowmetry curves of children. Healthy kindergarten children were enrolled for evaluation of uroflowmetry. Uroflowmetry curves were classified as bell-shaped, tower, plateau, staccato and interrupted. Only the bell-shaped curves were regarded as normal. Two urodynamists evaluated the curves independently after reviewing the definitions of the different types of uroflowmetry curve. The senior urodynamist evaluated the curves twice 3 months apart. The final conclusion was made when consensus was reached. Agreement among observers was analyzed using kappa statistics. Of 190 uroflowmetry curves eligible for analysis, the intra-observer agreement in interpreting each type of curve and interpreting normalcy vs abnormality was good (kappa=0.71 and 0.68, respectively). Very good inter-observer agreement (kappa=0.81) on normalcy and good inter-observer agreement (kappa=0.73) on types of uroflowmetry were observed. Poor inter-observer agreement existed on the classification of specific types of abnormal uroflowmetry curves (kappa=0.07). Uroflowmetry is a good screening tool for normalcy of kindergarten children, while not a good tool to define the specific types of abnormal uroflowmetry.
Maroules, Christopher D; Hamilton-Craig, Christian; Branch, Kelley; Lee, James; Cury, Roberto C; Maurovich-Horvat, Pál; Rubinshtein, Ronen; Thomas, Dustin; Williams, Michelle; Guo, Yanshu; Cury, Ricardo C
The Coronary Artery Disease Reporting and Data System (CAD-RADS) provides a lexicon and standardized reporting system for coronary CT angiography. To evaluate inter-observer agreement of the CAD-RADS among an panel of early career and expert readers. Four early career and four expert cardiac imaging readers prospectively and independently evaluated 50 coronary CT angiography cases using the CAD-RADS lexicon. All readers assessed image quality using a five-point Likert scale, with mean Likert score ≥4 designating high image quality, and <4 designating moderate/low image quality. All readers were blinded to medical history and invasive coronary angiography findings. Inter-observer agreement for CAD-RADS assessment categories and modifiers were assessed using intra-class correlation (ICC) and Fleiss' Kappa (κ).The impact of reader experience and image quality on inter-observer agreement was also examined. Inter-observer agreement for CAD-RADS assessment categories was excellent (ICC 0.958, 95% CI 0.938-0.974, p < 0.0001). Agreement among expert readers (ICC 0.925, 95% CI 0.884-0.954) was marginally stronger than for early career readers (ICC 0.904, 95% CI 0.852-0.941), both p < 0.0001. High image quality was associated with stronger agreement than moderate image quality (ICC 0.944, 95% CI 0.886-0.974 vs. ICC 0.887, 95% CI 0.775-0.95, both p < 0.0001). While excellent inter-observer agreement was observed for modifiers S (stent) and G (bypass graft) (both κ = 1.0), only fair agreement (κ = 0.40) was observed for modifier V (high risk plaque). Inter-observer reproducibility of CAD-RADS assessment categories and modifiers is excellent, except for high-risk plaque (modifier V) which demonstrates fair agreement. These results suggest CAD-RADS is feasible for clinical implementation. Copyright © 2017. Published by Elsevier Inc.
Carroll, Kristen L; Murray, Kathleen A; MacLeod, Lynne M; Hennessey, Theresa A; Woiczik, Marcella R; Roach, James W
2011-06-01
Numerous studies underscore the poor intraobserver and interobserver reliability of both the center edge angle (CEA) and the Severin classification using plain film measurements. In this study, experienced observers applied a computer-assisted measurement program to determine the CEA in digital pelvic radiographs of adults who had been previously treated for dysplasia of the hip (DDH). Using a teaching aid/algorithm of the Severin classification, the observers then assigned a Severin rating to these hips. Intraobserver and interobserver errors were then calculated on both the CEA measurements and the Severin classifications. Four pediatric orthopaedic surgeons and 1 pediatric radiologist calculated the CEAs using the OrthoView TM planning system and then determined the Severin classification on 41 blinded digital pelvic radiographs. The radiographs were evaluated by each examiner twice, with evaluations separated by 2 months. All examiners reviewed a Severin classification algorithm before making their Severin assignments. The intraobserver and interobserver reliability for both the CEA and the Severin classification were calculated using the interclass correlation coefficients and Cohen and Fleiss κ scores, respectively. The intraobserver and interobserver reliability for CEA measurement was moderate to almost perfect. When we separated the Severin classification into 3 clinically relevant groups of good (Severin I and II), dysplastic (Severin III), and poor (Severin IV and above), our interobserver reliability neared almost perfect. The Severin classification is an extremely useful and oft-used radiographic measure for the success of DDH treatment. Our research found digital radiography, computer-aided measurement tools, the use of a Severin algorithm, and separating the Severin classification into 3 clinically relevant groups significantly increased the intraobserver and interobserver reliability of both the CEA and Severin classification. This finding will assist future studies using the CEA and Severin classification in the radiographic assessment of DDH treatment outcomes.
2015-07-01
MSc, MRCSEd; David N. Naumann, MB BChir, MRCS; Paul Guyver, MBBS, FRCS; Jonathan Bishop, PhD; Simon Davies, BN(Hons), DipIMC RCSEd, RGN; Jonathan...Smith, I. M. Naumann, D. N. Guyver, P. Bishop, J . Davies, S. Lundy, J . B. Bowley, D. M. 5d. PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER...McLeod J , et al. The role of trauma scoring in developing trauma clinical governance in the De- fence Medical Services. Philos Trans R Soc Lond B
Serological markers in inflammatory bowel disease: the pros and cons.
Lerner, Aaron; Shoenfeld, Yehuda
2002-02-01
Accurate serological assays are desirable for the diagnosis of inflammatory bowel disease. Among several serological markers anti-Saccharomyces cerevisiae mannan antibodies and perinuclear antineutrophil cytoplasmic autoantibodies are highly disease specific for Crohn's disease and ulcerative colitis, respectively. Combining the two improves their specificity. Sensitivity, however, is still low. Due to lack of standardization and vast interobserver variability, they cannot be used as the only diagnostic criteria but can assist clinicians in diagnosing and categorizing patients with inflammatory bowel disease as well as in helping them to take therapeutic decisions.
Pitcher, Brandon; Alaqla, Ali; Noujeim, Marcel; Wealleans, James A; Kotsakis, Georgios; Chrepa, Vanessa
2017-03-01
Cone-beam computed tomographic (CBCT) analysis allows for 3-dimensional assessment of periradicular lesions and may facilitate preoperative periapical cyst screening. The purpose of this study was to develop and assess the predictive validity of a cyst screening method based on CBCT volumetric analysis alone or combined with designated radiologic criteria. Three independent examiners evaluated 118 presurgical CBCT scans from cases that underwent apicoectomies and had an accompanying gold standard histopathological diagnosis of either a cyst or granuloma. Lesion volume, density, and specific radiologic characteristics were assessed using specialized software. Logistic regression models with histopathological diagnosis as the dependent variable were constructed for cyst prediction, and receiver operating characteristic curves were used to assess the predictive validity of the models. A conditional inference binary decision tree based on a recursive partitioning algorithm was constructed to facilitate preoperative screening. Interobserver agreement was excellent for volume and density, but it varied from poor to good for the radiologic criteria. Volume and root displacement were strong predictors for cyst screening in all analyses. The binary decision tree classifier determined that if the volume of the lesion was >247 mm 3 , there was 80% probability of a cyst. If volume was <247 mm 3 and root displacement was present, cyst probability was 60% (78% accuracy). The good accuracy and high specificity of the decision tree classifier renders it a useful preoperative cyst screening tool that can aid in clinical decision making but not a substitute for definitive histopathological diagnosis after biopsy. Confirmatory studies are required to validate the present findings. Published by Elsevier Inc.
Zhang, Rui-Fang; Fu, Yu-Chuan; Lu, Yi; Zhang, Xiao-Xia; Hu, Yu-Min; Zhou, Yong-Jin; Tian, Nai-Feng; He, Jia-Wei; Yan, Zhi-Han
2017-02-01
Accurately evaluating the extent of trunk imbalance in the coronal plane is significant for patients before and after treatment. We preliminarily practiced a new method, axis-line-angle technique (ALAT), for evaluating coronal trunk imbalance with excellent intra-observer and interobserver reliability. Radiologists and surgeons were encouraged to use this method in clinical practice. However, the optimal cutoff value of the ALAT for determination of the extent of coronal trunk imbalance has not been calculated up to now. The purpose of this study was to identify the cutoff value of the ALAT that best predicts a positive measurement point to assess coronal balance or imbalance. A retrospective study at a university affiliated hospital was carried out. A total of 130 patients with C7-central sacral vertical line (CSVL) >0 mm and aged 10-18 years were recruited in this study from September 2013 to December 2014. Data were analyzed to determine the optimal cutoff value of the ALAT measurement. The C7-CSVL and ALAT measurements were conducted respectively twice on plain film within a 2-week interval by two radiologists. The optimal cutoff value of the ALAT was analyzed via receiver operating characteristic (ROC) curve. Comparison variables were performed with chi-square test between the C7-CSVL and ALAT measurements for evaluating trunk imbalance. Kappa agreement coefficient method was used to test the intra-observer and interobserver agreement of C7-CSVL and ALAT. The ROC curve area for the ALAT was 0.82 (95% confidence interval: 0.753-0.894, p<.001). The maximum Youden index was 0.51, and the corresponding cutoff point was 2.59°. No statistical difference was found between the C7-CSVL and ALAT measurements for evaluating trunk imbalance (p>.05). Intra-observer agreement values for the C7-CSVL measurements by observers 1 and 2 were 0.79 and 0.91 (p<.001), respectively, whereas intra-observer agreement values for the ALAT measurements were both 0.89 by observers 1 and 2 (p<.001). The interobserver agreement values for the first and second measurements with the C7-CSVL were 0.78 and 0.85 (p<.001), respectively, whereas the interobserver agreement values for the first and second measurements with the ALAT were 0.91 and 0.88 (p<.001), respectively. The newly developed ALAT provided an acceptable optimal cutoff value for evaluating trunk imbalance in the coronal plane with a high level of intra-observer and interobserver agreement, which suggests that the ALAT is suitable for clinical use. Copyright © 2016 Elsevier Inc. All rights reserved.
San José, Verónica; Bellot-Arcís, Carlos; Tarazona, Beatriz; Zamora, Natalia; O Lagravère, Manuel
2017-01-01
Background To compare the reliability and accuracy of direct and indirect dental measurements derived from two types of 3D virtual models: generated by intraoral laser scanning (ILS) and segmented cone beam computed tomography (CBCT), comparing these with a 2D digital model. Material and Methods One hundred patients were selected. All patients’ records included initial plaster models, an intraoral scan and a CBCT. Patients´ dental arches were scanned with the iTero® intraoral scanner while the CBCTs were segmented to create three-dimensional models. To obtain 2D digital models, plaster models were scanned using a conventional 2D scanner. When digital models had been obtained using these three methods, direct dental measurements were measured and indirect measurements were calculated. Differences between methods were assessed by means of paired t-tests and regression models. Intra and inter-observer error were analyzed using Dahlberg´s d and coefficients of variation. Results Intraobserver and interobserver error for the ILS model was less than 0.44 mm while for segmented CBCT models, the error was less than 0.97 mm. ILS models provided statistically and clinically acceptable accuracy for all dental measurements, while CBCT models showed a tendency to underestimate measurements in the lower arch, although within the limits of clinical acceptability. Conclusions ILS and CBCT segmented models are both reliable and accurate for dental measurements. Integration of ILS with CBCT scans would get dental and skeletal information altogether. Key words:CBCT, intraoral laser scanner, 2D digital models, 3D models, dental measurements, reliability. PMID:29410764
Galea, Angela; Adlan, Tarig; Gay, David; Roobottom, Carl; Dubbins, Paul; Riordan, Richard
2015-09-01
The aim of this study was to compare the sensitivity and specificity of chest digital tomosynthesis (DTS) with chest radiography (CXR) for the detection of noncalcified pulmonary nodules and hilar lesions using computed tomography (CT) as the reference standard. A total of 78 patients with suspected noncalcified pulmonary lesions on CXR were included in the study. Two radiologists, blinded to the history and CT, analyzed the CXR and the DTS images (separately), whereas a third radiologist analyzed the CXR and DTS images together. Noncalcified intrapulmonary nodules and hilar lesions were recorded for analysis. The interobserver agreement for CXR and DTS was assessed, and the time taken to report the images was recorded. A total of 202 lesions were recorded in 78 patients. There were 111 true lesions confirmed on CT in 53 patients; in 25 patients subsequent CT excluded a lesion. The overall sensitivity was 32% for CXR and 49% for DTS. This improved to 54% when the posteroanterior CXR and DTS were reviewed together (CXR-DTS). The overall specificities for CXR, DTS, and CXR-DTS were 49%, 96%, and 98%, respectively. There were 56 suspected hilar lesions with subgroup sensitivities of 76% for CXR, 65% for DTS, and 76% for CXR-DTS. The specificity for hilar lesions was 59%, 92%, and 97% for CXR, DTS, and CXR-DTS, respectively. DTS significantly improves the detectability of noncalcified nodules when compared with and when used in combination with CXR. The specificity and interobserver agreement of DTS in the diagnosis of suspected noncalcified pulmonary nodules and hilar lesions are significantly better than those of CXR and approaches those of CT.
Tublin, Mitchell E; Murphy, Michael E; Delong, David M; Tessler, Franklin N; Kliewer, Mark A
2002-10-01
To determine the effects of calculus size, composition, and technique (kilovolt and milliampere settings) on the conspicuity of renal calculi at unenhanced helical computed tomography (CT). The authors performed unenhanced CT of a phantom containing 188 renal calculi of varying size and chemical composition (brushite, cystine, struvite, weddellite, whewellite, and uric acid) at 24 combinations of four kilovolt (80-140 kV) and six milliampere (200-300 mA) levels. Two radiologists, who were unaware of the location and number of calculi, reviewed the CT images and recorded where stones were detected. These observations were compared with the known positions of calculi to generate true-positive and false-positive rates. Logistic regression analysis was performed to investigate the effects of stone size, composition, and technique and to generate probability estimates of detection. Interobserver agreement was estimated with kappa statistics. Interobserver agreement was high: the mean kappa value for the two observers was 0.86. The conspicuity of stone fragments increased with increasing kilovolt and milliampere levels for all stone types. At the highest settings (140 kV and 300 mA), the detection threshold size (ie, the size of calculus that had a 50% probability of being detected) ranged from 0.81 mm + 0.03 (weddellite) to 1.3 mm + 0.1 (uric acid). Detection threshold size for each type of calculus increased up to 1.17-fold at lower kilovolt settings and up to 1.08-fold at lower milliampere settings. The conspicuity of small renal calculi at CT increases with higher kilovolt and milliampere settings, with higher kilovolts being particularly important. Small uric acid calculi may be imperceptible, even with maximal CT technique.
Unenhanced CT imaging is highly sensitive to exclude pheochromocytoma: a multicenter study.
Buitenwerf, Edward; Korteweg, Tijmen; Visser, Anneke; Haag, Charlotte M S C; Feelders, Richard A; Timmers, Henri J L M; Canu, Letizia; Haak, Harm R; Bisschop, Peter H L T; Eekhoff, Elisabeth M W; Corssmit, Eleonora P M; Krak, Nanda C; Rasenberg, Elise; van den Bergh, Janneke; Stoker, Jaap; Greuter, Marcel J W; Dullaart, Robin P F; Links, Thera P; Kerstens, Michiel N
2018-05-01
A substantial proportion of all pheochromocytomas is currently detected during the evaluation of an adrenal incidentaloma. Recently, it has been suggested that biochemical testing to rule out pheochromocytoma is unnecessary in case of an adrenal incidentaloma with an unenhanced attenuation value ≤10 Hounsfield Units (HU) at computed tomography (CT). We aimed to determine the sensitivity of the 10 HU threshold value to exclude a pheochromocytoma. Retrospective multicenter study with systematic reassessment of preoperative unenhanced CT scans performed in patients in whom a histopathologically proven pheochromocytoma had been diagnosed. Unenhanced attenuation values were determined independently by two experienced radiologists. Sensitivity of the 10 HU threshold was calculated, and interobserver consistency was assessed using the intraclass correlation coefficient (ICC). 214 patients were identified harboring a total number of 222 pheochromocytomas. Maximum tumor diameter was 51 (39-74) mm. The mean attenuation value within the region of interest was 36 ± 10 HU. Only one pheochromocytoma demonstrated an attenuation value ≤10 HU, resulting in a sensitivity of 99.6% (95% CI: 97.5-99.9). ICC was 0.81 (95% CI: 0.75-0.86) with a standard error of measurement of 7.3 HU between observers. The likelihood of a pheochromocytoma with an unenhanced attenuation value ≤10 HU on CT is very low. The interobserver consistency in attenuation measurement is excellent. Our study supports the recommendation that in patients with an adrenal incidentaloma biochemical testing for ruling out pheochromocytoma is only indicated in adrenal tumors with an unenhanced attenuation value >10 HU. © 2018 European Society of Endocrinology.
Urrutia, Julio; Zamora, Tomas; Yurac, Ratko; Campos, Mauricio; Palma, Joaquin; Mobarec, Sebastian; Prada, Carlos
2017-03-01
An agreement study. The aim of this study was to perform an independent interobserver and intraobserver agreement assessment of the AOSpine subaxial cervical spine injury classification system. The AOSpine subaxial cervical spine injury classification system was recently described. It showed substantial inter- and intraobserver agreement in the study describing it; however, an independent evaluation has not been performed. Anteroposterior and lateral radiographs, computed tomography scans, and magnetic resonance imaging of 65 patients with acute traumatic subaxial cervical spine injuries were selected and classified using the morphologic grading of the subaxial cervical spine injury classification system by 6 evaluators (3 spine surgeons and 3 orthopedic surgery residents). After a 6-week interval, the 65 cases were presented to the same evaluators in a random sequence for repeat evaluation. The kappa coefficient (κ) was used to determine the inter- and intraobserver agreement. The interobserver agreement was substantial when considering the fracture main types (A, B, C, or F), with κ = 0.61 (0.57-0.64), but moderate when considering the subtypes: κ = 0.57 (0.54-0.60). The intraobserver agreement was substantial considering the fracture types, with κ = 0.68 (0.62-0.74) and considering subtypes, κ = 0.62 (0.57-0.66). No significant differences were observed between spine surgeons and orthopedic residents in the overall inter- and intraobserver agreement, or in the inter- and intraobserver agreement of specific A, B, C, or F type of injuries. This classification allows adequate agreement among different observers and by the same observer on separate occasions. Future prospective studies should determine whether this classification allows surgeons to decide the best treatment for patients with subaxial cervical spine injuries. 3.
Raoult, Hélène; Eugène, François; Le Bras, Anthony; Mineur, Géraldine; Carsin-Nicol, Béatrice; Ferré, Jean-Christophe; Gauvrit, Jean-Yves
2018-03-07
The WEB is an innovative flow disruption device for cerebral aneurysm embolization with rapidly expanding indications. Our purpose was to evaluate the diagnostic performance of computed tomography angiography (CTA) at 1-year follow-up of aneurysms treated with the WEB. Between April 2014 and May 2016, the study prospectively included patients treated with the WEB at our institution, and followed up within 24hours by CTA and at 1year by CTA, time-of-flight magnetic resonance angiography (TOF MRA) and digital subtraction angiography (DSA). The diagnostic quality of imaging data was assessed based on the confidence index, artifacts, and WEB shape depiction. The imaging diagnostic performance was assessed using 3 criteria at 1year: aneurysm occlusion status and worsening, and WEB shape compression. Interobserver and intermodality agreement was determined by calculating κ values. The study ultimately included 16 patients (9 women, mean age 53±7.6years). CTA quality confidence was scored as 2/2, artifacts 0.4/2 and WEB shape depiction 1.9/2, superior to TOF MRA for the latter two criteria. Aneurysm occlusion was adequate in 93.7% of patients, with CTA showing excellent interobserver reproducibility and agreement with DSA on a 4-grade scale (κ=1.00), while TOF MRA yielded good reproducibility (κ=0.76) and agreement with DSA (κ=0.69). CTA also identified aneurysm occlusion worsening (43.7%) and WEB compression (81.2%) in excellent agreement with DSA (κ=0.85 and 1.00). CTA is a reproducible and reliable technique for the follow-up of aneurysms treated with the WEB device. Copyright © 2018 Elsevier Masson SAS. All rights reserved.
Extra-hepatic sarcoma metastasis surveillance in the liver: is arterial phase imaging necessary?
Harri, Peter A; Chung, Alex; Tridandapani, Srini; Nandwana, Sadhna; Ibraheem, Oluwayemisi O; Cox, Kelly; Murphy, Fredrick; Mittal, Pardeep; Small, William
2017-06-01
To assess the value of arterial phase imaging (ART) in the detection of liver metastases on CT compared to portal venous phase imaging (PV) alone in patients with primary sarcomas. Multiphasic abdominal computed tomography (CT) images of patients with tissue-proven sarcomas were reviewed by five abdominal radiologists in a staggered fashion. Up to three of the largest or most conspicuous liver lesions were characterized on a four-point confidence level for PV independently, followed by PV + ART. Inter-observer reliability was evaluated with kappa statistics. Change in characterization of lesions by the addition of ART was calculated. Follow-up imaging was used to determine if index lesion characterization was valid. 55 of 149 patients had 470 liver lesion characterizations by the five readers with follow-up. Inter-observer agreement was κ = 0.62 on PV and κ = 0.58 on PV + ART. The intra-observer agreement between PV and ART interpretations of the same lesion was κ = 0.93. 426 lesion characterizations were possible on both PV and ART. Only 6 characterizations were changed after the addition of ART; 4 of the 6 changes were incorrect when compared to follow-up. Only 6 lesion characterizations could be made on ART alone (missed by PV), with all the malignant lesions arising from primary leiomyosarcomas. For the lesions seen on PV alone, the sensitivity, specificity, PPV, NPV, and accuracy were 98.8%, 100%, 100%, 99.3%, and 99.6%, respectively. After the addition of ART, they were 98.8%, 98.7%, 97.5%, 99.4%, and 98.7%, respectively. ART adds marginal value to PV for characterization of metastatic liver lesions in patients with primary sarcomas, except possibly in primary leiomyosarcomas.
Yamada, Shigeki; Hashimoto, Kenji; Ogata, Hideki; Watanabe, Yoshihiko; Oshima, Marie; Miyake, Hidenori
2014-02-01
Simple rating scale for calcification in the cervical arteries and the aortic arch on multi-detector computed tomography angiography (MDCTA) was evaluated its reliability and validity. Additionally, we investigated where is the most representative location for evaluating the calcification risk of carotid bifurcation stenosis and atherosclerotic infarction in the overall cervical arteries covering from the aortic arch to the carotid bifurcation. The aortic arch and cervical arteries among 518 patients (292 men, 226 women) were evaluated the extent of calcification using a 4-point grading scale for MDCTA. Reliability, validity and the concomitant risk with vascular stenosis and atherosclerotic infarction were assessed. Calcification was most frequently observed in the aortic arch itself, the orifices from the aortic arch, and the carotid bifurcation. Compared with the bilateral carotid bifurcations, the aortic arch itself had a stronger inter-observer agreement for the calcification score (Fleiss' kappa coefficients; 0.77), but weaker associations with stenosis and atherosclerotic infarction. Calcification at the orifices of the aortic arch branches had a stronger inter-observer agreement (0.74) and enough associations with carotid bifurcation stenosis and intracranial stenosis. In addition, the extensive calcification at the orifices from the aortic arch was significantly associated with atherosclerotic infarction, similar to the calcification at the bilateral carotid bifurcations. The orifices of the aortic arch branches were the novel representative location of the aortic arch and overall cervical arteries for evaluating the calcification extent. Thus, calcification at the aortic arch should be evaluated with focus on the orifices of 3 main branches. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Correlation of ultra-low dose chest CT findings with physiologic measures of asbestosis.
Manners, David; Wong, Patrick; Murray, Conor; Teh, Joelin; Kwok, Yi Jin; de Klerk, Nick; Alfonso, Helman; Franklin, Peter; Reid, Alison; Musk, A W Bill; Brims, Fraser J H
2017-08-01
The correlation between ultra low dose computed tomography (ULDCT)-detected parenchymal lung changes and pulmonary function abnormalities is not well described. This study aimed to determine the relationship between ULDCT-detected interstitial lung disease (ILD) and measures of pulmonary function in an asbestos-exposed population. Two thoracic radiologists independently categorised prone ULDCT scans from 143 participants for ILD appearances as absent (score 0), probable (1) or definite (2) without knowledge of asbestos exposure or lung function. Pulmonary function measures included spirometry and diffusing capacity to carbon monoxide (DLCO). Participants were 92% male with a median age of 73.0 years. CT dose index volume was between 0.6 and 1.8 mGy. Probable or definite ILD was reported in 63 (44.1%) participants. Inter-observer agreement was good (k = 0.613, p < 0.001). There was a statistically significant correlation between the ILD score and both forced expiratory volume in 1 second (FEV 1 ) and forced vital capacity (FVC) (r = -0.17, p = 0.04 and r = -0.20, p = 0.02). There was a strong correlation between ILD score and DLCO (r = -0.34, p < 0.0001). Changes consistent with ILD on ULDCT correlate well with corresponding reductions in gas transfer, similar to standard CT. In asbestos-exposed populations, ULDCT may be adequate to detect radiological changes consistent with asbestosis. • Interobserver agreement for the ILD score using prone ULDCT is good. • Prone ULDCT appearances of ILD correlate with changes in spirometric observations. • Prone ULDCT appearances of ILD correlate strongly with changes in gas transfer. • Prone ULDCT may provide sufficient radiological evidence to inform the diagnosis of asbestosis.
Kuya, Keita; Shinohara, Yuki; Kato, Ayumi; Sakamoto, Makoto; Kurosaki, Masamichi; Ogawa, Toshihide
2017-03-01
The aim of this study is to assess the value of adaptive statistical iterative reconstruction (ASIR) and model-based iterative reconstruction (MBIR) for reduction of metal artifacts due to dental hardware in carotid CT angiography (CTA). Thirty-seven patients with dental hardware who underwent carotid CTA were included. CTA was performed with a GE Discovery CT750 HD scanner and reconstructed with filtered back projection (FBP), ASIR, and MBIR. We measured the standard deviation at the cervical segment of the internal carotid artery that was affected most by dental metal artifacts (SD 1 ) and the standard deviation at the common carotid artery that was not affected by the artifact (SD 2 ). We calculated the artifact index (AI) as follows: AI = [(SD 1 )2 - (SD 2 )2]1/2 and compared each AI for FBP, ASIR, and MBIR. Visual assessment of the internal carotid artery was also performed by two neuroradiologists using a five-point scale for each axial and reconstructed sagittal image. The inter-observer agreement was analyzed using weighted kappa analysis. MBIR significantly improved AI compared with FBP and ASIR (p < 0.001, each). We found no significant difference in AI between FBP and ASIR (p = 0.502). The visual score of MBIR was significantly better than those of FBP and ASIR (p < 0.001, each), whereas the scores of ASIR were the same as those of FBP. Kappa values indicated good inter-observer agreements in all reconstructed images (0.747-0.778). MBIR resulted in a significant reduction in artifact from dental hardware in carotid CTA.
Uzun, Ismail; Gunduz, Kaan; Celenk, Peruze; Avsever, Hakan; Orhan, Kaan; Canitezer, Gozde; Ozmen, Bilal; Cicek, Ersan; Egrioglu, Erol
2015-01-01
Background: The teeth with undiagnosed vertical root fractures (VRFs) are likely to receive endodontic treatment or retreatment, leading to frustration and inappropriate endodontic therapies. Moreover, many cases of VRFs cannot be diagnosed definitively until the extraction of tooth. Objectives: This study aimed to assess the use of different voxel resolutions of two different cone beam computerized tomography (CBCT) units in the detection VRFs in vitro. Materials and Methods: The study material comprised 74 extracted human mandibular single rooted premolar teeth without root fractures that had not undergone any root-canal treatment. Images were obtained by two different CBCT units. Four image sets were obtained as follows: 1) 3D Accuitomo 170, 4 × 4 cm field of view (FOV) (0.080 mm3); 2) 3D Accuitomo 170. 6 × 6 cm FOV (0.125 mm3); 3) NewTom 3G, 6˝ (0.16 mm3) and 4) NewTom 3G, 9˝ FOV (0.25 mm3). Kappa coefficients were calculated to assess both intra- and inter-observer agreements for each image set. Results: No significant differences were found among observers or voxel sizes, with high average Z (Az) results being reported for all groups. Both intra- and inter-observer agreement values were relatively better for 3D Accuitomo 170 images than the images from NewTom 3G. The highest Az and kappa values were obtained with 3D Accuitomo 170, 4 × 4 cm FOV (0.080 mm3) images. Conclusion: No significant differences were found among observers or voxel sizes, with high Az results reported for all groups. PMID:26557279
DOE Office of Scientific and Technical Information (OSTI.GOV)
Verhaart, René F., E-mail: r.f.verhaart@erasmusmc.nl; Paulides, Margarethus M.; Fortunati, Valerio
Purpose: In current clinical practice, head and neck (H and N) hyperthermia treatment planning (HTP) is solely based on computed tomography (CT) images. Magnetic resonance imaging (MRI) provides superior soft-tissue contrast over CT. The purpose of the authors’ study is to investigate the relevance of using MRI in addition to CT for patient modeling in H and N HTP. Methods: CT and MRI scans were acquired for 11 patients in an immobilization mask. Three observers manually segmented on CT, MRI T1 weighted (MRI-T1w), and MRI T2 weighted (MRI-T2w) images the following thermo-sensitive tissues: cerebrum, cerebellum, brainstem, myelum, sclera, lens, vitreousmore » humor, and the optical nerve. For these tissues that are used for patient modeling in H and N HTP, the interobserver variation of manual tissue segmentation in CT and MRI was quantified with the mean surface distance (MSD). Next, the authors compared the impact of CT and CT and MRI based patient models on the predicted temperatures. For each tissue, the modality was selected that led to the lowest observer variation and inserted this in the combined CT and MRI based patient model (CT and MRI), after a deformable image registration. In addition, a patient model with a detailed segmentation of brain tissues (including white matter, gray matter, and cerebrospinal fluid) was created (CT and MRI{sub db}). To quantify the relevance of MRI based segmentation for H and N HTP, the authors compared the predicted maximum temperatures in the segmented tissues (T{sub max}) and the corresponding specific absorption rate (SAR) of the patient models based on (1) CT, (2) CT and MRI, and (3) CT and MRI{sub db}. Results: In MRI, a similar or reduced interobserver variation was found compared to CT (maximum of median MSD in CT: 0.93 mm, MRI-T1w: 0.72 mm, MRI-T2w: 0.66 mm). Only for the optical nerve the interobserver variation is significantly lower in CT compared to MRI (median MSD in CT: 0.58 mm, MRI-T1w: 1.27 mm, MRI-T2w: 1.40 mm). Patient models based on CT (T{sub max}: 38.0 °C) and CT and MRI (T{sub max}: 38.1 °C) result in similar simulated temperatures, while CT and MRI{sub db} (T{sub max}: 38.5 °C) resulted in significantly higher temperatures. The SAR corresponding to these temperatures did not differ significantly. Conclusions: Although MR imaging reduces the interobserver variation in most tissues, it does not affect simulated local tissue temperatures. However, the improved soft-tissue contrast provided by MRI allows generating a detailed brain segmentation, which has a strong impact on the predicted local temperatures and hence may improve simulation guided hyperthermia.« less
A Microsoft Excel® 2010 Based Tool for Calculating Interobserver Agreement
Azulay, Richard L
2011-01-01
This technical report provides detailed information on the rationale for using a common computer spreadsheet program (Microsoft Excel®) to calculate various forms of interobserver agreement for both continuous and discontinuous data sets. In addition, we provide a brief tutorial on how to use an Excel spreadsheet to automatically compute traditional total count, partial agreement-within-intervals, exact agreement, trial-by-trial, interval-by-interval, scored-interval, unscored-interval, total duration, and mean duration-per-interval interobserver agreement algorithms. We conclude with a discussion of how practitioners may integrate this tool into their clinical work. PMID:22649578
A microsoft excel(®) 2010 based tool for calculating interobserver agreement.
Reed, Derek D; Azulay, Richard L
2011-01-01
This technical report provides detailed information on the rationale for using a common computer spreadsheet program (Microsoft Excel(®)) to calculate various forms of interobserver agreement for both continuous and discontinuous data sets. In addition, we provide a brief tutorial on how to use an Excel spreadsheet to automatically compute traditional total count, partial agreement-within-intervals, exact agreement, trial-by-trial, interval-by-interval, scored-interval, unscored-interval, total duration, and mean duration-per-interval interobserver agreement algorithms. We conclude with a discussion of how practitioners may integrate this tool into their clinical work.
Reliability of cervical vertebral maturation staging.
Rainey, Billie-Jean; Burnside, Girvan; Harrison, Jayne E
2016-07-01
Growth and its prediction are important for the success of many orthodontic treatments. The aim of this study was to determine the reliability of the cervical vertebral maturation (CVM) method for the assessment of mandibular growth. A group of 20 orthodontic clinicians, inexperienced in CVM staging, was trained to use the improved version of the CVM method for the assessment of mandibular growth with a teaching program. They independently assessed 72 consecutive lateral cephalograms, taken at Liverpool University Dental Hospital, on 2 occasions. The cephalograms were presented in 2 different random orders and interspersed with 11 additional images for standardization. The intraobserver and interobserver agreement values were evaluated using the weighted kappa statistic. The intraobserver and interobserver agreement values were substantial (weighted kappa, 0.6-0.8). The overall intraobserver agreement was 0.70 (SE, 0.01), with average agreement of 89%. The interobserver agreement values were 0.68 (SE, 0.03) for phase 1 and 0.66 (SE, 0.03) for phase 2, with average interobserver agreement of 88%. The intraobserver and interobserver agreement values of classifying the vertebral stages with the CVM method were substantial. These findings demonstrate that this method of CVM classification is reproducible and reliable. Copyright © 2016 American Association of Orthodontists. Published by Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Bell, L. R.; Dowling, J. A.; Pogson, E. M.; Metcalfe, P.; Holloway, L.
2017-01-01
Accurate, efficient auto-segmentation methods are essential for the clinical efficacy of adaptive radiotherapy delivered with highly conformal techniques. Current atlas based auto-segmentation techniques are adequate in this respect, however fail to account for inter-observer variation. An atlas-based segmentation method that incorporates inter-observer variation is proposed. This method is validated for a whole breast radiotherapy cohort containing 28 CT datasets with CTVs delineated by eight observers. To optimise atlas accuracy, the cohort was divided into categories by mean body mass index and laterality, with atlas’ generated for each in a leave-one-out approach. Observer CTVs were merged and thresholded to generate an auto-segmentation model representing both inter-observer and inter-patient differences. For each category, the atlas was registered to the left-out dataset to enable propagation of the auto-segmentation from atlas space. Auto-segmentation time was recorded. The segmentation was compared to the gold-standard contour using the dice similarity coefficient (DSC) and mean absolute surface distance (MASD). Comparison with the smallest and largest CTV was also made. This atlas-based auto-segmentation method incorporating inter-observer variation was shown to be efficient (<4min) and accurate for whole breast radiotherapy, with good agreement (DSC>0.7, MASD <9.3mm) between the auto-segmented contours and CTV volumes.
Hosseinpour-Feizi, Hojjat; Soleimanpour, Jafar; Sales, Jafar Ganjpour; Arzroumchilar, Ali
2011-01-01
Purpose The aim of this study was to investigate the interobserver agreement of the Lenke and King classifications for adolescent idiopathic scoliosis, and to compare the results of surgery performed based on classification of the scoliosis according to each of these classification systems. Methods The study was conducted in Shohada Hospital in Tabriz, Iran, between 2009 and 2010. First, a reliability assessment was undertaken to assess interobserver agreement of the Lenke and King classifications for adolescent idiopathic scoliosis. Second, postoperative efficacy and safety of surgery performed based on the Lenke and King classifications were compared. Kappa coefficients of agreement were calculated to assess the agreement. Outcomes were compared using bivariate tests and repeated measures analysis of variance. Results A low to moderate interobserver agreement was observed for the King classification; the Lenke classification yielded mostly high agreement coefficients. The outcome of surgery was not found to be substantially different between the two systems. Conclusion Based on the results, the Lenke classification method seems advantageous. This takes into consideration the Lenke classification’s priority in providing details of curvatures in different anatomical surfaces to explain precise intensity of scoliosis, that it has higher interobserver agreement scores, and also that it leads to noninferior postoperative results compared with the King classification method. PMID:22267934
Hosseinpour-Feizi, Hojjat; Soleimanpour, Jafar; Sales, Jafar Ganjpour; Arzroumchilar, Ali
2011-01-01
The aim of this study was to investigate the interobserver agreement of the Lenke and King classifications for adolescent idiopathic scoliosis, and to compare the results of surgery performed based on classification of the scoliosis according to each of these classification systems. The study was conducted in Shohada Hospital in Tabriz, Iran, between 2009 and 2010. First, a reliability assessment was undertaken to assess interobserver agreement of the Lenke and King classifications for adolescent idiopathic scoliosis. Second, postoperative efficacy and safety of surgery performed based on the Lenke and King classifications were compared. Kappa coefficients of agreement were calculated to assess the agreement. Outcomes were compared using bivariate tests and repeated measures analysis of variance. A low to moderate interobserver agreement was observed for the King classification; the Lenke classification yielded mostly high agreement coefficients. The outcome of surgery was not found to be substantially different between the two systems. Based on the results, the Lenke classification method seems advantageous. This takes into consideration the Lenke classification's priority in providing details of curvatures in different anatomical surfaces to explain precise intensity of scoliosis, that it has higher interobserver agreement scores, and also that it leads to noninferior postoperative results compared with the King classification method.
Choo, Ji Yung; Lee, Ki Yeol; Yu, Ami; Kim, Je-Hyeong; Lee, Seung Heon; Choi, Jung Won; Kang, Eun-Young; Oh, Yu Whan
2016-09-01
To compare the diagnostic performance of digital tomosynthesis (DTS) and chest radiography for detecting airway abnormalities, using computed tomography (CT) as a reference. We evaluated 161 data sets from 149 patients (91 with and 70 without airway abnormalities) who had undergone radiography, DTS, and CT to detect airway problems. Radiographs and DTS were evaluated to localize and score the severity of the airway abnormalities, and to score the image quality using CT as a reference. Receiver operating characteristics (ROC), McNemar's test, weighted kappa, and the paired t-test were used for statistical analysis. The sensitivity of DTS was higher (reader 1, 93.51 %; reader 2, 94.29 %) than chest radiography (68.83 %; 71.43 %) in detecting airway lesions. The diagnostic accuracy of DTS (90.91 %; 94.70 %) was also significantly better than that of radiography (78.03 %; 82.58 %, all p < 0.05). DTS image quality was significantly better than chest radiography (1.83, 2.74; p < 0.05) in the results of both readers. The inter-observer agreement with respect to DTS findings was moderate and superior when compared to radiography findings. DTS is a more accurate and sensitive modality than radiography for detecting airway lesions that are easily obscured by soft tissue structures in the mediastinum. • Digital tomosynthesis offers new diagnostic options for airway lesions. • Digital tomosynthesis is more sensitive and accurate than radiography for airway lesions. • Digital tomosynthesis shows better image quality than radiography. • Assessment of lesion severity, via tomosynthesis is comparable to computed tomography.
Tay, Elton Lik Tong; Yong, Vernon Khet Yau; Lim, Boon Ang; Sia, Stelson; Wong, Elizabeth Poh Ying; Yip, Leonard Wei Leon
2015-01-01
To determine angle closure agreements between gonioscopy and anterior segment optical coherence tomography (AS-OCT), as well as gonioscopy and spectral domain OCT (SD-OCT). A secondary objective was to quantify inter-observer agreements of AS-OCT and SD-OCT assessments. Seventeen consecutive subjects (33 eyes) were recruited from the study hospital's Glaucoma clinic. Gonioscopy was performed by a glaucomatologist masked to OCT results. OCT images were read independently by 2 other glaucomatologists masked to gonioscopy findings as well as each other's analyses of OCT images. Totally 84.8% and 45.5% of scleral spurs were visualized in AS-OCT and SD-OCT images respectively (P<0.01). The agreement for angle closure between AS-OCT and gonioscopy was fair at k=0.31 (95% confidence interval, CI: 0.03-0.59) and k=0.35 (95% CI: 0.07-0.63) for reader 1 and 2 respectively. The agreement for angle closure between SD-OCT and gonioscopy was fair at k=0.21 (95% CI: 0.07-0.49) and slight at k=0.17 (95% CI: 0.08-0.42) for reader 1 and 2 respectively. The inter-reader agreement for angle closure in AS-OCT images was moderate at 0.51 (95% CI: 0.13-0.88). The inter-reader agreement for angle closure in SD-OCT images was slight at 0.18 (95% CI: 0.08-0.45). Significant proportion of scleral spurs were not visualised with SD-OCT imaging resulting in weaker inter-reader agreements. Identifying other angle landmarks in SD-OCT images will allow more consistent angle closure assessments. Gonioscopy and OCT imaging do not always agree in angle closure assessments but have their own advantages, and should be used together and not exclusively.
Robles, Lourdes Y; Singh, Satish; Fisichella, Piero Marco
2015-05-15
Despite advances in diagnoses and therapy, esophageal adenocarcinoma remains a highly lethal neoplasm. Hence, a great interest has been placed in detecting early lesions and in the detection of Barrett esophagus (BE). Advanced imaging technologies of the esophagus have then been developed with the aim of improving biopsy sensitivity and detection of preplastic and neoplastic cells. The purpose of this article was to review emerging imaging technologies for esophageal pathology, spectroscopy, confocal laser endomicroscopy (CLE), and optical coherence tomography (OCT). We conducted a PubMed search using the search string "esophagus or esophageal or oesophageal or oesophagus" and "Barrett or esophageal neoplasm" and "spectroscopy or optical spectroscopy" and "confocal laser endomicroscopy" and "confocal microscopy" and "optical coherence tomography." The first and senior author separately reviewed all articles. Our search identified: 19 in vivo studies with spectroscopy that accounted for 1021 patients and 4 ex vivo studies; 14 clinical CLE in vivo studies that accounted for 941 patients and 1 ex vivo study with 13 patients; and 17 clinical OCT in vivo studies that accounted for 773 patients and 2 ex vivo studies. Human studies using spectroscopy had a very high sensitivity and specificity for the detection of BE. CLE showed a high interobserver agreement in diagnosing esophageal pathology and an accuracy of predicting neoplasia. We also found several clinical studies that reported excellent diagnostic sensitivity and specificity for the detection of BE using OCT. Advanced imaging technology for the detection of esophageal lesions is a promising field that aims to improve the detection of early esophageal lesions. Although advancing imaging techniques improve diagnostic sensitivities and specificities, their integration into diagnostic protocols has yet to be perfected. Copyright © 2015 Elsevier Inc. All rights reserved.
Variable Grid Traveltime Tomography for Near-surface Seismic Imaging
NASA Astrophysics Data System (ADS)
Cai, A.; Zhang, J.
2017-12-01
We present a new algorithm of traveltime tomography for imaging the subsurface with automated variable grids upon geological structures. The nonlinear traveltime tomography along with Tikhonov regularization using conjugate gradient method is a conventional method for near surface imaging. However, model regularization for any regular and even grids assumes uniform resolution. From geophysical point of view, long-wavelength and large scale structures can be reliably resolved, the details along geological boundaries are difficult to resolve. Therefore, we solve a traveltime tomography problem that automatically identifies large scale structures and aggregates grids within the structures for inversion. As a result, the number of velocity unknowns is reduced significantly, and inversion intends to resolve small-scale structures or the boundaries of large-scale structures. The approach is demonstrated by tests on both synthetic and field data. One synthetic model is a buried basalt model with one horizontal layer. Using the variable grid traveltime tomography, the resulted model is more accurate in top layer velocity, and basalt blocks, and leading to a less number of grids. The field data was collected in an oil field in China. The survey was performed in an area where the subsurface structures were predominantly layered. The data set includes 476 shots with a 10 meter spacing and 1735 receivers with a 10 meter spacing. The first-arrival traveltime of the seismogram is picked for tomography. The reciprocal errors of most shots are between 2ms and 6ms. The normal tomography results in fluctuations in layers and some artifacts in the velocity model. In comparison, the implementation of new method with proper threshold provides blocky model with resolved flat layer and less artifacts. Besides, the number of grids reduces from 205,656 to 4,930 and the inversion produces higher resolution due to less unknowns and relatively fine grids in small structures. The variable grid traveltime tomography provides an alternative imaging solution for blocky structures in the subsurface and builds a good starting model for waveform inversion and statics.
Dibble, Elizabeth H; Lourenco, Ana P; Baird, Grayson L; Ward, Robert C; Maynard, A Stanley; Mainiero, Martha B
2018-01-01
To compare interobserver variability (IOV), reader confidence, and sensitivity/specificity in detecting architectural distortion (AD) on digital mammography (DM) versus digital breast tomosynthesis (DBT). This IRB-approved, HIPAA-compliant reader study used a counterbalanced experimental design. We searched radiology reports for AD on screening mammograms from 5 March 2012-27 November 2013. Cases were consensus-reviewed. Controls were selected from demographically matched non-AD examinations. Two radiologists and two fellows blinded to outcomes independently reviewed images from two patient groups in two sessions. Readers recorded presence/absence of AD and confidence level. Agreement and differences in confidence and sensitivity/specificity between DBT versus DM and attendings versus fellows were examined using weighted Kappa and generalised mixed modeling, respectively. There were 59 AD patients and 59 controls for 1,888 observations (59 × 2 (cases and controls) × 2 breasts × 2 imaging techniques × 4 readers). For all readers, agreement improved with DBT versus DM (0.61 vs. 0.37). Confidence was higher with DBT, p = .001. DBT achieved higher sensitivity (.59 vs. .32), p < .001; specificity remained high (>.90). DBT achieved higher positive likelihood ratio values, smaller negative likelihood ratio values, and larger ROC values. DBT decreases IOV, increases confidence, and improves sensitivity while maintaining high specificity in detecting AD. • Digital breast tomosynthesis decreases interobserver variability in the detection of architectural distortion. • Digital breast tomosynthesis increases reader confidence in the detection of architectural distortion. • Digital breast tomosynthesis improves sensitivity in the detection of architectural distortion.
Automatic algorithm for monitoring systolic pressure variation and difference in pulse pressure.
Pestel, Gunther; Fukui, Kimiko; Hartwich, Volker; Schumacher, Peter M; Vogt, Andreas; Hiltebrand, Luzius B; Kurz, Andrea; Fujita, Yoshihisa; Inderbitzin, Daniel; Leibundgut, Daniel
2009-06-01
Difference in pulse pressure (dPP) reliably predicts fluid responsiveness in patients. We have developed a respiratory variation (RV) monitoring device (RV monitor), which continuously records both airway pressure and arterial blood pressure (ABP). We compared the RV monitor measurements with manual dPP measurements. ABP and airway pressure (PAW) from 24 patients were recorded. Data were fed to the RV monitor to calculate dPP and systolic pressure variation in two different ways: (a) considering both ABP and PAW (RV algorithm) and (b) ABP only (RV(slim) algorithm). Additionally, ABP and PAW were recorded intraoperatively in 10-min intervals for later calculation of dPP by manual assessment. Interobserver variability was determined. Manual dPP assessments were used for comparison with automated measurements. To estimate the importance of the PAW signal, RV(slim) measurements were compared with RV measurements. For the 24 patients, 174 measurements (6-10 per patient) were recorded. Six observers assessed dPP manually in the first 8 patients (10-min interval, 53 measurements); no interobserver variability occurred using a computer-assisted method. Bland-Altman analysis showed acceptable bias and limits of agreement of the 2 automated methods compared with the manual method (RV: -0.33% +/- 8.72% and RV(slim): -1.74% +/- 7.97%). The difference between RV measurements and RV(slim) measurements is small (bias -1.05%, limits of agreement 5.67%). Measurements of the automated device are comparable with measurements obtained by human observers, who use a computer-assisted method. The importance of the PAW signal is questionable.
Boers, A M; Marquering, H A; Jochem, J J; Besselink, N J; Berkhemer, O A; van der Lugt, A; Beenen, L F; Majoie, C B
2013-08-01
Cerebral infarct volume as observed in follow-up CT is an important radiologic outcome measure of the effectiveness of treatment of patients with acute ischemic stroke. However, manual measurement of CIV is time-consuming and operator-dependent. The purpose of this study was to develop and evaluate a robust automated measurement of the CIV. The CIV in early follow-up CT images of 34 consecutive patients with acute ischemic stroke was segmented with an automated intensity-based region-growing algorithm, which includes partial volume effect correction near the skull, midline determination, and ventricle and hemorrhage exclusion. Two observers manually delineated the CIV. Interobserver variability of the manual assessments and the accuracy of the automated method were evaluated by using the Pearson correlation, Bland-Altman analysis, and Dice coefficients. The accuracy was defined as the correlation with the manual assessment as a reference standard. The Pearson correlation for the automated method compared with the reference standard was similar to the manual correlation (R = 0.98). The accuracy of the automated method was excellent with a mean difference of 0.5 mL with limits of agreement of -38.0-39.1 mL, which were more consistent than the interobserver variability of the 2 observers (-40.9-44.1 mL). However, the Dice coefficients were higher for the manual delineation. The automated method showed a strong correlation and accuracy with the manual reference measurement. This approach has the potential to become the standard in assessing the infarct volume as a secondary outcome measure for evaluating the effectiveness of treatment.
Rajagopalan, Malolan S; Khanna, Vineet K; Leiter, Yaacov; Stott, Meghan; Showalter, Timothy N; Dicker, Adam P; Lawrence, Yaacov R
2011-09-01
A wiki is a collaborative Web site, such as Wikipedia, that can be freely edited. Because of a wiki's lack of formal editorial control, we hypothesized that the content would be less complete and accurate than that of a professional peer-reviewed Web site. In this study, the coverage, accuracy, and readability of cancer information on Wikipedia were compared with those of the patient-orientated National Cancer Institute's Physician Data Query (PDQ) comprehensive cancer database. For each of 10 cancer types, medically trained personnel scored PDQ and Wikipedia articles for accuracy and presentation of controversies by using an appraisal form. Reliability was assessed by using interobserver variability and test-retest reproducibility. Readability was calculated from word and sentence length. Evaluators were able to rapidly assess articles (18 minutes/article), with a test-retest reliability of 0.71 and interobserver variability of 0.53. For both Web sites, inaccuracies were rare, less than 2% of information examined. PDQ was significantly more readable than Wikipedia: Flesch-Kincaid grade level 9.6 versus 14.1. There was no difference in depth of coverage between PDQ and Wikipedia (29.9, 34.2, respectively; maximum possible score 72). Controversial aspects of cancer care were relatively poorly discussed in both resources (2.9 and 6.1 for PDQ and Wikipedia, respectively, NS; maximum possible score 18). A planned subanalysis comparing common and uncommon cancers demonstrated no difference. Although the wiki resource had similar accuracy and depth as the professionally edited database, it was significantly less readable. Further research is required to assess how this influences patients' understanding and retention.
Bell, M R; Britson, P J; Chu, A; Holmes, D R; Bresnahan, J F; Schwartz, R S
1997-01-01
We describe a method of validation of computerized quantitative coronary arteriography and report the results of a new UNIX-based quantitative coronary arteriography software program developed for rapid on-line (digital) and off-line (digital or cinefilm) analysis. The UNIX operating system is widely available in computer systems using very fast processors and has excellent graphics capabilities. The system is potentially compatible with any cardiac digital x-ray system for on-line analysis and has been designed to incorporate an integrated database, have on-line and immediate recall capabilities, and provide digital access to all data. The accuracy (mean signed differences of the observed minus the true dimensions) and precision (pooled standard deviations of the measurements) of the program were determined x-ray vessel phantoms. Intra- and interobserver variabilities were assessed from in vivo studies during routine clinical coronary arteriography. Precision from the x-ray phantom studies (6-In. field of view) for digital images was 0.066 mm and for digitized cine images was 0.060 mm. Accuracy was 0.076 mm (overestimation) for digital images compared to 0.008 mm for digitized cine images. Diagnostic coronary catheters were also used for calibration; accuracy.varied according to size of catheter and whether or not they were filled with iodinated contrast. Intra- and interobserver variabilities were excellent and indicated that coronary lesion measurements were relatively user-independent. Thus, this easy to use and very fast UNIX based program appears to be robust with optimal accuracy and precision for clinical and research applications.
de Heide, John; Vroegh, C J; Szili Torok, T; Gobbens, R J J; Zijlstra, F; Takens-Lameijer, M; Lenzen, M J; Yap, S C; Scholte Op Reimer, W J M
Postprocedural complications after elective cardiac interventions include hematomas and infections. Telemedical wound assessment using mobile phones with integrated cameras may improve quality of care and help reduce costs. We aimed to study the feasibility of telemedical wound assessment using a mobile phone. The primary aim was the number of patients who were able to upload their pictures. Secondary aims were image interpretability, agreement between nurse practitioners, and patient evaluation of the intervention. This is a prospective study of all consecutive patients who underwent an elective cardiac intervention. Patients were instructed to photograph their wound or puncture site after hospital discharge and upload the pictures to a secure email address 6 days after hospital discharge. Received photos were assessed by 2 nurse practitioners. The intervention was evaluated using a peer-reviewed questionnaire and photo assessment scheme. In total, 46 eligible patients were included in the study, with 5 screen failures (eg, clinical stay ≥ 6 days) and 1 patient lost to follow-up. Thirty-three of 40 patients (83%) were able to upload their pictures. Smartphone users were more successful in uploading their pictures compared with feature phone users (93% vs 55%, P < .01). Eighty-eight percent of the clinical pictures were interpretable. The interobserver variability had an agreement between 93% and 97%. Patients are able to take and upload the mobile clinical photos to the secure email address, and the vast majority was interpretable. Smartphone users were more successful than feature phone users in uploading their pictures. The interobserver variability was good.
Fleury, Eduardo F C; Gianini, Ana Claudia; Marcomini, Karem; Oliveira, Vilmar
2018-01-01
To determine the applicability of a computer-aided diagnostic system strain elastography system for the classification of breast masses diagnosed by ultrasound and scored using the criteria proposed by the breast imaging and reporting data system ultrasound lexicon and to determine the diagnostic accuracy and interobserver variability. This prospective study was conducted between March 1, 2016, and May 30, 2016. A total of 83 breast masses subjected to percutaneous biopsy were included. Ultrasound elastography images before biopsy were interpreted by 3 radiologists with and without the aid of computer-aided diagnostic system for strain elastography. The parameters evaluated by each radiologist results were sensitivity, specificity, and diagnostic accuracy, with and without computer-aided diagnostic system for strain elastography. Interobserver variability was assessed using a weighted κ test and an intraclass correlation coefficient. The areas under the receiver operating characteristic curves were also calculated. The areas under the receiver operating characteristic curve were 0.835, 0.801, and 0.765 for readers 1, 2, and 3, respectively, without computer-aided diagnostic system for strain elastography, and 0.900, 0.926, and 0.868, respectively, with computer-aided diagnostic system for strain elastography. The intraclass correlation coefficient between the 3 readers was 0.6713 without computer-aided diagnostic system for strain elastography and 0.811 with computer-aided diagnostic system for strain elastography. The proposed computer-aided diagnostic system for strain elastography system has the potential to improve the diagnostic performance of radiologists in breast examination using ultrasound associated with elastography.
Nyns, Emile C A; Dragulescu, Andreea; Yoo, Shi-Joon; Grosse-Wortmann, Lars
2016-09-01
Right ventricular (RV) volume and function evaluation is essential in the follow-up of patients after arterial switch operation (ASO) for dextro-transposition of the great arteries (d-TGA). Cardiac magnetic resonance (CMR) imaging using the Simpson's method is the gold-standard for measuring these parameters. However, this method can be challenging and time-consuming, especially in congenital heart disease. Knowledge-based reconstruction (KBR) is an alternative method to derive volumes from CMR datasets. It is based on the identification of a finite number of anatomical RV landmarks in various planes, followed by computer-based reconstruction of the endocardial contours by matching these landmarks with a reference library of representative RV shapes. The purpose of this study was to evaluate the feasibility, accuracy, reproducibility and labor intensity of KBR for RV volumetry in patients after ASO for d-TGA. The CMR datasets of 17 children and adolescents (males 11, median age 15) were studied for RV volumetry using both KBR and Simpson's method. The intraobserver, interobserver and intermethod variabilities were assessed using Bland-Altman analyses. Good correlation between KBR and Simpson's method was noted. Intraobserver and interobserver variability for KBR showed excellent agreement. Volume and function assessment using KBR was faster when compared with the Simpson's method (5.1 ± 0.6 vs. 6.7 ± 0.9 min, p < 0.001). KBR is a feasible, accurate, reproducible and fast method for measuring RV volumes and function derived from CMR in patients after ASO for d-TGA.
Zhang, Ling; Chen, Siping; Chin, Chien Ting; Wang, Tianfu; Li, Shengli
2012-08-01
To assist radiologists and decrease interobserver variability when using 2D ultrasonography (US) to locate the standardized plane of early gestational sac (SPGS) and to perform gestational sac (GS) biometric measurements. In this paper, the authors report the design of the first automatic solution, called "intelligent scanning" (IS), for selecting SPGS and performing biometric measurements using real-time 2D US. First, the GS is efficiently and precisely located in each ultrasound frame by exploiting a coarse to fine detection scheme based on the training of two cascade AdaBoost classifiers. Next, the SPGS are automatically selected by eliminating false positives. This is accomplished using local context information based on the relative position of anatomies in the image sequence. Finally, a database-guided multiscale normalized cuts algorithm is proposed to generate the initial contour of the GS, based on which the GS is automatically segmented for measurement by a modified snake model. This system was validated on 31 ultrasound videos involving 31 pregnant volunteers. The differences between system performance and radiologist performance with respect to SPGS selection and length and depth (diameter) measurements are 7.5% ± 5.0%, 5.5% ± 5.2%, and 6.5% ± 4.6%, respectively. Additional validations prove that the IS precision is in the range of interobserver variability. Our system can display the SPGS along with biometric measurements in approximately three seconds after the video ends, when using a 1.9 GHz dual-core computer. IS of the GS from 2D real-time US is a practical, reproducible, and reliable approach.
Kushnir, Vladimir M.; Wani, Sachin B.; Fowler, Kathryn; Menias, Christine; Varma, Rakesh; Narra, Vamsi; Hovis, Christine; Murad, Faris; Mullady, Daniel; Jonnalagadda, Sreenivasa S.; Early, Dayna S.; Edmundowicz, Steven A.; Azar, Riad R.
2014-01-01
OBJECTIVES There are limited data comparing imaging modalities in the diagnosis of pancreas divisum. We aimed to: 1. Evaluate the sensitivity of endoscopic ultrasound (EUS), magnetic resonance cholangiopancreatography (MRCP) and multi-detector computed tomography (MDCT) for pancreas divisum. 2. Assess interobserver agreement (IOA) among expert radiologists for detecting pancreas divisum on MDCT and MRCP. METHODS For this retrospective cohort study, we identified 45 consecutive patients with pancreaticobiliary symptoms and pancreas divisum established by endoscopic retrograde pancreatography (ERP) who underwent EUS and cross-sectional imaging. The control group was composed of patients without pancreas divisum who underwent ERP and cross-sectional imaging. RESULTS The sensitivity of EUS for pancreas divisum was 86.7%, significantly higher than sensitivity reported in the medical records for MDCT (15.5%) or MRCP (60%) [p<0.001 for each]. On review by expert radiologists the sensitivity of MDCT increased to 83.3% in cases where the pancreatic duct was visualized, with fair IOA (қ=0.34). Expert review of MRCPs did not identify any additional cases of pancreas divisum; IOA was moderate (қ=0.43). CONCLUSIONS EUS is a sensitive test for diagnosing pancreas divisum and is superior to MDCT and MRCP. Review of MDCT studies by expert radiologists substantially raises its sensitivity for pancreas divisum. PMID:23211370
Delage Royle, Audrey; Balg, Frédéric; Bouliane, Martin J; Canet-Silvestri, Fanny; Garant-Saine, Laurianne; Sheps, David M; Lapner, Peter; Rouleau, Dominique M
2017-10-01
Quantifying glenohumeral bone loss is key in preoperative surgical planning for a successful Bankart repair. Simple radiographs can accurately measure bone defects in cases of recurrent shoulder instability. Cohort study (diagnosis); Level of evidence, 2. A true anteroposterior (AP) view, alone and in combination with an axillary view, was used to evaluate the diagnostic properties of radiographs compared with computed tomography (CT) scan, the current gold standard, to predict significant bone defects in 70 patients. Sensitivity, specificity, and positive and negative predictive values were evaluated and compared. Detection of glenoid bone loss on plain film radiographs, with and without axillary view, had a sensitivity of 86% for both views and a specificity of 73% and 64% with and without the axillary view, respectively. For detection of humeral bone loss, the sensitivity was 8% and 17% and the specificity was 98% and 91% with and without the axillary view, respectively. Regular radiographs would have missed 1 instance of significant bone loss on the glenoid side and 20 on the humeral side. Interobserver reliabilities were moderate for glenoid detection (κ = 0.473-0.503) and poor for the humeral side (κ = 0.278-0.336). Regular radiographs showed suboptimal sensitivity, specificity, and reliability. Therefore, CT scan should be considered in the treatment algorithm for accurate quantification of bone loss to prevent high rates of recurrent instability.
Feng, Xin; Li, Gang; Qu, Zhenyu; Liu, Lin; Näsström, Karin; Shi, Xie-Qi
2015-02-01
In this study, we aimed to evaluate the adenoidal nasopharyngeal ratio (ANR) on lateral cephalograms by assessing upper airway volumes using cone-beam computed tomography (CBCT) images as the validation method. Fifty-five patients were included in the study, and it was essential that the lateral cephalograms and CBCT images taken at their examinations were not more than 1 week apart. There were 32 subjects in group A (age ≤15 years) and 23 subjects in group B (age >15 years). The ANR was measured on the lateral cephalograms. The area and volumetric measurements of the nasopharynx and the total upper airway were obtained from CBCT images. Repeated measurements of the ANR and airway volume were performed on 10 subjects by 2 observers. Group A had a higher correlation (r = -0.78) between the ANR and the nasopharynx volume than did group B (r = -0.57). The ANR had a weak correlation with the total upper airway volume (group A, r = -0.48; group B, r = -0.32). Both measurements made on lateral cephalograms and CBCT were highly reproducible in terms of intraobserver and interobserver agreement. Based on our results, the measurement of the ANR on lateral cephalograms can be used as an initial screening method to estimate the nasopharynx volumes of younger patients (age ≤15 years). Copyright © 2015 American Association of Orthodontists. Published by Elsevier Inc. All rights reserved.
Lee, Sungwon; Jee, Won-Hee; Jung, Joon-Yong; Lee, So-Yeon; Ryu, Kyeung-Sik; Ha, Kee-Yong
2015-02-01
Three-dimensional (3D) fast spin-echo sequence with variable flip-angle refocusing pulse allows retrospective alignments of magnetic resonance imaging (MRI) in any desired plane. To compare isotropic 3D T2-weighted (T2W) turbo spin-echo sequence (TSE-SPACE) with standard two-dimensional (2D) T2W TSE imaging for evaluating lumbar spine pathology at 3.0 T MRI. Forty-two patients who had spine surgery for disk herniation and had 3.0 T spine MRI were included in this study. In addition to standard 2D T2W TSE imaging, sagittal 3D T2W TSE-SPACE was obtained to produce multiplanar (MPR) images. Each set of MR images from 3D T2W TSE and 2D TSE-SPACE were independently scored for the degree of lumbar neural foraminal stenosis, central spinal stenosis, and nerve compression by two reviewers. These scores were compared with operative findings and the sensitivities were evaluated by McNemar test. Inter-observer agreements and the correlation with symptoms laterality were assessed with kappa statistics. The 3D T2W TSE and 2D TSE-SPACE had similar sensitivity in detecting foraminal stenosis (78.9% versus 78.9% in 32 foramen levels), spinal stenosis (100% versus 100% in 42 spinal levels), and nerve compression (92.9% versus 81.8% in 59 spinal nerves). The inter-observer agreements (κ = 0.849 vs. 0.451 for foraminal stenosis, κ = 0.809 vs. 0.503 for spinal stenosis, and κ = 0.681 vs. 0.429 for nerve compression) and symptoms correlation (κ = 0.449 vs. κ = 0.242) were better in 3D TSE-SPACE compared to 2D TSE. 3D TSE-SPACE with oblique coronal MPR images demonstrated better inter-observer agreements compared to 3D TSE-SPACE without oblique coronal MPR images (κ = 0.930 vs. κ = 0.681). Isotropic 3D T2W TSE-SPACE at 3.0 T was comparable to 2D T2W TSE for detecting foraminal stenosis, central spinal stenosis, and nerve compression with better inter-observer agreements and symptom correlation. © The Foundation Acta Radiologica 2014 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav.
von Arx, Thomas; Janner, Simone F M; Hänni, Stefan; Bornstein, Michael M
2016-02-01
Conventional 2-dimensional radiography uses defined criteria for outcome assessment of apical surgery. However, these radiographic healing criteria are not applicable for 3-dimensional radiography. The present study evaluated the repeatability and reproducibility of new cone-beam computed tomographic (CBCT)-based healing criteria for the judgment of periapical healing 1 year after apical surgery. CBCT scans taken 1 year after apical surgery (61 roots of 54 teeth in 54 patients, mean age = 54.4 years) were evaluated by 3 blinded and calibrated observers using 4 different indices. Reformatted buccolingual CBCT sections through the longitudinal axis of the treated roots were analyzed. Radiographic healing was assessed at the resection plane (R index), within the apical area (A index), of the cortical plate (C index), and regarding a combined apical-cortical area (B index). All readings were performed twice to calculate the intraobserver agreement (repeatability). Second-time readings were used for analyzing the interobserver agreement (reproducibility). Various statistical tests (Cohen, kappa, Fisher, and Spearman) were performed to measure the intra- and interobserver concurrence, the variability of score ratios, and the correlation of indices. For all indices, the rates of identical first- and second-time scores were always higher than 80% (intraobserver Cohen κ values ranging from 0.793 to 0.963). The B index (94.0%) showed the highest intraobserver agreement. Regarding interobserver agreement, the highest rate was found for the B index (72.1%). The Fleiss' κ values for R and B indices exhibited substantial agreement (0.626 and 0.717, respectively), whereas the values for A and C indices showed moderate agreement (0.561 and 0.573, respectively). The Spearman correlation coefficients for R, A, C, and B indices all exhibited a moderate to very strong correlation with the highest correlation found between C and B indices (rs = 0.8069). All indices showed an excellent intraobserver agreement (repeatability). With regard to interobserver agreement (reproducibility), the B index (healing of apical and cortical defects combined) and the R index (healing on the resection plane) showed substantial congruence and thus are to be recommended in future studies when using buccolingual CBCT sections for radiographic outcome assessment of apical surgery. Copyright © 2016 American Association of Endodontists. Published by Elsevier Inc. All rights reserved.
Kawaguchi, Yurika Maria Fogaça; Nawa, Ricardo Kenji; Figueiredo, Thais Borgheti; Martins, Lourdes; Pires-Neto, Ruy Camargo
2016-01-01
ABSTRACT Objective: To translate the Perme Intensive Care Unit Mobility Score and the ICU Mobility Scale (IMS) into Portuguese, creating versions that are cross-culturally adapted for use in Brazil, and to determine the interobserver agreement and reliability for both versions. Methods: The processes of translation and cross-cultural validation consisted in the following: preparation, translation, reconciliation, synthesis, back-translation, review, approval, and pre-test. The Portuguese-language versions of both instruments were then used by two researchers to evaluate critically ill ICU patients. Weighted kappa statistics and Bland-Altman plots were used in order to verify interobserver agreement for the two instruments. In each of the domains of the instruments, interobserver reliability was evaluated with Cronbach's alpha coefficient. The correlation between the instruments was assessed by Spearman's correlation test. Results: The study sample comprised 103 patients-56 (54%) of whom were male-with a mean age of 52 ± 18 years. The main reason for ICU admission (in 44%) was respiratory failure. Both instruments showed excellent interobserver agreement (κ > 0.90) and reliability (α > 0.90) in all domains. Interobserver bias was low for the IMS and the Perme Score (−0.048 ± 0.350 and −0.06 ± 0.73, respectively). The 95% CIs for the same instruments ranged from −0.73 to 0.64 and −1.50 to 1.36, respectively. There was also a strong positive correlation between the two instruments (r = 0.941; p < 0.001). Conclusions: In their versions adapted for use in Brazil, both instruments showed high interobserver agreement and reliability. PMID:28117473
Amini, Michael H; Sykes, Joshua B; Olson, Stephen T; Smith, Richard A; Mauck, Benjamin M; Azar, Frederick M; Throckmorton, Thomas W
2015-03-01
The severity of elbow arthritis is one of many factors that surgeons must evaluate when considering treatment options for a given patient. Elbow surgeons have historically used the Broberg and Morrey (BM) and Hastings and Rettig (HR) classification systems to radiographically stage the severity of post-traumatic arthritis (PTA) and primary osteoarthritis (OA). We proposed to compare the intraobserver and interobserver reliability between systems for patients with either PTA or OA. The radiographs of 45 patients were evaluated at least 2 weeks apart by 6 evaluators of different levels of training. Intraobserver and interobserver reliability were calculated by Spearman correlation coefficients with 95% confidence intervals. Agreement was considered almost perfect for coefficients >0.80 and substantial for coefficients of 0.61 to 0.80. In patients with both PTA and OA, intraobserver reliability and interobserver reliability were substantial, with no difference between classification systems. There were no significant differences in intraobserver or interobserver reliability between attending physicians and trainees for either classification system (all P > .10). The presence of fracture implants did not affect reliability in the BM system but did substantially worsen reliability in the HR system (intraobserver P = .04 and interobserver P = .001). The BM and HR classifications both showed substantial intraobserver and interobserver reliability for PTA and OA. Training level differences did not affect reliability for either system. Both trainees and fellowship-trained surgeons may easily and reliably apply each classification system to the evaluation of primary elbow OA and PTA, although the HR system was less reliable in the presence of fracture implants. Copyright © 2015 Journal of Shoulder and Elbow Surgery Board of Trustees. Published by Elsevier Inc. All rights reserved.
Høyer, Christian; Pavar, Susanne; Pedersen, Begitte H; Biurrun Manresa, José A; Petersen, Lars J
2013-08-01
Mercury-in-silastic strain gauge pletysmography (SGP) is a well-established technique for blood flow and blood pressure measurements. The aim of this study was to examine (i) the possible influence of clinical clues, e.g. the presence of wounds and color changes during blood pressure measurements, and (ii) intra- and inter-observer variation of curve interpretation for segmental blood pressure measurements. A total of 204 patients with known or suspected peripheral arterial disease (PAD) were included in a diagnostic accuracy trial. Toe and ankle pressures were measured in both limbs, and primary observers analyzed a total of 804 pressure curve sets. The SGP curves were later reanalyzed separately by two observers blinded to clinical clues. Intra- and inter-observer agreement was quantified using Cohen's kappa and reliability was quantified using intra-class correlation coefficients, coefficients of variance, and Bland-Altman analysis. There was an overall agreement regarding patient diagnostic classification (PAD/not PAD) in 202/204 (99.0%) for intra-observer (κ = 0.969, p < 0.001), and 201/204 (98.5%) for inter-observer readings (κ = 0.953, p < 0.001). Reliability analysis showed excellent correlation between blinded versus non-blinded and inter-observer readings for determination of absolute segmental pressures (all intraclass correlation coefficients ≥ 0.984). The coefficient of variance for determination of absolute segmental blood pressure ranged from 2.9-3.4% for blinded/non-blinded data and from 3.8-5.0% for inter-observer data. This study shows a low inter-observer variation among experienced laboratory technicians for reading strain gauge curves. The low variation between blinded/non-blinded readings indicates that SGP measurements are minimally biased by clinical clues.
Pedersen, Ken Steen; Toft, Nils
2011-03-01
The objective of the current study was to evaluate intra- and inter-observer agreement using a descriptive classification scale with four categories, descriptive text and pictures for assessment of consistency in faecal samples from pigs post weaning. The four consistency categories were score one=firm and shaped, score two=soft and shaped, score three=loose and score four=watery. Five observers from the same veterinary practice examined 100 faecal samples using the scale with four categories. Four of the observers examined the 100 faecal samples twice within the same day. Within observers the difference in proportions for the individual consistency categories between two examinations was on average 0.04 (range: 0-0.10). The mean intra-observer agreement was 0.82 (range: 0.72-0.91) with a mean kappa value of 0.76 (range: 0.61-0.88). For inter-observer agreement overall kappa was 0.64. For the 10 pair-wise comparisons the mean inter-observer agreement was 0.73 (range: 0.61-0.90) with a mean kappa value of 0.64 (range: 0.48-0.87). The difference in proportions for the individual consistency categories was on average 0.08 (range: 0-0.17). In conclusion, the agreement observed for the descriptive classification scale with four categories, descriptive text and pictures may be categorized as a substantial to almost perfect intra-observer agreement and a moderate to almost perfect inter-observer agreement. However, more objective measures than clinical scales may still be needed to improve intra- and inter-observer agreement in research studies. Copyright © 2010 Elsevier B.V. All rights reserved.
Folks, Russell D; Savir-Baruch, Bital; Garcia, Ernest V; Verdes, Liudmila; Taylor, Andrew T
2012-12-01
Our objective was to design and implement a clinical history database capable of linking to our database of quantitative results from (99m)Tc-mercaptoacetyltriglycine (MAG3) renal scans and export a data summary for physicians or our software decision support system. For database development, we used a commercial program. Additional software was developed in Interactive Data Language. MAG3 studies were processed using an in-house enhancement of a commercial program. The relational database has 3 parts: a list of all renal scans (the RENAL database), a set of patients with quantitative processing results (the Q2 database), and a subset of patients from Q2 containing clinical data manually transcribed from the hospital information system (the CLINICAL database). To test interobserver variability, a second physician transcriber reviewed 50 randomly selected patients in the hospital information system and tabulated 2 clinical data items: hydronephrosis and presence of a current stent. The CLINICAL database was developed in stages and contains 342 fields comprising demographic information, clinical history, and findings from up to 11 radiologic procedures. A scripted algorithm is used to reliably match records present in both Q2 and CLINICAL. An Interactive Data Language program then combines data from the 2 databases into an XML (extensible markup language) file for use by the decision support system. A text file is constructed and saved for review by physicians. RENAL contains 2,222 records, Q2 contains 456 records, and CLINICAL contains 152 records. The interobserver variability testing found a 95% match between the 2 observers for presence or absence of ureteral stent (κ = 0.52), a 75% match for hydronephrosis based on narrative summaries of hospitalizations and clinical visits (κ = 0.41), and a 92% match for hydronephrosis based on the imaging report (κ = 0.84). We have developed a relational database system to integrate the quantitative results of MAG3 image processing with clinical records obtained from the hospital information system. We also have developed a methodology for formatting clinical history for review by physicians and export to a decision support system. We identified several pitfalls, including the fact that important textual information extracted from the hospital information system by knowledgeable transcribers can show substantial interobserver variation, particularly when record retrieval is based on the narrative clinical records.
Interobserver Agreement on First-Stage Conversation Analytic Transcription
ERIC Educational Resources Information Center
Roberts, Felicia; Robinson, Jeffrey D.
2004-01-01
This investigation assesses interobserver agreement on conversation analytic (CA) transcription. Four professional CA transcribers spent a maximum of 3 hours transcribing 2.5 minutes of a previously unknown, naturally occurring, mundane telephone call. Researchers unitized transcripts into words, sounds, silences, inbreaths, outbreaths, and laugh…
Roberson, David W; Kentala, Erna; Forbes, Peter
2005-12-01
The goals of this project were 1) to develop and validate an objective instrument to measure surgical performance at tonsillectomy, 2) to assess its interobserver and interobservation reliability and construct validity, and 3) to select those items with best reliability and most independent information to design a simplified form suitable for routine use in otolaryngology surgical evaluation. Prospective, observational data collection for an educational quality improvement project. The evaluation instrument was based on previous instruments developed in general surgery with input from attending otolaryngologic surgeons and experts in medical education. It was pilot tested and subjected to iterative improvements. After the instrument was finalized, a total of 55 tonsillectomies were observed and scored during academic year 2002 to 2003: 45 cases by residents at different points during their rotation, 5 by fellows, and 5 by faculty. Results were assessed for interobserver reliability, interobservation reliability, and construct validity. Factor analysis was used to identify items with independent information. Interobserver and interobservation reliability was high. On technical items, faculty substantially outperformed fellows, who in turn outperformed residents (P < .0001 for both comparisons). On the "global" scale (overall assessment), residents improved an average of 1 full point (on a 5 point scale) during a 3 month rotation (P = .01). In the subscale of "patient care," results were less clear cut: fellows outperformed residents, who in turn outperformed faculty, but only the fellows to faculty comparison was statistically significant (P = .04), and residents did not clearly improve over time (P = .36). Factor analysis demonstrated that technical items and patient care items factor separately and thus represent separate skill domains in surgery. It is possible to objectively measure surgical skill at tonsillectomy with high reliability and good construct validity. Factor analysis demonstrated that patient care is a distinct domain in surgical skill. Although the interobserver reliability for some patient care items reached statistical significance, it was not high enough for "high stakes testing" purposes. Using reliability and factor analysis results, we propose a simplified instrument for use in evaluating trainees in otolaryngologic surgery.
Does the Modified Gartland Classification Clarify Decision Making?
Leung, Sophia; Paryavi, Ebrahim; Herman, Martin J; Sponseller, Paul D; Abzug, Joshua M
2018-01-01
The modified Gartland classification system for pediatric supracondylar fractures is often utilized as a communication tool to aid in determining whether or not a fracture warrants operative intervention. This study sought to determine the interobserver and intraobserver reliability of the Gartland classification system, as well as to determine whether there was agreement that a fracture warranted operative intervention regardless of the classification system. A total of 200 anteroposterior and lateral radiographs of pediatric supracondylar humerus fractures were retrospectively reviewed by 3 fellowship-trained pediatric orthopaedic surgeons and 2 orthopaedic residents and then classified as type I, IIa, IIb, or III. The surgeons then recorded whether they would treat the fracture nonoperatively or operatively. The κ coefficients were calculated to determine interobserver and intraobserver reliability. Overall, the Wilkins-modified Gartland classification has low-moderate interobserver reliability (κ=0.475) and high intraobserver reliability (κ=0.777). A low interobserver reliability was found when differentiating between type IIa and IIb (κ=0.240) among attendings. There was moderate-high interobserver reliability for the decision to operate (κ=0.691) and high intraobserver reliability (κ=0.760). Decreased interobserver reliability was present for decision to operate among residents. For fractures classified as type I, the decision to operate was made 3% of the time and 27% for type IIa. The decision was made to operate 99% of the time for type IIb and 100% for type III. There is almost full agreement for the nonoperative treatment of Type I fractures and operative treatment for type III fractures. There is agreement that type IIb fractures should be treated operatively and that the majority of type IIa fractures should be treated nonoperatively. However, the interobserver reliability for differentiating between type IIa and IIb fractures is low. Our results validate the Gartland classfication system as a method to help direct treatment of pediatric supracondylar humerus fractures, although the modification of the system, IIa versus IIb, seems to have limited reliability and utility. Terminology based on decision to treat may lead to a more clinically useful classification system in the evaluation and treatment of pediatric supracondylar humerus fractures. Level III-diagnostic studies.
Reliability analysis for digital adolescent idiopathic scoliosis measurements.
Kuklo, Timothy R; Potter, Benjamin K; O'Brien, Michael F; Schroeder, Teresa M; Lenke, Lawrence G; Polly, David W
2005-04-01
Analysis of adolescent idiopathic scoliosis (AIS) requires a thorough clinical and radiographic evaluation to completely assess the three-dimensional deformity. Recently, these radiographic parameters have been analyzed for reliability and reproducibility following manual measurements; however, most of these parameters have not been analyzed with regard to digital measurements. The purpose of this study is to determine the intra- and interobserver reliability of common scoliosis radiographic parameters using a digital software measurement program. Thirty sets of preoperative (posteroanterior [PA], lateral, and side-bending [SB]) and postoperative (PA and lateral) radiographs were analyzed by three independent observers on two separate occasions using a software measurement program (PhDx, Albuquerque, NM). Coronal measures included main thoracic (MT) and thoracolumbar-lumbar (TL/L) Cobb, SB MT Cobb, MT and TL/L apical vertical translation (AVT), C7 to center sacral vertical line (CSVL), T1 tilt, LIV tilt, disk below lowest instrumented vertebra (LIV), coronal balance, and Risser, whereas sagittal measures included T2-T5, T5-T12, T2-T12, T10-L2, T12-S1, and sagittal balance. Analysis of variance for repeated measures or Cohen three-way kappa correlation coefficient analysis was performed as appropriate to calculate the intra- and interobserver reliability for each parameter. The majority of the radiographic parameters assessed demonstrated good or excellent intra- and interobserver reliability. The relationship of the LIV to the CSVL (intraobserver kappaa = 0.48-0.78, fair to excellent; interobserver kappaa = 0.34-0.41, fair to poor), interobserver measurement of AVT (rho = 0.49-0.73, low to good), Risser grade (intraobserver rho = 0.41-0.97, low to excellent; interobserver rho = 0.60-0.70, fair to good), intraobserver measurement of the angulation of the disk inferior to the LIV (rho = 0.53-0.88, fair to good), apical Nash-Moe vertebral rotation (intraobserver rho = 0.50-0.85, fair to good; interobserver rho = 0.53-0.59, fair), and especially regional thoracic kyphosis from T2 to T5 (intraobserver rho = 0.22-0.65, poor to fair; interobserver rho = 0.33-0.47, low) demonstrated lesser reliability. In general, preoperative measures demonstrated greater reliability than postoperative measures, and coronal angular measures were more reliable than sagittal measures. Most common radiographic parameters for AIS assessment demonstrated good or excellent reliability for digital measurement and can be recommended for routine clinical and academic use. Preoperative assessments and coronal measures may be more reliable than postoperative and sagittal measurements. The reliability of digital measurements will be increasingly important as digital radiographic viewing becomes commonplace.
Interobserver error involved in independent attempts to measure cusp base areas of Pan M1s
Bailey, Shara E; Pilbrow, Varsha C; Wood, Bernard A
2004-01-01
Cusp base areas measured from digitized images increase the amount of detailed quantitative information one can collect from post-canine crown morphology. Although this method is gaining wide usage for taxonomic analyses of extant and extinct hominoids, the techniques for digitizing images and taking measurements differ between researchers. The aim of this study was to investigate interobserver error in order to help assess the reliability of cusp base area measurement within extant and extinct hominoid taxa. Two of the authors measured individual cusp base areas and total cusp base area of 23 maxillary first molars (M1) of Pan. From these, relative cusp base areas were calculated. No statistically significant interobserver differences were found for either absolute or relative cusp base areas. On average the hypocone and paracone showed the least interobserver error (< 1%) whereas the protocone and metacone showed the most (2.6–4.5%). We suggest that the larger measurement error in the metacone/protocone is due primarily to either weakly defined fissure patterns and/or the presence of accessory occlusal features. Overall, levels of interobserver error are similar to those found for intraobserver error. The results of our study suggest that if certain prescribed standards are employed then cusp and crown base areas measured by different individuals can be pooled into a single database. PMID:15447691
Ploner, Stefan B; Moult, Eric M; Choi, WooJhon; Waheed, Nadia K; Lee, ByungKun; Novais, Eduardo A; Cole, Emily D; Potsaid, Benjamin; Husvogt, Lennart; Schottenhamml, Julia; Maier, Andreas; Rosenfeld, Philip J; Duker, Jay S; Hornegger, Joachim; Fujimoto, James G
2016-12-01
Currently available optical coherence tomography angiography systems provide information about blood flux but only limited information about blood flow speed. The authors develop a method for mapping the previously proposed variable interscan time analysis (VISTA) algorithm into a color display that encodes relative blood flow speed. Optical coherence tomography angiography was performed with a 1,050 nm, 400 kHz A-scan rate, swept source optical coherence tomography system using a 5 repeated B-scan protocol. Variable interscan time analysis was used to compute the optical coherence tomography angiography signal from B-scan pairs having 1.5 millisecond and 3.0 milliseconds interscan times. The resulting VISTA data were then mapped to a color space for display. The authors evaluated the VISTA visualization algorithm in normal eyes (n = 2), nonproliferative diabetic retinopathy eyes (n = 6), proliferative diabetic retinopathy eyes (n = 3), geographic atrophy eyes (n = 4), and exudative age-related macular degeneration eyes (n = 2). All eyes showed blood flow speed variations, and all eyes with pathology showed abnormal blood flow speeds compared with controls. The authors developed a novel method for mapping VISTA into a color display, allowing visualization of relative blood flow speeds. The method was found useful, in a small case series, for visualizing blood flow speeds in a variety of ocular diseases and serves as a step toward quantitative optical coherence tomography angiography.
Development and initial validation of the Classification of Early-Onset Scoliosis (C-EOS).
Williams, Brendan A; Matsumoto, Hiroko; McCalla, Daren J; Akbarnia, Behrooz A; Blakemore, Laurel C; Betz, Randal R; Flynn, John M; Johnston, Charles E; McCarthy, Richard E; Roye, David P; Skaggs, David L; Smith, John T; Snyder, Brian D; Sponseller, Paul D; Sturm, Peter F; Thompson, George H; Yazici, Muharrem; Vitale, Michael G
2014-08-20
Early-onset scoliosis is a heterogeneous condition, with highly variable manifestations and natural history. No standardized classification system exists to describe and group patients, to guide optimal care, or to prognosticate outcomes within this population. A classification system for early-onset scoliosis is thus a necessary prerequisite to the timely evolution of care of these patients. Fifteen experienced surgeons participated in a nominal group technique designed to achieve a consensus-based classification system for early-onset scoliosis. A comprehensive list of factors important in managing early-onset scoliosis was generated using a standardized literature review, semi-structured interviews, and open forum discussion. Three group meetings and two rounds of surveying guided the selection of classification components, subgroupings, and cut-points. Initial validation of the system was conducted using an interobserver reliability assessment based on the classification of a series of thirty cases. Nominal group technique was used to identify three core variables (major curve angle, etiology, and kyphosis) with high group content validity scores. Age and curve progression ranked slightly lower. Participants evaluated the cases of thirty patients with early-onset scoliosis for reliability testing. The mean kappa value for etiology (0.64) was substantial, while the mean kappa values for major curve angle (0.95) and kyphosis (0.93) indicated almost perfect agreement. The final classification consisted of a continuous age prefix, etiology (congenital or structural, neuromuscular, syndromic, and idiopathic), major curve angle (1, 2, 3, or 4), and kyphosis (-, N, or +) variables, and an optional progression modifier (P0, P1, or P2). Utilizing formal consensus-building methods in a large group of surgeons experienced in treating early-onset scoliosis, a novel classification system for early-onset scoliosis was developed with all core components demonstrating substantial to excellent interobserver reliability. This classification system will serve as a foundation to guide ongoing research efforts and standardize communication in the clinical setting. Copyright © 2014 by The Journal of Bone and Joint Surgery, Incorporated.
Lee, Kyoung Min; Lee, Jaebong; Chung, Chin Youb; Ahn, Soyeon; Sung, Ki Hyuk; Kim, Tae Won; Lee, Hui Jong; Park, Moon Seok
2012-06-01
Intra-class correlation coefficients (ICCs) provide a statistical means of testing the reliability. However, their interpretation is not well documented in the orthopedic field. The purpose of this study was to investigate the use of ICCs in the orthopedic literature and to demonstrate pitfalls regarding their use. First, orthopedic articles that used ICCs were retrieved from the Pubmed database, and journal demography, ICC models and concurrent statistics used were evaluated. Second, reliability test was performed on three common physical examinations in cerebral palsy, namely, the Thomas test, the Staheli test, and popliteal angle measurement. Thirty patients were assessed by three orthopedic surgeons to explore the statistical methods testing reliability. Third, the factors affecting the ICC values were examined by simulating the data sets based on the physical examination data where the ranges, slopes, and interobserver variability were modified. Of the 92 orthopedic articles identified, 58 articles (63%) did not clarify the ICC model used, and only 5 articles (5%) described all models, types, and measures. In reliability testing, although the popliteal angle showed a larger mean absolute difference than the Thomas test and the Staheli test, the ICC of popliteal angle was higher, which was believed to be contrary to the context of measurement. In addition, the ICC values were affected by the model, type, and measures used. In simulated data sets, the ICC showed higher values when the range of data sets were larger, the slopes of the data sets were parallel, and the interobserver variability was smaller. Care should be taken when interpreting the absolute ICC values, i.e., a higher ICC does not necessarily mean less variability because the ICC values can also be affected by various factors. The authors recommend that researchers clarify ICC models used and ICC values are interpreted in the context of measurement.
Liu, Lin; He, Yihua; Li, Zhian; Gu, Xiaoyan; Zhang, Ye; Zhang, Lianzhong
2014-07-01
The use of low-frequency high-definition power Doppler in assessing and defining pulmonary venous connections was investigated. Study A included 260 fetuses at gestational ages ranging from 18 to 36 weeks. Pulmonary veins were assessed by performing two-dimensional B-mode imaging, color Doppler flow imaging (CDFI), and low-frequency high-definition power Doppler. A score of 1 was assigned if one pulmonary vein was visualized, 2 if two pulmonary veins were visualized, 3 if three pulmonary veins were visualized, and 4 if four pulmonary veins were visualized. The detection rate between Exam-1 and Exam-2 (intra-observer variability) and between Exam-1 and Exam-3 (inter-observer variability) was compared. In study B, five cases with abnormal pulmonary venous connection were diagnosed and compared to their anatomical examination. In study A, there was a significant difference between CDFI and low-frequency high-definition power Doppler for the four pulmonary veins observed (P < 0.05). The detection rate of each pulmonary vein when employing low-frequency high-definition power Doppler was higher than that when employing two-dimensional B-mode imaging or CDFI. There was no significant difference between the intra- and inter-observer variabilities using low-frequency high-definition power Doppler display of pulmonary veins (P > 0.05). The coefficient correlation between Exam-1 and Exam-2 was 0.844, and the coefficient correlation between Exam-1 and Exam-3 was 0.821. In study B, one case of total anomalous pulmonary venous return and four cases of partial anomalous pulmonary venous return were diagnosed by low-frequency high-definition power Doppler and confirmed by autopsy. The assessment of pulmonary venous connections by low-frequency high-definition power Doppler is advantageous. Pulmonary venous anatomy can and should be monitored during fetal heart examination.
Izatt, Maree T; Bateman, Gary R; Adam, Clayton J
2012-07-30
Vertebral rotation found in structural scoliosis contributes to trunkal asymmetry which is commonly measured with a simple Scoliometer device on a patient's thorax in the forward flexed position. The new generation of mobile 'smartphones' have an integrated accelerometer, making accurate angle measurement possible, which provides a potentially useful clinical tool for assessing rib hump deformity. This study aimed to compare rib hump angle measurements performed using a Smartphone and traditional Scoliometer on a set of plaster torsos representing the range of torsional deformities seen in clinical practice. Nine observers measured the rib hump found on eight plaster torsos moulded from scoliosis patients with both a Scoliometer and an Apple iPhone on separate occasions. Each observer repeated the measurements at least a week after the original measurements, and were blinded to previous results. Intra-observer reliability and inter-observer reliability were analysed using the method of Bland and Altman and 95% confidence intervals were calculated. The Intra-Class Correlation Coefficients (ICC) were calculated for repeated measurements of each of the eight plaster torso moulds by the nine observers. Mean absolute difference between pairs of iPhone/Scoliometer measurements was 2.1 degrees, with a small (1 degrees) bias toward higher rib hump angles with the iPhone. 95% confidence intervals for intra-observer variability were +/- 1.8 degrees (Scoliometer) and +/- 3.2 degrees (iPhone). 95% confidence intervals for inter-observer variability were +/- 4.9 degrees (iPhone) and +/- 3.8 degrees (Scoliometer). The measurement errors and confidence intervals found were similar to or better than the range of previously published thoracic rib hump measurement studies. The iPhone is a clinically equivalent rib hump measurement tool to the Scoliometer in spinal deformity patients. The novel use of plaster torsos as rib hump models avoids the variables of patient fatigue and discomfort, inconsistent positioning and deformity progression using human subjects in a single or multiple measurement sessions.
2012-01-01
Background Vertebral rotation found in structural scoliosis contributes to trunkal asymmetry which is commonly measured with a simple Scoliometer device on a patient's thorax in the forward flexed position. The new generation of mobile 'smartphones' have an integrated accelerometer, making accurate angle measurement possible, which provides a potentially useful clinical tool for assessing rib hump deformity. This study aimed to compare rib hump angle measurements performed using a Smartphone and traditional Scoliometer on a set of plaster torsos representing the range of torsional deformities seen in clinical practice. Methods Nine observers measured the rib hump found on eight plaster torsos moulded from scoliosis patients with both a Scoliometer and an Apple iPhone on separate occasions. Each observer repeated the measurements at least a week after the original measurements, and were blinded to previous results. Intra-observer reliability and inter-observer reliability were analysed using the method of Bland and Altman and 95% confidence intervals were calculated. The Intra-Class Correlation Coefficients (ICC) were calculated for repeated measurements of each of the eight plaster torso moulds by the nine observers. Results Mean absolute difference between pairs of iPhone/Scoliometer measurements was 2.1 degrees, with a small (1 degrees) bias toward higher rib hump angles with the iPhone. 95% confidence intervals for intra-observer variability were +/- 1.8 degrees (Scoliometer) and +/- 3.2 degrees (iPhone). 95% confidence intervals for inter-observer variability were +/- 4.9 degrees (iPhone) and +/- 3.8 degrees (Scoliometer). The measurement errors and confidence intervals found were similar to or better than the range of previously published thoracic rib hump measurement studies. Conclusions The iPhone is a clinically equivalent rib hump measurement tool to the Scoliometer in spinal deformity patients. The novel use of plaster torsos as rib hump models avoids the variables of patient fatigue and discomfort, inconsistent positioning and deformity progression using human subjects in a single or multiple measurement sessions. PMID:22846346
An Instrument to Assess the Obesogenic Environment of Child Care Centers
ERIC Educational Resources Information Center
Ward, Dianne; Hales, Derek; Haverly, Katie; Marks, Julie; Benjamin, Sara; Ball, Sarah; Trost, Stewart
2008-01-01
Objectives: To describe protocol and interobserver agreements of an instrument to evaluate nutrition and physical activity environments at child care. Methods: Interobserver data were collected from 9 child care centers, through direct observation and document review (17 observer pairs). Results: Mean agreement between observer pairs was 87.26%…
Chen, Jian; Zhang, Yan-Ming; Song, Ze-Zhou; Fu, Yan-Fei; Geng, Yu
2018-04-10
The interobserver agreement in the assessment of the grade of carotid plaque neovascularization by contrast-enhanced ultrasonography is poorly established. We examined 140 carotid plaques in 66 patients (all patients had bilateral plaques, and 8 patients had 2 plaques on one side). We performed conventional and contrast-enhanced ultrasonography to analyze the presence of carotid plaque neovascularization, which was graded by two independent observers whose interobserver agreement (κ) was evaluated according to the thickness of carotid plaque. For all carotid plaques, the mean κ was 0.689 (95% confidence interval 0.604-0.774). It was 0.689 (0.569-0.808), 0.637 (0.487-0.787), and 0.740 (0.585-0.896), respectively for carotid plaques with maximal thickness <2 mm, from 2 mm to 3 mm, and >3 mm. The interobserver agreement for assessing carotid plaque neovascularization by using contrast-enhanced ultrasonography is substantial and acceptable for research purposes, regardless of the maximal thickness of the plaque. © 2018 Wiley Periodicals, Inc.
Venskutonis, Tadas; Plotino, Gianluca; Tocci, Luigi; Gambarini, Gianluca; Maminskas, Julius; Juodzbalys, Gintaras
2015-02-01
The purpose of this study was to present a new periapical and endodontic status scale (PESS) that is based on the complex periapical index (COPI), which was designed for the identification and classification of periapical bone lesions in cases of apical periodontitis, and the endodontically treated tooth index, which was designed for endodontic treatment quality evaluation by means of cone-beam computed tomographic (CBCT) analysis. Periapical and endodontic status parameters were selected from the already known indexes and scientific literature for radiologic evaluation. Radiographic images (CBCT imaging, digital orthopantomography [DOR], and digital periapical radiography) from 55 patients were analyzed. All parameters were evaluated on CBCT, DOR, and digital periapical radiographic images by 2 external observers. The statistical analysis was performed with software SPSS version 19.0 (SPSS Inc, Chicago, IL). Chi-square tests were used to compare frequencies of qualitative variables. The level of significance was set at P ≤ .05. Overall intraobserver and interobserver agreements were very good and good, respectively. CBCT analysis found more lesions and lesions of bigger dimension (P < .001). CBCT imaging was also superior in locating lesions in the apical part on the side compared with DOR and in the diagnosis of cortical bone destruction compared with both methods (P < .001). Through CBCT analysis, more root canals and more canals associated with lesions were found. The most informative and reproducible periapical and endodontic status parameters were selected, and a new PESS was proposed. The classification proposed in the present study seems to be reproducible and objective and adds helpful information with respect to the existing indexes. Future studies need to be conducted to validate PESS. Copyright © 2015 American Association of Endodontists. Published by Elsevier Inc. All rights reserved.
Morais, Pedro; Vilaça, João L; Queirós, Sandro; Marchi, Alberto; Bourier, Felix; Deisenhofer, Isabel; D'hooge, Jan; Tavares, João Manuel R S
2018-07-01
Image-fusion strategies have been applied to improve inter-atrial septal (IAS) wall minimally-invasive interventions. Hereto, several landmarks are initially identified on richly-detailed datasets throughout the planning stage and then combined with intra-operative images, enhancing the relevant structures and easing the procedure. Nevertheless, such planning is still performed manually, which is time-consuming and not necessarily reproducible, hampering its regular application. In this article, we present a novel automatic strategy to segment the atrial region (left/right atrium and aortic tract) and the fossa ovalis (FO). The method starts by initializing multiple 3D contours based on an atlas-based approach with global transforms only and refining them to the desired anatomy using a competitive segmentation strategy. The obtained contours are then applied to estimate the FO by evaluating both IAS wall thickness and the expected FO spatial location. The proposed method was evaluated in 41 computed tomography datasets, by comparing the atrial region segmentation and FO estimation results against manually delineated contours. The automatic segmentation method presented a performance similar to the state-of-the-art techniques and a high feasibility, failing only in the segmentation of one aortic tract and of one right atrium. The FO estimation method presented an acceptable result in all the patients with a performance comparable to the inter-observer variability. Moreover, it was faster and fully user-interaction free. Hence, the proposed method proved to be feasible to automatically segment the anatomical models for the planning of IAS wall interventions, making it exceptionally attractive for use in the clinical practice. Copyright © 2018 Elsevier B.V. All rights reserved.
Dutch population specific sex estimation formulae using the proximal femur.
Colman, K L; Janssen, M C L; Stull, K E; van Rijn, R R; Oostra, R J; de Boer, H H; van der Merwe, A E
2018-05-01
Sex estimation techniques are frequently applied in forensic anthropological analyses of unidentified human skeletal remains. While morphological sex estimation methods are able to endure population differences, the classification accuracy of metric sex estimation methods are population-specific. No metric sex estimation method currently exists for the Dutch population. The purpose of this study is to create Dutch population specific sex estimation formulae by means of osteometric analyses of the proximal femur. Since the Netherlands lacks a representative contemporary skeletal reference population, 2D plane reconstructions, derived from clinical computed tomography (CT) data, were used as an alternative source for a representative reference sample. The first part of this study assesses the intra- and inter-observer error, or reliability, of twelve measurements of the proximal femur. The technical error of measurement (TEM) and relative TEM (%TEM) were calculated using 26 dry adult femora. In addition, the agreement, or accuracy, between the dry bone and CT-based measurements was determined by percent agreement. Only reliable and accurate measurements were retained for the logistic regression sex estimation formulae; a training set (n=86) was used to create the models while an independent testing set (n=28) was used to validate the models. Due to high levels of multicollinearity, only single variable models were created. Cross-validated classification accuracies ranged from 86% to 92%. The high cross-validated classification accuracies indicate that the developed formulae can contribute to the biological profile and specifically in sex estimation of unidentified human skeletal remains in the Netherlands. Furthermore, the results indicate that clinical CT data can be a valuable alternative source of data when representative skeletal collections are unavailable. Copyright © 2017 Elsevier B.V. All rights reserved.
Groth, M; Forkert, N D; Buhk, J H; Schoenfeld, M; Goebell, E; Fiehler, J
2013-02-01
To compare intra- and inter-observer reliability of aneurysm measurements obtained by a 3D computer-aided technique with standard manual aneurysm measurements in different imaging modalities. A total of 21 patients with 29 cerebral aneurysms were studied. All patients underwent digital subtraction angiography (DSA), contrast-enhanced (CE-MRA) and time-of-flight magnetic resonance angiography (TOF-MRA). Aneurysm neck and depth diameters were manually measured by two observers in each modality. Additionally, semi-automatic computer-aided diameter measurements were performed using 3D vessel surface models derived from CE- (CE-com) and TOF-MRA (TOF-com) datasets. Bland-Altman analysis (BA) and intra-class correlation coefficient (ICC) were used to evaluate intra- and inter-observer agreement. BA revealed the narrowest relative limits of intra- and inter-observer agreement for aneurysm neck and depth diameters obtained by TOF-com (ranging between ±5.3 % and ±28.3 %) and CE-com (ranging between ±23.3 % and ±38.1 %). Direct measurements in DSA, TOF-MRA and CE-MRA showed considerably wider limits of agreement. The highest ICCs were observed for TOF-com and CE-com (ICC values, 0.92 or higher for intra- as well as inter-observer reliability). Computer-aided aneurysm measurement in 3D offers improved intra- and inter-observer reliability and a reproducible parameter extraction, which may be used in clinical routine and as objective surrogate end-points in clinical trials.
de Carvalho, Rogério Mendonca; Perez, Maria Del Carmen Janerio; Miranda, Fausto
2012-10-01
Traditional volumetry based on Archimedes' principle is the gold standard for the measurement of limb volume, but the routine use of this technique is discouraged because of several disadvantages. The purpose of this study was to evaluate intraobserver and interobserver reliability of direct measurements of wrist-hand volume using a new communicating vessels volumeter based on Pascal's law. A reliability study was conducted. To evaluate the reliability of the communicating vessels volumeter in generating measurements, 30 hands of 15 participants (9 women, 6 men) were measured 3 times each by 3 observers, totaling 270 volumetric results. Measurement time was short (X =3 minutes 42 seconds). The intraclass correlation coefficient (ICC) was .9977 for observer 1 and .9976 for observers 2 and 3. The interobserver ICC was .9998. The standard error of measurement was about 3 mL for all observers; the interobserver result was 1 mL. The interrater coefficient of variance (CV) was 1.15% for the series of 9 measurements collected for each segment; the intrarater CV was 1.20%. Limitations No swollen hands were measured, and measurements were not compared with the gold standard technique. Thus, accuracy of the new volumeter was not determined in this study. A new device has been developed for plethysmography of the extremities, and the results of its use to measure the volume of the wrist-hand segment were reliable in both intraobserver and interobserver analyses.
Evaluation of interobserver agreement for postoperative pain and sedation assessment in cats.
Benito, Javier; Monteiro, Beatriz P; Beauchamp, Guy; Lascelles, B Duncan X; Steagall, Paulo V
2017-09-01
OBJECTIVE To evaluate agreement between observers with different training and experience for assessment of postoperative pain and sedation in cats by use of a dynamic and interactive visual analog scale (DIVAS) and for assessment of postoperative pain in the same cats with a multidimensional composite pain scale (MCPS). DESIGN Randomized, controlled, blinded study. ANIMALS 45 adult cats undergoing ovariohysterectomy. PROCEDURES Cats received 1 of 3 preoperative treatments: bupivacaine, IP; meloxicam, SC with saline (0.9% NaCl) solution, IP, (positive control); or saline solution only, IP (negative control). All cats received premedication with buprenorphine prior to general anesthesia. An experienced observer (observer 1; male; native language, Spanish) used scales in English, and an inexperienced observer (observer 2; female; native language, French) used scales in French to assess signs of sedation and pain. Rescue analgesia was administered according to MCPS scoring by observer 1. Mean pain and sedation scores per treatment and time point, proportions of cats in each group with MCPS scores necessitating rescue analgesia, and mean MCPS scores assigned at the time of rescue analgesia were compared between observers. Agreement was assessed by intraclass correlation coefficient determination. Percentage disagreement between observers on the need for rescue analgesia was calculated. RESULTS Interobserver agreements for pain scores were good, and that for sedation scores was fair. On the basis of observer 1's MCPS scores, a greater proportion of cats in the negative control group received rescue analgesia than in the bupivacaine or positive control groups. Scores from observer 2 indicated a greater proportion of cats in the negative control group than in the positive control group required rescue analgesia but identified no significant difference between the negative control and bupivacaine groups for this variable. Overall, disagreement regarding need for rescue analgesia was identified for 22 of 360 (6.1%) paired observations. CONCLUSIONS AND CLINICAL RELEVANCE Interobserver differences in assessing pain can lead to different conclusions regarding treatment effectiveness.
Validation of Morphometric Analyses of Small-Intestinal Biopsy Readouts in Celiac Disease
Taavela, Juha; Koskinen, Outi; Huhtala, Heini; Lähdeaho, Marja-Leena; Popp, Alina; Laurila, Kaija; Collin, Pekka; Kaukinen, Katri; Kurppa, Kalle; Mäki, Markku
2013-01-01
Background Assessment of the gluten-induced small-intestinal mucosal injury remains the cornerstone of celiac disease diagnosis. Usually the injury is evaluated using grouped classifications (e.g. Marsh groups), but this is often too imprecise and ignores minor but significant changes in the mucosa. Consequently, there is a need for validated continuous variables in everyday practice and in academic and pharmacological research. Methods We studied the performance of our standard operating procedure (SOP) on 93 selected biopsy specimens from adult celiac disease patients and non-celiac disease controls. The specimens, which comprised different grades of gluten-induced mucosal injury, were evaluated by morphometric measurements. Specimens with tangential cutting resulting from poorly oriented biopsies were included. Two accredited evaluators performed the measurements in blinded fashion. The intraobserver and interobserver variations for villus height and crypt depth ratio (VH:CrD) and densities of intraepithelial lymphocytes (IELs) were analyzed by the Bland-Altman method and intraclass correlation. Results Unevaluable biopsies according to our SOP were correctly identified. The intraobserver analysis of VH:CrD showed a mean difference of 0.087 with limits of agreement from −0.398 to 0.224; the standard deviation (SD) was 0.159. The mean difference in interobserver analysis was 0.070, limits of agreement −0.516 to 0.375, and SD 0.227. The intraclass correlation coefficient in intraobserver variation was 0.983 and that in interobserver variation 0.978. CD3+ IEL density countings in the paraffin-embedded and frozen biopsies showed SDs of 17.1% and 16.5%; the intraclass correlation coefficients were 0.961 and 0.956, respectively. Conclusions Using our SOP, quantitative, reliable and reproducible morphometric results can be obtained on duodenal biopsy specimens with different grades of gluten-induced injury. Clinically significant changes were defined according to the error margins (2SD) of the analyses in VH:CrD as 0.4 and in CD3+-stained IELs as 30%. PMID:24146832
Claßen, Anne Christine; Kneissl, Sibylle; Lang, Johann; Tichy, Alexander; Pakozdy, Akos
2016-08-11
Hippocampal necrosis in cats has been reported to be associated with epileptic seizures. Magnetic resonance imaging (MRI) features of temporal lobe (TL) abnormalities in epileptic cats have been described but MR images from epileptic and non-epileptic individuals have not yet been systematically compared. TL abnormalities are highly variable in shape, size and signal, and therefore may lead to varying evaluations by different specialists. The aim of this study was to investigate whether there were differences in the appearance of the TL between epileptic and non-epileptic cats, and whether there were any relationships between TL abnormalities and seizure semiologies or other clinical findings. We also investigated interobserver agreement among three specialists. The MR images of 46 cats were reviewed independently by three observers, who were blinded to patient data, examination findings and the review of the other observers. Images were evaluated using a multiparametric scoring system developed for this study. Mann-Whitney U-tests and chi-square were used to analyse the differences between observers' evaluations. The kappa coefficient (k) and Fleiss' kappa coefficient were used to quantify interobserver agreement. The overall interobserver agreement was moderate to good (k =0.405 to 0.615). The MR scores between epileptic and non-epileptic cats did not differ significantly. However, there was a significant difference between the MR scores of epileptic cats with and without orofacial involvement according to all three observers. Likewise, MR scores of cats with cluster seizures were higher than those of cats without clusters. Cats presenting with recurrent epileptic seizures with orofacial involvement are more likely to have hippocampal pathologies, which suggests that TL abnormalities are not merely unspecific epileptic findings, but are associated with a certain type of epilepsy. TL signal alterations are more likely to be detected on FLAIR sequences. In contrast to severe changes in the TL which were described similarly among specialists, mild TL abnormalities may be difficult to interpret, thus leading to different assessments among observers.
Gowda, Meghana; Kit, Laura Chang; Stuart Reynolds, W; Wang, Li; Dmochowski, Roger R; Kaufman, Melissa R
2013-10-01
To unify and organize reporting, an International Urogynecological Association (IUGA)/International Continence Society (ICS) expert consortium published terminology guidelines with a classification system for complications related to implants used in female pelvic surgery. We hypothesize that the complexity of the codification system may be a hindrance to precision, especially with decreasing levels of postgraduate expertise. Residents, fellows, and attending physicians were asked to code seven test cases taken from published literature. Category, timing, and site components of the classification system were assessed independently and according to the level of training. Interobserver reliability was calculated as percent agreement and Fleiss' kappa statistic. A total of 24 participants (6 attending physicians, 3 fellows, and 15 residents) were tested. The percent agreement showed significant variation when classified by level of training. In all categories, attending physicians had the greatest percentage agreement and largest kappa. The most agreement was seen when attending physicians classified mesh complications by time, 71% agreement with kappa 0.73 [95% confidence interval (CI) 0.58-0.88]. For the same task, the percentage agreement for fellows was 57%, kappa 0.55 (95% CI 0.23-0.87) and with residents 57%, kappa 0.71([95% CI 0.64-0.78). Interestingly, the site component of the classification system had the least overall agreement and lowest kappa [0%, kappa 0.29 (95% CI 0.26-0.32)] followed by the category component [14%, kappa 0.48 (95% CI 0.46-0.5)]. The IUGA/ICS mesh complication classification system has poor interobserver reliability. This trended downward with decreasing postgraduate level; however, we did not have sufficient statistical power to show an association when stratifying by all training levels. This highlights the complex nature of the classification system in its current form and its limitation for widespread clinical and research application.
Neukamm, Christian; Try, Kirsti; Norgård, Gunnar; Brun, Henrik
2014-01-01
A technique that uses two-dimensional images to create a knowledge-based, three-dimensional model was tested and compared to magnetic resonance imaging. Measurement of right ventricular volumes and function is important in the follow-up of patients after pulmonary valve replacement. Magnetic resonance imaging is the gold standard for volumetric assessment. Echocardiographic methods have been validated and are attractive alternatives. Thirty patients with tetralogy of Fallot (25 ± 14 years) after pulmonary valve replacement were examined. Magnetic resonance imaging volumetric measurements and echocardiography-based three-dimensional reconstruction were performed. End-diastolic volume, end-systolic volume, and ejection fraction were measured, and the results were compared. Magnetic resonance imaging measurements gave coefficient of variation in the intraobserver study of 3.5, 4.6, and 5.3 and in the interobserver study of 3.6, 5.9, and 6.7 for end-diastolic volume, end-systolic volume, and ejection fraction, respectively. Echocardiographic three-dimensional reconstruction was highly feasible (97%). In the intraobserver study, the corresponding values were 6.0, 7.0, and 8.9 and in the interobserver study 7.4, 10.8, and 13.4. In comparison of the methods, correlations with magnetic resonance imaging were r = 0.91, 0.91, and 0.38, and the corresponding coefficient of variations were 9.4, 10.8, and 14.7. Echocardiography derived volumes (mL/m(2)) were significantly higher than magnetic resonance imaging volumes in end-diastolic volume 13.7 ± 25.6 and in end-systolic volume 9.1 ± 17.0 (both P < .05). The knowledge-based three-dimensional right ventricular volume method was highly feasible. Intra and interobserver variabilities were satisfactory. Agreement with magnetic resonance imaging measurements for volumes was reasonable but unsatisfactory for ejection fraction. Knowledge-based reconstruction may replace magnetic resonance imaging measurements for serial follow-up, whereas magnetic resonance imaging should be used for surgical decision making.
Soyer, Philippe; Corno, Lucie; Boudiaf, Mourad; Aout, Mounir; Sirol, Marc; Placé, Vinciane; Duchat, Florent; Guerrache, Youcef; Fargeaudou, Yann; Vicaut, Eric; Pocard, Marc; Hamzi, Lounis
2011-11-01
To test interobserver variability of ADC measurements and compare the diagnostic performances of free-breathing diffusion-weighted (FBDW) with that of T2-weighted FSE (T2WFSE) MR imaging for differentiating between cavernous hemangiomas and untreated malignant hepatic neoplasms. Thirty-five patients with cavernous hemangiomas and 35 with untreated hepatic malignant neoplasms had FBDW and T2WFSE MR imaging. Hepatic lesions were characterized with ADC measurement and visual evaluation. Interobserver agreement for ADC measurement was calculated. Association between ADC value and lesion type was assessed using univariate analysis. Sensitivity, specificity and accuracy of ADC values and visual evaluation of MR images for the diagnosis of untreated malignant hepatic neoplasm were compared. ADC measurements showed excellent interobserver correlation (intraclass correlation coefficient=0.980). Malignant neoplasms had lower ADC values than hemangiomas for the two observers (1.11×10(-3) mm2/s±.21×10(-3) vs. 1.77×10(-3) mm2/s±.29×10(-3) for observer 1 and 1.11×10(-3) mm2/s±.19×10(-3) vs. 1.79×10(-3) mm2/s±.32×10(-3) for observer 2) and univariate analysis found significant correlations between lesion type and ADC values. Depending on ADC threshold value, accuracy for the diagnosis of malignant neoplasm varied from 82.9% to 94.3%. Using visual evaluation, FBDW showed better specificity and accuracy than T2WFSE MR images for the diagnosis of malignant neoplasm (97.1% vs. 77.1% and 94.3% vs. 62.9%, respectively). FBDW imaging provides reproducible quantitative information and surpasses the value of T2WFSE MR imaging for differentiating between cavernous hemangiomas and untreated malignant hepatic neoplasms. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
Taffin, Elien Rl; Paepe, Dominique; Campos, Miguel; Duchateau, Luc; Goris, Nesya; De Roover, Katrien; Daminet, Sylvie
2016-11-01
Objectives The Karnofsky score (KS) modified for cats, a scoring system to rate health and quality of life (QOL) in cats, is used in clinical trials, but its reliability and validity are yet to be determined. The present study aims to evaluate the scientific robustness of the KS when adapted for use in a hospital setting. Methods A list of variables to consider during the physical examination, which informs the clinician's score (CS) part of the KS, was added and clinicians were allowed to choose a score anywhere between 0 and 50. The Karnofsky QOL questionnaire was adapted for use in a hospital setting. F-tests with Bonferroni correction and Spearman rank correlation coefficients were used to evaluate reliability and validity of the KS to assess the health and wellbeing of cats in a hospital setting. The records of 54 feline immunodeficiency virus-positive cats, which were recruited for a clinical trial and hospitalised for 6 weeks, were reviewed. Four veterinarians scored the CS, and one veterinarian and a veterinary nurse assessed the QOL score. Results Mean absolute difference between observers was significantly larger for the CS than for the QOL score ( P <0.001) and two veterinarians scored significantly higher than the remaining two veterinarians ( P <0.001). Inter-observer correlation ranged from 0.45-0.75 for the CS. For the QOL score, the absolute difference between observers was small, no significant difference was found between observers and a high degree of inter-observer correlation was noted (r = 0.91). Conclusions and relevance The results indicate low inter-observer reliability for the CS, requiring additional modifications to this part of the KS. The QOL score seems more reliable, and the questionnaire may serve as a reliable tool in the assessment of QOL in cats in a hospital setting. Consequently, further adaptation of the KS is mandatory when simultaneous assessment of both the cat's clinical health and perceived wellbeing is required.
Interobserver agreement and diagnostic accuracy of brain magnetic resonance imaging in dogs.
Leclerc, Mylène-Kim; d'Anjou, Marc-André; Blond, Laurent; Carmel, Éric Norman; Dennis, Ruth; Kraft, Susan L; Matthews, Andrea R; Parent, Joane M
2013-06-15
To evaluate interobserver agreement and diagnostic accuracy of brain MRI in dogs. Evaluation study. 44 dogs. 5 board-certified veterinary radiologists with variable MRI experience interpreted transverse T2-weighted (T2w), T2w fluid-attenuated inversion recovery (FLAIR), and T1-weighted-FLAIR; transverse, sagittal, and dorsal T2w; and T1-weighted-FLAIR postcontrast brain sequences (1.5 T). Several imaging parameters were scored, including the following: lesion (present or absent), lesion characteristics (axial localization, mass effect, edema, hemorrhage, and cavitation), contrast enhancement characteristics, and most likely diagnosis (normal, neoplastic, inflammatory, vascular, metabolic or toxic, or other). Magnetic resonance imaging diagnoses were determined initially without patient information and then repeated, providing history and signalment. For all cases and readers, MRI diagnoses were compared with final diagnoses established with results from histologic examination (when available) or with other pertinent clinical data (CSF analysis, clinical response to treatment, or MRI follow-up). Magnetic resonance scores were compared between examiners with κ statistics. Reading agreement was substantial to almost perfect (0.64 < κ < 0.86) when identifying a brain lesion on MRI; fair to moderate (0.14 < κ < 0.60) when interpreting hemorrhage, edema, and pattern of contrast enhancement; fair to substantial (0.22 < κ < 0.74) for dural tail sign and categorization of margins of enhancement; and moderate to substantial (0.40 < κ < 0.78) for axial localization, presence of mass effect, cavitation, intensity, and distribution of enhancement. Interobserver agreement was moderate to substantial for categories of diagnosis (0.56 < κ < 0.69), and agreement with the final diagnosis was substantial regardless of whether patient information was (0.65 < κ < 0.76) or was not (0.65 < κ < 0.68) provided. The present study found that whereas some MRI features such as edema and hemorrhage were interpreted less consistently, radiologists were reasonably constant and accurate when providing diagnoses.
Nayak, Lakshmi; DeAngelis, Lisa M; Brandes, Alba A; Peereboom, David M; Galanis, Evanthia; Lin, Nancy U; Soffietti, Riccardo; Macdonald, David R; Chamberlain, Marc; Perry, James; Jaeckle, Kurt; Mehta, Minesh; Stupp, Roger; Muzikansky, Alona; Pentsova, Elena; Cloughesy, Timothy; Iwamoto, Fabio M; Tonn, Joerg-Christian; Vogelbaum, Michael A; Wen, Patrick Y; van den Bent, Martin J; Reardon, David A
2017-05-01
The Macdonald criteria and the Response Assessment in Neuro-Oncology (RANO) criteria define radiologic parameters to classify therapeutic outcome among patients with malignant glioma and specify that clinical status must be incorporated and prioritized for overall assessment. But neither provides specific parameters to do so. We hypothesized that a standardized metric to measure neurologic function will permit more effective overall response assessment in neuro-oncology. An international group of physicians including neurologists, medical oncologists, radiation oncologists, and neurosurgeons with expertise in neuro-oncology drafted the Neurologic Assessment in Neuro-Oncology (NANO) scale as an objective and quantifiable metric of neurologic function evaluable during a routine office examination. The scale was subsequently tested in a multicenter study to determine its overall reliability, inter-observer variability, and feasibility. The NANO scale is a quantifiable evaluation of 9 relevant neurologic domains based on direct observation and testing conducted during routine office visits. The score defines overall response criteria. A prospective, multinational study noted a >90% inter-observer agreement rate with kappa statistic ranging from 0.35 to 0.83 (fair to almost perfect agreement), and a median assessment time of 4 minutes (interquartile range, 3-5). The NANO scale provides an objective clinician-reported outcome of neurologic function with high inter-observer agreement. It is designed to combine with radiographic assessment to provide an overall assessment of outcome for neuro-oncology patients in clinical trials and in daily practice. Furthermore, it complements existing patient-reported outcomes and cognition testing to combine for a global clinical outcome assessment of well-being among brain tumor patients. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Neuro-Oncology. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com
Hakimé, Antoine; Peddi, Himaja; Hines-Peralta, Andrew U; Wilcox, Carol J; Kruskal, Jonathan; Lin, Shezhang; de Baere, Thierry; Raptopoulos, Vassilios D; Goldberg, S Nahum
2007-06-01
To prospectively compare single- and multisection computed tomographic (CT) perfusion for tumor blood flow determination in an animal model. All animal protocols and experiments were approved by the institutional animal care and use committee before the study was initiated. R3230 mammary adenocarcinoma was implanted in 11 rats. Tumors (18-20 mm) were scanned with dynamic 16-section CT at baseline and after administration of arsenic trioxide, which is known to cause acute reduction in blood flow. The concentration of arsenic was titrated (0-6 mg of arsenic per kilogram of body weight) to achieve a defined blood flow reduction (0%-75%) from baseline levels at 60 minutes, as determined with correlative laser Doppler flowmetry. The mean blood flow was calculated for each of four 5-mm sections that covered the entire tumor, as well as for the entire tumor after multiple sections were processed. Measurements obtained with both methods were correlated with laser Doppler flowmetry measurements. Interobserver agreement was determined for two blinded radiologists, who calculated the percentage of blood flow reduction for the "most representative" single sections at baseline and after arsenic administration. These results were compared with the interobserver variability of the same radiologists obtained by summing blood flow changes for the entire tumor volume. Overall correlations for acute blood flow reduction were demonstrated between laser Doppler flowmetry and the two CT perfusion approaches (single-section CT, r=0.85 and r(2)=0.73; multisection CT, r=0.93 and r(2)=0.87; pooled data, P=.01). CT perfusion disclosed marked heterogeneity of blood flow, with variations of 36% +/- 13 between adjacent 5-mm sections. Given these marked differences, interobserver agreement was much lower for single-section CT (standard deviation, 0.22) than for multisection CT (standard deviation, 0.10; P=.01). Multisection CT perfusion techniques may provide an accurate and more reproducible method of tumor perfusion surveillance than comparison of single representative tumor sections. (c) RSNA, 2007.
van der Wel, M J; Duits, L C; Seldenrijk, C A; Offerhaus, G J; Visser, M; Ten Kate, F J; de Boer, O J; Tijssen, J G; Bergman, J J; Meijer, S L
2017-11-01
Management of Barrett's esophagus (BE) relies heavily on histopathological assessment of biopsies, associated with significant intra- and interobserver variability. Guidelines recommend biopsy review by an expert in case of dysplasia. Conventional review of biopsies, however, is impractical and does not allow for teleconferencing or annotations. An expert digital review platform might overcome these limitations. We compared diagnostic agreement of digital and conventional microscopy for diagnosing BE ± dysplasia. Sixty BE biopsy glass slides (non-dysplastic BE (NDBE); n = 25, low-grade dysplasia (LGD); n = 20; high-grade dysplasia (HGD); n = 15) were scanned at ×20 magnification. The slides were assessed four times by five expert BE pathologists, all practicing histopathologists (range: 5-30 years), in 2 alternating rounds of digital and conventional microscopy, each in randomized order and sequence of slides. Intraobserver and pairwise interobserver agreement were calculated, using custom weighted Cohen's kappa, adjusted for the maximum possible kappa scores. Split into three categories (NDBE, IND, LGD+HGD), the mean intraobserver agreement was 0.75 and 0.84 for digital and conventional assessment, respectively (p = 0.35). Mean pairwise interobserver agreement was 0.80 for digital and 0.85 for conventional microscopy (p = 0.17). In 47/60 (78%) of digital microscopy reviews a majority vote of ≥3 pathologists was reached before consensus meeting. After group discussion, a majority vote was achieved in all cases (60/60). Diagnostic agreement of digital microscopy is comparable to that of conventional microscopy. These outcomes justify the use of digital slides in a nationwide, web-based BE revision platform in the Netherlands. This will overcome the practical issues associated with conventional histologic review by multiple pathologists. © The Authors 2017. Published by Oxford University Press on behalf of International Society for Diseases of the Esophagus. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Kuwahara, Y; Shima, Y; Shirayama, D; Kawai, M; Hagihara, K; Hirano, T; Arimitsu, J; Ogata, A; Tanaka, T; Kawase, I
2008-07-01
No objective method to measure skin involvement in SSc has been established. We developed a novel method using a computer-linked device to simultaneously quantify physical properties of the skin such as hardness, elasticity and viscosity. Skin hardness was calculated by measuring the depth of an indenter pressed onto the skin. The Voigt model was used to calculate skin elasticity, viscosity, visco-elastic ratio and relaxation time by analysing the waveform of skin surface behaviour. The results were compared with the modified Rodnan skin score (mRSS) obtained at 17 sites on the bodies of 20 SSc patients and 20 healthy controls. A functional assessment questionnaire was administered to determine how skin hardness represents a patient's disability. We also examined intra- and inter-observer variability to determine the reliability of this method. The crude hardness obtained with this device correlated well with the standard hardness specified by the American Society for Testing and Materials (ASTM, r = 0.957). A close relationship between hardness and total mRSS was also observed (r = 0.832). Skin elasticity correlated positively, and relaxation time negatively with mRSS. Functional disability correlated more closely with skin hardness (r = 0.643) than with mRSS (r = 0.517). Intra- and inter-observer variabilities were 7.63 and 19.76%, respectively, which were lower than those reported for mRSS. Increases in hardness and elasticity as well as shortening of relaxation time constitute objective characteristics of skin involvement in SSc. The system devised by us proved to be able to assess skin abnormalities of SSc with high reliability.
Maltez de Almeida, João Ricardo; Gomes, André Boechat; Barros, Thomas Pitangueira; Fahel, Paulo Eduardo; de Seixas Rocha, Mário
2015-07-01
The purposes of this study were to investigate whether dynamic contrast-enhanced MRI is adequate for subcategorization of suspicious lesions (BI-RADS category 4) and to evaluate whether use of DWI improves diagnostic performance. The study group was composed of 103 suspicious lesions found in 83 subjects. Patient ages and lesion sizes were compiled, and two radiologists reanalyzed the images; subcategorized the findings as BI-RADS 4A, 4B, or 4C; and calculated apparent diffusion coefficient (ADC) values. The stratified variables were tested by univariate analysis and inserted in two multivariate predictive models, which were used to generate ROC curves and compare AUCs. Positive predictive values (PPVs) for each subcategory and ADC level were calculated, and interobserver agreement was tested. Forty-four (42.7%) suspicious findings proved malignant. Except for age (p = 0.08), all stratified predictor variables were significant in univariate analyses (p < 0.01). Logistic regression models did not differ substantially after comparison of the ROC curves (p = 0.09), but the one including ADC values was slightly better: AUC of 0.89 (95% CI, 0.82-0.95) against AUC of 0.85 (95% CI, 0.78-0.93). PPV increased progressively in each BI-RADS 4 subcategory (4A, 0.15; 4B, 0.37; 4C, 0.84). ADC values of 1.10 × 10(-3) mm(2)/s or less had the second highest PPV (0.77). Interobserver agreement was substantial at a kappa value of 0.80 (95% CI, 0.70-0.90; p < 0.01). Risk stratification of suspicious lesions (BI-RADS category 4) can be satisfactorily performed with DCE-MRI and slightly improved when DWI is introduced.
Volpe, P; Contro, E; Fanelli, T; Muto, B; Pilu, G; Gentile, M
2016-06-01
To describe the sonographic appearance of fetal posterior fossa anatomy at 11-14 weeks of pregnancy and to assess the outcome of fetuses with increased intracranial translucency (IT) and/or brainstem-to-occipital bone (BSOB) diameter. Reference ranges for brainstem (BS), IT and cisterna magna (CM) measurements, BSOB diameter and the BS : BSOB ratio were obtained from the first-trimester ultrasound examination of 233 fetuses with normal postnatal outcome (control group). The intraobserver and interobserver variability of measurements were investigated using 73 stored ultrasound images. In addition, a study group of 17 fetuses with increased IT and/or BSOB diameter was selected to assess outcome. No significant intraobserver or interobserver variability was found for any measurement in the control group. In the study group, IT was increased in all cases and BSOB diameter was above the 95(th) centile of the calculated normal range in all but two (88%) cases. In 13/17 study cases, only two of the three posterior brain spaces were recognized on ultrasound. These 13 fetuses had a larger BSOB diameter than did the four cases that showed all three posterior brain spaces, and had severe associated anomalies including Dandy-Walker malformation (DWM) and/or chromosomal anomalies. Visualization of the fetal posterior fossa anatomy at 11-14 weeks' gestation is feasible. Increased fluid in the posterior brain at 11-14 weeks, particularly in the case of non-visibility of the septation that divides the future fourth ventricle from the CM, is an important risk factor for cystic posterior fossa malformations, in particular DWM, and/or chromosomal aberrations. Copyright © 2015 ISUOG. Published by John Wiley & Sons Ltd. Copyright © 2015 ISUOG. Published by John Wiley & Sons Ltd.
Braun, Martin; Kirsten, Robert; Rupp, Niels J; Moch, Holger; Fend, Falko; Wernert, Nicolas; Kristiansen, Glen; Perner, Sven
2013-05-01
Quantification of protein expression based on immunohistochemistry (IHC) is an important step for translational research and clinical routine. Several manual ('eyeballing') scoring systems are used in order to semi-quantify protein expression based on chromogenic intensities and distribution patterns. However, manual scoring systems are time-consuming and subject to significant intra- and interobserver variability. The aim of our study was to explore, whether new image analysis software proves to be sufficient as an alternative tool to quantify protein expression. For IHC experiments, one nucleus specific marker (i.e., ERG antibody), one cytoplasmic specific marker (i.e., SLC45A3 antibody), and one marker expressed in both compartments (i.e., TMPRSS2 antibody) were chosen. Stainings were applied on TMAs, containing tumor material of 630 prostate cancer patients. A pathologist visually quantified all IHC stainings in a blinded manner, applying a four-step scoring system. For digital quantification, image analysis software (Tissue Studio v.2.1, Definiens AG, Munich, Germany) was applied to obtain a continuous spectrum of average staining intensity. For each of the three antibodies we found a strong correlation of the manual protein expression score and the score of the image analysis software. Spearman's rank correlation coefficient was 0.94, 0.92, and 0.90 for ERG, SLC45A3, and TMPRSS2, respectively (p⟨0.01). Our data suggest that the image analysis software Tissue Studio is a powerful tool for quantification of protein expression in IHC stainings. Further, since the digital analysis is precise and reproducible, computer supported protein quantification might help to overcome intra- and interobserver variability and increase objectivity of IHC based protein assessment.
Rajagopalan, Malolan S.; Khanna, Vineet K.; Leiter, Yaacov; Stott, Meghan; Showalter, Timothy N.; Dicker, Adam P.; Lawrence, Yaacov R.
2011-01-01
Purpose: A wiki is a collaborative Web site, such as Wikipedia, that can be freely edited. Because of a wiki's lack of formal editorial control, we hypothesized that the content would be less complete and accurate than that of a professional peer-reviewed Web site. In this study, the coverage, accuracy, and readability of cancer information on Wikipedia were compared with those of the patient-orientated National Cancer Institute's Physician Data Query (PDQ) comprehensive cancer database. Methods: For each of 10 cancer types, medically trained personnel scored PDQ and Wikipedia articles for accuracy and presentation of controversies by using an appraisal form. Reliability was assessed by using interobserver variability and test-retest reproducibility. Readability was calculated from word and sentence length. Results: Evaluators were able to rapidly assess articles (18 minutes/article), with a test-retest reliability of 0.71 and interobserver variability of 0.53. For both Web sites, inaccuracies were rare, less than 2% of information examined. PDQ was significantly more readable than Wikipedia: Flesch-Kincaid grade level 9.6 versus 14.1. There was no difference in depth of coverage between PDQ and Wikipedia (29.9, 34.2, respectively; maximum possible score 72). Controversial aspects of cancer care were relatively poorly discussed in both resources (2.9 and 6.1 for PDQ and Wikipedia, respectively, NS; maximum possible score 18). A planned subanalysis comparing common and uncommon cancers demonstrated no difference. Conclusion: Although the wiki resource had similar accuracy and depth as the professionally edited database, it was significantly less readable. Further research is required to assess how this influences patients' understanding and retention. PMID:22211130
Han, Guangming; Soslow, Robert A; Wethington, Stephanie; Levine, Douglas A; Bogomolniy, Faina; Clement, Philip B; Köbel, Martin; Gilks, Blake; DeLair, Deborah
2015-07-01
Endometrial clear cell carcinoma (CC) is an uncommon tumor and often carries a poor prognosis. It has histologic features that overlap with other endometrial carcinomas and is frequently misclassified. Accurate classification is crucial, however, to improve treatment options. The objectives of this study were (1) to assess diagnostic interobserver variability among 5 gynecologic pathologists for tumors originally diagnosed as CC or with a component of CC (n=44); (2) to determine the utility of immunohistochemical markers estrogen receptor and HNF-1β; and (3) to detect mutations in select genes. Clinical data and morphologic features were also recorded. Agreement among reviewers was only moderate: only 46% of the original CC remained classified as such. After reclassification, estrogen receptor was positive in 8% of CC, 67% of endometrioid carcinomas (EC), and 47% of serous carcinomas (SC). Sensitivities of HNF-1β in CC, SC, and EC were 62%, 27%, and 17%, respectively, whereas specificity for CC versus EC or SC was 78%. Mutations in PIK3CA, PIK3R1, PTEN, KRAS, and NRAS were detected in 41% of 37 cases that had adequate material for study. At least 1 mutation was identified in 33% of CC, 67% of EC, and 33% of SC. This group of patients had poor outcomes: 72% of the patients with follow-up information had died of disease. In summary, this study suggests that the current pool of CC is a heterogeneous group of tumors from the morphologic, immunophenotypic, and molecular point of views and that only a percentage of them represent true CC.
Raske, Matthew; Weisse, Chick; Berent, Allyson C; McDougall, Renee; Lamb, Kenneth
2018-03-01
Intraluminal tracheal stenting is a minimally invasive procedure shown to have variable degrees of success in managing clinical signs associated with tracheal collapse syndrome (CTCS) in dogs. Identify immediate post-stent changes in tracheal diameter, determine the extent of stent migration, and stent shortening after stent placement in the immediate-, short-, and long-term periods, and evaluate inter-observer reliability of radiographic measurements. Fifty client-owned dogs. Retrospective study in which medical records were reviewed in dogs with CTCS treated with an intraluminal tracheal stent. Data collected included signalment, location, and type of collapse, stent diameter and length, and post-stent placement radiographic follow-up times. Radiographs were used to obtain pre-stent tracheal measurements and post-stent placement measurements. Immediate mean percentage change was 5.14%, 5.49%, and 21.64% for cervical, thoracic inlet, and intra-thoracic tracheal diameters, respectively. Ultimate mean follow-up time was 446 days, with mean percentage change of 2.55%, 15.09%, and 8.65% for cervical, thoracic inlet, and intra-thoracic tracheal diameters, respectively. Initial mean stent length was 26.72% higher than nominal length and ultimate long-term tracheal mean stent shortening was only 9.90%. No significant stent migration was identified in the immediate, short-, or long-term periods. Good inter-observer agreement of radiographic measurements was found among observers of variable experience level. Use of an intraluminal tracheal stent for CTCS is associated with minimal stent shortening with no clinically relevant stent migration after fluoroscopic placement. Precise stent sizing and placement techniques likely play important roles in avoiding these reported complications. Copyright © 2018 The Authors. Journal of Veterinary Internal Medicine published by Wiley Periodicals, Inc. on behalf of the American College of Veterinary Internal Medicine.
Granados, Rosario; Butrón, Mercedes; Santonja, Carlos; Rodríguez, José-María; Martín, Ana; Duarte, Joanny; Camarmo, Encarnación; Corrales, Teresa; Aramburu, José-Antonio
2016-07-01
Liquid-based cytology (LBC) has recently become the preferred method for urine cytology analysis, but differences with conventional cytology (CC) have been observed. The purpose of this study is to analyze these differences and the clinical relevance of non-atypical urothelial cell groups (UCG) in voided urine specimens. Reporting terminology is discussed. Initially, diagnostic categories from 619 LBC and 474 CC samples, reviewed by five different pathologists, were compared (phase 1). Five years after LBC was implemented and applying strict cytologic criteria for UCG diagnosis, 760 samples were analyzed (phase 2) and compared to previous LBC specimens. Diagnostic differences, interobserver variability and clinicopathological correlation with a 6-month follow-up, were analyzed. UCG increased from 6.5% with CC to 20.7% (218%, 3.2 fold, P < 0.0001) with LBC. This difference was not related to interobserver variability. Five years later, the rate of UCG had decreased to 13 2%. While 6% of cases with a negative cytology had urothelial carcinoma (UC) within 6 months of diagnosis, this percentage increased to 15.7% with UCG. The sensitivity of the UCG category for UC was low (30.4%), but the specificity and the negative predictive value (NPV) were high (87.1% and 94%, respectively). LBC increases UCG when compared to CC. This can be corrected with observeŕs experience and using set cytological criteria. Due to its association with carcinoma, the presence of UCG in voided urine should be framed in a diagnostic category other than "negative for malignancy." Diagn. Cytopathol. 2016;44:582-590. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Spherical subjective refraction with a novel 3D virtual reality based system.
Pujol, Jaume; Ondategui-Parra, Juan Carlos; Badiella, Llorenç; Otero, Carles; Vilaseca, Meritxell; Aldaba, Mikel
To conduct a clinical validation of a virtual reality-based experimental system that is able to assess the spherical subjective refraction simplifying the methodology of ocular refraction. For the agreement assessment, spherical refraction measurements were obtained from 104 eyes of 52 subjects using three different methods: subjectively with the experimental prototype (Subj.E) and the classical subjective refraction (Subj.C); and objectively with the WAM-5500 autorefractor (WAM). To evaluate precision (intra- and inter-observer variability) of each refractive tool independently, 26 eyes were measured in four occasions. With regard to agreement, the mean difference (±SD) for the spherical equivalent (M) between the new experimental subjective method (Subj.E) and the classical subjective refraction (Subj.C) was -0.034D (±0.454D). The corresponding 95% Limits of Agreement (LoA) were (-0.856D, 0.924D). In relation to precision, intra-observer mean difference for the M component was 0.034±0.195D for the Subj.C, 0.015±0.177D for the WAM and 0.072±0.197D for the Subj.E. Inter-observer variability showed worse precision values, although still clinically valid (below 0.25D) in all instruments. The spherical equivalent obtained with the new experimental system was precise and in good agreement with the classical subjective routine. The algorithm implemented in this new system and its optical configuration has been shown to be a first valid step for spherical error correction in a semiautomated way. Copyright © 2016 Spanish General Council of Optometry. Published by Elsevier España, S.L.U. All rights reserved.
Vrtovec, Tomaž; Pernuš, Franjo; Likar, Boštjan
2014-10-01
In this study, sagittal vertebral inclination (SVI) was systematically evaluated for 28 vertebrae (segments between T4 and L5) in magnetic resonance (MR) images of one normal and one scoliotic subject to compare the performance of manual and computerized measurements, and identify the most reproducible and reliable measurements. Manual measurements were performed by three observers, who identified on two occasions the distinctive anatomical landmarks required to evaluate SVI by six measurement methods, i.e. the superior tangents, inferior tangents, anterior tangents, posterior tangents, mid-endplate lines and mid-wall lines. Computerized measurements were performed by automatically evaluating SVI from the symmetry of vertebral anatomical structures in two-dimensional (2D) sagittal cross-sections and in three-dimensional (3D) volumetric images. The mid-wall lines and posterior tangents proved to be the manual measurements with the lowest intra-observer (standard deviation, SD, of 1.4° and 1.7°, respectively) and inter-observer variability (SD of 1.9° and 2.4°, respectively). The strongest inter-method agreement was found between the mid-wall lines and posterior tangents (SD of 2.0°). Computerized measurements in 2D and in 3D resulted in intra-observer (SD of 2.8° and 3.1°, respectively) and inter-observer variability (SD of 3.8° and 5.2°, respectively) that were comparable to those of the superior tangents (SD of 2.6° and 3.7°) and inferior tangents (SD of 3.2° and 4.5°), which represent standard Cobb angle measurements. It can be concluded that computerized measurements of SVI should be based on the inclination of vertebral body walls. Copyright © 2014 Elsevier Ltd. All rights reserved.
Reliability of joint count assessment in rheumatoid arthritis: a systematic literature review.
Cheung, Peter P; Gossec, Laure; Mak, Anselm; March, Lyn
2014-06-01
Joint counts are central to the assessment of rheumatoid arthritis (RA) but reliability is an issue. To evaluate the reliability and agreement of joint counts (intra-observer and inter-observer) by health care professionals (physicians, nurses, and metrologists) and patients in RA, and the impact of training and standardization on joint count reliability through a systematic literature review. Articles reporting joint count reliability or agreement in RA in PubMed, EMBase, and the Cochrane library between 1960 and 2012 were selected. Data were extracted regarding tender joint counts (TJCs) and swollen joint counts (SJCs) derived by physicians, metrologists, or patients for intra-observer and inter-observer reliability. In addition, methods and effects of training or standardization were extracted. Statistics expressing reliability such as intraclass correlation coefficients (ICCs) were extracted. Data analysis was primarily descriptive due to high heterogeneity. Twenty-eight studies on health care professionals (HCP) and 20 studies on patients were included. Intra-observer reliability for TJCs and SJCs was good for HCPs and patients (range of ICC: 0.49-0.98). Inter-observer reliability between HCPs for TJCs was higher than for SJCs (range of ICC: 0.64-0.88 vs. 0.29-0.98). Patient inter-observer reliability with HCPs as comparators was better for TJCs (range of ICC: 0.31-0.91) compared to SJCs (0.16-0.64). Nine studies (7 with HCPs and 2 with patients) evaluated consensus or training, with improvement in reliability of TJCs but conflicting evidence for SJCs. Intra- and inter-observer reliability was high for TJCs for HCPs and patients: among all groups, reliability was better for TJCs than SJCs. Inter-observer reliability of SJCs was poorer for patients than HCPs. Data were inconclusive regarding the potential for training to improve SJC reliability. Overall, the results support further evaluation for patient-reported joint counts as an outcome measure. © 2013 Published by Elsevier Inc.
Reliability of a four-column classification for tibial plateau fractures.
Martínez-Rondanelli, Alfredo; Escobar-González, Sara Sofía; Henao-Alzate, Alejandro; Martínez-Cano, Juan Pablo
2017-09-01
A four-column classification system offers a different way of evaluating tibial plateau fractures. The aim of this study is to compare the intra-observer and inter-observer reliability between four-column and classic classifications. This is a reliability study, which included patients presenting with tibial plateau fractures between January 2013 and September 2015 in a level-1 trauma centre. Four orthopaedic surgeons blindly classified each fracture according to four different classifications: AO, Schatzker, Duparc and four-column. Kappa, intra-observer and inter-observer concordance were calculated for the reliability analysis. Forty-nine patients were included. The mean age was 39 ± 14.2 years, with no gender predominance (men: 51%; women: 49%), and 67% of the fractures included at least one of the posterior columns. The intra-observer and inter-observer concordance were calculated for each classification: four-column (84%/79%), Schatzker (60%/71%), AO (50%/59%) and Duparc (48%/58%), with a statistically significant difference among them (p = 0.001/p = 0.003). Kappa coefficient for intr-aobserver and inter-observer evaluations: Schatzker 0.48/0.39, four-column 0.61/0.34, Duparc 0.37/0.23, and AO 0.34/0.11. The proposed four-column classification showed the highest intra and inter-observer agreement. When taking into account the agreement that occurs by chance, Schatzker classification showed the highest inter-observer kappa, but again the four-column had the highest intra-observer kappa value. The proposed classification is a more inclusive classification for the posteromedial and posterolateral fractures. We suggest, therefore, that it be used in addition to one of the classic classifications in order to better understand the fracture pattern, as it allows more attention to be paid to the posterior columns, it improves the surgical planning and allows the surgical approach to be chosen more accurately.
Høyer, C; Paludan, J P D; Pavar, S; Biurrun Manresa, J A; Petersen, L J
2014-03-01
To assess the intra- and inter-observer variation in laser Doppler flowmetry curve reading for measurement of toe and ankle pressures. A prospective single blinded diagnostic accuracy study was conducted on 200 patients with known or suspected peripheral arterial disease (PAD), with a total of 760 curve sets produced. The first curve reading for this study was performed by laboratory technologists blinded to clinical clues and previous readings at least 3 months after the primary data sampling. The pressure curves were later reassessed following another period of at least 3 months. Observer agreement in diagnostic classification according to TASC-II criteria was quantified using Cohen's kappa. Reliability was quantified using intra-class correlation coefficients, coefficients of variance, and Bland-Altman analysis. The overall agreement in diagnostic classification (PAD/not PAD) was 173/200 (87%) for intra-observer (κ = .858) and 175/200 (88%) for inter-observer data (κ = .787). Reliability analysis confirmed excellent correlation for both intra- and inter-observer data (ICC all ≥.931). The coefficients of variance ranged from 2.27% to 6.44% for intra-observer and 2.39% to 8.42% for inter-observer data. Subgroup analysis showed lower observer-variation for reading of toe pressures in patients with diabetes and/or chronic kidney disease than patients not diagnosed with these conditions. Bland-Altman plots showed higher variation in toe pressure readings than ankle pressure readings. This study shows substantial intra- and inter-observer agreement in diagnostic classification and reading of absolute pressures when using laboratory technologists as observers. The study emphasises that observer variation for curve reading is an important factor concerning the overall reproducibility of the method. Our data suggest diabetes and chronic kidney disease have an influence on toe pressure reproducibility. Copyright © 2013 European Society for Vascular Surgery. Published by Elsevier Ltd. All rights reserved.
Zonnebeld, Niek; Maas, Tommy M G; Huberts, Wouter; van Loon, Magda M; Delhaas, Tammo; Tordoir, Jan H M
2017-11-01
Although clinical guidelines on arteriovenous fistula (AVF) creation advocate minimum luminal arterial and venous diameters, assessed by duplex ultrasonography (DUS), the clinical value of routine DUS examination is under debate. DUS might be an insufficiently repeatable and/or reproducible imaging modality because of its operator dependency. The present study aimed to assess intra- and inter-observer agreement of DUS examination in support of AVF surgery planning. Ten end stage renal disease patients were included, to assess intra- and inter-observer agreement of pre-operative DUS measurements. All measurements were performed by two trained and experienced vascular technicians, blinded to measurement readings. From the routine DUS protocol, representative measurements (venous diameters, and arterial diameters and volume flow in the upper arm and forearm) were selected. For intra-observer agreement the measurements were performed in triplicate, with the probe released from the skin between each. Intraclass correlation coefficients were calculated for intra- and inter-observer agreement, and Bland-Altman plots used to graphically display mean measurement differences and limits of agreement. Ten patients (6 male, 59.4±19.7 years) consented to participate, and all predefined measurements were obtained. Intraclass correlation coefficients for intra-observer agreement of diameter measurements were at least 0.90 (95% CI 0.74-0.97; radial artery). Inter-observer agreement was at least 0.83 (0.46-0.96; lateral diameter upper arm cephalic vein). The Bland-Altman plots showed acceptable mean measurement differences and limits of agreement. In experienced hands, excellent intra- and inter-observer agreement can be reached for the discrete pre-operative DUS measurements advocated in clinical guidelines. DUS is therefore a reliable imaging modality to support AVF surgery planning. The content of DUS protocols, however, needs further standardisation. Copyright © 2017 European Society for Vascular Surgery. Published by Elsevier Ltd. All rights reserved.
Chen, Frank; Cen, Steven; Palmer, Suzanne
2017-09-01
To evaluate interobserver agreement with the use of and the positive predictive value (PPV) of Prostate Imaging Reporting and Data System version 2 (PI-RADS v2) for the localization of intermediate- and high-grade prostate cancers on multiparametric magnetic resonance imaging (mpMRI). In this retrospective, institutional review board-approved study, 131 consecutive patients who had mpMRI followed by transrectal ultrasound-MR imaging fusion-guided biopsy of the prostate were included. Two readers who were blinded to initial mpMRI reports, clinical data, and pathologic outcomes reviewed the MR images, identified all prostate lesions, and scored each lesion based on the PI-RADS v2. Interobserver agreement was assessed by intraclass correlation coefficient (ICC), and PPV was calculated for each PI-RADS category. PI-RADS v2 was found to have a moderate level of interobserver agreement between two readers of varying experience, with ICC of 0.74, 0.72, and 0.67 for all lesions, peripheral zone lesions, and transitional zone lesions, respectively. Despite only moderate interobserver agreement, the calculated PPV in the detection of intermediate- and high-grade prostate cancers for each PI-RADS category was very similar between the two readers, with approximate PPV of 0%, 12%, 64%, and 87% for PI-RADS categories 2, 3, 4, and 5, respectively. In our study, PI-RADS v2 has only moderate interobserver agreement, a similar finding in studies of the original PI-RADS and in initial studies of PI-RADS v2. Despite this, PI-RADS v2 appears to be a useful system to predict significant prostate cancer, with PI-RADS scores correlating well with the likelihood of intermediate- and high-grade cancers. Copyright © 2017 The Association of University Radiologists. Published by Elsevier Inc. All rights reserved.
Niglis, L; Collin, P; Dosch, J-C; Meyer, N; Kempf, J-F
2017-10-01
The long-term outcomes of rotator cuff repair are unclear. Recurrent tears are common, although their reported frequency varies depending on the type and interpretation challenges of the imaging method used. The primary objective of this study was to assess the intra- and inter-observer reproducibility of the MRI assessment of rotator cuff repair using the Sugaya classification 10years after surgery. The secondary objective was to determine whether poor reproducibility, if found, could be improved by using a simplified yet clinically relevant classification. Our hypothesis was that reproducibility was limited but could be improved by simplifying the classification. In a retrospective study, we assessed intra- and inter-observer agreement in interpreting 49 magnetic resonance imaging (MRI) scans performed 10years after rotator cuff repair. These 49 scans were taken at random among 609 cases that underwent re-evaluation, with imaging, for the 2015 SoFCOT symposium on 10-year and 20-year clinical and anatomical outcomes of rotator cuff repair for full-thickness tears. Each of three observers read each of the 49 scans on two separate occasions. At each reading, they assessed the supra-spinatus tendon according to the Sugaya classification in five types. Intra-observer agreement for the Sugaya type was substantial (κ=0.64) but inter-observer agreement was only fair (κ=0.39). Agreement improved when the five Sugaya types were collapsed into two categories (1-2-3 and 4-5) (intra-observer κ=0.74 and inter-observer κ=0.68). Using the Sugaya classification to assess post-operative rotator cuff healing was associated with substantial intra-observer and fair inter-observer agreement. A simpler classification into two categories improved agreement while remaining clinically relevant. II, prospective randomised low-power study. Copyright © 2017 Elsevier Masson SAS. All rights reserved.
Low-Dose Radiation 3D Intraoperative Imaging: How Low Can We Go? An O-Arm, CT Scan, Cadaveric Study.
Sarwahi, Vishal; Payares, Monica; Wendolowski, Stephen; Maguire, Kathleen; Thornhill, Beverly; Lo, Yungtai; Amaral, Terry D
2017-11-15
MINI: The objective of this study was to evaluate the accuracy and reliability of pedicle screw placement using O-Arm at dosages below the manufactured recommended dose. O-Arm at reduced dose showed a 90% accuracy when compared with computed tomography; however, about 30% medial breaches were misclassified. Cadaveric study. The objective was to evaluate O-Arm's ability at low-dose (LD) settings to assess intraoperative screw placement. Accurate placement of pedicle screws is crucial because of proximity to vital structures. Malposition of screws may result in significant morbidity and potential mortality. O-arm provides real-time, intraoperative imaging of patient's anatomy and provides higher accuracy in scoliosis surgeries, avoiding risk to vital structures. We hypothesize using LD or ultra-low doses (ULDs) to obtain intraoperative images allow for accurate assessment of screw placement, both minimizing radiation exposure and preventing screw misplacement. Eight cadavers were instrumented with pedicle screws bilaterally from T1 to S1. Screws were randomly placed using O-arm navigation into three positions: contained within the bone, OUT-anterior/lateral, and OUT-medial. O-arm images were obtained at three dosage settings: LD (kVp120/mAs125-lowest manufacturer recommended), very-low dose (VLD) (kVp120/mAs63), and ULD (kVp120/mAs39). Computed tomography (CT) scan was performed using institution's LD protocol (kVp100/mAs50) and gross dissection to identify screw positions. LD, VLD, ULD, and CT for identifying "IN" screws relative to gross dissection had, a mean (standard deviation) sensitivity of 84.2% (±5.7), specificity of 76.1% (±9.3), and accuracy of 79.9% (±3.1) from all three observers. Across the three observers, the interobserver agreement was 0.67 (0.61-0.72) for LD, 0.74 (0.69-0.79) for VLD, 0.61 (0.56-0.66) for ULD, and 0.79 (0.74-0.84) for CT. Effective doses of radiation (mSV) for LD O-arm scan was 2.16, VLD 1.08, ULD 0.68, and our LD CT protocol was 1.05. Accuracy of pedicle screw placement is similar for O-arm at all doses and CT compared to gross dissection. Interobserver reliability was substantial for VLD and CT. Approximately 30% of medial screw breaches are, however, misclassified. ULD and VLDs can be used for intraoperative navigation and evaluation purposes within these limitations. N/A.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dowling, Jason A., E-mail: jason.dowling@csiro.au; University of Newcastle, Callaghan, New South Wales; Sun, Jidi
Purpose: To validate automatic substitute computed tomography CT (sCT) scans generated from standard T2-weighted (T2w) magnetic resonance (MR) pelvic scans for MR-Sim prostate treatment planning. Patients and Methods: A Siemens Skyra 3T MR imaging (MRI) scanner with laser bridge, flat couch, and pelvic coil mounts was used to scan 39 patients scheduled for external beam radiation therapy for localized prostate cancer. For sCT generation a whole-pelvis MRI scan (1.6 mm 3-dimensional isotropic T2w SPACE [Sampling Perfection with Application optimized Contrasts using different flip angle Evolution] sequence) was acquired. Three additional small field of view scans were acquired: T2w, T2*w, and T1wmore » flip angle 80° for gold fiducials. Patients received a routine planning CT scan. Manual contouring of the prostate, rectum, bladder, and bones was performed independently on the CT and MR scans. Three experienced observers contoured each organ on MRI, allowing interobserver quantification. To generate a training database, each patient CT scan was coregistered to their whole-pelvis T2w using symmetric rigid registration and structure-guided deformable registration. A new multi-atlas local weighted voting method was used to generate automatic contours and sCT results. Results: The mean error in Hounsfield units between the sCT and corresponding patient CT (within the body contour) was 0.6 ± 14.7 (mean ± 1 SD), with a mean absolute error of 40.5 ± 8.2 Hounsfield units. Automatic contouring results were very close to the expert interobserver level (Dice similarity coefficient): prostate 0.80 ± 0.08, bladder 0.86 ± 0.12, rectum 0.84 ± 0.06, bones 0.91 ± 0.03, and body 1.00 ± 0.003. The change in monitor units between the sCT-based plans relative to the gold standard CT plan for the same dose prescription was found to be 0.3% ± 0.8%. The 3-dimensional γ pass rate was 1.00 ± 0.00 (2 mm/2%). Conclusions: The MR-Sim setup and automatic sCT generation methods using standard MR sequences generates realistic contours and electron densities for prostate cancer radiation therapy dose planning and digitally reconstructed radiograph generation.« less
Inter-observer variation in identifying mammals from their tracks at enclosed track plate stations
William J. Zielinski; Fredrick V. Schlexer
2009-01-01
Enclosed track plate stations are a common method to detect mammalian carnivores. Studies rely on these data to make inferences about geographic range, population status and detectability. Despite their popularity, there has been no effort to document inter-observer variation in identifying the species that leave their tracks. Four previous field crew leaders...
Longo, F; Nicetto, T; Banzato, T; Savio, G; Drigo, M; Meneghello, R; Concheri, G; Isola, M
2018-02-01
The aim of this ex vivo study was to test a novel three-dimensional (3D) automated computer-aided design (CAD) method (aCAD) for the computation of femoral angles in dogs from 3D reconstructions of computed tomography (CT) images. The repeatability and reproducibility of three manual radiography, manual CT reconstructions and the aCAD method for the measurement of three femoral angles were evaluated: (1) anatomical lateral distal femoral angle (aLDFA); (2) femoral neck angle (FNA); and (3) femoral torsion angle (FTA). Femoral angles of 22 femurs obtained from 16 cadavers were measured by three blinded observers. Measurements were repeated three times by each observer for each diagnostic technique. Femoral angle measurements were analysed using a mixed effects linear model for repeated measures to determine the levels of intra-observer agreement (repeatability) and inter-observer agreement (reproducibility). Repeatability and reproducibility of measurements using the aCAD method were excellent (intra-class coefficients, ICCs≥0.98) for all three angles assessed. Manual radiography and CT exhibited excellent agreement for the aLDFA measurement (ICCs≥0.90). However, FNA repeatability and reproducibility were poor (ICCs<0.8), whereas FTA measurement showed slightly higher ICCs values, except for the radiographic reproducibility, which was poor (ICCs<0.8). The computation of the 3D aCAD method provided the highest repeatability and reproducibility among the tested methodologies. Copyright © 2017 Elsevier Ltd. All rights reserved.
Safi, Yaser; Aghdasi, Mohammad Mehdi; Ezoddini-Ardakani, Fatemeh; Beiraghi, Samira; Vasegh, Zahra
2015-01-01
Vertical root fracture (VRF) is common in endodontically treated teeth. Conventional and digital radiographies have limitations for detection of VRFs. Cone-beam computed tomography (CBCT) offers greater detection accuracy of VRFs in comparison with conventional radiography. This study compared the effects of metal artifacts on detection of VRFs by using two CBCT systems. Eighty extracted premolars were selected and sectioned at the level of the cemento enamel junction (CEJ). After preparation, root canals were filled with gutta-percha. Subsequently, two thirds of the root fillings were removed for post space preparation and a custom-made post was cemented into each canal. The teeth were randomly divided into two groups (n=40). In the test group, root fracture was created with Instron universal testing machine. The control teeth remained intact. CBCT scans of all teeth were obtained with either New Tom VGI or Soredex Scanora 3D. Three observers analyzed the images for detection of VRF. The sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) for VRF detection and percentage of probable cases were calculated for each imaging system and compared using non-parametric tests considering the non-normal distribution of data. The inter-observer reproducibility was calculated using the weighted kappa coefficient. There were no statistically significant differences in sensitivity, specificity, PPV and NPV between the two CBCT systems. The effect of metal artifacts on VRF detection was not significantly different between the two CBCT systems.
Kroner, Kevin; Cooley, Katie; Hoey, Seamus; Hetzel, Scott J; Bleedorn, Jason A
2017-01-01
To evaluate the reliability of radial torsion assessment in dogs using computed tomography (CT). Cadaveric and retrospective observational clinical study. Thoracic limbs (n = 40) from bilateral normal cadaveric canine specimens (10 pairs) and unilateral antebrachial angular limb deformity (ALD) dogs (10 uniapical and 10 biapical deformities). Limbs were evaluated using CT. Frontal, sagittal, and axial plane (torsion) values were obtained using published guidelines and compared between groups and limbs. Radial torsion reliability was assessed among 3 observers using intraclass correlation coefficients (ICC). The mean (±SD) radial torsion of normal dogs was 3.6° ± 6.4° and contained a significant right to left limb variation of 2.6°. Mean radial torsion in uniapical ALD limbs (3.6° ± 18.7°) was not significantly different from biapical ALD limbs (8.9° ± 17.9°). There was a wide range of torsion values in normal and ALD limbs. The interobserver reliability was excellent (ICC > 0.8) for normal dogs, good (0.73) for uniapical, and excellent (0.89) for biapical ALD limbs. The intraobserver reliability was excellent (>0.8) for all groups. There was a small side-to-side variation of radial torsion in normal dogs. With directed training, torsion assessment using CT is reliable in dogs with and without antebrachial bone deformity. © 2016 The American College of Veterinary Surgeons.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Davies, Rhian Siân, E-mail: rhian.s.davies@wales.nhs.uk; Perrett, Teresa; Powell, Jane
A study was performed to establish whether transrectal ultrasound (TRUS)-based postimplant dosimetry (PID) is both practically feasible and comparable to computed tomography (CT)-based PID, recommended in current published guidelines. In total, 22 patients treated consecutively at a single cancer center with low-dose-rate (LDR) brachytherapy for early-stage prostate cancer had a transrectal ultrasound performed immediately after implant (d0-TRUS) and computed tomography scan 30 days after implant (d30-CT). Postimplant dosimetry planning was performed on both image sets and the results were compared. The interobserver reproducibility of the transrectal ultrasound postimplant dosimetry planning technique was also assessed. It was noticed that there wasmore » no significant difference in mean prostate D{sub 90} (136.5 Gy and 144.4 Gy, p = 0.2197), V{sub 100} (86.4% and 89.1%, p = 0.1480) and V{sub 150} (52.0% and 47.8%, p = 0.1657) for d30-CT and d0-TRUS, respectively. Rectal doses were significantly higher for d0-TRUS than d30-CT. Urethral doses were available with d0-TRUS only. We have shown that d0-TRUS PID is a useful tool for assessing the quality of an implant after low-dose-rate prostate brachytherapy and is comparable to d30-CT PID. There are clear advantages to its use in terms of resource and time efficiency both for the clinical team and the patient.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kanoun, Salim, E-mail: Salim.kanoun@gmail.com; LE2I UMR6306, Centre national de la recherche scientifique, Arts et Métiers, Université Bourgogne Franche-Comté, Dijon; MRI Unit, Centre Hospitalier Régional Universitaire, Hôpital François Mitterrand, Dijon
Purpose: To compare the diagnostic performance of {sup 18}F-fluorocholine positron emission tomography/computed tomography (FCH-PET/CT), multiparametric prostate magnetic resonance imaging (mpMRI), and a combination of both techniques for the detection of local recurrence of prostate cancer initially treated by radiation therapy. Methods and Materials: This was a retrospective, single-institution study of 32 patients with suspected prostate cancer recurrence who underwent both FCH-PET/CT and 3T mpMRI within 3 months of one another for the detection of recurrence. All included patients had to be cleared for metastatic recurrence. The reference procedure was systematic 3-dimensional (3D)-transperineal prostate biopsy for the final assessment of local recurrence.more » Both imaging modalities were analyzed by 2 experienced readers blinded to clinical data. The analysis was made per-patient and per-segment using a 4-segment model. Results: The median prostate-specific antigen value at the time of imaging was 2.92 ng/mL. The mean prostate-specific antigen doubling time was 14 months. Of the 32 patients, 31 had a positive 3D-transperineal mapping biopsy for a local relapse. On a patient-based analysis, the detection rate was 71% (22 of 31) for mpMRI and 74% (23 of 31) for FCH-PET/CT. On a segment-based analysis, the sensitivity and specificity were, respectively, 32% and 87% for mpMRI, 34% and 87% for FCH-PET/CT, and 43% and 83% for the combined analysis of both techniques. Accuracy was 64%, 65%, and 66%, respectively. The interobserver agreement was κ = 0.92 for FCH-PET/CT and κ = 0.74 for mpMRI. Conclusions: Both mpMRI and FCH-PET/CT show limited sensitivity but good specificity for the detection of local cancer recurrence after radiation therapy, when compared with 3D-transperineal mapping biopsy. Prostate biopsy still seems to be mandatory to diagnose local relapse and select patients who could benefit from local salvage therapy.« less
Analysis of intensity variability in multislice and cone beam computed tomography.
Nackaerts, Olivia; Maes, Frederik; Yan, Hua; Couto Souza, Paulo; Pauwels, Ruben; Jacobs, Reinhilde
2011-08-01
The aim of this study was to evaluate the variability of intensity values in cone beam computed tomography (CBCT) imaging compared with multislice computed tomography Hounsfield units (MSCT HU) in order to assess the reliability of density assessments using CBCT images. A quality control phantom was scanned with an MSCT scanner and five CBCT scanners. In one CBCT scanner, the phantom was scanned repeatedly in the same and in different positions. Images were analyzed using registration to a mathematical model. MSCT images were used as a reference. Density profiles of MSCT showed stable HU values, whereas in CBCT imaging the intensity values were variable over the profile. Repositioning of the phantom resulted in large fluctuations in intensity values. The use of intensity values in CBCT images is not reliable, because the values are influenced by device, imaging parameters and positioning. © 2011 John Wiley & Sons A/S.
Towards the automated analysis and database development of defibrillator data from cardiac arrest.
Eftestøl, Trygve; Sherman, Lawrence D
2014-01-01
During resuscitation of cardiac arrest victims a variety of information in electronic format is recorded as part of the documentation of the patient care contact and in order to be provided for case review for quality improvement. Such review requires considerable effort and resources. There is also the problem of interobserver effects. We show that it is possible to efficiently analyze resuscitation episodes automatically using a minimal set of the available information. A minimal set of variables is defined which describe therapeutic events (compression sequences and defibrillations) and corresponding patient response events (annotated rhythm transitions). From this a state sequence representation of the resuscitation episode is constructed and an algorithm is developed for reasoning with this representation and extract review variables automatically. As a case study, the method is applied to the data abstraction process used in the King County EMS. The automatically generated variables are compared to the original ones with accuracies ≥ 90% for 18 variables and ≥ 85% for the remaining four variables. It is possible to use the information present in the CPR process data recorded by the AED along with rhythm and chest compression annotations to automate the episode review.
Zheng, Yuanda; Sun, Xiaojiang; Wang, Jian; Zhang, Lingnan; DI, Xiaoyun; Xu, Yaping
2014-04-01
18 F-fluorodeoxyglucose (FDG)-positron emission tomography (PET)/computed tomography (CT) has the potential to improve the staging and radiation treatment (RT) planning of various tumor sites. However, from a clinical standpoint, questions remain with regard to what extent PET/CT changes the target volume and whether PET/CT reduces interobserver variability in target volume delineation. The present study analyzed the use of FDG-PET/CT images for staging and evaluated the impact of FDG-PET/CT on the radiotherapy volume delineation compared with CT in patients with non-small cell lung cancer (NSCLC) who were candidates for radiotherapy. Intraobserver variation in delineating tumor volumes was also observed. In total, 23 patients with stage I-III NSCLC were enrolled and treated with fractionated RT-based therapy with or without chemotherapy. FDG-PET/CT scans were acquired within two weeks prior to RT. PET and CT data sets were sent to the treatment planning system, Pinnacle, through compact discs. The CT and PET images were subsequently fused by means of a dedicated RT planning system. Gross tumor volume (GTV) was contoured by four radiation oncologists on CT (GTV-CT) and PET/CT images (GTV-PET/CT). The resulting volumes were analyzed and compared. For the first phase, two radiation oncologists outlined the contours together, achieving a final consensus. Based on PET/CT, changes in tumor-node-metastasis categories occurred in 8/23 cases (35%). Radiation targeting with fused FDG-PET and CT images resulted in alterations in radiation therapy planning in 12/20 patients (60%) in comparison with CT targeting. The most prominent changes in GTV were observed in cases with atelectasis. For the second phase, the variation in delineating tumor volumes was assessed by four observers. The mean ratio of largest to smallest CT-based GTV was 2.31 (range, 1.01-5.96). The addition of the PET results reduced the mean ratio to 1.46 (range, 1.02-2.27). PET/CT fusion images may have a potential impact on tumor staging and treatment planning. Implementing matched PET/CT results reduced observer variation in delineating tumor volumes significantly with respect to CT only.
Batchelor, Connor; Pordeli, Pooneh; d'Esterre, Christopher D; Najm, Mohamed; Al-Ajlan, Fahad S; Boesen, Mari E; McDougall, Connor; Hur, Lisa; Fainardi, Enrico; Shankar, Jai Jai Shiva; Rubiera, Marta; Khaw, Alexander V; Hill, Michael D; Demchuk, Andrew M; Sajobi, Tolulope T; Goyal, Mayank; Lee, Ting-Yim; Aviv, Richard I; Menon, Bijoy K
2017-06-01
Intracerebral hemorrhage is a feared complication of intravenous alteplase therapy in patients with acute ischemic stroke. We explore the use of multimodal computed tomography in predicting this complication. All patients were administered intravenous alteplase with/without intra-arterial therapy. An age- and sex-matched case-control design with classic and conditional logistic regression techniques was chosen for analyses. Outcome was parenchymal hemorrhage on 24- to 48-hour imaging. Exposure variables were imaging (noncontrast computed tomography hypoattenuation degree, relative volume of very low cerebral blood volume, relative volume of cerebral blood flow ≤7 mL/min·per 100 g, relative volume of T max ≥16 s with all volumes standardized to z axis coverage, mean permeability surface area product values within T max ≥8 s volume, and mean permeability surface area product values within ipsilesional hemisphere) and clinical variables (NIHSS [National Institutes of Health Stroke Scale], onset to imaging time, baseline systolic blood pressure, blood glucose, serum creatinine, treatment type, and reperfusion status). One-hundred eighteen subjects (22 patients with parenchymal hemorrhage versus 96 without, median baseline NIHSS score of 15) were included in the final analysis. In multivariable regression, noncontrast computed tomography hypoattenuation grade ( P <0.006) and computerized tomography perfusion white matter relative volume of very low cerebral blood volume ( P =0.04) were the only significant variables associated with parenchymal hemorrhage on follow-up imaging (area under the curve, 0.73; 95% confidence interval, 0.63-0.83). Interrater reliability for noncontrast computed tomography hypoattenuation grade was moderate (κ=0.6). Baseline hypoattenuation on noncontrast computed tomography and very low cerebral blood volume on computerized tomography perfusion are associated with development of parenchymal hemorrhage in patients with acute ischemic stroke receiving intravenous alteplase. © 2017 American Heart Association, Inc.
Sánchez, Guillermo; Nova, John; Arias, Nilsa; Peña, Bibiana
2008-12-01
The Fitzpatrick phototype scale has been used to determine skin sensitivity to ultraviolet light. The reliability of this scale in estimating sensitivity permits risk evaluation of skin cancer based on phototype. Reliability and changes in intra and inter-observer concordance was determined for the Fitzpatrick phototype scale after the assessment methods for establishing the phototype were standardized. An analytical study of intra and inter-observer concordance was performed. The Fitzpatrick phototype scale was standardized using focus group methodology. To determine intra and inter-observer agreement, the weighted kappa statistical method was applied. The standardization effect was measured using the equal kappa contrast hypothesis and Wald test for dependent measurements. The phototype scale was applied to 155 patients over 15 years of age who were assessed four times by two independent observers. The sample was drawn from patients of the Centro Dermatol6gico Federico Lleras Acosta. During the pre-standardization phase, the baseline and six-week inter-observer weighted kappa were 0.31 and 0.40, respectively. The intra-observer kappa values for observers A and B were 0.47 and 0.51, respectively. After the standardization process, the baseline and six-week inter-observer weighted kappa values were 0.77, and 0.82, respectively. Intra-observer kappa coefficients for observers A and B were 0.78 and 0.82. Statistically significant differences were found between coefficients before and after standardization (p<0.001) in all comparisons. Following a standardization exercise, the Fitzpatrick phototype scale yielded reliable, reproducible and consistent results.
Error in geometric morphometric data collection: Combining data from multiple sources.
Robinson, Chris; Terhune, Claire E
2017-09-01
This study compares two- and three-dimensional morphometric data to determine the extent to which intra- and interobserver and intermethod error influence the outcomes of statistical analyses. Data were collected five times for each method and observer on 14 anthropoid crania using calipers, a MicroScribe, and 3D models created from NextEngine and microCT scans. ANOVA models were used to examine variance in the linear data at the level of genus, species, specimen, observer, method, and trial. Three-dimensional data were analyzed using geometric morphometric methods; principal components analysis was employed to examine how trials of all specimens were distributed in morphospace and Procrustes distances among trials were calculated and used to generate UPGMA trees to explore whether all trials of the same individual grouped together regardless of observer or method. Most variance in the linear data was at the genus level, with greater variance at the observer than method levels. In the 3D data, interobserver and intermethod error were similar to intraspecific distances among Callicebus cupreus individuals, with interobserver error being higher than intermethod error. Generally, taxa separate well in morphospace, with different trials of the same specimen typically grouping together. However, trials of individuals in the same species overlapped substantially with one another. Researchers should be cautious when compiling data from multiple methods and/or observers, especially if analyses are focused on intraspecific variation or closely related species, as in these cases, patterns among individuals may be obscured by interobserver and intermethod error. Conducting interobserver and intermethod reliability assessments prior to the collection of data is recommended. © 2017 Wiley Periodicals, Inc.
Are distal radius fracture classifications reproducible? Intra and interobserver agreement.
Belloti, João Carlos; Tamaoki, Marcel Jun Sugawara; Franciozi, Carlos Eduardo da Silveira; Santos, João Baptista Gomes dos; Balbachevsky, Daniel; Chap Chap, Eduardo; Albertoni, Walter Manna; Faloppa, Flávio
2008-05-01
Various classification systems have been proposed for fractures of the distal radius, but the reliability of these classifications is seldom addressed. For a fracture classification to be useful, it must provide prognostic significance, interobserver reliability and intraobserver reproducibility. The aim here was to evaluate the intraobserver and interobserver agreement of distal radius fracture classifications. This was a validation study on interobserver and intraobserver reliability. It was developed in the Department of Orthopedics and Traumatology, Universidade Federal de São Paulo - Escola Paulista de Medicina. X-rays from 98 cases of displaced distal radius fracture were evaluated by five observers: one third-year orthopedic resident (R3), one sixth-year undergraduate medical student (UG6), one radiologist physician (XRP), one orthopedic trauma specialist (OT) and one orthopedic hand surgery specialist (OHS). The radiographs were classified on three different occasions (times T1, T2 and T3) using the Universal (Cooney), Arbeitsgemeinschaft für Osteosynthesefragen/Association for the Study of Internal Fixation (AO/ASIF), Frykman and Fernández classifications. The kappa coefficient (kappa) was applied to assess the degree of agreement. Among the three occasions, the highest mean intraobserver k was observed in the Universal classification (0.61), followed by Fernández (0.59), Frykman (0.55) and AO/ASIF (0.49). The interobserver agreement was unsatisfactory in all classifications. The Fernández classification showed the best agreement (0.44) and the worst was the Frykman classification (0.26). The low agreement levels observed in this study suggest that there is still no classification method with high reproducibility.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Karki, K; Hugo, G; Saraiya, S
Purpose: Target delineation in lung cancer radiotherapy has, in general, large variability. MRI has so far not been investigated in detail for lung cancer delineation variability. The purpose of this study is to investigate delineation variability for lung tumors using MRI and compare it to CT alone and PET-CT based delineations. Methods: Seven physicians delineated the primary tumor volumes of nine patients for the following scenarios: (1) CT only; (2) post-contrast T1-weighted MRI registered with diffusion-weighted MRI; and (3) PET-CT fusion images. To compute interobserver variability, the median surface was generated from all observers’ contours and used as the referencemore » surface. A single physician labeled the interface types (tumor to lung, atelectasis (collapsed lung), hilum, mediastinum, or chest-wall) on the median surface. Volume variation (normalized to PET-CT volume), minimum distance (MD), and bidirectional local distance (BLD) between individual observers’ contours and the reference contour were measured. Results: CT- and MRI-based normalized volumes were 1.61±0.76 (mean±SD) and 1.38±0.44, respectively, both significantly larger than PET-CT (p<0.05, paired t-test). The overall uncertainty (root mean square of SD values over all points) of both BLD and MD measures of the observers for the interfaces were not significantly different (p>0.05, two-samples t-test) for all imaging modalities except between tumor-mediastinum and tumor-atelectasis in PET-CT. The largest mean overall uncertainty was observed for tumor-atelectasis interface, the smallest for tumor-mediastinum and tumor-lung interfaces for all modalities. The whole tumor uncertainties for both BLD and MD were not significantly different between any two modalities (p>0.05, paired t-test). Overall uncertainties for the interfaces using BLD were similar to using MD. Conclusion: Large volume variations were observed between the three imaging modalities. Contouring variability appeared to depend on the interface type. This study will be useful for understanding the delineation uncertainty for radiotherapy planning of lung cancer using different imaging modalities. Disclosures: Research agreement with Phillips Healthcare (GH and EW), National Institutes of Health Licensing agreement with Varian Medical Systems (GH and EW), research grants from the National Institute of Health (GH and EW), UpToDate royalties (EW), and none (others). Authors have no potential conflicts of interest to disclose.« less
ERIC Educational Resources Information Center
Mudford, Oliver C.; Taylor, Sarah Ann; Martin, Neil T.
2009-01-01
We reviewed all research articles in 10 recent volumes of the "Journal of Applied Behavior Analysis (JABA)": Vol. 28(3), 1995, through Vol. 38(2), 2005. Continuous recording was used in the majority (55%) of the 168 articles reporting data on free-operant human behaviors. Three methods for reporting interobserver agreement (exact agreement,…
Krause, Fabian G; Di Silvestro, Matthew; Penner, Murray J; Wing, Kevin J; Glazebrook, Mark A; Daniels, Timothy R; Lau, Johnny T C; Younger, Alastair S E
2012-02-01
End-stage ankle arthritis is operatively treated with numerous designs of total ankle replacement and different techniques for ankle fusion. For superior comparison of these procedures, outcome research requires a classification system to stratify patients appropriately. A postoperative 4-type classification system was designed by 6 fellowship-trained foot and ankle surgeons. Four surgeons reviewed blinded patient profiles and radiographs on 2 occasions to determine the interobserver and intraobserver reliability of the classification. Excellent interobserver reliability (κ = .89) and intraobserver reproducibility (κ = .87) were demonstrated for the postoperative classification system. In conclusion, the postoperative Canadian Orthopaedic Foot and Ankle Society (COFAS) end-stage ankle arthritis classification system appears to be a valid tool to evaluate the outcome of patients operated for end-stage ankle arthritis.
Gill, Ritu R; Naidich, David P; Mitchell, Alan; Ginsberg, Michelle; Erasmus, Jeremy; Armato, Samuel G; Straus, Christopher; Katz, Sharyn; Patios, Demetrois; Richards, William G; Rusch, Valerie W
2016-08-01
Clinical tumor (T), node, and metastasis staging is based on a qualitative assessment of features defining T descriptors and has been found to be suboptimal for predicting the prognosis of patients with malignant pleural mesothelioma (MPM). Previous work suggests that volumetric computed tomography (VolCT) is prognostic and, if found practical and reproducible, could improve clinical MPM classification. Six North American institutions electronically submitted clinical, pathologic, and imaging data on patients with stages I to IV MPM to an established multicenter database and biostatistical center. Two reference radiologists blinded to clinical data independently reviewed the scans; calculated clinical T, node, and metastasis stage by standard criteria; performed semiautomated tumor volume calculations using commercially available software; and submitted the findings to the biostatistical center. Study end points included the feasibility of a multi-institutional VolCT network, concordance of independent VolCT assessments, and association of VolCT with pathological T classification. Of 164 submitted cases, 129 were evaluated by both reference radiologists. Discordant clinical staging of most cases confirmed the inadequacy of current criteria. The overall correlation between VolCT estimates was good (Spearman correlation 0.822), but some were significantly discordant. Root cause analysis of the most discordant estimates identified four common sources of variability. Despite these limitations, median tumor volume estimates were similar within subgroups of cases representing each pathological T descriptor and increased monotonically for each reference radiologist with increasing pathological T status. The good correlation between VolCT estimates obtained for most cases reviewed by two independent radiologists and qualitative association of VolCT with pathological T status combine to encourage further study. The identified sources of user error will inform design of a follow-up prospective trial to more formally assess interobserver variability of VolCT and its potential contribution to clinical MPM staging. Copyright © 2016 International Association for the Study of Lung Cancer. Published by Elsevier Inc. All rights reserved.
Satriano, Alessandro; Guenther, Zachary; White, James A; Merchant, Naeem; Di Martino, Elena S; Al-Qoofi, Faisal; Lydell, Carmen P; Fine, Nowell M
2018-05-02
Functional impairment of the aorta is a recognized complication of aortic and aortic valve disease. Aortic strain measurement provides effective quantification of mechanical aortic function, and 3-dimenional (3D) approaches may be desirable for serial evaluation. Computerized tomographic angiography (CTA) is routinely performed for various clinical indications, and offers the unique potential to study 3D aortic deformation. We sought to investigate the feasibility of performing 3D aortic strain analysis in a candidate population of patients undergoing transcatheter aortic valve replacement (TAVR). Twenty-one patients with severe aortic valve stenosis (AS) referred for TAVR underwent ECG-gated CTA and echocardiography. CTA images were analyzed using a 3D feature-tracking based technique to construct a dynamic aortic mesh model to perform peak principal strain amplitude (PPSA) analysis. Segmental strain values were correlated against clinical, hemodynamic and echocardiographic variables. Reproducibility analysis was performed. The mean patient age was 81±6 years. Mean left ventricular ejection fraction was 52±14%, aortic valve area (AVA) 0.6±0.3 cm 2 and mean AS pressure gradient (MG) 44±11 mmHg. CTA-based 3D PPSA analysis was feasible in all subjects. Mean PPSA values for the global thoracic aorta, ascending aorta, aortic arch and descending aorta segments were 6.5±3.0, 10.2±6.0, 6.1±2.9 and 3.3±1.7%, respectively. 3D PSSA values demonstrated significantly more impairment with measures of worsening AS severity, including AVA and MG for the global thoracic aorta and ascending segment (p<0.001 for all). 3D PSSA was independently associated with AVA by multivariable modelling. Coefficients of variation for intra- and inter-observer variability were 5.8 and 7.2%, respectively. Three-dimensional aortic PPSA analysis is clinically feasible from routine ECG-gated CTA. Appropriate reductions in PSSA were identified with increasing AS hemodynamic severity. Expanded study of 3D aortic PSSA for patients with various forms of aortic disease is warranted.
Hong, Theodore S; Bosch, Walter R; Krishnan, Sunil; Kim, Tae K; Mamon, Harvey J; Shyn, Paul; Ben-Josef, Edgar; Seong, Jinsil; Haddock, Michael G; Cheng, Jason C; Feng, Mary U; Stephans, Kevin L; Roberge, David; Crane, Christopher; Dawson, Laura A
2014-07-15
Defining hepatocellular carcinoma (HCC) gross tumor volume (GTV) requires multimodal imaging, acquired in different perfusion phases. The purposes of this study were to evaluate the variability in contouring and to establish guidelines and educational recommendations for reproducible HCC contouring for treatment planning. Anonymous, multiphasic planning computed tomography scans obtained from 3 patients with HCC were identified and distributed to a panel of 11 gastrointestinal radiation oncologists. Panelists were asked the number of HCC cases they treated in the past year. Case 1 had no vascular involvement, case 2 had extensive portal vein involvement, and case 3 had minor branched portal vein involvement. The agreement between the contoured total GTVs (primary + vascular GTV) was assessed using the generalized kappa statistic. Agreement interpretation was evaluated using Landis and Koch's interpretation of strength of agreement. The S95 contour, defined using the simultaneous truth and performance level estimation (STAPLE) algorithm consensus at the 95% confidence level, was created for each case. Of the 11 panelists, 3 had treated >25 cases in the past year, 2 had treated 10 to 25 cases, 2 had treated 5 to 10 cases, 2 had treated 1 to 5 cases, 1 had treated 0 cases, and 1 did not respond. Near perfect agreement was seen for case 1, and substantial agreement was seen for cases 2 and 3. For case 2, there was significant heterogeneity in the volume identified as tumor thrombus (range 0.58-40.45 cc). For case 3, 2 panelists did not include the branched portal vein thrombus, and 7 panelists contoured thrombus separately from the primary tumor, also showing significant heterogeneity in volume of tumor thrombus (range 4.52-34.27 cc). In a group of experts, excellent agreement was seen in contouring total GTV. Heterogeneity exists in the definition of portal vein thrombus that may impact treatment planning, especially if differential dosing is contemplated. Guidelines for HCC GTV contouring are recommended. Copyright © 2014. Published by Elsevier Inc.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hong, Theodore S., E-mail: tshong1@mgh.harvard.edu; Bosch, Walter R.; Krishnan, Sunil
2014-07-15
Purpose: Defining hepatocellular carcinoma (HCC) gross tumor volume (GTV) requires multimodal imaging, acquired in different perfusion phases. The purposes of this study were to evaluate the variability in contouring and to establish guidelines and educational recommendations for reproducible HCC contouring for treatment planning. Methods and Materials: Anonymous, multiphasic planning computed tomography scans obtained from 3 patients with HCC were identified and distributed to a panel of 11 gastrointestinal radiation oncologists. Panelists were asked the number of HCC cases they treated in the past year. Case 1 had no vascular involvement, case 2 had extensive portal vein involvement, and case 3more » had minor branched portal vein involvement. The agreement between the contoured total GTVs (primary + vascular GTV) was assessed using the generalized kappa statistic. Agreement interpretation was evaluated using Landis and Koch's interpretation of strength of agreement. The S95 contour, defined using the simultaneous truth and performance level estimation (STAPLE) algorithm consensus at the 95% confidence level, was created for each case. Results: Of the 11 panelists, 3 had treated >25 cases in the past year, 2 had treated 10 to 25 cases, 2 had treated 5 to 10 cases, 2 had treated 1 to 5 cases, 1 had treated 0 cases, and 1 did not respond. Near perfect agreement was seen for case 1, and substantial agreement was seen for cases 2 and 3. For case 2, there was significant heterogeneity in the volume identified as tumor thrombus (range 0.58-40.45 cc). For case 3, 2 panelists did not include the branched portal vein thrombus, and 7 panelists contoured thrombus separately from the primary tumor, also showing significant heterogeneity in volume of tumor thrombus (range 4.52-34.27 cc). Conclusions: In a group of experts, excellent agreement was seen in contouring total GTV. Heterogeneity exists in the definition of portal vein thrombus that may impact treatment planning, especially if differential dosing is contemplated. Guidelines for HCC GTV contouring are recommended.« less
Rybacka, Anna; Goździk-Spychalska, Joanna; Rybacki, Adam; Piorunek, Tomasz; Batura-Gabryel, Halina; Karmelita-Katulska, Katarzyna
2018-05-04
In cystic fibrosis, pulmonary function tests (PFTs) and computed tomography are used to assess lung function and structure, respectively. Although both techniques of assessment are congruent there are lingering doubts about which PFTs variables show the best congruence with computed tomography scoring. In this study we addressed the issue by reinvestigating the association between PFTs variables and the score of changes seen in computed tomography scans in patients with cystic fibrosis with and without pulmonary exacerbation. This retrospective study comprised 40 patients in whom PFTs and computed tomography were performed no longer than 3 weeks apart. Images (inspiratory: 0.625 mm slice thickness, 0.625 mm interval; expiratory: 1.250 mm slice thickness, 10 mm interval) were evaluated with the Bhalla scoring system. The most frequent structural abnormality found in scans were bronchiectases and peribronchial thickening. The strongest relationship was found between the Bhalla sore and forced expiratory volume in 1 s (FEV1). The Bhalla sore also was related to forced vital capacity (FVC), FEV1/FVC ratio, residual volume (RV), and RV/total lung capacity (TLC) ratio. We conclude that lung structural data obtained from the computed tomography examination are highly congruent to lung function data. Thus, computed tomography imaging may supersede functional assessment in cases of poor compliance with spirometry procedures in the lederly or children. Computed tomography also seems more sensitive than PFTs in the assessment of cystic fibrosis progression. Moreover, in early phases of cystic fibrosis, computed tomography, due to its excellent resolution, may be irreplaceable in monitoring pulmonary damage.
Electrical resistivity tomography to delineate greenhouse soil variability
NASA Astrophysics Data System (ADS)
Rossi, R.; Amato, M.; Bitella, G.; Bochicchio, R.
2013-03-01
Appropriate management of soil spatial variability is an important tool for optimizing farming inputs, with the result of yield increase and reduction of the environmental impact in field crops. Under greenhouses, several factors such as non-uniform irrigation and localized soil compaction can severely affect yield and quality. Additionally, if soil spatial variability is not taken into account, yield deficiencies are often compensated by extra-volumes of crop inputs; as a result, over-irrigation and overfertilization in some parts of the field may occur. Technology for spatially sound management of greenhouse crops is therefore needed to increase yield and quality and to address sustainability. In this experiment, 2D-electrical resistivity tomography was used as an exploratory tool to characterize greenhouse soil variability and its relations to wild rocket yield. Soil resistivity well matched biomass variation (R2=0.70), and was linked to differences in soil bulk density (R2=0.90), and clay content (R2=0.77). Electrical resistivity tomography shows a great potential in horticulture where there is a growing demand of sustainability coupled with the necessity of stabilizing yield and product quality.
Doppler aortic flow velocity measurement in healthy children.
Sohn, S.; Kim, H. S.
2001-01-01
To determine normal values for Doppler parameters of left ventricular function, ascending aortic blood flow velocity was measured by pulsed wave Doppler echocardiography in 63 healthy children with body surface area (BSA) < 1 m(2) (age < 10 yr). Peak velocity was independent of sex, but increased with body size. Mean acceleration was related to peak velocity (r = 0.75, p < 0.0001). Both stroke distance and ejection time had strong negative correlations with heart rate and positive correlations with BSA, suggesting that these parameters should be evaluated in relation to heart rate and body size. Mean intra- and interobserver variability for peak velocity, ejection time, stroke and minute distance ranged from 3 to 7%, whereas variability for acceleration time was 9 to 13%. These data may be used as reference values for the assessment of hemodynamic states in young children with cardiac disease. PMID:11306737
Liu, Haiping; Chen, Ping; Wroblewski, Kristen; Hou, Peng; Zhang, Chen-Peng; Jiang, Yulei; Pu, Yonglin
2016-01-01
The objective of this study was to test the hypothesis that the metabolic tumor volume (MTV) of primary non-small-cell lung cancer is not sensitive to differences in F-fluorodeoxyglucose (F-FDG) uptake time, and to compare this consistency of MTV measurements with that of standardized uptake value (SUV) and total lesion glycolysis (TLG). Under Institutional Review Board approval, 134 consecutive patients with histologically proven non-small-cell lung cancer underwent F-FDG PET/computed tomography scanning at about 1 h (early) and 2 h (delayed) after intravenous injection of F-FDG. MTV, SUV, and TLG of the primary tumor were all measured. Student's t-test and Wilcoxon's signed-rank test for paired data were used to compare MTV, SUV, and TLG between the two scans. The intraclass correlation coefficient (ICC) was used to assess agreement in PET parameters between the two scans and between the measurements made by two observers. MTV was not significantly different (P=0.17) between the two scans. However, SUVmax, SUVmean, SUVpeak, and TLG increased significantly from the early to the delayed scans (P<0.0001 for all). The median percentage change between the two scans in MTV (1.65%) was smaller than in SUVmax (11.76%), SUVmean(10.57%), SUVpeak(13.51%), and TLG (14.34%); the ICC of MTV (0.996) was greater than that of SUVmax (0.933), SUVmean (0.952), SUVpeak (0.928), and TLG (0.982). Interobserver agreement between the two radiologists was excellent for MTV, SUV, and TLG on both scans (ICC: 0.934-0.999). MTV is not sensitive to common clinical variations in F-FDG uptake time, its consistency is greater than that of SUVmax, SUVmean, SUVpeak, and TLG, and it has excellent interobserver agreement.
Bae, Yun Jung; Kim, Jong-Min; Kim, Kyeong Joon; Kim, Eunhee; Park, Hyun Soo; Kang, Seo Young; Yoon, In-Young; Lee, Jee-Young; Jeon, Beomseok; Kim, Sang Eun
2018-04-01
Purpose To examine whether the loss of nigral hyperintensity (NH) on 3.0-T susceptibility-weighted (SW) magnetic resonance (MR) images can help identify high synucleinopathy risk in patients with idiopathic rapid eye movement sleep behavior disorder (iRBD). Materials and Methods Between March 2014 and April 2015, 18 consecutively recruited patients with iRBD were evaluated with 3.0-T SW imaging and iodine 123-2β-carbomethoxy-3β-(4-iodophenyl)-N-(3-fluoropropyl)-nortropane ( 123 I-FP-CIT) single photon emission computed tomography and compared with 18 healthy subjects and 18 patients with Parkinson disease (PD). Two readers blinded to clinical diagnosis independently assessed the images. 123 I-FP-CIT uptake ratios were compared by using the Kruskal-Wallis test, and intra- and interobserver agreements were assessed with the Cohen κ. The synucleinopathy conversion according to NH status was evaluated in patients with iRBD after follow-up. Results NH was intact in seven patients with iRBD and lost in 11. The 123 I-FP-CIT uptake ratios were comparable between those with intact NH (mean, 3.22 ± 0.47) and healthy subjects (mean, 3.37 ± 0.47) (P = .495). The 123 I-FP-CIT uptake ratios in the 11 patients with iRBD and NH loss (mean, 2.48 ± 0.44) were significantly lower than those in healthy subjects (mean, 3.37 ± 0.47; P < .001) but higher than those in patients with PD (mean, 1.80 ± 0.33; P < .001). The intra- and interobserver agreements were excellent (κ > 0.9). Five patients with iRBD and NH loss developed symptoms of parkinsonism or dementia 18 months after neuroimaging. Conclusion NH loss at 3.0-T SW imaging may be a promising marker for short-term synucleinopathy risk in iRBD. © RSNA, 2017 Online supplemental material is available for this article.
Nael, Kambiz; Khan, Rihan; Choudhary, Gagandeep; Meshksar, Arash; Villablanca, Pablo; Tay, Jennifer; Drake, Kendra; Coull, Bruce M; Kidwell, Chelsea S
2014-07-01
If magnetic resonance imaging (MRI) is to compete with computed tomography for evaluation of patients with acute ischemic stroke, there is a need for further improvements in acquisition speed. Inclusion criteria for this prospective, single institutional study were symptoms of acute ischemic stroke within 24 hours onset, National Institutes of Health Stroke Scale ≥3, and absence of MRI contraindications. A combination of echo-planar imaging (EPI) and a parallel acquisition technique were used on a 3T magnetic resonance (MR) scanner to accelerate the acquisition time. Image analysis was performed independently by 2 neuroradiologists. A total of 62 patients met inclusion criteria. A repeat MRI scan was performed in 22 patients resulting in a total of 84 MRIs available for analysis. Diagnostic image quality was achieved in 100% of diffusion-weighted imaging, 100% EPI-fluid attenuation inversion recovery imaging, 98% EPI-gradient recalled echo, 90% neck MR angiography and 96% of brain MR angiography, and 94% of dynamic susceptibility contrast perfusion scans with interobserver agreements (k) ranging from 0.64 to 0.84. Fifty-nine patients (95%) had acute infarction. There was good interobserver agreement for EPI-fluid attenuation inversion recovery imaging findings (k=0.78; 95% confidence interval, 0.66-0.87) and for detection of mismatch classification using dynamic susceptibility contrast-Tmax (k=0.92; 95% confidence interval, 0.87-0.94). Thirteen acute intracranial hemorrhages were detected on EPI-gradient recalled echo by both observers. A total of 68 and 72 segmental arterial stenoses were detected on contrast-enhanced MR angiography of the neck and brain with k=0.93, 95% confidence interval, 0.84 to 0.96 and 0.87, 95% confidence interval, 0.80 to 0.90, respectively. A 6-minute multimodal MR protocol with good diagnostic quality is feasible for the evaluation of patients with acute ischemic stroke and can result in significant reduction in scan time rivaling that of the multimodal computed tomographic protocol. © 2014 American Heart Association, Inc.
A tri-modality image fusion method for target delineation of brain tumors in radiotherapy.
Guo, Lu; Shen, Shuming; Harris, Eleanor; Wang, Zheng; Jiang, Wei; Guo, Yu; Feng, Yuanming
2014-01-01
To develop a tri-modality image fusion method for better target delineation in image-guided radiotherapy for patients with brain tumors. A new method of tri-modality image fusion was developed, which can fuse and display all image sets in one panel and one operation. And a feasibility study in gross tumor volume (GTV) delineation using data from three patients with brain tumors was conducted, which included images of simulation CT, MRI, and 18F-fluorodeoxyglucose positron emission tomography (18F-FDG PET) examinations before radiotherapy. Tri-modality image fusion was implemented after image registrations of CT+PET and CT+MRI, and the transparency weight of each modality could be adjusted and set by users. Three radiation oncologists delineated GTVs for all patients using dual-modality (MRI/CT) and tri-modality (MRI/CT/PET) image fusion respectively. Inter-observer variation was assessed by the coefficient of variation (COV), the average distance between surface and centroid (ADSC), and the local standard deviation (SDlocal). Analysis of COV was also performed to evaluate intra-observer volume variation. The inter-observer variation analysis showed that, the mean COV was 0.14(± 0.09) and 0.07(± 0.01) for dual-modality and tri-modality respectively; the standard deviation of ADSC was significantly reduced (p<0.05) with tri-modality; SDlocal averaged over median GTV surface was reduced in patient 2 (from 0.57 cm to 0.39 cm) and patient 3 (from 0.42 cm to 0.36 cm) with the new method. The intra-observer volume variation was also significantly reduced (p = 0.00) with the tri-modality method as compared with using the dual-modality method. With the new tri-modality image fusion method smaller inter- and intra-observer variation in GTV definition for the brain tumors can be achieved, which improves the consistency and accuracy for target delineation in individualized radiotherapy.
Fujimori, Takahito; Iwasaki, Motoki; Nagamoto, Yukitaka; Kashii, Masafumi; Takao, Masaki; Sugiura, Tsuyoshi; Yoshikawa, Hideki
2017-02-01
Reliability and agreement study. To assess the reliability of intraoperative 3-dimensional imaging with a mobile C-arm (3D C-arm) equipped with a flat-panel detector. Pedicle screws are widely used in spinal surgery. Postoperative computed tomography (CT) is the most reliable method to detect screw misplacement. Recent advances in imaging devices have enabled surgeons to acquire 3D images of the spine during surgery. However, the reliability of these imaging devices is not known. A total of 203 screws were used in 22 consecutive patients who underwent surgery for scoliosis. Screw position was read twice with a 3D C-arm and twice with CT in a blinded manner by 2 independent observers. Screw positions were classified into 4 categories at every 2 mm and then into 2 simpler categories of acceptable or unacceptable. The degree of agreement with respect to screw positions between the double readings was evaluated by κ value. With unanimous agreement between 2 observers regarding postoperative CT readings considered the gold standard, the sensitivity of the 3D C-arm for determining screw misplacement was calculated. A total 804 readings were performed. For the 4-category classification, the mean κ value for the 2 interobserver readings was 0.52 for the 3D C-arm and 0.46 for CT. For the 2-category classification, the mean κ value for the 2 interobserver readings was 0.80 for the 3D C-arm and 0.66 for CT. The sensitivity, specificity, positive predictive value, and negative predictive value of intraoperative imaging with the 3D C-arm were 70%, 95%, 44%, and 98%, respectively. With respect to screws with perforation ≥4 mm, the sensitivity was 83%. No revision surgery was performed. Intraoperative imaging with a 3D C-arm was reliable for detecting screw misplacement and helpful in decreasing the rate of revision surgery for screw misplacement.
Schellhaas, Barbara; Pfeifer, Lukas; Kielisch, Christian; Goertz, Ruediger Stephan; Neurath, Markus F; Strobel, Deike
2018-06-07
This pilot study aimed at assessing interobserver agreement with two contrast-enhanced ultrasound (CEUS) algorithms for the diagnosis of hepatocellular carcinoma (HCC) in high-risk patients. Focal liver lesions in 55 high-risk patients were assessed independently by three blinded observers with two standardized CEUS algorithms: ESCULAP (Erlanger Synopsis of Contrast-Enhanced Ultrasound for Liver Lesion Assessment in Patients at risk) and ACR-CEUS-LI-RADSv.2016 (American College of Radiology CEUS-Liver Imaging Reporting and Data System). Lesions were categorized according to size and ultrasound contrast enhancement in the arterial, portal-venous and late phase. Interobserver agreement for assessment of enhancement pattern and categorization was compared between both CEUS algorithms. Additionally, diagnostic accuracy for the definitive diagnosis of HCC was compared. Histology and/or CE-MRI and follow-up served as reference standards. 55 patients were included in the study (male/female, 44/ 11; mean age: 65.9 years). 90.9 % had cirrhosis. Histological findings were available in 39/55 lesions (70.9 %). Reference standard of the 55 lesions revealed 48 HCCs, 2 intrahepatic cholangiocellular carcinomas (ICCs), and 5 non-HCC-non-ICC lesions. Interobserver agreement was moderate to substantial for arterial phase hyperenhancement (ĸ = 0.53 - 0.67), and fair to moderate for contrast washout in the portal-venous or late phase (ĸ = 0.33 - 0.53). Concerning the CEUS-based algorithms, the interreader agreement was substantial for the ESCULAP category (ĸ = 0.64 - 0.68) and fair for the CEUS-LI-RADS ® category (ĸ = 0.3 - 0.39). Disagreement between observers was mostly due to different perception of washout. Interobserver agreement is better for ESCULAP than for CEUS-LI-RADS ® . This is mostly due to the fact that perception of contrast washout varies between different observers. However, interobserver agreement is good for arterial phase hyperenhancement, which is the key diagnostic feature for the diagnosis of HCC with CEUS in the cirrhotic liver. © Georg Thieme Verlag KG Stuttgart · New York.
Min, James K; Swaminathan, Rajesh V; Vass, Melissa; Gallagher, Scott; Weinsaft, Jonathan W
2009-01-01
The assessment of coronary stents with present-generation 64-detector row computed tomography scanners that use filtered backprojection and operating at standard definition of 0.5-0.75 mm (standard definition, SDCT) is limited by imaging artifacts and noise. We evaluated the performance of a novel, high-definition 64-slice CT scanner (HDCT), with improved spatial resolution (0.23 mm) and applied statistical iterative reconstruction (ASIR) for evaluation of coronary artery stents. HDCT and SDCT stent imaging was performed with the use of an ex vivo phantom. HDCT was compared with SDCT with both smooth and sharp kernels for stent intraluminal diameter, intraluminal area, and image noise. Intrastent visualization was assessed with an ASIR algorithm on HDCT scans, compared with the filtered backprojection algorithms by SDCT. Six coronary stents (2.5, 2.5, 2.75, 3.0, 3.5, 4.0mm) were analyzed by 2 independent readers. Interobserver correlation was high for both HDCT and SDCT. HDCT yielded substantially larger luminal area visualization compared with SDCT, both for smooth (29.4+/-14.5 versus 20.1+/-13.0; P<0.001) and sharp (32.0+/-15.2 versus 25.5+/-12.0; P<0.001) kernels. Stent diameter was higher with HDCT compared with SDCT, for both smooth (1.54+/-0.59 versus1.00+/-0.50; P<0.0001) and detailed (1.47+/-0.65 versus 1.08+/-0.54; P<0.0001) kernels. With detailed kernels, HDCT scans that used algorithms showed a trend toward decreased image noise compared with SDCT-filtered backprojection algorithms. On the basis of this ex vivo study, HDCT provides superior detection of intrastent luminal area and diameter visualization, compared with SDCT. ASIR image reconstruction techniques for HDCT scans enhance the in-stent assessment while decreasing image noise.
Tay, Elton Lik Tong; Yong, Vernon Khet Yau; Lim, Boon Ang; Sia, Stelson; Wong, Elizabeth Poh Ying; Yip, Leonard Wei Leon
2015-01-01
AIM To determine angle closure agreements between gonioscopy and anterior segment optical coherence tomography (AS-OCT), as well as gonioscopy and spectral domain OCT (SD-OCT). A secondary objective was to quantify inter-observer agreements of AS-OCT and SD-OCT assessments. METHODS Seventeen consecutive subjects (33 eyes) were recruited from the study hospital's Glaucoma clinic. Gonioscopy was performed by a glaucomatologist masked to OCT results. OCT images were read independently by 2 other glaucomatologists masked to gonioscopy findings as well as each other's analyses of OCT images. RESULTS Totally 84.8% and 45.5% of scleral spurs were visualized in AS-OCT and SD-OCT images respectively (P<0.01). The agreement for angle closure between AS-OCT and gonioscopy was fair at k=0.31 (95% confidence interval, CI: 0.03-0.59) and k=0.35 (95% CI: 0.07-0.63) for reader 1 and 2 respectively. The agreement for angle closure between SD-OCT and gonioscopy was fair at k=0.21 (95% CI: 0.07-0.49) and slight at k=0.17 (95% CI: 0.08-0.42) for reader 1 and 2 respectively. The inter-reader agreement for angle closure in AS-OCT images was moderate at 0.51 (95% CI: 0.13-0.88). The inter-reader agreement for angle closure in SD-OCT images was slight at 0.18 (95% CI: 0.08-0.45). CONCLUSION Significant proportion of scleral spurs were not visualised with SD-OCT imaging resulting in weaker inter-reader agreements. Identifying other angle landmarks in SD-OCT images will allow more consistent angle closure assessments. Gonioscopy and OCT imaging do not always agree in angle closure assessments but have their own advantages, and should be used together and not exclusively. PMID:25938053
Molecular imaging of malignant tumor metabolism: whole-body image fusion of DWI/CT vs. PET/CT.
Reiner, Caecilia S; Fischer, Michael A; Hany, Thomas; Stolzmann, Paul; Nanz, Daniel; Donati, Olivio F; Weishaupt, Dominik; von Schulthess, Gustav K; Scheffel, Hans
2011-08-01
To prospectively investigate the technical feasibility and performance of image fusion for whole-body diffusion-weighted imaging (wbDWI) and computed tomography (CT) to detect metastases using hybrid positron emission tomography/computed tomography (PET/CT) as reference standard. Fifty-two patients (60 ± 14 years; 18 women) with different malignant tumor disease examined by PET/CT for clinical reasons consented to undergo additional wbDWI at 1.5 Tesla. WbDWI was performed using a diffusion-weighted single-shot echo-planar imaging during free breathing. Images at b = 0 s/mm(2) and b = 700 s/mm(2) were acquired and apparent diffusion coefficient (ADC) maps were generated. Image fusion of wbDWI and CT (from PET/CT scan) was performed yielding for wbDWI/CT fused image data. One radiologist rated the success of image fusion and diagnostic image quality. The presence or absence of metastases on wbDWI/CT fused images was evaluated together with the separate wbDWI and CT images by two different, independent radiologists blinded to results from PET/CT. Detection rate and positive predictive values for diagnosing metastases was calculated. PET/CT examinations were used as reference standard. PET/CT identified 305 malignant lesions in 39 of 52 (75%) patients. WbDWI/CT image fusion was technically successful and yielded diagnostic image quality in 73% and 92% of patients, respectively. Interobserver agreement for the evaluation of wbDWI/CT images was κ = 0.78. WbDWI/CT identified 270 metastases in 43 of 52 (83%) patients. Overall detection rate and positive predictive value of wbDWI/CT was 89% (95% CI, 0.85-0.92) and 94% (95% CI, 0.92-0.97), respectively. WbDWI/CT image fusion is technically feasible in a clinical setting and allows the diagnostic assessment of metastatic tumor disease detecting nine of 10 lesions as compared with PET/CT. Copyright © 2011 AUR. Published by Elsevier Inc. All rights reserved.
Sumitsuji, Satoru; Ide, Seiko; Siegrist, Patrick T; Salah, Youssef; Yokoi, Kensuke; Yoshida, Masatoki; Awata, Masaki; Yamasaki, Keita; Tachibana, Kouichi; Kaneda, Hideaki; Nanto, Shinsuke; Sakata, Yasushi
2016-07-01
To select the best revascularization strategy a correct understanding of the ischemic territory and the coronary anatomy is crucial. Stress myocardial perfusion single photon emission computed tomography (SPECT) is the gold standard to assess ischemia, however, SPECT has important limitations such as lack of coronary anatomical information or false negative results due to balanced ischemia in multi-vessel disease. Angiographic scores are based on anatomical characteristics of coronary arteries but they lack information on the extent of jeopardized myocardium. Cardiac computed tomography (CCT) has the ability to evaluate the coronary anatomy and myocardium in one sequence, which is theoretically the ideal method to assess the myocardial mass at risk (MMAR) for any target lesion located at any point in the coronary tree. In this study we analyzed MMAR of the three main coronary arteries and three major side branches; diagonal (Dx), obtuse marginal (OM), and posterior descending artery (PDA) in 42 patients with normal coronary arteries using an algorithm based on the Voronoi method. The distribution of MMAR among the three main coronary arteries was 44.3 ± 5.6 % for the left anterior descending artery, 28.2 ± 7.3 % for the left circumflex artery, and 26.8 ± 8.6 % for the right coronary artery. MMAR of the three major side branches was 11.3 ± 3.9 % for the Dx, 12.6 ± 5.2 % for the OM and 10.2 ± 3.4 % for the PDA. Intra- and inter-observer analysis showed excellent correlation (r = 0.97; p < 0.0001 and r = 0.95; p < 0.0001, respectively). In conclusion, CCT-based MMAR assessment is reliable and may offer important information for selection of the optimal revascularization procedure.
Optimized tomography of continuous variable systems using excitation counting
NASA Astrophysics Data System (ADS)
Shen, Chao; Heeres, Reinier W.; Reinhold, Philip; Jiang, Luyao; Liu, Yi-Kai; Schoelkopf, Robert J.; Jiang, Liang
2016-11-01
We propose a systematic procedure to optimize quantum state tomography protocols for continuous variable systems based on excitation counting preceded by a displacement operation. Compared with conventional tomography based on Husimi or Wigner function measurement, the excitation counting approach can significantly reduce the number of measurement settings. We investigate both informational completeness and robustness, and provide a bound of reconstruction error involving the condition number of the sensing map. We also identify the measurement settings that optimize this error bound, and demonstrate that the improved reconstruction robustness can lead to an order-of-magnitude reduction of estimation error with given resources. This optimization procedure is general and can incorporate prior information of the unknown state to further simplify the protocol.
Tumour auto-contouring on 2d cine MRI for locally advanced lung cancer: A comparative study.
Fast, Martin F; Eiben, Björn; Menten, Martin J; Wetscherek, Andreas; Hawkes, David J; McClelland, Jamie R; Oelfke, Uwe
2017-12-01
Radiotherapy guidance based on magnetic resonance imaging (MRI) is currently becoming a clinical reality. Fast 2d cine MRI sequences are expected to increase the precision of radiation delivery by facilitating tumour delineation during treatment. This study compares four auto-contouring algorithms for the task of delineating the primary tumour in six locally advanced (LA) lung cancer patients. Twenty-two cine MRI sequences were acquired using either a balanced steady-state free precession or a spoiled gradient echo imaging technique. Contours derived by the auto-contouring algorithms were compared against manual reference contours. A selection of eight image data sets was also used to assess the inter-observer delineation uncertainty. Algorithmically derived contours agreed well with the manual reference contours (median Dice similarity index: ⩾0.91). Multi-template matching and deformable image registration performed significantly better than feature-driven registration and the pulse-coupled neural network (PCNN). Neither MRI sequence nor image orientation was a conclusive predictor for algorithmic performance. Motion significantly degraded the performance of the PCNN. The inter-observer variability was of the same order of magnitude as the algorithmic performance. Auto-contouring of tumours on cine MRI is feasible in LA lung cancer patients. Despite large variations in implementation complexity, the different algorithms all have relatively similar performance. Copyright © 2017 The Author(s). Published by Elsevier B.V. All rights reserved.
Maroto, A; Illescas, T; Meléndez, M; Arévalo, S; Rodó, C; Peiró, J L; Belfort, M; Cuxart, A; Carreras, E
2017-10-01
To assess the reliability of the interpretation of a new technique for the ultrasound evaluation of the level of neurological lesion in fetuses with myelomeningocele. Observational study including myelomeningocele fetuses, referred to our center for the sonographic assessment of the fetal lower-limb movements, made and recorded by an expert in Maternal-fetal medicine and a specialist in Rehabilitation. Two observers, with different levels of expertise and blinded to each other's results, interpreted each recorded scan two different times. The agreement for the segmental levels assigned between the observers and the gold standard, the inter-observer and intra-observer reproducibility were tested using the weighed Kappa (wκ) index. Twenty-eight scans were recorded and evaluated. The agreement between the observers and the gold standard remained constant for the expert observer (wκ = 0.82) and increased (wκ = 0.66-wκ = 0.72) for the other one. The inter-observer and the intra-observer variability for the expert observer were wκ = 0.72 and wκ = 0.94, respectively. The agreement for the prenatal evaluation of the segmental neurological level was excellent, after a short training period, for observers with different degrees of expertise. The interpretation of this technique is reproducible enough and this supports its value for the prediction of postnatal motor function in myelomeningocele fetuses.
Heineman, Kirsten R; Bos, Arend F; Hadders-Algra, Mijna
2008-04-01
A reliable and valid instrument to assess neuromotor condition in infancy is a prerequisite for early detection of developmental motor disorders. We developed a video-based assessment of motor behaviour, the Infant Motor Profile (IMP), to evaluate motor abilities, movement variability, ability to select motor strategies, movement symmetry, and fluency. The IMP consists of 80 items and is applicable in children from 3 to 18 months. The present study aimed to test intra- and interobserver reliability and concurrent validity of the IMP with the Alberta Infant Motor Scale (AIMS) and Touwen neurological examination. The study group consisted of 40 low-risk term (median gestational age [GA] 40 wks, range 38-42 wks) and 40 high-risk preterm infants (median GA 29.6 wks, range 26-33 wks) with corrected ages 4 to 18 months (31 females, 49 males). Intra- and interobserver agreement of the IMP were satisfactory (Spearman's rho=0.9). Concurrent validity of IMP and AIMS was good (Spearman's rho=0.8, p<0.005). The IMP was able to differentiate between infants with normal neurological condition, simple minor neurological dysfunction (MND), complex MND, and abnormal neurological condition (p<0.005). This means that the IMP may be a promising tool to evaluate neurological integrity during infancy, a suggestion that needs confirmation by means of assessment of larger groups of infants with heterogeneous neurological conditions.
Kassam, A M; Tillotson, L; Schranz, P J; Mandalia, V I
2015-01-01
The aim of the study is to show, on an MRI scan, that the posterior border of the anterior horn of the lateral meniscus (AHLM) could guide tibial tunnel position in the sagittal plane and provide anatomical graft position. One hundred MRI scans were analysed with normal cruciate ligaments and no evidence of meniscal injury. We measured the distance between the posterior border of the AHLM and the midpoint of the ACL by superimposing sagittal images. The mean distance between the posterior border of the AHLM and the ACL midpoint was -0.1mm (i.e. 0.1mm posterior to the ACL midpoint). The range was 5mm to -4.6mm. The median value was 0.0mm. 95% confidence interval was from -0.5 to 0.3mm. A normal, parametric distribution was observed and Intra- and inter-observer variability showed significant correlation (p<0.05) using Pearsons Correlation test (intra-observer) and Interclass correlation (inter-observer). Using the posterior border of the AHLM is a reproducible and anatomical marker for the midpoint of the ACL footprint in the majority of cases. It can be used intra-operatively as a guide for tibial tunnel insertion and graft placement allowing anatomical reconstruction. There will inevitably be some anatomical variation. Pre-operative MRI assessment of the relationship between AHLM and ACL footprint is advised to improve surgical planning. Level 4.
Klauser, Andrea S; Franz, Magdalena; Arora, Rohit; Feuchtner, Gudrun M; Gruber, Johann; Schirmer, Michael; Jaschke, Werner R; Gabl, Markus F
2010-01-01
We sought to assess vascularity in wrist tenosynovitis by using power Doppler ultrasound (PDUS) and to compare detection of intra- and peritendinous vascularity with that of contrast-enhanced grey-scale ultrasound (CEUS). Twenty-six tendons of 24 patients (nine men, 15 women; mean age ± SD, 54.4 ± 11.8 years) with a clinical diagnosis of tenosynovitis were examined with B-mode ultrasonography, PDUS, and CEUS by using a second-generation contrast agent, SonoVue (Bracco Diagnostics, Milan, Italy) and a low-mechanical-index ultrasound technique. Thickness of synovitis, extent of vascularized pannus, intensity of peritendinous vascularisation, and detection of intratendinous vessels was incorporated in a 3-score grading system (grade 0 to 2). Interobserver variability was calculated. With CEUS, a significantly greater extent of vascularity could be detected than by using PDUS (P < 0.001). In terms of peri- and intratendinous vessels, CEUS was significantly more sensitive in the detection of vascularization compared with PDUS (P < 0.001). No significant correlation between synovial thickening and extent of vascularity could be found (P = 0.089 to 0.097). Interobserver reliability was calculated to be excellent when evaluating the grading score (κ = 0.811 to 1.00). CEUS is a promising tool to detect tendon vascularity with higher sensitivity than PDUS by improved detection of intra- and peritendinous vascularity.
Tournemire, A; Groussolles, M; Ehlinger, V; Lusque, A; Morin, M; Benevent, J B; Arnaud, C; Vayssière, C
2015-08-01
To assess the value of the prenasal thickness to nasal bone length ratio (PT/NBL) for detecting trisomy 21 (T21) after the first trimester. Two examiners blinded to fetal T21 status retrospectively measured prenasal thickness (PT) and nasal bone length (NBL) of T21 and control fetuses at 15-36 weeks' gestational age on two-dimensional images from all T21-screening ultrasounds from November 2010 to April 2013. ROC curve analysis and its diagnostic values determined the best cut-off value for the ratio. Interobserver reproducibility was assessed. Good quality ultrasound profile images were available for 26 fetuses with T21 compared to 91 normal fetuses. The median PT/NBL ratio was 1.28 for T21 and 0.73 for control fetuses (p<0.0001). The PT/NBL ratio performed significantly better (AUC 0.99; 95%CI 0.97-1) than either PT (0.82; 0.73-0.91) or NBL (0.91; 0.85-0.98). The optimal PT/NBL ratio cut-off was 0.98, with a sensitivity of 88.5% [76.2-100%] and a specificity of 100%. Interobserver variability was low. The PT/NBL ratio is a strong marker for detecting T21 in the second and third trimesters, significantly more effective than either indicator alone. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Pearson, Richard
2011-03-01
To assess the possibility of estimating the refractive index of rigid contact lenses on the basis of measurements of their back vertex power (BVP) in air and when immersed in liquid. First, a spreadsheet model was used to quantify the magnitude of errors arising from simulated inaccuracies in the variables required to calculate refractive index. Then, refractive index was calculated from in-air and in-liquid measurements of BVP of 21 lenses that had been made in three negative BVPs from materials with seven different nominal refractive index values. The power measurements were made by two operators on two occasions. Intraobserver reliability showed a mean difference of 0.0033±0.0061 (t = 0.544, P = 0.59), interobserver reliability showed a mean difference of 0.0043±0.0061 (t = 0.707, P = 0.48), and the mean difference between the nominal and calculated refractive index values was -0.0010±0.0111 (t = -0.093, P = 0.93). The spreadsheet prediction that low-powered lenses might be subject to greater errors in the calculated values of refractive index was substantiated by the experimental results. This method shows good intra and interobserver reliabilities and can be used easily in a clinical setting to provide an estimate of the refractive index of rigid contact lenses having a BVP of 3 D or more.
Liu, Ying-Buh; Yang, Stephen S; Hsieh, Cheng-Hsing; Lin, Chia-Da; Chang, Shang-Jen
2014-05-01
To evaluate the inter-observer, intra-observer and intra-individual reliability of uroflowmetry and post-void residual urine (PVR) tests in adult men. Healthy volunteers aged over 40 years were enrolled. Every participant underwent two sets of uroflowmetry and PVR tests with a 2-week interval between the tests. The uroflowmetry tests were interpreted by four urologists independently. Uroflowmetry curves were classified as bell-shaped, bell-shaped with tail, obstructive, restrictive, staccato, interrupted and tower-shaped and scored from 1 (highly abnormal) to 5 (absolutely normal). The agreements between the observers, interpretations and tests within individuals were analyzed using kappa statistics and intraclass correlation coefficients. Generalizability theory with decision analysis was used to determine how many observers, tests, and interpretations were needed to obtain an acceptable reliability (> 0.80). Of 108 volunteers, we randomly selected the uroflowmetry results from 25 participants for the evaluation of reliability. The mean age of the studied adults was 55.3 years. The intra-individual and intra-observer reliability on uroflowmetry tests ranged from good to very good. However, the inter-observer reliability on normalcy and specific type of flow pattern were relatively lower. In generalizability theory, three observers were needed to obtain an acceptable reliability on normalcy of uroflow pattern if the patient underwent uroflowmetry tests twice with one observation. The intra-individual and intra-observer reliability on uroflowmetry tests were good while the inter-observer reliability was relatively lower. To improve inter-observer reliability, the definition of uroflowmetry should be clarified by the International Continence Society. © 2013 Wiley Publishing Asia Pty Ltd.
Mokhles, Palwasha; van den Bosch, Annemien E; Vletter-McGhie, Jackie S; Van Domburg, Ron T; Ruys, Titia P E; Kauer, Floris; Geleijnse, Marcel L; Roos-Hesselink, Jolien W
2013-09-01
The twisting motion of the heart has an important role in the function of the left ventricle. Speckle tracking echocardiography is able to quantify left ventricular (LV) rotation and twist. So far this new technique has not been used in congenital heart disease patients. The aim of our study was to investigate the feasibility and the intra- and inter-observer reproducibility of LV rotation parameters in adult patients with congenital heart disease. The study population consisted of 66 consecutive patients seen in the outpatient clinic (67% male, mean age 31 ± 7.7 years, NYHA class 1 ± 0.3) with a variety of congenital heart disease. First, feasibility was assessed in all patients. Intra- and inter-observer reproducibility was assessed for the patients in which speckle tracking echocardiography was feasible. Adequate image quality, for performing speckle echocardiography, was found in 80% of patients. The bias for the intra-observer reproducibility of the LV twist was 0.0°, with 95% limits of agreement of -2.5° and 2.5° and for interobserver reproducibility the bias was 0.0°, with 95% limits of agreement of -3.0° and 3.0°. Intra- and inter-observer measurements showed a strong correlation (0.86 and 0.79, respectively). Also a good repeatability was seen. The mean time to complete full analysis per subject for the first and second measurement was 9 and 5 minutes, respectively. Speckle tracking echocardiography is feasible in 80% of adult patients with congenital heart disease and shows excellent intra- and inter-observer reproducibility. © 2013, Wiley Periodicals, Inc.
Wu, Ziqiang; Lin, Jialiu; Huang, Jingjing
2015-01-01
Purpose To describe a novel method for quantitative measurement of area parameters in ocular anterior segment ultrasound biomicroscopy (UBM) images using Photoshop software and to assess its intraobserver and interobserver reproducibility. Methods Twenty healthy volunteers with wide angles and twenty patients with narrow or closed angles were consecutively recruited. UBM images were obtained and analyzed using Photoshop software by two physicians with different-level training on two occasions. Borders of anterior segment structures including cornea, iris, lens, and zonules in the UBM image were semi-automatically defined by the Magnetic Lasso Tool in the Photoshop software according to the pixel contrast and modified by the observers. Anterior chamber area (ACA), posterior chamber area (PCA), iris cross-section area (ICA) and angle recess area (ARA) were drawn and measured. The intraobserver and interobserver reproducibilities of the anterior segment area parameters and scleral spur location were assessed by limits of agreement, coefficient of variation (CV), and intraclass correlation coefficient (ICC). Results All of the parameters were successfully measured by Photoshop. The intraobserver and interobserver reproducibilities of ACA, PCA, and ICA were good, with no more than 5% CV and more than 0.95 ICC, while the CVs of ARA were within 20%. The intraobserver and interobserver reproducibilities for defining the spur location were more than 0.97 ICCs. Although the operating times for both observers were less than 3 minutes per image, there was significant difference in the measuring time between two observers with different levels of training (p<0.001). Conclusion Measurements of ocular anterior segment areas on UBM images by Photoshop showed good intraobserver and interobserver reproducibilties. The methodology was easy to adopt and effective in measuring. PMID:25803857
Intra- and interobserver agreement for fetal cerebral measurements in 3D-ultrasonography.
Albers, Maria E W A; Buisman, Erato T I A; Kahn, René S; Franx, Arie; Onland-Moret, N Charlotte; de Heus, Roel
2018-04-10
The aim of this study is to evaluate intra- and interobserver agreement for measurement of intracranial, cerebellar, and thalamic volume with the Virtual Organ Computer-aided AnaLysis (VOCAL) technique in three-dimensional ultrasound images, in comparison to two-dimensional measurements of these brain structures. Three-dimensional ultrasound images of the brains of 80 fetuses at 20-24 weeks' gestational age were obtained from YOUth, a Dutch prospective cohort study. Two observers performed offline measurement of the occipitofrontal diameter, intracranial volume, transcerebellar diameter, cerebellar volume, and thalamic width, area, and volume, independently. VOCAL was used for calculation of the volumes. The two-way random, single measures intraclass correlation coefficient (ICC) was used for analysis of agreement and Bland-Altman plots were configured. Intra- and interobserver agreement was almost perfect for occipitofrontal diameter (intra ICC 0.88, 95% CI 0.82-0.92; inter ICC 0.91, 95% CI 0.85-0.94), intracranial volume (intra ICC 0.96, 95% CI 0.91-0.98; inter ICC 0.97, 95% CI 0.96-0.98) and transcerebellar diameter (intra ICC 0.91, 95% CI 0.86-0.94; inter ICC 0.86, 95% CI 0.78-0.910). For cerebellar volume, the intraobserver agreement was almost perfect (0.85, 95% CI 0.76-0.90), whereas the interobserver agreement was substantial (0.75, 95% CI 0.44-0.88). Agreement was only moderate for thalamic measurements. Bland-Altman plots for the volume measurements are normally distributed with acceptable mean differences and 95% limits of agreement. The intra- and interobserver agreement of the measurement of intracranial and cerebellar volume with VOCAL was almost perfect. These measurements are therefore reliable, and can be used to investigate fetal brain development. Thalamic measurements are not reliable enough. © 2018 Wiley Periodicals, Inc.
McGivney, C L; Sweeney, J; David, F; O'Leary, J M; Hill, E W; Katz, L M
2017-07-01
Previous studies support good intra- and interobserver agreements for endoscopic evaluation of various upper respiratory tract (URT) diseases in horses. However, these studies mainly assessed resting endoscopic examination videos and/or focussed on a single URT abnormality. To estimate intra- and interobserver agreement for identification and grading of all URT abnormalities from resting and overground endoscopy (OGE) videos of Thoroughbreds. Blinded, fully crossed design. Resting and OGE URT videos for n = 43 Thoroughbreds were retrospectively chosen based on identification of common URT disorders. The videos were randomly evaluated in duplicate by 4 raters blinded to all information including prior URT disorder(s) diagnosis. Abnormalities were graded using well-described ordinal scales. Intra- and interobserver agreements were estimated using Cohen's weighted κ and Krippendorff's α, respectively. Intraobserver agreement was perfect/nearly perfect for arytenoid symmetry at exercise, epiglottic entrapment and epiglottic retroversion, substantial for arytenoid asymmetry at rest, palatal dysfunction (PD), medial deviation of the aryepiglottic folds (MDAF), pharyngeal mucus and epiglottic grade at exercise and moderate for vocal fold collapse (VFC), ventromedial luxation of the apex of the corniculate process of the arytenoid (VLAC), nasopharyngeal collapse (NPC) and epiglottic grade at rest. Interobserver agreement was substantial for arytenoid symmetry at exercise and PD and moderate for arytenoid asymmetry at rest, MDAF, VLAC and epiglottic entrapment. It was only fair for VFC, epiglottic grade at exercise, epiglottic retroversion, pharyngeal mucus and NPC and poor for epiglottic grade at rest. Sample size was insufficient to allow assessment of the effect of one abnormality on the grading of another abnormality. Observers were consistent in grading URT disorders. However, significant disparity in grading existed between observers for some conditions affecting reliability. © 2016 EVJ Ltd.
Detection of MET amplification in gastroesophageal tumor specimens using IQFISH.
Jørgensen, Jan Trøst; Nielsen, Karsten Bork; Mollerup, Jens; Jepsen, Anna; Go, Ning
2017-12-01
The gene mesenchymal epithelial transition factor ( MET ) is a proto-oncogene that encodes a transmembrane receptor with intrinsic tyrosine kinase activity known as Met or cMet. MET is found to be amplified in several human cancers including gastroesophageal cancer. Here we report the MET amplification prevalence data from 159 consecutive tumor specimens from patients with gastric (G), gastroesophageal junction (GEJ) and esophageal (E) adenocarcinoma, using a novel fluorescence in situ hybridization (FISH) assay, MET /CEN-7 IQFISH Probe Mix [an investigational use only (IUO) assay]. MET amplification was defined as a MET /CEN-7 ratio ≥2.0. Furthermore, the link between the MET signal distribution and amplification status was investigated. The prevalence of MET amplification was found to be 6.9%. The FISH assay demonstrated a high inter-observer reproducibility. The inter-observer results showed a 100% overall agreement with respect to the MET status (amplified/non-amplified). The inter-observer CV was estimated to 11.8% (95% CI: 10.2-13.4). For the signal distribution, the inter-observer agreement was reported to be 98.7%. We also report an association of MET amplification and a unique signal distribution pattern in the G/GEJ/E tumor specimens. We found that the prevalence of MET amplification was markedly higher in tumors specimens with a heterogeneous (66.7%) versus homogeneous (2.0%) signal distribution. Furthermore, specimens with a heterogeneous signal distribution had a statically significantly higher median MET /CEN-7 ratio (2.35 versus 1.04; P<0.0001). The novel FISH assay showed a high inter-observer reproducibility both with respect to amplification status and signal distribution. Based on the finding in the study it is suggested that MET amplification mainly is associated with tumor cells that is represented by a heterogonous growth pattern.
Accelerometric gait analysis for use in hospital outpatients.
Auvinet, B; Chaleil, D; Barrey, E
1999-01-01
To provide clinicians with a quantitative human gait analysis tool suitable for routine use. We evaluated the reproducibility, sensitivity, and specificity of gait analysis based on measurements of acceleration at a point near the center of gravity of the body. Two accelerometers held over the middle of the low back by a semi-elastic belt were used to record craniocaudal and side-to-side accelerations at a frequency of 50 Hz. Subjects were asked to walk at their normal speed to the end of a straight 40 meter-long hospital corridor and back. A 20-second period of stabilized walking was used to calculate cycle frequency, stride symmetry, and stride regularity. Symmetry and regularity were each derived from an auto-correlation coefficient; to convert their distribution from nonnormal to normal, Fisher's Z transformation was applied to the auto-coefficients for these two variables. Intraobserver reproducibility was evaluated by asking the same observer to test 16 controls on three separate occasions at two-day intervals and interobserver reproducibility by asking four different observers to each test four controls (Latin square). Specificity and sensitivity were determined by testing 139 controls and 63 patients. The 139 controls (70 women and 69 men) were divided into five age groups (third through seventh decades of life). The 63 patients had a noninflammatory musculoskeletal condition predominating on one side. ROC curves were used to determine the best cutoffs for separating normal from abnormal values. Neither intra- nor interobserver variability was significant (P > 0.05). Cycle frequency was significantly higher in female than in male controls (1.05 +/- 0.06 versus 0.98 +/- 0.05 cycles/s; P < 0.001). Neither symmetry nor regularity were influenced by gender in the controls; both variables were also unaffected by age, although nonsignificant decreases were found in the 61 to 70-year age group, which included only nine subjects. In the ROC curve analysis, the area under the curve was high for all three variables (frequency, 0.81 +/- 0.04; symmetry, 0.85 +/- 0.03; and regularity, 0.88 +/- 0.03), establishing that there was a good compromise between sensitivity and specificity. Our gait analysis method offers satisfactory reproducibility and is sufficiently sensitive and specific to be used by clinicians in the quantitative evaluation of gait abnormalities.
Dorniak, Karolina; Heiberg, Einar; Hellmann, Marcin; Rawicz-Zegrzda, Dorota; Wesierska, Maria; Galaska, Rafal; Sabisz, Agnieszka; Szurowska, Edyta; Dudziak, Maria; Hedström, Erik
2016-05-26
Pulse wave velocity (PWV) is a biomarker for arterial stiffness, clinically assessed by applanation tonometry (AT). Increased use of phase-contrast cardiac magnetic resonance (CMR) imaging allows for PWV assessment with minor routine protocol additions. The aims were to investigate the acquired temporal resolution needed for accurate and precise measurements of CMR-PWV, and develop a tool for CMR-PWV measurements. Computer phantoms were generated for PWV = 2-20 m/s based on human CMR-PWV data. The PWV measurements were performed in 13 healthy young subjects and 13 patients at risk for cardiovascular disease. The CMR-PWV was measured by through-plane phase-contrast CMR in the ascending aorta and at the diaphragm level. Centre-line aortic distance was determined between flow planes. The AT-PWV was assessed within 2 h after CMR. Three observers (CMR experience: 15, 4, and <1 year) determined CMR-PWV. The developed tool was based on the flow-curve foot transit time for PWV quantification. Computer phantoms showed bias 0.27 ± 0.32 m/s for a temporal resolution of at least 30 ms. Intraobserver variability for CMR-PWV were: 0 ± 0.03 m/s (15 years), -0.04 ± 0.33 m/s (4 years), and -0.02 ± 0.30 m/s (<1 year). Interobserver variability for CMR-PWV was below 0.02 ± 0.38 m/s. The AT-PWV overestimated CMR-PWV by 1.1 ± 0.7 m/s in healthy young subjects and 1.6 ± 2.7 m/s in patients. An acquired temporal resolution of at least 30 ms should be used to obtain accurate and precise thoracic aortic phase-contrast CMR-PWV. A new freely available research tool was used to measure PWV in healthy young subjects and in patients, showing low intra- and interobserver variability also for less experienced CMR observers.
Relationship between Two Types of Coil Packing Densities Relative to Aneurysm Size.
Park, Keun Young; Kim, Byung Moon; Ihm, Eun Hyun; Baek, Jang Hyun; Kim, Dong Joon; Kim, Dong Ik; Huh, Seung Kon; Lee, Jae Whan
2015-01-01
Coil packing density (PD) can be calculated via a formula (PDF ) or software (PDS ). Two types of PD can be different from each other for same aneurysm. This study aimed to evaluate the interobserver agreement and relationships between the 2 types of PD relative to aneurysm size. Consecutive 420 saccular aneurysms were treated with coiling. PD (PDF , [coil volume]/[volume calculated by formula] and PDS, [coil volume]/[volume measured by software]) was calculated and prospectively recorded. Interobserver agreement was evaluated between PDF and PDS . Additionally, the relationships between PDF and PDS relative to aneurysm size were subsequently analyzed. Interobserver agreement for PDF and PDS was excellent (Intraclass correlation coefficient, PDF ; 0.967 and PDS ; 0.998). The ratio of PDF and PDS was greater for smaller aneurysms and converged toward 1.0 as the maximum dimension (DM ) of aneurysm increased. Compared with PDS , PDF was overestimated by a mean of 28% for DM < 5 mm, by 17% for 5 mm ≤ DM < 10 mm, and by 9% for DM ≥ 10 mm (P < 0.01). Interobserver agreement for PDF and PDS was excellent. However, PDF was overestimated in smaller aneurysms and converged to PDS as aneurysm size increased. Copyright © 2014 by the American Society of Neuroimaging.
Roma, Andres A; Liu, Xiuli; Patil, Deepa T; Xie, Hao; Allende, Daniela
2017-07-01
To analyze interobserver reproducibility and compare practice patterns between academic and community settings of Lower Anogenital Squamous Terminology (LAST). In total, 132 anal biopsy slides were revised as well as p16 immunostains. LAST was used in 49% of cases (academic center, 68%; satellite hospitals [community practice setting], 32%). After pathology review and consensus interpretation, 23 (17%) case diagnoses were reclassified: eight (34.8%) cases (benign or low-grade squamous intraepithelial lesion [LSIL]) were upgraded to high-grade squamous intraepithelial lesion (HSIL) (p16 confirmed ordered during review); four (17.4%) cases originally classified as HSIL were downgraded to LSIL (p16 originally ordered in one case). There was no significant difference in discrepancies between original and consensus diagnosis in the community vs academic setting or by subspecialty (gynecological vs gastrointestinal). Overall interobserver agreement among reviewers was substantial (κ = 0.63) and improved with the use of p16 immunostain in challenging cases (κ = 0.71; P < .001). This new terminology is not yet uniformly used by pathologists in anal/perianal biopsy specimens; this two-tier system has a good interobserver agreement and is further improved with p16 use in appropriate cases. © American Society for Clinical Pathology, 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com
Palm, Peter; Josephson, Malin; Mathiassen, Svend Erik; Kjellberg, Katarina
2016-06-01
We evaluated the intra- and inter-observer reliability and criterion validity of an observation protocol, developed in an iterative process involving practicing ergonomists, for assessment of working technique during cash register work for the purpose of preventing upper extremity symptoms. Two ergonomists independently assessed 17 15-min videos of cash register work on two occasions each, as a basis for examining reliability. Criterion validity was assessed by comparing these assessments with meticulous video-based analyses by researchers. Intra-observer reliability was acceptable (i.e. proportional agreement >0.7 and kappa >0.4) for 10/10 questions. Inter-observer reliability was acceptable for only 3/10 questions. An acceptable inter-observer reliability combined with an acceptable criterion validity was obtained only for one working technique aspect, 'Quality of movements'. Thus, major elements of the cashiers' working technique could not be assessed with an acceptable accuracy from short periods of observations by one observer, such as often desired by practitioners. Practitioner Summary: We examined an observation protocol for assessing working technique in cash register work. It was feasible in use, but inter-observer reliability and criterion validity were generally not acceptable when working technique aspects were assessed from short periods of work. We recommend the protocol to be used for educational purposes only.
Dwyer, Tim; Whelan, Daniel B; Khoshbin, Amir; Wasserstein, David; Dold, Andrew; Chahal, Jaskarndip; Nauth, Aaron; Murnaghan, M Lucas; Ogilvie-Harris, Darrell J; Theodoropoulos, John S
2015-04-01
The objective of this study was to establish the intra- and inter-observer reliability of hamstring graft measurement using cylindrical sizing tubes. Hamstring tendons (gracilis and semitendinosus) were harvested from ten cadavers by a single surgeon and whip stitched together to create ten 4-strand hamstring grafts. Ten sports medicine surgeons and fellows sized each graft independently using either hollow cylindrical sizers or block sizers in 0.5-mm increments—the sizing technique used was applied consistently to each graft. Surgeons moved sequentially from graft to graft and measured each hamstring graft twice. Surgeons were asked to state the measured proximal (femoral) and distal (tibial) diameter of each graft, as well as the diameter of the tibial and femoral tunnels that they would drill if performing an anterior cruciate ligament (ACL) reconstruction using that graft. Reliability was established using intra-class correlation coefficients. Overall, both the inter-observer and intra-observer agreement were >0.9, demonstrating excellent reliability. The inter-observer reliability for drill sizes was also excellent (>0.9). Excellent correlation was seen between cylindrical sizing, and drill sizes (>0.9). Sizing of hamstring grafts by multiple surgeons demonstrated excellent intra-observer and intra-observer reliability, potentially validating clinical studies exploring ACL reconstruction outcomes by hamstring graft diameter when standard techniques are used. III.
MAUDGIL, D. D.; FREE, S. L.; SISODIYA, S. M.; LEMIEUX, L.; WOERMANN, F. G.; FISH, D. R.; SHORVON, S. D.
1998-01-01
Guided by a review of the anatomical literature, 36 sulci on the human cerebral cortical surface were designated as homologous. These sulci were assessed for visibility on 3-dimensional images reconstructed from magnetic resonance imaging scans of the brains of 20 normal volunteers by 2 independent observers. Those sulci that were found to be reproducibly identifiable were used to define 24 landmarks around the cortical surface. The interobserver and intraobserver variabilities of measurement of the 24 landmarks were calculated. These reliably reproducible landmarks can be used for detailed morphometric analysis, and may prove helpful in the analysis of suspected cerebral cortical structured abnormalities in patients with such conditions as epilepsy. PMID:10029189
He, Xiaowei; Liang, Jimin; Wang, Xiaorui; Yu, Jingjing; Qu, Xiaochao; Wang, Xiaodong; Hou, Yanbin; Chen, Duofang; Liu, Fang; Tian, Jie
2010-11-22
In this paper, we present an incomplete variables truncated conjugate gradient (IVTCG) method for bioluminescence tomography (BLT). Considering the sparse characteristic of the light source and insufficient surface measurement in the BLT scenarios, we combine a sparseness-inducing (ℓ1 norm) regularization term with a quadratic error term in the IVTCG-based framework for solving the inverse problem. By limiting the number of variables updated at each iterative and combining a variable splitting strategy to find the search direction more efficiently, it obtains fast and stable source reconstruction, even without a priori information of the permissible source region and multispectral measurements. Numerical experiments on a mouse atlas validate the effectiveness of the method. In vivo mouse experimental results further indicate its potential for a practical BLT system.
Lupidi, Marco; Coscas, Florence; Cagini, Carlo; Fiore, Tito; Spaccini, Elisa; Fruttini, Daniela; Coscas, Gabriel
2016-09-01
To describe a new automated quantitative technique for displaying and analyzing macular vascular perfusion using optical coherence tomography angiography (OCT-A) and to determine a normative data set, which might be used as reference in identifying progressive changes due to different retinal vascular diseases. Reliability study. A retrospective review of 47 eyes of 47 consecutive healthy subjects imaged with a spectral-domain OCT-A device was performed in a single institution. Full-spectrum amplitude-decorrelation angiography generated OCT angiograms of the retinal superficial and deep capillary plexuses. A fully automated custom-built software was used to provide quantitative data on the foveal avascular zone (FAZ) features and the total vascular and avascular surfaces. A comparative analysis between central macular thickness (and volume) and FAZ metrics was performed. Repeatability and reproducibility were also assessed in order to establish the feasibility and reliability of the method. The comparative analysis between the superficial capillary plexus and the deep capillary plexus revealed a statistically significant difference (P < .05) in terms of FAZ perimeter, surface, and major axis and a not statistically significant difference (P > .05) when considering total vascular and avascular surfaces. A linear correlation was demonstrated between central macular thickness (and volume) and the FAZ surface. Coefficients of repeatability and reproducibility were less than 0.4, thus demonstrating high intraobserver repeatability and interobserver reproducibility for all the examined data. A quantitative approach on retinal vascular perfusion, which is visible on Spectralis OCT angiography, may offer an objective and reliable method for monitoring disease progression in several retinal vascular diseases. Copyright © 2016 Elsevier Inc. All rights reserved.
Gür Güngör, Sirel; Akman, Ahmet; Sarıgül Sezenöz, Almila; Tanrıaşıkı, Gülşah
2016-12-01
The presence of retinal nerve fiber layer (RNFL) split bundles was recently described in normal eyes scanned using scanning laser polarimetry and by histologic studies. Split bundles may resemble RNFL loss in healthy eyes. The aim of our study was to determine the prevalence of nerve fiber layer split bundles in healthy people. We imaged 718 eyes of 359 healthy persons with the spectral domain optical coherence tomography in this cross-sectional study. All eyes had intraocular pressure of 21 mmHg or less, normal appearance of the optic nerve head, and normal visual fields (Humphrey Field Analyzer 24-2 full threshold program). In our study, a bundle was defined as 'split' when there is localized defect not resembling a wedge defect in the RNFL deviation map with a symmetrically divided RNFL appearance on the RNFL thickness map. The classification was performed by two independent observers who used an identical set of reference examples to standardize the classification. Inter-observer consensus was reached in all cases. Bilateral superior split bundles were seen in 19 cases (5.29%) and unilateral superior split was observed in 15 cases (4.16%). In 325 cases (90.52%) there was no split bundle. Split nerve fiber layer bundles, in contrast to single nerve fiber layer bundles, are not common findings in healthy eyes. In eyes with normal optic disc appearance, especially when a superior RNFL defect is observed in RNFL deviation map, the RNLF thickness map and graphs should also be examined for split nerve fiber layer bundles.
DOE R&D Accomplishments Database
Phelps, M. E.; Hoffman, E. J.; Huang, S. C.; Schelbert, H. R.; Kuhl, D. E.
1978-01-01
Emission computed tomography can provide a quantitative in vivo measurement of regional tissue radionuclide tracer concentrations. This facility when combined with physiologic models and radioactively labeled physiologic tracers that behave in a predictable manner allow measurement of a wide variety of physiologic variables. This integrated technique has been referred to as Physiologic Tomography (PT). PT requires labeled compounds which trace physiologic processes in a known and predictable manner, and physiologic models which are appropriately formulated and validated to derive physiologic variables from ECT data. In order to effectively achieve this goal, PT requires an ECT system that is capable of performing truly quantitative or analytical measurements of tissue tracer concentrations and which has been well characterized in terms of spatial resolution, sensitivity and signal to noise ratios in the tomographic image. This paper illustrates the capabilities of emission computed tomography and provides examples of physiologic tomography for the regional measurement of cerebral and myocardial metabolic rate for glucose, regional measurement of cerebral blood volume, gated cardiac blood pools and capillary perfusion in brain and heart. Studies on patients with stroke and myocardial ischemia are also presented.
McIver, Kerry L.; Brown, William H.; Pfeiffer, Karin A.; Dowda, Marsha; Pate, Russell R.
2016-01-01
Purpose This study describes the development and pilot testing of the Observational System for Recording Physical Activity-Elementary School (OSRAC-E) version. Methods This system was developed to observe and document the levels and types of physical activity and physical and social contexts of physical activity in elementary school students during the school day. Inter-observer agreement scores and summary data were calculated. Results All categories had Kappa statistics above 0.80, with the exception of the activity initiator category. Inter-observer agreement scores were 96% or greater. The OSRAC-E was shown to be a reliable observation system that allows researchers to assess physical activity behaviors, the contexts of those behaviors, and the effectiveness of physical activity interventions in the school environment. Conclusion The OSRAC-E can yield data with high interobserver reliability and provide relatively extensive contextual information about physical activity of students in elementary schools. PMID:26889587
Direct state tomography using continuous variable measuring device
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhu, Xuanmin, E-mail: zhuxuanmin@xidian.edu.cn; Wei, Qun
Compared with the conventional quantum state tomography (QST), the efficiency of the direct state tomography (DST) using weak value is very low. However, DST is easily manipulated in experiments. We modify the direct state tomography by using coupling-deformed observables. The modified direct state measurement is valid for arbitrarily large measurement strength. The optimal measurement strengths are obtained to attain the highest efficiency. The efficiency of DST is significantly improved in the modified strategy, and the reconstructed state has no inherent bias. The state reconstruction strategy investigated in this paper might be useful in actual experiments.
Dückelmann, A M; Bamberg, C; Michaelis, S A M; Lange, J; Nonnenmacher, A; Dudenhausen, J W; Kalache, K D
2010-02-01
To assess whether ultrasound experience or fetal head station affects the reliability of measurement of fetal head descent using the angle of progression on intrapartum ultrasound images obtained by a single experienced operator, and to determine reliability of measurements when images were acquired by different operators with variable ultrasound experience. One experienced obstetrician performed 44 transperineal ultrasound examinations of women at term and in prolonged second stage of labor with the fetus in the occipitoanterior position. Three midwives without ultrasound experience, three obstetricians with < 5 years' experience and three obstetricians with > 10 years' experience measured fetal head descent based on the angle of progression in the images obtained. The angle of progression was measured by two obstetricians in independent ultrasound examinations of 24 laboring women at term with the fetus in the cephalic position to allow assessment of the reliability of image acquisition. Intraclass correlation coefficients (ICCs) with 95% confidence interval (CI) were used to evaluate interobserver reliability and Bland-Altman analysis was used to assess interobserver agreement. In total, 444 measurements were performed and compared. Interobserver reliability with respect to offline image analysis was substantial (overall ICC, 0.72; 95% CI, 0.63-0.81). ICCs were 0.82 (95% CI, 0.70-0.89), 0.81 (95% CI, 0.71-0.88) and 0.61 (95% CI, 0.43-074) for observers with > 10 years', < 5 years' and no ultrasound experience, respectively. There were no significant differences between ICCs among observer groups according to ultrasound experience. Fetal head station did not affect reliability. Bland-Altman analysis indicated reasonable agreement between measurements obtained by two different operators with > 10 years' and < 5 years' ultrasound experience (bias, -1.09 degrees ; 95% limits of agreement, -8.76 to 6.58). The reliability of measurement of the angle of progression following separate image acquisition by two experienced operators was similar to the reliability of offline image analysis (ICC, 0.86; 95% CI, 0.70-0.93). Measurement of the angle of progression on transperineal ultrasound imaging is reliable regardless of fetal head station or the clinician's level of ultrasound experience.
Báez-Saldaña, Renata; López-Arteaga, Yesenia; Bizarrón-Muro, Alma; Ferreira-Guerrero, Elizabeth; Ferreyra-Reyes, Leticia; Delgado-Sánchez, Guadalupe; Cruz-Hervert, Luis Pablo; Mongua-Rodríguez, Norma; García-García, Lourdes
2013-01-01
Despite chemotherapy, patients with cured pulmonary tuberculosis may result in lung functional impairment. To evaluate a novel scoring system based on the degree of radiographic abnormalities and related spirometric values in patients with cured pulmonary tuberculosis. One hundred and twenty seven patients with cured pulmonary tuberculosis were prospectively enrolled in a referral hospital specializing in respiratory diseases. Spirometry was performed and the extent of radiographic abnormalities was evaluated twice by each of two readers to generate a novel quantitative score. Scoring reproducibility was analyzed by the intra-class correlation coefficient (ICC) and the Bland-Altman method. Multiple linear regression models were performed to assess the association of the extent of radiographic abnormalities with spirometric values. The intra-observer agreement for scoring of radiographic abnormalities (SRA) showed an ICC of 0.81 (CI:95%, 0.67-0.95) and 0.78 (CI:95%, 0.65-0.92), for reader 1 and 2, respectively. Inter-observer reproducibility for the first measurement was 0.83 (CI:95%, 0.71-0.95), and for the second measurement was 0.74 (CI:95%, 0.58-0.90). The Bland-Altman analysis of the intra-observer agreement showed a mean bias of 0.87% and -0.55% and an inter-observer agreement of -0.35% and -1.78%, indicating a minor average systematic variability. After adjustment for age, gender, height, smoking status, pack-years of smoking, and degree of dyspnea, the scoring degree of radiographic abnormalities was significantly and negatively associated with absolute and percent predicted values of FVC: -0.07 (CI:95%, -0.01 to -0.04); -2.48 (CI:95%, -3.45 to -1.50); and FEV1 -0.07 (CI:95%, -0.10 to -0.05); -2.92 (CI:95%, -3.87 to -1.97) respectively, in the patients studied. The extent of radiographic abnormalities, as evaluated through our novel scoring system, was inversely associated with spirometric values, and exhibited good reliability and reproducibility. As intra-observer and inter-observer agreement of the SRA varied from good to excellent, the use of SRA in this setting appears acceptable.
Báez-Saldaña, Renata; López-Arteaga, Yesenia; Bizarrón-Muro, Alma; Ferreira-Guerrero, Elizabeth; Ferreyra-Reyes, Leticia; Delgado-Sánchez, Guadalupe; Cruz-Hervert, Luis Pablo; Mongua-Rodríguez, Norma; García-García, Lourdes
2013-01-01
Background Despite chemotherapy, patients with cured pulmonary tuberculosis may result in lung functional impairment. Objective To evaluate a novel scoring system based on the degree of radiographic abnormalities and related spirometric values in patients with cured pulmonary tuberculosis. Methods One hundred and twenty seven patients with cured pulmonary tuberculosis were prospectively enrolled in a referral hospital specializing in respiratory diseases. Spirometry was performed and the extent of radiographic abnormalities was evaluated twice by each of two readers to generate a novel quantitative score. Scoring reproducibility was analyzed by the intra-class correlation coefficient (ICC) and the Bland-Altman method. Multiple linear regression models were performed to assess the association of the extent of radiographic abnormalities with spirometric values. Results The intra-observer agreement for scoring of radiographic abnormalities (SRA) showed an ICC of 0.81 (CI:95%, 0.67–0.95) and 0.78 (CI:95%, 0.65–0.92), for reader 1 and 2, respectively. Inter-observer reproducibility for the first measurement was 0.83 (CI:95%, 0.71–0.95), and for the second measurement was 0.74 (CI:95%, 0.58–0.90). The Bland-Altman analysis of the intra-observer agreement showed a mean bias of 0.87% and -0.55% and an inter-observer agreement of -0.35% and -1.78%, indicating a minor average systematic variability. After adjustment for age, gender, height, smoking status, pack-years of smoking, and degree of dyspnea, the scoring degree of radiographic abnormalities was significantly and negatively associated with absolute and percent predicted values of FVC: -0.07 (CI:95%, -0.01 to -0.04); -2.48 (CI:95%, -3.45 to -1.50); and FEV1 -0.07 (CI:95%, -0.10 to -0.05); -2.92 (CI:95%, -3.87 to -1.97) respectively, in the patients studied. Conclusion The extent of radiographic abnormalities, as evaluated through our novel scoring system, was inversely associated with spirometric values, and exhibited good reliability and reproducibility. As intra-observer and inter-observer agreement of the SRA varied from good to excellent, the use of SRA in this setting appears acceptable. PMID:24223865
McMillan, Matthew; Brearley, Jacqueline
2013-05-01
To evaluate the interobserver variability in the assignment of the American Society of Anesthesiologists Physical Status Classification (ASA-PSC) to compromised small animal patients amongst a group of veterinary anaesthetists. Anonymous internet survey. Hypothetical case presentations. Sixteen hypothetical small animal cases with differing degrees of physiological or patho-physiological compromise were presented as part of an internet survey. Respondents were asked to assign a single ASA-PSC to each case and also to answer a number of demographic questions. ASA-PSC scores were considered separately and then grouped as scores of I-II and III-V. Agreement was analysed using the modified kappa statistic for multiple observers. Data were then sorted into various demographic groups for further analysis. There were 144 respondents of which 60 (~42%) were anaesthesia diplomates, 24 (~17%) were post-residency (nondiploma holders), 24 (~17%) were current anaesthesia residents, 21 (~15%) were general practitioners, 12 (~8%) were veterinary nurses or technicians, and 3 (~2%) were interns. Although there was a majority agreement (>50% in a single category) in 15 of the 16 cases, ASA-PSC were spread over at least three ASA-PS classifications for every case. Overall agreement was considered only fair (κ = 0.24, mean ± SD agreement 46 ± 7%). When comparing grouped data (ASA-PSC I-II versus III-V) overall agreement remained fair (κ = 0.36, mean ± SD agreement 69 ± 19%). There was no difference in ASA-PSC assignment between any of the demographic groups investigated. This study suggests major discrepancies can occur between observers given identical information when using the ASA-PSC to categorise health status in compromised small animal patients. The significant potential for interobserver variability in classification allocation should be borne in mind when the ASA-PSC is used for clinical, scientific and statistical purposes. © 2013 The Authors. Veterinary Anaesthesia and Analgesia © 2013 Association of Veterinary Anaesthetists and the American College of Veterinary Anesthesia and Analgesia.
Chokshi, F H; Sadigh, G; Carpenter, W; Allen, J W
2017-04-01
Spinal anatomy has been variably investigated using 3D MRI. We aimed to compare the diagnostic quality of T2 sampling perfection with application-optimized contrasts by using flip angle evolution (SPACE) with T2-FSE sequences for visualization of cervical spine anatomy. We predicted that T2-SPACE will be equivalent or superior to T2-FSE for visibility of anatomic structures. Adult patients undergoing cervical spine MR imaging with both T2-SPACE and T2-FSE sequences for radiculopathy or myelopathy between September 2014 and February 2015 were included. Two blinded subspecialty-trained radiologists independently assessed the visibility of 12 anatomic structures by using a 5-point scale and assessed CSF pulsation artifact by using a 4-point scale. Sagittal images and 6 axial levels from C2-T1 on T2-FSE were reviewed; 2 weeks later and after randomization, T2-SPACE was evaluated. Diagnostic quality for each structure and CSF pulsation artifact visibility on both sequences were compared by using a paired t test. Interobserver agreement was calculated (κ). Forty-five patients were included (mean age, 57 years; 40% male). The average visibility scores for intervertebral disc signal, neural foramina, ligamentum flavum, ventral rootlets, and dorsal rootlets were higher for T2-SPACE compared with T2-FSE for both reviewers ( P < .001). Average scores for remaining structures were either not statistically different or the superiority of one sequence was discordant between reviewers. T2-SPACE showed less degree of CSF flow artifact ( P < .001). Interobserver variability ranged between -0.02-0.20 for T2-SPACE and -0.02-0.30 for T2-FSE (slight to fair agreement). T2-SPACE may be equivalent or superior to T2-FSE for the evaluation of cervical spine anatomic structures, and T2-SPACE shows a lower degree of CSF pulsation artifact. © 2017 by American Journal of Neuroradiology.
Matsuki, Keisuke; Watanabe, Atsuya; Ochiai, Shunsuke; Kenmoku, Tomonori; Ochiai, Nobuyasu; Obata, Takayuki; Toyone, Tomoaki; Wada, Yuichi; Okubo, Toshiyuki
2014-05-01
Although fatty degeneration of the rotator cuff muscles has been reported to affect the outcomes of rotator cuff repairs, only a few studies have attempted to quantitatively evaluate this degeneration. T2 mapping is a quantitative magnetic resonance imaging technique that potentially evaluates the concentration of fat in muscles. The purpose of this study was to investigate fatty degeneration of the rotator cuff muscles by using T2 mapping, as well as to evaluate the reliability of T2 measurement. We obtained magnetic resonance images including T2 mapping from 184 shoulders (180 patients; 110 male patients [112 shoulders] and 70 female patients [72 shoulders]; mean age, 62 years [range, 16-84 years]). Eighty-three shoulders had no rotator cuff tear (group A), whereas 101 shoulders had tears, of which 62 were incomplete to medium (group B) and 39 were large to massive (group C). T2 values of the supraspinatus and infraspinatus muscles were measured and compared among groups. Intraobserver and interobserver variabilities also were examined. The mean T2 values of the supraspinatus in groups A, B, and C were 36.3 ± 4.7 milliseconds, 44.2 ± 11.3 milliseconds, and 57.0 ± 18.8 milliseconds, respectively. The mean T2 values of the infraspinatus in groups A, B, and C were 36.1 ± 5.1 milliseconds, 40.0 ± 11.1 milliseconds, and 51.9 ± 18.2 milliseconds, respectively. The T2 value significantly increased with the extent of the tear in both muscles. Both intraobserver and interobserver variabilities were more than 0.99. T2 mapping can be a reliable tool to quantify fatty degeneration of the rotator cuff muscles. Copyright © 2014 Journal of Shoulder and Elbow Surgery Board of Trustees. Published by Mosby, Inc. All rights reserved.
Castel, Anne-Laure; Menet, Aymeric; Ennezat, Pierre-Vladimir; Delelis, François; Le Goffic, Caroline; Binda, Camille; Guerbaai, Raphaëlle-Ashley; Levy, Franck; Graux, Pierre; Tribouilloy, Christophe; Maréchaux, Sylvestre
2016-01-01
Speckle tracking can be used to measure left ventricular global longitudinal strain (GLS). To study the effect of speckle tracking software product upgrades on GLS values and intervendor consistency. Subjects (patients or healthy volunteers) underwent systematic echocardiography with equipment from Philips and GE, without a change in their position. Off-line post-processing for GLS assessment was performed with the former and most recent upgrades from these two vendors (Philips QLAB 9.0 and 10.2; GE EchoPAC 12.1 and 13.1.1). GLS was obtained in three myocardial layers with EchoPAC 13.1.1. Intersoftware and intervendor consistency was assessed. Interobserver variability was tested in a subset of patients. Among 73 subjects (65 patients and 8 healthy volunteers), absolute values of GLS were higher with QLAB 10.2 compared with 9.0 (intraclass correlation coefficient [ICC]: 0.88; bias: 2.2%). Agreement between EchoPAC 13.1.1 and 12.1 varied by myocardial layer (13.1.1 only): midwall (ICC: 0.95; bias: -1.1%), endocardium (ICC: 0.93; bias: 1.6%) and epicardial (ICC: 0.80; bias: -3.3%). Although GLS was comparable for QLAB 9.0 versus EchoPAC 12.1 (ICC: 0.95; bias: 0.5%), the agreement was lower between QLAB 10.2 and EchoPAC 13.1.1 endocardial (ICC: 0.91; bias: 1.1%), midwall (ICC: 0.73; bias: 3.9%) and epicardial (ICC: 0.54; bias: 6.0%). Interobserver variability of all software products in a subset of 20 patients was excellent (ICC: 0.97-0.99; bias: -0.8 to 1.0%). Upgrades of speckle tracking software may be associated with significant changes in GLS values, which could affect intersoftware and intervendor consistency. This finding has important clinical implications for the longitudinal follow-up of patients with speckle tracking echocardiography. Copyright © 2015 Elsevier Masson SAS. All rights reserved.
Ha, Richard; Mema, Eralda; Guo, Xiaotao; Mango, Victoria; Desperito, Elise; Ha, Jason; Wynn, Ralph; Zhao, Binsheng
2016-04-01
The amount of fibroglandular tissue (FGT) has been linked to breast cancer risk based on mammographic density studies. Currently, the qualitative assessment of FGT on mammogram (MG) and magnetic resonance imaging (MRI) is prone to intra and inter-observer variability. The purpose of this study is to develop an objective quantitative FGT measurement tool for breast MRI that could provide significant clinical value. An IRB approved study was performed. Sixty breast MRI cases with qualitative assessment of mammographic breast density and MRI FGT were randomly selected for quantitative analysis from routine breast MRIs performed at our institution from 1/2013 to 12/2014. Blinded to the qualitative data, whole breast and FGT contours were delineated on T1-weighted pre contrast sagittal images using an in-house, proprietary segmentation algorithm which combines the region-based active contours and a level set approach. FGT (%) was calculated by: [segmented volume of FGT (mm(3))/(segmented volume of whole breast (mm(3))] ×100. Statistical correlation analysis was performed between quantified FGT (%) on MRI and qualitative assessments of mammographic breast density and MRI FGT. There was a significant positive correlation between quantitative MRI FGT assessment and qualitative MRI FGT (r=0.809, n=60, P<0.001) and mammographic density assessment (r=0.805, n=60, P<0.001). There was a significant correlation between qualitative MRI FGT assessment and mammographic density assessment (r=0.725, n=60, P<0.001). The four qualitative assessment categories of FGT correlated with the calculated mean quantitative FGT (%) of 4.61% (95% CI, 0-12.3%), 8.74% (7.3-10.2%), 18.1% (15.1-21.1%), 37.4% (29.5-45.3%). Quantitative measures of FGT (%) were computed with data derived from breast MRI and correlated significantly with conventional qualitative assessments. This quantitative technique may prove to be a valuable tool in clinical use by providing computer generated standardized measurements with limited intra or inter-observer variability.
Pincus, Daniel; Kuhn, John E; Sheth, Ujash; Rizzone, Katie; Colbenson, Kristi; Dwyer, Tim; Karpinos, Ashley; Marks, Paul H; Wasserstein, David
2017-05-01
Clinical practice guidelines (CPGs) are published by several sports medicine institutions. A systematic evaluation can help identify the highest quality CPGs for clinical use and identify any deficiencies that remain. To identify and appraise CPGs relevant to clinical sports medicine professionals. Systematic review. Predetermined selection criteria were utilized by 2 reviewers who independently identified published CPGs before January 1, 2014. CPGs were excluded if they focused on injured workers, radiological criteria, medical pathology, or the axial skeleton (back/neck). The remaining guidelines were scored by 6 reviewers with different clinical backgrounds using the Appraisal of Guidelines for Research and Evaluation II (AGREE II). Scores lower than 50% indicated deficiency. Scores were also stratified by the publishing institution and anatomic location and compared using Kruskal-Wallis tests. The Spearman correlation coefficient was used to assess the range of interobserver agreement between the evaluators. Seventeen CPGs met the inclusion criteria. The majority of guidelines pertained to the knee, ankle, or shoulder. Interobserver agreement was strong ( r = 0.548-0.740), and mean total scores between nonsurgical (107.8) and surgical evaluators (109.3) were not statistically different. Overall guideline quality was variable but not deficient for 16 of 17 guidelines (>50%), except regarding clinical "applicability" and "editorial independence." No difference was found between CPGs of the knee, shoulder, foot/ankle, or chronic conditions. However, CPG publishing institutions had significantly different scores; the American Academy of Orthopaedic Surgeons (AAOS) guidelines scored significantly higher (141.4) than the total mean score (108.0). The overall quality of sports medicine CPGs was variable but generally not deficient, except regarding applicability and editorial independence. Bias through poor editorial independence is a concern. To improve future guideline quality, authors should pay particular attention to these areas and use existing highest quality guidelines, or the AGREE II instrument, as templates. CPGs dedicated to anatomic areas other than the knee, ankle, and shoulder are needed.
Visual judgements of steadiness in one-legged stance: reliability and validity.
Haupstein, T; Goldie, P
2000-01-01
There is a paucity of information about the validity and reliability of clinicians' visual judgements of steadiness in one-legged stance. Such judgements are used frequently in clinical practice to support decisions about treatment in the fields of neurology, sports medicine, paediatrics and orthopaedics. The aim of the present study was to address the validity and reliability of visual judgements of steadiness in one-legged stance in a group of physiotherapists. A videotape of 20 five-second performances was shown to 14 physiotherapists with median clinical experience of 6.75 years. Validity of visual judgement was established by correlating scores obtained from an 11-point rating scale with criterion scores obtained from a force platform. In addition, partial correlations were used to control for the potential influence of body weight on the relationship between the visual judgements and criterion scores. Inter-observer reliability was quantified between the physiotherapists; intra-observer reliability was quantified between two tests four weeks apart. Mean criterion-related validity was high, regardless of whether body weight was controlled for statistically (Pearson's r = 0.84, 0.83, respectively). The standard error of estimating the criterion score was 3.3 newtons. Inter-observer reliability was high (ICC (2,1) = 0.81 at Test 1 and 0.82 at Test 2). Intra-observer reliability was high (on average ICC (2,1) = 0.88; Pearson's r = 0.90). The standard error of measurement for the 11-point scale was one unit. The finding of higher accuracy of making visual judgements than previously reported may be due to several aspects of design: use of a criterion score derived from the variability of the force signal which is more discriminating than variability of centre of pressure; use of a discriminating visual rating scale; specificity and clear definition of the phenomenon to be rated.
Bonasia, Davide Edoardo; Marmotti, Antongiulio; Massa, Alessandro Domenico Felice; Ferro, Andrea; Blonna, Davide; Castoldi, Filippo; Rossi, Roberto
2015-09-01
In the last two decades, many surgical techniques have been described for articular cartilage repair. Reliable histological scoring systems are fundamental tools to evaluate new procedures. Several histological scoring systems have been described, and these can be divided in elementary and comprehensive scores, according to the number of sub-items. The aim of this study was to test the inter- and intra-observer reliability of ten main scores used for the histological evaluation of in vivo cartilage repair. The authors tested the starting hypothesis that elementary scores would show superior intra- and inter-observer reliability compared with comprehensive scores. Fifty histological sections obtained from the trochlea of New Zealand Rabbit and stained with Safranin-O fast green were used. The histological sections were analysed by 4 observers: 2 experienced in cartilage histology and 2 inexperienced. Histological evaluations were performed at time 1 and time 2, separated by a 30-day interval. The following scores were used: Mankin, O'Driscoll, Pineda, Wakitani, Fortier, Selleres, ICRS, ICRSII, Oswestry (OsScore) and modified O'Driscoll. Intra- and inter-observer reliability were evaluated for each score. In addition, the pavement-ceiling effect and the Bland-Altman Coefficient of Repeatability were then evaluated for each sub-item of every score. Intra-observer reliability was high for all observers in every score, even though the reliability was significantly lower for non-expert observers compared with expert counterparts. In terms of Coefficient of Repeatability, some scores performed better (O'Driscoll, Modified O'Driscoll and ICRSII) than others (Fortier, Seller). Inter-observer reliability was high for all observers in every score, but significantly lower for non-expert compared with expert observers. In expert hands, all the scores showed high intra- and inter-observer reliability, independently of the complexity. Although every score has advantages and disadvantages, ICRSII, O'Driscoll and Modified O'Driscoll scores should be preferred for the evaluation of in vivo cartilage repair in animal models.
Buczinski, S; Faure, C; Jolivet, S; Abdallah, A
2016-07-01
To determine inter-observer agreement for a clinical scoring system for the detection of bovine respiratory disease complex in calves, and the impact of classification of calves as sick or healthy based on different cut-off values. Two third-year veterinary students (Observer 1 and 2) and one post-graduate student (Observer 3) received 4 hours of training on scoring dairy calves for signs of respiratory disease, including rectal temperature, cough, eye and nasal discharge, and ear position. Observers 1 and 2 scored 40 pre-weaning dairy calves 24 hours apart (80 observations) over three visits to a calf-rearing facility, and Observers 1, 2 and 3 scored 20 calves on one visit. Inter-observer agreement was assessed using percentage of agreement (PA) and Kappa statistics for individual clinical signs, comparing Observers 1 and 2. Agreement between the three observers for total clinical score was assessed using cut-off values of ≥4, ≥5 and ≥6 to indicate unhealthy calves. Inter-observer PA for rectal temperature was 0.68, for cough 0.78, for nasal discharge 0.62, for eye discharge 0.63, and for ear position 0.85. Kappa values for all clinical signs indicated slight to fair agreement (<0.4), except temperature that had moderate agreement (0.6). The Fleiss' Kappa for total score, using cut-offs of ≥4, ≥5 and ≥6 to indicate unhealthy calves, was 0.35, 0.06 and 0.13, respectively, indicating slight to fair agreement. There was important inter-observer discrepancies in scoring clinical signs of respiratory disease, using relatively inexperienced observers. These disagreements may ultimately mean increased false negative or false positive diagnoses and incorrect treatment of cases. Visual assessment of clinical signs associated with bovine respiratory disease needs to be thoroughly validated when disease monitoring is based on the use of a clinical scoring system.
High inter-observer agreement of observer-perceived pain assessment in the emergency department.
Hangaard, Martin Høhrmann; Malling, Brian; Mogensen, Christian Backer
2018-02-21
Triage is used to prioritize the patients in the emergency department. The majority of the triage systems include the patients' pain score to assess their level of acuity by using a combination of patient reported pain and observer-perceived pain; the latter therefore requires a certain degree of inter-observer agreement. The aim of the present study was to assess the inter-observer agreement of perceived pain among emergency department nurses and to evaluate if it was influenced by predetermined factors like age and gender. A project assistant randomly recruited two nurses, who were not allowed to interact with each other, to assess patient pain intensity on the numeric ranking scale. The project assistant afterwards entered the pain scores in a predesigned electronic questionnaire. We used weighted Fleiss-Cohen (quadratic) kappa statistics, Bland-Altman statistics and logistic regression analysis to assess the inter-observer agreement. One hundred and sixty-two patients were included. They had a median age of 38 years and 45% were females. 30% of the patients were acute surgical patients and 70% acute orthopedic patients. The average time between the pain assessments were 1,7 min. The Bland Altman analysis found a mean difference in pain score of 0.2 and 95% limits of agreement of +/- 3 point. When the NRS scores were translated to commonly used pain categories (no, mild, moderate or severe pain) we found a 70% agreement with a mean difference in categories of 0.05 and 95% limits of agreement of +/- 1 category. Patient age, gender, localization of pain, examination room or presence of a significant other did not affect the inter-observer agreement. We found 70% agreement on pain category between the nurses and it is justified that nurse-perceived pain assessment is used for triage in the emergency department.
Berger, Aaron J; Momeni, Arash; Ladd, Amy L
2014-04-01
Trapeziometacarpal, or thumb carpometacarpal (CMC), arthritis is a common problem with a variety of treatment options. Although widely used, the Eaton radiographic staging system for CMC arthritis is of questionable clinical utility, as disease severity does not predictably correlate with symptoms or treatment recommendations. A possible reason for this is that the classification itself may not be reliable, but the literature on this has not, to our knowledge, been systematically reviewed. We therefore performed a systematic review to determine the intra- and interobserver reliability of the Eaton staging system. We systematically reviewed English-language studies published between 1973 and 2013 to assess the degree of intra- and interobserver reliability of the Eaton classification for determining the stage of trapeziometacarpal joint arthritis and pantrapezial arthritis based on plain radiographic imaging. Search engines included: PubMed, Scopus(®), and CINAHL. Four studies, which included a total of 163 patients, met our inclusion criteria and were evaluated. The level of evidence of the studies included in this analysis was determined using the Oxford Centre for Evidence Based Medicine Levels of Evidence Classification by two independent observers. A limited number of studies have been performed to assess intra- and interobserver reliability of the Eaton classification system. The four studies included were determined to be Level 3b. These studies collectively indicate that the Eaton classification demonstrates poor to fair interobserver reliability (kappa values: 0.11-0.56) and fair to moderate intraobserver reliability (kappa values: 0.54-0.657). Review of the literature demonstrates that radiographs assist in the assessment of CMC joint disease, but there is not a reliable system for classification of disease severity. Currently, diagnosis and treatment of thumb CMC arthritis are based on the surgeon's qualitative assessment combining history, physical examination, and radiographic evaluation. Inconsistent agreement using the current common radiographic classification system suggests a need for better radiographic tools to quantify disease severity.
Detection of MET amplification in gastroesophageal tumor specimens using IQFISH
Nielsen, Karsten Bork; Mollerup, Jens; Jepsen, Anna; Go, Ning
2017-01-01
Background The gene mesenchymal epithelial transition factor (MET) is a proto-oncogene that encodes a transmembrane receptor with intrinsic tyrosine kinase activity known as Met or cMet. MET is found to be amplified in several human cancers including gastroesophageal cancer. Methods Here we report the MET amplification prevalence data from 159 consecutive tumor specimens from patients with gastric (G), gastroesophageal junction (GEJ) and esophageal (E) adenocarcinoma, using a novel fluorescence in situ hybridization (FISH) assay, MET/CEN-7 IQFISH Probe Mix [an investigational use only (IUO) assay]. MET amplification was defined as a MET/CEN-7 ratio ≥2.0. Furthermore, the link between the MET signal distribution and amplification status was investigated. Results The prevalence of MET amplification was found to be 6.9%. The FISH assay demonstrated a high inter-observer reproducibility. The inter-observer results showed a 100% overall agreement with respect to the MET status (amplified/non-amplified). The inter-observer CV was estimated to 11.8% (95% CI: 10.2–13.4). For the signal distribution, the inter-observer agreement was reported to be 98.7%. We also report an association of MET amplification and a unique signal distribution pattern in the G/GEJ/E tumor specimens. We found that the prevalence of MET amplification was markedly higher in tumors specimens with a heterogeneous (66.7%) versus homogeneous (2.0%) signal distribution. Furthermore, specimens with a heterogeneous signal distribution had a statically significantly higher median MET/CEN-7 ratio (2.35 versus 1.04; P<0.0001). Conclusions The novel FISH assay showed a high inter-observer reproducibility both with respect to amplification status and signal distribution. Based on the finding in the study it is suggested that MET amplification mainly is associated with tumor cells that is represented by a heterogonous growth pattern. PMID:29285491
Kent, Michael N; Olsen, Thomas G; Feeser, Theresa A; Tesno, Katherine C; Moad, John C; Conroy, Michael P; Kendrick, Mary Jo; Stephenson, Sean R; Murchland, Michael R; Khan, Ayesha U; Peacock, Elizabeth A; Brumfiel, Alexa; Bottomley, Michael A
2017-12-01
Digital pathology represents a transformative technology that impacts dermatologists and dermatopathologists from residency to academic and private practice. Two concerns are accuracy of interpretation from whole-slide images (WSI) and effect on workflow. Studies of considerably large series involving single-organ systems are lacking. To evaluate whether diagnosis from WSI on a digital microscope is inferior to diagnosis of glass slides from traditional microscopy (TM) in a large cohort of dermatopathology cases with attention on image resolution, specifically eosinophils in inflammatory cases and mitotic figures in melanomas, and to measure the workflow efficiency of WSI compared with TM. Three dermatopathologists established interobserver ground truth consensus (GTC) diagnosis for 499 previously diagnosed cases proportionally representing the spectrum of diagnoses seen in the laboratory. Cases were distributed to 3 different dermatopathologists who diagnosed by WSI and TM with a minimum 30-day washout between methodologies. Intraobserver WSI/TM diagnoses were compared, followed by interobserver comparison with GTC. Concordance, major discrepancies, and minor discrepancies were calculated and analyzed by paired noninferiority testing. We also measured pathologists' read rates to evaluate workflow efficiency between WSI and TM. This retrospective study was caried out in an independent, national, university-affiliated dermatopathology laboratory. Intraobserver concordance of diagnoses between WSI and TM methods and interobserver variance from GTC, following College of American Pathology guidelines. Mean intraobserver concordance between WSI and TM was 94%. Mean interobserver concordance was 94% for WSI and GTC and 94% for TM and GTC. Mean interobserver concordance between WSI, TM, and GTC was 91%. Diagnoses from WSI were noninferior to those from TM. Whole-slide image read rates were commensurate with WSI experience, achieving parity with TM by the most experienced user. Diagnosis from WSI was found equivalent to diagnosis from glass slides using TM in this statistically powerful study of 499 dermatopathology cases. This study supports the viability of WSI for primary diagnosis in the clinical setting.
Reduction of variable-truncation artifacts from beam occlusion during in situ x-ray tomography
NASA Astrophysics Data System (ADS)
Borg, Leise; Jørgensen, Jakob S.; Frikel, Jürgen; Sporring, Jon
2017-12-01
Many in situ x-ray tomography studies require experimental rigs which may partially occlude the beam and cause parts of the projection data to be missing. In a study of fluid flow in porous chalk using a percolation cell with four metal bars drastic streak artifacts arise in the filtered backprojection (FBP) reconstruction at certain orientations. Projections with non-trivial variable truncation caused by the metal bars are the source of these variable-truncation artifacts. To understand the artifacts a mathematical model of variable-truncation data as a function of metal bar radius and distance to sample is derived and verified numerically and with experimental data. The model accurately describes the arising variable-truncation artifacts across simulated variations of the experimental setup. Three variable-truncation artifact-reduction methods are proposed, all aimed at addressing sinogram discontinuities that are shown to be the source of the streaks. The ‘reduction to limited angle’ (RLA) method simply keeps only non-truncated projections; the ‘detector-directed smoothing’ (DDS) method smooths the discontinuities; while the ‘reflexive boundary condition’ (RBC) method enforces a zero derivative at the discontinuities. Experimental results using both simulated and real data show that the proposed methods effectively reduce variable-truncation artifacts. The RBC method is found to provide the best artifact reduction and preservation of image features using both visual and quantitative assessment. The analysis and artifact-reduction methods are designed in context of FBP reconstruction motivated by computational efficiency practical for large, real synchrotron data. While a specific variable-truncation case is considered, the proposed methods can be applied to general data cut-offs arising in different in situ x-ray tomography experiments.
Interobserver reproducibility of The Paris System for Reporting Urinary Cytology.
Long, Theresa; Layfield, Lester J; Esebua, Magda; Frazier, Shellaine R; Giorgadze, D Tamar; Schmidt, Robert L
2017-01-01
The Paris System for Reporting Urinary Cytology represents a significant improvement in classification of urinary specimens. The system acknowledges the difficulty in cytologically diagnosing low-grade urothelial carcinomas and has developed categories to deal with this issue. The system uses six categories: unsatisfactory, negative for high-grade urothelial carcinoma (NHGUC), atypical urothelial cells, suspicious for high-grade urothelial carcinoma, high-grade urothelial carcinoma, other malignancies and a seventh subcategory (low-grade urothelial neoplasm). Three hundred and fifty-seven urine specimens were independently reviewed by four cytopathologists unaware of the previous diagnoses. Each cytopathologist rendered a diagnosis according to the Paris System categories. Agreement was assessed using absolute agreement and weighted chance-corrected agreement (kappa). Disagreements were classified as low impact and high impact based on the potential impact of a misclassification on clinical management. The average absolute agreement was 65% with an average expected agreement of 44%. The average chance-corrected agreement (kappa) was 0.32. Nine hundred and ninety-nine of 1902 comparisons between rater pairs were in agreement, but 12% of comparisons differed by two or more categories for the category NHGUC. Approximately 15% of the disagreements were classified as high clinical impact. Our findings indicated that the scheme recommended by the Paris System shows adequate precision for the category NHGUC, but the other categories demonstrated unacceptable interobserver variability. This low level of diagnostic precision may negatively impact the applicability of the Paris System for widespread clinical application.
Zhang, Tao; Yousaf, Ufra; Hsiao, Albert; Cheng, Joseph Y; Alley, Marcus T; Lustig, Michael; Pauly, John M; Vasanawala, Shreyas S
2015-10-01
Pediatric contrast-enhanced MR angiography is often limited by respiration, other patient motion and compromised spatiotemporal resolution. To determine the reliability of a free-breathing spatiotemporally accelerated 3-D time-resolved contrast-enhanced MR angiography method for depicting abdominal arterial anatomy in young children. With IRB approval and informed consent, we retrospectively identified 27 consecutive children (16 males and 11 females; mean age: 3.8 years, range: 14 days to 8.4 years) referred for contrast-enhanced MR angiography at our institution, who had undergone free-breathing spatiotemporally accelerated time-resolved contrast-enhanced MR angiography studies. A radio-frequency-spoiled gradient echo sequence with Cartesian variable density k-space sampling and radial view ordering, intrinsic motion navigation and intermittent fat suppression was developed. Images were reconstructed with soft-gated parallel imaging locally low-rank method to achieve both motion correction and high spatiotemporal resolution. Quality of delineation of 13 abdominal arteries in the reconstructed images was assessed independently by two radiologists on a five-point scale. Ninety-five percent confidence intervals of the proportion of diagnostically adequate cases were calculated. Interobserver agreements were also analyzed. Eleven out of 13 arteries achieved acceptable image quality (mean score range: 3.9-5.0) for both readers. Fair to substantial interobserver agreement was reached on nine arteries. Free-breathing spatiotemporally accelerated 3-D time-resolved contrast-enhanced MR angiography frequently yields diagnostic image quality for most abdominal arteries in young children.
Téllez-Zenteno, Jose F; Hernández-Ronquillo, Lizbeth; Buckley, Samantha; Zahagun, Ricardo; Rizvi, Syed
2014-06-01
To establish applicability, the recently proposed International League Against Epilepsy (ILAE) consensus on drug-resistant epilepsy (DRE) requires testing in clinical and research settings. This study evaluates the reliability and validity of these criteria in a clinical population. In phase I, two independent evaluators reviewed 97 randomly selected medical records of patients with epilepsy at two separate intervals. Both ILEA consensus and standard diagnostic criteria were employed. Kappa, weighted kappa, and intraclass correlation coefficient (ICC) were used to determine interobserver and intraobserver variability. In phase II, ILAE consensus criteria were applied to 250 patients with epilepsy to determine risk factors associated with development of DRE and to calculate point prevalence. The interobserver agreement of the four definitions was as follows: Berg (0.56), Kwan and Brodie (0.58), Camfield and Camfield (0.69), and ILAE (0.77). The intraobserver agreement of the four definition was as follows: Berg (0.81), Kwan and Brodie (0.82), Camfield and Camfield (0.72), and ILAE (0.82). The prevalence of DRE was the following: with the Berg's definition was 28.4%, Kwan and Brodie 34%, Camfield and Camfield 37%, and with ILAE was 33%. This is first study to establish reliability and validity of ILAE criteria for the diagnosis of DRE. This new definition compares favorably with previously established constructs, which continue to retain clinical significance. Wiley Periodicals, Inc. © 2014 International League Against Epilepsy.
Fischbach-Boulanger, C; Fitsiori, A; Noblet, V; Baloglu, S; Oesterle, H; Draghici, S; Philippi, N; Duron, E; Hanon, O; Dietemann, J-L; Blanc, F; Kremer, S
2018-05-01
Magnetic resonance imaging is part of the diagnostic criteria for Alzheimer's disease (AD) through the evaluation of hippocampal atrophy. The objective of this study was to evaluate which sequence of T1-weighted (T1WI) and T2-weighted (T2WI) imaging allowed the best visual evaluation of hippocampal atrophy. Visual qualitative ratings of the hippocampus of 100 patients with mild cognitive impairment (MCI) and 50 patients with AD were made independently by four operators according to the medial temporal lobe atrophy score based either on T1WI or T2WI. These two evaluations were compared in terms of interobserver reproducibility, concordance with a quantitative volumetric measure, discrimination power between AD and MCI groups, and correlation with several neuropsychological tests. The medial temporal lobe atrophy score evaluated on either T1WI or T2WI exhibited similar interobserver variability and accordance with quantitative volumetric evaluation. However, the visual evaluation on T2WI seemed to provide better discrimination power between AD and MCI groups for both left (T1WI, P = 0.0001; T2WI, P = 7.072 × 10 -5 ) and right (T1WI, P = 0.008; T2WI, P = 0.001) hippocampus, and a higher overall correlation with neuropsychological tests. The present study suggests that T2WI provides a more adequate visual rating of hippocampal atrophy. © 2018 EAN.
Claessen, Femke M A P; Stoop, Nicky; Doornberg, Job N; Guitton, Thierry G; van den Bekerom, Michel P J; Ring, David
2016-10-01
Stable fixation of distal humerus fracture fragments is necessary for adequate healing and maintenance of reduction. The purpose of this study was to measure the reliability and accuracy of interpretation of postoperative radiographs to predict which implants will loosen or break after operative treatment of bicolumnar distal humerus fractures. We also addressed agreement among surgeons regarding which fracture fixation will loosen or break and the influence of years in independent practice, location of practice, and so forth. A total of 232 orthopedic residents and surgeons from around the world evaluated 24 anteroposterior and lateral radiographs of distal humerus fractures on a Web-based platform to predict which implants would loosen or break. Agreement among observers was measured using the multi-rater kappa measure. The sensitivity of prediction of failure of fixation of distal humerus fracture on radiographs was 63%, specificity was 53%, positive predictive value was 36%, the negative predictive value was 78%, and accuracy was 56%. There was fair interobserver agreement (κ = 0.27) regarding predictions of failure of fixation of distal humerus fracture on radiographs. Interobserver variability did not change when assessed for the various subgroups. When experienced and skilled surgeons perform fixation of type C distal humerus fracture, the immediate postoperative radiograph is not predictive of fixation failure. Reoperation based on the probability of failure might not be advisable. Diagnostic III. Copyright © 2016 American Society for Surgery of the Hand. Published by Elsevier Inc. All rights reserved.