Sample records for observer agreement study

  1. Observer Agreement for Measurements in Videolaryngostroboscopy.

    PubMed

    Brunings, Jan Wouter; Vanbelle, Sophie; Akkermans, Annemarie; Heemskerk, Nienke M M; Kremer, Bernd; Stokroos, Robert J; Baijens, Laura W J

    2017-11-06

    This study evaluated the levels of intraobserver and interobserver agreement for measurements of visuoperceptual variables in videolaryngostroboscopic examinations and compared the observers' behavior during independent versus consensus panel rating. This is a retrospective study. This study was conducted in a single-center tertiary care facility. Sixty-four patients with dysphonia of heterogeneous etiology were included. All subjects underwent a standardized videolaryngostroboscopic examination. Two experienced and trained observers scored exactly the same examinations, first independently and then on a consensus panel. Specific visuoperceptual variables and the clinical diagnosis (as recommended by the Committee on Phoniatrics and the Phonosurgery Committee of the European Laryngological Society and advised by the American Speech-Language-Hearing Association) were scored. Descriptive and kappa statistics were used. In general, intraobserver agreement was better than agreement between observers for measurements of several variables. The intrapanel observer agreement levels were slightly higher than the intraobserver agreement levels on the independent rating task. When rating on the consensus panel, the observers deviated considerably from the scores they had previously given on the independent rating task. Observer agreement in videolaryngostroboscopic assessment has important implications not only for the diagnosis and treatment of dysphonic patients but also for the interpretation of the results of scientific studies using videolaryngostroboscopic outcome parameters. The identification of factors that can influence the levels of observer agreement can provide a better understanding of the rating process and its limitations. The results of this study suggest that future research could achieve better agreement levels by rating the visuoperceptual variables in a panel setting. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  2. The Pfirrmann classification of lumbar intervertebral disc degeneration: an independent inter- and intra-observer agreement assessment.

    PubMed

    Urrutia, Julio; Besa, Pablo; Campos, Mauricio; Cikutovic, Pablo; Cabezon, Mario; Molina, Marcelo; Cruz, Juan Pablo

    2016-09-01

    Grading inter-vertebral disc degeneration (IDD) is important in the evaluation of many degenerative conditions, including patients with low back pain. Magnetic resonance imaging (MRI) is considered the best imaging instrument to evaluate IDD. The Pfirrmann classification is commonly used to grade IDD; the authors describing this classification showed an adequate agreement using it; however, there has been a paucity of independent agreement studies using this grading system. The aim of this study was to perform an independent inter- and intra-observer agreement study using the Pfirrmann classification. T2-weighted sagittal images of 79 patients consecutively studied with lumbar spine MRI were classified using the Pfirrmann grading system by six evaluators (three spine surgeons and three radiologists). After a 6-week interval, the 79 cases were presented to the same evaluators in a random sequence for repeat evaluation. The intra-class correlation coefficient (ICC) and the weighted kappa (wκ) were used to determine the inter- and intra-observer agreement. The inter-observer agreement was excellent, with an ICC = 0.94 (0.93-0.95) and wκ = 0.83 (0.74-0.91). There were no differences between spine surgeons and radiologists. Likewise, there were no differences in agreement evaluating the different lumbar discs. Most differences among observers were only of one grade. Intra-observer agreement was also excellent with ICC = 0.86 (0.83-0.89) and wκ = 0.89 (0.85-0.93). In this independent study, the Pfirrmann classification demonstrated an adequate agreement among different observers and by the same observer on separate occasions. Furthermore, it allows communication between radiologists and spine surgeons.

  3. Intra- and inter-observer agreement when using a descriptive classification scale for clinical assessment of faecal consistency in growing pigs.

    PubMed

    Pedersen, Ken Steen; Toft, Nils

    2011-03-01

    The objective of the current study was to evaluate intra- and inter-observer agreement using a descriptive classification scale with four categories, descriptive text and pictures for assessment of consistency in faecal samples from pigs post weaning. The four consistency categories were score one=firm and shaped, score two=soft and shaped, score three=loose and score four=watery. Five observers from the same veterinary practice examined 100 faecal samples using the scale with four categories. Four of the observers examined the 100 faecal samples twice within the same day. Within observers the difference in proportions for the individual consistency categories between two examinations was on average 0.04 (range: 0-0.10). The mean intra-observer agreement was 0.82 (range: 0.72-0.91) with a mean kappa value of 0.76 (range: 0.61-0.88). For inter-observer agreement overall kappa was 0.64. For the 10 pair-wise comparisons the mean inter-observer agreement was 0.73 (range: 0.61-0.90) with a mean kappa value of 0.64 (range: 0.48-0.87). The difference in proportions for the individual consistency categories was on average 0.08 (range: 0-0.17). In conclusion, the agreement observed for the descriptive classification scale with four categories, descriptive text and pictures may be categorized as a substantial to almost perfect intra-observer agreement and a moderate to almost perfect inter-observer agreement. However, more objective measures than clinical scales may still be needed to improve intra- and inter-observer agreement in research studies. Copyright © 2010 Elsevier B.V. All rights reserved.

  4. Intra-observer and inter-observer agreements for the measurement of dual-input whole tumor computed tomography perfusion in patients with lung cancer: Influences of the size and inner-air density of tumors.

    PubMed

    Wang, Qingle; Zhang, Zhiyong; Shan, Fei; Shi, Yuxin; Xing, Wei; Shi, Liangrong; Zhang, Xingwei

    2017-09-01

    This study was conducted to assess intra-observer and inter-observer agreements for the measurement of dual-input whole tumor computed tomography perfusion (DCTP) in patients with lung cancer. A total of 88 patients who had undergone DCTP, which had proved a diagnosis of primary lung cancer, were divided into two groups: (i) nodules (diameter ≤3 cm) and masses (diameter >3 cm) by size, and (ii) tumors with and without air density. Pulmonary flow, bronchial flow, and pulmonary index were measured in each group. Intra-observer and inter-observer agreements for measurement were assessed using intraclass correlation coefficient, within-subject coefficient of variation, and Bland-Altman analysis. In all lung cancers, the reproducibility coefficient for intra-observer agreement (range 26.1-38.3%) was superior to inter-observer agreement (range 38.1-81.2%). Further analysis revealed lower agreements for nodules compared to masses. Additionally, inner-air density reduced both agreements for lung cancer. The intra-observer agreement for measuring lung cancer DCTP was satisfied, while the inter-observer agreement was limited. The effects of tumoral size and inner-air density to agreements, especially between two observers, should be emphasized. In future, an automatic computer-aided segment of perfusion value of the tumor should be developed. © 2017 The Authors. Thoracic Cancer published by China Lung Oncology Group and John Wiley & Sons Australia, Ltd.

  5. Intra- and inter-observer agreement on diagnosis of Dupuytren disease, measurements of severity of contracture, and disease extent.

    PubMed

    Broekstra, Dieuwke C; Lanting, Rosanne; Werker, Paul M N; van den Heuvel, Edwin R

    2015-08-01

    Dupuytren disease (DD) is a fibrosing disease affecting the palmar aponeurosis, and is mostly treated by surgery based on measurement of severity of flexion contracture of the fingers. Literature concerning the measurement reliability is scarce. This study aimed to determine the intra- and inter-observer agreement of four variables for diagnosing DD, determining severity of contracture, and disease extent. One of them is a new measurement on the area of nodules and cords for measuring the disease extent in early disease stages. An agreement study (n = 54) was performed by two trained investigators. Agreement was calculated per finger, based on an intraclass correlation coefficient (ICC) using a latent variable model on subjects for diagnosis and Tubiana stage. For total passive extension deficit (TPED) and the area of nodules and cords, agreement was calculated with an ICC using a one-way random effects model with subject as random effect. Inter-observer agreement was very good for diagnosing DD (ICC: 95.5%-99.9%) and good to very good for classifying Tubiana stage (ICC: 73.5%-94.9%). Agreements for area and TPED were moderate (middle finger) to very good (ICC: 48.4%-98.6% and 45.0%-99.5%, respectively). Intra-observer agreement was slightly higher on average than inter-observer agreement. Overall, the intra- and inter-observer agreement in diagnosing DD, and determining the severity of flexion contracture is high. Also, the newly introduced variable area of nodules and cords has high intra- and inter-observer agreement, indicating that it is suitable to measure disease extent. Copyright © 2015 Elsevier Ltd. All rights reserved.

  6. Graphical aids for visualizing and interpreting patterns in departures from agreement in ordinal categorical observer agreement data.

    PubMed

    Bangdiwala, Shrikant I

    2017-01-01

    When studying the agreement between two observers rating the same n units into the same k discrete ordinal categories, Bangdiwala (1985) proposed using the "agreement chart" to visually assess agreement. This article proposes that often it is more interesting to focus on the patterns of disagreement and visually understanding the departures from perfect agreement. The article reviews the use of graphical techniques for descriptively assessing agreement and disagreements, and also reviews some of the available summary statistics that quantify such relationships.

  7. Computed Tomography Assessment of Hepatic Metastases of Breast Cancer with Revised Response Evaluation Criteria in Solid Tumors (RECIST) Criteria (Version 1.1): Inter-Observer Agreement.

    PubMed

    Ghobrial, Fady Emil Ibrahim; Eldin, Manal Salah; Razek, Ahmed Abdel Khalek Abdel; Atwan, Nadia Ibrahim; Shamaa, Sameh Sayed Ahmed

    2017-01-01

    To assess inter-observer agreement of revised RECIST criteria (version 1.1) for computed tomography assessment of hepatic metastases of breast cancer. A prospective study was conducted in 28 female patients with breast cancer and with at least one measurable metastatic lesion in the liver that was treated with 3 cycles of anthracycline-based chemotherapy. All patients underwent computed tomography of the abdomen with 64-row multi- detector CT at baseline and after 3 cycles of chemotherapy for response assessment. Image analysis was performed by 2 observers, based on the RECIST criteria (version 1.1). Computed tomography revealed partial response of hepatic metastases in 7 patients (25%) by one observer and in 10 patients (35.7%) by the other observer, with good inter-observer agreement (k=0.75, percent agreement of 89.29%). Stable disease was detected in 19 patients (67.8%) by one observer and in 16 patients (57.1%) by the other observer, with good agreement (k=0.774, percent agreement of 89.29%). Progressive disease was detected in 2 patients (7.2%) by both observers, with perfect agreement (k=1, percent agreement of 100%). The overall inter-observer agreement in the CT-based response assessment of hepatic metastasis between the two observers was good ( k =0.793, percent agreement of 89.29%). We concluded that computed tomography is a reliable and reproducible imaging modality for response assessment of hepatic metastases of breast cancer according to the RECIST criteria (version 1.1).

  8. Intra- and inter-observer agreement in histological assessment of canine soft tissue sarcoma.

    PubMed

    Yap, F W; Rasotto, R; Priestnall, S L; Parsons, K J; Stewart, J

    2017-12-01

    The diagnosis of canine soft tissue sarcoma (STS) is based on histological assessment. Assessment of criteria such as, degree of differentiation, necrosis score and mitotic score, gives rise to a final tumour grade, which is important in the recommendation of treatment and prognosis of patients. Previously diagnosed cases of STS were independently assessed by three board-certified veterinary pathologists. Participating pathologists were blinded to the original results. For the intra-observer study, the cases were assessed by a single pathologist six months apart and slides were randomized between readings. For the inter-observer study, the whole case series was assessed by a single pathologist before being passed onto the next pathologist. Intraclass correlation coefficient (ICC) and Fleiss's Kappa (ƙ) for the intra- (single observer) and inter-observer agreement. Strong agreement was observed for the intra-observer assessment in necrosis score, mitotic score, total score and tumour grading (ICC between 0.78 to 0.91). The intra-observer agreement for differentiation score was rated perfect (ICC 1.00). The agreement between pathologists for the diagnosis and grading of canine STS was moderate (ƙ = 0.60 and 0.43 respectively). Histological assessment of canine STS had high reproducibility by an individual pathologist. The agreement of diagnosis and grading of canine STS was moderate between pathologists. Future studies are required to investigate further assessment criteria to improve the specificity of STS diagnosis and the accuracy of the STS grading in dogs. © 2017 John Wiley & Sons Ltd.

  9. Inter- and intra-observer agreement of BI-RADS-based subjective visual estimation of amount of fibroglandular breast tissue with magnetic resonance imaging: comparison to automated quantitative assessment.

    PubMed

    Wengert, G J; Helbich, T H; Woitek, R; Kapetas, P; Clauser, P; Baltzer, P A; Vogl, W-D; Weber, M; Meyer-Baese, A; Pinker, Katja

    2016-11-01

    To evaluate the inter-/intra-observer agreement of BI-RADS-based subjective visual estimation of the amount of fibroglandular tissue (FGT) with magnetic resonance imaging (MRI), and to investigate whether FGT assessment benefits from an automated, observer-independent, quantitative MRI measurement by comparing both approaches. Eighty women with no imaging abnormalities (BI-RADS 1 and 2) were included in this institutional review board (IRB)-approved prospective study. All women underwent un-enhanced breast MRI. Four radiologists independently assessed FGT with MRI by subjective visual estimation according to BI-RADS. Automated observer-independent quantitative measurement of FGT with MRI was performed using a previously described measurement system. Inter-/intra-observer agreements of qualitative and quantitative FGT measurements were assessed using Cohen's kappa (k). Inexperienced readers achieved moderate inter-/intra-observer agreement and experienced readers a substantial inter- and perfect intra-observer agreement for subjective visual estimation of FGT. Practice and experience reduced observer-dependency. Automated observer-independent quantitative measurement of FGT was successfully performed and revealed only fair to moderate agreement (k = 0.209-0.497) with subjective visual estimations of FGT. Subjective visual estimation of FGT with MRI shows moderate intra-/inter-observer agreement, which can be improved by practice and experience. Automated observer-independent quantitative measurements of FGT are necessary to allow a standardized risk evaluation. • Subjective FGT estimation with MRI shows moderate intra-/inter-observer agreement in inexperienced readers. • Inter-observer agreement can be improved by practice and experience. • Automated observer-independent quantitative measurements can provide reliable and standardized assessment of FGT with MRI.

  10. A comparative agreement evaluation of two subaxial cervical spine injury classification systems: the AOSpine and the Allen and Ferguson schemes.

    PubMed

    Urrutia, Julio; Zamora, Tomas; Campos, Mauricio; Yurac, Ratko; Palma, Joaquin; Mobarec, Sebastian; Prada, Carlos

    2016-07-01

    We performed an agreement study using two subaxial cervical spine classification systems: the AOSpine and the Allen and Ferguson (A&F) classifications. We sought to determine which scheme allows better agreement by different evaluators and by the same evaluator on different occasions. Complete imaging studies of 65 patients with subaxial cervical spine injuries were classified by six evaluators (three spine sub-specialists and three senior orthopaedic surgery residents) using the AOSpine subaxial cervical spine classification system and the A&F scheme. The cases were displayed in a random sequence after a 6-week interval for repeat evaluation. The Kappa coefficient (κ) was used to determine inter- and intra-observer agreement. Inter-observer: considering the main AO injury types, the agreement was substantial for the AOSpine classification [κ = 0.61 (0.57-0.64)]; using AO sub-types, the agreement was moderate [κ = 0.57 (0.54-0.60)]. For the A&F classification, the agreement [κ = 0.46 (0.42-0.49)] was significantly lower than using the AOSpine scheme. Intra-observer: the agreement was substantial considering injury types [κ = 0.68 (0.62-0.74)] and considering sub-types [κ = 0.62 (0.57-0.66)]. Using the A&F classification, the agreement was also substantial [κ = 0.66 (0.61-0.71)]. No significant differences were observed between spine surgeons and orthopaedic residents in the overall inter- and intra-observer agreement, or in the inter- and intra-observer agreement of specific type of injuries. The AOSpine classification (using the four main injury types or at the sub-types level) allows a significantly better agreement than the A&F classification. The A&F scheme does not allow reliable communication between medical professionals.

  11. Indices of agreement between neurosurgeons and a radiologist in interpreting tomography scans in an emergency department.

    PubMed

    Dourado, Jules Carlos; Pereira, Júlio Leonardo Barbosa; Albuquerque, Lucas Alverne Freitas de; Carvalho, Gervásio Teles Cardos de; Dias, Patrícia; Dias, Laura; Bicalho, Marcos; Magalhães, Pollyana; Dellaretti, Marcos

    2015-08-01

    The power of interpretation in the analysis of cranial computed tomography (CCT) among neurosurgeons and radiologists has rarely been studied. This study aimed to assess the rate of agreement in the interpretation of CCTs between neurosurgeons and a radiologist in an emergency department. 227 CCT were independently analyzed by two neurosurgeons (NS1 and NS2) and a radiologist (RAD). The level of agreement in interpreting the examination was studied. The Kappa values obtained between NS1 and NS2 and RAD were considered nearly perfect and substantial agreement. The highest levels of agreement when evaluating abnormalities were observed in the identification of tumors, hydrocephalus and intracranial hematomas. The worst levels of agreement were observed for leukoaraiosis and reduced brain volume. For diseases in which the emergency room procedure must be determined, agreement in the interpretation of CCTs between the radiologist and neurosurgeons was satisfactory.

  12. Reliability of plain radiographic parameters for developmental dysplasia of the hip in children.

    PubMed

    Upasani, Vidyadhar V; Bomar, James D; Parikh, Gaurav; Hosalkar, Harish

    2012-07-01

    Few studies have evaluated the reliability and reproducibility of the femoral neck-shaft angle (NSA), center-edge angle (CEA), and acetabular index (AI) in young children with developmental dysplasia of the hip (DDH). We wanted to determine whether these parameters could be used reliably by practitioners. Fifty radiographs from 21 children with DDH were reviewed. Analysis was performed by three observers, at two time periods. The intra- and inter-observer reliability for each measure was assessed. At time period one, we noted a "high" level of agreement between observers when measuring the NSA, a "low" level when measuring the CEA, and a "moderate" level when measuring the AI. At time period two, we noted a "very high" level of agreement between observers when measuring the NSA and a "high" level when measuring the CEA and AI. When comparing the measurements of observer 1 at the two different time periods, we noted nearly "very high" agreement when measuring the NSA, a "moderate" agreement when measuring the CEA, and a "high" agreement for the AI. In comparing the measurements of observer 2, we noted "very high" agreement for the NSA and "high" agreement for the CEA and AI. In comparing the measurements for observer 3, we noted nearly "very high" agreement for the NSA, nearly "high" agreement for the CEA, and "high" agreement for the AI. It is difficult to reliably measure three-dimensional pelvic morphology on a frontal plane radiograph, especially when important pelvic landmarks have yet to ossify.

  13. Observers' Agreement on Measurements in Fiberoptic Endoscopic Evaluation of Swallowing.

    PubMed

    Pilz, Walmari; Vanbelle, Sophie; Kremer, Bernd; van Hooren, Michel R; van Becelaere, Tine; Roodenburg, Nel; Baijens, Laura W J

    2016-04-01

    This study analyzed the effect that dysphagia etiology, different observers, and bolus consistency might have on the level of agreement for measurements in FEES images reached by independent versus consensus panel rating. Sixty patients were included and divided into two groups according to dysphagia etiology: neurological or head and neck oncological. All patients underwent standardized FEES examination using thin and thick liquid consistencies. Two observers scored the same exams, first independently and then in a consensus panel. Four ordinal FEES variables were analyzed. Statistical analysis was performed using a linear weighted kappa coefficient and Bayesian multilevel model. Intra- and interobserver agreement on FEES measurements ranged from 0.76 to 0.93 and from 0.61 to 0.88, respectively. Dysphagia etiology did not influence observers' agreement level. However, bolus consistency resulted in decreased interobserver agreement for all measured FEES variables during thin liquid swallows. When rating on the consensus panel, the observers deviated considerably from the scores they had previously given on the independent rating task. Observer agreement on measurements in FEES exams was influenced by bolus consistency, not by dysphagia etiology. Therefore, observer agreement on FEES measurements should be analyzed by taking bolus consistency into account, as it might affect the interpretation of the outcome. Identifying factors that might influence agreement levels could lead to better understanding of the rating process and assist in developing a more precise measurement scale that would ensure higher levels of observer agreement for measurements in FEES exams.

  14. Agreement of three interpretation systems of intrapartum foetal heart rate monitoring by different levels of physicians.

    PubMed

    Pruksanusak, Ninlapa; Thongphanang, Putthaporn; Chainarong, Natthicha; Suntharasaj, Thitima; Kor-Anantakul, Ounjai; Suwanrath, Chitkasaem; Petpichetchian, Chusana

    2017-11-01

    A prospective study was conducted in centre in Southern Thailand, to evaluate agreement in EFM interpretation among various physicians in order to find out the most practical system for daily use. We found strong agreement of very normal FHR tracings among the FIGO, NICHD 3-tier and 5-tier systems. The NICHD 3-tier was more compatible with the FIGO system than 5-tier system. Overall inter-observer agreement was moderate for the NICHD 3-tier system while inter-observer agreement of 5-tier system was fair also the intra-observer agreement was higher in the NICHD 3-tier system. So the 3-tier systems are more suitable than the 5-tier system in general obstetric practice. Impact statement What is already known on this subject: The 3-tier and 5-tier systems were widely used in general obstetrics practice. What the results of this study add: The inter- and intra-observer agreement of NICHD 3-tier system was higher than the 5-tier system. What the implications are of these findings for clinical practice and/or further research: The 3-tier systems were more suitable than the 5-tier systems in general obstetrics practice.

  15. Intra- and inter-observer agreement in MRI assessment of rotator cuff healing using the Sugaya classification 10years after surgery.

    PubMed

    Niglis, L; Collin, P; Dosch, J-C; Meyer, N; Kempf, J-F

    2017-10-01

    The long-term outcomes of rotator cuff repair are unclear. Recurrent tears are common, although their reported frequency varies depending on the type and interpretation challenges of the imaging method used. The primary objective of this study was to assess the intra- and inter-observer reproducibility of the MRI assessment of rotator cuff repair using the Sugaya classification 10years after surgery. The secondary objective was to determine whether poor reproducibility, if found, could be improved by using a simplified yet clinically relevant classification. Our hypothesis was that reproducibility was limited but could be improved by simplifying the classification. In a retrospective study, we assessed intra- and inter-observer agreement in interpreting 49 magnetic resonance imaging (MRI) scans performed 10years after rotator cuff repair. These 49 scans were taken at random among 609 cases that underwent re-evaluation, with imaging, for the 2015 SoFCOT symposium on 10-year and 20-year clinical and anatomical outcomes of rotator cuff repair for full-thickness tears. Each of three observers read each of the 49 scans on two separate occasions. At each reading, they assessed the supra-spinatus tendon according to the Sugaya classification in five types. Intra-observer agreement for the Sugaya type was substantial (κ=0.64) but inter-observer agreement was only fair (κ=0.39). Agreement improved when the five Sugaya types were collapsed into two categories (1-2-3 and 4-5) (intra-observer κ=0.74 and inter-observer κ=0.68). Using the Sugaya classification to assess post-operative rotator cuff healing was associated with substantial intra-observer and fair inter-observer agreement. A simpler classification into two categories improved agreement while remaining clinically relevant. II, prospective randomised low-power study. Copyright © 2017 Elsevier Masson SAS. All rights reserved.

  16. Echocardiographic agreement in the diagnostic evaluation for infective endocarditis.

    PubMed

    Lauridsen, Trine Kiilerich; Selton-Suty, Christine; Tong, Steven; Afonso, Luis; Cecchi, Enrico; Park, Lawrence; Yow, Eric; Barnhart, Huiman X; Paré, Carlos; Samad, Zainab; Levine, Donald; Peterson, Gail; Stancoven, Amy Butler; Johansson, Magnus Carl; Dickerman, Stuart; Tamin, Syahidah; Habib, Gilbert; Douglas, Pamela S; Bruun, Niels Eske; Crowley, Anna Lisa

    2016-07-01

    Echocardiography is essential for the diagnosis and management of infective endocarditis (IE). However, the reproducibility for the echocardiographic assessment of variables relevant to IE is unknown. Objectives of this study were: (1) To define the reproducibility for IE echocardiographic variables and (2) to describe a methodology for assessing quality in an observational cohort containing site-interpreted data. IE reproducibility was assessed on a subset of echocardiograms from subjects enrolled in the International Collaboration on Endocarditis registry. Specific echocardiographic case report forms were used. Intra-observer agreement was assessed from six site readers on ten randomly selected echocardiograms. Inter-observer agreement between sites and an echocardiography core laboratory was assessed on a separate random sample of 110 echocardiograms. Agreement was determined using intraclass correlation (ICC), coverage probability (CP), and limits of agreement for continuous variables and kappa statistics (κweighted) and CP for categorical variables. Intra-observer agreement for LVEF was excellent [ICC = 0.93 ± 0.1 and all pairwise differences for LVEF (CP) were within 10 %]. For IE categorical echocardiographic variables, intra-observer agreement was best for aortic abscess (κweighted = 1.0, CP = 1.0 for all readers). Highest inter-observer agreement for IE categorical echocardiographic variables was obtained for vegetation location (κweighted = 0.95; 95 % CI 0.92-0.99) and lowest agreement was found for vegetation mobility (κweighted = 0.69; 95 % CI 0.62-0.86). Moderate to excellent intra- and inter-observer agreement is observed for echocardiographic variables in the diagnostic assessment of IE. A pragmatic approach for determining echocardiographic data reproducibility in a large, multicentre, site interpreted observational cohort is feasible.

  17. Inter-observer and intra-observer agreement between embryologists during selection of a single Day 5 embryo for transfer: a multicenter study.

    PubMed

    Storr, Ashleigh; Venetis, Christos A; Cooke, Simon; Kilani, Suha; Ledger, William

    2017-02-01

    What is the inter-observer and intra-observer agreement between embryologists when selecting a single Day 5 embryo for transfer? The inter-observer and intra-observer agreement between embryologists when selecting a single Day 5 embryo for transfer was generally good, although not optimal, even among experienced embryologists. Previous research on the morphological assessment of early stage (two pronuclei to Day 3) embryos has shown varying levels of inter-observer and intra-observer agreement. However, single blastocyst transfer is now becoming increasingly popular and there are no published data that assess inter-observer and intra-observer agreement when selecting a single embryo for Day 5 transfer. This was a prospective study involving 10 embryologists working at five different IVF clinics within a single organization between July 2013 and November 2015. The top 10 embryologists were selected based on their yearly Quality Assurance Program scores for blastocyst grading and were asked to morphologically grade all Day 5 embryos and choose a single embryo for transfer in a survey of 100 cases using 2D images. A total of 1000 decisions were therefore assessed. For each case, Day 5 images were shown, followed by a Day 3 and Day 5 image of the same embryo. Subgroup analyses were also performed based on the following characteristics of embryologists: the level of clinical embryology experience in the laboratory; amount of research experience; number of days per week spent grading embryos. The agreement between these embryologists and the one that scored the embryos on the actual day of transfer was also evaluated. Inter-observer and intra-observer variability was assessed using the kappa coefficient to evaluate the extent of agreement. This study showed that all 10 embryologists agreed on the embryo chosen for transfer in 50 out of 100 cases. In 93 out of 100 cases, at least 6 out of the 10 embryologists agreed. The inter-observer and intra-observer agreement among embryologists when selecting a single Day 5 embryo for transfer was generally good as assessed by the kappa scores (kappa = 0.734, 95% CI: 0.665-0.791 and 0.759, 95% CI: 0.622-0.833, respectively). The subgroup analyses did not substantially alter the inter-observer and intra-observer agreement among embryologists. The agreement when Day 3 images were included alongside Day 5 images of the same embryos resulted in a change of mind at least three times by each embryologist (on average for <10% of cases) and resulted in a small decrease in inter-observer and intra-observer agreement between embryologists (kappa = 0.676, 95% CI: 0.617-0.724 and 0.752, 95% CI: 0.656-808, respectively).The assessment of the inter-observer agreement with regard to morphological grading of Day 5 embryos showed only a fair-to-moderate agreement, which was observed across all subgroup analyses. The highest overall kappa coefficient was seen for the grading of the developmental stage of an embryo (0.513; 95% CI: 0.492-0.538). The findings were similar when the individual embryologists were compared with the embryologist who made the morphological assessments of the available embryos on the actual day of transfer. All embryologists had already completed their training and were working under one organization with similar policies between the five clinics. Therefore, the inter-observer agreement might not be as high between embryologists working in clinics with different policies or with different levels of training. The generally good, although not optimal uniformity between participating embryologists when selecting a Day 5 embryo for transfer, as well as, the surprisingly low agreement when morphologically grading Day 5 embryos could be improved, potentially resulting in increased pregnancy rates. Future studies need to be directed toward technologies that can help achieve this. None declared. Not applicable. © The Author 2016. Published by Oxford University Press on behalf of the European Society of Human Reproduction and Embryology. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  18. Inter-observer variability between general pathologists and a specialist in breast pathology in the diagnosis of lobular neoplasia, columnar cell lesions, atypical ductal hyperplasia and ductal carcinoma in situ of the breast

    PubMed Central

    2014-01-01

    Background This study aimed to assess inter-observer variability between the original diagnostic reports and later review by a specialist in breast pathology considering lobular neoplasias (LN), columnar cell lesions (CCL), atypical ductal hyperplasia (ADH), and ductal carcinoma in situ (DCIS) of the breast. Methods A retrospective, observational, cross-sectional study was conducted. A total of 610 breast specimens that had been formally sent for consultation and/or second opinions to the Breast Pathology Laboratory of Federal University of Minas Gerais were analysed between January 2005 and December 2010. The inter-observer variability between the original report and later review was compared regarding the diagnoses of LN, CCL, ADH, and DCIS. Statistical analyses were conducted using the Kappa index. Results Weak correlations were observed for the diagnoses of columnar cell change (CCC; Kappa = 0.38), columnar cell hyperplasia (CCH; Kappa = 0.32), while a moderate agreement (Kappa = 0.47) was observed for the diagnoses of flat epithelial atypia (FEA). Good agreement was observed in the diagnoses of atypical lobular hyperplasia (ALH; Kappa = 0.62) and lobular carcinoma in situ (LCIS; Kappa = 0.66). However, poor agreement was observed for the diagnoses of pleomorphic LCIS (Kappa = 0.22). Moderate agreement was observed for the diagnoses of ADH (Kappa = 0.44), low-grade DCIS (Kappa = 0.47), intermediate-grade DCIS (Kappa = 0.45), and DCIS with microinvasion (Kappa = 0.56). Good agreement was observed between the diagnoses of high-grade DCIS (Kappa = 0.68). Conclusions According to our data, the best diagnostic agreements were observed for high-grade DCIS, ALH, and LCIS. CCL without atypia and pleomorphic LCIS had the worst agreement indices. Virtual Slides The virtual slide(s) for this article can be found here: http://www.diagnosticpathology.diagnomx.eu/vs/1640072350119725. PMID:24948027

  19. Inter-observer variability between general pathologists and a specialist in breast pathology in the diagnosis of lobular neoplasia, columnar cell lesions, atypical ductal hyperplasia and ductal carcinoma in situ of the breast.

    PubMed

    Gomes, Douglas S; Porto, Simone S; Balabram, Débora; Gobbi, Helenice

    2014-06-19

    This study aimed to assess inter-observer variability between the original diagnostic reports and later review by a specialist in breast pathology considering lobular neoplasias (LN), columnar cell lesions (CCL), atypical ductal hyperplasia (ADH), and ductal carcinoma in situ (DCIS) of the breast. A retrospective, observational, cross-sectional study was conducted. A total of 610 breast specimens that had been formally sent for consultation and/or second opinions to the Breast Pathology Laboratory of Federal University of Minas Gerais were analysed between January 2005 and December 2010. The inter-observer variability between the original report and later review was compared regarding the diagnoses of LN, CCL, ADH, and DCIS. Statistical analyses were conducted using the Kappa index. Weak correlations were observed for the diagnoses of columnar cell change (CCC; Kappa=0.38), columnar cell hyperplasia (CCH; Kappa=0.32), while a moderate agreement (Kappa=0.47) was observed for the diagnoses of flat epithelial atypia (FEA). Good agreement was observed in the diagnoses of atypical lobular hyperplasia (ALH; Kappa=0.62) and lobular carcinoma in situ (LCIS; Kappa=0.66). However, poor agreement was observed for the diagnoses of pleomorphic LCIS (Kappa=0.22). Moderate agreement was observed for the diagnoses of ADH (Kappa=0.44), low-grade DCIS (Kappa=0.47), intermediate-grade DCIS (Kappa=0.45), and DCIS with microinvasion (Kappa=0.56). Good agreement was observed between the diagnoses of high-grade DCIS (Kappa=0.68). According to our data, the best diagnostic agreements were observed for high-grade DCIS, ALH, and LCIS. CCL without atypia and pleomorphic LCIS had the worst agreement indices. The virtual slide(s) for this article can be found here: http://www.diagnosticpathology.diagnomx.eu/vs/1640072350119725.

  20. Inter- and Intra-Observer Agreement in Ultrasound BI-RADS Classification and Real-Time Elastography Tsukuba Score Assessment of Breast Lesions.

    PubMed

    Schwab, Fabienne; Redling, Katharina; Siebert, Matthias; Schötzau, Andy; Schoenenberger, Cora-Ann; Zanetti-Dällenbach, Rosanna

    2016-11-01

    Our aim was to prospectively evaluate inter- and intra-observer agreement between Breast Imaging Reporting and Data System (BI-RADS) classifications and Tsukuba elasticity scores (TSs) of breast lesions. The study included 164 breast lesions (63 malignant, 101 benign). The BI-RADS classification and TS of each breast lesion was assessed by the examiner and twice by three reviewers at an interval of 2 months. Weighted κ values for inter-observer agreement ranged from moderate to substantial for BI-RADS classification (κ = 0.585-0.738) and was substantial for TS (κ = 0.608-0.779). Intra-observer agreement was almost perfect for ultrasound (US) BI-RADS (κ = 0.847-0.872) and TS (κ = 0.879-0.914). Overall, individual reviewers are highly self-consistent (almost perfect intra-observer agreement) with respect to BI-RADS classification and TS, whereas inter-observer agreement was moderate to substantial. Comprehensive training is essential for achieving high agreement and minimizing the impact of subjectivity. Our results indicate that breast US and real-time elastography can achieve high diagnostic performance. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  1. An Independent Inter- and Intraobserver Agreement Evaluation of the AOSpine Subaxial Cervical Spine Injury Classification System.

    PubMed

    Urrutia, Julio; Zamora, Tomas; Yurac, Ratko; Campos, Mauricio; Palma, Joaquin; Mobarec, Sebastian; Prada, Carlos

    2017-03-01

    An agreement study. The aim of this study was to perform an independent interobserver and intraobserver agreement assessment of the AOSpine subaxial cervical spine injury classification system. The AOSpine subaxial cervical spine injury classification system was recently described. It showed substantial inter- and intraobserver agreement in the study describing it; however, an independent evaluation has not been performed. Anteroposterior and lateral radiographs, computed tomography scans, and magnetic resonance imaging of 65 patients with acute traumatic subaxial cervical spine injuries were selected and classified using the morphologic grading of the subaxial cervical spine injury classification system by 6 evaluators (3 spine surgeons and 3 orthopedic surgery residents). After a 6-week interval, the 65 cases were presented to the same evaluators in a random sequence for repeat evaluation. The kappa coefficient (κ) was used to determine the inter- and intraobserver agreement. The interobserver agreement was substantial when considering the fracture main types (A, B, C, or F), with κ = 0.61 (0.57-0.64), but moderate when considering the subtypes: κ = 0.57 (0.54-0.60). The intraobserver agreement was substantial considering the fracture types, with κ = 0.68 (0.62-0.74) and considering subtypes, κ = 0.62 (0.57-0.66). No significant differences were observed between spine surgeons and orthopedic residents in the overall inter- and intraobserver agreement, or in the inter- and intraobserver agreement of specific A, B, C, or F type of injuries. This classification allows adequate agreement among different observers and by the same observer on separate occasions. Future prospective studies should determine whether this classification allows surgeons to decide the best treatment for patients with subaxial cervical spine injuries. 3.

  2. The use of video clips in teleconsultation for preschool children with movement disorders.

    PubMed

    Gorter, Hetty; Lucas, Cees; Groothuis-Oudshoorn, Karin; Maathuis, Carel; van Wijlen-Hempel, Rietje; Elvers, Hans

    2013-01-01

    To investigate the reliability and validity of video clips in assessing movement disorders in preschool children. The study group included 27 children with neuromotor concerns. The explorative validity group included children with motor problems (n = 21) or with typical development (n = 9). Hempel screening was used for live observation of the child, full recording, and short video clips. The explorative study tested the validity of the clinical classifications "typical" or "suspect." Agreement between live observation and the full recording was almost perfect; Agreement for the clinical classification "typical" or "suspect" was substantial. Agreement between the full recording and short video clips was substantial to moderate. The explorative validity study, based on short video clips and the presence of a neuromotor developmental disorder, showed substantial agreement. Hempel screening enables reliable and valid observation of video clips, but further research is necessary to demonstrate the predictive value.

  3. A Pilot Study Comparing Observational and Questionnaire Surrogate Measures of Pesticide Exposure Among Residents Impacted by the Ecuadorian Flower Industry.

    PubMed

    Handal, Alexis J; McGough-Maduena, Alison; Páez, Maritza; Skipper, Betty; Rowland, Andrew S; Fenske, Richard A; Harlow, Siobán D

    2015-01-01

    Self-reported measures of residential pesticide exposure are commonly used in epidemiological studies, especially when financial and logistical resources are limited. However, self-reporting is prone to misclassification bias. This pilot study assesses the agreement between self-report of residential pesticide exposure with direct observation measures, in an agricultural region of Ecuador, as a cross-validation method in 26 participants (16 rose workers and 10 controls), with percent agreement and kappa statistics calculated. Proximity of homes to nearby flower farms was found to have only fair agreement (kappa =.35). The use of discarded plastics (kappa =.06) and wood (kappa =.13) were found to have little agreement. Results indicate that direct observation or measurement may provide more accurate appraisals of residential exposures, such as proximity to industrial farmland and the use of discarded materials obtained from the flower farms.

  4. Relationship and inter observer agreement of tooth and face forms in a Saudi subpopulation.

    PubMed

    Habib, Syed Rashid; Shiddi, Ibraheem Al; Al-Sufyani, Mohammed D; Althobaiti, Fahad A

    2015-04-01

    To determine the relationship of tooth form with the face form by different observers and further investigate the inter observer agreement on tooth forms, face forms, their relationship among male Saudis. A comparative cross-sectional study. Department of Prosthodontics, College of Dentistry, King Saud University, Riyadh, KSA, from February till August 2013. Ninety four male participants aged 18 - 35 years were randomly recruited for the study. Full-face and anterior teeth (intraoral) digital photographs in the frontal plane were recorded. The outline tracings of the face and the tooth were obtained using Autocad (version 2010) software. The outline of the tooth was enlarged proportionately, without altering the length to width ratio to fit the face outline. The outlines were then evaluated visually by 6 prosthodontists and results were tabulated. The most common type of face form (49.65%) and tooth form (56.38%) was square tapering. Using the visual method, a good relationship (31.41%), moderate relationship (35.31%), weak relationship (19.68%) and no relationship (13.65%) between the tooth form and face form was found by the observers. Overall kappa for inter observer agreement on face form, tooth form and their relationship was 0.24, 0.17 and 0.26 respectively. The kappa values showed a fair agreement between the observers. The study results indicated that there was no highly defined relationship between the tooth form and face form in the studied Saudi subpopulation. A fair agreement was found between the observers for classifying the tooth forms, face froms and their relationship.

  5. Cross-Cultural Agreement in Facial Attractiveness Preferences: The Role of Ethnicity and Gender

    PubMed Central

    Coetzee, Vinet; Greeff, Jaco M.; Stephen, Ian D.; Perrett, David I.

    2014-01-01

    Previous work showed high agreement in facial attractiveness preferences within and across cultures. The aims of the current study were twofold. First, we tested cross-cultural agreement in the attractiveness judgements of White Scottish and Black South African students for own- and other-ethnicity faces. Results showed significant agreement between White Scottish and Black South African observers' attractiveness judgements, providing further evidence of strong cross-cultural agreement in facial attractiveness preferences. Second, we tested whether cross-cultural agreement is influenced by the ethnicity and/or the gender of the target group. White Scottish and Black South African observers showed significantly higher agreement for Scottish than for African faces, presumably because both groups are familiar with White European facial features, but the Scottish group are less familiar with Black African facial features. Further work investigating this discordance in cross-cultural attractiveness preferences for African faces show that Black South African observers rely more heavily on colour cues when judging African female faces for attractiveness, while White Scottish observers rely more heavily on shape cues. Results also show higher cross-cultural agreement for female, compared to male faces, albeit not significantly higher. The findings shed new light on the factors that influence cross-cultural agreement in attractiveness preferences. PMID:24988325

  6. Cross-cultural agreement in facial attractiveness preferences: the role of ethnicity and gender.

    PubMed

    Coetzee, Vinet; Greeff, Jaco M; Stephen, Ian D; Perrett, David I

    2014-01-01

    Previous work showed high agreement in facial attractiveness preferences within and across cultures. The aims of the current study were twofold. First, we tested cross-cultural agreement in the attractiveness judgements of White Scottish and Black South African students for own- and other-ethnicity faces. Results showed significant agreement between White Scottish and Black South African observers' attractiveness judgements, providing further evidence of strong cross-cultural agreement in facial attractiveness preferences. Second, we tested whether cross-cultural agreement is influenced by the ethnicity and/or the gender of the target group. White Scottish and Black South African observers showed significantly higher agreement for Scottish than for African faces, presumably because both groups are familiar with White European facial features, but the Scottish group are less familiar with Black African facial features. Further work investigating this discordance in cross-cultural attractiveness preferences for African faces show that Black South African observers rely more heavily on colour cues when judging African female faces for attractiveness, while White Scottish observers rely more heavily on shape cues. Results also show higher cross-cultural agreement for female, compared to male faces, albeit not significantly higher. The findings shed new light on the factors that influence cross-cultural agreement in attractiveness preferences.

  7. Specific agreement on dichotomous outcomes can be calculated for more than two raters.

    PubMed

    de Vet, Henrica C W; Dikmans, Rieky E; Eekhout, Iris

    2017-03-01

    For assessing interrater agreement, the concepts of observed agreement and specific agreement have been proposed. The situation of two raters and dichotomous outcomes has been described, whereas often, multiple raters are involved. We aim to extend it for more than two raters and examine how to calculate agreement estimates and 95% confidence intervals (CIs). As an illustration, we used a reliability study that includes the scores of four plastic surgeons classifying photographs of breasts of 50 women after breast reconstruction into "satisfied" or "not satisfied." In a simulation study, we checked the hypothesized sample size for calculation of 95% CIs. For m raters, all pairwise tables [ie, m (m - 1)/2] were summed. Then, the discordant cells were averaged before observed and specific agreements were calculated. The total number (N) in the summed table is m (m - 1)/2 times larger than the number of subjects (n), in the example, N = 300 compared to n = 50 subjects times m = 4 raters. A correction of n√(m - 1) was appropriate to find 95% CIs comparable to bootstrapped CIs. The concept of observed agreement and specific agreement can be extended to more than two raters with a valid estimation of the 95% CIs. Copyright © 2017 Elsevier Inc. All rights reserved.

  8. Development and testing of the cancer multidisciplinary team meeting observational tool (MDT-MOT)

    PubMed Central

    Harris, Jenny; Taylor, Cath; Sevdalis, Nick; Jalil, Rozh; Green, James S.A.

    2016-01-01

    Abstract Objective To develop a tool for independent observational assessment of cancer multidisciplinary team meetings (MDMs), and test criterion validity, inter-rater reliability/agreement and describe performance. Design Clinicians and experts in teamwork used a mixed-methods approach to develop and refine the tool. Study 1 observers rated pre-determined optimal/sub-optimal MDM film excerpts and Study 2 observers independently rated video-recordings of 10 MDMs. Setting Study 2 included 10 cancer MDMs in England. Participants Testing was undertaken by 13 health service staff and a clinical and non-clinical observer. Intervention None. Main Outcome Measures Tool development, validity, reliability/agreement and variability in MDT performance. Results Study 1: Observers were able to discriminate between optimal and sub-optimal MDM performance (P ≤ 0.05). Study 2: Inter-rater reliability was good for 3/10 domains. Percentage of absolute agreement was high (≥80%) for 4/10 domains and percentage agreement within 1 point was high for 9/10 domains. Four MDTs performed well (scored 3+ in at least 8/10 domains), 5 MDTs performed well in 6–7 domains and 1 MDT performed well in only 4 domains. Leadership and chairing of the meeting, the organization and administration of the meeting, and clinical decision-making processes all varied significantly between MDMs (P ≤ 0.01). Conclusions MDT-MOT demonstrated good criterion validity. Agreement between clinical and non-clinical observers (within one point on the scale) was high but this was inconsistent with reliability coefficients and warrants further investigation. If further validated MDT-MOT might provide a useful mechanism for the routine assessment of MDMs by the local workforce to drive improvements in MDT performance. PMID:27084499

  9. Development and testing of the cancer multidisciplinary team meeting observational tool (MDT-MOT).

    PubMed

    Harris, Jenny; Taylor, Cath; Sevdalis, Nick; Jalil, Rozh; Green, James S A

    2016-06-01

    To develop a tool for independent observational assessment of cancer multidisciplinary team meetings (MDMs), and test criterion validity, inter-rater reliability/agreement and describe performance. Clinicians and experts in teamwork used a mixed-methods approach to develop and refine the tool. Study 1 observers rated pre-determined optimal/sub-optimal MDM film excerpts and Study 2 observers independently rated video-recordings of 10 MDMs. Study 2 included 10 cancer MDMs in England. Testing was undertaken by 13 health service staff and a clinical and non-clinical observer. None. Tool development, validity, reliability/agreement and variability in MDT performance. Study 1: Observers were able to discriminate between optimal and sub-optimal MDM performance (P ≤ 0.05). Study 2: Inter-rater reliability was good for 3/10 domains. Percentage of absolute agreement was high (≥80%) for 4/10 domains and percentage agreement within 1 point was high for 9/10 domains. Four MDTs performed well (scored 3+ in at least 8/10 domains), 5 MDTs performed well in 6-7 domains and 1 MDT performed well in only 4 domains. Leadership and chairing of the meeting, the organization and administration of the meeting, and clinical decision-making processes all varied significantly between MDMs (P ≤ 0.01). MDT-MOT demonstrated good criterion validity. Agreement between clinical and non-clinical observers (within one point on the scale) was high but this was inconsistent with reliability coefficients and warrants further investigation. If further validated MDT-MOT might provide a useful mechanism for the routine assessment of MDMs by the local workforce to drive improvements in MDT performance. © The Author 2016. Published by Oxford University Press in association with the International Society for Quality in Health Care; all rights reserved.

  10. [A systematic social observation tool: methods and results of inter-rater reliability].

    PubMed

    Freitas, Eulilian Dias de; Camargos, Vitor Passos; Xavier, César Coelho; Caiaffa, Waleska Teixeira; Proietti, Fernando Augusto

    2013-10-01

    Systematic social observation has been used as a health research methodology for collecting information from the neighborhood physical and social environment. The objectives of this article were to describe the operationalization of direct observation of the physical and social environment in urban areas and to evaluate the instrument's reliability. The systematic social observation instrument was designed to collect information in several domains. A total of 1,306 street segments belonging to 149 different neighborhoods in Belo Horizonte, Minas Gerais, Brazil, were observed. For the reliability study, 149 segments (1 per neighborhood) were re-audited, and Fleiss kappa was used to access inter-rater agreement. Mean agreement was 0.57 (SD = 0.24); 53% had substantial or almost perfect agreement, and 20.4%, moderate agreement. The instrument appears to be appropriate for observing neighborhood characteristics that are not time-dependent, especially urban services, property characterization, pedestrian environment, and security.

  11. Pre-operative Duplex Ultrasonography in Arteriovenous Fistula Creation: Intra- and Inter-observer Agreement.

    PubMed

    Zonnebeld, Niek; Maas, Tommy M G; Huberts, Wouter; van Loon, Magda M; Delhaas, Tammo; Tordoir, Jan H M

    2017-11-01

    Although clinical guidelines on arteriovenous fistula (AVF) creation advocate minimum luminal arterial and venous diameters, assessed by duplex ultrasonography (DUS), the clinical value of routine DUS examination is under debate. DUS might be an insufficiently repeatable and/or reproducible imaging modality because of its operator dependency. The present study aimed to assess intra- and inter-observer agreement of DUS examination in support of AVF surgery planning. Ten end stage renal disease patients were included, to assess intra- and inter-observer agreement of pre-operative DUS measurements. All measurements were performed by two trained and experienced vascular technicians, blinded to measurement readings. From the routine DUS protocol, representative measurements (venous diameters, and arterial diameters and volume flow in the upper arm and forearm) were selected. For intra-observer agreement the measurements were performed in triplicate, with the probe released from the skin between each. Intraclass correlation coefficients were calculated for intra- and inter-observer agreement, and Bland-Altman plots used to graphically display mean measurement differences and limits of agreement. Ten patients (6 male, 59.4±19.7 years) consented to participate, and all predefined measurements were obtained. Intraclass correlation coefficients for intra-observer agreement of diameter measurements were at least 0.90 (95% CI 0.74-0.97; radial artery). Inter-observer agreement was at least 0.83 (0.46-0.96; lateral diameter upper arm cephalic vein). The Bland-Altman plots showed acceptable mean measurement differences and limits of agreement. In experienced hands, excellent intra- and inter-observer agreement can be reached for the discrete pre-operative DUS measurements advocated in clinical guidelines. DUS is therefore a reliable imaging modality to support AVF surgery planning. The content of DUS protocols, however, needs further standardisation. Copyright © 2017 European Society for Vascular Surgery. Published by Elsevier Ltd. All rights reserved.

  12. Abdominal auscultation does not provide clear clinical diagnoses.

    PubMed

    Durup-Dickenson, Maja; Christensen, Marie Kirk; Gade, John

    2013-05-01

    Abdominal auscultation is a part of the clinical examination of patients, but the determining factors in bowel sound evaluation are poorly described. The aim of this study was to assess inter- and intra-observer agreement in physicians' evaluation of pitch, intensity and quantity in abdominal auscultation. A total of 100 physicians were presented with 20 bowel sound recordings in a blinded set-up. Recordings had been made in a mix of healthy volunteers and emergency patients. They evaluated pitch, intensity and quantity of bowel sounds in a questionnaire with three, three and four categories of answers, respectively. Fleiss' multi-rater kappa (κ) coefficients were calculated for inter-observer agreement; for intra-observer agreement, calculation of probability was performed. Inter-observer agreement regarding pitch, intensity and quantity yielded κ-values of 0.19 (p < 0.0001), 0.30 (p < 0.0001) and 0.24 (p < 0.0001), respectively, corresponding to slight, fair and fair agreement. Regarding intra-observer agreement, the probability of agreement was 0.55 (95% confidence interval (CI): 0.51-0.59), 0.45 (95% CI: 0.42-0.49) and 0.41 (95% CI: 0.38-0.45) for pitch, intensity and quantity, respectively. Although relatively poor, observer agreement was slight to fair and thus better than expected by chance. Since the diagnostic value of auscultation increases with addition of history and clinics, and may be further improved by systematic training, it should still be used in the examination of patients with acute abdominal pain. not relevant. not relevant.

  13. Inter-observer agreement for diagnostic classification of esophageal motility disorders defined in high-resolution manometry.

    PubMed

    Fox, M R; Pandolfino, J E; Sweis, R; Sauter, M; Abreu Y Abreu, A T; Anggiansah, A; Bogte, A; Bredenoord, A J; Dengler, W; Elvevi, A; Fruehauf, H; Gellersen, S; Ghosh, S; Gyawali, C P; Heinrich, H; Hemmink, M; Jafari, J; Kaufman, E; Kessing, K; Kwiatek, M; Lubomyr, B; Banasiuk, M; Mion, F; Pérez-de-la-Serna, J; Remes-Troche, J M; Rohof, W; Roman, S; Ruiz-de-León, A; Tutuian, R; Uscinowicz, M; Valdovinos, M A; Vardar, R; Velosa, M; Waśko-Czopnik, D; Weijenborg, P; Wilshire, C; Wright, J; Zerbib, F; Menne, D

    2015-01-01

    High-resolution esophageal manometry (HRM) is a recent development used in the evaluation of esophageal function. Our aim was to assess the inter-observer agreement for diagnosis of esophageal motility disorders using this technology. Practitioners registered on the HRM Working Group website were invited to review and classify (i) 147 individual water swallows and (ii) 40 diagnostic studies comprising 10 swallows using a drop-down menu that followed the Chicago Classification system. Data were presented using a standardized format with pressure contours without a summary of HRM metrics. The sequence of swallows was fixed for each user but randomized between users to avoid sequence bias. Participants were blinded to other entries. (i) Individual swallows were assessed by 18 practitioners (13 institutions). Consensus agreement (≤ 2/18 dissenters) was present for most cases of normal peristalsis and achalasia but not for cases of peristaltic dysmotility. (ii) Diagnostic studies were assessed by 36 practitioners (28 institutions). Overall inter-observer agreement was 'moderate' (kappa 0.51) being 'substantial' (kappa > 0.7) for achalasia type I/II and no lower than 'fair-moderate' (kappa >0.34) for any diagnosis. Overall agreement was somewhat higher among those that had performed >400 studies (n = 9; kappa 0.55) and 'substantial' among experts involved in development of the Chicago Classification system (n = 4; kappa 0.66). This prospective, randomized, and blinded study reports an acceptable level of inter-observer agreement for HRM diagnoses across the full spectrum of esophageal motility disorders for a large group of clinicians working in a range of medical institutions. Suboptimal agreement for diagnosis of peristaltic motility disorders highlights contribution of objective HRM metrics. © 2014 International Society for Diseases of the Esophagus.

  14. Inter and intra-observer concordance for the diagnosis of portal hypertension gastropathy.

    PubMed

    Casas, Meritxell; Vergara, Mercedes; Brullet, Enric; Junquera, Félix; Martínez-Bauer, Eva; Miquel, Mireia; Sánchez-Delgado, Jordi; Dalmau, Blai; Campo, Rafael; Calvet, Xavier

    2018-03-01

    At present there is no fully accepted endoscopic classification for the assessment of the severity of portal hypertensive gastropathy (PHG). Few studies have evaluated inter and intra-observer concordance or the degree of concordance between different endoscopic classifications. To evaluate inter and intra-observer agreement for the presence of portal hypertensive gastropathy and enteropathy using different endoscopic classifications. Patients with liver cirrhosis were included into the study. Enteroscopy was performed under sedation. The location of lesions and their severity was recorded. Images were videotaped and subsequently evaluated independently by three different endoscopists, one of whom was the initial endoscopist. The agreement between observations was assessed using the kappa index. Seventy-four patients (mean age 63.2 years, 53 males and 21 females) were included. The agreement between the three endoscopists regarding the presence or absence of PHG using the Tanoue and McCormack classifications was very low (kappa scores = 0.16 and 0.27, respectively). The current classifications of portal hypertensive gastropathy have a very low degree of intra and inter-observer agreement for the diagnosis and assessment of gastropathy severity.

  15. Inter-observer agreement improves with PERCIST 1.0 as opposed to qualitative evaluation in non-small cell lung cancer patients evaluated with F-18-FDG PET/CT early in the course of chemo-radiotherapy.

    PubMed

    Fledelius, Joan; Khalil, Azza; Hjorthaug, Karin; Frøkiær, Jørgen

    2016-12-01

    The purpose of this study is to determine whether a qualitative approach or a semi-quantitative approach provides the most robust method for early response evaluation with 2'-deoxy-2'-[(18)F]fluoro-D-glucose (F-18-FDG) positron emission tomography combined with whole body computed tomography (PET/CT) in non-small cell lung cancer (NSCLC). In this study eight Nuclear Medicine consultants analyzed F-18-FDG PET/CT scans from 35 patients with locally advanced NSCLC. Scans were performed at baseline and after 2 cycles of chemotherapy. Each observer used two different methods for evaluation: (1) PET response criteria in solid tumors (PERCIST) 1.0 and (2) a qualitative approach. Both methods allocate patients into one of four response categories (complete and partial metabolic response (CMR and PMR) and stable and progressive metabolic disease (SMD and PMD)). The inter-observer agreement was evaluated using Fleiss' kappa for multiple raters, Cohens kappa for comparison of the two methods, and intraclass correlation coefficients (ICC) for comparison of lean body mass corrected standardized uptake value (SUL) peak measurements. The agreement between observers when determining the percentage change in SULpeak was "almost perfect", with ICC = 0.959. There was a strong agreement among observers allocating patients to the different response categories with a Fleiss kappa of 0.76 (0.71-0.81). In 22 of the 35 patients, complete agreement was observed with PERCIST 1.0. The agreement was lower when using the qualitative method, moderate, having a Fleiss kappa of 0.60 (0.55-0.64). Complete agreement was achieved in only 10 of the 35 patients. The difference between the two methods was statistically significant (p < 0.005) (chi-squared). Comparing the two methods for each individual observer showed Cohen's kappa values ranging from 0.64 to 0.79, translating into a strong agreement between the two methods. PERCIST 1.0 provides a higher overall agreement between observers than the qualitative approach in categorizing early treatment response in NSCLC patients. The inter-observer agreement is in fact strong when using PERCIST 1.0 even when the level of instruction is purposely kept to a minimum in order to mimic the everyday situation. The variability is largely owing to the subjective elements of the method.

  16. Agreement studies in radiology research.

    PubMed

    Farzin, B; Gentric, J-C; Pham, M; Tremblay-Paquet, S; Brosseau, L; Roy, C; Jamali, S; Chagnon, M; Darsaut, T E; Guilbert, F; Naggara, O; Raymond, J

    2017-03-01

    The goal of this study was to estimate the frequency and the quality of agreement studies published in diagnostic imaging journals. All studies published between January 2011 and December 2012 in four radiology journals were reviewed. Four trained readers evaluated agreement studies using a 24-item form that included the 15 items of the Guidelines for Reporting Reliability and Agreement Studies criteria. Of 2229 source titles, 280 studies (13%) reported agreement. The mean number of patients per study was 81±99 (SD) (range, 0-180). Justification for sample size was found in 9 studies (3%). The number of raters was≤2 in 226 studies (81%). No intra-observer study was performed in 212 (76%) articles. Confidence intervals and interpretation of statistical estimates were provided in 98 (35%) and 147 (53%) of the studies, respectively. In 168 studies (60%), the agreement study was not mentioned in the discussion section. In 8 studies (3%), reporting of the agreement study was judged to be adequate. Twenty studies (7%) were dedicated to agreement. Agreement studies are preliminary and not adequately reported. Studies dedicated to agreement are infrequent. They are research opportunities that should be promoted. Copyright © 2016 Éditions françaises de radiologie. Published by Elsevier Masson SAS. All rights reserved.

  17. Comparison of three methods for evaluation of work postures in a truck assembly plant.

    PubMed

    Zare, Mohsen; Biau, Sophie; Brunet, Rene; Roquelaure, Yves

    2017-11-01

    This study compared the results of three risk assessment tools (self-reported questionnaire, observational tool, direct measurement method) for the upper limbs and back in a truck assembly plant at two cycle times (11 and 8 min). The weighted Kappa factor showed fair agreement between the observational and direct measurement method for the arm (0.39) and back (0.47). The weighted Kappa factor for these methods was poor for the neck (0) and wrist (0) but the observed proportional agreement (P o ) was 0.78 for the neck and 0.83 for the wrist. The weighted Kappa factor between questionnaire and direct measurement showed poor or slight agreement (0) for different body segments in both cycle times. The results revealed moderate agreement between the observational tool and the direct measurement method, and poor agreement between the self-reported questionnaire and direct measurement. Practitioner Summary: This study provides risk exposure measurement by different common ergonomic methods in the field. The results help to develop valid measurements and improve exposure evaluation. Hence, the ergonomist/practitioners should apply the methods with caution, or at least knowing what the issues/errors are.

  18. Inter-observer agreement of a multi-parameter campsite monitoring program on the Dixie National Forest, Utah

    Treesearch

    Nicholas J. Glidden; Martha E. Lee

    2007-01-01

    Precision is crucial to campsite monitoring programs. Yet, little empirical research has ever been published on the level of precision of this type of monitoring programs. The purpose of this study was to evaluate the level of agreement between observers of campsite impacts using a multi-parameter campsite monitoring program. Thirteen trained observers assessed 16...

  19. Threshold level for measurement of UV sensitivity: reproducibility of phototest.

    PubMed

    Lock-Andersen, J; Wulf, H C

    1996-08-01

    The ultraviolet (UV) sensitivity is determined by a phototest where the skin is exposed to well-defined doses of UV radiation and the resulting erythema is graded by visual scoring after 20-24 h. In this study we wanted to estimate the reproducibility of erythema assessment in phototesting. Twenty-one healthy Caucasians with skin types I to IV were phototested on UV un-exposed buttock skin using a xenon lamp solar simulator. Twenty-four hours after UV exposure eight physicians independently graded the erythema reactions two times. Data were analysed using inter- and intra-observer agreement and kappa statistics, which adjusts for agreement that could be caused by chance alone. Observed agreement and kappa statistics were found to decrease with increasing intensity of erythema and to be lower for skin types III and IV compared to skin types I and II. Intra-observer agreement was uniformly better than inter-observer agreement. The difference between observers assessment could be as much as three clinical erythema grades. Physicians's previous experience with phototesting only had a minor influence on agreement. In conclusion, phototesting is based on subjective assessment of erythema and is not as precise and reproducible as expected. Agreement was better for barely perceptible erythema than for erythema with a well-defined border and we therefore recommend that the barely perceptible erythema reaction should be used for measurement of the minimal erythema dose.

  20. EasyDIAg: A tool for easy determination of interrater agreement.

    PubMed

    Holle, Henning; Rein, Robert

    2015-09-01

    Reliable measurements are fundamental for the empirical sciences. In observational research, measurements often consist of observers categorizing behavior into nominal-scaled units. Since the categorization is the outcome of a complex judgment process, it is important to evaluate the extent to which these judgments are reproducible, by having multiple observers independently rate the same behavior. A challenge in determining interrater agreement for timed-event sequential data is to develop clear objective criteria to determine whether two raters' judgments relate to the same event (the linking problem). Furthermore, many studies presently report only raw agreement indices, without considering the degree to which agreement can occur by chance alone. Here, we present a novel, free, and open-source toolbox (EasyDIAg) designed to assist researchers with the linking problem, while also providing chance-corrected estimates of interrater agreement. Additional tools are included to facilitate the development of coding schemes and rater training.

  1. Diagnostic Reproducibility: What Happens When the Same Pathologist Interprets the Same Breast Biopsy Specimen at Two Points in Time?

    PubMed Central

    Jackson, Sara L.; Frederick, Paul D.; Pepe, Margaret S.; Nelson, Heidi D.; Weaver, Donald L.; Allison, Kimberly H.; Carney, Patricia A.; Geller, Berta M.; Tosteson, Anna N. A.; Onega, Tracy; Elmore, Joann G.

    2017-01-01

    Background Surgeons may receive a different diagnosis when a breast biopsy is interpreted by a second pathologist. The extent to which diagnostic agreement by the same pathologist varies at two time points is unknown. Participants and Methods Pathologists from 8 U.S. states independently interpreted 60 breast specimens, one glass slide per case, on 2 occasions separated by ≥9 months. Reproducibility was assessed by comparing interpretations between the two time points; associations between reproducibility (intra-observer agreement rates) and characteristics of pathologists and cases were determined and also compared with inter-observer agreement of baseline interpretations. Results Sixty-five percent of invited, responding pathologists were eligible and consented; 49 interpreted glass slides in both study phases resulting in 2,940 interpretations. Intra-observer agreement rates between the two phases were 92% (95% CI 88%-95%) for invasive breast cancer, 84% (95% CI 81%-87%) for ductal carcinoma in situ (DCIS), 53% (95% CI 47%-59%) for atypia, and 84% (95% CI 81%-86%) for benign without atypia. When comparing all study participants' case interpretations at baseline, inter-observer agreement rates were 89% (95% CI 84%-92%) for invasive cancer, 79% (95% CI 76%-81%) for DCIS, 43% (95% CI 41%-45%) for atypia, and 77% (95% CI 74%-79%) for benign without atypia. Conclusions Interpretive agreement between two time points by the same individual pathologists was low for atypia, and similar to observed rates of agreement for atypia between different pathologists. Physicians and patients should be aware of the diagnostic challenges associated with a breast biopsy diagnosis of atypia when considering treatment and surveillance decisions. PMID:27913946

  2. High inter-observer agreement of observer-perceived pain assessment in the emergency department.

    PubMed

    Hangaard, Martin Høhrmann; Malling, Brian; Mogensen, Christian Backer

    2018-02-21

    Triage is used to prioritize the patients in the emergency department. The majority of the triage systems include the patients' pain score to assess their level of acuity by using a combination of patient reported pain and observer-perceived pain; the latter therefore requires a certain degree of inter-observer agreement. The aim of the present study was to assess the inter-observer agreement of perceived pain among emergency department nurses and to evaluate if it was influenced by predetermined factors like age and gender. A project assistant randomly recruited two nurses, who were not allowed to interact with each other, to assess patient pain intensity on the numeric ranking scale. The project assistant afterwards entered the pain scores in a predesigned electronic questionnaire. We used weighted Fleiss-Cohen (quadratic) kappa statistics, Bland-Altman statistics and logistic regression analysis to assess the inter-observer agreement. One hundred and sixty-two patients were included. They had a median age of 38 years and 45% were females. 30% of the patients were acute surgical patients and 70% acute orthopedic patients. The average time between the pain assessments were 1,7 min. The Bland Altman analysis found a mean difference in pain score of 0.2 and 95% limits of agreement of +/- 3 point. When the NRS scores were translated to commonly used pain categories (no, mild, moderate or severe pain) we found a 70% agreement with a mean difference in categories of 0.05 and 95% limits of agreement of +/- 1 category. Patient age, gender, localization of pain, examination room or presence of a significant other did not affect the inter-observer agreement. We found 70% agreement on pain category between the nurses and it is justified that nurse-perceived pain assessment is used for triage in the emergency department.

  3. Scoring haemophilic arthropathy on X-rays: improving inter- and intra-observer reliability and agreement using a consensus atlas.

    PubMed

    Foppen, Wouter; van der Schaaf, Irene C; Beek, Frederik J A; Verkooijen, Helena M; Fischer, Kathelijn

    2016-06-01

    The radiological Pettersson score (PS) is widely applied for classification of arthropathy to evaluate costly haemophilia treatment. This study aims to assess and improve inter- and intra-observer reliability and agreement of the PS. Two series of X-rays (bilateral elbows, knees, and ankles) of 10 haemophilia patients (120 joints) with haemophilic arthropathy were scored by three observers according to the PS (maximum score 13/joint). Subsequently, (dis-)agreement in scoring was discussed until consensus. Example images were collected in an atlas. Thereafter, second series of 120 joints were scored using the atlas. One observer rescored the second series after three months. Reliability was assessed by intraclass correlation coefficients (ICC), agreement by limits of agreement (LoA). Median Pettersson score at joint level (PSjoint) of affected joints was 6 (interquartile range 3-9). Using the consensus atlas, inter-observer reliability of the PSjoint improved significantly from 0.94 (95 % confidence interval (CI) 0.91-0.96) to 0.97 (CI 0.96-0.98). LoA improved from ±1.7 to ±1.1 for the PSjoint. Therefore, true differences in arthropathy were differences in the PSjoint of >2 points. Intra-observer reliability of the PSjoint was 0.98 (CI 0.97-0.98), intra-observer LoA were ±0.9 points. Reliability and agreement of the PS improved by using a consensus atlas. • Reliability of the Pettersson score significantly improved using the consensus atlas. • The presented consensus atlas improved the agreement among observers. • The consensus atlas could be recommended to obtain a reproducible Pettersson score.

  4. Inter-observer variability in fetal biometric measurements.

    PubMed

    Kilani, Rami; Aleyadeh, Wesam; Atieleh, Luay Abu; Al Suleimat, Abdul Mane; Khadra, Maysa; Hawamdeh, Hassan M

    2018-02-01

    To evaluate inter-observer variability and reproducibility of ultrasound measurements for fetal biometric parameters. A prospective cohort study was implemented in two tertiary care hospitals in Amman, Jordan; Prince Hamza Hospital and Albashir Hospital. 192 women with a singleton pregnancy at a gestational age of 18-36 weeks were the participants in the study. Transabdominal scans for fetal biometric parameter measurement were performed on study participants from the period of November 2014 to March 2015. Women who agreed to participate in the study were administered two ultrasound scans for head circumference, abdominal circumference and femur length. The correlation coefficient was calculated. Bland-Altman plots were used to analyze the degree of measurement agreement between observers. Limits of agreement ± 2 SD for the differences in fetal biometry measurements in proportions of the mean of the measurements were derived. Main outcome measures examine the reproducibility of fetal biometric measurements by different observers. High inter-observer inter-class correlation coefficient (ICC) was found for femur length (0.990) and abdominal circumference (0.996) where Bland-Altman plots showed high degrees of agreement. The highest degrees of agreement were noted in the measurement of abdominal circumference followed by head circumference. The lowest degree of agreement was found for femur length measurement. We used a paired-sample t-test and found that the mean difference between duplicate measurements was not significant (P > 0.05). Biometric fetal parameter measurements may be reproducible by different operators in the clinical setting with similar results. Fetal head circumference, abdominal circumference and femur length were highly reproducible. Large organized studies are needed to ensure accurate fetal measurements due to the important clinical implications of inaccurate measurements. Copyright © 2018. Published by Elsevier B.V.

  5. Measuring symptoms and functioning of youth with ADHD in middle schools.

    PubMed

    Evans, Steven W; Allen, Jessica; Moore, Sheryle; Strauss, Victoria

    2005-12-01

    The identification of reliable and valid means for evaluating the effectiveness of school-based treatments and completing diagnostic evaluations of middle school aged students are needed. The present study examined the inter-rater agreement of teacher ratings and the relationship between ratings and observational data in a middle school setting. The data are interpreted in the context of differences between a secondary and elementary school setting. Teacher ratings and observational data were collected regularly over the course of two academic years for middle school students diagnosed with ADHD. The results indicate low rates of inter-rater agreement as well as low rates of agreement between teachers and observational data, and between observational data collected in different classrooms. Inter-rater agreement was lowest in late fall and gradually increased over the second half of the year. Implications for conducting treatment outcome evaluations of school-based treatment programs and diagnostic evaluations are discussed.

  6. Standardized assessment of tumor-infiltrating lymphocytes in breast cancer: an evaluation of inter-observer agreement between pathologists.

    PubMed

    Tramm, Trine; Di Caterino, Tina; Jylling, Anne-Marie B; Lelkaitis, Giedrius; Lænkholm, Anne-Vibeke; Ragó, Péter; Tabor, Tomasz P; Talman, Maj-Lis M; Vouza, Emmanouela

    2018-01-01

    In breast cancer, there is a growing body of evidence that tumor-infiltrating lymphocytes (TILs) may have clinical utility and may be able to direct clinical decisions for subgroups of patients. Clinical utility is, however, not sufficient for warranting the implementation of a new biomarker in the routine practice, and evaluation of the analytical validity is needed, including testing the reproducibility of decentralized assessment of TILs. The aim of this study was to evaluate the inter-observer agreement of TILs assessment using a standardized method, as proposed by the International TILs Working Group 2014, applied to a cohort of breast cancers reflecting an average breast cancer population. Stromal TILs were assessed using full slide sections from 124 breast cancers with varying histology, malignancy grade and ER- and HER2 status. TILs were estimated by nine dedicated breast pathologists using scanned hematoxylin-eosin stainings. TILs results were categorized using various cutoffs, and the inter-observer agreement was evaluated using the intraclass coefficient (ICC), Kappa statistics as well as individual overall agreements with the median value of TILs. Evaluation of TILs led to an ICC of 0.71 (95% CI: 0.65-0.77) corresponding to an acceptable agreement. Kappa values were in the range of 0.38-0.46 corresponding to a fair to moderate agreement. The individual agreements increased, when using only two categories ('high' vs. 'low' TILs) and a cutoff of 50-60%. The results of the present study are in accordance with previous studies, and shows that the proposed methodology for standardized evaluation of TILs renders an acceptable inter-observer agreement. The findings, however, indicate that assessment of TILs needs further refinement, and is in support of the latest St. Gallen Consensus, that routine reporting of TILs for early breast cancer is not ready for implementation in a clinical setting.

  7. Global Water Cycle Agreement in the Climate Models Assessed in the IPCC AR4

    NASA Technical Reports Server (NTRS)

    Waliser, D.; Seo, K. -W.; Schubert, S.; Njoku, E.

    2007-01-01

    This study examines the fidelity of the global water cycle in the climate model simulations assessed in the IPCC Fourth Assessment Report. The results demonstrate good model agreement in quantities that have had a robust global observational basis and that are physically unambiguous. The worst agreement occurs for quantities that have both poor observational constraints and whose model representations can be physically ambiguous. In addition, components involving water vapor (frozen water) typically exhibit the best (worst) agreement, and fluxes typically exhibit better agreement than reservoirs. These results are discussed in relation to the importance of obtaining accurate model representation of the water cycle and its role in climate change. Recommendations are also given for facilitating the needed model improvements.

  8. Investigating Various Thresholds as Immunohistochemistry Cutoffs for Observer Agreement.

    PubMed

    Ali, Asif; Bell, Sarah; Bilsland, Alan; Slavin, Jill; Lynch, Victoria; Elgoweini, Maha; Derakhshan, Mohammad H; Jamieson, Nigel B; Chang, David; Brown, Victoria; Denley, Simon; Orange, Clare; McKay, Colin; Carter, Ross; Oien, Karin A; Duthie, Fraser R

    2017-10-01

    Clinical translation of immunohistochemistry (IHC) biomarkers requires reliable and reproducible cutoffs or thresholds for interpretation of immunostaining. Most IHC biomarker research focuses on the clinical relevance (diagnostic, prognostic, or predictive utility) of cutoffs, with less emphasis on observer agreement using these cutoffs. From the literature, we identified 3 commonly used cutoffs of 10% positive epithelial cells, 20% positive epithelial cells, and moderate to strong staining intensity (+2/+3 hereafter) to use for investigating observer agreement. A series of 36 images of microarray cores stained for 4 different IHC biomarkers, with variable staining intensity and percentage of positive cells, was used for investigating interobserver and intraobserver agreement. Seven pathologists scored the immunostaining in each image using the 3 cutoffs for positive and negative staining. Kappa (κ) statistic was used to assess the strength of agreement for each cutoff. The interobserver agreement between all 7 pathologists using the 3 cutoffs was reasonably good, with mean κ scores of 0.64, 0.59, and 0.62, respectively, for 10%, 20%, and +2/+3 cutoffs. A good agreement was observed for experienced pathologists using the 10% cutoff, and their agreement was statistically higher than for junior pathologists (P=0.02). In addition, the mean intraobserver agreement for all 7 pathologists using the 3 cutoffs was reasonably good, with mean κ scores of 0.71, 0.60, and 0.73, respectively, for 10%, 20%, and +2/+3 cutoffs. For all 3 cutoffs, a positive correlation was observed with perceived ease of interpretation (P<0.003). Finally, cytoplasmic-only staining achieved higher agreement using all 3 cutoffs than mixed staining patterns. All 3 cutoffs investigated achieve reasonable strength of agreement, modestly decreasing interobserver and intraobserver variability in IHC interpretation. These cutoffs have previously been used in cancer pathology, and this study provides evidence that these cutoffs can be reproducible between practicing pathologists.

  9. Assessment of colon polyp morphology: Is education effective?

    PubMed Central

    Kim, Jae Hyun; Nam, Kyoung Sik; Kwon, Hye Jung; Choi, Youn Jung; Jung, Kyoungwon; Kim, Sung Eun; Moon, Won; Park, Moo In; Park, Seun Ja

    2017-01-01

    AIM To determine the inter-observer variability for colon polyp morphology and to identify whether education can improve agreement among observers. METHODS For purposes of the tests, we recorded colonoscopy video clips that included scenes visualizing the polyps. A total of 15 endoscopists and 15 nurses participated in the study. Participants watched 60 video clips of the polyp morphology scenes and then estimated polyp morphology (pre-test). After education for 20 min, participants performed a second test in which the order of 60 video clips was changed (post-test). To determine if the effectiveness of education was sustained, four months later, a third, follow-up test was performed with the same participants. RESULTS The overall Fleiss’ kappa value of the inter-observer agreement was 0.510 in the pre-test, 0.618 in the post-test, and 0.580 in the follow-up test. The overall diagnostic accuracy of the estimation for polyp morphology in the pre-, post-, and follow-up tests was 0.662, 0.797, and 0.761, respectively. After education, the inter-observer agreement and diagnostic accuracy of all participants improved. However, after four months, the inter-observer agreement and diagnostic accuracy of expert groups were markedly decreased, and those of beginner and nurse groups remained similar to pre-test levels. CONCLUSION The education program used in this study can improve inter-observer agreement and diagnostic accuracy in assessing the morphology of colon polyps; it is especially effective when first learning endoscopy. PMID:28974894

  10. Assessment of colon polyp morphology: Is education effective?

    PubMed

    Kim, Jae Hyun; Nam, Kyoung Sik; Kwon, Hye Jung; Choi, Youn Jung; Jung, Kyoungwon; Kim, Sung Eun; Moon, Won; Park, Moo In; Park, Seun Ja

    2017-09-14

    To determine the inter-observer variability for colon polyp morphology and to identify whether education can improve agreement among observers. For purposes of the tests, we recorded colonoscopy video clips that included scenes visualizing the polyps. A total of 15 endoscopists and 15 nurses participated in the study. Participants watched 60 video clips of the polyp morphology scenes and then estimated polyp morphology (pre-test). After education for 20 min, participants performed a second test in which the order of 60 video clips was changed (post-test). To determine if the effectiveness of education was sustained, four months later, a third, follow-up test was performed with the same participants. The overall Fleiss' kappa value of the inter-observer agreement was 0.510 in the pre-test, 0.618 in the post-test, and 0.580 in the follow-up test. The overall diagnostic accuracy of the estimation for polyp morphology in the pre-, post-, and follow-up tests was 0.662, 0.797, and 0.761, respectively. After education, the inter-observer agreement and diagnostic accuracy of all participants improved. However, after four months, the inter-observer agreement and diagnostic accuracy of expert groups were markedly decreased, and those of beginner and nurse groups remained similar to pre-test levels. The education program used in this study can improve inter-observer agreement and diagnostic accuracy in assessing the morphology of colon polyps; it is especially effective when first learning endoscopy.

  11. 68Ga-PSMA-11 PET/CT Interobserver Agreement for Prostate Cancer Assessments: An International Multicenter Prospective Study.

    PubMed

    Fendler, Wolfgang Peter; Calais, Jeremie; Allen-Auerbach, Martin; Bluemel, Christina; Eberhardt, Nina; Emmett, Louise; Gupta, Pawan; Hartenbach, Markus; Hope, Thomas A; Okamoto, Shozo; Pfob, Christian Helmut; Pöppel, Thorsten D; Rischpler, Christoph; Schwarzenböck, Sarah; Stebner, Vanessa; Unterrainer, Marcus; Zacho, Helle D; Maurer, Tobias; Gratzke, Christian; Crispin, Alexander; Czernin, Johannes; Herrmann, Ken; Eiber, Matthias

    2017-10-01

    The interobserver agreement for 68 Ga-PSMA-11 PET/CT study interpretations in patients with prostate cancer is unknown. Methods: 68 Ga-PSMA-11 PET/CT was performed in 50 patients with prostate cancer for biochemical recurrence ( n = 25), primary diagnosis ( n = 10), biochemical persistence after primary therapy ( n = 5), or staging of known metastatic disease ( n = 10). Images were reviewed by 16 observers who used a standardized approach for interpretation of local (T), nodal (N), bone (Mb), or visceral (Mc) involvement. Observers were classified as having a low (<30 prior 68 Ga-PSMA-11 PET/CT studies; n = 5), intermediate (30-300 studies; n = 5), or high level of experience (>300 studies; n = 6). Histopathology ( n = 25, 50%), post-external-beam radiation therapy prostate-specific antigen response ( n = 15, 30%), or follow-up PET/CT ( n = 10, 20%) served as a standard of reference. Observer groups were compared by overall agreement (% patients matching the standard of reference) and Fleiss' κ with mean and corresponding 95% confidence interval (CI). Results: Agreement among all observers was substantial for T (κ = 0.62; 95% CI, 0.59-0.64) and N (κ = 0.74; 95% CI, 0.71-0.76) staging and almost perfect for Mb (κ = 0.88; 95% CI, 0.86-0.91) staging. Level of experience positively correlated with agreement for T (κ = 0.73/0.66/0.50 for high/intermediate/low experience, respectively), N (κ = 0.80/0.76/0.64, respectively), and Mc staging (κ = 0.61/0.46/0.36, respectively). Interobserver agreement for Mb was almost perfect irrespective of prior experience (κ = 0.87/0.91/0.88, respectively). Observers with low experience, when compared with intermediate and high experience, demonstrated significantly lower median overall agreement (54% vs. 66% and 76%, P = 0.041) and specificity for T staging (73% vs. 88% and 93%, P = 0.032). Conclusion: The interpretation of 68 Ga-PSMA-11 PET/CT for prostate cancer staging is highly consistent among observers with high levels of experience, especially for nodal and bone assessments. Initial training on at least 30 patient cases is recommended to ensure acceptable performance. © 2017 by the Society of Nuclear Medicine and Molecular Imaging.

  12. Inter-Observer Agreement of Whole-Body Computed Tomography in Staging and Response Assessment in Lymphoma: The Lugano Classification.

    PubMed

    Razek, Ahmed Abdel Khalek Abdel; Shamaa, Sameh; Lattif, Mahmoud Abdel; Yousef, Hanan Hamid

    2017-01-01

    To assess inter-observer agreement of whole-body computed tomography (WBCT) in staging and response assessment in lymphoma according to the Lugano classification. Retrospective analysis was conducted of 115 consecutive patients with lymphomas (45 females, 70 males; mean age of 46 years). Patients underwent WBCT with a 64 multi-detector CT device for staging and response assessment after a complete course of chemotherapy. Image analysis was performed by 2 reviewers according to the Lugano classification for staging and response assessment. The overall inter-observer agreement of WBCT in staging of lymphoma was excellent ( k =0.90, percent agreement=94.9%). There was an excellent inter-observer agreement for stage I ( k =0.93, percent agreement=96.4%), stage II ( k =0.90, percent agreement=94.8%), stage III ( k =0.89, percent agreement=94.6%) and stage IV ( k =0.88, percent agreement=94%). The overall inter-observer agreement in response assessment after a completer course of treatment was excellent ( k =0.91, percent agreement=95.8%). There was an excellent inter-observer agreement in progressive disease ( k =0.94, percent agreement=97.1%), stable disease ( k =0.90, percent agreement=95%), partial response ( k =0.96, percent agreement=98.1%) and complete response ( k =0.87, Percent agreement=93.3%). We concluded that WBCT is a reliable and reproducible imaging modality for staging and treatment assessment in lymphoma according to the Lugano classification.

  13. Do thoraco-lumbar spinal injuries classification systems exhibit lower inter- and intra-observer agreement than other fractures classifications?: A comparison using fractures of the trochanteric area of the proximal femur as contrast model.

    PubMed

    Urrutia, Julio; Zamora, Tomas; Klaber, Ianiv; Carmona, Maximiliano; Palma, Joaquin; Campos, Mauricio; Yurac, Ratko

    2016-04-01

    It has been postulated that the complex patterns of spinal injuries have prevented adequate agreement using thoraco-lumbar spinal injuries (TLSI) classifications; however, limb fracture classifications have also shown variable agreements. This study compared agreement using two TLSI classifications with agreement using two classifications of fractures of the trochanteric area of the proximal femur (FTAPF). Six evaluators classified the radiographs and computed tomography scans of 70 patients with acute TLSI using the Denis and the new AO Spine thoraco-lumbar injury classifications. Additionally, six evaluators classified the radiographs of 70 patients with FTAPF using the Tronzo and the AO schemes. Six weeks later, all cases were presented in a random sequence for repeat assessment. The Kappa coefficient (κ) was used to determine agreement. Inter-observer agreement: For TLSI, using the AOSpine classification, the mean κ was 0.62 (0.57-0.66) considering fracture types, and 0.55 (0.52-0.57) considering sub-types; using the Denis classification, κ was 0.62 (0.59-0.65). For FTAPF, with the AO scheme, the mean κ was 0.58 (0.54-0.63) considering fracture types and 0.31 (0.28-0.33) considering sub-types; for the Tronzo classification, κ was 0.54 (0.50-0.57). Intra-observer agreement: For TLSI, using the AOSpine scheme, the mean κ was 0.77 (0.72-0.83) considering fracture types, and 0.71 (0.67-0.76) considering sub-types; for the Denis classification, κ was 0.76 (0.71-0.81). For FTAPF, with the AO scheme, the mean κ was 0.75 (0.69-0.81) considering fracture types and 0.45 (0.39-0.51) considering sub-types; for the Tronzo classification, κ was 0.64 (0.58-0.70). Using the main types of AO classifications, inter- and intra-observer agreement of TLSI were comparable to agreement evaluating FTAPF; including sub-types, inter- and intra-observer agreement evaluating TLSI were significantly better than assessing FTAPF. Inter- and intra-observer agreements using the Denis classification were also significantly better than agreement using the Tronzo scheme. Copyright © 2015 Elsevier Ltd. All rights reserved.

  14. Development of an observational measure of healthcare worker hand-hygiene behaviour: the hand-hygiene observation tool (HHOT).

    PubMed

    McAteer, J; Stone, S; Fuller, C; Charlett, A; Cookson, B; Slade, R; Michie, S

    2008-03-01

    Previous observational measures of healthcare worker (HCW) hand-hygiene behaviour (HHB) fail to provide adequate standard operating procedures (SOPs), accounts of inter-rater agreement testing or evidence of sensitivity to change. This study reports the development of an observational tool in a way that addresses these deficiencies. Observational categories were developed systematically, guided by a clinical guideline, previous measures and pilot hand-hygiene behaviour observations (HHOs). The measure, a simpler version of the Geneva tool, consists of HHOs (before and after low-risk, high-risk or unobserved contact), HHBs (soap, alcohol hand rub, no action, unknown), and type of HCW. Inter-observer agreement for each category was assessed by observation of 298 HHOs and HHBs by two independent observers on acute elderly and intensive care units. Raw agreement (%) and Kappa were 77% and 0.68 for HHB; 83% and 0.77 for HHO; and 90% and 0.77 for HCW. Inter-observer agreement for overall compliance of a group of HCWs was assessed by observation of 1191 HHOs and HHBs by two pairs of independent observers. Overall agreement was good (intraclass correlation coefficient = 0.79). Sensitivity to change was examined by autoregressive time-series modelling of longitudinal observations for 8 months on the intensive therapy unit during an Acinetobacter baumannii outbreak and subsequent strengthening of infection control measures. Sensitivity to change was demonstrated by a rise in compliance from 80 to 98% with an odds ratio of increased compliance of 7.00 (95% confidence interval: 4.02-12.2) P < 0.001.

  15. Home safety practices in an urban low-income population: level of agreement between parental self-report and observed behaviors.

    PubMed

    Lee, Lois K; Walia, Taranjeev; Forbes, Peter W; Osganian, Stavroula K; Samuels, Ronald; Cox, Joanne E; Mooney, David P

    2012-12-01

    Home-related injuries are overrepresented in children from low-income households. The objectives of this study were to determine frequencies of home safety behaviors and the level of agreement between parental self-report and observed safety practices in low-income homes. In a prospective, interventional home injury prevention study of 49 low-income families with children <5 years old, a trained home visitor administered baseline parental home safety behavior questionnaires and assessments. There was high agreement between caregiver self-report and home visitor observation for lack of cabinet latch (99%, 95% confidence interval [CI] = 88%-99%) and stair gate use (100%, 95% CI = 88-100%). There was lower agreement for the safe storage of cleaning supplies (62%, 95% CI = 46%-75%), sharps (74%, 95% CI = 59%-85%), and medicines/vitamins (83%, 95% CI = 69%-92%) because of the overreporting of safe practices. Self-reports of some home safety behaviors are relatively accurate, but certain practices may need to be verified by direct assessment.

  16. Comparative Study of the Diagnostic Value of Panoramic and Conventional Radiography of the Wrist in Scaphoid Fractures

    PubMed Central

    Ezoddini Ardakani, Fatemeh; Zangoie Booshehri, Maryam; Banadaki, Seyed Hossein Saeed; Nafisi-Moghadam, Reza

    2012-01-01

    Background Scaphoid fractures are the most common type of carpal fractures. Objectives The aim of the study was to compare the diagnostic value of panoramic and conventional radiographs of the wrist in scaphoid fractures. Patients and Methods The panoramic and conventional radiographs of 122 patients with acute and chronic wrist trauma were studied. The radiographs were analyzed and examined by two independent radiologist observers; one physician radiologist and one maxillofacial radiologist. The final diagnosis was made by an orthopedic specialist. Kappa test was used for statistical calculations, inter- and intra-observer agreement and correlation between the two techniques. Results Wrist panoramic radiography was more accurate than conventional radiography for ruling out scaphoid fractures. There was an agreement in 85% or more of the cases. Agreement values were higher with better inter and intra observer agreement for panoramic examinations than conventional radiographic examinations. Conclusion The panoramic examination of the wrist is a useful technique for the diagnosis and follow-up of scaphoid fractures. Its use is recommended as a complement to conventional radiography in cases with inconclusive findings. PMID:23599708

  17. [Inter-observes agreement of Ishak and Metavir scores in histological evaluation of chronic viral hepatitis B and C].

    PubMed

    Rammeh, Soumaya; Khadra, Hajer Ben; Znaidi, Nadia Sabbegh; Romdhane, Neila Attia; Najjar, Taoufik; Bouzaidi, Slim; Zermani, Rachida

    2014-01-01

    Many classification systems are currently used for histological evaluation of the severity of chronic viral hepatitis, including the Ishak and Metavir scores, but there is not a consensus classification. The objective of this work was to study the intra and inter-observers agreement of these two scores in the histopathological analysis of liver biopsies in patients with chronic viral hepatitis B or C. Fifty nine patients were included in the study, 26 had chronic hepatitis C and 33 had chronic hepatitis B. To investigate the inter-observers agreement, the liver biopsies were analyzed separately by two pathologists without prior consensus reading. The two pathologists conducted then a consensual reading before reviewing all cases independently. Cohen's kappa coefficient was calculated and in case of asymmetry Spearman's rho coefficient. Before the consensus reading, the agreement was moderate for the analysis of histological activity with both scores (Metavir: kappa=0.41, Ishak: rho=0.58). For the analysis of fibrosis, the agreement was good with both scores (Metavir: kappa=0.61, Ishak: rho=0.86). The consensus reading has improved the reproducibility of the activity that has become good with both scores (Metavir: kappa=0.77, Ishak: rho=0.76). For fibrosis improvement was observed with the Ishak score which agreement became excellent (kappa=0.81). In conclusion, we recommend in routine practice, a combined score: Metavir for activity and Ishak for fibrosis and to make a double reading for each biopsy.

  18. Psychometric properties of a sign language version of the Mini International Neuropsychiatric Interview (MINI).

    PubMed

    Øhre, Beate; Saltnes, Hege; von Tetzchner, Stephen; Falkum, Erik

    2014-05-22

    There is a need for psychiatric assessment instruments that enable reliable diagnoses in persons with hearing loss who have sign language as their primary language. The objective of this study was to assess the validity of the Norwegian Sign Language (NSL) version of the Mini International Neuropsychiatric Interview (MINI). The MINI was translated into NSL. Forty-one signing patients consecutively referred to two specialised psychiatric units were assessed with a diagnostic interview by clinical experts and with the MINI. Inter-rater reliability was assessed with Cohen's kappa and "observed agreement". There was 65% agreement between MINI diagnoses and clinical expert diagnoses. Kappa values indicated fair to moderate agreement, and observed agreement was above 76% for all diagnoses. The MINI diagnosed more co-morbid conditions than did the clinical expert interview (mean diagnoses: 1.9 versus 1.2). Kappa values indicated moderate to substantial agreement, and "observed agreement" was above 88%. The NSL version performs similarly to other MINI versions and demonstrates adequate reliability and validity as a diagnostic instrument for assessing mental disorders in persons who have sign language as their primary and preferred language.

  19. Teacher and TA Ratings of Preschoolers' Externalizing Behavior: Agreement and Associations with Observed Classroom Behavior

    ERIC Educational Resources Information Center

    Wolcott, Catherine Sanger; Williford, Amanda P.

    2015-01-01

    The present study investigated teachers' and teacher aides' (TAs) agreement in their ratings of preschoolers' externalizing behavior and their associations with observed classroom behavior for a sample of children at risk of developing a disruptive behavior disorder. One hundred twenty-two teachers rated 360 students' externalizing behavior in the…

  20. Tooth shade measurements under standard and nonstandard illumination and their agreement with skin color.

    PubMed

    Al-Dwairi, Ziad; Shaweesh, Ashraf; Kamkarfar, Sohrab; Kamkarfar, Shahrzad; Borzabadi-Farahani, Ali; Lynch, Edward

    2014-01-01

    The purpose of this study was to examine the relationship between skin color (shade) and tooth shade under standard and nonstandard illumination sources. Four hundred Jordanian participants (200 males, 200 females, 20 to 50 years of age) were studied. Skin colors were assessed and categorized using the L'Oreal and Revlon foundation shade guides (light, medium, dark). The Vita Pan Classical Shade Guide (VPCSG; Vident) and digital Vita EasyShade Intraoral Dental Spectrophotometer (VESIDS; Vident) were used to select shades in the middle thirds of maxillary central incisors; tooth shades were classified into four categories (highest, high, medium, low). Significant gender differences were observed for skin colors (P = .000) and tooth shade guide systems (P = .001 and .050 for VPCSG and VESIDS, respectively). The observed agreement was 100% and 93% for skin and tooth shade guides, respectively. The corresponding kappa statistic values were 1.00 and 0.79, respectively (substantial agreement, P < .001). The observed agreement between skin color and tooth shades (VPCSG and VESIDS) was approximately 50%. The digital tooth shade guide system can be a satisfactory substitute for classical tooth shade guides and clinical shade matching. There was only moderate agreement between skin color and tooth shade.

  1. A comparative study of software programmes for cross-sectional skeletal muscle and adipose tissue measurements on abdominal computed tomography scans of rectal cancer patients.

    PubMed

    van Vugt, Jeroen L A; Levolger, Stef; Gharbharan, Arvind; Koek, Marcel; Niessen, Wiro J; Burger, Jacobus W A; Willemsen, Sten P; de Bruin, Ron W F; IJzermans, Jan N M

    2017-04-01

    The association between body composition (e.g. sarcopenia or visceral obesity) and treatment outcomes, such as survival, using single-slice computed tomography (CT)-based measurements has recently been studied in various patient groups. These studies have been conducted with different software programmes, each with their specific characteristics, of which the inter-observer, intra-observer, and inter-software correlation are unknown. Therefore, a comparative study was performed. Fifty abdominal CT scans were randomly selected from 50 different patients and independently assessed by two observers. Cross-sectional muscle area (CSMA, i.e. rectus abdominis, oblique and transverse abdominal muscles, paraspinal muscles, and the psoas muscle), visceral adipose tissue area (VAT), and subcutaneous adipose tissue area (SAT) were segmented by using standard Hounsfield unit ranges and computed for regions of interest. The inter-software, intra-observer, and inter-observer agreement for CSMA, VAT, and SAT measurements using FatSeg, OsiriX, ImageJ, and sliceOmatic were calculated using intra-class correlation coefficients (ICCs) and Bland-Altman analyses. Cohen's κ was calculated for the agreement of sarcopenia and visceral obesity assessment. The Jaccard similarity coefficient was used to compare the similarity and diversity of measurements. Bland-Altman analyses and ICC indicated that the CSMA, VAT, and SAT measurements between the different software programmes were highly comparable (ICC 0.979-1.000, P < 0.001). All programmes adequately distinguished between the presence or absence of sarcopenia (κ = 0.88-0.96 for one observer and all κ = 1.00 for all comparisons of the other observer) and visceral obesity (all κ = 1.00). Furthermore, excellent intra-observer (ICC 0.999-1.000, P < 0.001) and inter-observer (ICC 0.998-0.999, P < 0.001) agreement for all software programmes were found. Accordingly, excellent Jaccard similarity coefficients were found for all comparisons (mean ≥ 0.964). FatSeg, OsiriX, ImageJ, and sliceOmatic showed an excellent agreement for CSMA, VAT, and SAT measurements on abdominal CT scans. Furthermore, excellent inter-observer and intra-observer agreement were achieved. Therefore, results of studies using these different software programmes can reliably be compared. © 2016 The Authors. Journal of Cachexia, Sarcopenia and Muscle published by John Wiley & Sons Ltd on behalf of the Society on Sarcopenia, Cachexia and Wasting Disorders.

  2. A comparative study of software programmes for cross‐sectional skeletal muscle and adipose tissue measurements on abdominal computed tomography scans of rectal cancer patients

    PubMed Central

    Levolger, Stef; Gharbharan, Arvind; Koek, Marcel; Niessen, Wiro J.; Burger, Jacobus W.A.; Willemsen, Sten P.; de Bruin, Ron W.F.

    2016-01-01

    Abstract Background The association between body composition (e.g. sarcopenia or visceral obesity) and treatment outcomes, such as survival, using single‐slice computed tomography (CT)‐based measurements has recently been studied in various patient groups. These studies have been conducted with different software programmes, each with their specific characteristics, of which the inter‐observer, intra‐observer, and inter‐software correlation are unknown. Therefore, a comparative study was performed. Methods Fifty abdominal CT scans were randomly selected from 50 different patients and independently assessed by two observers. Cross‐sectional muscle area (CSMA, i.e. rectus abdominis, oblique and transverse abdominal muscles, paraspinal muscles, and the psoas muscle), visceral adipose tissue area (VAT), and subcutaneous adipose tissue area (SAT) were segmented by using standard Hounsfield unit ranges and computed for regions of interest. The inter‐software, intra‐observer, and inter‐observer agreement for CSMA, VAT, and SAT measurements using FatSeg, OsiriX, ImageJ, and sliceOmatic were calculated using intra‐class correlation coefficients (ICCs) and Bland–Altman analyses. Cohen's κ was calculated for the agreement of sarcopenia and visceral obesity assessment. The Jaccard similarity coefficient was used to compare the similarity and diversity of measurements. Results Bland–Altman analyses and ICC indicated that the CSMA, VAT, and SAT measurements between the different software programmes were highly comparable (ICC 0.979–1.000, P < 0.001). All programmes adequately distinguished between the presence or absence of sarcopenia (κ = 0.88–0.96 for one observer and all κ = 1.00 for all comparisons of the other observer) and visceral obesity (all κ = 1.00). Furthermore, excellent intra‐observer (ICC 0.999–1.000, P < 0.001) and inter‐observer (ICC 0.998–0.999, P < 0.001) agreement for all software programmes were found. Accordingly, excellent Jaccard similarity coefficients were found for all comparisons (mean ≥ 0.964). Conclusions FatSeg, OsiriX, ImageJ, and sliceOmatic showed an excellent agreement for CSMA, VAT, and SAT measurements on abdominal CT scans. Furthermore, excellent inter‐observer and intra‐observer agreement were achieved. Therefore, results of studies using these different software programmes can reliably be compared. PMID:27897414

  3. A Local Agreement Pattern Measure Based on Hazard Functions for Survival Outcomes

    PubMed Central

    Dai, Tian; Guo, Ying; Peng, Limin; Manatunga, Amita K.

    2017-01-01

    Summary Assessing agreement is often of interest in biomedical and clinical research when measurements are obtained on the same subjects by different raters or methods. Most classical agreement methods have been focused on global summary statistics, which cannot be used to describe various local agreement patterns. The objective of this work is to study the local agreement pattern between two continuous measurements subject to censoring. In this paper, we propose a new agreement measure based on bivariate hazard functions to characterize the local agreement pattern between two correlated survival outcomes. The proposed measure naturally accommodates censored observations, fully captures the dependence structure between bivariate survival times and provides detailed information on how the strength of agreement evolves over time. We develop a nonparametric estimation method for the proposed local agreement pattern measure and study theoretical properties including strong consistency and asymptotical normality. We then evaluate the performance of the estimator through simulation studies and illustrate the method using a prostate cancer data example. PMID:28724196

  4. A local agreement pattern measure based on hazard functions for survival outcomes.

    PubMed

    Dai, Tian; Guo, Ying; Peng, Limin; Manatunga, Amita K

    2018-03-01

    Assessing agreement is often of interest in biomedical and clinical research when measurements are obtained on the same subjects by different raters or methods. Most classical agreement methods have been focused on global summary statistics, which cannot be used to describe various local agreement patterns. The objective of this work is to study the local agreement pattern between two continuous measurements subject to censoring. In this article, we propose a new agreement measure based on bivariate hazard functions to characterize the local agreement pattern between two correlated survival outcomes. The proposed measure naturally accommodates censored observations, fully captures the dependence structure between bivariate survival times and provides detailed information on how the strength of agreement evolves over time. We develop a nonparametric estimation method for the proposed local agreement pattern measure and study theoretical properties including strong consistency and asymptotical normality. We then evaluate the performance of the estimator through simulation studies and illustrate the method using a prostate cancer data example. © 2017, The International Biometric Society.

  5. Inter-method agreement in retinal blood vessels diameter analysis between Dynamic Vessel Analyzer and optical coherence tomography.

    PubMed

    Benatti, Lucia; Corvi, Federico; Tomasso, Livia; Mercuri, Stefano; Querques, Lea; Ricceri, Fulvio; Bandello, Francesco; Querques, Giuseppe

    2017-06-01

    To analyze the inter-methods agreement in arteriovenous ratio (AVR) evaluation between spectral-domain optical coherence tomography (SD-OCT) and Dynamic Vessel Analyzer (DVA). Healthy volunteers underwent DVA and SD-OCT examination. AVR was measured by SD-OCT using the four external lines of the optic nerve head-centered 7-line cube and by DVA using an automated AVR estimation. The mean AVR was calculated, twice, separately by two independent readers for each tool. Twenty-two eyes of 11 healthy subjects (five women and six men, mean age 35) were included. AVR analysis by DVA showed high inter-observer agreement between reader 1 and 2, and high intra-observer agreement for both reader 1 and reader 2. With regard to AVR analysis on SD-OCT, we found high inter-observer agreement between reader 1 and 2, and low intra-observer agreement for reader 2 but high intra-observer agreement for reader 1. Overall, the mean AVR measured on SD-OCT turned out to be significantly higher than mean AVR measured through DVA (reader 1, 0.9023 ± 0.06 vs 0.8036 ± 0.08; p < 0.001, and reader 2, 0.9067 ± 0.06 vs 0.8083 ± 0.05; p= 0.003). No inter-method agreement in AVR could be detected in the present study due to bias in measurements (shift between DVA and SD-OCT). We found significant difference in the two noninvasive methods for AVR measurement, with a tendency for SD-OCT to overestimate retinal vascular caliber in comparison to DVA. This may be useful for achieving greater accuracy in the evaluation of retinal vessel in ocular as well as systemic diseases.

  6. Postoperative chest tube management: measuring air leak using an electronic device decreases variability in the clinical practice.

    PubMed

    Varela, Gonzalo; Jiménez, Marcelo F; Novoa, Nuria Maria; Aranda, José Luis

    2009-01-01

    Since there are no data in the literature regarding variability in the management of postoperative pleural drainages, we have designed a prospective randomized study aimed at measuring inter-observer variability in deciding when to withdraw chest tubes after lung resection and to evaluate if the use of an electronic device to measure postoperative air leak decreases clinical practice variations. Sixty-one patients undergoing pulmonary resection were randomly assigned to one of the following groups: digital group (electronic measure of pleural air leak using Millicore AB DigiVent chest drainage system) or traditional group (standard water seal pleural chamber). Chest tube withdrawal criteria were established in advance. During morning rounds, two thoracic surgeons with comparable clinical experience and blinded to the decision of their counterpart, evaluated chest tube withdrawal criteria and noted whether the tube should be withdrawn or not. Inter-observer variability kappa index and global, positive, and negative agreement rates were calculated on 2 x 2 tables. Each observation episode was considered in the calculation. Fifty-four observations were recorded in the traditional group. Kappa coefficient was 0.37 (overall agreement rate: 0.58; positive agreement rate: 0.72; and negative agreement rate: 0.64). In the digital group, 67 observations were recorded. Kappa coefficient was 0.88 (overall agreement rate: 0.94; positive agreement rate 0.94; and negative agreement rate 0.94). We have demonstrated a high rate of disagreement related to the indication to remove chest tubes after lung resection and the improvement of the agreement rate with the use of an electronic device to measure postoperative air leak and pleural pressures.

  7. Suboptimal Agreement Among Cytopathologists in Diagnosis of Malignancy Based on Endoscopic Ultrasound Needle Aspirates of Solid Pancreatic Lesions: A Validation Study.

    PubMed

    Marshall, Carrie; Mounzer, Rawad; Hall, Matt; Simon, Violette; Centeno, Barbara; Dennis, Katie; Dhillon, Jasreman; Fan, Fang; Khazai, Laila; Klapman, Jason; Komanduri, Srinadh; Lin, Xiaoqi; Lu, David; Mehrotra, Sanjana; Muthusamy, V Raman; Nayar, Ritu; Paintal, Ajit; Rao, Jianyu; Sams, Sharon; Shah, Janak; Watson, Rabindra; Rastogi, Amit; Wani, Sachin

    2018-07-01

    Despite the widespread use of endoscopic ultrasound-guided fine-needle aspiration (EUS-FNA) to sample pancreatic lesions and the standardization of pancreaticobiliary cytopathologic nomenclature, there are few data on inter-observer agreement among cytopathologists evaluating pancreatic cytologic specimens obtained by EUS-FNA. We developed a scoring system to assess agreement among cytopathologists in overall diagnosis and quantitative and qualitative parameters, and evaluated factors associated with agreement. We performed a prospective study to validate results from our pilot study that demonstrated moderate to substantial inter-observer agreement among cytopathologists for the final cytologic diagnosis. In the first phase, 3 cytopathologists refined criteria for assessment of quantity and quality measures. During phase 2, EUS-FNA specimens of solid pancreatic lesions from 46 patients were evaluated by 11 cytopathologists at 5 tertiary care centers using a standardized scoring tool. Individual quantitative and qualitative measures were scored and an overall cytologic diagnosis was determined. Clinical and EUS parameters were assessed as predictors of unanimous agreement. Inter-observer agreement (IOA) was calculated using multi-rater kappa (κ) statistics and a logistic regression model was created to identify factors associated with unanimous agreement. The IOA for final diagnoses, based on cytologic analysis, was moderate (κ = 0.56; 95% CI, 0.43-0.70). Kappa values did not increase when categories of suspicious for malignancy, malignant, and neoplasm were combined. IOA was slight to moderate for individual quantitative (κ = 0.007; 95% CI, -0.03 to -0.04) and qualitative parameters (κ = 0.5; 95% CI, 0.47-0.53). Jaundice was the only factor associated with agreement among all cytopathologists on multivariate analysis (odds ratio for unanimous agreement, 5.3; 95% CI, 1.1-26.89). There is a suboptimal level of agreement among cytopathologists in the diagnosis of malignancy based on analysis of EUS-FNA specimens obtained from solid pancreatic masses. Strategies are needed to refine the cytologic criteria for diagnosis of malignancy and enhance tissue acquisition techniques to improve diagnostic reproducibility among cytopathologists. Copyright © 2018 AGA Institute. Published by Elsevier Inc. All rights reserved.

  8. Inter-observer agreement of standard joint count examination and disease global assessment in a cohort of Egyptian Rheumatoid Arthritis patients.

    PubMed

    El-Hadidi, Khaled; Gamal, Sherif M; Saad, Sahar

    2017-12-21

    To assess the inter-observer agreement of standard joint count between experienced Rheumatology professor (Prof) and young Rheumatology fellow (candidate), and to compare disease global assessment between professor, young candidate and patients. This study included one hundred rheumatoid arthritis patients. For all patients independent clinical evaluation was done by two rheumatologists (professor and candidate) for detection of tenderness in 28 joints and swelling in 26 joints. The study also involved global assessment of disease activity by the provider (Prof and candidate) (EGA) as well as by the patient (PGA). The EGA was determined without previous knowledge of the patient's laboratory test results. A highly significant accordance (correlation) between professor and candidate was found in both the number of tender joints (p<0.001) (r=0.946), and the number of swollen joints (p<0.001) (r=0.797). Regarding swollen joints, the highest agreement was in right knee (0.929), while poor agreement was found in the right 5th MCP (0.049). Regarding tender joints, the highest analogy was in the right elbow (0.899), in contrast to the left 3rd PIP (0.462) which showed the least congruence. Agreement study using kappa measurement for disease global assessment showed: moderate agreement (between professor and candidate) (0.405), fair agreement between (professor and patient) (0.213), fair agreement between (candidate and patient) (0.367). Inter-observer reliability was better for TJCs than SJCs. Regarding SJCs agreement was better in large joints such as the knees compared to the small joints such as the MCPs. Disease global assessment may show discrepancy between patients and physicians. Copyright © 2017 Elsevier España, S.L.U. and Sociedad Española de Reumatología y Colegio Mexicano de Reumatología. All rights reserved.

  9. Inter-observer variability in diagnosing radiological features of aneurysmal subarachnoid hemorrhage; a preliminary single centre study comparing observers from different specialties and levels of training.

    PubMed

    Siddiqui, Usman T; Khan, Anjum F; Shamim, Muhammad Shahzad; Hamid, Rana Shoaib; Alam, Muhammad Mehboob; Emaduddin, Muhammad

    2014-01-01

    A noncontrast computed tomography (CT) scan remains the initial radiological investigation of choice for a patient with suspected aneurysmal subarachnoid hemorrhage (aSAH). This initial scan may be used to derive key information about the underlying aneurysm which may aid in further management. The interpretation, however, is subject to the skill and experience of the interpreting individual. The authors here evaluate the interpretation of such CT scans by different individuals at different levels of training, and in two different specialties (Radiology and Neurosurgery). Initial nonontrast CT scan of 35 patients with aSAH was evaluated independently by four different observers. The observers selected for the study included two from Radiology and two from Neurosurgery at different levels of training; a resident currently in mid training and a resident who had recently graduated from training of each specialty. Measured variables included interpreter's suspicion of presence of subarachnoid blood, side of the subarachnoid hemorrhage, location of the aneurysm, the aneurysm's proximity to vessel bifurcation, number of aneurysm(s), contour of aneurysm(s), presence of intraventricular hemorrhage (IVH), intracerebral hemorrhage (ICH), infarction, hydrocephalus and midline shift. To determine the inter-observer variability (IOV), weighted kappa values were calculated. There was moderate agreement on most of the CT scan findings among all observers. Substantial agreement was found amongst all observers for hydrocephalus, IVH, and ICH. Lowest agreement rates were seen in the location of aneurysm being supra or infra tentorial. There were, however, some noteworthy exceptions. There was substantial to almost perfect agreement between the radiology graduate and radiology resident on most CT findings. The lowest agreement was found between the neurosurgery graduate and the radiology graduate. Our study suggests that although agreements were seen in the interpretation of some of the radiological features of aSAH, there is still considerable IOV in the interpretation of most features among physicians belonging to different levels of training and different specialties. Whether these might affect management or outcome is unclear.

  10. RELIABILITY AND VALIDITY OF SUBJECTIVE ASSESSMENT OF LUMBAR LORDOSIS IN CONVENTIONAL RADIOGRAPHY.

    PubMed

    Ruhinda, E; Byanyima, R K; Mugerwa, H

    2014-10-01

    Reliability and validity studies of different lumbar curvature analysis and measurement techniques have been documented however there is limited literature on the reliability and validity of subjective visual analysis. Radiological assessment of lumbar lordotic curve aids in early diagnosis of conditions even before neurologic changes set in. To ascertain the level of reliability and validity of subjective assessment of lumbar lordosis in conventional radiography. A blinded, repeated-measures diagnostic test was carried out on lumbar spine x-ray radiographs. Radiology Department at Joint Clinical Research Centre (JCRC), Mengo-Kampala-Uganda. Seventy (70) lateral lumbar x-ray films were used for this study and were obtained from the archive of JCRC radiology department at Butikiro house, Mengo-Kampala. Poor observer agreement, both inter- and intra-observer, with kappa values of 0.16 was found. Inter-observer agreement was poorer than intra-observer agreement. Kappa values significantly rose when the lumbar lordosis was clustered into four categories without grading each abnormality. The results confirm that subjective assessment of lumbar lordosis has low reliability and validity. Film quality has limited influence on the observer reliability. This study further shows that fewer scale categories of lordosis abnormalities produce better observer reliability.

  11. Inter-observer and intra-observer agreement on interpretation of uroflowmetry curves of kindergarten children.

    PubMed

    Chang, Shang-Jen; Yang, Stephen S D

    2008-12-01

    To evaluate the inter-observer and intra-observer agreement on the interpretation of uroflowmetry curves of children. Healthy kindergarten children were enrolled for evaluation of uroflowmetry. Uroflowmetry curves were classified as bell-shaped, tower, plateau, staccato and interrupted. Only the bell-shaped curves were regarded as normal. Two urodynamists evaluated the curves independently after reviewing the definitions of the different types of uroflowmetry curve. The senior urodynamist evaluated the curves twice 3 months apart. The final conclusion was made when consensus was reached. Agreement among observers was analyzed using kappa statistics. Of 190 uroflowmetry curves eligible for analysis, the intra-observer agreement in interpreting each type of curve and interpreting normalcy vs abnormality was good (kappa=0.71 and 0.68, respectively). Very good inter-observer agreement (kappa=0.81) on normalcy and good inter-observer agreement (kappa=0.73) on types of uroflowmetry were observed. Poor inter-observer agreement existed on the classification of specific types of abnormal uroflowmetry curves (kappa=0.07). Uroflowmetry is a good screening tool for normalcy of kindergarten children, while not a good tool to define the specific types of abnormal uroflowmetry.

  12. Counseling and Knowledge of Danger Signs of Pregnancy Complications in Haiti, Malawi, and Senegal.

    PubMed

    Assaf, Shireen

    2018-06-23

    Objectives Providing counseling on danger signs of pregnancy complications as part of visits for antenatal care (ANC) can raise expecting women's awareness so that if danger signs occur they can seek assistance in time. The study examines the level of agreement in counseling on danger signs between observation of the provider during the ANC visit and the client's report in the exit interview, and the association of this agreement with the client's level of knowledge on danger signs. Methods The analysis used data from service provision and assessment (SPA) surveys in Haiti, Malawi, and Senegal. Agreement between the observation and client's report was measured by Cohen's kappa and percent agreement. Regressions were performed on the number of danger signs the client knew, with the level of agreement on the counseling on danger signs as the main independent variable. Results The study found little agreement between the observation of counseling and the client's report that the counseling occurred, despite the fact that the exit interview with the client was performed immediately following the ANC visit with the provider. The level of positive agreement between observation and client's report was 17% in Haiti, 33% in Malawi, and 23% in Senegal. Clients' overall knowledge of danger signs was low; in all three countries the mean number of danger signs known was 1.5 or less. The regression analysis found that, in order to show a significant increase in knowledge of danger signs, it was important for the client to report that it took place. Conclusions Ideally, there should be 100% positive agreement that counseling occurred. To achieve this level requires raising both the level of counseling on danger signs of pregnancy complications and its quality. While challenges exist, providing counseling that is more client-centered and focuses on the client's needs could improve quality and thus could increase the client's knowledge of danger signs.

  13. Are photographic records reliable for orthodontic screening?

    PubMed

    Mandall, N A

    2002-06-01

    The aim of the study was to evaluate the reliability of a panel of orthodontists for accepting new patient referrals based on clinical photographs. Eight orthodontists from Greater Manchester, Lancashire, Chester, and Derbyshire observed clinical photographs of 40 consecutive new patients attending the orthodontic department, Hope Hospital, Salford. They recorded whether or not they would accept the patient, as a new patient referral, in their department. Each consultant was asked to take into account factors, such as oral hygiene, dental development, and severity of the malocclusion. Kappa statistic for multiple-rater agreement and kappa statistic for intra-observer reliability were calculated. Inter-observer panel agreement for accepting new patient referrals based on photographic information was low (multiple rater kappa score 0.37). Intra-examiner agreement was better (kappa range 0.34-0.90). Clinician agreement for screening and accepting orthodontic referrals based on clinical photographs is comparable to that previously reported for other clinical decision making.

  14. Inter-observer variability in the classification of ovarian cancer cell type using microscopy: a pilot study

    NASA Astrophysics Data System (ADS)

    Gavrielides, Marios A.; Ronnett, Brigitte M.; Vang, Russell; Seidman, Jeffrey D.

    2015-03-01

    Studies have shown that different cell types of ovarian carcinoma have different molecular profiles, exhibit different behavior, and that patients could benefit from typespecific treatment. Different cell types display different histopathology features, and different criteria are used for each cell type classification. Inter-observer variability for the task of classifying ovarian cancer cell types is an under-examined area of research. This study served as a pilot study to quantify observer variability related to the classification of ovarian cancer cell types and to extract valuable data for designing a validation study of digital pathology (DP) for this task. Three observers with expertise in gynecologic pathology reviewed 114 cases of ovarian cancer with optical microscopy, with specific guidelines for classifications into distinct cell types. For 93 cases all 3 pathologists agreed on the same cell type, for 18 cases 2 out of 3 agreed, and for 3 cases there was no agreement. Across cell types with a minimum sample size of 10 cases, agreement between all three observers was {91.1%, 80.0%, 90.0%, 78.6%, 100.0%, 61.5%} for the high grade serous carcinoma, low grade serous carcinoma, endometrioid, mucinous, clear cell, and carcinosarcoma cell types respectively. These results indicate that unanimous agreement varied over a fairly wide range. However, additional research is needed to determine the importance of these differences in comparison studies. These results will be used to aid in the design and sizing of such a study comparing optical and digital pathology. In addition, the results will help in understanding the potential role computer-aided diagnosis has in helping to improve the agreement of pathologists for this task.

  15. Agreement between histopathological results in clinically diagnosed cases of indeterminate leprosy in São Paulo, Brazil.

    PubMed

    Lombardi, C; Cohen, S; Leiker, D L; Souza, J M; Cunha, P R; Martelli, C M; Andrade, A L; Zicker, F

    1994-01-01

    Histopathological slides from skin biopsies of fifty-seven self-reporting patients diagnosed as indeterminate leprosy by the Leprosy Control Programme in São Paulo, were sent to three independent histopathologists. Agreement between the reports were based on the following diagnosis: "indeterminate leprosy", "suggestive leprosy" or "no leprosy". A great variation was observed in the interpretation of the histopathological examination. The three pathologists reported "indeterminate leprosy" respectively in 7.0%, 54.4% and 84.2%, of the cases studied. A kappa index of agreement between any two pathologists ranged from 0.08 to 0.32, showing poor agreement between observers. Agreement improved by pooling together the reports "suggestive leprosy" and "indeterminate leprosy". The three pathologists agreed in the results of 24 biopsies of the 27 classified as leprosy by any one of the three observers. Eight cases were considered as "no leprosy" by all pathologists. Higher agreement indices were obtained for positive and negative proportionate concordance between any two examiners. The implications of the variation in the diagnosis of indeterminate leprosy and early leprosy are discussed in the context of public health and case-management.

  16. Stool frequency recording in severe acute malnutrition ('StoolSAM'); an agreement study comparing maternal recall versus direct observation using diapers.

    PubMed

    Voskuijl, Wieger; Potani, Isabel; Bandsma, Robert; Baan, Anne; White, Sarah; Bourdon, Celine; Kerac, Marko

    2017-06-07

    Approximately 50% of the deaths of children under the age of 5 can be attributed to undernutrition, which also encompasses severe acute malnutrition (SAM). Diarrhoea is strongly associated with these deaths and is commonly diagnosed solely based on stool frequency and consistency obtained through maternal recall. This trial aims to determine whether this approach is equivalent to a 'directly observed method' in which a health care worker directly observed stool frequency using diapers in hospitalised children with complicated SAM. This study was conducted at 'Moyo' Nutritional Rehabilitation Unit, Queen Elizabeth Central Hospital, Malawi. Participants were children aged 5-59 months admitted with SAM. We compared 2 days of stool frequency data obtained with next-day maternal-recall versus a 'gold standard' in which a health care worker observed stool frequency every 2 h using diapers. After study completion, guardians were asked their preferred method and their level of education. We found poor agreement between maternal recall and the 'gold standard' of directly observed diapers. The sensitivity to detect diarrhoea based on maternal recall was poor, with only 75 and 56% of diarrhoea cases identified on days 1 and 2, respectively. However, the specificity was higher with more than 80% of children correctly classified as not having diarrhoea. On day 1, the mean stool frequency difference between the two methods was -0.17 (SD; 1.68) with limits of agreement (of stool frequency) of -3.55 and 3.20 and, similarly on day 2, the mean difference was -0.2 (SD; 1.59) with limits of agreement of -3.38 and 2.98. These limits extend beyond the pre-specified 'acceptable' limits of agreement (±1.5 stool per day) and indicate that the 2 methods are non-equivalent. The higher the stool frequency, the more discrepant the two methods were. Most primary care givers strongly preferred using diapers. This study shows lack of agreement between the assessment of stool frequency in SAM patients using maternal recall and direct observation of diapers. When designing studies, one should consider using diapers to determining diarrhoea incidence/prevalence in SAM patients especially when accuracy is essential. ISRCTN11571116 (registered 29/11/2013).

  17. A Statistical Analysis of Reviewer Agreement and Bias in Evaluating Medical Abstracts 1

    PubMed Central

    Cicchetti, Domenic V.; Conn, Harold O.

    1976-01-01

    Observer variability affects virtually all aspects of clinical medicine and investigation. One important aspect, not previously examined, is the selection of abstracts for presentation at national medical meetings. In the present study, 109 abstracts, submitted to the American Association for the Study of Liver Disease, were evaluated by three “blind” reviewers for originality, design-execution, importance, and overall scientific merit. Of the 77 abstracts rated for all parameters by all observers, interobserver agreement ranged between 81 and 88%. However, corresponding intraclass correlations varied between 0.16 (approaching statistical significance) and 0.37 (p < 0.01). Specific tests of systematic differences in scoring revealed statistically significant levels of observer bias on most of the abstract components. Moreover, the mean differences in interobserver ratings were quite small compared to the standard deviations of these differences. These results emphasize the importance of evaluating the simple percentage of rater agreement within the broader context of observer variability and systematic bias. PMID:997596

  18. Hand assessment in older adults with musculoskeletal hand problems: a reliability study.

    PubMed

    Myers, Helen L; Thomas, Elaine; Hay, Elaine M; Dziedzic, Krysia S

    2011-01-07

    Musculoskeletal hand pain is common in the general population. This study aims to investigate the inter- and intra-observer reliability of two trained observers conducting a simple clinical interview and physical examination for hand problems in older adults. The reliability of applying the American College of Rheumatology (ACR) criteria for hand osteoarthritis to community-dwelling older adults will also be investigated. Fifty-five participants aged 50 years and over with a current self-reported hand problem and registered with one general practice were recruited from a previous health questionnaire study. Participants underwent a standardised, structured clinical interview and physical examination by two independent trained observers and again by one of these observers a month later. Agreement beyond chance was summarised using Kappa statistics and intra-class correlation coefficients. Median values for inter- and intra-observer reliability for clinical interview questions were found to be "substantial" and "moderate" respectively [median agreement beyond chance (Kappa) was 0.75 (range: -0.03, 0.93) for inter-observer ratings and 0.57 (range: -0.02, 1.00) for intra-observer ratings]. Inter- and intra-observer reliability for physical examination items was variable, with good reliability observed for some items, such as grip and pinch strength, and poor reliability observed for others, notably assessment of altered sensation, pain on resisted movement and judgements based on observation and palpation of individual features at single joints, such as bony enlargement, nodes and swelling. Moderate agreement was observed both between and within observers when applying the ACR criteria for hand osteoarthritis. Standardised, structured clinical interview is reliable for taking a history in community-dwelling older adults with self reported hand problems. Agreement between and within observers for physical examination items is variable. Low Kappa values may have resulted, in part, from a low prevalence of clinical signs and symptoms in the study participants. The decision to use clinical interview and hand assessment variables in clinical practice or further research in primary care should include consideration of clinical applicability and training alongside reliability. Further investigation is required to determine the relationship between these clinical questions and assessments and the clinical course of hand pain and hand problems in community-dwelling older adults.

  19. Are distal radius fracture classifications reproducible? Intra and interobserver agreement.

    PubMed

    Belloti, João Carlos; Tamaoki, Marcel Jun Sugawara; Franciozi, Carlos Eduardo da Silveira; Santos, João Baptista Gomes dos; Balbachevsky, Daniel; Chap Chap, Eduardo; Albertoni, Walter Manna; Faloppa, Flávio

    2008-05-01

    Various classification systems have been proposed for fractures of the distal radius, but the reliability of these classifications is seldom addressed. For a fracture classification to be useful, it must provide prognostic significance, interobserver reliability and intraobserver reproducibility. The aim here was to evaluate the intraobserver and interobserver agreement of distal radius fracture classifications. This was a validation study on interobserver and intraobserver reliability. It was developed in the Department of Orthopedics and Traumatology, Universidade Federal de São Paulo - Escola Paulista de Medicina. X-rays from 98 cases of displaced distal radius fracture were evaluated by five observers: one third-year orthopedic resident (R3), one sixth-year undergraduate medical student (UG6), one radiologist physician (XRP), one orthopedic trauma specialist (OT) and one orthopedic hand surgery specialist (OHS). The radiographs were classified on three different occasions (times T1, T2 and T3) using the Universal (Cooney), Arbeitsgemeinschaft für Osteosynthesefragen/Association for the Study of Internal Fixation (AO/ASIF), Frykman and Fernández classifications. The kappa coefficient (kappa) was applied to assess the degree of agreement. Among the three occasions, the highest mean intraobserver k was observed in the Universal classification (0.61), followed by Fernández (0.59), Frykman (0.55) and AO/ASIF (0.49). The interobserver agreement was unsatisfactory in all classifications. The Fernández classification showed the best agreement (0.44) and the worst was the Frykman classification (0.26). The low agreement levels observed in this study suggest that there is still no classification method with high reproducibility.

  20. On Deviations between Observed and Theoretically Estimated Values on Additivity-Law Failures

    NASA Astrophysics Data System (ADS)

    Nayatani, Yoshinobu; Sobagaki, Hiroaki

    The authors have reported in the previous studies that the average observed results are about a half of the corresponding predictions on the experiments with large additivity-law failures. One of the reasons of the deviations is studied and clarified by using the original observed data on additivity-law failures in the Nakano experiment. The conclusion from the observations and their analyses clarified that it was essentially difficult to have a good agreement between the average observed results and the corresponding theoretical predictions in the experiments with large additivity-law failures. This is caused by a kind of unavoidable psychological pressure existing in subjects participated in the experiments. We should be satisfied with the agreement in trend between them.

  1. Application of psychometric theory to the measurement of voice quality using rating scales.

    PubMed

    Shrivastav, Rahul; Sapienza, Christine M; Nandur, Vuday

    2005-04-01

    Rating scales are commonly used to study voice quality. However, recent research has demonstrated that perceptual measures of voice quality obtained using rating scales suffer from poor interjudge agreement and reliability, especially in the mid-range of the scale. These findings, along with those obtained using multidimensional scaling (MDS), have been interpreted to show that listeners perceive voice quality in an idiosyncratic manner. Based on psychometric theory, the present research explored an alternative explanation for the poor interlistener agreement observed in previous research. This approach suggests that poor agreement between listeners may result, in part, from measurement errors related to a variety of factors rather than true differences in the perception of voice quality. In this study, 10 listeners rated breathiness for 27 vowel stimuli using a 5-point rating scale. Each stimulus was presented to the listeners 10 times in random order. Interlistener agreement and reliability were calculated from these ratings. Agreement and reliability were observed to improve when multiple ratings of each stimulus from each listener were averaged and when standardized scores were used instead of absolute ratings. The probability of exact agreement was found to be approximately .9 when using averaged ratings and standardized scores. In contrast, the probability of exact agreement was only .4 when a single rating from each listener was used to measure agreement. These findings support the hypothesis that poor agreement reported in past research partly arises from errors in measurement rather than individual differences in the perception of voice quality.

  2. Do children report differently from their parents and from observed data? Cross-sectional data on fruit, water, sugar-sweetened beverages and break-time foods.

    PubMed

    van de Gaar, V M; Jansen, W; van der Kleij, M J J; Raat, H

    2016-04-18

    Reliable assessment of children's dietary behaviour is needed for research purposes. The aim of this study was (1) to investigate the level of agreement between observed and child-reported break-time food items; and (2) to investigate the level of agreement between children's reports and those of their parents regarding children's overall consumption of fruit, water and sugar-sweetened beverages (SSB). The children in this study were 9-13 years old, attending primary schools in Rotterdam, the Netherlands. Children were observed with respect to foods brought for break-time at school. At the same day, children completed a questionnaire in which they were asked to recall the food(s) they brought to school to consume during break-time. Only paired data (observed and child-reported) were included in the analyses (n = 407 pairs). To determine each child's daily consumption and average amounts of fruit, water and SSB consumed, children and their parents completed parallel questionnaires. Only paired data (parent-reported and child-reported) were included in the analyses (n = 275 pairs). The main statistical measures were level of agreement between break-time foods, fruit, water and SSB; and Intra-class Correlation Coefficients (ICC). More children reported bringing sandwiches and snacks for break-time than was observed (73 % vs 51 % observed and 84 % vs 33 % observed). The overall agreement between observed and child-reported break-time foods was poor to fair, with ICC range 0.16-0.39 (p < 0.05). Children reported higher average amounts of SSB consumed than did their parents (1.3 vs 0.9 L SSB, p < 0.001). Child and parent estimations of the child's water and fruit consumption were similar. ICC between parent and child reports was poor to good (range 0.22-0.62, p < 0.05). Children report higher on amount of break-time foods as compared to observations and children's reports of SSB consumption are higher than those of their parents. Since the level of agreement between the observed break-time foods and that reported by children and the agreement of child's intake between parent and child reports are relatively weak, future studies should focus on improving methods of evaluating children's consumption behaviour or on ways on how to best use and interpret multiple-source dietary intake data. Current Controlled Trials NTR3400 .

  3. Accuracy of infrared thermometers in very low birth weight infants and impact on newborn behavioural states.

    PubMed

    Jarvis, Melanie; Guy, Katelyn J; König, Kai

    2013-06-01

    To study the impact on newborn behavioural states and accuracy of three infrared thermometers compared with digital axillary thermometer measurements in very low birth weight infants. Single-centre prospective observational study. Preterm infants born <1500-g birth weight were eligible. Infants were observed for pre-measurement behaviour state using a five-point neonatal behaviour observation tool. One infrared temperature was taken from each of the devices, followed by an axillary measurement. Further behaviour-state observations were recorded following infrared and axillary measurements. One hundred measurements were collected from each infrared device among a cohort of 42 very low birth weight infants. Only one infrared device showed satisfactory agreement with bias -0.071 (95% limits of agreement -0.68 to 0.54). The other two devices demonstrated poor agreement: bias -1.34; 95% limits of agreement -2.62 to -0.5 and bias -0.56; 95% limits of agreement -1.38 to 0.25. Neonatal behavioural scores showed only minimal changes when infrared measurements were performed but increased significantly following axillary measurements. The difference between the two modalities was statistically significant with a mean increase of 1.44 points following axillary measurements (95% confidence interval 1.21 to 1.67, P < 0.001). Temperature measurements taken with infrared thermometers demonstrated less disruption to preterm infants' behavioural state, however accuracy of devices varied. © 2013 The Authors. Journal of Paediatrics and Child Health © 2013 Paediatrics and Child Health Division (Royal Australasian College of Physicians).

  4. Observer performance in diagnosing osteoporosis by dental panoramic radiographs: results from the osteoporosis screening project in dentistry (OSPD).

    PubMed

    Taguchi, A; Asano, A; Ohtsuka, M; Nakamoto, T; Suei, Y; Tsuda, M; Kudo, Y; Inagaki, K; Noguchi, T; Tanimoto, K; Jacobs, R; Klemetti, E; White, S C; Horner, K

    2008-07-01

    Mandibular cortical erosion detected on dental panoramic radiographs (DPRs) may be useful for identifying women with osteoporosis, but little is known about the variation in diagnostic efficacy of observers worldwide. The purpose of this study was to measure the accuracy in identifying women at risk for osteoporosis in a worldwide group of observers using DPRs. We constructed a website that included background information about osteoporosis screening and instructions regarding the interpretation of mandibular cortical erosion. DPRs of 100 Japanese postmenopausal women aged 50 years or older who had completed skeletal bone mineral measurements by dual energy X-ray absorptiometry were digitized at 300 dpi. These were displayed on the website and used for the evaluation of diagnostic efficacy. Sixty observers aged 25 to 66 years recruited from 16 countries participated in this study. These observers classified cortical erosion into one of three groups (none, mild to moderate, and severe) on the website via the Internet, twice with an approximately 2-week interval. The diagnostic efficacy of the Osteoporosis Self-Assessment Tool (OST), a simple clinical decision rule based on age and weight, was also calculated and compared with that of cortical erosion. The overall mean sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) of the 60 observers in identifying women with osteoporosis by cortical erosion on DPRs were 82.5, 46.2, 46.7, and 84.0%, respectively. Those same values by the OST index were 82.9, 43.1, 43.9, and 82.4%, respectively. The intra-observer agreement in classifying cortical erosion on DPRs was sufficient (weighted kappa values>0.6) in 36 (60%) observers. This was significantly increased in observers who specialized in oral radiology (P<0.05). In the 36 observers with sufficient intra-observer agreement, the overall mean sensitivity, specificity, PPV, and NPV in identifying women with osteoporosis by any cortical erosion were 83.5, 48.7, 48.3, and 85.7%, respectively. The mean PPV and NPV were significantly higher in the 36 observers with sufficient intra-observer agreement than in the 24 observers with insufficient intra-observer agreement. Our results reconfirm the efficacy of cortical erosion findings in identifying postmenopausal women at risk for osteoporosis, among observers with sufficient intra-observer agreement. Information gathered from radiographic examination is at least as useful as that gathered from the OST index.

  5. WhatsApp Messenger is useful and reproducible in the assessment of tibial plateau fractures: inter- and intra-observer agreement study.

    PubMed

    Giordano, Vincenzo; Koch, Hilton Augusto; Mendes, Carlos Henrique; Bergamin, André; de Souza, Felipe Serrão; do Amaral, Ney Pecegueiro

    2015-02-01

    The aim of this study was to evaluate the inter- and intra-observer agreement in the initial diagnosis and classification by means of plain radiographs and CT scans of tibial plateau fractures photographed and sent via WhatsApp Messenger. The increasing popularity of smartphones has driven the development of technology for data transmission and imaging and generated a growing interest in the use of these devices as diagnostic tools. The emergence of WhatsApp Messenger technology, which is available for various platforms used by smartphones, has led to an improvement in the quality and resolution of images sent and received. The images (plain radiographs and CT scans) were obtained from 13 cases of tibial plateau fractures using the iPhone 5 (Apple Inc., Cupertino, CA, USA) and were sent to six observers via the WhatsApp Messenger application. The observers were asked to determine the standard deviation and type of injury, the classification according to the Schatzker and the Luo classifications schemes, and whether the CT scan changed the classification. The six observers independently assessed the images on two separate occasions, 15 days apart. The inter- and intra-observer agreement for both periods of the study ranged from excellent to perfect (0.75<κ<1.0) across all survey questions. When asked if the inclusion of the CT images would change their final X-ray classification (Schatzker or Luo), the inter- and intra-observer agreement was perfect (k=1) on both assessment occasions. We found an excellent inter- and intra-observer agreement in the imaging assessment of tibial plateau fractures sent via WhatsApp Messenger. The authors now propose the systematic use of the application to facilitate faster documentation and obtaining the opinion of an experienced consultant when not on call. Finally, we think the use of the WhatsApp Messenger as an adjuvant tool could be broadened to other clinical centres to assess its viability in other skeletal and non-skeletal trauma situations. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  6. Ultrasound functional evaluation of fetuses with myelomeningocele: study of the interpretation of results.

    PubMed

    Maroto, A; Illescas, T; Meléndez, M; Arévalo, S; Rodó, C; Peiró, J L; Belfort, M; Cuxart, A; Carreras, E

    2017-10-01

    To assess the reliability of the interpretation of a new technique for the ultrasound evaluation of the level of neurological lesion in fetuses with myelomeningocele. Observational study including myelomeningocele fetuses, referred to our center for the sonographic assessment of the fetal lower-limb movements, made and recorded by an expert in Maternal-fetal medicine and a specialist in Rehabilitation. Two observers, with different levels of expertise and blinded to each other's results, interpreted each recorded scan two different times. The agreement for the segmental levels assigned between the observers and the gold standard, the inter-observer and intra-observer reproducibility were tested using the weighed Kappa (wκ) index. Twenty-eight scans were recorded and evaluated. The agreement between the observers and the gold standard remained constant for the expert observer (wκ = 0.82) and increased (wκ = 0.66-wκ = 0.72) for the other one. The inter-observer and the intra-observer variability for the expert observer were wκ = 0.72 and wκ = 0.94, respectively. The agreement for the prenatal evaluation of the segmental neurological level was excellent, after a short training period, for observers with different degrees of expertise. The interpretation of this technique is reproducible enough and this supports its value for the prediction of postnatal motor function in myelomeningocele fetuses.

  7. Inter- and intra- observer reliability of risk assessment of repetitive work without an explicit method.

    PubMed

    Eliasson, Kristina; Palm, Peter; Nyman, Teresia; Forsman, Mikael

    2017-07-01

    A common way to conduct practical risk assessments is to observe a job and report the observed long term risks for musculoskeletal disorders. The aim of this study was to evaluate the inter- and intra-observer reliability of ergonomists' risk assessments without the support of an explicit risk assessment method. Twenty-one experienced ergonomists assessed the risk level (low, moderate, high risk) of eight upper body regions, as well as the global risk of 10 video recorded work tasks. Intra-observer reliability was assessed by having nine of the ergonomists repeat the procedure at least three weeks after the first assessment. The ergonomists made their risk assessment based on his/her experience and knowledge. The statistical parameters of reliability included agreement in %, kappa, linearly weighted kappa, intraclass correlation and Kendall's coefficient of concordance. The average inter-observer agreement of the global risk was 53% and the corresponding weighted kappa (K w ) was 0.32, indicating fair reliability. The intra-observer agreement was 61% and 0.41 (K w ). This study indicates that risk assessments of the upper body, without the use of an explicit observational method, have non-acceptable reliability. It is therefore recommended to use systematic risk assessment methods to a higher degree. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.

  8. Intraobserver and interobserver agreement on the radiographical diagnosis of canine cranial cruciate ligament rupture.

    PubMed

    Bogaerts, Evelien; Van der Vekens, Elke; Verhoeven, Geert; de Rooster, Hilde; Van Ryssen, Bernadette; Samoy, Yves; Putcuyps, Ingrid; Van Tilburg, Johan; Devriendt, Nausikaa; Weekers, Frederik; Bertal, Mileva; Houdellier, Blandine; Scheemaeker, Stephanie; Versteken, Jeroen; Lamerand, Maryline; Feenstra, Laurien; Peelman, Luc; Nieuwerburgh, Filip Van; Saunders, Jimmy H; Broeckx, Bart J G

    2018-04-28

    Even though radiography is one of the most frequently used imaging techniques for orthopaedic disorders, it has been demonstrated that the interpretation can vary between assessors. As such, the purpose of this study was to examine the intraobserver and interobserver agreement and the influence of level of expertise on the interpretation of radiographs of the stifle in dogs with and without cranial cruciate ligament rupture (CCLR). Sixteen observers, divided in four groups according to their level of experience, evaluated 30 radiographs (15 cases with CCLR and 15 control stifles) twice. Each observer was asked to evaluate joint effusion, presence and location of degenerative joint disease, joint instability and whether CCLR was present or absent. Overall, intraobserver and interobserver agreement ranged from fair to almost perfect with a trend towards increased agreement for more experienced observers. Additionally, it was found that stifles that were classified with high agreement have either overt disease characteristics or no disease characteristics at all, in comparison to the ones that are classified with a low agreement. Overall, the agreement on radiographic interpretation of CCLR was high, which is important, as it is the basis of a correct diagnosis and treatment. © British Veterinary Association (unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  9. Increasing Reliability of Direct Observation Measurement Approaches in Emotional and/or Behavioral Disorders Research Using Generalizability Theory

    ERIC Educational Resources Information Center

    Gage, Nicholas A.; Prykanowski, Debra; Hirn, Regina

    2014-01-01

    Reliability of direct observation outcomes ensures the results are consistent, dependable, and trustworthy. Typically, reliability of direct observation measurement approaches is assessed using interobserver agreement (IOA) and the calculation of observer agreement (e.g., percentage of agreement). However, IOA does not address intraobserver…

  10. Measuring agreement of multivariate discrete survival times using a modified weighted kappa coefficient.

    PubMed

    Guo, Ying; Manatunga, Amita K

    2009-03-01

    Assessing agreement is often of interest in clinical studies to evaluate the similarity of measurements produced by different raters or methods on the same subjects. We present a modified weighted kappa coefficient to measure agreement between bivariate discrete survival times. The proposed kappa coefficient accommodates censoring by redistributing the mass of censored observations within the grid where the unobserved events may potentially happen. A generalized modified weighted kappa is proposed for multivariate discrete survival times. We estimate the modified kappa coefficients nonparametrically through a multivariate survival function estimator. The asymptotic properties of the kappa estimators are established and the performance of the estimators are examined through simulation studies of bivariate and trivariate survival times. We illustrate the application of the modified kappa coefficient in the presence of censored observations with data from a prostate cancer study.

  11. The 27-28 October 1986 FIRE IFO Cirrus case study: Comparison of radiative transfer theory with observations by satellite and aircraft

    NASA Technical Reports Server (NTRS)

    Wielicki, Bruce A.; Suttles, J. T.; Heymsfield, Andrew J.; Welch, Ronald M.; Spinhirne, James D.; Wu, Man-Li C.; Starr, David OC.; Parker, Lindsay; Arduini, Robert F.

    1989-01-01

    Observations of cirrus and altocumulus clouds during the First International Satellite Cloud Climatology Project Regional Experiment (FIRE) are compared to theoretical models of cloud radiative properties. Three tests are performed. First, LANDSAT radiances are used to compare the relationship between nadir reflectance ot 0.83 micron and beam emittance at 11.5 microns with that predicted for model calculations using spherical and nonspherical phase functions. Good agreement is found between observations and theory when water droplets dominate. Poor agreement is found when ice particles dominate, especially using scattering phase functions for spherical particles. Even when compared to a laboratory measured ice particle phase function, the observations show increased side scattered radiation relative to the theoretical calculations. Second, the anisotropy of conservatively scattered radiation is examined using simultaneous multiple angle views of the cirrus from LANDSAT and ER-2 aircraft radiometers. Observed anisotropy gives good agreement with theoretical calculations using the laboratory measured ice particle phase function and poor agreement with a spherical particle phase function. Third, Landsat radiances at 0.83, 1.65, and 2.21 microns are used to infer particle phase and particle size. For water droplets, good agreement is found with King Air FSSP particle probe measurements in the cloud. For ice particles, the LANDSAT radiance observations predict an effective radius of 60 microns versus aircraft observations of about 200 microns. It is suggested that this descrepancy may be explained by uncertainty in the imaginary index of ice and by inadequate measurements of small ice particles by microphysical probes.

  12. Agreement between self-reported and general practitioner-reported chronic conditions among multimorbid patients in primary care - results of the MultiCare Cohort Study

    PubMed Central

    2014-01-01

    Background Multimorbidity is a common phenomenon in primary care. Until now, no clinical guidelines for multimorbidity exist. For the development of these guidelines, it is necessary to know whether or not patients are aware of their diseases and to what extent they agree with their doctor. The objectives of this paper are to analyze the agreement of self-reported and general practitioner-reported chronic conditions among multimorbid patients in primary care, and to discover which patient characteristics are associated with positive agreement. Methods The MultiCare Cohort Study is a multicenter, prospective, observational cohort study of 3,189 multimorbid patients, ages 65 to 85. Data was collected in personal interviews with patients and GPs. The prevalence proportions for 32 diagnosis groups, kappa coefficients and proportions of specific agreement were calculated in order to examine the agreement of patient self-reported and general practitioner-reported chronic conditions. Logistic regression models were calculated to analyze which patient characteristics can be associated with positive agreement. Results We identified four chronic conditions with good agreement (e.g. diabetes mellitus κ = 0.80;PA = 0,87), seven with moderate agreement (e.g. cerebral ischemia/chronic stroke κ = 0.55;PA = 0.60), seventeen with fair agreement (e.g. cardiac insufficiency κ = 0.24;PA = 0.36) and four with poor agreement (e.g. gynecological problems κ = 0.05;PA = 0.10). Factors associated with positive agreement concerning different chronic diseases were sex, age, education, income, disease count, depression, EQ VAS score and nursing care dependency. For example: Women had higher odds ratios for positive agreement with their GP regarding osteoporosis (OR = 7.16). The odds ratios for positive agreement increase with increasing multimorbidity in almost all of the observed chronic conditions (OR = 1.22-2.41). Conclusions For multimorbidity research, the knowledge of diseases with high disagreement levels between the patients’ perceived illnesses and their physicians’ reports is important. The analysis shows that different patient characteristics have an impact on the agreement. Findings from this study should be included in the development of clinical guidelines for multimorbidity aiming to optimize health care. Further research is needed to identify more reasons for disagreement and their consequences in health care. Trial registration ISRCTN89818205 PMID:24580758

  13. International perception of lung sounds: a comparison of classification across some European borders

    PubMed Central

    Aviles-Solis, Juan Carlos; Vanbelle, Sophie; Halvorsen, Peder A; Francis, Nick; Cals, Jochen W L; Andreeva, Elena A; Marques, Alda; Piirilä, Päivi; Pasterkamp, Hans; Melbye, Hasse

    2017-01-01

    Introduction Lung auscultation is helpful in the diagnosis of lung and heart diseases; however, the diagnostic value of lung sounds may be questioned due to interobserver variation. This situation may also impair clinical research in this area to generate evidence-based knowledge about the role that chest auscultation has in a modern clinical setting. The recording and visual display of lung sounds is a method that is both repeatable and feasible to use in large samples, and the aim of this study was to evaluate interobserver agreement using this method. Methods With a microphone in a stethoscope tube, we collected digital recordings of lung sounds from six sites on the chest surface in 20 subjects aged 40 years or older with and without lung and heart diseases. A total of 120 recordings and their spectrograms were independently classified by 28 observers from seven different countries. We employed absolute agreement and kappa coefficients to explore interobserver agreement in classifying crackles and wheezes within and between subgroups of four observers. Results When evaluating agreement on crackles (inspiratory or expiratory) in each subgroup, observers agreed on between 65% and 87% of the cases. Conger’s kappa ranged from 0.20 to 0.58 and four out of seven groups reached a kappa of ≥0.49. In the classification of wheezes, we observed a probability of agreement between 69% and 99.6% and kappa values from 0.09 to 0.97. Four out of seven groups reached a kappa ≥0.62. Conclusions The kappa values we observed in our study ranged widely but, when addressing its limitations, we find the method of recording and presenting lung sounds with spectrograms sufficient for both clinic and research. Standardisation of terminology across countries would improve international communication on lung auscultation findings. PMID:29435344

  14. International perception of lung sounds: a comparison of classification across some European borders.

    PubMed

    Aviles-Solis, Juan Carlos; Vanbelle, Sophie; Halvorsen, Peder A; Francis, Nick; Cals, Jochen W L; Andreeva, Elena A; Marques, Alda; Piirilä, Päivi; Pasterkamp, Hans; Melbye, Hasse

    2017-01-01

    Lung auscultation is helpful in the diagnosis of lung and heart diseases; however, the diagnostic value of lung sounds may be questioned due to interobserver variation. This situation may also impair clinical research in this area to generate evidence-based knowledge about the role that chest auscultation has in a modern clinical setting. The recording and visual display of lung sounds is a method that is both repeatable and feasible to use in large samples, and the aim of this study was to evaluate interobserver agreement using this method. With a microphone in a stethoscope tube, we collected digital recordings of lung sounds from six sites on the chest surface in 20 subjects aged 40 years or older with and without lung and heart diseases. A total of 120 recordings and their spectrograms were independently classified by 28 observers from seven different countries. We employed absolute agreement and kappa coefficients to explore interobserver agreement in classifying crackles and wheezes within and between subgroups of four observers. When evaluating agreement on crackles (inspiratory or expiratory) in each subgroup, observers agreed on between 65% and 87% of the cases. Conger's kappa ranged from 0.20 to 0.58 and four out of seven groups reached a kappa of ≥0.49. In the classification of wheezes, we observed a probability of agreement between 69% and 99.6% and kappa values from 0.09 to 0.97. Four out of seven groups reached a kappa ≥0.62. The kappa values we observed in our study ranged widely but, when addressing its limitations, we find the method of recording and presenting lung sounds with spectrograms sufficient for both clinic and research. Standardisation of terminology across countries would improve international communication on lung auscultation findings.

  15. Spine Instability Neoplastic Score: agreement across different medical and surgical specialties.

    PubMed

    Arana, Estanislao; Kovacs, Francisco M; Royuela, Ana; Asenjo, Beatriz; Pérez-Ramírez, Úrsula; Zamora, Javier

    2016-05-01

    Spinal instability is an acknowledged complication of spinal metastases; in spite of recent suggested criteria, it is not clearly defined in the literature. This study aimed to assess intra and interobserver agreement when using the Spine Instability Neoplastic Score (SINS) by all physicians involved in its management. Independent multicenter reliability study for the recently created SINS, undertaken with a panel of medical oncologists, neurosurgeons, radiologists, orthopedic surgeons, and radiation oncologists, was carried out. Ninety patients with biopsy-proven spinal metastases and magnetic resonance imaging, reviewed at the multidisciplinary tumor board of our institution, were included. Intraclass correlation coefficient (ICC) was used for SINS score agreement. Fleiss kappa statistic was used to assess agreement on the location of the most affected vertebral level; agreement on the SINS category ("stable," "potentially stable," or "unstable"); and overall agreement with the classification established by tumor board. Clinical data and imaging were provided to 83 specialists in 44 hospitals across 14 Spanish regions. No assessment criteria were pre-established. Each clinician assessed the SINS score twice, with a minimum 6-week interval. Clinicians were blinded to assessments made by other specialists and to their own previous assessment. Subgroup analyses were performed by clinicians' specialty, experience (≤7, 8-13, ≥14 years), and hospital category (four levels according to size and complexity). This study was supported by Kovacs Foundation. Intra and interobserver agreement on the location of the most affected levels was "almost perfect" (κ>0.94). Intra-observer agreement on the SINS score was "excellent" (ICC=0.77), whereas interobserver agreement was "moderate" (ICC=0.55). Intra-observer agreement in SINS category was "substantial" (k=0.61), whereas interobserver agreement was "moderate" (k=0.42). Overall agreement with the tumor board classification was "substantial" (κ=0.61). Results were similar across specialties, years of experience, and hospital category. Agreement on the assessment of metastatic spine instability is moderate. The SINS can help improve communication among clinicians in oncology care. Copyright © 2015 Elsevier Inc. All rights reserved.

  16. Coronary artery disease reporting and data system (CAD-RADSTM): Inter-observer agreement for assessment categories and modifiers.

    PubMed

    Maroules, Christopher D; Hamilton-Craig, Christian; Branch, Kelley; Lee, James; Cury, Roberto C; Maurovich-Horvat, Pál; Rubinshtein, Ronen; Thomas, Dustin; Williams, Michelle; Guo, Yanshu; Cury, Ricardo C

    The Coronary Artery Disease Reporting and Data System (CAD-RADS) provides a lexicon and standardized reporting system for coronary CT angiography. To evaluate inter-observer agreement of the CAD-RADS among an panel of early career and expert readers. Four early career and four expert cardiac imaging readers prospectively and independently evaluated 50 coronary CT angiography cases using the CAD-RADS lexicon. All readers assessed image quality using a five-point Likert scale, with mean Likert score ≥4 designating high image quality, and <4 designating moderate/low image quality. All readers were blinded to medical history and invasive coronary angiography findings. Inter-observer agreement for CAD-RADS assessment categories and modifiers were assessed using intra-class correlation (ICC) and Fleiss' Kappa (κ).The impact of reader experience and image quality on inter-observer agreement was also examined. Inter-observer agreement for CAD-RADS assessment categories was excellent (ICC 0.958, 95% CI 0.938-0.974, p < 0.0001). Agreement among expert readers (ICC 0.925, 95% CI 0.884-0.954) was marginally stronger than for early career readers (ICC 0.904, 95% CI 0.852-0.941), both p < 0.0001. High image quality was associated with stronger agreement than moderate image quality (ICC 0.944, 95% CI 0.886-0.974 vs. ICC 0.887, 95% CI 0.775-0.95, both p < 0.0001). While excellent inter-observer agreement was observed for modifiers S (stent) and G (bypass graft) (both κ = 1.0), only fair agreement (κ = 0.40) was observed for modifier V (high risk plaque). Inter-observer reproducibility of CAD-RADS assessment categories and modifiers is excellent, except for high-risk plaque (modifier V) which demonstrates fair agreement. These results suggest CAD-RADS is feasible for clinical implementation. Copyright © 2017. Published by Elsevier Inc.

  17. Wheezes, crackles and rhonchi: simplifying description of lung sounds increases the agreement on their classification: a study of 12 physicians' classification of lung sounds from video recordings

    PubMed Central

    Melbye, Hasse; Garcia-Marcos, Luis; Brand, Paul; Everard, Mark; Priftis, Kostas; Pasterkamp, Hans

    2016-01-01

    Background The European Respiratory Society (ERS) lung sounds repository contains 20 audiovisual recordings of children and adults. The present study aimed at determining the interobserver variation in the classification of sounds into detailed and broader categories of crackles and wheezes. Methods Recordings from 10 children and 10 adults were classified into 10 predefined sounds by 12 observers, 6 paediatricians and 6 doctors for adult patients. Multirater kappa (Fleiss' κ) was calculated for each of the 10 adventitious sounds and for combined categories of sounds. Results The majority of observers agreed on the presence of at least one adventitious sound in 17 cases. Poor to fair agreement (κ<0.40) was usually found for the detailed descriptions of the adventitious sounds, whereas moderate to good agreement was reached for the combined categories of crackles (κ=0.62) and wheezes (κ=0.59). The paediatricians did not reach better agreement on the child cases than the family physicians and specialists in adult medicine. Conclusions Descriptions of auscultation findings in broader terms were more reliably shared between observers compared to more detailed descriptions. PMID:27158515

  18. Log-Linear Modeling of Agreement among Expert Exposure Assessors

    PubMed Central

    Hunt, Phillip R.; Friesen, Melissa C.; Sama, Susan; Ryan, Louise; Milton, Donald

    2015-01-01

    Background: Evaluation of expert assessment of exposure depends, in the absence of a validation measurement, upon measures of agreement among the expert raters. Agreement is typically measured using Cohen’s Kappa statistic, however, there are some well-known limitations to this approach. We demonstrate an alternate method that uses log-linear models designed to model agreement. These models contain parameters that distinguish between exact agreement (diagonals of agreement matrix) and non-exact associations (off-diagonals). In addition, they can incorporate covariates to examine whether agreement differs across strata. Methods: We applied these models to evaluate agreement among expert ratings of exposure to sensitizers (none, likely, high) in a study of occupational asthma. Results: Traditional analyses using weighted kappa suggested potential differences in agreement by blue/white collar jobs and office/non-office jobs, but not case/control status. However, the evaluation of the covariates and their interaction terms in log-linear models found no differences in agreement with these covariates and provided evidence that the differences observed using kappa were the result of marginal differences in the distribution of ratings rather than differences in agreement. Differences in agreement were predicted across the exposure scale, with the likely moderately exposed category more difficult for the experts to differentiate from the highly exposed category than from the unexposed category. Conclusions: The log-linear models provided valuable information about patterns of agreement and the structure of the data that were not revealed in analyses using kappa. The models’ lack of dependence on marginal distributions and the ease of evaluating covariates allow reliable detection of observational bias in exposure data. PMID:25748517

  19. Measuring agreement between cervical vertebrae and hand-wrist maturation in determining skeletal age: reassessing the theory in patients with short stature.

    PubMed

    Danaei, Shahla Momeni; Karamifar, Amirali; Sardarian, Ahmadreza; Shahidi, Shoaleh; Karamifar, Hamdollah; Alipour, Abbas; Ghodsi Boushehri, Sahar

    2014-09-01

    The objective of this study was to determine the degree of agreement between hand-wrist radiography and cervical vertebral maturation analysis in patients diagnosed with short stature. A cross-sectional study was designed; 178 patients (90 girls, 88 boys) diagnosed with short stature and seeking treatment were selected. The patients were divided into 2 groups (76 with familial short stature, 102 with nonfamilial short stature). Hand-wrist and lateral cephalometric radiographs were obtained from the patients. The hand-wrist radiographs were analyzed using the Fishman method, and the lateral cephalometric views were categorized according to the method of Hassel and Farman. The degree of agreement between the 2 methods of predicting skeletal maturation was measured by calculating the contingency coefficient and the weighted kappa statistic. A high degree of agreement was observed between the 2 methods of analyzing skeletal maturation. It was also observed that agreement was higher in girls in the familial short-stature group, whereas boys had higher agreement in the nonfamilial short-stature group. Cervical vertebral maturation can be a valuable substitute for hand-wrist radiography in patients with short stature. Copyright © 2014 American Association of Orthodontists. Published by Elsevier Inc. All rights reserved.

  20. Interobserver Agreement and Disagreement in Continuous Recording Exemplified by Measurement of Behavior State.

    ERIC Educational Resources Information Center

    Mudford, Oliver C.; Hogg, James; Roberts, Jessica

    1997-01-01

    Continuous observational recording over 57 hours evaluated behavior states of three adults with profound and multiple disabilities. Two independent observers also recorded for 22 hours. Although overall percentage agreement was satisfactory (above 80%), agreement on occurrence was unsatisfactory (mean of 65%). Agreement data were superimposed on…

  1. International study on inter-reader variability for circulating tumor cells in breast cancer.

    PubMed

    Ignatiadis, Michail; Riethdorf, Sabine; Bidard, François-Clement; Vaucher, Isabelle; Khazour, Mustapha; Rothé, Françoise; Metallo, Jessica; Rouas, Ghizlane; Payne, Rachel E; Coombes, Raoul; Teufel, Ingrid; Andergassen, Ulrich; Apostolaki, Stella; Politaki, Eleni; Mavroudis, Dimitris; Bessi, Silvia; Pestrin, Marta; Di Leo, Angelo; Campion, Michael; Reinholz, Monica; Perez, Edith; Piccart, Martine; Borgen, Elin; Naume, Bjorn; Jimenez, Jose; Aura, Claudia; Zorzino, Laura; Cassatella, Maria; Sandri, Maria; Mostert, Bianca; Sleijfer, Stefan; Kraan, Jaco; Janni, Wolfgang; Fehm, Tanja; Rack, Brigitte; Terstappen, Leon; Repollet, Madeline; Pierga, Jean-Yves; Miller, Craig; Sotiriou, Christos; Michiels, Stefan; Pantel, Klaus

    2014-04-23

    Circulating tumor cells (CTCs) have been studied in breast cancer with the CellSearch® system. Given the low CTC counts in non-metastatic breast cancer, it is important to evaluate the inter-reader agreement. CellSearch® images (N = 272) of either CTCs or white blood cells or artifacts from 109 non-metastatic (M0) and 22 metastatic (M1) breast cancer patients from reported studies were sent to 22 readers from 15 academic laboratories and 8 readers from two Veridex laboratories. Each image was scored as No CTC vs CTC HER2- vs CTC HER2+. The 8 Veridex readers were summarized to a Veridex Consensus (VC) to compare each academic reader using % agreement and kappa (κ) statistics. Agreement was compared according to disease stage and CTC counts using the Wilcoxon signed rank test. For CTC definition (No CTC vs CTC), the median agreement between academic readers and VC was 92% (range 69 to 97%) with a median κ of 0.83 (range 0.37 to 0.93). Lower agreement was observed in images from M0 (median 91%, range 70 to 96%) compared to M1 (median 98%, range 64 to 100%) patients (P < 0.001) and from M0 and <3CTCs (median 87%, range 66 to 95%) compared to M0 and ≥3CTCs samples (median 95%, range 77 to 99%), (P < 0.001). For CTC HER2 expression (HER2- vs HER2+), the median agreement was 87% (range 51 to 95%) with a median κ of 0.74 (range 0.25 to 0.90). The inter-reader agreement for CTC definition was high. Reduced agreement was observed in M0 patients with low CTC counts. Continuous training and independent image review are required.

  2. Interobserver Agreement on Endoscopic Classification of Oesophageal Varices in Children.

    PubMed

    D'Antiga, Lorenzo; Betalli, Pietro; De Angelis, Paola; Davenport, Mark; Di Giorgio, Angelo; McKiernan, Patrick J; McLin, Valerie; Ravelli, Paolo; Durmaz, Ozlem; Talbotec, Cecile; Sturm, Ekkehard; Woynarowski, Marek; Burroughs, Andrew K

    2015-08-01

    Data regarding agreement on endoscopic features of oesophageal varices in children with portal hypertension (PH) are scant. The aim of this study was to evaluate endoscopic visualisation and classification of oesophageal varices in children by several European clinicians, to build a rational basis for future multicentre trials. Endoscopic pictures of the distal oesophagus of 100 children with a clinical diagnosis of PH were distributed to 10 endoscopists. Observers were requested to classify variceal size according to a 3-degree scale (small, medium, and large, class A), a 2-degree scale (small and large, class B), and to recognise red wales (presence or absence, class Red). Overall agreement was considered fair if Fleiss and Cohen κ test was ≥0.30, good if ≥0.40, excellent if ≥0.60, and perfect if ≥0.80. Agreement between observers was fair with class A (κ = 0.34) and class B (κ = 0.38), and good with class Red (κ = 0.49). The agreement was good on presence versus absence of varices (class A = 0.53, class B = 0.48). The agreement among the observers was good in class A when endoscopic features of severe PH (medium and large sizes, red marks) were grouped and compared with mild features (absent and small varices) (κ = 0.58). Experts working in different centres show a fairly good agreement on endoscopic features of PH in children, although a better training of paediatric endoscopists may improve the agreement in grading severity of varices in this setting.

  3. Evaluation of inter-observer agreement when using a clinical respiratory scoring system in pre-weaned dairy calves.

    PubMed

    Buczinski, S; Faure, C; Jolivet, S; Abdallah, A

    2016-07-01

    To determine inter-observer agreement for a clinical scoring system for the detection of bovine respiratory disease complex in calves, and the impact of classification of calves as sick or healthy based on different cut-off values. Two third-year veterinary students (Observer 1 and 2) and one post-graduate student (Observer 3) received 4 hours of training on scoring dairy calves for signs of respiratory disease, including rectal temperature, cough, eye and nasal discharge, and ear position. Observers 1 and 2 scored 40 pre-weaning dairy calves 24 hours apart (80 observations) over three visits to a calf-rearing facility, and Observers 1, 2 and 3 scored 20 calves on one visit. Inter-observer agreement was assessed using percentage of agreement (PA) and Kappa statistics for individual clinical signs, comparing Observers 1 and 2. Agreement between the three observers for total clinical score was assessed using cut-off values of ≥4, ≥5 and ≥6 to indicate unhealthy calves. Inter-observer PA for rectal temperature was 0.68, for cough 0.78, for nasal discharge 0.62, for eye discharge 0.63, and for ear position 0.85. Kappa values for all clinical signs indicated slight to fair agreement (<0.4), except temperature that had moderate agreement (0.6). The Fleiss' Kappa for total score, using cut-offs of ≥4, ≥5 and ≥6 to indicate unhealthy calves, was 0.35, 0.06 and 0.13, respectively, indicating slight to fair agreement. There was important inter-observer discrepancies in scoring clinical signs of respiratory disease, using relatively inexperienced observers. These disagreements may ultimately mean increased false negative or false positive diagnoses and incorrect treatment of cases. Visual assessment of clinical signs associated with bovine respiratory disease needs to be thoroughly validated when disease monitoring is based on the use of a clinical scoring system.

  4. Validity of a measure to assess healthy eating and physical activity policies and practices in Australian childcare services.

    PubMed

    Dodds, Pennie; Wyse, Rebecca; Jones, Jannah; Wolfenden, Luke; Lecathelinais, Christophe; Williams, Amanda; Yoong, Sze Lin; Finch, Meghan; Nathan, Nicole; Gillham, Karen; Wiggers, John

    2014-06-09

    Childcare services represent a valuable obesity prevention opportunity, providing access to a large portion of children at a vital point in their development. Few rigorously validated measures exist to measure healthy eating and physical activity policies and practices in this setting, and no such measures exist that are specific to the childcare setting in Australia. This was a cross sectional study, comparing two measures (pen and paper survey and observation) of healthy eating and physical activity policies and practices in childcare services. Research assistants attended consenting childcare services (n = 42) across the Hunter region of New South Wales, Australia and observed practices for one day. Nominated Supervisors and Room Leaders of the service also completed a pen and paper survey during the day of observation. Kappa statistics and proportion agreement were calculated for a total of 43 items relating to healthy eating and physical activity policies and practices. Agreement ranged from 38%-100%. Fifty one percent of items showed agreement of greater than or equal to 80%. Items assessing the frequency with which staff joined in active play with children reported the lowest percent agreement, while items assessing availability of beverages such as juice, milk and cordial, as well as the provision of foods such as popcorn, pretzels and sweet biscuits, reported the highest percent agreement. Kappa scores ranged from -0.06 (poor agreement) to 1 (perfect agreement). Of the 43 items assessed, 27 were found to have moderate or greater agreement. The study found that Nominated Supervisors and Room Leaders were able to accurately report on a number of healthy eating and physical activity policies and practices. Items assessing healthy eating practices tended to have higher kappa scores than those assessing physical activity related policies or practices. The tool represents a useful instrument for public health researchers and policy makers working in this setting.

  5. Inter-observer agreement on a checklist to evaluate scientific publications in the field of animal reproduction.

    PubMed

    Simoneit, Céline; Heuwieser, Wolfgang; Arlt, Sebastian P

    2012-01-01

    This study's objective was to determine respondents' inter-observer agreement on a detailed checklist to evaluate three exemplars (one case report, one randomized controlled study without blinding, and one blinded, randomized controlled study) of the scientific literature in the field of bovine reproduction. Fourteen international scientists in the field of animal reproduction were provided with the three articles, three copies of the checklist, and a supplementary explanation. Overall, 13 responded to more than 90% of the items. Overall repeatability between respondents using Fleiss's κ was 0.35 (fair agreement). Combining the "strongly agree" and "agree" responses and the "strongly disagree" and "disagree" responses increased κ to 0.49 (moderate agreement). Evaluation of information given in the three articles on housing of the animals (35% identical answers) and preconditions or pretreatments (42%) varied widely. Even though the overall repeatability was fair, repeatability concerning the important categories was high (e.g., level of agreement=98%). Our data show that the checklist is a reasonable and practical supporting tool to assess the quality of publications. Therefore, it may be used in teaching and practicing evidence-based veterinary medicine. It can support training in systematic and critical appraisal of information and in clinical decision making.

  6. Combining Decision Rules from Classification Tree Models and Expert Assessment to Estimate Occupational Exposure to Diesel Exhaust for a Case-Control Study

    PubMed Central

    Friesen, Melissa C.; Wheeler, David C.; Vermeulen, Roel; Locke, Sarah J.; Zaebst, Dennis D.; Koutros, Stella; Pronk, Anjoeka; Colt, Joanne S.; Baris, Dalsu; Karagas, Margaret R.; Malats, Nuria; Schwenn, Molly; Johnson, Alison; Armenti, Karla R.; Rothman, Nathanial; Stewart, Patricia A.; Kogevinas, Manolis; Silverman, Debra T.

    2016-01-01

    Objectives: To efficiently and reproducibly assess occupational diesel exhaust exposure in a Spanish case-control study, we examined the utility of applying decision rules that had been extracted from expert estimates and questionnaire response patterns using classification tree (CT) models from a similar US study. Methods: First, previously extracted CT decision rules were used to obtain initial ordinal (0–3) estimates of the probability, intensity, and frequency of occupational exposure to diesel exhaust for the 10 182 jobs reported in a Spanish case-control study of bladder cancer. Second, two experts reviewed the CT estimates for 350 jobs randomly selected from strata based on each CT rule’s agreement with the expert ratings in the original study [agreement rate, from 0 (no agreement) to 1 (perfect agreement)]. Their agreement with each other and with the CT estimates was calculated using weighted kappa (κ w) and guided our choice of jobs for subsequent expert review. Third, an expert review comprised all jobs with lower confidence (low-to-moderate agreement rates or discordant assignments, n = 931) and a subset of jobs with a moderate to high CT probability rating and with moderately high agreement rates (n = 511). Logistic regression was used to examine the likelihood that an expert provided a different estimate than the CT estimate based on the CT rule agreement rates, the CT ordinal rating, and the availability of a module with diesel-related questions. Results: Agreement between estimates made by two experts and between estimates made by each of the experts and the CT estimates was very high for jobs with estimates that were determined by rules with high CT agreement rates (κ w: 0.81–0.90). For jobs with estimates based on rules with lower agreement rates, moderate agreement was observed between the two experts (κ w: 0.42–0.67) and poor-to-moderate agreement was observed between the experts and the CT estimates (κ w: 0.09–0.57). In total, the expert review of 1442 jobs changed 156 probability estimates, 128 intensity estimates, and 614 frequency estimates. The expert was more likely to provide a different estimate when the CT rule agreement rate was <0.8, when the CT ordinal ratings were low to moderate, or when a module with diesel questions was available. Conclusions: Our reliability assessment provided important insight into where to prioritize additional expert review; as a result, only 14% of the jobs underwent expert review, substantially reducing the exposure assessment burden. Overall, we found that we could efficiently, reproducibly, and reliably apply CT decision rules from one study to assess exposure in another study. PMID:26732820

  7. Visual assessment of breast density using Visual Analogue Scales: observer variability, reader attributes and reading time

    NASA Astrophysics Data System (ADS)

    Ang, Teri; Harkness, Elaine F.; Maxwell, Anthony J.; Lim, Yit Y.; Emsley, Richard; Howell, Anthony; Evans, D. Gareth; Astley, Susan; Gadde, Soujanya

    2017-03-01

    Breast density is a strong risk factor for breast cancer and has potential use in breast cancer risk prediction, with subjective methods of density assessment providing a strong relationship with the development of breast cancer. This study aims to assess intra- and inter-observer variability in visual density assessment recorded on Visual Analogue Scales (VAS) among trained readers, and examine whether reader age, gender and experience are associated with assessed density. Eleven readers estimated the breast density of 120 mammograms on two occasions 3 years apart using VAS. Intra- and inter-observer agreement was assessed with Intraclass Correlation Coefficient (ICC) and variation between readers visualised on Bland-Altman plots. The mean scores of all mammograms per reader were used to analyse the effect of reader attributes on assessed density. Excellent intra-observer agreement (ICC>0.80) was found in the majority of the readers. All but one reader had a mean difference of <10 percentage points from the first to the second reading. Inter-observer agreement was excellent for consistency (ICC 0.82) and substantial for absolute agreement (ICC 0.69). However, the 95% limits of agreement for pairwise differences were -6.8 to 15.7 at the narrowest and 0.8 to 62.3 at the widest. No significant association was found between assessed density and reader age, experience or gender, or with reading time. Overall, the readers were consistent in their scores, although some large variations were observed. Reader evaluation and targeted training may alleviate this problem.

  8. Radiological findings for hip dysplasia at skeletal maturity. Validation of digital and manual measurement techniques.

    PubMed

    Engesæter, Ingvild Øvstebø; Laborie, Lene Bjerke; Lehmann, Trude Gundersen; Sera, Francesco; Fevang, Jonas; Pedersen, Douglas; Morcuende, José; Lie, Stein Atle; Engesæter, Lars Birger; Rosendahl, Karen

    2012-07-01

    To report on intra-observer, inter-observer, and inter-method reliability and agreement for radiological measurements used in the diagnosis of hip dysplasia at skeletal maturity, as obtained by a manual and a digital measurement technique. Pelvic radiographs from 95 participants (56 females) in a follow-up hip study of 18- to 19-year-old patients were included. Eleven radiological measurements relevant for hip dysplasia (Sharp's, Wiberg's, and Ogata's angles; acetabular roof angle of Tönnis; articulo-trochanteric distance; acetabular depth-width ratio; femoral head extrusion index; maximum teardrop width; and the joint space width in three different locations) were validated. Three observers measured the radiographs using both a digital measurement program and manually in AgfaWeb1000. Inter-method and inter- and intra-observer agreement were analyzed using the mean differences between the readings/readers, establishing the 95% limits of agreement. We also calculated the minimum detectable change and the intra-class correlation coefficient. Large variations among different radiological measurements were demonstrated. However, the variation was not related to the use of either the manual or digital measurement technique. For measurements with greater absolute values (Sharp's angle, femoral head extrusion index, and acetabular depth-width ratio) the inter- and intra-observer and inter-method agreements were better as compared to measurements with lower absolute values (acetabular roof angle, teardrop and joint space width). The inter- and intra-observer variation differs notably across different radiological measurements relevant for hip dysplasia at skeletal maturity, a fact that should be taken into account in clinical practice. The agreement between the manual and digital methods is good.

  9. Visual-search models for location-known detection tasks

    NASA Astrophysics Data System (ADS)

    Gifford, H. C.; Karbaschi, Z.; Banerjee, K.; Das, M.

    2017-03-01

    Lesion-detection studies that analyze a fixed target position are generally considered predictive of studies involving lesion search, but the extent of the correlation often goes untested. The purpose of this work was to develop a visual-search (VS) model observer for location-known tasks that, coupled with previous work on localization tasks, would allow efficient same-observer assessments of how search and other task variations can alter study outcomes. The model observer featured adjustable parameters to control the search radius around the fixed lesion location and the minimum separation between suspicious locations. Comparisons were made against human observers, a channelized Hotelling observer and a nonprewhitening observer with eye filter in a two-alternative forced-choice study with simulated lumpy background images containing stationary anatomical and quantum noise. These images modeled single-pinhole nuclear medicine scans with different pinhole sizes. When the VS observer's search radius was optimized with training images, close agreement was obtained with human-observer results. Some performance differences between the humans could be explained by varying the model observer's separation parameter. The range of optimal pinhole sizes identified by the VS observer was in agreement with the range determined with the channelized Hotelling observer.

  10. What is the optimal cutoff value of the axis-line-angle technique for evaluating trunk imbalance in coronal plane?

    PubMed

    Zhang, Rui-Fang; Fu, Yu-Chuan; Lu, Yi; Zhang, Xiao-Xia; Hu, Yu-Min; Zhou, Yong-Jin; Tian, Nai-Feng; He, Jia-Wei; Yan, Zhi-Han

    2017-02-01

    Accurately evaluating the extent of trunk imbalance in the coronal plane is significant for patients before and after treatment. We preliminarily practiced a new method, axis-line-angle technique (ALAT), for evaluating coronal trunk imbalance with excellent intra-observer and interobserver reliability. Radiologists and surgeons were encouraged to use this method in clinical practice. However, the optimal cutoff value of the ALAT for determination of the extent of coronal trunk imbalance has not been calculated up to now. The purpose of this study was to identify the cutoff value of the ALAT that best predicts a positive measurement point to assess coronal balance or imbalance. A retrospective study at a university affiliated hospital was carried out. A total of 130 patients with C7-central sacral vertical line (CSVL) >0 mm and aged 10-18 years were recruited in this study from September 2013 to December 2014. Data were analyzed to determine the optimal cutoff value of the ALAT measurement. The C7-CSVL and ALAT measurements were conducted respectively twice on plain film within a 2-week interval by two radiologists. The optimal cutoff value of the ALAT was analyzed via receiver operating characteristic (ROC) curve. Comparison variables were performed with chi-square test between the C7-CSVL and ALAT measurements for evaluating trunk imbalance. Kappa agreement coefficient method was used to test the intra-observer and interobserver agreement of C7-CSVL and ALAT. The ROC curve area for the ALAT was 0.82 (95% confidence interval: 0.753-0.894, p<.001). The maximum Youden index was 0.51, and the corresponding cutoff point was 2.59°. No statistical difference was found between the C7-CSVL and ALAT measurements for evaluating trunk imbalance (p>.05). Intra-observer agreement values for the C7-CSVL measurements by observers 1 and 2 were 0.79 and 0.91 (p<.001), respectively, whereas intra-observer agreement values for the ALAT measurements were both 0.89 by observers 1 and 2 (p<.001). The interobserver agreement values for the first and second measurements with the C7-CSVL were 0.78 and 0.85 (p<.001), respectively, whereas the interobserver agreement values for the first and second measurements with the ALAT were 0.91 and 0.88 (p<.001), respectively. The newly developed ALAT provided an acceptable optimal cutoff value for evaluating trunk imbalance in the coronal plane with a high level of intra-observer and interobserver agreement, which suggests that the ALAT is suitable for clinical use. Copyright © 2016 Elsevier Inc. All rights reserved.

  11. Agreement among undergraduate and graduate veterinary students and veterinary anesthesiologists on pain assessment in cats and dogs: A preliminary study

    PubMed Central

    Doodnaught, Graeme M.; Benito, Javier; Monteiro, Beatriz P.; Beauchamp, Guy; Grasso, Stefania C.; Steagall, Paulo V.

    2017-01-01

    This study investigated agreement among undergraduate and graduate veterinary students and veterinary anesthesiologists on video pain assessment at the University of Montreal. Pain assessment in dogs and cats appeared to be affected by gender, previous experience, and degree of training despite a small population of observers. PMID:28761184

  12. Interjudge agreement in videofluoroscopic studies of swallowing.

    PubMed

    Wilcox, F; Liss, J M; Siegel, G M

    1996-02-01

    Videofluoroscopic swallowing examinations of 3 patients with dysphagia were reviewed independently by 10 speech-language pathologists. Prior to viewing each video, clinicians were provided with information about the patient's history, the results of a bedside swallow examination, and oral-facial and oral motor control examinations. Clinicians completed a swallowing observation protocol as they viewed each video. They then recommended, from a list of treatment strategies, intervention techniques that would be most appropriate for each patient. Interjudge agreement was calculated by determining how many clinicians observed a given swallowing event or deficit, and how many recommended a given treatment strategy. Results suggest that the level of interjudge agreement for videofluoroscopic evaluations is not encouragingly high.

  13. Cervical Cancer Screening in Cameroon: Interobserver Agreement on the Interpretation of Digital Cervicography Results.

    PubMed

    Manga, Simon; Parham, Groesbeck; Benjamin, Nkoum; Nulah, Kathleen; Sheldon, Lisa Kennedy; Welty, Edith; Ogembo, Javier Gordon; Bradford, Leslie; Sando, Zacharie; Shields, Ray; Welty, Thomas

    2015-10-01

    The World Health Organization recommends visual inspection with acetic acid (VIA) for cervical cancer screening in resource-limited settings. In Cameroon, we use digital cervicography (DC) to capture images of the cervix after VIA. This study evaluated interobserver agreement of DC results, compared DC with histopathologic results, and examined interobserver agreement among screening methods. Three observers, blinded to each other's interpretations, evaluated 540 DC photographs as follows: (1) negative/positive for acetowhite lesions or cancer and (2) assigned a presumptive diagnosis of histopathologic lesion grade in the 91 cases that had a histopathologic diagnosis. Observer A was the actual screening nurse; B, a reproductive health nurse; C, a gynecologic oncologist; and D, the histopathologic diagnosis. We compared inter-rater agreement of DC impressions among observers A, B, and C, and with D, with Cohen kappas. For interpretations of DC, (negative/positive) strengths of agreement of paired observers were the following: A/B, moderate [K, 0.54; 95% confidence interval (CI), 0.47-0.61], A/C, fair (K, 0.37; 95% CI, 0.29-0.44), and B/C, moderate (K, 0.45; 95% CI, 0.37-0.53). For presumptive pathologic grading, strengths of agreement for weighted Ks were as follows: A/B, moderate (K, 0.42; 95% CI, 0.28-0.56); A/C, fair (K, 0.33; 95% CI, 0.20-0.46); B/C, fair (K, 0.54; 95% CI, 0.40-0.67); A/D, moderate (K, 0.59; 95% CI, 0.45-0.74); B/D, moderate (K, 0.58; 95% CI, 0.46-0.70); and C/D, moderate (K, 0.50; 95% CI, 0.37-0.63). Interobserver agreement of DC interpretations was mostly moderate among the 3 observers, between them and histopathology, and comparable to that of other visual-based screening methods, i.e., VIA, cytology, or colposcopy.

  14. The development of a reliable amateur boxing performance analysis template.

    PubMed

    Thomson, Edward; Lamb, Kevin; Nicholas, Ceri

    2013-01-01

    The aim of this study was to devise a valid performance analysis system for the assessment of the movement characteristics associated with competitive amateur boxing and assess its reliability using analysts of varying experience of the sport and performance analysis. Key performance indicators to characterise the demands of an amateur contest (offensive, defensive and feinting) were developed and notated using a computerised notational analysis system. Data were subjected to intra- and inter-observer reliability assessment using median sign tests and calculating the proportion of agreement within predetermined limits of error. For all performance indicators, intra-observer reliability revealed non-significant differences between observations (P > 0.05) and high agreement was established (80-100%) regardless of whether exact or the reference value of ±1 was applied. Inter-observer reliability was less impressive for both analysts (amateur boxer and experienced analyst), with the proportion of agreement ranging from 33-100%. Nonetheless, there was no systematic bias between observations for any indicator (P > 0.05), and the proportion of agreement within the reference range (±1) was 100%. A reliable performance analysis template has been developed for the assessment of amateur boxing performance and is available for use by researchers, coaches and athletes to classify and quantify the movement characteristics of amateur boxing.

  15. Mitosis Counting in Breast Cancer: Object-Level Interobserver Agreement and Comparison to an Automatic Method

    PubMed Central

    Veta, Mitko; van Diest, Paul J.; Jiwa, Mehdi; Al-Janabi, Shaimaa; Pluim, Josien P. W.

    2016-01-01

    Background Tumor proliferation speed, most commonly assessed by counting of mitotic figures in histological slide preparations, is an important biomarker for breast cancer. Although mitosis counting is routinely performed by pathologists, it is a tedious and subjective task with poor reproducibility, particularly among non-experts. Inter- and intraobserver reproducibility of mitosis counting can be improved when a strict protocol is defined and followed. Previous studies have examined only the agreement in terms of the mitotic count or the mitotic activity score. Studies of the observer agreement at the level of individual objects, which can provide more insight into the procedure, have not been performed thus far. Methods The development of automatic mitosis detection methods has received large interest in recent years. Automatic image analysis is viewed as a solution for the problem of subjectivity of mitosis counting by pathologists. In this paper we describe the results from an interobserver agreement study between three human observers and an automatic method, and make two unique contributions. For the first time, we present an analysis of the object-level interobserver agreement on mitosis counting. Furthermore, we train an automatic mitosis detection method that is robust with respect to staining appearance variability and compare it with the performance of expert observers on an “external” dataset, i.e. on histopathology images that originate from pathology labs other than the pathology lab that provided the training data for the automatic method. Results The object-level interobserver study revealed that pathologists often do not agree on individual objects, even if this is not reflected in the mitotic count. The disagreement is larger for objects from smaller size, which suggests that adding a size constraint in the mitosis counting protocol can improve reproducibility. The automatic mitosis detection method can perform mitosis counting in an unbiased way, with substantial agreement with human experts. PMID:27529701

  16. Mitosis Counting in Breast Cancer: Object-Level Interobserver Agreement and Comparison to an Automatic Method.

    PubMed

    Veta, Mitko; van Diest, Paul J; Jiwa, Mehdi; Al-Janabi, Shaimaa; Pluim, Josien P W

    2016-01-01

    Tumor proliferation speed, most commonly assessed by counting of mitotic figures in histological slide preparations, is an important biomarker for breast cancer. Although mitosis counting is routinely performed by pathologists, it is a tedious and subjective task with poor reproducibility, particularly among non-experts. Inter- and intraobserver reproducibility of mitosis counting can be improved when a strict protocol is defined and followed. Previous studies have examined only the agreement in terms of the mitotic count or the mitotic activity score. Studies of the observer agreement at the level of individual objects, which can provide more insight into the procedure, have not been performed thus far. The development of automatic mitosis detection methods has received large interest in recent years. Automatic image analysis is viewed as a solution for the problem of subjectivity of mitosis counting by pathologists. In this paper we describe the results from an interobserver agreement study between three human observers and an automatic method, and make two unique contributions. For the first time, we present an analysis of the object-level interobserver agreement on mitosis counting. Furthermore, we train an automatic mitosis detection method that is robust with respect to staining appearance variability and compare it with the performance of expert observers on an "external" dataset, i.e. on histopathology images that originate from pathology labs other than the pathology lab that provided the training data for the automatic method. The object-level interobserver study revealed that pathologists often do not agree on individual objects, even if this is not reflected in the mitotic count. The disagreement is larger for objects from smaller size, which suggests that adding a size constraint in the mitosis counting protocol can improve reproducibility. The automatic mitosis detection method can perform mitosis counting in an unbiased way, with substantial agreement with human experts.

  17. On the Agreement between Manual and Automated Methods for Single-Trial Detection and Estimation of Features from Event-Related Potentials

    PubMed Central

    Biurrun Manresa, José A.; Arguissain, Federico G.; Medina Redondo, David E.; Mørch, Carsten D.; Andersen, Ole K.

    2015-01-01

    The agreement between humans and algorithms on whether an event-related potential (ERP) is present or not and the level of variation in the estimated values of its relevant features are largely unknown. Thus, the aim of this study was to determine the categorical and quantitative agreement between manual and automated methods for single-trial detection and estimation of ERP features. To this end, ERPs were elicited in sixteen healthy volunteers using electrical stimulation at graded intensities below and above the nociceptive withdrawal reflex threshold. Presence/absence of an ERP peak (categorical outcome) and its amplitude and latency (quantitative outcome) in each single-trial were evaluated independently by two human observers and two automated algorithms taken from existing literature. Categorical agreement was assessed using percentage positive and negative agreement and Cohen’s κ, whereas quantitative agreement was evaluated using Bland-Altman analysis and the coefficient of variation. Typical values for the categorical agreement between manual and automated methods were derived, as well as reference values for the average and maximum differences that can be expected if one method is used instead of the others. Results showed that the human observers presented the highest categorical and quantitative agreement, and there were significantly large differences between detection and estimation of quantitative features among methods. In conclusion, substantial care should be taken in the selection of the detection/estimation approach, since factors like stimulation intensity and expected number of trials with/without response can play a significant role in the outcome of a study. PMID:26258532

  18. Validation of the Italian version of the Coma Recovery Scale-Revised (CRS-R).

    PubMed

    Sacco, Simona; Altobelli, Emma; Pistarini, Caterina; Cerone, Davide; Cazzulani, Benedetta; Carolei, Antonio

    2011-01-01

    To validate the Italian version of the Coma Recovery Scale-Revised (CRS-R). Two observers applied the Italian version of the CRS-R to selected patients. On day 1, observer A and B independently scored each patient; the comparison of their observations was used to evaluate inter-observer agreement. On day 2, observer A completed a second evaluation and the comparison of this observation with that obtained on day 1 by the same observer was used to evaluate test-re-test agreement. For each evaluation, also diagnostic impression (vegetative state/minimally conscious state) was reported. Thirty-eight patients were evaluated (mean age ± SD, 58.9 ± 13.8 years). Inter-observer (ρ = 0.81; p < 0.001) as well as test-re-test agreement (ρ = 0.97; p < 0.001) for the total score was high. Inter-observer agreement was excellent for the communication sub-scale, good for the auditory, visual and motor sub-scales and moderate for the oromotor/verbal and arousal sub-scales. Test-re-test agreement was excellent for the visual, motor, oromotor/verbal and communication sub-scales, good for the auditory sub-scale and moderate for the arousal sub-scale. When considering the diagnostic impression, inter-observer agreement was good (κ = 0.75; p < 0.001) and test-re-test agreement was excellent (κ = 0.92; p < 0.001). The Italian version of the CRS-R can be administered reliably and can be also employed to discriminate patients in vegetative and in minimally conscious state.

  19. Semiautomatic estimation of breast density with DM-Scan software.

    PubMed

    Martínez Gómez, I; Casals El Busto, M; Antón Guirao, J; Ruiz Perales, F; Llobet Azpitarte, R

    2014-01-01

    To evaluate the reproducibility of the calculation of breast density with DM-Scan software, which is based on the semiautomatic segmentation of fibroglandular tissue, and to compare it with the reproducibility of estimation by visual inspection. The study included 655 direct digital mammograms acquired using craniocaudal projections. Three experienced radiologists analyzed the density of the mammograms using DM-Scan, and the inter- and intra-observer agreement between pairs of radiologists for the Boyd and BI-RADS® scales were calculated using the intraclass correlation coefficient. The Kappa index was used to compare the inter- and intra-observer agreements with those obtained previously for visual inspection in the same set of images. For visual inspection, the mean interobserver agreement was 0,876 (95% CI: 0,873-0,879) on the Boyd scale and 0,823 (95% CI: 0,818-0,829) on the BI-RADS® scale. The mean intraobserver agreement was 0,813 (95% CI: 0,796-0,829) on the Boyd scale and 0,770 (95% CI: 0,742-0,797) on the BI-RADS® scale. For DM-Scan, the mean inter- and intra-observer agreement was 0,92, considerably higher than the agreement for visual inspection. The semiautomatic calculation of breast density using DM-Scan software is more reliable and reproducible than visual estimation and reduces the subjectivity and variability in determining breast density. Copyright © 2012 SERAM. Published by Elsevier Espana. All rights reserved.

  20. Interobserver agreement in CTG interpretation using the 2015 FIGO guidelines for intrapartum fetal monitoring.

    PubMed

    Rei, Mariana; Tavares, Sara; Pinto, Pedro; Machado, Ana P; Monteiro, Sofia; Costa, Antónia; Costa-Santos, Cristina; Bernardes, João; Ayres-De-Campos, Diogo

    2016-10-01

    Visual analysis of cardiotocographic (CTG) tracings has been shown to be prone to poor intra- and interobserver agreement when several interpretation guidelines are used, and this may have an important impact on the technology's performance. The aim of this study was to evaluate agreement in CTG interpretation using the new 2015 FIGO guidelines on intrapartum fetal monitoring. A pre-existing database of intrapartum CTG tracings was used to sequentially select 151 cases acquired with a fetal electrode, with duration exceeding 60minutes, and signal loss less than 15%. These tracings were presented to six clinicians, three with more than 5 years' experience in the labor ward, and three with 5 or less years' experience. Observers were asked to evaluate tracings independently, to assess basic CTG features: baseline, variability, accelerations, decelerations, sinusoidal pattern, tachysystole, and to classify each tracing as normal, suspicious or pathologic, according to the 2015 FIGO guidelines on intrapartum fetal monitoring. Agreement between observers was evaluated using the proportions of agreement (Pa), with 95% confidence intervals (95%CI). A good interobserver agreement was found in the evaluation of most CTG features, but not bradycardia, reduced variability, saltatory pattern, absence of accelerations and absence of decelerations. For baseline classification Pa was 0.85 [0.82-0.90], for variability 0.82 [0.78-0.85], for accelerations 0.72 [0.68-0.75], for tachysystole 0.77 [0.74-0.81], for decelerations 0.92 [0.90-0.95], for variable decelerations 0.62 [0.58-0.65], for late decelerations 0.63 [0.59-0.66], for repetitive decelerations 0.73 [0.69-0.78], and for prolonged decelerations 0.81 [0.77-0.85]. For overall CTG classification, Pa were 0.60 [0.56-0.64], for classification as normal 0.67 [0.61-0.72], for suspicious 0.54 [0.48-0.60] and for pathologic 0.59 [0.51-0.66]. No differences in agreement according to the level of expertise were observed, except in the identification of accelerations, where it was better in the more experienced group. A good interobserver agreement was found in evaluation of most CTG features and in overall tracing classification. Results were better than those reported in previous studies evaluating agreement in overall tracing classification. Observer experience did not appear to play a role in agreement. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  1. Measures of Agreement Between Many Raters for Ordinal Classifications

    PubMed Central

    Nelson, Kerrie P.; Edwards, Don

    2015-01-01

    Screening and diagnostic procedures often require a physician's subjective interpretation of a patient's test result using an ordered categorical scale to define the patient's disease severity. Due to wide variability observed between physicians’ ratings, many large-scale studies have been conducted to quantify agreement between multiple experts’ ordinal classifications in common diagnostic procedures such as mammography. However, very few statistical approaches are available to assess agreement in these large-scale settings. Existing summary measures of agreement rely on extensions of Cohen's kappa [1 - 5]. These are prone to prevalence and marginal distribution issues, become increasingly complex for more than three experts or are not easily implemented. Here we propose a model-based approach to assess agreement in large-scale studies based upon a framework of ordinal generalized linear mixed models. A summary measure of agreement is proposed for multiple experts assessing the same sample of patients’ test results according to an ordered categorical scale. This measure avoids some of the key flaws associated with Cohen's kappa and its extensions. Simulation studies are conducted to demonstrate the validity of the approach with comparison to commonly used agreement measures. The proposed methods are easily implemented using the software package R and are applied to two large-scale cancer agreement studies. PMID:26095449

  2. The social perception of emotional abilities: expanding what we know about observer ratings of emotional intelligence.

    PubMed

    Elfenbein, Hillary Anger; Barsade, Sigal G; Eisenkraft, Noah

    2015-02-01

    We examine the social perception of emotional intelligence (EI) through the use of observer ratings. Individuals frequently judge others' emotional abilities in real-world settings, yet we know little about the properties of such ratings. This article examines the social perception of EI and expands the evidence to evaluate its reliability and cross-judge agreement, as well as its convergent, divergent, and predictive validity. Three studies use real-world colleagues as observers and data from 2,521 participants. Results indicate significant consensus across observers about targets' EI, moderate but significant self-observer agreement, and modest but relatively consistent discriminant validity across the components of EI. Observer ratings significantly predicted interdependent task performance, even after controlling for numerous factors. Notably, predictive validity was greater for observer-rated than for self-rated or ability-tested EI. We discuss the minimal associations of observer ratings with ability-tested EI, study limitations, future directions, and practical implications. PsycINFO Database Record (c) 2015 APA, all rights reserved.

  3. Intra- and inter-rater agreement between an ophthalmologist and mid-level ophthalmic personnel to diagnose retinal diseases based on fundus photographs at a primary eye center in Nepal: the Bhaktapur Retina Study.

    PubMed

    Thapa, Raba; Bajimaya, Sanyam; Bouman, Renske; Paudyal, Govinda; Khanal, Shankar; Tan, Stevie; Thapa, Suman S; van Rens, Ger

    2016-07-18

    Early detection can reduce irreversible blindness from retinal diseases. This study aims to assess the intra- and inter-rater agreement of retinal pathologies observed on fundus photographs between an ophthalmologist and two-mid level ophthalmic personnel (MLOPs). A population-based, cross-sectional study was conducted among subjects 60 years and above in the Bhaktapur district of Nepal. Fundus photographs of 500 eyes of 500 subjects were assessed. The macula-centered 45-degree photographs were graded twice by one ophthalmologist and two MLOPs. Intra-rater and inter-rater agreements were assessed for the ophthalmologist and the MLOPs. Mean age was 70.22 years ± 6.94 (SD). Retinal pathologies were observed in 55.6 % of photographs (age-related macular degeneration: 34.2 %; diabetic retinopathy: 4.2 %; retinal vein occlusion: 3.8 %). Twelve (2.4 %) fundus pictures were non-gradable. The intra-rater agreement for overall retinal pathologies, retinal hemorrhage, and maculopathy were substantial both for the ophthalmologist as well as for the MLOPs. There was moderate inter-rater agreement between the ophthalmologist and the first MLOP on second rating for overall retinal pathologies, [kappa (k); 95 % CI = 0.59 (0.51-0.66)], retinal hemorrhage [k; 95 % CI = 0.60 (0.41-0.78)], and maculopathy [k; 95 % CI = 0.52 (0.43-0.60)]. Inter-rater agreement between the ophthalmologist and the second MLOP for second rating was moderate for overall retinal pathologies [k; 95 % CI = 0.52 (0.44-0.60)], substantial agreement for retinal hemorrhage [k; 95 % CI = 0. 68 (0.52-0.84)], moderate agreement for maculopathy [k; 95 % CI = 0.59 (0.50-0.67)]. There is moderate agreement between the MLOPs and the ophthalmologist in grading fundus photographs for retinal hemorrhages and maculopathy.

  4. The intra- and inter-observer reliability of the physical examination methods used to assess patients with patellofemoral joint instability.

    PubMed

    Smith, Toby O; Clark, Allan; Neda, Sophia; Arendt, Elizabeth A; Post, William R; Grelsamer, Ronald P; Dejour, David; Almqvist, Karl Fredrik; Donell, Simon T

    2012-08-01

    An accurate physical examination of patients with patellar instability is an important aspect of the diagnosis and treatment. While previous studies have assessed the diagnostic accuracy of such physical examination tests, little has been undertaken to assess the inter- and intra-tester reliability of such techniques. The purpose of this study was to determine the inter- and intra-tester reliability of the physical examination tests used for patients with patellar instability. Five patients (10 knees) with bilateral recurrent patellar instability were assessed by five members of the International Patellofemoral Study Group. Each surgeon assessed each patient twice using 18 reported physical examination tests. The inter- and intra-observer reliability was assessed using weighted Kappa statistics with 95% confidence intervals. The findings of the study suggested that there were very poor inter-observer reliability for the majority of the physical tests, with only the assessments of patellofemoral crepitus, foot arch position and the J-sign presenting with fair to moderate agreement respectively. The intra-observer reliability indicated largely moderate to substantial agreement between the first and second tests performed by each assessor, with the greatest agreement seen for the assessment of tibial torsion, popliteal angle and the Bassett's sign. For the common physical examination tests used in the management of patients with patellar instability inter-observer reliability is poor, while intra-observer reliability is moderate. Standardization of physical exam assessments and further study of these results among different clinicians and more divergent patient groups is indicated. Copyright © 2011 Elsevier B.V. All rights reserved.

  5. A Reliable, Feasible Method to Observe Neighborhoods at High Spatial Resolution

    PubMed Central

    Kepper, Maura M.; Sothern, Melinda S.; Theall, Katherine P.; Griffiths, Lauren A.; Scribner, Richard; Tseng, Tung-Sung; Schaettle, Paul; Cwik, Jessica M.; Felker-Kantor, Erica; Broyles, Stephanie T.

    2016-01-01

    Introduction Systematic social observation (SSO) methods traditionally measure neighborhoods at street level and have been performed reliably using virtual applications to increase feasibility. Research indicates that collection at even higher spatial resolution may better elucidate the health impact of neighborhood factors, but whether virtual applications can reliably capture social determinants of health at the smallest geographic resolution (parcel level) remains uncertain. This paper presents a novel, parcel-level SSO methodology and assesses whether this new method can be collected reliably using Google Street View and is feasible. Methods Multiple raters (N=5) observed 42 neighborhoods. In 2016, inter-rater reliability (observed agreement and kappa coefficient) was compared for four SSO methods: (1) street-level in person; (2) street-level virtual; (3) parcel-level in person; and (4) parcel-level virtual. Intra-rater reliability (observed agreement and kappa coefficient) was calculated to determine whether parcel-level methods produce results comparable to traditional street-level observation. Results Substantial levels of inter-rater agreement were documented across all four methods; all methods had >70% of items with at least substantial agreement. Only physical decay showed higher levels of agreement (83% of items with >75% agreement) for direct versus virtual rating source. Intra-rater agreement comparing street- versus parcel-level methods resulted in observed agreement >75% for all but one item (90%). Conclusions Results support the use of Google Street View as a reliable, feasible tool for performing SSO at the smallest geographic resolution. Validation of a new parcel-level method collected virtually may improve the assessment of social determinants contributing to disparities in health behaviors and outcomes. PMID:27989289

  6. Intramodality and intermodality agreement in radiography and computed tomography of equine distal limb fractures.

    PubMed

    Crijns, C P; Martens, A; Bergman, H-J; van der Veen, H; Duchateau, L; van Bree, H J J; Gielen, I M V L

    2014-01-01

    Computed tomography (CT) is increasingly accessible in equine referral hospitals. To document the level of agreement within and between radiography and CT in characterising equine distal limb fractures. Retrospective descriptive study. Images from horses that underwent radiographic and CT evaluation for suspected distal limb fractures were reviewed, including 27 horses and 3 negative controls. Using Cohen's kappa and weighted kappa analysis, the level of agreement among 4 observers for a predefined set of diagnostic characteristics for radiography and CT separately and for the level of agreement between the 2 imaging modalities were documented. Both CT and radiography had very good intramodality agreement in identifying fractures, but intermodality agreement was lower. There was good intermodality and intramodality agreement for anatomical localisation and the identification of fracture displacement. Agreement for articular involvement, fracture comminution and fracture fragment number was towards the lower limit of good agreement. There was poor to fair intermodality agreement regarding fracture orientation, fracture width and coalescing cracks; intramodality agreement was higher for CT than for radiography for these features. Further studies, including comparisons with surgical and/or post mortem findings, are required to determine the sensitivity and specificity of CT and radiography in the diagnosis and characterisation of equine distal limb fractures. © 2013 EVJ Ltd.

  7. New definitions of 6 clinical signs of perceptual disorder in children with cerebral palsy: an observational study through reliability measures.

    PubMed

    Ferrari, A; Sghedoni, A; Alboresi, S; Pedroni, E; Lombardi, F

    2014-12-01

    Recently authors have begun to emphasize the non-motor aspects of Cerebral Palsy and their influence on motor control and recovery prognosis. Much has been written about single clinical signs (i.e., startle reaction) but so far no definitions of the six perceptual signs presented in this study have appeared in literature. This study defines 6 signs (startle reaction, upper limbs in startle position, frequent eye blinking, posture freezing, averted eye gaze, grimacing) suggestive of perceptual disorders in children with cerebral palsy and measures agreement on sign recognition among independent observers and consistency of opinions over time. Observational study with both cross-sectional and prospective components. Fifty-six videos presented to observers in random order. Videos were taken from 19 children with a bilateral form of cerebral palsy referred to the Children Rehabilitation Unit in Reggio Emilia. Thirty-five rehabilitation professionals from all over Italy: 9 doctors and 26 physiotherapists. Measure of agreement among 35 independent observers was compiled from a sample of 56 videos. Interobserver reliability was determined using the K index of Fleiss and reliability intra-observer was calculated by the Spearman correlation index between ranks (rho - ρ). Percentage of agreement between observers and Gold Standard was used as criterion validity. Interobserver reliability was moderate for startle reaction, upper limb in startle position, adverted eye gaze and eye-blinking and fair for posture freezing and grimacing. Intraobserver reliability remained consistent over time. Criterion validity revealed very high agreement between independent observer evaluation and gold standard. Semiotics of perceptual disorders can be used as a specific and sensitive instrument in order to identify a new class of patients within existing heterogeneous clinical types of bilateral cerebral palsy forms and could help clinicians in identifying functional prognosis. To provide clinicians with a definition of 6 clinical signs found in children with cerebral palsy in routine rehabilitation settings. Future research should explore the link between these signs and motor prognosis (i.e., time to independent walking).

  8. Digital image analysis of Ki67 proliferation index in breast cancer using virtual dual staining on whole tissue sections: clinical validation and inter-platform agreement.

    PubMed

    Koopman, Timco; Buikema, Henk J; Hollema, Harry; de Bock, Geertruida H; van der Vegt, Bert

    2018-05-01

    The Ki67 proliferation index is a prognostic and predictive marker in breast cancer. Manual scoring is prone to inter- and intra-observer variability. The aims of this study were to clinically validate digital image analysis (DIA) of Ki67 using virtual dual staining (VDS) on whole tissue sections and to assess inter-platform agreement between two independent DIA platforms. Serial whole tissue sections of 154 consecutive invasive breast carcinomas were stained for Ki67 and cytokeratin 8/18 with immunohistochemistry in a clinical setting. Ki67 proliferation index was determined using two independent DIA platforms, implementing VDS to identify tumor tissue. Manual Ki67 score was determined using a standardized manual counting protocol. Inter-observer agreement between manual and DIA scores and inter-platform agreement between both DIA platforms were determined and calculated using Spearman's correlation coefficients. Correlations and agreement were assessed with scatterplots and Bland-Altman plots. Spearman's correlation coefficients were 0.94 (p < 0.001) for inter-observer agreement between manual counting and platform A, 0.93 (p < 0.001) between manual counting and platform B, and 0.96 (p < 0.001) for inter-platform agreement. Scatterplots and Bland-Altman plots revealed no skewness within specific data ranges. In the few cases with ≥ 10% difference between manual counting and DIA, results by both platforms were similar. DIA using VDS is an accurate method to determine the Ki67 proliferation index in breast cancer, as an alternative to manual scoring of whole sections in clinical practice. Inter-platform agreement between two different DIA platforms was excellent, suggesting vendor-independent clinical implementability.

  9. Spasticity, dyskinesia and ataxia in cerebral palsy: Are we sure we can differentiate them?

    PubMed

    Eggink, H; Kremer, D; Brouwer, O F; Contarino, M F; van Egmond, M E; Elema, A; Folmer, K; van Hoorn, J F; van de Pol, L A; Roelfsema, V; Tijssen, M A J

    2017-09-01

    Cerebral palsy (CP) can be classified as spastic, dyskinetic, ataxic or combined. Correct classification is essential for symptom-targeted treatment. This study aimed to investigate agreement among professionals on the phenotype of children with CP based on standardized videos. In a prospective, observational pilot study, videos of fifteen CP patients (8 boys, mean age 11 ± 5 y) were rated by three pediatric neurologists, three rehabilitation physicians and three movement disorder specialists. They scored the presence and severity of spasticity, ataxia or dyskinesias/dystonia. Inter- and intraobserver agreement were calculated using Cohen's and Fleiss' kappa. We found a fair inter-observer (κ = 0.36) and moderate intra-observer agreement (κ = 0.51) for the predominant motor symptom. This only slightly differed within the three groups of specialists (κ = 0.33-0.55). A large variability in the phenotyping of CP children was detected, not only between but also within clinicians, calling for a discussing on the operational definitions of spasticity, dystonia and ataxia. In addition, the low agreement found in our study questions the reliability of use of videos to measure intervention outcomes, such as deep brain stimulation in dystonic CP. Future studies should include functional domains to assess the true impact of management options in this highly challenging patient population. Copyright © 2017 European Paediatric Neurology Society. Published by Elsevier Ltd. All rights reserved.

  10. Mismatch between perceived and objectively measured environmental obesogenic features in European neighbourhoods.

    PubMed

    Roda, C; Charreire, H; Feuillet, T; Mackenbach, J D; Compernolle, S; Glonti, K; Ben Rebah, M; Bárdos, H; Rutter, H; McKee, M; De Bourdeaudhuij, I; Brug, J; Lakerveld, J; Oppert, J-M

    2016-01-01

    Findings from research on the association between the built environment and obesity remain equivocal but may be partly explained by differences in approaches used to characterize the built environment. Findings obtained using subjective measures may differ substantially from those measured objectively. We investigated the agreement between perceived and objectively measured obesogenic environmental features to assess (1) the extent of agreement between individual perceptions and observable characteristics of the environment and (2) the agreement between aggregated perceptions and observable characteristics, and whether this varied by type of characteristic, region or neighbourhood. Cross-sectional data from the SPOTLIGHT project (n = 6037 participants from 60 neighbourhoods in five European urban regions) were used. Residents' perceptions were self-reported, and objectively measured environmental features were obtained by a virtual audit using Google Street View. Percent agreement and Kappa statistics were calculated. The mismatch was quantified at neighbourhood level by a distance metric derived from a factor map. The extent to which the mismatch metric varied by region and neighbourhood was examined using linear regression models. Overall, agreement was moderate (agreement < 82%, kappa < 0.3) and varied by obesogenic environmental feature, region and neighbourhood. Highest agreement was found for food outlets and outdoor recreational facilities, and lowest agreement was obtained for aesthetics. In general, a better match was observed in high-residential density neighbourhoods characterized by a high density of food outlets and recreational facilities. Future studies should combine perceived and objectively measured built environment qualities to better understand the potential impact of the built environment on health, particularly in low residential density neighbourhoods. © 2016 World Obesity.

  11. Assessing the influence of rater and subject characteristics on measures of agreement for ordinal ratings.

    PubMed

    Nelson, Kerrie P; Mitani, Aya A; Edwards, Don

    2017-09-10

    Widespread inconsistencies are commonly observed between physicians' ordinal classifications in screening tests results such as mammography. These discrepancies have motivated large-scale agreement studies where many raters contribute ratings. The primary goal of these studies is to identify factors related to physicians and patients' test results, which may lead to stronger consistency between raters' classifications. While ordered categorical scales are frequently used to classify screening test results, very few statistical approaches exist to model agreement between multiple raters. Here we develop a flexible and comprehensive approach to assess the influence of rater and subject characteristics on agreement between multiple raters' ordinal classifications in large-scale agreement studies. Our approach is based upon the class of generalized linear mixed models. Novel summary model-based measures are proposed to assess agreement between all, or a subgroup of raters, such as experienced physicians. Hypothesis tests are described to formally identify factors such as physicians' level of experience that play an important role in improving consistency of ratings between raters. We demonstrate how unique characteristics of individual raters can be assessed via conditional modes generated during the modeling process. Simulation studies are presented to demonstrate the performance of the proposed methods and summary measure of agreement. The methods are applied to a large-scale mammography agreement study to investigate the effects of rater and patient characteristics on the strength of agreement between radiologists. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  12. Seeing Eye to Eye: Predicting Teacher-Student Agreement on Classroom Social Networks

    ERIC Educational Resources Information Center

    Neal, Jennifer Watling; Cappella, Elise; Wagner, Caroline; Atkins, Marc S.

    2011-01-01

    This study examines the association between classroom characteristics and teacher-student agreement in perceptions of students' classroom peer networks. Social network, peer nomination, and observational data were collected from a sample of second through fourth grade teachers (N = 33) and students (N = 669) in 33 classrooms across five…

  13. Development of a set of activities to evaluate the arm and hand function in children with obstetric brachial plexus lesion.

    PubMed

    Boeschoten, K H; Folmer, K B; van der Lee, J H; Nollet, F

    2007-02-01

    To develop an observational instrument that can be used to evaluate the quality of arm and hand skills in daily functional activities in children with obstetric brachial plexus lesion (OBPL). A set of functional activities was constructed and standardized, and the intra-observer reliability of the assessment of this set of activities was studied. Department of Occupational Therapy and Department of Rehabilitation Medicine, VU University Medical Centre. Twenty-six children with OBPL in the age range of 4 -6 years. The children were asked to perform 47 bimanual activities, which were recorded on videotape. The videotapes were scored twice by the same occupational therapist. The percentage of agreement in scoring 'hand-use', 'speed' and 'assistance' was over 80% for a substantial number of activities, indicating a strong agreement. However, in scoring 'deviations in movements and body posture' the percentage of agreement was insufficient in most activities. This set of activities has good potential for assessment of the performance of functional activities in children with OBPL. This study, however, showed a number of difficulties in observing and scoring the activities that have to be considered when developing a standardized video observation.

  14. Reliability of laser Doppler flowmetry curve reading for measurement of toe and ankle pressures: intra- and inter-observer variation.

    PubMed

    Høyer, C; Paludan, J P D; Pavar, S; Biurrun Manresa, J A; Petersen, L J

    2014-03-01

    To assess the intra- and inter-observer variation in laser Doppler flowmetry curve reading for measurement of toe and ankle pressures. A prospective single blinded diagnostic accuracy study was conducted on 200 patients with known or suspected peripheral arterial disease (PAD), with a total of 760 curve sets produced. The first curve reading for this study was performed by laboratory technologists blinded to clinical clues and previous readings at least 3 months after the primary data sampling. The pressure curves were later reassessed following another period of at least 3 months. Observer agreement in diagnostic classification according to TASC-II criteria was quantified using Cohen's kappa. Reliability was quantified using intra-class correlation coefficients, coefficients of variance, and Bland-Altman analysis. The overall agreement in diagnostic classification (PAD/not PAD) was 173/200 (87%) for intra-observer (κ = .858) and 175/200 (88%) for inter-observer data (κ = .787). Reliability analysis confirmed excellent correlation for both intra- and inter-observer data (ICC all ≥.931). The coefficients of variance ranged from 2.27% to 6.44% for intra-observer and 2.39% to 8.42% for inter-observer data. Subgroup analysis showed lower observer-variation for reading of toe pressures in patients with diabetes and/or chronic kidney disease than patients not diagnosed with these conditions. Bland-Altman plots showed higher variation in toe pressure readings than ankle pressure readings. This study shows substantial intra- and inter-observer agreement in diagnostic classification and reading of absolute pressures when using laboratory technologists as observers. The study emphasises that observer variation for curve reading is an important factor concerning the overall reproducibility of the method. Our data suggest diabetes and chronic kidney disease have an influence on toe pressure reproducibility. Copyright © 2013 European Society for Vascular Surgery. Published by Elsevier Ltd. All rights reserved.

  15. Simple Rules, Not So Simple: The Use of International Ovarian Tumor Analysis (IOTA) Terminology and Simple Rules in Inexperienced Hands in a Prospective Multicenter Cohort Study.

    PubMed

    Meys, Evelyne; Rutten, Iris; Kruitwagen, Roy; Slangen, Brigitte; Lambrechts, Sandrina; Mertens, Helen; Nolting, Ernst; Boskamp, Dieuwke; Van Gorp, Toon

    2017-12-01

     To analyze how well untrained examiners - without experience in the use of International Ovarian Tumor Analysis (IOTA) terminology or simple ultrasound-based rules (simple rules) - are able to apply IOTA terminology and simple rules and to assess the level of agreement between non-experts and an expert.  This prospective multicenter cohort study enrolled women with ovarian masses. Ultrasound was performed by non-expert examiners and an expert. Ultrasound features were recorded using IOTA nomenclature, and used for classifying the mass by simple rules. Interobserver agreement was evaluated with Fleiss' kappa and percentage agreement between observers.  50 consecutive women were included. We observed 46 discrepancies in the description of ovarian masses when non-experts utilized IOTA terminology. Tumor type was misclassified often (n = 22), resulting in poor interobserver agreement between the non-experts and the expert (kappa = 0.39, 95 %-CI 0.244 - 0.529, percentage of agreement = 52.0 %). Misinterpretation of simple rules by non-experts was observed 57 times, resulting in an erroneous diagnosis in 15 patients (30 %). The agreement for classifying the mass as benign, malignant or inconclusive by simple rules was only moderate between the non-experts and the expert (kappa = 0.50, 95 %-CI 0.300 - 0.704, percentage of agreement = 70.0 %). The level of agreement for all 10 simple rules features varied greatly (kappa index range: -0.08 - 0.74, percentage of agreement 66 - 94 %).  Although simple rules are useful to distinguish benign from malignant adnexal masses, they are not that simple for untrained examiners. Training with both IOTA terminology and simple rules is necessary before simple rules can be introduced into guidelines and daily clinical practice. © Georg Thieme Verlag KG Stuttgart · New York.

  16. Effect of clinical information and previous exam execution on observer agreement and reliability in the analysis of hysteroscopic video-recordings.

    PubMed

    Martinho, Margarida Suzel Lopes; da Costa Santos, Cristina Maria Nogueira; Silva Carvalho, João Luís Mendonça; Bernardes, João Francisco Montenegro Andrade Lima

    2018-02-01

    Inter-observer agreement and reliability in hysteroscopic image assessment remain uncertain and the type of factors that may influence it has only been studied in relation to the experience of hysteroscopists. We aim to assess the effect of clinical information and previous exam execution on observer agreement and reliability in the analysis of hysteroscopic video-recordings. Ninety hysteroscopies were video-recorded and randomized into a group without (Group 1) and with clinical information (Group 2). The videos were independently analyzed by three hysteroscopists, regarding lesion location, dimension, and type, as well as decision to perform a biopsy. One of the hysteroscopists had executed all the exams before. Proportions of agreement (PA) and kappa statistics (κ) with 95% confidence intervals (95% CI) were used. In Group 2, there was a higher proportion of a normal diagnosis (p < 0.001) and a lower proportion of biopsies recommended (p = 0.027). Observer agreement and reliability were better in Group 2, with the PA and κ ranging, respectively, from 0.73 (95% CI 0.62, 0.83) and 0.44 (95% CI 0.26, 0.63), for image quality, to 0.94 (95% CI 0.88, 0.99) and 0.85 (95% CI 0.65, 0.95), for the decision to perform a biopsy. Execution of the exams before the analysis of the video-recordings did not significantly affect the results. With clinical information, agreement and reliability in the overall analysis of hysteroscopic video-recordings may reach almost perfect results and this was not significantly affected by the execution of the exams before the analysis. However, there is still uncertainty in the analysis of specific endometrial cavity abnormalities.

  17. Reproducibility of right-to-left shunt quantification using transthoracic contrast echocardiography in hereditary haemorrhagic telangiectasia.

    PubMed

    Vorselaars, V M M; Velthuis, S; Huitema, M P; Hosman, A E; Westermann, C J J; Snijder, R J; Mager, J J; Post, M C

    2018-04-01

    Transthoracic contrast echocardiography (TTCE) is recommended for screening of pulmonary arteriovenous malformations (PAVMs) in hereditary haemorrhagic telangiectasia. Shunt quantification is used to find treatable PAVMs. So far, there has been no study investigating the reproducibility of this diagnostic test. Therefore, this study aimed to describe inter-observer and inter-injection variability of TTCE. We conducted a prospective single centre study. We included all consecutive persons screened for presence of PAVMs in association with hereditary haemorrhagic telangiectasia in 2015. The videos of two contrast injections per patient were divided and reviewed by two cardiologists blinded for patient data. Pulmonary right-to-left shunts were graded using a three-grade scale. Inter-observer and inter-injection agreement was calculated with κ statistics for the presence and grade of pulmonary right-to-left shunts. We included 107 persons (accounting for 214 injections) (49.5% male, mean age 45.0 ± 16.6 years). A pulmonary right-to-left shunt was present in 136 (63.6%) and 131 (61.2%) injections for observer 1 and 2, respectively. Inter-injection agreement for the presence of pulmonary right-to-left shunts was 0.96 (95% confidence interval (CI) 0.9-1.0) and 0.98 (95% CI 0.94-1.00) for observer 1 and 2, respectively. Inter-injection agreement for pulmonary right-to-left shunt grade was 0.96 (95% CI 0.93-0.99) and 0.95 (95% CI 0.92-0.98) respectively. There was disagreement in right-to-left shunt grade between the contrast injections in 11 patients (10.3%). Inter-observer variability for presence and grade of the pulmonary right-to-left shunt was 0.95 (95% CI 0.91-0.99) and 0.97 (95% CI 0.95-0.99) respectively. TTCE has an excellent inter-injection and inter-observer agreement for both the presence and grade of pulmonary right-to-left shunts.

  18. Seeing Eye to Eye: Predicting Teacher-Student Agreement on Classroom Social Networks

    PubMed Central

    Neal, Jennifer Watling; Cappella, Elise; Wagner, Caroline; Atkins, Marc S.

    2010-01-01

    This study examines the association between classroom characteristics and teacher-student agreement in perceptions of students’ classroom peer networks. Social network, peer nomination, and observational data were collected from a sample of second through fourth grade teachers (N=33) and students (N=669) in 33 classrooms across five high poverty urban schools. Results demonstrate that variation in teacher-student agreement on the structure of students’ peer networks can be explained, in part, by developmental factors and classroom characteristics. Developmental increases in network density partially mediated the positive relationship between grade level and teacher-student agreement. Larger class sizes and higher levels of normative aggressive behavior resulted in lower levels of teacher-student agreement. Teachers’ levels of classroom organization had mixed influences, with behavior management negatively predicting agreement, and productivity positively predicting agreement. These results underscore the importance of the classroom context in shaping teacher and student perceptions of peer networks. PMID:21666768

  19. Quantitative blood flow measurements in gliomas using arterial spin-labeling at 3T: intermodality agreement and inter- and intraobserver reproducibility study.

    PubMed

    Hirai, T; Kitajima, M; Nakamura, H; Okuda, T; Sasao, A; Shigematsu, Y; Utsunomiya, D; Oda, S; Uetani, H; Morioka, M; Yamashita, Y

    2011-12-01

    QUASAR is a particular application of the ASL method and facilitates the user-independent quantification of brain perfusion. The purpose of this study was to assess the intermodality agreement of TBF measurements obtained with ASL and DSC MR imaging and the inter- and intraobserver reproducibility of glioma TBF measurements acquired by ASL at 3T. Two observers independently measured TBF in 24 patients with histologically proved glioma. ASL MR imaging with QUASAR and DSC MR imaging were performed on 3T scanners. The observers placed 5 regions of interest in the solid tumor on rCBF maps derived from ASL and DSC MR images and 1 region of interest in the contralateral brain and recorded the measured values. Maximum and average sTBF values were calculated. Intermodality and intra- and interobsever agreement were determined by using 95% Bland-Altman limits of agreement and ICCs. The intermodality agreement for maximum sTBF was good to excellent on DSC and ASL images; ICCs ranged from 0.718 to 0.884. The 95% limits of agreement ranged from 59.2% to 65.4% of the mean. ICCs for intra- and interobserver agreement for maximum sTBF ranged from 0.843 to 0.850 and from 0.626 to 0.665, respectively. The reproducibility of maximum sTBF measurements obtained by methods was similar. In the evaluation of sTBF in gliomas, ASL with QUASAR at 3T yielded measurements and reproducibility similar to those of DSC perfusion MR imaging.

  20. A Population-Based Assessment of the Agreement Between Grading of Goniophotographic Images and Gonioscopy in the Chinese-American Eye Study (CHES).

    PubMed

    Murakami, Yohko; Wang, Dandan; Burkemper, Bruce; Lin, Shan C; Varma, Rohit

    2016-08-01

    To compare grading of goniophotographic images and gonioscopy in assessing the iridocorneal angle. In a population-based, cross-sectional study, participants underwent gonioscopy and goniophotographic imaging during the same visit. The iridocorneal angle was classified as closed if the posterior trabecular meshwork could not be seen. A single masked observer graded the goniophotographic images, and each eye was classified as having angle closure based on the number of closed quadrants. Agreement between the methods was analyzed by calculating kappa (κ) and first-order agreement coefficient (AC1) statistics and comparison of area under receiver operating characteristic curves (AUC). A total of 4149 Chinese Americans (3994 eyes) were included in this study. The agreement for angle closure diagnosis between gonioscopy and EyeCam was moderate to excellent (κ = 0.60, AC1 0.90, AUC 0.76-0.80). Detection of iridocorneal angle closure based on goniophotographic imaging shows moderate to very good agreement with angle closure assessment using gonioscopy.

  1. International study on inter-reader variability for circulating tumor cells in breast cancer

    PubMed Central

    2014-01-01

    Introduction Circulating tumor cells (CTCs) have been studied in breast cancer with the CellSearch® system. Given the low CTC counts in non-metastatic breast cancer, it is important to evaluate the inter-reader agreement. Methods CellSearch® images (N = 272) of either CTCs or white blood cells or artifacts from 109 non-metastatic (M0) and 22 metastatic (M1) breast cancer patients from reported studies were sent to 22 readers from 15 academic laboratories and 8 readers from two Veridex laboratories. Each image was scored as No CTC vs CTC HER2- vs CTC HER2+. The 8 Veridex readers were summarized to a Veridex Consensus (VC) to compare each academic reader using % agreement and kappa (κ) statistics. Agreement was compared according to disease stage and CTC counts using the Wilcoxon signed rank test. Results For CTC definition (No CTC vs CTC), the median agreement between academic readers and VC was 92% (range 69 to 97%) with a median κ of 0.83 (range 0.37 to 0.93). Lower agreement was observed in images from M0 (median 91%, range 70 to 96%) compared to M1 (median 98%, range 64 to 100%) patients (P < 0.001) and from M0 and <3CTCs (median 87%, range 66 to 95%) compared to M0 and ≥3CTCs samples (median 95%, range 77 to 99%), (P < 0.001). For CTC HER2 expression (HER2- vs HER2+), the median agreement was 87% (range 51 to 95%) with a median κ of 0.74 (range 0.25 to 0.90). Conclusions The inter-reader agreement for CTC definition was high. Reduced agreement was observed in M0 patients with low CTC counts. Continuous training and independent image review are required. PMID:24758318

  2. Reliability of the Robinson classification for displaced comminuted midshaft clavicular fractures.

    PubMed

    Stegeman, Sylvia A; Fernandes, Nicole C; Krijnen, Pieta; Schipper, Inger B

    2015-01-01

    This study aimed to assess the reliability of the Robinson classification for displaced comminuted midshaft fractures. A total of 102 surgeons and 52 radiologists classified 15 displaced comminuted midshaft clavicular fractures on anteroposterior (AP) and 30-degree caudocephalad radiographs twice. For both surgeons and radiologists, inter-observer and intra-observer agreement significantly improved after showing the 30-degree caudocephalad view in addition to the AP view. Radiologists had significantly higher inter- and intra-observer agreement than surgeons after judging both radiographs (κmultirater of 0.81 vs. 0.56; κintra-observer of 0.73 vs. 0.44). We advise to use two-plane radiography and to routinely incorporate the Robinson classification in the radiology reports. Copyright © 2015 Elsevier Inc. All rights reserved.

  3. Effects of Type of Agreement Violation and Utterance Position on the Auditory Processing of Subject-Verb Agreement: An ERP Study

    PubMed Central

    Dube, Sithembinkosi; Kung, Carmen; Peter, Varghese; Brock, Jon; Demuth, Katherine

    2016-01-01

    Previous ERP studies have often reported two ERP components—LAN and P600—in response to subject-verb (S-V) agreement violations (e.g., the boys *runs). However, the latency, amplitude and scalp distribution of these components have been shown to vary depending on various experiment-related factors. One factor that has not received attention is the extent to which the relative perceptual salience related to either the utterance position (verbal inflection in utterance-medial vs. utterance-final contexts) or the type of agreement violation (errors of omission vs. errors of commission) may influence the auditory processing of S-V agreement. The lack of reports on these effects in ERP studies may be due to the fact that most studies have used the visual modality, which does not reveal acoustic information. To address this gap, we used ERPs to measure the brain activity of Australian English-speaking adults while they listened to sentences in which the S-V agreement differed by type of agreement violation and utterance position. We observed early negative and positive clusters (AN/P600 effects) for the overall grammaticality effect. Further analysis revealed that the mean amplitude and distribution of the P600 effect was only significant in contexts where the S-V agreement violation occurred utterance-finally, regardless of type of agreement violation. The mean amplitude and distribution of the negativity did not differ significantly across types of agreement violation and utterance position. These findings suggest that the increased perceptual salience of the violation in utterance final position (due to phrase-final lengthening) influenced how S-V agreement violations were processed during sentence comprehension. Implications for the functional interpretation of language-related ERPs and experimental design are discussed. PMID:27625617

  4. Effects of Type of Agreement Violation and Utterance Position on the Auditory Processing of Subject-Verb Agreement: An ERP Study.

    PubMed

    Dube, Sithembinkosi; Kung, Carmen; Peter, Varghese; Brock, Jon; Demuth, Katherine

    2016-01-01

    Previous ERP studies have often reported two ERP components-LAN and P600-in response to subject-verb (S-V) agreement violations (e.g., the boys (*) runs). However, the latency, amplitude and scalp distribution of these components have been shown to vary depending on various experiment-related factors. One factor that has not received attention is the extent to which the relative perceptual salience related to either the utterance position (verbal inflection in utterance-medial vs. utterance-final contexts) or the type of agreement violation (errors of omission vs. errors of commission) may influence the auditory processing of S-V agreement. The lack of reports on these effects in ERP studies may be due to the fact that most studies have used the visual modality, which does not reveal acoustic information. To address this gap, we used ERPs to measure the brain activity of Australian English-speaking adults while they listened to sentences in which the S-V agreement differed by type of agreement violation and utterance position. We observed early negative and positive clusters (AN/P600 effects) for the overall grammaticality effect. Further analysis revealed that the mean amplitude and distribution of the P600 effect was only significant in contexts where the S-V agreement violation occurred utterance-finally, regardless of type of agreement violation. The mean amplitude and distribution of the negativity did not differ significantly across types of agreement violation and utterance position. These findings suggest that the increased perceptual salience of the violation in utterance final position (due to phrase-final lengthening) influenced how S-V agreement violations were processed during sentence comprehension. Implications for the functional interpretation of language-related ERPs and experimental design are discussed.

  5. Agreement on underlying causes of infant death between original records and after investigation: analysis of two biennia in the years 2000.

    PubMed

    dos Santos, Hellen Geremias; de Andrade, Selma Maffei; Silva, Ana Maria Rigo; de Carvalho, Wladithe Organ; Mesas, Arthur Eumann; González, Alberto Durán

    2014-01-01

    To analyze the agreement between underlying causes of infant deaths obtained from Death Certificates (DC) with those defined after investigation by the Municipal Committee for the Prevention of Maternal and Infant Mortality (CMPMMI), in Londrina, Paraná State, in the biennia 2000-2001 and 2007-2008. DC of infants and records of investigations were obtained from the CMPMMI. The causes of death registered in both sources were coded according to the International Classification of Diseases, tenth revision (ICD-10), and the underlying causes of deaths were selected. Agreement between underlying causes of deaths was verified by Kappa's (k) test and analyzed according to ICD-10 chapters and blocks of categories in both biennia. In 2000/2001, according to ICD-10 chapters, high agreement rates were observed for conditions originated in the perinatal period (k = 0.85) and for external causes (k = 0.84), while, for congenital malformations, there was a substantial agreement (k = 0.71). In 2007/2008, agreement was considered poor for all analyzed chapters. For blocks of categories, high or substantial agreement rates were observed only in the first biennium for "congenital malformations of the circulatory system" (k = 0.78) and for "other external causes of accidental injury" (k = 0.91). A decrease in agreement between the sources during the study period indicates either an improvement in the process of investigation of infant death by the CMPMMI and/or a worsening in the quality of the DC information.

  6. Observer agreement for detection of cardiac arrhythmias on telemetric ECG recordings obtained at rest, during and after exercise in 10 Warmblood horses.

    PubMed

    Trachsel, D S; Bitschnau, C; Waldern, N; Weishaupt, M A; Schwarzwald, C C

    2010-11-01

    Frequent supraventricular or ventricular arrhythmias during and after exercise are considered pathological in horses. Prevalence of arrhythmias seen in apparently healthy horses is still a matter of debate and may depend on breed, athletic condition and exercise intensity. To determine intra- and interobserver agreement for detection of arrhythmias at rest, during and after exercise using a telemetric electrocardiography device. The electrocardiogram (ECG) recordings of 10 healthy Warmblood horses (5 of which had an intracardiac catheter in place) undergoing a standardised treadmill exercise test were analysed at rest (R), during warm-up (W), during exercise (E), as well as during 0-5 min (PE(0-5)) and 6-45 min (PE(6-45)) recovery after exercise. The number and time of occurrence of physiological and pathological 'rhythm events' were recorded. Events were classified according to origin and mode of conduction. The agreement of 3 independent, blinded observers with different experience in ECG reading was estimated considering time of occurrence and classification of events. For correct timing and classification, intraobserver agreement for observer 1 was 97% (R), 100% (W), 20% (E), 82% (PE(0-5)) and 100% (PE(6-45)). Interobserver agreement between observer 1 vs. observer 2 and between observer 1 vs. 3, respectively, was 96 and 92.6% (R), 83 and 31% (W), 0 and 13% (E), 23 and 18% (PE(0-5)), and 67 and 55% (PE(6-45)). When including the events with correct timing but disagreement for classification, the intraobserver agreement increased to 94% during PE(0-5) and the interobserver agreement reached 83 and 50% (W), 20 and 50% (E), 41 and 47% (PE(0-5)), and 83.5 and 65% (PE(6-45)). The interobserver agreement increased with observer experience. Intra- and interobserver agreement for recognition and classification of events was good at R, but poor during E and poor-moderate during recovery periods. These results highlight the limitations of stress ECG in horses and the need for high-quality recordings and adequate observer training. © 2010 EVJ Ltd.

  7. Comparison of High-Resolution MR Imaging and Digital Subtraction Angiography for the Characterization and Diagnosis of Intracranial Artery Disease.

    PubMed

    Lee, N J; Chung, M S; Jung, S C; Kim, H S; Choi, C-G; Kim, S J; Lee, D H; Suh, D C; Kwon, S U; Kang, D-W; Kim, J S

    2016-12-01

    High-resolution MR imaging has recently been introduced as a promising diagnostic modality in intracranial artery disease. Our aim was to compare high-resolution MR imaging with digital subtraction angiography for the characterization and diagnosis of various intracranial artery diseases. Thirty-seven patients who had undergone both high-resolution MR imaging and DSA for intracranial artery disease were enrolled in our study (August 2011 to April 2014). The time interval between the high-resolution MR imaging and DSA was within 1 month. The degree of stenosis and the minimal luminal diameter were independently measured by 2 observers in both DSA and high-resolution MR imaging, and the results were compared. Two observers independently diagnosed intracranial artery diseases on DSA and high-resolution MR imaging. The time interval between the diagnoses on DSA and high-resolution MR imaging was 2 weeks. Interobserver diagnostic agreement for each technique and intermodality diagnostic agreement for each observer were acquired. High-resolution MR imaging showed moderate-to-excellent agreement (interclass correlation coefficient = 0.892-0.949; κ = 0.548-0.614) and significant correlations (R = 0.766-892) with DSA on the degree of stenosis and minimal luminal diameter. The interobserver diagnostic agreement was good for DSA (κ = 0.643) and excellent for high-resolution MR imaging (κ = 0.818). The intermodality diagnostic agreement was good (κ = 0.704) for observer 1 and moderate (κ = 0.579) for observer 2, respectively. High-resolution MR imaging may be an imaging method comparable with DSA for the characterization and diagnosis of various intracranial artery diseases. © 2016 by American Journal of Neuroradiology.

  8. Interobserver and intermodality agreement of standardized algorithms for non-invasive diagnosis of hepatocellular carcinoma in high-risk patients: CEUS-LI-RADS versus MRI-LI-RADS.

    PubMed

    Schellhaas, Barbara; Hammon, Matthias; Strobel, Deike; Pfeifer, Lukas; Kielisch, Christian; Goertz, Ruediger S; Cavallaro, Alexander; Janka, Rolf; Neurath, Markus F; Uder, Michael; Seuss, Hannes

    2018-04-19

    We compared the interobserver agreement for the recently introduced contrast-enhanced ultrasound (CEUS)-based algorithm CEUS-LI-RADS (Liver Imaging Reporting and Data System) versus the well-established magnetic resonance imaging (MRI)-LI-RADS for non-invasive diagnosis of hepatocellular carcinoma (HCC) in high-risk patients. Focal liver lesions in 50 high-risk patients (mean age 66.2 ± 11.8 years; 39 male) were assessed retrospectively with CEUS and MRI. Two independent observers reviewed CEUS and MRI examinations, separately, classifying observations according to CEUS-LI-RADSv.2016 and MRI-LI-RADSv.2014. Interobserver agreement was assessed with Cohen's kappa. Forty-three lesions were HCCs; two were intrahepatic cholangiocarcinomas; five were benign lesions. Arterial phase hyperenhancement was perceived less frequently with CEUS than with MRI (37/50 / 38/50 lesions = 74%/78% [CEUS; observer 1/observer 2] versus 46/50 / 44/50 lesions = 92%/88% [MRI; observer 1/observer 2]). Washout appearance was observed in 34/50 / 20/50 lesions = 68%/40% with CEUS and 31/50 / 31/50 lesions = 62%/62%) with MRI. Interobserver agreement was moderate for arterial hyperenhancement (ĸ = 0.511/0.565 [CEUS/MRI]) and "washout" (ĸ = 0.490/0.582 [CEUS/MRI]), fair for CEUS-LI-RADS category (ĸ = 0.309) and substantial for MRI-LI-RADS category (ĸ = 0.609). Intermodality agreement was fair for arterial hyperenhancement (ĸ = 0.329), slight to fair for "washout" (ĸ = 0.202) and LI-RADS category (ĸ = 0.218) CONCLUSION: Interobserver agreement is substantial for MRI-LI-RADS and only fair for CEUS-LI-RADS. This is mostly because interobserver agreement in the perception of washout appearance is better in MRI than in CEUS. Further refinement of the LI-RADS algorithms and increasing education and practice may be necessary to improve the concordance between CEUS and MRI for the final LI-RADS categorization. • CEUS-LI-RADS and MRI-LIRADS enable standardized non-invasive diagnosis of HCC in high-risk patients. • With CEUS, interobserver agreement is better for arterial hyperenhancement than for "washout". • Interobserver agreement for major features is moderate for both CEUS and MRI. • Interobserver agreement for LI-RADS category is substantial for MRI, and fair for CEUS. • Interobserver-agreement for CEUS-LI-RADS will presumably improve with ongoing use of the algorithm.

  9. A validity study of self-reported daily texting frequency, cell phone characteristics, and texting styles among young adults.

    PubMed

    Gold, Judith E; Rauscher, Kimberly J; Zhu, Motao

    2015-04-02

    Texting is associated with adverse health effects including musculoskeletal disorders, sleep disturbances, and traffic crashes. Many studies have relied on self-reported texting frequency, yet the validity of self-reports is unknown. Our objective was to provide some of the first data on the validity of self-reported texting frequency, cell phone characteristics including input device (e.g. touchscreen), key configuration (e.g., QWERTY), and texting styles including phone orientation (e.g., horizontal) and hands holding the phone while texting. Data were collected using a self-administered questionnaire and observation of a texting task among college students ages 18 to 24. To gauge agreement between self-reported and phone bill-derived categorical number of daily text messages sent, we calculated percent of agreement, Spearman correlation coefficient, and a linear weighted kappa statistic. For agreement between self-reported and observed cell phone characteristics and texting styles we calculated percentages of agreement. We used chi-square tests to detect significant differences (α = 0.05) by gender and study protocol. There were 106 participants; 87 of which had complete data for texting frequency analyses. Among these 87, there was 26% (95% CI: 21-31) agreement between self-reported and phone bill-derived number of daily text messages sent with a Spearman's rho of 0.48 and a weighted kappa of 0.17 (95% CI: 0.06-0.27). Among those who did not accurately report the number of daily texts sent, 81% overestimated this number. Among the full sample (n = 106), there was high agreement between self-reported and observed texting input device (96%, 95% CI: 91-99), key configuration (89%, 95% CI: 81-94), and phone orientation while texting (93%, 95% CI: 86-97). No differences were found by gender or study protocol among any items. While young adults correctly reported their cell phone's characteristics and phone orientation while texting, most incorrectly estimated the number of daily text messages they sent. This suggests that while self-reported texting frequency may be useful for studies where relative ordering is adequate, it should not be used in epidemiologic studies to identify a risk threshold. For these studies, it is recommended that a less biased measure, such as a cell phone bill, be utilized.

  10. Multi-rater Agreement in the Assessment of Anterior Cruciate Ligament Reconstruction Failure. A Radiographic and Video Analysis of the MARS Cohort

    PubMed Central

    Matava, Matthew J.; Arciero, Robert A.; Baumgarten, Keith M.; Carey, James L.; DeBerardino, Thomas M.; Hame, Sharon L.; Hannafin, Jo A.; Miller, Bruce S.; Nissen, Carl W.; Taft, Timothy N.; Wolf, Brian R.; Wright, Rick W.

    2015-01-01

    Background ACL reconstruction failure occurs in up to 10% of cases. Technical errors are considered the most common cause of graft failure despite the absence of validated studies. There is limited data regarding the agreement among orthopedic surgeons in terms of the etiology of primary ACL reconstruction failure and accuracy of graft tunnel placement. Purpose The purpose of this study is to test the hypothesis that experienced knee surgeons have a high level of inter-observer reliability in the agreement of the etiology of the primary ACL reconstruction failure, anatomical graft characteristics, tunnel placement. Methods Twenty cases of revision ACL reconstruction were randomly selected from the MARS database. Each case included the patient's history, standardized radiographs, and a concise 30-second arthroscopic video taken at the time of revision demonstrating the graft remnant and location of the tunnel apertures. 10 MARS surgeons not involved with the primary surgery reviewed all 20 cases. Each surgeon completed a two-part questionnaire dealing with each surgeon's training and practice as well as the placement of the femoral and tibial tunnels, condition of the primary graft, and the surgeon's opinion as to the etiology of graft failure. Inter-rater agreement was determined for each question. Inter-rater agreement was determined for each question with the kappa coefficient and prevalence adjusted bias adjusted kappa (PABAK). Results The 10 reviewers were in practice an average of 14 years. All performed at least 25 ACL reconstructions per year and 9 were fellowship-trained in sports medicine. There was wide variability in agreement among knee experts as to the specific etiology of ACL graft failure. When specifically asked about technical error as the cause for failure, inter-observer agreement was only slight (prevalence adjusted bias adjusted kappa [PABAK]: 0.26). There was fair overall agreement on ideal femoral tunnel placement (PABAK: 0.55), but only slight agreement whether a femoral tunnel was too anterior (PABAK: 0.24) and fair agreement whether it was too vertical (PABAK: 0.46). There was poor overall agreement for ideal tibial tunnel placement (PABAK: 0.17). Conclusion This study suggests that more objective criteria are needed to accurately determine the etiology of primary ACL graft failure as well as the ideal femoral and tibial tunnel placement in patients undergoing revision ACL reconstruction. PMID:25537942

  11. Agreement in the assessment of metastatic spine disease using scoring systems.

    PubMed

    Arana, Estanislao; Kovacs, Francisco M; Royuela, Ana; Asenjo, Beatriz; Pérez-Ramírez, Ursula; Zamora, Javier

    2015-04-01

    To assess variability in the use of Tomita and modified Bauer scores in spine metastases. Clinical data and imaging from 90 patients with biopsy-proven spinal metastases, were provided to 83 specialists from 44 hospitals. Spinal levels involved and the Tomita and modified Bauer scores for each case were determined twice by each clinician, with a minimum of 6-week interval. Clinicians were blinded to every evaluation. Kappa statistic was used to assess intra and inter-observer agreement. Subgroup analyses were performed according to clinicians' specialty (medical oncology, neurosurgery, radiology, orthopedic surgery and radiation oncology), years of experience (⩽7, 8-13, ⩾14), and type of hospital (four levels). For metastases identification, intra-observer agreement was "substantial" (0.600.80) at the other levels. Inter-observer agreement was "almost perfect" at lumbar spine, and "substantial" at the other levels. Intra-observer agreement for the Tomita and Bauer scores was almost perfect. Inter-observer agreement was almost perfect for the Tomita score and substantial for the Bauer one. Results were similar across specialties, years of experience and type of hospital. Agreement in the assessment of metastatic spine disease is high. These scoring systems can improve communication among clinicians involved in oncology care. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  12. Validity of an activity monitor in young people with cerebral palsy gross motor function classification system level I.

    PubMed

    O' Donoghue, Deirdre; Kennedy, Norelee

    2014-11-01

    The activPAL™ activity monitor has potential for use in youth with Cerebral Palsy (CP) as it has demonstrated acceptable validity for the assessment of sedentary and physical activity in other populations. This study determined the validity of the activPAL™ activity monitor for the measurement of sitting, standing, walking time, transitions and step count for both legs in young people with hemiplegic and asymmetric diplegic CP. Seventeen participants with CP Gross Motor Function Classification System level I completed two video recorded test protocols that involved wearing an activPAL™ activity monitor on alternate legs. Agreement between observed video recorded data and activPAL™ activity monitor data was assessed using the Bland and Altman (BA) method and intraclass correlation coefficients (ICC 3,1). There was perfect agreement for transitions and high agreement for sitting (BA mean differences (MD): -1.8 and -1.8 s; ICCs: 0.49 and 0.95) standing (MD: 0.8 and 0.1 s; ICCs: 0.59 and 0.98) walking (MD: 1 and 1.1 s; ICCs: 0.99 and 0.94) timings and low agreement for step count (MD: 4.1 and 2.8 steps; ICCs: 0.96 and 0.95) for both legs. This study found clinically acceptable agreement with direct observation for all activPAL™ activity monitor functions, except for step count measurement with respect to the range of measurement values obtained for both legs in this study population.

  13. Digital Paper Prints as Replacement for LASER Films: A Study of Intra-Observer Agreement for Wrist Radiographic Findings in Rickets

    PubMed Central

    Jain, Abhinav; Anand, Surinder Pal Singh; Dang, Archana

    2016-01-01

    Introduction Replacement of conventional LASER films with digital paper prints as supplement to radiology reports may serve as an economical and environment friendly method. However, it is essential that such a change does not compromise patient’s intended diagnostic outcome. Aim The aim of this study was to assess the reliability and acceptability of digital paper prints for the radiographic images by the treating physicians and radiologists. Materials and Methods This observational analytical study was done at a tertiary care hospital of New Delhi, India. A total of 58 consecutively ordered wrist radiographs of paediatric patients (6 months to 12 years of age) for ruling out rickets were retrieved from the PACS (Picture Archiving and Communication System). These 58 radiographs, out of which 21 (36.2%) had radiological evidence of rickets over PACS were printed on two different media i.e., LASER films and glossy photographic paper. An objective scoring for the severity of rickets was done on both LASER films and paper prints by six observers independently. Overall comfort level with paper prints was rated on a 1-5 point Likert scale. Data was analysed using STATA 14.0 (Stata Corp, College Station, TX). Results Intra-observer percentage agreement and value of Cohen’s kappa for PACS vs. LASER films and PACS vs. paper prints was equal i.e., 98.3% and 0.97, respectively. Intra-observer agreement between LASER films and paper prints for all six observers was excellent, ranging from 0.92 to 1.00; percentage agreement ranging from 94.8% to 100%. Fracture of ulna/radius present in 4 sets of the X-rays was well demonstrated in both LASER films and paper prints. Comfort level with paper prints was rated as 5 out of 5 by all due to no requirement of any special illuminated view box and dark room. Conclusion This study concludes that the use of paper prints may serve as a reliable alternative to LASER films to communicate the report of wrist radiographs for the treating physicians without any compromise over diagnostic information in cases of rickets. PMID:27656525

  14. Morphology vs morphokinetics: a retrospective comparison of inter-observer and intra-observer agreement between embryologists on blastocysts with known implantation outcome.

    PubMed

    Adolfsson, Emma; Andershed, Anna Nowosad

    2018-06-18

    Our primary aim was to compare the morphology and morphokinetics on inter- and intra-observer agreement for blastocyst with known implantation outcome. Our secondary aim was to validate the morphokinetic parameters' ability to predict pregnancy using a previous published selection algorithm, and to compare this to standard morphology assessments. Two embryologists made independent blinded annotations on two occasions using time-lapse images and morphology evaluations using the Gardner Schoolcraft criteria of 99 blastocysts with known implantation outcome. Inter- and intra-observer agreement was calculated and compared using the two methods. The embryos were grouped based on their morphological score, and on their morphokinetic class using a previous published selection algorithm. The implantation rates for each group was calculated and compared. There was moderate agreement for morphology, with agreement on the same embryo score in 55 of 99 cases. The highest agreement rate was found for expansion grade, followed by trophectoderm and inner cell mass. Correlation with pregnancy was inconclusive. For morphokinetics, almost perfect agreement was found for early and late embryo development events, and strong agreement for day-2 and day-3 events. When applying the selection algorithm, the embryo distributions were uneven, and correlation to pregnancy was inconclusive. Time-lapse annotation is consistent and accurate, but our external validation of a previously published selection algorithm was unsuccessful.

  15. Novel use of non-echo-planar diffusion weighted MRI in monitoring disease activity and treatment response in active Grave's orbitopathy: An initial observational cohort study.

    PubMed

    Lingam, Ravi Kumar; Mundada, Pravin; Lee, Vickie

    2018-01-10

    To examine the novel use of non-echo-planar diffusion weighted MRI (DWI) in depicting activity and treatment response in active Grave's orbitopathy (GO) by assessing, with inter-observer agreement, for a correlation between its apparent diffusion coefficients (ADCs) and conventional Short tau Inversion Recovery (STIR) MRI signal-intensity ratios (SIRs). A total of 23 actively inflamed muscles and 30 muscle response episodes were analysed in patients with active GO who underwent medical treatment. The MRI orbit scans included STIR sequences and non-echo-planar DWI were evaluated. Two observers independently assessed the images qualitatively for the presence of activity in the extraocular muscles (EOMs) and recorded the STIR signal-intensity (SI), SIR (SI ratio of EOM/temporalis muscle), and ADC values of any actively inflamed muscle on the pre-treatment scans and their corresponding values on the subsequent post-treatment scans. Inter-observer agreement was examined. There was a significant positive correlation (0.57, p < 0.001) between ADC and both SIR and STIR SI of the actively inflamed EOM. There was also a significant positive correlation (0.75, p < 0.001) between SIR and ADC values depicting change in muscle activity associated with treatment response. There was good inter-observer agreement. Our preliminary results indicate that quantitative evaluation with non-echo-planar DWI ADC values correlates well with conventional STIR SIR in detecting active GO and monitoring its treatment response, with good inter-observer agreement.

  16. Blast and ballistic trajectories in combat casualties: a preliminary analysis using a cartesian positioning system with MDCT.

    PubMed

    Folio, Les R; Fischer, Tatjana; Shogan, Paul; Frew, Michael; Dwyer, Andrew; Provenzale, James M

    2011-08-01

    The purpose of this study is to determine the agreement with which radiologists identify wound paths in vivo on MDCT and calculate missile trajectories on the basis of Cartesian coordinates using a Cartesian positioning system (CPS). Three radiologists retrospectively identified 25 trajectories on MDCT in 19 casualties who sustained penetrating trauma in Iraq. Trajectories were described qualitatively in terms of directional path descriptors and quantitatively as trajectory vectors. Directional descriptors, trajectory angles, and angles between trajectories were calculated based on Cartesian coordinates of entrance and terminus or exit recorded in x, y image and table space (z) using a Trajectory Calculator created using spreadsheet software. The consistency of qualitative descriptor determinations was assessed in terms of frequency of observer agreement and multirater kappa statistics. Consistency of trajectory vectors was evaluated in terms of distribution of magnitude of the angles between vectors and the differences between their paraaxial and parasagittal angles. In 68% of trajectories, the observers' visual assessment of qualitative descriptors was congruent. Calculated descriptors agreed across observers in 60% of the trajectories. Estimated kappa also showed good agreement (0.65-0.79, p < 0.001); 70% of calculated paraaxial and parasagittal angles were within 20° across observers, and 61.3% of angles between trajectory vectors were within 20° across observers. Results show agreement of visually assessed and calculated qualitative descriptors and trajectory angles among observers. The Trajectory Calculator describes trajectories qualitatively similar to radiologists' visual assessment, showing the potential feasibility of automated trajectory analysis.

  17. Assisting Hand Assessment and Children's Hand-Use Experience Questionnaire -Observed Versus Perceived Bimanual Performance in Children with Unilateral Cerebral Palsy.

    PubMed

    Ryll, Ulrike C; Bastiaenen, Carolien H G; Eliasson, Ann-Christin

    2017-05-01

    To explore the differences, relationship, and extent of agreement between the Assisting Hand Assessment (AHA), measuring observed ability to perform bimanual tasks, and the Children's Hand-Use Experience Questionnaire (CHEQ), assessing experienced bimanual performance. This study investigates a convenience sample of 34 children (16 girls) with unilateral cerebral palsy aged 6-18 years (mean 12.1, SD 3.9) in a cross-sectional design. The AHA and CHEQ subscales share 8-25% of their variance (R 2 ). Bland-Altman plots for AHA and all three CHEQ subscales indicate good average agreement, with a mean difference approaching zero but large 95% confidence intervals. Limits of agreement were extremely wide, indicating considerable disagreement between AHA and CHEQ subscales. AHA and CHEQ seem to measure different though somewhat related constructs of bimanual performance. Results of this investigation reinforce the recommendation to use both instruments to obtain complementary information about bimanual performance including observed and perceived performance of children with unilateral cerebral palsy.

  18. The Orientation of Gastric Biopsy Samples Improves the Inter-observer Agreement of the OLGA Staging System.

    PubMed

    Cotruta, Bogdan; Gheorghe, Cristian; Iacob, Razvan; Dumbrava, Mona; Radu, Cristina; Bancila, Ion; Becheanu, Gabriel

    2017-12-01

    Evaluation of severity and extension of gastric atrophy and intestinal metaplasia is recommended to identify subjects with a high risk for gastric cancer. The inter-observer agreement for the assessment of gastric atrophy is reported to be low. The aim of the study was to evaluate the inter-observer agreement for the assessment of severity and extension of gastric atrophy using oriented and unoriented gastric biopsy samples. Furthermore, the quality of biopsy specimens in oriented and unoriented samples was analyzed. A total of 35 subjects with dyspeptic symptoms addressed for gastrointestinal endoscopy that agreed to enter the study were prospectively enrolled. The OLGA/OLGIM gastric biopsies protocol was used. From each subject two sets of biopsies were obtained (four from the antrum, two oriented and two unoriented, two from the gastric incisure, one oriented and one unoriented, four from the gastric body, two oriented and two unoriented). The orientation of the biopsy samples was completed using nitrocellulose filters (Endokit®, BioOptica, Milan, Italy). The samples were blindly examined by two experienced pathologists. Inter-observer agreement was evaluated using kappa statistic for inter-rater agreement. The quality of histopathology specimens taking into account the identification of lamina propria was analyzed in oriented vs. unoriented samples. The samples with detectable lamina propria mucosae were defined as good quality specimens. Categorical data was analyzed using chi-square test and a two-sided p value <0.05 was considered statistically significant. A total of 350 biopsy samples were analyzed (175 oriented / 175 unoriented). The kappa index values for oriented/unoriented OLGA 0/I/II/III and IV stages have been 0.62/0.13, 0.70/0.20, 0.61/0.06, 0.62/0.46, and 0.77/0.50, respectively. For OLGIM 0/I/II/III stages the kappa index values for oriented/unoriented samples were 0.83/0.83, 0.88/0.89, 0.70/0.88 and 0.83/1, respectively. No case of OLGIM IV stage was found in the present case series. Good quality histopathology specimens were described in 95.43% of the oriented biopsy samples, and in 89.14% of the unoriented biopsy samples, respectively (p=0.0275). The orientation of gastric biopsies specimens improves the inter-observer agreement for the assessment of gastric atrophy.

  19. Development and Testing of the Observational System for Recording Physical Activity in Children: Elementary School

    PubMed Central

    McIver, Kerry L.; Brown, William H.; Pfeiffer, Karin A.; Dowda, Marsha; Pate, Russell R.

    2016-01-01

    Purpose This study describes the development and pilot testing of the Observational System for Recording Physical Activity-Elementary School (OSRAC-E) version. Methods This system was developed to observe and document the levels and types of physical activity and physical and social contexts of physical activity in elementary school students during the school day. Inter-observer agreement scores and summary data were calculated. Results All categories had Kappa statistics above 0.80, with the exception of the activity initiator category. Inter-observer agreement scores were 96% or greater. The OSRAC-E was shown to be a reliable observation system that allows researchers to assess physical activity behaviors, the contexts of those behaviors, and the effectiveness of physical activity interventions in the school environment. Conclusion The OSRAC-E can yield data with high interobserver reliability and provide relatively extensive contextual information about physical activity of students in elementary schools. PMID:26889587

  20. Evaluation of interobserver agreement for postoperative pain and sedation assessment in cats.

    PubMed

    Benito, Javier; Monteiro, Beatriz P; Beauchamp, Guy; Lascelles, B Duncan X; Steagall, Paulo V

    2017-09-01

    OBJECTIVE To evaluate agreement between observers with different training and experience for assessment of postoperative pain and sedation in cats by use of a dynamic and interactive visual analog scale (DIVAS) and for assessment of postoperative pain in the same cats with a multidimensional composite pain scale (MCPS). DESIGN Randomized, controlled, blinded study. ANIMALS 45 adult cats undergoing ovariohysterectomy. PROCEDURES Cats received 1 of 3 preoperative treatments: bupivacaine, IP; meloxicam, SC with saline (0.9% NaCl) solution, IP, (positive control); or saline solution only, IP (negative control). All cats received premedication with buprenorphine prior to general anesthesia. An experienced observer (observer 1; male; native language, Spanish) used scales in English, and an inexperienced observer (observer 2; female; native language, French) used scales in French to assess signs of sedation and pain. Rescue analgesia was administered according to MCPS scoring by observer 1. Mean pain and sedation scores per treatment and time point, proportions of cats in each group with MCPS scores necessitating rescue analgesia, and mean MCPS scores assigned at the time of rescue analgesia were compared between observers. Agreement was assessed by intraclass correlation coefficient determination. Percentage disagreement between observers on the need for rescue analgesia was calculated. RESULTS Interobserver agreements for pain scores were good, and that for sedation scores was fair. On the basis of observer 1's MCPS scores, a greater proportion of cats in the negative control group received rescue analgesia than in the bupivacaine or positive control groups. Scores from observer 2 indicated a greater proportion of cats in the negative control group than in the positive control group required rescue analgesia but identified no significant difference between the negative control and bupivacaine groups for this variable. Overall, disagreement regarding need for rescue analgesia was identified for 22 of 360 (6.1%) paired observations. CONCLUSIONS AND CLINICAL RELEVANCE Interobserver differences in assessing pain can lead to different conclusions regarding treatment effectiveness.

  1. Lenke and King classification systems for adolescent idiopathic scoliosis: interobserver agreement and postoperative results

    PubMed Central

    Hosseinpour-Feizi, Hojjat; Soleimanpour, Jafar; Sales, Jafar Ganjpour; Arzroumchilar, Ali

    2011-01-01

    Purpose The aim of this study was to investigate the interobserver agreement of the Lenke and King classifications for adolescent idiopathic scoliosis, and to compare the results of surgery performed based on classification of the scoliosis according to each of these classification systems. Methods The study was conducted in Shohada Hospital in Tabriz, Iran, between 2009 and 2010. First, a reliability assessment was undertaken to assess interobserver agreement of the Lenke and King classifications for adolescent idiopathic scoliosis. Second, postoperative efficacy and safety of surgery performed based on the Lenke and King classifications were compared. Kappa coefficients of agreement were calculated to assess the agreement. Outcomes were compared using bivariate tests and repeated measures analysis of variance. Results A low to moderate interobserver agreement was observed for the King classification; the Lenke classification yielded mostly high agreement coefficients. The outcome of surgery was not found to be substantially different between the two systems. Conclusion Based on the results, the Lenke classification method seems advantageous. This takes into consideration the Lenke classification’s priority in providing details of curvatures in different anatomical surfaces to explain precise intensity of scoliosis, that it has higher interobserver agreement scores, and also that it leads to noninferior postoperative results compared with the King classification method. PMID:22267934

  2. Lenke and King classification systems for adolescent idiopathic scoliosis: interobserver agreement and postoperative results.

    PubMed

    Hosseinpour-Feizi, Hojjat; Soleimanpour, Jafar; Sales, Jafar Ganjpour; Arzroumchilar, Ali

    2011-01-01

    The aim of this study was to investigate the interobserver agreement of the Lenke and King classifications for adolescent idiopathic scoliosis, and to compare the results of surgery performed based on classification of the scoliosis according to each of these classification systems. The study was conducted in Shohada Hospital in Tabriz, Iran, between 2009 and 2010. First, a reliability assessment was undertaken to assess interobserver agreement of the Lenke and King classifications for adolescent idiopathic scoliosis. Second, postoperative efficacy and safety of surgery performed based on the Lenke and King classifications were compared. Kappa coefficients of agreement were calculated to assess the agreement. Outcomes were compared using bivariate tests and repeated measures analysis of variance. A low to moderate interobserver agreement was observed for the King classification; the Lenke classification yielded mostly high agreement coefficients. The outcome of surgery was not found to be substantially different between the two systems. Based on the results, the Lenke classification method seems advantageous. This takes into consideration the Lenke classification's priority in providing details of curvatures in different anatomical surfaces to explain precise intensity of scoliosis, that it has higher interobserver agreement scores, and also that it leads to noninferior postoperative results compared with the King classification method.

  3. Comparative evaluation of RetCam vs. gonioscopy images in congenital glaucoma.

    PubMed

    Azad, Raj V; Chandra, Parijat; Chandra, Anuradha; Gupta, Aparna; Gupta, Viney; Sihota, Ramanjit

    2014-02-01

    To compare clarity, exposure and quality of anterior chamber angle visualization in congenital glaucoma patients, using RetCam and indirect gonioscopy images. Cross-sectional study Participants. Congenital glaucoma patients over age of 5 years. A prospective consecutive pilot study was done in congenital glaucoma patients who were older than 5 years. Methods used are indirect gonioscopy and RetCam imaging. Clarity of the image, extent of angle visible and details of angle structures seen were graded for both methods, on digitally recorded images, in each eye, by two masked observers. Image clarity, interobserver agreement. 40 eyes of 25 congenital glaucoma patients were studied. RetCam image had excellent clarity in 77.5% of patients versus 47.5% by gonioscopy. The extent of angle seen was similar by both methods. Agreement between RetCam and gonioscopy images regarding details of angle structures was 72.50% by observer 1 and 65.00% by observer 2. There was good agreement between RetCam and indirect gonioscopy images in detecting angle structures of congenital glaucoma patients. However, RetCam provided greater clarity, with better quality, and higher magnification images. RetCam can be a useful alternative to gonioscopy in infants and small children without the need for general anesthesia.

  4. Comparative evaluation of RetCam vs. gonioscopy images in congenital glaucoma

    PubMed Central

    Azad, Raj V; Chandra, Parijat; Chandra, Anuradha; Gupta, Aparna; Gupta, Viney; Sihota, Ramanjit

    2014-01-01

    Purpose: To compare clarity, exposure and quality of anterior chamber angle visualization in congenital glaucoma patients, using RetCam and indirect gonioscopy images. Design: Cross-sectional study Participants. Congenital glaucoma patients over age of 5 years. Materials and Methods: A prospective consecutive pilot study was done in congenital glaucoma patients who were older than 5 years. Methods used are indirect gonioscopy and RetCam imaging. Clarity of the image, extent of angle visible and details of angle structures seen were graded for both methods, on digitally recorded images, in each eye, by two masked observers. Outcome Measures: Image clarity, interobserver agreement. Results: 40 eyes of 25 congenital glaucoma patients were studied. RetCam image had excellent clarity in 77.5% of patients versus 47.5% by gonioscopy. The extent of angle seen was similar by both methods. Agreement between RetCam and gonioscopy images regarding details of angle structures was 72.50% by observer 1 and 65.00% by observer 2. Conclusions: There was good agreement between RetCam and indirect gonioscopy images in detecting angle structures of congenital glaucoma patients. However, RetCam provided greater clarity, with better quality, and higher magnification images. RetCam can be a useful alternative to gonioscopy in infants and small children without the need for general anesthesia. PMID:24008788

  5. Chemical models of interstellar gas-grain processes. II - The effect of grain-catalysed methane on gas phase evolution

    NASA Technical Reports Server (NTRS)

    Brown, Paul D.; Charnley, S. B.

    1991-01-01

    The effects on gas phase chemistry which result from the continuous desorption of methane molecules from grain surfaces are studied. Significant and sustained enhancements in the abundances of several complex hydrocarbon molecules are found, in good agreement with their observed values in TMC-1. The overall agreement is, however, just as good for the case of zero CH4 desorption efficiency. It is thus impossible to determine from the models whether or not the grain-surface production of methane is responsible for the observed abundances of some hydrocarbon molecules.

  6. Do proposed facial expressions of contempt, shame, embarrassment, and compassion communicate the predicted emotion?

    PubMed

    Widen, Sherri C; Christy, Anita M; Hewett, Kristen; Russell, James A

    2011-08-01

    Shame, embarrassment, compassion, and contempt have been considered candidates for the status of basic emotions on the grounds that each has a recognisable facial expression. In two studies (N=88, N=60) on recognition of these four facial expressions, observers showed moderate agreement on the predicted emotion when assessed with forced choice (58%; 42%), but low agreement when assessed with free labelling (18%; 16%). Thus, even though some observers endorsed the predicted emotion when it was presented in a list, over 80% spontaneously interpreted these faces in a way other than the predicted emotion.

  7. Reliability of a rapid hematology stain for sputum cytology*

    PubMed Central

    Gonçalves, Jéssica; Pizzichini, Emilio; Pizzichini, Marcia Margaret Menezes; Steidle, Leila John Marques; Rocha, Cristiane Cinara; Ferreira, Samira Cardoso; Zimmermann, Célia Tânia

    2014-01-01

    Objective: To determine the reliability of a rapid hematology stain for the cytological analysis of induced sputum samples. Methods: This was a cross-sectional study comparing the standard technique (May-Grünwald-Giemsa stain) with a rapid hematology stain (Diff-Quik). Of the 50 subjects included in the study, 21 had asthma, 19 had COPD, and 10 were healthy (controls). From the induced sputum samples collected, we prepared four slides: two were stained with May-Grünwald-Giemsa, and two were stained with Diff-Quik. The slides were read independently by two trained researchers blinded to the identification of the slides. The reliability for cell counting using the two techniques was evaluated by determining the intraclass correlation coefficients (ICCs) for intraobserver and interobserver agreement. Agreement in the identification of neutrophilic and eosinophilic sputum between the observers and between the stains was evaluated with kappa statistics. Results: In our comparison of the two staining techniques, the ICCs indicated almost perfect interobserver agreement for neutrophil, eosinophil, and macrophage counts (ICC: 0.98-1.00), as well as substantial agreement for lymphocyte counts (ICC: 0.76-0.83). Intraobserver agreement was almost perfect for neutrophil, eosinophil, and macrophage counts (ICC: 0.96-0.99), whereas it was moderate to substantial for lymphocyte counts (ICC = 0.65 and 0.75 for the two observers, respectively). Interobserver agreement for the identification of eosinophilic and neutrophilic sputum using the two techniques ranged from substantial to almost perfect (kappa range: 0.91-1.00). Conclusions: The use of Diff-Quik can be considered a reliable alternative for the processing of sputum samples. PMID:25029648

  8. Interobserver Agreement for Contrast-Enhanced Ultrasound (CEUS)-Based Standardized Algorithms for the Diagnosis of Hepatocellular Carcinoma in High-Risk Patients.

    PubMed

    Schellhaas, Barbara; Pfeifer, Lukas; Kielisch, Christian; Goertz, Ruediger Stephan; Neurath, Markus F; Strobel, Deike

    2018-06-07

     This pilot study aimed at assessing interobserver agreement with two contrast-enhanced ultrasound (CEUS) algorithms for the diagnosis of hepatocellular carcinoma (HCC) in high-risk patients.  Focal liver lesions in 55 high-risk patients were assessed independently by three blinded observers with two standardized CEUS algorithms: ESCULAP (Erlanger Synopsis of Contrast-Enhanced Ultrasound for Liver Lesion Assessment in Patients at risk) and ACR-CEUS-LI-RADSv.2016 (American College of Radiology CEUS-Liver Imaging Reporting and Data System). Lesions were categorized according to size and ultrasound contrast enhancement in the arterial, portal-venous and late phase. Interobserver agreement for assessment of enhancement pattern and categorization was compared between both CEUS algorithms. Additionally, diagnostic accuracy for the definitive diagnosis of HCC was compared. Histology and/or CE-MRI and follow-up served as reference standards.  55 patients were included in the study (male/female, 44/ 11; mean age: 65.9 years). 90.9 % had cirrhosis. Histological findings were available in 39/55 lesions (70.9 %). Reference standard of the 55 lesions revealed 48 HCCs, 2 intrahepatic cholangiocellular carcinomas (ICCs), and 5 non-HCC-non-ICC lesions. Interobserver agreement was moderate to substantial for arterial phase hyperenhancement (ĸ = 0.53 - 0.67), and fair to moderate for contrast washout in the portal-venous or late phase (ĸ = 0.33 - 0.53). Concerning the CEUS-based algorithms, the interreader agreement was substantial for the ESCULAP category (ĸ = 0.64 - 0.68) and fair for the CEUS-LI-RADS ® category (ĸ = 0.3 - 0.39). Disagreement between observers was mostly due to different perception of washout.  Interobserver agreement is better for ESCULAP than for CEUS-LI-RADS ® . This is mostly due to the fact that perception of contrast washout varies between different observers. However, interobserver agreement is good for arterial phase hyperenhancement, which is the key diagnostic feature for the diagnosis of HCC with CEUS in the cirrhotic liver. © Georg Thieme Verlag KG Stuttgart · New York.

  9. A Population-Based Assessment of the Agreement Between Grading of Goniophotographic Images and Gonioscopy in the Chinese-American Eye Study (CHES)

    PubMed Central

    Murakami, Yohko; Wang, Dandan; Burkemper, Bruce; Lin, Shan C.; Varma, Rohit

    2016-01-01

    Purpose To compare grading of goniophotographic images and gonioscopy in assessing the iridocorneal angle. Methods In a population-based, cross-sectional study, participants underwent gonioscopy and goniophotographic imaging during the same visit. The iridocorneal angle was classified as closed if the posterior trabecular meshwork could not be seen. A single masked observer graded the goniophotographic images, and each eye was classified as having angle closure based on the number of closed quadrants. Agreement between the methods was analyzed by calculating kappa (κ) and first-order agreement coefficient (AC1) statistics and comparison of area under receiver operating characteristic curves (AUC). Results A total of 4149 Chinese Americans (3994 eyes) were included in this study. The agreement for angle closure diagnosis between gonioscopy and EyeCam was moderate to excellent (κ = 0.60, AC1 0.90, AUC 0.76–0.80). Conclusions Detection of iridocorneal angle closure based on goniophotographic imaging shows moderate to very good agreement with angle closure assessment using gonioscopy. PMID:27571018

  10. High Agreement and High Prevalence: The Paradox of Cohen's Kappa.

    PubMed

    Zec, Slavica; Soriani, Nicola; Comoretto, Rosanna; Baldi, Ileana

    2017-01-01

    Cohen's Kappa is the most used agreement statistic in literature. However, under certain conditions, it is affected by a paradox which returns biased estimates of the statistic itself. The aim of the study is to provide sufficient information which allows the reader to make an informed choice of the correct agreement measure, by underlining some optimal properties of Gwet's AC1 in comparison to Cohen's Kappa, using a real data example. During the process of literature review, we have asked a panel of three evaluators to come up with a judgment on the quality of 57 randomized controlled trials assigning a score to each trial using the Jadad scale. The quality was evaluated according to the following dimensions: adopted design, randomization unit, type of primary endpoint. With respect to each of the above described features, the agreement between the three evaluators has been calculated using Cohen's Kappa statistic and Gwet's AC1 statistic and, finally, the values have been compared with the observed agreement. The values of the Cohen's Kappa statistic would lead to believe that the agreement levels for the variables Unit, Design and Primary Endpoints are totally unsatisfactory. The AC1 statistic, on the contrary, shows plausible values which are in line with the respective values of the observed concordance. We conclude that it would always be appropriate to adopt the AC1 statistic, thus bypassing any risk of incurring the paradox and drawing wrong conclusions about the results of agreement analysis.

  11. Assessing agreement with relative area under the coverage probability curve.

    PubMed

    Barnhart, Huiman X

    2016-08-15

    There has been substantial statistical literature in the last several decades on assessing agreement, and coverage probability approach was selected as a preferred index for assessing and improving measurement agreement in a core laboratory setting. With this approach, a satisfactory agreement is based on pre-specified high satisfactory coverage probability (e.g., 95%), given one pre-specified acceptable difference. In practice, we may want to have quality control on more than one pre-specified differences, or we may simply want to summarize the agreement based on differences up to a maximum acceptable difference. We propose to assess agreement via the coverage probability curve that provides a full spectrum of measurement error at various differences/disagreement. Relative area under the coverage probability curve is proposed for the summary of overall agreement, and this new summary index can be used for comparison of different intra-methods or inter-methods/labs/observers' agreement. Simulation studies and a blood pressure example are used for illustration of the methodology. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  12. Agreement among Classroom Observers of Children's Stylistic Learning Behaviors.

    ERIC Educational Resources Information Center

    Buchanan, Helen Hamlet; McDermott, Paul A.; Schaefer, Barbara A.

    1998-01-01

    Investigates the interobserver agreement of the Learning Behavior Scale (LBS) by educators (n=16) observing students in special-education classes (n=72). No significant observer effect was found. Moreover, the LBS produced comparable levels of differential learning styles for assessments of individual children. (Author/MKA)

  13. Inter-rater reliability of the Full Outline of UnResponsiveness score and the Glasgow Coma Scale in critically ill patients: a prospective observational study

    PubMed Central

    2010-01-01

    Introduction The Glasgow Coma Scale (GCS) is the most widely used scoring system for comatose patients in intensive care. Limitations of the GCS include the impossibility to assess the verbal score in intubated or aphasic patients, and an inconsistent inter-rater reliability. The FOUR (Full Outline of UnResponsiveness) score, a new coma scale not reliant on verbal response, was recently proposed. The aim of the present study was to compare the inter-rater reliability of the GCS and the FOUR score among unselected patients in general critical care. A further aim was to compare the inter-rater reliability of neurologists with that of intensive care unit (ICU) staff. Methods In this prospective observational study, scoring of GCS and FOUR score was performed by neurologists and ICU staff on 267 consecutive patients admitted to intensive care. Results In a total of 437 pair wise ratings the exact inter-rater agreement for the GCS was 71%, and for the FOUR score 82% (P = 0.0016); the inter-rater agreement within a range of ± 1 score point for the GCS was 90%, and for the FOUR score 92% (P = ns.). The exact inter-rater agreement among neurologists was superior to that among ICU staff for the FOUR score (87% vs. 79%, P = 0.04) but not for the GCS (73% vs. 73%). Neurologists and ICU staff did not significantly differ in the inter-rater agreement within a range of ± 1 score point for both GCS (88% vs. 93%) and the FOUR score (91% vs. 88%). Conclusions The FOUR score performed better than the GCS for exact inter-rater agreement, but not for the clinically more relevant agreement within the range of ± 1 score point. Though neurologists outperformed ICU staff with regard to exact inter-rater agreement, the inter-rater agreement of ICU staff within the clinically more relevant range of ± 1 score point equalled that of the neurologists. The small advantage in inter-rater reliability of the FOUR score is most likely insufficient to replace the GCS, a score with a long tradition in intensive care. PMID:20398274

  14. How to determine leg dominance: The agreement between self-reported and observed performance in healthy adults

    PubMed Central

    Meddeler, Bart M.; Hoogeboom, Thomas J.; Nijhuis-van der Sanden, Maria W. G.; van Cingel, Robert E. H.

    2017-01-01

    Context Since decades leg dominance is suggested to be important in rehabilitation and return to play in athletes with anterior cruciate ligament injuries. However, an ideal method to determine leg dominance in relation to task performance is still lacking. Objective To test the agreement between self-reported and observed leg dominance in bilateral mobilizing and unilateral stabilizing tasks, and to assess whether the dominant leg switches between bilateral mobilizing tasks and unilateral stabilizing tasks. Design Cross-sectional study. Participants Forty-one healthy adults: 21 men aged 36 ± 17 years old and 20 women aged 36 ±15 years old. Measurement and analysis Participants self-reported leg dominance in the Waterloo Footedness Questionnaire-Revised (WFQ-R), and leg dominance was observed during performance of four bilateral mobilizing tasks and two unilateral stabilizing tasks. Descriptive statistics and crosstabs were used to report the percentages of agreement. Results The leg used to kick a ball had 100% agreement between the self-reported and observed dominant leg for both men and women. The dominant leg in kicking a ball and standing on one leg was the same in 66.7% of the men and 85.0% of the women. The agreement with jumping with one leg was lower: 47.6% for men and 70.0% for women. Conclusions It is appropriate to ask healthy adults: “If you would shoot a ball on a target, which leg would you use to shoot the ball?” to determine leg dominance in bilateral mobilizing tasks. However, a considerable number of the participants switched the dominant leg in a unilateral stabilizing task. PMID:29287067

  15. Gait Deviation Index, Gait Profile Score and Gait Variable Score in children with spastic cerebral palsy: Intra-rater reliability and agreement across two repeated sessions.

    PubMed

    Rasmussen, Helle Mätzke; Nielsen, Dennis Brandborg; Pedersen, Niels Wisbech; Overgaard, Søren; Holsgaard-Larsen, Anders

    2015-07-01

    The Gait Deviation Index (GDI) and Gait Profile Score (GPS) are the most used summary measures of gait in children with cerebral palsy (CP). However, the reliability and agreement of these indices have not been investigated, limiting their clinimetric quality for research and clinical practice. The aim of this study was to investigate the intra-rater reliability and agreement of summary measures of gait (GDI; GPS; and the Gait Variable Score (GVS) derived from the GPS). The intra-rater reliability and agreement were investigated across two repeated sessions in 18 children aged 5-12 years diagnosed with spastic CP. No systematic bias was observed between the sessions and no heteroscedasticity was observed in Bland-Altman plots. For the GDI and GPS, excellent reliability with intraclass correlation coefficient (ICC) values of 0.8-0.9 was found, while the GVS was found to have fair to good reliability with ICCs of 0.4-0.7. The agreement for the GDI and the logarithmically transformed GPS, in terms of the standard error of measurement as a percentage of the grand mean (SEM%) varied from 4.1 to 6.7%, whilst the smallest detectable change in percent (SDC%) ranged from 11.3 to 18.5%. For the logarithmically transformed GVS, we found a fair to large variation in SEM% from 7 to 29% and in SDC% from 18 to 81%. The GDI and GPS demonstrated excellent reliability and acceptable agreement proving that they can both be used in research and clinical practice. However, the observed large variability for some of the GVS requires cautious consideration when selecting outcome measures. Copyright © 2015 Elsevier B.V. All rights reserved.

  16. An observational study of agreement between percentage pain reduction calculated from visual analog or numerical rating scales versus that reported by parturients during labor epidural analgesia.

    PubMed

    Pratici, E; Nebout, S; Merbai, N; Filippova, J; Hajage, D; Keita, H

    2017-05-01

    This study aimed to determine the level of agreement between calculated percentage pain reduction, derived from visual analog or numerical rating scales, and patient-reported percentage pain reduction in patients having labor epidural analgesia. In a prospective observational study, parturients were asked to rate their pain intensity on a visual analog scale and numerical rating scale, before and 30min after initiation of epidural analgesia. The percentage pain reduction 30min after epidural analgesia was calculated by the formula: 100×(score before epidural analgesia-score 30min after epidural analgesia)/score before epidural analgesia. To evaluate agreement between calculated percentage pain reduction and patient-reported percentage pain reduction, we computed the concordance correlation coefficient and performed Bland-Altman analysis. Ninety-seven women in labor were enrolled in the study, most of whom were nulliparous, with a singleton fetus and in spontaneous labor. The concordance correlation coefficient with patient-reported percentage pain reduction was 0.76 (95% CI 0.6 to 0.8) and 0.77 (95% CI 0.6 to 0.8) for the visual analog and numerical rating scale, respectively. The Bland-Altman mean difference between calculated percentage pain reduction and patient-reported percentage pain reduction for the visual analog and numerical rating scales was -2.0% (limits of agreement at 29.8%) and 0 (limits of agreement at 28.2%), respectively. The agreement between calculated percentage pain reduction from a visual analog or numerical rating scale and patient-reported percentage pain reduction in the context of labor epidural analgesia was moderate. The difference could range up to 30%. Patient-reported percentage pain reduction has advantages as a measurement tool for assessing pain management for childbirth but differences compared with other assessment methods should be taken into account. Copyright © 2017 Elsevier Ltd. All rights reserved.

  17. Lactate - Arterial and Venous Agreement in Sepsis: a prospective observational study.

    PubMed

    Datta, Deepankar; Grahamslaw, Julia; Gray, Alasdair J; Graham, Catriona; Walker, Craig A

    2018-04-01

    Sepsis is a common condition in the emergency department (ED). Lactate measurement is an important part of management: arterial lactate (A-LACT) measurement is the gold standard. There is increasing use of peripheral venous lactate (PV-LACT); however, there is little research supporting the interchangeability of the two measures.If PV-LACT has good agreement with A-LACT, it would significantly reduce patient discomfort and the risks of arterial sampling for a large group of acutely unwell patients, while allowing faster and wider screening, with potential reduced costs to the healthcare system. The aim of this study is to determine the agreement between PV-LACT and A-LACT in septic patients attending the ED. We carried out a prospective observational cohort study of 304 consented patients presenting with sepsis to a single UK NHS ED (110 000 adult attendances annually) taking paired PV-LACT and A-LACT. Bland-Altman analysis was carried out to determine agreement. Receiver operating characteristic curves and 2×2 tables were constructed to explore the predictive value of PV-LACT for A-LACT. The mean difference (PV-LACT-A-LACT) is 0.4 mmol/l [95% confidence interval (CI): 0.37-0.45], with 95% limits of agreement from -0.4 (95% CI: -0.45 to -0.32) to 1.2 (95% CI: 1.14-1.27). A PV-LACT of at least 2 mmol/l predicts an A-LACT of at least 2 with 100% sensitivity (95% CI: 89-100%) and 83% specificity (95% CI: 77-87%). This study is the largest comparing the two measurements, and shows good clinical agreement. We recommend using PV-LACT in the routine screening of septic patients. A PV-LACT less than 2 mmol/l is predictive of an A-LACT less than 2 mmol/l.

  18. An inter-observer Ki67 reproducibility study applying two different assessment methods: on behalf of the Danish Scientific Committee of Pathology, Danish breast cancer cooperative group (DBCG).

    PubMed

    Laenkholm, Anne-Vibeke; Grabau, Dorthe; Møller Talman, Maj-Lis; Balslev, Eva; Bak Jylling, Anne Marie; Tabor, Tomasz Piotr; Johansen, Morten; Brügmann, Anja; Lelkaitis, Giedrius; Di Caterino, Tina; Mygind, Henrik; Poulsen, Thomas; Mertz, Henrik; Søndergaard, Gorm; Bruun Rasmussen, Birgitte

    2018-01-01

    In 2011, the St. Gallen Consensus Conference introduced the use of pathology to define the intrinsic breast cancer subtypes by application of immunohistochemical (IHC) surrogate markers ER, PR, HER2 and Ki67 with a specified Ki67 cutoff (>14%) for luminal B-like definition. Reports concerning impaired reproducibility of Ki67 estimation and threshold inconsistency led to the initiation of this quality assurance study (2013-2015). The aim of the study was to investigate inter-observer variation for Ki67 estimation in malignant breast tumors by two different quantification methods (assessment method and count method) including measure of agreement between methods. Fourteen experienced breast pathologists from 12 pathology departments evaluated 118 slides from a consecutive series of malignant breast tumors. The staining interpretation was performed according to both the Danish and Swedish guidelines. Reproducibility was quantified by intra-class correlation coefficient (ICC) and Lights Kappa with dichotomization of observations at the larger than (>) 20% threshold. The agreement between observations by the two quantification methods was evaluated by Bland-Altman plot. For the fourteen raters the median ranged from 20% to 40% by the assessment method and from 22.5% to 36.5% by the count method. Light's Kappa was 0.664 for observation by the assessment method and 0.649 by the count method. The ICC was 0.82 (95% CI: 0.77-0.86) by the assessment method vs. 0.84 (95% CI: 0.80-0.87) by the count method. Although the study in general showed a moderate to good inter-observer agreement according to both ICC and Lights Kappa, still major discrepancies were identified in especially the mid-range of observations. Consequently, for now Ki67 estimation is not implemented in the DBCG treatment algorithm.

  19. Comparison of superior vena cava and femoroiliac vein pressure according to intra-abdominal pressure.

    PubMed

    Ait-Oufella, Hafid; Boelle, Pierre-Yves; Galbois, Arnaud; Baudel, Jean-Luc; Margetis, Dimitri; Alves, Mikael; Offenstadt, Georges; Maury, Eric; Guidet, Bertrand

    2012-06-28

    Previous studies have shown a good agreement between central venous pressure (CVP) measurements from catheters placed in superior vena cava and catheters placed in the abdominal cava/common iliac vein. However, the influence of intra-abdominal pressure on such measurements remains unknown. We conducted a prospective, observational study in a tertiary teaching hospital. We enrolled patients who had indwelling catheters in both superior vena cava (double lumen catheter) and femoroiliac veins (dialysis catheter) and into the bladder. Pressures were measured from all the sites, CVP, femoroiliac venous pressure (FIVP), and intra-abdominal pressure. A total of 30 patients were enrolled (age 62 ± 14 years; SAPS II 62 (52-76)). Fifty complete sets of measurements were performed. All of the studied patients were mechanically ventilated (PEP 3 cmH20 (2-5)). We observed that the concordance between CVP and FIVP decreased when intra-abdominal pressure increased. We identified 14 mmHg as the best intra-abdominal pressure cutoff, and we found that CVP and FIVP were significantly more in agreement below this threshold than above (94% versus 50%, P = 0.002). We reported that intra-abdominal pressure affected agreement between CVP measurements from catheter placed in superior vena cava and catheters placed in the femoroiliac vein. Agreement was excellent when intra-abdominal pressure was below 14 mmHg.

  20. Clinical evaluation of the 3M Littmann Electronic Stethoscope Model 3200 in 150 cats.

    PubMed

    Blass, Keith A; Schober, Karsten E; Bonagura, John D; Scansen, Brian A; Visser, Lance C; Lu, Jennifer; Smith, Danielle N; Ward, Jessica L

    2013-10-01

    Detection of murmurs and gallops may help to identify cats with heart disease. However, auscultatory findings may be subject to clinically relevant observer variation. The objective of this study was to evaluate an electronic stethoscope (ES) in cats. We hypothesized that the ES would perform at least as well as a conventional stethoscope (CS) in the detection of abnormal heart sounds. One hundred and fifty consecutive cats undergoing echocardiography were enrolled prospectively. Cats were ausculted with a CS (WA Tycos Harvey Elite) by two observers, and heart sounds were recorded digitally using an ES (3M Littmann Stethoscope Model 3200) for off-line analysis. Echocardiography was used as the clinical standard method for validation of auscultatory findings. Additionally, digital recordings (DRs) were assessed by eight independent observers with various levels of expertise, and compared using interclass correlation and Cohen's weighted kappa analyses. Using the CS, a heart murmur (n = 88 cats) or gallop sound (n = 17) was identified in 105 cats, whereas 45 cats lacked abnormal heart sounds. There was good total agreement (83-90%) between the two observers using the CS. In contrast, there was only moderate agreement (P <0.001) between results from the CS and the DRs for murmurs, and poor agreement for gallops. The CS was more sensitive compared with the DRs with regard to murmurs and gallops. Agreement among the eight observers was good-to-excellent for murmur detection (81%). In conclusion, DRs made with the ES are less sensitive but comparably specific to a CS at detecting abnormal heart sounds in cats.

  1. High-resolution dental magnetic resonance imaging for planning palatal graft surgery-a clinical pilot study.

    PubMed

    Hilgenfeld, Tim; Kästel, Thorsten; Heil, Alexander; Rammelsberg, Peter; Heiland, Sabine; Bendszus, Martin; Schwindling, Franz Sebastian

    2018-04-01

    To evaluate whether high-resolution, non-contrast-enhanced dental magnetic resonance imaging (MRI) can be used for accurate determination of palatal masticatory mucosa thickness (PMMT) and to locate the greater palatal artery (GPA). In five volunteers (four males, one female; mean age 30.2 ± 0.4 years), two independent raters measured PMMT by use of dental MRI in 180 positions. For comparison, clinical bone sounding was performed. The GPA was identified in time-of-flight (TOF) angiography and MSVAT-SPACE-prototype sequence. Intra- and inter-observer agreement for MRI measurements, agreement between MRI and bone sounding were analysed by intra-class correlation coefficient (ICC) and Cohen's kappa (κ). Reliability of dental MRI measurements was high (intra-observer-ICC 0.962; inter-observer ICC 0.959). Agreement of MRI measurements with bone sounding was moderate (ICC 0.744), and the GPA could be identified in 60% of measurement points using the TOF-angiography alone and in 85% with additional information of the MSVAT-SPACE. Good intra-observer agreement was observed for GPA identification (κ: 0.778). Palatal masticatory mucosa thickness measured by high-resolution, non-contrast enhanced dental MRI is comparable with that obtained by bone sounding. Dental MRI enables reliable, non-invasive and radiation-free planning of palatal tissue harvesting and can also be used for location of the GPA at 85% of measurement points, which might help reduce complications during surgery. © 2018 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  2. Constraints on the Evolution of the Galaxy Stellar Mass Function I: Role of Star Formation, Mergers, and Stellar Stripping

    NASA Astrophysics Data System (ADS)

    Contini, E.; Kang, Xi; Romeo, A. D.; Xia, Q.

    2017-03-01

    We study the connection between the observed star formation rate-stellar mass (SFR-M *) relation and the evolution of the stellar mass function (SMF) by means of a subhalo abundance matching technique coupled to merger trees extracted from an N-body simulation. Our approach, which considers both galaxy mergers and stellar stripping, is to force the model to match the observed SMF at redshift z> 2, and let it evolve down to the present time according to the observed SFR-M * relation. In this study, we use two different sets of SMFs and two SFR-M * relations: a simple power law and a relation with a mass-dependent slope. Our analysis shows that the evolution of the SMF is more consistent with an SFR-M * relation with a mass-dependent slope, in agreement with predictions from other models of galaxy evolution and recent observations. In order to fully and realistically describe the evolution of the SMF, both mergers and stellar stripping must be considered, and we find that both have almost equal effects on the evolution of SMF at the massive end. Taking into account the systematic uncertainties in the observed data, the high-mass end of the SMF obtained by considering stellar stripping results in good agreement with recent observational data from the Sloan Digital Sky Survey. At {log} {M}* < 11.2, our prediction at z = 0.1 is close to Li & White data, but the high-mass end ({log} {M}* > 11.2) is in better agreement with D’Souza et al. data which account for more massive galaxies.

  3. Inter and intra-observer reliability in assessment of the position of the lateral sesamoid in determining the severity of hallux valgus.

    PubMed

    Panchani, Sunil; Reading, Jonathan; Mehta, Jaysheel

    2016-06-01

    The position of the lateral sesamoid on standard dorso-plantar weight bearing radiographs, with respect to the lateral cortex of the first metatarsal, has been shown to correlate well with the degree of the hallux valgus angle. This study aimed to assess the inter- and intra-observer error of this new classification system. Five orthopaedic consultants and five trainee orthopaedic surgeons were recruited to assess and document the degree of displacement of the lateral sesamoid on 144 weight-bearing dorso-plantar radiographs on two separate occasions. The severity of hallux valgus was defined as normal (0%), mild (≤50%), moderate (51-≤99%) or severe (≥100%) depending on the percentage displacement of the lateral sesamoid body from the lateral cortical border of the first metatarsal. Consultant intra-observer variability showed good agreement between repeated assessment of the radiographs (mean Kappa=0.75). Intra-observer variability for trainee orthopaedic surgeons also showed good agreement with a mean Kappa=0.73. Intraclass correlations for consultants and trainee surgeons was also high. The new classification system of assessing the severity of hallux valgus shows high inter- and intra-observer variability with good agreement and reproducibility between surgeons of consultant and trainee grades. Copyright © 2015 Elsevier Ltd. All rights reserved.

  4. Reproducibility of the water drinking test.

    PubMed

    Muñoz, C R; Macias, J H; Hartleben, C

    2015-11-01

    To investigate the reproducibility of the water drinking test in determining intraocular pressure peaks and fluctuation. It has been suggested that there is limited agreement between the water drinking test and diurnal tension curve. This may be because it has only been compared with a 10-hour modified diurnal tension curve, missing 70% of IOP peaks that occurred during night. This was a prospective, analytical and comparative study that assesses the correlation, agreement, sensitivity and specificity of the water drinking test. The correlation between the water drinking test and diurnal tension curve was significant and strong (r=0.93, Confidence interval 95% between 0.79 and 0.96, p<01). A moderate agreement was observed between these measurements (pc=0.93, Confidence interval 95% between 0.87 and 0.95, p<.01). The agreement was within±2mmHg in 89% of the tests. Our study found a moderate agreement between the water drinking test and diurnal tension curve, in contrast with the poor agreement found in other studies, possibly due to the absence of nocturnal IOP peaks. These findings suggest that the water drinking test could be used to determine IOP peaks, as well as for determining baseline IOP. Copyright © 2014 Sociedad Española de Oftalmología. Published by Elsevier España, S.L.U. All rights reserved.

  5. Digital replication of chest radiographs without altering diagnostic observer performance

    NASA Astrophysics Data System (ADS)

    Flynn, Michael J.; Davies, Eric; Spizarny, David; Beute, Gordon H.; Peterson, Edward; Eyler, William R.; Gross, Barry; Chen, Ji

    1991-05-01

    A study to test the ability of a high-fidelity system to digitize chest radiographs, store the data in a computer, and reprint the film without altering diagnostic observer performance is reported. Two hundred and fifty-two (252) chest films with subtle image features indicative of interstitial disease, pulmonary nodule, or pneumothorax, along with 36 normal chest films were used in the study. Films were selected from a key word search on a computerized report archive and were graded by two experienced radiologists. Each film was digitized with 86 micron pixels and stored in 4000 X 5000 arrays using a research instrument. Replicates were printed using a commercial laser film printer (Eastman Kodak Company) having 80 micron pixels. Originals and replicates were observed separately by two different experienced radiologists. Each indicated a graded response for the three possible pathologies. The agreement of observers between responses for replicates and originals was described by the kappa statistic and compared to the agreement when rereading the original film. The final result of this study supports a hypothesis that the replicate is indistinguishable from the original.

  6. Reliability of a survey tool for measuring consumer nutrition environment in urban food stores.

    PubMed

    Hosler, Akiko S; Dharssi, Aliza

    2011-01-01

    Despite the increase in the volume and importance of food environment research, there is a general lack of reliable measurement tools. This study presents the development and reliability assessment of a tool for measuring consumer nutrition environment in urban food stores. Cross-sectional design. A racially diverse downtown portion (6 ZIP code areas) in Albany, New York. A sample of 39 food stores was visited by our research team in 2009 to 2010. These stores were randomly selected from 123 eligible food stores identified through multiple government lists and ground-truthing. The Food Retail Outlet Survey Tool was developed to assess the presence of selected food and nonfood items, placement, milk prices, physical characteristics of the store, policy implementation, and advertisements on outside windows. For in-store items, agreement of observations between experienced and lightly trained surveyors was assessed. For window advertisement assessments, inter-method agreement (on-site sketch vs digital photo), and inter-rater agreement (both on-site) among lightly trained surveyors were evaluated. Percent agreement, Kappa, and prevalence-adjusted bias-adjusted kappa were calculated for in-store observations. Interclass correlation coefficients were calculated for window observations. Twenty-seven of the 47 in-store items had 100% agreement. The prevalence-adjusted bias-adjusted kappa indicated excellent agreement (≥0.90) on all items, except aisle width (0.74) and dark-green/orange colored fresh vegetables (0.85). The store type (nonconvenience store), the order of visits (first half), and the time to complete survey (>10 minutes) were associated with lower reliability in these 2 items. Both the inter-method and inter-rater agreements for window advertisements were uniformly high (intraclass correlation coefficient ranged 0.94-1.00), indicating high reliability. The Food Retail Outlet Survey Tool is a reliable tool for quickly measuring consumer nutrition environment. It can be effectively used by an individual who attended a 30-minute group briefing and practiced with 3 to 4 stores.

  7. Live Versus Televised Observations of Social Behavior in Preschool Children.

    ERIC Educational Resources Information Center

    Paulson, F. Leon

    1972-01-01

    An investigation to compare systematic behavioral observations made live with those made on television was conducted. The study was designed to answer three questions: (1) Is there a difference in the agreement between observers (Os) when both view an event Live and when both view the same event on Television? (2) Is there a difference in…

  8. Diagnostic value of computed tomography in dogs with chronic nasal disease.

    PubMed

    Saunders, Jimmy H; van Bree, Henri; Gielen, Ingrid; de Rooster, Hilde

    2003-01-01

    Computed tomographic (CT) studies of 80 dogs with chronic nasal disease (nasal neoplasia (n = 19), nasal aspergillosis (n = 46), nonspecific rhinitis (n = 11), and foreign body rhinitis (n = 4)) were reviewed retrospectively by two independent observers. Each observer filled out a custom-designed list to record his or her interpretation of the CT signs and selected a diagnosis. Accuracy, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were calculated for the diagnosis of each disease. The agreement between observers was evaluated. The CT signs corresponded to those previously described in the literature. CT had an accuracy greater than 90% for each observer in all disease processes. The sensitivity, specificity, PPV, and NPV were greater than 80% in all dogs with the exception of the PPV of foreign body rhinitis (80% for observer A and 44% for observer B). There was a substantial, to almost perfect, agreement between the two observers regarding the CT signs and diagnosis. This study indicates a high accuracy of CT for diagnosis of dogs with chronic nasal disease. The differentiation between nasal aspergillosis restricted to the nasal passages and foreign body rhinitis may be difficult when the foreign body is not visible.

  9. The efficacy of the Microsoft KinectTM to assess human bimanual coordination.

    PubMed

    Liddy, Joshua J; Zelaznik, Howard N; Huber, Jessica E; Rietdyk, Shirley; Claxton, Laura J; Samuel, Arjmand; Haddad, Jeffrey M

    2017-06-01

    The Microsoft Kinect has been used in studies examining posture and gait. Despite the advantages of portability and low cost, this device has not been used to assess interlimb coordination. Fundamental insights into movement control, variability, health, and functional status can be gained by examining coordination patterns. In this study, we investigated the efficacy of the Microsoft Kinect to capture bimanual coordination relative to a research-grade motion capture system. Twenty-four healthy adults performed coordinated hand movements in two patterns (in-phase and antiphase) at eight movement frequencies (1.00-3.33 Hz). Continuous relative phase (CRP) and discrete relative phase (DRP) were used to quantify the means (mCRP and mDRP) and variability (sdCRP and sdDRP) of coordination patterns. Between-device agreement was assessed using Bland-Altman bias with 95 % limits of agreement, concordance correlation coefficients (absolute agreement), and Pearson correlation coefficients (relative agreement). Modest-to-excellent relative and absolute agreements were found for mCRP in all conditions. However, mDRP showed poor agreement for the in-phase pattern at low frequencies, due to large between-device differences in a subset of participants. By contrast, poor absolute agreement was observed for both sdCRP and sdDRP, while relative agreement ranged from poor to excellent. Overall, the Kinect captures the macroscopic patterns of bimanual coordination better than coordination variability.

  10. Inter-rater reliability of direct observations of the physical and psychosocial working conditions in eldercare: An evaluation in the DOSES project.

    PubMed

    Karstad, Kristina; Rugulies, Reiner; Skotte, Jørgen; Munch, Pernille Kold; Greiner, Birgit A; Burdorf, Alex; Søgaard, Karen; Holtermann, Andreas

    2018-05-01

    The aim of the study was to develop and evaluate the reliability of the "Danish observational study of eldercare work and musculoskeletal disorders" (DOSES) observation instrument to assess physical and psychosocial risk factors for musculoskeletal disorders (MSD) in eldercare work. During 1.5 years, sixteen raters conducted 117 inter-rater observations from 11 nursing homes. Reliability was evaluated using percent agreement and Gwet's AC1 coefficient. Of the 18 examined items, inter-rater reliability was excellent for 7 items (AC1>0.75) fair to good for 7 items (AC1 0.40-0.75) and poor for 2 items (AC1 0-0.40). For 2 items there was no agreement between the raters (AC1 <0). The reliability did not differ between the first and second half of the data collection period and the inter-rater observations were representative regarding occurrence of events in eldercare work. The instrument is appropriate for assessing physical and psychosocial risk factors for MSD among eldercare workers. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.

  11. Comparison of Aerosol Optical Depth from Four Solar Radiometers During the Fall 1997 ARM Intensive Observation Period

    NASA Technical Reports Server (NTRS)

    Schmid, B.; Michalsky, J.; Halthore, R.; Beauharnois, M.; Harrison, L.; Livingston, J.; Russell, P.; Holben, B.; Eck, T.; Smirnov, A.

    2000-01-01

    In the Fall of 1997 the Atmospheric Radiation Measurement (ARM) program conducted an Intensive Observation Period (IOP) to study aerosols. Five sun-tracking radiometers were present to measure the total column aerosol optical depth. This comparison performed on the Southern Great Plains (SGP) demonstrates the capabilities and limitations of modern tracking sunphotometers at a location typical of where aerosol measurements are required. The key result was agreement in aerosol optical depth measured by 4 of the 5 instruments within 0.015 (rms). The key to this level of agreement was meticulous care in the calibrations of the instruments.

  12. Localizing value of pain distribution patterns in cervical spondylosis.

    PubMed

    Bunyaratavej, Krishnapundha; Montriwiwatnchai, Peerapong; Siwanuwatn, Rungsak; Khaoroptham, Surachai

    2015-04-01

    Prospective observational study. To investigate the value of pain distribution in localizing appropriate surgical levels in patients with cervical spondylosis. Previous studies have investigated the value of pain drawings in its correlation with various features in degenerative spine diseases including surgical outcome, magnetic resonance imaging findings, discographic study, and psychogenic issues. However, there is no previous study on the value of pain drawings in identifying symptomatic levels for the surgery in cervical spondylosis. The study collected data from patients with cervical spondylosis who underwent surgical treatment between August 2009 and July 2012. Pain diagrams drawn separately by each patient and physician were collected. Pain distribution patterns among various levels of surgery were analyzed by the chi-square test. Agreement between different pairs of data, including pain diagrams drawn by each patient and physician, intra-examiner agreement on interpretation of pain diagrams, inter-examiner agreement on interpretation of pain diagrams, interpretation of pain diagram by examiners and actual surgery, was analyzed by Kappa statistics. The study group consisted of 19 men and 28 women with an average age of 55.2 years. Average duration of symptoms was 16.8 months. There was no difference in the pain distribution pattern at any level of surgery. The agreement between pain diagram drawn by each patient and physician was moderate. Intra-examiner agreement was moderate. There was slight agreement of inter-examiners, examiners versus actual surgery. Pain distribution pattern by itself has limited value in identifying surgical levels in patients with cervical spondylosis.

  13. Evaluation of missing value methods for predicting ambient BTEX concentrations in two neighbouring cities in Southwestern Ontario Canada

    NASA Astrophysics Data System (ADS)

    Miller, Lindsay; Xu, Xiaohong; Wheeler, Amanda; Zhang, Tianchu; Hamadani, Mariam; Ejaz, Unam

    2018-05-01

    High density air monitoring campaigns provide spatial patterns of pollutant concentrations which are integral in exposure assessment. Such analysis can assist with the determination of links between air quality and health outcomes, however, problems due to missing data can threaten to compromise these studies. This research evaluates four methods; mean value imputation, inverse distance weighting (IDW), inter-species ratios, and regression, to address missing spatial concentration data ranging from one missing data point up to 50% missing data. BTEX (benzene, toluene, ethylbenzene, and xylenes) concentrations were measured in Windsor and Sarnia, Ontario in the fall of 2005. Concentrations and inter-species ratios were generally similar between the two cities. Benzene (B) was observed to be higher in Sarnia, whereas toluene (T) and the T/B ratios were higher in Windsor. Using these urban, industrialized cities as case studies, this research demonstrates that using inter-species ratios or regression of the data for which there is complete information, along with one measured concentration (i.e. benzene) to predict for missing concentrations (i.e. TEX) results in good agreement between predicted and measured values. In both cities, the general trend remains that best agreement is observed for the leave-one-out scenario, followed by 10% and 25% missing, and the least agreement for the 50% missing cases. In the absence of any known concentrations IDW can provide reasonable agreement between observed and estimated concentrations for the BTEX species, and was superior over mean value imputation which was not able to preserve the spatial trend. The proposed methods can be used to fill in missing data, while preserving the general characteristics and rank order of the data which are sufficient for epidemiologic studies.

  14. An Instrument to Assess the Obesogenic Environment of Child Care Centers

    ERIC Educational Resources Information Center

    Ward, Dianne; Hales, Derek; Haverly, Katie; Marks, Julie; Benjamin, Sara; Ball, Sarah; Trost, Stewart

    2008-01-01

    Objectives: To describe protocol and interobserver agreements of an instrument to evaluate nutrition and physical activity environments at child care. Methods: Interobserver data were collected from 9 child care centers, through direct observation and document review (17 observer pairs). Results: Mean agreement between observer pairs was 87.26%…

  15. Clinical examination of pregnant women by paramedical and medical personnel: an assessment of consistency of findings in a field study.

    PubMed

    Srinivasan, K; Prakasam, C P; Rajaretnam, T; Praharaj, Purujit

    2006-01-01

    As a part of a project to improve the maternal and child health services in 4 primary health centres (PHCs) in Bellary and Raichur districts of Karnataka, we assessed the consistency in recording symptoms, signs and some clinical observations of pregnant women by three examiners-the junior health assistant, medical officer of the PHC and a private medical practitioner. One hundred seventy-four pregnant women were examined independently by the three examiners on the same day for 4 symptoms reported by the women themselves, 4 signs assessed by the examining person and 9 simple clinical observations. Agreement rates in each examiner pair for each parameter were assessed. We found poor rates of agreement in assesment of various parameters by each observer pair. The disagreement rates were lower between the two doctors compared with those between the junior health assistant and each doctor. The agreement rates between various healthcare personnel in assessing pregnant women are low. There is a need for measures to correct this situation.

  16. Validation of a Brief Questionnaire Against Direct Observation to Assess Adolescents' School Lunchtime Beverage Consumption.

    PubMed

    Grummon, Anna H; Hampton, Karla E; Hecht, Amelie; Oliva, Ariana; McCulloch, Charles E; Brindis, Claire D; Patel, Anisha I

    Beverage consumption is an important determinant of youth health outcomes. Beverage interventions often occur in schools, yet no brief validated questionnaires exist to assess whether these efforts improve in-school beverage consumption. This study validated a brief questionnaire to assess beverage consumption during school lunch. Researchers observed middle school students' (n = 25) beverage consumption during school lunchtime using a standardized tool. After lunch, students completed questionnaires regarding their lunchtime beverage consumption. Kappa statistics compared self-reported with observed beverage consumption across 15 beverage categories. Eight beverages showed at least fair agreement (kappa [κ] > 0.20) for both type and amount consumed, with most showing substantial agreement (κ > 0.60). One beverage had high raw agreement but κ < 0.20. Six beverages had too few ratings to compute κ's. This brief questionnaire was useful for assessing school lunchtime consumption of many beverages and provides a low-cost tool for evaluating school-based beverage interventions. Copyright © 2017 Society for Nutrition Education and Behavior. Published by Elsevier Inc. All rights reserved.

  17. Questionnaire layout and wording influence prevalence and risk estimates of respiratory symptoms in a population cohort.

    PubMed

    Ekerljung, Linda; Rönmark, Eva; Lötvall, Jan; Wennergren, Göran; Torén, Kjell; Lundbäck, Bo

    2013-01-01

    Results of epidemiological studies are greatly influenced by the chosen methodology. The study aims to investigate how two frequently used questionnaires (Qs), with partly different layout, influence the prevalence of respiratory symptoms. A booklet containing two Qs, the Global Allergy and Asthma European Network Q and the Obstructive Lung Disease in Northern Sweden Q, was mailed to 30,000 subjects aged 16-75years in West Sweden; 62% responded. Sixteen questions were included in the analysis: seven identical between the Qs, four different in set-up and five with the same layout but different wording. Comparisons were made using differences in proportions, observed agreement and Kappa statistics.  Identical questions yielded similar prevalences with high observed agreement and kappa values. Questions with different set-up or differences in wording resulted in significantly different prevalences with lower observed agreement and kappa values. In general, the use of follow-up questions, excluding subjects answering no to the initial question, resulted in 2.9-6.7% units lower prevalence. The question set-up has great influences on epidemiological results, and specifically questions that are set up to be excluded based on a previous no answer leads to lower prevalence compared with detached questions. Therefore, Q layout and exact wording of questions has to be carefully considered when comparing studies. © 2012 Blackwell Publishing Ltd.

  18. Comparative measurement of collagen bundle orientation by Fourier analysis and semiquantitative evaluation: reliability and agreement in Masson's trichrome, Picrosirius red and confocal microscopy techniques.

    PubMed

    Marcos-Garcés, V; Harvat, M; Molina Aguilar, P; Ferrández Izquierdo, A; Ruiz-Saurí, A

    2017-08-01

    Measurement of collagen bundle orientation in histopathological samples is a widely used and useful technique in many research and clinical scenarios. Fourier analysis is the preferred method for performing this measurement, but the most appropriate staining and microscopy technique remains unclear. Some authors advocate the use of Haematoxylin-Eosin (H&E) and confocal microscopy, but there are no studies comparing this technique with other classical collagen stainings. In our study, 46 human skin samples were collected, processed for histological analysis and stained with Masson's trichrome, Picrosirius red and H&E. Five microphotographs of the reticular dermis were taken with a 200× magnification with light microscopy, polarized microscopy and confocal microscopy, respectively. Two independent observers measured collagen bundle orientation with semiautomated Fourier analysis with the Image-Pro Plus 7.0 software and three independent observers performed a semiquantitative evaluation of the same parameter. The average orientation for each case was calculated with the values of the five pictures. We analyzed the interrater reliability, the consistency between Fourier analysis and average semiquantitative evaluation and the consistency between measurements in Masson's trichrome, Picrosirius red and H&E-confocal. Statistical analysis for reliability and agreement was performed with the SPSS 22.0 software and consisted of intraclass correlation coefficient (ICC), Bland-Altman plots and limits of agreement and coefficient of variation. Interrater reliability was almost perfect (ICC > 0.8) with all three histological and microscopy techniques and always superior in Fourier analysis than in average semiquantitative evaluation. Measurements were consistent between Fourier analysis by one observer and average semiquantitative evaluation by three observers, with an almost perfect agreement with Masson's trichrome and Picrosirius red techniques (ICC > 0.8) and a strong agreement with H&E-confocal (0.7 < ICC < 0.8). Comparison of measurements between the three techniques for the same observer showed an almost perfect agreement (ICC > 0.8), better with Fourier analysis than with semiquantitative evaluation (single and average). These results in nonpathological skin samples were also confirmed in a preliminary analysis in eight scleroderma skin samples. Our results show that Masson's trichrome and Picrosirius red are consistent with H&E-confocal for measuring collagen bundle orientation in histological samples and could thus be used indistinctly for this purpose. Fourier analysis is superior to average semiquantitative evaluation and should keep being used as the preferred method. © 2017 The Authors Journal of Microscopy © 2017 Royal Microscopical Society.

  19. Magnetic resonance imaging versus computed tomography to plan hemilaminectomies in chondrodystrophic dogs with intervertebral disc extrusion.

    PubMed

    Noyes, Julie A; Thomovsky, Stephanie A; Chen, Annie V; Owen, Tina J; Fransson, Boel A; Carbonneau, Kira J; Matthew, Susan M

    2017-10-01

    To determine the influence of preoperative computed tomography (CT) versus magnetic resonance (MR) on hemilaminectomies planned to treat thoracolumbar (TL) intervertebral disc (IVD) extrusions in chondrodystrophic dogs. Prospective clinical study. Forty chondrodystrophic dogs with TL IVD extrusion and preoperative CT and MR studies. MR and CT images were randomized and reviewed by 4 observers masked to the dog's identity and corresponding imaging studies. Observers planned the location along the spine, side, and extent (number of articular facets to be removed) based on individual reviews of CT and MR studies. Intra-observer agreement was determined between overall surgical plan, location, side, and size of the hemilaminectomy planned on CT versus MR of the same dog. Similar surgical plans were developed based on MR versus CT in 43.5%-66.6% of dogs, depending on the observer. Intra-observer agreement in location, side, and size of the planned hemilaminectomy based on CT versus MR ranged between 48.7%-66.6%, 87%-92%, and 51.2%-71.7% of dogs, respectively. Observers tended to plan larger laminectomy defects based on MR versus CT of the same dog. Findings from this study indicated considerable differences in hemilaminectomies planned on preoperative MR versus CT imaging. Surgical location and size varied the most; the side of planned hemilaminectomies was most consistent between imaging modalities. © 2017 The American College of Veterinary Surgeons.

  20. 32 CFR 37.575 - What are my responsibilities for determining milestone payment amounts?

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... SECRETARY OF DEFENSE DoD GRANT AND AGREEMENT REGULATIONS TECHNOLOGY INVESTMENT AGREEMENTS Pre-Award Business... agreement or in separate instructions to the post-award administrative agreements officer. That will help..., observable and verifiable technical outcomes (e.g., demonstrations, tests, or data analysis) that you...

  1. 32 CFR 37.575 - What are my responsibilities for determining milestone payment amounts?

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... SECRETARY OF DEFENSE DoD GRANT AND AGREEMENT REGULATIONS TECHNOLOGY INVESTMENT AGREEMENTS Pre-Award Business... agreement or in separate instructions to the post-award administrative agreements officer. That will help..., observable and verifiable technical outcomes (e.g., demonstrations, tests, or data analysis) that you...

  2. 32 CFR 37.575 - What are my responsibilities for determining milestone payment amounts?

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... SECRETARY OF DEFENSE DoD GRANT AND AGREEMENT REGULATIONS TECHNOLOGY INVESTMENT AGREEMENTS Pre-Award Business... agreement or in separate instructions to the post-award administrative agreements officer. That will help..., observable and verifiable technical outcomes (e.g., demonstrations, tests, or data analysis) that you...

  3. 32 CFR 37.575 - What are my responsibilities for determining milestone payment amounts?

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... SECRETARY OF DEFENSE DoD GRANT AND AGREEMENT REGULATIONS TECHNOLOGY INVESTMENT AGREEMENTS Pre-Award Business... agreement or in separate instructions to the post-award administrative agreements officer. That will help..., observable and verifiable technical outcomes (e.g., demonstrations, tests, or data analysis) that you...

  4. 32 CFR 37.575 - What are my responsibilities for determining milestone payment amounts?

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... SECRETARY OF DEFENSE DoD GRANT AND AGREEMENT REGULATIONS TECHNOLOGY INVESTMENT AGREEMENTS Pre-Award Business... agreement or in separate instructions to the post-award administrative agreements officer. That will help..., observable and verifiable technical outcomes (e.g., demonstrations, tests, or data analysis) that you...

  5. Enhanced Two-Factor Authentication and Key Agreement Using Dynamic Identities in Wireless Sensor Networks.

    PubMed

    Chang, I-Pin; Lee, Tian-Fu; Lin, Tsung-Hung; Liu, Chuan-Ming

    2015-11-30

    Key agreements that use only password authentication are convenient in communication networks, but these key agreement schemes often fail to resist possible attacks, and therefore provide poor security compared with some other authentication schemes. To increase security, many authentication and key agreement schemes use smartcard authentication in addition to passwords. Thus, two-factor authentication and key agreement schemes using smartcards and passwords are widely adopted in many applications. Vaidya et al. recently presented a two-factor authentication and key agreement scheme for wireless sensor networks (WSNs). Kim et al. observed that the Vaidya et al. scheme fails to resist gateway node bypassing and user impersonation attacks, and then proposed an improved scheme for WSNs. This study analyzes the weaknesses of the two-factor authentication and key agreement scheme of Kim et al., which include vulnerability to impersonation attacks, lost smartcard attacks and man-in-the-middle attacks, violation of session key security, and failure to protect user privacy. An efficient and secure authentication and key agreement scheme for WSNs based on the scheme of Kim et al. is then proposed. The proposed scheme not only solves the weaknesses of previous approaches, but also increases security requirements while maintaining low computational cost.

  6. Validation and Continued Development of Methods for Spheromak Simulation

    NASA Astrophysics Data System (ADS)

    Benedett, Thomas

    2016-10-01

    The HIT-SI experiment has demonstrated stable sustainment of spheromaks. Determining how the underlying physics extrapolate to larger, higher-temperature regimes is of prime importance in determining the viability of the inductively-driven spheromak. It is thus prudent to develop and validate a computational model that can be used to study current results and study the effect of possible design choices on plasma behavior. A zero-beta Hall-MHD model has shown good agreement with experimental data at 14.5 kHz injector operation. Experimental observations at higher frequency, where the best performance is achieved, indicate pressure effects are important and likely required to attain quantitative agreement with simulations. Efforts to extend the existing validation to high frequency (36-68 kHz) using an extended MHD model implemented in the PSI-TET arbitrary-geometry 3D MHD code will be presented. An implementation of anisotropic viscosity, a feature observed to improve agreement between NIMROD simulations and experiment, will also be presented, along with investigations of flux conserver features and their impact on density control for future SIHI experiments. Work supported by DoE.

  7. Behavior States Are Real and Observable.

    ERIC Educational Resources Information Center

    Guess, Doug; Roberts, Sally; Rues, Jane

    2000-01-01

    This article critiques the research methodology used by Mudford, Hogg, and Roberts (1999) that resulted in a failure to achieve inter-observer agreement on adults with mental retardation when using an experimental, 13-category behavior state code. Arguments are provided on why their videotape study does not meet requirements of acceptable…

  8. Stylus/tablet user input device for MRI heart wall segmentation: efficiency and ease of use.

    PubMed

    Taslakian, Bedros; Pires, Antonio; Halpern, Dan; Babb, James S; Axel, Leon

    2018-05-02

    To determine whether use of a stylus user input device (UID) would be superior to a mouse for CMR segmentation. Twenty-five consecutive clinical cardiac magnetic resonance (CMR) examinations were selected. Image analysis was independently performed by four observers. Manual tracing of left (LV) and right (RV) ventricular endocardial contours was performed twice in 10 randomly assigned sessions, each session using only one UID. Segmentation time and the ventricular function variables were recorded. The mean segmentation time and time reduction were calculated for each method. Intraclass correlation coefficients (ICC) and Bland-Altman plots of function variables were used to assess intra- and interobserver variability and agreement between methods. Observers completed a Likert-type questionnaire. The mean segmentation time (in seconds) was significantly less with the stylus compared to the mouse, averaging 206±108 versus 308±125 (p<0.001) and 225±140 versus 353±162 (p<0.001) for LV and RV segmentation, respectively. The intra- and interobserver agreement rates were excellent (ICC≥0.75) regardless of the UID. There was an excellent agreement between measurements derived from manual segmentation using different UIDs (ICC≥0.75), with few exceptions. Observers preferred the stylus. The study shows a significant reduction in segmentation time using the stylus, a subjective preference, and excellent agreement between the methods. • Using a stylus for MRI ventricular segmentation is faster compared to mouse • A stylus is easier to use and results in less fatigue • There is excellent agreement between stylus and mouse UIDs.

  9. Effect of Picture Archiving and Communication System Image Manipulation on the Agreement of Chest Radiograph Interpretation in the Neonatal Intensive Care Unit.

    PubMed

    Castro, Denise A; Naqvi, Asad Ahmed; Vandenkerkhof, Elizabeth; Flavin, Michael P; Manson, David; Soboleski, Donald

    2016-01-01

    Variability in image interpretation has been attributed to differences in the interpreters' knowledge base, experience level, and access to the clinical scenario. Picture archiving and communication system (PACS) has allowed the user to manipulate the images while developing their impression of the radiograph. The aim of this study was to determine the agreement of chest radiograph (CXR) impressions among radiologists and neonatologists and help determine the effect of image manipulation with PACS on report impression. Prospective cohort study included 60 patients from the Neonatal Intensive Care Unit undergoing CXRs. Three radiologists and three neonatologists reviewed two consecutive frontal CXRs of each patient. Each physician was allowed manipulation of images as needed to provide a decision of "improved," "unchanged," or "disease progression" lung disease for each patient. Each physician repeated the process once more; this time, they were not allowed to individually manipulate the images, but an independent radiologist presets the image brightness and contrast to best optimize the CXR appearance. Percent agreement and opposing reporting views were calculated between all six physicians for each of the two methods (allowing and not allowing image manipulation). One hundred percent agreement in image impression between all six observers was only seen in 5% of cases when allowing image manipulation; 100% agreement was seen in 13% of the cases when there was no manipulation of the images. Agreement in CXR interpretation is poor; the ability to manipulate the images on PACS results in a decrease in agreement in the interpretation of these studies. New methods to standardize image appearance and allow improved comparison with previous studies should be sought to improve clinician agreement in interpretation consistency and advance patient care.

  10. [Evaluation of echocardiographic left ventricular wall motion analysis supported by internet picture viewing system].

    PubMed

    Hirano, Yutaka; Ikuta, Shin-Ichiro; Nakano, Manabu; Akiyama, Seita; Nakamura, Hajime; Nasu, Masataka; Saito, Futoshi; Nakagawa, Junichi; Matsuzaki, Masashi; Miyazaki, Shunichi

    2007-02-01

    Assessment of deterioration of regional wall motion by echocardiography is not only subjective but also features difficulties with interobserver agreement. Progress in digital communication technology has made it possible to send video images from a distant location via the Internet. The possibility of evaluating left ventricular wall motion using video images sent via the Internet to distant institutions was evaluated. Twenty-two subjects were randomly selected. Four sets of video images (parasternal long-axis view, parasternal short-axis view, apical four-chamber view, and apical two-chamber view) were taken for one cardiac cycle. The images were sent via the Internet to two institutions (observer C in facility A and observers D and E in facility B) for evaluation. Great care was taken to prevent disclosure of patient information to these observers. Parasternal long-axis images were divided into four segments, and the parasternal short-axis view, apical four-chamber view, and apical two-chamber view were divided into six segments. One of the following assessments, normokinesis, hypokinesis, akinesis, or dyskinesis, was assigned to each segment. The interobserver rates of agreement in judgments between observers C and D, observers C and E, and intraobserver agreement rate (for observer D) were calculated. The rate of interobserver agreement was 85.7% (394/460 segments; Kappa = 0.65) between observers C and D, 76.7% (353/460 segments; Kappa = 0.39) between observers D and E, and 76.3% (351/460 segments; Kappa = 0.36)between observers C and E, and intraobserver agreement was 94.3% (434/460; Kappa = 0.86). Segments of difference judgments between observers C and D were normokinesis-hypokinesis; 62.1%, hypokinesis-akinesis; 33.3%, akinesis-dyskinesis; 3.0%, and normokinesis-akinesis; 1.5%. Wall motion can be evaluated at remote institutions via the Internet.

  11. Comparison of 3D computer-aided with manual cerebral aneurysm measurements in different imaging modalities.

    PubMed

    Groth, M; Forkert, N D; Buhk, J H; Schoenfeld, M; Goebell, E; Fiehler, J

    2013-02-01

    To compare intra- and inter-observer reliability of aneurysm measurements obtained by a 3D computer-aided technique with standard manual aneurysm measurements in different imaging modalities. A total of 21 patients with 29 cerebral aneurysms were studied. All patients underwent digital subtraction angiography (DSA), contrast-enhanced (CE-MRA) and time-of-flight magnetic resonance angiography (TOF-MRA). Aneurysm neck and depth diameters were manually measured by two observers in each modality. Additionally, semi-automatic computer-aided diameter measurements were performed using 3D vessel surface models derived from CE- (CE-com) and TOF-MRA (TOF-com) datasets. Bland-Altman analysis (BA) and intra-class correlation coefficient (ICC) were used to evaluate intra- and inter-observer agreement. BA revealed the narrowest relative limits of intra- and inter-observer agreement for aneurysm neck and depth diameters obtained by TOF-com (ranging between ±5.3 % and ±28.3 %) and CE-com (ranging between ±23.3 % and ±38.1 %). Direct measurements in DSA, TOF-MRA and CE-MRA showed considerably wider limits of agreement. The highest ICCs were observed for TOF-com and CE-com (ICC values, 0.92 or higher for intra- as well as inter-observer reliability). Computer-aided aneurysm measurement in 3D offers improved intra- and inter-observer reliability and a reproducible parameter extraction, which may be used in clinical routine and as objective surrogate end-points in clinical trials.

  12. Inconsistent identification of pit bull-type dogs by shelter staff.

    PubMed

    Olson, K R; Levy, J K; Norby, B; Crandall, M M; Broadhurst, J E; Jacks, S; Barton, R C; Zimmerman, M S

    2015-11-01

    Shelter staff and veterinarians routinely make subjective dog breed identification based on appearance, but their accuracy regarding pit bull-type breeds is unknown. The purpose of this study was to measure agreement among shelter staff in assigning pit bull-type breed designations to shelter dogs and to compare breed assignments with DNA breed signatures. In this prospective cross-sectional study, four staff members at each of four different shelters recorded their suspected breed(s) for 30 dogs; there was a total of 16 breed assessors and 120 dogs. The terms American pit bull terrier, American Staffordshire terrier, Staffordshire bull terrier, pit bull, and their mixes were included in the study definition of 'pit bull-type breeds.' Using visual identification only, the median inter-observer agreements and kappa values in pair-wise comparisons of each of the staff breed assignments for pit bull-type breed vs. not pit bull-type breed ranged from 76% to 83% and from 0.44 to 0.52 (moderate agreement), respectively. Whole blood was submitted to a commercial DNA testing laboratory for breed identification. Whereas DNA breed signatures identified only 25 dogs (21%) as pit bull-type, shelter staff collectively identified 62 (52%) dogs as pit bull-type. Agreement between visual and DNA-based breed assignments varied among individuals, with sensitivity for pit bull-type identification ranging from 33% to 75% and specificity ranging from 52% to 100%. The median kappa value for inter-observer agreement with DNA results at each shelter ranged from 0.1 to 0.48 (poor to moderate). Lack of consistency among shelter staff indicated that visual identification of pit bull-type dogs was unreliable. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.

  13. Indicators for evaluating European population health: a Delphi selection process.

    PubMed

    Freitas, Ângela; Santana, Paula; Oliveira, Mónica D; Almendra, Ricardo; Bana E Costa, João C; Bana E Costa, Carlos A

    2018-04-27

    Indicators are essential instruments for monitoring and evaluating population health. The selection of a multidimensional set of indicators should not only reflect the scientific evidence on health outcomes and health determinants, but also the views of health experts and stakeholders. The aim of this study is to describe the Delphi selection process designed to promote agreement on indicators considered relevant to evaluate population health at the European regional level. Indicators were selected in a Delphi survey conducted using a web-platform designed to implement and monitor participatory processes. It involved a panel of 51 experts and 30 stakeholders from different areas of knowledge and geographies. In three consecutive rounds the panel indicated their level of agreement or disagreement with indicator's relevance for evaluating population health in Europe. Inferential statistics were applied to draw conclusions on observed level of agreement (Scott's Pi interrater reliability coefficient) and opinion change (McNemar Chi-square test). Multivariate analysis of variance was conducted to check if the field of expertise influenced the panellist responses (Wilk's Lambda test). The panel participated extensively in the study (overall response rate: 80%). Eighty indicators reached group agreement for selection in the areas of: economic and social environment (12); demographic change (5); lifestyle and health behaviours (8); physical environment (6); built environment (12); healthcare services (11) and health outcomes (26). Higher convergence of group opinion towards agreement on the relevance of indicators was seen for lifestyle and health behaviours, healthcare services, and health outcomes. The panellists' field of expertise influenced responses: statistically significant differences were found for economic and social environment (p < 0.05 in round 1 and 2), physical environment (p < 0.01 in round 1) and health outcomes (p < 0.01 in round 3). The high levels of participation observed in this study, by involving experts and stakeholders and ascertaining their views, underpinned the added value of using a transparent Web-Delphi process to promote agreement on what indicators are relevant to appraise population health.

  14. Concordance between VDU-users' ratings of comfort and perceived exertion with experts' observations of workplace layout and working postures.

    PubMed

    Lindegård, A; Karlberg, C; Wigaeus Tornqvist, E; Toomingas, A; Hagberg, M

    2005-05-01

    The aim of the present study was to evaluate the concordance (agreement) between VDU-users' ratings of comfort and ergonomists' observations of workplace layout, and the concordance between VDU-users' ratings of perceived exertion and ergonomists' observations of working postures during VDU-work. The study population consisted of 853 symptom free subjects. Data on perceived comfort in different dimensions and data regarding perceived exertion in different body locations were collected by means of a questionnaire. Data concerning workplace layout and working postures were collected with an observation protocol, by an ergonomist. Concordance between ratings of comfort and observations of workplace layout was reasonably good for the chair and the keyboard (0.60, 0.58) and good regarding the screen and the input device (0.72, 0.61). Concordance between ratings of perceived exertion and observations of working postures indicated good agreement (0.63-0.77) for all measured body locations (neck, shoulder, wrist and trunk). In conclusion ratings of comfort and perceived exertion could be used as cost-efficient and user-friendly methods for practitioners to identify high exposure to poor workplace layout and poor working postures.

  15. Testing Models for Perceptual Discrimination Using Repeatable Noise

    NASA Technical Reports Server (NTRS)

    Ahumada, Albert J., Jr.; Null, Cynthia H. (Technical Monitor)

    1998-01-01

    Adding noise to stimuli to be discriminated allows estimation of observer classification functions based on the correlation between observer responses and relevant features of the noisy stimuli. Examples will be presented of stimulus features that are found in auditory tone detection and visual Vernier acuity. Using the standard signal detection model (Thurstone scaling), we derive formulas to estimate the proportion of the observer's decision variable variance that is controlled by the added noise. One is based on the probability of agreement of the observer with him/herself on trials with the same noise sample. Another is based on the relative performance of the observer and the model. When these do not agree, the model can be rejected. A second derivation gives the probability of agreement of observer and model when the observer follows the model except for internal noise. Agreement significantly less than this amount allows rejection of the model.

  16. Endodontic radiography: who is reading the digital radiograph?

    PubMed

    Tewary, Shalini; Luzzo, Joseph; Hartwell, Gary

    2011-07-01

    Digital radiographic imaging systems have undergone tremendous improvements since their introduction. Advantages of digital radiographs over conventional films include lower radiation doses compared with conventional films, instantaneous images, archiving and sharing images easily, and manipulation of several radiographic properties that might help in diagnosis. A total of 6 observers including 2 endodontic residents, 3 endodontists, and 1 oral radiologist evaluated 150 molar digital periapical radiographs to determine which of the following conditions existed: normal periapical tissue, widened periodontal ligament, or presence of periapical radiolucency. The evaluators had full control over the radiograph's parameters of the Planmeca Dimaxis software program. All images were viewed on the same computer monitor with ideal vie-wing conditions. The same 6 observers evaluated the same 150 digital images 3 months later. The data were analyzed to determine how well the evaluators agreed with each other (interobserver agreement) for 2 rounds of observations and with themselves (intraobserver agreement). Fleiss kappa statistical analysis was used to measure the level of agreement among multiple raters. The overall Fleiss kappa value for interobserver agreement for the first round of interpretation was 0.34 (P < .001). The overall Fleiss kappa value for interobserver agreement for the second round of interpretation was 0.35 (P < .001). This resulted in fair (0.2-0.4) agreement among the 6 raters at both observation periods. A weighted kappa analysis was used to determine intraobserver agreement, which showed on average a moderate agreement. The results indicate that the interpretation of a dental radiograph is subjective, irrespective of whether conventional or digital radiographs are used. The factors that appeared to have the most impact were the years of experience of the examiner and familiarity of the operator with a given digital system. Copyright © 2011 American Association of Endodontists. Published by Elsevier Inc. All rights reserved.

  17. Localizing Value of Pain Distribution Patterns in Cervical Spondylosis

    PubMed Central

    Montriwiwatnchai, Peerapong; Siwanuwatn, Rungsak; Khaoroptham, Surachai

    2015-01-01

    Study Design Prospective observational study. Purpose To investigate the value of pain distribution in localizing appropriate surgical levels in patients with cervical spondylosis. Overview of Literature Previous studies have investigated the value of pain drawings in its correlation with various features in degenerative spine diseases including surgical outcome, magnetic resonance imaging findings, discographic study, and psychogenic issues. However, there is no previous study on the value of pain drawings in identifying symptomatic levels for the surgery in cervical spondylosis. Methods The study collected data from patients with cervical spondylosis who underwent surgical treatment between August 2009 and July 2012. Pain diagrams drawn separately by each patient and physician were collected. Pain distribution patterns among various levels of surgery were analyzed by the chi-square test. Agreement between different pairs of data, including pain diagrams drawn by each patient and physician, intra-examiner agreement on interpretation of pain diagrams, inter-examiner agreement on interpretation of pain diagrams, interpretation of pain diagram by examiners and actual surgery, was analyzed by Kappa statistics. Results The study group consisted of 19 men and 28 women with an average age of 55.2 years. Average duration of symptoms was 16.8 months. There was no difference in the pain distribution pattern at any level of surgery. The agreement between pain diagram drawn by each patient and physician was moderate. Intra-examiner agreement was moderate. There was slight agreement of inter-examiners, examiners versus actual surgery. Conclusions Pain distribution pattern by itself has limited value in identifying surgical levels in patients with cervical spondylosis. PMID:25901232

  18. Do Orthopaedic Oncologists Agree on the Diagnosis and Treatment of Cartilage Tumors of the Appendicular Skeleton?

    PubMed

    Zamora, Tomas; Urrutia, Julio; Schweitzer, Daniel; Amenabar, Pedro Pablo; Botello, Eduardo

    2017-09-01

    Distinguishing a benign enchondroma from a low-grade chondrosarcoma is a common diagnostic challenge for orthopaedic oncologists. Low interrater agreement has been observed for the diagnosis of cartilaginous neoplasms among radiologists and pathologists, but, to our knowledge, no study has evaluated inter- and intraobserver agreement among orthopaedic oncologists grading these lesions using initial clinical and imaging information. Determining such agreement is important since it reflects the certainty in the diagnosis by orthopaedic oncologists. Agreement also is important as it will guide future treatment and prognosis, considering that there is no gold standard for diagnosis of these lesions. (1) to determine inter- and intraobserver agreement among a multinational panel of expert orthopaedic oncologists in diagnosing cartilaginous neoplasms based on their assessment of clinical symptoms and imaging at diagnosis. (2) To describe the most important clinical and imaging features that experts use during the initial diagnostic process. (3) To determine interobserver agreement for proposed initial treatment strategies for cartilaginous neoplasms by this panel of evaluators. Thirty-nine patients with intramedullary cartilaginous neoplasms of the appendicular skeleton of various histopathologic grades were selected and classified as having benign, low-grade malignant, or intermediate- or high-grade malignant neoplasms by 10 experienced orthopaedic oncologists based on clinical and imaging information. Additionally, they chose the three most important clinical or imaging features for the diagnosis of these neoplasms, and they proposed a treatment strategy for each patient. The Kappa coefficient (κ) was used to determine inter- and intraobserver agreement. Inter- and intraobserver agreements were only fair to good, κ = 0.44(95% CI, 0.41-0.48) and κ = 0.62 (95% CI, 0.52-0.72), respectively. The three factors most frequently identified as helpful in making the diagnosis by our panel were cortical involvement in 65% of evaluations (253/390), neoplasm size in 51% (198/390), and pain in 50% (194/390). The interobserver agreement for the proposed initial treatment strategy after diagnosis was poor (κ = 0.21; 95% CI, 0.18-0.24). This study showed barely fair interobserver and fair to good intraobserver agreement for grading of intramedullary cartilaginous neoplasms by orthopaedic oncologists using initial clinical and imaging findings. These results reflect the insufficient guidance interpreting clinical and imaging features, and the limitations of the systems we use today when making these diagnoses. In the same way, they generate concern for the implications that this may have on different treatment strategies and the future prognosis of our patients. Future studies should build on these observations and focus on clarifying our criteria of diagnosis so that treatment recommendations are standardized regardless of the treating institution or oncologist. Level III, diagnostic study.

  19. Repeatability of Diagnostic Features and Scoring Systems for Hepatocellular Carcinoma by Using MR Imaging

    PubMed Central

    Khalatbari, Shokoufeh; Liu, Peter S. C.; Maturen, Katherine E.; Kaza, Ravi K.; Wasnik, Ashish P.; Al-Hawary, Mahmoud M.; Glazer, Daniel I.; Stein, Erica B.; Patel, Jeet; Somashekar, Deepak K.; Viglianti, Benjamin L.; Hussain, Hero K.

    2014-01-01

    Purpose To determine for expert and novice radiologists repeatability of major diagnostic features and scoring systems (ie, Liver Imaging Reporting and Data System [LI-RADS], Organ Procurement and Transplantation Network [OPTN], and American Association for the Study of Liver Diseases [AASLD]) for hepatocellular carcinoma (HCC) by using magnetic resonance (MR) imaging. Materials and Methods Institutional review board approval was obtained and patient consent was waived for this HIPAA-compliant, retrospective study. The LI-RADS discussed in this article refers to version 2013.1. Ten blinded readers reviewed 100 liver MR imaging studies that demonstrated observations preliminarily assigned LI-RADS scores of LR1–LR5. Diameter and major HCC features (arterial hyperenhancement, washout appearance, pseudocapsule) were recorded for each observation. LI-RADS, OPTN, and AASLD scores were assigned. Interreader agreement was assessed by using intraclass correlation coefficients and κ statistics. Scoring rates were compared by using McNemar test. Results Overall interreader agreement was substantial for arterial hyperenhancement (0.67 [95% confidence interval {CI}: 0.65, 0.69]), moderate for washout appearance (0.48 [95%CI: 0.46, 0.50]), moderate for pseudocapsule (0.52 [95% CI: 050, 0.54]), fair for LI-RADS (0.35 [95% CI: 0.34, 0.37]), fair for AASLD (0.39 [95% CI: 0.37, 0.42]), and moderate for OPTN (0.53 [95% CI: 0.51, 0.56]). Agreement for measured diameter was almost perfect (range, 0.95–0.97). There was substantial agreement for most scores consistent with HCC. Experts agreed significantly more than did novices and were significantly more likely than were novices to assign a diagnosis of HCC (P < .001). Conclusion Two of three major features for HCC (washout appearance and pseudocapsule) have only moderate interreader agreement. Experts and novices who assigned scores consistent with HCC had substantial but not perfect agreement. Expert agreement is substantial for OPTN, but moderate for LI-RADS and AASLD. Novices were less consistent and less likely to diagnose HCC than were experts. © RSNA, 2014 Online supplemental material is available for this article. PMID:24555636

  20. The Surgical Safety Checklist and Teamwork Coaching Tools: a study of inter-rater reliability.

    PubMed

    Huang, Lyen C; Conley, Dante; Lipsitz, Stu; Wright, Christopher C; Diller, Thomas W; Edmondson, Lizabeth; Berry, William R; Singer, Sara J

    2014-08-01

    To assess the inter-rater reliability (IRR) of two novel observation tools for measuring surgical safety checklist performance and teamwork. Data surgical safety checklists can promote adherence to standards of care and improve teamwork in the operating room. Their use has been associated with reductions in mortality and other postoperative complications. However, checklist effectiveness depends on how well they are performed. Authors from the Safe Surgery 2015 initiative developed a pair of novel observation tools through literature review, expert consultation and end-user testing. In one South Carolina hospital participating in the initiative, two observers jointly attended 50 surgical cases and independently rated surgical teams using both tools. We used descriptive statistics to measure checklist performance and teamwork at the hospital. We assessed IRR by measuring percent agreement, Cohen's κ, and weighted κ scores. The overall percent agreement and κ between the two observers was 93% and 0.74 (95% CI 0.66 to 0.79), respectively, for the Checklist Coaching Tool and 86% and 0.84 (95% CI 0.77 to 0.90) for the Surgical Teamwork Tool. Percent agreement for individual sections of both tools was 79% or higher. Additionally, κ scores for six of eight sections on the Checklist Coaching Tool and for two of five domains on the Surgical Teamwork Tool achieved the desired 0.7 threshold. However, teamwork scores were high and variation was limited. There were no significant changes in the percent agreement or κ scores between the first 10 and last 10 cases observed. Both tools demonstrated substantial IRR and required limited training to use. These instruments may be used to observe checklist performance and teamwork in the operating room. However, further refinement and calibration of observer expectations, particularly in rating teamwork, could improve the utility of the tools. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  1. Interobserver reliability of computed tomographic contouring of canine tonsils in radiation therapy treatment planning.

    PubMed

    Murakami, Keiko; Rancilio, Nicholas J; Plantenga, Jeannie Poulson; Moore, George E; Heng, Hock Gan; Lim, Chee Kin

    2018-05-01

    In radiation therapy (RT) treatment planning for canine head and neck cancer, the tonsils may be included as part of the treated volume. Delineation of tonsils on computed tomography (CT) scans is difficult. Error or uncertainty in the volume and location of contoured structures may result in treatment failure. The purpose of this prospective, observer agreement study was to assess the interobserver agreement of tonsillar contouring by two groups of trained observers. Thirty dogs undergoing pre- and post-contrast CT studies of the head were included. After the pre- and postcontrast CT scans, the tonsils were identified via direct visualization, barium paste was applied bilaterally to the visible tonsils, and a third CT scan was acquired. Data from each of the three CT scans were registered in an RT treatment planning system. Two groups of observers (one veterinary radiologist and one veterinary radiation oncologist in each group) contoured bilateral tonsils by consensus, obtaining three sets of contours. Tonsil volume and location data were obtained from both groups. The contour volumes and locations were compared between groups using mixed (fixed and random effect) linear models. There was no significant difference between each group's contours in terms of three-dimensional coordinates. However there was a significant difference between each group's contours in terms of the tonsillar volume (P < 0.0001). Pre- and postcontrast CT can be used to identify the location of canine tonsils with reasonable agreement between trained observers. Discrepancy in tonsillar volume between groups of trained observers may affect RT treatment outcome. © 2017 American College of Veterinary Radiology.

  2. Multisite Assessment of Aging-Related Tau Astrogliopathy (ARTAG).

    PubMed

    Kovacs, Gabor G; Xie, Sharon X; Lee, Edward B; Robinson, John L; Caswell, Carrie; Irwin, David J; Toledo, Jon B; Johnson, Victoria E; Smith, Douglas H; Alafuzoff, Irina; Attems, Johannes; Bencze, Janos; Bieniek, Kevin F; Bigio, Eileen H; Bodi, Istvan; Budka, Herbert; Dickson, Dennis W; Dugger, Brittany N; Duyckaerts, Charles; Ferrer, Isidro; Forrest, Shelley L; Gelpi, Ellen; Gentleman, Stephen M; Giaccone, Giorgio; Grinberg, Lea T; Halliday, Glenda M; Hatanpaa, Kimmo J; Hof, Patrick R; Hofer, Monika; Hortobágyi, Tibor; Ironside, James W; King, Andrew; Kofler, Julia; Kövari, Enikö; Kril, Jillian J; Love, Seth; Mackenzie, Ian R; Mao, Qinwen; Matej, Radoslav; McLean, Catriona; Munoz, David G; Murray, Melissa E; Neltner, Janna; Nelson, Peter T; Ritchie, Diane; Rodriguez, Roberta D; Rohan, Zdenek; Rozemuller, Annemieke; Sakai, Kenji; Schultz, Christian; Seilhean, Danielle; Smith, Vanessa; Tacik, Pawel; Takahashi, Hitoshi; Takao, Masaki; Rudolf Thal, Dietmar; Weis, Serge; Wharton, Stephen B; White, Charles L; Woulfe, John M; Yamada, Masahito; Trojanowski, John Q

    2017-07-01

    Aging-related tau astrogliopathy (ARTAG) is a recently introduced terminology. To facilitate the consistent identification of ARTAG and to distinguish it from astroglial tau pathologies observed in the primary frontotemporal lobar degeneration tauopathies we evaluated how consistently neuropathologists recognize (1) different astroglial tau immunoreactivities, including those of ARTAG and those associated with primary tauopathies (Study 1); (2) ARTAG types (Study 2A); and (3) ARTAG severity (Study 2B). Microphotographs and scanned sections immunostained for phosphorylated tau (AT8) were made available for download and preview. Percentage of agreement and kappa values with 95% confidence interval (CI) were calculated for each evaluation. The overall agreement for Study 1 was >60% with a kappa value of 0.55 (95% CI 0.433-0.645). Moderate agreement (>90%, kappa 0.48, 95% CI 0.457-0.900) was reached in Study 2A for the identification of ARTAG pathology for each ARTAG subtype (kappa 0.37-0.72), whereas fair agreement (kappa 0.40, 95% CI 0.341-0.445) was reached for the evaluation of ARTAG severity. The overall assessment of ARTAG showed moderate agreement (kappa 0.60, 95% CI 0.534-0.653) among raters. Our study supports the application of the current harmonized evaluation strategy for ARTAG with a slight modification of the evaluation of its severity. © 2017 American Association of Neuropathologists, Inc. All rights reserved.

  3. Reliability and Validity of Observational Risk Screening in Evaluating Dynamic Knee Valgus

    PubMed Central

    Ekegren, Christina L.; Miller, William C.; Celebrini, Richard G.; Eng, Janice J.; MacIntyre, Donna L.

    2012-01-01

    Study Design Nonexperimental methodological study. Objectives To determine the interrater and intrarater reliability and validity of using observational risk screening guidelines to evaluate dynamic knee valgus. Background A deficiency in the neuromuscular control of the hip has been identified as a key risk factor for non-contact anterior cruciate ligament (ACL) injury in post pubescent females. This deficiency can manifest itself as a valgus knee alignment during tasks involving hip and knee flexion. There are currently no scientifically tested methods to screen for dynamic knee valgus in the clinic or on the field. Methods Three physiotherapists used observational risk screening guidelines to rate 40 adolescent female soccer players according to their risk of ACL injury. The rating was based on the amount of dynamic knee valgus observed on a drop jump landing. Ratings were evaluated for intrarater and interrater agreement using kappa coefficients. Sensitivity and specificity of ratings were evaluated by comparing observational ratings with measurements obtained using 3-dimensional (3D) motion analysis. Results Kappa coefficients for intrarater and interrater agreement ranged from 0.75 to 0.85, indicating that ratings were reasonably consistent over time and between physiotherapists. Sensitivity values were inadequate, ranging from 67–87%. This indicated that raters failed to detect up to a third of “truly high risk” individuals. Specificity values ranged from 60–72% which was considered adequate for the purposes of the screen. Conclusion Observational risk screening is a practical and cost-effective method of screening for ACL injury risk. Rater agreement and specificity were acceptable for this method but sensitivity was not. To detect a greater proportion of individuals at risk of ACL injury, coaches and clinicians should ensure that they include additional tests for other high risk characteristics in their screening protocols. PMID:19721212

  4. Various methods for assessing static lower extremity alignment: implications for prospective risk-factor screenings.

    PubMed

    Nguyen, Anh-Dung; Boling, Michelle C; Slye, Carrie A; Hartley, Emily M; Parisi, Gina L

    2013-01-01

    Accurate, efficient, and reliable measurement methods are essential to prospectively identify risk factors for knee injuries in large cohorts. To determine tester reliability using digital photographs for the measurement of static lower extremity alignment (LEA) and whether values quantified with an electromagnetic motion-tracking system are in agreement with those quantified with clinical methods and digital photographs. Descriptive laboratory study. Laboratory. Thirty-three individuals participated and included 17 (10 women, 7 men; age = 21.7 ± 2.7 years, height = 163.4 ± 6.4 cm, mass = 59.7 ± 7.8 kg, body mass index = 23.7 ± 2.6 kg/m2) in study 1, in which we examined the reliability between clinical measures and digital photographs in 1 trained and 1 novice investigator, and 16 (11 women, 5 men; age = 22.3 ± 1.6 years, height = 170.3 ± 6.9 cm, mass = 72.9 ± 16.4 kg, body mass index = 25.2 ± 5.4 kg/m2) in study 2, in which we examined the agreement among clinical measures, digital photographs, and an electromagnetic tracking system. We evaluated measures of pelvic angle, quadriceps angle, tibiofemoral angle, genu recurvatum, femur length, and tibia length. Clinical measures were assessed using clinically accepted methods. Frontal- and sagittal-plane digital images were captured and imported into a computer software program. Anatomic landmarks were digitized using an electromagnetic tracking system to calculate static LEA. Intraclass correlation coefficients and standard errors of measurement were calculated to examine tester reliability. We calculated 95% limits of agreement and used Bland-Altman plots to examine agreement among clinical measures, digital photographs, and an electromagnetic tracking system. Using digital photographs, fair to excellent intratester (intraclass correlation coefficient range = 0.70-0.99) and intertester (intraclass correlation coefficient range = 0.75-0.97) reliability were observed for static knee alignment and limb-length measures. An acceptable level of agreement was observed between clinical measures and digital pictures for limb-length measures. When comparing clinical measures and digital photographs with the electromagnetic tracking system, an acceptable level of agreement was observed in measures of static knee angles and limb-length measures. The use of digital photographs and an electromagnetic tracking system appears to be an efficient and reliable method to assess static knee alignment and limb-length measurements.

  5. Evaluation of a numerical model's ability to predict bed load transport observed in braided river experiments

    NASA Astrophysics Data System (ADS)

    Javernick, Luke; Redolfi, Marco; Bertoldi, Walter

    2018-05-01

    New data collection techniques offer numerical modelers the ability to gather and utilize high quality data sets with high spatial and temporal resolution. Such data sets are currently needed for calibration, verification, and to fuel future model development, particularly morphological simulations. This study explores the use of high quality spatial and temporal data sets of observed bed load transport in braided river flume experiments to evaluate the ability of a two-dimensional model, Delft3D, to predict bed load transport. This study uses a fixed bed model configuration and examines the model's shear stress calculations, which are the foundation to predict the sediment fluxes necessary for morphological simulations. The evaluation is conducted for three flow rates, and model setup used highly accurate Structure-from-Motion (SfM) topography and discharge boundary conditions. The model was hydraulically calibrated using bed roughness, and performance was evaluated based on depth and inundation agreement. Model bed load performance was evaluated in terms of critical shear stress exceedance area compared to maps of observed bed mobility in a flume. Following the standard hydraulic calibration, bed load performance was tested for sensitivity to horizontal eddy viscosity parameterization and bed morphology updating. Simulations produced depth errors equal to the SfM inherent errors, inundation agreement of 77-85%, and critical shear stress exceedance in agreement with 49-68% of the observed active area. This study provides insight into the ability of physically based, two-dimensional simulations to accurately predict bed load as well as the effects of horizontal eddy viscosity and bed updating. Further, this study highlights how using high spatial and temporal data to capture the physical processes at work during flume experiments can help to improve morphological modeling.

  6. Depression recognition and capacity for self-report among ethnically diverse nursing homes residents: Evidence of disparities in screening.

    PubMed

    Chun, Audrey; Reinhardt, Joann P; Ramirez, Mildred; Ellis, Julie M; Silver, Stephanie; Burack, Orah; Eimicke, Joseph P; Cimarolli, Verena; Teresi, Jeanne A

    2017-12-01

    To examine agreement between Minimum Data Set clinician ratings and researcher assessments of depression among ethnically diverse nursing home residents using the 9-item Patient Health Questionnaire. Although depression is common among nursing homes residents, its recognition remains a challenge. Observational baseline data from a longitudinal intervention study. Sample of 155 residents from 12 long-term care units in one US facility; 50 were interviewed in Spanish. Convergence between clinician and researcher ratings was examined for (i) self-report capacity, (ii) suicidal ideation, (iii) at least moderate depression, (iv) Patient Health Questionnaire severity scores. Experiences by clinical raters using the depression assessment were analysed. The intraclass correlation coefficient was used to examine concordance and Cohen's kappa to examine agreement between clinicians and researchers. Moderate agreement (κ = 0.52) was observed in determination of capacity and poor to fair agreement in reporting suicidal ideation (κ = 0.10-0.37) across time intervals. Poor agreement was observed in classification of at least moderate depression (κ = -0.02 to 0.24), lower than the maximum kappa obtainable (0.58-0.85). Eight assessors indicated problems assessing Spanish-speaking residents. Among Spanish speakers, researchers identified 16% with Patient Health Questionnaire scores of 10 or greater, and 14% with thoughts of self-harm whilst clinicians identified 6% and 0%, respectively. This study advances the field of depression recognition in long-term care by identification of possible challenges in assessing Spanish speakers. Use of the Patient Health Questionnaire requires further investigation, particularly among non-English speakers. Depression screening for ethnically diverse nursing home residents is required, as underreporting of depression and suicidal ideation among Spanish speakers may result in lack of depression recognition and referral for evaluation and treatment. Training in depression recognition is imperative to improve the recognition, evaluation and treatment of depression in older people living in nursing homes. © 2017 John Wiley & Sons Ltd.

  7. Spatial correspondence of 4D CT ventilation and SPECT pulmonary perfusion defects in patients with malignant airway stenosis

    NASA Astrophysics Data System (ADS)

    Castillo, Richard; Castillo, Edward; McCurdy, Matthew; Gomez, Daniel R.; Block, Alec M.; Bergsma, Derek; Joy, Sarah; Guerrero, Thomas

    2012-04-01

    To determine the spatial overlap agreement between four-dimensional computed tomography (4D CT) ventilation and single photon emission computed tomography (SPECT) perfusion hypo-functioning pulmonary defect regions in a patient population with malignant airway stenosis. Treatment planning 4D CT images were obtained retrospectively for ten lung cancer patients with radiographically demonstrated airway obstruction due to gross tumor volume. Each patient also received a SPECT perfusion study within one week of the planning 4D CT, and prior to the initiation of treatment. Deformable image registration was used to map corresponding lung tissue elements between the extreme component phase images, from which quantitative three-dimensional (3D) images representing the local pulmonary specific ventilation were constructed. Semi-automated segmentation of the percentile perfusion distribution was performed to identify regional defects distal to the known obstructing lesion. Semi-automated segmentation was similarly performed by multiple observers to delineate corresponding defect regions depicted on 4D CT ventilation. Normalized Dice similarity coefficient (NDSC) indices were determined for each observer between SPECT perfusion and 4D CT ventilation defect regions to assess spatial overlap agreement. Tidal volumes determined from 4D CT ventilation were evaluated versus measurements obtained from lung parenchyma segmentation. Linear regression resulted in a linear fit with slope = 1.01 (R2 = 0.99). Respective values for the average DSC, NDSC1 mm and NDSC2 mm for all cases and multiple observers were 0.78, 0.88 and 0.99, indicating that, on average, spatial overlap agreement between ventilation and perfusion defect regions was comparable to the threshold for agreement within 1-2 mm uncertainty. Corresponding coefficients of variation for all metrics were similarly in the range: 0.10%-19%. This study is the first to quantitatively assess 3D spatial overlap agreement between clinically acquired SPECT perfusion and specific ventilation from 4D CT. Results suggest high correlation between methods within the sub-population of lung cancer patients with malignant airway stenosis.

  8. Many participants in inpatient rehabilitation can quantify their exercise dosage accurately: an observational study.

    PubMed

    Scrivener, Katharine; Sherrington, Catherine; Schurr, Karl; Treacy, Daniel

    2011-01-01

    Are inpatients undergoing rehabilitation who appear able to count exercises able to quantify accurately the amount of exercise they undertake? Observational study. Inpatients in an aged care rehabilitation unit and a neurological rehabilitation unit, who appeared able to count their exercises during a 1-2 min observation by their treating physiotherapist. Participants were observed for 30 min by an external observer while they exercised in the physiotherapy gymnasium. Both the participants and the observer counted exercise repetitions with a hand-held tally counter and the two tallies were compared. Of the 60 people admitted for aged care rehabilitation during the study period, 49 (82%) were judged by their treating therapist to be able to count their own exercise repetitions accurately. Of the 30 people admitted for neurological rehabilitation during the study period, 20 (67%) were judged by their treating therapist to be able to count their repetitions accurately. Of the 69 people judged to be accurate, 40 underwent observation while exercising. There was excellent agreement between these participants' counts of their exercise repetitions and the observers' counts, ICC (3,1) of 0.99 (95% CI 0.98 to 0.99). Eleven participants (28%) were in complete agreement with the observer. A further 19 participants (48%) varied from the observer by less than 10%. Therapists were able to identify a group of rehabilitation participants who were accurate in counting their exercise repetitions. Counting of exercise repetitions by therapist-selected patients is a valid means of quantifying exercise dosage during inpatient rehabilitation. Copyright © 2011 Australian Physiotherapy Association. Published by .. All rights reserved.

  9. Agreement in cardiovascular risk rating based on anthropometric parameters

    PubMed Central

    Dantas, Endilly Maria da Silva; Pinto, Cristiane Jordânia; Freitas, Rodrigo Pegado de Abreu; de Medeiros, Anna Cecília Queiroz

    2015-01-01

    Objective To investigate the agreement in evaluation of risk of developing cardiovascular diseases based on anthropometric parameters in young adults. Methods The study included 406 students, measuring weight, height, and waist and neck circumferences. Waist-to-height ratio and the conicity index. The kappa coefficient was used to assess agreement in risk classification for cardiovascular diseases. The positive and negative specific agreement values were calculated as well. The Pearson chi-square (χ2) test was used to assess associations between categorical variables (p<0.05). Results The majority of the parameters assessed (44%) showed slight (k=0.21 to 0.40) and/or poor agreement (k<0.20), with low values of negative specific agreement. The best agreement was observed between waist circumference and waist-to-height ratio both for the general population (k=0.88) and between sexes (k=0.93 to 0.86). There was a significant association (p<0.001) between the risk of cardiovascular diseases and females when using waist circumference and conicity index, and with males when using neck circumference. This resulted in a wide variation in the prevalence of cardiovascular disease risk (5.5%-36.5%), depending on the parameter and the sex that was assessed. Conclusion The results indicate variability in agreement in assessing risk for cardiovascular diseases, based on anthropometric parameters, and which also seems to be influenced by sex. Further studies in the Brazilian population are required to better understand this issue. PMID:26466060

  10. 68Ga-DOTATATE PET/CT interobserver agreement for neuroendocrine tumor assessments: results from a prospective study on 50 patients

    PubMed Central

    Fendler, Wolfgang Peter; Barrio, Martin; Spick, Claudio; Allen-Auerbach, Martin; Ambrosini, Valentina; Benz, Matthias; Bluemel, Christina; Grewal, Ravinder Kaur; Lapa, Constantin; Miederer, Matthias; Nicolas, Guillaume; Schuster, Tibor; Czernin, Johannes; Herrmann, Ken

    2016-01-01

    We evaluated the observer agreement for 68Ga-DOTATATE PET/CT study interpretations in patients with neuroendocrine tumors (NET). Methods 68Ga-DOTATATE PET/CT was performed in 50 patients with known or suspected NET of the small bowel (n = 19), pancreas (n = 14), lung (n = 4) or other location (n = 13). Images were reviewed by seven observers who used a standardized approach for image interpretation. Observers were classified as having low (<500 scans or <5 years experience with 68Ga-DOTATATE PET/CT; n = 4) or high level of experience (≥500 scans and ≥5 years experience with 68Ga-DOTATATE PET/CT; n = 3). Interpretation by the primary nuclear medicine physician un-blinded to all clinical and imaging data served as reference standard. Interobserver agreement was determined by Cohen's κ and intraclass correlation coefficient (ICC) with corresponding 95% confidence interval (CI). Results Interobserver agreement was substantial and the median number of false findings (FF) was low for the overall scan result; i.e. positive versus negative study (κ = 0.80, 95%CI 0.74–0.86; FF = 3), organ involvement (κ = 0.70, 95%CI 0.64–0.76; FF = 5), and lymph node involvement (κ = 0.71, 95%CI 0.65–0.78; FF = 6). The interobserver agreement was substantial to almost-perfect and the average absolute difference (Δ) to the reference reader was low for number of organ and lymph node metastases (ICC = 0.84, 95%CI 0.77–0.89, Δ = 0.45 and ICC = 0.77, 95%CI 0.69–0.84, Δ = 0.45), tumor SUVmax (ICC = 0.99, 95%CI 0.97–0.99; Δ = 0.44) and reference SUV (SUVmean spleen: ICC = 0.81, Δ = 1.10; SUVmax liver ICC = 0.79, Δ = 0.62). Interpretations of the appropriateness for peptide-receptor radionuclide therapy (PRRT) varied more significantly among observers (κ = 0.64, 95%CI 0.57–0.70) and a higher frequency of false positive recommendations for PRRT occurred in observers with low versus high levels of experience (range, 7–12 versus 4–8). Conclusion The interpretation of 68Ga-DOTATATE PET/CT for NET staging is consistent among readers with low and high levels of experience. However, image based recommendations for or against PRRT require experience and training. PMID:27539839

  11. Standardized Reporting of Prostate MRI: Comparison of the Prostate Imaging Reporting and Data System (PI-RADS) Version 1 and Version 2

    PubMed Central

    Tewes, Susanne; Mokov, Nikolaj; Hartung, Dagmar; Schick, Volker; Peters, Inga; Schedl, Peter; Pertschy, Stefanie; Wacker, Frank; Voshage, Götz; Hueper, Katja

    2016-01-01

    Introduction Objective of our study was to determine the agreement between version 1 (v1) and v2 of the Prostate Imaging Reporting and Data System (PI-RADS) for evaluation of multiparametric prostate MRI (mpMRI) and to compare their diagnostic accuracy, their inter-observer agreement and practicability. Material and Methods mpMRI including T2-weighted imaging, diffusion-weighted imaging (DWI) and dynamic contrast-enhanced imaging (DCE) of 54 consecutive patients, who subsequently underwent MRI-guided in-bore biopsy were re-analyzed according to PI-RADS v1 and v2 by two independent readers. Diagnostic accuracy for detection of prostate cancer (PCa) was assessed using ROC-curve analysis. Agreement between PI-RADS versions and observers was calculated and the time needed for scoring was determined. Results MRI-guided biopsy revealed PCa in 31 patients. Diagnostic accuracy for detection of PCa was equivalent with both PI-RADS versions for reader 1 with sensitivities and specificities of 84%/91% (AUC = 0.91 95%CI[0.8–1]) for PI-RADS v1 and 100%/74% (AUC = 0.92 95% CI[0.8–1]) for PI-RADS v2. Reader 2 achieved similar diagnostic accuracy with sensitivity and specificity of 74%/91% (AUC = 0.88 95%CI[0.8–1]) for PI-RADS v1 and 81%/91% (AUC = 0.91 95%CI[0.8–1]) for PI-RADS v2. Agreement between scores determined with different PI-RADS versions was good (reader 1: κ = 0.62, reader 2: κ = 0.64). Inter-observer agreement was moderate with PI-RADS v2 (κ = 0.56) and fair with v1 (κ = 0.39). The time required for building the PI-RADS score was significantly lower with PI-RADS v2 compared to v1 (24.7±2.3 s vs. 41.9±2.6 s, p<0.001). Conclusion Agreement between PI-RADS versions was high and both versions revealed high diagnostic accuracy for detection of PCa. Due to better inter-observer agreement for malignant lesions and less time demand, the new PI-RADS version could be more practicable for clinical routine. PMID:27657729

  12. Kappa statistic to measure agreement beyond chance in free-response assessments.

    PubMed

    Carpentier, Marc; Combescure, Christophe; Merlini, Laura; Perneger, Thomas V

    2017-04-19

    The usual kappa statistic requires that all observations be enumerated. However, in free-response assessments, only positive (or abnormal) findings are notified, but negative (or normal) findings are not. This situation occurs frequently in imaging or other diagnostic studies. We propose here a kappa statistic that is suitable for free-response assessments. We derived the equivalent of Cohen's kappa statistic for two raters under the assumption that the number of possible findings for any given patient is very large, as well as a formula for sampling variance that is applicable to independent observations (for clustered observations, a bootstrap procedure is proposed). The proposed statistic was applied to a real-life dataset, and compared with the common practice of collapsing observations within a finite number of regions of interest. The free-response kappa is computed from the total numbers of discordant (b and c) and concordant positive (d) observations made in all patients, as 2d/(b + c + 2d). In 84 full-body magnetic resonance imaging procedures in children that were evaluated by 2 independent raters, the free-response kappa statistic was 0.820. Aggregation of results within regions of interest resulted in overestimation of agreement beyond chance. The free-response kappa provides an estimate of agreement beyond chance in situations where only positive findings are reported by raters.

  13. Evaluation of the overall efficacy of the Omron office digital blood pressure HEM-907 monitor in adults.

    PubMed

    White, W B; Anwar, Y A

    2001-04-01

    Non-invasive self blood pressure monitoring has become increasingly popular. To assure the accuracy of devices used for this purpose, all need to be validated independently prior to marketing. The objective of this study was to assess the accuracy of the HEM-907, a new semi-automatic, non-invasive, oscillometric blood pressure monitoring device specifically designed to be used in the clinic or physician's office setting. Blood pressure measurements taken employing this device were compared with the results obtained by two experienced observers using a mercury sphygmomanometer on 100 subjects and patients (384 measurements). The limits of agreement were calculated for the device compared with the results of the two observers according to the standards of the Association for the Advancement of Medical Instrumentation (AAMI). The agreement between the two observers was -0.36+/-2.32mmHg for systolic blood pressure and 0.02+/-2.42mmHg for diastolic blood pressure. The agreement between the device and the observers was 1.56+/-4.42mmHg and 3.49+/-4.61mmHg for systolic and diastolic blood pressure respectively. The Omron HEM-907 satisfied the AAMI criteria for accuracy for a non-invasive blood pressure monitoring device.

  14. Critical discussion of evaluation parameters for inter-observer variability in target definition for radiation therapy.

    PubMed

    Fotina, I; Lütgendorf-Caucig, C; Stock, M; Pötter, R; Georg, D

    2012-02-01

    Inter-observer studies represent a valid method for the evaluation of target definition uncertainties and contouring guidelines. However, data from the literature do not yet give clear guidelines for reporting contouring variability. Thus, the purpose of this work was to compare and discuss various methods to determine variability on the basis of clinical cases and a literature review. In this study, 7 prostate and 8 lung cases were contoured on CT images by 8 experienced observers. Analysis of variability included descriptive statistics, calculation of overlap measures, and statistical measures of agreement. Cross tables with ratios and correlations were established for overlap parameters. It was shown that the minimal set of parameters to be reported should include at least one of three volume overlap measures (i.e., generalized conformity index, Jaccard coefficient, or conformation number). High correlation between these parameters and scatter of the results was observed. A combination of descriptive statistics, overlap measure, and statistical measure of agreement or reliability analysis is required to fully report the interrater variability in delineation.

  15. Validity of covering-up sun-protection habits: Association of observations and self-report

    PubMed Central

    O'Riordan, David L.; Nehl, Eric; Gies, Peter; Bundy, Lucja; Burgess, Kristen; Davis, Erica; Glanz, Karen

    2013-01-01

    Background Few studies have reported the accuracy of measures used to assess sun-protection practices. Valid measures are critical to the internal validity and use of skin cancer control research. Objectives We sought to validate self-reported covering-up practices of pool-goers. Methods A total of 162 lifeguards and 201 parent/child pairs from 16 pools in 4 metropolitan regions in the United States completed a survey and a 4-day sun-habits diary. Observations of sun-protective behaviors were conducted on two occasions. Results Agreement between observations and diaries ranged from slight to substantial, with most values in the fair to moderate range. Highest agreement was observed for parent hat use (κ = 0.58–0.70). There was no systematic pattern of over- or under-reporting among the 3 study groups. Limitations Potential reactivity and a relatively affluent sample are limitations. Conclusion There was little over-reporting and no systematic bias, which increases confidence in reliance on verbal reports of these behaviors in surveys and intervention research. PMID:19278750

  16. Performing both propensity score and instrumental variable analyses in observational studies often leads to discrepant results: a systematic review.

    PubMed

    Laborde-Castérot, Hervé; Agrinier, Nelly; Thilly, Nathalie

    2015-10-01

    Propensity score (PS) and instrumental variable (IV) are analytical techniques used to adjust for confounding in observational research. More and more, they seem to be used simultaneously in studies evaluating health interventions. The present review aimed to analyze the agreement between PS and IV results in medical research published to date. Review of all published observational studies that evaluated a clinical intervention using simultaneously PS and IV analyses, as identified in MEDLINE and Web of Science. Thirty-seven studies, most of them published during the previous 5 years, reported 55 comparisons between results from PS and IV analyses. There was a slight/fair agreement between the methods [Cohen's kappa coefficient = 0.21 (95% confidence interval: 0.00, 0.41)]. In 23 cases (42%), results were nonsignificant for one method and significant for the other, and IV analysis results were nonsignificant in most situations (87%). Discrepancies are frequent between PS and IV analyses and can be interpreted in various ways. This suggests that researchers should carefully consider their analytical choices, and readers should be cautious when interpreting results, until further studies clarify the respective roles of the two methods in observational comparative effectiveness research. Copyright © 2015 Elsevier Inc. All rights reserved.

  17. Increasing consistency of disease biomarker prediction across datasets.

    PubMed

    Chikina, Maria D; Sealfon, Stuart C

    2014-01-01

    Microarray studies with human subjects often have limited sample sizes which hampers the ability to detect reliable biomarkers associated with disease and motivates the need to aggregate data across studies. However, human gene expression measurements may be influenced by many non-random factors such as genetics, sample preparations, and tissue heterogeneity. These factors can contribute to a lack of agreement among related studies, limiting the utility of their aggregation. We show that it is feasible to carry out an automatic correction of individual datasets to reduce the effect of such 'latent variables' (without prior knowledge of the variables) in such a way that datasets addressing the same condition show better agreement once each is corrected. We build our approach on the method of surrogate variable analysis but we demonstrate that the original algorithm is unsuitable for the analysis of human tissue samples that are mixtures of different cell types. We propose a modification to SVA that is crucial to obtaining the improvement in agreement that we observe. We develop our method on a compendium of multiple sclerosis data and verify it on an independent compendium of Parkinson's disease datasets. In both cases, we show that our method is able to improve agreement across varying study designs, platforms, and tissues. This approach has the potential for wide applicability to any field where lack of inter-study agreement has been a concern.

  18. Reliability and agreement on embryo assessment: 5 years of an external quality control programme.

    PubMed

    Martínez-Granados, Luis; Serrano, María; González-Utor, Antonio; Ortiz, Nereyda; Badajoz, Vicente; López-Regalado, María Luisa; Boada, Montserrat; Castilla, Jose A

    2018-03-01

    An external quality-control programme for morphology-based embryo quality assessment, incorporating a standardized embryo grading scheme, was evaluated over a period of 5 years to determine levels of inter-observer reliability and agreement between practising clinical embryologists at IVF centres and the opinions of a panel of experts. Following Guidelines for Reporting Reliability and Agreement Studies, the Gwet index and proportion of positive (Ppos) and negative agreement were calculated. For embryo morphology assessment, a substantial degree of reliability was measured between the centres and the panel of experts (Gwet index: 0.76; 95% CI 0.70 to 0.84). The agreement was higher for good- versus poor-quality embryos. When multinucleation or vacuoles were observed, low levels of reliability were obtained (Ppos: 0.56 and 0.43, respectively). In blastocysts, the characteristic that presented the largest discrepancy was that related to the inner cell mass. In decisions about the final disposition of the embryo, reliability between centre and the panel of experts was moderate (Gwet index: 0.51; 95% CI 0.41 to 0.60). In conclusion, the ability of clinical embryologists to evaluate the presence of multinucleation and vacuoles in the early cleavage embryo, and to determine the category of the inner cell mass in blastocysts, needs to be improved. Copyright © 2017 Reproductive Healthcare Ltd. All rights reserved.

  19. Methods to systematically review and meta-analyse observational studies: a systematic scoping review of recommendations.

    PubMed

    Mueller, Monika; D'Addario, Maddalena; Egger, Matthias; Cevallos, Myriam; Dekkers, Olaf; Mugglin, Catrina; Scott, Pippa

    2018-05-21

    Systematic reviews and meta-analyses of observational studies are frequently performed, but no widely accepted guidance is available at present. We performed a systematic scoping review of published methodological recommendations on how to systematically review and meta-analyse observational studies. We searched online databases and websites and contacted experts in the field to locate potentially eligible articles. We included articles that provided any type of recommendation on how to conduct systematic reviews and meta-analyses of observational studies. We extracted and summarised recommendations on pre-defined key items: protocol development, research question, search strategy, study eligibility, data extraction, dealing with different study designs, risk of bias assessment, publication bias, heterogeneity, statistical analysis. We summarised recommendations by key item, identifying areas of agreement and disagreement as well as areas where recommendations were missing or scarce. The searches identified 2461 articles of which 93 were eligible. Many recommendations for reviews and meta-analyses of observational studies were transferred from guidance developed for reviews and meta-analyses of RCTs. Although there was substantial agreement in some methodological areas there was also considerable disagreement on how evidence synthesis of observational studies should be conducted. Conflicting recommendations were seen on topics such as the inclusion of different study designs in systematic reviews and meta-analyses, the use of quality scales to assess the risk of bias, and the choice of model (e.g. fixed vs. random effects) for meta-analysis. There is a need for sound methodological guidance on how to conduct systematic reviews and meta-analyses of observational studies, which critically considers areas in which there are conflicting recommendations.

  20. Vaginal vault suspension during hysterectomy for benign indications: a prospective register study of agreement on terminology and surgical procedure.

    PubMed

    Bonde, Lisbeth; Noer, Mette Calundann; Møller, Lars Alling; Ottesen, Bent; Gimbel, Helga

    2017-07-01

    Several suspension methods are used to try to prevent pelvic organ prolapse (POP) after hysterectomy. We aimed to evaluate agreement on terminology and surgical procedure of these methods. We randomly chose 532 medical records of women with a history of hysterectomy from the Danish Hysterectomy and Hysteroscopy Database (DHHD). Additionally, we video-recorded 36 randomly chosen hysterectomies. The hysterectomies were registered in the DHHD. The material was categorized according to predefined suspension methods. Agreement compared suspension codes in DHHD (gynecologists' registrations) with medical records (gynecologists' descriptions) and with videos (reviewers' categorizations) respectively. Whether the vaginal vault was suspended (pooled suspension) or not (no suspension method + not described) was analyzed, in addition to each suspension method. Regarding medical records, agreement on terminology was good among patients undergoing pooled suspension in cases of hysterectomy via the abdominal and vaginal route (agreement 78.7, 92.3%). Regarding videos, agreement on surgical procedure was good among pooled suspension patients in cases of hysterectomy via the abdominal, laparoscopic, and vaginal routes (agreement 88.9, 97.8, 100%). Agreement on individual suspension methods differed regarding both medical records (agreement 0-90.1%) and videos (agreement 0-100%). Agreement on terminology and surgical procedure regarding suspension method was good in respect of pooled suspension. However, disagreement was observed when individual suspension methods and operative details were scrutinized. Better consensus of terminology and surgical procedure is warranted to enable further research aimed at preventing POP among women undergoing hysterectomy.

  1. Magnetic force microscopy study of domain walls in Co{sub 2}Z ferrite

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Qin, Lang; Verweij, Henk, E-mail: verweij.1@osu.edu

    2014-03-01

    Graphical abstract: - Highlights: • Hexaferrite Co{sub 2}Z is synthesized through the modified Pechini method. • Magnetic domains are observed in anisotropic Co{sub 2}Z single grain using MFM. • Observed single grain domain thickness is in good agreement with Dotsh model. - Abstract: Hexaferrite Co{sub 2}Z was synthesized through the modified Pechini method. Partially oriented samples were obtained after consolidation with uniaxial pressing and calcination/sintering at 1300 °C/1330 °C. The sample composition and morphology was identified with X-ray diffractometry (XRD) and scanning electron microscopy (SEM) with energy-dispersive X-ray spectrometry (EDS). MFM studies of the single grains revealed a domain structuremore » with 0.7 μm wide. The Co{sub 2}Z static magnetization was measured with a vibrating sample magnetometer (VSM), and was used to calculate a single grain domain with a thickness of 4.8 μm. This result is in good agreement with SEM observations of the single grain thickness.« less

  2. Brief report: Agreement between parent and adolescent autonomy expectations and its relationship to adolescent adjustment.

    PubMed

    Pérez, J Carola; Cumsille, Patricio; Martínez, M Loreto

    2016-12-01

    While disagreement in autonomy expectations between parents and their adolescent children is normative, it may also compromise adolescent adjustment. This study examines the association between parents' and adolescents' agreement on autonomy expectations by cognitive social domains and adolescent adjustment. A sample of 211 Chilean dyads of adolescents (57% female, M age  = 15.29 years) and one of their parents (82% mothers, M age  = 44.36 years) reported their expectations for the age at which adolescents should decide on their own regarding different issues in their life. Indexes of parent-adolescent agreement on autonomy expectations were estimated for issues of personal and prudential domains. Greater agreement in the prudential than in the personal domain was observed. For boys and girls, higher agreement in adolescent-parent autonomy expectations in the personal domain was associated with lower substance use. A negative association between level of agreement in adolescent-parent autonomy expectations in the prudential domain and externalizing behaviors and substance use was found. Copyright © 2016 The Foundation for Professionals in Services for Adolescents. Published by Elsevier Ltd. All rights reserved.

  3. A multiple reader scoring system for Nasal Potential Difference parameters.

    PubMed

    Solomon, George M; Liu, Bo; Sermet-Gaudelus, Isabelle; Fajac, Isabelle; Wilschanski, Michael; Vermeulen, Francois; Rowe, Steven M

    2017-09-01

    Nasal Potential Difference (NPD) is a biomarker of CFTR activity used to diagnose CF and monitor experimental therapies. Limited studies have been performed to assess agreement between expert readers of NPD interpretation using a scoring algorithm. We developed a standardized scoring algorithm for "interpretability" and "confidence" for PD (potential difference) measures, and sought to determine the degree of agreement on NPD parameters between trained readers. There was excellent agreement for interpretability between NPD readers for CF and fair agreement for normal tracings but slight agreement of interpretability in indeterminate tracings. Amongst interpretable tracings, excellent correlation of mean scores for Ringer's Baseline PD, Δ amiloride , and Δ Cl-free+Isoproterenol was observed. There was slight agreement regarding confidence of the interpretable PD tracings, resulting in divergence of the Ringers and Δ amiloride , and ΔCl -free+Isoproterenol PDs between "high" and "low" confidence CF tracings. A multi-reader process with adjudication is important for scoring NPDs for diagnosis and in monitoring of CF clinical trials. Copyright © 2017 European Cystic Fibrosis Society. Published by Elsevier B.V. All rights reserved.

  4. An X-ray absorption spectroscopy study of the interactions of Ni2+ with yeast enolase.

    PubMed

    Wang, S; Scott, R A; Lebioda, L; Zhou, Z H; Brewer, J M

    1995-05-15

    An x-ray absorption spectroscopy (XAS) study was carried out at pH 7.6 on solutions of Ni2+ and yeast enolase depleted of its physiological cofactor (Mg2+) in the presence or absence of substrate/product, the very strongly bound competitive inhibitor 2-phosphonoacetohydroxamate and Mg2+. Both "conformational" and "catalytic" Ni2+ are distorted octahedral in coordination, in agreement with several spectroscopic studies but in contrast to the coordination in the crystal at pH 6.0. The data are consistent with direct coordination of what must be the catalytic Ni2+ to the phosphate of the substrate, in agreement with some previous data but in disagreement with recent interpretations by other workers. The ligands around the metal ions obtained from the x-ray structure give simulated XAS spectra in good agreement with the observed spectra.

  5. Improving agreement in assessment of synovitis in rheumatoid arthritis.

    PubMed

    Cheung, Peter P; Dougados, Maxime; Andre, Vincent; Balandraud, Nathalie; Chales, Gérard; Chary-Valckenaere, Isabelle; Chatelus, Emmanuel; Dernis, Emmanuelle; Gill, Ghislaine; Gilson, Mélanie; Guis, Sandrine; Mouterde, Gael; Pavy, Stephan; Pouyol, François; Marhadour, Thierry; Richette, Pascal; Ruyssen-Witrand, Adeline; Soubrier, Martin; Gossec, Laure

    2013-03-01

    Synovitis assessment through evaluation of swollen joints is integral in steering treatment decisions in rheumatoid arthritis (RA). However, there is high inter-observer variation. The objective was to assess if a short collegiate consensus would improve swollen joint agreement between rheumatologists and whether this was affected by experience. Eighteen rheumatologists from French university rheumatology units participated in three 30 minutes rounds over a half day meeting evaluating joint counts of RA patients in small groups, followed by short consensus discussions. Agreement was evaluated at the end of each round as follows: (i) global agreement of swollen joints (ii) swollen joint agreement according to level of experience of the rheumatologist (iii) swollen joint count and (iv) agreement of disease activity state according to the Disease Activity Score (DAS28). Agreement was calculated using percentage agreement and kappa. Global agreement of swollen joints failed to improve (kappa 0.50 to 0.52) at the joint level. Agreement between seniors did not improve but agreement between newly qualified rheumatologists and their senior peer, which was initially poor (kappa 0.28), improved significantly (to 0.54) at the end of the consensus exercises. Concordance of DAS28 activity states improved from 71% to 87%. Consensus exercises for swollen joint assessment is worthwhile and may potentially improve agreement between clinicians in clinical synovitis and disease activity state, benefit was mostly observed in newly qualified rheumatologists. Copyright © 2012 Société française de rhumatologie. Published by Elsevier SAS. All rights reserved.

  6. Levels-of-growing-stock cooperative study in Douglas-fir: report no. 09—Some comparisons of DFSIM estimates with growth in the levels-of-growing-stock study.

    Treesearch

    Robert O. Curtis

    1987-01-01

    Initial stand statistics for the levels-of-growing-stock study installations were projected by the Douglas-fir stand simulation program (DFSIM) over the available periods of observation. Estimates were compared with observed volume and basal area growth, diameter change, and mortality. Overall agreement was reasonably good, although results indicate some biases and a...

  7. Effect of two X-ray tube voltages on detection of approximal caries in digital radiographs. An in vitro study.

    PubMed

    Hellén-Halme, Kristina

    2011-04-01

    This study evaluated the effect of two different tube voltages on clinicians' ability to diagnose approximal carious lesions in digital radiographs. One hundred extracted teeth were radiographed twice at two voltage settings, 60 and 70 kV, using a standardized procedure. Seven observers evaluated the radiographs on a standard color monitor pre-calibrated according to DICOM part 14. Evaluations were made at ambient light levels below 50 lx. All observations were analyzed with receiver operating characteristic curves. A histological examination of the teeth served as the criterion standard. A paired t test compared the effects of the two voltages. The significance level was set to p < 0.05. Weighted kappa statistics estimated intra-observer agreement. No significant difference in accuracy of approximal carious lesion diagnosis was found between the two voltage settings. But five observers rated dentin lesions on radiographs exposed at 70 kV better than on radiographs exposed at 60 kV. Intra-observer agreement differed from fair to moderate. There was no significant difference in accuracy of approximal carious lesion diagnosis between digital radiographs exposed with 60 or 70 kV.

  8. Magnetic resonance features of the feline hippocampus in epileptic and non-epileptic cats: a blinded, retrospective, multi-observer study.

    PubMed

    Claßen, Anne Christine; Kneissl, Sibylle; Lang, Johann; Tichy, Alexander; Pakozdy, Akos

    2016-08-11

    Hippocampal necrosis in cats has been reported to be associated with epileptic seizures. Magnetic resonance imaging (MRI) features of temporal lobe (TL) abnormalities in epileptic cats have been described but MR images from epileptic and non-epileptic individuals have not yet been systematically compared. TL abnormalities are highly variable in shape, size and signal, and therefore may lead to varying evaluations by different specialists. The aim of this study was to investigate whether there were differences in the appearance of the TL between epileptic and non-epileptic cats, and whether there were any relationships between TL abnormalities and seizure semiologies or other clinical findings. We also investigated interobserver agreement among three specialists. The MR images of 46 cats were reviewed independently by three observers, who were blinded to patient data, examination findings and the review of the other observers. Images were evaluated using a multiparametric scoring system developed for this study. Mann-Whitney U-tests and chi-square were used to analyse the differences between observers' evaluations. The kappa coefficient (k) and Fleiss' kappa coefficient were used to quantify interobserver agreement. The overall interobserver agreement was moderate to good (k =0.405 to 0.615). The MR scores between epileptic and non-epileptic cats did not differ significantly. However, there was a significant difference between the MR scores of epileptic cats with and without orofacial involvement according to all three observers. Likewise, MR scores of cats with cluster seizures were higher than those of cats without clusters. Cats presenting with recurrent epileptic seizures with orofacial involvement are more likely to have hippocampal pathologies, which suggests that TL abnormalities are not merely unspecific epileptic findings, but are associated with a certain type of epilepsy. TL signal alterations are more likely to be detected on FLAIR sequences. In contrast to severe changes in the TL which were described similarly among specialists, mild TL abnormalities may be difficult to interpret, thus leading to different assessments among observers.

  9. Deep infiltrating endometriosis: Should rectal and vaginal opacification be systematically used in MR imaging?

    PubMed

    Uyttenhove, F; Langlois, C; Collinet, P; Rubod, C; Verpillat, P; Bigot, J; Kerdraon, O; Faye, N

    2016-06-01

    To evaluate the interest of rectal and vaginal filling in vaginal and recto-sigmoid endometriosis with MR imaging. To compare the results between a senior and a junior radiologist review. Sixty-seven patients with clinically suspected deep infiltrating endometriosis were included in our MRI protocol consisting of repeated T2-weigthed sequences (axial and sagittal) before and after rectal and vaginal marking with ultrasonography gel. Vaginal and recto-sigmoid endometriosis lesions were analyzed before and after opacification. The inter-reader agreement between senior and junior scores was studied. Concerning vaginal and muscularis and beyond colonic involvement, no significant difference (P=0.32) was observed and the inter-reader agreement was excellent (K=0.96 and 0.97 respectively). Concerning serosa colonic lesions, a significant difference was observed (P=0.01) and the inter-reader agreement was poor (K=0). Rectal and vaginal filling in endometriosis staging with MRI is not necessary no matter the reader experiment. Copyright © 2016 Elsevier Masson SAS. All rights reserved.

  10. Micromagnetic study of auto-oscillation modes in spin-Hall nano-oscillators

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ulrichs, H., E-mail: henning.ulrichs@uni-muenster.de; Demidov, V. E.; Demokritov, S. O.

    2014-01-27

    We present a numerical study of magnetization dynamics in a recently introduced spin torque nano-oscillator, whose operational principle relies on the spin-Hall effect—spin-Hall nano-oscillators. Our numerical results show good agreement with the experimentally observed behaviors and provide detailed information about the features of the primary auto-oscillation mode observed in the experiments. They also clarify the physical nature of the secondary auto-oscillation mode, which was experimentally observed under certain conditions only.

  11. Enhanced Two-Factor Authentication and Key Agreement Using Dynamic Identities in Wireless Sensor Networks

    PubMed Central

    Chang, I-Pin; Lee, Tian-Fu; Lin, Tsung-Hung; Liu, Chuan-Ming

    2015-01-01

    Key agreements that use only password authentication are convenient in communication networks, but these key agreement schemes often fail to resist possible attacks, and therefore provide poor security compared with some other authentication schemes. To increase security, many authentication and key agreement schemes use smartcard authentication in addition to passwords. Thus, two-factor authentication and key agreement schemes using smartcards and passwords are widely adopted in many applications. Vaidya et al. recently presented a two-factor authentication and key agreement scheme for wireless sensor networks (WSNs). Kim et al. observed that the Vaidya et al. scheme fails to resist gateway node bypassing and user impersonation attacks, and then proposed an improved scheme for WSNs. This study analyzes the weaknesses of the two-factor authentication and key agreement scheme of Kim et al., which include vulnerability to impersonation attacks, lost smartcard attacks and man-in-the-middle attacks, violation of session key security, and failure to protect user privacy. An efficient and secure authentication and key agreement scheme for WSNs based on the scheme of Kim et al. is then proposed. The proposed scheme not only solves the weaknesses of previous approaches, but also increases security requirements while maintaining low computational cost. PMID:26633396

  12. Remote Digital Preoperative Assessments for Cleft Lip and Palate May Improve Clinical and Economic Impact in Global Plastic Surgery.

    PubMed

    Hughes, Christopher; Campbell, Jacob; Mukhopadhyay, Swagoto; McCormack, Susan; Silverman, Richard; Lalikos, Janice; Babigian, Alan; Castiglione, Charles

    2017-09-01

    Reconstructive surgical care can play a vital role in the resource-poor settings of low- and middle-income countries. Telemedicine platforms can improve the efficiency and effectiveness of surgical care. The purpose of this study is to determine whether remote digital video evaluations are reliable in the context of a short-term plastic surgical intervention. The setting for this study was a district hospital located in Latacunga, Ecuador. Participants were 27 consecutive patients who presented for operative repair of cleft lip and palate. We calculated kappa coefficients for reliability between in-person and remote digital video assessments for the classification of cleft lip and palate between two separate craniofacial surgeons. We hypothesized that the technology would be a reliable method of preoperative assessment for cleft disease. Of the 27 (81.4%) participants, 22 received operative treatment for their cleft disorder. Mean age was 11.1 ± 8.3 years. Patients presented with a spectrum of disorders, including cleft lip (24 of 27, 88.9%), cleft palate (19 of 27, 70.4%), and alveolar cleft (19 of 27, 70.4%). We found a 95.7% agreement between observers for cleft lip with substantial reliability (κ = .78, P < .01). There was an 82.6% agreement between observers for cleft palate, with a moderate interrater reliability (κ = .55, P = .01). We found only a 47.8% agreement between observers for alveolar cleft with a nonsignificant, weak kappa agreement (κ = .06, P = .74). Remote digital assessments are a reliable way to preoperatively diagnose cleft lip and palate in the context of short-term plastic surgical interventions in low- and middle-income countries. Future work will evaluate the potential for real-time, telemedicine assessments to reduce cost and improve clinical effectiveness in global plastic surgery.

  13. Stress echocardiography with smartphone: real-time remote reading for regional wall motion.

    PubMed

    Scali, Maria Chiara; de Azevedo Bellagamba, Clarissa Carmona; Ciampi, Quirino; Simova, Iana; de Castro E Silva Pretto, José Luis; Djordjevic-Dikic, Ana; Dodi, Claudio; Cortigiani, Lauro; Zagatina, Angela; Trambaiolo, Paolo; Torres, Marco R; Citro, Rodolfo; Colonna, Paolo; Paterni, Marco; Picano, Eugenio

    2017-11-01

    The diffusion of smart-phones offers access to the best remote expertise in stress echo (SE). To evaluate the reliability of SE based on smart-phone filming and reading. A set of 20 SE video-clips were read in random sequence with a multiple choice six-answer test by ten readers from five different countries (Italy, Brazil, Serbia, Bulgaria, Russia) of the "SE2020" study network. The gold standard to assess accuracy was a core-lab expert reader in agreement with angiographic verification (0 = wrong, 1 = right). The same set of 20 SE studies were read, in random order and >2 months apart, on desktop Workstation and via smartphones by ten remote readers. Image quality was graded from 1 = poor but readable, to 3 = excellent. Kappa (k) statistics was used to assess intra- and inter-observer agreement. The image quality was comparable in desktop workstation vs. smartphone (2.0 ± 0.5 vs. 2.4 ± 0.7, p = NS). The average reading time per case was similar for desktop versus smartphone (90 ± 39 vs. 82 ± 54 s, p = NS). The overall diagnostic accuracy of the ten readers was similar for desktop workstation vs. smartphone (84 vs. 91%, p = NS). Intra-observer agreement (desktop vs. smartphone) was good (k = 0.81 ± 0.14). Inter-observer agreement was good and similar via desktop or smartphone (k = 0.69 vs. k = 0.72, p = NS). The diagnostic accuracy and consistency of SE reading among certified readers was high and similar via desktop workstation or via smartphone.

  14. Assessing physical activity during youth sport: the Observational System for Recording Activity in Children: Youth Sports.

    PubMed

    Cohen, Alysia; McDonald, Samantha; McIver, Kerry; Pate, Russell; Trost, Stewart

    2014-05-01

    The purpose of this study was to evaluate the validity and interrater reliability of the Observational System for Recording Activity in Children: Youth Sports (OSRAC:YS). Children (N = 29) participating in a parks and recreation soccer program were observed during regularly scheduled practices. Physical activity (PA) intensity and contextual factors were recorded by momentary time-sampling procedures (10-second observe, 20-second record). Two observers simultaneously observed and recorded children's PA intensity, practice context, social context, coach behavior, and coach proximity. Interrater reliability was based on agreement (Kappa) between the observer's coding for each category, and the Intraclass Correlation Coefficient (ICC) for percent of time spent in MVPA. Validity was assessed by calculating the correlation between OSRAC:YS estimated and objectively measured MVPA. Kappa statistics for each category demonstrated substantial to almost perfect interobserver agreement (Kappa = 0.67-0.93). The ICC for percent time in MVPA was 0.76 (95% C.I. = 0.49-0.90). A significant correlation (r = .73) was observed for MVPA recorded by observation and MVPA measured via accelerometry. The results indicate the OSRAC:YS is a reliable and valid tool for measuring children's PA and contextual factors during a youth soccer practice.

  15. Quantitative MR Imaging of Hepatic Steatosis: Validation in Ex Vivo Human Livers

    PubMed Central

    Bannas, Peter; Kramer, Harald; Hernando, Diego; Agni, Rashmi; Cunningham, Ashley M.; Mandal, Rakesh; Motosugi, Utaroh; Sharma, Samir D.; del Rio, Alejandro Munoz; Fernandez, Luis; Reeder, Scott B.

    2015-01-01

    Emerging magnetic resonance imaging (MRI) biomarkers of hepatic steatosis have demonstrated tremendous promise for accurate quantification of hepatic triglyceride concentration. These methods quantify the “proton density fat-fraction” (PDFF), which reflects the concentration of triglycerides in tissue. Previous in vivo studies have compared MRI-PDFF with histologic steatosis grading for assessment of hepatic steatosis. However, the correlation of MRI-PDFF with the underlying hepatic triglyceride content remained unknown. The aim of this ex vivo study was to validate the accuracy of MRI-PDFF as an imaging biomarker of hepatic steatosis. Using ex vivo human livers, we compared MRI-PDFF with magnetic resonance spectroscopy-PDFF (MRS-PDFF), biochemical triglyceride extraction and histology as three independent reference standards. A secondary aim was to compare the precision of MRI-PDFF relative to biopsy for the quantification of hepatic steatosis. MRI-PDFF was prospectively performed at 1.5T in 13 explanted human livers. We performed co-localized paired evaluation of liver fat content in all nine Couinaud segments using single-voxel MRS-PDFF (n=117), tissue wedges for biochemical triglyceride extraction (n=117), and five core biopsies performed in each segment for histologic grading (n=585). Accuracy of MRI-PDFF was assessed through linear regression with MRS-PDFF, triglyceride extraction and histology. Intra-observer agreement, inter-observer agreement and repeatability of MRI-PDFF and histologic grading were assessed through Bland-Altman analyses. MRI-PDFF showed an excellent correlation with MRS-PDFF (r=0.984; CI: 0.978–0.989) and strong correlation with histology (r=0.850; CI: 0.791–0.894) and triglyceride extraction (r=0.871; CI: 0.818–0.909). Intra-observer agreement, inter-observer agreement and repeatability showed a significantly smaller variance for MRI-PDFF than for histologic steatosis grading (all p<0.001). Conclusion MRI-PDFF is an accurate, precise and reader-independent non-invasive imaging biomarker of liver triglyceride content, capable of steatosis quantification over the entire liver. PMID:26224591

  16. Influence of intra-tumoral heterogeneity on the evaluation of BCL2, E-cadherin, EGFR, EMMPRIN, and Ki-67 expression in tissue microarrays from breast cancer.

    PubMed

    Tramm, Trine; Kyndi, Marianne; Sørensen, Flemming B; Overgaard, Jens; Alsner, Jan

    2018-01-01

    The influence of intra-tumoral heterogeneity on the evaluation of immunohistochemical (IHC) biomarker expression may affect the analytical validity of new biomarkers substantially and hence compromise the clinical utility. The aim of this study was to examine the influence of intra-tumoral heterogeneity as well as inter-observer variability on the evaluation of various IHC markers with potential prognostic impact in breast cancer (BCL2, E-cadherin, EGFR, EMMPRIN and Ki-67). From each of 27 breast cancer patients, two tumor-containing paraffin blocks were chosen. Intra-tumoral heterogeneity was evaluated (1) within a single tumor-containing paraffin block ('intra-block agreement') by comparing information from a central, a peripheral tissue microarray (TMA) core and a whole slide section (WS), (2) between two different tumor-containing blocks from the same primary tumor ('inter-block agreement') by comparing information from TMA cores (central/peripheral) and WS. IHC markers on WS and TMA cores were evaluated by two observers independently, and agreements were estimated by Kappa statistics. For BCL2, E-cadherin and EGFR, an almost perfect intra- and inter-block agreement was found. EMMPRIN and Ki-67 showed a more heterogeneous expression with moderate to substantial intra-block agreements. For both stainings, there was a moderate inter-block agreement that improved slightly for EMMPRIN, when using WS instead of TMA cores. Inter-observer agreements were found to be almost perfect for BCL2, E-cadherin and EGFR (WS: κ > 0.82, TMAs: κ > 0.90), substantial for EMMPRIN (κ > 0.63), but only fair to moderate for Ki-67 (WS: κ = 0.54, TMAs: κ = 0.33). BCL2, E-cadherin and EGFR were found to be homogeneously expressed, whereas EMMPRIN and Ki-67 showed a more pronounced degree of intra-tumoral heterogeneity. The results emphasize the importance of securing the analytical validity of new biomarkers by examining the intra-tumoral heterogeneity of immunohistochemical stainings applied to TMA cores individually in each type of cancer.

  17. The Illustris simulation: supermassive black hole-galaxy connection beyond the bulge

    NASA Astrophysics Data System (ADS)

    Mutlu-Pakdil, Burçin; Seigar, Marc S.; Hewitt, Ian B.; Treuthardt, Patrick; Berrier, Joel C.; Koval, Lauren E.

    2018-02-01

    We study the spiral arm morphology of a sample of the local spiral galaxies in the Illustris simulation and explore the supermassive black hole-galaxy connection beyond the bulge (e.g. spiral arm pitch angle, total stellar mass, dark matter mass, and total halo mass), finding good agreement with other theoretical studies and observational constraints. It is important to study the properties of supermassive black holes and their host galaxies through both observations and simulations and compare their results in order to understand their physics and formative histories. We find that Illustris prediction for supermassive black hole mass relative to pitch angle is in rather good agreement with observations and that barred and non-barred galaxies follow similar scaling relations. Our work shows that Illustris presents very tight correlations between supermassive black hole mass and large-scale properties of the host galaxy, not only for early-type galaxies but also for low-mass, blue and star-forming galaxies. These tight relations beyond the bulge suggest that halo properties determine those of a disc galaxy and its supermassive black hole.

  18. Examining Interrater Agreement Analyses of a Pilot Special Education Observation Tool

    ERIC Educational Resources Information Center

    Johnson, Evelyn S.; Semmelroth, Carrie L.

    2012-01-01

    This paper reports the results of interrater agreement analyses on a pilot special education teacher evaluation instrument, the Recognizing Effective Special Education Teachers (RESET) Observation Tool (OT). Using evidence-based instructional practices as the basis for the evaluation, the RESET OT is designed for the spectrum of different…

  19. 'Ease of interpretation' of cytological smears stained with modified ultrafast papanicolaou stain: Interobserver agreement and reproducibility.

    PubMed

    Uthamalingam, Preithy; Sathish Kumar, Thabasum; Venus, Albina; Sekar, Preethi; Muthusamy, Rajeshwari K; Mehta, Sangita

    2018-04-01

    Since its inception in 1995, the Ultrafast Papanicoloau (UFPAP) cytological stain has undergone a number of modifications to suit the local availability of reagents and cost in different set ups. However, the reported results have been uniformly encouraging. We designed a study to investigate the inter-observer agreement in 'perceived ease of interpretation' of cytological smears stained with Modified Ultrafast Papanicoloau stain (MUFPAP). After a small pilot study, we prospectively stained air-dried fine needle aspirate smears (FNACs) and Body Fluid smears with the standardized MUFPAP stain. The MUFPAP stained slides were evaluated in tandem with other routine cytological stains as well as independently by two pathologists. Two rater kappa was used to determine the agreement. The study included 93 fluids and 34 FNACs. A vast majority of the cases stained with MUFPAP were rated 'better' than the routine stains in terms of 'overall ease of interpretation' with considerable agreement. The agreement tended to be better for FNACs than fluid specimens. Cases with malignant pathology demonstrated a perfect agreement (kappa = 1) between the raters in terms of 'overall ease of interpretation' (91.7% cases were rated 'very good' by each pathologist) when compared to cases with benign pathology (kappa = 0.52). Nuclear characteristics were appreciated with a better agreement than other parameters. Modified UFPAP stain appears to be quick, reliable, cost-effective alternative in cytology, especially for detecting malignant cells in smears with low cellularity. Its specific advantage is robust nuclear staining against a clear background. © 2018 Wiley Periodicals, Inc.

  20. Effect of ergonomics training on agreement between expert and nonexpert ratings of the potential for musculoskeletal harm in manufacturing tasks.

    PubMed

    Fethke, Nathan B; Merlino, Linda; Gerr, Fred

    2013-12-01

    To evaluate the effect of ergonomics training on non-ergonomists' ability to recognize and characterize the potential for musculoskeletal harm in manufacturing tasks. Ergonomics training was delivered to members of a participatory ergonomics team in a manufacturing facility. Before and after training, participatory ergonomics team members and the research team rated the potential for musculoskeletal harm for each of 30 tasks. Measures of agreement included Pearson, concordance, and intraclass correlation coefficients. Measures of agreement generally improved after training. The greatest agreement was observed for ratings of the potential for musculoskeletal harm to the low back. The greatest improvement in agreement was observed for ratings of the potential for musculoskeletal harm to the neck/shoulder. The training seemed to improve non-experts' ability to identify the potential for musculoskeletal harm.

  1. Interobserver agreement on Poser's and the new McDonald's diagnostic criteria for multiple sclerosis.

    PubMed

    Zipoli, V; Portaccio, E; Siracusa, G; Pracucci, G; Sorbi, S; Amato, M P

    2003-10-01

    We assessed the interobserver agreement on the diagnosis of multiple sclerosis (MS) in a study sample consisting of 41 MS (15 relapsing remitting, two secondary progressive, five primary progressive and 19 presenting their first clinical attack) and three non-MS cases. Clinical and paraclinical information was recorded in standardized forms. Four neurologists were asked to make a diagnosis using Poser's and McDonald's criteria and to assess MRI scans according to the McDonald's guidelines. In terms of the kappa statistic (kappa), we found a moderate agreement on the overall diagnosis using both Poser's and McDonald's criteria (kappa, respectively 0.57 and 0.52). As for distinct diagnostic categories, we observed a moderate to substantial agreement for the three McDonald categories (range of kappa values 0.49-0.64) and a fair to substantial agreement for the nine Poser categories (range of kappa values 0.37-0.67). Taking into account clinical information, the agreement on dissemination over time was substantially higher (kappa = 0.69) than that found on dissemination over space (kappa = 0.46). In contrast, for MRI assessment, the agreement for spatial dissemination was substantial (kappa = 0.74) compared with the fair agreement (kappa = 0.25) yielded by dissemination over time. The new McDonald's criteria yield a good overall diagnostic reliability, and compare favourably with Poser's classification in terms of agreement on distinct diagnostic categories.

  2. Assessment of a new web-based sexual concurrency measurement tool for men who have sex with men.

    PubMed

    Rosenberg, Eli S; Rothenberg, Richard B; Kleinbaum, David G; Stephenson, Rob B; Sullivan, Patrick S

    2014-11-10

    Men who have sex with men (MSM) are the most affected risk group in the United States' human immunodeficiency virus (HIV) epidemic. Sexual concurrency, the overlapping of partnerships in time, accelerates HIV transmission in populations and has been documented at high levels among MSM. However, concurrency is challenging to measure empirically and variations in assessment techniques used (primarily the date overlap and direct question approaches) and the outcomes derived from them have led to heterogeneity and questionable validity of estimates among MSM and other populations. The aim was to evaluate a novel Web-based and interactive partnership-timing module designed for measuring concurrency among MSM, and to compare outcomes measured by the partnership-timing module to those of typical approaches in an online study of MSM. In an online study of MSM aged ≥18 years, we assessed concurrency by using the direct question method and by gathering the dates of first and last sex, with enhanced programming logic, for each reported partner in the previous 6 months. From these methods, we computed multiple concurrency cumulative prevalence outcomes: direct question, day resolution / date overlap, and month resolution / date overlap including both 1-month ties and excluding ties. We additionally computed variants of the UNAIDS point prevalence outcome. The partnership-timing module was also administered. It uses an interactive month resolution calendar to improve recall and follow-up questions to resolve temporal ambiguities, combines elements of the direct question and date overlap approaches. The agreement between the partnership-timing module and other concurrency outcomes was assessed with percent agreement, kappa statistic (κ), and matched odds ratios at the individual, dyad, and triad levels of analysis. Among 2737 MSM who completed the partnership section of the partnership-timing module, 41.07% (1124/2737) of individuals had concurrent partners in the previous 6 months. The partnership-timing module had the highest degree of agreement with the direct question. Agreement was lower with date overlap outcomes (agreement range 79%-81%, κ range .55-.59) and lowest with the UNAIDS outcome at 5 months before interview (65% agreement, κ=.14, 95% CI .12-.16). All agreements declined after excluding individuals with 1 sex partner (always classified as not engaging in concurrency), although the highest agreement was still observed with the direct question technique (81% agreement, κ=.59, 95% CI .55-.63). Similar patterns in agreement were observed with dyad- and triad-level outcomes. The partnership-timing module showed strong concurrency detection ability and agreement with previous measures. These levels of agreement were greater than others have reported among previous measures. The partnership-timing module may be well suited to quantifying concurrency among MSM at multiple levels of analysis.

  3. Masking by Gratings Predicted by an Image Sequence Discriminating Model: Testing Models for Perceptual Discrimination Using Repeatable Noise

    NASA Technical Reports Server (NTRS)

    Ahumada, Albert J., Jr.; Null, Cynthia H. (Technical Monitor)

    1998-01-01

    Adding noise to stimuli to be discriminated allows estimation of observer classification functions based on the correlation between observer responses and relevant features of the noisy stimuli. Examples will be presented of stimulus features that are found in auditory tone detection and visual vernier acuity. using the standard signal detection model (Thurstone scaling), we derive formulas to estimate the proportion of the observers decision variable variance that is controlled by the added noise. one is based on the probability of agreement of the observer with him/herself on trials with the same noise sample. Another is based on the relative performance of the observer and the model. When these do not agree, the model can be rejected. A second derivation gives the probability of agreement of observer and model when the observer follows the model except for internal noise. Agreement significantly less than this amount allows rejection of the model.

  4. Breast lesion shape and margin evaluation: BI-RADS based metrics understate radiologists' actual levels of agreement.

    PubMed

    Rawashdeh, Mohammad; Lewis, Sarah; Zaitoun, Maha; Brennan, Patrick

    2018-05-01

    While there is much literature describing the radiologic detection of breast cancer, there are limited data available on the agreement between experts when delineating and classifying breast lesions. The aim of this work is to measure the level of agreement between expert radiologists when delineating and classifying breast lesions as demonstrated through Breast Imaging Reporting and Data System (BI-RADS) and quantitative shape metrics. Forty mammographic images, each containing a single lesion, were presented to nine expert breast radiologists using a high specification interactive digital drawing tablet with stylus. Each reader was asked to manually delineate the breast masses using the tablet and stylus and then visually classify the lesion according to the American College of Radiology (ACR) BI-RADS lexicon. The delineated lesion compactness and elongation were computed using Matlab software. Intraclass Correlation Coefficient (ICC) and Cohen's kappa were used to assess inter-observer agreement for delineation and classification outcomes, respectively. Inter-observer agreement was fair for BI-RADS shape (kappa = 0.37) and moderate for margin (kappa = 0.58) assessments. Agreement for quantitative shape metrics was good for lesion elongation (ICC = 0.82) and excellent for compactness (ICC = 0.93). Fair to moderate levels of agreement was shown by radiologists for shape and margin classifications of cancers using the BI-RADS lexicon. When quantitative shape metrics were used to evaluate radiologists' delineation of lesions, good to excellent inter-observer agreement was found. The results suggest that qualitative descriptors such as BI-RADS lesion shape and margin understate the actual level of expert radiologist agreement. Copyright © 2018 Elsevier Ltd. All rights reserved.

  5. Photon counting statistics analysis of biophotons from hands.

    PubMed

    Jung, Hyun-Hee; Woo, Won-Myung; Yang, Joon-Mo; Choi, Chunho; Lee, Jonghan; Yoon, Gilwon; Yang, Jong S; Soh, Kwang-Sup

    2003-05-01

    The photon counting statistics of biophotons emitted from hands is studied with a view to test its agreement with the Poisson distribution. The moments of observed probability up to seventh order have been evaluated. The moments of biophoton emission from hands are in good agreement while those of dark counts of photomultiplier tube show large deviations from the theoretical values of Poisson distribution. The present results are consistent with the conventional delta-value analysis of the second moment of probability.

  6. The critical role of the routing scheme in simulating peak river discharge in global hydrological models

    NASA Astrophysics Data System (ADS)

    Zhao, F.; Veldkamp, T.; Frieler, K.; Schewe, J.; Ostberg, S.; Willner, S. N.; Schauberger, B.; Gosling, S.; Mueller Schmied, H.; Portmann, F. T.; Leng, G.; Huang, M.; Liu, X.; Tang, Q.; Hanasaki, N.; Biemans, H.; Gerten, D.; Satoh, Y.; Pokhrel, Y. N.; Stacke, T.; Ciais, P.; Chang, J.; Ducharne, A.; Guimberteau, M.; Wada, Y.; Kim, H.; Yamazaki, D.

    2017-12-01

    Global hydrological models (GHMs) have been applied to assess global flood hazards, but their capacity to capture the timing and amplitude of peak river discharge—which is crucial in flood simulations—has traditionally not been the focus of examination. Here we evaluate to what degree the choice of river routing scheme affects simulations of peak discharge and may help to provide better agreement with observations. To this end we use runoff and discharge simulations of nine GHMs forced by observational climate data (1971-2010) within the ISIMIP2a project. The runoff simulations were used as input for the global river routing model CaMa-Flood. The simulated daily discharge was compared to the discharge generated by each GHM using its native river routing scheme. For each GHM both versions of simulated discharge were compared to monthly and daily discharge observations from 1701 GRDC stations as a benchmark. CaMa-Flood routing shows a general reduction of peak river discharge and a delay of about two to three weeks in its occurrence, likely induced by the buffering capacity of floodplain reservoirs. For a majority of river basins, discharge produced by CaMa-Flood resulted in a better agreement with observations. In particular, maximum daily discharge was adjusted, with a multi-model averaged reduction in bias over about 2/3 of the analysed basin area. The increase in agreement was obtained in both managed and near-natural basins. Overall, this study demonstrates the importance of routing scheme choice in peak discharge simulation, where CaMa-Flood routing accounts for floodplain storage and backwater effects that are not represented in most GHMs. Our study provides important hints that an explicit parameterisation of these processes may be essential in future impact studies.

  7. Multidecadal Changes in the UTLS Ozone from the MERRA-2 Reanalysis and the GMI Chemistry Model

    NASA Technical Reports Server (NTRS)

    Wargan, Krzysztof; Orbe, Clara; Pawson, Steven; Ziemke, Jerald R.; Oman, Luke; Olsen, Mark; Coy, Lawrence; Knowland, Emma

    2018-01-01

    Long-term changes of ozone in the UTLS (Upper Troposphere / Lower Stratosphere) reflect the response to decreases in the stratospheric concentrations of ozone-depleting substances as well as changes in the stratospheric circulation induced by climate change. To date, studies of UTLS ozone changes and variability have relied mainly on satellite and in-situ observations as well as chemistry-climate model simulations. By comparison, the potential of reanalysis ozone data remains relatively untapped. This is despite evidence from recent studies, including detailed analyses conducted under SPARC (Scalable Processor Architecture) Reanalysis Intercomparison Project (S-RIP), that demonstrate that stratospheric ozone fields from modern atmospheric reanalyses exhibit good agreement with independent data while delineating issues related to inhomogeneities in the assimilated observations. In this presentation, we will explore the possibility of inferring long-term geographically and vertically resolved behavior of the lower stratospheric (LS) ozone from NASA's MERRA-2 (Modern-Era Retrospective Analysis for Research and Applications -2) reanalysis after accounting for the few known discontinuities and gaps in its assimilated input data. This work builds upon previous studies that have documented excellent agreement between MERRA-2 ozone and ozonesonde observations in the LS. Of particular importance is a relatively good vertical resolution of MERRA-2 allowing precise separation of tropospheric and stratospheric ozone contents. We also compare the MERRA-2 LS ozone results with the recently completed 37-year simulation produced using Goddard Earth Observing System in "replay"� mode coupled with the GMI (Global Modeling Initiative) chemistry mechanism. Replay mode dynamically constrains the model with the MERRA-2 reanalysis winds, temperature, and pressure. We will emphasize the areas of agreement of the reanalysis and replay and interpret differences between them in the context of our increasing understanding of model transport driven by assimilated winds.

  8. Comparison of ionospheric profile parameters with IRI-2012 model over Jicamarca

    NASA Astrophysics Data System (ADS)

    Bello, S. A.; Abdullah, M.; Hamid, N. S. A.; Reinisch, B. W.

    2017-05-01

    We used the hourly ionogram data obtained from Jicamarca station (12° S, 76.9° W, dip latitude: 1.0° N) an equatorial region to study the variation of the electron density profile parameters: maximum height of F2-layer (hmF2), bottomside thickness (B0) and shape (B1) parameter of F-layer. The period of study is for the year 2010 (solar minimum period).The diurnal monthly averages of these parameters are compared with the updated IRI-2012 model. The results show that hmF2 is highest during the daytime than nighttime. The variation in hmF2 was observed to modulate the thickness of the bottomside F2-layer. The observed hmF2 and B0 post-sunset peak is as result of the upward drift velocity of ionospheric plasma. We found a close agreement between IRI-CCIR hmF2 model and observed hmF2 during 0000-0700 LT while outside this period the model predictions deviate significantly with the observational values. Significant discrepancies are observed between the IRI model options for B0 and the observed B0 values. Specifically, the modeled values do not show B0 post-sunset peak. A fairly good agreement was observed between the observed B1 and IRI model options (ABT-2009 and Bill 2000) for B1.

  9. Polar Ozone Loss Rates: Comparison Of Match Observations With Simulations Of 3-D Chemical Transport Model And Box Model

    NASA Astrophysics Data System (ADS)

    Tripathi, O. P.; Godin-Beekmann, S.; Lefevre, F.; Marchand, M.; Pazmino, A.; Hauchecorne, A.

    2005-12-01

    Model simulations of ozone loss rates during recent arctic and Antarctic winters are compared with the observed ozone loss rates from the match technique. Arctic winters 1994/1995, 1999/2000, 2002/2003 and the Antarctic winter 2003 were considered for the analysis. We use a high resolution chemical transport model MIMOSA-CHIM and REPROBUS box model for the calculation of ozone loss rates. Trajectory model calculations show that the ozone loss rates are dependent on the initialization fields. On the one hand when chemical fields are initialized by UCAM (University of Cambridge SLIMCAT model simulated fields) the loss rates were underestimated by a factor of two whereas on the other hand when it is initialized by UL (University of Leeds) fields the model loss rates are in a very good agreement with match loss rates at lower levels. The study shows a very good agreement between MIMOSA-CHIM simulation and match observation in 1999/2000 winter at both levels, 450 and 500 K, except slight underestimation in March at 500 K. But in January we have a very good agreement. This is also true for 1994/1995 when we consider simulated ozone loss rate in view of the ECMWF wind deficiency assuming that match observations were not made on isolated trajectories. Sensitivity tests, by changing JCl2O2 value, particle number density and heating rates, performed for the arctic winter 1999/2000 shows that we need to improve our understanding of particle number density and heating rate calculation mechanism. Burkholder JCl2O2 has improved the comparison of MIMOSA-CHIM model results with observations (Tripathi et al., 2005). In the same study the comparison results were shown to improved by changing heating rates and number density through NAT particle sedimentation.

  10. EXTREMELY METAL-POOR STARS AND A HIERARCHICAL CHEMICAL EVOLUTION MODEL

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Komiya, Yutaka

    2011-07-20

    Early phases of the chemical evolution of the Galaxy and formation history of extremely metal-poor (EMP) stars are investigated using hierarchical galaxy formation models. We build a merger tree of the Galaxy according to the extended Press-Schechter theory. We follow the chemical evolution along the tree and compare the model results to the metallicity distribution function and abundance ratio distribution of the Milky Way halo. We adopt three different initial mass functions (IMFs). In a previous study, we argued that the typical mass, M{sub md}, of EMP stars should be high, M{sub md} {approx} 10 M{sub sun}, based on studiesmore » of binary origin carbon-rich EMP stars. In this study, we show that only the high-mass IMF can explain an observed small number of EMP stars. For relative element abundances, the high-mass IMF and the Salpeter IMF predict similar distributions. We also investigate dependence on nucleosynthetic yields of supernovae (SNe). The theoretical SN yields by Kobayashi et al. and Chieffi and Limongi show reasonable agreement with observations for {alpha}-elements. Our model predicts a significant scatter of element abundances at [Fe/H] < -3. We adopted the stellar yields derived in the work of Francois et al., which produce the best agreement between the observational data and the one-zone chemical evolution model. Their yields well reproduce a trend of the averaged abundances of EMP stars but predict much larger scatter than do the observations. The model with hypernovae predicts Zn abundance, in agreement with the observations, but other models predict lower [Zn/Fe]. Ejecta from the hypernovae with large explosion energy is mixed in large mass and decreases the scatter of the element abundances.« less

  11. Intra- and interobserver agreement for fetal cerebral measurements in 3D-ultrasonography.

    PubMed

    Albers, Maria E W A; Buisman, Erato T I A; Kahn, René S; Franx, Arie; Onland-Moret, N Charlotte; de Heus, Roel

    2018-04-10

    The aim of this study is to evaluate intra- and interobserver agreement for measurement of intracranial, cerebellar, and thalamic volume with the Virtual Organ Computer-aided AnaLysis (VOCAL) technique in three-dimensional ultrasound images, in comparison to two-dimensional measurements of these brain structures. Three-dimensional ultrasound images of the brains of 80 fetuses at 20-24 weeks' gestational age were obtained from YOUth, a Dutch prospective cohort study. Two observers performed offline measurement of the occipitofrontal diameter, intracranial volume, transcerebellar diameter, cerebellar volume, and thalamic width, area, and volume, independently. VOCAL was used for calculation of the volumes. The two-way random, single measures intraclass correlation coefficient (ICC) was used for analysis of agreement and Bland-Altman plots were configured. Intra- and interobserver agreement was almost perfect for occipitofrontal diameter (intra ICC 0.88, 95% CI 0.82-0.92; inter ICC 0.91, 95% CI 0.85-0.94), intracranial volume (intra ICC 0.96, 95% CI 0.91-0.98; inter ICC 0.97, 95% CI 0.96-0.98) and transcerebellar diameter (intra ICC 0.91, 95% CI 0.86-0.94; inter ICC 0.86, 95% CI 0.78-0.910). For cerebellar volume, the intraobserver agreement was almost perfect (0.85, 95% CI 0.76-0.90), whereas the interobserver agreement was substantial (0.75, 95% CI 0.44-0.88). Agreement was only moderate for thalamic measurements. Bland-Altman plots for the volume measurements are normally distributed with acceptable mean differences and 95% limits of agreement. The intra- and interobserver agreement of the measurement of intracranial and cerebellar volume with VOCAL was almost perfect. These measurements are therefore reliable, and can be used to investigate fetal brain development. Thalamic measurements are not reliable enough. © 2018 Wiley Periodicals, Inc.

  12. Agreement among Magnetic Resonance Imaging/Magnetic Resonance Cholangiopancreatography (MRI-MRCP) and Endoscopic Ultrasound (EUS) in the evaluation of morphological features of Branch Duct Intraductal Papillary Mucinous Neoplasm (BD-IPMN).

    PubMed

    Uribarri-Gonzalez, Laura; Keane, Margaret G; Pereira, Stephen P; Iglesias-García, Julio; Dominguez-Muñoz, J Enrique; Lariño-Noia, Jose

    2018-03-01

    To evaluate the agreement between the imaging modalities MRI-MRCP and EUS in cystic lesions of the pancreas which were thought to be a BD-IPMN. Multicenter retrospective study included all patients between 2010 and 2015 with a suspected BD-IPMN who underwent an EUS and MRI-MRCP within 6 months or less of each other. Location, number, size, worrisome features and high-risk stigmata were evaluated. Interobserver agreement was evaluated by Kappa score. 173 patients were included (97 UHSC, 76 UCLH-RFH), mean age 65 (range 25-87 years), 66 males. When comparing both modalities there was good agreement for the location of the cyst. The median lesion size was larger by MRI-MRCP than EUS although it was not significant. With regards to worrisome features, there was moderate agreement for main PD of 5-9 mm and abrupt change (k = 0.45 and 0.52). Fair agreement was seen for the cyst wall thickening (k = 0.25). No agreement was seen between the presence of non-enhanced mural nodules or lymphadenopathy (k < 0). With regards to high-risk stigmata, poor agreement was obtained for the detection of an enhanced solid component (k = 0.12). No agreement was observed for main PD > 10 mm (k < 0). In this multicentre study of patients with a BD-IPMN under active surveillance, most disagreement between these modalities was seen in the proximal pancreas. There was generally only minimal concordance between the imaging findings of EUS and MRI-MRCP for the detection of high-risk stigmata and worrisome features. Copyright © 2018 IAP and EPC. All rights reserved.

  13. Applanation tonometry: interobserver and prism agreement using the reusable Goldmann applanation prism and the Tonosafe disposable prism.

    PubMed

    Ajtony, Csilla; Elkarmouty, Ahmed; Barton, Keith; Kotecha, Aachal

    2016-06-01

    To evaluate the levels of agreement between the standard reusable prism and a disposable prism, and to examine the agreement between ophthalmologists, nursing and technical staff when measuring intraocular pressure (IOP) using the Goldmann applanation tonometer. Three hundred eyes of 300 patients were recruited. IOP measurements were made in a randomised order by three observer groups consisting of ophthalmologists and ophthalmic technicians/nurses taken from a pool of clinicians working within a busy outpatient clinic. Agreement was calculated by Bland-Altman analysis, showing the mean difference and 95% limits of agreement (LoA) of measurements. The mean difference between the reusable and disposable prism IOP measurements was <0.5 mm Hg. The LoA ranged from ±3.1 to ±4.9 mm Hg, depending on the observer group. The interobserver variability was <1 mm Hg across all observer groups; the LoA was slightly higher for observers using the reusable prism (range between ±4.3 and ±5.6 mm Hg) compared with using the disposable prism (range between ±3.7 and ±5.4 mm Hg) across observer groups. There is an acceptable agreement between IOP measurements made with the reusable Goldmann tonometer prism and the disposable Tonosafe prism. Interobserver variability in IOP measurements within an outpatient setting is larger than that found within a research setting, and may be of a level that impacts on clinical decision-making. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/

  14. Observer agreement in the reporting of knee and lumbar spine magnetic resonance (MR) imaging examinations: selectively trained MR radiographers and consultant radiologists compared with an index radiologist.

    PubMed

    Brealey, S; Piper, K; King, D; Bland, M; Caddick, J; Campbell, P; Gibbon, A; Highland, A; Jenkins, N; Petty, D; Warren, D

    2013-10-01

    To assess agreement between trained radiographers and consultant radiologists compared with an index radiologist when reporting on magnetic resonance imaging (MRI) examinations of the knee and lumbar spine and to examine the subsequent effect of discordant reports on patient management and outcome. At York Hospital two MR radiographers, two consultant radiologists and an index radiologist reported on a prospective, random sample of 326 MRI examinations. The radiographers reported in clinical practice conditions and the radiologists during clinical practice. An independent consultant radiologist compared these reports with the index radiologist report for agreement. Orthopaedic surgeons then assessed whether the discordance between reports was clinically important. Overall observer agreement with the index radiologist was comparable between observers and ranged from 54% to 58%; for the knee it was 46-57% and for the lumbar spine was 56-66%. There was a very small observed difference of 0.6% (95% CI -11.9 to 13.0) in mean agreement between the radiographers and radiologists (P=0.860). For the knee, lumbar spine and overall, radiographers' discordant reports, when compared with the index radiologist, were less likely to have a clinically important effect on patient outcome than the radiologists' discordant reports. Less than 10% of observer's reports were sufficiently discordant with the index radiologist's reports to be clinically important. Carefully selected MR radiographers with postgraduate education and training reported in clinical practice conditions on specific MRI examinations of the knee and lumbar spine to a level of agreement comparable with non-musculoskeletal consultant radiologists. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  15. Intra- and interobserver reliability estimates for identification and grading of upper respiratory tract abnormalities recorded in horses at rest and during overground endoscopy.

    PubMed

    McGivney, C L; Sweeney, J; David, F; O'Leary, J M; Hill, E W; Katz, L M

    2017-07-01

    Previous studies support good intra- and interobserver agreements for endoscopic evaluation of various upper respiratory tract (URT) diseases in horses. However, these studies mainly assessed resting endoscopic examination videos and/or focussed on a single URT abnormality. To estimate intra- and interobserver agreement for identification and grading of all URT abnormalities from resting and overground endoscopy (OGE) videos of Thoroughbreds. Blinded, fully crossed design. Resting and OGE URT videos for n = 43 Thoroughbreds were retrospectively chosen based on identification of common URT disorders. The videos were randomly evaluated in duplicate by 4 raters blinded to all information including prior URT disorder(s) diagnosis. Abnormalities were graded using well-described ordinal scales. Intra- and interobserver agreements were estimated using Cohen's weighted κ and Krippendorff's α, respectively. Intraobserver agreement was perfect/nearly perfect for arytenoid symmetry at exercise, epiglottic entrapment and epiglottic retroversion, substantial for arytenoid asymmetry at rest, palatal dysfunction (PD), medial deviation of the aryepiglottic folds (MDAF), pharyngeal mucus and epiglottic grade at exercise and moderate for vocal fold collapse (VFC), ventromedial luxation of the apex of the corniculate process of the arytenoid (VLAC), nasopharyngeal collapse (NPC) and epiglottic grade at rest. Interobserver agreement was substantial for arytenoid symmetry at exercise and PD and moderate for arytenoid asymmetry at rest, MDAF, VLAC and epiglottic entrapment. It was only fair for VFC, epiglottic grade at exercise, epiglottic retroversion, pharyngeal mucus and NPC and poor for epiglottic grade at rest. Sample size was insufficient to allow assessment of the effect of one abnormality on the grading of another abnormality. Observers were consistent in grading URT disorders. However, significant disparity in grading existed between observers for some conditions affecting reliability. © 2016 EVJ Ltd.

  16. Defining intractability: comparisons among published definitions.

    PubMed

    Berg, Anne T; Kelly, Molly M

    2006-02-01

    Intractable epilepsy is the focus of much research; however, this concept is defined in no single way. Individual studies use different definitions, creating difficulties for comparisons of results across studies. A head-to-head comparison of definitions could highlight these differences and motivate the development of consensus guidelines. Within a single prospective study of 613 children in Connecticut with newly diagnosed epilepsy (1993-1997), six different published definitions or indicators for intractability were applied and compared. All definitions were assessed at various times within the first 5 years after diagnosis, with the exact timing reflecting how they were used in their initial reports. Observed and chance-adjusted agreement (kappa) were computed. The associations of each definition with remission status 7-10 years after diagnosis were quantified with a relative risk. Depending on the specific definition, the epilepsy of 9-24% of children was considered intractable. Observed agreements among the definitions ranged from a low of 0.83 to a high of 0.96. Kappas ranged from low of 0.45 to 0.79. More similar definitions had higher levels of agreement. All definitions were strongly associated with remission status as of last follow-up. Agreement among the different definitions is strong but imperfect. All definitions were significantly associated with longer-term outcome. No single preferred definition of intractable epilepsy exists. Some discussion within the field of epilepsy and a consensus process should be considered as a future step for enhancing comparability of research efforts and clinical guidelines. Consideration should be given to whether a single definition will suit all purposes or whether different types of definitions are needed for different purposes.

  17. Telecytology: Is it possible with smartphone images?

    PubMed

    Sahin, Davut; Hacisalihoglu, Uguray Payam; Kirimlioglu, Saime Hale

    2018-01-01

    This study aimed to discuss smartphone usage in telecytology and determine intraobserver concordance between microscopic cytopathological diagnoses and diagnoses derived via static smartphone images. The study was conducted with 172 cytologic material. A pathologist captured static images of the cytology slides from the ocular lens of a microscope using a smartphone. The images were transferred via WhatsApp® to a cytopathologist working in another center who made all the microscopic cytopathological diagnoses 5-27 months ago. The cytopathologist diagnosed images on a smartphone without knowledge of their previous microscopic diagnoses. The Kappa agreement between microscopic cytopathological diagnoses and smartphone image diagnoses was determined. The average image capturing, transfer, and remote cytopathological diagnostic time for one case was 6.20 minutes. The percentage of cases whose microscopic and smartphone image diagnoses were concordant was 84.30%, and the percentage of those whose diagnoses were discordant was 15.69%. The highest Kappa agreement was observed in endoscopic ultrasound-guided fine needle aspiration (1.000), and the lowest agreement was observed in urine cytology (0.665). Patient management changed with smart phone image diagnoses at 11.04%. This study showed that easy, fast, and high-quality image capturing and transfer is possible from cytology slides using smartphones. The intraobserver Kappa agreement between the microscopic cytopathological diagnoses and remote smartphone image diagnoses was high. It was found that remote diagnosis due to difficulties in telecytology might change patient management. The developments in the smartphone camera technology and transfer software make them efficient telepathology and telecytology tools. © 2017 Wiley Periodicals, Inc.

  18. Acoustic solitons in waveguides with Helmholtz resonators: transmission line approach.

    PubMed

    Achilleos, V; Richoux, O; Theocharis, G; Frantzeskakis, D J

    2015-02-01

    We report experimental results and study theoretically soliton formation and propagation in an air-filled acoustic waveguide side loaded with Helmholtz resonators. We propose a theoretical modeling of the system, which relies on a transmission-line approach, leading to a nonlinear dynamical lattice model. The latter allows for an analytical description of the various soliton solutions for the pressure, which are found by means of dynamical systems and multiscale expansion techniques. These solutions include Boussinesq-like and Korteweg-de Vries pulse-shaped solitons that are observed in the experiment, as well as nonlinear Schrödinger envelope solitons, that are predicted theoretically. The analytical predictions are in excellent agreement with direct numerical simulations and in qualitative agreement with the experimental observations.

  19. Quasiparticle Energy in a Strongly Interacting Homogeneous Bose-Einstein Condensate.

    PubMed

    Lopes, Raphael; Eigen, Christoph; Barker, Adam; Viebahn, Konrad G H; Robert-de-Saint-Vincent, Martin; Navon, Nir; Hadzibabic, Zoran; Smith, Robert P

    2017-05-26

    Using two-photon Bragg spectroscopy, we study the energy of particlelike excitations in a strongly interacting homogeneous Bose-Einstein condensate, and observe dramatic deviations from Bogoliubov theory. In particular, at large scattering length a the shift of the excitation resonance from the free-particle energy changes sign from positive to negative. For an excitation with wave number q, this sign change occurs at a≈4/(πq), in agreement with the Feynman energy relation and the static structure factor expressed in terms of the two-body contact. For a≳3/q we also see a breakdown of this theory, and better agreement with calculations based on the Wilson operator product expansion. Neither theory explains our observations across all interaction regimes, inviting further theoretical efforts.

  20. The Proposed Anti-Counterfeiting Trade Agreement: Background and Key Issues

    DTIC Science & Technology

    2010-03-12

    or is negotiating regional and bilateral FTAs with several of the ACTA participants. Some observers have questioned why countries designated in...Notably, China, India , and Brazil currently are not participants to this agreement. Negotiations on the agreement are ongoing; participants have expressed...bilateral free trade agreements ( FTAs ). In negotiating recent FTAs , the USTR frequently has sought levels of protection that exceed the minimum standards

  1. Concordance between local, institutional, and central pathology review in glioblastoma: implications for research and practice: a pilot study.

    PubMed

    Gupta, Tejpal; Nair, Vimoj; Epari, Sridhar; Pietsch, Torsten; Jalali, Rakesh

    2012-01-01

    There is significant inter-observer variation amongst the neuro-pathologists in the typing, subtyping, and grading of glial neoplasms for diagnosis. Centralized pathology review has been proposed to minimize this inter-observer variation and is now almost mandatory for accrual into multicentric trials. We sought to assess the concordance between neuro-pathologists on histopathological diagnosis of glioblastoma. Comparison of local, institutional, and central neuro-oncopathology reporting in a cohort of 34 patients with newly diagnosed supratentorial glioblastoma accrued consecutively at a tertiary-care institution on a prospective trial testing the addition of a new agent to standard chemo-radiation regimen. Concordance was sub-optimal between local histological diagnosis and central review, fair between local diagnosis and institutional review, and good between institutional and central review, with respect to histological typing/subtyping. Twelve (39%) of 31 patients with local histological diagnosis had identical tumor type, subtype and grade on central review. Overall agreement was modestly better (52%) between local diagnosis and institutional review. In contrast, 28 (83%) of 34 patients had completely concordant histopathologic diagnosis between institutional and central review. The inter-observer reliability test showed poor agreement between local and central review (kappa statistic=0.12, 95% confidence interval (CI): -0.03-0.32, P=0.043), but moderate agreement between institutional and central review (kappa statistic=0.51, 95%CI: 0.17-0.84, P=0.00003). Agreement between local diagnosis and institutional review was fair. There exists significant inter-observer variation regarding histopathological diagnosis of glioblastoma with significant implications for clinical research and practice. There is a need for more objective, quantitative, robust, and reproducible criteria for better subtyping for accurate diagnosis.

  2. Assessing Agreement between Multiple Raters with Missing Rating Information, Applied to Breast Cancer Tumour Grading

    PubMed Central

    Ellis, Ian O.; Green, Andrew R.; Hanka, Rudolf

    2008-01-01

    Background We consider the problem of assessing inter-rater agreement when there are missing data and a large number of raters. Previous studies have shown only ‘moderate’ agreement between pathologists in grading breast cancer tumour specimens. We analyse a large but incomplete data-set consisting of 24177 grades, on a discrete 1–3 scale, provided by 732 pathologists for 52 samples. Methodology/Principal Findings We review existing methods for analysing inter-rater agreement for multiple raters and demonstrate two further methods. Firstly, we examine a simple non-chance-corrected agreement score based on the observed proportion of agreements with the consensus for each sample, which makes no allowance for missing data. Secondly, treating grades as lying on a continuous scale representing tumour severity, we use a Bayesian latent trait method to model cumulative probabilities of assigning grade values as functions of the severity and clarity of the tumour and of rater-specific parameters representing boundaries between grades 1–2 and 2–3. We simulate from the fitted model to estimate, for each rater, the probability of agreement with the majority. Both methods suggest that there are differences between raters in terms of rating behaviour, most often caused by consistent over- or under-estimation of the grade boundaries, and also considerable variability in the distribution of grades assigned to many individual samples. The Bayesian model addresses the tendency of the agreement score to be biased upwards for raters who, by chance, see a relatively ‘easy’ set of samples. Conclusions/Significance Latent trait models can be adapted to provide novel information about the nature of inter-rater agreement when the number of raters is large and there are missing data. In this large study there is substantial variability between pathologists and uncertainty in the identity of the ‘true’ grade of many of the breast cancer tumours, a fact often ignored in clinical studies. PMID:18698346

  3. The definition of polytrauma: variable interrater versus intrarater agreement--a prospective international study among trauma surgeons.

    PubMed

    Butcher, Nerida E; Enninghorst, Natalie; Sisak, Krisztian; Balogh, Zsolt J

    2013-03-01

    The international trauma community has recognized the lack of a validated consensus definition of "polytrauma." We hypothesized that using a subjective definition, trauma surgeons will not have substantial agreement; thus, an objective definition is needed. A prospective observational study was conducted between December 2010 and June 2011 (John Hunter Hospital, Level I trauma center). Inclusion criteria were all trauma call patients with subsequent intensive care unit admission. The study was composed of four stages as follows: (1) four trauma surgeons assessed patients until 24 hours, then coded as either "yes" or "no" for polytrauma, and results compared for agreement; (2) eight trauma surgeons representing the United States, Germany, and the Netherlands graded the same prospectively assessed patients and coded as either "yes" or "no" for polytrauma; (3) 12 months later, the original four trauma surgeons repeated assessment via data sheets to test intrarater variability; and (4) individual subjective definitions were compared with three anatomic scores, namely, (a) Injury Severity Score (ISS) of greater than 15, (b) ISS of greater 17, and (c) Abbreviated Injury Scale (AIS) score of greater than 2 in at least two ISS body regions. A total of 52 trauma patients were included. Results for each stage were as follows: (1) κ score of 0.50, moderate agreement; (2) κ score of 0.41, moderate agreement; (3) Rater 1 had moderate intrarater agreement (κ score, 0.59), while Raters 2, 3, 4 had substantial intrarater agreement (κ scores, 0.75, 0.66, and 0.71, respectively); and (4) none had most agreement with ISS of greater than 15 (κ score, 0.16), while both definitions ISS greater than 17 and Abbreviated Injury Scale (AIS) score of greater than 2 in at least two ISS body regions had on average fair agreement (κ scores, 0.27 and 0.39, respectively). Based on subjective assessments, trauma surgeons do not agree on the definition of polytrauma, with the subjective definition differing both within and across institutions.

  4. A combined pulmonary-radiology workshop for visual evaluation of COPD: study design, chest CT findings and concordance with quantitative evaluation.

    PubMed

    Barr, R Graham; Berkowitz, Eugene A; Bigazzi, Francesca; Bode, Frederick; Bon, Jessica; Bowler, Russell P; Chiles, Caroline; Crapo, James D; Criner, Gerard J; Curtis, Jeffrey L; Dass, Chandra; Dirksen, Asger; Dransfield, Mark T; Edula, Goutham; Erikkson, Leif; Friedlander, Adam; Galperin-Aizenberg, Maya; Gefter, Warren B; Gierada, David S; Grenier, Philippe A; Goldin, Jonathan; Han, MeiLan K; Hanania, Nicola A; Hansel, Nadia N; Jacobson, Francine L; Kauczor, Hans-Ulrich; Kinnula, Vuokko L; Lipson, David A; Lynch, David A; MacNee, William; Make, Barry J; Mamary, A James; Mann, Howard; Marchetti, Nathaniel; Mascalchi, Mario; McLennan, Geoffrey; Murphy, James R; Naidich, David; Nath, Hrudaya; Newell, John D; Pistolesi, Massimo; Regan, Elizabeth A; Reilly, John J; Sandhaus, Robert; Schroeder, Joyce D; Sciurba, Frank; Shaker, Saher; Sharafkhaneh, Amir; Silverman, Edwin K; Steiner, Robert M; Strange, Charlton; Sverzellati, Nicola; Tashjian, Joseph H; van Beek, Edwin J R; Washington, Lacey; Washko, George R; Westney, Gloria; Wood, Susan A; Woodruff, Prescott G

    2012-04-01

    The purposes of this study were: to describe chest CT findings in normal non-smoking controls and cigarette smokers with and without COPD; to compare the prevalence of CT abnormalities with severity of COPD; and to evaluate concordance between visual and quantitative chest CT (QCT) scoring. Volumetric inspiratory and expiratory CT scans of 294 subjects, including normal non-smokers, smokers without COPD, and smokers with GOLD Stage I-IV COPD, were scored at a multi-reader workshop using a standardized worksheet. There were 58 observers (33 pulmonologists, 25 radiologists); each scan was scored by 9-11 observers. Interobserver agreement was calculated using kappa statistic. Median score of visual observations was compared with QCT measurements. Interobserver agreement was moderate for the presence or absence of emphysema and for the presence of panlobular emphysema; fair for the presence of centrilobular, paraseptal, and bullous emphysema subtypes and for the presence of bronchial wall thickening; and poor for gas trapping, centrilobular nodularity, mosaic attenuation, and bronchial dilation. Agreement was similar for radiologists and pulmonologists. The prevalence on CT readings of most abnormalities (e.g. emphysema, bronchial wall thickening, mosaic attenuation, expiratory gas trapping) increased significantly with greater COPD severity, while the prevalence of centrilobular nodularity decreased. Concordances between visual scoring and quantitative scoring of emphysema, gas trapping and airway wall thickening were 75%, 87% and 65%, respectively. Despite substantial inter-observer variation, visual assessment of chest CT scans in cigarette smokers provides information regarding lung disease severity; visual scoring may be complementary to quantitative evaluation.

  5. 3-D microphysical model studies of Arctic denitrification: comparison with observations

    NASA Astrophysics Data System (ADS)

    Davies, S.; Mann, G. W.; Carslaw, K. S.; Chipperfield, M. P.; Kettleborough, J. A.; Santee, M. L.; Oelhaf, H.; Wetzel, G.; Sasano, Y.; Sugita, T.

    2005-11-01

    Simulations of Arctic denitrification using a 3-D chemistry-microphysics transport model are compared with observations for the winters 1994/95, 1996/97 and 1999/2000. The model of Denitrification by Lagrangian Particle Sedimentation (DLAPSE) couples the full chemical scheme of the 3-D chemical transport model, SLIMCAT, with a nitric acid trihydrate (NAT) growth and sedimentation scheme. We use observations from the Microwave Limb Sounder (MLS) and Improved Limb Atmospheric Sounder (ILAS) satellite instruments, the balloon-borne Michelsen Interferometer for Passive Atmospheric Sounding (MIPAS-B), and the in situ NOy instrument on-board the ER-2. As well as directly comparing model results with observations, we also assess the extent to which these observations are able to validate the modelling approach taken. For instance, in 1999/2000 the model captures the temporal development of denitrification observed by the ER-2 from late January into March. However, in this winter the vortex was already highly denitrified by late January so the observations do not provide a strong constraint on the modelled rate of denitrification. The model also reproduces the MLS observations of denitrification in early February 2000. In 1996/97 the model captures the timing and magnitude of denitrification as observed by ILAS, although the lack of observations north of ~67° N in the beginning of February make it difficult to constrain the actual timing of onset. The comparison for this winter does not support previous conclusions that denitrification must be caused by an ice-mediated process. In 1994/95 the model notably underestimates the magnitude of denitrification observed during a single balloon flight of the MIPAS-B instrument. Agreement between model and MLS HNO3 at 68 hPa in mid-February 1995 is significantly better. Sensitivity tests show that a 1.5 K overall decrease in vortex temperatures, or a factor 4 increase in assumed NAT nucleation rates, produce the best statistical fit to MLS observations. Both adjustments would be required to bring the model into agreement with the MIPAS-B observations. The agreement between the model and observations suggests that a NAT-only denitrification scheme (without ice), which was discounted by previous studies, must now be considered as one mechanism for the observed Arctic denitrification. The timing of onset and the rate of denitrification remain poorly constrained by the available observations.

  6. 3-D microphysical model studies of Arctic denitrification: comparison with observations

    NASA Astrophysics Data System (ADS)

    Davies, S.; Mann, G. W.; Carslaw, K. S.; Chipperfield, M. P.; Kettleborough, J. A.; Santee, M. L.; Oelhaf, H.; Wetzel, G.; Sasano, Y.; Sugita, T.

    2005-01-01

    Simulations of Arctic denitrification using a 3-D chemistry-microphysics transport model are compared with observations for the winters 1994/1995, 1996/1997 and 1999/2000. The model of Denitrification by Lagrangian Particle Sedimentation (DLAPSE) couples the full chemical scheme of the 3-D chemical transport model, SLIMCAT, with a nitric acid trihydrate (NAT) growth and sedimentation scheme. We use observations from the Microwave Limb Sounder (MLS) and Improved Limb Atmospheric Sounder (ILAS) satellite instruments, the balloon-borne Michelsen Interferometer for Passive Atmospheric Sounding (MIPAS-B), and the in situ NOy instrument on-board the ER-2. As well as directly comparing model results with observations, we also assess the extent to which these observations are able to validate the modelling approach taken. For instance, in 1999/2000 the model captures the temporal development of denitrification observed by the ER-2 from late January into March. However, in this winter the vortex was already highly denitrified by late January so the observations do not provide a strong constraint on the modelled rate of denitrification. The model also reproduces the MLS observations of denitrification in early February 2000. In 1996/1997 the model captures the timing and magnitude of denitrification as observed by ILAS, although the lack of observations north of ~67° N make it difficult to constrain the actual timing of onset. The comparison for this winter does not support previous conclusions that denitrification must be caused by an ice-mediated process. In 1994/1995 the model notably underestimates the magnitude of denitrification observed during a single balloon flight of the MIPAS-B instrument. Agreement between model and MLS HNO3 at 68 hPa in mid-February 1995 was significantly better. Sensitivity tests show that a 1.5 K overall decrease in vortex temperatures or a factor 4 increase in assumed NAT nucleation rates produce the best statistical fit to MLS observations. Both adjustments would be required to bring the model into agreement with the MIPAS-B observations. The agreement between the model and observations suggests that a NAT-only denitrification scheme (without ice), which was discounted by previous studies, must now be considered as one mechanism for the observed Arctic denitrification. The timing of onset and the rate of denitrification remain poorly constrained by the available observations.

  7. Quality of pharmaceutical care at the pharmacy counter: patients' experiences versus video observation.

    PubMed

    Koster, Ellen S; Blom, Lyda; Overbeeke, Marloes R; Philbert, Daphne; Vervloet, Marcia; Koopman, Laura; van Dijk, Liset

    2016-01-01

    Consumer Quality Index questionnaires are used to assess quality of care from patients' experiences. To provide insight into the agreement about quality of pharmaceutical care, measured both by a patient questionnaire and video observations. Pharmaceutical encounters in four pharmacies were video-recorded. Patients completed a questionnaire based upon the Consumer Quality Index Pharmaceutical Care after the encounter containing questions about patients' experiences regarding information provision, medication counseling, and pharmacy staff's communication style. An observation protocol was used to code the recorded encounters. Agreement between video observation and patients' experiences was calculated. In total, 109 encounters were included for analysis. For the domains "medication counseling" and "communication style", agreement between patients' experiences and observations was very high (>90%). Less agreement (45%) was found for "information provision", which was rated more positive by patients compared to the observations, especially for the topic, encouragement of patients' questioning behavior. A questionnaire is useful to assess the quality of medication counseling and pharmacy staff's communication style, but might be less suitable to evaluate information provision and pharmacy staff's encouragement of patients' questioning behavior. Although patients may believe that they have received all necessary information to use their new medicine, some information on specific instructions was not addressed during the encounter. When using questionnaires to get insight into information provision, observations of encounters are very informative to validate the patient questionnaires and make necessary adjustments.

  8. Agreement between coding schemas used to identify bleeding-related hospitalizations in claims analyses of nonvalvular atrial fibrillation patients.

    PubMed

    Coleman, Craig I; Vaitsiakhovich, Tatsiana; Nguyen, Elaine; Weeda, Erin R; Sood, Nitesh A; Bunz, Thomas J; Schaefer, Bernhard; Meinecke, Anna-Katharina; Eriksson, Daniel

    2018-01-01

    Schemas to identify bleeding-related hospitalizations in claims data differ in billing codes used and coding positions allowed. We assessed agreement across bleeding-related hospitalization coding schemas for claims analyses of nonvalvular atrial fibrillation (NVAF) patients on oral anticoagulation (OAC). We hypothesized that prior coding schemas used to identify bleeding-related hospitalizations in claim database studies would provide varying levels of agreement in incidence rates. Within MarketScan data, we identified adults, newly started on OAC for NVAF from January 2012 to June 2015. Billing code schemas developed by Cunningham et al., the US Food and Drug Administration (FDA) Mini-Sentinel program, and Yao et al. were used to identify bleeding-related hospitalizations as a surrogate for major bleeding. Bleeds were subcategorized as intracranial hemorrhage (ICH), gastrointestinal (GI), or other. Schema agreement was assessed by comparing incidence, rates of events/100 person-years (PYs), and Cohen's kappa statistic. We identified 151 738 new-users of OAC with NVAF (CHA2DS2-VASc score = 3, [interquartile range = 2-4] and median HAS-BLED score = 3 [interquartile range = 2-3]). The Cunningham, FDA Mini-Sentinel, and Yao schemas identified any bleeding-related hospitalizations in 1.87% (95% confidence interval [CI]: 1.81-1.94), 2.65% (95% CI: 2.57-2.74), and 4.66% (95% CI: 4.55-4.76) of patients (corresponding rates = 3.45, 4.90, and 8.65 events/100 PYs). Kappa agreement across schemas was weak-to-moderate (κ = 0.47-0.66) for any bleeding hospitalization. Near-perfect agreement (κ = 0.99) was observed with the FDA Mini-Sentinel and Yao schemas for ICH-related hospitalizations, but agreement was weak when comparing Cunningham to FDA Mini-Sentinel or Yao (κ = 0.52-0.53). FDA Mini-Sentinel and Yao agreement was moderate (κ = 0.62) for GI bleeding, but agreement was weak when comparing Cunningham to FDA Mini-Sentinel or Yao (κ = 0.44-0.56). For other bleeds, agreement across schemas was minimal (κ = 0.14-0.38). We observed varying levels of agreement among 3 bleeding-related hospitalizations schemas in NVAF patients. © 2018 Wiley Periodicals, Inc.

  9. The angular structure of jet quenching within a hybrid strong/weak coupling model

    NASA Astrophysics Data System (ADS)

    Casalderrey-Solana, Jorge; Gulhan, Doga Can; Milhano, José Guilherme; Pablos, Daniel; Rajagopal, Krishna

    2017-08-01

    Building upon the hybrid strong/weak coupling model for jet quenching, we incorporate and study the effects of transverse momentum broadening and medium response of the plasma to jets on a variety of observables. For inclusive jet observables, we find little sensitivity to the strength of broadening. To constrain those dynamics, we propose new observables constructed from ratios of differential jet shapes, in which particles are binned in momentum, which are sensitive to the in-medium broadening parameter. We also investigate the effect of the back-reaction of the medium on the angular structure of jets as reconstructed with different cone radii R. Finally we provide results for the so called ;missing-pt;, finding a qualitative agreement between our model calculations and data in many respects, although a quantitative agreement is beyond our simplified treatment of the hadrons originating from the hydrodynamic wake.

  10. Prevalence of Metabolic Syndrome in Type 2 Diabetes Mellitus Using NCEP-ATPIII, IDF and WHO Definition and Its Agreement in Gwalior Chambal Region of Central India

    PubMed Central

    Yadav, Dhananjay; Mahajan, Sunil; Subramanian, Senthil K.; Bisen, Prakash Singh; Chung, Choon Hee; Prasad, GBKS

    2013-01-01

    The aim of study was to determine the prevalence of metabolic syndrome (MetS) in people with type 2 diabetes mellitus (T2DM). National Cholesterol Education Program (NCEP) ATPIII Criteria, International Diabetes Federation and the World Health Organization (WHO) definitions were used in quantifying the metabolic syndrome and also the concordance between these three criteria’s used for identifying metabolic syndrome. Methods: This cross-sectional study involved 700 type 2 diabetic subjects from the urban areas of Gwalior Chambal region (Central India). Subjects in the age group of 28-87 yrs were included in the study. Type I diabetics, pregnant ladies and those with chronic viral and bacterial infections and serious metabolic disorders were excluded from the study. Fasting blood glucose, Blood lipids (T-cholesterol, triglyceride, HDL-cholesterol) were assessed and anthropometry blood pressure were measured from all the subjects. Results: The Prevalence of metabolic syndrome was found to be 45.8%, 57.7% and 28% following NCEP-ATPIII Criteria, IDF and WHO definitions, respectively. Using all the three definitions the prevalence was higher in women in all age groups. ATP III and IDF criteria showed good agreement (κ 0.68) compared to ATP III with WHO (κ 0.54) and IDF with WHO (κ 0.34) criteria. Highest prevalence was observed following IDF definition. Conclusions: A good agreement was observed between ATPIII and IDF criteria. Maximum prevalence of Metabolic syndrome was recorded when IDF criteria was followed. NCEP-ATPIII criteria for the diagnosis of MetS and this criterion reflected equal importance to the every variable and showed a good agreement between the different criteria used. PMID:24171882

  11. [Identification of adverse events in hospitalised influenza patients].

    PubMed

    Aranaz-Andrés, J M; Gea-Velázquez de Castro, M T; Jiménez-Pericás, F; Balbuena-Segura, A I; Meyer-García, M C; López-Fresneña, N; Miralles-Bueno, J J; Obón-Azuara, B; Moliner-Lahoz, J; Aibar-Remón, C

    2015-01-01

    To test the inter-observer agreement in identifying adverse events (AE) in patients hospitalized by flu and undergoing precautionary isolation measures. Historical cohort study, 50 patients undergoing isolation measures due to flu, and 50 patients without any isolation measures. The AE incidence ranges from 10 to 26% depending on the observer (26% [95%CI: 17.4%-34.60%], 10% [95%CI: 4.12%-15.88%], and 23% [95%CI: 14.75%-31.25%]). It was always lower in the cohort undergoing the isolation measures. This difference is statistically significant when the accurate definition of a case is applied. The agreement as regards the screening was good (higher than 76%; Kappa index between 0.29 and 0.81). The agreement as regards the accurate identification of AE related to care was lower (from 50 to 93.3%, Kappa index from 0.20 to 0.70). Before performing an epidemiological study on AE, interobserver concordance must be analyzed to improve the accuracy of the results and the validity of the study. Studies have different levels of reliability. Kappa index shows high levels for the screening guide, but not for the identification of AE. Without a good methodology the results achieved, and thus the decisions made from them, cannot be guaranteed. Researchers have to be sure of the method used, which should be as close as possible to the optimal achievable. Copyright © 2014 SECA. Published by Elsevier Espana. All rights reserved.

  12. Gastritis staging: interobserver agreement by applying OLGA and OLGIM systems.

    PubMed

    Isajevs, Sergejs; Liepniece-Karele, Inta; Janciauskas, Dainius; Moisejevs, Georgijs; Putnins, Viesturs; Funka, Konrads; Kikuste, Ilze; Vanags, Aigars; Tolmanis, Ivars; Leja, Marcis

    2014-04-01

    Atrophic gastritis remains a difficult histopathological diagnosis with low interobserver agreement. The aim of our study was to compare gastritis staging and interobserver agreement between general and expert gastrointestinal (GI) pathologists using Operative Link for Gastritis Assessment (OLGA) and Operative Link on Gastric Intestinal Metaplasia (OLGIM). We enrolled 835 patients undergoing upper endoscopy in the study. Two general and two expert gastrointestinal pathologists graded biopsy specimens according to the Sydney classification, and the stage of gastritis was assessed by OLGA and OLGIM system. Using OLGA, 280 (33.4 %) patients had gastritis (stage I-IV), whereas with OLGIM this was 167 (19.9 %). OLGA stage III- IV gastritis was observed in 25 patients, whereas by OLGIM stage III-IV was found in 23 patients. Interobserver agreement between expert GI pathologists for atrophy in the antrum, incisura angularis, and corpus was moderate (kappa = 0.53, 0.57 and 0.41, respectively, p < 0.0001), but almost perfect for intestinal metaplasia (kappa = 0.82, 0.80 and 0.81, respectively, p < 0.0001). However, interobserver agreement between general pathologists was poor for atrophy, but moderate for intestinal metaplasia. OLGIM staging provided the highest interobserver agreement, but a substantial proportion of potentially high-risk individuals would be missed if only OLGIM staging is applied. Therefore, we recommend to use a combination of OLGA and OLGIM for staging of chronic gastritis.

  13. Analysis of deep inferior epigastric perforator (DIEP) arteries by using MDCTA: Comparison between 2 post-processing techniques.

    PubMed

    Saba, Luca; Atzeni, Matteo; Ribuffo, Diego; Mallarini, Giorgio; Suri, Jasjit S

    2012-08-01

    Our purpose was to compare two post-processing techniques, Maximum-Intensity-Projection (MIP) and Volume Rendering (VR) for the study of perforator arteries. Thirty patients who underwent Multi-Detector-Row CT Angiography (MDCTA) between February 2010 and May 2010 were retrospectively analyzed. For each patient and for each reconstruction method, the image quality was evaluated and the inter- and intra-observer agreement was calculated according to the Cohen statistics. The Hounsfield Unit (HU) value in the common femoral artery was quantified and the correlation (Pearson Statistic) between image quality and HU value was explored. The Pearson r between the right and left common femoral artery was excellent (r=0.955). The highest image quality score was obtained using MIP for both observers (total value 75, with a mean value 2.67 for observer 1 and total value of 79 and a mean value of 2.82 for observer 2). The highest agreement between the two observers was detected using the MIP protocol with a Cohen kappa value of 0.856. The ROC area under the curve (Az) for the VR is 0.786 (0.086 SD; p value=0.0009) whereas the ROC area under the curve (Az) for the MIP is 0.0928 (0.051 SD; p value=0.0001). MIP showed the optimal inter- and intra-observer agreement and the highest quality scores and therefore should be used as post-processing techniques in the analysis of perforating arteries. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.

  14. Investigations of the reliability of observational gait analysis for the assessment of lameness in horses.

    PubMed

    Hewetson, M; Christley, R M; Hunt, I D; Voute, L C

    2006-06-24

    The objectives of this study were to assess the reliability of a numerical rating scale (NRS) and a verbal rating scale (VRS) for the assessment of lameness in horses and to determine whether they can be used interchangeably. Sixteen independent observers graded the severity of lameness in 20 videotaped horses, and the agreement between and within observers, correlation and bias were determined for each scale. The observers agreed with each other in 56 per cent of the observations with the NRS and in 60 per cent of the observations with the VRS, and the associated Kendall coefficient of concordance was high. Similar trends were evident in the agreement between two observations by each observer. The correlation between and within observers was high for both scales. There were no significant differences (bias) among the observers' mean scores when using either scale. There was a significant correlation between the lameness scores attributed when using the two scales, but the differences between the scores when plotted against their overall mean were unacceptable for clinical purposes. The results indicate that the NRS and VRS are only moderately reliable when used to assess lameness severity in the horse, and that they should not be used interchangeably.

  15. The Use of Actigraphy to Study Sleep Disorders in Preschoolers: Some Concerns about Detection of Nighttime Awakenings

    PubMed Central

    Sitnick, Stephanie L.; Goodlin-Jones, Beth L.; Anders, Thomas F.

    2008-01-01

    Study Objectives: This study compared actigraphy with videosomnography in preschool-aged children, with special emphasis on the accuracy of detection of nighttime awakenings. Design: Fifty-eight participants wore an actigraph for 1 week and were videotaped for 2 nights while wearing the actigraph. Setting: Participants were solitary sleepers, studied in their homes. Participants: One group (n = 22) was diagnosed with autism, another group (n = 11) had developmental delays without autism, and a third group (n = 25) were typically developing children; age ranged from 28 to 73 months (mean age 47 months); 29 boys and 29 girls. Interventions: N/A. Measurements and Results: Nocturnal sleep and wakefulness were scored from simultaneously recorded videosomnography and actigraphy. The accuracy of actigraphy was examined in an epoch-by-epoch comparison with videosomnography. Findings were 94% overall agreement, 97% sensitivity, and 24% specificity. Statistical corrections for overall agreement and specificity resulted in an 89% weighted-agreement and 27% adjusted specificity. Conclusions: Actigraphy has poor agreement for detecting nocturnal awakenings, compared with video observations, in preschool-aged children. Citation: Sitnick SL; Goodlin-Jones BL; Anders TF. The use of actigraphy to study sleep disorders in preschoolers: some concerns about detection of nighttime awakenings. SLEEP 2008;31(3):395-401. PMID:18363316

  16. Relationships among different subjective measurements of consumer health information retrieval performance.

    PubMed

    Zeng, Qing T; Kogan, Sandra; Ngo, Long; Greenes, Robert A

    2004-01-01

    Millions of consumers perform health information retrieval (HIR) online. To better understand the consumers' perspective on HIR performance, we conducted an observation and interview study of 97 health information consumers. Consumers were asked to perform HIR tasks and we recorded their view regarding performance using several differ-ent subjective measurements: finding the desired information, usefulness of the information found, satisfaction with the information, and intention to continue searching. Statistical analysis was applied to verify if the multiple subjective measurements were redundant. The measurements ranged from slight agreement to no agreement among them. A number of reasons were identified for this lack of agreement. Although related, the four subjective measurements of HIR performance are distinct from each other and carried different useful information

  17. Development and validation of self-reported line drawings of the modified Beighton score for the assessment of generalised joint hypermobility.

    PubMed

    Cooper, Dale J; Scammell, Brigitte E; Batt, Mark E; Palmer, Debbie

    2018-01-17

    The impracticalities and comparative expense of carrying out a clinical assessment is an obstacle in many large epidemiological studies. The purpose of this study was to develop and validate a series of electronic self-reported line drawing instruments based on the modified Beighton scoring system for the assessment of self-reported generalised joint hypermobility. Five sets of line drawings were created to depict the 9-point Beighton score criteria. Each instrument consisted of an explanatory question whereby participants were asked to select the line drawing which best represented their joints. Fifty participants completed the self-report online instrument on two occasions, before attending a clinical assessment. A blinded expert clinical observer then assessed participants' on two occasions, using a standardised goniometry measurement protocol. Validity of the instrument was assessed by participant-observer agreement and reliability by participant repeatability and observer repeatability using unweighted Cohen's kappa (k). Validity and reliability were assessed for each item in the self-reported instrument separately, and for the sum of the total scores. An aggregate score for generalised joint hypermobility was determined based on a Beighton score of 4 or more out of 9. Observer-repeatability between the two clinical assessments demonstrated perfect agreement (k 1.00; 95% CI 1.00, 1.00). Self-reported participant-repeatability was lower but it was still excellent (k 0.91; 95% CI 0.74, 1.00). The participant-observer agreement was excellent (k 0.96; 95% CI 0.87, 1.00). Validity was excellent for the self-report instrument, with a good sensitivity of 0.87 (95% CI 0.81, 0.91) and excellent specificity of 0.99 (95% CI 0.98, 1.00). The self-reported instrument provides a valid and reliable assessment of the presence of generalised joint hypermobility and may have practical use in epidemiological studies.

  18. Inter-rater agreement on PIVC-associated phlebitis signs, symptoms and scales.

    PubMed

    Marsh, Nicole; Mihala, Gabor; Ray-Barruel, Gillian; Webster, Joan; Wallis, Marianne C; Rickard, Claire M

    2015-10-01

    Many peripheral intravenous catheter (PIVC) infusion phlebitis scales and definitions are used internationally, although no existing scale has demonstrated comprehensive reliability and validity. We examined inter-rater agreement between registered nurses on signs, symptoms and scales commonly used in phlebitis assessment. Seven PIVC-associated phlebitis signs/symptoms (pain, tenderness, swelling, erythema, palpable venous cord, purulent discharge and warmth) were observed daily by two raters (a research nurse and registered nurse). These data were modelled into phlebitis scores using 10 different tools. Proportions of agreement (e.g. positive, negative), observed and expected agreements, Cohen's kappa, the maximum achievable kappa, prevalence- and bias-adjusted kappa were calculated. Two hundred ten patients were recruited across three hospitals, with 247 sets of paired observations undertaken. The second rater was blinded to the first's findings. The Catney and Rittenberg scales were the most sensitive (phlebitis in >20% of observations), whereas the Curran, Lanbeck and Rickard scales were the most restrictive (≤2% phlebitis). Only tenderness and the Catney (one of pain, tenderness, erythema or palpable cord) and Rittenberg scales (one of erythema, swelling, tenderness or pain) had acceptable (more than two-thirds, 66.7%) levels of inter-rater agreement. Inter-rater agreement for phlebitis assessment signs/symptoms and scales is low. This likely contributes to the high degree of variability in phlebitis rates in literature. We recommend further research into assessment of infrequent signs/symptoms and the Catney or Rittenberg scales. New approaches to evaluating vein irritation that are valid, reliable and based on their ability to predict complications need exploration. © 2015 John Wiley & Sons, Ltd.

  19. Comparative study of optic disc measurement by Copernicus optical coherence tomography and Heidelberg retinal tomography.

    PubMed

    Yang, Qing-Song; Yu, Ya-Jie; Li, Shu-Ning; Liu, Juan; Hao, Ying-Juan

    2012-08-01

    Copernicus optical coherence tomography (SOCT) is a new, ultra high-speed and high-resolution instrument available for clinical evaluation of optic nerve. The purpose of the study was to compare the agreements between SOCT and Heidelberg retinal tomography (HRT). A total of 44 healthy normal volunteers were recruited in this study. One eye in each subject was selected randomly. Agreement between SOCT and HRT-3 in measuring optic disc area was assessed using Bland-Altman plots. Relationships between measurements of optic nerve head parameter obtained by SOCT and HRT-3 were assessed by Pearson correlation. There was no significant difference in the average cup area (0.306 vs. 0.355 mm, P = 0.766), cup volume (0.158 vs. 0.130 mm, P = 0.106) and cup/disc ration (0.394 vs. 0.349 mm, P = 0.576) measured by the two instruments. However, other optic disc parameters from SOCT were significantly lower compared with HRT-3. The Bland-Altman plot revealed good agreement of cup area and cup volume measured by SOCT and HRT-3. Bad agreement of disc area, rim area, rim volume and cup/disc ratio were found between SOCT and HRT-3. The highest correlations between the two instruments were observed for cup area (r(2) = 0.783, P = 0.000) and cup/disc ratio (r(2) = 0.669, P = 0.000), whereas the lowest correlation was observed for disc area (r(2) = 0.100, P = 0.037), rim area (r(2) = 0.275, P = 0.000), cup volume (r(2) = 0.005, P = 0.391) and rim volume (r(2) = 0.021, P = 0.346). There were poor agreements between SOCT and HRT-3 for measurement of optic nerve parameters except cup area and cup volume. Measurement results of the two instruments are not interchangeable.

  20. Effect of Auditory-Perceptual Training With Natural Voice Anchors on Vocal Quality Evaluation.

    PubMed

    Dos Santos, Priscila Campos Martins; Vieira, Maurílio Nunes; Sansão, João Pedro Hallack; Gama, Ana Cristina Côrtes

    2018-01-10

    To analyze the effects of auditory-perceptual training with anchor stimuli of natural voices on inter-rater agreement during the assessment of vocal quality. This is a quantitative nature study. An auditory-perceptual training site was developed consisting of Programming Interface A, an auditory training activity, and Programming Interface B, a control activity. Each interface had three stages: pre-training/pre-interval evaluation, training/interval, and post-training/post-interval evaluation. Two experienced evaluators classified 381 voices according to the GRBASI scale (G-grade, R-roughness, B-breathiness, A-asthenia, S-strain, I-instability). Voices were selected that received the same evaluation by both evaluators: 57 voices for evaluation and 56 for training were selected, with varying degrees of deviation across parameters. Fifteen inexperienced evaluators were then selected. In the pre-, post-training, pre-, and postinterval stages, evaluators listened to the voices and classified them via the GRBASI scale. In the stage interval evaluators read a text. In the stage training each parameter was trained separately. Evaluators analyzed the degrees of deviation of the GRBASI parameters based on anchor stimuli, and could only advance after correctly classifying the voices. To quantify inter-rater agreement and provide statistical analyses, the AC1 coefficient, confidence intervals, and percentage variation of agreement were employed. Except for the asthenia parameter, decreased agreement was observed in the control condition. Improved agreement was observed with auditory training, but this improvement did not achieve statistical significance. Training with natural voice anchors suggest an increased inter-rater agreement during perceptual voice analysis, potentially indicating that new internal references were established. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  1. Inter-rater agreement between trachoma graders: comparison of grades given in field conditions versus grades from photographic review

    PubMed Central

    Gebresillasie, Sintayehu; Tadesse, Zerihun; Shiferaw, Ayalew; Yu, Sun N.; Stoller, Nicole E.; Zhou, Zhaoxia; Emerson, Paul M.; Gaynor, Bruce D.; Lietman, Thomas M.; Keenan, Jeremy D.

    2016-01-01

    Purpose Trachoma surveillance is most commonly performed by direct observation, usually by non-ophthalmologists using the World Health Organization (WHO) simplified grading system. However, conjunctival photographs may offer several benefits over direct clinical observation, including the potential for greater inter-rater agreement. This study assesses whether inter-rater agreement of trachoma grading differs when trained graders review conjunctival photographs versus when they perform conjunctival examinations in the field. Methods 3 trained trachoma graders each performed an independent examination of the everted right tarsal conjunctiva of 269 children aged 0-9 years, and then reviewed photographs of these same conjunctivae in a random order. For each eye, the grader documented the presence or absence of follicular trachoma (TF) and intense trachomatous inflammation (TI) according to the WHO simplified grading system. Results Inter-rater agreement for grade of TF was significantly higher in the field (kappa coefficient, κ, 0.73, 95% confidence interval, CI 0.67-0.80) than by photographic review (κ=0.55, 95% CI 0.49-0.63; difference in κ between field grading and photo grading 0.18, 95% CI 0.09-0.26). When field and photographic grades were each assessed as the consensus grade from the 3 graders, agreement between in-field and photographic graders was high for TF (κ=0.75, 95% CI 0.68-0.84). Conclusions In an area with hyperendemic trachoma, inter-rater agreement was lower for photographic assessment of trachoma than for in-field assessment. However, the trachoma grade reached by a consensus of photographic graders agreed well with the grade given by a consensus of in-field graders. PMID:26158573

  2. Radiographic classifications in Perthes disease

    PubMed Central

    Huhnstock, Stefan; Svenningsen, Svein; Merckoll, Else; Catterall, Anthony; Terjesen, Terje; Wiig, Ola

    2017-01-01

    Background and purpose Different radiographic classifications have been proposed for prediction of outcome in Perthes disease. We assessed whether the modified lateral pillar classification would provide more reliable interobserver agreement and prognostic value compared with the original lateral pillar classification and the Catterall classification. Patients and methods 42 patients (38 boys) with Perthes disease were included in the interobserver study. Their mean age at diagnosis was 6.5 (3–11) years. 5 observers classified the radiographs in 2 separate sessions according to the Catterall classification, the original and the modified lateral pillar classifications. Interobserver agreement was analysed using weighted kappa statistics. We assessed the associations between the classifications and femoral head sphericity at 5-year follow-up in 37 non-operatively treated patients in a crosstable analysis (Gamma statistics for ordinal variables, γ). Results The original lateral pillar and Catterall classifications showed moderate interobserver agreement (kappa 0.49 and 0.43, respectively) while the modified lateral pillar classification had fair agreement (kappa 0.40). The original lateral pillar classification was strongly associated with the 5-year radiographic outcome, with a mean γ correlation coefficient of 0.75 (95% CI: 0.61–0.95) among the 5 observers. The modified lateral pillar and Catterall classifications showed moderate associations (mean γ correlation coefficient 0.55 [95% CI: 0.38–0.66] and 0.64 [95% CI: 0.57–0.72], respectively). Interpretation The Catterall classification and the original lateral pillar classification had sufficient interobserver agreement and association to late radiographic outcome to be suitable for clinical use. Adding the borderline B/C group did not increase the interobserver agreement or prognostic value of the original lateral pillar classification. PMID:28613966

  3. Impact of training on concordance among rheumatologists and dermatologists in the assessment of patients with psoriasis and psoriatic arthritis.

    PubMed

    Salvarani, Carlo; Girolomoni, Giampiero; Di Lernia, Vito; Gisondi, Paolo; Tripepi, Giovanni; Egan, Colin Gerard; Marchesoni, Antonio

    2016-12-01

    To evaluate the impact of training on the reliability among dermatologists and rheumatologists in the assessment of psoriatic arthritis (PsA) patients. Overall, 9 hospital-based rheumatologists and 8 hospital-based dermatologists met in Reggio Emilia, Italy on October 2015 to assess 17 PsA patients. After 1 month, physicians underwent a 3-h training session by 4 recognized experts and then assessed 19 different PsA patients according to a modified Latin square design. Measures included tender (TJC) and swollen joint count (SJC), dactylitis, enthesitis, Schober test, psoriasis body surface area (BSA), Psoriasis Area and Severity Index (PASI), Nail Psoriasis Severity Index (NAPSI), and static physician's global assessment of PsA disease activity (sPGA). Variance components analyses were performed to estimate the intraclass correlation coefficient (ICC). TJC and enthesitis-measured pre-training by dermatologists or rheumatologists revealed moderate-substantial agreement (ICC: 0.4-0.8). In contrast, SJC and Schober test showed fair (ICC: 0.2-0.4) and moderate agreement, respectively (ICC: 0.4-0.6), while poor agreement (ICC: 0-0.2) was represented by dactylitis. Moderate-substantial (ICC: 0.4-0.8) agreement was observed for most skin measures by dermatologists and rheumatologists, apart from BSA, where fair agreement (ICC: 0.2-0.4) was observed. Agreement levels were similar before and after training for arthritis measures. In contrast, levels of agreement after training for 3 of the 4 skin measures were increased for dermatologists and all 4 skin measures were increased for rheumatologists. Substantial to excellent agreement was observed for TJC, enthesitis, PASI, and sPGA. Rheumatologists benefited from training to a greater extent. Copyright © 2016 Elsevier Inc. All rights reserved.

  4. Reliability of self-reported weight and height among state bank employees.

    PubMed

    Chor, D; Coutinho, E da S; Laurenti, R

    1999-02-01

    Self-reported weight and height were compared with direct measurements in order to evaluate the agreement between the two sources. Data were obtained from a cross-sectional study on health status from a probabilistic sample of 1,183 employees of a bank, in Rio de Janeiro State, Brazil. Direct measurements were made of 322 employees. Differences between the two sources were evaluated using mean differences, limits of agreement and intraclass correlation coefficient (ICC). Men and women tended to underestimate their weight while differences between self-reported and measured height were insignificant. Body mass index (BMI) mean differences were smaller than those observed for weight. ICC was over 0.98 for weight and 0.95 for BMI, expressing close agreement. Combining a graphical method with ICC may be useful in pilot studies to detect populational groups capable of providing reliable information on weight and height, thus minimizing resources needed for field work.

  5. Preliminayr Study on Diffraction Enhanced Radiographic Imaging for a Canine Model of Cartilage Damage

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Muehleman,C.; Li, J.; Zhong, Z.

    2006-01-01

    Objective: To demonstrate the ability of a novel radiographic technique, Diffraction Enhanced Radiographic Imaging (DEI), to render high contrast images of canine knee joints for identification of cartilage lesions in situ. Methods: DEI was carried out at the X-15A beamline at Brookhaven National Laboratory on intact canine knee joints with varying levels of cartilage damage. Two independent observers graded the DE images for lesions and these grades were correlated to the gross morphological grade. Results: The correlation of gross visual grades with DEI grades for the 18 canine knee joints as determined by observer 1 (r2=0.8856, P=0.001) and observer 2more » (r2=0.8818, P=0.001) was high. The overall weighted ? value for inter-observer agreement was 0.93, thus considered high agreement. Conclusion: The present study is the first study for the efficacy of DEI for cartilage lesions in an animal joint, from very early signs through erosion down to subchondral bone, representing the spectrum of cartilage changes occurring in human osteoarthritis (OA). Here we show that DEI allows the visualization of cartilage lesions in intact canine knee joints with good accuracy. Hence, DEI may be applicable for following joint degeneration in animal models of OA.« less

  6. Comparison of Algorithm-based Estimates of Occupational Diesel Exhaust Exposure to Those of Multiple Independent Raters in a Population-based Case–Control Study

    PubMed Central

    Friesen, Melissa C.

    2013-01-01

    Objectives: Algorithm-based exposure assessments based on patterns in questionnaire responses and professional judgment can readily apply transparent exposure decision rules to thousands of jobs quickly. However, we need to better understand how algorithms compare to a one-by-one job review by an exposure assessor. We compared algorithm-based estimates of diesel exhaust exposure to those of three independent raters within the New England Bladder Cancer Study, a population-based case–control study, and identified conditions under which disparities occurred in the assessments of the algorithm and the raters. Methods: Occupational diesel exhaust exposure was assessed previously using an algorithm and a single rater for all 14 983 jobs reported by 2631 study participants during personal interviews conducted from 2001 to 2004. Two additional raters independently assessed a random subset of 324 jobs that were selected based on strata defined by the cross-tabulations of the algorithm and the first rater’s probability assessments for each job, oversampling their disagreements. The algorithm and each rater assessed the probability, intensity and frequency of occupational diesel exhaust exposure, as well as a confidence rating for each metric. Agreement among the raters, their aggregate rating (average of the three raters’ ratings) and the algorithm were evaluated using proportion of agreement, kappa and weighted kappa (κw). Agreement analyses on the subset used inverse probability weighting to extrapolate the subset to estimate agreement for all jobs. Classification and Regression Tree (CART) models were used to identify patterns in questionnaire responses that predicted disparities in exposure status (i.e., unexposed versus exposed) between the first rater and the algorithm-based estimates. Results: For the probability, intensity and frequency exposure metrics, moderate to moderately high agreement was observed among raters (κw = 0.50–0.76) and between the algorithm and the individual raters (κw = 0.58–0.81). For these metrics, the algorithm estimates had consistently higher agreement with the aggregate rating (κw = 0.82) than with the individual raters. For all metrics, the agreement between the algorithm and the aggregate ratings was highest for the unexposed category (90–93%) and was poor to moderate for the exposed categories (9–64%). Lower agreement was observed for jobs with a start year <1965 versus ≥1965. For the confidence metrics, the agreement was poor to moderate among raters (κw = 0.17–0.45) and between the algorithm and the individual raters (κw = 0.24–0.61). CART models identified patterns in the questionnaire responses that predicted a fair-to-moderate (33–89%) proportion of the disagreements between the raters’ and the algorithm estimates. Discussion: The agreement between any two raters was similar to the agreement between an algorithm-based approach and individual raters, providing additional support for using the more efficient and transparent algorithm-based approach. CART models identified some patterns in disagreements between the first rater and the algorithm. Given the absence of a gold standard for estimating exposure, these patterns can be reviewed by a team of exposure assessors to determine whether the algorithm should be revised for future studies. PMID:23184256

  7. Development of a reactive-dispersive plume model

    NASA Astrophysics Data System (ADS)

    Kim, Hyun S.; Kim, Yong H.; Song, Chul H.

    2017-04-01

    A reactive-dispersive plume model (RDPM) was developed in this study. The RDPM can consider two main components of large-scale point source plume: i) turbulent dispersion and ii) photochemical reactions. In order to evaluate the simulation performance of newly developed RDPM, the comparisons between the model-predicted and observed mixing ratios were made using the TexAQS II 2006 (Texas Air Quality Study II 2006) power-plant experiment data. Statistical analyses show good correlation (0.61≤R≤0.92), and good agreement with the Index of Agreement (0.70≤R≤0.95). The chemical NOx lifetimes for two power-plant plumes (Monticello and Welsh power plants) were also estimated.

  8. Validity of parent's self-reported responses to home safety questions.

    PubMed

    Osborne, Jodie M; Shibl, Rania; Cameron, Cate M; Kendrick, Denise; Lyons, Ronan A; Spinks, Anneliese B; Sipe, Neil; McClure, Roderick J

    2016-09-01

    The aim of the study was to describe the validity of parent's self-reported responses to questions on home safety practices for children of 2-4 years. A cross-sectional validation study compared parent's self-administered responses to items in the Home Injury Prevention Survey with home observations undertaken by trained researchers. The relationship between the questionnaire and observation results was assessed using percentage agreement, sensitivity, specificity, positive predictive value, negative predictive value and intraclass correlation coefficients. Percentage agreements ranged from 44% to 100% with 40 of the total 45 items scoring higher than 70%. Sensitivities ranged from 0% to 100%, with 27 items scoring at least 70%. Specificities also ranged from 0% to 100%, with 33 items scoring at least 70%. As such, the study identified a series of self-administered home safety questions that have sensitivities, specificities and predictive values sufficiently high to allow the information to be useful in research and injury prevention practice.

  9. Hydrodynamic studies of oxygen, neon, and magnesium novae

    NASA Technical Reports Server (NTRS)

    Starrfield, Sumner; Sparks, W. M.; Truran, J. W.

    1987-01-01

    Results are presented from recent theoretical studies that have examined the properties of nova outbursts on ONeMg white dwarfs. These outbursts are much more violent and occur much more frequently than outbursts on CO white dwarfs. Hydrodynamic simulations of both kinds of outbursts are in excellent agreement with the observations.

  10. Patient-physician agreement on tobacco and alcohol consumption: a multilevel analysis of GPs' characteristics.

    PubMed

    Thebault, Jean-Laurent; Falcoff, Hector; Favre, Madeleine; Noël, Frédérique; Rigal, Laurent

    2015-03-18

    Data about tobacco and alcohol consumption are essential in many types of studies. These data can be obtained by directly questioning patients or by using the information collected from physicians. Agreement between these two sources varies according to the characteristics of patients but probably also those of physicians. The purpose of this study was to analyze the characteristics of general practitioners (GPs) associated with agreement between them and their patients about the patients' consumption of alcohol and tobacco. Data came from an observational survey among GPs who were internship supervisors in the Paris metropolitan area. Fifty-two volunteer GPs completed a self-administered questionnaire about the organization of their practice and their training. For each GP, a random sample of 70 patients, aged 40 to 74 years, answered questions about their personal tobacco and alcohol consumption. GPs simultaneously answered similar questions about each patient. We used a mixed logistic model to assess the association between physicians' characteristics and agreement for patients' smoking status and alcohol consumption. Data were collected from both patient and physician for 2599 patients. The agreement between patients and their physicians was 60.4% for smoking status and 48.7% for alcohol consumption. Physicians with continuing medical education in management of smokers and those reporting specific skill in managing hypertension had the best agreement for smoking. Physicians who taught courses at the university medical school and those reporting specific skill in managing alcoholism had the best agreement for alcohol consumption. Agreement increases with physicians' training and skills in management of patients with tobacco and alcohol problems. It supports the importance of professional training for improving the quality of epidemiologic data in general practice. Researchers who use GPs as a source of information about patients' tobacco and alcohol consumption must assess the physicians' characteristics.

  11. Reliability and agreement in student ratings of the class environment.

    PubMed

    Nelson, Peter M; Christ, Theodore J

    2016-09-01

    The current study estimated the reliability and agreement of student ratings of the classroom environment obtained using the Responsive Environmental Assessment for Classroom Teaching (REACT; Christ, Nelson, & Demers, 2012; Nelson, Demers, & Christ, 2014). Coefficient alpha, class-level reliability, and class agreement indices were evaluated as each index provides important information for different interpretations and uses of student rating scale data. Data for 84 classes across 29 teachers in a suburban middle school were sampled to derive reliability and agreement indices for the REACT subscales across 4 class sizes: 25, 20, 15, and 10. All participating teachers were White and a larger number of 6th-grade classes were included (42%) relative to 7th- (33%) or 8th- (23%) grade classes. Teachers were responsible for a variety of content areas, including language arts (26%), science (26%), math (20%), social studies (19%), communications (6%), and Spanish (3%). Coefficient alpha estimates were generally high across all subscales and class sizes (α = .70-.95); class-mean estimates were greatly impacted by the number of students sampled from each class, with class-level reliability values generally falling below .70 when class size was reduced from 25 to 20. Further, within-class student agreement varied widely across the REACT subscales (mean agreement = .41-.80). Although coefficient alpha and test-retest reliability are commonly reported in research with student rating scales, class-level reliability and agreement are not. The observed differences across coefficient alpha, class-level reliability, and agreement indices provide evidence for evaluating students' ratings of the class environment according to their intended use (e.g., differentiating between classes, class-level instructional decisions). (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  12. Questionnaire layout and wording influence prevalence and risk estimates of respiratory symptoms in a population cohort

    PubMed Central

    Ekerljung, Linda; Rönmark, Eva; Lötvall, Jan; Wennergren, Göran; Torén, Kjell; Lundbäck, Bo

    2013-01-01

    Objective Results of epidemiological studies are greatly influenced by the chosen methodology. The study aims to investigate how two frequently used questionnaires (Qs), with partly different layout, influence the prevalence of respiratory symptoms. Study Design and Setting A booklet containing two Qs, the Global Allergy and Asthma European Network Q and the Obstructive Lung Disease in Northern Sweden Q, was mailed to 30 000 subjects aged 16–75 years in West Sweden; 62% responded. Sixteen questions were included in the analysis: seven identical between the Qs, four different in set-up and five with the same layout but different wording. Comparisons were made using differences in proportions, observed agreement and Kappa statistics. Results Identical questions yielded similar prevalences with high observed agreement and kappa values. Questions with different set-up or differences in wording resulted in significantly different prevalences with lower observed agreement and kappa values. In general, the use of follow-up questions, excluding subjects answering no to the initial question, resulted in 2.9–6.7% units lower prevalence. Conclusion The question set-up has great influences on epidemiological results, and specifically questions that are set up to be excluded based on a previous no answer leads to lower prevalence compared with detached questions. Therefore, Q layout and exact wording of questions has to be carefully considered when comparing studies. Please cite this paper as: Ekerljung L, Rönmark E, Lötvall J, Wennergren G, Torén K and Lundbäck B. Questionnaire layout and wording influence prevalence and risk estimates of respiratory symptoms in a population cohort. Clin Respir J 2013; 7: 53–63. PMID:22243692

  13. Digital radiography with computerized conventional monitors compared to medical monitors in vertical root fracture diagnosis.

    PubMed

    Tofangchiha, Maryam; Adel, Mamak; Bakhshi, Mahin; Esfehani, Mahsa; Nazeman, Pantea; Ghorbani Elizeyi, Mojgan; Javadi, Amir

    2013-01-01

    Vertical root fracture (VRF) is a complication which is chiefly diagnosed radiographically. Recently, film-based radiography has been substituted with digital radiography. At the moment, there is a wide range of monitors available in the market for viewing digital images. The present study aims to compare the diagnostic accuracy, sensitivity and specificity of medical and conventional monitors in detection of vertical root fractures. In this in vitro study 228 extracted single-rooted human teeth were endodontically treated. Vertical root fractures were induced in 114 samples. The teeth were imaged by a digital charge-coupled device radiography using parallel technique. The images were evaluated by a radiologist and an endodontist on two medical and conventional liquid-crystal display (LCD) monitors twice. Z-test was used to analyze the sensitivity, accuracy and specificity of each monitor. Significance level was set at 0.05. Inter and intra observer agreements were calculated by Cohen's kappa. Accuracy, specificity and sensitivity for conventional monitor were calculated as 67.5%, 72%, 62.5% respectively; and data for medical grade monitor were 67.5%, 66.5% and 68% respectively. Statistical analysis showed no significant differences in detecting VRF between the two techniques. Inter-observer agreement for conventional and medical monitor was 0.47 and 0.55 respectively (moderate). Intra-observer agreement was 0.78 for medical monitor and 0.87 for conventional one (substantial). The type of monitor does not influence diagnosis of vertical root fractures.

  14. Diffuse intrinsic pontine glioma: is MRI surveillance improved by region of interest volumetry?

    PubMed

    Riley, Garan T; Armitage, Paul A; Batty, Ruth; Griffiths, Paul D; Lee, Vicki; McMullan, John; Connolly, Daniel J A

    2015-02-01

    Paediatric diffuse intrinsic pontine glioma (DIPG) is noteworthy for its fibrillary infiltration through neuroparenchyma and its resultant irregular shape. Conventional volumetry methods aim to approximate such irregular tumours to a regular ellipse, which could be less accurate when assessing treatment response on surveillance MRI. Region-of-interest (ROI) volumetry methods, using manually traced tumour profiles on contiguous imaging slices and subsequent computer-aided calculations, may prove more reliable. To evaluate whether the reliability of MRI surveillance of DIPGs can be improved by the use of ROI-based volumetry. We investigated the use of ROI- and ellipsoid-based methods of volumetry for paediatric DIPGs in a retrospective review of 22 MRI examinations. We assessed the inter- and intraobserver variability of the two methods when performed by four observers. ROI- and ellipsoid-based methods strongly correlated for all four observers. The ROI-based volumes showed slightly better agreement both between and within observers than the ellipsoid-based volumes (inter-[intra-]observer agreement 89.8% [92.3%] and 83.1% [88.2%], respectively). Bland-Altman plots show tighter limits of agreement for the ROI-based method. Both methods are reproducible and transferrable among observers. ROI-based volumetry appears to perform better with greater intra- and interobserver agreement for complex-shaped DIPG.

  15. Imperfection Insensitivity Analyses of Advanced Composite Tow-Steered Shells

    NASA Technical Reports Server (NTRS)

    Wu, K. Chauncey; Farrokh, Babak; Stanford, Bret K.; Weaver, Paul M.

    2016-01-01

    Two advanced composite tow-steered shells, one with tow overlaps and another without overlaps, were previously designed, fabricated and tested in end compression, both without cutouts, and with small and large cutouts. In each case, good agreement was observed between experimental buckling loads and supporting linear bifurcation buckling analyses. However, previous buckling tests and analyses have shown historically poor correlation, perhaps due to the presence of geometric imperfections that serve as failure initiators. For the tow-steered shells, their circumferential variation in axial stiffness may have suppressed this sensitivity to imperfections, leading to the agreement noted between tests and analyses. To investigate this further, a numerical investigation was performed in this study using geometric imperfections measured from both shells. Finite element models of both shells were analyzed first without, and then, with measured imperfections that were then, superposed in different orientations around the shell longitudinal axis. Small variations in both the axial prebuckling stiffness and global buckling load were observed for the range of imperfections studied here, which suggests that the tow steering, and resulting circumferentially varying axial stiffness, may result in the test-analysis correlation observed for these shells.

  16. AO Distal Radius Fracture Classification: Global Perspective on Observer Agreement.

    PubMed

    Jayakumar, Prakash; Teunis, Teun; Giménez, Beatriz Bravo; Verstreken, Frederik; Di Mascio, Livio; Jupiter, Jesse B

    2017-02-01

    Background  The primary objective of this study was to test interobserver reliability when classifying fractures by consensus by AO types and groups among a large international group of surgeons. Secondarily, we assessed the difference in inter- and intraobserver agreement of the AO classification in relation to geographical location, level of training, and subspecialty. Methods  A randomized set of radiographic and computed tomographic images from a consecutive series of 96 distal radius fractures (DRFs), treated between October 2010 and April 2013, was classified using an electronic web-based portal by an invited group of participants on two occasions. Results  Interobserver reliability was substantial when classifying AO type A fractures but fair and moderate for type B and C fractures, respectively. No difference was observed by location, except for an apparent difference between participants from India and Australia classifying type B fractures. No statistically significant associations were observed comparing interobserver agreement by level of training and no differences were shown comparing subspecialties. Intra-rater reproducibility was "substantial" for fracture types and "fair" for fracture groups with no difference accounting for location, training level, or specialty. Conclusion  Improved definition of reliability and reproducibility of this classification may be achieved using large international groups of raters, empowering decision making on which system to utilize. Level of Evidence  Level III.

  17. AO Distal Radius Fracture Classification: Global Perspective on Observer Agreement

    PubMed Central

    Jayakumar, Prakash; Teunis, Teun; Giménez, Beatriz Bravo; Verstreken, Frederik; Di Mascio, Livio; Jupiter, Jesse B.

    2016-01-01

    Background The primary objective of this study was to test interobserver reliability when classifying fractures by consensus by AO types and groups among a large international group of surgeons. Secondarily, we assessed the difference in inter- and intraobserver agreement of the AO classification in relation to geographical location, level of training, and subspecialty. Methods A randomized set of radiographic and computed tomographic images from a consecutive series of 96 distal radius fractures (DRFs), treated between October 2010 and April 2013, was classified using an electronic web-based portal by an invited group of participants on two occasions. Results Interobserver reliability was substantial when classifying AO type A fractures but fair and moderate for type B and C fractures, respectively. No difference was observed by location, except for an apparent difference between participants from India and Australia classifying type B fractures. No statistically significant associations were observed comparing interobserver agreement by level of training and no differences were shown comparing subspecialties. Intra-rater reproducibility was “substantial” for fracture types and “fair” for fracture groups with no difference accounting for location, training level, or specialty. Conclusion Improved definition of reliability and reproducibility of this classification may be achieved using large international groups of raters, empowering decision making on which system to utilize. Level of Evidence Level III PMID:28119795

  18. Development and testing of a de novo clinical staging system for podoconiosis (endemic non-filarial elephantiasis).

    PubMed

    Tekola, Fasil; Ayele, Zewdu; Mariam, Dereje Haile; Fuller, Claire; Davey, Gail

    2008-10-01

    To develop and test a robust clinical staging system for podoconiosis, a geochemical disease in individuals exposed to red clay soil. We adapted the Dreyer system for staging filarial lymphoedema and tested it in four re-iterative field tests conducted in an area of high-podoconiosis prevalence in Southern Ethiopia. The system has five stages according to proximal spread of disease and presence of dermal nodules, ridges and bands. We measured the 1-week repeatability and the inter-observer agreement of the final staging system. The five-stage system is readily understood by community workers with little health training. Kappa for 1-week repeatability was 0.88 (95% CI 0.80-0.96), for agreement between health professionals was 0.71 (95% CI 0.60-0.82), while that between health professionals and community podoconiosis agents without formal health training averaged 0.64 (95% CI 0.52-0.78). This simple staging system with good inter-observer agreement and repeatability can assist in the management and further study of podoconiosis.

  19. Methodology for turbulence code validation: Quantification of simulation-experiment agreement and application to the TORPEX experiment

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ricci, Paolo; Theiler, C.; Fasoli, A.

    A methodology for plasma turbulence code validation is discussed, focusing on quantitative assessment of the agreement between experiments and simulations. The present work extends the analysis carried out in a previous paper [P. Ricci et al., Phys. Plasmas 16, 055703 (2009)] where the validation observables were introduced. Here, it is discussed how to quantify the agreement between experiments and simulations with respect to each observable, how to define a metric to evaluate this agreement globally, and - finally - how to assess the quality of a validation procedure. The methodology is then applied to the simulation of the basic plasmamore » physics experiment TORPEX [A. Fasoli et al., Phys. Plasmas 13, 055902 (2006)], considering both two-dimensional and three-dimensional simulation models.« less

  20. Use of dynamic 3-dimensional transvaginal and transrectal ultrasonography to assess posterior pelvic floor dysfunction related to obstructed defecation.

    PubMed

    Murad-Regadas, Sthela M; Regadas Filho, Francisco Sergio Pinheiro; Regadas, Francisco Sergio Pinheiro; Rodrigues, Lusmar Veras; de J R Pereira, Jacyara; da S Fernandes, Graziela Olivia; Dealcanfreitas, Iris Daiana; Mendonca Filho, Jose Jader

    2014-02-01

    New ultrasound techniques may complement current diagnostic tools, and combined techniques may help to overcome the limitations of individual techniques for the diagnosis of anorectal dysfunction. A high degree of agreement has been demonstrated between echodefecography (dynamic 3-dimensional anorectal ultrasonography) and conventional defecography. Our aim was to evaluate the ability of a combined approach consisting of dynamic 3-dimensional transvaginal and transrectal ultrasonography by using a 3-dimensional biplane endoprobe to assess posterior pelvic floor dysfunctions related to obstructed defecation syndrome in comparison with echodefecography. This was a prospective, observational cohort study conducted at a tertiary-care hospital. Consecutive female patients with symptoms of obstructed defecation were eligible. Each patient underwent assessment of posterior pelvic floor dysfunctions with a combination of dynamic 3-dimensional transvaginal and transrectal ultrasonography by using a biplane transducer and with echodefecography. Kappa (κ) was calculated as an index of agreement between the techniques. Diagnostic accuracy (sensitivity, specificity, and positive and negative predictive values) of the combined technique in detection of posterior dysfunctions was assessed with echodefecography as the standard for comparison. A total of 33 women were evaluated. Substantial agreement was observed regarding normal relaxation and anismus. In detecting the absence or presence of rectocele, the 2 methods agreed in all cases. Near-perfect agreement was found for rectocele grade I, grade II, and grade III. Perfect agreement was found for entero/sigmoidocele, with near-perfect agreement for rectal intussusception. Using echodefecography as the standard for comparison, we found high diagnostic accuracy of transvaginal and transrectal ultrasonography in the detection of posterior dysfunctions. This combined technique should be compared with other dynamic techniques and validated with conventional defecography. Dynamic 3-dimensional transvaginal and transrectal ultrasonography is a simple and fast ultrasound technique that shows strong agreement with echodefecography and may be used as an alternative method to assess patients with obstructed defecation syndrome.

  1. Quality of pharmaceutical care at the pharmacy counter: patients’ experiences versus video observation

    PubMed Central

    Koster, Ellen S; Blom, Lyda; Overbeeke, Marloes R; Philbert, Daphne; Vervloet, Marcia; Koopman, Laura; van Dijk, Liset

    2016-01-01

    Introduction Consumer Quality Index questionnaires are used to assess quality of care from patients’ experiences. Objective To provide insight into the agreement about quality of pharmaceutical care, measured both by a patient questionnaire and video observations. Methods Pharmaceutical encounters in four pharmacies were video-recorded. Patients completed a questionnaire based upon the Consumer Quality Index Pharmaceutical Care after the encounter containing questions about patients’ experiences regarding information provision, medication counseling, and pharmacy staff’s communication style. An observation protocol was used to code the recorded encounters. Agreement between video observation and patients’ experiences was calculated. Results In total, 109 encounters were included for analysis. For the domains “medication counseling” and “communication style”, agreement between patients’ experiences and observations was very high (>90%). Less agreement (45%) was found for “information provision”, which was rated more positive by patients compared to the observations, especially for the topic, encouragement of patients’ questioning behavior. Conclusion A questionnaire is useful to assess the quality of medication counseling and pharmacy staff’s communication style, but might be less suitable to evaluate information provision and pharmacy staff’s encouragement of patients’ questioning behavior. Although patients may believe that they have received all necessary information to use their new medicine, some information on specific instructions was not addressed during the encounter. When using questionnaires to get insight into information provision, observations of encounters are very informative to validate the patient questionnaires and make necessary adjustments. PMID:27042025

  2. A study of the utilization of ERTS-1 data from the Wabash River Basin. [crop identification, water resources, urban land use, soil mapping, and atmospheric modeling

    NASA Technical Reports Server (NTRS)

    Landgrebe, D. A. (Principal Investigator)

    1974-01-01

    The author has identified the following significant results. The most significant results were obtained in the water resources research, urban land use mapping, and soil association mapping projects. ERTS-1 data was used to classify water bodies to determine acreages and high agreement was obtained with USGS figures. Quantitative evaluation was achieved of urban land use classifications from ERTS-1 data and an overall test accuracy of 90.3% was observed. ERTS-1 data classifications of soil test sites were compared with soil association maps scaled to match the computer produced map and good agreement was observed. In some cases the ERTS-1 results proved to be more accurate than the soil association map.

  3. Does parent-child agreement vary based on presenting problems? Results from a UK clinical sample.

    PubMed

    Cleridou, Kalia; Patalay, Praveetha; Martin, Peter

    2017-01-01

    Discrepancies are often found between child and parent reports of child psychopathology, nevertheless the role of the child's presenting difficulties in relation to these is underexplored. This study investigates whether parent-child agreement on the conduct and emotional scales of the Strengths and Difficulties Questionnaire (SDQ) varied as a result of certain child characteristics, including the child's presenting problems to clinical services, age and gender. The UK-based sample consisted of 16,754 clinical records of children aged 11-17, the majority of which were female (57%) and White (76%). The dataset was provided by the Child Outcomes Research Consortium , which collects outcome measures from child services across the UK. Clinicians reported the child's presenting difficulties, and parents and children completed the SDQ. Using correlation analysis, the main findings indicated that agreement varied as a result of the child's difficulties for reports of conduct problems, and this seemed to be related to the presence or absence of externalising difficulties in the child's presentation. This was not the case for reports of emotional difficulties. In addition, agreement was higher when reporting problems not consistent with the child's presentation; for instance, agreement on conduct problems was greater for children presenting with internalising problems. Lastly, the children's age and gender did not seem to have an impact on agreement. These findings demonstrate that certain child presenting difficulties, and in particular conduct problems, may be related to informant agreement and need to be considered in clinical practice and research. Trial Registration This study was observational and as such did not require trial registration.

  4. The development and validity of the Salford Gait Tool: an observation-based clinical gait assessment tool.

    PubMed

    Toro, Brigitte; Nester, Christopher J; Farren, Pauline C

    2007-03-01

    To develop the construct, content, and criterion validity of the Salford Gait Tool (SF-GT) and to evaluate agreement between gait observations using the SF-GT and kinematic gait data. Tool development and comparative evaluation. University in the United Kingdom. For designing construct and content validity, convenience samples of 10 children with hemiplegic, diplegic, and quadriplegic cerebral palsy (CP) and 152 physical therapy students and 4 physical therapists were recruited. For developing criterion validity, kinematic gait data of 13 gait clusters containing 56 children with hemiplegic, diplegic, and quadriplegic CP and 11 neurologically intact children was used. For clinical evaluation, a convenience sample of 23 pediatric physical therapists participated. We developed a sagittal plane observational gait assessment tool through a series of design, test, and redesign iterations. The tool's grading system was calibrated using kinematic gait data of 13 gait clusters and was evaluated by comparing the agreement of gait observations using the SF-GT with kinematic gait data. Criterion standard kinematic gait data. There was 58% mean agreement based on grading categories and 80% mean agreement based on degree estimations evaluated with the least significant difference method. The new SF-GT has good concurrent criterion validity.

  5. Evaluation of agreement among dermatologists in the assessment of the color of port wine stains and their clearance after treatment with the flashlamp-pumped dye laser.

    PubMed

    Pérez, B; Abraira, V; Núñez, M; Boixeda, P; Perez Corral, F; Ledo, A

    1997-01-01

    Color classification and its subjective clearance evaluation in response to treatment are essential in the management of patients with port wine stains (PWS). But color perception by physicians is not an objective measurement so that it can change among observers. Agreement among physicians is essential for the reliability of the color classification and the clinical assessment of the response to laser treatment. The purpose of our study was to determine the reliability of the clinical color classification of port wine stains and of their color change or clearance in response to laser treatment. The study was not designed to evaluate the outcome of laser treatment in PWS or the factors that could predict the final response. We used the kappa index to evaluate the proportion of agreement in color and clearance perception among dermatologists. Six dermatologists classified the initial color of PWS in 80 patients. Three of them also assessed the amount of clearance achieved after treatment with the flashlamp-pumped dye laser. These three dermatologists were usually dedicated to treat patients with PWS, while the other three were not. The kappa index showed a substantial agreement in both cases. No difference in the initial color perception was observed between the group of dermatologists specialized in PWS and the other three dermatologists. These results favor the reliability of the clinical method in the assessment of PWS before and after laser treatment. So, although subjective, color perception by physicians can be used in the study of laser treatment outcome in PWS and its related factors, and the results of different authors can be compared.

  6. Breast MRI radiomics: comparison of computer- and human-extracted imaging phenotypes.

    PubMed

    Sutton, Elizabeth J; Huang, Erich P; Drukker, Karen; Burnside, Elizabeth S; Li, Hui; Net, Jose M; Rao, Arvind; Whitman, Gary J; Zuley, Margarita; Ganott, Marie; Bonaccio, Ermelinda; Giger, Maryellen L; Morris, Elizabeth A

    2017-01-01

    In this study, we sought to investigate if computer-extracted magnetic resonance imaging (MRI) phenotypes of breast cancer could replicate human-extracted size and Breast Imaging-Reporting and Data System (BI-RADS) imaging phenotypes using MRI data from The Cancer Genome Atlas (TCGA) project of the National Cancer Institute. Our retrospective interpretation study involved analysis of Health Insurance Portability and Accountability Act-compliant breast MRI data from The Cancer Imaging Archive, an open-source database from the TCGA project. This study was exempt from institutional review board approval at Memorial Sloan Kettering Cancer Center and the need for informed consent was waived. Ninety-one pre-operative breast MRIs with verified invasive breast cancers were analysed. Three fellowship-trained breast radiologists evaluated the index cancer in each case according to size and the BI-RADS lexicon for shape, margin, and enhancement (human-extracted image phenotypes [HEIP]). Human inter-observer agreement was analysed by the intra-class correlation coefficient (ICC) for size and Krippendorff's α for other measurements. Quantitative MRI radiomics of computerised three-dimensional segmentations of each cancer generated computer-extracted image phenotypes (CEIP). Spearman's rank correlation coefficients were used to compare HEIP and CEIP. Inter-observer agreement for HEIP varied, with the highest agreement seen for size (ICC 0.679) and shape (ICC 0.527). The computer-extracted maximum linear size replicated the human measurement with p  < 10 -12 . CEIP of shape, specifically sphericity and irregularity, replicated HEIP with both p values < 0.001. CEIP did not demonstrate agreement with HEIP of tumour margin or internal enhancement. Quantitative radiomics of breast cancer may replicate human-extracted tumour size and BI-RADS imaging phenotypes, thus enabling precision medicine.

  7. Medial tibial stress syndrome can be diagnosed reliably using history and physical examination.

    PubMed

    Winters, M; Bakker, E W P; Moen, M H; Barten, C C; Teeuwen, R; Weir, A

    2017-02-08

    The majority of sporting injuries are clinically diagnosed using history and physical examination as the cornerstone. There are no studies supporting the reliability of making a clinical diagnosis of medial tibial stress syndrome (MTSS). Our aim was to assess if MTSS can be diagnosed reliably, using history and physical examination. We also investigated if clinicians were able to reliably identify concurrent lower leg injuries. A clinical reliability study was performed at multiple sports medicine sites in The Netherlands. Athletes with non-traumatic lower leg pain were assessed for having MTSS by two clinicians, who were blinded to each others' diagnoses. We calculated the prevalence, percentage of agreement, observed percentage of positive agreement (Ppos), observed percentage of negative agreement (Pneg) and Kappa-statistic with 95%CI. Forty-nine athletes participated in this study, of whom 46 completed both assessments. The prevalence of MTSS was 74%. The percentage of agreement was 96%, with Ppos and Pneg of 97% and 92%, respectively. The inter-rater reliability was almost perfect; k=0.89 (95% CI 0.74 to 1.00), p<0.000001. Of the 34 athletes with MTSS, 11 (32%) had a concurrent lower leg injury, which was reliably noted by our clinicians, k=0.73, 95% CI 0.48 to 0.98, p<0.0001. Our findings show that MTSS can be reliably diagnosed clinically using history and physical examination, in clinical practice and research settings. We also found that concurrent lower leg injuries are common in athletes with MTSS. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  8. Molecular dynamics simulation study of solvent effects on conformation and dynamics of polyethylene oxide and polypropylene oxide chains in water and in common organic solvents.

    PubMed

    Hezaveh, Samira; Samanta, Susruta; Milano, Giuseppe; Roccatano, Danilo

    2012-03-28

    In this paper, the conformation and dynamics properties of polyethylene oxide (PEO) and polypropylene oxide (PPO) polymer chains at 298 K have been studied in the melt and at infinite dilution condition in water, methanol, chloroform, carbon tetrachloride, and n-heptane using molecular dynamics simulations. The calculated density of PEO melt with chain lengths of n = 2, 3, 4, 5 and, for PPO, n = 7 are in good agreement with the available experimental data. The conformational properties of PEO and PPO show an increasing gauche preference for the O-C-C-O dihedral in the following order water>methanol>chloroform>carbon tetrachloride = n-heptane. On the contrary, the preference for trans conformation has a maximum in carbon tetrachloride and n-heptane followed in the order by chloroform, methanol, and water. The PEO conformational preferences are in qualitative agreement with results of NMR studies. PEO chains formed different types of hydrogen bonds with polar solvent molecules. In particular, the occurrence of bifurcated hydrogen bonding in chloroform was also observed. Radii of gyration of PEO chains of length larger than n = 9 monomers showed a good agreement with light scattering data in water and in methanol. For the shorter chains the observed deviations are probably due to the enhanced hydrophobic effects caused by the terminal methyl groups. For PEO the fitting of end-to-end distance distributions with the semi-flexible chain model at 298 K provided persistence lengths of 0.375 and 0.387 nm in water and methanol, respectively. Finally, the radius of gyration of Pluronic P85 turned out to be 2.25 ± 0.4 nm at 293 K in water in agreement with experimental data.

  9. Molecular dynamics simulation study of solvent effects on conformation and dynamics of polyethylene oxide and polypropylene oxide chains in water and in common organic solvents

    NASA Astrophysics Data System (ADS)

    Hezaveh, Samira; Samanta, Susruta; Milano, Giuseppe; Roccatano, Danilo

    2012-03-01

    In this paper, the conformation and dynamics properties of polyethylene oxide (PEO) and polypropylene oxide (PPO) polymer chains at 298 K have been studied in the melt and at infinite dilution condition in water, methanol, chloroform, carbon tetrachloride, and n-heptane using molecular dynamics simulations. The calculated density of PEO melt with chain lengths of n = 2, 3, 4, 5 and, for PPO, n = 7 are in good agreement with the available experimental data. The conformational properties of PEO and PPO show an increasing gauche preference for the O-C-C-O dihedral in the following order water>methanol>chloroform>carbon tetrachloride = n-heptane. On the contrary, the preference for trans conformation has a maximum in carbon tetrachloride and n-heptane followed in the order by chloroform, methanol, and water. The PEO conformational preferences are in qualitative agreement with results of NMR studies. PEO chains formed different types of hydrogen bonds with polar solvent molecules. In particular, the occurrence of bifurcated hydrogen bonding in chloroform was also observed. Radii of gyration of PEO chains of length larger than n = 9 monomers showed a good agreement with light scattering data in water and in methanol. For the shorter chains the observed deviations are probably due to the enhanced hydrophobic effects caused by the terminal methyl groups. For PEO the fitting of end-to-end distance distributions with the semi-flexible chain model at 298 K provided persistence lengths of 0.375 and 0.387 nm in water and methanol, respectively. Finally, the radius of gyration of Pluronic P85 turned out to be 2.25 ± 0.4 nm at 293 K in water in agreement with experimental data.

  10. The Effects of Blade Count on Boundary Layer Development in a Low-Pressure Turbine

    NASA Technical Reports Server (NTRS)

    Dorney, Daniel J.; Flitan, Horia C.; Ashpis, David E.; Solomon, William J.

    2000-01-01

    Experimental data from jet-engine tests have indicated that turbine efficiencies at takeoff can be as much as two points higher than those at cruise conditions. Recent studies have shown that Reynolds number effects contribute to the lower efficiencies at cruise conditions. In the current study numerical simulations have been performed to study the boundary layer development in a two-stage low-pressure turbine, and to evaluate the models available for low Reynolds number flows in turbomachinery. In a previous study using the same geometry the predicted time-averaged boundary layer quantities showed excellent agreement with the experimental data, but the predicted unsteady results showed only fair agreement with the experimental data. It was surmised that the blade count approximation used in the numerical simulations generated more unsteadiness than was observed in the experiments. In this study a more accurate blade approximation has been used to model the turbine, and the method of post-processing the boundary layer information has been modified to more closely resemble the process used in the experiments. The predicted results show improved agreement with the unsteady experimental data.

  11. Evaluation of Multiclass Model Observers in PET LROC Studies

    NASA Astrophysics Data System (ADS)

    Gifford, H. C.; Kinahan, P. E.; Lartizien, C.; King, M. A.

    2007-02-01

    A localization ROC (LROC) study was conducted to evaluate nonprewhitening matched-filter (NPW) and channelized NPW (CNPW) versions of a multiclass model observer as predictors of human tumor-detection performance with PET images. Target localization is explicitly performed by these model observers. Tumors were placed in the liver, lungs, and background soft tissue of a mathematical phantom, and the data simulation modeled a full-3D acquisition mode. Reconstructions were performed with the FORE+AWOSEM algorithm. The LROC study measured observer performance with 2D images consisting of either coronal, sagittal, or transverse views of the same set of cases. Versions of the CNPW observer based on two previously published difference-of-Gaussian channel models demonstrated good quantitative agreement with human observers. One interpretation of these results treats the CNPW observer as a channelized Hotelling observer with implicit internal noise

  12. Agreement Analysis: What He Said, She Said Versus You Said.

    PubMed

    Vetter, Thomas R; Schober, Patrick

    2018-06-01

    Correlation and agreement are 2 concepts that are widely applied in the medical literature and clinical practice to assess for the presence and strength of an association. However, because correlation and agreement are conceptually distinct, they require the use of different statistics. Agreement is a concept that is closely related to but fundamentally different from and often confused with correlation. The idea of agreement refers to the notion of reproducibility of clinical evaluations or biomedical measurements. The intraclass correlation coefficient is a commonly applied measure of agreement for continuous data. The intraclass correlation coefficient can be validly applied specifically to assess intrarater reliability and interrater reliability. As its name implies, the Lin concordance correlation coefficient is another measure of agreement or concordance. In undertaking a comparison of a new measurement technique with an established one, it is necessary to determine whether they agree sufficiently for the new to replace the old. Bland and Altman demonstrated that using a correlation coefficient is not appropriate for assessing the interchangeability of 2 such measurement methods. They in turn described an alternative approach, the since widely applied graphical Bland-Altman Plot, which is based on a simple estimation of the mean and standard deviation of differences between measurements by the 2 methods. In reading a medical journal article that includes the interpretation of diagnostic tests and application of diagnostic criteria, attention is conventionally focused on aspects like sensitivity, specificity, predictive values, and likelihood ratios. However, if the clinicians who interpret the test cannot agree on its interpretation and resulting typically dichotomous or binary diagnosis, the test results will be of little practical use. Such agreement between observers (interobserver agreement) about a dichotomous or binary variable is often reported as the kappa statistic. Assessing the interrater agreement between observers, in the case of ordinal variables and data, also has important biomedical applicability. Typically, this situation calls for use of the Cohen weighted kappa. Questionnaires, psychometric scales, and diagnostic tests are widespread and increasingly used by not only researchers but also clinicians in their daily practice. It is essential that these questionnaires, scales, and diagnostic tests have a high degree of agreement between observers. It is therefore vital that biomedical researchers and clinicians apply the appropriate statistical measures of agreement to assess the reproducibility and quality of these measurement instruments and decision-making processes.

  13. Best-Quality Vessel Identification Using Vessel Quality Measure in Multiple-Phase Coronary CT Angiography.

    PubMed

    Hadjiiski, Lubomir; Liu, Jordan; Chan, Heang-Ping; Zhou, Chuan; Wei, Jun; Chughtai, Aamer; Kuriakose, Jean; Agarwal, Prachi; Kazerooni, Ella

    2016-01-01

    The detection of stenotic plaques strongly depends on the quality of the coronary arterial tree imaged with coronary CT angiography (cCTA). However, it is time consuming for the radiologist to select the best-quality vessels from the multiple-phase cCTA for interpretation in clinical practice. We are developing an automated method for selection of the best-quality vessels from coronary arterial trees in multiple-phase cCTA to facilitate radiologist's reading or computerized analysis. Our automated method consists of vessel segmentation, vessel registration, corresponding vessel branch matching, vessel quality measure (VQM) estimation, and automatic selection of best branches based on VQM. For every branch, the VQM was calculated as the average radial gradient. An observer preference study was conducted to visually compare the quality of the selected vessels. 167 corresponding branch pairs were evaluated by two radiologists. The agreement between the first radiologist and the automated selection was 76% with kappa of 0.49. The agreement between the second radiologist and the automated selection was also 76% with kappa of 0.45. The agreement between the two radiologists was 81% with kappa of 0.57. The observer preference study demonstrated the feasibility of the proposed automated method for the selection of the best-quality vessels from multiple cCTA phases.

  14. A comparative analysis of Patient-Reported Expanded Disability Status Scale tools.

    PubMed

    Collins, Christian DE; Ivry, Ben; Bowen, James D; Cheng, Eric M; Dobson, Ruth; Goodin, Douglas S; Lechner-Scott, Jeannette; Kappos, Ludwig; Galea, Ian

    2016-09-01

    Patient-Reported Expanded Disability Status Scale (PREDSS) tools are an attractive alternative to the Expanded Disability Status Scale (EDSS) during long term or geographically challenging studies, or in pressured clinical service environments. Because the studies reporting these tools have used different metrics to compare the PREDSS and EDSS, we undertook an individual patient data level analysis of all available tools. Spearman's rho and the Bland-Altman method were used to assess correlation and agreement respectively. A systematic search for validated PREDSS tools covering the full EDSS range identified eight such tools. Individual patient data were available for five PREDSS tools. Excellent correlation was observed between EDSS and PREDSS with all tools. A higher level of agreement was observed with increasing levels of disability. In all tools, the 95% limits of agreement were greater than the minimum EDSS difference considered to be clinically significant. However, the intra-class coefficient was greater than that reported for EDSS raters of mixed seniority. The visual functional system was identified as the most significant predictor of the PREDSS-EDSS difference. This analysis will (1) enable researchers and service providers to make an informed choice of PREDSS tool, depending on their individual requirements, and (2) facilitate improvement of current PREDSS tools. © The Author(s), 2015.

  15. The Neurologic Assessment in Neuro-Oncology (NANO) scale: a tool to assess neurologic function for integration into the Response Assessment in Neuro-Oncology (RANO) criteria

    PubMed Central

    DeAngelis, Lisa M.; Brandes, Alba A.; Peereboom, David M.; Galanis, Evanthia; Lin, Nancy U.; Soffietti, Riccardo; Macdonald, David R.; Chamberlain, Marc; Perry, James; Jaeckle, Kurt; Mehta, Minesh; Stupp, Roger; Muzikansky, Alona; Pentsova, Elena; Cloughesy, Timothy; Iwamoto, Fabio M.; Tonn, Joerg-Christian; Vogelbaum, Michael A.; Wen, Patrick Y.; van den Bent, Martin J.; Reardon, David A.

    2017-01-01

    Abstract Background. The Macdonald criteria and the Response Assessment in Neuro-Oncology (RANO) criteria define radiologic parameters to classify therapeutic outcome among patients with malignant glioma and specify that clinical status must be incorporated and prioritized for overall assessment. But neither provides specific parameters to do so. We hypothesized that a standardized metric to measure neurologic function will permit more effective overall response assessment in neuro-oncology. Methods. An international group of physicians including neurologists, medical oncologists, radiation oncologists, and neurosurgeons with expertise in neuro-oncology drafted the Neurologic Assessment in Neuro-Oncology (NANO) scale as an objective and quantifiable metric of neurologic function evaluable during a routine office examination. The scale was subsequently tested in a multicenter study to determine its overall reliability, inter-observer variability, and feasibility. Results. The NANO scale is a quantifiable evaluation of 9 relevant neurologic domains based on direct observation and testing conducted during routine office visits. The score defines overall response criteria. A prospective, multinational study noted a >90% inter-observer agreement rate with kappa statistic ranging from 0.35 to 0.83 (fair to almost perfect agreement), and a median assessment time of 4 minutes (interquartile range, 3–5). Conclusion. The NANO scale provides an objective clinician-reported outcome of neurologic function with high inter-observer agreement. It is designed to combine with radiographic assessment to provide an overall assessment of outcome for neuro-oncology patients in clinical trials and in daily practice. Furthermore, it complements existing patient-reported outcomes and cognition testing to combine for a global clinical outcome assessment of well-being among brain tumor patients. PMID:28453751

  16. Does extensive genotyping and nasal potential difference testing clarify the diagnosis of cystic fibrosis among patients with single-organ manifestations of cystic fibrosis?

    PubMed

    Ooi, Chee Y; Dupuis, Annie; Ellis, Lynda; Jarvi, Keith; Martin, Sheelagh; Ray, Peter N; Steele, Leslie; Kortan, Paul; Gonska, Tanja; Dorfman, Ruslan; Solomon, Melinda; Zielenski, Julian; Corey, Mary; Tullis, Elizabeth; Durie, Peter

    2014-03-01

    The phenotypic spectrum of cystic fibrosis (CF) has expanded to include patients affected by single-organ diseases. Extensive genotyping and nasal potential difference (NPD) testing have been proposed to assist in the diagnosis of CF when sweat testing is inconclusive. However, the diagnostic yield of extensive genotyping and NPD and the concordance between NPD and the sweat test have not been carefully evaluated. We evaluated the diagnostic outcomes of genotyping (with 122 mutations included as disease causing), sweat testing and NPD in a prospectively ascertained cohort of undiagnosed patients who presented with chronic sino-pulmonary disease (RESP), chronic/recurrent pancreatitis (PANC) or obstructive azoospermia (AZOOSP). 202 patients (68 RESP, 42 PANC and 92 AZOOSP) were evaluated; 17.3%, 22.8% and 59.9% had abnormal, borderline and normal sweat chloride results, respectively. Only 17 (8.4%) patients were diagnosable as having CF by genotyping. Compared to sweat testing, NPD identified more patients as having CF (33.2%) with fewer borderline results (18.8%). The level of agreement according to kappa statistics (and the observed percentage of agreement) between sweat chloride and NPD in RESP, PANC and AZOOSP subjects was 'moderate' (65% observed agreement), 'poor' (33% observed agreement) and 'fair' (28% observed agreement), respectively. The degree of agreement only improved marginally when subjects with borderline sweat chloride results were excluded from the analysis. The diagnosis of CF or its exclusion is not always straightforward and may remain elusive even with comprehensive evaluation, particularly among individuals who present at an older age with single-organ manifestations suggestive of CF.

  17. Inter-observer reliability of radiographic classifications and measurements in the assessment of Perthes' disease.

    PubMed

    Wiig, Ola; Terjesen, Terje; Svenningsen, Svein

    2002-10-01

    We evaluated the inter-observer agreement of radiographic methods when evaluating patients with Perthes' disease. The radiographs were assessed at the time of diagnosis and at the 1-year follow-up by local orthopaedic surgeons (O) and 2 experienced pediatric orthopedic surgeons (TT and SS). The Catterall, Salter-Thompson, and Herring lateral pillar classifications were compared, and the femoral head coverage (FHC), center-edge angle (CE-angle), and articulo-trochanteric distance (ATD) were measured in the affected and normal hips. On the primary evaluation, the lateral pillar and Salter-Thompson classifications had a higher level of agreement among the observers than the Catterall classification, but none of the classifications showed good agreement (weighted kappa values between O and SS 0.56, 0.54, 0.49, respectively). Combining Catterall groups 1 and 2 into one group, and groups 3 and 4 into another resulted in better agreement (kappa 0.55) than with the original 4-group system. The agreement was also better (kappa 0.62-0.70) between experienced than between less experienced examiners for all classifications. The femoral head coverage was a more reliable and accurate measure than the CE-angle for quantifying the acetabular covering of the femoral head, as indicated by higher intraclass correlation coefficients (ICC) and smaller inter-observer differences. The ATD showed good agreement in all comparisons and had low interobserver differences. We conclude that all classifications of femoral head involvement are adequate in clinical work if the radiographic assessment is done by experienced examiners. When they are less experienced examiners, a 2-group classification or the lateral pillar classification is more reliable. For evaluation of containment of the femoral head, FHC is more appropriate than the CE-angle.

  18. Dynamical description of the fission process using the TD-BCS theory

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Scamps, Guillaume, E-mail: scamps@nucl.phys.tohoku.ac.jp; Simenel, Cédric; Lacroix, Denis

    2015-10-15

    The description of fission remains a challenge for nuclear microscopic theories. The time-dependent Hartree-Fock approach with BCS pairing is applied to study the last stage of the fission process. A good agreement is found for the one-body observables: the total kinetic energy and the average mass asymmetry. The non-physical dependence of two-body observables with the initial shape is discussed.

  19. Direct observation of the protonation of acetone ketyl radical by conductometric pulse radiolysis. [8-MeV electrons

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Janata, E.; Schuler, R.H.

    1980-12-11

    Improvements in conductometric pulse radiolysis methods allow direct observation of the protonation of the acetone ketyl radical anion on the 10-ns time scale. The protonation period of 9.7 +- 0.5 ns determined here is in good agreement with that estimated from the ESR line broadening studies of Laroff and Fessenden, (J. Phys. Chem., 77, 1283(1973)).

  20. Accuracy of endoscopic diagnosis of Helicobacter pylori infection according to level of endoscopic experience and the effect of training

    PubMed Central

    2013-01-01

    Background Accurate prediction of Helicobacter pylori infection status on endoscopic images can contribute to early detection of gastric cancer, especially in Asia. We identified the diagnostic yield of endoscopy for H. pylori infection at various endoscopist career levels and the effect of two years of training on diagnostic yield. Methods A total of 77 consecutive patients who underwent endoscopy were analyzed. H. pylori infection status was determined by histology, serology, and the urea breast test and categorized as H. pylori-uninfected, -infected, or -eradicated. Distinctive endoscopic findings were judged by six physicians at different career levels: beginner (<500 endoscopies), intermediate (1500–5000), and advanced (>5000). Diagnostic yield and inter- and intra-observer agreement on H. pylori infection status were evaluated. Values were compared between the two beginners after two years of training. The kappa (K) statistic was used to calculate agreement. Results For all physicians, the diagnostic yield was 88.9% for H. pylori-uninfected, 62.1% for H. pylori-infected, and 55.8% for H. pylori-eradicated. Intra-observer agreement for H. pylori infection status was good (K > 0.6) for all physicians, while inter-observer agreement was lower (K = 0.46) for beginners than for intermediate and advanced (K > 0.6). For all physicians, good inter-observer agreement in endoscopic findings was seen for atrophic change (K = 0.69), regular arrangement of collecting venules (K = 0.63), and hemorrhage (K = 0.62). For beginners, the diagnostic yield of H. pylori-infected/eradicated status and inter-observer agreement of endoscopic findings were improved after two years of training. Conclusions The diagnostic yield of endoscopic diagnosis was high for H. pylori-uninfected cases, but was low for H. pylori-eradicated cases. In beginners, daily training on endoscopic findings improved the low diagnostic yield. PMID:23947684

  1. Reliability of psychiatric diagnosis in hospitalized adolescents. Interrater agreement using DSM-III.

    PubMed

    Strober, M; Green, J; Carlson, G

    1981-02-01

    To determine the reliability of psychiatric diagnosis in hospitalized adolescents, 95 consecutively admitted patients were diagnosed independently by two experienced clinicians using DSM-III criteria. Diagnostic judgments were based on joint interview of the patient via a structured mental-status examination, nursing observations, and referral materials. Concordance was analyzed by the kappa coefficient. A total of 13 DSM-III categories were used to classify this cohort, with the majority of categories representing traditional syndromes of functional psychopathology. There was complete agreement between the raters for more than three fourths of the patients. Levels of agreement for the categories of schizophrenia and major affective disorder were similar to values obtained in recent studies of adult patients. The results are discussed in relation to historical conceptions of adolescent psychopathology.

  2. Theoretical and experimental studies of the atmospheric sodium layer

    NASA Technical Reports Server (NTRS)

    Richter, E. S.; Sechrist, C. F., Jr.

    1978-01-01

    Atmospheric atomic sodium was studied with a laser radar system. Photocount data were processed using a digital filter to obtain continuous estimates of the sodium concentration versus altitude. Wave-like structures in the sodium layer were observed, and there was evidence for the presence of a standing wave in the layer. The bottomside of the layer was observed to undulate with a period of about 2 1/2 hours, and the layer was observed to broaden through the night. A meteor ablation-cluster ion theory of sodium was developed. The theory shows good agreement with existing atmospheric observations as well as laboratory measurements of rate constants.

  3. A novel scoring system to measure radiographic abnormalities and related spirometric values in cured pulmonary tuberculosis.

    PubMed

    Báez-Saldaña, Renata; López-Arteaga, Yesenia; Bizarrón-Muro, Alma; Ferreira-Guerrero, Elizabeth; Ferreyra-Reyes, Leticia; Delgado-Sánchez, Guadalupe; Cruz-Hervert, Luis Pablo; Mongua-Rodríguez, Norma; García-García, Lourdes

    2013-01-01

    Despite chemotherapy, patients with cured pulmonary tuberculosis may result in lung functional impairment. To evaluate a novel scoring system based on the degree of radiographic abnormalities and related spirometric values in patients with cured pulmonary tuberculosis. One hundred and twenty seven patients with cured pulmonary tuberculosis were prospectively enrolled in a referral hospital specializing in respiratory diseases. Spirometry was performed and the extent of radiographic abnormalities was evaluated twice by each of two readers to generate a novel quantitative score. Scoring reproducibility was analyzed by the intra-class correlation coefficient (ICC) and the Bland-Altman method. Multiple linear regression models were performed to assess the association of the extent of radiographic abnormalities with spirometric values. The intra-observer agreement for scoring of radiographic abnormalities (SRA) showed an ICC of 0.81 (CI:95%, 0.67-0.95) and 0.78 (CI:95%, 0.65-0.92), for reader 1 and 2, respectively. Inter-observer reproducibility for the first measurement was 0.83 (CI:95%, 0.71-0.95), and for the second measurement was 0.74 (CI:95%, 0.58-0.90). The Bland-Altman analysis of the intra-observer agreement showed a mean bias of 0.87% and -0.55% and an inter-observer agreement of -0.35% and -1.78%, indicating a minor average systematic variability. After adjustment for age, gender, height, smoking status, pack-years of smoking, and degree of dyspnea, the scoring degree of radiographic abnormalities was significantly and negatively associated with absolute and percent predicted values of FVC: -0.07 (CI:95%, -0.01 to -0.04); -2.48 (CI:95%, -3.45 to -1.50); and FEV1 -0.07 (CI:95%, -0.10 to -0.05); -2.92 (CI:95%, -3.87 to -1.97) respectively, in the patients studied. The extent of radiographic abnormalities, as evaluated through our novel scoring system, was inversely associated with spirometric values, and exhibited good reliability and reproducibility. As intra-observer and inter-observer agreement of the SRA varied from good to excellent, the use of SRA in this setting appears acceptable.

  4. A Novel Scoring System to Measure Radiographic Abnormalities and Related Spirometric Values in Cured Pulmonary Tuberculosis

    PubMed Central

    Báez-Saldaña, Renata; López-Arteaga, Yesenia; Bizarrón-Muro, Alma; Ferreira-Guerrero, Elizabeth; Ferreyra-Reyes, Leticia; Delgado-Sánchez, Guadalupe; Cruz-Hervert, Luis Pablo; Mongua-Rodríguez, Norma; García-García, Lourdes

    2013-01-01

    Background Despite chemotherapy, patients with cured pulmonary tuberculosis may result in lung functional impairment. Objective To evaluate a novel scoring system based on the degree of radiographic abnormalities and related spirometric values in patients with cured pulmonary tuberculosis. Methods One hundred and twenty seven patients with cured pulmonary tuberculosis were prospectively enrolled in a referral hospital specializing in respiratory diseases. Spirometry was performed and the extent of radiographic abnormalities was evaluated twice by each of two readers to generate a novel quantitative score. Scoring reproducibility was analyzed by the intra-class correlation coefficient (ICC) and the Bland-Altman method. Multiple linear regression models were performed to assess the association of the extent of radiographic abnormalities with spirometric values. Results The intra-observer agreement for scoring of radiographic abnormalities (SRA) showed an ICC of 0.81 (CI:95%, 0.67–0.95) and 0.78 (CI:95%, 0.65–0.92), for reader 1 and 2, respectively. Inter-observer reproducibility for the first measurement was 0.83 (CI:95%, 0.71–0.95), and for the second measurement was 0.74 (CI:95%, 0.58–0.90). The Bland-Altman analysis of the intra-observer agreement showed a mean bias of 0.87% and -0.55% and an inter-observer agreement of -0.35% and -1.78%, indicating a minor average systematic variability. After adjustment for age, gender, height, smoking status, pack-years of smoking, and degree of dyspnea, the scoring degree of radiographic abnormalities was significantly and negatively associated with absolute and percent predicted values of FVC: -0.07 (CI:95%, -0.01 to -0.04); -2.48 (CI:95%, -3.45 to -1.50); and FEV1 -0.07 (CI:95%, -0.10 to -0.05); -2.92 (CI:95%, -3.87 to -1.97) respectively, in the patients studied. Conclusion The extent of radiographic abnormalities, as evaluated through our novel scoring system, was inversely associated with spirometric values, and exhibited good reliability and reproducibility. As intra-observer and inter-observer agreement of the SRA varied from good to excellent, the use of SRA in this setting appears acceptable. PMID:24223865

  5. Development of Two-Moment Cloud Microphysics for Liquid and Ice Within the NASA Goddard Earth Observing System Model (GEOS-5)

    NASA Technical Reports Server (NTRS)

    Barahona, Donifan; Molod, Andrea M.; Bacmeister, Julio; Nenes, Athanasios; Gettelman, Andrew; Morrison, Hugh; Phillips, Vaughan,; Eichmann, Andrew F.

    2013-01-01

    This work presents the development of a two-moment cloud microphysics scheme within the version 5 of the NASA Goddard Earth Observing System (GEOS-5). The scheme includes the implementation of a comprehensive stratiform microphysics module, a new cloud coverage scheme that allows ice supersaturation and a new microphysics module embedded within the moist convection parameterization of GEOS-5. Comprehensive physically-based descriptions of ice nucleation, including homogeneous and heterogeneous freezing, and liquid droplet activation are implemented to describe the formation of cloud particles in stratiform clouds and convective cumulus. The effect of preexisting ice crystals on the formation of cirrus clouds is also accounted for. A new parameterization of the subgrid scale vertical velocity distribution accounting for turbulence and gravity wave motion is developed. The implementation of the new microphysics significantly improves the representation of liquid water and ice in GEOS-5. Evaluation of the model shows agreement of the simulated droplet and ice crystal effective and volumetric radius with satellite retrievals and in situ observations. The simulated global distribution of supersaturation is also in agreement with observations. It was found that when using the new microphysics the fraction of condensate that remains as liquid follows a sigmoidal increase with temperature which differs from the linear increase assumed in most models and is in better agreement with available observations. The performance of the new microphysics in reproducing the observed total cloud fraction, longwave and shortwave cloud forcing, and total precipitation is similar to the operational version of GEOS-5 and in agreement with satellite retrievals. However the new microphysics tends to underestimate the coverage of persistent low level stratocumulus. Sensitivity studies showed that the simulated cloud properties are robust to moderate variation in cloud microphysical parameters. However significant sensitivity in ice cloud properties was found to variation in the dispersion of the ice crystal size distribution and the critical size for ice autoconversion. The implementation of the new microphysics leads to a more realistic representation of cloud processes in GEOS-5 and allows the linkage of cloud properties to aerosol emissions.

  6. Novel Zero-Heat-Flux Deep Body Temperature Measurement in Lower Extremity Vascular and Cardiac Surgery.

    PubMed

    Mäkinen, Marja-Tellervo; Pesonen, Anne; Jousela, Irma; Päivärinta, Janne; Poikajärvi, Satu; Albäck, Anders; Salminen, Ulla-Stina; Pesonen, Eero

    2016-08-01

    The aim of this study was to compare deep body temperature obtained using a novel noninvasive continuous zero-heat-flux temperature measurement system with core temperatures obtained using conventional methods. A prospective, observational study. Operating room of a university hospital. The study comprised 15 patients undergoing vascular surgery of the lower extremities and 15 patients undergoing cardiac surgery with cardiopulmonary bypass. Zero-heat-flux thermometry on the forehead and standard core temperature measurements. Body temperature was measured using a new thermometry system (SpotOn; 3M, St. Paul, MN) on the forehead and with conventional methods in the esophagus during vascular surgery (n = 15), and in the nasopharynx and pulmonary artery during cardiac surgery (n = 15). The agreement between SpotOn and the conventional methods was assessed using the Bland-Altman random-effects approach for repeated measures. The mean difference between SpotOn and the esophageal temperature during vascular surgery was+0.08°C (95% limit of agreement -0.25 to+0.40°C). During cardiac surgery, during off CPB, the mean difference between SpotOn and the pulmonary arterial temperature was -0.05°C (95% limits of agreement -0.56 to+0.47°C). Throughout cardiac surgery (on and off CPB), the mean difference between SpotOn and the nasopharyngeal temperature was -0.12°C (95% limits of agreement -0.94 to+0.71°C). Poor agreement between the SpotOn and nasopharyngeal temperatures was detected in hypothermia below approximately 32°C. According to this preliminary study, the deep body temperature measured using the zero-heat-flux system was in good agreement with standard core temperatures during lower extremity vascular and cardiac surgery. However, agreement was questionable during hypothermia below 32°C. Copyright © 2016 Elsevier Inc. All rights reserved.

  7. Rater agreement of visual lameness assessment in horses during lungeing

    PubMed Central

    Hammarberg, M.; Egenvall, A.; Pfau, T.

    2015-01-01

    Summary Reasons for performing study Lungeing is an important part of lameness examinations as the circular path may accentuate low‐grade lameness. Movement asymmetries related to the circular path, to compensatory movements and to pain make the lameness evaluation complex. Scientific studies have shown high inter‐rater variation when assessing lameness during straight line movement. Objectives The aim was to estimate inter‐ and intra‐rater agreement of equine veterinarians evaluating lameness from videos of sound and lame horses during lungeing and to investigate the influence of veterinarians’ experience and the objective degree of movement asymmetry on rater agreement. Study design Cross‐sectional observational study. Methods Video recordings and quantitative gait analysis with inertial sensors were performed in 23 riding horses of various breeds. The horses were examined at trot on a straight line and during lungeing on soft or hard surfaces in both directions. One video sequence was recorded per condition and the horses were classified as forelimb lame, hindlimb lame or sound from objective straight line symmetry measurements. Equine veterinarians (n = 86), including 43 with >5 years of orthopaedic experience, participated in a web‐based survey and were asked to identify the lamest limb on 60 videos, including 10 repeats. The agreements between (inter‐rater) and within (intra‐rater) veterinarians were analysed with κ statistics (Fleiss, Cohen). Results Inter‐rater agreement κ was 0.31 (0.38/0.25 for experienced/less experienced) and higher for forelimb (0.33) than for hindlimb lameness (0.11) or soundness (0.08) evaluation. Median intra‐rater agreement κ was 0.57. Conclusions Inter‐rater agreement was poor for less experienced raters, and for all raters when evaluating hindlimb lameness. Since identification of the lame limb/limbs is a prerequisite for successful diagnosis, treatment and recovery, the high inter‐rater variation when evaluating lameness on the lunge is likely to influence the accuracy and repeatability of lameness examinations and, indirectly, the efficacy of treatment. PMID:25399722

  8. Estimating irrigated areas from satellite and model soil moisture data over the contiguous US

    NASA Astrophysics Data System (ADS)

    Zaussinger, Felix; Dorigo, Wouter; Gruber, Alexander

    2017-04-01

    Information about irrigation is crucial for a number of applications such as drought- and yield management and contributes to a better understanding of the water-cycle, land-atmosphere interactions as well as climate projections. Currently, irrigation is mainly quantified by national agricultural statistics, which do not include spatial information. The digital Global Map of Irrigated Areas (GMIA) has been the first effort to quantify irrigation at the global scale by merging these statistics with remote sensing data. Also, the MODIS-Irrigated Agriculture Dataset (MirAD-US) was created by merging annual peak MODIS-NDVI with US county level irrigation statistics. In this study we aim to map irrigated areas by confronting time series of various satellite soil moisture products with soil moisture from the ERA-Interim/Land reanalysis product. We follow the assumption that irrigation signals are not modelled in the reanalysis product, nor contributing to its forcing data, but affecting the spatially continuous remote sensing observations. Based on this assumption, spatial patterns of irrigation are derived from differences between the temporal slopes of the modelled and remotely sensed time series during the irrigation season. Results show that a combination of ASCAT and ERA-Interim/Land show spatial patterns which are in good agreement with the MIrAD-US, particularly within the Mississippi Delta, Texas and eastern Nebraska. In contrast, AMSRE shows weak agreements, plausibly due to a higher vegetation dependency of the soil moisture signal. There is no significant agreement to the MIrAD-US in California, which is possibly related to higher crop-diversity and lower field sizes. Also, a strong signal in the region of the Great Corn Belt is observed, which is generally not outlined as an irrigated area. It is not yet clear to what extent the signal obtained in the Mississippi Delta is related to re-reflection effects caused by standing water due to flood or furrow irrigation practices. Consequently, future research should focus on the specific effects of different irrigation practices and crop types. This study is supported by the European Union's FP7 EartH2Observe "Global Earth Observation for Integrated Water Resource Assessment" project (grant agreement number 331 603608).

  9. Statistical analysis of the mesospheric inversion layers over two symmetrical tropical sites: Réunion (20.8° S, 55.5° E) and Mauna Loa (19.5° N, 155.6° W)

    NASA Astrophysics Data System (ADS)

    Bègue, Nelson; Mbatha, Nkanyiso; Bencherif, Hassan; Tato Loua, René; Sivakumar, Venkataraman; Leblanc, Thierry

    2017-11-01

    In this investigation a statistical analysis of the characteristics of mesospheric inversion layers (MILs) over tropical regions is presented. This study involves the analysis of 16 years of lidar observations recorded at Réunion (20.8° S, 55.5° E) and 21 years of lidar observations recorded at Mauna Loa (19.5° N, 155.6° W) together with SABER observations at these two locations. MILs appear in 10 and 9.3 % of the observed temperature profiles recorded by Rayleigh lidar at Réunion and Mauna Loa, respectively. The parameters defining MILs show a semi-annual cycle over the two selected sites with maxima occurring near the equinoxes and minima occurring during the solstices. Over both sites, the maximum mean amplitude is observed in April and October, and this corresponds to a value greater than 35 K. According to lidar observations, the maximum and minimum mean of the base height ranged from 79 to 80.5 km and from 76 to 77.5 km, respectively. The MILs at Réunion appear on average ˜ 1 km thinner and ˜ 1 km lower, with an amplitude of ˜ 2 K higher than Mauna Loa. Generally, the statistical results for these two tropical locations as presented in this investigation are in fairly good agreement with previous studies. When compared to lidar measurements, on average SABER observations show MILs with greater amplitude, thickness and base altitudes of 4 K, 0.75 and 1.1 km, respectively. Taking into account the temperature error by SABER in the mesosphere, it can therefore be concluded that the measurements obtained from lidar and SABER observations are in significant agreement. The frequency spectrum analysis based on the lidar profiles and the 60-day averaged profile from SABER confirms the presence of the semi-annual oscillation where the magnitude maximum is found to coincide with the height range of the temperature inversion zone. This connection between increases in the semi-annual component close to the inversion zone is in agreement with most previously reported studies over tropics based on satellite observations. Results presented in this study confirm through the use of the ground-based Rayleigh lidar at Réunion and Mauna Loa that the semi-annual oscillation contributes to the formation of MILs over the tropical region.

  10. Automated computation of femoral angles in dogs from three-dimensional computed tomography reconstructions: Comparison with manual techniques.

    PubMed

    Longo, F; Nicetto, T; Banzato, T; Savio, G; Drigo, M; Meneghello, R; Concheri, G; Isola, M

    2018-02-01

    The aim of this ex vivo study was to test a novel three-dimensional (3D) automated computer-aided design (CAD) method (aCAD) for the computation of femoral angles in dogs from 3D reconstructions of computed tomography (CT) images. The repeatability and reproducibility of three manual radiography, manual CT reconstructions and the aCAD method for the measurement of three femoral angles were evaluated: (1) anatomical lateral distal femoral angle (aLDFA); (2) femoral neck angle (FNA); and (3) femoral torsion angle (FTA). Femoral angles of 22 femurs obtained from 16 cadavers were measured by three blinded observers. Measurements were repeated three times by each observer for each diagnostic technique. Femoral angle measurements were analysed using a mixed effects linear model for repeated measures to determine the levels of intra-observer agreement (repeatability) and inter-observer agreement (reproducibility). Repeatability and reproducibility of measurements using the aCAD method were excellent (intra-class coefficients, ICCs≥0.98) for all three angles assessed. Manual radiography and CT exhibited excellent agreement for the aLDFA measurement (ICCs≥0.90). However, FNA repeatability and reproducibility were poor (ICCs<0.8), whereas FTA measurement showed slightly higher ICCs values, except for the radiographic reproducibility, which was poor (ICCs<0.8). The computation of the 3D aCAD method provided the highest repeatability and reproducibility among the tested methodologies. Copyright © 2017 Elsevier Ltd. All rights reserved.

  11. Behavior States: Now You See Them, Now You Don't.

    ERIC Educational Resources Information Center

    Mudford, Oliver C.; Hogg, James; Roberts, Jessie

    1999-01-01

    A study attempted to replicate a previous study that presented reliability data from recordings of behavior state using a 13-category coding system. Replication was unsuccessful. Obtained mean percentage agreement on occurrence for individual behavior state and participants (n=34) ranged across observer pairs from 0 to 58 percent. (Contains 13…

  12. Comparison of variability in breast density assessment by BI-RADS category according to the level of experience.

    PubMed

    Eom, Hye-Joung; Cha, Joo Hee; Kang, Ji-Won; Choi, Woo Jung; Kim, Han Jun; Go, EunChae

    2018-05-01

    Background Only few studies have assessed variability in the results obtained by the readers with different experience levels in comparison with automated volumetric breast density measurements. Purpose To examine the variations in breast density assessment according to BI-RADS categories among readers with different experience levels and to compare it with the results of automated quantitative measurements. Material and Methods Density assignment was done for 1000 screening mammograms by six readers with three different experience levels (breast-imaging experts, general radiologists, and students). Agreement level between the results obtained by the readers and the Volpara automated volumetric breast density measurements was assessed. The agreement analysis using two categories-non-dense and dense breast tissue-was also performed. Results Intra-reader agreement for experts, general radiologists, and students were almost perfect or substantial (k = 0.74-0.95). The agreement between visual assessments of the breast-imaging experts and volumetric assessments by Volpara was substantial (k = 0.77). The agreement was moderate between the experts and general radiologists (k = 0.67) and slight between the students and Volpara (k = 0.01). The agreement for the two category groups (nondense and dense) was almost perfect between the experts and Volpara (k = 0.83). The agreement was substantial between the experts and general radiologists (k = 0.78). Conclusion We observed similar high agreement levels between visual assessments of breast density performed by radiologists and the volumetric assessments. However, agreement levels were substantially lower for the untrained readers.

  13. Translation and cultural adaptation of the Brazilian Portuguese version of the Behavioral Pain Scale.

    PubMed

    Morete, Márcia Carla; Mofatto, Sarah Camargo; Pereira, Camila Alves; Silva, Ana Paula; Odierna, Maria Tereza

    2014-01-01

    The objective of this study was to translate and culturally adapt the Behavioral Pain Scale to Brazilian Portuguese and to evaluate the psychometric properties of this scale. This study was conducted in two phases: the Behavioral Pain Scale was translated and culturally adapted to Brazilian Portuguese and the psychometric properties of this scale were subsequently assessed (reliability and clinical utility). The study sample consisted of 100 patients who were older than 18 years of age, admitted to an intensive care unit, intubated, mechanically ventilated, and subjected or not to sedation and analgesia from July 2012 to December 2012. Pediatric and non-intubated patients were excluded. The study was conducted at a large private hospital that was situated in the city of São Paulo (SP). Regarding reproducibility, the results revealed that the observed agreement between the two evaluators was 92.08% for the pain descriptor "adaptation to mechanical ventilation", 88.1% for "upper limbs", and 90.1% for "facial expression". The kappa coefficient of agreement for "adaptation to mechanical ventilation" assumed a value of 0.740. Good agreement was observed between the evaluators with an intraclass correlation coefficient of 0.807 (95% confidence interval: 0.727-0.866). The Behavioral Pain Scale was easy to administer and reproduce. Additionally, this scale had adequate internal consistency. The Behavioral Pain Scale was satisfactorily adapted to Brazilian Portuguese for the assessment of pain in critically ill patients.

  14. Scales of degree of facial paralysis: analysis of agreement.

    PubMed

    Fonseca, Kércia Melo de Oliveira; Mourão, Aline Mansueto; Motta, Andréa Rodrigues; Vicente, Laelia Cristina Caseiro

    2015-01-01

    It has become common to use scales to measure the degree of involvement of facial paralysis in phonoaudiological clinics. To analyze the inter- and intra-rater agreement of the scales of degree of facial paralysis and to elicit point of view of the appraisers regarding their use. Cross-sectional observational clinical study of the Chevalier and House & Brackmann scales performed by five speech therapists with clinical experience, who analyzed the facial expression of 30 adult subjects with impaired facial movements two times, with a one week interval between evaluations. The kappa analysis was employed. There was excellent inter-rater agreement for both scales (kappa>0.80), and on the Chevalier scale a substantial intra-rater agreement in the first assessment (kappa=0.792) and an excellent agreement in the second assessment (kappa=0.928). The House & Brackmann scale showed excellent agreement at both assessments (kappa=0.850 and 0.857). As for the appraisers' point of view, one appraiser thought prior training is necessary for the Chevalier scale and, four appraisers felt that training is important for the House & Brackmann scale. Both scales have good inter- and intra-rater agreement and most of the appraisers agree on the ease and relevance of the application of these scales. Copyright © 2014 Associação Brasileira de Otorrinolaringologia e Cirurgia Cérvico-Facial. Published by Elsevier Editora Ltda. All rights reserved.

  15. Patient's Perception on the Esthetic Outcome of Anterior Fixed Prosthetic Treatment.

    PubMed

    Alshiddi, Ibraheem F; BinSaleh, Saad M; Alhawas, Yasser

    2015-11-01

    Patient's perception to the esthetic result of the treatment received can be different from a dentist opinion. Understanding patient's opinion, demand and expectation is part of successful treatment procedure. The purpose of this study was to investigate patient's opinion about the esthetic result of the fixed prosthetic treatment received in upper anterior teeth. About 90 volunteer subjects, 58 males and 32 females were given a self-evaluation questionnaire with 11 questions to respond as Yes or No. The questions regarded the esthetic result of a fixed prosthodontic treatment received for their upper anterior teeth. The same questioner was completed for each subject by three clinicians through clinical photographs for different views of subject's smile. Agreement between patients and clinicians was calculated for all subjects to evaluate patient's perception to their esthetic results. An agreement of 47.8 to 72.2% was observed between patients and clinicians, and the average agreement was 53.64 to 60%. The highest agreement was related to satisfaction with the color of the crown and/or bridge margin while the least agreement was related to the satisfaction with the natural looking of the restoration. There was variability in the agreement between the patients and the dentists with the satisfaction of the esthetic result of anterior restoration. Factor, such as gender, age and educational level may affect the results of the agreement.

  16. The rational clinical examination. Does this infant have pneumonia?

    PubMed

    Margolis, P; Gadomski, A

    1998-01-28

    Acute lower respiratory tract illness is common among children seen in primary care. We reviewed the accuracy and precision of the clinical examination in detecting pneumonia in children. Although most cases are viral, it is important to identify bacterial pneumonia to provide appropriate therapy. Studies were identified by searching MEDLINE from 1982 to 1995, reviewing reference lists, reviewing a published compendium of studies of the clinical examination, and consulting experts. Observer agreement is good for most signs on the clinical examination. Each study was reviewed by 2 observers and graded for methodologic quality. There is better agreement about signs that can be observed (eg, use of accessory muscles, color, attentiveness; kappa, 0.48-0.66) than signs that require auscultation of the chest (eg, adventitious sounds; kappa, 0.3). Measurements of the respiratory rate are enhanced by counting for 60 seconds. The best individual finding for ruling out pneumonia is the absence of tachypnea. Chest indrawing, and other signs of increased work of breathing, increases the likelihood of pneumonia. If all clinical signs (respiratory rate, auscultation, and work of breathing) are negative, the chest x-ray findings are unlikely to be positive. Studies are needed to assess the value of clinical findings when they are used together.

  17. Agreement protocol between the CNES (National French Space Study Center) and the Swedish Space Commission

    NASA Technical Reports Server (NTRS)

    1979-01-01

    The detailed arrangements made between France and Sweden to develop a satellite (and the associated receiving stations) which will perform systematic, repetitive observations of land masses, with the purpose of terrestrial resource exploration are described.

  18. Comparison of the manual, semiautomatic, and automatic selection and leveling of hot spots in whole slide images for Ki-67 quantification in meningiomas.

    PubMed

    Swiderska, Zaneta; Korzynska, Anna; Markiewicz, Tomasz; Lorent, Malgorzata; Zak, Jakub; Wesolowska, Anna; Roszkowiak, Lukasz; Slodkowska, Janina; Grala, Bartlomiej

    2015-01-01

    Background. This paper presents the study concerning hot-spot selection in the assessment of whole slide images of tissue sections collected from meningioma patients. The samples were immunohistochemically stained to determine the Ki-67/MIB-1 proliferation index used for prognosis and treatment planning. Objective. The observer performance was examined by comparing results of the proposed method of automatic hot-spot selection in whole slide images, results of traditional scoring under a microscope, and results of a pathologist's manual hot-spot selection. Methods. The results of scoring the Ki-67 index using optical scoring under a microscope, software for Ki-67 index quantification based on hot spots selected by two pathologists (resp., once and three times), and the same software but on hot spots selected by proposed automatic methods were compared using Kendall's tau-b statistics. Results. Results show intra- and interobserver agreement. The agreement between Ki-67 scoring with manual and automatic hot-spot selection is high, while agreement between Ki-67 index scoring results in whole slide images and traditional microscopic examination is lower. Conclusions. The agreement observed for the three scoring methods shows that automation of area selection is an effective tool in supporting physicians and in increasing the reliability of Ki-67 scoring in meningioma.

  19. Comparison of the Manual, Semiautomatic, and Automatic Selection and Leveling of Hot Spots in Whole Slide Images for Ki-67 Quantification in Meningiomas

    PubMed Central

    Swiderska, Zaneta; Korzynska, Anna; Markiewicz, Tomasz; Lorent, Malgorzata; Zak, Jakub; Wesolowska, Anna; Roszkowiak, Lukasz; Slodkowska, Janina; Grala, Bartlomiej

    2015-01-01

    Background. This paper presents the study concerning hot-spot selection in the assessment of whole slide images of tissue sections collected from meningioma patients. The samples were immunohistochemically stained to determine the Ki-67/MIB-1 proliferation index used for prognosis and treatment planning. Objective. The observer performance was examined by comparing results of the proposed method of automatic hot-spot selection in whole slide images, results of traditional scoring under a microscope, and results of a pathologist's manual hot-spot selection. Methods. The results of scoring the Ki-67 index using optical scoring under a microscope, software for Ki-67 index quantification based on hot spots selected by two pathologists (resp., once and three times), and the same software but on hot spots selected by proposed automatic methods were compared using Kendall's tau-b statistics. Results. Results show intra- and interobserver agreement. The agreement between Ki-67 scoring with manual and automatic hot-spot selection is high, while agreement between Ki-67 index scoring results in whole slide images and traditional microscopic examination is lower. Conclusions. The agreement observed for the three scoring methods shows that automation of area selection is an effective tool in supporting physicians and in increasing the reliability of Ki-67 scoring in meningioma. PMID:26240787

  20. Reliability of the OSCE for Physical and Occupational Therapists

    PubMed Central

    Sakurai, Hiroaki; Kanada, Yoshikiyo; Sugiura, Yoshito; Motoya, Ikuo; Wada, Yosuke; Yamada, Masayuki; Tomita, Masao; Tanabe, Shigeo; Teranishi, Toshio; Tsujimura, Toru; Sawa, Syunji; Okanishi, Tetsuo

    2014-01-01

    [Purpose] To examine agreement rates between faculty members and clinical supervisors as OSCE examiners. [Subjects] The study subjects were involved physical and occupational therapists working in clinical environments for 1 to 5 years after graduating from training schools as OSCE examinees, and a physical or occupational therapy faculty member and a clinical supervisor as examiners. Another clinical supervisor acted as a simulated patient. [Methods] The agreement rate between the examiners for each OSCE item was calculated based on Cohen’s kappa coefficient to confirm inter-rater reliability. [Results] The agreement rates for the behavioral aspects of the items were higher in the second than in the first examination. Similar increases were also observed in the agreement rates for the technical aspects until the initiation of each activity; however, the rates decreased during the middle to terminal stages of continuous movements. [Conclusion] The results may reflect the recent implementation of measures for the integration of therapist education in training schools and clinical training facilities. PMID:25202170

  1. Evaluation of the interpretative skills of participants of a limited transthoracic echocardiography training course (H.A.R.T.scan course).

    PubMed

    Royse, C F; Haji, D L; Faris, J G; Veltman, M G; Kumar, A; Royse, A G

    2012-05-01

    Limited transthoracic echocardiography performed by treating physicians may facilitate assessment of haemodynamic abnormalities in perioperative and critical care patients. The interpretative skills of one hundred participants who completed an education program in limited transthoracic echocardiography were assessed by reporting five pre-recorded case studies. A high level of agreement was observed in ventricular volume assessment (left 95%, right 96%), systolic function (left 99%, right 96%), left atrial pressure (96%) and haemodynamic state (97%). The highest failure to report answers (that is, no answer given) was for right ventricular volume and function. For moderate or severe valve lesions, agreement ranged from 90 to 98%, with failure to report <5% in all cases except for mitral stenosis (18%). For mild valve lesions, the range of agreement was lower (53 to 100%) due to overestimation of severity. Medical practitioners who completed the structured educational program showed good agreement with experts in interpretation of valve and ventricular function.

  2. Input-based structure-specific proficiency predicts the neural mechanism of adult L2 syntactic processing.

    PubMed

    Deng, Taiping; Zhou, Huixia; Bi, Hong-Yan; Chen, Baoguo

    2015-06-12

    This study used Event-Related Potentials (ERPs) to explore the role of input-based structure-specific proficiency in L2 syntactic processing, using English subject-verb agreement structures as the stimuli. A pre-test/trainings/post-test paradigm of experimental and control groups was employed, and Chinese speakers who learned English as a second language (L2) participated in the experiment. At pre-test, no ERP component related to the subject-verb agreement structures violations was observed in either group. At training session, the experimental group learned the subject-verb agreement structures, while the control group learned other syntactic structures. After two continuously intensive input trainings, at post-test, a significant P600 component related to the subject-verb agreement structures violations was elicited in the experimental group, but not in the control group. These findings suggest that input training improves structure-specific proficiency, which is reflected in the neural mechanism of L2 syntactic processing. Copyright © 2015 Elsevier B.V. All rights reserved.

  3. Interobserver agreement of interim and end-of-treatment 18F-FDG PET/CT in diffuse large B-cell lymphoma (DLBCL): impact on clinical practice and trials.

    PubMed

    Burggraaff, Coreline N; Cornelisse, Alexander C; Hoekstra, Otto S; Lugtenburg, Pieternella J; de Keizer, Bart; Arens, Anne I J; Celik, Filiz; Huijbregts, Julia E; De Vet, Henrica C W; Zijlstra, Josee M

    2018-05-04

    We aimed to assess the interobserver agreement of Interim PET (I-PET) and End-of-Treatment PET (EoT-PET) using the Deauville 5-point scale (DS) in first-line DLBCL patients. Methods: I-PET and EoT-PET scans of DLBCL patients were performed in the HOVON84 study (2007-2012), an international multicenter randomized controlled trial. Patients received R-CHOP14 and were randomized to receive rituximab intensification in the first 4 cycles or not. I-PET was made after 4 cycles (for observational purposes), and EoT-PET scan after 6 or 8 cycles. Two independent central reviewers retrospectively scored all scans according to the DS-system, blinded to clinical outcomes. Results were dichotomised as 'negative' (DS: 1-3) or 'positive' (DS: 4-5). Besides percentage overall agreement we calculated agreement for positive and negative scores, expressed as positive agreement (PA) and negative agreement (NA), respectively. Results: 465 I-PET and 457 EoT-PET scans were centrally reviewed; baseline 18 F-FDG PET(/CT) was available in 75-77%, and CT in the remaining cases. Percentage overall agreement for I-PET and EoT-PET were 87.7% and 91.7% ( P =0.049), with NA of 92.0% and 95.0% ( P =0.091), and PA of 73.7% and 76.3% ( P =0.656), respectively. Conclusion: Interobserver agreement using DS in DLBCL patients in I-PET and EoT-PET yields high overall and negative agreement. The lower positive agreement suggests that EoT-PET/CT treatment evaluation in daily practice and I-PET adapted trials may benefit from dual reads and central review, respectively. Copyright © 2018 by the Society of Nuclear Medicine and Molecular Imaging, Inc.

  4. Characterization of Severe Arterial Phase Respiratory Motion Artifact on Gadoxetate Disodium-Enhanced MRI - Assessment of Interrater Agreement and Reliability.

    PubMed

    Ringe, Kristina Imeen; Luetkens, Julian A; Fimmers, Rolf; Hammerstingl, Renate Maria; Layer, Günter; Maurer, Martin H; Nähle, Claas Philip; Michalik, Sabine; Reimer, Peter; Schraml, Christina; Schreyer, Andreas G; Stumpp, Patrick; Vogl, Thomas J; Wacker, Frank K; Willinek, Winfried; Kukuk, Guido Mattias

    2018-04-01

     To assess the interrater agreement and reliability of experienced abdominal radiologists in the characterization and grading of arterial phase gadoxetate disodium-related respiratory motion artifact on liver MRI.  This prospective multicenter study was initiated by the working group for abdominal imaging within the German Roentgen Society (DRG), and approved by the local IRB of each participating center. 11 board-certified radiologists independently reviewed 40 gadoxetate disodium-enhanced liver MRI datasets. Motion artifacts in the arterial phase were assessed on a 5-point scale. Interrater agreement and reliability were calculated using the intraclass correlation coefficient (ICC) and Kendall coefficient of concordance (W), with p < 0.05 deemed significant.  The ICC for interrater agreement and reliability were 0.983 (CI 0.973 - 0.990) and 0.985 (CI 0.978 - 0.991), respectively (both p < 0.0001), indicating excellent agreement and reliability. Kendall's W for interrater agreement was 0.865. A severe motion artifact, defined as a mean motion score ≥ 4 in the arterial phase was observed in 12 patients. In these specific cases, a motion score ≥ 4 was assigned by all readers in 75 % (n = 9/12 cases).  Differentiation and grading of arterial phase respiratory motion artifact is possible with a high level of inter-/intrarater agreement and interrater reliability, which is crucial for assessing the incidence of this phenomenon in larger multicenter studies.   · Inter- and intrarater agreement for motion artifact scoring is excellent among experienced readers.. · Interrater reliability for motion artifact scoring is excellent among experienced readers.. · Characterization of severe motion artifacts proved feasible in this multicenter study.. · Ringe KI, Luetkens JA, Fimmers R et al. Characterization of Severe Arterial Phase Respiratory Motion Artifact on Gadoxetate Disodium-Enhanced MRI - Assessment of Interrater Agreement and Reliability. Fortschr Röntgenstr 2017; 190: 341 - 347. © Georg Thieme Verlag KG Stuttgart · New York.

  5. Comparisons of Methods for Predicting Community Annoyance Due to Sonic Booms

    NASA Technical Reports Server (NTRS)

    Hubbard, Harvey H.; Shepherd, Kevin P.

    1996-01-01

    Two approaches to the prediction of community response to sonic boom exposure are examined and compared. The first approach is based on the wealth of data concerning community response to common transportation noises coupled with results of a sonic boom/aircraft noise comparison study. The second approach is based on limited field studies of community response to sonic booms. Substantial differences between indoor and outdoor listening conditions are observed. Reasonable agreement is observed between predicted community responses and available measured responses.

  6. The Role of Satellite Earth Observation Data in Monitoring and Verifying International Environmental Treaties

    NASA Technical Reports Server (NTRS)

    Johnston, Shaida

    2004-01-01

    The term verification implies compliance verification in the language of treaty negotiation and implementation, particularly in the fields of disarmament and arms control. The term monitoring on the other hand, in both environmental and arms control treaties, has a much broader interpretation which allows for use of supporting data sources that are not necessarily acceptable or adequate for direct verification. There are many ways that satellite Earth observation (EO) data can support international environmental agreements, from national forest inventories to use in geographic information system (GIs) tools. Though only a few references to satellite EO data and their use exist in the treaties themselves, an expanding list of applications can be considered in support of multilateral environmental agreements (MEAs). This paper explores the current uses of satellite Earth observation data which support monitoring activities of major environmental treaties and draws conclusions about future missions and their data use. The scope of the study includes all phases of environmental treaty fulfillment - development, monitoring, and enforcement - and includes a multinational perspective on the use of satellite Earth observation data for treaty support.

  7. Development and reliability of an observation method to assess food intake of young children in child care.

    PubMed

    Ball, Sarah C; Benjamin, Sara E; Ward, Dianne S

    2007-04-01

    To our knowledge, a direct observation protocol for assessing dietary intake among young children in child care has not been published. This article reviews the development and testing of a diet observation system for child care facilities that occurred during a larger intervention trial. Development of this system was divided into five phases, done in conjunction with a larger intervention study; (a) protocol development, (b) training of field staff, (c) certification of field staff in a laboratory setting, (d) implementation in a child-care setting, and (e) certification of field staff in a child-care setting. During the certification phases, methods were used to assess the accuracy and reliability of all observers at estimating types and amounts of food and beverages commonly served in child care. Tests of agreement show strong agreement among five observers, as well as strong accuracy between the observers and 20 measured portions of foods and beverages with a mean intraclass correlation coefficient value of 0.99. This structured observation system shows promise as a valid and reliable approach for assessing dietary intake of children in child care and makes a valuable contribution to the growing body of literature on the dietary assessment of young children.

  8. Three-dimensional ultrasonography of the breast; An adequate replacement for MRI in neoadjuvant chemotherapy tumour response evaluation? - RESPONDER trial.

    PubMed

    van Egdom, L S E; Lagendijk, M; Heijkoop, E H M; Koning, A H J; van Deurzen, C H M; Jager, A; van Lankeren, W; Koppert, L B

    2018-07-01

    Accurate measurement of tumour response during and after neoadjuvant chemotherapy (NAC) is important and may influence treatment decisions in invasive breast cancer patients. Breast MRI forms the gold standard but is more burdensome, time consuming and costly. In this study response measurement was done with 3-D ultrasound by Automated Breast Volume Scanner (ABVS) and compared to breast MRI. Moreover, patient satisfaction with both techniques was compared. A single-institution, prospective observational pilot study evaluating tumour response by ABVS in addition to breast MRI (standard care) was performed in 25 invasive breast cancer patients receiving NAC. Tumour response was evaluated comparing longest tumour diameters as well as tumour volumes at predefined time points using Bland-Altman analysis. Volume measurements for breast MRI were obtained using a fully immersive virtual reality system (a Barco I-Space) and V-Scope software. Same software was used to obtain ABVS volume measurements using an in-house developed desktop VR system. Inter- and intra-observer agreement was evaluated by Intraclass Correlation Coefficient (ICC). Twenty-five patients were eligible for baseline measurement, 20 for a mid-NAC response evaluation, and five for a post-NAC response evaluation. MRI and ABVS showed absolute concordance in 73% of patients for the mid-NAC evaluation, with a 'good' correlation for the difference in longest diameter measurement (ICC 0.73, p < 0.01) as compared to baseline assessment. Concerning difference in volume measurement in the mid-NAC response evaluation showed a 'fair' correlation (ICC 0.52, p < 0.01) and in the post-NAC response evaluation an 'excellent' correlation (ICC 0.98, p < 0.01). 'Excellent' inter- and intra-observer agreement was found (ICC 0.88, p < 0.01) with comparable limits of agreement (LOA) for observer 1 and 2 in both diameter and volume measurement. Patient satisfaction was higher for ABVS compared to breast MRI, 93% versus 12% respectively. ABVS showed 'good' correlation with MRI tumour response evaluation in breast cancer patients during NAC with 'excellent' inter- and intra-observer agreement. ABVS has patients' preference over breast MRI and could be considered as alternative to breast MRI, in case results on an on-going prospective trial confirm these results (NTR6799). Copyright © 2018 Elsevier B.V. All rights reserved.

  9. Interobserver agreement on the echocardiographic parameters that estimate right ventricular systolic function in the early postoperative period of cardiac surgery.

    PubMed

    Olmos-Temois, S G; Santos-Martínez, L E; Álvarez-Álvarez, R; Gutiérrez-Delgado, L G; Baranda-Tovar, F M

    2016-11-01

    To know the variability of transthoracic echocardiographic parameters that assess right ventricular systolic function by analyzing interobserver agreement in the early postoperative period of cardiovascular surgery. To assess the feasibility of these echocardiographic measurements. A cross-sectional study, double-blind pilot study was carried out from May 2011 to February 2013. Cardiovascular postoperative critical care at the National Institute of Cardiology "Ignacio Chávez", Mexico City, Mexico. Consecutive, non-probabilistic sampling. Fifty-six patients were studied in the postoperative period of cardiac surgery. The first echocardiographic parameters were obtained between 6-8hours after cardiac surgery, followed by blinded second measurements. Tricuspid annular plane systolic excursion (TAPSE), tricuspid annular peak systolic velocity on tissue Doppler imaging (VSPAT), diameters and right ventricular outflow area, tract fractional shortening. The agreement was analyzed by the Bland-Altman method, and its magnitude was assessed by the intraclass correlation coefficient (95% confidence interval). Both observers evaluated TAPSE and VSPAT in 48 patients (92%). The average TAPSE was 11.68±4.53mm (range 4-27mm). Right ventricular systolic dysfunction was observed in 41 cases (85%) and normal TAPSE in 7 patients (15%). The average difference and its limits according to TAPSE were -0.917±2.95 (-6.821, 4.988), with a magnitude of 0.725 (0.552, 0.837); the tricuspid annular peak systolic velocity on tissue Doppler imaging was -0.001±0.015 (-0.031, 0.030), and its magnitude 0.825 (0.708, 0.898), respectively. VSPAT and TAPSE were estimated by both observers in 92% of the patients, these parameters exhibiting the lowest interobserver variability. Copyright © 2016 Elsevier España, S.L.U. y SEMICYUC. All rights reserved.

  10. Ethical and Legal Observations on Contract Cheating Services as an Agreement

    ERIC Educational Resources Information Center

    Tauginiene, Loreta; Jurkevicius, Vaidas

    2017-01-01

    In this paper we cast light on one form of dishonest behaviour in academia--contract cheating services. We examine how an agreement between a student and a contract cheating services provider is viewed from ethical and legal perspectives. For this purpose we carried out an analysis of contract cheating services as an agreement which, in Lithuania,…

  11. Direct in situ observations of single Fe atom catalytic processes and anomalous diffusion at graphene edges

    PubMed Central

    Zhao, Jiong; Deng, Qingming; Avdoshenko, Stanislav M.; Fu, Lei; Eckert, Jürgen; Rümmeli, Mark H.

    2014-01-01

    Single-atom catalysts are of great interest because of their high efficiency. In the case of chemically deposited sp2 carbon, the implementation of a single transition metal atom for growth can provide crucial insight into the formation mechanisms of graphene and carbon nanotubes. This knowledge is particularly important if we are to overcome fabrication difficulties in these materials and fully take advantage of their distinct band structures and physical properties. In this work, we present atomically resolved transmission EM in situ investigations of single Fe atoms at graphene edges. Our in situ observations show individual iron atoms diffusing along an edge either removing or adding carbon atoms (viz., catalytic action). The experimental observations of the catalytic behavior of a single Fe atom are in excellent agreement with supporting theoretical studies. In addition, the kinetics of Fe atoms at graphene edges are shown to exhibit anomalous diffusion, which again, is in agreement with our theoretical investigations. PMID:25331874

  12. MRI of the wrist in juvenile idiopathic arthritis: proposal of a paediatric synovitis score by a consensus of an international working group. Results of a multicentre reliability study.

    PubMed

    Damasio, Maria Beatrice; Malattia, Clara; Tanturri de Horatio, Laura; Mattiuz, Chiara; Pistorio, Angela; Bracaglia, Claudia; Barbuti, Domenico; Boavida, Peter; Juhan, Karen Lambot; Ording, Lil Sophie Mueller; Rosendahl, Karen; Martini, Alberto; Magnano, GianMichele; Tomà, Paolo

    2012-09-01

    MRI is a sensitive tool for the evaluation of synovitis in juvenile idiopathic arthritis (JIA). The purpose of this study was to introduce a novel MRI-based score for synovitis in children and to examine its inter- and intraobserver variability in a multi-centre study. Wrist MRI was performed in 76 children with JIA. On postcontrast 3-D spoiled gradient-echo and fat-suppressed T2-weighted spin-echo images, joint recesses were scored for the degree of synovial enhancement, effusion and overall inflammation independently by two paediatric radiologists. Total-enhancement and inflammation-synovitis scores were calculated. Interobserver agreement was poor to moderate for enhancement and inflammation in all recesses, except in the radioulnar and radiocarpal joints. Intraobserver agreement was good to excellent. For enhancement and inflammation scores, mean differences (95 % CI) between observers were -1.18 (-4.79 to 2.42) and -2.11 (-6.06 to 1.83). Intraobserver variability (reader 1) was 0 (-1.65 to 1.65) and 0.02 (-1.39 to 1.44). Intraobserver agreement was good. Except for the radioulnar and radiocarpal joints, interobserver agreement was not acceptable. Therefore, the proposed scoring system requires further refinement.

  13. The Critical Role of the Routing Scheme in Simulating Peak River Discharge in Global Hydrological Models

    NASA Technical Reports Server (NTRS)

    Zhao, Fang; Veldkamp, Ted I. E.; Frieler, Katja; Schewe, Jacob; Ostberg, Sebastian; Willner, Sven; Schauberger, Bernhard; Gosling, Simon N.; Schmied, Hannes Muller; Portmann, Felix T.; hide

    2017-01-01

    Global hydrological models (GHMs) have been applied to assess global flood hazards, but their capacity to capture the timing and amplitude of peak river discharge which is crucial in flood simulations has traditionally not been the focus of examination. Here we evaluate to what degree the choice of river routing scheme affects simulations of peak discharge and may help to provide better agreement with observations. To this end we use runoff and discharge simulations of nine GHMs forced by observational climate data (1971-2010) within the ISIMIP2a (Inter-Sectoral Impact Model Intercomparison Project phase 2a) project. The runoff simulations were used as input for the global river routing model CaMa-Flood (Catchment-based Macro-scale Floodplain). The simulated daily discharge was compared to the discharge generated by each GHM using its native river routing scheme. For each GHM both versions of simulated discharge were compared to monthly and daily discharge observations from 1701 GRDC (Global Runoff Data Centre) stations as a benchmark. CaMa-Flood routing shows a general reduction of peak river discharge and a delay of about two to three weeks in its occurrence, likely induced by the buffering capacity of floodplain reservoirs. For a majority of river basins, discharge produced by CaMa-Flood resulted in a better agreement with observations. In particular, maximum daily discharge was adjusted, with a multi-model averaged reduction in bias over about two-thirds of the analysed basin area. The increase in agreement was obtained in both managed and near-natural basins. Overall, this study demonstrates the importance of routing scheme choice in peak discharge simulation, where CaMa-Flood routing accounts for floodplain storage and backwater effects that are not represented in most GHMs. Our study provides important hints that an explicit parameterisation of these processes may be essential in future impact studies.

  14. Global two-fluid simulations of geodesic acoustic modes in strongly shaped tight aspect ratio tokamak plasmas

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Robinson, J. R.; Hnat, B.; Thyagaraja, A.

    2013-05-15

    Following recent observations suggesting the presence of the geodesic acoustic mode (GAM) in ohmically heated discharges in the Mega Amp Spherical Tokamak (MAST) [J. R. Robinson et al., Plasma Phys. Controlled Fusion 54, 105007 (2012)], the behaviour of the GAM is studied numerically using the two fluid, global code CENTORI [P. J. Knight et al. Comput. Phys. Commun. 183, 2346 (2012)]. We examine mode localisation and effects of magnetic geometry, given by aspect ratio, elongation, and safety factor, on the observed frequency of the mode. An excellent agreement between simulations and experimental data is found for simulation plasma parameters matchedmore » to those of MAST. Increasing aspect ratio yields good agreement between the GAM frequency found in the simulations and an analytical result obtained for elongated large aspect ratio plasmas.« less

  15. An Analytical Diffusion–Expansion Model for Forbush Decreases Caused by Flux Ropes

    NASA Astrophysics Data System (ADS)

    Dumbović, Mateja; Heber, Bernd; Vršnak, Bojan; Temmer, Manuela; Kirin, Anamarija

    2018-06-01

    We present an analytical diffusion–expansion Forbush decrease (FD) model ForbMod, which is based on the widely used approach of an initially empty, closed magnetic structure (i.e., flux rope) that fills up slowly with particles by perpendicular diffusion. The model is restricted to explaining only the depression caused by the magnetic structure of the interplanetary coronal mass ejection (ICME). We use remote CME observations and a 3D reconstruction method (the graduated cylindrical shell method) to constrain initial boundary conditions of the FD model and take into account CME evolutionary properties by incorporating flux rope expansion. Several flux rope expansion modes are considered, which can lead to different FD characteristics. In general, the model is qualitatively in agreement with observations, whereas quantitative agreement depends on the diffusion coefficient and the expansion properties (interplay of the diffusion and expansion). A case study was performed to explain the FD observed on 2014 May 30. The observed FD was fitted quite well by ForbMod for all expansion modes using only the diffusion coefficient as a free parameter, where the diffusion parameter was found to correspond to an expected range of values. Our study shows that, in general, the model is able to explain the global properties of an FD caused by a flux rope and can thus be used to help understand the underlying physics in case studies.

  16. Reliability of visual and instrumental color matching.

    PubMed

    Igiel, Christopher; Lehmann, Karl Martin; Ghinea, Razvan; Weyhrauch, Michael; Hangx, Ysbrand; Scheller, Herbert; Paravina, Rade D

    2017-09-01

    The aim of this investigation was to evaluate intra-rater and inter-rater reliability of visual and instrumental shade matching. Forty individuals with normal color perception participated in this study. The right maxillary central incisor of a teaching model was prepared and restored with 10 feldspathic all-ceramic crowns of different shades. A shade matching session consisted of the observer (rater) visually selecting the best match by using VITA classical A1-D4 (VC) and VITA Toothguide 3D Master (3D) shade guides and the VITA Easyshade Advance intraoral spectrophotometer (ES) to obtain both VC and 3D matches. Three shade matching sessions were held with 4 to 6 weeks between sessions. Intra-rater reliability was assessed based on the percentage of agreement for the three sessions for the same observer, whereas the inter-rater reliability was calculated as mean percentage of agreement between different observers. The Fleiss' Kappa statistical analysis was used to evaluate visual inter-rater reliability. The mean intra-rater reliability for the visual shade selection was 64(11) for VC and 48(10) for 3D. The corresponding ES values were 96(4) for both VC and 3D. The percentages of observers who matched the same shade with VC and 3D were 55(10) and 43(12), respectively, while corresponding ES values were 88(8) for VC and 92(4) for 3D. The results for visual shade matching exhibited a high to moderate level of inconsistency for both intra-rater and inter-rater comparisons. The VITA Easyshade Advance intraoral spectrophotometer exhibited significantly better reliability compared with visual shade selection. This study evaluates the ability of observers to consistently match the same shade visually and with a dental spectrophotometer in different sessions. The intra-rater and inter-rater reliability (agreement of repeated shade matching) of visual and instrumental tooth color matching strongly suggest the use of color matching instruments as a supplementary tool in everyday dental practice to enhance the esthetic outcome. © 2017 Wiley Periodicals, Inc.

  17. Does experience in hysteroscopy improve accuracy and inter-observer agreement in the management of abnormal uterine bleeding?

    PubMed

    Bourdel, Nicolas; Modaffari, Paola; Tognazza, Enrica; Pertile, Riccardo; Chauvet, Pauline; Botchorishivili, Revaz; Savary, Dennis; Pouly, Jean Luc; Rabischong, Benoit; Canis, Michel

    2016-12-01

    Hysteroscopic reliability may be influenced by the experience of the operator and by a lack of morphological diagnostic criteria for endometrial malignant pathologies. The aim of this study was to evaluate the diagnostic accuracy and the inter-observer agreement (IOA) in the management of abnormal uterine bleeding (AUB) among different experienced gynecologists. Each gynecologist, without any other clinical information, was asked to evaluate the anonymous video recordings of 51 consecutive patients who underwent hysteroscopy and endometrial resection for AUB. Experts (>500 hysteroscopies), seniors (20-499 procedures) and junior (≤19 procedures) gynecologists were asked to judge endometrial macroscopic appearance (benign, suspicious or frankly malignant). They also had to propose the histological diagnosis (atrophic or proliferative endometrium; simple, glandulocystic or atypical endometrial hyperplasia and endometrial carcinoma). Observers were free to indicate whether the quality of recordings were not good enough for adequate assessment. IOA (k coefficient), sensitivity, specificity, predictive value and the likelihood ratio were calculated. Five expert, five senior and six junior gynecologists were involved in the study. Considering endometrial cancer and endometrial atypical hyperplasia, sensitivity and specificity were respectively 55.5 % and 84.5 % for juniors, 66.6 % and 81.2 % for seniors and 86.6 % and 87.3 % for experts. Concerning endometrial macroscopic appearance, IOA was poor for juniors (k = 0.10) and fair for seniors and experts (k = 0.23 and 0.22, respectively). IOA was poor for juniors and experts (k = 0.18 and 0.20, respectively) and fair for seniors (k = 0.30) in predicting the histological diagnosis. Sensitivity improves with the observer's experience, but inter-observer agreement and reproducibility of hysteroscopy for endometrial malignancies are not satisfying no matter the level of expertise. Therefore, an accurate and complete endometrial sampling is still needed.

  18. Sensitivity and Specificity of Laser-Scanning In Vivo Confocal Microscopy for Filamentous Fungal Keratitis: Role of Observer Experience.

    PubMed

    Kheirkhah, Ahmad; Syed, Zeba A; Satitpitakul, Vannarut; Goyal, Sunali; Müller, Rodrigo; Tu, Elmer Y; Dana, Reza

    2017-07-01

    To determine sensitivity and specificity of laser-scanning in vivo confocal microscopy (LS-IVCM) for detection of filamentous fungi in patients with microbial keratitis and to evaluate the effect of observer's imaging experience on these parameters. Retrospective reliability study. This study included 21 patients with filamentous fungal keratitis and 24 patients with bacterial keratitis (as controls). The etiology of infection was confirmed based on the response to specific therapy regardless of culture results. All patients had undergone full-thickness corneal imaging by a LS-IVCM (Heidelberg Retina Tomograph 3 with Rostock Cornea Module; Heidelberg Engineering, Heidelberg, Germany). The images were evaluated for the presence of fungal filaments by 2 experienced observers and 2 inexperienced observers. All observers were masked to the clinical and microbiologic data. The mean number of images obtained per eye was 917 ± 353. The average sensitivity of LS-IVCM for detecting fungal filaments was 71.4% ± 0% for the experienced observers and 42.9% ± 6.7% for the inexperienced observers. The average specificity was 89.6% ± 3.0% and 87.5% ± 17.7% for these 2 groups of observers, respectively. Although there was a good agreement between the 2 experienced observers (κ = 0.77), the inexperienced observers showed only a moderate interobserver agreement (κ = 0.51). The LS-IVCM sensitivity was higher in patients with fungal infections who had positive culture or longer duration of the disease. Although LS-IVCM has a high specificity for diagnosing filamentous fungal keratitis, its sensitivity is moderate and highly dependent on the level of the observer's experience and training with this imaging modality. Copyright © 2017 Elsevier Inc. All rights reserved.

  19. Diagnostic Reproducibility: What Happens When the Same Pathologist Interprets the Same Breast Biopsy Specimen at Two Points in Time?

    PubMed

    Jackson, Sara L; Frederick, Paul D; Pepe, Margaret S; Nelson, Heidi D; Weaver, Donald L; Allison, Kimberly H; Carney, Patricia A; Geller, Berta M; Tosteson, Anna N A; Onega, Tracy; Elmore, Joann G

    2017-05-01

    Surgeons may receive a different diagnosis when a breast biopsy is interpreted by a second pathologist. The extent to which diagnostic agreement by the same pathologist varies at two time points is unknown. Pathologists from eight U.S. states independently interpreted 60 breast specimens, one glass slide per case, on two occasions separated by ≥9 months. Reproducibility was assessed by comparing interpretations between the two time points; associations between reproducibility (intraobserver agreement rates); and characteristics of pathologists and cases were determined and also compared with interobserver agreement of baseline interpretations. Sixty-five percent of invited, responding pathologists were eligible and consented; 49 interpreted glass slides in both study phases, resulting in 2940 interpretations. Intraobserver agreement rates between the two phases were 92% [95% confidence interval (CI) 88-95] for invasive breast cancer, 84% (95% CI 81-87) for ductal carcinoma-in-situ, 53% (95% CI 47-59) for atypia, and 84% (95% CI 81-86) for benign without atypia. When comparing all study participants' case interpretations at baseline, interobserver agreement rates were 89% (95% CI 84-92) for invasive cancer, 79% (95% CI 76-81) for ductal carcinoma-in-situ, 43% (95% CI 41-45) for atypia, and 77% (95% CI 74-79) for benign without atypia. Interpretive agreement between two time points by the same individual pathologist was low for atypia and was similar to observed rates of agreement for atypia between different pathologists. Physicians and patients should be aware of the diagnostic challenges associated with a breast biopsy diagnosis of atypia when considering treatment and surveillance decisions.

  20. Agreement between the Turkey Guidelines and the Fracture Risk Assessment Tool®-based Intervention Threshold

    PubMed Central

    Aydogan, Nevres Hurriyet; Tosun, Kursad

    2018-01-01

    Background The aim of this study was to evaluate the agreement between the fracture-risk assessment tool (FRAX®)-based intervention strategy in Turkey and the recommendations published in the Healthcare Practices Statement (HPS). Methods This descriptive cross-sectional study included individuals aged 40 to 90 years who were previously diagnosed as having osteoporosis but had not received any treatment. The intervention thresholds recommended by the National Osteoporosis Foundation for treatment were used. The criteria necessary for the start of administration of pharmacological agents in osteoporosis treatment were evaluated on the basis of the HPS guidelines. Results Of the 1,255 patients evaluated, 161 (12.8%) were male and 1,094 (87.2%) were female. In the evaluation, according to HPS, treatment was recommended for 783 patients (62.4%; HPS+) and not recommended for 472 (37.6%; HPS−). Of the 783 HPS+ patients, 391 (49.9%) were FRAX+, and of the 472 HPS− patients, 449 (95.1%) were FRAX−. A statistically significant difference was observed between the treatment recommendations of HPS and FRAX® (P<0.001). In the age group of 75 to 90 years, excellent agreement was found between the two strategies (Gwet's agreement coefficient 1=0.94). As age increased, the agreement between the two treatment strategies also increased. Conclusions The FRAX® model has different treatment recommendation rates from the HPS. The agreement between the two is at a minimal level. However, as age increased, so did the agreement between the FRAX® and the HPS treatment recommendations. In the recommendation to start pharmacological treatment primarily based on age, non-medical interventions that preserve bone density should be evaluated. PMID:29900157

  1. The Moneron Tsunami of September 5, 1971, and Its Manifestation on the Sakhalin Island Coast: Numerical Simulation Results

    NASA Astrophysics Data System (ADS)

    Kostenko, I. S.; Zaytsev, A. I.; Minaev, D. D.; Kurkin, A. A.; Pelinovsky, E. N.; Oshmarina, O. E.

    2018-01-01

    Observation data on the September 5, 1971, earthquake that occurred near the Moneron Island (Sakhalin) have been analyzed and a numerical simulation of the tsunami induced by this earthquake is conducted. The tsunami source identified in this study indicates that the observational data are in good agreement with the results of calculations performed on the basis of shallow-water equations.

  2. Reproducibility and predictive value of scoring stromal tumour infiltrating lymphocytes in triple-negative breast cancer: a multi-institutional study.

    PubMed

    O'Loughlin, Mark; Andreu, Xavier; Bianchi, Simonetta; Chemielik, Ewa; Cordoba, Alicia; Cserni, Gábor; Figueiredo, Paulo; Floris, Giuseppe; Foschini, Maria P; Heikkilä, Päivi; Kulka, Janina; Liepniece-Karele, Inta; Regitnig, Peter; Reiner, Angelika; Ryska, Ales; Sapino, Anna; Shalaby, Aliaa; Stovgaard, Elisabeth Specht; Quinn, Cecily; Walsh, Elaine M; Zolota, Vicky; Glynn, Sharon A; Callagy, Grace

    2018-05-17

    Several studies have demonstrated a prognostic role for stromal tumour infiltrating lymphocytes (sTILs) in triple-negative breast cancer (TNBC). The reproducibility of scoring sTILs is variable with potentially excellent concordance being achievable using a software tool. We examined agreement between breast pathologists across Europe scoring sTILs on H&E-stained sections without software, an approach that is easily applied in clinical practice. The association between sTILs and response to anthracycline-taxane NACT was also examined. Pathologists from the European Working Group for Breast Screening Pathology scored sTILs in 84 slides from 75 TNBCs using the immune-oncology biomarker working group guidance in two circulations. There were 16 participants in the first and 19 in the second circulation. Moderate agreement was achieved for absolute sTILs scores (intraclass correlation coefficient (ICC) = 0.683, 95% CI 0.601-0.767, p-value < 0.001). Agreement was less when a 25% threshold was used (ICC 0.509, 95% CI 0.416-0.614, p-value < 0.001) and for lymphocyte predominant breast cancer (LPBC) (ICC 0.504, 95% CI 0.412-0.610, p-value < 0.001). Intra-observer agreement was strong for absolute sTIL values (Spearman ρ = 0.727); fair for sTILs ≥ 25% (κ = 0.53) and for LPBC (κ = 0.49), but poor for sTILs as 10% increments (κ = 0.24). Increasing sTILs was significantly associated with an increased likelihood of a pathological complete response (pCR) on multivariable analysis. Increasing sTILs in TNBCs improves the likelihood of a pCR. However, inter-observer agreement is such that H&E-based assessment is not sufficiently reproducible for clinical application. Other methodologies should be explored, but may be at the cost of ease of application.

  3. Gender Agreement Attraction in Russian: Production and Comprehension Evidence

    PubMed Central

    Slioussar, Natalia; Malko, Anton

    2016-01-01

    Agreement attraction errors (such as the number error in the example “The key to the cabinets are rusty”) have been the object of many studies in the last 20 years. So far, almost all production experiments and all comprehension experiments looked at binary features (primarily at number in Germanic, Romance, and some other languages, in several cases at gender in Romance languages). Among other things, it was noted that both in production and in comprehension, attraction effects are much stronger for some feature combinations than for the others: they can be observed in the sentences with singular heads and plural dependent nouns (e.g.,“The key to the cabinets…”), but not in the sentences with plural heads and singular dependent nouns (e.g., “The keys to the cabinet…”). Almost all proposed explanations of this asymmetry appeal to feature markedness, but existing findings do not allow teasing different approaches to markedness apart. We report the results of four experiments (one on production and three on comprehension) studying subject-verb gender agreement in Russian, a language with three genders. Firstly, we found attraction effects both in production and in comprehension, but, unlike in the case of number agreement, they were not parallel (in production, feminine gender triggered strongest effects, while neuter triggered weakest effects, while in comprehension, masculine triggered weakest effects). Secondly, in the comprehension experiments attraction was observed for all dependent noun genders, but only for a subset of head noun genders. This goes against the traditional assumption that the features of the dependent noun are crucial for attraction, showing the features of the head are more important. We demonstrate that this approach can be extended to previous findings on attraction and that there exists other evidence for it. In total, these findings let us reconsider the question which properties of features are crucial for agreement attraction in production and in comprehension. PMID:27867365

  4. Gender Agreement Attraction in Russian: Production and Comprehension Evidence.

    PubMed

    Slioussar, Natalia; Malko, Anton

    2016-01-01

    Agreement attraction errors (such as the number error in the example "The key to the cabinets are rusty") have been the object of many studies in the last 20 years. So far, almost all production experiments and all comprehension experiments looked at binary features (primarily at number in Germanic, Romance, and some other languages, in several cases at gender in Romance languages). Among other things, it was noted that both in production and in comprehension, attraction effects are much stronger for some feature combinations than for the others: they can be observed in the sentences with singular heads and plural dependent nouns (e.g.,"The key to the cabinets…"), but not in the sentences with plural heads and singular dependent nouns (e.g., "The keys to the cabinet…"). Almost all proposed explanations of this asymmetry appeal to feature markedness, but existing findings do not allow teasing different approaches to markedness apart. We report the results of four experiments (one on production and three on comprehension) studying subject-verb gender agreement in Russian, a language with three genders. Firstly, we found attraction effects both in production and in comprehension, but, unlike in the case of number agreement, they were not parallel (in production, feminine gender triggered strongest effects, while neuter triggered weakest effects, while in comprehension, masculine triggered weakest effects). Secondly, in the comprehension experiments attraction was observed for all dependent noun genders, but only for a subset of head noun genders. This goes against the traditional assumption that the features of the dependent noun are crucial for attraction, showing the features of the head are more important. We demonstrate that this approach can be extended to previous findings on attraction and that there exists other evidence for it. In total, these findings let us reconsider the question which properties of features are crucial for agreement attraction in production and in comprehension.

  5. Portable gamma camera guidance in sentinel lymph node biopsy: prospective observational study of consecutive cases.

    PubMed

    Peral Rubio, F; de La Riva, P; Moreno-Ramírez, D; Ferrándiz-Pulido, L

    2015-06-01

    Sentinel lymph node biopsy is the most important tool available for node staging in patients with melanoma. To analyze sentinel lymph node detection and dissection with radio guidance from a portable gamma camera. To assess the number of complications attributable to this biopsy technique. Prospective observational study of a consecutive series of patients undergoing radioguided sentinel lymph node biopsy. We analyzed agreement between nodes detected by presurgical lymphography, those detected by the gamma camera, and those finally dissected. A total of 29 patients (17 women [62.5%] and 12 men [37.5%]) were enrolled. The mean age was 52.6 years (range, 26-82 years). The sentinel node was dissected from all patients; secondary nodes were dissected from some. In 16 cases (55.2%), there was agreement between the number of nodes detected by lymphography, those detected by the gamma camera, and those finally dissected. The only complications observed were seromas (3.64%). No cases of wound dehiscence, infection, hematoma, or hemorrhage were observed. Portable gamma-camera radio guidance may be of use in improving the detection and dissection of sentinel lymph nodes and may also reduce complications. These goals are essential in a procedure whose purpose is melanoma staging. Copyright © 2014 Elsevier España, S.L.U. and AEDV. All rights reserved.

  6. Sediment porewater toxicity assessment studies in the vicinity of offshore oil and gas production platforms in the Gulf of Mexico

    USGS Publications Warehouse

    Carr, R.S.; Chapman, D.C.; Presley, B.J.; Biedenbach, J.M.; Robertson, L.; Boothe, P.; Kilada, R.; Wade, T.; Montagna, P.

    1996-01-01

    As part of a multidisciplinary program to assess the potential long-term impacts of offshore oil and gas exploration and production activities in the Gulf of Mexico, sediment chemical analyses and porewater toxicity tests were conducted in the vicinity of five offshore platforms. Based on data from sea urchin fertilization and embryological development assays, toxicity was observed near four of the five platforms sampled; the majority of the toxic samples were collected within 150 m of a platform. There was excellent agreement among the results of porewater tests with three different species (sea urchin embryological development, polychaete reproduction, and copepod nauplii survival). The sediment concentrations of several metals were well in excess of sediment quality assessment guidelines at a number of stations, and good agreement was observed between predicted and observed toxicity. Porewater metal concentrations compared with EC50, LOEC, and NOEC values generated for water-only exposures indicated that the porewater concentrations for several metals were high enough to account for the observed toxicity. Results of these studies utilizing highly sensitive toxicity tests suggest that the contaminant-induced impacts from offshore platforms are limited to a localized area in the immediate vicinity of the platforms. 

  7. Development and Reliability Testing of a Fast-Food Restaurant Observation Form.

    PubMed

    Rimkus, Leah; Ohri-Vachaspati, Punam; Powell, Lisa M; Zenk, Shannon N; Quinn, Christopher M; Barker, Dianne C; Pugach, Oksana; Resnick, Elissa A; Chaloupka, Frank J

    2015-01-01

    To develop a reliable observational data collection instrument to measure characteristics of the fast-food restaurant environment likely to influence consumer behaviors, including product availability, pricing, and promotion. The study used observational data collection. Restaurants were in the Chicago Metropolitan Statistical Area. A total of 131 chain fast-food restaurant outlets were included. Interrater reliability was measured for product availability, pricing, and promotion measures on a fast-food restaurant observational data collection instrument. Analysis was done with Cohen's κ coefficient and proportion of overall agreement for categorical variables and intraclass correlation coefficient (ICC) for continuous variables. Interrater reliability, as measured by average κ coefficient, was .79 for menu characteristics, .84 for kids' menu characteristics, .92 for food availability and sizes, .85 for beverage availability and sizes, .78 for measures on the availability of nutrition information,.75 for characteristics of exterior advertisements, and .62 and .90 for exterior and interior characteristics measures, respectively. For continuous measures, average ICC was .88 for food pricing measures, .83 for beverage prices, and .65 for counts of exterior advertisements. Over 85% of measures demonstrated substantial or almost perfect agreement. Although some measures required revision or protocol clarification, results from this study suggest that the instrument may be used to reliably measure the fast-food restaurant environment.

  8. Industrial companies' demand for energy based on a micro panel database -- Effects of CO{sub 2} taxation and agreements on energy savings

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bjoerner, T.B.; Togeby, M.

    1999-07-01

    An econometric panel data analysis of industrial demand for electricity and energy is presented. In the panel energy consumption, production and value added are observed at company level. The authors estimate price and production elasticities for electricity and total energy (i.e. measuring the X per cent change in demand of say electricity of a one per cent increase in the price of electricity). The estimated price and production elasticities are allowed to vary according to company characteristics such as industrial sub-sector, company size, energy intensity and type of ownership. Most previous econometric studies on industrial energy demand use aggregate data,more » while a couple of micro level studies mainly employ cross-section analysis. To the knowledge this is only the second econometric study on industrial energy demand based on a large micro panel database. More than 2,700 Danish industrial companies during the period 1983 to 1995 are included in the model (covering the majority of all Danish industrial energy consumption). One advantage of micro data is that these data can be used to estimate the effect of an instrument like voluntary energy agreements. By entering a voluntary energy agreement a Danish company avoids paying the usual CO{sub 2} tax. Preliminary analyses suggest that there is a large positive gross reduction of electricity and total energy consumption of companies with energy agreements. However, the authors also find that companies would have had about the same reduction in electricity consumption if they had not entered into an agreement, but instead paid the full CO{sub 2} tax. Thus, the analysis suggests that the net effect on electricity use of the voluntary energy agreements is very low (perhaps even negative).« less

  9. Laboratory blood analysis in Strigiformes-Part II: plasma biochemistry reference intervals and agreement between the Abaxis Vetscan V2 and the Roche Cobas c501.

    PubMed

    Ammersbach, Mélanie; Beaufrère, Hugues; Gionet Rollick, Annick; Tully, Thomas

    2015-03-01

    Limited plasma biochemical information is available in Strigiformes. Only one study investigated the agreement between a point-of-care with a reference laboratory analyzer for biochemistry variables in birds. The objective was to report reference intervals (RI) for plasma biochemistry variables in Strigiformes, and to assess agreement between the Abaxis Vetscan V2 and Roche Cobas c501. A prospective study was designed to assess plasma biochemistry RI for concentration of calcium, phosphorus, total protein, albumin, globulin, glucose, bilirubin, uric acid, bile acids, sodium, potassium, and chloride, and activities of AST, GGT, CK, amylase, lipase, LDH, and GLDH. In addition, the agreement between the Vetscan and the Cobas in owl species was assessed. A total of 190 individuals were sampled belonging to 12 Strigiformes species including Barn Owls, Barred Owls, Great Horned Owls, Eurasian Eagle Owls, Spectacled Owls, Eastern Screech Owls, Long-Eared Owls, Short-Eared Owls, Great Gray Owls, Snowy Owls, Northern Saw-Whet Owls, and Northern Hawk-Owls. Order-, species-, and method-specific RI were determined on both analyzers. Although Vetscan data were not equivalent to the Cobas, 4 analytes (glucose, AST, CK, and total protein, with correction for bias) were within acceptable agreement, 3 analytes (uric acid, calcium, and phosphorus) were within close agreement, and the remaining analytes were in strong disagreement. Species-specific differences were observed notably for the concentration of glucose in Barn Owls and electrolytes in Northern Saw-Whet Owls. Overall, this study suggests that the Vetscan has acceptable clinical performance in Strigiformes for some analytes and highlights discrepancies for several analytes. © 2015 American Society for Veterinary Clinical Pathology.

  10. Impact of 4D-(18)FDG-PET/CT imaging on target volume delineation in SBRT patients with central versus peripheral lung tumors. Multi-reader comparative study.

    PubMed

    Chirindel, Alin; Adebahr, Sonja; Schuster, Daniel; Schimek-Jasch, Tanja; Schanne, Daniel H; Nemer, Ursula; Mix, Michael; Meyer, Philipp; Grosu, Anca-Ligia; Brunner, Thomas; Nestle, Ursula

    2015-06-01

    Evaluation of the effect of co-registered 4D-(18)FDG-PET/CT for SBRT target delineation in patients with central versus peripheral lung tumors. Analysis of internal target volume (ITV) delineation of central and peripheral lung lesions in 21 SBRT-patients. Manual delineation was performed by 4 observers in 2 contouring phases: on respiratory gated 4DCT with diagnostic 3DPET available aside (CT-ITV) and on co-registered 4DPET/CT (PET/CT-ITV). Comparative analysis of volumes and inter-reader agreement. 11 cases of peripheral and 10 central lesions were evaluated. In peripheral lesions, average CT-ITV was 6.2 cm(3) and PET/CT-ITV 8.6 cm(3), resembling a mean change in hypothetical radius of 2 mm. For both CT-ITVs and PET/CT-ITVs inter reader agreement was good and unchanged (0.733 and 0.716; p=0.58). All PET/CT-ITVs stayed within the PTVs derived from CT-ITVs. In central lesions, average CT-ITVs were 42.1 cm(3), PET/CT-ITVs 44.2 cm(3), without significant overall volume changes. Inter-reader agreement improved significantly (0.665 and 0.750; p<0.05). 2/10 PET/CT-ITVs exceeded the PTVs derived from CT-ITVs by >1 ml in average for all observers. The addition of co-registered 4DPET data to 4DCT based target volume delineation for SBRT of centrally located lung tumors increases the inter-observer agreement and may help to avoid geographic misses. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  11. Comparison of two methods based on photoplethysmography for the diagnosis of peripheral arterial disease.

    PubMed

    Høyer, Christian; Nielsen, Nikolaj Schandorph; Jordansen, Malene Kragh Overvad; Zacho, Helle Damgaard

    2017-12-01

    To examine the interchangeability of two methods for distal pressure measurement based on photoplethysmography using a truncated or full display of the arterial inflow curve, respectively. Toe and ankle pressures were obtained from 69 patients suspected of peripheral arterial disease (PAD). Observer reproducibility of the curve readings was examined by blinded reassessment of the pressure curves in a randomly selected subgroup (60 limbs). There were no significant differences in mean pressures between the two methods (p for all > .455). The limits of agreement for the differences were -15.0-15.4 mmHg for right toe pressures, -16.3-16.2 mmHg for left toe pressures, -14.2-15.7 mmHg for right ankle pressures, and -18.3-17.7 mmHg for left ankle pressures. Correlation analysis revealed intraclass correlation coefficients ≥0.960 for all measuring sites. Cohen's Kappa showed excellent agreement in diagnostic classification, with κ = 0.930 for the diagnosis of PAD and perfect agreement in the diagnosis of critical limb ischemia (κ = 1.000). The analysis of intra-observer variation for curve reading showed limits of agreement of -3.9-4.0 for toe pressures and -7.6-7.7 for ankle pressures for the method involving truncated display and -3.1-3.2 for toe pressures and -6.3-8.6 for ankle pressures for the method involving full display of the signal. The present study shows minimal differences in diagnostic classification, as well as in ankle and toe pressures, between the full display and the truncated display of the photoplethysmographic pulse signal. Furthermore, the inter-observer variation was low for both of the photoplethysmographic methods investigated.

  12. Intra-rater reliability and agreement of various methods of measurement to assess dorsiflexion in the Weight Bearing Dorsiflexion Lunge Test (WBLT) among female athletes.

    PubMed

    Langarika-Rocafort, Argia; Emparanza, José Ignacio; Aramendi, José F; Castellano, Julen; Calleja-González, Julio

    2017-01-01

    To examine the intra-observer reliability and agreement between five methods of measurement for dorsiflexion during Weight Bearing Dorsiflexion Lunge Test and to assess the degree of agreement between three methods in female athletes. Repeated measurements study design. Volleyball club. Twenty-five volleyball players. Dorsiflexion was evaluated using five methods: heel-wall distance, first toe-wall distance, inclinometer at tibia, inclinometer at Achilles tendon and the dorsiflexion angle obtained by a simple trigonometric function. For the statistical analysis, agreement was studied using the Bland-Altman method, the Standard Error of Measurement and the Minimum Detectable Change. Reliability analysis was performed using the Intraclass Correlation Coefficient. Measurement methods using the inclinometer had more than 6° of measurement error. The angle calculated by trigonometric function had 3.28° error. The reliability of inclinometer based methods had ICC values < 0.90. Distance based methods and trigonometric angle measurement had an ICC values > 0.90. Concerning the agreement between methods, there was from 1.93° to 14.42° bias, and from 4.24° to 7.96° random error. To assess DF angle in WBLT, the angle calculated by a trigonometric function is the most repeatable method. The methods of measurement cannot be used interchangeably. Copyright © 2016 Elsevier Ltd. All rights reserved.

  13. Damage Precursor Detection in Polymer Matrix Composites Using Novel Smart Composite Particles

    DTIC Science & Technology

    2016-09-20

    during the deformation test. Good agreement was observed with experimental results : the intensity of fluorescence was found to be directly proportional to...agreement is observed with experimental results , for which the intensity of fluorescence was found to be directly proportional to the deformation. Epoxy...the estimated Tgs of both neat epoxy and the smart polymer were compared with the experimental results obtained by DSC. Unit cell preparation

  14. Interobserver and Intraobserver Agreement on Qualitative Assessments of Right Ventricular Dysfunction With Echocardiography in Patients With Pulmonary Embolism.

    PubMed

    Weekes, Anthony J; Oh, Laura; Thacker, Gregory; Johnson, Angela K; Runyon, Michael; Rose, Geoffrey; Johnson, Thomas; Templin, Megan; Norton, H James

    2016-10-01

    To evaluate observer agreement using qualitative goal-directed echocardiographic criteria for right ventricular (RV) dysfunction prognostication in submassive pulmonary embolism (PE). Two emergency physicians and 2 cardiologists independently reviewed 31 packets of goal-directed echocardiographic video clips consisting of at least 3 windows obtained by emergency physicians from normotensive patients with PE. Nine packets were repeated to assess for intraobserver agreement. Right ventricular dysfunction criteria on goal-directed echocardiography were as follows: RV enlargement was present, with a right-to-left ventricular basal diameter ratio of 1.0 or higher and blunting of the apex of the RV in 2 or more different windows; RV systolic dysfunction was present if the tricuspid annulus moved toward the apex 10 mm or less and there was RV free wall hypokinesis; and septal deviation was present with any flattening or deviation of the ventricular septum toward the left ventricle. Among the 4 participants, there was 83.9% agreement on the presence or absence of RV enlargement (κ = 0.84), 74.2% agreement on the presence or absence of RV systolic dysfunction (κ = 0.69), and 71.0% agreement on the presence or absence of septal deviation (κ = 0.59). Intraobserver agreement was 100% for each RV dysfunction variable for each observer (κ = 1.0). Agreement was substantial for both severe RV enlargement and RV systolic dysfunction and moderate for septal deviation. Right ventricular dysfunction assessment with qualitative goal-directed echocardiographic criteria is reproducible for PE risk stratification.

  15. Comparison between the Temperature Measurements by TIMED/SABER and Lidar in the Mid-Latitude

    NASA Technical Reports Server (NTRS)

    Xu, Jiyao; She, C. Y.; Yuan, Wei; Mertens, Chris; Mlynczak, Marty; Russell, James

    2005-01-01

    Comparisons of monthly-mean nighttime temperature profiles observed by the Sodium Lidar at Colorado State University and TIMED/SABER over passes are made. In the altitude range from 85 km to about 100 km, the two observations are in excellent agreement. Though within each other s error bars, important differences occur below 85 km in the entire year and above 100 km in the summer season. Possible reasons for these difference are high photon noise below 85 km in lidar observations, and less than accurate assumptions in the concentration of important chemical species like oxygen (and its quenching rate) in the SABER retrieval above 100 km. However, the two techniques both show the two-level mesopause thermal structure, with the times of change from one level to the other in excellent agreement. Comparison indicates that the high-level (winter) mesopause altitudes are also in excellent agreement between the two observations, though some difference may exist in the low-level (summer) mesopause altitudes between ground-based and satellite-borne data.

  16. Interobserver variability of radiation therapists aligning to fiducial markers for prostate radiation therapy.

    PubMed

    Deegan, Timothy; Owen, Rebecca; Holt, Tanya; Roberts, Lisa; Biggs, Jennifer; McCarthy, Alicia; Parfitt, Matthew; Fielding, Andrew

    2013-08-01

    As the use of fiducial markers (FMs) for the localisation of the prostate during external beam radiation therapy (EBRT) has become part of routine practice, radiation therapists (RTs) have become increasingly responsible for online image interpretation. The aim of this investigation was to quantify the limits of agreement (LoA) between RTs when localising to FMs with orthogonal kilovoltage (kV) imaging. Six patients receiving prostate EBRT utilising FMs were included in this study. Treatment localisation was performed using kV imaging prior to each fraction. Online stereoscopic assessment of FMs, performed by the treating RTs, was compared with the offline assessment by three RTs. Observer agreement was determined by pairwise Bland-Altman analysis. Stereoscopic analysis of 225 image pairs was performed online at the time of treatment, and offline by three RT observers. Eighteen pairwise Bland-Altman analyses were completed to assess the level of agreement between observers. Localisation by RTs was found to be within clinically acceptable 95% LoAs. Small differences between RTs, in both the online and offline setting, were found to be within clinically acceptable limits. RTs were able to make consistent and reliable judgements when matching FMs on planar kV imaging. © 2013 The Authors. Journal of Medical Imaging and Radiation Oncology © 2013 The Royal Australian and New Zealand College of Radiologists.

  17. Nutritional status of children and adolescents based on body mass index: agreement between World Health Organization and International Obesity Task Force

    PubMed Central

    Cavazzotto, Timothy Gustavo; Brasil, Marcos Roberto; Oliveira, Vinicius Machado; da Silva, Schelyne Ribas; Ronque, Enio Ricardo V.; Queiroga, Marcos Roberto; Serassuelo, Helio

    2014-01-01

    Objective: To investigate the agreement between two international criteria for classification of children and adolescents nutritional status. Methods: The study included 778 girls and 863 boys aged from six to 13 years old. Body mass and height were measured and used to calculate the body mass index. Nutritional status was classified according to the cut-off points defined by the World Health Organization and the International Obesity Task Force. The agreement was evaluated using Kappa statistic and weighted Kappa. Results: In order to classify the nutritional status, the agreement between the criteria was higher for the boys (Kappa 0.77) compared to girls (Kappa 0.61). The weighted Kappa was also higher for boys (0.85) in comparison to girls (0.77). Kappa index varied according to age. When the nutritional status was classified in only two categories - appropriate (thinness + accentuated thinness + eutrophy) and overweight (overweight + obesity + severe obesity) -, the Kappa index presented higher values than those related to the classification in six categories. Conclusions: A substantial agreement was observed between the criteria, being higher in males and varying according to the age. PMID:24676189

  18. Assessment of Intraobserver and Interobserver Agreement of a New Classification System for Retrograde Periimplantitis.

    PubMed

    Shah, Rucha; Thomas, Raison; Kumar, Tarun; Mehta, Dhoom Singh

    2016-12-01

    Retrograde periimplantitis (RPI) is the inflammatory disease that affects the apical part of an osseointegrated implant while the coronal portion of the implant sustains a normal bone-to-implant interface. The aim of the current study was to assess the intraexaminer and interexaminer reliability of a proposed new classification system for RPI. After thorough electronic literature search, 56 intraoral periapical radiographs (IOPA) of implants with RPI were collected and were classified by 2 independent reviewers as per the new classification system into one of the 3-mild, moderate, and advanced-classes based on the amount of bone loss from the apex of the implant to the most coronal part as a percentage of the total implant length. The IOPAs were assessed twice by the same examiners and both were blinded to each other's observations. The intraobserver agreement ranged from 0.85 to 0.91, which falls under the category of almost perfect agreement. The interexaminer agreement was found to be 0.83, also considered as almost perfect agreement. The proposed classification shows good intraexaminer and interexaminer reliability and can be used for treatment planning and prognosis in cases of RPI.

  19. Kinetics of carbon clustering in detonation of high explosives: Does theory match experiment?

    NASA Astrophysics Data System (ADS)

    Velizhanin, Kirill; Watkins, Erik; Dattelbaum, Dana; Gustavsen, Richard; Aslam, Tariq; Podlesak, David; Firestone, Millicent; Huber, Rachel; Ringstrand, Bryan; Willey, Trevor; Bagge-Hansen, Michael; Hodgin, Ralph; Lauderbach, Lisa; van Buuren, Tony; Sinclair, Nicholas; Rigg, Paulo; Seifert, Soenke; Gog, Thomas

    2017-06-01

    Chemical reactions in detonation of carbon-rich high explosives yield carbon clusters as major constituents of the products. Efforts to model carbon clustering as a diffusion-limited irreversible coagulation of carbon clusters go back to the seminal paper by Shaw and Johnson. However, first direct experimental observations of the kinetics of clustering yielded cluster growth one to two orders of magnitude slower than theoretical predictions. Multiple efforts were undertaken to test and revise the basic assumptions of the model in order to achieve better agreement with experiment. We discuss our very recent direct experimental observations of carbon clustering dynamics and demonstrate that these new results are in much better agreement with the modified Shaw-Johnson model. The implications of this much better agreement on our present understanding of detonation carbon clustering processes and possible ways to increase the agreement between theory and experiment even further are discussed.

  20. INFLUENCES OF RESPONSE RATE AND DISTRIBUTION ON THE CALCULATION OF INTEROBSERVER RELIABILITY SCORES

    PubMed Central

    Rolider, Natalie U.; Iwata, Brian A.; Bullock, Christopher E.

    2012-01-01

    We examined the effects of several variations in response rate on the calculation of total, interval, exact-agreement, and proportional reliability indices. Trained observers recorded computer-generated data that appeared on a computer screen. In Study 1, target responses occurred at low, moderate, and high rates during separate sessions so that reliability results based on the four calculations could be compared across a range of values. Total reliability was uniformly high, interval reliability was spuriously high for high-rate responding, proportional reliability was somewhat lower for high-rate responding, and exact-agreement reliability was the lowest of the measures, especially for high-rate responding. In Study 2, we examined the separate effects of response rate per se, bursting, and end-of-interval responding. Response rate and bursting had little effect on reliability scores; however, the distribution of some responses at the end of intervals decreased interval reliability somewhat, proportional reliability noticeably, and exact-agreement reliability markedly. PMID:23322930

  1. Psychophysical Laws and the Superorganism.

    PubMed

    Reina, Andreagiovanni; Bose, Thomas; Trianni, Vito; Marshall, James A R

    2018-03-12

    Through theoretical analysis, we show how a superorganism may react to stimulus variations according to psychophysical laws observed in humans and other animals. We investigate an empirically-motivated honeybee house-hunting model, which describes a value-sensitive decision process over potential nest-sites, at the level of the colony. In this study, we show how colony decision time increases with the number of available nests, in agreement with the Hick-Hyman law of psychophysics, and decreases with mean nest quality, in agreement with Piéron's law. We also show that colony error rate depends on mean nest quality, and difference in quality, in agreement with Weber's law. Psychophysical laws, particularly Weber's law, have been found in diverse species, including unicellular organisms. Our theoretical results predict that superorganisms may also exhibit such behaviour, suggesting that these laws arise from fundamental mechanisms of information processing and decision-making. Finally, we propose a combined psychophysical law which unifies Hick-Hyman's law and Piéron's law, traditionally studied independently; this unified law makes predictions that can be empirically tested.

  2. Comparison of echocardiographic and cardiac magnetic resonance imaging measurements of functional single ventricular volumes, mass, and ejection fraction (from the Pediatric Heart Network Fontan Cross-Sectional Study).

    PubMed

    Margossian, Renee; Schwartz, Marcy L; Prakash, Ashwin; Wruck, Lisa; Colan, Steven D; Atz, Andrew M; Bradley, Timothy J; Fogel, Mark A; Hurwitz, Lynne M; Marcus, Edward; Powell, Andrew J; Printz, Beth F; Puchalski, Michael D; Rychik, Jack; Shirali, Girish; Williams, Richard; Yoo, Shi-Joon; Geva, Tal

    2009-08-01

    Assessment of the size and function of a functional single ventricle (FSV) is a key element in the management of patients after the Fontan procedure. Measurement variability of ventricular mass, volume, and ejection fraction (EF) among observers by echocardiography and cardiac magnetic resonance imaging (CMR) and their reproducibility among readers in these patients have not been described. From the 546 patients enrolled in the Pediatric Heart Network Fontan Cross-Sectional Study (mean age 11.9 +/- 3.4 years), 100 echocardiograms and 50 CMR studies were assessed for measurement reproducibility; 124 subjects with paired studies were selected for comparison between modalities. Interobserver agreement for qualitative grading of ventricular function by echocardiography was modest for left ventricular (LV) morphology (kappa = 0.42) and weak for right ventricular (RV) morphology (kappa = 0.12). For quantitative assessment, high intraclass correlation coefficients were found for echocardiographic interobserver agreement (LV 0.87 to 0.92, RV 0.82 to 0.85) of systolic and diastolic volumes, respectively. In contrast, intraclass correlation coefficients for LV and RV mass were moderate (LV 0.78, RV 0.72). The corresponding intraclass correlation coefficients by CMR were high (LV 0.96, RV 0.85). Volumes by echocardiography averaged 70% of CMR values. Interobserver reproducibility for the EF was similar for the 2 modalities. Although the absolute mean difference between modalities for the EF was small (<2%), 95% limits of agreement were wide. In conclusion, agreement between observers of qualitative FSV function by echocardiography is modest. Measurements of FSV volume by 2-dimensional echocardiography underestimate CMR measurements, but their reproducibility is high. Echocardiographic and CMR measurements of FSV EF demonstrate similar interobserver reproducibility, whereas measurements of FSV mass and LV diastolic volume are more reproducible by CMR.

  3. The Power of Flash Mob Research: Conducting a Nationwide Observational Clinical Study on Capillary Refill Time in a Single Day.

    PubMed

    Alsma, Jelmer; van Saase, Jan L C M; Nanayakkara, Prabath W B; Schouten, W E M Ineke; Baten, Anique; Bauer, Martijn P; Holleman, Frits; Ligtenberg, Jack J M; Stassen, Patricia M; Kaasjager, Karin H A H; Haak, Harm R; Bosch, Frank H; Schuit, Stephanie C E

    2017-05-01

    Capillary refill time (CRT) is a clinical test used to evaluate the circulatory status of patients; various methods are available to assess CRT. Conventional clinical research often demands large numbers of patients, making it costly, labor-intensive, and time-consuming. We studied the interobserver agreement on CRT in a nationwide study by using a novel method of research called flash mob research (FMR). Physicians in the Netherlands were recruited by using word-of-mouth referrals, conventional media, and social media to participate in a nationwide, single-day, "nine-to-five," multicenter, cross-sectional, observational study to evaluate CRT. Patients aged ≥ 18 years presenting to the ED or who were hospitalized were eligible for inclusion. CRT was measured independently (by two investigators) at the patient's sternum and distal phalanx after application of pressure for 5 s (5s) and 15 s (15s). On October 29, 2014, a total of 458 investigators in 38 Dutch hospitals enrolled 1,734 patients. The mean CRT measured at the distal phalanx were 2.3 s (5s, SD 1.1) and 2.4 s (15s, SD 1.3). The mean CRT measured at the sternum was 2.6 s (5s, SD 1.1) and 2.7 s (15s, SD 1.1). Interobserver agreement was higher for the distal phalanx (κ value, 0.40) than for the sternum (κ value, 0.30). Interobserver agreement on CRT is, at best, moderate. CRT measured at the distal phalanx yielded higher interobserver agreement compared with sternal CRT measurements. FMR proved a valuable instrument to investigate a relatively simple clinical question in an inexpensive, quick, and reliable manner. Copyright © 2016 American College of Chest Physicians. Published by Elsevier Inc. All rights reserved.

  4. Reliability of mercury-in-silastic strain gauge plethysmography curve reading: influence of clinical clues and observer variation.

    PubMed

    Høyer, Christian; Pavar, Susanne; Pedersen, Begitte H; Biurrun Manresa, José A; Petersen, Lars J

    2013-08-01

    Mercury-in-silastic strain gauge pletysmography (SGP) is a well-established technique for blood flow and blood pressure measurements. The aim of this study was to examine (i) the possible influence of clinical clues, e.g. the presence of wounds and color changes during blood pressure measurements, and (ii) intra- and inter-observer variation of curve interpretation for segmental blood pressure measurements. A total of 204 patients with known or suspected peripheral arterial disease (PAD) were included in a diagnostic accuracy trial. Toe and ankle pressures were measured in both limbs, and primary observers analyzed a total of 804 pressure curve sets. The SGP curves were later reanalyzed separately by two observers blinded to clinical clues. Intra- and inter-observer agreement was quantified using Cohen's kappa and reliability was quantified using intra-class correlation coefficients, coefficients of variance, and Bland-Altman analysis. There was an overall agreement regarding patient diagnostic classification (PAD/not PAD) in 202/204 (99.0%) for intra-observer (κ = 0.969, p < 0.001), and 201/204 (98.5%) for inter-observer readings (κ = 0.953, p < 0.001). Reliability analysis showed excellent correlation between blinded versus non-blinded and inter-observer readings for determination of absolute segmental pressures (all intraclass correlation coefficients ≥ 0.984). The coefficient of variance for determination of absolute segmental blood pressure ranged from 2.9-3.4% for blinded/non-blinded data and from 3.8-5.0% for inter-observer data. This study shows a low inter-observer variation among experienced laboratory technicians for reading strain gauge curves. The low variation between blinded/non-blinded readings indicates that SGP measurements are minimally biased by clinical clues.

  5. Role of deuterium desorption kinetics on the thermionic emission properties of polycrystalline diamond films with respect to kinetic isotope effects

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Paxton, W. F., E-mail: william.f.paxton@vanderbilt.edu; Howell, M.; Kang, W. P.

    2014-06-21

    The desorption kinetics of deuterium from polycrystalline chemical vapor deposited diamond films were characterized by monitoring the isothermal thermionic emission current behavior. The reaction was observed to follow a first-order trend as evidenced by the decay rate of the thermionic emission current over time which is in agreement with previously reported studies. However, an Arrhenius plot of the reaction rates at each tested temperature did not exhibit the typical linear behavior which appears to contradict past observations of the hydrogen (or deuterium) desorption reaction from diamond. This observed deviation from linearity, specifically at lower temperatures, has been attributed to non-classicalmore » processes. Though no known previous studies reported similar deviations, a reanalysis of the data obtained in the present study was performed to account for tunneling which appeared to add merit to this hypothesis. Additional investigations were performed by reevaluating previously reported data involving the desorption of hydrogen (as opposed to deuterium) from diamond which further indicated this reaction to be dominated by tunneling at the temperatures tested in this study (<775 °C). An activation energy of 3.19 eV and a pre-exponential constant of 2.3 × 10{sup 12} s{sup −1} were determined for the desorption reaction of deuterium from diamond which is in agreement with previously reported studies.« less

  6. Rule-based exposure assessment versus case-by-case expert assessment using the same information in a community-based study.

    PubMed

    Peters, Susan; Glass, Deborah C; Milne, Elizabeth; Fritschi, Lin

    2014-03-01

    Retrospective exposure assessment in community-based studies is largely reliant on questionnaire information. Expert assessment is often used to assess lifetime occupational exposures, but these assessments generally lack transparency and are very time-consuming. We explored the agreement between a rule-based assessment approach and case-by-case expert assessment of occupational exposures in a community-based study. We used data from a case-control study of childhood acute lymphoblastic leukaemia in which parental occupational exposures were originally assigned by expert assessment. Key questions were identified from the completed parent questionnaires and, on the basis of these, rules were written to assign exposure levels to diesel exhaust, pesticides and solvents. We estimated exposure prevalence separately for fathers and mothers, and used κ statistics to assess the agreement between the two exposure assessment methods. Exposures were assigned to 5829 jobs among 1079 men and 6189 jobs among 1234 women. For both sexes, agreement was good for the two assessment methods of exposure to diesel exhaust at a job level (κ=0.70 for men and κ=0.71 for women) and at a person level (κ=0.74 and κ=0.75). The agreement was good to excellent for pesticide exposure among men (κ=0.74 for jobs and κ=0.84 at a person level) and women (κ=0.68 and κ=0.71 at a job and person level, respectively). Moderate to good agreement was observed for assessment of solvent exposure, which was better for women than men. The rule-based assessment approach appeared to be an efficient alternative for assigning occupational exposures in a community-based study for a selection of occupational exposures.

  7. Validity and reliability of acoustic analysis of respiratory sounds in infants

    PubMed Central

    Elphick, H; Lancaster, G; Solis, A; Majumdar, A; Gupta, R; Smyth, R

    2004-01-01

    Objective: To investigate the validity and reliability of computerised acoustic analysis in the detection of abnormal respiratory noises in infants. Methods: Blinded, prospective comparison of acoustic analysis with stethoscope examination. Validity and reliability of acoustic analysis were assessed by calculating the degree of observer agreement using the κ statistic with 95% confidence intervals (CI). Results: 102 infants under 18 months were recruited. Convergent validity for agreement between stethoscope examination and acoustic analysis was poor for wheeze (κ = 0.07 (95% CI, –0.13 to 0.26)) and rattles (κ = 0.11 (–0.05 to 0.27)) and fair for crackles (κ = 0.36 (0.18 to 0.54)). Both the stethoscope and acoustic analysis distinguished well between sounds (discriminant validity). Agreement between observers for the presence of wheeze was poor for both stethoscope examination and acoustic analysis. Agreement for rattles was moderate for the stethoscope but poor for acoustic analysis. Agreement for crackles was moderate using both techniques. Within-observer reliability for all sounds using acoustic analysis was moderate to good. Conclusions: The stethoscope is unreliable for assessing respiratory sounds in infants. This has important implications for its use as a diagnostic tool for lung disorders in infants, and confirms that it cannot be used as a gold standard. Because of the unreliability of the stethoscope, the validity of acoustic analysis could not be demonstrated, although it could discriminate between sounds well and showed good within-observer reliability. For acoustic analysis, targeted training and the development of computerised pattern recognition systems may improve reliability so that it can be used in clinical practice. PMID:15499065

  8. PI-RADS version 2: evaluation of diffusion-weighted imaging interpretation between b = 1000 and b = 1500 s mm-2.

    PubMed

    Kwon, Mi-Ri; Kim, Chan Kyo; Kim, Jae-Hun

    2017-11-01

    To investigate the variability of diffusion-weighted imaging (DWI) interpretation of Prostate Imaging Reporting and Data System (PI-RADS) version 2 (v2) in evaluating prostate cancer (PCa). 154 patients with PCa underwent multiparametric 3T MRI, followed by radical prostatectomy. DWI with different b values (b = 0, 100, 1000 and 1500 s mm - 2 ) was obtained. Using the PI-RADS v2, two radiologists independently scored suspicious lesions in each patient and compared DWI of b = 1000 (DWI 1000 ) with 1500 (DWI 1500 ) s mm - 2 . On DWI 1000 and DWI 1500 , the intermethod and interobserver agreements of DWI scores were excellent in all patients (κ ≥ 0.873). In each peripheral zone and transition zone DWI scores, both observers showed excellent intermethod agreement between DWI 1000 and DWI 1500 (κ ≥ 0.897), and interobserver agreement for DWI 1000 and DWI 1500 was good to excellent (κ ≥ 0.796). For estimating clinically significant cancer, the area under receiver operating characteristics curves of DWI 1000 and DWI 1500 were 0.710 and 0.724 for observer 1 (p = 0.11), and 0.649 and 0.656 for observer 2 (p = 0.12), respectively. The PI-RADS v2 scoring at 3T shows excellent agreement between DWI 1000 and DWI 1500 in evaluating PCa, with excellent inter-observer agreement. Advance in knowledge: DWI using b = 1000 s mm -2 instead of b = 1500 s mm -2 reduces examination time or image distortion, with improved the signal-to-noise ratio.

  9. Evidence-based dentistry: analysis of dental anxiety scales for children.

    PubMed

    Al-Namankany, A; de Souza, M; Ashley, P

    2012-03-09

    To review paediatric dental anxiety measures (DAMs) and assess the statistical methods used for validation and their clinical implications. A search of four computerised databases between 1960 and January 2011 associated with DAMs, using pre-specified search terms, to assess the method of validation including the reliability as intra-observer agreement 'repeatability or stability' and inter-observer agreement 'reproducibility' and all types of validity. Fourteen paediatric DAMs were predominantly validated in schools and not in the clinical setting while five of the DAMs were not validated at all. The DAMs that were validated were done so against other paediatric DAMs which may not have been validated previously. Reliability was not assessed in four of the DAMs. However, all of the validated studies assessed reliability which was usually 'good' or 'acceptable'. None of the current DAMs used a formal sample size technique. Diversity was seen between the studies ranging from a few simple pictograms to lists of questions reported by either the individual or an observer. To date there is no scale that can be considered as a gold standard, and there is a need to further develop an anxiety scale with a cognitive component for children and adolescents.

  10. A unified solution to the small scale problems of the ΛCDM model II: introducing parent-satellite interaction

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Popolo, A. Del; Delliou, M. Le, E-mail: adelpopolo@oact.inaf.it, E-mail: delliou@ift.unesp.br

    2014-12-01

    We continue the study of the impact of baryon physics on the small scale problems of the ΛCDM model, based on a semi-analytical model (Del Popolo, 2009). With such model, we show how the cusp/core, missing satellite (MSP), Too Big to Fail (TBTF) problems and the angular momentum catastrophe can be reconciled with observations, adding parent-satellite interaction. Such interaction between dark matter (DM) and baryons through dynamical friction (DF) can sufficiently flatten the inner cusp of the density profiles to solve the cusp/core problem. Combining, in our model, a Zolotov et al. (2012)-like correction, similarly to Brooks et al. (2013),more » and effects of UV heating and tidal stripping, the number of massive, luminous satellites, as seen in the Via Lactea 2 (VL2) subhaloes, is in agreement with the numbers observed in the MW, thus resolving the MSP and TBTF problems. The model also produces a distribution of the angular spin parameter and angular momentum in agreement with observations of the dwarfs studied by van den Bosch, Burkert, and Swaters (2001)« less

  11. Interobserver variability in recognizing arousal in respiratory sleep disorders.

    PubMed

    Drinnan, M J; Murray, A; Griffiths, C J; Gibson, G J

    1998-08-01

    Daytime sleepiness is a common consequence of repeated arousal in obstructive sleep apnea (OSA). Arousal indices are sometimes used to make decisions on treatment, but there is no evidence that arousals are detected similarly even by experienced observers. Using the American Sleep Disorders Association (ASDA) definition of arousal in terms of the accompanying electroencephalogram (EEG) changes, we have quantified interobserver agreement for arousal scoring and identified factors affecting it. Ten patients with suspected OSA were studied; three representative EEG events during each of light, slow-wave, and rapid-eye-movement (REM) sleep were extracted from each record (90 events total) and evaluated by experts in 14 sleep laboratories. Observers differed (ANOVA, p < 0.001) in the number of events scored as arousal (totals ranged from 23 to 53 of the 90 events). Overall agreement was moderate (kappa = 0.47), but it was best for events during slow-wave sleep, moderate for REM, and poor for light sleep (kappa = 0.60, 0.52, and 0.28, respectively). Agreement was unrelated to arousal duration. We conclude that the ASDA definition of arousal is only moderately repeatable. Account should be taken of this variability when results from different centers are compared.

  12. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ade, P. A. R.; Aghanim, N.; Arnaud, M.

    The Virgo cluster is the largest Sunyaev-Zeldovich (SZ) source in the sky, both in terms of angular size and total integrated flux. Planck’s wide angular scale and frequency coverage, together with its high sensitivity, enable a detailed study of this big object through the SZ effect. Virgo is well resolved by Planck, showing an elongated structure that correlates well with the morphology observed from X-rays, but extends beyond the observed X-ray signal. We find good agreement between the SZ signal (or Compton parameter, y c) observed by Planck and the expected signal inferred from X-ray observations and simple analytical models.more » Owing to its proximity to us, the gas beyond the virial radius in Virgo can be studied with unprecedented sensitivity by integrating the SZ signal over tens of square degrees. In this paper, we study the signal in the outskirts of Virgo and compare it with analytical models and a constrained simulation of the environment of Virgo. Planck data suggest that significant amounts of low-density plasma surround Virgo, out to twice the virial radius. We find the SZ signal in the outskirts of Virgo to be consistent with a simple model that extrapolates the inferred pressure at lower radii, while assuming that the temperature stays in the keV range beyond the virial radius. The observed signal is also consistent with simulations and points to a shallow pressure profile in the outskirts of the cluster. This reservoir of gas at large radii can be linked with the hottest phase of the elusivewarm/hot intergalactic medium. Taking the lack of symmetry of Virgo into account, we find that a prolate model is favoured by the combination of SZ and X-ray data, in agreement with predictions. In conclusion, based on the combination of the same SZ and X-ray data, we constrain the total amount of gas in Virgo. Under the hypothesis that the abundance of baryons in Virgo is representative of the cosmic average, we also infer a distance for Virgo of approximately 18 Mpc, in good agreement with previous estimates.« less

  13. Planck intermediate results. XL. The Sunyaev-Zeldovich signal from the Virgo cluster

    NASA Astrophysics Data System (ADS)

    Planck Collaboration; Ade, P. A. R.; Aghanim, N.; Arnaud, M.; Ashdown, M.; Aumont, J.; Baccigalupi, C.; Banday, A. J.; Barreiro, R. B.; Bartolo, N.; Battaner, E.; Benabed, K.; Benoit-Lévy, A.; Bernard, J.-P.; Bersanelli, M.; Bielewicz, P.; Bonaldi, A.; Bonavera, L.; Bond, J. R.; Borrill, J.; Bouchet, F. R.; Burigana, C.; Butler, R. C.; Calabrese, E.; Cardoso, J.-F.; Catalano, A.; Chamballu, A.; Chiang, H. C.; Christensen, P. R.; Churazov, E.; Clements, D. L.; Colombo, L. P. L.; Combet, C.; Comis, B.; Couchot, F.; Coulais, A.; Crill, B. P.; Curto, A.; Cuttaia, F.; Danese, L.; Davies, R. D.; Davis, R. J.; de Bernardis, P.; de Rosa, A.; de Zotti, G.; Delabrouille, J.; Dickinson, C.; Diego, J. M.; Dolag, K.; Dole, H.; Donzelli, S.; Doré, O.; Douspis, M.; Ducout, A.; Dupac, X.; Efstathiou, G.; Elsner, F.; Enßlin, T. A.; Eriksen, H. K.; Finelli, F.; Forni, O.; Frailis, M.; Fraisse, A. A.; Franceschi, E.; Galeotta, S.; Galli, S.; Ganga, K.; Giard, M.; Giraud-Héraud, Y.; Gjerløw, E.; González-Nuevo, J.; Górski, K. M.; Gregorio, A.; Gruppuso, A.; Gudmundsson, J. E.; Hansen, F. K.; Harrison, D. L.; Helou, G.; Hernández-Monteagudo, C.; Herranz, D.; Hildebrandt, S. R.; Hivon, E.; Hobson, M.; Hornstrup, A.; Hovest, W.; Huffenberger, K. M.; Hurier, G.; Jaffe, A. H.; Jaffe, T. R.; Jones, W. C.; Keihänen, E.; Keskitalo, R.; Kisner, T. S.; Kneissl, R.; Knoche, J.; Kunz, M.; Kurki-Suonio, H.; Lagache, G.; Lamarre, J.-M.; Lasenby, A.; Lattanzi, M.; Lawrence, C. R.; Leonardi, R.; Levrier, F.; Liguori, M.; Lilje, P. B.; Linden-Vørnle, M.; López-Caniego, M.; Lubin, P. M.; Macías-Pérez, J. F.; Maffei, B.; Maggio, G.; Maino, D.; Mandolesi, N.; Mangilli, A.; Marcos-Caballero, A.; Maris, M.; Martin, P. G.; Martínez-González, E.; Masi, S.; Matarrese, S.; Mazzotta, P.; Meinhold, P. R.; Melchiorri, A.; Mennella, A.; Migliaccio, M.; Mitra, S.; Miville-Deschênes, M.-A.; Moneti, A.; Montier, L.; Morgante, G.; Mortlock, D.; Munshi, D.; Murphy, J. A.; Naselsky, P.; Nati, F.; Natoli, P.; Noviello, F.; Novikov, D.; Novikov, I.; Oppermann, N.; Oxborrow, C. A.; Pagano, L.; Pajot, F.; Paoletti, D.; Pasian, F.; Pearson, T. J.; Perdereau, O.; Perotto, L.; Pettorino, V.; Piacentini, F.; Piat, M.; Pierpaoli, E.; Plaszczynski, S.; Pointecouteau, E.; Polenta, G.; Ponthieu, N.; Pratt, G. W.; Prunet, S.; Puget, J.-L.; Rachen, J. P.; Reinecke, M.; Remazeilles, M.; Renault, C.; Renzi, A.; Ristorcelli, I.; Rocha, G.; Rosset, C.; Rossetti, M.; Roudier, G.; Rubiño-Martín, J. A.; Rusholme, B.; Sandri, M.; Santos, D.; Savelainen, M.; Savini, G.; Schaefer, B. M.; Scott, D.; Soler, J. D.; Stolyarov, V.; Stompor, R.; Sudiwala, R.; Sunyaev, R.; Sutton, D.; Suur-Uski, A.-S.; Sygnet, J.-F.; Tauber, J. A.; Terenzi, L.; Toffolatti, L.; Tomasi, M.; Tristram, M.; Tucci, M.; Umana, G.; Valenziano, L.; Valiviita, J.; Van Tent, B.; Vielva, P.; Villa, F.; Wade, L. A.; Wandelt, B. D.; Wehus, I. K.; Weller, J.; Yvon, D.; Zacchei, A.; Zonca, A.

    2016-12-01

    The Virgo cluster is the largest Sunyaev-Zeldovich (SZ) source in the sky, both in terms of angular size and total integrated flux. Planck's wide angular scale and frequency coverage, together with its high sensitivity, enable a detailed study of this big object through the SZ effect. Virgo is well resolved by Planck, showing an elongated structure that correlates well with the morphology observed from X-rays, but extends beyond the observed X-ray signal. We find good agreement between the SZ signal (or Compton parameter, yc) observed by Planck and the expected signal inferred from X-ray observations and simple analytical models. Owing to its proximity to us, the gas beyond the virial radius in Virgo can be studied with unprecedented sensitivity by integrating the SZ signal over tens of square degrees. We study the signal in the outskirts of Virgo and compare it with analytical models and a constrained simulation of the environment of Virgo. Planck data suggest that significant amounts of low-density plasma surround Virgo, out to twice the virial radius. We find the SZ signal in the outskirts of Virgo to be consistent with a simple model that extrapolates the inferred pressure at lower radii, while assuming that the temperature stays in the keV range beyond the virial radius. The observed signal is also consistent with simulations and points to a shallow pressure profile in the outskirts of the cluster. This reservoir of gas at large radii can be linked with the hottest phase of the elusivewarm/hot intergalactic medium. Taking the lack of symmetry of Virgo into account, we find that a prolate model is favoured by the combination of SZ and X-ray data, in agreement with predictions. Finally, based on the combination of the same SZ and X-ray data, we constrain the total amount of gas in Virgo. Under the hypothesis that the abundance of baryons in Virgo is representative of the cosmic average, we also infer a distance for Virgo of approximately 18 Mpc, in good agreement with previous estimates.

  14. Planck intermediate results: XL. The Sunyaev-Zeldovich signal from the Virgo cluster

    DOE PAGES

    Ade, P. A. R.; Aghanim, N.; Arnaud, M.; ...

    2016-12-12

    The Virgo cluster is the largest Sunyaev-Zeldovich (SZ) source in the sky, both in terms of angular size and total integrated flux. Planck’s wide angular scale and frequency coverage, together with its high sensitivity, enable a detailed study of this big object through the SZ effect. Virgo is well resolved by Planck, showing an elongated structure that correlates well with the morphology observed from X-rays, but extends beyond the observed X-ray signal. We find good agreement between the SZ signal (or Compton parameter, y c) observed by Planck and the expected signal inferred from X-ray observations and simple analytical models.more » Owing to its proximity to us, the gas beyond the virial radius in Virgo can be studied with unprecedented sensitivity by integrating the SZ signal over tens of square degrees. In this paper, we study the signal in the outskirts of Virgo and compare it with analytical models and a constrained simulation of the environment of Virgo. Planck data suggest that significant amounts of low-density plasma surround Virgo, out to twice the virial radius. We find the SZ signal in the outskirts of Virgo to be consistent with a simple model that extrapolates the inferred pressure at lower radii, while assuming that the temperature stays in the keV range beyond the virial radius. The observed signal is also consistent with simulations and points to a shallow pressure profile in the outskirts of the cluster. This reservoir of gas at large radii can be linked with the hottest phase of the elusivewarm/hot intergalactic medium. Taking the lack of symmetry of Virgo into account, we find that a prolate model is favoured by the combination of SZ and X-ray data, in agreement with predictions. In conclusion, based on the combination of the same SZ and X-ray data, we constrain the total amount of gas in Virgo. Under the hypothesis that the abundance of baryons in Virgo is representative of the cosmic average, we also infer a distance for Virgo of approximately 18 Mpc, in good agreement with previous estimates.« less

  15. Good Agreement Between Transabdominal and Endoscopic Ultrasound of the Pancreas in Chronic Pancreatitis.

    PubMed

    Engjom, Trond; Pham, Khahn Do-Chong; Erchinger, Friedemann; Haldorsen, Ingfrid Salvesen; Gilja, Odd Helge; Dimcevski, Georg; Havre, Roald Flesland

    2018-03-26

     We aimed to evaluate the agreement of single criteria and dedicated scores from transabdominal ultrasound of the pancreas (US) compared to standards by endoscopic ultrasound (EUS) and computed tomography (CT).  In this observational cohort study performed in a tertiary care center, US and EUS were performed in 110 patients referred for suspected CP. Based on the Mayo score, 52 patients were diagnosed with CP. The sonographic findings obtained by both methods were registered. The number of criteria was counted and scored according to the Rosemont score.  Agreement between the number of detected US and EUS criteria was substantial (ICC = 0.74 [0.61 - 0.83]. Adding Rosemont weighting improved the agreement (ICC = 0.88 [0.81 - 0.92]). Regarding individual criteria, the agreement was substantial for the detection of calcifications (κ = 0.86) and moderate for cysts and irregular or dilated pancreatic duct (κ = 0.42 - 0.58). Agreement for the other criteria was poorer (κ≤ 0.40). The diagnostic performance indices [95 % CI] of US for diagnosing CP (using Mayo score as reference standard) were for the unweighted score: Sensitivity: 0.65 [0.51 - 0.78], specificity: 0.97 [0.87 - 1.00]; and for Rosemont score: Sensitivity: 0.75 [0.61 - 0.86], specificity: 0.95 [0.83 - 0.99].  The agreement between US and EUS for the unweighted and weighted scores was substantial. For the features calcifications, cysts and main pancreatic duct (MPD) changes, agreement was moderate to substantial. For the other detected US criteria, the agreement with EUS was too poor to be clinically relevant. © Georg Thieme Verlag KG Stuttgart · New York.

  16. A novel magnetic resonance imaging segmentation technique for determining diffuse intrinsic pontine glioma tumor volume.

    PubMed

    Singh, Ranjodh; Zhou, Zhiping; Tisnado, Jamie; Haque, Sofia; Peck, Kyung K; Young, Robert J; Tsiouris, Apostolos John; Thakur, Sunitha B; Souweidane, Mark M

    2016-11-01

    OBJECTIVE Accurately determining diffuse intrinsic pontine glioma (DIPG) tumor volume is clinically important. The aims of the current study were to 1) measure DIPG volumes using methods that require different degrees of subjective judgment; and 2) evaluate interobserver agreement of measurements made using these methods. METHODS Eight patients from a Phase I clinical trial testing convection-enhanced delivery (CED) of a therapeutic antibody were included in the study. Pre-CED, post-radiation therapy axial T2-weighted images were analyzed using 2 methods requiring high degrees of subjective judgment (picture archiving and communication system [PACS] polygon and Volume Viewer auto-contour methods) and 1 method requiring a low degree of subjective judgment (k-means clustering segmentation) to determine tumor volumes. Lin's concordance correlation coefficients (CCCs) were calculated to assess interobserver agreement. RESULTS The CCCs of measurements made by 2 observers with the PACS polygon and the Volume Viewer auto-contour methods were 0.9465 (lower 1-sided 95% confidence limit 0.8472) and 0.7514 (lower 1-sided 95% confidence limit 0.3143), respectively. Both were considered poor agreement. The CCC of measurements made using k-means clustering segmentation was 0.9938 (lower 1-sided 95% confidence limit 0.9772), which was considered substantial strength of agreement. CONCLUSIONS The poor interobserver agreement of PACS polygon and Volume Viewer auto-contour methods highlighted the difficulty in consistently measuring DIPG tumor volumes using methods requiring high degrees of subjective judgment. k-means clustering segmentation, which requires a low degree of subjective judgment, showed better interobserver agreement and produced tumor volumes with delineated borders.

  17. Using Web-Based Questionnaires and Obstetric Records to Assess General Health Characteristics Among Pregnant Women: A Validation Study

    PubMed Central

    Schouten, Naomi PE; Merkus, Peter JFM; Verhaak, Chris M; Roeleveld, Nel; Roukema, Jolt

    2015-01-01

    Background Self-reported medical history information is included in many studies. However, data on the validity of Web-based questionnaires assessing medical history are scarce. If proven to be valid, Web-based questionnaires may provide researchers with an efficient means to collect data on this parameter in large populations. Objective The aim of this study was to assess the validity of a Web-based questionnaire on chronic medical conditions, allergies, and blood pressure readings against obstetric records and data from general practitioners. Methods Self-reported questionnaire data were compared with obstetric records for 519 pregnant women participating in the Dutch PRegnancy and Infant DEvelopment (PRIDE) Study from July 2011 through November 2012. These women completed Web-based questionnaires around their first prenatal care visit and in gestational weeks 17 and 34. We calculated kappa statistics (κ) and the observed proportions of positive and negative agreement between the baseline questionnaire and obstetric records for chronic conditions and allergies. In case of inconsistencies between these 2 data sources, medical records from the woman’s general practitioner were consulted as the reference standard. For systolic and diastolic blood pressure, intraclass correlation coefficients (ICCs) were calculated for multiple data points. Results Agreement between the baseline questionnaire and the obstetric record was substantial (κ=.61) for any chronic condition and moderate for any allergy (κ=.51). For specific conditions, we found high observed proportions of negative agreement (range 0.88-1.00) and on average moderate observed proportions of positive agreement with a wide range (range 0.19-0.90). Using the reference standard, the sensitivity of the Web-based questionnaire for chronic conditions and allergies was comparable to or even better than the sensitivity of the obstetric records, in particular for migraine (0.90 vs 0.40, P=.02), asthma (0.86 vs 0.61, P=.04), inhalation allergies (0.92 vs 0.74, P=.003), hay fever (0.90 vs 0.64, P=.001), and allergies to animals (0.89 vs 0.53, P=.01). However, some overreporting of allergies was observed in the questionnaire and for some nonsomatic conditions sensitivity of both measurement instruments was low. The ICCs for blood pressure readings ranged between 0.72 and 0.92 with very small mean differences between the 2 methods of data collection. Conclusions Web-based questionnaires can be used to validly collect data on many chronic disorders, allergies, and blood pressure readings among pregnant women. PMID:26081990

  18. Using Web-Based Questionnaires and Obstetric Records to Assess General Health Characteristics Among Pregnant Women: A Validation Study.

    PubMed

    van Gelder, Marleen M H J; Schouten, Naomi P E; Merkus, Peter J F M; Verhaak, Chris M; Roeleveld, Nel; Roukema, Jolt

    2015-06-16

    Self-reported medical history information is included in many studies. However, data on the validity of Web-based questionnaires assessing medical history are scarce. If proven to be valid, Web-based questionnaires may provide researchers with an efficient means to collect data on this parameter in large populations. The aim of this study was to assess the validity of a Web-based questionnaire on chronic medical conditions, allergies, and blood pressure readings against obstetric records and data from general practitioners. Self-reported questionnaire data were compared with obstetric records for 519 pregnant women participating in the Dutch PRegnancy and Infant DEvelopment (PRIDE) Study from July 2011 through November 2012. These women completed Web-based questionnaires around their first prenatal care visit and in gestational weeks 17 and 34. We calculated kappa statistics (κ) and the observed proportions of positive and negative agreement between the baseline questionnaire and obstetric records for chronic conditions and allergies. In case of inconsistencies between these 2 data sources, medical records from the woman's general practitioner were consulted as the reference standard. For systolic and diastolic blood pressure, intraclass correlation coefficients (ICCs) were calculated for multiple data points. Agreement between the baseline questionnaire and the obstetric record was substantial (κ=.61) for any chronic condition and moderate for any allergy (κ=.51). For specific conditions, we found high observed proportions of negative agreement (range 0.88-1.00) and on average moderate observed proportions of positive agreement with a wide range (range 0.19-0.90). Using the reference standard, the sensitivity of the Web-based questionnaire for chronic conditions and allergies was comparable to or even better than the sensitivity of the obstetric records, in particular for migraine (0.90 vs 0.40, P=.02), asthma (0.86 vs 0.61, P=.04), inhalation allergies (0.92 vs 0.74, P=.003), hay fever (0.90 vs 0.64, P=.001), and allergies to animals (0.89 vs 0.53, P=.01). However, some overreporting of allergies was observed in the questionnaire and for some nonsomatic conditions sensitivity of both measurement instruments was low. The ICCs for blood pressure readings ranged between 0.72 and 0.92 with very small mean differences between the 2 methods of data collection. Web-based questionnaires can be used to validly collect data on many chronic disorders, allergies, and blood pressure readings among pregnant women.

  19. Assessment of the variation in American Society of Anesthesiologists [corrected] Physical Status Classification assignment in small animal anaesthesia.

    PubMed

    McMillan, Matthew; Brearley, Jacqueline

    2013-05-01

    To evaluate the interobserver variability in the assignment of the American Society of Anesthesiologists Physical Status Classification (ASA-PSC) to compromised small animal patients amongst a group of veterinary anaesthetists. Anonymous internet survey. Hypothetical case presentations. Sixteen hypothetical small animal cases with differing degrees of physiological or patho-physiological compromise were presented as part of an internet survey. Respondents were asked to assign a single ASA-PSC to each case and also to answer a number of demographic questions. ASA-PSC scores were considered separately and then grouped as scores of I-II and III-V. Agreement was analysed using the modified kappa statistic for multiple observers. Data were then sorted into various demographic groups for further analysis. There were 144 respondents of which 60 (~42%) were anaesthesia diplomates, 24 (~17%) were post-residency (nondiploma holders), 24 (~17%) were current anaesthesia residents, 21 (~15%) were general practitioners, 12 (~8%) were veterinary nurses or technicians, and 3 (~2%) were interns. Although there was a majority agreement (>50% in a single category) in 15 of the 16 cases, ASA-PSC were spread over at least three ASA-PS classifications for every case. Overall agreement was considered only fair (κ = 0.24, mean ± SD agreement 46 ± 7%). When comparing grouped data (ASA-PSC I-II versus III-V) overall agreement remained fair (κ = 0.36, mean ± SD agreement 69 ± 19%). There was no difference in ASA-PSC assignment between any of the demographic groups investigated. This study suggests major discrepancies can occur between observers given identical information when using the ASA-PSC to categorise health status in compromised small animal patients. The significant potential for interobserver variability in classification allocation should be borne in mind when the ASA-PSC is used for clinical, scientific and statistical purposes. © 2013 The Authors. Veterinary Anaesthesia and Analgesia © 2013 Association of Veterinary Anaesthetists and the American College of Veterinary Anesthesia and Analgesia.

  20. Measurement agreement between a new biometer based on partial coherence interferometry and a validated biometer based on optical low-coherence reflectometry.

    PubMed

    Li, Junhua; Chen, Hao; Savini, Giacomo; Lu, Weicong; Yu, Xinxin; Bao, Fangjun; Wang, Qinmei; Huang, Jinhai

    2016-01-01

    To evaluate the agreement of ocular measurements obtained with a new optical biometer (AL-Scan) and a previously validated optical biometer (Lenstar). Eye Hospital of Wenzhou Medical University, Wenzhou, Zhejiang, China. Prospective observational cross-sectional study. For a comprehensive comparison between the partial coherence interferometry (PCI) device and the optical low-coherence reflectometry (OLCR) device, the axial length (AL), central corneal thickness (CCT), anterior chamber depth (ACD), aqueous depth, mean keratometry (K), astigmatism, white-to-white (WTW), and pupil diameter were measured 3 times per device in eyes with cataract. The sequence of the device was in random order. The mean values were compared and 95% limits of agreement (LoA) were assessed. Ninety-two eyes of 92 cataract patients were included. Bland-Altman analysis showed excellent agreement between the PCI device and the OLCR device for AL, CCT, ACD and aqueous depth measurements with narrow 95% LoA (-0.05 to 0.06 mm, -13.39 to 15.61 μm, -0.11 to 0.10 mm, and -0.12 to 0.10 mm, respectively), and the P values were more than 0.05. The mean K, astigmatism, and WTW values provided by the PCI device were in good agreement with the OLCR device, although statistically significant differences were detected. A major difference was observed in the pupil diameter measurement, with a 95% LoA of -0.73 to 1.21 mm. The PCI device biometer provided ocular measurements similar to those provided by the OLCR device for most parameters, especially for AL, CCT, and ACD. The pupil diameter values obtained with the PCI device were in poor agreement with the OLCR device, and these measurements should be interpreted with necessary adjustment. None of the authors has a proprietary or financial interest in any material or method mentioned. Copyright © 2016 ASCRS and ESCRS. Published by Elsevier Inc. All rights reserved.

  1. Visual-search model observer for assessing mass detection in CT

    NASA Astrophysics Data System (ADS)

    Karbaschi, Zohreh; Gifford, Howard C.

    2017-03-01

    Our aim is to devise model observers (MOs) to evaluate acquisition protocols in medical imaging. To optimize protocols for human observers, an MO must reliably interpret images containing quantum and anatomical noise under aliasing conditions. In this study of sampling parameters for simulated lung CT, the lesion-detection performance of human observers was compared with that of visual-search (VS) observers, a channelized nonprewhitening (CNPW) observer, and a channelized Hoteling (CH) observer. Scans of a mathematical torso phantom modeled single-slice parallel-hole CT with varying numbers of detector pixels and angular projections. Circular lung lesions had a fixed radius. Twodimensional FBP reconstructions were performed. A localization ROC study was conducted with the VS, CNPW and human observers, while the CH observer was applied in a location-known ROC study. Changing the sampling parameters had negligible effect on the CNPW and CH observers, whereas several VS observers demonstrated a sensitivity to sampling artifacts that was in agreement with how the humans performed.

  2. Assessing distractors and teamwork during surgery: developing an event-based method for direct observation.

    PubMed

    Seelandt, Julia C; Tschan, Franziska; Keller, Sandra; Beldi, Guido; Jenni, Nadja; Kurmann, Anita; Candinas, Daniel; Semmer, Norbert K

    2014-11-01

    To develop a behavioural observation method to simultaneously assess distractors and communication/teamwork during surgical procedures through direct, on-site observations; to establish the reliability of the method for long (>3 h) procedures. Observational categories for an event-based coding system were developed based on expert interviews, observations and a literature review. Using Cohen's κ and the intraclass correlation coefficient, interobserver agreement was assessed for 29 procedures. Agreement was calculated for the entire surgery, and for the 1st hour. In addition, interobserver agreement was assessed between two tired observers and between a tired and a non-tired observer after 3 h of surgery. The observational system has five codes for distractors (door openings, noise distractors, technical distractors, side conversations and interruptions), eight codes for communication/teamwork (case-relevant communication, teaching, leadership, problem solving, case-irrelevant communication, laughter, tension and communication with external visitors) and five contextual codes (incision, last stitch, personnel changes in the sterile team, location changes around the table and incidents). Based on 5-min intervals, Cohen's κ was good to excellent for distractors (0.74-0.98) and for communication/teamwork (0.70-1). Based on frequency counts, intraclass correlation coefficient was excellent for distractors (0.86-0.99) and good to excellent for communication/teamwork (0.45-0.99). After 3 h of surgery, Cohen's κ was 0.78-0.93 for distractors, and 0.79-1 for communication/teamwork. The observational method developed allows a single observer to simultaneously assess distractors and communication/teamwork. Even for long procedures, high interobserver agreement can be achieved. Data collected with this method allow for investigating separate or combined effects of distractions and communication/teamwork on surgical performance and patient outcomes. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  3. Tapering Practices of Strongman Athletes: Test-Retest Reliability Study

    PubMed Central

    Pritchard, Hayden J; Keogh, Justin WL

    2017-01-01

    Background Little is currently known about the tapering practices of strongman athletes. We have developed an Internet-based comprehensive self-report questionnaire examining the training and tapering practices of strongman athletes. Objective The objective of this study was to document the test-retest reliability of questions associated with the Internet-based comprehensive self-report questionnaire on the tapering practices of strongman athletes. The information will provide insight on the reliability and usefulness of the online questionnaire for use with strongman athletes. Methods Invitations to complete an Internet questionnaire were sent via Facebook Messenger to identified strongman athletes. The survey consisted of four main areas of inquiry, including demographics and background information, training practices, tapering, and tapering practices. Of the 454 athletes that completed the survey over the 8-week period, 130 athletes responded on Facebook Messenger indicating that they intended to complete, or had completed, the survey. These participants were asked if they could complete the online questionnaire a second time for a test-retest reliability analysis. Sixty-four athletes (mean age 33.3 years, standard deviation [SD] 7.7; mean height 178.2 cm, SD 11.0; mean body mass 103.7 kg, SD 24.8) accepted this invitation and completed the survey for the second time after a minimum 7-day period from the date of their first completion. Agreement between athlete responses was measured using intraclass correlation coefficients (ICCs) and kappa statistics. Confidence intervals (at 95%) were reported for all measures and significance was set at P<.05. Results Test-retest reliability for demographic and training practices items were significant (P<.001) and showed excellent (ICC range=.84 to .98) and fair to almost perfect agreement (κ range=.37-.85). Moderate to excellent agreements (ICC range=.56-.84; P<.01) were observed for all tapering practice measures except for the number of days athletes started their usual taper before a strongman competition (ICC=.30). When the number of days were categorized with additional analyses, moderate reliability was observed (κ=.43; P<.001). Fair to substantial agreement was observed for the majority of tapering practices measures (κrange=.38-.73; P<.001) except for how training frequency (κ=.26) and the percentage and type of resistance training performed, which changed in the taper (κ=.20). Good to excellent agreement (ICC=.62-.93; P<.05) was observed for items relating to strongman events and traditional exercises performed during the taper. Only the time at which the Farmer’s Walk was last performed before competition showed poor reliability (ICC=.27). Conclusions We have developed a low cost, self-reported, online retrospective questionnaire, which provided stable and reliable answers for most of the demographic, training, and tapering practice questions. The results of this study support the inferences drawn from the Tapering Practices of Strongman Athletes Study. PMID:29089292

  4. Bullying at school: Agreement between caregivers' and children's perception.

    PubMed

    Durán, Lucas G; Scherñuk Schroh, Jordán C; Panizoni, Estefanía P; Jouglard, Ezequiel F; Serralunga, M Gabriela; Esandi, M Eugenia

    2017-02-01

    Bullying at school is usually kept secret from adults, making them unaware of the situation. To describe caregivers' and children's perception and assess their agreement in terms of bullying situations. Cross-sectional study in children aged 8-12 years old attending public schools and their caregivers. The questionnaire on preconceptions of intimidation and bullying among peers (PRECONCIMEI) (child/caregiver version) was used. Studied outcome measures: Scale of bullying, causes of bullying, child involvement in bullying, communication in bullying situations. Univariate and bivariate analyses were done and agreement was estimated using the Kappa index. A total of 529 child/caregiver dyads participated. Among caregivers, 35% stated that bullying occurred in their children's schools. Among children, 133 (25%) admitted to being involved: 70 (13%) were victims of bullying, 40 (8%) were bullies, and 23 (4%) were bullied and perpetrated bullying. Among the 63 caregivers of children who admitted to be bullies, 78% did not consider their children capable of perpetrating bullying. Among children who were bullied or who both suffered bullying and bullied others, 69.9% (65/93) indicated that "if they were the victims of bullying, they would tell their family." However, 89.2% (83/93) of caregivers considered that their children would tell them if they were ever involved in these situations. Agreement was observed in terms of a positive communication (Kappa = -0.04) between 62.6% (57/91) of the child/caregiver dyads school bullying. Disagreement was observed between children and their caregivers in relation to the frequency and communication of bullying situations. Few caregivers whose children admitted to being involved in these situations believed it was a possibility. Sociedad Argentina de Pediatría

  5. Osteochondritis dissecans of the humeral capitellum: reliability of four classification systems using radiographs and computed tomography.

    PubMed

    Claessen, Femke M A P; van den Ende, Kimberly I M; Doornberg, Job N; Guitton, Thierry G; Eygendaal, Denise; van den Bekerom, Michel P J

    2015-10-01

    The radiographic appearance of osteochondritis dissecans (OCD) of the humeral capitellum varies according to the stage of the lesion. It is important to evaluate the stage of OCD lesion carefully to guide treatment. We compared the interobserver reliability of currently used classification systems for OCD of the humeral capitellum to identify the most reliable classification system. Thirty-two musculoskeletal radiologists and orthopaedic surgeons specialized in elbow surgery from several countries evaluated anteroposterior and lateral radiographs and corresponding computed tomography (CT) scans of 22 patients to classify the stage of OCD of the humeral capitellum according to the classification systems developed by (1) Minami, (2) Berndt and Harty, (3) Ferkel and Sgaglione, and (4) Anderson on a Web-based study platform including a Digital Imaging and Communications in Medicine viewer. Magnetic resonance imaging was not evaluated as part of this study. We measured agreement among observers using the Siegel and Castellan multirater κ. All OCD classification systems, except for Berndt and Harty, which had poor agreement among observers (κ = 0.20), had fair interobserver agreement: κ was 0.27 for the Minami, 0.23 for Anderson, and 0.22 for Ferkel and Sgaglione classifications. The Minami Classification was significantly more reliable than the other classifications (P < .001). The Minami Classification was the most reliable for classifying different stages of OCD of the humeral capitellum. However, it is unclear whether radiographic evidence of OCD of the humeral capitellum, as categorized by the Minami Classification, guides treatment in clinical practice as a result of this fair agreement. Copyright © 2015 Journal of Shoulder and Elbow Surgery Board of Trustees. Published by Elsevier Inc. All rights reserved.

  6. Laboratory evaluation on the sensitivity and specificity of a novel and rapid detection method for malaria diagnosis based on magneto-optical technology (MOT).

    PubMed

    Mens, Petra F; Matelon, Raphael J; Nour, Bakri Y M; Newman, Dave M; Schallig, Henk D F H

    2010-07-19

    This study describes the laboratory evaluation of a novel diagnostic platform for malaria. The Magneto Optical Test (MOT) is based on the bio-physical detection of haemozoin in clinical samples. Having an assay time of around one minute, it offers the potential of high throughput screening. Blood samples of confirmed malaria patients from different regions of Africa, patients with other diseases and healthy non-endemic controls were used in the present study. The samples were analysed with two reference tests, i.e. an histidine rich protein-2 based rapid diagnostic test (RDT) and a conventional Pan-Plasmodium PCR, and the MOT as index test. Data were entered in 2 x 2 tables and analysed for sensitivity and specificity. The agreement between microscopy, RDT and PCR and the MOT assay was determined by calculating Kappa values with a 95% confidence interval. The observed sensitivity/specificity of the MOT test in comparison with clinical description, RDT or PCR ranged from 77.2 - 78.8% (sensitivity) and from 72.5 - 74.6% (specificity). In general, the agreement between MOT and the other assays is around 0.5 indicating a moderate agreement between the reference and the index test. However, when RDT and PCR are compared to each other, an almost perfect agreement can be observed (k = 0.97) with a sensitivity and specificity of >95%. Although MOT sensitivity and specificity are currently not yet at a competing level compared to other diagnostic test, such as PCR and RDTs, it has a potential to rapidly screen patients for malaria in endemic as well as non-endemic countries.

  7. Comparison between electromagnetic transponders and radiographic imaging for prostate localization: A pelvic phantom study with rotations and translations.

    PubMed

    Hamilton, Daniel G; McKenzie, Dean P; Perkins, Anne E

    2017-09-01

    The aim of this study was to evaluate the differences in target localization between Calypso ® , kV orthogonal imaging and cone-beam computed tomography (CBCT) for combined translations and rotations of an anthropomorphic pelvic phantom. The phantom was localized using all three systems in 50 different positions, with applied translational and rotational offsets randomly sampled from representative normal distributions of prostate motion. Lin's concordance correlation coefficient (ρc) and 95% confidence intervals were calculated to assess the agreement between the localization systems. Mean differences and difference vectors between the three systems were also calculated. Agreement between systems for lateral, vertical, and longitudinal translations was excellent, with ρc values of greater than 0.98 between all three systems in all axes. There was excellent agreement between the systems for rotations around the lateral axis (pitch) (ρc > 0.99), and around the vertical axis (yaw) (ρc > 0.97). However, somewhat poorer agreement for rotations around the longitudinal axis (roll) was observed, with the lowest correlation observed between Calypso and kV orthogonal imaging (ρc = 0.895). Mean differences between the phantom position reported by Calypso and the radiographic systems were less than 1 mm and 1° for all translations and rotations. The results for translations are consistent with the publications of previous authors. There is no comparable published data for rotations. While there is lower correlation between the three systems for roll than for the other angles, the mean differences in reported rotations are not clinically significant. © 2017 The Authors. Journal of Applied Clinical Medical Physics published by Wiley Periodicals, Inc. on behalf of American Association of Physicists in Medicine.

  8. Comparison of translabial three-dimensional ultrasound with magnetic resonance imaging for measurement of levator hiatal biometry at rest.

    PubMed

    Vergeldt, T F M; Notten, K J B; Stoker, J; Fütterer, J J; Beets-Tan, R G; Vliegen, R F A; Schweitzer, K J; Mulder, F E M; van Kuijk, S M J; Roovers, J P W R; Kluivers, K B; Weemhoff, M

    2016-05-01

    To compare translabial three-dimensional (3D) ultrasound with magnetic resonance imaging (MRI) for the measurement of levator hiatal biometry at rest in women with pelvic organ prolapse, and to determine the interobserver reliability between two independent observers for ultrasound and MRI measurements. Data were derived from a multicenter prospective cohort study in which women scheduled for conventional anterior colporrhaphy underwent translabial 3D ultrasound and MRI prior to surgery. Intraclass correlation coefficients (ICCs) were calculated to estimate interobserver reliability between two independent observers and determine the agreement between ultrasound and MRI measurements. Bland-Altman plots were created to assess the agreement between ultrasound and MRI measurements. Data from 139 women from nine hospitals were included in the study. The interobserver reliability of ultrasound assessment at rest, during Valsalva maneuver and during contraction and of MRI assessment at rest were moderate or good. The agreement between ultrasound and MRI for the measurement of levator hiatal biometry at rest was moderate, with ICCs of 0.52 (95%CI, 0.32-0.66) for levator hiatal area, 0.44 (95%CI, 0.21-0.60) for anteroposterior diameter and 0.44 (95%CI, 0.22-0.60) for transverse diameter. Levator hiatal biometry measurements were statistically significantly larger on MRI than on translabial 3D ultrasound. The agreement between translabial 3D ultrasound and MRI for measurement of the levator hiatus at rest in women with pelvic organ prolapse was only moderate. The results of translabial 3D ultrasound and MRI should therefore not be used interchangeably in daily practice or in clinical research. Copyright © 2015 ISUOG. Published by John Wiley & Sons Ltd. Copyright © 2015 ISUOG. Published by John Wiley & Sons Ltd.

  9. High usability of a smartphone application for reporting symptoms in adults with cystic fibrosis.

    PubMed

    Wood, Jamie; Jenkins, Sue; Putrino, David; Mulrennan, Siobhain; Morey, Sue; Cecins, Nola; Hill, Kylie

    2017-01-01

    Introduction In cystic fibrosis, exacerbations impair lung function and health-related quality of life, increase healthcare costs and reduce survival. Delayed reporting of worsening symptoms can result in more severe exacerbations and worse clinical outcomes; therefore there is a need for a novel approach to facilitate the early identification and treatment of exacerbations in this population. This study investigated the usability of a smartphone application to report symptoms in adults with cystic fibrosis, and the observer agreement in clinical decision-making between senior clinicians interpreting smartphone application responses. Methods Adults with cystic fibrosis used the smartphone application weekly for four weeks. The application comprised 10 yes/no questions regarding respiratory symptoms and two regarding emotional well-being. Usability was measured with the System Usability Scale; Observer agreement was tested by providing a cystic fibrosis physician and a nurse practitioner with 45 clinical scenarios. For each scenario the clinicians, who were blinded to each other's responses, were asked to indicate whether or not they would: (i) initiate telephone contact, and/or (ii) request a clinic visit for the individual. Results Ten participants (five female), aged mean (SD) 33 (11) years, FEV1 49 (27)% predicted completed the study. The mean (SD) System Usability Scale score was 94 (6). There was perfect agreement between clinicians for initiating contact with the participant ( κ = 1.0, p < 0.001), and near-perfect for requesting a clinic visit ( κ = 0.86, p < 0.001). Discussion The use of a smartphone application for reporting symptoms in adults with cystic fibrosis has excellent usability and near-perfect agreement between senior clinicians when interpreting the application responses.

  10. Species and temperature predictions in a semi-industrial MILD furnace using a non-adiabatic conditional source-term estimation formulation

    NASA Astrophysics Data System (ADS)

    Labahn, Jeffrey William; Devaud, Cecile

    2017-05-01

    A Reynolds-Averaged Navier-Stokes (RANS) simulation of the semi-industrial International Flame Research Foundation (IFRF) furnace is performed using a non-adiabatic Conditional Source-term Estimation (CSE) formulation. This represents the first time that a CSE formulation, which accounts for the effect of radiation on the conditional reaction rates, has been applied to a large scale semi-industrial furnace. The objective of the current study is to assess the capabilities of CSE to accurately reproduce the velocity field, temperature, species concentration and nitrogen oxides (NOx) emission for the IFRF furnace. The flow field is solved using the standard k-ε turbulence model and detailed chemistry is included. NOx emissions are calculated using two different methods. Predicted velocity profiles are in good agreement with the experimental data. The predicted peak temperature occurs closer to the centreline, as compared to the experimental observations, suggesting that the mixing between the fuel jet and vitiated air jet may be overestimated. Good agreement between the species concentrations, including NOx, and the experimental data is observed near the burner exit. Farther downstream, the centreline oxygen concentration is found to be underpredicted. Predicted NOx concentrations are in good agreement with experimental data when calculated using the method of Peters and Weber. The current study indicates that RANS-CSE can accurately predict the main characteristics seen in a semi-industrial IFRF furnace.

  11. Do patient and proxy agree? Long-term changes in multiple sclerosis physical impact and walking ability on patient-reported outcome scales.

    PubMed

    Sonder, Judith M; Balk, Lisanne J; Bosma, Libertje V A E; Polman, Chris H; Uitdehaag, Bernard M J

    2014-10-01

    Patient-reported outcome scales (PROs) are useful in monitoring changes in multiple sclerosis (MS) over time. Although these scales are reliable and valid measures in longitudinal studies in MS patients, it is unknown what the impact is when obtaining longitudinal data from proxies. The objective of this paper is to compare longitudinal changes in patient and proxy responses on PROs assessing physical impact of MS and walking ability. In a prospective observational study, data on the Multiple Sclerosis Impact Scale (MSIS-29 physical) and Multiple Sclerosis Walking Scale (MSWS-12) were obtained from 137 patient-proxy couples at baseline and at two-year follow-up. Demographic and disease-related variables explaining agreement or disagreement between patients and proxies were investigated using linear regression analyses. Full agreement was found in 56% (MSIS) and 62% (MSWS) of the patient-proxy couples. Complete disagreement was very rare for both scales (2% MSIS, 5% MSWS). When patients were more positive than proxies, a higher age, longer disease duration, longer patient-proxy relationship and increased levels of depression, anxiety and caregiver burden in proxies were observed. In the majority of the patient-proxy couples there was agreement. Proxies can serve as a valuable source of information, but caution remains essential when using scores from proxies. © The Author(s), 2014.

  12. Delayed pneumothorax after stab wound to thorax and upper abdomen: Truth or myth?

    PubMed

    Zehtabchi, Shahriar; Morley, Eric J; Sajed, Dana; Greenberg, Oded; Sinert, Richard

    2009-01-01

    Stab wounds to the thorax and upper abdomen have the potential to cause pneumothorax (PTX). When a CXR (CXR) obtained during initial resuscitation is negative, a second CXR (CXR-2) is commonly performed with the goal of identifying delayed PTX. To assess the diagnostic yield of the CXR-2 in identifying delayed PTX. Prospective observational study of patients (age >or=13 years) with stab wounds to the thorax (chest/back) and upper abdomen with suspected PTX, in a level 1 trauma centre. Patients were included if they had a negative initial CXR followed by a repeat CXR 3-6h after the initial one. patients who died, were transferred out of the ED, or received chest tubes before the second CXR. The outcome of interest was delayed PTX. All CXR were read by an attending radiologist. To test the inter-observer agreement, another blinded radiologist reviewed 20% of CXR. Continuous data is presented as mean+/-standard deviation and categorical data as percentages with 95% confidence interval (CI). Kappa statistics were used to measure the inter-observer agreement between radiologists. Between January 2003 and December 2006 a total of 185 patients qualified for the enrollment (mean age: 28+/-10 years, age range: 13-65, 94% male). Only 2 patients (1.1%, 95% CI, 0.4- 4.1%) had PTX on the CXR-2. Both patients received chest tubes. The inter-observer agreement for radiology reports was high (kappa: 0.79). Occurrence of delayed PTX in patients with stab wounds to the thorax and upper abdomen and negative triage CXR is rare.

  13. Validation of existing diagnosis of autism in mainland China using standardised diagnostic instruments.

    PubMed

    Sun, Xiang; Allison, Carrie; Auyeung, Bonnie; Zhang, Zhixiang; Matthews, Fiona E; Baron-Cohen, Simon; Brayne, Carol

    2015-11-01

    Research to date in mainland China has mainly focused on children with autistic disorder rather than Autism Spectrum Conditions and the diagnosis largely depended on clinical judgment without the use of diagnostic instruments. Whether children who have been diagnosed in China before meet the diagnostic criteria of Autism Spectrum Conditions is not known nor how many such children would meet these criteria. The aim of this study was to evaluate children with a known diagnosis of autism in mainland China using the Autism Diagnostic Observation Schedule and the Autism Diagnostic Interview-Revised to verify that children who were given a diagnosis of autism made by Chinese clinicians in China were mostly children with severe autism. Of 50 children with an existing diagnosis of autism made by Chinese clinicians, 47 children met the diagnosis of autism on the Autism Diagnostic Observation Schedule algorithm and 44 children met the diagnosis of autism on the Autism Diagnostic Interview-Revised algorithm. Using the Gwet's alternative chance-corrected statistic, the agreement between the Chinese diagnosis and the Autism Diagnostic Observation Schedule diagnosis was very good (AC1 = 0.94, p < 0.005, 95% confidence interval (0.86, 1.00)), so was the agreement between the Chinese diagnosis and the Autism Diagnostic Interview-Revised (AC1 = 0.91, p < 0.005, 95% confidence interval (0.81, 1.00)). The agreement between the Autism Diagnostic Observation Schedule and the Autism Diagnostic Interview-Revised was lower but still very good (AC1 = 0.83, p < 0.005). © The Author(s) 2015.

  14. Validation of existing diagnosis of autism in mainland China using standardised diagnostic instruments

    PubMed Central

    Sun, Xiang; Allison, Carrie; Auyeung, Bonnie; Zhang, Zhixiang; Matthews, Fiona E; Baron-Cohen, Simon; Brayne, Carol

    2016-01-01

    Research to date in mainland China has mainly focused on children with autistic disorder rather than Autism Spectrum Conditions and the diagnosis largely depended on clinical judgment without the use of diagnostic instruments. Whether children who have been diagnosed in China before meet the diagnostic criteria of Autism Spectrum Conditions is not known nor how many such children would meet these criteria. The aim of this study was to evaluate children with a known diagnosis of autism in mainland China using the Autism Diagnostic Observation Schedule and the Autism Diagnostic Interview–Revised to verify that children who were given a diagnosis of autism made by Chinese clinicians in China were mostly children with severe autism. Of 50 children with an existing diagnosis of autism made by Chinese clinicians, 47 children met the diagnosis of autism on the Autism Diagnostic Observation Schedule algorithm and 44 children met the diagnosis of autism on the Autism Diagnostic Interview–Revised algorithm. Using the Gwet’s alternative chance-corrected statistic, the agreement between the Chinese diagnosis and the Autism Diagnostic Observation Schedule diagnosis was very good (AC1 = 0.94, p < 0.005, 95% confidence interval (0.86, 1.00)), so was the agreement between the Chinese diagnosis and the Autism Diagnostic Interview–Revised (AC1 = 0.91, p < 0.005, 95% confidence interval (0.81, 1.00)). The agreement between the Autism Diagnostic Observation Schedule and the Autism Diagnostic Interview–Revised was lower but still very good (AC1 = 0.83, p < 0.005). PMID:25757721

  15. Fluorescence excitation and emission spectroscopy of the X(1)A' --> A(1)A'' system of CHI and CDI.

    PubMed

    Tao, Chong; Ebben, Carlena; Reid, Scott A

    2009-11-26

    We report on the first detailed studies of the spectroscopy of an iodocarbene, measuring fluorescence excitation and emission spectra of the X1A' --> A1A'' system of :CHI and the deuterated isotopomer :CDI. Due to similar bending and C-I stretching frequencies in the upper state, fluorescence excitation spectra of :CHI show polyads composed of members of the 2(0)(n-x)3(0)x progressions with x = 0-3. For :CDI, only progressions with x = 0, 1 are observed. Extrapolation of the 20n term energies for both isotopomers to a common origin places the electronic origin of the X1A' --> A1A'' system near 10500 cm-1, in good agreement with theoretical predictions. Rotational analysis of the 16 observed bands for CHI and 13 observed bands for :CDI yields rotational constants for the upper and lower states that are also in good agreement with theory. To investigate the controversial issue of the ground state multiplicity of :CHI, we measured single vibronic level emission spectra from many A1A'' levels. These spectra show conclusively that the ground state is a singlet, as for both isotopomers the ã3A'' origin is observed, lying well above the origin of the X1A' state. At energies above the ã3A'' origin, the spin-orbit mixing is so severe that few vibrational assignments can be made. Analysis of the emission spectra provides a lower limit on the singlet-triplet gap of 4.1 kcal mol-1, in excellent agreement with theoretical predictions.

  16. Accuracy and reliability of 3D stereophotogrammetry: A comparison to direct anthropometry and 2D photogrammetry.

    PubMed

    Dindaroğlu, Furkan; Kutlu, Pınar; Duran, Gökhan Serhat; Görgülü, Serkan; Aslan, Erhan

    2016-05-01

    To evaluate the accuracy of three-dimensional (3D) stereophotogrammetry by comparing it with the direct anthropometry and digital photogrammetry methods. The reliability of 3D stereophotogrammetry was also examined. Six profile and four frontal parameters were directly measured on the faces of 80 participants. The same measurements were repeated using two-dimensional (2D) photogrammetry and 3D stereophotogrammetry (3dMDflex System, 3dMD, Atlanta, Ga) to obtain images of the subjects. Another observer made the same measurements for images obtained with 3D stereophotogrammetry, and interobserver reproducibility was evaluated for 3D images. Both observers remeasured the 3D images 1 month later, and intraobserver reproducibility was evaluated. Statistical analysis was conducted using the paired samples t-test, intraclass correlation coefficient, and Bland-Altman limits of agreement. The highest mean difference was 0.30 mm between direct measurement and photogrammetry, 0.21 mm between direct measurement and 3D stereophotogrammetry, and 0.5 mm between photogrammetry and 3D stereophotogrammetry. The lowest agreement value was 0.965 in the Sn-Pro parameter between the photogrammetry and 3D stereophotogrammetry methods. Agreement between the two observers varied from 0.90 (Ch-Ch) to 0.99 (Sn-Me) in linear measurements. For intraobserver agreement, the highest difference between means was 0.33 for observer 1 and 1.42 mm for observer 2. Measurements obtained using 3D stereophotogrammetry indicate that it may be an accurate and reliable imaging method for use in orthodontics.

  17. Dependence of solid-liquid interface free energy on liquid structure

    NASA Astrophysics Data System (ADS)

    Wilson, S. R.; Mendelev, M. I.

    2014-09-01

    The Turnbull relation is widely believed to enable prediction of solid-liquid interface (SLI) free energies from measurements of the latent heat and the solid density. Ewing proposed an additional contribution to the SLI free energy to account for variations in liquid structure near the interface. In the present study, molecular dynamics (MD) simulations were performed to investigate whether SLI free energy depends on liquid structure. Analysis of the MD simulation data for 11 fcc metals demonstrated that the Turnbull relation is only a rough approximation for highly ordered liquids, whereas much better agreement is observed with Ewing's theory. A modification to Ewing's relation is proposed in this study that was found to provide excellent agreement with MD simulation data.

  18. Consensus-Based Attributes for Identifying Patients With Spasmodic Dysphonia and Other Voice Disorders.

    PubMed

    Ludlow, Christy L; Domangue, Rickie; Sharma, Dinesh; Jinnah, H A; Perlmutter, Joel S; Berke, Gerald; Sapienza, Christine; Smith, Marshall E; Blumin, Joel H; Kalata, Carrie E; Blindauer, Karen; Johns, Michael; Hapner, Edie; Harmon, Archie; Paniello, Randal; Adler, Charles H; Crujido, Lisa; Lott, David G; Bansberg, Stephen F; Barone, Nicholas; Drulia, Teresa; Stebbins, Glenn

    2018-06-21

    A roadblock for research on adductor spasmodic dysphonia (ADSD), abductor SD (ABSD), voice tremor (VT), and muscular tension dysphonia (MTD) is the lack of criteria for selecting patients with these disorders. To determine the agreement among experts not using standard guidelines to classify patients with ABSD, ADSD, VT, and MTD, and develop expert consensus attributes for classifying patients for research. From 2011 to 2016, a multicenter observational study examined agreement among blinded experts when classifying patients with ADSD, ABSD, VT or MTD (first study). Subsequently, a 4-stage Delphi method study used reiterative stages of review by an expert panel and 46 community experts to develop consensus on attributes to be used for classifying patients with the 4 disorders (second study). The study used a convenience sample of 178 patients clinically diagnosed with ADSD, ABSD, VT MTD, vocal fold paresis/paralysis, psychogenic voice disorders, or hypophonia secondary to Parkinson disease. Participants were aged 18 years or older, without laryngeal structural disease or surgery for ADSD and underwent speech and nasolaryngoscopy video recordings following a standard protocol. Speech and nasolaryngoscopy video recordings following a standard protocol. Specialists at 4 sites classified 178 patients into 11 categories. Four international experts independently classified 75 patients using the same categories without guidelines after viewing speech and nasolaryngoscopy video recordings. Each member from the 4 sites also classified 50 patients from other sites after viewing video clips of voice/laryngeal tasks. Interrater κ less than 0.40 indicated poor classification agreement among rater pairs and across recruiting sites. Consequently, a Delphi panel of 13 experts identified and ranked speech and laryngeal movement attributes for classifying ADSD, ABSD, VT, and MTD, which were reviewed by 46 community specialists. Based on the median attribute rankings, a final attribute list was created for each disorder. When classifying patients without guidelines, raters differed in their classification distributions (likelihood ratio, χ2 = 107.66), had poor interrater agreement, and poor agreement with site categories. For 11 categories, the highest agreement was 34%, with no κ values greater than 0.26. In external rater pairs, the highest κ was 0.23 and the highest agreement was 38.5%. Using 6 categories, the highest percent agreement was 73.3% and the highest κ was 0.40. The Delphi method yielded 18 attributes for classifying disorders from speech and nasolaryngoscopic examinations. Specialists without guidelines had poor agreement when classifying patients for research, leading to a Delphi-based development of the Spasmodic Dysphonia Attributes Inventory for classifying patients with ADSD, ABSD, VT, and MTD for research.

  19. Regional Geoid Modeling Compared to Ocean Surface Observations

    NASA Astrophysics Data System (ADS)

    Roman, D. R.; Saleh, J.; Wang, Y. M.

    2007-05-01

    Aerogravity over a limited coastal region of the northern Gulf of Mexico enhanced and rectified the local gravity field signal. In turn, these data improved the derived geoid height model based on comparison with dynamic ocean topography (DOT) and tide gage information at eleven stations. Additionally, lidar observations were analyzed along nearly 50 profiles to estimate the reliability of these models into the offshore region. The overall comparison shows dm-level agreement between the various geoid and DOT models and ocean surface observations. An approximate 30 cm bias must still be explained; however, the results of this study point to the potential for further cooperative studies between oceanographers and geodesists.

  20. Comparison of transcoelomic, contrast transcoelomic, and transesophageal echocardiography in anesthetized red-tailed hawks (Buteo jamaicensis).

    PubMed

    Beaufrère, Hugues; Pariaut, Romain; Rodriguez, Daniel; Nevarez, Javier G; Tully, Thomas N

    2012-10-01

    To assess the agreement and reliability of cardiac measurements obtained with 3 echocardiographic techniques in anesthetized red-tailed hawks (Buteo jamaicensis). 10 red-tailed hawks. Transcoelomic, contrast transcoelomic, and transesophageal echocardiographic evaluations of the hawks were performed, and cineloops of imaging planes were recorded. Three observers performed echocardiographic measurements of cardiac variables 3 times on 3 days. The order in which hawks were assessed and echocardiographic techniques were used was randomized. Results were analyzed with linear mixed modeling, agreement was assessed with intraclass correlation coefficients, and variation was estimated with coefficients of variation. Significant differences were evident among the 3 echocardiographic methods for most measurements, and the agreement among findings was generally low. Interobserver agreement was generally low to medium. Intraobserver agreement was generally medium to high. Overall, better agreement was achieved for the left ventricular measurements and for the transesophageal approach than for other measurements and techniques. Echocardiographic measurements in hawks were not reliable, except when the left ventricle was measured by the same observer. Furthermore, cardiac morphometric measurements may not be clinically important. When measurements are required, one needs to consider that follow-up measurements should be performed by the same echocardiographer and should show at least a 20% difference from initial measurements to be confident that any difference is genuine.

  1. Assessing agreement between malaria slide density readings.

    PubMed

    Alexander, Neal; Schellenberg, David; Ngasala, Billy; Petzold, Max; Drakeley, Chris; Sutherland, Colin

    2010-01-04

    Several criteria have been used to assess agreement between replicate slide readings of malaria parasite density. Such criteria may be based on percent difference, or absolute difference, or a combination. Neither the rationale for choosing between these types of criteria, nor that for choosing the magnitude of difference which defines acceptable agreement, are clear. The current paper seeks a procedure which avoids the disadvantages of these current options and whose parameter values are more clearly justified. Variation of parasite density within a slide is expected, even when it has been prepared from a homogeneous sample. This places lower limits on sensitivity and observer agreement, quantified by the Poisson distribution. This means that, if a criterion of fixed percent difference criterion is used for satisfactory agreement, the number of discrepant readings is over-estimated at low parasite densities. With a criterion of fixed absolute difference, the same happens at high parasite densities. For an ideal slide, following the Poisson distribution, a criterion based on a constant difference in square root counts would apply for all densities. This can be back-transformed to a difference in absolute counts, which, as expected, gives a wider range of acceptable agreement at higher average densities. In an example dataset from Tanzania, observed differences in square root counts correspond to a 95% limits of agreement of -2,800 and +2,500 parasites/microl at average density of 2,000 parasites/microl, and -6,200 and +5,700 parasites/microl at 10,000 parasites/microl. However, there were more outliers beyond those ranges at higher densities, meaning that actual coverage of these ranges was not a constant 95%, but decreased with density. In a second study, a trial of microscopist training, the corresponding ranges of agreement are wider and asymmetrical: -8,600 to +5,200/microl, and -19,200 to +11,700/microl, respectively. By comparison, the optimal limits of agreement, corresponding to Poisson variation, are +/- 780 and +/- 1,800 parasites/microl, respectively. The focus of this approach on the volume of blood read leads to other conclusions. For example, no matter how large a volume of blood is read, some densities are too low to be reliably detected, which in turn means that disagreements on slide positivity may simply result from within-slide variation, rather than reading errors. The proposed method defines limits of acceptable agreement in a way which allows for the natural increase in variability with parasite density. This includes defining the levels of between-reader variability, which are consistent with random variation: disagreements within these limits should not trigger additional readings. This approach merits investigation in other settings, in order to determine both the extent of its applicability, and appropriate numerical values for limits of agreement.

  2. Assessment of Bowel Wall Enhancement for the Diagnosis of Intestinal Ischemia in Patients with Small Bowel Obstruction: Value of Adding Unenhanced CT to Contrast-enhanced CT.

    PubMed

    Chuong, Anh Minh; Corno, Lucie; Beaussier, Hélène; Boulay-Coletta, Isabelle; Millet, Ingrid; Hodel, Jérôme; Taourel, Patrice; Chatellier, Gilles; Zins, Marc

    2016-07-01

    Purpose To determine whether adding unenhanced computed tomography (CT) to contrast material-enhanced CT improves the diagnostic performance of decreased bowel wall enhancement as a sign of ischemia complicating mechanical small bowel obstruction (SBO). Materials and Methods This retrospective study was approved by the institutional review board, which waived the requirement for informed consent. Two gastrointestinal radiologists independently performed retrospective assessments of 164 unenhanced and contrast-enhanced CT studies from 158 consecutive patients (mean age, 71.2 years) with mechanical SBO. The reference standard was the intraoperative and/or histologic diagnosis (in 80 cases) or results from clinical follow-up in patients who did not undergo surgery (84 cases). Decreased bowel wall enhancement was evaluated with contrast-enhanced images then and both unenhanced and contrast-enhanced images 1 month later. Diagnostic performance of decreased bowel wall enhancement and confidence in the diagnosis were compared between the two readings by using McNemar and Wilcoxon signed rank tests. Interobserver agreement was assessed by using κ statistics and compared with bootstrapping. Results Ischemia was diagnosed in 41 of 164 (25%) episodes of SBO. For both observers, adding unenhanced images improved decreased bowel wall enhancement sensitivity (observer 1: 46.3% [19 of 41] vs 65.8% [27 of 41], P = .02; observer 2: 56.1% [23 of 41] vs 63.4% [26 of 41], P = .45), Youden index (from 0.41 to 0.58 for observer 1 and from 0.42 to 0.61 for observer 2), and confidence score (P < .001 for both). Specificity significantly increased for observer 2 (84.5% [104 of 123] vs 94.3% [116 of 123], P = .002), and interobserver agreement significantly increased, from moderate (κ = 0.48) to excellent (κ = 0.89; P < .0001). Conclusion Adding unenhanced CT to contrast-enhanced CT improved the sensitivity, diagnostic confidence, and interobserver agreement of the diagnosis of ischemia, a complication of mechanical SBO, on the basis of decreased bowel wall enhancement. (©) RSNA, 2016.

  3. THEORETICAL RESEARCH OF THE OPTICAL SPECTRA AND EPR PARAMETERS FOR Cs2NaYCl6:Dy3+ CRYSTAL

    NASA Astrophysics Data System (ADS)

    Dong, Hui-Ning; Dong, Meng-Ran; Li, Jin-Jin; Li, Deng-Feng; Zhang, Yi

    2013-09-01

    The calculated EPR parameters are in reasonable agreement with the observed values. The important material Cs2NaYCl6 doped with rare earth ions have received much attention because of its excellent optical and magnetic properties. Based on the superposition model, in this paper the crystal field energy levels, the electron paramagnetic resonance parameters g factors of Dy3+ and hyperfine structure constants of 161Dy3+ and 163Dy3+ isotopes in Cs2NaYCl6 crystal are studied by diagonalizing the 42 × 42 energy matrix. In the calculations, the contributions of various admixtures and interactions such as the J-mixing, the mixtures among the states with the same J-value, and the covalence are all considered. The calculated results are in reasonable agreement with the observed values. The results are discussed.

  4. Mathematical Modelling of Drying Kinetics of Wheat in Electron Fired Fluidized Bed Drying System

    NASA Astrophysics Data System (ADS)

    Deomore, Dayanand N.; Yarasu, Ravindra B.

    2018-02-01

    The conventional method of electrical heating is replaced by electron firing system. The drying kinetics of wheat is studied using electron fired fluidized bed dryer. The results are simulated by using ANSYS. It was observed that the graphs are in agreement with each other. Therefore, the new proposed electronic firing system can be employed instead of electrical firing. It was observed that the drop in Relative Humidity in case of Electrical heating is 68.75% for temp reaching up to 70° C in 67 sec for pressure drop of 13 psi while for the electronic Firing system it is 67.6 % temp reaches to 70° C in 70 sec for pressure drop of 12.67 psi. As the results are in agreement with each other it was concluded that for the grains like wheat which has low initial moisture content both systems can be used.

  5. All Together Now: Measuring Staff Cohesion in Special Education Classrooms

    ERIC Educational Resources Information Center

    Kratz, Hilary E.; Locke, Jill; Piotrowski, Zinnia; Ouellette, Rachel R.; Xie, Ming; Stahmer, Aubyn C.; Mandell, David S.

    2015-01-01

    This study sought to validate a new measure, the Classroom Cohesion Survey (CCS), designed to examine the relationship between teachers and classroom assistants in autism support classrooms. Teachers, classroom assistants, and external observers showed good inter-rater agreement on the CCS and good internal consistency for all scales. Simple…

  6. All Together Now: Measuring Staff Cohesion in Special Education Classrooms

    ERIC Educational Resources Information Center

    Kratz, Hilary E.; Locke, Jill; Piotrowski, Zinnia; Ouellette, Rachel R.; Xie, Ming; Stahmer, Aubyn C.; Mandell, David S.

    2014-01-01

    This study sought to validate a new measure, the Classroom Cohesion Survey (CCS), designed to examine the relationship between teachers and classroom assistants in autism support classrooms. Teachers, classroom assistants, and external observers showed good inter-rater agreement on the CCS and good internal consistency for all scales. Simple…

  7. Agreement between self-report and prescription data in medical records for pregnant women.

    PubMed

    Sarangarm, Preeyaporn; Young, Bonnie; Rayburn, William; Jaiswal, Pallavi; Dodd, Melanie; Phelan, Sharon; Bakhireva, Ludmila

    2012-03-01

    BACKGROUND Clinical teratology studies often rely on patient reports of medication use in pregnancy with or without other sources of information. Electronic medical records (EMRs), administrative databases, pharmacy dispensing records, drug registries, and patients' self-reports are all widely used sources of information to assess potential teratogenic effect of medications. The objective of this study was to assess comparability of self-reported and prescription medication data in EMRs for the most common therapeutic classes. METHODS The study population included 404 pregnant women prospectively recruited from five prenatal care clinics affiliated with the University of New Mexico. Self-reported information on prescription medications taken since the last menstrual period (LMP) was obtained by semistructured interviews in either English or Spanish. For validation purposes, EMRs were reviewed to abstract information on medications prescribed between the LMP and the date of the interview. Agreement was estimated by calculating a kappa (κ) coefficient, sensitivity, and specificity. RESULTS In this sample of socially-disadvantaged (i.e., 67.9% high school education or less, 48.5% no health insurance), predominantly Latina (80.4%) pregnant women, antibiotics and antidiabetic agents were the most prevalent therapeutic classes. The agreement between the two sources substantially varied by therapeutic class, with the highest level of agreement seen among antidiabetic and thyroid medications (κ ≥0.8) and the lowest among opioid analgesics (κ = 0.35). CONCLUSIONS Results indicate a high concordance between self-report and prescription data for therapeutic classes used chronically, while poor agreement was observed for medications used intermittently, on an 'as needed" basis, or in short courses. Copyright © 2012 Wiley Periodicals, Inc.

  8. Observation of chain stretching in Langmuir diblock copolymer monolayers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Factor, B.J.; Lee, L.; Kent, M.S.

    1993-10-01

    We report observations of chain stretching in diblock copolymer monolayers on the surface of a selective solvent. Using neutron reflectivity, we have studied the concentration profile of the submerged block over a large range of surface density [sigma] (chains per area) for two different molecular weights. The observed increase in the layer thickness is weaker than the [sigma][sup 1/3] prediction of mean-field and scaling theories for the limiting behavior, but is in agreement with recent numerical self-consistent-field calculations by Whitmore and Noolandi [Macromolecules 23, 3321 (1990)].

  9. A comparison of NNLO QCD predictions with 7 TeV ATLAS and CMS data for V+jet processes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Boughezal, Radja; Liu, Xiaohui; Petriello, Frank

    2016-06-17

    Here, we perform a detailed comparison of next-to-next-to-leading order (NNLO) QCD predictions for the W+jet and Z+jet processes with 7 TeV experimental data from ATLAS and CMS. We observe excellent agreement between theory and data for most studied observables, which span several orders of magnitude in both cross section and energy. For some observables, such as the HT distribution, the NNLO QCD corrections are essential for resolving existing discrepancies between theory and data.

  10. Source amplitudes of NTS explosions inferred from Rayleigh waves at Albuquerque and Tucson. Topical report

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bache, T.C.; Rodi, W.L.; Mason, B.F.

    1978-06-01

    Comparing observed and synthetic seismograms, source amplitudes of NTS explosions are inferred from Rayleigh wave recordings from the WWSSN stations at Albuquerque, New Mexico (ALQ) and Tucson, Arizona (TUC). The potential influence of source complexities, particularly surface spallation and related phenomena, is studied in detail. As described in earlier work by Bache, Rodi and Harkrider, the earth model for the synthetic were converted from observations at ALQ and TUC. The agreement of observed and synthetic seismograms is quite good and is sensitive to important features of the source.

  11. Agreement between PRE2DUP register data modeling method and comprehensive drug use interview among older persons

    PubMed Central

    Taipale, Heidi; Tanskanen, Antti; Koponen, Marjaana; Tolppanen, Anna-Maija; Tiihonen, Jari; Hartikainen, Sirpa

    2016-01-01

    Background PRE2DUP is a modeling method that generates drug use periods (ie, when drug use started and ended) from drug purchases recorded in dispensing-based register data. It is based on the evaluation of personal drug purchasing patterns and considers hospital stays, possible stockpiling of drugs, and package information. Objective The objective of this study was to investigate person-level agreement between self-reported drug use in the interview and drug use modeled from dispensing data with PRE2DUP method for various drug classes used by older persons. Methods Self-reported drug use was assessed from the GeMS Study including a random sample of persons aged ≥75 years from the city of Kuopio, Finland, in 2006. Drug purchases recorded in the Prescription register data of these persons were modeled to determine drug use periods with PRE2DUP modeling method. Agreement between self-reported drug use on the interview date and drug use calculated from register-based data was compared in order to find the frequently used drugs and drug classes, which was evaluated by Cohen’s kappa. Kappa values 0.61–0.80 were considered to represent good and 0.81–1.00 as very good agreement. Results Among 569 participants with mean age of 82 years, the agreement between interview and register data was very good for 75% and very good or good for 93% of the studied drugs or drug classes. Good or very good agreement was observed for drugs that are typically used on regular bases, whereas “as needed” drugs represented poorer results. Conclusion PRE2DUP modeling method validly describes regular drug use among older persons. For most of drug classes investigated, PRE2DUP-modeled register data described drug use as well as interview-based data which are more time-consuming to collect. Further studies should be conducted by comparing it with other methods and in different drug user populations. PMID:27785101

  12. Misclassification of acute respiratory distress syndrome after traumatic injury: The cost of less rigorous approaches.

    PubMed

    Hendrickson, Carolyn M; Dobbins, Sarah; Redick, Brittney J; Greenberg, Molly D; Calfee, Carolyn S; Cohen, Mitchell Jay

    2015-09-01

    Adherence to rigorous research protocols for identifying adult respiratory distress syndrome (ARDS) after trauma is variable. To examine how misclassification of ARDS may bias observational studies in trauma populations, we evaluated the agreement of two methods for adjudicating ARDS after trauma: the current gold standard, direct review of chest radiographs and review of dictated radiology reports, a commonly used alternative. This nested cohort study included 123 mechanically ventilated patients between 2005 and 2008, with at least one PaO2/FIO2 less than 300 within the first 8 days of admission. Two blinded physician investigators adjudicated ARDS by two methods. The investigators directly reviewed all chest radiographs to evaluate for bilateral infiltrates. Several months later, blinded to their previous assessments, they adjudicated ARDS using a standardized rubric to classify radiology reports. A κ statistics was calculated. Regression analyses quantified the association between established risk factors as well as important clinical outcomes and ARDS determined by the aforementioned methods as well as hypoxemia as a surrogate marker. The κ was 0.47 for the observed agreement between ARDS adjudicated by direct review of chest radiographs and ARDS adjudicated by review of radiology reports. Both the magnitude and direction of bias on the estimates of association between ARDS and established risk factors as well as clinical outcomes varied by method of adjudication. Classification of ARDS by review of dictated radiology reports had only moderate agreement with the current gold standard, ARDS adjudicated by direct review of chest radiographs. While the misclassification of ARDS had varied effects on the estimates of associations with established risk factors, it tended to weaken the association of ARDS with important clinical outcomes. A standardized approach to ARDS adjudication after trauma by direct review of chest radiographs will minimize misclassification bias in future observational studies. Diagnostic study, level II.

  13. Wave and Beach Processes Modeling for Sabine Pass to Galveston Bay, Texas, Shoreline Erosion Feasibility Study

    DTIC Science & Technology

    2007-08-01

    local wind field , led to the development of an appropriate alternative procedure which produced GENESIS results in agreement with observations...River field site wave data. ........................................................................................................53 Table 17...been abandoned since 1989 due to shoreline erosion. From east to west, the inlets in the study area include Sabine Pass, Rollover Pass near the

  14. An application of actuarial methods in psychiatric diagnosis.

    PubMed

    Overall, J E; Higgins, C W

    1977-10-01

    An actuarial program for psychiatric diagnosis is evaluated for agreement with final clinical diagnosis in a series of 288 patients. The acturial program provides a probability differential diagnosis based on an analysis of history and background data, symptom rating profiles, and MMPI clinical scale profiles. The observed agreement with final clinical diagnosis is approximately 50% higher than previously reported for psychological testing in this same setting. The results emphasize the importance for psychologists of clinical interview and observation skills.

  15. Correlation and agreement of self-assessed and objective skin disease severity in a cross-sectional study of patients with acne, psoriasis, and atopic eczema.

    PubMed

    Magin, Parker J; Pond, C Dimity; Smith, Wayne T; Watson, Alan B; Goode, Susan M

    2011-12-01

    Previous studies have shown variable correlation of patients' self-assessed skin severity measures and clinician-assessed objective measures of severity. But, generally, correlation has not been as good as might be expected for conditions in which the objective physical extent of skin disease is apparent to the sufferer to an extent that is not applicable in many other diseases. This paper reports agreement and correlation of self-assessed and objective severity measures in a study of 108 subjects with acne, psoriasis, or atopic eczema. The study was a cross-sectional study examining psychological associations of these skin diseases. Objective severity was assessed with the Leeds technique (acne), the Psoriasis Area and Severity Index, and Six Area Six Sign Atopic Dermatitis instruments. Agreement is a more appropriate measure than correlation in this situation and was measured with weighted kappa, while correlation was measured with Spearman's rank correlation. There was a modest correlation of ρ = 0.46 and similarly very modest agreement of 0.35 (weighted kappa) of self-assessed and clinician-assessed disease severity. Furthermore, self-assessed (but not clinician-assessed) severity was statistically associated with psychological morbidity in this study; i.e. - depression, anxiety, and overall psychological morbidity. Clinicians should consider psychological sequelae of skin disease, not only in those with objectively more severe disease but in patients across the severity spectrum. Both observational and interventional studies of skin disease should include both clinician-assessed and self-assessed measures of severity among assessed variables. © 2011 The International Society of Dermatology.

  16. Assessment of mastication in healthy children and children with cerebral palsy: a validity and consistency study.

    PubMed

    Remijn, L; Speyer, R; Groen, B E; Holtus, P C M; van Limbeek, J; Nijhuis-van der Sanden, M W G

    2013-05-01

    The aim of this study was to develop the Mastication Observation and Evaluation instrument for observing and assessing the chewing ability of children eating solid and lumpy foods. This study describes the process of item definition and item selection and reports the content validity, reproducibility and consistency of the instrument. In the developmental phase, 15 experienced speech therapists assessed item relevance and descriptions over three Delphi rounds. Potential items were selected based on the results from a literature review. At the initial Delphi round, 17 potential items were included. After three Delphi rounds, 14 items that regarded as providing distinctive value in assessment of mastication (consensus >75%) were included in the Mastication Observation and Evaluation instrument. To test item reproducibility and consistency, two experts and five students evaluated video recordings of 20 children (10 children with cerebral palsy aged 29-65 months and 10 healthy children aged 11-42 months) eating bread and a biscuit. Reproducibility was estimated by means of the intraclass correlation coefficient (ICC). With the exception of one item concerning chewing duration, all items showed good to excellent intra-observer agreement (ICC students: 0.73-1.0). With the exception of chewing duration and number of swallows, inter-observer agreement was fair to excellent for all items (ICC experts: 0.68-1.0 and ICC students: 0.42-1.0). Results indicate that this tool is a feasible instrument and could be used in clinical practice after further research is completed on the reliability of the tool. © 2013 Blackwell Publishing Ltd.

  17. Replication Validity of Initial Association Studies: A Comparison between Psychiatry, Neurology and Four Somatic Diseases.

    PubMed

    Dumas-Mallet, Estelle; Button, Katherine; Boraud, Thomas; Munafo, Marcus; Gonon, François

    2016-01-01

    There are growing concerns about effect size inflation and replication validity of association studies, but few observational investigations have explored the extent of these problems. Using meta-analyses to measure the reliability of initial studies and explore whether this varies across biomedical domains and study types (cognitive/behavioral, brain imaging, genetic and "others"). We analyzed 663 meta-analyses describing associations between markers or risk factors and 12 pathologies within three biomedical domains (psychiatry, neurology and four somatic diseases). We collected the effect size, sample size, publication year and Impact Factor of initial studies, largest studies (i.e., with the largest sample size) and the corresponding meta-analyses. Initial studies were considered as replicated if they were in nominal agreement with meta-analyses and if their effect size inflation was below 100%. Nominal agreement between initial studies and meta-analyses regarding the presence of a significant effect was not better than chance in psychiatry, whereas it was somewhat better in neurology and somatic diseases. Whereas effect sizes reported by largest studies and meta-analyses were similar, most of those reported by initial studies were inflated. Among the 256 initial studies reporting a significant effect (p<0.05) and paired with significant meta-analyses, 97 effect sizes were inflated by more than 100%. Nominal agreement and effect size inflation varied with the biomedical domain and study type. Indeed, the replication rate of initial studies reporting a significant effect ranged from 6.3% for genetic studies in psychiatry to 86.4% for cognitive/behavioral studies. Comparison between eight subgroups shows that replication rate decreases with sample size and "true" effect size. We observed no evidence of association between replication rate and publication year or Impact Factor. The differences in reliability between biological psychiatry, neurology and somatic diseases suggest that there is room for improvement, at least in some subdomains.

  18. Replication Validity of Initial Association Studies: A Comparison between Psychiatry, Neurology and Four Somatic Diseases

    PubMed Central

    Dumas-Mallet, Estelle; Button, Katherine; Boraud, Thomas; Munafo, Marcus; Gonon, François

    2016-01-01

    Context There are growing concerns about effect size inflation and replication validity of association studies, but few observational investigations have explored the extent of these problems. Objective Using meta-analyses to measure the reliability of initial studies and explore whether this varies across biomedical domains and study types (cognitive/behavioral, brain imaging, genetic and “others”). Methods We analyzed 663 meta-analyses describing associations between markers or risk factors and 12 pathologies within three biomedical domains (psychiatry, neurology and four somatic diseases). We collected the effect size, sample size, publication year and Impact Factor of initial studies, largest studies (i.e., with the largest sample size) and the corresponding meta-analyses. Initial studies were considered as replicated if they were in nominal agreement with meta-analyses and if their effect size inflation was below 100%. Results Nominal agreement between initial studies and meta-analyses regarding the presence of a significant effect was not better than chance in psychiatry, whereas it was somewhat better in neurology and somatic diseases. Whereas effect sizes reported by largest studies and meta-analyses were similar, most of those reported by initial studies were inflated. Among the 256 initial studies reporting a significant effect (p<0.05) and paired with significant meta-analyses, 97 effect sizes were inflated by more than 100%. Nominal agreement and effect size inflation varied with the biomedical domain and study type. Indeed, the replication rate of initial studies reporting a significant effect ranged from 6.3% for genetic studies in psychiatry to 86.4% for cognitive/behavioral studies. Comparison between eight subgroups shows that replication rate decreases with sample size and “true” effect size. We observed no evidence of association between replication rate and publication year or Impact Factor. Conclusion The differences in reliability between biological psychiatry, neurology and somatic diseases suggest that there is room for improvement, at least in some subdomains. PMID:27336301

  19. Differences between the human eye and the spectrophotometer in the shade matching of tooth colour.

    PubMed

    Gómez-Polo, Cristina; Gómez-Polo, Miguel; Celemin-Viñuela, Alicia; Martínez Vázquez De Parga, Juan Antonio

    2014-06-01

    The aim of this work was to assess the agreement between instrumental and visual colour matching. Shade selection with the 3DMaster Toothguide (Vita-Zahnfabrik) was performed for 1361 maxillary central incisors and compared with the shade obtained with the EasyShade Compact (Vita-Zahnfabrik) spectrophotometer. We observed a greater correlation between the objective method and the subjective one in the colour dimension of lightness (Kappa 0.6587), followed by hue (Kappa 0.4337) and finally chroma (Kappa 0.3578). The colour dimension in which the greatest agreement is seen between the operator and the spectrophotometer is value or lightness. This study reveals differences between the measurement of colour via spectrophotometry and the visual shade selection method. According to our results, there is better agreement in the value or lightness colour dimension, which is the most important one in the choice of tooth colour. Copyright © 2014 Elsevier Ltd. All rights reserved.

  20. EXPECTING GENDER: AN EVENT RELATED BRAIN POTENTIAL STUDY ON THE ROLE OF GRAMMATICAL GENDER IN COMPREHENDING A LINE DRAWING WITHIN A WRITTEN SENTENCE IN SPANISH

    PubMed Central

    Wicha, Nicole Y. Y.; Moreno, Eva M.; Kutas, Marta

    2012-01-01

    Event-related brain potentials (ERPs) were used to examine the role of grammatical gender in written sentence comprehension. Native Spanish speakers read sentences in which a drawing depicting a target noun was either congruent or incongruent with sentence meaning, and either agreed or disagreed in gender with that of the preceding article. The gender-agreement violation at the drawing was associated with an enhanced negativity between 500 and 700 msec post-stimulus onset. Semantically incongruent drawings elicited a larger N400 than congruent drawings regardless of gender (dis)agreement, indicating little effect of grammatical gender agreement on contextual integration of a picture into a written sentence context. We also observed an enhanced negativity for articles with unexpected relative to expected gender based on prior sentence context indicating that readers generate expectations for specific nouns and their articles. PMID:12870823

  1. Pre- and postoperative radiotherapy for extremity soft tissue sarcoma: Evaluation of inter-observer target volume contouring variability among French sarcoma group radiation oncologists.

    PubMed

    Sargos, P; Charleux, T; Haas, R L; Michot, A; Llacer, C; Moureau-Zabotto, L; Vogin, G; Le Péchoux, C; Verry, C; Ducassou, A; Delannes, M; Mervoyer, A; Wiazzane, N; Thariat, J; Sunyach, M P; Benchalal, M; Laredo, J D; Kind, M; Gillon, P; Kantor, G

    2018-04-01

    The purpose of this study was to evaluate, during a national workshop, the inter-observer variability in target volume delineation for primary extremity soft tissue sarcoma radiation therapy. Six expert sarcoma radiation oncologists (members of French Sarcoma Group) received two extremity soft tissue sarcoma radiation therapy cases 1: one preoperative and one postoperative. They were distributed with instructions for contouring gross tumour volume or reconstructed gross tumour volume, clinical target volume and to propose a planning target volume. The preoperative radiation therapy case was a patient with a grade 1 extraskeletal myxoid chondrosarcoma of the thigh. The postoperative case was a patient with a grade 3 pleomorphic undifferentiated sarcoma of the thigh. Contour agreement analysis was performed using kappa statistics. For the preoperative case, contouring agreement regarding GTV, gross tumour volume GTV, clinical target volume and planning target volume were substantial (kappa between 0.68 and 0.77). In the postoperative case, the agreement was only fair for reconstructed gross tumour volume (kappa: 0.38) but moderate for clinical target volume and planning target volume (kappa: 0.42). During the workshop discussion, consensus was reached on most of the contour divergences especially clinical target volume longitudinal extension. The determination of a limited cutaneous cover was also discussed. Accurate delineation of target volume appears to be a crucial element to ensure multicenter clinical trial quality assessment, reproducibility and homogeneity in delivering RT. radiation therapy RT. Quality assessment process should be proposed in this setting. We have shown in our study that preoperative radiation therapy of extremity soft tissue sarcoma has less inter-observer contouring variability. Copyright © 2018 Société française de radiothérapie oncologique (SFRO). Published by Elsevier SAS. All rights reserved.

  2. Reliability of the MDi Psoriasis® Application to Aid Therapeutic Decision-Making in Psoriasis.

    PubMed

    Moreno-Ramírez, D; Herrerías-Esteban, J M; Ojeda-Vila, T; Carrascosa, J M; Carretero, G; de la Cueva, P; Ferrándiz, C; Galán, M; Rivera, R; Rodríguez-Fernández, L; Ruiz-Villaverde, R; Ferrándiz, L

    2017-09-01

    Therapeutic decisions in psoriasis are influenced by disease factors (e.g., severity or location), comorbidity, and demographic and clinical features. We aimed to assess the reliability of a mobile telephone application (MDi-Psoriasis) designed to help the dermatologist make decisions on how to treat patients with moderate to severe psoriasis. We analyzed interobserver agreement between the advice given by an expert panel and the recommendations of the MDi-Psoriasis application in 10 complex cases of moderate to severe psoriasis. The experts were asked their opinion on which treatments were most appropriate, possible, or inappropriate. Data from the same 10 cases were entered into the MDi-Psoriasis application. Agreement was analyzed in 3 ways: paired interobserver concordance (Cohen's κ), multiple interobserver concordance (Fleiss's κ), and percent agreement between recommendations. The mean percent agreement between the total of 1210 observations was 51.3% (95% CI, 48.5-54.1%). Cohen's κ statistic was 0.29 and Fleiss's κ was 0.28. Mean agreement between pairs of human observers only, excluding the MDi-Psoriasis recommendations, was 50.5% (95% CI, 47.6-53.5%). Paired agreement between the recommendations of the MDi-Psoriasis tool and the majority opinion of the expert panel (Cohen's κ) was 0.44 (68.2% agreement). The MDi-Psoriasis tool can generate recommendations that are comparable to those of experts in psoriasis. Copyright © 2017 AEDV. Publicado por Elsevier España, S.L.U. All rights reserved.

  3. Agreement between different methods of measuring height in elderly patients.

    PubMed

    Frid, H; Adolfsson, E Thors; Rosenblad, A; Nydahl, M

    2013-10-01

    The present study aimed to examine the agreement between measurements of standing height and self-reported height, height measured with a sliding caliper, and height estimated from either demispan or knee height in elderly patients. Fifty-five patients (mean age 79 years) at a Swedish hospital were included in this observational study. The participants' heights were evaluated as the standing height, self-reported height, height measured in a recumbent position with a sliding caliper, and height estimated from the demispan or knee height. The measurements made with a sliding caliper in the recumbent position agreed most closely with the standing height. Ninety-five percent of the individuals' differences from standing height were within an interval of +1.1 to -4.8 cm (limits of agreement). Self-reported height and height estimated from knee height differed relatively strongly from standing height. The limits of agreement were +5.2 to -9.8 cm and +9.4 to -6.2 cm, respectively. The widest distribution of differences was found in the height estimated from the demispan, with limits of agreements from +11.2 to -9.3 cm. When measuring the height of patients who find it difficult to stand upright, a sliding caliper should be the method of choice, and the second choice should be self-reported height or the height estimated from knee height. Estimating height from the demispan should be the method of last resort. © 2013 The Authors Journal of Human Nutrition and Dietetics © 2013 The British Dietetic Association Ltd.

  4. Very Low Intravenous Contrast Volume Protocol for Computed Tomography Angiography Providing Comprehensive Cardiac and Vascular Assessment Prior to Transcatheter Aortic Valve Replacement in Patients with Chronic Kidney Disease

    PubMed Central

    Pulerwitz, Todd C.; Khalique, Omar K.; Nazif, Tamim N.; Rozenshtein, Anna; Pearson, Gregory D.N.; Hahn, Rebecca T.; Vahl, Torsten P.; Kodali, Susheel K.; George, Isaac; Leon, Martin B.; D'Souza, Belinda; Po, Ming Jack; Einstein, Andrew J.

    2016-01-01

    Background Transcatheter aortic valve replacement (TAVR) is a lifesaving procedure for many patients high risk for surgical aortic valve replacement. The prevalence of chronic kidney disease (CKD) is high in this population, and thus a very low contrast volume (VLCV) computed tomography angiography (CTA) protocol providing comprehensive cardiac and vascular imaging would be valuable. Methods 52 patients with severe, symptomatic aortic valve disease, undergoing pre-TAVR CTA assessment from 2013-4 at Columbia University Medical Center were studied, including all 26 patients with CKD (eGFR<30mL/min) who underwent a novel VLCV protocol (20mL of iohexol at 2.5mL/s), and 26 standard-contrast-volume (SCV) protocol patients. Using a 320-slice volumetric scanner, the protocol included ECG-gated volume scanning of the aortic root followed by medium-pitch helical vascular scanning through the femoral arteries. Two experienced cardiologists performed aortic annulus and root measurements. Vascular image quality was assessed by two radiologists using a 4-point scale. Results VLCV patients had mean(±SD) age 86±6.5, BMI 23.9±3.4 kg/m2 with 54% men; SCV patients age 83±8.8, BMI 28.7±5.3 kg/m2, 65% men. There was excellent intra- and inter-observer agreement for annular and root measurements, and excellent agreement with 3D-transesophageal echocardiographic measurements. Both radiologists found diagnostic-quality vascular imaging in 96% of VLCV and 100% of SCV cases, with excellent inter-observer agreement. Conclusions This study is the first of its kind to report the feasibility and reproducibility of measurements for a VLCV protocol for comprehensive pre-TAVR CTA. There was excellent agreement of cardiac measurements and almost all studies were diagnostic quality for vascular access assessment. PMID:27061253

  5. The validity and reliability of a simple semantic classification of foot posture.

    PubMed

    Cross, Hugh A; Lehman, Linda

    2008-12-01

    The Simple Semantic Classification (SSC) is described as a pragmatic method to assist in the assessment of the weight bearing foot. It was designed for application by therapists and technicians working in underdeveloped situations, after they have had basic orientation in foot function. To present evidence of the validity and inter observer reliability of the SSC. 13 physiotherapists from LEPRA India projects and 12 physical therapists functioning within the National Programme for the Elimination of Hansen's Disease (PNEH), Brazil, participated in an inter-observer exercise. Inter-observer agreement was gauged using the Kappa statistic. The results of the inter-observer exercise were dependent on observations of foot posture made from photographs. This was necessary to ensure that the procedure was standardised for participants in different countries. The method had limitations which were partly reflected in the results. The level of agreement between the principle investigator and Indian physiotherapists was Kappa = 058. The level of agreement between Brazilian physical therapists and the principle investigator was Kappa = 0.70. The authors opine that the results were sufficiently compelling to suggest that the Simple Semantic Classification can be used as a field method to identify people at increased risk of foot pathologies.

  6. A critical appraisal of vertebral fracture assessment in paediatrics.

    PubMed

    Kyriakou, Andreas; Shepherd, Sheila; Mason, Avril; Faisal Ahmed, S

    2015-12-01

    There is a need to improve our understanding of the clinical utility of vertebral fracture assessment (VFA) in paediatrics and this requires a thorough evaluation of its readability, reproducibility, and accuracy for identifying VF. VFA was performed independently by two observers, in 165 children and adolescents with a median age of 13.4 years (range, 3.6, 18). In 20 of these subjects, VFA was compared to lateral vertebral morphometry assessment on lateral spine X-ray (LVM). 1528 (84%) of the vertebrae were adequately visualised by both observers for VFA. Interobserver agreement in vertebral readability was 94% (kappa, 0.73 [95% CI, 0.68, 0.73]). 93% of the non-readable vertebrae were located between T6 and T9. Interobserver agreement per-vertebra for the presence of VF was 99% (kappa, 0.85 [95% CI, 0.79, 0.91]). Interobserver agreement per-subject was 91% (kappa, 0.78 [95% CI, 0.66, 0.87]). Per-vertebra agreement between LVM and VFA was 95% (kappa 0.79 [95% CI, 0.62, 0.92]) and per-subject agreement was 95% (kappa, 0.88 [95% CI, 0.58, 1.0]). Accepting LVM as the gold standard, VFA had a positive predictive value (PPV) of 90% and a negative predictive value (NPV) of 95% in per-vertebra analysis and a PPV of 100% and NPV of 93% in per-subject analysis. VFA reaches an excellent level of agreement between observers and a high level of accuracy in identifying VF in a paediatric population. The readability of vertebrae at the mid thoracic region is suboptimal and interpretation at this level should be exercised with caution. Copyright © 2015 Elsevier Inc. All rights reserved.

  7. Relationship of salivary and plasma cortisol levels in preterm infants: results of a prospective observational study and systematic review of the literature.

    PubMed

    Maas, Christoph; Ringwald, Christine; Weber, Karin; Engel, Corinna; Poets, Christian F; Binder, Gerhard; Bassler, Dirk

    2014-01-01

    (1) To investigate the relationship of salivary and plasma cortisol levels in preterm infants with a focus on the usability of salivary cortisol in diagnostic work-up of infants at risk of adrenal insufficiency. (2) To perform a systematic review addressing this question. Clinical study: We conducted a prospective observational single-center study in preterm infants. We analyzed plasma and saliva cortisol concentrations by enzyme immunoassay. Correlation analysis was used to determine the relation between salivary and plasma cortisol levels and the agreement of the measurement methods was analyzed according to Bland-Altman. Systematic review: A systematic literature search (PubMed and Embase) on the relationship of salivary and plasma cortisol levels in neonates was performed in November 2012. Clinical study: We enrolled 58 preterm infants (median (interquartile range) gestational age at birth was 31.4 (28.1-32.7) weeks, birth weight 1,340 (974-1,745) g, respectively). Correlation analyses revealed a relationship of plasma cortisol and salivary cortisol levels. Rank correlation coefficient was 0.6. Estimating plasma cortisol levels based on measured salivary cortisol levels showed poor agreement of the two methods for determining plasma cortisol levels (direct and via salivary cortisol). Sensitivity and specificity of salivary cortisol for the detection of adrenal insufficiency were 0.66 and 0.62, respectively. Systematic review: Six studies in preterm infants and term neonates depicting the correlation of salivary and plasma cortisol were identified with a range of saliva-plasma correlation coefficients from 0.44 to 0.83. Substitution of plasma cortisol by salivary cortisol determination cannot be recommended in preterm infants because of unsatisfactory agreement between methods.

  8. Investigating Autism-Related Symptoms in Children with Prader-Willi Syndrome: A Case Study

    PubMed Central

    Bennett, Jeffrey A.; Hodgetts, Sandra; Mackenzie, Michelle L.; Haqq, Andrea M.; Zwaigenbaum, Lonnie

    2017-01-01

    Prader-Willi syndrome (PWS), a rare genetic disorder caused by the lack of expression of paternal genes from chromosome 15q11-13, has been investigated for autism spectrum disorder (ASD) symptomatology in various studies. However, previous findings have been variable, and no studies investigating ASD symptomatology in PWS have exclusively studied children. We aimed to characterize social communication functioning and other ASD-related symptoms in children with PWS, and assessed agreement across measures and rates of ASD diagnosis. Measures included the Autism Diagnostic Observation Schedule-2 (ADOS-2), the Social Communication Questionnaire (SCQ), Social Responsiveness Scale-2 (SRS-2), Social Skills Improvement System-Rating Scales (SSIS-RS), and the Vineland Adaptive Behavioral Scales-II (VABS-II). General adaptive and intellectual skills were also assessed. Clinical best estimate (CBE) diagnosis was determined by an experienced developmental pediatrician, based on history and review of all available study measures, and taking into account overall developmental level. Participants included 10 children with PWS, aged 3 to 12 years. Three of the 10 children were male and genetic subtypes were two deletion (DEL) and eight uniparental disomy (UPD) (with a total of 6 female UPD cases). Although 8 of the 10 children exceeded cut-offs on at least one of the ASD assessments, agreement between parent questionnaires (SCQ, SRS-2, SSIS-RS) and observational assessment (ADOS-2) was very poor. None of the children were assigned a CBE diagnosis of ASD, with the caveat that the risk may have been lower because of the predominance of girls in the sample. The lack of agreement between the assessments emphasizes the complexity of interpreting ASD symptom measures in children with PWS. PMID:28264487

  9. Investigating Autism-Related Symptoms in Children with Prader-Willi Syndrome: A Case Study.

    PubMed

    Bennett, Jeffrey A; Hodgetts, Sandra; Mackenzie, Michelle L; Haqq, Andrea M; Zwaigenbaum, Lonnie

    2017-02-28

    Prader-Willi syndrome (PWS), a rare genetic disorder caused by the lack of expression of paternal genes from chromosome 15q11-13, has been investigated for autism spectrum disorder (ASD) symptomatology in various studies. However, previous findings have been variable, and no studies investigating ASD symptomatology in PWS have exclusively studied children. We aimed to characterize social communication functioning and other ASD-related symptoms in children with PWS, and assessed agreement across measures and rates of ASD diagnosis. Measures included the Autism Diagnostic Observation Schedule-2 (ADOS-2), the Social Communication Questionnaire (SCQ), Social Responsiveness Scale-2 (SRS-2), Social Skills Improvement System-Rating Scales (SSIS-RS), and the Vineland Adaptive Behavioral Scales-II (VABS-II). General adaptive and intellectual skills were also assessed. Clinical best estimate (CBE) diagnosis was determined by an experienced developmental pediatrician, based on history and review of all available study measures, and taking into account overall developmental level. Participants included 10 children with PWS, aged 3 to 12 years. Three of the 10 children were male and genetic subtypes were two deletion (DEL) and eight uniparental disomy (UPD) (with a total of 6 female UPD cases). Although 8 of the 10 children exceeded cut-offs on at least one of the ASD assessments, agreement between parent questionnaires (SCQ, SRS-2, SSIS-RS) and observational assessment (ADOS-2) was very poor. None of the children were assigned a CBE diagnosis of ASD, with the caveat that the risk may have been lower because of the predominance of girls in the sample. The lack of agreement between the assessments emphasizes the complexity of interpreting ASD symptom measures in children with PWS.

  10. Trace Gases and Aerosols Simulated Over the Indian Domain: Evaluation of the Model Wrf-Chem

    NASA Astrophysics Data System (ADS)

    Michael, M.; Yadav, A.; Tripathi, S. N.; Venkataraman, C.; Kanawade, V. P.

    2012-12-01

    As the anthropogenic emissions from the Asian countries contribute substantially to the global aerosol loading, the study of the distribution of trace gases and aerosols over this region has received increasing attention in recent years. In the present work, the aerosol properties over the Indian domain during the pre-monsoon season has been addressed. The "online" meteorological and chemical transport Weather Research and Forecasting-Chemistry (WRF-Chem) model has been implemented over Indian subcontinent for three consecutive summers in 2008, 2009 and 2010.The initial and boundary conditions are obtained from NCAR reanalysis data. The global emission inventories (REanalysis of the TROpospheric chemical composition (RETRO) and Emissions Database for Global Atmospheric Research (EDGAR)) have been used and are projected for the period of study using the method provided in Ohara et al. (2007). The emission rates of sulfur dioxide, black carbon, organic carbon and PM2.5 available in the global inventory are replaced with the high resolution emission inventory developed over India for the present study. The model simulates meteorological parameters, trace gases and particulate matter. Simulated mixing ratios of trace gases (Ozone, carbon monoxide, nitrogen oxides, and SO2) are compared with ground based as well as satellite observations over India with specific focus on Indo-Gangetic Plain. Simulated aerosol optical depth are in good agreement with those observed by Aerosol Robotic Network (AERONET). The vertical profiles of extinction coefficient have been compared with the Micro Pulse Lidar Network (MPLnet) data. The simulated mass concentration of BC shows very good agreement with those observed at a few ground stations. The vertical profiles of BC have also been compared with aircraft observations carried out during summer of 2008 and 2009, resulting in good agreement. This study shows that WRF-Chem model captures many important features of the observations and therefore can be used for understanding and forecasting regional weather patterns over Indian subcontinent. Acknowledgements: The author MM was supported by the DST-Fast Track fellowship. References: Ohara, T., H. Akimoto, J. Kurokawa, N. Horii, K. Yamaji, X. Yan, and T. Hayasaka, An Asian emission inventory of anthropogenic emission sources for the period 1980-2020, Atmos. Chem. Phys., 7, 4419, doi:10.5194/acp744192007, 2007.

  11. Inverse modeling of ground surface uplift and pressure with iTOUGH-PEST and TOUGH-FLAC: The case of CO2 injection at In Salah, Algeria

    NASA Astrophysics Data System (ADS)

    Rinaldi, Antonio P.; Rutqvist, Jonny; Finsterle, Stefan; Liu, Hui-Hai

    2017-11-01

    Ground deformation, commonly observed in storage projects, carries useful information about processes occurring in the injection formation. The Krechba gas field at In Salah (Algeria) is one of the best-known sites for studying ground surface deformation during geological carbon storage. At this first industrial-scale on-shore CO2 demonstration project, satellite-based ground-deformation monitoring data of high quality are available and used to study the large-scale hydrological and geomechanical response of the system to injection. In this work, we carry out coupled fluid flow and geomechanical simulations to understand the uplift at three different CO2 injection wells (KB-501, KB-502, KB-503). Previous numerical studies focused on the KB-502 injection well, where a double-lobe uplift pattern has been observed in the ground-deformation data. The observed uplift patterns at KB-501 and KB-503 have single-lobe patterns, but they can also indicate a deep fracture zone mechanical response to the injection. The current study improves the previous modeling approach by introducing an injection reservoir and a fracture zone, both responding to a Mohr-Coulomb failure criterion. In addition, we model a stress-dependent permeability and bulk modulus, according to a dual continuum model. Mechanical and hydraulic properties are determined through inverse modeling by matching the simulated spatial and temporal evolution of uplift to InSAR observations as well as by matching simulated and measured pressures. The numerical simulations are in agreement with both spatial and temporal observations. The estimated values for the parameterized mechanical and hydraulic properties are in good agreement with previous numerical results. In addition, the formal joint inversion of hydrogeological and geomechanical data provides measures of the estimation uncertainty.

  12. Repeatability and reproducibility of corneal thickness using SOCT Copernicus HR.

    PubMed

    Vidal, Silvia; Viqueira, Valentín; Mas, David; Domenech, Begoña

    2013-05-01

    The aim of this study is to determine the reliability of corneal thickness measurements derived from SOCT Copernicus HR (Fourier domain OCT). Thirty healthy eyes of 30 subjects were evaluated. One eye of each patient was chosen randomly. Images were obtained of the central (up to 2.0 mm from the corneal apex) and paracentral (2.0 to 4.0 mm) cornea. We assessed corneal thickness (central and paracentral) and epithelium thickness. The intra-observer repeatability data were analysed using the intra-class correlation coefficient (ICC) for a range of 95 per cent within-subject standard deviation (S(W)) and the within-subject coefficient of variation (C(W)). The level of agreement by Bland-Altman analysis was also represented for the study of the reproducibility between observers and agreement between methods of measurement (automatic versus manual). The mean value of the central corneal thickness (CCT) was 542.4 ± 30.1 μm (SD). There was a high intra-observer agreement, finding the best result in the central sector with an intra-class correlation coefficient of 0.99, 95 per cent CI (0.989 to 0.997) and the worst, in the minimum corneal thickness, with an intra-class correlation coefficient of 0.672, 95 per cent CI (0.417 to 0.829). Reproducibility between observers was very high. The best result was found in the central sector thickness obtained both manually and automatically with an intra-class correlation coefficient of 0.990 in both cases and the worst result in the maximum corneal thickness with an intra-class correlation coefficient of 0.827. The agreement between measurement methods was also very high with intra-class correlation coefficient greater than 0.91. On the other hand the repeatability and reproducibility for epithelial measurements was poor. Pachymetric mapping with SOCT Copernicus HR was found to be highly repeatable and reproducible. We found that the device lacks an appropriate ergonomic design as proper focusing of the laser beam onto the cornea for anterior segment scanning required that patients were positioned slightly farther away from the machine head-rest than in the setup for retinal imaging. © 2013 The Authors. Clinical and Experimental Optometry © 2013 Optometrists Association Australia.

  13. Reliability of joint count assessment in rheumatoid arthritis: a systematic literature review.

    PubMed

    Cheung, Peter P; Gossec, Laure; Mak, Anselm; March, Lyn

    2014-06-01

    Joint counts are central to the assessment of rheumatoid arthritis (RA) but reliability is an issue. To evaluate the reliability and agreement of joint counts (intra-observer and inter-observer) by health care professionals (physicians, nurses, and metrologists) and patients in RA, and the impact of training and standardization on joint count reliability through a systematic literature review. Articles reporting joint count reliability or agreement in RA in PubMed, EMBase, and the Cochrane library between 1960 and 2012 were selected. Data were extracted regarding tender joint counts (TJCs) and swollen joint counts (SJCs) derived by physicians, metrologists, or patients for intra-observer and inter-observer reliability. In addition, methods and effects of training or standardization were extracted. Statistics expressing reliability such as intraclass correlation coefficients (ICCs) were extracted. Data analysis was primarily descriptive due to high heterogeneity. Twenty-eight studies on health care professionals (HCP) and 20 studies on patients were included. Intra-observer reliability for TJCs and SJCs was good for HCPs and patients (range of ICC: 0.49-0.98). Inter-observer reliability between HCPs for TJCs was higher than for SJCs (range of ICC: 0.64-0.88 vs. 0.29-0.98). Patient inter-observer reliability with HCPs as comparators was better for TJCs (range of ICC: 0.31-0.91) compared to SJCs (0.16-0.64). Nine studies (7 with HCPs and 2 with patients) evaluated consensus or training, with improvement in reliability of TJCs but conflicting evidence for SJCs. Intra- and inter-observer reliability was high for TJCs for HCPs and patients: among all groups, reliability was better for TJCs than SJCs. Inter-observer reliability of SJCs was poorer for patients than HCPs. Data were inconclusive regarding the potential for training to improve SJC reliability. Overall, the results support further evaluation for patient-reported joint counts as an outcome measure. © 2013 Published by Elsevier Inc.

  14. Contouring Variability of the Penile Bulb on CT Images: Quantitative Assessment Using a Generalized Concordance Index

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Carillo, Viviana; Cozzarini, Cesare; Perna, Lucia

    2012-11-01

    Purpose: Within a multicenter study (DUE-01) focused on the search of predictors of erectile dysfunction and urinary toxicity after radiotherapy for prostate cancer, a dummy run exercise on penile bulb (PB) contouring on computed tomography (CT) images was carried out. The aim of this study was to quantitatively assess interobserver contouring variability by the application of the generalized DICE index. Methods and Materials: Fifteen physicians from different Institutes drew the PB on CT images of 10 patients. The spread of DICE values was used to objectively select those observers who significantly disagreed with the others. The analyses were performed withmore » a dedicated module in the VODCA software package. Results: DICE values were found to significantly change among observers and patients. The mean DICE value was 0.67, ranging between 0.43 and 0.80. The statistics of DICE coefficients identified 4 of 15 observers who systematically showed a value below the average (p value range, 0.013 - 0.059): Mean DICE values were 0.62 for the 4 'bad' observers compared to 0.69 of the 11 'good' observers. For all bad observers, the main cause of the disagreement was identified. Average DICE values were significantly worse from the average in 2 of 10 patients (0.60 vs. 0.70, p < 0.05) because of the limited visibility of the PB. Excluding the bad observers and the 'bad' patients,' the mean DICE value increased from 0.67 to 0.70; interobserver variability, expressed in terms of standard deviation of DICE spread, was also reduced. Conclusions: The obtained values of DICE around 0.7 shows an acceptable agreement, considered the small dimension of the PB. Additional strategies to improve this agreement are under consideration and include an additional tutorial of the so-called bad observers with a recontouring procedure, or the recontouring by a single observer of the PB for all patients included in the DUE-01 study.« less

  15. Planetary Boundary Layer Simulation Using TASS

    NASA Technical Reports Server (NTRS)

    Schowalter, David G.; DeCroix, David S.; Lin, Yuh-Lang; Arya, S. Pal; Kaplan, Michael

    1996-01-01

    Boundary conditions to an existing large-eddy simulation model have been changed in order to simulate turbulence in the atmospheric boundary layer. Several options are now available, including the use of a surface energy balance. In addition, we compare convective boundary layer simulations with the Wangara and Minnesota field experiments as well as with other model results. We find excellent agreement of modelled mean profiles of wind and temperature with observations and good agreement for velocity variances. Neutral boundary simulation results are compared with theory and with previously used models. Agreement with theory is reasonable, while agreement with previous models is excellent.

  16. Gamma ray sources observation with the ARGO-YBJ detector

    NASA Astrophysics Data System (ADS)

    Vernetto, S.; ARGO-YBJ Collaboration

    2011-02-01

    In this paper we report on the observations of TeV gamma ray sources performed by the air shower detector ARGO-YBJ. The objects studied in this work are the blazar Markarian 421 and the extended galactic source MGROJ1908+06, monitored during ~2 years of operation. Mrk421 has been detected by ARGO-YBJ with a statistical significance of ~11 standard deviations. The observed TeV emission was highly variable, showing large enhancements of the flux during active periods. The study of the spectral behaviour during flares revealed a positive correlation of the hardness with the flux, as already reported in the past by the Whipple telescope, suggesting that this is a long term property of the source. ARGO-YBJ observed a strong correlation between TeV gamma rays and the X-ray flux measured by RXTM/ASM and SWIFT/BAT during the whole period, with a time lag compatible with zero, supporting the one-zone SSC model to describe the emission mechanism. MGROJ1908+06 has been detected by ARGO-YBJ with ~5 standard deviation of significance. From our data the source appears extended and the measured extension is σext = 0.48° --> σext = 0.48° -0.28+0.26 --> -0.28+0.26, in agreement with a previous HESS observation. The average flux is in marginal agreement with that reported by MILAGRO, but significantly higher than that obtained by HESS, suggesting a possible flux variability.

  17. Systematic review of methods for quantifying teamwork in the operating theatre

    PubMed Central

    Marshall, D.; Sykes, M.; McCulloch, P.; Shalhoub, J.; Maruthappu, M.

    2018-01-01

    Background Teamwork in the operating theatre is becoming increasingly recognized as a major factor in clinical outcomes. Many tools have been developed to measure teamwork. Most fall into two categories: self‐assessment by theatre staff and assessment by observers. A critical and comparative analysis of the validity and reliability of these tools is lacking. Methods MEDLINE and Embase databases were searched following PRISMA guidelines. Content validity was assessed using measurements of inter‐rater agreement, predictive validity and multisite reliability, and interobserver reliability using statistical measures of inter‐rater agreement and reliability. Quantitative meta‐analysis was deemed unsuitable. Results Forty‐eight articles were selected for final inclusion; self‐assessment tools were used in 18 and observational tools in 28, and there were two qualitative studies. Self‐assessment of teamwork by profession varied with the profession of the assessor. The most robust self‐assessment tool was the Safety Attitudes Questionnaire (SAQ), although this failed to demonstrate multisite reliability. The most robust observational tool was the Non‐Technical Skills (NOTECHS) system, which demonstrated both test–retest reliability (P > 0·09) and interobserver reliability (Rwg = 0·96). Conclusion Self‐assessment of teamwork by the theatre team was influenced by professional differences. Observational tools, when used by trained observers, circumvented this.

  18. Interrelationship between Autism Diagnostic Observation Schedule-Generic (ADOS-G), Autism Diagnostic Interview-Revised (ADI-R), and the Diagnostic and Statistical Manual of Mental Disorders (DSM-IV-TR) Classification in Children and Adolescents with Mental Retardation

    ERIC Educational Resources Information Center

    de Bildt, Annelies; Sytema, Sjoerd; Ketelaars, Cees; Kraijer, Dirk; Mulder, Erik; Volkmar, Fred; Minderaa, Ruud

    2004-01-01

    The interrelationship between the Autism Diagnostic Interview-Revised (ADI-R), Autism Diagnostic Observation Schedule-Generic (ADOS-G) and clinical classification was studied in 184 children and adolescents with Mental Retardation (MR). The agreement between the ADI-R and ADOS-G was fair, with a substantial difference between younger and older…

  19. A case study of the aurora, high-latitude ionosphere, and particle precipitation during near-steady state conditions

    NASA Technical Reports Server (NTRS)

    Winningham, J. D.; Anger, C. D.; Shepherd, G. G.; Weber, E. J.; Wagner, R. A.

    1978-01-01

    An Isis 2 pass studied in related experiments was singled out for a detailed examination of the particle fluxes, optical emissions, and ionospheric parameters observed during a quiescent period (late recovery) between two substorms. Since both long-duration measurements (aircraft) and transient snapshot (spacecraft) data are available, space and time effects can, on a macroscopic level, be separated. The latitudinal morphology observed by the satellite is found to be basically spatial in nature. It is suggested that the observed particle fluxes can be explained in terms of precipitation from the quiet time plasma sheet without intervening acceleration. The agreement of the observed optical emissions and ionospheric parameters with the electron fluxes is discussed.

  20. The ICI classification for calcaneal injuries: a validation study.

    PubMed

    Frima, Herman; Eshuis, Rienk; Mulder, Paul; Leenen, Luke

    2012-06-01

    The integral classification of injuries (ICI), by Zwipp et al. has been developed as a classification system for injuries of the bones, joints, cartilage and ligaments of the foot. It follows the principles of the comprehensive classification of fractures by Müller et al. The ICI was developed for 'everyday use' and scientific purposes. Our aim was to perform a validation study for this classification system applied to the calcaneal injuries. A panel of five experienced trauma and orthopaedic surgeons evaluated the ICI score in 20 calcaneal injuries. After 2 months, a second classification was performed in a different order. Inter- and intra-observer variability were evaluated by kappa statistics. Panel members were not able to evaluate capsule and ligamental injuries based on X-ray and computed tomography (CT) films. Two injuries were excluded for logistical reasons. The inter-observer agreement based on 18 injuries of bone and joints was slight; kappa 0.14 (90% confidence interval (CI): 0.05-0.22). The intra-observer agreement was fair; kappa 0.31 (90% CI: 0.22-0.41). Overall, the panel rated the system as very complicated and not practical. The ICI is a complicated classification system with slight to fair inter- and intra-observer variabilities. It might not be a practical classification system for calcaneal injuries in 'everyday use' or scientific purposes. Copyright © 2011 Elsevier Ltd. All rights reserved.

  1. Effects of subject-case marking on agreement processing: ERP evidence from Basque.

    PubMed

    Chow, Wing-Yee; Nevins, Andrew; Carreiras, Manuel

    2018-02-01

    Previous cross-linguistic research has found that comprehenders are immediately sensitive to various kinds of agreement violations across languages. We focused on Basque, a verb-final ergative language with both subject-verb (SV) and object-verb (OV) agreement. We compared the effects of SV agreement violations on comprehenders' event-related brain potentials (ERPs) in transitive sentences (where OV agreement is present, and the subject is ergative) and intransitive sentences (where OV agreement is absent, and the subject is absolutive). We observed a P600 effect in both cases, but only violations with intransitive subjects elicited an early posterior negativity. Such a qualitative difference suggests that distinct neurocognitive mechanisms are involved in processing agreement with transitive subjects (which are marked with ergative case) versus intransitive subjects (which bear absolutive case). Building on theoretical proposals that in languages such as Basque, true agreement occurs with absolutive subjects but not with ergative subjects, we submit that the early posterior negativity may be an electrophysiological signature for true agreement. Copyright © 2017 Elsevier Ltd. All rights reserved.

  2. Accuracy of a technology-assisted eye exam in evaluation of referable diabetic retinopathy and concomitant ocular diseases.

    PubMed

    Conlin, Paul R; Asefzadeh, Baharak; Pasquale, Louis R; Selvin, Gerald; Lamkin, Rebecca; Cavallerano, Anthony A

    2015-12-01

    Digital retinal imaging using store-and-forward technology is used to screen for diabetic retinopathy (DR). Its usefulness in detecting non-diabetic eye diseases is uncertain. We determined the level of agreement between teleretinal imaging supplemented with visual acuity and intraocular pressure (IOP) measurements (ie, technology-assisted eye (TAE) exam) and a comprehensive eye exam in evaluation for DR and non-diabetic ocular conditions. We conducted a prospective, observational study with two parallel evaluations. Patients with diabetes (n=317) had a TAE exam and a comprehensive eye exam on the same day. A subset of participants with normal baseline exams (n=72) had follow-up exams 1 year later. We measured the level of agreement for referable ocular findings. Agreement for referable ocular findings was moderate (n=389, agreement: 77%; κ: 0.55), due in part to ungradable exams (22%). However, about half of the ungradable exams had findings that warranted referral. There was substantial agreement for follow-up exams (n=72, agreement: 93%; κ: 0.63). Among all gradable exams (n=303), the TAE exam had 86% sensitivity and 84% specificity for referable ocular findings, with high agreement (≥94%) for DR and other major ocular diagnoses. There was moderate-to-substantial agreement between a TAE exam and a comprehensive eye exam for referable ocular findings in patients with diabetes. Ungradable exams were a frequent marker of ocular pathology. Teleretinal imaging may be a useful evaluation for both diabetic and non-diabetic ocular conditions. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/

  3. Agreement of angle closure assessments between gonioscopy, anterior segment optical coherence tomography and spectral domain optical coherence tomography.

    PubMed

    Tay, Elton Lik Tong; Yong, Vernon Khet Yau; Lim, Boon Ang; Sia, Stelson; Wong, Elizabeth Poh Ying; Yip, Leonard Wei Leon

    2015-01-01

    To determine angle closure agreements between gonioscopy and anterior segment optical coherence tomography (AS-OCT), as well as gonioscopy and spectral domain OCT (SD-OCT). A secondary objective was to quantify inter-observer agreements of AS-OCT and SD-OCT assessments. Seventeen consecutive subjects (33 eyes) were recruited from the study hospital's Glaucoma clinic. Gonioscopy was performed by a glaucomatologist masked to OCT results. OCT images were read independently by 2 other glaucomatologists masked to gonioscopy findings as well as each other's analyses of OCT images. Totally 84.8% and 45.5% of scleral spurs were visualized in AS-OCT and SD-OCT images respectively (P<0.01). The agreement for angle closure between AS-OCT and gonioscopy was fair at k=0.31 (95% confidence interval, CI: 0.03-0.59) and k=0.35 (95% CI: 0.07-0.63) for reader 1 and 2 respectively. The agreement for angle closure between SD-OCT and gonioscopy was fair at k=0.21 (95% CI: 0.07-0.49) and slight at k=0.17 (95% CI: 0.08-0.42) for reader 1 and 2 respectively. The inter-reader agreement for angle closure in AS-OCT images was moderate at 0.51 (95% CI: 0.13-0.88). The inter-reader agreement for angle closure in SD-OCT images was slight at 0.18 (95% CI: 0.08-0.45). Significant proportion of scleral spurs were not visualised with SD-OCT imaging resulting in weaker inter-reader agreements. Identifying other angle landmarks in SD-OCT images will allow more consistent angle closure assessments. Gonioscopy and OCT imaging do not always agree in angle closure assessments but have their own advantages, and should be used together and not exclusively.

  4. Prospective assessment of interobserver agreement for defecography in fecal incontinence.

    PubMed

    Dobben, Annette C; Wiersma, Tjeerd G; Janssen, Lucas W M; de Vos, Rien; Terra, Maaike P; Baeten, Cor G; Stoker, Jaap

    2005-11-01

    The primary aim of our study was to determine the interobserver agreement of defecography in diagnosing enterocele, anterior rectocele, intussusception, and anismus in fecal-incontinent patients. The subsidiary aim was to evaluate the influence of level of experience on interpreting defecography. Defecography was performed in 105 consecutive fecal-incontinent patients. Observers were classified by level of experience and their findings were compared with the findings of an expert radiologist. The quality of the expert radiologist's findings was evaluated by an intraobserver agreement procedure. Intraobserver agreement was good to very good except for anismus: incomplete evacuation after 30 sec (kappa, 0.55) and puborectalis impression (kappa, 0.54). Interobserver agreement for enterocele and rectocele was good (kappa, 0.66 for both) and for intussusception, fair (kappa, 0.29). Interobserver agreement for anismus: incomplete evacuation after 30 sec was moderate (kappa, 0.47), and for anismus: puborectalis impression was fair (kappa, 0.24). Agreement in grading of enterocele and rectocele was good (kappa, 0.64 and 0.72, respectively) and for intussusception, fair (kappa, 0.39). Agreement separated by experience level was very good for rectocele (kappa, 0.83) and grading of rectoceles (kappa, 0.83) and moderate for intussusception (kappa, 0.44) at the most experienced level. For enterocele and grading, experience level did not influence the reproducibility. Reproducibility for enterocele, anterior rectocele, and severity grading is good, but for intussusception is fair to moderate. For anismus, the diagnosis of incomplete evacuation after 30 sec is more reproducible than puborectalis impression. The level of experience seems to play a role in diagnosing anterior rectocele and its grading and in diagnosing intussusception.

  5. The Neurologic Assessment in Neuro-Oncology (NANO) scale: a tool to assess neurologic function for integration into the Response Assessment in Neuro-Oncology (RANO) criteria.

    PubMed

    Nayak, Lakshmi; DeAngelis, Lisa M; Brandes, Alba A; Peereboom, David M; Galanis, Evanthia; Lin, Nancy U; Soffietti, Riccardo; Macdonald, David R; Chamberlain, Marc; Perry, James; Jaeckle, Kurt; Mehta, Minesh; Stupp, Roger; Muzikansky, Alona; Pentsova, Elena; Cloughesy, Timothy; Iwamoto, Fabio M; Tonn, Joerg-Christian; Vogelbaum, Michael A; Wen, Patrick Y; van den Bent, Martin J; Reardon, David A

    2017-05-01

    The Macdonald criteria and the Response Assessment in Neuro-Oncology (RANO) criteria define radiologic parameters to classify therapeutic outcome among patients with malignant glioma and specify that clinical status must be incorporated and prioritized for overall assessment. But neither provides specific parameters to do so. We hypothesized that a standardized metric to measure neurologic function will permit more effective overall response assessment in neuro-oncology. An international group of physicians including neurologists, medical oncologists, radiation oncologists, and neurosurgeons with expertise in neuro-oncology drafted the Neurologic Assessment in Neuro-Oncology (NANO) scale as an objective and quantifiable metric of neurologic function evaluable during a routine office examination. The scale was subsequently tested in a multicenter study to determine its overall reliability, inter-observer variability, and feasibility. The NANO scale is a quantifiable evaluation of 9 relevant neurologic domains based on direct observation and testing conducted during routine office visits. The score defines overall response criteria. A prospective, multinational study noted a >90% inter-observer agreement rate with kappa statistic ranging from 0.35 to 0.83 (fair to almost perfect agreement), and a median assessment time of 4 minutes (interquartile range, 3-5). The NANO scale provides an objective clinician-reported outcome of neurologic function with high inter-observer agreement. It is designed to combine with radiographic assessment to provide an overall assessment of outcome for neuro-oncology patients in clinical trials and in daily practice. Furthermore, it complements existing patient-reported outcomes and cognition testing to combine for a global clinical outcome assessment of well-being among brain tumor patients. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Neuro-Oncology. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com

  6. Reliability of classification for post-traumatic ankle osteoarthritis.

    PubMed

    Claessen, Femke M A P; Meijer, Diederik T; van den Bekerom, Michel P J; Gevers Deynoot, Barend D J; Mallee, Wouter H; Doornberg, Job N; van Dijk, C Niek

    2016-04-01

    The purpose of this study was to identify the most reliable classification system for clinical outcome studies to categorize post-traumatic-fracture-osteoarthritis. A total of 118 orthopaedic surgeons and residents-gathered in the Ankle Platform Study Collaborative Science of Variation Group-evaluated 128 anteroposterior and lateral radiographs of patients after a bi- or trimalleolar ankle fracture on a Web-based platform in order to rate post-traumatic osteoarthritis according to the classification systems coined by (1) van Dijk, (2) Kellgren, and (3) Takakura. Reliability was evaluated with the use of the Siegel and Castellan's multirater kappa measure. Differences between classification systems were compared using the two-sample Z-test. Interobserver agreement of surgeons who participated in the survey was fair for the van Dijk osteoarthritis scale (k = 0.24), and poor for the Takakura (k = 0.19) and the Kellgren systems (k = 0.18) according to the categorical rating of Landis and Koch. This difference in one categorical rating was found to be significant (p < 0.001, CI 0.046-0.053) with the high numbers of observers and cases available. This study documents fair interobserver agreement for the van Dijk osteoarthritis scale, and poor interobserver agreement for the Takakura and Kellgren osteoarthritis classification systems. Because of the low interobserver agreement for the van Dijk, Kellgren, and Takakura classification systems, those systems cannot be used for clinical decision-making. Development of diagnostic criteria on basis of consecutive patients, Level II.

  7. Interrater agreement in the interpretation of neonatal electroencephalography in hypoxic-ischemic encephalopathy.

    PubMed

    Wusthoff, Courtney J; Sullivan, Joseph; Glass, Hannah C; Shellhaas, Renée A; Abend, Nicholas S; Chang, Taeun; Tsuchida, Tammy N

    2017-03-01

    Research using neonatal electroencephalography (EEG) has been limited by a lack of a standardized classification system and interpretation terminology. In 2013, the American Clinical Neurophysiology Society (ACNS) published a guideline for standardized terminology and categorization in the description of continuous EEG in neonates. We sought to assess interrater agreement for this neonatal EEG categorization system as applied by a group of pediatric neurophysiologists. A total of 60 neonatal EEG studies were collected from three institutions. All EEG segments were from term neonates with hypoxic-ischemic encephalopathy. Three pediatric neurophysiologists independently reviewed each record using the ACNS standardized scoring system. Unweighted kappa values were calculated for interrater agreement of categorical data across multiple observers. Interrater agreement was very good for identification of seizures (κ = 0.93, p < 0.001), with perfect agreement in 95% of records (57 of 60). Interrater agreement was moderate for classifying records as normal or having any abnormality (κ = 0.49, p < 0.001), with perfect agreement in 78% of records (47 of 60). Interrater agreement was good in classifying EEG backgrounds on a 5-category scale (normal, excessively discontinuous, burst suppression, status epilepticus, or electrocerebral inactivity) (κ = 0.70, p < 0.001), with perfect agreement in 72% of records (43 of 60). Other specific background features had lower agreement, including voltage (κ = 0.41, p < 0.001), variability (κ = 0.35, p < 0.001), symmetry (κ = 0.18, p = 0.01), presence of abnormal sharp waves (κ < 0.20, p < 0.05), and presence of brief rhythmic discharges (κ < 0.20, p < 0.05). We found good or very good interrater agreement applying the ACNS system for identification of seizures and classification of EEG background. Other specific EEG features showed limited interrater agreement. Of importance to both clinicians and researchers, our findings support using the ACNS system in identifying seizures and classifying backgrounds of neonatal EEG recordings, but also suggest limited reproducibility for certain other EEG features. Wiley Periodicals, Inc. © 2017 International League Against Epilepsy.

  8. Air Quality Forecasting through Different Statistical and Artificial Intelligence Techniques

    NASA Astrophysics Data System (ADS)

    Mishra, D.; Goyal, P.

    2014-12-01

    Urban air pollution forecasting has emerged as an acute problem in recent years because there are sever environmental degradation due to increase in harmful air pollutants in the ambient atmosphere. In this study, there are different types of statistical as well as artificial intelligence techniques are used for forecasting and analysis of air pollution over Delhi urban area. These techniques are principle component analysis (PCA), multiple linear regression (MLR) and artificial neural network (ANN) and the forecasting are observed in good agreement with the observed concentrations through Central Pollution Control Board (CPCB) at different locations in Delhi. But such methods suffers from disadvantages like they provide limited accuracy as they are unable to predict the extreme points i.e. the pollution maximum and minimum cut-offs cannot be determined using such approach. Also, such methods are inefficient approach for better output forecasting. But with the advancement in technology and research, an alternative to the above traditional methods has been proposed i.e. the coupling of statistical techniques with artificial Intelligence (AI) can be used for forecasting purposes. The coupling of PCA, ANN and fuzzy logic is used for forecasting of air pollutant over Delhi urban area. The statistical measures e.g., correlation coefficient (R), normalized mean square error (NMSE), fractional bias (FB) and index of agreement (IOA) of the proposed model are observed in better agreement with the all other models. Hence, the coupling of statistical and artificial intelligence can be use for the forecasting of air pollutant over urban area.

  9. A novel magnetic resonance imaging segmentation technique for determining diffuse intrinsic pontine glioma tumor volume

    PubMed Central

    Singh, Ranjodh; Zhou, Zhiping; Tisnado, Jamie; Haque, Sofia; Peck, Kyung K.; Young, Robert J.; Tsiouris, Apostolos John; Thakur, Sunitha B.; Souweidane, Mark M.

    2017-01-01

    OBJECTIVE Accurately determining diffuse intrinsic pontine glioma (DIPG) tumor volume is clinically important. The aims of the current study were to 1) measure DIPG volumes using methods that require different degrees of subjective judgment; and 2) evaluate interobserver agreement of measurements made using these methods. METHODS Eight patients from a Phase I clinical trial testing convection-enhanced delivery (CED) of a therapeutic antibody were included in the study. Pre-CED, post–radiation therapy axial T2-weighted images were analyzed using 2 methods requiring high degrees of subjective judgment (picture archiving and communication system [PACS] polygon and Volume Viewer auto-contour methods) and 1 method requiring a low degree of subjective judgment (k-means clustering segmentation) to determine tumor volumes. Lin’s concordance correlation coefficients (CCCs) were calculated to assess interobserver agreement. RESULTS The CCCs of measurements made by 2 observers with the PACS polygon and the Volume Viewer auto-contour methods were 0.9465 (lower 1-sided 95% confidence limit 0.8472) and 0.7514 (lower 1-sided 95% confidence limit 0.3143), respectively. Both were considered poor agreement. The CCC of measurements made using k-means clustering segmentation was 0.9938 (lower 1-sided 95% confidence limit 0.9772), which was considered substantial strength of agreement. CONCLUSIONS The poor interobserver agreement of PACS polygon and Volume Viewer auto-contour methods high-lighted the difficulty in consistently measuring DIPG tumor volumes using methods requiring high degrees of subjective judgment. k-means clustering segmentation, which requires a low degree of subjective judgment, showed better interob-server agreement and produced tumor volumes with delineated borders. PMID:27391980

  10. Judgments of subtle facial expressions of emotion.

    PubMed

    Matsumoto, David; Hwang, Hyisung C

    2014-04-01

    Most studies on judgments of facial expressions of emotion have primarily utilized prototypical, high-intensity expressions. This paper examines judgments of subtle facial expressions of emotion, including not only low-intensity versions of full-face prototypes but also variants of those prototypes. A dynamic paradigm was used in which observers were shown a neutral expression followed by the target expression to judge, and then the neutral expression again, allowing for a simulation of the emergence of the expression from and then return to a baseline. We also examined how signal and intensity clarities of the expressions (explained more fully in the Introduction) were associated with judgment agreement levels. Low-intensity, full-face prototypical expressions of emotion were judged as the intended emotion at rates significantly greater than chance. A number of the proposed variants were also judged as the intended emotions. Both signal and intensity clarities were individually associated with agreement rates; when their interrelationships were taken into account, signal clarity independently predicted agreement rates but intensity clarity did not. The presence or absence of specific muscles appeared to be more important to agreement rates than their intensity levels, with the exception of the intensity of zygomatic major, which was positively correlated with agreement rates for judgments of joy.

  11. Assessment of Interobserver Reliability in Nutrition Studies that Use Direct Observation of School Meals

    PubMed Central

    BAGLIO, MICHELLE L.; BAXTER, SUZANNE DOMEL; GUINN, CAROLINE H.; THOMPSON, WILLIAM O.; SHAFFER, NICOLE M.; FRYE, FRANCESCA H. A.

    2005-01-01

    This article (a) provides a general review of interobserver reliability (IOR) and (b) describes our method for assessing IOR for items and amounts consumed during school meals for a series of studies regarding the accuracy of fourth-grade children's dietary recalls validated with direct observation of school meals. A widely used validation method for dietary assessment is direct observation of meals. Although many studies utilize several people to conduct direct observations, few published studies indicate whether IOR was assessed. Assessment of IOR is necessary to determine that the information collected does not depend on who conducted the observation. Two strengths of our method for assessing IOR are that IOR was assessed regularly throughout the data collection period and that IOR was assessed for foods at the item and amount level instead of at the nutrient level. Adequate agreement among observers is essential to the reasoning behind using observation as a validation tool. Readers are encouraged to question the results of studies that fail to mention and/or to include the results for assessment of IOR when multiple people have conducted observations. PMID:15354155

  12. Histopathological grading of breast ductal carcinoma in situ: validation of a web-based survey through intra-observer reproducibility analysis.

    PubMed

    Schuh, Fernando; Biazús, Jorge Villanova; Resetkova, Erika; Benfica, Camila Zanella; Ventura, Alessandra de Freitas; Uchoa, Diego; Graudenz, Márcia; Edelweiss, Maria Isabel Albano

    2015-07-10

    Histopathological grading diagnosis of ductal carcinoma in situ (DCIS) of the breast may be very difficult even for experts, and it is important for therapeutic decisions. The challenge may be due to the inaccurate and/or subjective application of the diagnosis criteria. The aim of this study was to investigate the intra-observer agreement between a traditional method and a developed web-based questionnaire for scoring breast DCIS. A cross-sectional study was carried out to evaluate the diagnostic agreement of an electronic questionnaire and its point scoring system with the subjective reading of digital images for 3 different DCIS grading systems: Holland, Van Nuys and modified Black nuclear grade system. Three pathologists analyzed the same set of digitized images from 43 DCIS cases using two different web-based programs. In the first phase, they accessed a website with a newly created questionnaire and scoring system developed to allow the determination of the histological grade of the cases. After at least 6 months, the pathologists read again the same images, but without the help of the questionnaire, indicating subjectively the diagnoses. The intra-observer agreement analysis was employed to validate this innovative web-based survey. Overall, diagnostic reproducibility was similar for all histologic grading classification systems, with kappa values of 0.57 ± 0.10, 0.67 ± 0.09 and 0.67 ± 0.09 for Holland, Van Nuys classification and modified Black nuclear grade system respectively. Only two 2-step diagnostic disagreements were found, one for Holland and another for Van Nuys. Both cases were superestimated by the web-based survey. The diagnostic agreement between the web-based questionnaire and a traditional method, both using digital images, is moderate to good for Holland, Van Nuys and modified Black nuclear grade system. The use of a scoring point system does not appear to pose a major risk of presenting large (2-step) diagnostic disagreements. These findings indicate that the use of this point scoring system in this web-based survey to grade objectively DCIS lesions is a useful diagnostic tool.

  13. Occlusion assessment of intracranial aneurysms treated with the WEB device.

    PubMed

    Caroff, Jildaz; Mihalea, Cristian; Tuilier, Titien; Barreau, Xavier; Cognard, Christophe; Desal, Hubert; Pierot, Laurent; Arnoux, Armelle; Moret, Jacques; Spelle, Laurent

    2016-09-01

    The Woven EndoBridge (WEB) system is an innovative device under evaluation for its capacity to treat wide-neck bifurcation intracranial aneurysms. The purpose of this study is to evaluate the use of the different occlusion scales available in clinical practice. Seven WEB-experienced neurointerventionalists were provided with 30 angiographic follow-up data sets and asked to grade each evaluation point according to the Bicêtre Occlusion Scale Score (BOSS), firstly based on DSA images only then using additional C-Arm VasoCT analysis. This BOSS evaluation was then converted into the WEB Occlusion Scale (WOS) and into a dichotomized scale (complete occlusion or not). To estimate the inter-rater agreement among the seven raters, an overall kappa coefficient [1] and its standard error (SE) were computed. Using the five-grade BOSS, raters showed "moderate" agreement (kappa = 0.56). Using the three-grade WOS, agreement appeared slightly better (kappa = 0.59). Strongest inter-rater agreement was observed with a dichotomized version of the scale (complete occlusion or not), which enabled an "almost perfect" agreement (kappa = 0.88). VasoCT consistently enhanced the agreement particularly with regards depicting intra-WEB residual filling. The WOS is a consistent means to angiographically evaluate the WEB device efficiency. But the five-grade BOSS scale allows to identify aneurysm subgroups with differing risks of recurrence and/or rehemorrhage, which needs to be separated especially at the initial phase of evaluation of this innovative device. The additional use of VasoCT allows better inter-rater agreement in evaluating occlusion and specially in depicting intra-WEB persistent filling.

  14. Waist Circumference, Pedometer Placement, and Step-Counting Accuracy in Youth

    ERIC Educational Resources Information Center

    Abel, Mark G.; Hannon, James C.; Eisenman, Patricia A.; Ransdell, Lynda B.; Pett, Marjorie; Williams, Daniel P.

    2009-01-01

    This study examined whether differences in waist circumference (WC) and pedometer placement (anterior vs. midaxillary vs. posterior) affect the agreement between pedometer and observed steps during treadmill and self-paced walking. Participants included 19 pairs of youth (9-15 years old) who were matched for sex, race, and height and stratified by…

  15. Experimental assessment of theory for refraction of sound by a shear layer

    NASA Technical Reports Server (NTRS)

    Schlinker, R. H.; Amiet, R. K.

    1978-01-01

    The refraction angle and amplitude changes associated with sound transmission through a circular, open-jet shear layer were studied in a 0.91 m diameter open jet acoustic research tunnel. Free stream Mach number was varied from 0.1 to 0.4. Good agreement between refraction angle correction theory and experiment was obtained over the test Mach number, frequency and angle measurement range for all on-axis acoustic source locations. For off-axis source positions, good agreement was obtained at a source-to-shear layer separation distance greater than the jet radius. Measureable differences between theory and experiment occurred at a source-to-shear layer separation distance less than one jet radius. A shear layer turbulence scattering experiment was conducted at 90 deg to the open jet axis for the same free stream Mach numbers and axial source locations used in the refraction study. Significant discrete tone spectrum broadening and tone amplitude changes were observed at open jet Mach numbers above 0.2 and at acoustic source frequencies greater than 5 kHz. More severe turbulence scattering was observed for downstream source locations.

  16. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ghosh, Subimal; Das, Debasish; Kao, Shih-Chieh

    Recent studies disagree on how rainfall extremes over India have changed in space and time over the past half century, as well as on whether the changes observed are due to global warming or regional urbanization. Although a uniform and consistent decrease in moderate rainfall has been reported, a lack of agreement about trends in heavy rainfall may be due in part to differences in the characterization and spatial averaging of extremes. Here we use extreme value theory to examine trends in Indian rainfall over the past half century in the context of long-term, low-frequency variability.We show that when generalizedmore » extreme value theory is applied to annual maximum rainfall over India, no statistically significant spatially uniform trends are observed, in agreement with previous studies using different approaches. Furthermore, our space time regression analysis of the return levels points to increasing spatial variability of rainfall extremes over India. Our findings highlight the need for systematic examination of global versus regional drivers of trends in Indian rainfall extremes, and may help to inform flood hazard preparedness and water resource management in the region.« less

  17. Dependence of solid-liquid interface free energy on liquid structure

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wilson, S R; Mendelev, M I

    2014-09-01

    The Turnbull relation is widely believed to enable prediction of solid–liquid interface (SLI) free energies from measurements of the latent heat and the solid density. Ewing proposed an additional contribution to the SLI free energy to account for variations in liquid structure near the interface. In the present study, molecular dynamics (MD) simulations were performed to investigate whether SLI free energy depends on liquid structure. Analysis of the MD simulation data for 11 fcc metals demonstrated that the Turnbull relation is only a rough approximation for highly ordered liquids, whereas much better agreement is observed with Ewing’s theory. A modificationmore » to Ewing’s relation is proposed in this study that was found to provide excellent agreement with MD simulation data.« less

  18. Self-Reports versus Parental Perceptions of Health-Related Quality of Life among Deaf Children and Adolescents

    ERIC Educational Resources Information Center

    Pardo-Guijarro, María Jesús; Martínez-Andrés, María; Notario-Pacheco, Blanca; Solera-Martínez, Montserrat; Sánchez-López, Mairena; Martínez-Vizcaíno, Vicente

    2015-01-01

    The aim of this study was to assess the agreement between deaf children's and adolescents' self-ratings of health-related quality of life (HRQoL) and their parents' proxy reports. This observational cross-sectional study included 114 deaf 8- to 18-years-old students and proxy family members. HRQoL was measured using the KIDSCREEN-27 questionnaire,…

  19. Direct observations of a flare related coronal and solar wind disturbance

    NASA Technical Reports Server (NTRS)

    Gosling, J. T.; Hildner, E.; Macqueen, R. M.; Munro, R. H.; Poland, A. I.; Ross, C. L.

    1975-01-01

    Numerous mass ejections from the sun have been detected with orbiting coronagraphs. Here for the first time we document and discuss the direct association of a coronagraph observed mass ejection, which followed a 2B flare, with a large interplanetary shock wave disturbance observed at 1 AU. Estimates of the mass and energy content of the coronal disturbance are in reasonably good agreement with estimates of the mass and energy content of the solar wind disturbance at 1 AU. The energy estimates as well as the transit time of the disturbance are also in good agreement with numerical models of shock wave propagation in the solar wind.

  20. Detection of retinal lesions in diabetic retinopathy: comparative evaluation of 7-field digital color photography versus red-free photography.

    PubMed

    Venkatesh, Pradeep; Sharma, Reetika; Vashist, Nagender; Vohra, Rajpal; Garg, Satpal

    2015-10-01

    Red-free light allows better detection of vascular lesions as this wavelength is absorbed by hemoglobin; however, the current gold standard for the detection and grading of diabetic retinopathy remains 7-field color fundus photography. The goal of this study was to compare the ability of 7-field fundus photography using red-free light to detect retinopathy lesions with corresponding images captured using standard 7-field color photography. Non-stereoscopic standard 7-field 30° digital color fundus photography and 7-field 30° digital red-free fundus photography were performed in 200 eyes of 103 patients with various grades of diabetic retinopathy ranging from mild to moderate non-proliferative diabetic retinopathy to proliferative diabetic retinopathy. The color images (n = 1,400) were studied with corresponding red-free images (n = 1,400) by one retina consultant (PV) and two senior residents training in retina. The various retinal lesions [microaneurysms, hemorrhages, hard exudates, soft exudates, intra-retinal microvascular anomalies (IRMA), neovascularization of the retina elsewhere (NVE), and neovascularization of the disc (NVD)] detected by all three observers in each of the photographs were noted followed by determination of agreement scores using κ values (range 0-1). Kappa coefficient was categorized as poor (≤0), slight (0.01-0.20), fair (0.2 -0.40), moderate (0.41-0.60), substantial (0.61-0.80), and almost perfect (0.81-1). The number of lesions detected by red-free images alone was higher for all observers and all abnormalities except hard exudates. Detection of IRMA was especially higher for all observers with red-free images. Between image pairs, there was substantial agreement for detection of hard exudates (average κ = 0.62, range 0.60-0.65) and moderate agreement for detection of hemorrhages (average κ = 0.52, range 0.45-0.58), soft exudates (average κ = 0.51, range 0.42-0.61), NVE (average κ = 0.47, range 0.39-0.53), and NVD (average κ = 0.51, range 0.45-0.54). Fair agreement was noted for detection of microaneurysms (average κ = 0.29, range 0.20-0.39) and IRMA (average κ = 0.23, range 0.23-0.24). Inter-observer agreement with color images was substantial for hemorrhages (average κ = 0.72), soft exudates (average κ = 0.65), and NVD (average κ = 0.65); moderate for microaneurysms (average κ = 0.42), NVE (average κ = 0.44), and hard exudates (average κ = 0.59) and fair for IRMA (average κ = 0.21). Inter-observer agreement with red-free images was substantial for hard exudates (average κ = 0.63) and moderate for detection of hemorrhages (average κ = 0.56), SE (average κ = 0.60), IRMA (average κ = 0.50), NVE (average κ = 0.44), and NVD (average κ = 0.45). Digital red-free photography has a higher level of detection ability for all retinal lesions of diabetic retinopathy. More advanced grades of retinopathy are likely to be detected earlier with red-free imaging because of its better ability to detect IRMA, NVE, and NVD. Red-free monochromatic imaging of the retina is a more effective and less costly alternative for detection of vision-threatening diabetic retinopathy.

Top