Choosing the best index for the average score intraclass correlation coefficient.
Shieh, Gwowen
2016-09-01
The intraclass correlation coefficient (ICC)(2) index from a one-way random effects model is widely used to describe the reliability of mean ratings in behavioral, educational, and psychological research. Despite its apparent utility, the essential property of ICC(2) as a point estimator of the average score intraclass correlation coefficient is seldom mentioned. This article considers several potential measures and compares their performance with ICC(2). Analytical derivations and numerical examinations are presented to assess the bias and mean square error of the alternative estimators. The results suggest that more advantageous indices can be recommended over ICC(2) for their theoretical implication and computational ease.
Peirce, Deborah; Brown, Janie; Corkish, Victoria; Lane, Marguerite; Wilson, Sally
2016-06-01
To compare two methods of calculating interrater agreement while determining content validity of the Paediatric Pain Knowledge and Attitudes Questionnaire for use with Australian nurses. Paediatric pain assessment and management documentation was found to be suboptimal revealing a need to assess paediatric nurses' knowledge and attitude to pain. The Paediatric Pain Knowledge and Attitudes Questionnaire was selected as it had been reported as valid and reliable in the United Kingdom with student nurses. The questionnaire required content validity determination prior to use in the Australian context. A two phase process of expert review. Ten paediatric nurses completed a relevancy rating of all 68 questionnaire items. In phase two, five pain experts reviewed the items of the questionnaire that scored an unacceptable item level content validity. Item and scale level content validity indices and intraclass correlation coefficients were calculated. In phase one, 31 items received an item level content validity index <0·78 and the scale level content validity index average was 0·80 which were below levels required for acceptable validity. The intraclass correlation coefficient was 0·47. In phase two, 10 items were amended and four items deleted. The revised questionnaire provided a scale level content validity index average >0·90 and an intraclass correlation coefficient of 0·94 demonstrating excellent agreement between raters therefore acceptable content validity. Equivalent outcomes were achieved using the content validity index and the intraclass correlation coefficient. To assess content validity the content validity index has the advantage of providing an item level score and is a simple calculation. The intraclass correlation coefficient requires statistical knowledge, or support, and has the advantage of accounting for the possibility of chance agreement. © 2016 John Wiley & Sons Ltd.
Goldstein, Seth D; Lindeman, Brenessa; Colbert-Getz, Jorie; Arbella, Trisha; Dudas, Robert; Lidor, Anne; Sacks, Bethany
2014-02-01
The clinical knowledge of medical students on a surgery clerkship is routinely assessed via subjective evaluations from faculty members and residents. Interpretation of these ratings should ideally be valid and reliable. However, prior literature has questioned the correlation between subjective and objective components when assessing students' clinical knowledge. Retrospective cross-sectional data were collected from medical student records at The Johns Hopkins University School of Medicine from July 2009 through June 2011. Surgical faculty members and residents rated students' clinical knowledge on a 5-point, Likert-type scale. Interrater reliability was assessed using intraclass correlation coefficients for students with ≥4 attending surgeon evaluations (n = 216) and ≥4 resident evaluations (n = 207). Convergent validity was assessed by correlating average evaluation ratings with scores on the National Board of Medical Examiners (NBME) clinical subject examination for surgery. Average resident and attending surgeon ratings were also compared by NBME quartile using analysis of variance. There were high degrees of reliability for resident ratings (intraclass correlation coefficient, .81) and attending surgeon ratings (intraclass correlation coefficient, .76). Resident and attending surgeon ratings shared a moderate degree of variance (19%). However, average resident ratings and average attending surgeon ratings shared a small degree of variance with NBME surgery examination scores (ρ(2) ≤ .09). When ratings were compared among NBME quartile groups, the only significant difference was for residents' ratings of students with the lower 25th percentile of scores compared with the top 25th percentile of scores (P = .007). Although high interrater reliability suggests that attending surgeons and residents rate students with consistency, the lack of convergent validity suggests that these ratings may not be reflective of actual clinical knowledge. Both faculty members and residents may benefit from training in knowledge assessment, which will likely increase opportunities to recognize deficiencies and make student evaluation a more valuable tool. Copyright © 2014 Elsevier Inc. All rights reserved.
Larson, Tomas; Kerekes, Nóra; Selinus, Eva Norén; Lichtenstein, Paul; Gumpert, Clara Hellner; Anckarsäter, Henrik; Nilsson, Thomas; Lundström, Sebastian
2014-02-01
The Autism-Tics, AD/HD, and other Comorbidities (A-TAC) inventory is used in epidemiological research to assess neurodevelopmental problems and coexisting conditions. Although the A-TAC has been applied in various populations, data on retest reliability are limited. The objective of the present study was to present additional reliability data. The A-TAC was administered by lay assessors and was completed on two occasions by parents of 400 individual twins, with an average interval of 70 days between test sessions. Intra- and inter-rater reliability were analysed with intraclass correlations and Cohen's kappa. A-TAC showed excellent test-retest intraclass correlations for both autism spectrum disorder and attention deficit hyperactivity disorder (each at .84). Most modules in the A-TAC had intra- and inter-rater reliability intraclass correlation coefficients of > or = .60. Cohen's kappa indi- cated acceptable reliability. The current study provides statistical evidence that the A-TAC yields good test-retest reliability in a population-based cohort of children.
One Iota Fills the Quota: A Paradox in Multifacet Reliability Coefficients.
ERIC Educational Resources Information Center
Conger, Anthony J.
1983-01-01
A paradoxical phenomenon of decreases in reliability as the number of elements averaged over increases is shown to be possible in multifacet reliability procedures (intraclass correlations or generalizability coefficients). Conditions governing this phenomenon are presented along with implications and cautions. (Author)
Labocha, Marta K.; Sadowska, Edyta T.; Baliga, Katarzyna; Semer, Aleksandra K.; Koteja, Paweł
2004-01-01
Basal metabolic rate (BMR) is a fundamental energetic trait and has been measured in hundreds of birds and mammals. Nevertheless, little is known about the consistency of the population-average BMR or its repeatability at the level of individual variation. Here, we report that average mass-independent BMR did not differ between two generations of bank voles or between two trials separated by one month. Individual differences in BMR were highly repeatable across the one month interval: the coefficient of intraclass correlation was 0.70 for absolute log-transformed values and 0.56 for mass-independent values. Thus, BMR can be a meaningful measure of an individual physiological characteristic and can be used to test hypotheses concerning relationships between BMR and other traits. On the other hand, mass-independent BMR did not differ significantly across families, and the coefficient of intraclass correlation for full sibs did not differ from zero, which suggests that heritability of BMR in voles is not high. PMID:15101695
The Variance of Intraclass Correlations in Three- and Four-Level Models
ERIC Educational Resources Information Center
Hedges, Larry V.; Hedberg, E. C.; Kuyper, Arend M.
2012-01-01
Intraclass correlations are used to summarize the variance decomposition in populations with multilevel hierarchical structure. There has recently been considerable interest in estimating intraclass correlations from surveys or designed experiments to provide design parameters for planning future large-scale randomized experiments. The large…
The Variance of Intraclass Correlations in Three and Four Level
ERIC Educational Resources Information Center
Hedges, Larry V.; Hedberg, Eric C.; Kuyper, Arend M.
2012-01-01
Intraclass correlations are used to summarize the variance decomposition in popula- tions with multilevel hierarchical structure. There has recently been considerable interest in estimating intraclass correlations from surveys or designed experiments to provide design parameters for planning future large-scale randomized experiments. The large…
Reliability Generalization of the Psychopathy Checklist Applied in Youthful Samples
ERIC Educational Resources Information Center
Campbell, Justin S.; Pulos, Steven; Hogan, Mike; Murry, Francie
2005-01-01
This study examines the average reliability of Hare Psychopathy Checklists (PCLs) adapted for use in samples of youthful offenders (aged 12 to 21 years). Two forms of reliability are examined: 18 alpha estimates of internal consistency and 18 intraclass correlation (two or more raters) estimates of interrater reliability. The results, an average…
Test-Retest Reliability of the Salutogenic Wellness Promotion Scale (SWPS)
ERIC Educational Resources Information Center
Anderson, L. M.; Moore, J. B.; Hayden, B. M.; Becker, C. M.
2014-01-01
Objective: This study examined the temporal stability (i.e. test-retest reliability) of the Salutogenic Wellness Promotion Scale (SWPS) using intraclass correlation coefficients (ICC). Current intraclass results were also compared to previously published interclass correlations to support the use of the intraclass method for test-retest…
Braschel, Melissa C; Svec, Ivana; Darlington, Gerarda A; Donner, Allan
2016-04-01
Many investigators rely on previously published point estimates of the intraclass correlation coefficient rather than on their associated confidence intervals to determine the required size of a newly planned cluster randomized trial. Although confidence interval methods for the intraclass correlation coefficient that can be applied to community-based trials have been developed for a continuous outcome variable, fewer methods exist for a binary outcome variable. The aim of this study is to evaluate confidence interval methods for the intraclass correlation coefficient applied to binary outcomes in community intervention trials enrolling a small number of large clusters. Existing methods for confidence interval construction are examined and compared to a new ad hoc approach based on dividing clusters into a large number of smaller sub-clusters and subsequently applying existing methods to the resulting data. Monte Carlo simulation is used to assess the width and coverage of confidence intervals for the intraclass correlation coefficient based on Smith's large sample approximation of the standard error of the one-way analysis of variance estimator, an inverted modified Wald test for the Fleiss-Cuzick estimator, and intervals constructed using a bootstrap-t applied to a variance-stabilizing transformation of the intraclass correlation coefficient estimate. In addition, a new approach is applied in which clusters are randomly divided into a large number of smaller sub-clusters with the same methods applied to these data (with the exception of the bootstrap-t interval, which assumes large cluster sizes). These methods are also applied to a cluster randomized trial on adolescent tobacco use for illustration. When applied to a binary outcome variable in a small number of large clusters, existing confidence interval methods for the intraclass correlation coefficient provide poor coverage. However, confidence intervals constructed using the new approach combined with Smith's method provide nominal or close to nominal coverage when the intraclass correlation coefficient is small (<0.05), as is the case in most community intervention trials. This study concludes that when a binary outcome variable is measured in a small number of large clusters, confidence intervals for the intraclass correlation coefficient may be constructed by dividing existing clusters into sub-clusters (e.g. groups of 5) and using Smith's method. The resulting confidence intervals provide nominal or close to nominal coverage across a wide range of parameters when the intraclass correlation coefficient is small (<0.05). Application of this method should provide investigators with a better understanding of the uncertainty associated with a point estimator of the intraclass correlation coefficient used for determining the sample size needed for a newly designed community-based trial. © The Author(s) 2015.
Syed, Mushabbar A; Oshinski, John N; Kitchen, Charles; Ali, Arshad; Charnigo, Richard J; Quyyumi, Arshed A
2009-08-01
Carotid MRI measurements are increasingly being employed in research studies for atherosclerosis imaging. The majority of carotid imaging studies use 1.5 T MRI. Our objective was to investigate intra-observer and inter-observer variability in carotid measurements using high resolution 3 T MRI. We performed 3 T carotid MRI on 10 patients (age 56 +/- 8 years, 7 male) with atherosclerosis risk factors and ultrasound intima-media thickness > or =0.6 mm. A total of 20 transverse images of both right and left carotid arteries were acquired using T2 weighted black-blood sequence. The lumen and outer wall of the common carotid and internal carotid arteries were manually traced; vessel wall area, vessel wall volume, and average wall thickness measurements were then assessed for intra-observer and inter-observer variability. Pearson and intraclass correlations were used in these assessments, along with Bland-Altman plots. For inter-observer variability, Pearson correlations ranged from 0.936 to 0.996 and intraclass correlations from 0.927 to 0.991. For intra-observer variability, Pearson correlations ranged from 0.934 to 0.954 and intraclass correlations from 0.831 to 0.948. Calculations showed that inter-observer variability and other sources of error would inflate sample size requirements for a clinical trial by no more than 7.9%, indicating that 3 T MRI is nearly optimal in this respect. In patients with subclinical atherosclerosis, 3 T carotid MRI measurements are highly reproducible and have important implications for clinical trial design.
A comparison of two indices for the intraclass correlation coefficient.
Shieh, Gwowen
2012-12-01
In the present study, we examined the behavior of two indices for measuring the intraclass correlation in the one-way random effects model: the prevailing ICC(1) (Fisher, 1938) and the corrected eta-squared (Bliese & Halverson, 1998). These two procedures differ both in their methods of estimating the variance components that define the intraclass correlation coefficient and in their performance of bias and mean squared error in the estimation of the intraclass correlation coefficient. In contrast with the natural unbiased principle used to construct ICC(1), in the present study it was analytically shown that the corrected eta-squared estimator is identical to the maximum likelihood estimator and the pairwise estimator under equal group sizes. Moreover, the empirical results obtained from the present Monte Carlo simulation study across various group structures revealed the mutual dominance relationship between their truncated versions for negative values. The corrected eta-squared estimator performs better than the ICC(1) estimator when the underlying population intraclass correlation coefficient is small. Conversely, ICC(1) has a clear advantage over the corrected eta-squared for medium and large magnitudes of population intraclass correlation coefficient. The conceptual description and numerical investigation provide guidelines to help researchers choose between the two indices for more accurate reliability analysis in multilevel research.
Does Dry Eye Affect Repeatability of Corneal Topography Measurements?
Doğan, Aysun Şanal; Gürdal, Canan; Köylü, Mehmet Talay
2018-04-01
The purpose of this study was to assess the repeatability of corneal topography measurements in dry eye patients and healthy controls. Participants underwent consecutive corneal topography measurements (Sirius; Costruzione Strumenti Oftalmici, Florence, Italy). Two images with acquisition quality higher than 90% were accepted. The following parameters were evaluated: minimum and central corneal thickness, aqueous depth, apex curvature, anterior chamber volume, horizontal anterior chamber diameter, iridocorneal angle, cornea volume, and average simulated keratometry. Repeatability was assessed by calculating intra-class correlation coefficient. Thirty-three patients with dry eye syndrome and 40 healthy controls were enrolled to the study. The groups were similar in terms of age (39 [18-65] vs. 30.5 [18-65] years, p=0.198) and gender (M/F: 4/29 vs. 8/32, p=0.366). Intra-class correlation coefficients among all topography parameters within both groups showed excellent repeatability (>0.90). The anterior segment measurements provided by the Sirius corneal topography system were highly repeatable for dry eye patients and are sufficiently reliable for clinical practice and research.
Development and reliability testing of a food store observation form.
Rimkus, Leah; Powell, Lisa M; Zenk, Shannon N; Han, Euna; Ohri-Vachaspati, Punam; Pugach, Oksana; Barker, Dianne C; Resnick, Elissa A; Quinn, Christopher M; Myllyluoma, Jaana; Chaloupka, Frank J
2013-01-01
To develop a reliable food store observational data collection instrument to be used for measuring product availability, pricing, and promotion. Observational data collection. A total of 120 food stores (26 supermarkets, 34 grocery stores, 54 gas/convenience stores, and 6 mass merchandise stores) in the Chicago metropolitan statistical area. Inter-rater reliability for product availability, pricing, and promotion measures on a food store observational data collection instrument. Cohen's kappa coefficient and proportion of overall agreement for dichotomous variables and intra-class correlation coefficient for continuous variables. Inter-rater reliability, as measured by average kappa coefficient, was 0.84 for food and beverage product availability measures, 0.80 for interior store characteristics, and 0.70 for exterior store characteristics. For continuous measures, average intra-class correlation coefficient was 0.82 for product pricing measures; 0.90 for counts of fresh, frozen, and canned fruit and vegetable options; and 0.85 for counts of advertisements on the store exterior and property. The vast majority of measures demonstrated substantial or almost perfect agreement. Although some items may require revision, results suggest that the instrument may be used to reliably measure the food store environment. Copyright © 2013 Society for Nutrition Education and Behavior. Published by Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Kistner, Emily O.; Muller, Keith E.
2004-01-01
Intraclass correlation and Cronbach's alpha are widely used to describe reliability of tests and measurements. Even with Gaussian data, exact distributions are known only for compound symmetric covariance (equal variances and equal correlations). Recently, large sample Gaussian approximations were derived for the distribution functions. New exact…
Yue, Chen; Chen, Shaojie; Sair, Haris I; Airan, Raag; Caffo, Brian S
2015-09-01
Data reproducibility is a critical issue in all scientific experiments. In this manuscript, the problem of quantifying the reproducibility of graphical measurements is considered. The image intra-class correlation coefficient (I2C2) is generalized and the graphical intra-class correlation coefficient (GICC) is proposed for such purpose. The concept for GICC is based on multivariate probit-linear mixed effect models. A Markov Chain Monte Carlo EM (mcm-cEM) algorithm is used for estimating the GICC. Simulation results with varied settings are demonstrated and our method is applied to the KIRBY21 test-retest dataset.
Greywoode, Jewel; Bluman, Eric; Spiegel, Joseph; Boon, Maurits
2009-11-01
To evaluate the readability of patient-oriented online health information (OHI) presented on the American Academy of Otolaryngology-Head and Neck Surgery (AAO-HNS) website. Review of the Flesch-Kincaid (FK) grade level for 104 articles on the AAO-HNS website. The FK grade level for 104 articles was determined using the readability calculator available within Microsoft Office Word 2003. The interobserver reliability for the FK grade level was determined by calculating the intraclass correlation coefficient (ICC) for 52 entries. The average FK grade reading level of the articles was 10.8 (range 6.3-16.7; 95% CI, 10.4-11.2). Eighty-one percent of the articles were written at a ninth grade level or higher. The intraclass correlation was good (r = 0.83) for the 52 articles that were independently reviewed. This analysis has shown that the average reading level for each article on the AAO-HNS site was higher than the recommended sixth grade reading level. Although the AAO-HNS site is written at a higher level than that suggested for the general public, it is important to realize that readability is just one consideration in the evaluation of OHI comprehension. Physicians need to be cognizant of their patients' ability to read and comprehend written information and tailor their educational material appropriately.
To, Kien Gia; Meuleners, Lynn; Chen, Huei-Yang; Lee, Andy; Do, Dung Van; Duong, Dat Van; Phi, Tien Duy; Tran, Hoang Huy; Nguyen, Nguyen Do
2014-06-01
To determine the test-retest repeatability of the National Eye Institute 25-item Visual Function Questionnaire (NEI VFQ-25) for use with older Vietnamese adults with bilateral cataract. The questionnaire was translated into Vietnamese and back-translated into English by two independent translators. Patients with bilateral cataract aged 50 and older completed the questionnaire on two separate occasions, one to two weeks after first administration of the questionnaire. Test-retest repeatability was assessed using the Cronbach's α and intraclass correlation coefficients. The average age of participants was 67 ± 8 years and most participants were female (73%). Internal consistency was acceptable with the α coefficient above 0.7 for all subscales and intraclass correlation coefficients were 0.6 or greater in all subscales. The Vietnamese NEI VFQ-25 is reliable for use in studies assessing vision-related quality of life in older adults with bilateral cataract in Vietnam. We propose some modifications to the NEI-VFQ questions to reflect activities of older people in Vietnam. © 2013 ACOTA.
Judging in Rhythmic Gymnastics at Different Levels of Performance.
Leandro, Catarina; Ávila-Carvalho, Lurdes; Sierra-Palmeiro, Elena; Bobo-Arce, Marta
2017-12-01
This study aimed to analyse the quality of difficulty judging in rhythmic gymnastics, at different levels of performance. The sample consisted of 1152 difficulty scores concerning 288 individual routines, performed in the World Championships in 2013. The data were analysed using the mean absolute judge deviation from the final difficulty score, a Cronbach's alpha coefficient and intra-class correlations, for consistency and reliability assessment. For validity assessment, mean deviations of judges' difficulty scores, the Kendall's coefficient of concordance W and ANOVA eta-squared values were calculated. Overall, the results in terms of consistency (Cronbach's alpha mostly above 0.90) and reliability (intra-class correlations for single and average measures above 0.70 and 0.90, respectively) were satisfactory, in the first and third parts of the ranking on all apparatus. The medium level gymnasts, those in the second part of the ranking, had inferior reliability indices and highest score dispersion. In this part, the minimum of corrected item-total correlation of individual judges was 0.55, with most values well below, and the matrix for between-judge correlations identified remarkable inferior correlations. These findings suggest that the quality of difficulty judging in rhythmic gymnastics may be compromised at certain levels of performance. In future, special attention should be paid to the judging analysis of the medium level gymnasts, as well as the Code of Points applicability at this level.
Judging in Rhythmic Gymnastics at Different Levels of Performance
Ávila-Carvalho, Lurdes; Sierra-Palmeiro, Elena; Bobo-Arce, Marta
2017-01-01
Abstract This study aimed to analyse the quality of difficulty judging in rhythmic gymnastics, at different levels of performance. The sample consisted of 1152 difficulty scores concerning 288 individual routines, performed in the World Championships in 2013. The data were analysed using the mean absolute judge deviation from the final difficulty score, a Cronbach’s alpha coefficient and intra-class correlations, for consistency and reliability assessment. For validity assessment, mean deviations of judges’ difficulty scores, the Kendall’s coefficient of concordance W and ANOVA eta-squared values were calculated. Overall, the results in terms of consistency (Cronbach’s alpha mostly above 0.90) and reliability (intra-class correlations for single and average measures above 0.70 and 0.90, respectively) were satisfactory, in the first and third parts of the ranking on all apparatus. The medium level gymnasts, those in the second part of the ranking, had inferior reliability indices and highest score dispersion. In this part, the minimum of corrected item-total correlation of individual judges was 0.55, with most values well below, and the matrix for between-judge correlations identified remarkable inferior correlations. These findings suggest that the quality of difficulty judging in rhythmic gymnastics may be compromised at certain levels of performance. In future, special attention should be paid to the judging analysis of the medium level gymnasts, as well as the Code of Points applicability at this level. PMID:29339996
Strayhorn, J; McDermott, J F; Tanguay, P
1993-06-01
The effects of methods used to improve the interrater reliability of reviewers' ratings of manuscripts submitted to the Journal of the American Academy of Child and Adolescent Psychiatry were studied. Reviewers' ratings of consecutive manuscripts submitted over approximately 1 year were first analyzed; 296 pairs of ratings were studied. Intraclass correlations and confidence intervals for the correlations were computed for the two main ratings by which reviewers quantified the quality of the article: a 1-10 overall quality rating and a recommendation for acceptance or rejection with four possibilities along that continuum. Modifications were then introduced, including a multi-item rating scale and two training manuals to accompany it. Over the next year, 272 more articles were rated, and reliabilities were computed for the new scale and for the scales previously used. The intraclass correlation of the most reliable rating before the intervention was 0.27; the reliability of the new rating procedure was 0.43. The difference between these two was significant. The reliability for the new rating scale was in the fair to good range, and it became even better when the ratings of the two reviewers were averaged and the reliability stepped up by the Spearman-Brown formula. The new rating scale had excellent internal consistency and correlated highly with other quality ratings. The data confirm that the reliability of ratings of scientific articles may be improved by increasing the number of rating scale points, eliciting ratings of separate, concrete items rather than a global judgment, using training manuals, and averaging the scores of multiple reviewers.
Intraclass Correlation Values for Planning Group-Randomized Trials in Education
ERIC Educational Resources Information Center
Hedges, Larry V.; Hedberg, E. C.
2007-01-01
Experiments that assign intact groups to treatment conditions are increasingly common in social research. In educational research, the groups assigned are often schools. The design of group-randomized experiments requires knowledge of the intraclass correlation structure to compute statistical power and sample sizes required to achieve adequate…
Vizzeri, Gianmarco; Bowd, Christopher; Medeiros, Felipe A; Weinreb, Robert N; Zangwill, Linda M
2008-08-01
Misalignment of the Stratus optical coherence tomograph scan circle placed by the operator around the optic nerve head (ONH) during each retinal nerve fiber layer (RNFL) examination can affect the instrument reproducibility and its theoretical ability to detect true structural changes in the RNFL thickness over time. We evaluated the effect of scan circle placement on RNFL measurements. Observational clinical study. Sixteen eyes of 8 normal participants were examined using the Stratus optical coherence tomograph Fast RNFL thickness acquisition protocol (software version 4.0.7; Carl Zeiss Meditec, Dublin, CA). Four consecutive images were taken by the same operator with the circular scan centered on the optic nerve head. Four images each with the scan displaced superiorly, inferiorly, temporally, and nasally were also acquired. Differences in average and sectoral RNFL thicknesses were determined. For the centered scans, the coefficients of variation (CV) and the intraclass correlation coefficient for the average RNFL thickness measured were calculated. When the average RNFL thickness of the centered scans was compared with the average RNFL thickness of the displaced scans individually using analysis of variance with post-hoc analysis, no difference was found between the average RNFL thickness of the nasally (105.2 microm), superiorly (106.2 microm), or inferiorly (104.1 microm) displaced scans and the centered scans (106.4 microm). However, a significant difference (analysis of variance with Dunnett's test: F=8.82, P<0.0001) was found between temporally displaced scans (115.8 microm) and centered scans. Significant differences in sectoral RNFL thickness measurements were found between centered and each displaced scan. The coefficient of variation for average RNFL thickness was 1.75% and intraclass correlation coefficient was 0.95. In normal eyes, average RNFL thickness measurements are robust and similar with significant superior, inferior, and nasal scan displacement, but average RNFL thickness is greater when scans are displaced temporally. Parapapillary scan misalignment produces significant changes in RNFL assessment characterized by an increase in measured RNFL thickness in the quadrant in which the scan is closer to the disc, and a significant decrease in RNFL thickness in the quadrant in which the scan is displaced further from the optic disc.
ERIC Educational Resources Information Center
Zhou, Hong; Muellerleile, Paige; Ingram, Debra; Wong, Seok P.
2011-01-01
Intraclass correlation coefficients (ICCs) are commonly used in behavioral measurement and psychometrics when a researcher is interested in the relationship among variables of a common class. The formulas for deriving ICCs, or generalizability coefficients, vary depending on which models are specified. This article gives the equations for…
ERIC Educational Resources Information Center
Raykov, Tenko
2011-01-01
Interval estimation of intraclass correlation coefficients in hierarchical designs is discussed within a latent variable modeling framework. A method accomplishing this aim is outlined, which is applicable in two-level studies where participants (or generally lower-order units) are clustered within higher-order units. The procedure can also be…
Angiographic assessment of initial balloon angioplasty results.
Gardiner, Geoffrey A; Sullivan, Kevin L; Halpern, Ethan J; Parker, Laurence; Beck, Margaret; Bonn, Joseph; Levin, David C
2004-10-01
To determine the influence of three factors involved in the angiographic assessment of balloon angioplasty-interobserver variability, operator bias, and the definition used to determine success-on the primary (technical) results of angioplasty in the peripheral arteries. Percent stenosis in 107 lesions in lower-extremity arteries was graded by three independent, experienced vascular radiologists ("observers") before and after balloon angioplasty and their estimates were compared with the initial interpretations reported by the physician performing the procedure ("operator") and an automated quantitative computer analysis. Observer variability was measured with use of intraclass correlation coefficients and SD. Differences among the operator, observers, and the computer were analyzed with use of the Wilcoxon signed-rank test and analysis of variance. For each evaluator, the results in this series of lesions were interpreted with three different definitions of success. Estimation of residual stenosis varied by an average range of 22.76% with an average SD of 8.99. The intraclass correlation coefficients averaged 0.59 for residual stenosis after angioplasty for the three observers but decreased to 0.36 when the operator was included as the fourth evaluator. There was good to very good agreement among the three independent observers and the computer, but poor correlation with the operator (P = .001). The primary success rates for this series of lesions varied from a low of 47% to high of 99%, depending solely on which definition of success was used. Significant differences among the operator, the three observers, and the computer were not present when the definition of success was based on less than 50% residual stenosis. Observer variability and bias in the subjective evaluation of peripheral angioplasty can have a significant influence on the reported initial success rates. This effect can be largely eliminated with the use of residual stenosis of less than 50% to define success. Otherwise, meaningful evaluation of angioplasty results will require independent panels of evaluators or computerized measurements.
Reliability of Computerized Neurocognitive Tests for Concussion Assessment: A Meta-Analysis.
Farnsworth, James L; Dargo, Lucas; Ragan, Brian G; Kang, Minsoo
2017-09-01
Although widely used, computerized neurocognitive tests (CNTs) have been criticized because of low reliability and poor sensitivity. A systematic review was published summarizing the reliability of Immediate Post-Concussion Assessment and Cognitive Testing (ImPACT) scores; however, this was limited to a single CNT. Expansion of the previous review to include additional CNTs and a meta-analysis is needed. Therefore, our purpose was to analyze reliability data for CNTs using meta-analysis and examine moderating factors that may influence reliability. A systematic literature search (key terms: reliability, computerized neurocognitive test, concussion) of electronic databases (MEDLINE, PubMed, Google Scholar, and SPORTDiscus) was conducted to identify relevant studies. Studies were included if they met all of the following criteria: used a test-retest design, involved at least 1 CNT, provided sufficient statistical data to allow for effect-size calculation, and were published in English. Two independent reviewers investigated each article to assess inclusion criteria. Eighteen studies involving 2674 participants were retained. Intraclass correlation coefficients were extracted to calculate effect sizes and determine overall reliability. The Fisher Z transformation adjusted for sampling error associated with averaging correlations. Moderator analyses were conducted to evaluate the effects of the length of the test-retest interval, intraclass correlation coefficient model selection, participant demographics, and study design on reliability. Heterogeneity was evaluated using the Cochran Q statistic. The proportion of acceptable outcomes was greatest for the Axon Sports CogState Test (75%) and lowest for the ImPACT (25%). Moderator analyses indicated that the type of intraclass correlation coefficient model used significantly influenced effect-size estimates, accounting for 17% of the variation in reliability. The Axon Sports CogState Test, which has a higher proportion of acceptable outcomes and shorter test duration relative to other CNTs, may be a reliable option; however, future studies are needed to compare the diagnostic accuracy of these instruments.
Stroop, D M; Glueck, C J; Tracy, T M; Schumacher, H R
1994-12-01
Our specific aim was to compare three plasminogen activator-inhibitor type 1 (PAI-1) antigen ELISA kit assays (the Biopool AB, Ltd, TintElize PAI-1 Strip-Well Format; the American Diagnostica, Inc., Imubind 822/1; and the second generation Imubind 822/1S). Within-run coefficients of variation (n = 6) for the TintElize, Imubind 822/1 and Imubind 822/1S methods were 5.5%, 5.9% and 6.8%, respectively. Between-run coefficients of variation for six aliquots per run were 2.9% for TintElize, 3.8% for Imubind 822/1, and 3.5% for Imubind 822/1S. Comparison of the average of duplicate aliquots from hyperlipidaemic patients demonstrated intraclass correlations of 0.75, 0.79 and 0.95 for TintElize vs Imubind 822/1 (n = 39), TintElize vs Imubind 822/1S (n = 39), and Imubind 822/1 vs 822/1S (n = 84), respectively. Lower 95% confidence interval limits of the intraclass correlation were 0.55, 0.48 and 0.93, respectively. Mean PAI-1 antigen values (n = 39) were 12.1, 15.8, 15.8 and 16.0 ng/ml, respectively, for TintElize, TintElize without using the quenching well, Imubind 822/1, and Imubind 822/1S. All three methods were easily performed and exhibited high correlation and reproducibility. A significant systematic bias (P < 0.006) existed between TintElize and TintElize without using the quenching well, Imubind 822/1, and Imubind 822/1S. However, there was no significant bias when TintElize without using the quenching well is compared with Imubind 822/1 (P > 0.8) and to 822/1S (P > 0.8) nor is there significant systematic bias between Imubind 822/1 and 822/1S (P > 0.3). By convention, interchangeability between assay methods suggests that the lower limit of the 95% intraclass correlation confidence interval be greater than 0.75.(ABSTRACT TRUNCATED AT 250 WORDS)
Interobserver Reliability of the Total Body Score System for Quantifying Human Decomposition.
Dabbs, Gretchen R; Connor, Melissa; Bytheway, Joan A
2016-03-01
Several authors have tested the accuracy of the Total Body Score (TBS) method for quantifying decomposition, but none have examined the reliability of the method as a scoring system by testing interobserver error rates. Sixteen participants used the TBS system to score 59 observation packets including photographs and written descriptions of 13 human cadavers in different stages of decomposition (postmortem interval: 2-186 days). Data analysis used a two-way random model intraclass correlation in SPSS (v. 17.0). The TBS method showed "almost perfect" agreement between observers, with average absolute correlation coefficients of 0.990 and average consistency correlation coefficients of 0.991. While the TBS method may have sources of error, scoring reliability is not one of them. Individual component scores were examined, and the influences of education and experience levels were investigated. Overall, the trunk component scores were the least concordant. Suggestions are made to improve the reliability of the TBS method. © 2016 American Academy of Forensic Sciences.
ERIC Educational Resources Information Center
Raykov, Tenko; Marcoulides, George A.
2015-01-01
A latent variable modeling procedure that can be used to evaluate intraclass correlation coefficients in two-level settings with discrete response variables is discussed. The approach is readily applied when the purpose is to furnish confidence intervals at prespecified confidence levels for these coefficients in setups with binary or ordinal…
Rhee, Sun Jung; Hong, Hyun Sook; Kim, Chul-Hee; Lee, Eun Hye; Cha, Jang Gyu; Jeong, Sun Hye
2015-12-01
This study aimed to evaluate the usefulness of Acoustic Structure Quantification (ASQ; Toshiba Medical Systems Corporation, Nasushiobara, Japan) values in the diagnosis of Hashimoto thyroiditis using B-mode sonography and to identify a cutoff ASQ level that differentiates Hashimoto thyroiditis from normal thyroid tissue. A total of 186 thyroid lobes with Hashimoto thyroiditis and normal thyroid glands underwent sonography with ASQ imaging. The quantitative results were reported in an echo amplitude analysis (Cm(2)) histogram with average, mode, ratio, standard deviation, blue mode, and blue average values. Receiver operating characteristic curve analysis was performed to assess the diagnostic ability of the ASQ values in differentiating Hashimoto thyroiditis from normal thyroid tissue. Intraclass correlation coefficients of the ASQ values were obtained between 2 observers. Of the 186 thyroid lobes, 103 (55%) had Hashimoto thyroiditis, and 83 (45%) were normal. There was a significant difference between the ASQ values of Hashimoto thyroiditis glands and those of normal glands (P < .001). The ASQ values in patients with Hashimoto thyroiditis were significantly greater than those in patients with normal thyroid glands. The areas under the receiver operating characteristic curves for the ratio, blue average, average, blue mode, mode, and standard deviation were: 0.936, 0.902, 0.893, 0.855, 0.846, and 0.842, respectively. The ratio cutoff value of 0.27 offered the best diagnostic performance, with sensitivity of 87.38% and specificity of 95.18%. The intraclass correlation coefficients ranged from 0.86 to 0.94, which indicated substantial agreement between the observers. Acoustic Structure Quantification is a useful and promising sonographic method for diagnosing Hashimoto thyroiditis. Not only could it be a helpful tool for quantifying thyroid echogenicity, but it also would be useful for diagnosis of Hashimoto thyroiditis. © 2015 by the American Institute of Ultrasound in Medicine.
ERIC Educational Resources Information Center
Pals, Sherri L.; Beaty, Brenda L.; Posner, Samuel F.; Bull, Sheana S.
2009-01-01
Studies designed to evaluate HIV and STD prevention interventions often involve random assignment of groups such as neighborhoods or communities to study conditions (e.g., to intervention or control). Investigators who design group-randomized trials (GRTs) must take the expected intraclass correlation coefficient (ICC) into account in sample size…
Validity and reproducibility of self-reported working hours among Japanese male employees.
Imai, Teppei; Kuwahara, Keisuke; Miyamoto, Toshiaki; Okazaki, Hiroko; Nishihara, Akiko; Kabe, Isamu; Mizoue, Tetsuya; Dohi, Seitaro
2016-07-22
Working long hours is a potential health hazard. Although self-reporting of working hours in various time frames has been used in epidemiologic studies, its validity is unclear. The objective of this study was to examine the validity and reproducibility of self-reported working hours among Japanese male employees. The participants were 164 male employees of four large-scale companies in Japan. For validity, the Spearman correlation between self-reported working hours in the second survey and the working hours recorded by the company was calculated for the following four time frames: daily working hours, monthly overtime working hours in the last month, average overtime working hours in the last 3 months, and the frequency of long working months (≥45 h/month) within the last 12 months. For reproducibility, the intraclass correlation between the first (September 2013) and second surveys (December 2013) was calculated for each of the four time frames. The Spearman correlations between self-reported working hours and those based on company records were 0.74, 0.81, 0.85, and 0.89 for daily, monthly, 3-monthly, and yearly time periods, respectively. The intraclass correlations for self-reported working hours between the two questionnaire surveys were 0.63, 0.66, 0.73, and 0.87 for the respective time frames. The results of the present study among Japanese male employees suggest that the validity of self-reported working hours is high for all four time frames, whereas the reproducibility is moderate to high.
Hall, Justin M; Azar, Frederick M; Miller, Robert H; Smith, Richard; Throckmorton, Thomas W
2014-09-01
We compared accuracy and reliability of a traditional method of measurement (most cephalad vertebral spinous process that can be reached by a patient with the extended thumb) to estimates made with the shoulder in abduction to determine if there were differences between the two methods. Six physicians with fellowship training in sports medicine or shoulder surgery estimated measurements in 48 healthy volunteers. Three were randomly chosen to make estimates of both internal rotation measurements for each volunteer. An independent observer made objective measurements on lateral scoliosis films (spinous process method) or with a goniometer (abduction method). Examiners were blinded to objective measurements as well as to previous estimates. Intraclass coefficients for interobserver reliability for the traditional method averaged 0.75, indicating good agreement among observers. The difference in vertebral level estimated by the examiner and the actual radiographic level averaged 1.8 levels. The intraclass coefficient for interobserver reliability for the abduction method averaged 0.81 for all examiners, indicating near-perfect agreement. Confidence intervals indicated that estimates were an average of 8° different from the objective goniometer measurements. Pearson correlation coefficients of intraobserver reliability for the abduction method averaged 0.94, indicating near-perfect agreement within observers. Confidence intervals demonstrated repeated estimates between 5° and 10° of the original. Internal rotation estimates made with the shoulder abducted demonstrated interobserver reliability superior to that of spinous process estimates, and reproducibility was high. On the basis of this finding, we now take glenohumeral internal rotation measurements with the shoulder in abduction and use a goniometer to maximize accuracy and objectivity. Copyright © 2014 Journal of Shoulder and Elbow Surgery Board of Trustees. Published by Mosby, Inc. All rights reserved.
ERIC Educational Resources Information Center
Hsu, Hsien-Yuan; Lin, Jr-Hung; Kwok, Oi-Man; Acosta, Sandra; Willson, Victor
2017-01-01
Several researchers have recommended that level-specific fit indices should be applied to detect the lack of model fit at any level in multilevel structural equation models. Although we concur with their view, we note that these studies did not sufficiently consider the impact of intraclass correlation (ICC) on the performance of level-specific…
ERIC Educational Resources Information Center
Jin, Ying; Myers, Nicholas D.; Ahn, Soyeon
2014-01-01
Previous research has demonstrated that differential item functioning (DIF) methods that do not account for multilevel data structure could result in too frequent rejection of the null hypothesis (i.e., no DIF) when the intraclass correlation coefficient (?) of the studied item was the same as the ? of the total score. The current study extended…
Margossian, Renee; Schwartz, Marcy L; Prakash, Ashwin; Wruck, Lisa; Colan, Steven D; Atz, Andrew M; Bradley, Timothy J; Fogel, Mark A; Hurwitz, Lynne M; Marcus, Edward; Powell, Andrew J; Printz, Beth F; Puchalski, Michael D; Rychik, Jack; Shirali, Girish; Williams, Richard; Yoo, Shi-Joon; Geva, Tal
2009-08-01
Assessment of the size and function of a functional single ventricle (FSV) is a key element in the management of patients after the Fontan procedure. Measurement variability of ventricular mass, volume, and ejection fraction (EF) among observers by echocardiography and cardiac magnetic resonance imaging (CMR) and their reproducibility among readers in these patients have not been described. From the 546 patients enrolled in the Pediatric Heart Network Fontan Cross-Sectional Study (mean age 11.9 +/- 3.4 years), 100 echocardiograms and 50 CMR studies were assessed for measurement reproducibility; 124 subjects with paired studies were selected for comparison between modalities. Interobserver agreement for qualitative grading of ventricular function by echocardiography was modest for left ventricular (LV) morphology (kappa = 0.42) and weak for right ventricular (RV) morphology (kappa = 0.12). For quantitative assessment, high intraclass correlation coefficients were found for echocardiographic interobserver agreement (LV 0.87 to 0.92, RV 0.82 to 0.85) of systolic and diastolic volumes, respectively. In contrast, intraclass correlation coefficients for LV and RV mass were moderate (LV 0.78, RV 0.72). The corresponding intraclass correlation coefficients by CMR were high (LV 0.96, RV 0.85). Volumes by echocardiography averaged 70% of CMR values. Interobserver reproducibility for the EF was similar for the 2 modalities. Although the absolute mean difference between modalities for the EF was small (<2%), 95% limits of agreement were wide. In conclusion, agreement between observers of qualitative FSV function by echocardiography is modest. Measurements of FSV volume by 2-dimensional echocardiography underestimate CMR measurements, but their reproducibility is high. Echocardiographic and CMR measurements of FSV EF demonstrate similar interobserver reproducibility, whereas measurements of FSV mass and LV diastolic volume are more reproducible by CMR.
Validity and reliability of a new ankle dorsiflexion measurement device.
Gatt, Alfred; Chockalingam, Nachiappan
2013-08-01
The assessment of the maximum ankle dorsiflexion angle is an important clinical examination procedure. Evidence shows that the traditional goniometer is highly unreliable, and various designs of goniometers to measure the maximum ankle dorsiflexion angle rely on the application of a known force to obtain reliable results. Hence, an innovative ankle dorsiflexion measurement device was designed to make this measurement more reliable by holding the foot in a selected posture without the application of a known moment. To report on the comprehensive validity and reliability testing carried out on the new device. Following validity testing, four different trials to test reliability of the ankle dorsiflexion measurement device were performed. These trials included inter-rater and intra-rater testings with a controlled moment, intra-rater reliability testing with knees flexed and extended without a controlled moment, intra-rater testing with a patient population, and inter-rater reliability testing between four raters of varying experience without controlling moment. All raters were blinded. A series of trials to test intra-rater and inter-rater reliabilities. Intra-rater reliability intraclass correlation coefficient was 0.98 and inter-rater reliability intraclass correlation coefficient (2,1) was 0.953 with a controlled moment. With uncontrolled moment, very high reliability for intra-tester was also achieved (intraclass correlation coefficient = 0.94 with knees extended and intraclass correlation coefficient = 0.95 with knees flexed). For the trial investigating test-retest reliability with actual patients, intraclass correlation coefficient of 0.99 was obtained. In the trial investigating four different raters with uncontrolled moment, intraclass correlation coefficient of 0.91 was achieved. The new ankle dorsiflexion measurement device is a valid and reliable device for measuring ankle dorsiflexion in both healthy subjects and patients, with both controlled and uncontrolled moments, even by multiple raters of varying experience when the foot is dorsiflexed to its end of range of motion. An ankle dorsiflexion measuring device has been designed to increase the reliability of ankle dorsiflexion measurement and replace the traditional goniometer. While the majority of similar devices rely on application of a known moment to perform this measurement, it has been shown that this is not required with the new ankle dorsiflexion measurement device and, rather, foot posture should be taken into consideration as this affects the maximum ankle dorsiflexion angle.
Repeatability and reproducibility of corneal thickness using SOCT Copernicus HR.
Vidal, Silvia; Viqueira, Valentín; Mas, David; Domenech, Begoña
2013-05-01
The aim of this study is to determine the reliability of corneal thickness measurements derived from SOCT Copernicus HR (Fourier domain OCT). Thirty healthy eyes of 30 subjects were evaluated. One eye of each patient was chosen randomly. Images were obtained of the central (up to 2.0 mm from the corneal apex) and paracentral (2.0 to 4.0 mm) cornea. We assessed corneal thickness (central and paracentral) and epithelium thickness. The intra-observer repeatability data were analysed using the intra-class correlation coefficient (ICC) for a range of 95 per cent within-subject standard deviation (S(W)) and the within-subject coefficient of variation (C(W)). The level of agreement by Bland-Altman analysis was also represented for the study of the reproducibility between observers and agreement between methods of measurement (automatic versus manual). The mean value of the central corneal thickness (CCT) was 542.4 ± 30.1 μm (SD). There was a high intra-observer agreement, finding the best result in the central sector with an intra-class correlation coefficient of 0.99, 95 per cent CI (0.989 to 0.997) and the worst, in the minimum corneal thickness, with an intra-class correlation coefficient of 0.672, 95 per cent CI (0.417 to 0.829). Reproducibility between observers was very high. The best result was found in the central sector thickness obtained both manually and automatically with an intra-class correlation coefficient of 0.990 in both cases and the worst result in the maximum corneal thickness with an intra-class correlation coefficient of 0.827. The agreement between measurement methods was also very high with intra-class correlation coefficient greater than 0.91. On the other hand the repeatability and reproducibility for epithelial measurements was poor. Pachymetric mapping with SOCT Copernicus HR was found to be highly repeatable and reproducible. We found that the device lacks an appropriate ergonomic design as proper focusing of the laser beam onto the cornea for anterior segment scanning required that patients were positioned slightly farther away from the machine head-rest than in the setup for retinal imaging. © 2013 The Authors. Clinical and Experimental Optometry © 2013 Optometrists Association Australia.
Catenaccio, E; Caccese, J; Wakschlag, N; Fleysher, R; Kim, N; Kim, M; Buckley, T A; Stewart, W F; Lipton, R B; Kaminski, T; Lipton, M L
2016-01-01
The long-term effects of repetitive head impacts due to heading are an area of increasing concern, and exposure must be accurately measured; however, the validity of self-report of cumulative soccer heading is not known. In order to validate HeadCount, a 2-week recall questionnaire, the number of player-reported headers was compared to the number of headers observed by trained raters for a men's and a women's collegiate soccer teams during an entire season of competitive play using Spearman's correlations and intraclass correlation coefficients (ICCs), and calibrated using a generalized estimating equation. The average Spearman's rho was 0.85 for men and 0.79 for women. The average ICC was 0.75 in men and 0.38 in women. The calibration analysis demonstrated that men tend to report heading accurately while women tend to overestimate. HeadCount is a valid instrument for tracking heading behaviour, but may have to be calibrated in women.
Validity and reproducibility of self-reported working hours among Japanese male employees
Imai, Teppei; Kuwahara, Keisuke; Miyamoto, Toshiaki; Okazaki, Hiroko; Nishihara, Akiko; Kabe, Isamu; Mizoue, Tetsuya; Dohi, Seitaro
2016-01-01
Objective: Working long hours is a potential health hazard. Although self-reporting of working hours in various time frames has been used in epidemiologic studies, its validity is unclear. The objective of this study was to examine the validity and reproducibility of self-reported working hours among Japanese male employees. Methods: The participants were 164 male employees of four large-scale companies in Japan. For validity, the Spearman correlation between self-reported working hours in the second survey and the working hours recorded by the company was calculated for the following four time frames: daily working hours, monthly overtime working hours in the last month, average overtime working hours in the last 3 months, and the frequency of long working months (≥45 h/month) within the last 12 months. For reproducibility, the intraclass correlation between the first (September 2013) and second surveys (December 2013) was calculated for each of the four time frames. Results: The Spearman correlations between self-reported working hours and those based on company records were 0.74, 0.81, 0.85, and 0.89 for daily, monthly, 3-monthly, and yearly time periods, respectively. The intraclass correlations for self-reported working hours between the two questionnaire surveys were 0.63, 0.66, 0.73, and 0.87 for the respective time frames. Conclusions: The results of the present study among Japanese male employees suggest that the validity of self-reported working hours is high for all four time frames, whereas the reproducibility is moderate to high. PMID:27265530
An interrater reliability study of the Braden scale in two nursing homes.
Kottner, Jan; Dassen, Theo
2008-10-01
Adequate risk assessment is essential in pressure ulcer prevention. Assessment scales were designed to support practitioners in identifying persons at pressure ulcer risk. The Braden scale is one of the most extensively studied risk assessment instruments, although the majority of studies focused on validity rather than reliability. The first aim was to measure the interrater reliability of the Braden scale and its individual items. The second aim was to study different statistical approaches regarding interrater reliability estimation. An interrater reliability study was conducted in two German nursing homes. Residents (n = 152) from 8 units were assessed twice. The raters were trained nurses with a work experience ranging from 0.5 to 30 years. Data were analysed using an overall percentage of agreement, weighted and unweighted kappa and the intraclass correlation coefficient. Differences between nurses rating the overall Braden score ranged from 0 up to 9 points. Interrater reliability expressed by the intraclass correlation coefficient ranged from 0.73 (95% CI 0.26 - 0.91) to 0.95 (95% CI 0.87 - 0.98). Calculated intraclass correlation coefficients for individual items ranged from 0.06 (95% CI -0.31 to 0.48) to 0.97 (95% CI 0.93-0.99) with the lowest values being measured for the items "sensory perception" and "nutrition". There was no association between work experience and the level of interrater reliability. With two exceptions, simple kappa-values were always lower than weighted kappa-values and intraclass correlation coefficients. Although the calculated interrater reliability coefficients for the total Braden score were high in some cases, several clinically relevant differences occurred between the nurses. Due to interrater reliability being very low for the items "sensory perception" and "nutrition", it is doubtful if their assessment contributes to any valid results. The calculation of weighted kappa or intraclass correlation coefficients is the most appropriate interrater reliability estimates.
Validity and reliability of the Questionnaire for Compliance with Standard Precaution
Valim, Marília Duarte; Marziale, Maria Helena Palucci; Hayashida, Miyeko; Rocha, Fernanda Ludmilla Rossi; Santos, Jair Lício Ferreira
2015-01-01
ABSTRACT OBJECTIVE : To evaluate the validity and reliability of the Questionnaire for Compliance with Standard Precaution for nurses. METHODS : This methodological study was conducted with 121 nurses from health care facilities in Sao Paulo’s countryside, who were represented by two high-complexity and by three average-complexity health care facilities. Internal consistency was calculated using Cronbach’s alpha and stability was calculated by the intraclass correlation coefficient, through test-retest. Convergent, discriminant, and known-groups construct validity techniques were conducted. RESULTS : The questionnaire was found to be reliable (Cronbach’s alpha: 0.80; intraclass correlation coefficient: (0.97) In regards to the convergent and discriminant construct validity, strong correlation was found between compliance to standard precautions, the perception of a safe environment, and the smaller perception of obstacles to follow such precautions (r = 0.614 and r = 0.537, respectively). The nurses who were trained on the standard precautions and worked on the health care facilities of higher complexity were shown to comply more (p = 0.028 and p = 0.006, respectively). CONCLUSIONS : The Brazilian version of the Questionnaire for Compliance with Standard Precaution was shown to be valid and reliable. Further investigation must be conducted with nurse samples that are more representative of the Brazilian reality. The use of the questionnaire may support the creation of educational measures considering the possible gaps that can be identified, focusing on the workers’ health and on the patients’ safety. PMID:26759967
Planning to avoid trouble in the operating room: experts' formulation of the preoperative plan.
Zilbert, Nathan R; St-Martin, Laurent; Regehr, Glenn; Gallinger, Steven; Moulton, Carol-Anne
2015-01-01
The purpose of this study was to capture the preoperative plans of expert hepato-pancreato-biliary (HPB) surgeons with the goal of finding consistent aspects of the preoperative planning process. HPB surgeons were asked to think aloud when reviewing 4 preoperative computed tomography scans of patients with distal pancreatic tumors. The imaging features they identified and the planned actions they proposed were tabulated. Surgeons viewed the tabulated list of imaging features for each case and rated the relevance of each feature for their subsequent preoperative plan. Average rater intraclass correlation coefficients were calculated for each type of data collected (imaging features detected, planned actions reported, and relevance of each feature) to establish whether the surgeons were consistent with one another in their responses. Average rater intraclass correlation coefficient values greater than 0.7 were considered indicative of consistency. Division of General Surgery, University of Toronto. HPB surgeons affiliated with the University of Toronto. A total of 11 HPB surgeons thought aloud when reviewing 4 computed tomography scans. Surgeons were consistent in the imaging features they detected but inconsistent in the planned actions they reported. Of the HPB surgeons, 8 completed the assessment of feature relevance. For 3 of the 4 cases, the surgeons were consistent in rating the relevance of specific imaging features on their preoperative plans. These results suggest that HPB surgeons are consistent in some aspects of the preoperative planning process but not others. The findings further our understanding of the preoperative planning process and will guide future research on the best ways to incorporate the teaching and evaluation of preoperative planning into surgical training. Copyright © 2014 Association of Program Directors in Surgery. Published by Elsevier Inc. All rights reserved.
Sun, Yi; Arning, Martin; Bochmann, Frank; Börger, Jutta; Heitmann, Thomas
2018-06-01
The Occupational Safety and Health Monitoring and Assessment Tool (OSH-MAT) is a practical instrument that is currently used in the German woodworking and metalworking industries to monitor safety conditions at workplaces. The 12-item scoring system has three subscales rating technical, organizational, and personnel-related conditions in a company. Each item has a rating value ranging from 1 to 9, with higher values indicating higher standard of safety conditions. The reliability of this instrument was evaluated in a cross-sectional survey among 128 companies and its validity among 30,514 companies. The inter-rater reliability of the instrument was examined independently and simultaneously by two well-trained safety engineers. Agreement between the double ratings was quantified by the intraclass correlation coefficient and absolute agreement of the rating values. The content validity of the OSH-MAT was evaluated by quantifying the association between OSH-MAT values and 5-year average injury rates by Poisson regression analysis adjusted for the size of the companies and industrial sectors. The construct validity of OSH-MAT was examined by principle component factor analysis. Our analysis indicated good to very good inter-rater reliability (intraclass correlation coefficient = 0.64-0.74) of OSH-MAT values with an absolute agreement of between 72% and 81%. Factor analysis identified three component subscales that met exactly the structure theory of this instrument. The Poisson regression analysis demonstrated a statistically significant exposure-response relationship between OSH-MAT values and the 5-year average injury rates. These analyses indicate that OSH-MAT is a valid and reliable instrument that can be used effectively to monitor safety conditions at workplaces.
Puntillo, Kathleen A; Neuhaus, John; Arai, Shoshana; Paul, Steven M; Gropper, Michael A; Cohen, Neal H; Miaskowski, Christine
2012-10-01
Determine levels of agreement among intensive care unit patients and their family members, nurses, and physicians (proxies) regarding patients' symptoms and compare levels of mean intensity (i.e., the magnitude of a symptom sensation) and distress (i.e., the degree of emotionality that a symptom engenders) of symptoms among patients and proxy reporters. Prospective study of proxy reporters of symptoms in seriously ill patients. Two intensive care units in a tertiary medical center in the Western United States. Two hundred and forty-five intensive care unit patients, 243 family members, 103 nurses, and 92 physicians. None. On the basis of the magnitude of intraclass correlation coefficients, where coefficients from .35 to .78 are considered to be appropriately robust, correlation coefficients between patients' and family members' ratings met this criterion (≥.35) for intensity in six of ten symptoms. No intensity ratings between patients and nurses had intraclass correlation coefficients >.32. Three symptoms had intensity correlation coefficients of ≥.36 between patients' and physicians' ratings. Correlation coefficients between patients and family members were >.40 for five symptom-distress ratings. No symptoms had distress correlation coefficients of ≥.28 between patients' and nurses' ratings. Two symptoms had symptom-distress correlation coefficients between patients' and physicians' ratings at >.39. Family members, nurses, and physicians reported higher symptom-intensity scores than patients did for 80%, 60%, and 60% of the symptoms, respectively. Family members, nurses, and physicians reported higher symptom-distress scores than patients did for 90%, 70%, and 80% of the symptoms, respectively. Patient-family intraclass correlation coefficients were sufficiently close for us to consider using family members to help assess intensive care unit patients' symptoms. Relatively low intraclass correlation coefficients between intensive care unit clinicians' and patients' symptom ratings indicate that some proxy raters overestimate whereas others underestimate patients' symptoms. Proxy overestimation of patients' symptom scores warrants further study because this may influence decisions about treating patients' symptoms.
Mejia, Amanda F; Nebel, Mary Beth; Barber, Anita D; Choe, Ann S; Pekar, James J; Caffo, Brian S; Lindquist, Martin A
2018-05-15
Reliability of subject-level resting-state functional connectivity (FC) is determined in part by the statistical techniques employed in its estimation. Methods that pool information across subjects to inform estimation of subject-level effects (e.g., Bayesian approaches) have been shown to enhance reliability of subject-level FC. However, fully Bayesian approaches are computationally demanding, while empirical Bayesian approaches typically rely on using repeated measures to estimate the variance components in the model. Here, we avoid the need for repeated measures by proposing a novel measurement error model for FC describing the different sources of variance and error, which we use to perform empirical Bayes shrinkage of subject-level FC towards the group average. In addition, since the traditional intra-class correlation coefficient (ICC) is inappropriate for biased estimates, we propose a new reliability measure denoted the mean squared error intra-class correlation coefficient (ICC MSE ) to properly assess the reliability of the resulting (biased) estimates. We apply the proposed techniques to test-retest resting-state fMRI data on 461 subjects from the Human Connectome Project to estimate connectivity between 100 regions identified through independent components analysis (ICA). We consider both correlation and partial correlation as the measure of FC and assess the benefit of shrinkage for each measure, as well as the effects of scan duration. We find that shrinkage estimates of subject-level FC exhibit substantially greater reliability than traditional estimates across various scan durations, even for the most reliable connections and regardless of connectivity measure. Additionally, we find partial correlation reliability to be highly sensitive to the choice of penalty term, and to be generally worse than that of full correlations except for certain connections and a narrow range of penalty values. This suggests that the penalty needs to be chosen carefully when using partial correlations. Copyright © 2018. Published by Elsevier Inc.
Kistner, Emily O; Muller, Keith E
2004-09-01
Intraclass correlation and Cronbach's alpha are widely used to describe reliability of tests and measurements. Even with Gaussian data, exact distributions are known only for compound symmetric covariance (equal variances and equal correlations). Recently, large sample Gaussian approximations were derived for the distribution functions. New exact results allow calculating the exact distribution function and other properties of intraclass correlation and Cronbach's alpha, for Gaussian data with any covariance pattern, not just compound symmetry. Probabilities are computed in terms of the distribution function of a weighted sum of independent chi-square random variables. New F approximations for the distribution functions of intraclass correlation and Cronbach's alpha are much simpler and faster to compute than the exact forms. Assuming the covariance matrix is known, the approximations typically provide sufficient accuracy, even with as few as ten observations. Either the exact or approximate distributions may be used to create confidence intervals around an estimate of reliability. Monte Carlo simulations led to a number of conclusions. Correctly assuming that the covariance matrix is compound symmetric leads to accurate confidence intervals, as was expected from previously known results. However, assuming and estimating a general covariance matrix produces somewhat optimistically narrow confidence intervals with 10 observations. Increasing sample size to 100 gives essentially unbiased coverage. Incorrectly assuming compound symmetry leads to pessimistically large confidence intervals, with pessimism increasing with sample size. In contrast, incorrectly assuming general covariance introduces only a modest optimistic bias in small samples. Hence the new methods seem preferable for creating confidence intervals, except when compound symmetry definitely holds.
Evaluating handoffs in the context of a communication framework.
Hasan, Hani; Ali, Fadwa; Barker, Paul; Treat, Robert; Peschman, Jacob; Mohorek, Matthew; Redlich, Philip; Webb, Travis
2017-03-01
The implementation of mandated restrictions in resident duty hours has led to increased handoffs for patient care and thus more opportunities for errors during transitions of care. Much of the current handoff literature is empiric, with experts recommending the study of handoffs within an established framework. A prospective, single-institution study was conducted evaluating the process of handoffs for the care of surgical patients in the context of a published communication framework. Evaluation tools for the source, receiver, and observer were developed to identify factors impacting the handoff process, and inter-rater correlations were assessed. Data analysis was generated with Pearson/Spearman correlations and multivariate linear regressions. Rater consistency was assessed with intraclass correlations. A total of 126 handoffs were observed. Evaluations were completed by 1 observer (N = 126), 2 observers (N = 23), 2 receivers (N = 39), 1 receiver (N = 82), and 1 source (N = 78). An average (±standard deviation) service handoff included 9.2 (±4.6) patients, lasted 9.1 (±5.4) minutes, and had 4.7 (±3.4) distractions recorded by the observer. The source and receiver(s) recognized distractions in >67% of handoffs, with the most common internal and external distractions being fatigue (60% of handoffs) and extraneous staff entering/exiting the room (31%), respectively. Teams with more patients spent less time per individual patient handoff (r = -0.298; P = .001). Statistically significant intraclass correlations (P ≤ .05) were moderate between observers (r ≥ 0.4) but not receivers (r < 0.4). Intraclass correlation values between different types of raters were inconsistent (P > .05). The quality of the handoff process was affected negatively by presence of active electronic devices (β = -0.565; P = .005), number of teaching discussions (β = -0.417; P = .048), and a sense of hierarchy between source and receiver (β = -0.309; P = .002). Studying the handoff process within an established framework highlights factors that impair communication. Internal and external distractions are common during handoffs and along with the working relationship between the source and receiver impact the quality of the handoff process. This information allows further study and targeted interventions of the handoff process to improve overall effectiveness and patient safety of the handoff. Copyright © 2016 Elsevier Inc. All rights reserved.
Bredow, J; Oppermann, J; Scheyerer, M J; Gundlfinger, K; Neiss, W F; Budde, S; Floerkemeier, T; Eysel, P; Beyer, F
2015-05-01
Radiological study. To asses standard values, intra- and interobserver reliability and reproducibility of sacral slope (SS) and lumbar lordosis (LL) and the correlation of these parameters in patients with lumbar spinal stenosis (LSS). Anteroposterior and lateral X-rays of the lumbar spine of 102 patients with LSS were included in this retrospective, radiologic study. Measurements of SS and LL were carried out by five examiners. Intraobserver correlation and correlation between LL and SS were calculated with Pearson's r linear correlation coefficient and intraclass correlation coefficients (ICC) were calculated for inter- and intraobserver reliability. In addition, patients were examined in subgroups with respect to previous surgery and the current therapy. Lumbar lordosis averaged 45.6° (range 2.5°-74.9°; SD 14.2°), intraobserver correlation was between Pearson r = 0.93 and 0.98. The measurement of SS averaged 35.3° (range 13.8°-66.9°; SD 9.6°), intraobserver correlation was between Pearson r = 0.89 and 0.96. Intraobserver reliability ranged from 0.966 to 0.992 ICC in LL measurements and 0.944-0.983 ICC in SS measurements. There was an interobserver reliability ICC of 0.944 in LL and 0.990 in SS. Correlation between LL and SS averaged r = 0.79. No statistically significant differences were observed between the analyzed subgroups. Manual measurement of LL and SS in patients with LSS on lateral radiographs is easily performed with excellent intra- and interobserver reliability. Correlation between LL and SS is very high. Differences between patients with and without previous decompression were not statistically significant.
Pulmonary disease in cystic fibrosis: assessment with chest CT at chest radiography dose levels.
Ernst, Caroline W; Basten, Ines A; Ilsen, Bart; Buls, Nico; Van Gompel, Gert; De Wachter, Elke; Nieboer, Koenraad H; Verhelle, Filip; Malfroot, Anne; Coomans, Danny; De Maeseneer, Michel; de Mey, Johan
2014-11-01
To investigate a computed tomographic (CT) protocol with iterative reconstruction at conventional radiography dose levels for the assessment of structural lung abnormalities in patients with cystic fibrosis ( CF cystic fibrosis ). In this institutional review board-approved study, 38 patients with CF cystic fibrosis (age range, 6-58 years; 21 patients <18 years and 17 patients >18 years) underwent investigative CT (at minimal exposure settings combined with iterative reconstruction) as a replacement of yearly follow-up posteroanterior chest radiography. Verbal informed consent was obtained from all patients or their parents. CT images were randomized and rated independently by two radiologists with use of the Bhalla scoring system. In addition, mosaic perfusion was evaluated. As reference, the previous available conventional chest CT scan was used. Differences in Bhalla scores were assessed with the χ(2) test and intraclass correlation coefficients ( ICC intraclass correlation coefficient s). Radiation doses for CT and radiography were assessed for adults (>18 years) and children (<18 years) separately by using technical dose descriptors and estimated effective dose. Differences in dose were assessed with the Mann-Whitney U test. The median effective dose for the investigative protocol was 0.04 mSv (95% confidence interval [ CI confidence interval ]: 0.034 mSv, 0.10 mSv) for children and 0.05 mSv (95% CI confidence interval : 0.04 mSv, 0.08 mSv) for adults. These doses were much lower than those with conventional CT (median: 0.52 mSv [95% CI confidence interval : 0.31 mSv, 3.90 mSv] for children and 1.12 mSv [95% CI confidence interval : 0.57 mSv, 3.15 mSv] for adults) and of the same order of magnitude as those for conventional radiography (median: 0.012 mSv [95% CI confidence interval : 0.006 mSv, 0.022 mSv] for children and 0.012 mSv [95% CI confidence interval : 0.005 mSv, 0.031 mSv] for adults). All images were rated at least as diagnostically acceptable. Very good agreement was found in overall Bhalla score ( ICC intraclass correlation coefficient , 0.96) with regard to the severity of bronchiectasis ( ICC intraclass correlation coefficient , 0.87) and sacculations and abscesses ( ICC intraclass correlation coefficient , 0.84). Interobserver agreement was excellent ( ICC intraclass correlation coefficient , 0.86-1). For patients with CF cystic fibrosis , a dedicated chest CT protocol can replace the two yearly follow-up chest radiographic examinations without major dose penalty and with similar diagnostic quality compared with conventional CT.
Flores-Mir, Carlos; Burgess, Corr A; Champney, Mitchell; Jensen, Robert J; Pitcher, Micheal R; Major, Paul W
2006-01-01
The aim of this study was to assess the correlation between the Fishman maturation prediction method (FMP) and the cervical vertebral maturation (CVM) method for skeletal maturation stage determination. Hand-wrist and lateral cephalograms from 79 subjects (52 females and 27 males) were used. Hand-wrist radiographs were analyzed using the FMP to determine skeletal maturation level (advanced, average, or delayed) and stage (relative position of the individual in the pubertal growth curve). Cervical vertebrae (C2, C3, and C4) outlines obtained from lateral cephalograms were analyzed using the CVM to determine skeletal maturation stage. Intraexaminer reliability (Intraclass correlation coefficient [ICC]) for both methods was calculated from 10 triplicate hand-wrist and lateral cephalograms from the same patients. An ICC coefficient of 0.985 for FMP and an ICC of 0.889 for CVM were obtained. A Spearman correlation value of 0.72 (P < .001) was found between the skeletal maturation stages of both methods. When the sample was subgrouped according to skeletal maturation level, the following correlation values were found: for early mature adolescents 0.73, for average mature adolescents 0.70, and for late mature adolescents 0.87. All these correlation values were statistically different from zero (P < .024). Correlation values between both skeletal maturation methods were moderately high. This may be high enough to use either of the methods indistinctively for research purposes but not for the assessment of individual patients. Skeletal level influences the correlation values and, therefore, it should be considered whenever possible.
Web-Based Assessment of Mental Well-Being in Early Adolescence: A Reliability Study.
Hamann, Christoph; Schultze-Lutter, Frauke; Tarokh, Leila
2016-06-15
The ever-increasing use of the Internet among adolescents represents an emerging opportunity for researchers to gain access to larger samples, which can be queried over several years longitudinally. Among adolescents, young adolescents (ages 11 to 13 years) are of particular interest to clinicians as this is a transitional stage, during which depressive and anxiety symptoms often emerge. However, it remains unclear whether these youngest adolescents can accurately answer questions about their mental well-being using a Web-based platform. The aim of the study was to examine the accuracy of responses obtained from Web-based questionnaires by comparing Web-based with paper-and-pencil versions of depression and anxiety questionnaires. The primary outcome was the score on the depression and anxiety questionnaires under two conditions: (1) paper-and-pencil and (2) Web-based versions. Twenty-eight adolescents (aged 11-13 years, mean age 12.78 years and SD 0.78; 18 females, 64%) were randomly assigned to complete either the paper-and-pencil or the Web-based questionnaire first. Intraclass correlation coefficients (ICCs) were calculated to measure intrarater reliability. Intraclass correlation coefficients were calculated separately for depression (Children's Depression Inventory, CDI) and anxiety (Spence Children's Anxiety Scale, SCAS) questionnaires. On average, it took participants 17 minutes (SD 6) to answer 116 questions online. Intraclass correlation coefficient analysis revealed high intrarater reliability when comparing Web-based with paper-and-pencil responses for both CDI (ICC=.88; P<.001) and the SCAS (ICC=.95; P<.001). According to published criteria, both of these values are in the "almost perfect" category indicating the highest degree of reliability. The results of the study show an excellent reliability of Web-based assessment in 11- to 13-year-old children as compared with the standard paper-pencil assessment. Furthermore, we found that Web-based assessments with young adolescents are highly feasible, with all enrolled participants completing the Web-based form. As early adolescence is a time of remarkable social and behavioral changes, these findings open up new avenues for researchers from diverse fields who are interested in studying large samples of young adolescents over time.
Ponrartana, Skorn; Andrade, Kristine E; Wren, Tishya A L; Ramos-Platt, Leigh; Hu, Houchun H; Bluml, Stefan; Gilsanz, Vicente
2014-06-01
The purpose of this study was to assess the repeatability of water-fat MRI and diffusion-tensor imaging (DTI) as quantitative biomarkers of pediatric lower extremity skeletal muscle. MRI at 3 T of a randomly selected thigh and lower leg of seven healthy children was studied using water-fat separation and DTI techniques. Muscle-fat fraction, apparent diffusion coefficient (ADC), and fractional anisotropy (FA) values were calculated. Test-retest and interrater repeatability were assessed by calculating the Pearson correlation coefficient, intraclass correlation coefficient, and Bland-Altman analysis. Bland-Altman plots show that the mean difference between test-retest and interrater measurements of muscle-fat fraction, ADC, and FA was near 0. The correlation coefficients and intraclass correlation coefficients were all between 0.88 and 0.99 (p < 0.05), suggesting excellent reliability of the measurements. Muscle-fat fraction measurements from water-fat MRI exhibited the highest intraclass correlation coefficient. Interrater agreement was consistently better than test-retest comparisons. Water-fat MRI and DTI measurements in lower extremity skeletal muscles are objective repeatable biomarkers in children. This knowledge should aid in the understanding of the number of participants needed in clinical trials when using these determinations as an outcome measure to noninvasively monitor neuromuscular disease.
Demers, Marie-Elaine; Dubé, Samuel; Bourdages, Mélodie; Gasse, Cedric; Boutin, Amélie; Girard, Mario; Bujold, Emmanuel; Demers, Suzanne
2018-01-10
To compare the first-trimester uterine artery pulsatility index (PI) measured by abdominal and transvaginal ultrasound (US). We performed a prospective study of singleton pregnant women recruited at 11 to 13 weeks' gestation. The mean uterine artery PI was obtained by abdominal followed by transvaginal US. The mean of the left and right uterine artery PIs was used, and differences between approaches were computed. The intraclass correlation coefficient and a Bland-Altman plot were used to compare the two approaches. Data were available for 940 participants, including 928 (99%) with uterine artery PIs obtained on both uterine sides. The mean uterine artery PI decreased with gestational age in both approaches (P < .001). We observed a moderate correlation between abdominal and transvaginal mean uterine artery PIs (intraclass correlation coefficient, 0.72; 95% confidence interval, 0.69 to 0.75). Values obtained by abdominal US (median, 1.70, interquartile range, 1.35 to 2.09) were greater than those obtained by transvaginal US (median, 1.65; interquartile range, 1.37 to 1.99). There was a significant increase in differences as average measurements became higher (P < .01). The first-trimester mean uterine artery PI decreases with gestational age in both approaches. Abdominal US could be associated with greater uterine artery PI values than transvaginal US, especially at higher measurements. The first-trimester uterine artery PI for prediction of adverse perinatal outcomes should be adjusted for gestational age and possibly for the US approach. © 2018 by the American Institute of Ultrasound in Medicine.
Rey-Martinez, Jorge; Pérez-Fernández, Nicolás
2016-12-01
The proposed validation goal of 0.9 in intra-class correlation coefficient was reached with the results of this study. With the obtained results we consider that the developed software (RombergLab) is a validated balance assessment software. The reliability of this software is dependent of the used force platform technical specifications. Develop and validate a posturography software and share its source code in open source terms. Prospective non-randomized validation study: 20 consecutive adults underwent two balance assessment tests, six condition posturography was performed using a clinical approved software and force platform and the same conditions were measured using the new developed open source software using a low cost force platform. Intra-class correlation index of the sway area obtained from the center of pressure variations in both devices for the six conditions was the main variable used for validation. Excellent concordance between RombergLab and clinical approved force platform was obtained (intra-class correlation coefficient =0.94). A Bland and Altman graphic concordance plot was also obtained. The source code used to develop RombergLab was published in open source terms.
Nkenke, Emeka; Lehner, Bernhard; Kramer, Manuel; Haeusler, Gerd; Benz, Stefanie; Schuster, Maria; Neukam, Friedrich W; Vairaktaris, Eleftherios G; Wurm, Jochen
2006-03-01
To assess measurement errors of a novel technique for the three-dimensional determination of the degree of facial symmetry in patients suffering from unilateral cleft lip and palate malformations. Technical report, reliability study. Cleft Lip and Palate Center of the University of Erlangen-Nuremberg, Erlangen, Germany. The three-dimensional facial surface data of five 10-year-old unilateral cleft lip and palate patients were subjected to the analysis. Distances, angles, surface areas, and volumes were assessed twice. Calculations were made for method error, intraclass correlation coefficient, and repeatability of the measurements of distances, angles, surface areas, and volumes. The method errors were less than 1 mm for distances and less than 1.5 degrees for angles. The intraclass correlation coefficients showed values greater than .90 for all parameters. The repeatability values were comparable for cleft and noncleft sides. The small method errors, high intraclass correlation coefficients, and comparable repeatability values for cleft and noncleft sides reveal that the new technique is appropriate for clinical use.
Thoma, Brent; Sebok-Syer, Stefanie S; Colmers-Gray, Isabelle; Sherbino, Jonathan; Ankel, Felix; Trueger, N Seth; Grock, Andrew; Siemens, Marshall; Paddock, Michael; Purdy, Eve; Kenneth Milne, William; Chan, Teresa M
2018-01-30
Construct: We investigated the quality of emergency medicine (EM) blogs as educational resources. Online medical education resources such as blogs are increasingly used by EM trainees and clinicians. However, quality evaluations of these resources using gestalt are unreliable. We investigated the reliability of two previously derived quality evaluation instruments for blogs. Sixty English-language EM websites that published clinically oriented blog posts between January 1 and February 24, 2016, were identified. A random number generator selected 10 websites, and the 2 most recent clinically oriented blog posts from each site were evaluated using gestalt, the Academic Life in Emergency Medicine (ALiEM) Approved Instructional Resources (AIR) score, and the Medical Education Translational Resources: Impact and Quality (METRIQ-8) score, by a sample of medical students, EM residents, and EM attendings. Each rater evaluated all 20 blog posts with gestalt and 15 of the 20 blog posts with the ALiEM AIR and METRIQ-8 scores. Pearson's correlations were calculated between the average scores for each metric. Single-measure intraclass correlation coefficients (ICCs) evaluated the reliability of each instrument. Our study included 121 medical students, 88 EM residents, and 100 EM attendings who completed ratings. The average gestalt rating of each blog post correlated strongly with the average scores for ALiEM AIR (r = .94) and METRIQ-8 (r = .91). Single-measure ICCs were fair for gestalt (0.37, IQR 0.25-0.56), ALiEM AIR (0.41, IQR 0.29-0.60) and METRIQ-8 (0.40, IQR 0.28-0.59). The average scores of each blog post correlated strongly with gestalt ratings. However, neither ALiEM AIR nor METRIQ-8 showed higher reliability than gestalt. Improved reliability may be possible through rater training and instrument refinement.
Barnett, Lisa M; Ridgers, Nicola D; Zask, Avigdor; Salmon, Jo
2015-01-01
To determine reliability and face validity of an instrument to assess young children's perceived fundamental movement skill competence. Validation and reliability study. A pictorial instrument based on the Test Gross Motor Development-2 assessed perceived locomotor (six skills) and object control (six skills) competence using the format and item structure from the physical competence subscale of the Pictorial Scale of Perceived Competence and Acceptance for Young Children. Sample 1 completed object control items in May (n=32) and locomotor items in October 2012 (n=23) at two time points seven days apart. Children were asked at the end of the test-retest their understanding of what was happening in each picture to determine face validity. Sample 2 (n=58) completed 12 items in November 2012 on a single occasion to test internal reliability only. Sample 1 children were aged 5-7 years (M=6.0, SD=0.8) at object control assessment and 5-8 years at locomotor assessment (M=6.5, SD=0.9). Sample 2 children were aged 6-8 years (M=7.2, SD=0.73). Intra-class correlations assessed in Sample 1 children were excellent for object control (intra-class correlation=0.78), locomotor (intra-class correlation=0.82) and all 12 skills (intra-class correlations=0.83). Face validity was acceptable. Internal consistency was adequate in both samples for each subscale and all 12 skills (alpha range 0.60-0.81). This study has provided preliminary evidence for instrument reliability and face validity. This enables future alignment between the measurement of perceived and actual fundamental movement skill competence in young children. Crown Copyright © 2014. Published by Elsevier Ltd. All rights reserved.
Y-balance test: a reliability study involving multiple raters.
Shaffer, Scott W; Teyhen, Deydre S; Lorenson, Chelsea L; Warren, Rick L; Koreerat, Christina M; Straseske, Crystal A; Childs, John D
2013-11-01
The Y-balance test (YBT) is one of the few field expedient tests that have shown predictive validity for injury risk in an athletic population. However, analysis of the YBT in a heterogeneous population of active adults (e.g., military, specific occupations) involving multiple raters with limited experience in a mass screening setting is lacking. The primary purpose of this study was to determine interrater test-retest reliability of the YBT in a military setting using multiple raters. Sixty-four service members (53 males, 11 females) actively conducting military training volunteered to participate. Interrater test-retest reliability of the maximal reach had intraclass correlation coefficients (2,1) of 0.80 to 0.85 with a standard error of measurement ranging from 3.1 to 4.2 cm for the 3 reach directions (anterior, posteromedial, and posterolateral). Interrater test-retest reliability of the average reach of 3 trails had an intraclass correlation coefficients (2,3) range of 0.85 to 0.93 with an associated standard error of measurement ranging from 2.0 to 3.5cm. The YBT showed good interrater test-retest reliability with an acceptable level of measurement error among multiple raters screening active duty service members. In addition, 31.3% (n = 20 of 64) of participants exhibited an anterior reach asymmetry of >4cm, suggesting impaired balance symmetry and potentially increased risk for injury. Reprint & Copyright © 2013 Association of Military Surgeons of the U.S.
Evaluation of different recall periods for the US National Cancer Institute's PRO-CTCAE.
Mendoza, Tito R; Dueck, Amylou C; Bennett, Antonia V; Mitchell, Sandra A; Reeve, Bryce B; Atkinson, Thomas M; Li, Yuelin; Castro, Kathleen M; Denicoff, Andrea; Rogak, Lauren J; Piekarz, Richard L; Cleeland, Charles S; Sloan, Jeff A; Schrag, Deborah; Basch, Ethan
2017-06-01
The US National Cancer Institute recently developed the PRO-CTCAE (Patient-Reported Outcomes version of the Common Terminology Criteria for Adverse Events). PRO-CTCAE is a library of questions for clinical trial participants to self-report symptomatic adverse events (e.g. nausea). The objective of this study is to inform evidence-based selection of a recall period when PRO-CTCAE is included in a trial. We evaluated differences between 1-, 2-, 3-, and 4-week recall periods, using daily reporting as the reference. English-speaking patients with cancer receiving chemotherapy and/or radiotherapy were enrolled at four US cancer centers and affiliated community clinics. Participants completed 27 PRO-CTCAE items electronically daily for 28 days, and then weekly over 4 weeks, using 1-, 2-, 3-, and 4-week recall periods. For each recall period, mean differences, effect sizes, and intraclass correlation coefficients were calculated to evaluate agreement between the maximum of daily ratings and the corresponding ratings obtained using longer recall periods (e.g. maximum of daily scores over 7 days vs 1-week recall). Analyses were repeated using the average of daily scores within each recall period rather than the maximum of daily scores. A total of 127 subjects completed questionnaires (57% male; median age: 57). The median of the 27 mean differences in scores on the PRO-CTCAE 5-point response scale comparing the maximum daily versus the longer recall period (and corresponding effect size) was -0.20 (-0.20) for 1-week recall, -0.36 (-0.31) for 2-week recall, -0.45 (-0.39) for 3-week recall, and -0.47 (-0.40) for 4-week recall. The median intraclass correlation across 27 items between the maximum of daily ratings and the corresponding longer recall ratings for 1-week recall was 0.70 (range: 0.54-0.82), for 2-week recall was 0.74 (range: 0.58-0.83), for 3-week recall was 0.72 (range: 0.61-0.84), and for 4-week recall was 0.72 (range: 0.64-0.86). Similar results were observed for all analyses using the average of daily scores rather than the maximum of daily scores. A 1-week recall corresponds best to daily reporting. Although intraclass correlations remain stable over time, there are small but progressively larger differences between daily and longer recall periods at 2, 3, and 4 weeks, respectively. The preferred recall period for the PRO-CTCAE is the past 7 days, although investigators may opt for recall periods of 2, 3, or 4 weeks with an understanding that there may be some information loss.
Nandigam, R N Kaveer; Chen, Yu-Wei; Gurol, Mahmut E; Rosand, Jonathan; Greenberg, Steven M; Smith, Eric E
2007-01-01
We sought to determine whether mid-sagittal intracranial area (ICA) is a valid surrogate of intracranial volume (ICV) when using retrospective data with relatively thick (6-7 mm) sagittal slices. Data were retrospectively analyzed from 47 subjects who had two MRI scans taken at least nine months apart. Twenty-three subjects had manual segmentation of ICV on the T2-weighted sequence for comparison. Intraclass correlation coefficient (ICC) for intraobserver, interobserver, and intraobserver scan-rescan comparisons were 0.96, 0.97 and 0.95. Pearson correlation coefficients between ICV and ICA, averaging the cumulative 1, 2, 3, and 4 most midline slices, were 0.89, 0.94, 0.93, and 0.95. There was a significant marginal increase in explained variance of ICV by measuring two, rather than one, slices (P= 0.001). These data suggest that ICA, even measured without high-resolution imaging, is a reasonable substitute for ICV.
Guber, Ivo; Bachmann, Lucas M; Guber, Josef; Bochmann, Frank; Lange, Alex P; Thiel, Michael A
2011-09-01
Straylight gives the appearance of a veil of light thrown over a person's retinal image when there is a strong light source present. We examined the reproducibility of the measurements by C-Quant, and assessed its correlation to characteristics of the eye and subjects' age. Five repeated straylight measurements were taken using the dominant eye of 45 healthy subjects (age 21-59) with a BCVA of 20/20: 14 emmetropic, 16 myopic, eight hyperopic and seven with astigmatism. We assessed the extent of reproducibility of straylight measures using the intraclass correlation coefficient. The mean straylight value of all measurements was 1.01 (SD 0.23, median 0.97, interquartile range 0.85-1.1). Per 10 years of age, straylight increased in average by 0.10 (95%CI 0.04 to 0.16, p < 0.01]. We found no independent association of refraction (range -5.25 dpt to +2 dpt) on straylight values (0.001; 95%CI -0.022 to 0.024, p = 0.92). Compared to emmetropic subjects, myopia reduced straylight (-.011; -0.024 to 0.02, p = 0.11), whereas higher straylight values (0.09; -0.01 to 0.20, p = 0.09) were observed in subjects with blue irises as compared to dark-colored irises when correcting for age. The intraclass correlation coefficient (ICC) of repeated measurements was 0.83 (95%CI 0.76 to 0.90). Our study showed that straylight measurements with the C-Quant had a high reproducibility, i.e. a lack of large intra-observer variability, making it appropriate to be applied in long-term follow-up studies assessing the long-term effect of surgical procedures on the quality of vision.
Hirsch Index Value and Variability Related to General Surgery in a UK Deanery.
Abdelrahman, Tarig; Brown, Josephine; Wheat, Jenny; Thomas, Charlotte; Lewis, Wyn
2016-01-01
The Hirsch Index (h-index) is often used to assess research impact, and on average a social science senior lecturer will have an h-index of 2.29, yet its validity within the context of UK General Surgery (GS) is unknown. The aim of this study was to calculate the h-indices of a cohort of GS consultants in a UK Deanery to assess its relative validity. Individual h-indices and total publication (TP) counts were obtained for GS consultants via the Scopus and Web of Science (WoS) Internet search engines. Assessment of construct validity and reliability of these 2 measures of the h-index was undertaken. All hospitals in a single UK National Health Service Deanery were included (14 general hospitals). All 136 GS consultants from the Deanery were included. Median h-index (Scopus) was 5 (0-52) and TP 15 (0-369), and strong correlation was found between h-index and TP (ρ = 0.932, p < 0.001), with the intraclass correlation between Scopus and WoS h-index also significant (intraclass correlation coefficient = 0.973 [95% CI: 0.962-0.981], p < 0.001). Academic GS consultants had higher h-indices than nonacademic University Hospital and District General Hospital consultants (Scopus 12 vs 7 vs 4 [p < 0.001] and WoS 10.5 vs 7 vs 4 [p < 0.001]). h-Index was >2.29 in 57.4% of consultants. No subspecialty differences were apparent in median h-indices (p = 0.792) and TP (p = 0.903). h-Index is a valid GS research productivity metric with over half of consultants performing at levels equivalent to social science Senior Lecturers. Copyright © 2015 Association of Program Directors in Surgery. Published by Elsevier Inc. All rights reserved.
Strober, Bruce; Zhao, Yang; Tran, Mary Helen; Gnanasakthy, Ari; Nyirady, Judit; Papavassilis, Charis; Nelson, Lauren M; McLeod, Lori D; Mordin, Margaret; Gottlieb, Alice B; Elewski, Boni E; Lebwohl, Mark
2016-03-01
This analysis aimed to confirm the reliability, validity, and responsiveness of the Psoriasis Symptom Diary (PSD) using data from two Phase III studies in patients with moderate to severe chronic plaque psoriasis. Data from two randomized, double-blind, double-dummy, placebo-controlled, multicenter Phase III studies (n = 820) assessing the efficacy and safety of secukinumab were used. The PSD (24-h recall; 0-10 numeric rating scale) was electronically administered each evening. Test-retest reliability was determined using intraclass correlations. Construct validity hypotheses were evaluated via correlations with the Psoriasis Area and Severity Index (PASI), Investigator's Global Assessment (IGA), Dermatology Life Quality Index (DLQI), EuroQoL 5-Dimension Health Status Questionnaire, and Patient Global Impression of Change (PGIC). Discriminating ability and responsiveness were evaluated by estimating mean differences and effect sizes between known groups (using the PASI and IGA). Phase II-derived, anchor-based PGIC thresholds and cumulative distribution function (CDF) plots described meaningful change. Items on the PSD yielded high intraclass coefficients (>0.90). Correlations were in the anticipated direction and by week 12 were moderate to strong (0.41-0.73) in magnitude, demonstrating construct validity. Average PSD item scores differed predictably and significantly between known groups. Responsiveness effect size estimates were moderate to large (0.6-1.5), and CDF plots showed the percentage of responders to be consistently higher in treatment than in placebo arms across the range of change in PSD scores. The PSD is reliable, valid, and responsive, and represents a valid tool to enhance treatment decisions in patients with moderate to severe plaque psoriasis. © 2015 The International Society of Dermatology.
Validity and reliability of the Hawaii anaerobic run test.
Kimura, Iris F; Stickley, Christopher D; Lentz, Melissa A; Wages, Jennifer J; Yanagi, Kazuhiko; Hetzler, Ronald K
2014-05-01
This study examined the reliability and validity of the Hawaii anaerobic run test (HART) by comparing anaerobic capacity measures obtained to those during the Wingate Anaerobic Test (WAnT). Ninety-six healthy physically active volunteers (age, 22.0 ± 2.8 years; height, 163.9 ± 9.5 cm; body mass, 70.6 ± 14.7 kg; body fat %, 19.29 ± 5.39%) participated in this study. Each participant performed 2 anaerobic capacity tests: the WAnT and the HART by random assignment on separate days. The reliability of the HART was calculated from 2 separate trials of the test and then determined through intraclass correlation coefficients (ICCs). Blood samples were collected, and lactate was analyzed both pretest and posttest for each of the 2 exercise modes. Heart rate and rate of perceived exertion were also measured pre- and post-exercise. Hawaii anaerobic run test peak and mean momentum were calculated as body mass times highest or average split velocity, respectively. Intraclass correlation coefficients between trials of the HART for peak and mean momentum were 0.98 and 0.99, respectively (SEM = 18.8 and 25.7, respectively). Validity of the HART was established through comparison of momentum on the HART with power on the WAnT. High correlations were found between peak power and peak momentum (r = 0.88), as well as mean power and mean momentum (r = 0.94). The HART was considered to be a reliable test of anaerobic power. The HART was also determined to be a valid test of anaerobic power when compared with the WAnT. When testing healthy college-aged individuals, the HART offers an easy and inexpensive alternative maximal effort anaerobic power test to other established tests.
Lim, Renly; Liong, Men Long; Khan, Nurzalina Abdul Karim; Yuen, Kah Hay
2017-02-17
There is currently no published information on the validity and reliability of the Golombok Rust Inventory of Sexual Satisfaction in the Asian population, specifically in patients with stress urinary incontinence, which limits its use in this region. Our study aimed to evaluate the psychometric properties of this questionnaire in the Malaysian population. Ten couples were recruited for the pilot testing. The agreement between the English and Chinese or Malay versions were tested using the intraclass correlation coefficients, with results of more than 0.80 for all subscales and overall scores indicating good agreement. Sixty-six couples were included in the subsequent phase. The following data are presented in the order of English, Chinese, and Malay. Cronbach's alphas for the male total score were 0.82, 0.88, and 0.95. For the female total score, Cronbach's alphas were 0.76, 0.78, and 0.88. Intraclass correlation coefficients for the male total score were 0.93, 0.94, and 0.99, while intraclass correlation coefficients for the female total score were 0.89, 0.86, and 0.88. In conclusion, the English, Chinese, and Malay versions each proved to be valid and reliable in our Malaysian population.
Mahoney, Liam; Fernandez-Alvarez, Jose R; Rojas-Anaya, Hector; Aiton, Neil; Wertheim, David; Seddon, Paul; Rabe, Heike
2018-02-24
To explore the intra- and inter-rater agreement of superior vena cava (SVC) flow and right ventricular (RV) outflow in healthy and unwell late preterm neonates (33-37 weeks' gestational age), term neonates (≥37 weeks' gestational age), and neonates receiving total-body cooling. The intra- and inter-rater agreement (n = 25 and 41 neonates, respectively) rates for SVC flow and RV outflow were determined by echocardiography in healthy and unwell late preterm and term neonates with the use of Bland-Altman plots, the repeatability coefficient, the repeatability index, and intraclass correlation coefficients. The intra-rater repeatability index values were 41% for SVC flow and 31% for RV outflow, with intraclass correlation coefficients indicating good agreement for both measures. The inter-rater repeatability index values for SVC flow and RV outflow were 63% and 51%, respectively, with intraclass correlation coefficients indicating moderate agreement for both measures. If SVC flow or RV outflow is used in the hemodynamic treatment of neonates, sequential measurements should ideally be performed by the same clinician to reduce potential variability. © 2018 by the American Institute of Ultrasound in Medicine.
Clinical applications of correlational vestibular autorotation test.
Hsieh, Li-Chun; Lin, Te-Ming; Chang, Yu-Min; Kuo, Terry B J; Lee, Gho-She
2015-06-01
The correlational vestibular autorotation test (VAT) system has the advantages of good test-retest reliability and calibrations of absolute degrees of eye movement are unnecessary when acquiring a cross correlation coefficient (CCC). The approach is able to efficiently detect peripheral vestibulopathies. A VAT has some drawbacks including poor test-retest reliability and slippage of sensor. This study aimed to develop a correlational VAT system and to evaluate the reliability and applicability of this system. Twenty healthy participants and 10 vertiginous patients were enrolled. Vertical and horizontal autorotations from 0 to 3 Hz with either closed or open eyes were performed. A small sensor and a wireless transmission technique were used to acquire the electro-ocular graph and head velocity signals. The two signals were analyzed using CCCs to assess the functioning of the vestibular ocular reflex (VOR). The results showed a significantly greater CCC for open-eye versus closed-eye of head autorotations. The CCCs also increased significantly with head rotational frequencies. Moreover, the CCCs significantly correlated with the VOR gains at autorotation frequencies ≥1.0 Hz. The test-retest reliability was good (intraclass correlation coefficients ≥0.85). The vertiginous participants had significantly lower individual CCCs and overall average CCC than age- and-gender matched controls.
Reproducibility of intraocular pressure peak and fluctuation of the water-drinking test.
Hatanaka, Marcelo; Alencar, Luciana M; De Moraes, Carlos G; Susanna, Remo
2013-01-01
The water-drinking test has been used as a stress test to evaluate the drainage system of the eye. However, in order to be clinically applicable,a test must provide reproducible results with consistent measurements. This study was performed to verify the reproducibility of intraocular pressure peaks and fluctuation detected during the water-drinking test in patients with ocular hypertension and open-angle glaucoma. A prospective analysis of patients in a tertiary care unit for glaucoma treatment. Twenty-four ocular hypertension and 64 open-angle glaucoma patients not under treatment. The water-drinking test was performed in 2 consecutive days by the same examiners in patients not under treatment. Reproducibility was assessed using the intraclass correlation coefficient. Peak and fluctuation of intraocular pressure obtained with the water-drinking test were analysed for reproducibility. Eighty-eight eyes from 24 ocular hypertension and 64 open-angle glaucoma patients not under treatment were evaluated. Test and retest intraocular pressure peak values were 28.38 ± 4.64 and 28.38 ± 4.56 mmHg, respectively (P = 1.00). Test and retest intraocular pressure fluctuation values were 5.75 ± 3.9 and 4.99 ± 2.7 mmHg, respectively (P = 0.06). Based on intraclass coefficient, reproducibility was excellent for peak intraocular pressure (intraclass correlation coefficient = 0.79) and fair for intraocular pressure fluctuation (intraclass correlation coefficient = 0.37). Intraocular pressure peaks detected during the water-drinking test presented excellent reproducibility, whereas the reproducibility of fluctuation was considered fair. © 2012 The Authors. Clinical and Experimental Ophthalmology © 2012 Royal Australian and New Zealand College of Ophthalmologists.
Sample size determination for GEE analyses of stepped wedge cluster randomized trials.
Li, Fan; Turner, Elizabeth L; Preisser, John S
2018-06-19
In stepped wedge cluster randomized trials, intact clusters of individuals switch from control to intervention from a randomly assigned period onwards. Such trials are becoming increasingly popular in health services research. When a closed cohort is recruited from each cluster for longitudinal follow-up, proper sample size calculation should account for three distinct types of intraclass correlations: the within-period, the inter-period, and the within-individual correlations. Setting the latter two correlation parameters to be equal accommodates cross-sectional designs. We propose sample size procedures for continuous and binary responses within the framework of generalized estimating equations that employ a block exchangeable within-cluster correlation structure defined from the distinct correlation types. For continuous responses, we show that the intraclass correlations affect power only through two eigenvalues of the correlation matrix. We demonstrate that analytical power agrees well with simulated power for as few as eight clusters, when data are analyzed using bias-corrected estimating equations for the correlation parameters concurrently with a bias-corrected sandwich variance estimator. © 2018, The International Biometric Society.
The validation of the visual analogue scale for patient satisfaction after total hip arthroplasty.
Brokelman, Roy B G; Haverkamp, Daniel; van Loon, Corné; Hol, Annemiek; van Kampen, Albert; Veth, Rene
2012-06-01
INTRODUCTION: Patient satisfaction becomes more important in our modern health care system. The assessment of satisfaction is difficult because it is a multifactorial item for which no golden standard exists. One of the potential methods of measuring satisfaction is by using the well-known visual analogue scale (VAS). In this study, we validated VAS for satisfaction. PATIENT AND METHODS: In this prospective study, we studied 147 patients (153 hips). The construct validity was measured using the Spearman correlation test that compares the satisfaction VAS with the Harris hip score, pain VAS at rest and during activity, Oxford hip score, Short Form 36 and Western Ontario McMaster Universities Osteoarthritis Index. The reliability was tested using the intra-class coefficient. RESULTS: The Pearson correlation test showed correlations in the range of 0.40-0.80. The satisfaction VAS had a high correlation between the pain VAS and Oxford hip score, which could mean that pain is one of the most important factors in patient satisfaction. The intra-class coefficient was 0.95. CONCLUSIONS: There is a moderate to mark degree of correlation between the satisfaction VAS and the currently available subjective and objective scoring systems. The intra-class coefficient of 0.95 indicates an excellent test-retest reliability. The VAS satisfaction is a simple instrument to quantify the satisfaction of a patient after total hip arthroplasty. In this study, we showed that the satisfaction VAS has a good validity and reliability.
2018-01-01
This study aimed to assess and validate the repeatability and agreement of quantitative elastography of novel shear wave methods on four individual tissue-mimicking liver fibrosis phantoms with different known Young’s modulus. We used GE Logiq E9 2D-SWE, Philips iU22 ARFI (pSWE), Samsung TS80A SWE (pSWE), Hitachi Ascendus (SWM) and Transient Elastography (TE). Two individual investigators performed all measurements non-continued and in parallel. The methods were evaluated for inter- and intraobserver variability by intraclass correlation, coefficient of variation and limits of agreement using the median elastography value. All systems used in this study provided high repeatability in quantitative measurements in a liver fibrosis phantom and excellent inter- and intraclass correlations. All four elastography platforms showed excellent intra-and interobserver agreement (interclass correlation 0.981–1.000 and intraclass correlation 0.987–1.000) and no significant difference in mean elasticity measurements for all systems, except for TE on phantom 4. All four liver fibrosis phantoms could be differentiated by quantitative elastography, by all platforms (p<0.001). In the Bland-Altman analysis the differences in measurements were larger for the phantoms with higher Young’s modulus. All platforms had a coefficient of variation in the range 0.00–0.21 for all four phantoms, equivalent to low variance and high repeatability. PMID:29293527
Xiang, Mi; Konishi, Massayuki; Hu, Huanhuan; Takahashi, Masaki; Fan, Wenbi; Nishimaki, Mio; Ando, Karina; Kim, Hyeon-Ki; Tabata, Hiroki; Arao, Takashi; Sakamoto, Shizuo
2016-09-01
Objectives The objectives of the present study were to translate the English version of the Pregnancy Physical Activity Questionnaire into Chinese (PPAQ-C) and to determine its reliability and validity for use by pregnant Chinese women. Methods The study included 224 pregnant women during their first, second, or third trimesters of pregnancy who completed the PPAQ-C on their first visit and wore a uniaxial accelerometer (Lifecorder; Suzuken Co. Ltd) for 7 days. One week after the first visit, we collected the data from the uniaxial accelerometer records, and the women were asked to complete the PPAQ-C again. Results We used intraclass correlation coefficients to determine the reliability of the PPAQ-C. The intraclass correlation coefficients were 0.77 for total activity (light and above), 0.76 for sedentary activity, 0.75 for light activity, 0.59 for moderate activity, and 0.28 for vigorous activity. The intraclass correlation coefficients were 0.74 for "household and caregiving", 0.75 for "occupational" activities, and 0.34 for "sports/exercise". Validity between the PPAQ-C and accelerometer data was determined by Spearman correlation coefficients. Although there were no significant correlations for moderate activity (r = 0.19, P > 0.05) or vigorous activity (r = 0.15, P > 0.05), there were significant correlations for total activity [light and above; r = 0.35, P < 0.01)] and for light activity (r = 0.33, P < 0.01). Conclusions for Practice The PPAQ-C is reliable and moderately accurate for measuring physical activity in pregnant Chinese women.
Validity and test–retest reliability of a novel simple back extensor muscle strength test
Harding, Amy T; Weeks, Benjamin Kurt; Horan, Sean A; Little, Andrew; Watson, Steven L; Beck, Belinda Ruth
2017-01-01
Objectives: To develop and determine convergent validity and reliability of a simple and inexpensive clinical test to quantify back extensor muscle strength. Methods: Two testing sessions were conducted, 7 days apart. Each session involved three trials of standing maximal isometric back extensor muscle strength using both the novel test and isokinetic dynamometry. Lumbar spine bone mineral density was examined by dual-energy X-ray absorptiometry. Validation was examined with Pearson correlations (r). Test–retest reliability was examined with intraclass correlation coefficients and limits of agreement. Pearson correlations and intraclass correlation coefficients are presented with corresponding 95% confidence intervals. Linear regression was used to examine the ability of peak back extensor muscle strength to predict indices of lumbar spine bone mineral density and strength. Results: A total of 52 healthy adults (26 men, 26 women) aged 46.4 ± 20.4 years were recruited from the community. A strong positive relationship was observed between peak back extensor strength from hand-held and isokinetic dynamometry (r = 0.824, p < 0.001). For the novel back extensor strength test, short- and long-term reliability was excellent (intraclass correlation coefficient = 0.983 (95% confidence interval, 0.971–0.990), p < 0.001 and intraclass correlation coefficient = 0.901 (95% confidence interval, 0.833–0.943), p < 0.001, respectively). Limits of agreement for short-term repeated back extensor strength measures with the novel back extensor strength protocol were −6.63 to 7.70 kg, with a mean bias of +0.71 kg. Back extensor strength predicted 11% of variance in lumbar spine bone mineral density (p < 0.05) and 9% of lumbar spine index of bone structural strength (p < 0.05). Conclusion: Our novel hand-held dynamometer method to determine back extensor muscle strength is quick, relatively inexpensive, and reliable; demonstrates initial convergent validity in a healthy population; and is associated with bone mass at a clinically important site. PMID:28255442
Short version of the Depression Anxiety Stress Scale-21: is it valid for Brazilian adolescents?
da Silva, Hítalo Andrade; dos Passos, Muana Hiandra Pereira; de Oliveira, Valéria Mayaly Alves; Palmeira, Aline Cabral; Pitangui, Ana Carolina Rodarti; de Araújo, Rodrigo Cappato
2016-01-01
ABSTRACT Objective To evaluate the interday reproducibility, agreement and validity of the construct of short version of the Depression Anxiety Stress Scale-21 applied to adolescents. Methods The sample consisted of adolescents of both sexes, aged between 10 and 19 years, who were recruited from schools and sports centers. The validity of the construct was performed by exploratory factor analysis, and reliability was calculated for each construct using the intraclass correlation coefficient, standard error of measurement and the minimum detectable change. Results The factor analysis combining the items corresponding to anxiety and stress in a single factor, and depression in a second factor, showed a better match of all 21 items, with higher factor loadings in their respective constructs. The reproducibility values for depression were intraclass correlation coefficient with 0.86, standard error of measurement with 0.80, and minimum detectable change with 2.22; and, for anxiety/stress: intraclass correlation coefficient with 0.82, standard error of measurement with 1.80, and minimum detectable change with 4.99. Conclusion The short version of the Depression Anxiety Stress Scale-21 showed excellent values of reliability, and strong internal consistency. The two-factor model with condensation of the constructs anxiety and stress in a single factor was the most acceptable for the adolescent population. PMID:28076595
Acar, Nihat; Karakasli, Ahmet; Karaarslan, Ahmet; Mas, Nermin Ng; Hapa, Onur
2017-01-01
Volumetric measurements of benign tumors enable surgeons to trace volume changes during follow-up periods. For a volumetric measurement technique to be applicable, it should be easy, rapid, and inexpensive and should carry a high interobserver reliability. We aimed to assess the interobserver reliability of a volumetric measurement technique using the Cavalier's principle of stereological methods. The computerized tomography (CT) of 15 patients with a histopathologically confirmed diagnosis of enchondroma with variant tumor sizes and localizations was retrospectively reviewed for interobserver reliability evaluation of the volumetric stereological measurement with the Cavalier's principle, V = t × [((SU) × d) /SL]2 × Σ P. The volumes of the 15 tumors collected by the observers are demonstrated in Table 1. There was no statistical significance between the first and second observers ( p = 0.000 and intraclass correlation coefficient = 0.970) and between the first and third observers ( p = 0.000 and intraclass correlation coefficient = 0.981). No statistical significance was detected between the second and third observers ( p = 0.000 and intraclass correlation coefficient = 0.976). The Cavalier's principle with the stereological technique using the CT scans is an easy, rapid, and inexpensive technique in volumetric evaluation of enchondromas with a trustable interobserver reliability.
Hill, Peter B
2015-06-01
Grading of erythema in clinical practice is a subjective assessment that cannot be confirmed using a definitive test; nevertheless, erythema scores are typically measured in clinical trials assessing the response to treatment interventions. Most commonly, ordinal scales are used for this purpose, but the optimal number of categories in such scales has not been determined. This study aimed to compare the reliability and agreement of a four-point and a six-point ordinal scale for the assessment of erythema in digital images of canine skin. Fifteen digital images showing varying degrees of erythema were assessed by specialist dermatologists and laypeople, using either the four-point or the six-point scale. Reliability between the raters was assessed using intraclass correlation coefficients and Cronbach's α. Agreement was assessed using the variation ratio (the percentage of respondents who chose the mode, the most common answer). Intraobserver variability was assessed by comparing the results of two grading sessions, at least 6 weeks apart. Both scales demonstrated high reliability, with intraclass correlation coefficient values and Cronbach's α above 0.99. However, the four-point scale demonstrated significantly superior agreement, with variation ratios for the four-point scale averaging 74.8%, compared with 56.2% for the six-point scale. Intraobserver consistency for the four-point scale was very high. Although both scales demonstrated high reliability, the four-point scale was superior in terms of agreement. For the assessment of erythema in clinical trials, a four-point ordinal scale is recommended. © 2014 ESVD and ACVD.
Wang, H-K; Chen, C-Y; Lin, N-C; Liu, C-S; Loong, C-C; Lin, Y-H; Lai, Y-C; Chiou, H-J
2018-05-01
Intraoperative portal venous flow measurement provides surgeons with instant guidance for portal flow modulation during living-donor liver transplantation (LDLT). In this study, we compared the agreement of portal flow measurement obtained by 2 devices: transit time ultrasound (TTU) and conventional Doppler ultrasound (CDU). Fifty-four recipients of LDLT underwent intraoperative measurement of portal flow after completion of vascular anastomosis of the implanted partial liver graft. Both TTU and CDU were used concurrently. Agreement of TTU and CDU was assessed by intraclass correlation coefficient using a model of 2-way random effects, absolute agreement, and single measurement. A Bland-Altman plot was applied to assess the variability between the 2 devices. The mean, median, and range of portal venous flow was 1456, 1418, and 117 to 2776 mL/min according to TTU; and 1564, 1566, and 119 to 3216 mL/min according to CDU. The intraclass correlation coefficient of portal venous flow between TTU and CDU was 0.68 (95% confidence interval, 0.51-0.80). The Bland-Altman plots revealed an average variation of 4.8% between TTU and CDU but with a rather wide 95% confidence interval of variation ranging from -57.7% to 67.4%. Intraoperative TTU and CDU showed moderate agreement in portal flow measurement. However, a relatively wide range of variation exists between TTU and CDU, indicating that data obtained from the 2 devices may not be interchangeable. Copyright © 2018 Elsevier Inc. All rights reserved.
Interobserver agreement on histopathological lesions in class III or IV lupus nephritis.
Wilhelmus, Suzanne; Cook, H Terence; Noël, Laure-Hélène; Ferrario, Franco; Wolterbeek, Ron; Bruijn, Jan A; Bajema, Ingeborg M
2015-01-07
To treat lupus nephritis effectively, proper identification of the histologic class is essential. Although the classification system for lupus nephritis is nearly 40 years old, remarkably few studies have investigated interobserver agreement. Interobserver agreement among nephropathologists was studied, particularly with respect to the recognition of class III/IV lupus nephritis lesions, and possible causes of disagreement were determined. A link to a survey containing pictures of 30 glomeruli was provided to all 360 members of the Renal Pathology Society; 34 responses were received from 12 countries (a response rate of 9.4%). The nephropathologist was asked whether glomerular lesions were present that would categorize the biopsy as class III/IV. If so, additional parameters were scored. To determine the interobserver agreement among the participants, κ or intraclass correlation values were calculated. The intraclass correlation or κ-value was also calculated for two separate levels of experience (specifically, nephropathologists who were new to the field or moderately experienced [less experienced] and nephropathologists who were highly experienced). Intraclass correlation for the presence of a class III/IV lesion was 0.39 (poor). The κ/intraclass correlation values for the additional parameters were as follows: active, chronic, or both: 0.36; segmental versus global: 0.39; endocapillary proliferation: 0.46; influx of inflammatory cells: 0.32; swelling of endothelial cells: 0.46; extracapillary proliferation: 0.57; type of crescent: 0.46; and wire loops: 0.35. The highly experienced nephropathologists had significantly less interobserver variability compared with the less experienced nephropathologists (P=0.004). There is generally poor agreement in terms of recognizing class III/IV lesions. Because experience clearly increases interobserver agreement, this agreement may be improved by training nephropathologists. These results also underscore the importance of a central review by experienced nephropathologists in clinical trials. Copyright © 2015 by the American Society of Nephrology.
Quantification of Finger-Tapping Angle Based on Wearable Sensors
Djurić-Jovičić, Milica; Jovičić, Nenad S.; Roby-Brami, Agnes; Popović, Mirjana B.; Kostić, Vladimir S.; Djordjević, Antonije R.
2017-01-01
We propose a novel simple method for quantitative and qualitative finger-tapping assessment based on miniature inertial sensors (3D gyroscopes) placed on the thumb and index-finger. We propose a simplified description of the finger tapping by using a single angle, describing rotation around a dominant axis. The method was verified on twelve subjects, who performed various tapping tasks, mimicking impaired patterns. The obtained tapping angles were compared with results of a motion capture camera system, demonstrating excellent accuracy. The root-mean-square (RMS) error between the two sets of data is, on average, below 4°, and the intraclass correlation coefficient is, on average, greater than 0.972. Data obtained by the proposed method may be used together with scores from clinical tests to enable a better diagnostic. Along with hardware simplicity, this makes the proposed method a promising candidate for use in clinical practice. Furthermore, our definition of the tapping angle can be applied to all tapping assessment systems. PMID:28125051
Quantification of Finger-Tapping Angle Based on Wearable Sensors.
Djurić-Jovičić, Milica; Jovičić, Nenad S; Roby-Brami, Agnes; Popović, Mirjana B; Kostić, Vladimir S; Djordjević, Antonije R
2017-01-25
We propose a novel simple method for quantitative and qualitative finger-tapping assessment based on miniature inertial sensors (3D gyroscopes) placed on the thumb and index-finger. We propose a simplified description of the finger tapping by using a single angle, describing rotation around a dominant axis. The method was verified on twelve subjects, who performed various tapping tasks, mimicking impaired patterns. The obtained tapping angles were compared with results of a motion capture camera system, demonstrating excellent accuracy. The root-mean-square (RMS) error between the two sets of data is, on average, below 4°, and the intraclass correlation coefficient is, on average, greater than 0.972. Data obtained by the proposed method may be used together with scores from clinical tests to enable a better diagnostic. Along with hardware simplicity, this makes the proposed method a promising candidate for use in clinical practice. Furthermore, our definition of the tapping angle can be applied to all tapping assessment systems.
Nema, Sandeep Kumar; Balaji, Gopisankar; Akkilagunta, Sujiv; Menon, Jagdish; Poduval, Murali; Patro, Dilip
2017-01-01
Background: Accurate tibial and femoral tunnel placement has a significant effect on outcomes after anterior cruciate ligament reconstruction (ACLR). Postoperative radiographs provide a reliable and valid way for the assessment of anatomical tunnel placement after ACLR. The aim of this study was to examine the radiographic location of tibial and femoral tunnels in patients who underwent arthroscopic ACLR using anatomic landmarks. Patients who underwent arthroscopic ACLR from January 2014 to March 2016 were included in this retrospective cohort study. Materials and Methods: 45 patients who underwent arthroscopic ACLR, postoperative radiographs were studied. Femoral and tibial tunnel positions on sagittal and coronal radiographic views, graft impingement, and femoral roof angle were measured. Radiological parameters were summarized as mean ± standard deviation and proportions as applicable. Interobserver agreement was measured using intraclass correlation coefficient. Results: The position of the tibial tunnel was found to be at an average of 35.1% ± 7.4% posterior from the anterior edge of the tibia. The femoral tunnel was found at an average of 30% ± 1% anterior to the posterior femoral cortex along the Blumensaat's line. Radiographic impingement was found in 34% of the patients. The roof angle averaged 34.3° ± 4.3°. The position of the tibial tunnel was found at an average of 44.16% ± 3.98% from the medial edge of the tibial plateau. The coronal tibial tunnel angle averaged 67.5° ± 8.9°. The coronal angle of the femoral tunnel averaged 41.9° ± 8.5°. Conclusions: The femoral and tibial tunnel placements correlated well with anatomic landmarks except for radiographic impingement which was present in 34% of the patients. PMID:28566780
Nema, Sandeep Kumar; Balaji, Gopisankar; Akkilagunta, Sujiv; Menon, Jagdish; Poduval, Murali; Patro, Dilip
2017-01-01
Accurate tibial and femoral tunnel placement has a significant effect on outcomes after anterior cruciate ligament reconstruction (ACLR). Postoperative radiographs provide a reliable and valid way for the assessment of anatomical tunnel placement after ACLR. The aim of this study was to examine the radiographic location of tibial and femoral tunnels in patients who underwent arthroscopic ACLR using anatomic landmarks. Patients who underwent arthroscopic ACLR from January 2014 to March 2016 were included in this retrospective cohort study. 45 patients who underwent arthroscopic ACLR, postoperative radiographs were studied. Femoral and tibial tunnel positions on sagittal and coronal radiographic views, graft impingement, and femoral roof angle were measured. Radiological parameters were summarized as mean ± standard deviation and proportions as applicable. Interobserver agreement was measured using intraclass correlation coefficient. The position of the tibial tunnel was found to be at an average of 35.1% ± 7.4% posterior from the anterior edge of the tibia. The femoral tunnel was found at an average of 30% ± 1% anterior to the posterior femoral cortex along the Blumensaat's line. Radiographic impingement was found in 34% of the patients. The roof angle averaged 34.3° ± 4.3°. The position of the tibial tunnel was found at an average of 44.16% ± 3.98% from the medial edge of the tibial plateau. The coronal tibial tunnel angle averaged 67.5° ± 8.9°. The coronal angle of the femoral tunnel averaged 41.9° ± 8.5°. The femoral and tibial tunnel placements correlated well with anatomic landmarks except for radiographic impingement which was present in 34% of the patients.
First quality score for referral letters in gastroenterology—a validation study
Eskeland, Sigrun Losada; Brunborg, Cathrine; Seip, Birgitte; Wiencke, Kristine; Hovde, Øistein; Owen, Tanja; Skogestad, Erik; Huppertz-Hauss, Gert; Halvorsen, Fred-Arne; Garborg, Kjetil; Aabakken, Lars; de Lange, Thomas
2016-01-01
Objective To create and validate an objective and reliable score to assess referral quality in gastroenterology. Design An observational multicentre study. Setting and participants 25 gastroenterologists participated in selecting variables for a Thirty Point Score (TPS) for quality assessment of referrals to gastroenterology specialist healthcare for 9 common indications. From May to September 2014, 7 hospitals from the South-Eastern Norway Regional Health Authority participated in collecting and scoring 327 referrals to a gastroenterologist. Main outcome measure Correlation between the TPS and a visual analogue scale (VAS) for referral quality. Results The 327 referrals had an average TPS of 13.2 (range 1–25) and an average VAS of 4.7 (range 0.2–9.5). The reliability of the score was excellent, with an intra-rater intraclass correlation coefficient (ICC) of 0.87 and inter-rater ICC of 0.91. The overall correlation between the TPS and the VAS was moderate (r=0.42), and ranged from fair to substantial for the various indications. Mean agreement was good (ICC=0.47, 95% CI (0.34 to 0.57)), ranging from poor to good. Conclusions The TPS is reliable, objective and shows good agreement with the subjective VAS. The score may be a useful tool for assessing referral quality in gastroenterology, particularly important when evaluating the effect of interventions to improve referral quality. PMID:27855107
Chang, Wen-Dien; Chang, Wan-Yi; Lee, Chia-Lun; Feng, Chi-Yen
2013-10-01
[Purpose] Balance is an integral part of human ability. The smart balance master system (SBM) is a balance test instrument with good reliability and validity, but it is expensive. Therefore, we modified a Wii Fit balance board, which is a convenient balance assessment tool, and analyzed its reliability and validity. [Subjects and Methods] We recruited 20 healthy young adults and 20 elderly people, and administered 3 balance tests. The correlation coefficient and intraclass correlation of both instruments were analyzed. [Results] There were no statistically significant differences in the 3 tests between the Wii Fit balance board and the SBM. The Wii Fit balance board had a good intraclass correlation (0.86-0.99) for the elderly people and positive correlations (r = 0.58-0.86) with the SBM. [Conclusions] The Wii Fit balance board is a balance assessment tool with good reliability and high validity for elderly people, and we recommend it as an alternative tool for assessing balance ability.
Koritar, Priscila; Philippi, Sonia Tucunduva; Alvarenga, Marle dos Santos; Santos, Bernardo dos
2014-08-01
The scope of this study was to show the cross-cultural adaptation and validation of the Health and Taste Attitude Scale in Portuguese. The methodology included translation of the scale; evaluation of conceptual, operational and item-based equivalence by 14 experts and 51 female undergraduates; semantic equivalence and measurement assessment by 12 bilingual women by the paired t-test, the Pearson correlation coefficient and the coefficient intraclass correlation; internal consistency and test-retest reliability by Cronbach's alpha and intraclass correlation coefficient, respectively, after application on 216 female undergraduates; assessment of discriminant and concurrent validity via the t-test and Spearman's correlation coefficient, respectively, in addition to Confirmatory Factor and Exploratory Factor Analysis. The scale was considered adequate and easily understood by the experts and university students and presented good internal consistency and reliability (µ 0.86, ICC 0.84). The results show that the scale is valid and can be used in studies with women to better understand attitudes related to taste.
The Yale-Brown Obsessive Compulsive Scale: A Reliability Generalization Meta-Analysis.
López-Pina, José Antonio; Sánchez-Meca, Julio; López-López, José Antonio; Marín-Martínez, Fulgencio; Núñez-Núñez, Rosa Maria; Rosa-Alcázar, Ana I; Gómez-Conesa, Antonia; Ferrer-Requena, Josefa
2015-10-01
The Yale-Brown Obsessive Compulsive Scale (Y-BOCS) is the most frequently applied test to assess obsessive compulsive symptoms. We conducted a reliability generalization meta-analysis on the Y-BOCS to estimate the average reliability, examine the variability among the reliability estimates, search for moderators, and propose a predictive model that researchers and clinicians can use to estimate the expected reliability of the Y-BOCS. We included studies where the Y-BOCS was applied to a sample of adults and reliability estimate was reported. Out of the 11,490 references located, 144 studies met the selection criteria. For the total scale, the mean reliability was 0.866 for coefficients alpha, 0.848 for test-retest correlations, and 0.922 for intraclass correlations. The moderator analyses led to a predictive model where the standard deviation of the total test and the target population (clinical vs. nonclinical) explained 38.6% of the total variability among coefficients alpha. Finally, clinical implications of the results are discussed. © The Author(s) 2014.
Zhang, Tan; Chen, Ang
2017-01-01
Based on the job demands-resources model, the study developed and validated an instrument that measures physical education teachers' job demands-resources perception. Expert review established content validity with the average item rating of 3.6/5.0. Construct validity and reliability were determined with a teacher sample ( n = 397). Exploratory factor analysis established a five-dimension construct structure matching the theoretical construct deliberated in the literature. The composite reliability scores for the five dimensions range from .68 to .83. Validity coefficients (intraclass correlational coefficients) are .69 for job resources items and .82 for job demands items. Inter-scale correlational coefficients range from -.32 to .47. Confirmatory factor analysis confirmed the construct validity with high dimensional factor loadings (ranging from .47 to .84 for job resources scale and from .50 to .85 for job demands scale) and adequate model fit indexes (root mean square error of approximation = .06). The instrument provides a tool to measure physical education teachers' perception of their working environment.
Zhang, Tan; Chen, Ang
2017-01-01
Based on the job demands–resources model, the study developed and validated an instrument that measures physical education teachers’ job demands–resources perception. Expert review established content validity with the average item rating of 3.6/5.0. Construct validity and reliability were determined with a teacher sample (n = 397). Exploratory factor analysis established a five-dimension construct structure matching the theoretical construct deliberated in the literature. The composite reliability scores for the five dimensions range from .68 to .83. Validity coefficients (intraclass correlational coefficients) are .69 for job resources items and .82 for job demands items. Inter-scale correlational coefficients range from −.32 to .47. Confirmatory factor analysis confirmed the construct validity with high dimensional factor loadings (ranging from .47 to .84 for job resources scale and from .50 to .85 for job demands scale) and adequate model fit indexes (root mean square error of approximation = .06). The instrument provides a tool to measure physical education teachers’ perception of their working environment. PMID:29200808
Rosenblum, Uri; Melzer, Itshak
2017-01-01
About 90% of people with multiple sclerosis (PwMS) have gait instability and 50% fall. Reliable and clinically feasible methods of gait instability assessment are needed. The study investigated the reliability and validity of the Narrow Path Walking Test (NPWT) under single-task (ST) and dual-task (DT) conditions for PwMS. Thirty PwMS performed the NPWT on 2 different occasions, a week apart. Number of Steps, Trial Time, Trial Velocity, Step Length, Number of Step Errors, Number of Cognitive Task Errors, and Number of Balance Losses were measured. Intraclass correlation coefficients (ICC2,1) were calculated from the average values of NPWT parameters. Absolute reliability was quantified from standard error of measurement (SEM) and smallest real difference (SRD). Concurrent validity of NPWT with Functional Reach Test, Four Square Step Test (FSST), 12-item Multiple Sclerosis Walking Scale (MSWS-12), and 2 Minute Walking Test (2MWT) was determined using partial correlations. Intraclass correlation coefficients (ICCs) for most NPWT parameters during ST and DT ranged from 0.46-0.94 and 0.55-0.95, respectively. The highest relative reliability was found for Number of Step Errors (ICC = 0.94 and 0.93, for ST and DT, respectively) and Trial Velocity (ICC = 0.83 and 0.86, for ST and DT, respectively). Absolute reliability was high for Number of Step Errors in ST (SEM % = 19.53%) and DT (SEM % = 18.14%) and low for Trial Velocity in ST (SEM % = 6.88%) and DT (SEM % = 7.29%). Significant correlations for Number of Step Errors and Trial Velocity were found with FSST, MSWS-12, and 2MWT. In persons with PwMS performing the NPWT, Number of Step Errors and Trial Velocity were highly reliable parameters. Based on correlations with other measures of gait instability, Number of Step Errors was the most valid parameter of dynamic balance under the conditions of our test.Video Abstract available for more insights from the authors (see Supplemental Digital Content 1, available at: http://links.lww.com/JNPT/A159).
Margolin, Ezra J; Mlynarczyk, Carrie M; Mulhall, John P; Stember, Doron S; Stahl, Peter J
2017-06-01
Non-curvature penile deformities are prevalent and bothersome manifestations of Peyronie's disease (PD), but the quantitative metrics that are currently used to describe these deformities are inadequate and non-standardized, presenting a barrier to clinical research and patient care. To introduce erect penile volume (EPV) and percentage of erect penile volume loss (percent EPVL) as novel metrics that provide detailed quantitative information about non-curvature penile deformities and to study the feasibility and reliability of three-dimensional (3D) photography for measurement of quantitative penile parameters. We constructed seven penis models simulating deformities found in PD. The 3D photographs of each model were captured in triplicate by four observers using a 3D camera. Computer software was used to generate automated measurements of EPV, percent EPVL, penile length, minimum circumference, maximum circumference, and angle of curvature. The automated measurements were statistically compared with measurements obtained using water-displacement experiments, a tape measure, and a goniometer. Accuracy of 3D photography for average measurements of all parameters compared with manual measurements; inter-test, intra-observer, and inter-observer reliabilities of EPV and percent EPVL measurements as assessed by the intraclass correlation coefficient. The 3D images were captured in a median of 52 seconds (interquartile range = 45-61). On average, 3D photography was accurate to within 0.3% for measurement of penile length. It overestimated maximum and minimum circumferences by averages of 4.2% and 1.6%, respectively; overestimated EPV by an average of 7.1%; and underestimated percent EPVL by an average of 1.9%. All inter-test, inter-observer, and intra-observer intraclass correlation coefficients for EPV and percent EPVL measurements were greater than 0.75, reflective of excellent methodologic reliability. By providing highly descriptive and reliable measurements of penile parameters, 3D photography can empower researchers to better study volume-loss deformities in PD and enable clinicians to offer improved clinical assessment, communication, and documentation. This is the first study to apply 3D photography to the assessment of PD and to accurately measure the novel parameters of EPV and percent EPVL. This proof-of-concept study is limited by the lack of data in human subjects, which could present additional challenges in obtaining reliable measurements. EPV and percent EPVL are novel metrics that can be quickly, accurately, and reliably measured using computational analysis of 3D photographs and can be useful in describing non-curvature volume-loss deformities resulting from PD. Margolin EJ, Mlynarczyk CM, Muhall JP, et al. Three-Dimensional Photography for Quantitative Assessment of Penile Volume-Loss Deformities in Peyronie's Disease. J Sex Med 2017;14:829-833. Copyright © 2017 International Society for Sexual Medicine. Published by Elsevier Inc. All rights reserved.
Practice of preventive dentistry for nursing staff in primary care.
Jiménez-Báez, María Valeria; Acuña-Reyes, Raquel; Cigarroa-Martínez, Didier; Ureña-Bogarín, Enrique; Orgaz-Fernández, Jose David
2014-01-01
Determine the domain of preventive dentistry in nursing personnel assigned to a primary care unit. Prospective descriptive study, questionnaire validation, and prevalence study. In the first stage, the questionnaire for the practice of preventive dentistry (CPEP, for the term in Spanish) was validated; consistency and reliability were measured by Cronbach's alpha, Pearson's correlation, factor analysis with intra-class correlation coefficient (ICC). In the second stage, the domain in preventive dental nurses was explored. The overall internal consistency of CPEP is α= 0.66, ICC= 0.64, CI95%: 0.29-0.87 (p >0.01). Twenty-one subjects in the study, average age 43, 81.0% female, average seniority of 12.5 were included. A total of 71.5% showed weak domain, 28.5% regular domain, and there was no questionnaire with good domain result. The older the subjects were, the smaller the domain; female nurses showed greater mastery of preventive dentistry (29%, CI95%: 0.1-15.1) than male nurses. Public health nurses showed greater mastery with respect to other categories (50%, CI95%: 0.56-2.8). The CDEP has enough consistency to explore the domain of preventive dentistry in health-care staff. The domain of preventive dentistry in primary care nursing is poor, required to strengthen to provide education in preventive dentistry to the insured population.
Validation of a computerized algorithm to quantify fetal heart rate deceleration area.
Gyllencreutz, Erika; Lu, Ke; Lindecrantz, Kaj; Lindqvist, Pelle G; Nordstrom, Lennart; Holzmann, Malin; Abtahi, Farhad
2018-05-16
Reliability in visual cardiotocography interpretation is unsatisfying, which has led to development of computerized cardiotocography. Computerized analysis is well established for antenatal fetal surveillance, but has yet not performed sufficiently during labor. We aimed to investigate the capacity of a new computerized algorithm compared to visual assessment in identifying intrapartum fetal heart rate baseline and decelerations. Three-hundred-and-twelve intrapartum cardiotocography tracings with variable decelerations were analysed by the computerized algorithm and visually examined by two observers, blinded to each other and the computer analysis. The width, depth and area of each deceleration was measured. Four cases (>100 variable decelerations) were subject to in-depth detailed analysis. The outcome measures were bias in seconds (width), beats per minute (depth), and beats (area) between computer and observers by using Bland-Altman analysis. Interobserver reliability was determined by calculating intraclass correlation and Spearman rank analysis. The analysis (312 cases) showed excellent intraclass correlation (0.89-0.95) and very strong Spearman correlation (0.82-0.91). The detailed analysis of > 100 decelerations in 4 cases revealed low bias between the computer and the two observers; width 1.4 and 1.4 seconds, depth 5.1 and 0.7 beats per minute, and area 0.1 and -1.7 beats. This was comparable to the bias between the two observers; 0.3 seconds (width), 4.4 beats per minute (depth), and 1.7 beats (area). The intraclass correlation was excellent (0.90-0.98). A novel computerized algorithm for intrapartum cardiotocography analysis is as accurate as gold standard visual assessment with high correlation and low bias. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Test-retest stability of the Task and Ego Orientation Questionnaire.
Lane, Andrew M; Nevill, Alan M; Bowes, Neal; Fox, Kenneth R
2005-09-01
Establishing stability, defined as observing minimal measurement error in a test-retest assessment, is vital to validating psychometric tools. Correlational methods, such as Pearson product-moment, intraclass, and kappa are tests of association or consistency, whereas stability or reproducibility (regarded here as synonymous) assesses the agreement between test-retest scores. Indexes of reproducibility using the Task and Ego Orientation in Sport Questionnaire (TEOSQ; Duda & Nicholls, 1992) were investigated using correlational (Pearson product-moment, intraclass, and kappa) methods, repeated measures multivariate analysis of variance, and calculating the proportion of agreement within a referent value of +/-1 as suggested by Nevill, Lane, Kilgour, Bowes, and Whyte (2001). Two hundred thirteen soccer players completed the TEOSQ on two occasions, 1 week apart. Correlation analyses indicated a stronger test-retest correlation for the Ego subscale than the Task subscale. Multivariate analysis of variance indicated stability for ego items but with significant increases in four task items. The proportion of test-retest agreement scores indicated that all ego items reported relatively poor stability statistics with test-retest scores within a range of +/-1, ranging from 82.7-86.9%. By contrast, all task items showed test-retest difference scores ranging from 92.5-99%, although further analysis indicated that four task subscale items increased significantly. Findings illustrated that correlational methods (Pearson product-moment, intraclass, and kappa) are influenced by the range in scores, and calculating the proportion of agreement of test-retest differences with a referent value of +/-1 could provide additional insight into the stability of the questionnaire. It is suggested that the item-by-item proportion of agreement method proposed by Nevill et al. (2001) should be used to supplement existing methods and could be especially helpful in identifying rogue items in the initial stages of psychometric questionnaire validation.
Temporal variability of urinary cadmium in spot urine samples and first morning voids.
Vacchi-Suzzi, Caterina; Porucznik, Christina A; Cox, Kyley J; Zhao, Yuan; Ahn, Hongshik; Harrington, James M; Levine, Keith E; Demple, Bruce; Marsit, Carmen J; Gonzalez, Adam; Luft, Benjamin; Meliker, Jaymie R
2017-05-01
Cadmium is a carcinogenic heavy metal. Urinary levels of cadmium are considered to be an indicator of long-term body burden, as cadmium accumulates in the kidneys and has a half-life of at least 10 years. However, the temporal stability of the biomarker in urine samples from a non-occupationally exposed population has not been rigorously established. We used repeated measurements of urinary cadmium (U-Cd) in spot urine samples and first morning voids from two separate cohorts, to assess the temporal stability of the samples. Urine samples from two cohorts including individuals of both sexes were measured for cadmium and creatinine. The first cohort (Home Observation of Perinatal Exposure (HOPE)) consisted of 21 never-smokers, who provided four first morning urine samples 2-5 days apart, and one additional sample roughly 1 month later. The second cohort (World Trade Center-Health Program (WTC-HP)) consisted of 78 individuals, including 52 never-smokers, 22 former smokers and 4 current smokers, who provided 2 spot urine samples 6 months apart, on average. Intra-class correlation was computed for groups of replicates from each individual to assess temporal variability. The median creatinine-adjusted U-Cd level (0.19 and 0.21 μg/g in the HOPE and WTC-HP, respectively) was similar to levels recorded in the United States by the National Health and Nutrition Examination Survey. The intra-class correlation (ICC) was high (0.76 and 0.78 for HOPE and WTC-HP, respectively) and similar between cohorts, irrespective of whether samples were collected days or months apart. Both single spot or first morning urine cadmium samples show good to excellent reproducibility in low-exposure populations.
Validity of a traffic air pollutant dispersion model to assess exposure to fine particles.
Kostrzewa, Aude; Reungoat, Patrice; Raherison, Chantal
2009-08-01
Fine particles (PM(2.5)) are an important component of air pollution. Epidemiological studies have shown health effects due to ambient air particles, particularly allergies in children. Since the main difficulty is to determine exposure to such pollution, traffic air pollutant (TAP) dispersions models have been developed to improve the estimation of individual exposure levels. One such model, the ExTra index, has been validated for nitrogen oxide concentrations but not for other pollutants. The purpose of this study was to assess the validity of the ExTra index to assess PM(2.5) exposure. We compared PM(2.5) concentrations calculated by the ExTra index to reference measures (passive samplers situated under the covered part of the playground), in 15 schools in Bordeaux, in 2000. First, we collected the input data required by the ExTra index: background and local pollution depending on traffic, meteorology and topography. Second, the ExTra index was calculated for each school. Statistical analysis consisted of a graphic description; then, we calculated an intraclass correlation coefficient. Concentrations calculated with the ExTra index and the reference method were similar. The ExTra index underestimated exposure by 2.2 microg m(-3) on average compared to the reference method. The intraclass correlation coefficient was 0.85 and its 95% confidence interval was [0.62; 0.95]. The results suggest that the ExTra index provides an assessment of PM(2.5) exposure similar to that of the reference method. Although caution is required in interpreting these results owing to the small number of sites, the ExTra index could be a useful epidemiological tool for reconstructing individual exposure, an important challenge in epidemiology.
Olsen, J. Pat; Fellows, Robert P.; Rivera-Mindt, Monica; Morgello, Susan; Byrd, Desiree A.
2015-01-01
The Wide Range Achievement Test, 3rd edition, Reading-Recognition subtest (WRAT-3 RR) is an established measure of premorbid ability. Furthermore, its long-term reliability is not well documented, particularly in diverse populations with CNS-relevant disease. Objective: We examined test-retest reliability of the WRAT-3 RR over time in an HIV+ sample of predominantly racial/ethnic minority adults. Method: Participants (N = 88) completed a comprehensive neuropsychological battery, including the WRAT-3 RR, on at least two separate study visits. Intraclass correlation coefficients (ICCs) were computed using scores from baseline and follow-up assessments to determine the test-retest reliability of the WRAT-3 RR across racial/ethnic groups and changes in medical (immunological) and clinical (neurocognitive) factors. Additionally, Fisher’s Z tests were used to determine the significance of the differences between ICCs. Results: The average test-retest interval was 58.7 months (SD=36.4). The overall WRAT-3 RR test-retest reliability was high (r = .97, p < .001), and remained robust across all demographic, medical, and clinical variables (all r’s > .92). Intraclass correlation coefficients did not differ significantly between the subgroups tested (all Fisher’s Z p’s > .05). Conclusions: Overall, this study supports the appropriateness of word-reading tests, such as the WRAT-3 RR, for use as stable premorbid IQ estimates among ethnically diverse groups. Moreover, this study supports the reliability of this measure in the context of change in health and neurocognitive status, and in lengthy inter-test intervals. These findings offer strong rationale for reading as a “hold” test, even in the presence of a chronic, variable disease such as HIV. PMID:26689235
Repeatability of standard metabolic rate (SMR) in a small fish, the spined loach (Cobitis taenia).
Maciak, Sebastian; Konarzewski, Marek
2010-10-01
Significant repeatability of a trait of interest is an essential assumption for undertaking studies of phenotypic variability. It is especially important in studies on highly variable traits, such as metabolic rates. Recent publications suggest that resting/basal metabolic rate of homeotherms is repeatable across wide range of species. In contrast, studies on the consistency of standard metabolic rate (SMR) in ectotherms, particularly fish, are scarce. Here we present a comprehensive analysis of several important technical aspects of body mass-corrected SMR measurements and its repeatability in a small (average weight approximately 3g) fish, the spined loach (Cobitis taenia). First we demonstrated that release of oxygen from the walls of metabolic chambers exposed to hypoxic conditions did not confound SMR measurements. Next, using principle of propagation of measurement uncertainties we demonstrated that in aquatic systems, measurement error is significantly higher in open than closed respirometry setups. The measurement error for SMR of a small fish determined in a closed aquatic system is comparable to that obtainable using top-notch open-flow systems used for air-breathing terrestrial animals. Using a closed respirometer we demonstrated that body mass-corrected SMR in spined loaches was repeatable under both normoxia and hypoxia over a 5-month period (Pearson correlation r=0.68 and r=0.73, respectively) as well as across both conditions (intraclass correlation coefficient tau=0.30). In these analyses we accounted for possible effect of oxygen consumption of the oxygen electrode on repeatability of SMR. Significant SMR consistency was accompanied by significant repeatability of body mass (intraclass correlation coefficient tau=0.86). To our knowledge, this is the first study showing long-term repeatability of body mass and SMR in a small fish, and is consistent with the existence of heritable variation of these two traits. 2010 Elsevier Inc. All rights reserved.
First quality score for referral letters in gastroenterology-a validation study.
Eskeland, Sigrun Losada; Brunborg, Cathrine; Seip, Birgitte; Wiencke, Kristine; Hovde, Øistein; Owen, Tanja; Skogestad, Erik; Huppertz-Hauss, Gert; Halvorsen, Fred-Arne; Garborg, Kjetil; Aabakken, Lars; de Lange, Thomas
2016-10-08
To create and validate an objective and reliable score to assess referral quality in gastroenterology. An observational multicentre study. 25 gastroenterologists participated in selecting variables for a Thirty Point Score (TPS) for quality assessment of referrals to gastroenterology specialist healthcare for 9 common indications. From May to September 2014, 7 hospitals from the South-Eastern Norway Regional Health Authority participated in collecting and scoring 327 referrals to a gastroenterologist. Correlation between the TPS and a visual analogue scale (VAS) for referral quality. The 327 referrals had an average TPS of 13.2 (range 1-25) and an average VAS of 4.7 (range 0.2-9.5). The reliability of the score was excellent, with an intra-rater intraclass correlation coefficient (ICC) of 0.87 and inter-rater ICC of 0.91. The overall correlation between the TPS and the VAS was moderate (r=0.42), and ranged from fair to substantial for the various indications. Mean agreement was good (ICC=0.47, 95% CI (0.34 to 0.57)), ranging from poor to good. The TPS is reliable, objective and shows good agreement with the subjective VAS. The score may be a useful tool for assessing referral quality in gastroenterology, particularly important when evaluating the effect of interventions to improve referral quality. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
ERIC Educational Resources Information Center
Hedges, Larry V.; Hedberg, E. C.
2013-01-01
Background: Cluster-randomized experiments that assign intact groups such as schools or school districts to treatment conditions are increasingly common in educational research. Such experiments are inherently multilevel designs whose sensitivity (statistical power and precision of estimates) depends on the variance decomposition across levels.…
ERIC Educational Resources Information Center
Goldsmith, H. Hill; Buss, Kristin A.; Lemery, Kathryn S.
1997-01-01
Studied 715 twins and singletons to document heritable influences on temperament during toddler and preschool ages. Found substantial shared environmental influence on positive affect and additive genetic influence on emotion regulation. Intraclass correlations from many scales showed no evidence of "too-low" dizygotic correlations that…
ERIC Educational Resources Information Center
Hedges, Larry V.; Hedberg, Eric C.
2013-01-01
Background: Cluster randomized experiments that assign intact groups such as schools or school districts to treatment conditions are increasingly common in educational research. Such experiments are inherently multilevel designs whose sensitivity (statistical power and precision of estimates) depends on the variance decomposition across levels.…
Chang, Wen-Dien; Chang, Wan-Yi; Lee, Chia-Lun; Feng, Chi-Yen
2013-01-01
[Purpose] Balance is an integral part of human ability. The smart balance master system (SBM) is a balance test instrument with good reliability and validity, but it is expensive. Therefore, we modified a Wii Fit balance board, which is a convenient balance assessment tool, and analyzed its reliability and validity. [Subjects and Methods] We recruited 20 healthy young adults and 20 elderly people, and administered 3 balance tests. The correlation coefficient and intraclass correlation of both instruments were analyzed. [Results] There were no statistically significant differences in the 3 tests between the Wii Fit balance board and the SBM. The Wii Fit balance board had a good intraclass correlation (0.86–0.99) for the elderly people and positive correlations (r = 0.58–0.86) with the SBM. [Conclusions] The Wii Fit balance board is a balance assessment tool with good reliability and high validity for elderly people, and we recommend it as an alternative tool for assessing balance ability. PMID:24259769
In vitro burn model illustrating heat conduction patterns using compressed thermal papers.
Lee, Jun Yong; Jung, Sung-No; Kwon, Ho
2015-01-01
To date, heat conduction from heat sources to tissue has been estimated by complex mathematical modeling. In the present study, we developed an intuitive in vitro skin burn model that illustrates heat conduction patterns inside the skin. This was composed of tightly compressed thermal papers with compression frames. Heat flow through the model left a trace by changing the color of thermal papers. These were digitized and three-dimensionally reconstituted to reproduce the heat conduction patterns in the skin. For standardization, we validated K91HG-CE thermal paper using a printout test and bivariate correlation analysis. We measured the papers' physical properties and calculated the estimated depth of heat conduction using Fourier's equation. Through contact burns of 5, 10, 15, 20, and 30 seconds on porcine skin and our burn model using a heated brass comb, and comparing the burn wound and heat conduction trace, we validated our model. The heat conduction pattern correlation analysis (intraclass correlation coefficient: 0.846, p < 0.001) and the heat conduction depth correlation analysis (intraclass correlation coefficient: 0.93, p < 0.001) showed statistically significant high correlations between the porcine burn wound and our model. Our model showed good correlation with porcine skin burn injury and replicated its heat conduction patterns. © 2014 by the Wound Healing Society.
Oviedo-Caro, Miguel Ángel; Bueno-Antequera, Javier; Munguía-Izquierdo, Diego
2018-03-19
To transculturally adapt the Spanish version of Pregnancy Physical Activity Questionnaire (PPAQ) analyzing its psychometric properties. The PPAQ was transculturally adapted into Spanish. Test-retest reliability was evaluated in a subsample of 109 pregnant women. The validity was evaluated in a sample of 208 pregnant women who answered the questionnaire and wore the multi-sensor monitor for 7 valid days. The reliability (intraclass correlation coefficient), concordance (concordance correlation coefficient), correlation (Pearson correlation coefficient), agreement (Bland-Altman plots) and relative activity levels (Jonckheere-Terpstra test) between both administrations and methods were examined. Intraclass correlation coefficients between both administrations were good for all categories except transportation. A low but significant correlation was found for total activity (light and above) whereas no correlation was found for other intensities between both methods. Relative activity levels analysis showed a significant linear trend for increased total activity between both methods. Spanish version of PPAQ is a brief and easily interpretable questionnaire with good reliability and ability to rank individuals, and poor validity compared with multi-sensor monitor. The use of PPAQ provides information of pregnancy-specific activities in order to establish physical activity levels of pregnant women and adapt health promotion interventions. Copyright © 2018 SESPAS. Publicado por Elsevier España, S.L.U. All rights reserved.
ERIC Educational Resources Information Center
Stanley, Julian C.; Livingston, Samuel A.
Besides the ubiquitous Pearson product-moment r, there are a number of other measures of relationship that are attenuated by errors of measurement and for which the relationship between true measures can be estimated. Among these are the correlation ratio (eta squared), Kelley's unbiased correlation ratio (epsilon squared), Hays' omega squared,…
Relationships between Contextual and Task Performance and Interrater Agreement: Are There Any?
Díaz-Vilela, Luis F; Delgado Rodríguez, Naira; Isla-Díaz, Rosa; Díaz-Cabrera, Dolores; Hernández-Fernaud, Estefanía; Rosales-Sánchez, Christian
2015-01-01
Work performance is one of the most important dependent variables in Work and Organizational Psychology. The main objective of this paper was to explore the relationships between citizenship performance and task performance measures obtained from different appraisers and their consistency through a seldom-used methodology, intraclass correlation coefficients. Participants were 135 public employees, the total staff in a local government department. Jobs were clustered into job families through a work analysis based on standard questionnaires. A task description technique was used to develop a performance appraisal questionnaire for each job family, with three versions: self-, supervisor-, and peer-evaluation, in addition to a measure of citizenship performance. Only when the self-appraisal bias is controlled, significant correlations appeared between task performance rates. However, intraclass correlations analyses show that only self- (contextual and task) performance measures are consistent, while interrater agreement disappears. These results provide some interesting clues about the procedure of appraisal instrument development, the role of appraisers, and the importance of choosing adequate consistency analysis methods.
Relationships between Contextual and Task Performance and Interrater Agreement: Are There Any?
Díaz-Cabrera, Dolores; Hernández-Fernaud, Estefanía; Rosales-Sánchez, Christian
2015-01-01
Work performance is one of the most important dependent variables in Work and Organizational Psychology. The main objective of this paper was to explore the relationships between citizenship performance and task performance measures obtained from different appraisers and their consistency through a seldom-used methodology, intraclass correlation coefficients. Participants were 135 public employees, the total staff in a local government department. Jobs were clustered into job families through a work analysis based on standard questionnaires. A task description technique was used to develop a performance appraisal questionnaire for each job family, with three versions: self-, supervisor-, and peer-evaluation, in addition to a measure of citizenship performance. Only when the self-appraisal bias is controlled, significant correlations appeared between task performance rates. However, intraclass correlations analyses show that only self- (contextual and task) performance measures are consistent, while interrater agreement disappears. These results provide some interesting clues about the procedure of appraisal instrument development, the role of appraisers, and the importance of choosing adequate consistency analysis methods. PMID:26473956
Dong, Xiao-Yan; Wang, Lan; Tao, Yan-Xia; Suo, Xiu-Li; Li, Yue-Chuan; Liu, Fang; Zhao, Yue; Zhang, Qing
2017-01-01
Anxiety is a common comorbidity in patients with COPD in China, and it can significantly decrease patients' quality of life. Almost all anxiety measurements contain somatic items that can overlap with symptoms of COPD and side effects of medicines, which can lead to bias in measuring anxiety in patients with COPD. Therefore, a brief and disease-specific non-somatic anxiety measurement scale, the Anxiety Inventory for Respiratory Disease (AIR), which has been developed and validated in its English version, is needed for patients with COPD in China. A two-center study was conducted in two tertiary hospitals in Tianjin, China. A total of 181 outpatients with COPD (mean age 67.21±8.10 years, 32.6% women), who met the inclusion and exclusion criteria, were enrolled in the study. Test-retest reliability was examined using intraclass correlation coefficients. The internal consistency was calculated by Cronbach's α . Content validity was examined using the Content Validity Index (CVI), scale-level CVI/universal agreement, and scale-level CVI/average agreement (S-CVI/Ave). Besides, convergent validity and construct validity were also examined. The AIR-C (AIR-Chinese version) scale had high test-retest reliability (intraclass correlation coefficient =0.904) and internal consistency (Cronbach's α =0.914); the content validity of the AIR-C scale was calculated by CVI, scale-level CVI/universal agreement, and S-CVI/Ave at values of 0.89-1, 0.90, and 0.98, respectively. Meanwhile, the AIR-C scale had good convergent validity, correlating with the Hospital Anxiety and Depression Scale-Anxiety ( r =0.81, P <0.01), and there were significant correlations between the AIR-C and Clinical COPD Questionnaire (CCQ; r =0.44, P <0.01) and Activities of Daily Living Scale (ADLS; r =0.36, P <0.01). A two-factor model of general anxiety and panic symptoms in the AIR-C scale had the best fit according to Confirmatory Factor Analysis (CFA). The AIR-C scale had a good reliability and validity for patients with COPD and can be used as a user-friendly and valid tool for measuring anxiety symptoms among patients with COPD in China.
Sample size requirements for the design of reliability studies: precision consideration.
Shieh, Gwowen
2014-09-01
In multilevel modeling, the intraclass correlation coefficient based on the one-way random-effects model is routinely employed to measure the reliability or degree of resemblance among group members. To facilitate the advocated practice of reporting confidence intervals in future reliability studies, this article presents exact sample size procedures for precise interval estimation of the intraclass correlation coefficient under various allocation and cost structures. Although the suggested approaches do not admit explicit sample size formulas and require special algorithms for carrying out iterative computations, they are more accurate than the closed-form formulas constructed from large-sample approximations with respect to the expected width and assurance probability criteria. This investigation notes the deficiency of existing methods and expands the sample size methodology for the design of reliability studies that have not previously been discussed in the literature.
[Validity of self-reported metabolic syndrome components in a cohort study].
Fernández-Montero, Alejandro; Beunza, Juan J; Bes-Rastrollo, Maira; Barrio, María T; de la Fuente-Arrillaga, Carmen; Moreno-Galarraga, Laura; Martínez-González, Miguel A
2011-01-01
To assess the accuracy of self-reported data needed to constitute the metabolic syndrome in the University of Navarra Follow-Up [Seguimiento Universidad de Navarra (SUN)] cohort. The SUN project is a multi-purpose prospective cohort, formed by more than 20,000 university graduates, followed-up using surface mail questionnaires every 2 years. In a sample of 287 cohort participants, self-reported data on the criteria needed to define the metabolic syndrome (waist circumference, blood pressure, triglycerides, high-density lipoprotein-cholesterol and glucose) were compared with the same biometric data obtained by blood tests or measured by trained medical staff. Intra-class correlation coefficients with 95% confidence intervals (95% CI), relative mean error and agreement limits according to the method proposed by Bland and Altman were calculated for each variable studied. High intraclass correlations were found for the values of waist circumference (r=0.86, 95% CI: 0.80-0.90) and triglycerides (r=0.71, 95%CI: 0.61-0.79). Moderate intraclass correlations were found (between 0.46 and 0.63) for the other factors. Relative mean errors were always<2.5%, and >91% of values were within the limits of agreement for all variables. The results suggest that self-declared data on the criteria of metabolic syndrome obtained in the SUN cohort, though with some caution, are sufficiently accurate to be used in epidemiological studies. Copyright © 2010 SESPAS. Published by Elsevier Espana. All rights reserved.
Spanish validation of the social stigma scale: Community Attitudes towards Mental Illness.
Ochoa, Susana; Martínez-Zambrano, Francisco; Vila-Badia, Regina; Arenas, Oti; Casas-Anguera, Emma; García-Morales, Esther; Villellas, Raúl; Martín, José Ramón; Pérez-Franco, María Belén; Valduciel, Tamara; García-Franco, Mar; Miguel, Jose; Balsera, Joaquim; Pascual, Gemma; Julia, Eugènia; Casellas, Diana; Haro, Josep Maria
2016-01-01
The stigma against people with mental illness is very high. In Spain there are currently no tools to assess this construct. The aim of this study was to validate the Spanish version of the Community Attitudes towards Mental Illness questionnaire in an adolescent population, and determining its internal consistency and temporal stability. Another analysis by gender will be also performed. A translation and back-translation of the Community Attitudes towards Mental Illness was performed. A total of 150 students of between 14 and 18 years-old were evaluated with this tool in two stages. Internal consistency was tested using Cronbach α; and intraclass correlation coefficient was used for test-retest reliability. Gender-stratified analyses were also performed. The Cronbach α was 0.861 for the first evaluation and 0.909 for the second evaluation. The values of the intraclass correlation coefficient ranged from 0.775 to 0.339 in the item by item analysis, and between 0.88 and 0.81 in the subscales. In the segmentation by gender, it was found that girls scored between 0.797 and 0.863 in the intraclass correlation coefficient, and boys scored between 0.889 and 0.774. In conclusion, the Community Attitudes towards Mental Illness is a reliable tool for the assessment of social stigma. Although reliable results have been found for boys and girls, our results found some gender differences in the analysis. Copyright © 2014 SEP y SEPB. Published by Elsevier España. All rights reserved.
Valid statistical approaches for analyzing sholl data: Mixed effects versus simple linear models.
Wilson, Machelle D; Sethi, Sunjay; Lein, Pamela J; Keil, Kimberly P
2017-03-01
The Sholl technique is widely used to quantify dendritic morphology. Data from such studies, which typically sample multiple neurons per animal, are often analyzed using simple linear models. However, simple linear models fail to account for intra-class correlation that occurs with clustered data, which can lead to faulty inferences. Mixed effects models account for intra-class correlation that occurs with clustered data; thus, these models more accurately estimate the standard deviation of the parameter estimate, which produces more accurate p-values. While mixed models are not new, their use in neuroscience has lagged behind their use in other disciplines. A review of the published literature illustrates common mistakes in analyses of Sholl data. Analysis of Sholl data collected from Golgi-stained pyramidal neurons in the hippocampus of male and female mice using both simple linear and mixed effects models demonstrates that the p-values and standard deviations obtained using the simple linear models are biased downwards and lead to erroneous rejection of the null hypothesis in some analyses. The mixed effects approach more accurately models the true variability in the data set, which leads to correct inference. Mixed effects models avoid faulty inference in Sholl analysis of data sampled from multiple neurons per animal by accounting for intra-class correlation. Given the widespread practice in neuroscience of obtaining multiple measurements per subject, there is a critical need to apply mixed effects models more widely. Copyright © 2017 Elsevier B.V. All rights reserved.
Sonographic measurements of the ulnar nerve at the elbow with different degrees of elbow flexion.
Patel, Prutha; Norbury, John W; Fang, Xiangming
2014-05-01
To determine whether there were differences in the cross-sectional area (CSA) and the flattening ratio of the normative ulnar nerve as it passes between the medial epicondyle and the olecranon at 30° of elbow flexion versus 90° of elbow flexion. Bilateral upper extremities of normal healthy adult volunteers were evaluated with ultrasound. The CSA and the flattening ratio of the ulnar nerve at the elbow as it passes between the medial epicondyle and the olecranon were measured, with the elbow flexed at 30° and at 90°, by 2 operators with varying ultrasound scanning experience by using ellipse and direct tracing methods. The results from the 2 different angles of elbow flexion were compared for each individual operator. Finally, intraclass correlations for absolute agreement and consistency between the 2 raters were calculated. An outpatient clinic room at a regional rehabilitation center. Twenty-five normal healthy adult volunteers. The mean CSA and the mean flattening ratio of the ulnar nerve at 30° of elbow flexion and at 90° of elbow flexion. First, for the ellipse method, the mean CSA of the ulnar nerve at 90° (9.93 mm(2)) was slightly larger than at 30° (9.77 mm(2)) for rater 1. However, for rater 2, the mean CSA of the ulnar nerve at 90° (6.80 mm(2)) was slightly smaller than at 30° (7.08 mm(2)). This was found to be statistically insignificant when using a matched pairs t test and the Wilcoxon signed-rank test, with a significance level of .05. Similarly, the difference between the right side and the left side was not statistically significant. The intraclass correlations for absolute agreement between the 2 raters were not very high due to different measurement locations, but the intraclass correlations for consistency were high. Second, for the direct tracing method, the mean CSA at 90° (7.26 mm(2)) was slightly lower than at 30° (7.48 mm(2)). This was found to be statistically nonsignificant when using the matched pairs t test and the Wilcoxon signed-rank test with a significance level of .05. There was no significant difference in the average flattening ratio between the 2 angles for the left arm (0.54 at 30° vs 0.56 at 90°; P = .619 for the matched pairs t test and .274 for the Wilcoxon signed-rank test). However, for the right arm, the flattening ratio at 90° was significantly higher than that at 30° (0.58 at 90° vs 0.50 at 30°; P = .007 for both the matched pairs t test and the Wilcoxon signed-rank test). The mean CSA of the ulnar nerve at the elbow at 30° was not significantly different than at 90°. However, the average flattening ratio at 90° was found to be significantly higher than at 30° for the right arm. Copyright © 2014 American Academy of Physical Medicine and Rehabilitation. Published by Elsevier Inc. All rights reserved.
Bishop, Dorothy VM; Hardiman, Mervyn; Uwer, Ruth; von Suchodoletz, Waldemar
2007-01-01
It has been proposed that specific language impairment (SLI) is the consequence of low-level abnormalities in auditory perception. However, studies of long-latency auditory ERPs in children with SLI have generated inconsistent findings. A possible reason for this inconsistency is the heterogeneity of SLI. The intraclass correlation (ICC) has been proposed as a useful statistic for evaluating heterogeneity because it allows one to compare an individual's auditory ERP with the grand average waveform from a typically developing reference group. We used this method to reanalyse auditory ERPs from a sample previously described by Uwer, Albrecht and von Suchodoletz (2002). In a subset of children with receptive SLI, there was less correspondence (i.e. lower ICC) with the normative waveform (based on the control grand average) than for typically developing children. This poorer correspondence was seen in responses to both tone and speech stimuli for the period 100–228 ms post stimulus onset. The effect was lateralized and seen at right- but not left-sided electrodes. PMID:17683344
Jirativanont, T; Raksamani, K; Aroonpruksakul, N; Apidechakul, P; Suraseranivongse, S
2017-07-01
We sought to evaluate the validity of two non-technical skills evaluation instruments, the Anaesthetists' Non-Technical Skills (ANTS) behavioural marker system and the Ottawa Global Rating Scale (GRS), to apply them to anaesthesia training. The content validity, response process, internal structure, relations with other variables and consequences were described for validity evidence. Simulated crisis management sessions were initiated during which two trained raters evaluated the performance of postgraduate first-, second- and third-year (PGY-1, PGY-2 and PGY-3) anaesthesia residents. The study included 70 participants, composed of 24 PGY-1, 24 PGY-2 and 22 PGY-3 residents. Both instruments differentiated the non-technical skills of PGY-1 from PGY-3 residents ( P <0.05). Inter-rater agreement was measured using the intraclass correlation coefficient (ICC). For the ANTS instrument, the intraclass correlation coefficients for task management, team-working, situation awareness and decision-making were 0.79, 0.34, 0.81 and 0.70, respectively. For the Ottawa GRS, the intraclass correlation coefficients for overall performance, leadership, problem-solving, situation awareness, resource utilisation and communication skills were 0.86, 0.83, 0.84, 0.87, 0.80 and 0.86, respectively. The Cronbach's alpha for internal consistency of the ANTS instrument was 0.93, and was 0.96 for the Ottawa GRS. There was a high correlation between the ANTS and Ottawa GRS. The raters reported the ease of use of the Ottawa GRS compared to the ANTS. We found sufficient evidence of validity in the ANTS instrument and the Ottawa GRS for the evaluation of non-technical skills in a simulated anaesthesia setting, but the Ottawa GRS was more practical and had higher reliability.
Reliability of the Brazilian version of the Physical Activity Checklist Interview in children.
Adami, Fernando; Cruciani, Fernanda; Douek, Michelle; Sewell, Carolina Dumit; Mariath, Aline Brandão; Hinnig, Patrícia de Fragas; Freaza, Silvia Rafaela Mascarenhas; Bergamaschi, Denise Pimentel
2011-04-01
To assess the reliability of the Lista de Atividades Físicas (Brazilian version of the Physical Activity Checklist Interview) in children. The study is part of a cross-cultural adaptation of the Physical Activity Checklist Interview, conducted with 83 school children aged between seven and ten years, enrolled between the 2nd and 5th grades of primary education in the city of São Paulo, Southeastern Brazil, in 2008. The questionnaire was responded by children through individual interviews. It is comprised of a list of 21 moderate to vigorous physical activities performed on the previous day, it is divided into periods (before, during and after school) and it has a section for interview assessment. This questionnaire enables the quantification of time spent in physical and sedentary activities and the total and weighed metabolic costs. Reliability was assessed by comparing two interviews conducted with a mean interval of three hours. For the interview assessment, data from the first interview and those from an external evaluator were compared. Bland-Altman's proposal, the intraclass correlation coefficient and Lin's concordance correlation coefficient were used to assess reliability. The intraclass correlation coefficient lower limits for the outcomes analyzed varied from 0.84 to 0.96. Precision and agreement varied between 0.83 and 0.97 and between 0.99 and 1, respectively. The line estimated from the pairs of values obtained in both interviews indicates high data precision. The interview item showing the poorest result was the ability to estimate time (fair in 27.7% of interviews). Interview assessment items showed intraclass correlation coefficients between 0.60 and 0.70, except for level of cooperation (0.46). The Brazilian version of the Physical Activity Checklist Interview shows high reliability to assess physical and sedentary activity on the previous day in children.
Validation of the French version of the Burn Specific Health Scale-Brief (BSHS-B) questionnaire.
Gandolfi, S; Auquit-Auckbur, I; Panunzi, S; Mici, E; Grolleau, J-L; Chaput, B
2016-11-01
The Burn Specific Health Scale-Brief questionnaire is a widely validated tool for estimating the health related quality of life and for assessing the best multidisciplinary management of burn patients. The aim of this study was to translate the BSHS-B into French and to investigate its reliability and validity. According to the procedure proposed by the Scientific Advisory Committee of the Medical Outcomes Trust, the Burn Specific Health Scale-Brief (BSHS-B) was translated from the English version into French. In order to test the reliability of the French version of the BSHS-B, 53 burn patients French speakers completed the BSHS-B and SF-36 questionnaires from two to four years after burn. Ten of them have been re-tested at 6 months after the first evaluation. To evaluate clinical utility of the BSHS-F, internal consistency, construct validity (using SF-36) and stability in time were assessed using Cronbach's alpha statistic, Spearman rank test, and intra-class correlation coefficient respectively. The French version of the BSHS-B Cronbach's alpha coefficient was 0.93 and was >0.80 for all the sub-domains. French version of the BSHS-B and the SF-36 were positively correlated, all the associations were statistically significant (p<0.01). Intra-class correlation coefficients for test-retest ranged between 0.95 and 0.99 for the sub-domains. The intra-class correlation coefficient (ICC) for the total score was 0.98. The French version of the BSHS-B shows a robust rate of internal consistency, construct validity and stability in time, supporting its application in routine clinical practice as well as in international studies. Copyright © 2016 Elsevier Ltd and ISBI. All rights reserved.
Botens-Helmus, Christine; Klein, Rolf; Stephan, Carola
2006-01-01
Background A new instrument to assess stress scoliosis patients have whilst wearing their brace has been developed. Aim of this study was to test the reliability of this new instrument. Methods Eight questions are provided focussing on this topic only, including two questions to test the credibility. A max. score of 24 can be achieved (from 0 for most stress to 24 for least stress). We have proposed a subdivision of the score values as follows: 0–8 (strong stress), 9–16 (medium stress) and 17–24 (little stress). 85 patients were invited to take part in this study and to complete the BSSQbrace questionnaire twice, once at the first presentation and a second after a further three days. 62 patients with an average age of 14,5 years and an average Cobb angle of 40° returned their fully completed questionnaires. Results The average stress value was 12,5/24 at the first measurement and 12,4/24 at the second measurement. Ceiling value was 23; floor value 2. The average stress value was 12,5 / 24 at the first measurement and 12,4 / 24 at the second measurement (from 0 for most stress to 24 for least stress). Ceiling value was 23; floor value 2. There was a correlation of 0,88 (Intraclass Correlation Coefficient) between the values of the two measurements. Cronbach alpha was 0, 97. Conclusion The BSSQbrace questionnaire is reliable with good internal consistency and reproducibility. It can be used to measure the coping strategies a patient uses and the impairment a patient feels to have, whilst wearing a brace. PMID:17176483
Jackson, Sarah E; van Jaarsveld, Cornelia Hm; Beeken, Rebecca J; Gunter, Marc J; Steptoe, Andrew; Wardle, Jane
2015-01-01
To examine the medium-term stability of anthropometric and cardio-metabolic parameters in the general population. Participants were 5160 men and women from the English Longitudinal Study of Ageing (age ≥50 years) assessed in 2004 and 2008. Anthropometric data included height, weight, BMI and waist circumference. Cardio-metabolic parameters included blood pressure, serum lipids (total cholesterol, HDL, LDL, triglycerides), hemoglobin, fasting glucose, fibrinogen and C-reactive protein. Stability of anthropometric variables was high (all intraclass correlations >0.92), although mean values changed slightly (-0.01 kg weight, +1.33 cm waist). Cardio-metabolic parameters showed more variation: correlations ranged from 0.43 (glucose) to 0.81 (HDL). The majority of participants (71-97%) remained in the same grouping relative to established clinical cut-offs. Over a 4-year period, anthropometric and cardio-metabolic parameters showed good stability. These findings suggest that when no means to obtain more recent data exist, a one-time sample will give a reasonable approximation to average levels over the medium-term, although reliability is reduced.
Optimal sample sizes for the design of reliability studies: power consideration.
Shieh, Gwowen
2014-09-01
Intraclass correlation coefficients are used extensively to measure the reliability or degree of resemblance among group members in multilevel research. This study concerns the problem of the necessary sample size to ensure adequate statistical power for hypothesis tests concerning the intraclass correlation coefficient in the one-way random-effects model. In view of the incomplete and problematic numerical results in the literature, the approximate sample size formula constructed from Fisher's transformation is reevaluated and compared with an exact approach across a wide range of model configurations. These comprehensive examinations showed that the Fisher transformation method is appropriate only under limited circumstances, and therefore it is not recommended as a general method in practice. For advance design planning of reliability studies, the exact sample size procedures are fully described and illustrated for various allocation and cost schemes. Corresponding computer programs are also developed to implement the suggested algorithms.
Suzuki, Kenji; Epstein, Mark L.; Kohlbrenner, Ryan; Garg, Shailesh; Hori, Masatoshi; Oto, Aytekin; Baron, Richard L.
2014-01-01
OBJECTIVE The purpose of this study was to evaluate automated CT volumetry in the assessment of living-donor livers for transplant and to compare this technique with software-aided interactive volumetry and manual volumetry. MATERIALS AND METHODS Hepatic CT scans of 18 consecutively registered prospective liver donors were obtained under a liver transplant protocol. Automated liver volumetry was developed on the basis of 3D active-contour segmentation. To establish reference standard liver volumes, a radiologist manually traced the contour of the liver on each CT slice. We compared the results obtained with automated and interactive volumetry with those obtained with the reference standard for this study, manual volumetry. RESULTS The average interactive liver volume was 1553 ± 343 cm3, and the average automated liver volume was 1520 ± 378 cm3. The average manual volume was 1486 ± 343 cm3. Both interactive and automated volumetric results had excellent agreement with manual volumetric results (intraclass correlation coefficients, 0.96 and 0.94). The average user time for automated volumetry was 0.57 ± 0.06 min/case, whereas those for interactive and manual volumetry were 27.3 ± 4.6 and 39.4 ± 5.5 min/case, the difference being statistically significant (p < 0.05). CONCLUSION Both interactive and automated volumetry are accurate for measuring liver volume with CT, but automated volumetry is substantially more efficient. PMID:21940543
Suzuki, Kenji; Epstein, Mark L; Kohlbrenner, Ryan; Garg, Shailesh; Hori, Masatoshi; Oto, Aytekin; Baron, Richard L
2011-10-01
The purpose of this study was to evaluate automated CT volumetry in the assessment of living-donor livers for transplant and to compare this technique with software-aided interactive volumetry and manual volumetry. Hepatic CT scans of 18 consecutively registered prospective liver donors were obtained under a liver transplant protocol. Automated liver volumetry was developed on the basis of 3D active-contour segmentation. To establish reference standard liver volumes, a radiologist manually traced the contour of the liver on each CT slice. We compared the results obtained with automated and interactive volumetry with those obtained with the reference standard for this study, manual volumetry. The average interactive liver volume was 1553 ± 343 cm(3), and the average automated liver volume was 1520 ± 378 cm(3). The average manual volume was 1486 ± 343 cm(3). Both interactive and automated volumetric results had excellent agreement with manual volumetric results (intraclass correlation coefficients, 0.96 and 0.94). The average user time for automated volumetry was 0.57 ± 0.06 min/case, whereas those for interactive and manual volumetry were 27.3 ± 4.6 and 39.4 ± 5.5 min/case, the difference being statistically significant (p < 0.05). Both interactive and automated volumetry are accurate for measuring liver volume with CT, but automated volumetry is substantially more efficient.
Practice of preventive dentistry for nursing staff in primary care
Acuña-Reyes, Raquel; Cigarroa-Martínez, Didier; Ureña-Bogarín, Enrique; Orgaz-Fernández, Jose David
2014-01-01
Objectives: Determine the domain of preventive dentistry in nursing personnel assigned to a primary care unit. Methods: Prospective descriptive study, questionnaire validation, and prevalence study. In the first stage, the questionnaire for the practice of preventive dentistry (CPEP, for the term in Spanish) was validated; consistency and reliability were measured by Cronbach's alpha, Pearson's correlation, factor analysis with intra-class correlation coefficient (ICC). In the second stage, the domain in preventive dental nurses was explored. Results: The overall internal consistency of CPEP is α= 0.66, ICC= 0.64, CI95%: 0.29-0.87 (p >0.01). Twenty-one subjects in the study, average age 43, 81.0% female, average seniority of 12.5 were included. A total of 71.5% showed weak domain, 28.5% regular domain, and there was no questionnaire with good domain result. The older the subjects were, the smaller the domain; female nurses showed greater mastery of preventive dentistry (29%, CI95%: 0.1-15.1) than male nurses. Public health nurses showed greater mastery with respect to other categories (50%, CI95%: 0.56-2.8). Conclusions: The CDEP has enough consistency to explore the domain of preventive dentistry in health-care staff. The domain of preventive dentistry in primary care nursing is poor, required to strengthen to provide education in preventive dentistry to the insured population. PMID:25386037
Neltner, Janna Hackett; Abner, Erin Lynn; Schmitt, Frederick A; Denison, Stephanie Kay; Anderson, Sonya; Patel, Ela; Nelson, Peter T
2012-12-01
Quantitative neuropathologic methods provide information that is important for both research and clinical applications. The technologic advancement of digital pathology and image analysis offers new solutions to enable valid quantification of pathologic severity that is reproducible between raters regardless of experience. Using an Aperio ScanScope XT and its accompanying image analysis software, we designed algorithms for quantitation of amyloid and tau pathologies on 65 β-amyloid (6F/3D antibody) and 48 phospho-tau (PHF-1)-immunostained sections of human temporal neocortex. Quantitative digital pathologic data were compared with manual pathology counts. There were excellent correlations between manually counted and digitally analyzed neuropathologic parameters (R² = 0.56-0.72). Data were highly reproducible among 3 participants with varying degrees of expertise in neuropathology (intraclass correlation coefficient values, >0.910). Digital quantification also provided additional parameters, including average plaque area, which shows statistically significant differences when samples are stratified according to apolipoprotein E allele status (average plaque area, 380.9 μm² in apolipoprotein E [Latin Small Letter Open E]4 carriers vs 274.4 μm² for noncarriers; p < 0.001). Thus, digital pathology offers a rigorous and reproducible method for quantifying Alzheimer disease neuropathologic changes and may provide additional insights into morphologic characteristics that were previously more challenging to assess because of technical limitations.
Rajyalakshmi, R.; Prakash, Winston D.; Ali, Mohammad Javed; Naik, Milind N.
2017-01-01
Purpose: To assess the reliability and repeatability of periorbital biometric measurements using ImageJ software and to assess if the horizontal visible iris diameter (HVID) serves as a reliable scale for facial measurements. Methods: This study was a prospective, single-blind, comparative study. Two clinicians performed 12 periorbital measurements on 100 standardised face photographs. Each individual’s HVID was determined by Orbscan IIz and used as a scale for measurements using ImageJ software. All measurements were repeated using the ‘average’ HVID of the study population as a measurement scale. Intraclass correlation coefficient (ICC) and Pearson product-moment coefficient were used as statistical tests to analyse the data. Results: The range of ICC for intra- and interobserver variability was 0.79–0.99 and 0.86–0.99, respectively. Test-retest reliability ranged from 0.66–1.0 to 0.77–0.98, respectively. When average HVID of the study population was used as scale, ICC ranged from 0.83 to 0.99, and the test-retest reliability ranged from 0.83 to 0.96 and the measurements correlated well with recordings done with individual Orbscan HVID measurements. Conclusion: Periorbital biometric measurements using ImageJ software are reproducible and repeatable. Average HVID of the population as measured by Orbscan is a reliable scale for facial measurements. PMID:29403183
Development and validation of a new Prescription Quality Index
Hassan, Norul Badriah; Ismail, Hasanah Che; Naing, Lin; Conroy, Ronán M; Abdul Rahman, Abdul Rashid
2010-01-01
AIMS The aims were to develop and validate a new Prescription Quality Index (PQI) for the measurement of prescription quality in chronic diseases. METHODS The PQI were developed and validated based on three separate surveys and one pilot study. Criteria were developed based on literature search, discussions and brainstorming sessions. Validity of the criteria was examined using modified Delphi method. Pre-testing was performed on 30 patients suffering from chronic diseases. The modified version was then subjected to reviews by pharmacists and clinicians in two separate surveys. The rater-based PQI with 22 criteria was then piloted in 120 patients with chronic illnesses. Results were analysed using SPSS version 12.0.1 RESULTS Exploratory principal components analysis revealed multiple factors contributing to prescription quality. Cronbach's α for the entire 22 criteria was 0.60. The average intra-rater and inter-rater reliability showed good to moderate stability (intraclass correlation coefficient 0.76 and 0.52, respectively). The PQI was significantly and negatively correlated with age (correlation coefficient −0.34, P < 0.001), number of drugs in prescriptions (correlation coefficient −0.51, P < 0.001) and number of chronic diseases/conditions (correlation coefficient −0.35, P < 0.001). CONCLUSIONS The PQI is a promising new instrument for measuring prescription quality. It has been shown that the PQI is a valid, reliable and responsive tool to measure quality of prescription in chronic diseases. PMID:20840442
A comparison of computer-assisted and manual wound size measurement.
Thawer, Habiba A; Houghton, Pamela E; Woodbury, M Gail; Keast, David; Campbell, Karen
2002-10-01
Accurate and precise wound measurements are a critical component of every wound assessment. To examine the reliability and validity of a new computerized technique for measuring human and animal wounds, chronic human wounds (N = 45) and surgical animal wounds (N = 38) were assessed using manual and computerized techniques. Using intraclass correlation coefficients, intrarater and interrater reliability of surface area measurements obtained using the computerized technique were compared to those obtained using acetate tracings and planimetry. A single measurement of surface area using either technique produced excellent intrarater and interrater reliability for both human and animal wounds, but the computerized technique was more precise than the manual technique for measuring the surface area of animal wounds. For both types of wounds and measurement techniques, intrarater and interrater reliability improved when the average of three repeated measurements was obtained. The precision of each technique with human wounds and the precision of the manual technique with animal wounds also improved when three repeated measurement results were averaged. Concurrent validity between the two techniques was excellent for human wounds but poor for the smaller animal wounds, regardless of whether single or the average of three repeated surface area measurements was used. The computerized technique permits reliable and valid assessment of the surface area of both human and animal wounds.
BurnCase 3D software validation study: Burn size measurement accuracy and inter-rater reliability.
Parvizi, Daryousch; Giretzlehner, Michael; Wurzer, Paul; Klein, Limor Dinur; Shoham, Yaron; Bohanon, Fredrick J; Haller, Herbert L; Tuca, Alexandru; Branski, Ludwik K; Lumenta, David B; Herndon, David N; Kamolz, Lars-P
2016-03-01
The aim of this study was to compare the accuracy of burn size estimation using the computer-assisted software BurnCase 3D (RISC Software GmbH, Hagenberg, Austria) with that using a 2D scan, considered to be the actual burn size. Thirty artificial burn areas were pre planned and prepared on three mannequins (one child, one female, and one male). Five trained physicians (raters) were asked to assess the size of all wound areas using BurnCase 3D software. The results were then compared with the real wound areas, as determined by 2D planimetry imaging. To examine inter-rater reliability, we performed an intraclass correlation analysis with a 95% confidence interval. The mean wound area estimations of the five raters using BurnCase 3D were in total 20.7±0.9% for the child, 27.2±1.5% for the female and 16.5±0.1% for the male mannequin. Our analysis showed relative overestimations of 0.4%, 2.8% and 1.5% for the child, female and male mannequins respectively, compared to the 2D scan. The intraclass correlation between the single raters for mean percentage of the artificial burn areas was 98.6%. There was also a high intraclass correlation between the single raters and the 2D Scan visible. BurnCase 3D is a valid and reliable tool for the determination of total body surface area burned in standard models. Further clinical studies including different pediatric and overweight adult mannequins are warranted. Copyright © 2016 Elsevier Ltd and ISBI. All rights reserved.
Glaucoma diagnosis by mapping macula with Fourier domain optical coherence tomography
NASA Astrophysics Data System (ADS)
Tan, Ou; Lu, Ake; Chopra, Vik; Varma, Rohit; Hiroshi, Ishikawa; Schuman, Joel; Huang, David
2008-03-01
A new image segmentation method was developed to detect macular retinal sub-layers boundary on newly-developed Fourier-Domain Optical Coherence Tomography (FD-OCT) with macular grid scan pattern. The segmentation results were used to create thickness map of macular ganglion cell complex (GCC), which contains the ganglion cell dendrites, cell bodies and axons. Overall average and several pattern analysis parameters were defined on the GCC thickness map and compared for the diagnosis of glaucoma. Intraclass correlation (ICC) is used to compare the reproducibility of the parameters. Area under receiving operative characteristic curve (AROC) was calculated to compare the diagnostic power. The result is also compared to the output of clinical time-domain OCT (TD-OCT). We found that GCC based parameters had good repeatability and comparable diagnostic power with circumpapillary nerve fiber layer (cpNFL) thickness. Parameters based on pattern analysis can increase the diagnostic power of GCC macular mapping.
Development of a reliable method to assess footwear comfort during running.
Mündermann, Anne; Nigg, Benno M; Stefanyshyn, Darren J; Humble, R Neil
2002-08-01
The purposes of this study were: (a) to determine whether subjects are able to distinguish between differences in footwear with respect to footwear comfort; and (b) to determine how reliably footwear comfort can be assessed using a visual analogue scale (VAS) and a protocol including a control condition during running. Intraclass correlation coefficients (ICCs) between comfort ratings for repeated conditions were high (ICC = 0.799). Differences in comfort ratings between the insert conditions were significant. A paired t-test revealed a significant difference in overall comfort ratings for the control insert when tested after the soft insert compared to when tested after the hard insert (P = 0.008). The results of this study showed that VASs provide a reliable measure to assess footwear comfort during running under the conditions that: (a) a control condition is included; and (b) the average comfort rating of sessions 4-6 is used. Copyright 2002 Elsevier Science B.V.
Cross-cultural Adaptation of the "Functional Activities Questionnaire - FAQ" for use in Brazil
Sanchez, Maria Angélica dos Santos; Correa, Pricila Cristina Ribeiro; Lourenço, Roberto Alves
2011-01-01
Objective The aim of this paper was to present the results of the first stage of cross-cultural adaptation of the Functional Activities Questionnaire (FAQ). Methods The tool was subjected to translation and re-translation, and the test-retest reliability of a proposed version for use in Brazil was analyzed. Results Of the 548 questionnaire respondents, a convenience sample of 68 informants was selected for retesting. Internal consistency was measured by Cronbach's alpha (0.95) while test-retest reliability was assessed using intra-class correlation (0.97). The findings have shown that FAQ is brief - averaging seven minutes to apply, easily understood and has good intra-rater test-retest reliability. Conclusion Our results suggest this adapted version of the FAQ is a reliable and stable tool which may be useful for assessing function in Brazilian elderly. Notwithstanding, the version should be subjected to further analysis with the aim of reaching functional equivalence. PMID:29213759
The accuracy of nurses' estimates of their absenteeism.
Gaudine, Alice; Gregory, Connie
2010-07-01
The purpose of the present study was to determine the accuracy of nurses' self-reports of absence by examining: (1) the correlation, intra-class correlation, and Cronbach's alpha for self-reported absence and absence as reported in organizational records, (2) difference in central tendency for the two measures of absence and (3) the percentage of nurses who underestimate their absence. Research on nurses' absenteeism has often relied on self-reports of absence. However, nurses may not be aware of their actual absenteeism, or they may underestimate it. Self-reported absence from questionnaires completed by 215 Canadian nurses was compared with their absence from organizational records. There is a strong positive correlation, a strong intra-class correlation and Cronbach's alpha for the two measures of absence. However, there is a difference in central tendency that is related to the majority of nurses in this study (51.1%) underestimating their days absent from work. Research examining the predictors of absence may consider measuring absence with self-reports. Nevertheless, nurses demonstrated a bias to underestimate their absence. Feedback interventions to reduce absenteeism can be developed to include providing nurses with accurate information about their absence.
[Intraclass reliability of the Alberta Infant Motor Scale in the Brazilian version].
Silva, Larissa Paiva; Maia, Polyana Candeia; Lopes, Márcia Maria Coelho Oliveira; Cardoso, Maria Vera Lúcia Moreira Leitão
2013-10-01
This study had as its objective to analyze the intraclass reliability of the Alberta Infant Motor Scale (AIMS), in the Brazilian version, in preterm and term infants. It was a methodological study, conducted from November 2009 to April 2010, with 50 children receiving care in two public institutions in Fortaleza, Ceará, Brazil. Children were grouped according to gestational age as preterm and term, and evaluated by three evaluators in the communication laboratory of a public institution or at home. The intraclass correlation indices for the categories prone, supine, sitting and standing ranged from 0.553 to 0.952; most remained above 0.800, except for the standing category of the third evaluator, in which the index was 0.553. As for the total score and percentile, rates ranged from 0.843 to 0.954. The scale proved to be a reliable instrument for assessing gross motor performance of Brazilian children, particularly in Ceará, regardless of gestational age at birth.
Validity and reliability of the Self-Reported Physical Fitness (SRFit) survey.
Keith, NiCole R; Clark, Daniel O; Stump, Timothy E; Miller, Douglas K; Callahan, Christopher M
2014-05-01
An accurate physical fitness survey could be useful in research and clinical care. To estimate the validity and reliability of a Self-Reported Fitness (SRFit) survey; an instrument that estimates muscular fitness, flexibility, cardiovascular endurance, BMI, and body composition (BC) in adults ≥ 40 years of age. 201 participants completed the SF-36 Physical Function Subscale, International Physical Activity Questionnaire (IPAQ), Older Adults' Desire for Physical Competence Scale (Rejeski), the SRFit survey, and the Rikli and Jones Senior Fitness Test. BC, height and weight were measured. SRFit survey items described BC, BMI, and Senior Fitness Test movements. Correlations between the Senior Fitness Test and the SRFit survey assessed concurrent validity. Cronbach's Alpha measured internal consistency within each SRFit domain. SRFit domain scores were compared with SF-36, IPAQ, and Rejeski survey scores to assess construct validity. Intraclass correlations evaluated test-retest reliability. Correlations between SRFit and the Senior Fitness Test domains ranged from 0.35 to 0.79. Cronbach's Alpha scores were .75 to .85. Correlations between SRFit and other survey scores were -0.23 to 0.72 and in the expected direction. Intraclass correlation coefficients were 0.79 to 0.93. All P-values were 0.001. Initial evaluation supports the SRFit survey's validity and reliability.
Critical analysis of consecutive unilateral cleft lip repairs: determining ideal sample size.
Power, Stephanie M; Matic, Damir B
2013-03-01
Objective : Cleft surgeons often show 10 consecutive lip repairs to reduce presentation bias, however the validity remains unknown. The purpose of this study is to determine the number of consecutive cases that represent average outcomes. Secondary objectives are to determine if outcomes correlate with cleft severity and to calculate interrater reliability. Design : Consecutive preoperative and 2-year postoperative photographs of the unilateral cleft lip-nose complex were randomized and evaluated by cleft surgeons. Parametric analysis was performed according to chronologic, consecutive order. The mean standard deviation over all raters enabled calculation of expected 95% confidence intervals around a mean tested for various sample sizes. Setting : Meeting of the American Cleft Palate-Craniofacial Association in 2009. Patients, Participants : Ten senior cleft surgeons evaluated 39 consecutive lip repairs. Main Outcome Measures : Preoperative severity and postoperative outcomes were evaluated using descriptive and quantitative scales. Results : Intraclass correlation coefficients for cleft severity and postoperative evaluations were 0.65 and 0.21, respectively. Outcomes did not correlate with cleft severity (P = .28). Calculations for 10 consecutive cases demonstrated wide 95% confidence intervals, spanning two points on both postoperative grading scales. Ninety-five percent confidence intervals narrowed within one qualitative grade (±0.30) and one point (±0.50) on the 10-point scale for 27 consecutive cases. Conclusions : Larger numbers of consecutive cases (n > 27) are increasingly representative of average results, but less practical in presentation format. Ten consecutive cases lack statistical support. Cleft surgeons showed low interrater reliability for postoperative assessments, which may reflect personal bias when evaluating another surgeon's results.
Measurement of the area of venous ulcers using two software programs 1
Eberhardt, Thaís Dresch; de Lima, Suzinara Beatriz Soares; Lopes, Luis Felipe Dias; Borges, Eline de Lima; Weiller, Teresinha Heck; da Fonseca, Graziele Gorete Portella
2016-01-01
ABSTRACT Objective: to compare the measurement area of venous ulcers using AutoCAD(r) and Image Tool software. Method: this was an assessment of reproducibility tests conducted in a angiology clinic of a university hospital. Data were collected from 21 patients with venous ulcers, in the period from March to July of 2015, using a collection form and photograph of wounds. Five nurses (evaluators) of the hospital skin wound study group participated. The wounds were measured using both software programs. Data were analyzed using intraclass correlation coefficient, concordance correlation coefficient and Bland-Altman analysis. The study met the ethical aspects in accordance with current legislation. Results: the size of ulcers varied widely, however, without significant difference between the measurements; an excellent intraclass and concordance correlation was found between both software programs, which seem to be more accurate when measuring a wound area >10 cm². Conclusion: the use of both software programs is appropriate for measurement of venous ulcers, appearing to be more accurate when used to measure a wound area > 10 cm². PMID:27992028
Suzuki, T; Sato, Y; Sotome, S; Arai, H; Arai, A; Yoshida, H
2017-06-01
This study was designed to investigate the reliability and validity of measurements of finger diameters with a ring gauge. A reliability study enrolled two independent samples (50 participants and seven examiners in Study I; 26 participants and 26 examiners in Study II). The sizes of each participant's little fingers were measured twice with a ring gauge by each examiner. To investigate the validity of the measurements, five hand therapists compared the finger size and hand volume of 30 participants with the ring gauge and with a figure-of-eight technique (Study III). The intra-class correlation coefficient for intra-observer reliability ranged from 0.97 to 0.99 in Study I, and 0.90 to 0.97 in Study II. The intra-class correlation coefficient for inter-observer reliability was 0.95 in Study I and 0.94 in Study II. The validity study showed a Pearson product moment correlation coefficient of 0.75. The ring gauge showed high reliability and validity for measurement of finger size. III, diagnostic.
Jaciw, Andrew P; Lin, Li; Ma, Boya
2016-10-18
Prior research has investigated design parameters for assessing average program impacts on achievement outcomes with cluster randomized trials (CRTs). Less is known about parameters important for assessing differential impacts. This article develops a statistical framework for designing CRTs to assess differences in impact among student subgroups and presents initial estimates of critical parameters. Effect sizes and minimum detectable effect sizes for average and differential impacts are calculated before and after conditioning on effects of covariates using results from several CRTs. Relative sensitivities to detect average and differential impacts are also examined. Student outcomes from six CRTs are analyzed. Achievement in math, science, reading, and writing. The ratio of between-cluster variation in the slope of the moderator divided by total variance-the "moderator gap variance ratio"-is important for designing studies to detect differences in impact between student subgroups. This quantity is the analogue of the intraclass correlation coefficient. Typical values were .02 for gender and .04 for socioeconomic status. For studies considered, in many cases estimates of differential impact were larger than of average impact, and after conditioning on effects of covariates, similar power was achieved for detecting average and differential impacts of the same size. Measuring differential impacts is important for addressing questions of equity, generalizability, and guiding interpretation of subgroup impact findings. Adequate power for doing this is in some cases reachable with CRTs designed to measure average impacts. Continuing collection of parameters for assessing differential impacts is the next step. © The Author(s) 2016.
Hruska, Carrie B; Geske, Jennifer R; Swanson, Tiffinee N; Mammel, Alyssa N; Lake, David S; Manduca, Armando; Conners, Amy Lynn; Whaley, Dana H; Scott, Christopher G; Carter, Rickey E; Rhodes, Deborah J; O'Connor, Michael K; Vachon, Celine M
2018-06-05
Background parenchymal uptake (BPU), which refers to the level of Tc-99m sestamibi uptake within normal fibroglandular tissue on molecular breast imaging (MBI), has been identified as a breast cancer risk factor, independent of mammographic density. Prior analyses have used subjective categories to describe BPU. We evaluate a new quantitative method for assessing BPU by testing its reproducibility, comparing quantitative results with previously established subjective BPU categories, and determining the association of quantitative BPU with breast cancer risk. Two nonradiologist operators independently performed region-of-interest analysis on MBI images viewed in conjunction with corresponding digital mammograms. Quantitative BPU was defined as a unitless ratio of the average pixel intensity (counts/pixel) within the fibroglandular tissue versus the average pixel intensity in fat. Operator agreement and the correlation of quantitative BPU measures with subjective BPU categories assessed by expert radiologists were determined. Percent density on mammograms was estimated using Cumulus. The association of quantitative BPU with breast cancer (per one unit BPU) was examined within an established case-control study of 62 incident breast cancer cases and 177 matched controls. Quantitative BPU ranged from 0.4 to 3.2 across all subjects and was on average higher in cases compared to controls (1.4 versus 1.2, p < 0.007 for both operators). Quantitative BPU was strongly correlated with subjective BPU categories (Spearman's r = 0.59 to 0.69, p < 0.0001, for each paired combination of two operators and two radiologists). Interoperator and intraoperator agreement in the quantitative BPU measure, assessed by intraclass correlation, was 0.92 and 0.98, respectively. Quantitative BPU measures showed either no correlation or weak negative correlation with mammographic percent density. In a model adjusted for body mass index and percent density, higher quantitative BPU was associated with increased risk of breast cancer for both operators (OR = 4.0, 95% confidence interval (CI) 1.6-10.1, and 2.4, 95% CI 1.2-4.7). Quantitative measurement of BPU, defined as the ratio of average counts in fibroglandular tissue relative to that in fat, can be reliably performed by nonradiologist operators with a simple region-of-interest analysis tool. Similar to results obtained with subjective BPU categories, quantitative BPU is a functional imaging biomarker of breast cancer risk, independent of mammographic density and hormonal factors.
Kang, Kun-Tai; Chiu, Shuenn-Nan; Weng, Wen-Chin; Lee, Pei-Lin; Hsu, Wei-Chung
2017-03-01
To compare office blood pressure (BP) and 24-hour ambulatory BP (ABP) monitoring to facilitate the diagnosis and management of hypertension in children with obstructive sleep apnea (OSA). Children aged 4-16 years with OSA-related symptoms were recruited from a tertiary referral medical center. All children underwent overnight polysomnography, office BP, and 24-hour ABP studies. Multiple linear regression analyses were applied to elucidate the association between the apnea-hypopnea index and BP. Correlation and consistency between office BP and 24-hour ABP were measured by Pearson correlation, intraclass correlation, and Bland-Altman analyses. In the 163 children enrolled (mean age, 8.2 ± 3.3 years; 67% male). The prevalence of systolic hypertension at night was significantly higher in children with moderate-to-severe OSA than in those with primary snoring (44.9% vs 16.1%, P = .006). Pearson correlation and intraclass correlation analyses revealed associations between office BP and 24-hour BP, and Bland-Altman analysis indicated an agreement between office and 24-hour BP measurements. However, multiple linear regression analyses demonstrated that 24-hour BP (nighttime systolic BP and mean arterial pressure), unlike office BP, was independently associated with the apnea-hypopnea index, after adjustment for adiposity variables. Twenty-four-hour ABP is more strongly correlated with OSA in children, compared with office BP. Copyright © 2016 Elsevier Inc. All rights reserved.
Murray, Aileen; Hall, Amanda; Williams, Geoffrey C; McDonough, Suzanne M; Ntoumanis, Nikos; Taylor, Ian; Jackson, Ben; Copsey, Bethan; Hurley, Deirdre A; Matthews, James
2018-02-27
To assess the inter-rater reliability and concurrent validity of the Communication Evaluation in Rehabilitation Tool, which aims to externally assess physiotherapists competency in using Self-Determination Theory-based communication strategies in practice. Audio recordings of initial consultations between 24 physiotherapists and 24 patients with chronic low back pain in four hospitals in Ireland were obtained as part of a larger randomised controlled trial. Three raters, all of whom had Ph.Ds in psychology and expertise in motivation and physical activity, independently listened to the 24 audio recordings and completed the 18-item Communication Evaluation in Rehabilitation Tool. Inter-rater reliability between all three raters was assessed using intraclass correlation coefficients. Concurrent validity was assessed using Pearson's r correlations with a reference standard, the Health Care Climate Questionnaire. The total score for the Communication Evaluation in Rehabilitation Tool is an average of all 18 items. Total scores demonstrated good inter-rater reliability (Intraclass Correlation Coefficient (ICC) = 0.8) and concurrent validity with the Health Care Climate Questionnaire total score (range: r = 0.7-0.88). Item-level scores of the Communication Evaluation in Rehabilitation Tool identified five items that need improvement. Results provide preliminary evidence to support future use and testing of the Communication Evaluation in Rehabilitation Tool. Implications for Rehabilitation Promoting patient autonomy is a learned skill and while interventions exist to train clinicians in these skills there are no tools to assess how well clinicians use these skills when interacting with a patient. The lack of robust assessment has severe implications regarding both the fidelity of clinician training packages and resulting outcomes for promoting patient autonomy. This study has developed a novel measurement tool Communication Evaluation in Rehabilitation Tool and a comprehensive user manual to assess how well health care providers use autonomy-supportive communication strategies in real world-clinical settings. This tool has demonstrated good inter-rater reliability and concurrent validity in its initial testing phase. The Communication Evaluation in Rehabilitation Tool can be used in future studies to assess autonomy-supportive communication and undergo further measurement property testing as per our recommendations.
Values of a Patient and Observer Scar Assessment Scale to Evaluate the Facial Skin Graft Scar.
Chae, Jin Kyung; Kim, Jeong Hee; Kim, Eun Jung; Park, Kun
2016-10-01
The patient and observer scar assessment scale (POSAS) recently emerged as a promising method, reflecting both observer's and patient's opinions in evaluating scar. This tool was shown to be consistent and reliable in burn scar assessment, but it has not been tested in the setting of skin graft scar in skin cancer patients. To evaluate facial skin graft scar applied to POSAS and to compare with objective scar assessment tools. Twenty three patients, who diagnosed with facial cutaneous malignancy and transplanted skin after Mohs micrographic surgery, were recruited. Observer assessment was performed by three independent rates using the observer component of the POSAS and Vancouver scar scale (VSS). Patient self-assessment was performed using the patient component of the POSAS. To quantify scar color and scar thickness more objectively, spectrophotometer and ultrasonography was applied. Inter-observer reliability was substantial with both VSS and the observer component of the POSAS (average measure intraclass coefficient correlation, 0.76 and 0.80, respectively). The observer component consistently showed significant correlations with patients' ratings for the parameters of the POSAS (all p -values<0.05). The correlation between subjective assessment using POSAS and objective assessment using spectrophotometer and ultrasonography showed low relationship. In facial skin graft scar assessment in skin cancer patients, the POSAS showed acceptable inter-observer reliability. This tool was more comprehensive and had higher correlation with patient's opinion.
Nigro, Carlos Alberto; González, Sergio; Arce, Anabella; Aragone, María Rosario; Nigro, Luciana
2015-05-01
Patients under treatment with continuous positive airway pressure (CPAP) may have residual sleep apnea (RSA). The main objective of our study was to evaluate a novel auto-CPAP for the diagnosis of RSA. All patients referred to the sleep laboratory to undergo CPAP polysomnography were evaluated. Patients treated with oxygen or noninvasive ventilation and split-night polysomnography (PSG), PSG with artifacts, or total sleep time less than 180 min were excluded. The PSG was manually analyzed before generating the automatic report from auto-CPAP. PSG variables (respiratory disturbance index (RDI), obstructive apnea index, hypopnea index, and central apnea index) were compared with their counterparts from auto-CPAP through Bland-Altman plots and intraclass correlation coefficient. The diagnostic accuracy of autoscoring from auto-CPAP using different cutoff points of RDI (≥5 and 10) was evaluated by the receiver operating characteristics (ROCs) curve. The study included 114 patients (24 women; mean age and BMI, 59 years old and 33 kg/m(2); RDI and apnea/hypopnea index (AHI)-auto median, 5 and 2, respectively). The average difference between the AHI-auto and the RDI was -3.5 ± 3.9. The intraclass correlation coefficient (ICC) between the total number of central apneas, obstructive, and hypopneas between the PSG and the auto-CPAP were 0.69, 0.16, and 0.15, respectively. An AHI-auto >2 (RDI ≥ 5) or >4 (RDI ≥ 10) had an area under the ROC curve, sensitivity, specificity, positive likelihood ratio, and negative for diagnosis of residual sleep apnea of 0.84/0.89, 84/81%, 82/91%, 4.5/9.5, and 0.22/0.2, respectively. The automatic analysis from auto-CPAP (S9 Autoset) showed a good diagnostic accuracy to identify residual sleep apnea. The absolute agreement between PSG and auto-CPAP to classify the respiratory events correctly varied from very low (obstructive apneas, hypopneas) to moderate (central apneas).
Inter-rater reliability of kinesthetic measurements with the KINARM robotic exoskeleton.
Semrau, Jennifer A; Herter, Troy M; Scott, Stephen H; Dukelow, Sean P
2017-05-22
Kinesthesia (sense of limb movement) has been extremely difficult to measure objectively, especially in individuals who have survived a stroke. The development of valid and reliable measurements for proprioception is important to developing a better understanding of proprioceptive impairments after stroke and their impact on the ability to perform daily activities. We recently developed a robotic task to evaluate kinesthetic deficits after stroke and found that the majority (~60%) of stroke survivors exhibit significant deficits in kinesthesia within the first 10 days post-stroke. Here we aim to determine the inter-rater reliability of this robotic kinesthetic matching task. Twenty-five neurologically intact control subjects and 15 individuals with first-time stroke were evaluated on a robotic kinesthetic matching task (KIN). Subjects sat in a robotic exoskeleton with their arms supported against gravity. In the KIN task, the robot moved the subjects' stroke-affected arm at a preset speed, direction and distance. As soon as subjects felt the robot begin to move their affected arm, they matched the robot movement with the unaffected arm. Subjects were tested in two sessions on the KIN task: initial session and then a second session (within an average of 18.2 ± 13.8 h of the initial session for stroke subjects), which were supervised by different technicians. The task was performed both with and without the use of vision in both sessions. We evaluated intra-class correlations of spatial and temporal parameters derived from the KIN task to determine the reliability of the robotic task. We evaluated 8 spatial and temporal parameters that quantify kinesthetic behavior. We found that the parameters exhibited moderate to high intra-class correlations between the initial and retest conditions (Range, r-value = [0.53-0.97]). The robotic KIN task exhibited good inter-rater reliability. This validates the KIN task as a reliable, objective method for quantifying kinesthesia after stroke.
Alonso Fachado, A; Montes Martinez, A; Menendez Villalva, C; Pereira, M Graça
2007-01-01
The aim of this study was the assesment of psychometric properties of the Portuguese version of the instrument "Medical Outcomes Study - Social Support Survey (MOSSSS)". This questionnaire has been translated and adapted in a Portuguese sample of 101 patients with chronic illness of a rural health centre in Portugal. The average age of patients was 63.4 years, 56.4% female. 29% were illiterate and 2% had completed high school. 78% had arterial hypertension and the 56.4% had diabetes mellitus type 2. The internal consistency was evaluated using Cronbach's alpha. Exploratory and Confirmatory factor analysis were performed in order to confirm reliability and validity of the scale and its multidimensional characteristics. The 2-week test-retest reliability was estimated using weighted kappa for the ordinals variables and intraclass coefficient correlation for the quantitative variables. Cronbach's alphas for the subscales ranged from 0.873 to 0.967 at test, and 0.862 to 0.972 at retest. Exploratory factor analysis revealed the existence of four factors (emotional, tangible, positive interaction and affection support) that explain the 72.71% of the variance. Confirmatory factor analysis supported the existence of four factors that allowed the application of the scale with original items. The goodness-of-fit measures corroborate the initial structure, with chi2/ df=2.01, GFI=0.998, CFI=0.999, AGFI=0.998, TLI=0.999, NFI=0.998, SRMR=0.332, RMSEA=0.76. The 2-weeks test-retest reliability of the Portuguese MOS-SSS as measured by the intraclass correlation coefficient was ranged from 0.941 to 0.966 for the four dimensions and the overall support index. The weighted kappa was ranged from 0.67 to 0.87 for all the items. The MOS-SSS Portuguese version demonstrates good psychometric properties and seems to be useful to measure multidimensional aspects of social support in the Portuguese population.
Waterloo Eye Study: data abstraction and population representation.
Machan, Carolyn M; Hrynchak, Patricia K; Irving, Elizabeth L
2011-05-01
To determine data quality in the Waterloo Eye Study (WatES) and compare the WatES age/sex distribution to the general population. Six thousand three hundred ninety-seven clinic files were reviewed at the University of Waterloo, School of Optometry. Abstracted information included patient age, sex, presenting chief complaint, entering spectacle prescription, refraction, binocular vision, and disease data. Mean age and age distributions were determined for the entire study group and both sexes. These results were compared with Statistics Canada (2006) estimates and information on Canadian optometric practices. Inter- and intraabstractor reliability was determined through double entry of 425 and 50 files, respectively; the Cohen kappa statistic (K) was calculated for qualitative data and the intraclass correlation coefficient (ICC) for quantitative data. Availability of data within the files was determined through missing data rates. The age of the patients in the WatES ranged from 0.2 to 93.9 years (mean age, 42.5 years), with all age groups younger than 85 years well represented. Females comprised 54.1% and males 45.9% of the study group. There were more older patients (>65 years) and younger patients (<10 years) than in the population at large. K values were highest for demographic information (e.g., sex, 0.96) and averaged slightly less for most clinical data requiring some abstractor interpretation (0.71 to 1.00). The two lowest interabstractor values, migraine (0.41) and smoking (0.26), had low reporting frequencies and definition ambiguity between abstractors. Intraclass correlation coefficient values were >0.90 for all but one continuous data type. Missing data rates were <2% for all but near phoria, which was 7.4%. The WatES database includes patients from all age groups and both sexes. It provides a fair representation of optometric patients in Canada. Its large sample size, good interabstractor repeatability, and low missing data rates demonstrates sufficient data quality for future analysis.
Take the HEAT: A pilot study on improving communication with angry families.
Delacruz, Nicolas; Reed, Suzanne; Splinter, Ansley; Brown, Amy; Flowers, Stacy; Verbeck, Nicole; Turpening, Debbie; Mahan, John D
2017-06-01
Our objective was to evaluate the utility of an educational program consisting of a workshop based on the Take the HEAT communication strategy, designed specifically for addressing patients who are angry, using a novel tool to evaluate residents' skills in employing this method. 33 first-year pediatric and internal medicine-pediatrics residents participated in the study. The workshop presented the Take the HEAT (Hear, Empathize, Apologize, Take action) strategy of communication. Communication skills were assessed through standardized patient encounters at baseline and post-workshop. Encounters were scored using a novel assessment tool. After the workshop, residents' Take the HEAT communication improved from baseline total average score 23.15 to total average score 25.36 (Z=-3.428, p<0.001). At baseline, empathy skills were the lowest. Intraclass Correlation Coefficient demonstrated substantial agreement (0.60 and 0.61) among raters using the tool. First-year pediatric trainees' communication with angry families improved with education focused on the Take the HEAT strategy. Poor performance by residents in demonstrating empathy should be explored further. This study demonstrates the utility of a brief communications curriculum aimed at improving pediatric residents' ability to communicate with angry families. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Measurement of Angle Kappa Using Ultrasound Biomicroscopy and Corneal Topography.
Yeo, Joon Hyung; Moon, Nam Ju; Lee, Jeong Kyu
2017-06-01
To introduce a new convenient and accurate method to measure the angle kappa using ultrasound biomicroscopy (UBM) and corneal topography. Data from 42 eyes (13 males and 29 females) were analyzed in this study. The angle kappa was measured using Orbscan II and calculated with UBM and corneal topography. The angle kappa of the dominant eye was compared with measurements by Orbscan II. The mean patient age was 36.4 ± 13.8 years. The average angle kappa measured by Orbscan II was 3.98° ± 1.12°, while the average angle kappa calculated with UBM and corneal topography was 3.19° ± 1.15°. The difference in angle kappa measured by the two methods was statistically significant (p < 0.001). The two methods showed good reliability (intraclass correlation coefficient, 0.671; p < 0.001). Bland-Altman plots were used to demonstrate the agreement between the two methods. We designed a new method using UBM and corneal topography to calculate the angle kappa. This method is convenient to use and allows for measurement of the angle kappa without an expensive device. © 2017 The Korean Ophthalmological Society
Measurement of Angle Kappa Using Ultrasound Biomicroscopy and Corneal Topography
Yeo, Joon Hyung; Moon, Nam Ju
2017-01-01
Purpose To introduce a new convenient and accurate method to measure the angle kappa using ultrasound biomicroscopy (UBM) and corneal topography. Methods Data from 42 eyes (13 males and 29 females) were analyzed in this study. The angle kappa was measured using Orbscan II and calculated with UBM and corneal topography. The angle kappa of the dominant eye was compared with measurements by Orbscan II. Results The mean patient age was 36.4 ± 13.8 years. The average angle kappa measured by Orbscan II was 3.98° ± 1.12°, while the average angle kappa calculated with UBM and corneal topography was 3.19° ± 1.15°. The difference in angle kappa measured by the two methods was statistically significant (p < 0.001). The two methods showed good reliability (intraclass correlation coefficient, 0.671; p < 0.001). Bland-Altman plots were used to demonstrate the agreement between the two methods. Conclusions We designed a new method using UBM and corneal topography to calculate the angle kappa. This method is convenient to use and allows for measurement of the angle kappa without an expensive device. PMID:28471103
Barra, Filipe Ramos; de Souza, Fernanda Freire; Camelo, Rosimara Eva Ferreira Almeida; Ribeiro, Andrea Campos de Oliveira; Farage, Luciano
2017-01-01
To assess the feasibility of contrast-enhanced spectral mammography (CESM) of the breast for assessing the size of residual tumors after neoadjuvant chemotherapy (NAC). In breast cancer patients who underwent NAC between 2011 and 2013, we evaluated residual tumor measurements obtained with CESM and full-field digital mammography (FFDM). We determined the concordance between the methods, as well as their level of agreement with the pathology. Three radiologists analyzed eight CESM and FFDM measurements separately, considering the size of the residual tumor at its largest diameter and correlating it with that determined in the pathological analysis. Interobserver agreement was also evaluated. The sensitivity, specificity, positive predictive value, and negative predictive value were higher for CESM than for FFDM (83.33%, 100%, 100%, and 66% vs. 50%, 50%, 50%, and 25%, respectively). The CESM measurements showed a strong, consistent correlation with the pathological findings (correlation coefficient = 0.76-0.92; intraclass correlation coefficient = 0.692-0.886). The correlation between the FFDM measurements and the pathological findings was not statistically significant, with questionable consistency (intraclass correlation coefficient = 0.488-0.598). Agreement with the pathological findings was narrower for CESM measurements than for FFDM measurements. Interobserver agreement was higher for CESM than for FFDM (0.94 vs. 0.88). CESM is a feasible means of evaluating residual tumor size after NAC, showing a good correlation and good agreement with pathological findings. For CESM measurements, the interobserver agreement was excellent.
Lauricella, Leticia L; Costa, Priscila B; Salati, Michele; Pego-Fernandes, Paulo M; Terra, Ricardo M
2018-06-01
Database quality measurement should be considered a mandatory step to ensure an adequate level of confidence in data used for research and quality improvement. Several metrics have been described in the literature, but no standardized approach has been established. We aimed to describe a methodological approach applied to measure the quality and inter-rater reliability of a regional multicentric thoracic surgical database (Paulista Lung Cancer Registry). Data from the first 3 years of the Paulista Lung Cancer Registry underwent an audit process with 3 metrics: completeness, consistency, and inter-rater reliability. The first 2 methods were applied to the whole data set, and the last method was calculated using 100 cases randomized for direct auditing. Inter-rater reliability was evaluated using percentage of agreement between the data collector and auditor and through calculation of Cohen's κ and intraclass correlation. The overall completeness per section ranged from 0.88 to 1.00, and the overall consistency was 0.96. Inter-rater reliability showed many variables with high disagreement (>10%). For numerical variables, intraclass correlation was a better metric than inter-rater reliability. Cohen's κ showed that most variables had moderate to substantial agreement. The methodological approach applied to the Paulista Lung Cancer Registry showed that completeness and consistency metrics did not sufficiently reflect the real quality status of a database. The inter-rater reliability associated with κ and intraclass correlation was a better quality metric than completeness and consistency metrics because it could determine the reliability of specific variables used in research or benchmark reports. This report can be a paradigm for future studies of data quality measurement. Copyright © 2018 American College of Surgeons. Published by Elsevier Inc. All rights reserved.
Amariles, Pedro; Pino-Marín, Daniel; Sabater-Hernández, Daniel; García-Jiménez, Emilio; Roig-Sánchez, Inés; Faus, María José
2016-11-01
To determine the test-retest reliability of a questionnaire, with a validation preliminary, to assess knowledge of cardiovascular risk (CVR) and cardiovascular disease in patients attending community pharmacies in Spain. To complement the external validity, establishing the relationship between an educational activity and the increase in knowledge about CVR and cardiovascular disease. Sub-analysis of a controlled clinical study, EMDADER-CV, in which a questionnaire about knowledge concerning CVR was applied at 4 different times. Spanish Community Pharmacies. There were 323 patients in the control group, from the 640 who completed the study. Intraclass correlation coefficient to assess the reliability in 3 comparisons (post-educational activity with week 16, post-educational activity with week 32, and week 16 with week 32); and the non-parametric Friedman test to establish the relationship between an oral and written educational activity with increasing knowledge. For the 323 patients in the 3 comparisons, the intraclass correlation coefficient values were 0.624; 0.608 and 0.801, respectively (fair-good to excellent reliability). So, the Friedman test showed a statistically significant relationship between educational activity and increased knowledge (p < .0001). According to the intraclass correlation coefficient, the questionnaire aimed at assessing the knowledge on CVR and cardiovascular disease has a reliability between acceptable and excellent, which added to the previous validation, shows that the instrument meets the criteria of validity and reliability. Furthermore, the questionnaire showed the ability to relate an increase in knowledge with an educational intervention, feature that complements its external validity. Copyright © 2016 Elsevier España, S.L.U. All rights reserved.
Jhanji, Vishal; Yang, Bingzhi; Yu, Marco; Ye, Cong; Leung, Christopher K S
2013-11-01
To compare corneal thickness and corneal elevation using swept source optical coherence tomography and slit scanning topography. Prospective study. 41 normal and 46 keratoconus subjects. All eyes were imaged using swept source optical coherence tomography and slit scanning tomography during the same visit. Mean corneal thickness and best-fit sphere measurements were compared between the instruments. Agreement of measurements between swept source optical coherence tomography and scanning slit topography was analyzed. Intra-rater reproducibility coefficient and intraclass correlation coefficient were evaluated. In normal eyes, central corneal thickness measured by swept source optical coherence tomography was thinner compared with slit scanning topography (p < 0.0001) and ultrasound pachymetry (p = < .0001). Ultrasound pachymetry readings had better 95% limits of agreement with swept source optical coherence tomography than slit scanning topography. In keratoconus eyes, central corneal thickness was thinner on swept source optical coherence tomography than slit scanning topography (p = 0.081) and ultrasound pachymetry (p = 0.001). There were significant differences between thinnest corneal thickness, and, anterior and posterior best-fit sphere measurements between both instruments (p < 0.05 for all). Overall, reproducibility coefficients and intraclass correlation coefficients were significantly better with swept source optical coherence tomography for measurement of central corneal thickness, anterior best-fit sphere and, posterior best-fit sphere (all p < 0.001). Corneal thickness and elevation measurements were significantly different between swept source optical coherence tomography and slit scanning topography. With better reproducibility coefficients and intraclass correlation coefficients, swept source optical coherence tomography may provide a reliable alternative for measurement of corneal parameters. © 2013 The Authors. Clinical and Experimental Ophthalmology © 2013 Royal Australian and New Zealand College of Ophthalmologists.
Braileanu, Maria; Yang, Wuyang; Caplan, Justin M; Lin, Li-Mei; Radvany, Martin G; Tamargo, Rafael J; Huang, Judy
2016-11-01
Arteriovenous malformation (AVM) diffuseness has been shown to be prognostic of treatment outcomes. We assessed interobserver agreement of AVM diffuseness among physicians of different specialty and training backgrounds using digital subtraction angiography (DSA). All research protocols were approved by the institutional review board for this retrospective chart review. In a single-blinded setting, 2 attending neurosurgeons, 1 attending interventional neuroradiologist, and 1 senior neurosurgical resident rated 80 DSA views of 36 AVMs as either compact or diffuse. Individual interobserver agreement and subgroup agreement were analyzed using κ agreement and intraclass correlation coefficient. Disagreement regarding AVM diffuseness occurred in 43.8% of all DSA views (n = 80). Interobserver κ agreement on AVM diffuseness using DSA views among 4 physicians ranged from fair (κ = 0.40 [95% confidence interval (CI) = 0.22-0.58]) to substantial (κ = 0.65 [95% CI = 0.48-0.81]), whereas total intraclass correlation coefficient was 0.81 (95% CI = 0.73-0.87). For the 36 AVMs, κ agreement ranged from fair (κ = 0.36 [95% CI = 0.13-0.60]) to moderate (κ = 0.57 [95% CI = 0.35-0.79]), whereas intraclass correlation coefficient among all 4 physicians was 0.68 (95% CI = 0.47-0.82). Moderate agreement on AVM diffuseness (n = 80) was found between attending and resident assessments (κ = 0.57 [95% CI = 0.39-0.75]) and between neurosurgeon and interventional neuroradiologist assessments (κ = 0.55 [95% CI = 0.37-0.73]). Agreement of individual physicians on AVM diffuseness varies from fair to substantial. Objective and three-dimensional measures of AVM diffuseness should be developed for consistent clinical application. Copyright © 2016 Elsevier Inc. All rights reserved.
Hsu, Hsien-Yuan; Lin, Jr-Hung; Kwok, Oi-Man; Acosta, Sandra; Willson, Victor
2016-01-01
Several researchers have recommended that level-specific fit indices should be applied to detect the lack of model fit at any level in multilevel structural equation models. Although we concur with their view, we note that these studies did not sufficiently consider the impact of intraclass correlation (ICC) on the performance of level-specific fit indices. Our study proposed to fill this gap in the methodological literature. A Monte Carlo study was conducted to investigate the performance of (a) level-specific fit indices derived by a partially saturated model method (e.g., CFIPS_B and CFIPS_W) and (b) SRMRW and SRMRB in terms of their performance in multilevel structural equation models across varying ICCs. The design factors included intraclass correlation (ICC: ICC1 = 0.091 to ICC6 = 0.500), numbers of groups in between-level models (NG: 50, 100, 200, and 1,000), group size (GS: 30, 50, and 100), and type of misspecification (no misspecification, between-level misspecification, and within-level misspecification). Our simulation findings raise a concern regarding the performance of between-level-specific partial saturated fit indices in low ICC conditions: the performances of both TLIPS_B and RMSEAPS_B were more influenced by ICC compared with CFIPS_B and SRMRB. However, when traditional cutoff values (RMSEA≤ 0.06; CFI, TLI≥ 0.95; SRMR≤ 0.08) were applied, CFIPS_B and TLIPS_B were still able to detect misspecified between-level models even when ICC was as low as 0.091 (ICC1). On the other hand, both RMSEAPS_B and SRMRB were not recommended under low ICC conditions. PMID:29795901
2013-01-01
Background Handrim wheelchair propulsion is a complex bimanual motor task. The bimanually applied forces on the rims determine the speed and direction of locomotion. Measurements of forces and torques on the handrim are important to study status and change of propulsion technique (and consequently mechanical strain) due to processes of learning, training or the wheelchair configuration. The purpose of this study was to compare the simultaneous outcomes of two different measurement-wheels attached to the different sides of the wheelchair, to determine measurement consistency within and between these wheels given the expected inter- and intra-limb variability as a consequence of motor control. Methods Nine able-bodied subjects received a three-week low-intensity handrim wheelchair practice intervention. They then performed three four-minute trials of wheelchair propulsion in an instrumented hand rim wheelchair on a motor-driven treadmill at a fixed belt speed. The two measurement-wheels on each side of the wheelchair measured forces and torques of one of the two upper limbs, which simultaneously perform the push action over time. The resulting data were compared as direct output using cross-correlation on the torque around the wheel-axle. Calculated push characteristics such as power production and speed were compared using an intra-class correlation. Results Measured torque around the wheel axle of the two measurement-wheels had a high average cross-correlation of 0.98 (std=0.01). Unilateral mean power output over a minute was found to have an intra-class correlation of 0.89 between the wheels. Although the difference over the pushes between left and right power output had a high variability, the mean difference between the measurement-wheels was low at 0.03 W (std=1.60). Other push characteristics showed even higher ICC’s (>0.9). Conclusions A good agreement between both measurement-wheels was found at the level of the power output. This indicates a high comparability of the measurement-wheels for the different propulsion parameters. Data from both wheels seem suitable to be used together or interchangeably in experiments on motor control and wheelchair propulsion performance. A high variability in forces and timing between the left and right side were found during the execution of this bimanual task, reflecting the human motor control process. PMID:23360756
Vegter, Riemer J K; Lamoth, Claudine J; de Groot, Sonja; Veeger, Dirkjan H E J; van der Woude, Lucas H V
2013-01-29
Handrim wheelchair propulsion is a complex bimanual motor task. The bimanually applied forces on the rims determine the speed and direction of locomotion. Measurements of forces and torques on the handrim are important to study status and change of propulsion technique (and consequently mechanical strain) due to processes of learning, training or the wheelchair configuration. The purpose of this study was to compare the simultaneous outcomes of two different measurement-wheels attached to the different sides of the wheelchair, to determine measurement consistency within and between these wheels given the expected inter- and intra-limb variability as a consequence of motor control. Nine able-bodied subjects received a three-week low-intensity handrim wheelchair practice intervention. They then performed three four-minute trials of wheelchair propulsion in an instrumented hand rim wheelchair on a motor-driven treadmill at a fixed belt speed. The two measurement-wheels on each side of the wheelchair measured forces and torques of one of the two upper limbs, which simultaneously perform the push action over time. The resulting data were compared as direct output using cross-correlation on the torque around the wheel-axle. Calculated push characteristics such as power production and speed were compared using an intra-class correlation. Measured torque around the wheel axle of the two measurement-wheels had a high average cross-correlation of 0.98 (std=0.01). Unilateral mean power output over a minute was found to have an intra-class correlation of 0.89 between the wheels. Although the difference over the pushes between left and right power output had a high variability, the mean difference between the measurement-wheels was low at 0.03 W (std=1.60). Other push characteristics showed even higher ICC's (>0.9). A good agreement between both measurement-wheels was found at the level of the power output. This indicates a high comparability of the measurement-wheels for the different propulsion parameters. Data from both wheels seem suitable to be used together or interchangeably in experiments on motor control and wheelchair propulsion performance. A high variability in forces and timing between the left and right side were found during the execution of this bimanual task, reflecting the human motor control process.
Concurrent Validity Between Live and Home Video Observations Using the Alberta Infant Motor Scale.
Boonzaaijer, Marike; van Dam, Ellen; van Haastert, Ingrid C; Nuysink, Jacqueline
2017-04-01
Serial assessment of gross motor development of infants at risk is an established procedure in neonatal follow-up clinics. Assessments based on home video recordings could be a relevant addition. In 48 infants (1.5-19 months), the concurrent validity of 2 applications was examined using the Alberta Infant Motor Scale: (1) a home video made by parents and (2) simultaneous observation on-site by a pediatric physical therapist. Parents' experiences were explored using a questionnaire. The intraclass correlation coefficient agreement between live and home video assessment was 0.99, with a standard error of measurement of 1.41 items. Intra- and interrater reliability: intraclass correlation coefficients were more than 0.99. According to 94% of the parents, recording their infant's movement repertoire was easy to perform. Assessing the Alberta Infant Motor Scale based on home video recordings is comparable to assessment by live observation. The video method is a promising application that can be used with low burden for parents and infants.
Quality Control of Epidemiological Lectures Online: Scientific Evaluation of Peer Review
Linkov, Faina; Lovalekar, Mita; LaPorte, Ronald
2007-01-01
Aim To examine the feasibility of using peer review for the quality control of online materials. Methods We analyzed the inter-rater agreement on the quality of epidemiological lectures online, based on the Global Health Network Supercourse lecture library. We examined the agreement among reviewers by looking at κ statistics and intraclass correlations. Seven expert reviewers examined and rated a random sample of 100 Supercourse lectures. Their reviews were compared with the reviews of the lay Supercourse reviewers. Results Both expert and non-expert reviewers rated lectures very highly, with a mean overall score of 4 out of 5. Kappa (κ) statistic and intraclass correlations indicated that inter-rater agreement for experts and non-experts was surprisingly low (below 0.4). Conclusions To our knowledge, this was the first time that poor inter-rater agreement was demonstrated for the Internet lectures. Future research studies need to evaluate the alternatives to the peer review system, especially for online materials. PMID:17436390
Holmes, Jeffrey D; Jenkins, Mary E; Johnson, Andrew M; Hunt, Michael A; Clark, Ross A
2013-04-01
Impaired postural stability places individuals with Parkinson's at an increased risk for falls. Given the high incidence of fall-related injuries within this population, ongoing assessment of postural stability is important. To evaluate the validity of the Nintendo Wii(®) balance board as a measurement tool for the assessment of postural stability in individuals with Parkinson's. Twenty individuals with Parkinson's participated. Subjects completed testing on two balance tasks with eyes open and closed on a Wii(®) balance board and biomechanical force platform. Bland-Altman plots and a two-way, random-effects, single measure intraclass correlation coefficient model were used to assess concurrent validity of centre-of-pressure data. Concurrent validity was demonstrated to be excellent across balance tasks (intraclass correlation coefficients = 0.96, 0.98, 0.92, 0.94). This study suggests that the Wii(®) balance board is a valid tool for the quantification of postural stability among individuals with Parkinson's.
Brew, Christopher J; Simpson, Philip M; Whitehouse, Sarah L; Donnelly, William; Crawford, Ross W; Hubble, Matthew J W
2012-04-01
We describe a scaling method for templating digital radiographs using conventional acetate templates independent of template magnification without the need for a calibration marker. The mean magnification factor for the radiology department was determined (119.8%; range, 117%-123.4%). This fixed magnification factor was used to scale the radiographs by the method described. Thirty-two femoral heads on postoperative total hip arthroplasty radiographs were then measured and compared with the actual size. The mean absolute accuracy was within 0.5% of actual head size (range, 0%-3%) with a mean absolute difference of 0.16 mm (range, 0-1 mm; SD, 0.26 mm). Intraclass correlation coefficient showed excellent reliability for both interobserver and intraobserver measurements with intraclass correlation coefficient scores of 0.993 (95% CI, 0.988-0.996) for interobserver measurements and intraobserver measurements ranging between 0.990 and 0.993 (95% CI, 0.980-0.997). Crown Copyright © 2012. Published by Elsevier Inc. All rights reserved.
Quek, June; Brauer, Sandra G; Treleaven, Julia; Clark, Ross A
2017-09-01
This study aims to investigate the concurrent validity and intrarater reliability of the Microsoft Kinect to measure thoracic kyphosis against the Flexicurve. Thirty-three healthy individuals (age: 31±11.0 years, men: 17, height: 170.2±8.2 cm, weight: 64.2±12.0 kg) participated, with 29 re-examined for intrarater reliability 1-7 days later. Thoracic kyphosis was measured using the Flexicurve and the Microsoft Kinect consecutively in both standing and sitting positions. Both the kyphosis index and angle were calculated. The Microsoft Kinect showed excellent concurrent validity (intraclass correlation coefficient=0.76-0.82) and reliability (intraclass correlation coefficient=0.81-0.98) for measuring thoracic kyphosis (angle and index) in both standing and sitting postures. This study is the first to show that the Microsoft Kinect has excellent validity and intrarater reliability to measure thoracic kyphosis, which is promising for its use in the clinical setting.
[The reliability of a questionnaire regarding Colombian children's physical activity].
Herazo-Beltrán, Aliz Y; Domínguez-Anaya, Regina
2012-10-01
Reporting the Physical Activity Questionnaire for school children's (PAQ-C) test-retest reliability and internal consistency. This was a descriptive study of 100 school-aged children aged 9 to 11 years old attending a school in Cartagena, Colombia. The sample was randomly selected. The PAQ-C was given twice, one week apart, after the informed consent forms had been signing by the children's parents and school officials. Cronbach's alpha coefficient of reliability was used for assessing internal consistency and an intra-class correlation coefficient for test-retest reliability SPSS (version 17.0) was used for statistical analysis. The questionnaire scored 0.73 internal consistencies during the first measurement and 0.78 on the second; intra-class correlation coefficient was 0.60. There were differences between boys and girls regarding both measurements. The PAQ-C had acceptable internal consistency and test-retest reliability, thereby making it useful for measuring children's self-reported physical activity and a valuable tool for population studies in Colombia.
Concurrent Validity Between Live and Home Video Observations Using the Alberta Infant Motor Scale
van Dam, Ellen; van Haastert, Ingrid C.; Nuysink, Jacqueline
2017-01-01
Purpose: Serial assessment of gross motor development of infants at risk is an established procedure in neonatal follow-up clinics. Assessments based on home video recordings could be a relevant addition. Methods: In 48 infants (1.5-19 months), the concurrent validity of 2 applications was examined using the Alberta Infant Motor Scale: (1) a home video made by parents and (2) simultaneous observation on-site by a pediatric physical therapist. Parents' experiences were explored using a questionnaire. Results: The intraclass correlation coefficient agreement between live and home video assessment was 0.99, with a standard error of measurement of 1.41 items. Intra- and interrater reliability: intraclass correlation coefficients were more than 0.99. According to 94% of the parents, recording their infant's movement repertoire was easy to perform. Conclusion: Assessing the Alberta Infant Motor Scale based on home video recordings is comparable to assessment by live observation. The video method is a promising application that can be used with low burden for parents and infants. PMID:28350771
Optimization of Scan Parameters to Reduce Acquisition Time for Diffusion Kurtosis Imaging at 1.5T.
Yokosawa, Suguru; Sasaki, Makoto; Bito, Yoshitaka; Ito, Kenji; Yamashita, Fumio; Goodwin, Jonathan; Higuchi, Satomi; Kudo, Kohsuke
2016-01-01
To shorten acquisition of diffusion kurtosis imaging (DKI) in 1.5-tesla magnetic resonance (MR) imaging, we investigated the effects of the number of b-values, diffusion direction, and number of signal averages (NSA) on the accuracy of DKI metrics. We obtained 2 image datasets with 30 gradient directions, 6 b-values up to 2500 s/mm(2), and 2 signal averages from 5 healthy volunteers and generated DKI metrics, i.e., mean, axial, and radial kurtosis (MK, K∥, and K⊥) maps, from various combinations of the datasets. These maps were estimated by using the intraclass correlation coefficient (ICC) with those from the full datasets. The MK and K⊥ maps generated from the datasets including only the b-value of 2500 s/mm(2) showed excellent agreement (ICC, 0.96 to 0.99). Under the same acquisition time and diffusion directions, agreement was better of MK, K∥, and K⊥ maps obtained with 3 b-values (0, 1000, and 2500 s/mm(2)) and 4 signal averages than maps obtained with any other combination of numbers of b-value and varied NSA. Good agreement (ICC > 0.6) required at least 20 diffusion directions in all the metrics. MK and K⊥ maps with ICC greater than 0.95 can be obtained at 1.5T within 10 min (b-value = 0, 1000, and 2500 s/mm(2); 20 diffusion directions; 4 signal averages; slice thickness, 6 mm with no interslice gap; number of slices, 12).
Validity and cross-cultural adaptation of the persian version of the oxford elbow score.
Ebrahimzadeh, Mohammad H; Kachooei, Amir Reza; Vahedi, Ehsan; Moradi, Ali; Mashayekhi, Zeinab; Hallaj-Moghaddam, Mohammad; Azami, Mehran; Birjandinejad, Ali
2014-01-01
Oxford Elbow Score (OES) is a patient-reported questionnaire used to assess outcomes after elbow surgery. The aim of this study was to validate and adapt the OES into Persian language. After forward-backward translation of the OES into Persian, a total number of 92 patients after elbow surgeries completed the Persian OES along with the Persian DASH and SF-36. To assess test-retest reliability, 31 randomly selected patients (34%) completed the Persian OES again after three days while abstaining from all forms of therapeutic regimens. Reliability of the Persian OES was assessed by measuring intraclass correlation coefficient (ICC) for test-retest reliability and Cronbach's alpha for internal consistency. Spearman's correlation coefficient was used to test the construct validity. Cronbach's alpha coefficient was 0.92 showing excellent reliability. Cronbach's alpha for function, pain, and social-psychological subscales was 0.95, 0.86, and 0.85, respectively. Intraclass correlation coefficient (ICC) was 0.85 for the overall questionnaire and 0.90, 0.76, and 0.75 for function, pain, and social-psychological subscales, respectively. Construct validity was confirmed as the Spearman correlation between OES and DASH was 0.80. Persian OES is a valid and reliable patient-reported outcome measure to assess postsurgical elbow status in Persian speaking population.
Hama, Yohei; Kanazawa, Manabu; Minakuchi, Shunsuke; Uchida, Tatsuro; Sasaki, Yoshiyuki
2014-03-19
In the present study, we developed a novel color scale for visual assessment, conforming to theoretical color changes of a gum, to evaluate masticatoryperformance; moreover, we investigated the reliability and validity of this evaluation method using the color scale. Ten participants (aged 26.30 years) with natural dentition chewed the gum at several chewing strokes. Changes in color were measured using a colorimeter, and then, linearregression expressions that represented changes in gum color were derived. The color scale was developed using these regression expressions. Thirty-two chewed gums were evaluated using colorimeter and were assessed three times using the color scale by six dentists aged 25.27 (mean, 25.8) years, six preclinical dental students aged 21.23 (mean, 22.2) years, and six elderly individuals aged 68.84 (mean, 74.0) years. The intrarater and interrater reliability of evaluations was assessed using intraclass correlation coefficients. Validity of the method compared with a colorimeter was assessed using Spearman's rank correlation coefficient. All intraclass correlation coefficients were > 0.90, and Spearman's rank-correlation coefficients were > 0.95 in all groups. These results indicated that the evaluation method of the color-changeable chewing gum using the newly developed color scale is reliable and valid.
Movement-related beta oscillations show high intra-individual reliability.
Espenhahn, Svenja; de Berker, Archy O; van Wijk, Bernadette C M; Rossiter, Holly E; Ward, Nick S
2017-02-15
Oscillatory activity in the beta frequency range (15-30Hz) recorded from human sensorimotor cortex is of increasing interest as a putative biomarker of motor system function and dysfunction. Despite its increasing use in basic and clinical research, surprisingly little is known about the test-retest reliability of spectral power and peak frequency measures of beta oscillatory signals from sensorimotor cortex. Establishing that these beta measures are stable over time in healthy populations is a necessary precursor to their use in the clinic. Here, we used scalp electroencephalography (EEG) to evaluate intra-individual reliability of beta-band oscillations over six sessions, focusing on changes in beta activity during movement (Movement-Related Beta Desynchronization, MRBD) and after movement termination (Post-Movement Beta Rebound, PMBR). Subjects performed visually-cued unimanual wrist flexion and extension. We assessed Intraclass Correlation Coefficients (ICC) and between-session correlations for spectral power and peak frequency measures of movement-related and resting beta activity. Movement-related and resting beta power from both sensorimotor cortices was highly reliable across sessions. Resting beta power yielded highest reliability (average ICC=0.903), followed by MRBD (average ICC=0.886) and PMBR (average ICC=0.663). Notably, peak frequency measures yielded lower ICC values compared to the assessment of spectral power, particularly for movement-related beta activity (ICC=0.386-0.402). Our data highlight that power measures of movement-related beta oscillations are highly reliable, while corresponding peak frequency measures show greater intra-individual variability across sessions. Importantly, our finding that beta power estimates show high intra-individual reliability over time serves to validate the notion that these measures reflect meaningful individual differences that can be utilised in basic research and clinical studies. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Short version of the Depression Anxiety Stress Scale-21: is it valid for Brazilian adolescents?
Silva, Hítalo Andrade da; Passos, Muana Hiandra Pereira Dos; Oliveira, Valéria Mayaly Alves de; Palmeira, Aline Cabral; Pitangui, Ana Carolina Rodarti; Araújo, Rodrigo Cappato de
2016-01-01
To evaluate the interday reproducibility, agreement and validity of the construct of short version of the Depression Anxiety Stress Scale-21 applied to adolescents. The sample consisted of adolescents of both sexes, aged between 10 and 19 years, who were recruited from schools and sports centers. The validity of the construct was performed by exploratory factor analysis, and reliability was calculated for each construct using the intraclass correlation coefficient, standard error of measurement and the minimum detectable change. The factor analysis combining the items corresponding to anxiety and stress in a single factor, and depression in a second factor, showed a better match of all 21 items, with higher factor loadings in their respective constructs. The reproducibility values for depression were intraclass correlation coefficient with 0.86, standard error of measurement with 0.80, and minimum detectable change with 2.22; and, for anxiety/stress: intraclass correlation coefficient with 0.82, standard error of measurement with 1.80, and minimum detectable change with 4.99. The short version of the Depression Anxiety Stress Scale-21 showed excellent values of reliability, and strong internal consistency. The two-factor model with condensation of the constructs anxiety and stress in a single factor was the most acceptable for the adolescent population. Avaliar a reprodutibilidade interdias, a concordância e a validade do construto da versão reduzida da Depression Anxiety Stress Scale-21 aplicada a adolescentes. A amostra foi composta por adolescentes de ambos os sexos, com idades entre 10 e 19 anos, recrutados de escolas e centros esportivos. A validade de construto foi realizada por análise fatorial exploratória, e a confiabilidade foi calculada para cada construto, por meio de coeficiente de correlação intraclasse, erro padrão de medida e mudança mínima detectável. A análise fatorial combinando os itens correspondentes a ansiedade e estresse em um único fator, e depressão em um segundo fator apresentou melhor adequação de todos os 21 itens, com cargas fatoriais mais altas em seus respectivos construtos. Os valores de reprodutibilidade para a depressão foram coeficiente de correlação intraclasse com 0,86, erros padrão de medida com 0,80 e mudança mínima detectável com 2,22 e, para a ansiedade/estresse, foram coeficiente de correlação intraclasse com 0,82, erro padrão de medida com 1,80 e mudança mínima detectável com 4,99. A versão reduzida da Depression Anxiety Stress Scale-21 apresentou excelentes valores de confiabilidade e também uma forte consistência interna. O modelo de dois fatores com a condensação dos construtos ansiedade e estresse em um único fator foi o mais aceitável para a população adolescente.
Ara, Mirian; Ferreras, Antonio; Pajarin, Ana B; Calvo, Pilar; Figus, Michele; Frezzotti, Paolo
2015-01-01
To assess the intrasession repeatability and intersession reproducibility of peripapillary retinal nerve fiber layer (RNFL) thickness parameters measured by scanning laser polarimetry (SLP) with enhanced corneal compensation (ECC) in healthy and glaucomatous eyes. One randomly selected eye of 82 healthy individuals and 60 glaucoma subjects was evaluated. Three scans were acquired during the first visit to evaluate intravisit repeatability. A different operator obtained two additional scans within 2 months after the first session to determine intervisit reproducibility. The intraclass correlation coefficient (ICC), coefficient of variation (COV), and test-retest variability (TRT) were calculated for all SLP parameters in both groups. ICCs ranged from 0.920 to 0.982 for intravisit measurements and from 0.910 to 0.978 for intervisit measurements. The temporal-superior-nasal-inferior-temporal (TSNIT) average was the highest (0.967 and 0.946) in normal eyes, while nerve fiber indicator (NFI; 0.982) and inferior average (0.978) yielded the best ICC in glaucomatous eyes for intravisit and intervisit measurements, respectively. All COVs were under 10% in both groups, except NFI. TSNIT average had the lowest COV (2.43%) in either type of measurement. Intervisit TRT ranged from 6.48 to 12.84. The reproducibility of peripapillary RNFL measurements obtained with SLP-ECC was excellent, indicating that SLP-ECC is sufficiently accurate for monitoring glaucoma progression.
Cho, Myung-Rae; Lee, Young Sik; Choi, Won-Kee
2018-03-01
The objective was to evaluate the relationship between side-to-side differences of lateral femoral bowing and varus knee deformity based on two-dimensional (2D) assessment in unilateral total knee arthroplasty (TKA). A total of 143 patients with varus knee osteoarthritis who underwent unilateral TKA were enrolled. We evaluated the side-to-side differences of the frontal lower limb alignment by assessing lateral femoral bowing, anatomical medial distal femoral angle, and anatomical medial proximal tibial angle (aMPTA). The average values of all anatomical indices were significantly different between the operated side and the non-operated side (p<0.05). The side-to-side difference in hip knee ankle (HKA) angle had a statistically significant correlation with that in lateral femoral bowing (intraclass correlation coefficient, 0.259; p=0.002) and that in aMPTA. Linear regression analysis showed 0.199° of side-to-side difference in lateral femoral bowing was associated with 1° of side-to-side difference in bilateral HKA angle. The side-to-side difference in lateral femoral bowing showed a tendency to increase in proportion to varus knee deformity based on 2D assessment in unilateral TKA patients.
Santos, Rafaella Zulianello Dos; Bonin, Christiani Decker Batista; Martins, Eliara Ten Caten; Pereira Junior, Moacir; Ghisi, Gabriela Lima de Melo; Macedo, Kassia Rosangela Paz de; Benetti, Magnus
2018-01-01
The absence of instruments capable of measuring the level of knowledge of hypertensive patients in cardiac rehabilitation programs about their disease reflects the lack of specific recommendations for these patients. To develop and validate a questionnaire to evaluate the knowledge of hypertensive patients in cardiac rehabilitation programs about their disease. A total of 184 hypertensive patients (mean age 60.5 ± 10 years, 66.8% men) were evaluated. Reproducibility was assessed by calculation of the intraclass correlation coefficient using the test-retest method. Internal consistency was assessed by the Cronbach's alpha and the construct validity by the exploratory factorial analysis. The final version of the instrument had 17 questions organized in areas considered important for patient education. The instrument proposed showed a clarity index of 8.7 (0.25). The intraclass correlation coefficient was 0.804 and the Cronbach's correlation coefficient was 0.648. Factor analysis revealed five factors associated with knowledge areas. Regarding the criterion validity, patients with higher education level and higher family income showed greater knowledge about hypertension. The instrument has a satisfactory clarity index and adequate validity, and can be used to evaluate the knowledge of hypertensive participants in cardiac rehabilitation programs.
Stuberg, W A; Colerick, V L; Blanke, D J; Bruce, W
1988-08-01
The purpose of this study was to compare a clinical gait analysis method using videography and temporal-distance measures with 16-mm cinematography in a gait analysis laboratory. Ten children with a diagnosis of cerebral palsy (means age = 8.8 +/- 2.7 years) and 9 healthy children (means age = 8.9 +/- 2.4 years) participated in the study. Stride length, walking velocity, and goniometric measurements of the hip, knee, and ankle were recorded using the two gait analysis methods. A multivariate analysis of variance was used to determine significant differences between the data collected using the two methods. Pearson product-moment correlation coefficients were determined to examine the relationship between the measurements recorded by the two methods. The consistency of performance of the subjects during walking was examined by intraclass correlation coefficients. No significant differences were found between the methods for the variables studied. Pearson product-moment correlation coefficients ranged from .79 to .95, and intraclass coefficients ranged from .89 to .97. The clinical gait analysis method was found to be a valid tool in comparison with 16-mm cinematography for the variables that were studied.
Test-Retest Stability of the Task and Ego Orientation Questionnaire
ERIC Educational Resources Information Center
Lane, Andrew M.; Nevill, Alan M.; Bowes, Neal; Fox, Kenneth R.
2005-01-01
Establishing stability, defined as observing minimal measurement error in a test-retest assessment, is vital to validating psychometric tools. Correlational methods, such as Pearson product-moment, intraclass, and kappa are tests of association or consistency, whereas stability or reproducibility (regarded here as synonymous) assesses the…
Agreement among High School Diving Judges.
ERIC Educational Resources Information Center
Stewart, Michael J.; Blair, William O.
1982-01-01
Raters' agreement and the relative consistency of diving judges at a boy's competition were analyzed using intraclass correlations within 16 position x type combinations. Judges' variance was significant for 5 of the 16 combinations. Point estimates were generally greater for consistency than for raters' agreement about scores. (Author/CM)
Kerssens, Jan J; Groenewegen, Peter P; Sixma, Herman J; Boerma, Wienke G W; van der Eijk, Ingrid
2004-02-01
To gain insight into similarities and differences in patient evaluations of quality of primary care across 12 European countries and to correlate patient evaluations with WHO health system performance measures (for example, responsiveness) of these countries. Patient evaluations were derived from a series of Quote (QUality of care Through patients' Eyes) instruments designed to measure the quality of primary care. Various research groups provided a total sample of 5133 patients from 12 countries: Belarus, Denmark, Finland, Greece, Ireland, Israel, Italy, the Netherlands, Norway, Portugal, United Kingdom, and Ukraine. Intraclass correlations of 10 Quote items were calculated to measure differences between countries. The world health report 2000 - Health systems: improving performance performance measures in the same countries were correlated with mean Quote scores. Intra-class correlation coefficients ranged from low to very high, which indicated little variation between countries in some respects (for example, primary care providers have a good understanding of patients' problems in all countries) and large variation in other respects (for example, with respect to prescription of medication and communication between primary care providers). Most correlations between mean Quote scores per country and WHO performance measures were positive. The highest correlation (0.86) was between the primary care provider's understanding of patients' problems and responsiveness according to WHO. Patient evaluations of the quality of primary care showed large differences across countries and related positively to WHO's performance measures of health care systems.
MIDAS and HIT-6 French translation: reliability and correlation between tests.
Magnoux, E; Freeman, M A; Zlotnik, G
2008-01-01
The aim was to evaluate the test-retest reliability of the French translation of the Migraine Disability Assessment (MIDAS) and Headache Impact Test (HIT)-6 questionnaires as applied to episodic and chronic headaches and to assess the correlation between these two questionnaires. The MIDAS and HIT-6 questionnaires, which assess the degree of migraine-related functional disability, are widely used in headache treatment clinics. The French translation has not been checked for test-retest reliability. MIDAS involves recall, over the previous 3 months, of the number of days with functional disability with regard to work and to home and social life. HIT-6 involves a more subjective and general assessment of headache-related disability over the previous 4 weeks. We expect that there may be greater impact recall bias for chronic headaches than for episodic headaches and considered it important to be able to determine if the reliability of these questionnaires is equally good for these two patient populations. Given that both questionnaires have the same objective, that of assessing headache impact, it was thought useful to determine if their results might show a correlation and if they could thus be used interchangeably. The study was approved by an external ethics committee. The subjects were patients who regularly visit the Clinique de la Migraine de Montréal, which specializes in the treatment of headaches. The MIDAS and HIT-6 questionnaires were completed by the patients during their regular visit. Twelve days later, the same questionnaires were mailed with a prepaid return envelope. Sixty-five patients were required in both the episodic and chronic headache groups, assuming an 80% questionnaire return rate. One hundred and eighty-five patients were enrolled, and 143 completed the study, 75 with episodic headaches and 68 with chronic headaches. The questionnaire return rate was 78.9%. On average, questionnaires were completed a second time 21 days after the first, with a median of 19 days. The Shrout-Fleiss intraclass correlation coefficients for MIDAS and HIT-6 were, respectively, 0.76 and 0.77 for episodic headaches and 0.83 and 0.80 for chronic headaches. The Pearson correlation coefficient between the MIDAS and HIT-6 questionnaires was 0.48 for episodic headaches and 0.58 for chronic headaches at the first compilation and 0.42 and 0.59 at the second compilation. The test-retest intraclass correlation of the French versions for both MIDAS and HIT-6 questionnaires indicates moderate reliability for episodic headache and substantial reliability for chronic headache. The correlation between the MIDAS and HIT-6 questionnaires is weak for episodic headaches, but approaches a level of 'good' for chronic headaches.
A Note on Cluster Effects in Latent Class Analysis
ERIC Educational Resources Information Center
Kaplan, David; Keller, Bryan
2011-01-01
This article examines the effects of clustering in latent class analysis. A comprehensive simulation study is conducted, which begins by specifying a true multilevel latent class model with varying within- and between-cluster sample sizes, varying latent class proportions, and varying intraclass correlations. These models are then estimated under…
ERIC Educational Resources Information Center
Rausch, Tobias; Karing, Constance; Dörfler, Tobias; Artelt, Cordula
2016-01-01
This study examined personality similarity between teachers and their students and its impact on teacher judgement of student achievement in the domains of reading comprehension and mathematics. Personality similarity was quantified through intraclass correlations between personality characteristics of 409 dyads of German teachers and their…
The Validation of a Food Label Literacy Questionnaire for Elementary School Children
ERIC Educational Resources Information Center
Reynolds, Jesse S.; Treu, Judith A.; Njike, Valentine; Walker, Jennifer; Smith, Erica; Katz, Catherine S.; Katz, David L.
2012-01-01
Objective: To determine the reliability and validity of a 10-item questionnaire, the Food Label Literacy for Applied Nutrition Knowledge questionnaire. Methods: Participants were elementary school children exposed to a 90-minute school-based nutrition program. Reliability was assessed via Cronbach alpha and intraclass correlation coefficient…
Sudbrack, Simone; Barbosa, Fernanda P; Mattiello, Rita; Booij, Linda; Estorgato, Geovana R; Dutra, Moisés S; Assunção, Fabiana D de; Nunes, Magda L
2018-04-22
To validate the Brazilian Portuguese version of the Family Environment Assessment questionnaire (Inventaire du Milieu Familial). The validation process was carried out in two stages. First, translation and back-translation were performed, and in the second phase, the questionnaire was applied in 72 families of children between 0 and 24 months for the validation process. The tool consists of the following domains: mother's communication ability; behavior; organization of the physical and temporal environment; collection/quantity of toys; maternal attitude of constant attention toward her baby; diversification of stimuli; baby's behavior. The following was performed for the scale validation: 1 - content analysis (judgment); 2 - construct analysis (factorial analysis - Kaiser-Meyer-Olkin, Bartlett, and Pearson's correlation tests); 3 - criterion analysis (calculation of Cronbach's alpha coefficient, intraclass correlations, and split-half correlations). The mean age of the children was 9±6.7 months, and of these, 35 (48.6%) were males. Most correlations between items and domains were significant. In the factorial analysis of the scale, Kaiser-Meyer-Olkin values were 0.76, Bartlett's test showed a p-value<0.001, and correlation between items and domains showed a p-value<0.01. Regarding the validity, Cronbach's alpha was 0.92 (95% CI: 0.89-0.94). The intraclass correlation among the evaluators was 0.97 (0.96-0.98) and split-half correlations, r: 0.60, with p<0.01. The Portuguese version of the Inventaire du Milieu Familial showed good to excellent performance regarding the assessed psychometric properties. Copyright © 2018. Published by Elsevier Editora Ltda.
Barra, Filipe Ramos; de Souza, Fernanda Freire; Camelo, Rosimara Eva Ferreira Almeida; Ribeiro, Andrea Campos de Oliveira; Farage, Luciano
2017-01-01
Objective To assess the feasibility of contrast-enhanced spectral mammography (CESM) of the breast for assessing the size of residual tumors after neoadjuvant chemotherapy (NAC). Materials and methods In breast cancer patients who underwent NAC between 2011 and 2013, we evaluated residual tumor measurements obtained with CESM and full-field digital mammography (FFDM). We determined the concordance between the methods, as well as their level of agreement with the pathology. Three radiologists analyzed eight CESM and FFDM measurements separately, considering the size of the residual tumor at its largest diameter and correlating it with that determined in the pathological analysis. Interobserver agreement was also evaluated. Results The sensitivity, specificity, positive predictive value, and negative predictive value were higher for CESM than for FFDM (83.33%, 100%, 100%, and 66% vs. 50%, 50%, 50%, and 25%, respectively). The CESM measurements showed a strong, consistent correlation with the pathological findings (correlation coefficient = 0.76-0.92; intraclass correlation coefficient = 0.692-0.886). The correlation between the FFDM measurements and the pathological findings was not statistically significant, with questionable consistency (intraclass correlation coefficient = 0.488-0.598). Agreement with the pathological findings was narrower for CESM measurements than for FFDM measurements. Interobserver agreement was higher for CESM than for FFDM (0.94 vs. 0.88). Conclusion CESM is a feasible means of evaluating residual tumor size after NAC, showing a good correlation and good agreement with pathological findings. For CESM measurements, the interobserver agreement was excellent. PMID:28894329
Jones, Sydney A; Evenson, Kelly R; Johnston, Larry F; Trost, Stewart G; Samuel-Hodge, Carmen; Jewell, David A; Kraschnewski, Jennifer L; Keyserling, Thomas C
2015-01-01
This study explored the criterion-related validity and test-retest reliability of the modified RESIDential Environment physical activity questionnaire and whether the instrument's validity varied by body mass index, education, race/ethnicity, or employment status. Validation study using baseline data collected for randomized trial of a weight loss intervention. Participants recruited from health departments wore an ActiGraph accelerometer and self-reported non-occupational walking, moderate and vigorous physical activity on the modified RESIDential Environment questionnaire. We assessed validity (n=152) using Spearman correlation coefficients, and reliability (n=57) using intraclass correlation coefficients. When compared to steps, moderate physical activity, and bouts of moderate/vigorous physical activity measured by accelerometer, these questionnaire measures showed fair evidence for validity: recreational walking (Spearman correlation coefficients 0.23-0.36), total walking (Spearman correlation coefficients 0.24-0.37), and total moderate physical activity (Spearman correlation coefficients 0.18-0.36). Correlations for self-reported walking and moderate physical activity were higher among unemployed participants and women with lower body mass indices. Generally no other variability in the validity of the instrument was found. Evidence for reliability of RESIDential Environment measures of recreational walking, total walking, and total moderate physical activity was substantial (intraclass correlation coefficients 0.56-0.68). Evidence for questionnaire validity and reliability varied by activity domain and was strongest for walking measures. The questionnaire may capture physical activity less accurately among women with higher body mass indices and employed participants. Capturing occupational activity, specifically walking at work, may improve questionnaire validity. Copyright © 2014 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.
Nutrition Environment Food Pantry Assessment Tool (NEFPAT): Development and Evaluation.
Nikolaus, Cassandra J; Laurent, Emily; Loehmer, Emily; An, Ruopeng; Khan, Naiman; McCaffrey, Jennifer
2018-04-24
To develop and evaluate a nutrition environment assessment tool to assess the consumer nutrition environment and use of recommended practices in food pantries. The Nutrition Environment Food Pantry Assessment Tool (NEFPAT) was developed based on a literature review and guidance from professionals working with food pantries. The tool was pilot-tested at 9 food pantries, an expert panel assessed content validity, and interrater reliability was evaluated by pairs in 3 pantries. After revisions, the NEFPAT was used in 27 pantries. Pilot tests indicated positive appraisal for the NEFPAT and recommendations were addressed. The NEFPAT's 6 objectives and the overall tool were rated as content valid by experts, with an average section rating of 3.85 ± 0.10. Intraclass correlation coefficients for interrater reliability were >0.90. The NEFPAT is content valid with high interrater reliability. It provides baseline data that could be valuable for interventions within the nutrition environment of food pantries. Published by Elsevier Inc.
2013-01-01
Background Associations of bisphenol A and phthalates with chronic disease health outcomes are increasingly being investigated in epidemiologic studies. The majority of previous studies of within-person variability in urinary bisphenol A and phthalate metabolite concentrations have focused on reproducibility over short time periods. Long-term reproducibility data are needed to assess the potential usefulness of these biomarkers for prospective studies, particularly those examining risk of diseases with long latency periods. Low within-person reproducibility may attenuate relative risk estimates and reduce statistical power to detect associations with disease. Therefore, we assessed within-person reproducibility of bisphenol A, eight phthalate metabolites, and phthalic acid in spot urine samples over 1 to 3 years among women enrolled in two large cohort studies. Methods Women in the Nurses’ Health Study and Nurses’ Health Study II provided two spot urine samples, 1 to 3 years apart (n = 80 women for analyses of bisphenol A; n = 40 women for analyses of phthalate metabolites; n = 34 women for analyses of phthalic acid). To measure within-person reproducibility, we calculated Spearman rank correlation coefficients and intraclass correlation coefficients for creatinine-adjusted concentrations of bisphenol A, phthalate metabolites, and phthalic acid. Results Over 1 to 3 years, within-person variability of bisphenol A was high relative to total variability (intraclass correlation coefficient = 0.14) and rankings of bisphenol A levels between time-points were weakly correlated (Spearman correlation = 0.19). Seven of the eight phthalate metabolites and phthalic acid demonstrated moderate within-person stability over time (Spearman correlation or intraclass correlation coefficient = 0.39-0.55). Restricting analyses to first-morning urine samples did not alter results. Conclusions Single measurements of bisphenol A in spot urine samples were highly variable within women over 1 to 3 years, indicating that investigation of associations between a single urinary bisphenol A measurement and disease risk may be challenging in epidemiologic studies. The majority of urinary phthalate metabolites and phthalic acid appeared moderately reproducible within women over time, suggesting single measurements may be useful in epidemiologic studies, although observed relative risks can be substantially attenuated. PMID:24034517
Townsend, Mary K; Franke, Adrian A; Li, Xingnan; Hu, Frank B; Eliassen, A Heather
2013-09-13
Associations of bisphenol A and phthalates with chronic disease health outcomes are increasingly being investigated in epidemiologic studies. The majority of previous studies of within-person variability in urinary bisphenol A and phthalate metabolite concentrations have focused on reproducibility over short time periods. Long-term reproducibility data are needed to assess the potential usefulness of these biomarkers for prospective studies, particularly those examining risk of diseases with long latency periods. Low within-person reproducibility may attenuate relative risk estimates and reduce statistical power to detect associations with disease. Therefore, we assessed within-person reproducibility of bisphenol A, eight phthalate metabolites, and phthalic acid in spot urine samples over 1 to 3 years among women enrolled in two large cohort studies. Women in the Nurses' Health Study and Nurses' Health Study II provided two spot urine samples, 1 to 3 years apart (n = 80 women for analyses of bisphenol A; n = 40 women for analyses of phthalate metabolites; n = 34 women for analyses of phthalic acid). To measure within-person reproducibility, we calculated Spearman rank correlation coefficients and intraclass correlation coefficients for creatinine-adjusted concentrations of bisphenol A, phthalate metabolites, and phthalic acid. Over 1 to 3 years, within-person variability of bisphenol A was high relative to total variability (intraclass correlation coefficient = 0.14) and rankings of bisphenol A levels between time-points were weakly correlated (Spearman correlation = 0.19). Seven of the eight phthalate metabolites and phthalic acid demonstrated moderate within-person stability over time (Spearman correlation or intraclass correlation coefficient = 0.39-0.55). Restricting analyses to first-morning urine samples did not alter results. Single measurements of bisphenol A in spot urine samples were highly variable within women over 1 to 3 years, indicating that investigation of associations between a single urinary bisphenol A measurement and disease risk may be challenging in epidemiologic studies. The majority of urinary phthalate metabolites and phthalic acid appeared moderately reproducible within women over time, suggesting single measurements may be useful in epidemiologic studies, although observed relative risks can be substantially attenuated.
A Metric to Quantify Shared Visual Attention in Two-Person Teams
NASA Technical Reports Server (NTRS)
Gontar, Patrick; Mulligan, Jeffrey B.
2015-01-01
Introduction: Critical tasks in high-risk environments are often performed by teams, the members of which must work together efficiently. In some situations, the team members may have to work together to solve a particular problem, while in others it may be better for them to divide the work into separate tasks that can be completed in parallel. We hypothesize that these two team strategies can be differentiated on the basis of shared visual attention, measured by gaze tracking. 2) Methods: Gaze recordings were obtained for two-person flight crews flying a high-fidelity simulator (Gontar, Hoermann, 2014). Gaze was categorized with respect to 12 areas of interest (AOIs). We used these data to construct time series of 12 dimensional vectors, with each vector component representing one of the AOIs. At each time step, each vector component was set to 0, except for the one corresponding to the currently fixated AOI, which was set to 1. This time series could then be averaged in time, with the averaging window time (t) as a variable parameter. For example, when we average with a t of one minute, each vector component represents the proportion of time that the corresponding AOI was fixated within the corresponding one minute interval. We then computed the Pearson product-moment correlation coefficient between the gaze proportion vectors for each of the two crew members, at each point in time, resulting in a signal representing the time-varying correlation between gaze behaviors. We determined criteria for concluding correlated gaze behavior using two methods: first, a permutation test was applied to the subjects' data. When one crew member's gaze proportion vector is correlated with a random time sample from the other crewmember's data, a distribution of correlation values is obtained that differs markedly from the distribution obtained from temporally aligned samples. In addition to validating that the gaze tracker was functioning reasonably well, this also allows us to compute probabilities of coordinated behavior for each value of the correlation. As an alternative, we also tabulated distributions of correlation coefficients for synthetic data sets, in which the behavior was modeled as a first-order Markov process, and compared correlation distributions for identical processes with those for disparate processes, allowing us to choose criteria and estimate error rates. 3) Discussion: Our method of gaze correlation is able to measure shared visual attention, and can distinguish between activities involving different instruments. We plan to analyze whether pilots strategies of sharing visual attention can predict performance. Possible measurements of performance include expert ratings from instructors, fuel consumption, total task time, and failure rate. While developed for two-person crews, our approach can be applied to larger groups, using intra-class correlation coefficients instead of the Pearson product-moment correlation.
Values of a Patient and Observer Scar Assessment Scale to Evaluate the Facial Skin Graft Scar
Chae, Jin Kyung; Kim, Eun Jung; Park, Kun
2016-01-01
Background The patient and observer scar assessment scale (POSAS) recently emerged as a promising method, reflecting both observer's and patient's opinions in evaluating scar. This tool was shown to be consistent and reliable in burn scar assessment, but it has not been tested in the setting of skin graft scar in skin cancer patients. Objective To evaluate facial skin graft scar applied to POSAS and to compare with objective scar assessment tools. Methods Twenty three patients, who diagnosed with facial cutaneous malignancy and transplanted skin after Mohs micrographic surgery, were recruited. Observer assessment was performed by three independent rates using the observer component of the POSAS and Vancouver scar scale (VSS). Patient self-assessment was performed using the patient component of the POSAS. To quantify scar color and scar thickness more objectively, spectrophotometer and ultrasonography was applied. Results Inter-observer reliability was substantial with both VSS and the observer component of the POSAS (average measure intraclass coefficient correlation, 0.76 and 0.80, respectively). The observer component consistently showed significant correlations with patients' ratings for the parameters of the POSAS (all p-values<0.05). The correlation between subjective assessment using POSAS and objective assessment using spectrophotometer and ultrasonography showed low relationship. Conclusion In facial skin graft scar assessment in skin cancer patients, the POSAS showed acceptable inter-observer reliability. This tool was more comprehensive and had higher correlation with patient's opinion. PMID:27746642
Ko, Seok-Jae; Lee, Hyunju; Kim, Seul-Ki; Kim, Minji; Kim, Jinsung; Lee, Beom-Joon; Park, Jae-Woo
2015-06-01
Abdominal examination (AE) is the evaluation of the status of illness by examining the abdominal region in traditional Korean medicine (TKM). Although AE is currently considered an important diagnostic method in TKM, owing to its clinical usage, no studies have been conducted to objectively assess its accuracy and develop standards. Twelve healthy subjects and 21 patients with functional dyspepsia have participated in this study. The patients were classified into epigastric discomfort group (n=11) and epigastric discomfort with tenderness group (n=10) according to the clinical diagnosis by AE. After evaluating the subjective epigastric discomfort in all subjects, two independent clinicians measured the pressure pain threshold (PPT) two times at an acupoint (CV 14) using an algometer. We then assessed the interrater and intrarater reliability of the PPT measurements and evaluated the validity (sensitivity and specificity) via a receiver operating characteristic plot and optimal cutoff value. The results of the interrater reliability test showed a very strong correlation (correlation coefficient range: 0.82-0.91). The results of intrarater reliability test also showed a higher than average correlation (intraclass correlation coefficient: 0.58-0.70). The optimal cutoff value of PPT in the epigastric area was 1.8 kg/cm(2) with 100% sensitivity and 54.54% specificity. PPT measurements in the epigastric area with an algometer demonstrated high reliability and validity for AE, which makes this approach potentially useful in clinical applications as a new quantitative measurement in TKM.
Axelsen, M B; Stoltenberg, M; Poggenborg, R P; Kubassova, O; Boesen, M; Bliddal, H; Hørslev-Petersen, K; Hanson, L G; Østergaard, M
2012-03-01
To determine whether dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) evaluated using semi-automatic image processing software can accurately assess synovial inflammation in rheumatoid arthritis (RA) knee joints. In 17 RA patients undergoing knee surgery, the average grade of histological synovial inflammation was determined from four biopsies obtained during surgery. A preoperative series of T(1)-weighted dynamic fast low-angle shot (FLASH) MR images was obtained. Parameters characterizing contrast uptake dynamics, including the initial rate of enhancement (IRE), were generated by the software in three different areas: (I) the entire slice (Whole slice); (II) a manually outlined region of interest (ROI) drawn quickly around the joint, omitting large artefacts such as blood vessels (Quick ROI); and (III) a manually outlined ROI following the synovial capsule of the knee joint (Precise ROI). Intra- and inter-reader agreement was assessed using the intra-class correlation coefficient (ICC). The IRE from the Quick ROI and the Precise ROI revealed high correlations to the grade of histological inflammation (Spearman's correlation coefficient (rho) = 0.70, p = 0.001 and rho = 0.74, p = 0.001, respectively). Intra- and inter-reader ICCs were very high (0.93-1.00). No Whole slice parameters were correlated to histology. DCE-MRI provides fast and accurate assessment of synovial inflammation in RA patients. Manual outlining of the joint to omit large artefacts is necessary.
Regan, Timothy; Paul, Christine; Ishiguchi, Paul; D'Este, Catherine; Koller, Claudia; Forshaw, Kristy; Noble, Natasha; Oldmeadow, Christopher; Bisquera, Alessandra; Eades, Sandra
2017-10-17
The objective of this study was to determine the concordance between data extracted from two Clinical Decision Support Systems regarding diabetes testing and monitoring at Aboriginal Community Controlled Health Services in Australia. De-identified PenCAT and Communicare Systems data were extracted from the services allocated to the intervention arm of a diabetes care trial, and intra-class correlations for each extracted item were derived at a service level. Strong to very strong correlations between the two data sources were found regarding the total number of patients with diabetes per service (Intra-class correlation [ICC] = 0.99), as well as the number (ICC = 0.98-0.99) and proportion (ICC = 0.96) of patients with diabetes by gender. The correlation was moderate for the number and proportion of Type 2 diabetes patients per service in the group aged 18-34 years (ICC = 0.65 and 0.8-0.82 respectively). Strong to very strong correlations were found for numbers and proportions of patients being tested for diabetes, and for appropriate monitoring of patients known to have diabetes (ICC = 0.998-1.00). This indicated a generally high degree of concordance between whole-service data extracted by the two Clinical Decision Support Systems. Therefore, the less expensive or less complex option (depending on the individual circumstances of the service) may be appropriate for monitoring diabetes testing and care. However, the extraction of data about subgroups of patients may not be interchangeable.
Simões, Maria do Socorro Mp; Garcia, Isabel Ff; Costa, Lucíola da Cm; Lunardi, Adriana C
2018-05-01
The Life-Space Assessment (LSA) assesses mobility from the spaces that older adults go, and how often and how independent they move. Despite its increased use, LSA measurement properties remain unclear. The aim of the present study was to analyze the content validity, reliability, construct validity and interpretability of the LSA for Brazilian community-dwelling older adults. In this clinimetric study we analyzed the measurement properties (content validity, reliability, construct validity and interpretability) of the LSA administered to 80 Brazilian community-dwelling older adults. Reliability was analyzed by Cronbach's alpha (internal consistency), intraclass correlation coefficients and 95% confidence interval (reproducibility), and standard error of measurement (measurement error). Construct validity was analyzed by Pearson's correlations between the LSA and accelerometry (time in inactivity and moderate-to-vigorous activities), and interpretability was analyzed by determination of the minimal detectable change, and floor and ceiling effects. The LSA met the criteria for content validity. The Cronbach's alpha was 0.92, intraclass correlation coefficient was 0.97 (95% confidence interval 0.95-0.98) and standard error of measurement was 4.12. The LSA showed convergence with accelerometry (negative correlation with time in inactivity and positive correlation with time in moderate to vigorous activities), the minimal detectable change was 0.36 and we observed no floor or ceiling effects. The LSA showed adequate reliability, validity and interpretability for life-space mobility assessment of Brazilian community-dwelling older adults. Geriatr Gerontol Int 2018; 18: 783-789. © 2018 Japan Geriatrics Society.
Measuring the Cobb angle with the iPhone in kyphoses: a reliability study.
Jacquot, Frederic; Charpentier, Axelle; Khelifi, Sofiane; Gastambide, Daniel; Rigal, Regis; Sautet, Alain
2012-08-01
Smartphones have gained widespread use in the healthcare field to fulfill a variety of tasks. We developed a small iPhone application to take advantage of the built-in position sensor to measure angles in a variety of spinal deformities. We present a reliability study of this tool in measuring kyphotic angles. Radiographs taken from 20 different patients' charts were presented to a panel of six operators at two different times. Radiographs were measured with the protractor and the iPhone application and statistical analysis was applied to measure intraclass correlation coefficients between both measurement methods, and to measure intra- and interobserver reliability The intraclass correlation coefficient calculated between methods (i.e. CobbMeter application on the iPhone versus standard method with the protractor) was 0.963 for all measures, indicating excellent correlation was obtained between the CobbMeter application and the standard method. The interobserver correlation coefficient was 0.965. The intraobserver ICC was 0.977, indicating excellent reproductibility of measurements at different times for all operators. The interobserver ICC between fellowship trained senior surgeons and general orthopaedic residents was 0.989. Consistently, the ICC for intraobserver and interobserver correlations was higher with the CobbMeter application than with the regular protractor method. This difference was not statistically significant. Measuring kyphotic angles with the iPhone application appears to be a valid procedure and is in no way inferior to the standard way of measuring the Cobb angle in kyphotic deformities.
Lienhard, K; Lauermann, S P; Schneider, D; Item-Glatthorn, J F; Casartelli, N C; Maffiuletti, N A
2013-12-01
Reliability of isometric, isokinetic and isoinertial modalities for quadriceps strength evaluation, and the relation between quadriceps strength and physical function was investigated in 29 total knee arthroplasty (TKA) patients, with an average age of 63 years. Isometric maximal voluntary contraction torque, isokinetic peak torque, and isoinertial one-repetition maximum load of the involved and uninvolved quadriceps were evaluated as well as objective (walking parameters) and subjective physical function (WOMAC). Reliability was good and comparable for the isometric, isokinetic, and isoinertial strength outcomes on both sides (intraclass correlation coefficient range: 0.947-0.966; standard error of measurement range: 5.1-9.3%). Involved quadriceps strength was significantly correlated to walking speed (r range: 0.641-0.710), step length (r range: 0.685-0.820) and WOMAC function (r range: 0.575-0.663), independent from the modality (P < 0.05). Uninvolved quadriceps strength was also significantly correlated to walking speed (r range: 0.413-0.539), step length (r range: 0.514-0.608) and WOMAC function (r range: 0.374-0.554) (P < 0.05), except for WOMAC function/isokinetic peak torque (P > 0.05). In conclusion, isometric, isokinetic, and isoinertial modalities ensure valid and reliable assessment of quadriceps muscle strength in TKA patients. Copyright © 2013 Elsevier Ltd. All rights reserved.
Nimphius, Sophia; McGuigan, Michael R; Suchomel, Timothy J; Newton, Robert U
2016-06-01
This study assessed reliability of discrete ground reaction force (GRF) variables over multiple pitching trials, investigated the relationships between discrete GRF variables and pitch velocity (PV) and assessed the variability of the "force signature" or continuous force-time curve during the pitching motion of windmill softball pitchers. Intraclass correlation coefficient (ICC) for all discrete variables was high (0.86-0.99) while the coefficient of variance (CV) was low (1.4-5.2%). Two discrete variables were significantly correlated to PV; second vertical peak force (r(5)=0.81, p=0.03) and time between peak forces (r(5)=-0.79; p=0.03). High ICCs and low CVs support the reliability of discrete GRF and PV variables over multiple trials and significant correlations indicate there is a relationship between the ability to produce force and the timing of this force production with PV. The mean of all pitchers' curve-average standard deviation of their continuous force-time curves demonstrated low variability (CV=4.4%) indicating a repeatable and identifiable "force signature" pattern during this motion. As such, the continuous force-time curve in addition to discrete GRF variables should be examined in future research as a potential method to monitor or explain changes in pitching performance. Copyright © 2016 Elsevier B.V. All rights reserved.
Pereira, Sara; Todd Katzmarzyk, Peter; Gomes, Thayse Natacha; Souza, Michele; Chaves, Raquel Nichele; dos Santos, Fernanda Karina; Santos, Daniel; Hedeker, Donald; Maia, José
2017-01-01
This study investigates biological, behavioural and sociodemographic correlates of intra-pair similarities, and estimates sibling resemblance in health-related physical fitness (PF). The sample comprises 1101 biological siblings (525 females) aged 9–20 years. PF components and markers were: morphological [waist circumference (WC) and %body fat (%BF)], muscular [handgrip strength (GS) and standing long jump (SLJ)], motor [50-yard dash (50YD) and shuttle run (SR)], and cardiorespiratory (1-mile run). Biological maturation was assessed; physical activity (PA), TV viewing and socioeconomic status (SES) information was obtained. On average, older and more mature subjects are better performers in all PF components; PA was negatively associated with SR, while SES was negatively associated with SLJ and SR. A pattern was observed in the intraclass correlations (ρ) wherein same sex siblings demonstrate greater resemblance for most PF components (sister-sister: 0.35≤ ρ≤0.55; brother-brother: (0.25≤ρ≤0.60) than brother-sister pairs (BS) (0≤ρ≤0.15), except for %BF (ρBB>ρSS>ρBS), and the 1-mile run (ρSS>ρBS>ρBB). In conclusion, behavioural and sociodemographic correlates play different roles in siblings PF expression. Further, a significant familial PF resemblance was observed with different trends in different sibling types, probably due to variations in shared genetic factors and sociodemographic conditions. PMID:28187195
Manios, Y; Androutsos, O; Moschonis, G; Birbilis, M; Maragkopoulou, K; Giannopoulou, A; Argyri, E; Kourlaba, G
2013-10-01
The aim of this paper was to evaluate the criterion validity of the Physical Activity Questionnaire for Schoolchildren (PAQ-S). The current study is a subcohort of the Healthy Growth Study, a large-scale cross-sectional study. 202 schoolchildren aged 9-13 years from Greece completed the PAQ-S and wore an accelerometer for 4 consecutive days. Time spent moderate (MPA), moderate to vigorous (MVPA) and vigorous (VPA) physical activity was calculated based on PAQ-S and accelerometer data. The average time spent on MPA and MVPA as derived from PAQ-S and from accelerometers were significantly moderately correlated (r=0.462, P<0.001 and r=0.483, P<0.001, respectively). No significant correlation was detected between PAQ-S and accelerometer-measured time spent performing VPA (rho=0.150, P=0.057). Intraclass Correlation Coefficient (ICC) indicated a moderate agreement between PAQ-S and accelerometer in estimating MPA (ICC=0.592, P<0.001) and MVPA (ICC=0.581, P<0.001). Bland-Altman analysis revealed a small mean difference (the "bias"), between the two methods, in estimating MPA, although this difference was found to be significantly higher than zero ("bias"=27.4% of the accelerometer-measured mean score, P=0.006). On the other hand, Bland-Altman analysis revealed a large mean difference in estimating MVPA and VPA ("bias"=84.2% and 357% of the accelerometer-measured mean score for MVPA and VPA, respectively and P<0.001). The high correlation coefficient between the average and difference values between all physical activity scores derived from accelerometers and PAQ-S, indicate a systematic overestimation of physical activity time with increasing physical activity for PAQ-S. The validity of PAQ-S for the estimation of MPA and MVPA was found to be slightly similar self-reported measures for schoolchildren. Therefore, this questionnaire could be used as a tool for physical activity assessment in large population studies.
Hielm-Björkman, Anna K; Kapatkin, Amy S; Rita, Hannu J
2011-05-01
To assess validity and reliability for a visual analogue scale (VAS) used by owners to measure chronic pain in their osteoarthritic dogs. 68, 61, and 34 owners who completed a questionnaire. Owners answered questionnaires at 5 time points. Criterion validity of the VAS was evaluated for all dogs in the intended-to-treat population by correlating scores for the VAS with scores for the validated Helsinki Chronic Pain Index (HCPI) and a relative quality-of-life scale. Intraclass correlation was used to assess repeatability of the pain VAS at 2 baseline evaluations. To determine sensitivity to change and face validity of the VAS, 2 blinded, randomized control groups (17 dogs receiving carprofen and 17 receiving a placebo) were analyzed over time. Significant correlations existed between the VAS score and the quality-of-life scale and HCPI scores. Intraclass coefficient (r = 0.72; 95% confidence interval, 0.57 to 0.82) for the VAS indicated good repeatability. In the carprofen and placebo groups, there was poor correlation between the 2 pain evaluation methods (VAS and HCPI items) at the baseline evaluation, but the correlation improved in the carprofen group over time. No correlation was detected for the placebo group over time. Although valid and reliable, the pain VAS was a poor tool for untrained owners because of poor face validity (ie, owners could not recognize their dogs' behavior as signs of pain). Only after owners had seen pain diminish and then return (after starting and discontinuing NSAID use) did the VAS have face validity.
van der Wal, Martijn; Bloemen, Monica; Verhaegen, Pauline; Tuinebreijer, Wim; de Vet, Henrica; van Zuijlen, Paul; Middelkoop, Esther
2013-01-01
Color measurements are an essential part of scar evaluation. Thus, vascularization (erythema) and pigmentation (melanin) are common outcome parameters in scar research. The aim of this study was to investigate the clinimetric properties and clinical feasibility of the Mexameter, Colorimeter, and the DSM II ColorMeter for objective measurements on skin and scars. Fifty scars with a mean age of 6 years (2 months to 53 years) were included. Reliability was tested using the single-measure interobserver intraclass correlation coefficient. Validity was determined by measuring the Pearson correlation with the Fitzpatrick skin type classification (for skin) and the Patient and Observer Scar Assessment Scale (for scar tissue). All three instruments provided reliable readings (intraclass correlation coefficient ≥ 0.83; confidence interval: 0.71-0.90) on normal skin and scar tissue. Parameters with the highest correlations with the Fitzpatrick classification were melanin (Mexameter), 0.72; ITA (Colorimeter), -0.74; and melanin (DSM II), 0.70. On scars, the highest correlations with the Patient and Observer Scar Assessment Scale vascularization scores were the following: erythema (Mexameter), 0.59; LAB2 (Colorimeter), 0.69; and erythema (DSM II), 0.66. For hyperpigmentation, the highest correlations were melanin (Mexameter), 0.75; ITA (Colorimeter), -0.80; and melanin (DSM II), 0.83. This study shows that all three instruments can provide reliable color data on skin and scars with a single measurement. The authors also demonstrated that they can assist in objective skin type classification. For scar assessment, the most valid parameters in each instrument were identified.
Sample Size Estimation in Cluster Randomized Educational Trials: An Empirical Bayes Approach
ERIC Educational Resources Information Center
Rotondi, Michael A.; Donner, Allan
2009-01-01
The educational field has now accumulated an extensive literature reporting on values of the intraclass correlation coefficient, a parameter essential to determining the required size of a planned cluster randomized trial. We propose here a simple simulation-based approach including all relevant information that can facilitate this task. An…
Intraclass Correlations for Three-Level Multi-Site Cluster-Randomized Trials of Science Achievement
ERIC Educational Resources Information Center
Westine, Carl D.
2015-01-01
A cluster-randomized trial (CRT) relies on random assignment of intact clusters to treatment conditions, such as classrooms or schools (Raudenbush & Bryk, 2002). One specific type of CRT, a multi-site CRT (MSCRT), is commonly employed in educational research and evaluation studies (Spybrook & Raudenbush, 2009; Spybrook, 2014; Bloom,…
Designing Large-Scale Multisite and Cluster-Randomized Studies of Professional Development
ERIC Educational Resources Information Center
Kelcey, Ben; Spybrook, Jessaca; Phelps, Geoffrey; Jones, Nathan; Zhang, Jiaqi
2017-01-01
We develop a theoretical and empirical basis for the design of teacher professional development studies. We build on previous work by (a) developing estimates of intraclass correlation coefficients for teacher outcomes using two- and three-level data structures, (b) developing estimates of the variance explained by covariates, and (c) modifying…
The Power of the Test for Treatment Effects in Three-Level Block Randomized Designs
ERIC Educational Resources Information Center
Konstantopoulos, Spyros
2008-01-01
Experiments that involve nested structures may assign treatment conditions either to subgroups (such as classrooms) or individuals within subgroups (such as students). The design of such experiments requires knowledge of the intraclass correlation structure to compute the sample sizes necessary to achieve adequate power to detect the treatment…
Alcohol Drinking Onset: A Reliability Study
ERIC Educational Resources Information Center
Prause, JoAnn; Dooley, David; Ham-Rowbottom, Kathleen A.; Emptage, Nicholas
2007-01-01
Early alcohol drinking onset (ADO) is associated with adult alcohol misuse, but the accuracy of ADO is unclear. Reliability of self-reported ADO was studied in two panels of the National Longitudinal Survey of Youth. For the Adult sample (n = 6,215), the intraclass correlation coefficient (ICC) was 0.36. Older respondents had higher reliabilities…
A More Powerful Test in Three-Level Cluster Randomized Designs
ERIC Educational Resources Information Center
Konstantopoulos, Spyros
2011-01-01
Field experiments that involve nested structures frequently assign treatment conditions to entire groups (such as schools). A key aspect of the design of such experiments includes knowledge of the clustering effects that are often expressed via intraclass correlation. This study provides methods for constructing a more powerful test for the…
Podolsky, Dale J; Fisher, David M; Wong Riff, Karen W; Szasz, Peter; Looi, Thomas; Drake, James M; Forrest, Christopher R
2018-06-01
This study assessed technical performance in cleft palate repair using a newly developed assessment tool and high-fidelity cleft palate simulator through a longitudinal simulation training exercise. Three residents performed five and one resident performed nine consecutive endoscopically recorded cleft palate repairs using a cleft palate simulator. Two fellows in pediatric plastic surgery and two expert cleft surgeons also performed recorded simulated repairs. The Cleft Palate Objective Structured Assessment of Technical Skill (CLOSATS) and end-product scales were developed to assess performance. Two blinded cleft surgeons assessed the recordings and the final repairs using the CLOSATS, end-product scale, and a previously developed global rating scale. The average procedure-specific (CLOSATS), global rating, and end-product scores increased logarithmically after each successive simulation session for the residents. Reliability of the CLOSATS (average item intraclass correlation coefficient (ICC), 0.85 ± 0.093) and global ratings (average item ICC, 0.91 ± 0.02) among the raters was high. Reliability of the end-product assessments was lower (average item ICC, 0.66 ± 0.15). Standard setting linear regression using an overall cutoff score of 7 of 10 corresponded to a pass score for the CLOSATS and the global score of 44 (maximum, 60) and 23 (maximum, 30), respectively. Using logarithmic best-fit curves, 6.3 simulation sessions are required to reach the minimum standard. A high-fidelity cleft palate simulator has been developed that improves technical performance in cleft palate repair. The simulator and technical assessment scores can be used to determine performance before operating on patients.
Díez Rodriguez-Labajo, A; Castarlenas, E; Miró, J; Reinoso-Barbero, F
2017-03-01
Parental report on a child's secondary chronic pain is commonly requested by anesthesiologists when the child cannot directly provide information. Daily pain intensity is reported as highest, average and lowest. However, it is unclear whether the parents' score is a valid indicator of the child's pain experience. Nineteen children (aged 6-18years) with secondary chronic pain attending our anesthesiologist-run pediatric pain unit participated in this study. Identification of highest, average and lowest pain intensity levels were requested during initial screening interviews with the child and parents. Pain intensity was scored on a 0-10 numerical rating scale. Agreement was examined using: (i) intraclass correlation coefficient (ICC), and (ii) the Bland-Altman method. The ICC's between the children and the parents' pain intensity reports were: 0.92 for the highest, 0.68 for the average, and 0.50 for the lowest pain intensity domains. The limits of agreement set at 95% between child and parental reports were respectively +2.19 to -2.07, +3.17 to -3.88 and +5.15 to -5.50 for the highest, average and lowest pain domains. For the highest pain intensity domain, agreement between parents and children was excellent. If replicated this preliminary finding would suggest the highest pain intensity is the easiest domain for reporting pain intensity when a child cannot directly express him or herself. Copyright © 2016 Sociedad Española de Anestesiología, Reanimación y Terapéutica del Dolor. Publicado por Elsevier España, S.L.U. All rights reserved.
Test-retest reliability of the Progressive Isoinertial Lifting Evaluation (PILE).
Lygren, Hildegunn; Dragesund, Tove; Joensen, Jón; Ask, Tove; Moe-Nilssen, Rolf
2005-05-01
A repeated measures single group design. To investigate test-retest reliability of Progressive Isoinertial Lifting Evaluation on patients with long lasting musculoskeletal problems related to the lumbar spine. Test-retest reliability has been satisfactory in healthy men. Test-retest reliability for clinical populations has not been reported. A total of 31 patients (17 women and 14 men) with long lasting low back pain participated in the study. The patients were tested twice at an interval of 2 days and at the same time of the day. The heaviest load that the patient could lift 4 times was used as outcome measure. The error of measurement indicates that the true result in 95% of cases will be within +/-4.5 kg from the measured value, while the difference between 2 measurements in 95% of cases will be less than 6.4 kg. Intra-class correlation (1,1) was 0.91. Relative test-retest reliability was high assessed by intra-class correlation, but absolute measurement variability reported as the smallest detectable difference has relevance for the interpretation of clinical test results and should also be considered.
T2 Mapping of the Sacroiliac Joints With 3-T MRI: A Preliminary Study.
Lefebvre, Guillaume; Bergère, Antonin; Rafei, Mazen El; Duhamel, Alain; Teixeira, Pedro; Cotten, Anne
2017-08-01
The objective of this study was to assess the feasibility of T2 relaxation time measurements of the sacroiliac joints. The sacroiliac joints of 40 patients were imaged by 3-T MRI using an oblique axial multislice multiecho spin-echo T2-weighted sequence. Manual plotting and automatic subdivision of ROIs allowed us to obtain T2 values for up to 48 different areas per patient (posterior and anterior parts, sacral, intermediate, and iliac parts). Intraand interobserver reproducibility of T2 values were calculated after independent assessment by two musculoskeletal radiologists. A total of 1656 measurement sites could be analyzed. Mean (± SD) T2 values were 40.6 ± 6.7 ms and 41.2 ± 6.3 ms for observer 1 and 39.9 ± 6.6 ms for observer 2. The intraobserver intraclass correlation coefficient was 0.72 (95% CI, 0.70-0.74), and the interobserver intraclass correlation coefficient was 0.71 (95% CI, 0.68-0.72). Our study shows the feasibility of T2 relaxation time measurements at the sacroiliac joints.
Reliability and Validity of the TIMPSI for Infants With Spinal Muscular Atrophy Type I
Krosschell, Kristin J.; Maczulski, Jo Anne; Scott, Charles; King, Wendy; Hartman, Jill T.; Case, Laura E.; Viazzo-Trussell, Donata; Wood, Janine; Roman, Carolyn A.; Hecker, Eva; Meffert, Marianne; Léveillé, Maude; Kienitz, Krista; Swoboda, Kathryn J.
2014-01-01
Purpose This study examined the reliability and validity of the Test of Infant Motor Performance Screening Items (TIMPSI) in infants with type I spinal muscular atrophy (SMA). Methods After training, 12 evaluators scored 4 videos of infants with type I SMA to assess interrater reliability. Intrarater and test-retest reliability was further assessed for 9 evaluators during a SMA type I clinical trial, with 9 evaluators testing a total of 38 infants twice. Relatedness of the TIMPSI score to ability to reach and ventilatory support was also examined. Results Excellent interrater video score reliability was noted (intraclass correlation coefficient, 0.97–0.98). Intrarater reliability was excellent (intraclass correlation coefficient, 0.91–0.98) and test-retest reliability ranged from r = 0.82 to r = 0.95. The TIMPSI score was related to the ability to reach (P ≤ .05). Conclusion The TIMPSI can reliably be used to assess motor function in infants with type I SMA. In addition, the TIMPSI scores are related to the ability to reach, an important functional skill in children with type I SMA. PMID:23542189
Reliability of reports of childhood trauma in bipolar disorder: A test-retest study over 18 months.
Shannon, Ciaran; Hanna, Donncha; Tumelty, Leo; Waldron, Daniel; Maguire, Chrissie; Mowlds, William; Meenagh, Ciaran; Mulholland, Ciaran
2016-01-01
This study aimed to explore the reliability of self-reported trauma histories in a population with a diagnosis of bipolar disorder using the Childhood Trauma Questionnaire. Previous studies in other populations suggest high reliability of trauma histories over time, and it was postulated that a similar high reliability would be demonstrated in this population. A total of 39 patients with a confirmed diagnosis (Diagnostic and Statistical Manual of Mental Disorders, 4th Edition, criteria) were followed up and readministered the Childhood Trauma Questionnaire after 18 months. Cohen's kappa scores and intraclass correlations suggested reasonable test-retest reliability over the 18-month time period of the study for all types of childhood abuse, namely, emotional, physical, and sexual abuse and physical and emotional neglect. Intraclass correlations ranged from r = .50 (sexual abuse) to r = .96 (physical abuse). Cohen's kappas ranged from .44 (sexual abuse) to .76 (physical abuse). Retrospective reports of childhood trauma can be seen as reliable and are in keeping with results found with other mental health populations.
Milanović, Zoran; Pantelić, Saša; Trajković, Nebojša; Jorgić, Bojan; Sporiš, Goran; Bratić, Milovan
2014-01-01
The purpose of this study was to determine the test-retest reliability of the International Physical Activity Questionnaire (IPAQ) for older adults in Serbia. Six hundred and sixty older adults (352 men, 53%; 308 women, 47%; mean age 67.65±5.76 years) participated in the study. To examine test-retest reliability, the participants were asked to complete the IPAQ on two occasions 2 weeks apart. Moderate reliability was observed between the repeated IPAQ, with intraclass correlation coefficients ranging from 0.53 to 0.91. The least reliability was established in leisure time activity (0.53) and the most reliability in the transport domain (0.91). Men and women had similar intraclass correlation coefficients for total physical activity (0.71 versus 0.74, respectively), while the biggest difference was obtained for housework in men (0.68) and in women (0.90). Our study shows that the long version of the IPAQ is a reliable instrument for assessing physical activity levels in older adults and that it may be useful for generating internationally comparable data.
Reliability of a questionnaire on substance use among adolescent students, Brazil.
Machado Neto, Adelmo de Souza; Andrade, Tarcisio Matos; Fernandes, Gilênio Borges; Zacharias, Helder Paulo; Carvalho, Fernando Martins; Machado, Ana Paula Souza; Dias, Ana Carmen Costa; Garcia, Ana Carolina Rocha; Santana, Lauro Reis; Rolin, Carlos Eduardo; Sampaio, Cyntia; Ghiraldi, Gisele; Bastos, Francisco Inácio
2010-10-01
To analyze reliability of a self-applied questionnaire on substance use and misuse among adolescent students. Two cross-sectional studies were carried out for the instrument test-retest. The sample comprised male and female students aged 1119 years from public and private schools (elementary, middle, and high school students) in the city of Salvador, Northeastern Brazil, in 2006. A total of 591 questionnaires were applied in the test and 467 in the retest. Descriptive statistics, the Kappa index, Cronbach's alpha and intraclass correlation were estimated. The prevalence of substance use/misuse was similar in both test and retest. Sociodemographic variables showed a "moderate" to "almost perfect" agreement for the Kappa index, and a "satisfactory" (>0.75) consistency for Cronbach's alpha and intraclass correlation. The age which psychoactive substances (tobacco, alcohol, and cannabis) were first used and chronological age were similar in both studies. Test-retest reliability was found to be a good indicator of students' age of initiation and their patterns of substance use. The questionnaire reliability was found to be satisfactory in the population studied.
Aly, Sharif S; Zhao, Jianyang; Li, Ben; Jiang, Jiming
2014-01-01
The Intraclass Correlation Coefficient (ICC) is commonly used to estimate the similarity between quantitative measures obtained from different sources. Overdispersed data is traditionally transformed so that linear mixed model (LMM) based ICC can be estimated. A common transformation used is the natural logarithm. The reliability of environmental sampling of fecal slurry on freestall pens has been estimated for Mycobacterium avium subsp. paratuberculosis using the natural logarithm transformed culture results. Recently, the negative binomial ICC was defined based on a generalized linear mixed model for negative binomial distributed data. The current study reports on the negative binomial ICC estimate which includes fixed effects using culture results of environmental samples. Simulations using a wide variety of inputs and negative binomial distribution parameters (r; p) showed better performance of the new negative binomial ICC compared to the ICC based on LMM even when negative binomial data was logarithm, and square root transformed. A second comparison that targeted a wider range of ICC values showed that the mean of estimated ICC closely approximated the true ICC.
Richman, Jesse; Zangalli, Camila; Lu, Lan; Wizov, Sheryl S; Spaeth, Eric; Spaeth, George L
2015-01-01
(1) To determine the ability of a novel, internet-based contrast sensitivity test titled the Spaeth/Richman Contrast Sensitivity Test (SPARCS) to identify patients with glaucoma. (2) To determine the test-retest reliability of SPARCS. A prospective, cross-sectional study of patients with glaucoma and controls was performed. Subjects were assessed by SPARCS and the Pelli-Robson chart. Reliability of each test was assessed by the intraclass correlation coefficient and the coefficient of repeatability. Sensitivity and specificity for identifying glaucoma was also evaluated. The intraclass correlation coefficient for SPARCS was 0.97 and 0.98 for Pelli-Robson. The coefficient of repeatability for SPARCS was ±6.7% and ±6.4% for Pelli-Robson. SPARCS identified patients with glaucoma with 79% sensitivity and 93% specificity. SPARCS has high test-retest reliability. It is easily accessible via the internet and identifies patients with glaucoma well. NCT01300949. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Translation and validation of a Spanish version of the xerostomia inventory.
Serrano, Carlos; Fariña, María P; Pérez, Cristhian; Fernández, Marcos; Forman, Katherine; Carrasco, Mauricio
2016-12-01
The aim of this study was to validate a Spanish cross-cultural adaptation of the xerostomia inventory (XI). The original English version of XI was translated into Spanish, cross-culturally adapted and field tested. The Spanish version of XI (XI-Sp) was tested with a sample of 41 patients with xerostomia. The reliability of the XI-Sp was determined through internal consistency and test-retest methods. The construct validity of XI-Sp was determined by means of correlation between XI-Sp scores and salivary flow measurements. Overall XI-Sp scores were 40.8 (SD = 10) for the first application and 40.2 (SD = 9.5) for the second. Cronbach's alpha value for the XI-Sp was 0.89 and 0.87, respectively, while interitem correlation averages were r = 0.44 and r = 0.39 for each application. Interitem correlation and corrected total was r c ≥0.30. The test-retest intraclass correlation coefficient value for the XI-Sp score was 0.59 and 0.91. Convergent validity for construct validity correlation with salivary flow showed a medium effect size (r 2 = 0.10) for the first application but did not make a statistically significant prediction for the second (r 2 = 0.7). This study provides evidence concerning the reliability of the XI-Sp, showing that it may be a useful tool for Spanish-speaking xerostomia patients for both clinical and epidemiologic research. © 2015 John Wiley & Sons A/S and The Gerodontology Association. Published by John Wiley & Sons Ltd.
Development and Reliability Testing of a Fast-Food Restaurant Observation Form.
Rimkus, Leah; Ohri-Vachaspati, Punam; Powell, Lisa M; Zenk, Shannon N; Quinn, Christopher M; Barker, Dianne C; Pugach, Oksana; Resnick, Elissa A; Chaloupka, Frank J
2015-01-01
To develop a reliable observational data collection instrument to measure characteristics of the fast-food restaurant environment likely to influence consumer behaviors, including product availability, pricing, and promotion. The study used observational data collection. Restaurants were in the Chicago Metropolitan Statistical Area. A total of 131 chain fast-food restaurant outlets were included. Interrater reliability was measured for product availability, pricing, and promotion measures on a fast-food restaurant observational data collection instrument. Analysis was done with Cohen's κ coefficient and proportion of overall agreement for categorical variables and intraclass correlation coefficient (ICC) for continuous variables. Interrater reliability, as measured by average κ coefficient, was .79 for menu characteristics, .84 for kids' menu characteristics, .92 for food availability and sizes, .85 for beverage availability and sizes, .78 for measures on the availability of nutrition information,.75 for characteristics of exterior advertisements, and .62 and .90 for exterior and interior characteristics measures, respectively. For continuous measures, average ICC was .88 for food pricing measures, .83 for beverage prices, and .65 for counts of exterior advertisements. Over 85% of measures demonstrated substantial or almost perfect agreement. Although some measures required revision or protocol clarification, results from this study suggest that the instrument may be used to reliably measure the fast-food restaurant environment.
Brosseau, Lucie; Laroche, Chantal; Guitard, Paulette; King, Judy; Poitras, Stéphane; Casimiro, Lynn; Barette, Julie Alexandra; Cardinal, Dominique; Cavallo, Sabrina; Laferrière, Lucie; Martini, Rose; Champoux, Nicholas; Taverne, Jennifer; Paquette, Chanyque; Tremblay, Sébastien; Sutton, Ann; Galipeau, Roseline; Tourigny, Jocelyne; Toupin-April, Karine; Loew, Laurianne; Demers, Catrine; Sauvé-Schenk, Katrine; Paquet, Nicole; Savard, Jacinthe; Lagacé, Josée; Pharand, Denyse; Vaillancourt, Véronique
2017-01-01
Objectives: The primary objective was to produce a French-Canadian translation of AMSTAR (a measurement tool to assess systematic reviews) and to examine the validity of the translation's contents. The secondary and tertiary objectives were to assess the inter-rater reliability and factorial construct validity of this French-Canadian version of AMSTAR. Methods: A modified approach to Vallerand's methodology (1989) for cross-cultural validation was used. 1 First, a parallel back-translation of AMSTAR 2 was performed, by both professionals and future professionals. Next, a first committee of experts (P1) examined the translations to create a first draft of the French-Canadian version of the AMSTAR tool. This draft was then evaluated and modified by a second committee of experts (P2). Following that, 18 future professionals (master's students in physiotherapy) rated this second draft of the instrument for clarity using a seven-point scale (1: very clear; 7: very ambiguous). Lastly, the principal co-investigators then reviewed the problematic elements and proposed final changes. Four independent raters used this French-Canadian version of AMSTAR to assess 20 systematic reviews that were published in French after the year 2000. An intraclass correlation coefficient (ICC) and kappa coefficient were calculated to measure the tool's inter-rater reliability. A Cronbach's alpha coefficient was also calculated to measure internal consistency. In addition, factor analysis was used to evaluate construct validity in order to determine the number of dimensions. Results: The statements on the final version of the AMSTAR tool received an average ambiguity rating of between 1.0 and 1.4. No statement received an average rating below 1.4, which indicates a high level of clarity. Inter-rater reliability ( n =4) for the instrument's total score was moderate, with an intraclass correlation coefficient of 0.61 (95% confidence interval [CI]: 0.29, 0.97). Inter-rater reliability for 82% of the individual items was good, according to the kappa values obtained. Internal consistency was excellent, with a Cronbach's alpha coefficient of 0.91 (95% CI: 0.83, 0.99). The French-Canadian version of AMSTAR is a unidimensional tool, as confirmed by factor analysis and community values greater than 0.30. Conclusion: A valid French-Canadian version of AMSTAR was created using this rigorous five-step process. This version is unidimensional, with moderate inter-rater reliability for the elements overall, and with excellent internal consistency. This tool could be valuable to French-Canadian professionals and researchers, and could also be of interest to the international Francophone community.
İlçin, Nursen; Gürpınar, Barış; Bayraktar, Deniz; Savcı, Sema; Çetin, Pınar; Sarı, İsmail; Akkoç, Nurullah
2016-01-01
[Purpose] This study describes the cultural adaptation, validation, and reliability of the Turkish version of the Pain Catastrophizing Scale in patients with ankylosing spondylitis. [Methods] The validity of the Turkish version of the Pain Catastrophizing Scale was assessed by evaluating data quality (missing data and floor and ceiling effects), principal components analysis, internal consistency (Cronbach’s alpha), and construct validity (Spearman’s rho). Reproducibility analyses included standard measurement error, minimum detectable change, limits of agreement, and intraclass correlation coefficients. [Results] Sixty-four adult patients with ankylosing spondylitis with a mean age of 42.2 years completed the study. Factor analysis revealed that all questionnaire items could be grouped into two factors. Excellent internal consistency was found, with a Chronbach’s alpha value of 0.95. Reliability analyses showed an intraclass correlation coefficient (95% confidence interval) of 0.96 for the total score. There was a low correlation coefficient between the Turkish version of the Pain Catastrophizing Scale and body mass index, pain levels at rest and during activity, health-related quality of life, and fear and avoidance behaviors. [Conclusion] The results of this study indicate that the Turkish version of the Pain Catastrophizing Scale is a valid and reliable clinical and research tool for patients with ankylosing spondylitis. PMID:26957778
Preda, Adrian; Nguyen, Dana D; Bustillo, Juan R; Belger, Aysenil; O'Leary, Daniel S; McEwen, Sarah; Ling, Shichun; Faziola, Lawrence; Mathalon, Daniel H; Ford, Judith M; Potkin, Steven G; van Erp, Theo G M
2018-06-20
To provide quantitative conversions between commonly used scales for the assessment of negative symptoms in schizophrenia. Linear regression analyses generated conversion equations between symptom scores from the Scale for the Assessment of Negative Symptoms (SANS), the Schedule for the Deficit Syndrome (SDS), the Positive and Negative Syndrome Scale (PANSS), or the Negative Symptoms Assessment (NSA) based on a cross sectional sample of 176 individuals with schizophrenia. Intraclass correlations assessed the rating conversion accuracy based on a separate sub-sample of 29 patients who took part in the initial study as well as an independent sample of 28 additional subjects with schizophrenia. Between-scale negative symptom ratings were moderately to highly correlated (r = 0.73-0.91). Intraclass correlations between the original negative symptom rating scores and those obtained via using the conversion equations were in the range of 0.61-0.79. While there is a degree of non-overlap, several negative symptoms scores reflect measures of similar constructs and may be reliably converted between some scales. The conversion equations are provided at http://www.converteasy.org and may be used for meta- and mega-analyses that examine negative symptoms. Copyright © 2018 Elsevier B.V. All rights reserved.
The validity and reliability of an iPhone app for measuring vertical jump performance.
Balsalobre-Fernández, Carlos; Glaister, Mark; Lockey, Richard Anthony
2015-01-01
The purpose of this investigation was to analyse the concurrent validity and reliability of an iPhone app (called: My Jump) for measuring vertical jump performance. Twenty recreationally active healthy men (age: 22.1 ± 3.6 years) completed five maximal countermovement jumps, which were evaluated using a force platform (time in the air method) and a specially designed iPhone app. My jump was developed to calculate the jump height from flight time using the high-speed video recording facility on the iPhone 5 s. Jump heights of the 100 jumps measured, for both devices, were compared using the intraclass correlation coefficient, Pearson product moment correlation coefficient (r), Cronbach's alpha (α), coefficient of variation and Bland-Altman plots. There was almost perfect agreement between the force platform and My Jump for the countermovement jump height (intraclass correlation coefficient = 0.997, P < 0.001; Bland-Altman bias = 1.1 ± 0.5 cm, P < 0.001). In comparison with the force platform, My Jump showed good validity for the CMJ height (r = 0.995, P < 0.001). The results of the present study showed that CMJ height can be easily, accurately and reliably evaluated using a specially developed iPhone 5 s app.
Teixeira, Juliana Araujo; Baggio, Maria Luiza; Giuliano, Anna R; Fisberg, Regina Mara; Marchioni, Dirce Maria Lobo
2011-07-01
The Natural History of Human Papillomavirus (HPV) Infection in Men: The HIM Study is a prospective multicenter cohort study that, among other factors, analyzes participants' diet. A parallel cross-sectional study was designed to evaluate the validity and reproducibility of the quantitative food frequency questionnaire (QFFQ) used in the Brazilian center from the HIM Study. For this, a convenience subsample of 98 men aged 18 to 70 years from the HIM Study in Brazil answered three 54-item QFFQ and three 24-hour recall interviews, with 6-month intervals between them (data collection January to September 2007). A Bland-Altman analysis indicated that the difference between instruments was dependent on the magnitude of the intake for energy and most nutrients included in the validity analysis, with the exception of carbohydrates, fiber, polyunsaturated fat, vitamin C, and vitamin E. The correlation between the QFFQ and the 24-hour recall for the deattenuated and energy-adjusted data ranged from 0.05 (total fat) to 0.57 (calcium). For the energy and nutrients consumption included in the validity analysis, 33.5% of participants on average were correctly classified into quartiles, and the average value of 0.26 for weighted kappa shows a reasonable agreement. The intraclass correlation coefficients for all nutrients were greater than 0.40 in the reproducibility analysis. The QFFQ demonstrated good reproducibility and acceptable validity. The results support the use of this instrument in the HIM Study. Copyright © 2011 American Dietetic Association. Published by Elsevier Inc. All rights reserved.
How many drinks did you have on September 11, 2001?
Perrine, M W Bud; Schroder, Kerstin E E
2005-07-01
This study tested the predictability of error in retrospective self-reports of alcohol consumption on September 11, 2001, among 80 Vermont light, medium and heavy drinkers. Subjects were 52 men and 28 women participating in daily self-reports of alcohol consumption for a total of 2 years, collected via interactive voice response technology (IVR). In addition, retrospective self-reports of alcohol consumption on September 11, 2001, were collected by telephone interview 4-5 days following the terrorist attacks. Retrospective error was calculated as the difference between the IVR self-report of drinking behavior on September 11 and the retrospective self-report collected by telephone interview. Retrospective error was analyzed as a function of gender and baseline drinking behavior during the 365 days preceding September 11, 2001 (termed "the baseline"). The intraclass correlation (ICC) between daily IVR and retrospective self-reports of alcohol consumption on September 11 was .80. Women provided, on average, more accurate self-reports (ICC = .96) than men (ICC = .72) but displayed more underreporting bias in retrospective responses. Amount and individual variability of alcohol consumption during the 1-year baseline explained, on average, 11% of the variance in overreporting (r = .33), 9% of the variance in underreporting (r = .30) and 25% of the variance in the overall magnitude of error (r = .50), with correlations up to .62 (r2 = .38). The size and direction of error were clearly predictable from the amount and variation in drinking behavior during the 1-year baseline period. The results demonstrate the utility and detail of information that can be derived from daily IVR self-reports in the analysis of retrospective error.
Lockie, Robert G; Schultz, Adrian B; Callaghan, Samuel J; Jeffriess, Matthew D; Berry, Simon P
2013-01-01
Field sport coaches must use reliable and valid tests to assess change-of-direction speed in their athletes. Few tests feature linear sprinting with acute change- of-direction maneuvers. The Change-of-Direction and Acceleration Test (CODAT) was designed to assess field sport change-of-direction speed, and includes a linear 5-meter (m) sprint, 45° and 90° cuts, 3- m sprints to the left and right, and a linear 10-m sprint. This study analyzed the reliability and validity of this test, through comparisons to 20-m sprint (0-5, 0-10, 0-20 m intervals) and Illinois agility run (IAR) performance. Eighteen Australian footballers (age = 23.83 ± 7.04 yrs; height = 1.79 ± 0.06 m; mass = 85.36 ± 13.21 kg) were recruited. Following familiarization, subjects completed the 20-m sprint, CODAT, and IAR in 2 sessions, 48 hours apart. Intra-class correlation coefficients (ICC) assessed relative reliability. Absolute reliability was analyzed through paired samples t-tests (p ≤ 0.05) determining between-session differences. Typical error (TE), coefficient of variation (CV), and differences between the TE and smallest worthwhile change (SWC), also assessed absolute reliability and test usefulness. For the validity analysis, Pearson's correlations (p ≤ 0.05) analyzed between-test relationships. Results showed no between-session differences for any test (p = 0.19-0.86). CODAT time averaged ~6 s, and the ICC and CV equaled 0.84 and 3.0%, respectively. The homogeneous sample of Australian footballers meant that the CODAT's TE (0.19 s) exceeded the usual 0.2 x standard deviation (SD) SWC (0.10 s). However, the CODAT is capable of detecting moderate performance changes (SWC calculated as 0.5 x SD = 0.25 s). There was a near perfect correlation between the CODAT and IAR (r = 0.92), and very large correlations with the 20-m sprint (r = 0.75-0.76), suggesting that the CODAT was a valid change-of-direction speed test. Due to movement specificity, the CODAT has value for field sport assessment. Key pointsThe change-of-direction and acceleration test (CODAT) was designed specifically for field sport athletes from specific speed research, and data derived from time-motion analyses of sports such as rugby union, soccer, and Australian football. The CODAT features a linear 5-meter (m) sprint, 45° and 90° cuts and 3-m sprints to the left and right, and a linear 10-m sprint.The CODAT was found to be a reliable change-of-direction speed assessment when considering intra-class correlations between two testing sessions, and the coefficient of variation between trials. A homogeneous sample of Australian footballers resulted in absolute reliability limitations when considering differences between the typical error and smallest worthwhile change. However, the CODAT will detect moderate (0.5 times the test's standard deviation) changes in performance.The CODAT correlated with the Illinois agility run, highlighting that it does assess change-of-direction speed. There were also significant relationships with short sprint performance (i.e. 0-5 m and 0-10 m), demonstrating that linear acceleration is assessed within the CODAT, without the extended duration and therefore metabolic limitations of the IAR. Indeed, the average duration of the test (~6 seconds) is field sport-specific. Therefore, the CODAT could be used as an assessment of change-of-direction speed in field sport athletes.
Validation of Morphometric Analyses of Small-Intestinal Biopsy Readouts in Celiac Disease
Taavela, Juha; Koskinen, Outi; Huhtala, Heini; Lähdeaho, Marja-Leena; Popp, Alina; Laurila, Kaija; Collin, Pekka; Kaukinen, Katri; Kurppa, Kalle; Mäki, Markku
2013-01-01
Background Assessment of the gluten-induced small-intestinal mucosal injury remains the cornerstone of celiac disease diagnosis. Usually the injury is evaluated using grouped classifications (e.g. Marsh groups), but this is often too imprecise and ignores minor but significant changes in the mucosa. Consequently, there is a need for validated continuous variables in everyday practice and in academic and pharmacological research. Methods We studied the performance of our standard operating procedure (SOP) on 93 selected biopsy specimens from adult celiac disease patients and non-celiac disease controls. The specimens, which comprised different grades of gluten-induced mucosal injury, were evaluated by morphometric measurements. Specimens with tangential cutting resulting from poorly oriented biopsies were included. Two accredited evaluators performed the measurements in blinded fashion. The intraobserver and interobserver variations for villus height and crypt depth ratio (VH:CrD) and densities of intraepithelial lymphocytes (IELs) were analyzed by the Bland-Altman method and intraclass correlation. Results Unevaluable biopsies according to our SOP were correctly identified. The intraobserver analysis of VH:CrD showed a mean difference of 0.087 with limits of agreement from −0.398 to 0.224; the standard deviation (SD) was 0.159. The mean difference in interobserver analysis was 0.070, limits of agreement −0.516 to 0.375, and SD 0.227. The intraclass correlation coefficient in intraobserver variation was 0.983 and that in interobserver variation 0.978. CD3+ IEL density countings in the paraffin-embedded and frozen biopsies showed SDs of 17.1% and 16.5%; the intraclass correlation coefficients were 0.961 and 0.956, respectively. Conclusions Using our SOP, quantitative, reliable and reproducible morphometric results can be obtained on duodenal biopsy specimens with different grades of gluten-induced injury. Clinically significant changes were defined according to the error margins (2SD) of the analyses in VH:CrD as 0.4 and in CD3+-stained IELs as 30%. PMID:24146832
Besson, Florent L; Henry, Théophraste; Meyer, Céline; Chevance, Virgile; Roblot, Victoire; Blanchet, Elise; Arnould, Victor; Grimon, Gilles; Chekroun, Malika; Mabille, Laurence; Parent, Florence; Seferian, Andrei; Bulifon, Sophie; Montani, David; Humbert, Marc; Chaumet-Riffaud, Philippe; Lebon, Vincent; Durand, Emmanuel
2018-04-03
Purpose To assess the performance of the ITK-SNAP software for fluorodeoxyglucose (FDG) positron emission tomography (PET) segmentation of complex-shaped lung tumors compared with an optimized, expert-based manual reference standard. Materials and Methods Seventy-six FDG PET images of thoracic lesions were retrospectively segmented by using ITK-SNAP software. Each tumor was manually segmented by six raters to generate an optimized reference standard by using the simultaneous truth and performance level estimate algorithm. Four raters segmented 76 FDG PET images of lung tumors twice by using ITK-SNAP active contour algorithm. Accuracy of ITK-SNAP procedure was assessed by using Dice coefficient and Hausdorff metric. Interrater and intrarater reliability were estimated by using intraclass correlation coefficients of output volumes. Finally, the ITK-SNAP procedure was compared with currently recommended PET tumor delineation methods on the basis of thresholding at 41% volume of interest (VOI; VOI 41 ) and 50% VOI (VOI 50 ) of the tumor's maximal metabolism intensity. Results Accuracy estimates for the ITK-SNAP procedure indicated a Dice coefficient of 0.83 (95% confidence interval: 0.77, 0.89) and a Hausdorff distance of 12.6 mm (95% confidence interval: 9.82, 15.32). Interrater reliability was an intraclass correlation coefficient of 0.94 (95% confidence interval: 0.91, 0.96). The intrarater reliabilities were intraclass correlation coefficients above 0.97. Finally, VOI 41 and VOI 50 accuracy metrics were as follows: Dice coefficient, 0.48 (95% confidence interval: 0.44, 0.51) and 0.34 (95% confidence interval: 0.30, 0.38), respectively, and Hausdorff distance, 25.6 mm (95% confidence interval: 21.7, 31.4) and 31.3 mm (95% confidence interval: 26.8, 38.4), respectively. Conclusion ITK-SNAP is accurate and reliable for active-contour-based segmentation of heterogeneous thoracic PET tumors. ITK-SNAP surpassed the recommended PET methods compared with ground truth manual segmentation. © RSNA, 2018.
Terashima, Taiko; Yoshimura, Sadako
2018-03-01
To determine whether nurses can accurately assess the skin colour of replanted fingers displayed as digital images on a computer screen. Colour measurement and clinical diagnostic methods for medical digital images have been studied, but reproducing skin colour on a computer screen remains difficult. The inter-rater reliability of skin colour assessment scores was evaluated. In May 2014, 21 nurses who worked on a trauma ward in Japan participated in testing. Six digital images with different skin colours were used. Colours were scored from both digital images and direct patient's observation. The score from a digital image was defined as the test score, and its difference from the direct assessment score as the difference score. Intraclass correlation coefficients were calculated. Nurses' opinions were classified and summarised. The intraclass correlation coefficients for the test scores were fair. Although the intraclass correlation coefficients for the difference scores were poor, they improved to good when three images that might have contributed to poor reliability were excluded. Most nurses stated that it is difficult to assess skin colour in digital images; they did not think it could be a substitute for direct visual assessment. However, most nurses were in favour of including images in nursing progress notes. Although the inter-rater reliability was fairly high, the reliability of colour reproduction in digital images as indicated by the difference scores was poor. Nevertheless, nurses expect the incorporation of digital images in nursing progress notes to be useful. This gap between the reliability of digital colour reproduction and nurses' expectations towards it must be addressed. High inter-rater reliability for digital images in nursing progress notes was not observed. Assessments of future improvements in colour reproduction technologies are required. Further digitisation and visualisation of nursing records might pose challenges. © 2017 John Wiley & Sons Ltd.
Hallegraeff, Joannes M; van der Schans, Cees P; Krijnen, Wim P; de Greef, Mathieu H G
2013-02-01
The eight-item Brief Illness Perception Questionnaire is used as a screening instrument in physical therapy to assess mental defeat in patients with acute low back pain, besides patient perception might determine the course and risk for chronic low back pain. However, the psychometric properties of the Brief Illness Perception Questionnaire in common musculoskeletal disorders like acute low back pain have not been adequately studied. Patients' perceptions vary across different populations and affect coping styles. Thus, our aim was to determine the internal consistency, test-retest reliability and validity of the Dutch language version of the Brief Illness Perception Questionnaire in acute non-specific low back pain patients in primary care physical therapy. A non-experimental cross-sectional study with two measurements was performed. Eighty-four acute low back pain patients, in multidisciplinary health care center in Dutch primary care with a sample mean (SD) age of 42 (12) years, participated in the study. Internal consistency (Cronbach's α) and test-retest procedures (Intraclass Correlation Coefficients and limits of agreement) were evaluated at a one-week interval. The concurrent validity of the Brief Illness Perception Questionnaire was examined by using the Mental Health Component of the Short Form 36 Health Survey. The Cronbach's α for internal consistency was 0.73 (95% CI, 0.67 - 0.83); and the Intraclass Correlation Coefficient test-retest reliability was acceptable: 0.72 (95% CI, 0.53 - 0.82), however, the limits of agreement were large. The Intraclass Correlation Coefficient measuring concurrent validity 0.65 (95% CI, 0.46 - 0.80). The Dutch version of the Brief Illness Perception Questionnaire is an appropriate instrument for measuring patients' perceptions in acute low back pain patients, showing acceptable internal consistency and reliability. Concurrent validity is adequate, however, the instrument may be unsuitable for detecting changes in low back pain perception over time.
Oxygen-weighted Hyperpolarized 3He MR Imaging: A Short-term Reproducibility Study in Human Subjects
Ishii, Masaru; Hamedani, Hooman; Clapp, Justin T.; Kadlecek, Stephen J.; Xin, Yi; Gefter, Warren B.; Rossman, Milton D.
2015-01-01
Purpose To determine whether hyperpolarized helium 3 magnetic resonance (MR) imaging to measure alveolar partial pressure of oxygen (Pao2) shows sufficient test-retest repeatability and between-cohort differences to be used as a reliable technique for detection of alterations in gas exchange in asymptomatic smokers. Materials and Methods The protocol was approved by the local institutional review board and was HIPAA compliant. Informed consent was obtained from all subjects. Two sets of MR images were obtained 10 minutes apart in 25 subjects: 10 nonsmokers (five men, five women; mean ± standard deviation age, 50 years ± 6) and 15 smokers (seven women, eight men; mean age, 50 years ± 8). A mixed-effects model was developed to identify the regional repeatability of Pao2 measurements as an intraclass correlation coefficient. Ten smokers were matched with the 10 nonsmokers on the basis of signal-to-noise ratio (SNR). Three separate models were generated: one for nonsmokers, one for the SNR-matched smokers, and one for the five remaining smokers, who were imaged with a significantly higher SNR. Results Short-term back-to-back regional reproducibility was assessed by using intraclass correlation coefficients, which were 0.67 and 0.65 for SNR case-matched nonsmokers and smokers, respectively. Repeatability was a strong function of SNR; a 50% increase in SNR in the remaining smokers improved the intraclass correlation coefficient to 0.82. Although repeatability was not significantly different between the SNR-matched cohorts (P = .44), the smoker group showed higher spatial and temporal variability in Pao2. Conclusion The short-term test-retest repeatability of hyperpolarized gas MR imaging of regional Pao2 was good. Asymptomatic smokers exhibited greater spatial and temporal variability in Pao2 than did the nonsmokers, which suggests that this parameter allows detection of small functional alterations associated with smoking. © RSNA, 2015 Online supplemental material is available for this article. PMID:26110668
Singer, Hannah M; Almazan, Timothy; Craft, Noah; David, Consuelo V; Eells, Samantha; Erfe, Crisel; Lazzaro, Cynthia; Nguyen, Kathy; Preciado, Katy; Tan, Belinda; Patel, Vishal A
2018-02-01
Teledermatology has undergone exponential growth in the past 2 decades. Many technological innovations are becoming available without necessarily undergoing validation studies for specific dermatologic applications. To determine whether patient-taken photographs of acne using Network Oriented Research Assistant (NORA) result in similar lesion counts and Investigator's Global Assessment (IGA) findings compared with in-person examination findings. This pilot reliability study enrolled consecutive patients with acne vulgaris from a single general dermatology practice in Los Angeles, California, who were able to use NORA on an iPhone 6 to take self-photographs. Patients were enrolled from January 1 through March 31, 2016. Each individual underwent in-person and digital evaluation of his or her acne by the same dermatologist. A period of at least 1 week separated the in-person and digital assessments of acne. All participants were trained on how to use NORA on the iPhone 6 and take photographs of their face with the rear-facing camera. Reliability of patient-taken photographs with NORA for acne evaluation compared with in-person examination findings. Acne assessment measures included lesion count (total, inflammatory, noninflammatory, and cystic) and IGA for acne severity. A total of 69 patients (37 male [54%] and 32 female [46%]; mean [SD] age, 22.7 [7.7] years) enrolled in the study. The intraclass correlation coefficients of in-person and photograph-based acne evaluations indicated strong agreement. The intraclass correlation coefficient for total lesion count was 0.81; for the IGA, 0.75. Inflammatory lesion count, noninflammatory lesion count, and cyst count had intraclass correlation coefficients of 0.72, 0.72, and 0.82, respectively. This study found agreement between acne evaluations performed in person and from self-photographs with NORA. As a reliable telehealth technology for acne, NORA can be used as a teledermatology platform for dermatology research and can increase access to dermatologic care.
Sandercock, D A; Nute, G R; Hocking, P M
2009-05-01
A multistrain experiment was conducted to quantify the extent of genetic differences in carcass and muscle yields, muscle quality, support organs, and taste panel assessments of cooked breast muscle of 296 birds from 37 lines of commercial broiler, layer, and traditional chickens. The birds were reared as broilers and 4 males from each line were slaughtered at 6 and 10 wk of age. The extent of genetic variation was measured as the intraclass correlation. The intraclass correlation for live weight; carcass yields; breast, drum, and wing portions; and associated muscle yields were high, whereas those for the thigh portion and yield were low. Broilers had more breast and thigh muscle but similar drum muscle as a proportion of carcass weight compared with layer and traditional lines. Genetic variation for muscle quality (plasma creatine kinase activity) was high; that for muscle color (L, a, and b) and hemorrhage score were moderate in size and were greater at 10 than at 6 wk of age. Broiler lines had greater creatine kinase activity indicative of greater muscle pathology; breast muscle was lighter, less red and yellow in color, and had a greater hemorrhage score than muscle from layer and traditional lines, which were similar. Intraclass correlations for taste panel scores were low and generally not significant except for texture, chicken flavor intensity, flavor liking, and overall liking at 6 wk of age. Significantly greater scores from broiler compared with layer and traditional lines for texture, chicken flavor intensity, and overall liking were observed. At 10 wk of age, chicken flavor intensity did not differ between broiler or layer birds but was significantly greater in both groups than traditional birds. Genetic variation for relative weight of abdominal fat, spleen, and heart was moderately high and greater at 10 than at 6 wk of age. Broiler carcasses had a relatively high proportion of abdominal fat and smaller spleen and heart weights.
Middleton, Michael S; Haufe, William; Hooker, Jonathan; Borga, Magnus; Dahlqvist Leinhard, Olof; Romu, Thobias; Tunón, Patrik; Hamilton, Gavin; Wolfson, Tanya; Gamst, Anthony; Loomba, Rohit; Sirlin, Claude B
2017-05-01
Purpose To determine the repeatability and accuracy of a commercially available magnetic resonance (MR) imaging-based, semiautomated method to quantify abdominal adipose tissue and thigh muscle volume and hepatic proton density fat fraction (PDFF). Materials and Methods This prospective study was institutional review board- approved and HIPAA compliant. All subjects provided written informed consent. Inclusion criteria were age of 18 years or older and willingness to participate. The exclusion criterion was contraindication to MR imaging. Three-dimensional T1-weighted dual-echo body-coil images were acquired three times. Source images were reconstructed to generate water and calibrated fat images. Abdominal adipose tissue and thigh muscle were segmented, and their volumes were estimated by using a semiautomated method and, as a reference standard, a manual method. Hepatic PDFF was estimated by using a confounder-corrected chemical shift-encoded MR imaging method with hybrid complex-magnitude reconstruction and, as a reference standard, MR spectroscopy. Tissue volume and hepatic PDFF intra- and interexamination repeatability were assessed by using intraclass correlation and coefficient of variation analysis. Tissue volume and hepatic PDFF accuracy were assessed by means of linear regression with the respective reference standards. Results Adipose and thigh muscle tissue volumes of 20 subjects (18 women; age range, 25-76 years; body mass index range, 19.3-43.9 kg/m 2 ) were estimated by using the semiautomated method. Intra- and interexamination intraclass correlation coefficients were 0.996-0.998 and coefficients of variation were 1.5%-3.6%. For hepatic MR imaging PDFF, intra- and interexamination intraclass correlation coefficients were greater than or equal to 0.994 and coefficients of variation were less than or equal to 7.3%. In the regression analyses of manual versus semiautomated volume and spectroscopy versus MR imaging, PDFF slopes and intercepts were close to the identity line, and correlations of determination at multivariate analysis (R 2 ) ranged from 0.744 to 0.994. Conclusion This MR imaging-based, semiautomated method provides high repeatability and accuracy for estimating abdominal adipose tissue and thigh muscle volumes and hepatic PDFF. © RSNA, 2017.
The Edematous and Erythematous Airway Does Not Denote Pathologic Gastroesophageal Reflux.
Rosen, Rachel; Mitchell, Paul D; Amirault, Janine; Amin, Manali; Watters, Karen; Rahbar, Reza
2017-04-01
To determine if the reflux finding score (RFS), a validated score for airway inflammation, correlates with gastroesophageal reflux measured by multichannel intraluminal impedance (MII) testing, endoscopy, and quality of life scores. We performed a prospective, cross-sectional cohort study of 77 children with chronic cough undergoing direct laryngoscopy and bronchoscopy, esophagogastroduodenoscopy, and MII testing with pH (pH-MII) between 2006 and 2011. Airway examinations were videotaped and reviewed by 3 blinded otolaryngologists each of whom assigned RFS to the airways. RFS were compared with the results of reflux testing (endoscopy, MII, symptom scores). An intraclass correlation coefficient was calculated for the degree of agreement between otolaryngologists' RFS. Receiver operating characteristic curves were created to determine the sensitivity of the RFS. Spearman correlation was calculated between the RFS and reflux measurements by pH-MII. The mean ± SD RFS was 12 ± 4. There was no correlation between pH-MII variables and mean RFS (|r| < 0.15). The concordance correlation coefficient for RFS between otolaryngologists was low (intraclass correlation coefficient = 0.32). Using pH-metry as a gold standard, the positive predictive value for the RFS was 29%. Using MII as the gold standard, the positive predictive value for the RFS was 40%. There was no difference in the mean RFS in patients with (12 ± 4) and without (12 ± 3) esophagitis (P = .9). There was no correlation between RFS and quality of life scores (|r| < 0.15, P > .3). The RFS cannot predict pathologic gastroesophageal reflux and an airway examination should not be used as a basis for prescribing gastroesophageal reflux therapies. Copyright © 2016. Published by Elsevier Inc.
Ara, Mirian; Pajarin, Ana B.
2015-01-01
Objective. To assess the intrasession repeatability and intersession reproducibility of peripapillary retinal nerve fiber layer (RNFL) thickness parameters measured by scanning laser polarimetry (SLP) with enhanced corneal compensation (ECC) in healthy and glaucomatous eyes. Methods. One randomly selected eye of 82 healthy individuals and 60 glaucoma subjects was evaluated. Three scans were acquired during the first visit to evaluate intravisit repeatability. A different operator obtained two additional scans within 2 months after the first session to determine intervisit reproducibility. The intraclass correlation coefficient (ICC), coefficient of variation (COV), and test-retest variability (TRT) were calculated for all SLP parameters in both groups. Results. ICCs ranged from 0.920 to 0.982 for intravisit measurements and from 0.910 to 0.978 for intervisit measurements. The temporal-superior-nasal-inferior-temporal (TSNIT) average was the highest (0.967 and 0.946) in normal eyes, while nerve fiber indicator (NFI; 0.982) and inferior average (0.978) yielded the best ICC in glaucomatous eyes for intravisit and intervisit measurements, respectively. All COVs were under 10% in both groups, except NFI. TSNIT average had the lowest COV (2.43%) in either type of measurement. Intervisit TRT ranged from 6.48 to 12.84. Conclusions. The reproducibility of peripapillary RNFL measurements obtained with SLP-ECC was excellent, indicating that SLP-ECC is sufficiently accurate for monitoring glaucoma progression. PMID:26185762
Charles, James
2016-09-02
In clinical and research settings, ankle joint dorsiflexion needs to be reliably measured. Dorsiflexion is often measured by goniometry, but the intrarater and interrater reliability of this technique have been reported to be poor. Many devices to measure dorsiflexion have been developed for clinical and research use. An evaluation of 12 current tools showed that none met all of the desirable criteria. The purpose of this study was to design and develop a device that rates highly in all of the criteria and that can be proved to be highly reliable. While supine on a treatment table, 14 participants had a foot placed in the Charles device and ankle joint dorsiflexion measured and recorded three times with a digital inclinometer. The mean of the three readings was determined to be the ankle joint dorsiflexion. The analysis used was intraclass correlation coefficient (ICC). There was very little difference in ICC single or average measures between left and right feet, so data were pooled (N = 28). The single-measure ICC was 0.998 (95% confidence interval, 0.996-0.998). The average-measure ICC was 0.998 (95% confidence interval, 0.995-0.999). Limits of agreement for the average measure were also very good: -1.30° to 1.65°. The Charles device meets all of the desirable criteria and has many innovative features, increasing its appropriateness for clinical and research applications. It has a suitable design for measuring dorsiflexion and high intrarater and interrater reliability.
Optical implementation of neocognitron and its applications to radar signature discrimination
NASA Technical Reports Server (NTRS)
Chao, Tien-Hsin; Stoner, William W.
1991-01-01
A feature-extraction-based optoelectronic neural network is introduced. The system implementation approach applies the principle of the neocognitron paradigm first introduced by Fukushima et al. (1983). A multichannel correlator is used as a building block of a generic single layer of the neocognitron for shift-invariant feature correlation. Multilayer processing is achieved by iteratively feeding back the output of the feature correlator to the input spatial light modulator. Successful pattern recognition with intraclass fault tolerance and interclass discrimination is achieved using this optoelectronic neocognitron. Detailed system analysis is described. Experimental demonstration of radar signature processing is also provided.
Salazar-Gutiérrez, María Luisa; Ochoa-Ponce, Cristina; Lona-Reyes, Juan Carlos; Gutiérrez-Íñiguez, Sara Ivonne
Reference methods for the quantification of the glomerular filtration rate (GFR) are difficult to use in clinical practice; formulas for evaluating GFR based on serum creatinine (SCr) and/or creatinine clearance are used. The aim of this study was to quantify the correlation and concordance of GFR with creatinine clearance in 24-hour urine (GFR24) and Schwartz and Schwartz updated formulas. Cross-sectional study involving healthy pediatric patients and with chronic kidney disease (CKD) from 5 to 16.9 years. Linear correlation between GFR 24 and two formulas was evaluated with the Pearson correlation coefficient (r) and intraclass correlation coefficient (ICC). We studied 134 patients, of which 59.7% were male. Mean age was 10.8 years. The average GFR24 was 140.34ml/min/1.73m 2 ; 34.3% (n=46) had GFR <90ml/min/1.73m 2 . Moderate linear correlation between GFR24 and Schwartz (r= 0.63) and Schwartz updated (r= 0.65) formulas was observed. There was good concordance between the GFR24 and Schwartz (ICC= 0.77) and updated Schwartz (ICC= 0.77) formulas. Schwartz classical formula in patients with GFR24 ≥ 90ml/min/1.73m 2 estimated higher values, while Schwartz updated underestimated values. There is moderate correlation and good concordance between the GFR24 and Schwartz and Schwartz updated formulas. The concordance was better in patients with obesity and lower in women, patients with hyperfiltration and normal weight. Copyright © 2016 Hospital Infantil de México Federico Gómez. Publicado por Masson Doyma México S.A. All rights reserved.
Fang, Danqi; Tang, Fang Yao; Huang, Haifan; Cheung, Carol Y; Chen, Haoyu
2018-05-29
To investigate the repeatability, interocular correlation and agreement of quantitative swept-source optical coherence tomography angiography (SS-OCTA) metrics in healthy subjects. Thirty-three healthy normal subjects were enrolled. The macula was scanned four times by an SS-OCTA system using the 3 mm×3 mm mode. The superficial capillary map images were analysed using a MATLAB program. A series of parameters were measured: foveal avascular zone (FAZ) area, FAZ perimeter, FAZ circularity, parafoveal vessel density, fractal dimension and vessel diameter index (VDI). The repeatability of four scans was determined by intraclass correlation coefficient (ICC). Then the averaged results were analysed for intereye difference, correlation and agreement using paired t-test, Pearson's correlation coefficient (r), ICC and Bland-Altman plot. The repeatability assessment of the macular metrics exported high ICC values (ranged from 0.853 to 0.996). There is no statistically significant difference in the OCTA metrics between the two eyes. FAZ area (ICC=0.961, r=0.929) and FAZ perimeter (ICC=0.884, r=0.802) showed excellent binocular correlation. Fractal dimension (ICC=0.732, r=0.578) and VDI (ICC=0.707, r=0.547) showed moderate binocular correlation, while parafoveal vessel density had poor binocular correlation. Bland-Altman plots showed the range of agreement was from -0.0763 to 0.0954 mm 2 for FAZ area and from -0.0491 to 0.1136 for parafoveal vessel density. The macular metrics obtained using SS-OCTA showed excellent repeatability in healthy subjects. We showed high intereye correlation in FAZ area and perimeter, moderate correlation in fractal dimension and VDI, while vessel density had poor correlation in normal healthy subjects. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Estimating Glenoid Width for Instability-Related Bone Loss: A CT Evaluation of an MRI Formula.
Giles, Joshua W; Owens, Brett D; Athwal, George S
2015-07-01
Determining the magnitude of glenoid bone loss in cases of shoulder instability is an important step in selecting the optimal reconstructive procedure. Recently, a formula has been proposed that estimates native glenoid width based on magnetic resonance imaging (MRI) measurements of height (1/3 × glenoid height + 15 mm). This technique, however, has not been validated for use with computed tomography (CT), which is often the preferred imaging modality to assess bone deficiencies. The purpose of this project was 2-fold: (1) to determine if the MRI-based formula that predicts glenoid width from height is valid with CT and (2) to determine if a more accurate regression can be resolved for use specifically with CT data. Descriptive laboratory study. Ninety normal shoulder CT scans with preserved osseous anatomy were drawn from an existing database and analyzed. Measurements of glenoid height and width were performed by 2 observers on reconstructed 3-dimensional models. After assessment of reliability, the data were correlated, and regression models were created for male and female shoulders. The accuracy of the MRI-based model's predictions was then compared with that of the CT-based models. Intra- and interrater reliabilities were good to excellent for height and width, with intraclass correlation coefficients of 0.765 to 0.992. The height and width values had a strong correlation of 0.900 (P < .001). Regression analyses for male and female shoulders produced CT-specific formulas: for men, glenoid width = 2/3 × glenoid height + 5 mm; for women, glenoid width = 2/3 × glenoid height + 3 mm. Comparison of predictions from the MRI- and CT-specific formulas demonstrated good agreement (intraclass correlation coefficient = 0.818). The CT-specific formulas produced a root mean squared error of 1.2 mm, whereas application of the MRI-specific formula to CT images resulted in a root mean squared error of 1.5 mm. Use of the MRI-based formula on CT scans to predict glenoid width produced estimates that were nearly as accurate as the CT-specific formulas. The CT-specific formulas, however, are more accurate at predicting native glenoid width when applied to CT data. Imaging-specific (CT and MRI) formulas have been developed to estimate glenoid bone loss in patients with instability. The CT-specific formula can accurately predict native glenoid width, having an error of only 2.2% of average glenoid width. © 2015 The Author(s).
ERIC Educational Resources Information Center
Westine, Carl D.
2016-01-01
Little is known empirically about intraclass correlations (ICCs) for multisite cluster randomized trial (MSCRT) designs, particularly in science education. In this study, ICCs suitable for science achievement studies using a three-level (students in schools in districts) MSCRT design that block on district are estimated and examined. Estimates of…
ERIC Educational Resources Information Center
Can, Seda; van de Schoot, Rens; Hox, Joop
2015-01-01
Because variables may be correlated in the social and behavioral sciences, multicollinearity might be problematic. This study investigates the effect of collinearity manipulated in within and between levels of a two-level confirmatory factor analysis by Monte Carlo simulation. Furthermore, the influence of the size of the intraclass correlation…
Dealing with Dependence (Part I): Understanding the Effects of Clustered Data
ERIC Educational Resources Information Center
McCoach, D. Betsy; Adelson, Jill L.
2010-01-01
This article provides a conceptual introduction to the issues surrounding the analysis of clustered (nested) data. We define the intraclass correlation coefficient (ICC) and the design effect, and we explain their effect on the standard error. When the ICC is greater than 0, then the design effect is greater than 1. In such a scenario, the…
Adolescent Alcohol Use Self-Report Stability: A Decade of Panel Study Data
ERIC Educational Resources Information Center
Shillington, Audrey M.; Clapp, John D.; Reed, Mark B.; Woodruff, Susan I.
2011-01-01
This study analyzed six waves of panel data from the National Longitudinal Survey of Youth (NLSY). These analyses were conducted to test the stability of self-reported lifetime use and age of onset. Intraclass correlation coefficients (ICCs) indicated that the stability of age of onset reports decreased with longer time frames between follow-ups.…
Child and Informant Influences on Behavioral Ratings of Preschool Children
ERIC Educational Resources Information Center
Phillips, Beth M.; Lonigan, Christopher J.
2010-01-01
This study investigated relationships among teacher, parent, and observer behavioral ratings of 3- and 4-year-old children using intra-class correlations and analysis of variance. Comparisons within and across children from middle-income (MI; N = 166; mean age = 54.25 months, standard deviation [SD] = 8.74) and low-income (LI; N = 199; mean age =…
Reliability of Leg and Vertical Stiffness During High Speed Treadmill Running.
Pappas, Panagiotis; Dallas, Giorgos; Paradisis, Giorgos
2017-04-01
In research, the accurate and reliable measurement of leg and vertical stiffness could contribute to valid interpretations. The current study aimed at determining the intraparticipant variability (ie, intraday and interday reliabilities) of leg and vertical stiffness, as well as related parameters, during high speed treadmill running, using the "sine-wave" method. Thirty-one males ran on a treadmill at 6.67 m∙s -1 , and the contact and flight times were measured. To determine the intraday reliability, three 10-s running bouts with 10-min recovery were performed. In addition, to examine the interday reliability, three 10-s running bouts on 3 separate days with 48-h interbout intervals were performed. The reliability statistics included repeated-measure analysis of variance, average intertrial correlations, intraclass correlation coefficients (ICCs), Cronbach's α reliability coefficient, and the coefficient of variation (CV%). Both intraday and interday reliabilities were high for leg and vertical stiffness (ICC > 0.939 and CV < 4.3%), as well as related variables (ICC > 0.934 and CV < 3.9%). It was thus inferred that the measurements of leg and vertical stiffness, as well as the related parameters obtained using the "sine-wave" method during treadmill running at 6.67 m∙s -1 , were highly reliable, both within and across days.
Athanasiadou, Elpiniki; Kyrkou, Charikleia; Fotiou, Maria; Tsakoumaki, Foteini; Dimitropoulou, Aristea; Polychroniadou, Eleni; Menexes, Georgios; Athanasiadis, Apostolos P.; Biliaderis, Costas G.; Michaelidou, Alexandra-Maria
2016-01-01
The objectives were to develop a Mediterranean oriented semi-quantitative food frequency questionnaire (FFQ) and evaluate its validity in measuring energy and nutrient intakes. For FFQ development, the main challenge was to merge food items and practices reflecting cultural Mediterranean preferences with other food choices ensuing from diet transition to more westernized dietary patterns. FFQ validity was evaluated by comparing nutrient intakes against the average of two 24-h dietary recalls for 179 pregnant women. Although the mean intake values for most nutrients and energy tended to be higher when determined by the FFQ, the Cohen’s d was below 0.3. Bland-Altman plots confirmed the agreement between the two methods. Positive significant correlations ranged from 0.35 to 0.77. The proportion of women classified correctly was between 73.2% and 92.2%, whereas gross misclassification was low. Weighted kappa values were between 0.31 and 0.78, while intraclass correlation coefficients were between 0.49 and 0.89. Our methodological approach for the development and validation of this FFQ provides reliable measurements of energy, macro- and micronutrient intakes. Overall, our culture-specific FFQ could serve as a useful assessment tool in studies aiming at monitoring dietary intakes, especially in the Mediterranean region, where countries share common cultural dietary habits. PMID:27571097
Carreau, Joseph H; Bastrom, Tracey; Petcharaporn, Maty; Schulte, Caitlin; Marks, Michelle; Illés, Tamás; Somoskeöy, Szabolcs; Newton, Peter O
2014-03-01
Reproducibility study of SterEOS 3-dimensional (3D) software in large, idiopathic scoliosis (IS) spinal curves. To determine the accuracy and reproducibility of various 3D, software-generated radiographic measurements acquired from a 2-dimensional (2D) imaging system. SterEOS software allows a user to reconstruct a 3D spinal model from an upright, biplanar, low-dose, X-ray system. The validity and internal consistency of this system have not been tested in large IS curves. EOS images from 30 IS patients with curves greater than 50° were collected for analysis. Three observers blinded to the study protocol conducted repeated, randomized, manual 2D measurements, and 3D software generated measurements from biplanar images acquired from an EOS Imaging system. Three-dimensional measurements were repeated using both the Full 3D and Fast 3D guided processes. A total of 180 (120 3D and 60 2D) sets of measurements were obtained of coronal (Cobb angle) and sagittal (T1-T12 and T4-T12 kyphosis; L1-S1 and L1-L5; and pelvic tilt, pelvic incidence, and sacral slope) parameters. Intra-class correlation coefficients were compared, as were the calculated differences in values generated by SterEOS 3D software and manual 2D measurements. The 95% confidence intervals of the mean differences in measures were calculated as an estimate of reproducibility. Average intra-class correlation coefficients were excellent: 0.97, 0.97, and 0.93 for Full 3D, Fast 3D, and 2D measures, respectively (p = .11). Measurement errors for some sagittal measures were significantly lower with the 3D techniques. Both the Full 3D and Fast 3D techniques provided consistent measurements of axial plane vertebral rotation. SterEOS 3D reconstruction spine software creates reproducible measurements in all 3 planes of deformity in curves greater than 50°. Advancements in 3D scoliosis imaging are expected to improve our understanding and treatment of idiopathic scoliosis. Copyright © 2014 Scoliosis Research Society. Published by Elsevier Inc. All rights reserved.
Jaeschke, Lina; Steinbrecher, Astrid; Jeran, Stephanie; Konigorski, Stefan; Pischon, Tobias
2018-04-20
24 h-accelerometry is now used to objectively assess physical activity (PA) in many observational studies like the German National Cohort; however, PA variability, observational time needed to estimate habitual PA, and reliability are unclear. We assessed 24 h-PA of 50 participants using triaxial accelerometers (ActiGraph GT3X+) over 2 weeks. Variability of overall PA and different PA intensities (time in inactivity and in low intensity, moderate, vigorous, and very vigorous PA) between days of assessment or days of the week was quantified using linear mixed-effects and random effects models. We calculated the required number of days to estimate PA, and calculated PA reliability using intraclass correlation coefficients. Between- and within-person variance accounted for 34.4-45.5% and 54.5-65.6%, respectively, of total variance in overall PA and PA intensities over the 2 weeks. Overall PA and times in low intensity, moderate, and vigorous PA decreased slightly over the first 3 days of assessment. Overall PA (p = 0.03), time in inactivity (p = 0.003), in low intensity PA (p = 0.001), in moderate PA (p = 0.02), and in vigorous PA (p = 0.04) slightly differed between days of the week, being highest on Wednesday and Friday and lowest on Sunday and Monday, with apparent differences between Saturday and Sunday. In nested random models, the day of the week accounted for < 19% of total variance in the PA parameters. On average, the required number of days to estimate habitual PA was around 1 week, being 7 for overall PA and ranging from 6 to 9 for the PA intensities. Week-to-week reliability was good (intraclass correlation coefficients, range, 0.68-0.82). Individual PA, as assessed using 24 h-accelerometry, is highly variable between days, but the day of assessment or the day of the week explain only small parts of this variance. Our data indicate that 1 week of assessment is necessary for reliable estimation of habitual PA.
2011-01-01
Background Antibiotic consumption in hospitals is commonly measured using the accumulated amount of drugs delivered from the pharmacy to ward held stocks. The reliability of this method, particularly the impact of the length of the registration periods, has not been evaluated and such evaluation was aim of the study. Methods During 26 weeks, we performed a weekly ward stock count of use of broad-spectrum antibiotics - that is second- and third-generation cephalosporins, carbapenems, and quinolones - in five hospital wards and compared the data with corresponding pharmacy sales figures during the same period. Defined daily doses (DDDs) for antibiotics were used as measurement units (WHO ATC/DDD classification). Consumption figures obtained with the two methods for different registration intervals were compared by use of intraclass correlation analysis and Bland-Altman statistics. Results Broad-spectrum antibiotics accounted for a quarter to one-fifth of all systemic antibiotics (ATC group J01) used in the hospital and varied between wards, from 12.8 DDDs per 100 bed days in a urological ward to 24.5 DDDs in a pulmonary diseases ward. For the entire study period of 26 weeks, the pharmacy and ward defined daily doses figures for all broad-spectrum antibiotics differed only by 0.2%; however, for single wards deviations varied from -4.3% to 6.9%. The intraclass correlation coefficient, pharmacy versus ward data, increased from 0.78 to 0.94 for parenteral broad-spectrum antibiotics with increasing registration periods (1-4 weeks), whereas the corresponding figures for oral broad-spectrum antibiotics (ciprofloxacin) were from 0.46 to 0.74. For all broad-spectrum antibiotics and for parenteral antibiotics, limits of agreement between the two methods showed, according to Bland-Altman statistics, a deviation of ± 5% or less from average mean DDDs at 3- and 4-weeks registration intervals. Corresponding deviation for oral antibiotics was ± 21% at a 4-weeks interval. Conclusions There is a need for caution in interpreting pharmacy sales data aggregated over short registration intervals, especially so for oral formulations. Even a one-month registration period may be too short. PMID:22166018
Haug, Jon B; Myhr, Randi; Reikvam, Asmund
2011-12-13
Antibiotic consumption in hospitals is commonly measured using the accumulated amount of drugs delivered from the pharmacy to ward held stocks. The reliability of this method, particularly the impact of the length of the registration periods, has not been evaluated and such evaluation was aim of the study. During 26 weeks, we performed a weekly ward stock count of use of broad-spectrum antibiotics--that is second- and third-generation cephalosporins, carbapenems, and quinolones--in five hospital wards and compared the data with corresponding pharmacy sales figures during the same period. Defined daily doses (DDDs) for antibiotics were used as measurement units (WHO ATC/DDD classification). Consumption figures obtained with the two methods for different registration intervals were compared by use of intraclass correlation analysis and Bland-Altman statistics. Broad-spectrum antibiotics accounted for a quarter to one-fifth of all systemic antibiotics (ATC group J01) used in the hospital and varied between wards, from 12.8 DDDs per 100 bed days in a urological ward to 24.5 DDDs in a pulmonary diseases ward. For the entire study period of 26 weeks, the pharmacy and ward defined daily doses figures for all broad-spectrum antibiotics differed only by 0.2%; however, for single wards deviations varied from -4.3% to 6.9%. The intraclass correlation coefficient, pharmacy versus ward data, increased from 0.78 to 0.94 for parenteral broad-spectrum antibiotics with increasing registration periods (1-4 weeks), whereas the corresponding figures for oral broad-spectrum antibiotics (ciprofloxacin) were from 0.46 to 0.74. For all broad-spectrum antibiotics and for parenteral antibiotics, limits of agreement between the two methods showed, according to Bland-Altman statistics, a deviation of ± 5% or less from average mean DDDs at 3- and 4-weeks registration intervals. Corresponding deviation for oral antibiotics was ± 21% at a 4-weeks interval. There is a need for caution in interpreting pharmacy sales data aggregated over short registration intervals, especially so for oral formulations. Even a one-month registration period may be too short.
Koo, Terry K; Cohen, Jeffrey H; Zheng, Yongping
2011-11-01
Soft tissue exhibits nonlinear stress-strain behavior under compression. Characterizing its nonlinear elasticity may aid detection, diagnosis, and treatment of soft tissue abnormality. The purposes of this study were to develop a rate-controlled Mechano-Acoustic Indentor System and a corresponding finite element optimization method to extract nonlinear elastic parameters of soft tissue and evaluate its test-retest reliability. An indentor system using a linear actuator to drive a force-sensitive probe with a tip-mounted ultrasound transducer was developed. Twenty independent sites at the upper lateral quadrant of the buttock from 11 asymptomatic subjects (7 men and 4 women from a chiropractic college) were indented at 6% per second for 3 sessions, each consisting of 5 trials. Tissue thickness, force at 25% deformation, and area under the load-deformation curve from 0% to 25% deformation were calculated. Optimized hyperelastic parameters of the soft tissue were calculated with a finite element model using a first-order Ogden material model. Load-deformation response on a standardized block was then simulated, and the corresponding area and force parameters were calculated. Between-trials repeatability and test-retest reliability of each parameter were evaluated using coefficients of variation and intraclass correlation coefficients, respectively. Load-deformation responses were highly reproducible under repeated measurements. Coefficients of variation of tissue thickness, area under the load-deformation curve from 0% to 25% deformation, and force at 25% deformation averaged 0.51%, 2.31%, and 2.23%, respectively. Intraclass correlation coefficients ranged between 0.959 and 0.999, indicating excellent test-retest reliability. The automated Mechano-Acoustic Indentor System and its corresponding optimization technique offers a viable technology to make in vivo measurement of the nonlinear elastic properties of soft tissue. This technology showed excellent between-trials repeatability and test-retest reliability with potential to quantify the effects of a wide variety of manual therapy techniques on the soft tissue elastic properties. Copyright © 2011 National University of Health Sciences. Published by Mosby, Inc. All rights reserved.
[Validating the Spanish version of the Nursing Activities Score].
Sánchez-Sánchez, M M; Arias-Rivera, S; Fraile-Gamo, M P; Thuissard-Vasallo, I J; Frutos-Vivar, F
2015-01-01
Validating workload scores ensures that they are appropriate for the purpose for which they were developed. To validate the Nursing Activities Score (NAS) Spanish version. Observational and prospective study. 1,045 patients who were admitted to a medical-surgical unit and a serious burns unit in 2006 were included. The nurse in charge assessed patient workloads by Nine Equivalent of Nursing Manpower use Score and NAS. To assess the internal consistency of the measurements of NAS, item-test correlations, Cronbach's α and Cronbach's α corrected by omitting each of the items were calculated. The intraobserver and interobserver reliability were assessed with the intraclass correlation coefficient by viewing recordings and Kappa (interobserver reliability) was estimated. For the analysis of internal validity, a factorial principal components analysis was performed. Convergent validity was assessed using the Spearman correlation coefficient values obtained from the Nine Equivalent of Nursing Manpower use Score and Spanish-NAS scales. For internal consistency, 164 questionnaires were analysed and a Cronbach's α of 0.373 was calculated. The intraclass correlation coefficient for intraobserver reliability estimate was 0.837 (95% IC: 0.466-0.950) and 0.662 (95% IC: 0.033-0.882) for interobserver reliability. The estimated kappa was 0.371. For internal validity, exploratory factor analysis showed that the first item explained 58.9% of the variance of the questionnaire. For convergent validity 1006 questionnaires were included and a Spearman correlation coefficient of 0.746 was observed. The psychometric properties of Spanish-NAS are acceptable. Copyright © 2014 Elsevier España, S.L.U. y SEEIUC. All rights reserved.
Holzapfel, Sebastian; Riecke, Jenny; Rief, Winfried; Schneider, Jessica; Glombiewski, Julia A
2016-11-01
Pain-related fear and avoidance of physical activities are central elements of the fear-avoidance model of musculoskeletal pain. Pain-related fear has typically been measured by self-report instruments. In this study, we developed and validated a Behavioral Avoidance Test (BAT) for chronic low back pain (CLBP) patients with the aim of assessing pain-related avoidance behavior by direct observation. The BAT-Back was administered to a group of CLBP patients (N=97) and pain-free controls (N=31). Furthermore, pain, pain-related fear, disability, catastrophizing, and avoidance behavior were measured using self-report instruments. Reliability was assessed with intraclass correlation coefficient and Cronbach α. Validity was assessed by examining correlation and regression analysis. The intraclass correlation coefficient for the BAT-Back avoidance score was r=0.76. Internal consistency was α=0.95. CLBP patients and controls differed significantly on BAT-Back avoidance scores as well as self-report measures. BAT-Back avoidance scores were significantly correlated with scores on each of the self-report measures (rs=0.27 to 0.54). They were not significantly correlated with general anxiety and depression, age, body mass index, and pain duration. The BAT-Back avoidance score was able to capture unique variance in disability after controlling for other variables (eg, pain intensity and pain-related fear). Results indicate that the BAT-Back is a reliable and valid measure of pain-related avoidance behavior. It may be useful for clinicians in tailoring treatments for chronic pain as well as an outcome measure for exposure treatments.
Concurrent validity and reliability of the Alberta Infant Motor Scale in premature infants.
Almeida, Kênnea Martins; Dutra, Maria Virginia Peixoto; Mello, Rosane Reis de; Reis, Ana Beatriz Rodrigues; Martins, Priscila Silveira
2008-01-01
To verify the concurrent validity and interobserver reliability of the Alberta Infant Motor Scale (AIMS) in premature infants followed-up at the outpatient clinic of Instituto Fernandes Figueira, Fundação Oswaldo Cruz (IFF/Fiocruz), in Rio de Janeiro, Brazil. A total of 88 premature infants were enrolled at the follow-up clinic at IFF/Fiocruz, between February and December of 2006. For the concurrent validity study, 46 infants were assessed at either 6 (n = 26) or 12 (n = 20) months' corrected age using the AIMS and the second edition of the Bayley Scales of Infant Development, by two different observers, and applying Pearson's correlation coefficient to analyze the results. For the reliability study, 42 infants between 0 and 18 months were assessed using the Alberta Infant Motor Scale, by two different observers and the results analyzed using the intraclass correlation coefficient. The concurrent validity study found a high level of correlation between the two scales (r = 0.95) and one that was statistically significant (p < 0.01) for the entire population of infants, with higher values at 12 months (r = 0.89) than at 6 months (r = 0.74). The interobserver reliability study found satisfactory intraclass correlation coefficients at all ages tested, varying from 0.76 to 0.99. The AIMS is a valid and reliable instrument for the evaluation of motor development in high-risk infants within the Brazilian public health system.
Marchetti, Bárbara V; Candotti, Cláudia T; Raupp, Eduardo G; Oliveira, Eduardo B C; Furlanetto, Tássia S; Loss, Jefferson F
The purpose of this study was to assess a radiographic method for spinal curvature evaluation in children, based on spinous processes, and identify its normality limits. The sample consisted of 90 radiographic examinations of the spines of children in the sagittal plane. Thoracic and lumbar curvatures were evaluated using angular (apex angle [AA]) and linear (sagittal arrow [SA]) measurements based on the spinous processes. The same curvatures were also evaluated using the Cobb angle (CA) method, which is considered the gold standard. For concurrent validity (AA vs CA), Pearson's product-moment correlation coefficient, root-mean-square error, Pitman- Morgan test, and Bland-Altman analysis were used. For reproducibility (AA, SA, and CA), the intraclass correlation coefficient, standard error of measurement, and minimal detectable change measurements were used. A significant correlation was found between CA and AA measurements, as was a low root-mean-square error. The mean difference between the measurements was 0° for thoracic and lumbar curvatures, and the mean standard deviations of the differences were ±5.9° and 6.9°, respectively. The intraclass correlation coefficients of AA and SA were similar to or higher than the gold standard (CA). The standard error of measurement and minimal detectable change of the AA were always lower than the CA. This study determined the concurrent validity, as well as intra- and interrater reproducibility, of the radiographic measurements of kyphosis and lordosis in children. Copyright © 2017. Published by Elsevier Inc.
AHRQ's hospital survey on patient safety culture: psychometric analyses.
Blegen, Mary A; Gearhart, Susan; O'Brien, Roxanne; Sehgal, Niraj L; Alldredge, Brian K
2009-09-01
This project analyzed the psychometric properties of the Agency for Healthcare Research and Quality Hospital Survey on Patient Safety Culture (HSOPSC) including factor structure, interitem reliability and intraclass correlations, usefulness for assessment, predictive validity, and sensitivity. The survey was administered to 454 health care staff in 3 hospitals before and after a series of multidisciplinary interventions designed to improve safety culture. Respondents (before, 434; after, 368) included nurses, physicians, pharmacists, and other hospital staff members. Factor analysis partially confirmed the validity of the HSOPSC subscales. Interitem consistency reliability was above 0.7 for 5 subscales; the staffing subscale had the lowest reliability coefficients. The intraclass correlation coefficients, agreement among the members of each unit, were within recommended ranges. The pattern of high and low scores across the subscales of the HSOPSC in the study hospitals were similar to the sample of Pacific region hospitals reported by the Agency for Healthcare Research and Quality and corresponded to the proportion of items in each subscale that are worded negatively (reverse scored). Most of the unit and hospital dimensions were correlated with the Safety Grade outcome measure in the tool. Overall, the tool was shown to have moderate-to-strong validity and reliability, with the exception of the staffing subscale. The usefulness in assessing areas of strength and weakness for hospitals or units among the culture subscales is questionable. The culture subscales were shown to correlate with the perceived outcomes, but further study is needed to determine true predictive validity.
Doubova, Svetlana V; Aguirre-Hernandez, Rebeca; Infante-Castañeda, Claudia; Martinez-Vega, Ingrid; Pérez-Cuevas, Ricardo
2015-10-01
The purpose of this study was to validate the Mexican version of the Support Person Unmet Needs Survey (SPUNS-SFM). A cross-sectional survey that included 826 primary caregivers of cancer patients was conducted from June to December 2013 at the Oncology Hospital of the Mexican Institute of Social Security in Mexico City. The validation procedure comprised (1) content validity through a group of experts; (2) construct validity through an exploratory factor analysis based on the polychoric correlation matrix; (3) internal consistency using Cronbach's alpha; (4) convergent validity between SPUNS-SFM and quality of life, anxiety-and-depression scales by calculating Spearman's rank correlation coefficient;( 5) discriminative validity through the Wilcoxon rank-sum test; and (6) test-retest reliability using intraclass correlation coefficient. SPUNS-SFM has 23 items with six factors accounting for 65 % of the total variance. The domains were concerns about the future, access and continuity of healthcare, information, work and finance, and personal and emotional needs. Cronbach's alpha values ranged from 0.70 to 0.88 among factors. SPUNS-SFM had moderate convergent validity compared with quality of life and depression-and-anxiety scales and good discriminative validity, revealing high needs for younger caregivers and more emotional needs for caregivers of patients with advanced cancer stages. Intraclass correlation coefficient between SPUNS-SFM measurements was 0.78. SPUNS-SFM is a valid and reliable tool to identify needs of caregivers of cancer patients.
Barbado, David; Moreside, Janice; Vera-Garcia, Francisco J
2017-03-01
Although unstable seat methodology has been used to assess trunk postural control, the reliability of the variables that characterize it remains unclear. To analyze reliability and learning effect of center of pressure (COP) and kinematic parameters that characterize trunk postural control performance in unstable seating. The relationships between kinematic and COP parameters also were explored. Test-retest reliability design. Biomechanics laboratory setting. Twenty-three healthy male subjects. Participants volunteered to perform 3 sessions at 1-week intervals, each consisting of five 70-second balancing trials. A force platform and a motion capture system were used to measure COP and pelvis, thorax, and spine displacements. Reliability was assessed through standard error of measurement (SEM) and intraclass correlation coefficients (ICC 2,1 ) using 3 methods: (1) comparing the last trial score of each day; (2) comparing the best trial score of each day; and (3) calculating the average of the three last trial scores of each day. Standard deviation and mean velocity were calculated to assess balance performance. Although analyses of variance showed some differences in balance performance between days, these differences were not significant between days 2 and 3. Best result and average methods showed the greatest reliability. Mean velocity of the COP showed high reliability (0.71 < ICC < 0.86; 10.3 < SEM < 13.0), whereas standard deviation only showed a low to moderate reliability (0.37 < ICC < 0.61; 14.5 < SEM < 23.0). Regarding the kinematic variables, only pelvis displacement mean velocity achieved a high reliability using the average method (0.62 < ICC < 0.83; 18.8 < SEM < 23.1). Correlations between COP and kinematics were high only for mean velocity (0.45
Koleilat, Maria; Whaley, Shannon E
2016-06-01
Fruits, vegetables, sweetened foods, and beverages have been found to have positive and negative associations with obesity in early childhood, yet no rapid assessment tools are available to measure intake of these foods among preschoolers. This study examines the test-retest reliability and validity of a 10-item Child Food and Beverage Intake Questionnaire designed to assess fruits, vegetables, and sweetened foods and beverages intake among 2- to 4-year-old children. The Child Food and Beverage Intake Questionnaire was developed for use in periodic phone surveys conducted with low-income families with preschool-aged children. Seventy primary caregivers of 2- to 4-year-old children completed two Child Food and Beverage Intake Questionnaires within a 2-week period for test-retest reliability. Participants also completed three 24-hour recalls to allow assessment of validity. Intraclass correlations were used to examine test-retest reliability. Spearman rank correlation coefficients, Bland-Altman plots, and linear regression analyses were used to examine validity of the Child Food and Beverage Intake Questionnaire compared with three 24-hour recalls. Intraclass correlations between Child Food and Beverage Intake Questionnaire administrations ranged from 0.48 for sweetened drinks to 0.87 for regular sodas. Intraclass correlations for fruits, vegetables, and sweetened food were 0.56, 0.49, and 0.56, respectively. Spearman rank correlation coefficients ranged from 0.15 to 0.59 for beverages, with 0.46 for sugar-sweetened beverages. Spearman rank correlation coefficients for fruits, vegetables, and sweetened food were 0.30, 0.33, and 0.30, respectively. Although observation of the Bland-Altman plots and linear regression analyses showed a slight upward trend in mean differences, with increasing mean intake for five beverage groups, at least 90% of data plots fell within the limits of agreement for all food/beverage groups. The Child Food and Beverage Intake Questionnaire exhibited fair to substantial test-retest reliability and moderate to strong validity in ranking fruits, vegetables, sweetened food, and the majority of beverages consumed by children aged 2 to 4 years old. Although the Child Food and Beverage Intake Questionnaire might not be able to assess the absolute intake of foods and beverages, given the scarcity of an easily administered, valid, and reliable questionnaire to assess nutritional intake among 2- to 4-year-old low-income children, this tool is a useful means for measuring trends in dietary intake among low-income preschoolers. Copyright © 2016 Academy of Nutrition and Dietetics. Published by Elsevier Inc. All rights reserved.
Igwesi-Chidobe, Chinonso N; Obiekwe, Chinwe; Sorinola, Isaac O; Godfrey, Emma L
2017-12-14
Cross-culturally adapt and validate the Igbo Roland Morris Disability Questionnaire. Cross-cultural adaptation, test-retest, and cross-sectional psychometric testing. Roland Morris Disability Questionnaire was forward and back translated by clinical/non-clinical translators. An expert committee appraised the translations. Twelve participants with chronic low back pain pre-tested the measure in a rural Nigerian community. Internal consistency using Cronbach's alpha; test-retest reliability using intra-class correlation coefficient and Bland-Altman plot; and minimal detectable change were investigated in a convenient sample of 50 people with chronic low back pain in rural and urban Nigeria. Pearson's correlation analyses using the eleven-point box scale and back performance scale, and exploratory factor analysis were used to examine construct validity in a random sample of 200 adults with chronic low back pain in rural Nigeria. Ceiling and floor effects were investigated in the two samples. Modifications gave the option of interviewer-administration and reflected Nigerian social context. The measure had excellent internal consistency (α = 0.91) and intraclass correlation coefficient (ICC =0.84), moderately high correlations (r > 0.6) with performance-based disability and pain intensity, and a predominant uni-dimensional structure, with no ceiling or floor effects. Igbo Roland Morris Disability Questionnaire is a valid and reliable measure of pain-related disability. Implications for rehabilitation Low back pain is the leading cause of years lived with disability worldwide, and is particularly prevalent in rural Nigeria, but there are no self-report measures to assess its impact due to low literacy rates. This study describes the cross-cultural adaptation and validation of a core self-report back pain specific disability measure in a low-literate Nigerian population. The Igbo Roland Morris Disability Questionnaire is a reliable and valid measure of self-reported disability in Igbo populations as indicated by excellent internal consistency (α = 0.91) and intra-class correlation coefficient (ICC =0.84), moderately high correlations (r > 0.6) with performance-based disability and pain intensity that supports a pain-related disability construct, a predominant one factor structure with no ceiling or floor effects. The measure will be useful for researchers and clinicians examining the factors associated with low back pain disability or the effects of interventions on low back pain disability in this culture. This measure will support global health initiatives concurrently involving people from several cultures or countries, and may inform cross-cultural disability research in other populations.
González-Pérez, Javier; Queiruga Piñeiro, Juan; Sánchez García, Ángelx; González Méijome, José Manuel
2018-04-10
To compare central corneal thickness (CCT) measured by standard ultrasound pachymetry (USP), and three non-contact devices in healthy eyes. A cross-sectional study of CCT measurement in 52 eyes of 52 healthy volunteers was done by a single examiner at Ocular Surface and Contact Lens Laboratory. Three consecutive measurements were done by standard USP, non-contact tono-pachymeter, Pentacam corneal topographer, and Anterior Segment Optical Coherence Tomography (AS-OCT). The mean values were used for assessment. The results were compared using multivariate ANOVA, linear regression, and Pearson correlation. Agreement among the devices was analyzed using mean differences and Bland-Altman analysis with 95% limits of agreement (LoA). Finally, reliability was analyzed using intraclass correlation coefficient (ICC). Mean CCT by ultrasound pachymeter, tono-pachymeter, corneal topographer and AS-OCT were 558.9 ± 31.2 µm, 525.8 ± 43.1 µm, 550.4 ± 30.5 µm, and 545.9 ± 30.5 µm respectively. There was a significant positive correlation between AS-OCT and USP (Pearson correlation = 0.957, p < 0.001), corneal topography and USP (Pearson correlation = 0.965, p < 0.001), and corneal topography and AS-OCT (Pearson correlation = 0.965, p < 0.001). There was a lower correlation between CT-1P tono-pachymeter and the other three modalities. Intraclass correlation coefficients show an excellent reliability between pairs except for CT-1P against the other three instruments that were found moderate. CT-1P tono-pachymeter underestimates CCT measurements compared to Scheimpflug system, AS-OCT device, and USP. Mean CCT among USP, Pentacam and AS-OCT were comparable and had significant linear correlations. In clinical practice, these three modalities could be interchangeable in healthy patients.
Dwyer, Tim; Martin, C Ryan; Kendra, Rita; Sermer, Corey; Chahal, Jaskarndip; Ogilvie-Harris, Darrell; Whelan, Daniel; Murnaghan, Lucas; Nauth, Aaron; Theodoropoulos, John
2017-06-01
To determine the interobserver reliability of the International Cartilage Repair Society (ICRS) grading system of chondral lesions in cadavers, to determine the intraobserver reliability of the ICRS grading system comparing arthroscopy and video assessment, and to compare the arthroscopic ICRS grading system with histological grading of lesion depth. Eighteen lesions in 5 cadaveric knee specimens were arthroscopically graded by 7 fellowship-trained arthroscopic surgeons using the ICRS classification system. The arthroscopic video of each lesion was sent to the surgeons 6 weeks later for repeat grading and determination of intraobserver reliability. Lesions were biopsied, and the depth of the cartilage lesion was assessed. Reliability was calculated using intraclass correlations. The interobserver reliability was 0.67 (95% confidence interval, 0.5-0.89) for the arthroscopic grading, and the intraobserver reliability with the video grading was 0.8 (95% confidence interval, 0.67-0.9). A high correlation was seen between the arthroscopic grading of depth and the histological grading of depth (0.91); on average, surgeons graded lesions using arthroscopy a mean of 0.37 (range, 0-0.86) deeper than the histological grade. The arthroscopic ICRS classification system has good interobserver and intraobserver reliability. A high correlation with histological assessment of depth provides evidence of validity for this classification system. As cartilage lesions are treated on the basis of the arthroscopic ICRS classification, it is important to ascertain the reliability and validity of this method. Copyright © 2016 Arthroscopy Association of North America. Published by Elsevier Inc. All rights reserved.
Martín-Fernández, Jesús; del Cura-González, Ma Isabel; Rodríguez-Martínez, Gemma; Ariza-Cardiel, Gloria; Zamora, Javier; Gómez-Gascón, Tomás; Polentinos-Castro, Elena; Pérez-Rivas, Francisco Javier; Domínguez-Bidagor, Julia; Beamud-Lagos, Milagros; Tello-Bernabé, Ma Eugenia; Conde-López, Juan Francisco; Aguado-Arroyo, Óscar; Sanz-Bayona, Ma Teresa; Gil-Lacruz, Ana Isabel
2013-01-01
Identifying the economic value assigned by users to a particular health service is of principal interest in planning the service. The aim of this study was to evaluate the perception of economic value of nursing consultation in primary care (PC) by its users. Economic study using contingent valuation methodology. A total of 662 users of nursing consultation from 23 health centers were included. Data on demographic and socioeconomic characteristics, health needs, pattern of usage, and satisfaction with provided service were compiled. The validity of the response was evaluated by an explanatory mixed-effects multilevel model in order to assess the factors associated with the response according to the welfare theory. Response reliability was also evaluated. Subjects included in the study indicated an average Willingness to Pay (WTP) of €14.4 (CI 95%: €13.2-15.5; median €10) and an average Willingness to Accept [Compensation] (WTA) of €20.9 (CI 95%: €19.6-22.2; median €20). Average area income, personal income, consultation duration, home visit, and education level correlated with greater WTP. Women and older subjects showed lower WTP. Fixed parameters explained 8.41% of the residual variability, and response clustering in different health centers explained 4-6% of the total variability. The influence of income on WTP was different in each center. The responses for WTP and WTA in a subgroup of subjects were consistent when reassessed after 2 weeks (intraclass correlation coefficients 0.952 and 0.893, respectively). The economic value of nursing services provided within PC in a public health system is clearly perceived by its user. The perception of this value is influenced by socioeconomic and demographic characteristics of the subjects and their environment, and by the unique characteristics of the evaluated service. The method of contingent valuation is useful for making explicit this perception of value of health services.
Martín-Fernández, Jesús; del Cura-González, Mª Isabel; Rodríguez-Martínez, Gemma; Ariza-Cardiel, Gloria; Zamora, Javier; Gómez-Gascón, Tomás; Polentinos-Castro, Elena; Pérez-Rivas, Francisco Javier; Domínguez-Bidagor, Julia; Beamud-Lagos, Milagros; Tello-Bernabé, Mª Eugenia; Conde-López, Juan Francisco; Aguado-Arroyo, Óscar; Bayona, Mª Teresa Sanz-; Gil-Lacruz, Ana Isabel
2013-01-01
Background Identifying the economic value assigned by users to a particular health service is of principal interest in planning the service. The aim of this study was to evaluate the perception of economic value of nursing consultation in primary care (PC) by its users. Methods and Results Economic study using contingent valuation methodology. A total of 662 users of nursing consultation from 23 health centers were included. Data on demographic and socioeconomic characteristics, health needs, pattern of usage, and satisfaction with provided service were compiled. The validity of the response was evaluated by an explanatory mixed-effects multilevel model in order to assess the factors associated with the response according to the welfare theory. Response reliability was also evaluated. Subjects included in the study indicated an average Willingness to Pay (WTP) of €14.4 (CI 95%: €13.2–15.5; median €10) and an average Willingness to Accept [Compensation] (WTA) of €20.9 (CI 95%: €19.6–22.2; median €20). Average area income, personal income, consultation duration, home visit, and education level correlated with greater WTP. Women and older subjects showed lower WTP. Fixed parameters explained 8.41% of the residual variability, and response clustering in different health centers explained 4–6% of the total variability. The influence of income on WTP was different in each center. The responses for WTP and WTA in a subgroup of subjects were consistent when reassessed after 2 weeks (intraclass correlation coefficients 0.952 and 0.893, respectively). Conclusions The economic value of nursing services provided within PC in a public health system is clearly perceived by its user. The perception of this value is influenced by socioeconomic and demographic characteristics of the subjects and their environment, and by the unique characteristics of the evaluated service. The method of contingent valuation is useful for making explicit this perception of value of health services. PMID:23626858
Schrack, Jennifer A; Simonsick, Eleanor M; Ferrucci, Luigi
2010-02-18
Recent introduction of the Cosmed K4b(2) portable metabolic analyzer allows measurement of oxygen consumption outside of a laboratory setting in more typical clinical or household environments and thus may be used to obtain information on the metabolic costs of specific daily life activities. The purpose of this study was to assess the accuracy of the Cosmed K4b(2) portable metabolic analyzer against a traditional, stationary gas exchange system (the Medgraphics D-Series) during steady-state, submaximal walking exercise. Nineteen men and women (9 women, 10 men) with an average age of 39.8 years (+/-13.8) completed two 400 meter walk tests using the two systems at a constant, self-selected pace on a treadmill. Average oxygen consumption (VO2) and carbon dioxide production (VCO2) from each walk were compared. Intraclass Correlation Coefficient (ICC) and Pearson correlation coefficients between the two systems for weight indexed VO2 (ml/kg/min), total VO2 (ml/min), and VCO2 (ml/min) ranged from 0.93 to 0.97. Comparison of the average values obtained using the Cosmed K4b(2) and Medgraphics systems using paired t-tests indicate no significant difference for VO2 (ml/kg/min) overall (p = 0.25), or when stratified by sex (p = 0.21 women, p = 0.69 men). The mean difference between analyzers was - 0.296 ml/kg/min (+/-0.26). Results were not significantly different for VO(2) (ml/min) or VCO2) (ml/min) within the study population (p = 0.16 and p = 0.08, respectively), or when stratified by sex (VO(2): p = 0.51 women, p = 0.16 men; VCO2: p = .11 women, p = 0.53 men). The Cosmed K4b(2) portable metabolic analyzer provides measures of VO2 and VCO2 during steady-state, submaximal exercise similar to a traditional, stationary gas exchange system.
Intra- and Interobserver Variability of Cochlear Length Measurements in Clinical CT.
Iyaniwura, John E; Elfarnawany, Mai; Riyahi-Alam, Sadegh; Sharma, Manas; Kassam, Zahra; Bureau, Yves; Parnes, Lorne S; Ladak, Hanif M; Agrawal, Sumit K
2017-07-01
The cochlear A-value measurement exhibits significant inter- and intraobserver variability, and its accuracy is dependent on the visualization method in clinical computed tomography (CT) images of the cochlea. An accurate estimate of the cochlear duct length (CDL) can be used to determine electrode choice, and frequency map the cochlea based on the Greenwood equation. Studies have described estimating the CDL using a single A-value measurement, however the observer variability has not been assessed. Clinical and micro-CT images of 20 cadaveric cochleae were acquired. Four specialists measured A-values on clinical CT images using both standard views and multiplanar reconstructed (MPR) views. Measurements were repeated to assess for intraobserver variability. Observer variabilities were evaluated using intra-class correlation and absolute differences. Accuracy was evaluated by comparison to the gold standard micro-CT images of the same specimens. Interobserver variability was good (average absolute difference: 0.77 ± 0.42 mm) using standard views and fair (average absolute difference: 0.90 ± 0.31 mm) using MPR views. Intraobserver variability had an average absolute difference of 0.31 ± 0.09 mm for the standard views and 0.38 ± 0.17 mm for the MPR views. MPR view measurements were more accurate than standard views, with average relative errors of 9.5 and 14.5%, respectively. There was significant observer variability in A-value measurements using both the standard and MPR views. Creating the MPR views increased variability between experts, however MPR views yielded more accurate results. Automated A-value measurement algorithms may help to reduce variability and increase accuracy in the future.
Chen, H; Ho, H M; Ying, M; Fu, S N
2012-01-01
Objectives The purpose of this study was to correlate findings on small vessel vascularity between computerised findings and Newman's scaling using power Doppler ultrasonography (PDU) imaging and its predictive value in patients with plantar fasciitis. Methods PDU was performed on 44 patients (age range 30–66 years; mean age 48 years) with plantar fasciitis and 46 healthy subjects (age range 18–61 years; mean age 36 years). The vascularity was quantified using ultrasound images by a customised software program and graded by Newman's grading scale. Vascular index (VI) was calculated from the software program as the ratio of the number of colour pixels to the total number of pixels within a standardised selected area of proximal plantar fascia. The 46 healthy subjects were examined on 2 occasions 7–10 days apart, and 18 of them were assessed by 2 examiners. Statistical analyses were performed using intraclass correlation coefficient and linear regression analysis. Results Good correlation was found between the averaged VI ratios and Newman's qualitative scale (ρ = 0.70; p<0.001). Intratester and intertester reliability were 0.89 and 0.61, respectively. Furthermore, higher VI was correlated with less reduction in pain after physiotherapeutic intervention. Conclusions The computerised VI not only has a high level of concordance with the Newman grading scale but is also reliable in reflecting the vascularity of proximal plantar fascia, and can predict pain reduction after intervention. This index can be used to characterise the changes in vascularity of patients with plantar fasciitis, and it may also be helpful for evaluating treatment and monitoring the progress after intervention in future studies. PMID:22167513
ERIC Educational Resources Information Center
Gray, Heewon Lee; Burgermaster, Marissa; Tipton, Elizabeth; Contento, Isobel R.; Koch, Pamela A.; Di Noia, Jennifer
2016-01-01
Objective: Sample size and statistical power calculation should consider clustering effects when schools are the unit of randomization in intervention studies. The objective of the current study was to investigate how student outcomes are clustered within schools in an obesity prevention trial. Method: Baseline data from the Food, Health &…
ERIC Educational Resources Information Center
Glassman, Jill R.; Potter, Susan C.; Baumler, Elizabeth R.; Coyle, Karin K.
2015-01-01
Introduction: Group-randomized trials (GRTs) are one of the most rigorous methods for evaluating the effectiveness of group-based health risk prevention programs. Efficiently designing GRTs with a sample size that is sufficient for meeting the trial's power and precision goals while not wasting resources exceeding them requires estimates of the…
ERIC Educational Resources Information Center
Martinkova, Patricia; Goldhaber, Dan
2015-01-01
Inter-rater reliability, commonly assessed by intra-class correlation coefficient ICC, is an important index for describing the extent to which there is consistency amongst two or more raters in assigned measures. In organizational research, the data structure is often hierarchical and designs deviate substantially from the ideal of a balanced…
NASA Astrophysics Data System (ADS)
Li, Lin; Zeng, Li; Lin, Zi-Jing; Cazzell, Mary; Liu, Hanli
2015-05-01
Test-retest reliability of neuroimaging measurements is an important concern in the investigation of cognitive functions in the human brain. To date, intraclass correlation coefficients (ICCs), originally used in inter-rater reliability studies in behavioral sciences, have become commonly used metrics in reliability studies on neuroimaging and functional near-infrared spectroscopy (fNIRS). However, as there are six popular forms of ICC, the adequateness of the comprehensive understanding of ICCs will affect how one may appropriately select, use, and interpret ICCs toward a reliability study. We first offer a brief review and tutorial on the statistical rationale of ICCs, including their underlying analysis of variance models and technical definitions, in the context of assessment on intertest reliability. Second, we provide general guidelines on the selection and interpretation of ICCs. Third, we illustrate the proposed approach by using an actual research study to assess intertest reliability of fNIRS-based, volumetric diffuse optical tomography of brain activities stimulated by a risk decision-making protocol. Last, special issues that may arise in reliability assessment using ICCs are discussed and solutions are suggested.
Johansen, Mette Dencker; Gjerløv, Irene; Christiansen, Jens Sandahl; Hejlesen, Ole K
2012-03-01
In glycemic control, postprandial glycemia may be important to monitor and optimize as it reveals glycemic control quality, and postprandial hyperglycemia partly predicts late diabetic complications. Self-monitoring of blood glucose (SMBG) may be an appropriate technology to use, but recommendations on measurement time are crucial. We retrospectively analyzed interindividual and intraindividual variations in postprandial glycemic peak time. Continuous glucose monitoring (CGM) and carbohydrate intake were collected in 22 patients with type 1 diabetes mellitus. Meals were identified from carbohydrate intake data. For each meal, peak time was identified as time from meal to CGM zenith within 40-150 min after meal start. Interindividual (one-way Anova) and intraindividual (intraclass correlation coefficient) variation was calculated. Nineteen patients were included with sufficient meal data quality. Mean peak time was 87 ± 29 min. Mean peak time differed significantly between patients (p = 0.02). Intraclass correlation coefficient was 0.29. Significant interindividual and intraindividual variations exist in postprandial glycemia peak time, thus hindering simple and general advice regarding postprandial SMBG for detection of maximum values. © 2012 Diabetes Technology Society.
A quick and reliable procedure for assessing foot alignment in athletes.
De Michelis Mendonça, Luciana; Bittencourt, Natália Franco Netto; Amaral, Giovanna Mendes; Diniz, Lívia Santos; Souza, Thales Rezende; da Fonseca, Sérgio Teixeira
2013-01-01
Quick procedures with proper psychometric properties that can capture the combined alignment of the foot-ankle complex in a position that may be more representative of the status of the lower limb during ground contact are essential for assessing a large group of athletes. The assessed lower limb was positioned with the calcaneus surface facing upward in a way that all of the marks could be seen at the center of the camera display. After guaranteeing maintenance of the foot at 90° of dorsiflexion actively sustained by the athlete, the examiner took the picture of the foot-ankle alignment. Intraclass correlation coefficients ranging from 0.82 to 0.93 demonstrated excellent intratester and intertester reliability for the proposed measurements of forefoot, rearfoot, and shank-forefoot alignments. The intraclass correlation coefficient between the shank-forefoot measures and the sum of the rearfoot and forefoot measures was 0.98, suggesting that the shank-forefoot alignment measures can represent the combined rearfoot and forefoot alignments. This study describes a reliable and practical measurement procedure for rearfoot, forefoot, and shank-forefoot alignments that can be applied to clinical and research situations as a screening procedure for risk factors for lower-limb injuries in athletes.
Inter-rater reliability of the Sødring Motor Evaluation of Stroke patients (SMES).
Halsaa, K E; Sødring, K M; Bjelland, E; Finsrud, K; Bautz-Holter, E
1999-12-01
The Sødring Motor Evaluation of Stroke patients is an instrument for physiotherapists to evaluate motor function and activities in stroke patients. The rating reflects quality as well as quantity of the patient's unassisted performance within three domains: leg, arm and gross function. The inter-rater reliability of the method was studied in a sample of 30 patients admitted to a stroke rehabilitation unit. Three therapists were involved in the study; two therapists assessed the same patient on two consecutive days in a balanced design. Cohen's weighted kappa and McNemar's test of symmetry were used as measures of item reliability, and the intraclass correlation coefficient was used to express the reliability of the sumscores. For 24 out of 32 items the weighted kappa statistic was excellent (0.75-0.98), while 7 items had a kappa statistic within the range 0.53-0.74 (fair to good). The reliability of one item was poor (0.13). The intraclass correlation coefficient for the three sumscores was 0.97, 0.91 and 0.97. We conclude that the Sødring Motor Evaluation of Stroke patients is a reliable measure of motor function in stroke patients undergoing rehabilitation.
Reproducibility of dynamically represented acoustic lung images from healthy individuals
Maher, T M; Gat, M; Allen, D; Devaraj, A; Wells, A U; Geddes, D M
2008-01-01
Background and aim: Acoustic lung imaging offers a unique method for visualising the lung. This study was designed to demonstrate reproducibility of acoustic lung images recorded from healthy individuals at different time points and to assess intra- and inter-rater agreement in the assessment of dynamically represented acoustic lung images. Methods: Recordings from 29 healthy volunteers were made on three separate occasions using vibration response imaging. Reproducibility was measured using quantitative, computerised assessment of vibration energy. Dynamically represented acoustic lung images were scored by six blinded raters. Results: Quantitative measurement of acoustic recordings was highly reproducible with an intraclass correlation score of 0.86 (very good agreement). Intraclass correlations for inter-rater agreement and reproducibility were 0.61 (good agreement) and 0.86 (very good agreement), respectively. There was no significant difference found between the six raters at any time point. Raters ranged from 88% to 95% in their ability to identically evaluate the different features of the same image presented to them blinded on two separate occasions. Conclusion: Acoustic lung imaging is reproducible in healthy individuals. Graphic representation of lung images can be interpreted with a high degree of accuracy by the same and by different reviewers. PMID:18024534
Winston, Courtney P; Sallis, James F; Swartz, Michael D; Hoelscher, Deanna M; Peskin, Melissa F
2013-08-01
According to ecological models, the physical environment plays a major role in determining individual health behaviors. As such, researchers have started targeting the consumer nutrition environment of large-scale foodservice operations when implementing obesity-prevention programs. In 2010, the American Hospital Association released a call-to-action encouraging health care facilities to join in this movement and improve their facilities' consumer nutrition environments. The Hospital Nutrition Environment Scan (HNES) for Cafeterias, Vending Machines, and Gift Shops was developed in 2011, and the present study evaluated the inter-rater reliability of this instrument. Two trained raters visited 39 hospitals in southern California and completed the HNES. Percent agreement, kappa statistics, and intraclass correlation coefficients were calculated. Percent agreement between raters ranged from 74.4% to 100% and kappa statistics ranged from 0.458 to 1.0. The intraclass correlation coefficient for the overall nutrition composite scores was 0.961. Given these results, the HNES demonstrated acceptable reliability metrics and can now be disseminated to assess the current state of hospital consumer nutrition environments. Copyright © 2013 Academy of Nutrition and Dietetics. Published by Elsevier Inc. All rights reserved.
Ertuğ, Nurcan
2018-06-01
The aim of this study was to determine the validity and reliability of the Turkish version of the V-scale, which measures nurses' attitudes towards vital signs monitoring in the detection of clinical deterioration. This validity and reliability study was conducted at a tertiary hospital in Ankara, Turkey, in 2016. A total of 169 ward nurses participated in the study. Exploratory factor analysis, Cronbach's alpha coefficient, and the intraclass correlation coefficient were used to determine the validity and reliability of the scale. A 5-factor, 16-item scale explained 60.823% of the total variance according to the validity analysis. Our version matched the original scale in terms of the number of items and factor structure. Cronbach's alpha coefficient of the Turkish version of the V-scale was 0.764. The test-retest reliability results were 0.855 for the overall intraclass correlation coefficient, and the t-test result was P > 0.05. The V-scale is a reliable and valid instrument to measure Turkish nurses' attitudes towards vital signs monitoring in the detection of clinical deterioration. © 2018 John Wiley & Sons Australia, Ltd.
Interval Timing Accuracy and Scalar Timing in C57BL/6 Mice
Buhusi, Catalin V.; Aziz, Dyana; Winslow, David; Carter, Rickey E.; Swearingen, Joshua E.; Buhusi, Mona C.
2010-01-01
In many species, interval timing behavior is accurate—appropriate estimated durations—and scalar—errors vary linearly with estimated durations. While accuracy has been previously examined, scalar timing has not been yet clearly demonstrated in house mice (Mus musculus), raising concerns about mouse models of human disease. We estimated timing accuracy and precision in C57BL/6 mice, the most used background strain for genetic models of human disease, in a peak-interval procedure with multiple intervals. Both when timing two intervals (Experiment 1) or three intervals (Experiment 2), C57BL/6 mice demonstrated varying degrees of timing accuracy. Importantly, both at individual and group level, their precision varied linearly with the subjective estimated duration. Further evidence for scalar timing was obtained using an intraclass correlation statistic. This is the first report of consistent, reliable scalar timing in a sizable sample of house mice, thus validating the PI procedure as a valuable technique, the intraclass correlation statistic as a powerful test of the scalar property, and the C57BL/6 strain as a suitable background for behavioral investigations of genetically engineered mice modeling disorders of interval timing. PMID:19824777
Simmenroth-Nayda, Anne; Heinemann, Stephanie; Nolte, Catharina; Fischer, Thomas; Himmel, Wolfgang
2014-12-06
The aim of this study was to analyse the psychometric properties of the short version of the Calgary Cambridge Guides and to decide whether it can be recommended for use in the assessment of communications skills in young undergraduate medical students. Using a translated version of the Guide, 30 members from the Department of General Practice rated 5 videotaped encounters between students and simulated patients twice. Item analysis should detect possible floor and/or ceiling effects. The construct validity was investigated using exploratory factor analysis. Intra-rater reliability was measured in an interval of 3 months, inter-rater reliability was assessed by the intraclass correlation coefficient. The score distribution of the items showed no ceiling or floor effects. Four of the five factors extracted from the factor analysis represented important constructs of doctor-patient communication The ratings for the first and second round of assessing the videos correlated at 0.75 (p<0.0001). Intraclass correlation coefficients for each item ranged were moderate and ranged from 0.05 to 0.57. Reasonable score distributions of most items without ceiling or floor effects as well as a good test-retest reliability and construct validity recommend the C-CG as an instrument for assessing communication skills in undergraduate medical students. Some deficiencies in inter-rater reliability are a clear indication that raters need a thorough instruction before using the C-CG.
Mousavian, Alireza; Ebrahimzadeh, Mohammad H; Birjandinejad, Ali; Omidi-Kashani, Farzad; Kachooei, Amir Reza
2015-12-01
In this study, we aimed to translate and test the validity and reliablity of the Persian version of the Manchester-Oxford Foot Questionnaire in foot and ankle patients. We translated the Manchester-Oxford Foot Questionnaire to Persian language according to the accepted guidelines, then assessed the psychometric properties including the validity and reliability on 308 patients with long-standing foot and ankle problems. To test the reliability, we calculated the intra-class correlation coefficient (ICC) for test-retest reliability and measured Cronbach's alpha to test the internal consistency. To test the construct validity of the Manchester-Oxford Foot Questionnaire we also administered the Short-Form 36 to patients. Construct validity was supported by significant correlation with SF36 subscales except for pain subscale of the persian MOXFQ with mental health of the SF36 (r=0.207). Intraclass correlation coefficient was 0.79 for the total MOXFQ and ranged from 0.83 to 0.89 for the three subscales. Cronbach's alpha for pain, walking/standing, and social interaction was 0.86, 0.88, and 0.89, respectively, and was 0.79 for the total MOXFQ showing good internal consistency in each domain. The Persian Manchester-Oxford Foot Questionnaire health scoring system is a valid and reliable patient-reported instrument for foot and ankle problems. Copyright © 2015. Published by Elsevier Ltd.
Wu, Lin; Wang, Yang; Pan, Shirui
2017-12-01
It is now well established that sparse representation models are working effectively for many visual recognition tasks, and have pushed forward the success of dictionary learning therein. Recent studies over dictionary learning focus on learning discriminative atoms instead of purely reconstructive ones. However, the existence of intraclass diversities (i.e., data objects within the same category but exhibit large visual dissimilarities), and interclass similarities (i.e., data objects from distinct classes but share much visual similarities), makes it challenging to learn effective recognition models. To this end, a large number of labeled data objects are required to learn models which can effectively characterize these subtle differences. However, labeled data objects are always limited to access, committing it difficult to learn a monolithic dictionary that can be discriminative enough. To address the above limitations, in this paper, we propose a weakly-supervised dictionary learning method to automatically learn a discriminative dictionary by fully exploiting visual attribute correlations rather than label priors. In particular, the intrinsic attribute correlations are deployed as a critical cue to guide the process of object categorization, and then a set of subdictionaries are jointly learned with respect to each category. The resulting dictionary is highly discriminative and leads to intraclass diversity aware sparse representations. Extensive experiments on image classification and object recognition are conducted to show the effectiveness of our approach.
Validation of the Female Sexual Function Index (FSFI) for web-based administration.
Crisp, Catrina C; Fellner, Angela N; Pauls, Rachel N
2015-02-01
Web-based questionnaires are becoming increasingly valuable for clinical research. The Female Sexual Function Index (FSFI) is the gold standard for evaluating female sexual function; yet, it has not been validated in this format. We sought to validate the Female Sexual Function Index (FSFI) for web-based administration. Subjects enrolled in a web-based research survey of sexual function from the general population were invited to participate in this validation study. The first 151 respondents were included. Validation participants completed the web-based version of the FSFI followed by a mailed paper-based version. Demographic data were collected for all subjects. Scores were compared using the paired t test and the intraclass correlation coefficient. One hundred fifty-one subjects completed both web- and paper-based versions of the FSFI. Those subjects participating in the validation study did not differ in demographics or FSFI scores from the remaining subjects in the general population study. Total web-based and paper-based FSFI scores were not significantly different (mean 20.31 and 20.29 respectively, p = 0.931). The six domains or subscales of the FSFI were similar when comparing web and paper scores. Finally, intraclass correlation analysis revealed a high degree of correlation between total and subscale scores, r = 0.848-0.943, p < 0.001. Web-based administration of the FSFI is a valid alternative to the paper-based version.
Mobile detection system to evaluate reactive hyperemia using radionuclide plethysmography.
Harel, François; Ngo, Quam; Finnerty, Vincent; Hernandez, Edgar; Khairy, Paul; Dupuis, Jocelyn
2007-08-01
We validated a novel mobile detection system to evaluate reactive hyperemia using the radionuclide plethysmography technique. Twenty-six subjects underwent simultaneously radionuclide plethysmography with strain gauge plethysmography. Strain gauge and radionuclide methods showed excellent reproducibility with intraclass correlation coefficients of 0.96 and 0.89 respectively. There was also a good correlation of flows between the two methods during reactive hyperemia (r = 0.87). We conclude that radionuclide plethysmography using this mobile detection system is a non-invasive alternative to assess forearm blood flow and its dynamic variations during reactive hyperemia.
Tsehaie, J; Poot, D H J; Oei, E H G; Verhaar, J A N; de Vos, R J
2017-07-01
To evaluate whether baseline MRI parameters provide prognostic value for clinical outcome, and to study correlation between MRI parameters and clinical outcome. Observational prospective cohort study. Patients with chronic midportion Achilles tendinopathy were included and performed a 16-week eccentric calf-muscle exercise program. Outcome measurements were the validated Victorian Institute of Sports Assessment-Achilles (VISA-A) questionnaire and MRI parameters at baseline and after 24 weeks. The following MRI parameters were assessed: tendon volume (Volume), tendon maximum cross-sectional area (CSA), tendon maximum anterior-posterior diameter (AP), and signal intensity (SI). Intra-class correlation coefficients (ICCs) and minimum detectable changes (MDCs) for each parameter were established in a reliability analysis. Twenty-five patients were included and complete follow-up was achieved in 20 patients. The average VISA-A scores increased significantly with 12.3 points (27.6%). The reliability was fair-good for all MRI-parameters with ICCs>0.50. Average tendon volume and CSA decreased significantly with 0.28cm 3 (5.2%) and 4.52mm 2 (4.6%) respectively. Other MRI parameters did not change significantly. None of the baseline MRI parameters were univariately associated with VISA-A change after 24 weeks. MRI SI increase over 24 weeks was positively correlated with the VISA-A score improvement (B=0.7, R 2 =0.490, p=0.02). Tendon volume and CSA decreased significantly after 24 weeks of conservative treatment. As these differences were within the MDC limits, they could be a result of a measurement error. Furthermore, MRI parameters at baseline did not predict the change in symptoms, and therefore have no added value in providing a prognosis in daily clinical practice. Copyright © 2017 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.
The Trojan Lifetime Champions Health Survey: development, validity, and reliability.
Sorenson, Shawn C; Romano, Russell; Scholefield, Robin M; Schroeder, E Todd; Azen, Stanley P; Salem, George J
2015-04-01
Self-report questionnaires are an important method of evaluating lifespan health, exercise, and health-related quality of life (HRQL) outcomes among elite, competitive athletes. Few instruments, however, have undergone formal characterization of their psychometric properties within this population. To evaluate the validity and reliability of a novel health and exercise questionnaire, the Trojan Lifetime Champions (TLC) Health Survey. Descriptive laboratory study. A large National Collegiate Athletic Association Division I university. A total of 63 university alumni (age range, 24 to 84 years), including former varsity collegiate athletes and a control group of nonathletes. Participants completed the TLC Health Survey twice at a mean interval of 23 days with randomization to the paper or electronic version of the instrument. Content validity, feasibility of administration, test-retest reliability, parallel-form reliability between paper and electronic forms, and estimates of systematic and typical error versus differences of clinical interest were assessed across a broad range of health, exercise, and HRQL measures. Correlation coefficients, including intraclass correlation coefficients (ICCs) for continuous variables and κ agreement statistics for ordinal variables, for test-retest reliability averaged 0.86, 0.90, 0.80, and 0.74 for HRQL, lifetime health, recent health, and exercise variables, respectively. Correlation coefficients, again ICCs and κ, for parallel-form reliability (ie, equivalence) between paper and electronic versions averaged 0.90, 0.85, 0.85, and 0.81 for HRQL, lifetime health, recent health, and exercise variables, respectively. Typical measurement error was less than the a priori thresholds of clinical interest, and we found minimal evidence of systematic test-retest error. We found strong evidence of content validity, convergent construct validity with the Short-Form 12 Version 2 HRQL instrument, and feasibility of administration in an elite, competitive athletic population. These data suggest that the TLC Health Survey is a valid and reliable instrument for assessing lifetime and recent health, exercise, and HRQL, among elite competitive athletes. Generalizability of the instrument may be enhanced by additional, larger-scale studies in diverse populations.
Macedo-Ojeda, Gabriela; Vizmanos-Lamotte, Barbara; Márquez-Sandoval, Yolanda Fabiola; Rodríguez-Rocha, Norma Patricia; López-Uriarte, Patricia Josefina; Fernández-Ballart, Joan D
2013-11-01
Semi-quantitative Food Frequency Questionnaires (FFQs) analyze average food and nutrient intake over extended periods to associate habitual dietary intake with health problems and chronic diseases. A tool of this nature applicable to both women and men is not presently available in Mexico. To validate a FFQ for adult men and women. The study was conducted on 97 participants, 61% were women. Two FFQs were administered (with a one-year interval) to measure reproducibility. To assess validity, the second FFQ was compared against dietary record (DR) covering nine days. Statistical analyses included Pearson correlations and Intraclass Correlation Coefficients (ICC). The de-attenuation of the ICC resulting from intraindividual variability was controlled. The validity analysis was complemented by comparing the classification ability of FFQ to that of DR through concordance between intake categories and Bland-Altman plots. Reproducibility: ICC values for food groups ranged 0.42-0.87; the range for energy and nutrients was between 0.34 and 0.82. ICC values for food groups ranged 0.35-0.84; the range for energy and nutrients was between 0.36 and 0.77. Most subjects (56.7-76.3%) classified in the same or adjacent quintile for energy and nutrients using both methods. Extreme misclassification was <6.3% for all items. Bland-Altman plots reveal high concordance between FFQ and DR. FFQ produced sufficient levels of reproducibility and validity to determine average daily intake over one year. These results will enable the analysis of possible associations with chronic diseases and dietary diagnoses in adult populations of men and women. Copyright AULA MEDICA EDICIONES 2013. Published by AULA MEDICA. All rights reserved.
Dantas, Jose Luiz; Pereira, Gleber; Nakamura, Fabio Yuzo
2015-09-01
The five-kilometer time trial (TT5km) has been used to assess aerobic endurance performance without further investigation of its validity. This study aimed to perform a preliminary validation of the TT5km to rank well-trained cyclists based on aerobic endurance fitness and assess changes of the aerobic endurance performance. After the incremental test, 20 cyclists (age = 31.3 ± 7.9 years; body mass index = 22.7 ± 1.5 kg/m(2); maximal aerobic power = 360.5 ± 49.5 W) performed the TT5km twice, collecting performance (time to complete, absolute and relative power output, average speed) and physiological responses (heart rate and electromyography activity). The validation criteria were pacing strategy, absolute and relative reliability, validity, and sensitivity. Sensitivity index was obtained from the ratio between the smallest worthwhile change and typical error. The TT5km showed high absolute (coefficient of variation < 3%) and relative (intraclass coefficient correlation > 0.95) reliability of performance variables, whereas it presented low reliability of physiological responses. The TT5km performance variables were highly correlated with the aerobic endurance indices obtained from incremental test (r > 0.70). These variables showed adequate sensitivity index (> 1). TT5km is a valid test to rank the aerobic endurance fitness of well-trained cyclists and to differentiate changes on aerobic endurance performance. Coaches can detect performance changes through either absolute (± 17.7 W) or relative power output (± 0.3 W.kg(-1)), the time to complete the test (± 13.4 s) and the average speed (± 1.0 km.h(-1)). Furthermore, TT5km performance can also be used to rank the athletes according to their aerobic endurance fitness.
Validity and reliability of a controlled pneumatic resistance exercise device.
Paulus, David C; Reynolds, Michael C; Schilling, Brian K
2008-01-01
During the concentric portion of the free-weight squat exercise, accelerating the mass from rest results in a fluctuation in ground reaction force. It is characterized by an initial period of force greater than the load while accelerating from rest followed by a period of force lower than the external load during negative acceleration. During the deceleration phase, less force is exerted and muscles are loaded sub-optimally. Thus, using a reduced inertia form of resistance such as pneumatics has the capability to minimize these inertial effects as well as control the force in real time to maximize the force exerted over the exercise cycle. To improve the system response of a preliminary design, a squat device was designed with a reduced mass barbell and two smaller pneumatic cylinders. The resistance was controlled by regulating cylinder pressure such that it is capable of adjusting force within a repetition to maximize force exerted during the lift. The resistance force production of the machine was statically validated with the input voltage and output force R2 =0.9997 for at four increments of the range of motion, and the intraclass correlation coefficient (ICC) between trials at the different heights equaled 0.999. The slew rate at three forces was 749.3 N/s +/- 252.3. Dynamic human subject testing showed the desired input force correlated with average and peak ground reaction force with R2 = 0.9981 and R2 = 0.9315, respectively. The ICC between desired force and average and peak ground reaction force was 0.963. Thus, the system is able to deliver constant levels of static and dynamic force with validity and reliability. Future work will be required to develop the control strategy required for real-time control, and performance testing is required to determine its efficacy.
Measurement and Reliability of Response Inhibition
Congdon, Eliza; Mumford, Jeanette A.; Cohen, Jessica R.; Galvan, Adriana; Canli, Turhan; Poldrack, Russell A.
2012-01-01
Response inhibition plays a critical role in adaptive functioning and can be assessed with the Stop-signal task, which requires participants to suppress prepotent motor responses. Evidence suggests that this ability to inhibit a prepotent motor response (reflected as Stop-signal reaction time (SSRT)) is a quantitative and heritable measure of interindividual variation in brain function. Although attention has been given to the optimal method of SSRT estimation, and initial evidence exists in support of its reliability, there is still variability in how Stop-signal task data are treated across samples. In order to examine this issue, we pooled data across three separate studies and examined the influence of multiple SSRT calculation methods and outlier calling on reliability (using Intra-class correlation). Our results suggest that an approach which uses the average of all available sessions, all trials of each session, and excludes outliers based on predetermined lenient criteria yields reliable SSRT estimates, while not excluding too many participants. Our findings further support the reliability of SSRT, which is commonly used as an index of inhibitory control, and provide support for its continued use as a neurocognitive phenotype. PMID:22363308
Mazloum, A; Johnston, M; Lundrigan, M; Birmingham, C L
2008-12-01
Non-exercise activity thermogenesis (NEAT) is the energy expended by body movement, other than sleeping, eating or sports-like activities. The obese have been reported to have a lower NEAT (walking, standing, and fidgeting) than controls. We hypothesize that an elevated NEAT could explain why some patients with anorexia nervosa are resistant to weight gain. To evaluate the interrater reliability of a rating of non-exercise activity of inpatients with eating disorders (ED) using a visual analogue scale (VAS). Health care providers were asked to rate the non-exercise activity of inpatients by marking a VAS. Eight patients were individually rated by 10 clinicians. Results were analyzed using the intraclass correlation coefficient (ICC) and Cohen's multi-rater kappa statistic (kappa). The ICC(3,k) was 0.257 (p<0.01) and 0.708 (p<0.01) for average measures. The ratings of NEAT using a VAS were not reliable between clinicians. This indicates that the ward staff, even on a specialized ED unit, cannot reliably estimate non-exercise activity and physiological measurements should be used.
Using Cluster Bootstrapping to Analyze Nested Data With a Few Clusters.
Huang, Francis L
2018-04-01
Cluster randomized trials involving participants nested within intact treatment and control groups are commonly performed in various educational, psychological, and biomedical studies. However, recruiting and retaining intact groups present various practical, financial, and logistical challenges to evaluators and often, cluster randomized trials are performed with a low number of clusters (~20 groups). Although multilevel models are often used to analyze nested data, researchers may be concerned of potentially biased results due to having only a few groups under study. Cluster bootstrapping has been suggested as an alternative procedure when analyzing clustered data though it has seen very little use in educational and psychological studies. Using a Monte Carlo simulation that varied the number of clusters, average cluster size, and intraclass correlations, we compared standard errors using cluster bootstrapping with those derived using ordinary least squares regression and multilevel models. Results indicate that cluster bootstrapping, though more computationally demanding, can be used as an alternative procedure for the analysis of clustered data when treatment effects at the group level are of primary interest. Supplementary material showing how to perform cluster bootstrapped regressions using R is also provided.
Welsh, A W; Hou, M; Meriki, N; Martins, W P
2012-10-01
Volumetric impedance indices derived from spatiotemporal image correlation (STIC) power Doppler ultrasound (PDU) might overcome the influence of machine settings and attenuation. We examined the feasibility of obtaining these indices from spherical samples of anterior placentas in healthy pregnancies, and assessed intraobserver reliability and correlation with conventional umbilical artery (UA) impedance indices. Uncomplicated singleton pregnancies with anterior placenta were included in the study. A single observer evaluated UA pulsatility index (PI), resistance index (RI) and systolic/diastolic ratio (S/D) and acquired three STIC-PDU datasets from the placenta just above the placental cord insertion. Another observer analyzed the STIC-PDU datasets using Virtual Organ Computer-aided AnaLysis (VOCAL) spherical samples from every frame to determine the vascularization index (VI) and vascularization flow index (VFI); maximum, minimum and average values were used to determine the three volumetric impedance indices (vPI, vRI, vS/D). Intraobserver reliability was examined by intraclass correlation coefficients (ICC) and association between volumetric indices from placenta, and UA Doppler indices were assessed by Pearson's correlation coefficient. A total of 25 pregnant women were evaluated but five were excluded because of artifacts observed during analysis. The reliability of measurement of volumetric indices of both VI and VFI from three STIC-PDU datasets was similar, with all ICCs ≥ 0.78. Pearson's r values showed a weak and non-significant correlation between UA pulsed-wave Doppler indices and their respective volumetric indices from spherical samples of placenta (all r ≥ 0.23). VOCAL indices from specific phases of the cardiac cycle showed good repeatability (ICC ≥ 0.92). Volumetric impedance indices determined from spherical samples of placenta are sufficiently reliable but do not correlate with UA Doppler indices in healthy pregnancies. Copyright © 2012 ISUOG. Published by John Wiley & Sons, Ltd.
Intra- and interobserver reliability of quantitative ultrasound measurement of the plantar fascia.
Rathleff, Michael Skovdal; Moelgaard, Carsten; Lykkegaard Olesen, Jens
2011-01-01
To determine intra- and interobserver reliability and measurement precision of sonographic assessment of plantar fascia thickness when using one, the mean of two, or the mean of three measurements. Two experienced observers scanned 20 healthy subjects twice with 60 minutes between test and retest. A GE LOGIQe ultrasound scanner was used in the study. The built-in software in the scanner was used to measure the thickness of the plantar fascia (PF). Reliability was calculated using intraclass correlation coefficient (ICC) and limits of agreement (LOA). Intraobserver reliability (ICC) using one measurement was 0.50 for one observer and 0.52 for the other, and using the mean of three measurements intraobserver reliability increased up to 0.77 and 0.67, respectively. Interobserver reliability (ICC) when using one measurement was 0.62 and increased to 0.82 when using the average of three measurements. LOA showed that when using the average of three measurements, LOA decreased to 0.6 mm, corresponding to 17.5% of the mean thickness of the PF. The results showed that reliability increases when using the mean of three measurements compared with one. Limits of agreement based on intratester reliability shows that changes in thickness that are larger than 0.6 mm can be considered actual changes in thickness and not a result of measurement error. Copyright © 2011 Wiley Periodicals, Inc.
Reliability of a store observation tool in measuring availability of alcohol and selected foods.
Cohen, Deborah A; Schoeff, Diane; Farley, Thomas A; Bluthenthal, Ricky; Scribner, Richard; Overton, Adrian
2007-11-01
Alcohol and food items can compromise or contribute to health, depending on the quantity and frequency with which they are consumed. How much people consume may be influenced by product availability and promotion in local retail stores. We developed and tested an observational tool to objectively measure in-store availability and promotion of alcoholic beverages and selected food items that have an impact on health. Trained observers visited 51 alcohol outlets in Los Angeles and southeastern Louisiana. Using a standardized instrument, two independent observations were conducted documenting the type of outlet, the availability and shelf space for alcoholic beverages and selected food items, the purchase price of standard brands, the placement of beer and malt liquor, and the amount of in-store alcohol advertising. Reliability of the instrument was excellent for measures of item availability, shelf space, and placement of malt liquor. Reliability was lower for alcohol advertising, beer placement, and items that measured the "least price" of apples and oranges. The average kappa was 0.87 for categorical items and the average intraclass correlation coefficient was 0.83 for continuous items. Overall, systematic observation of the availability and promotion of alcoholic beverages and food items was feasible, acceptable, and reliable. Measurement tools such as the one we evaluated should be useful in studies of the impact of availability of food and beverages on consumption and on health outcomes.
Computerized Liver Volumetry on MRI by Using 3D Geodesic Active Contour Segmentation
Huynh, Hieu Trung; Karademir, Ibrahim; Oto, Aytekin; Suzuki, Kenji
2014-01-01
OBJECTIVE Our purpose was to develop an accurate automated 3D liver segmentation scheme for measuring liver volumes on MRI. SUBJECTS AND METHODS Our scheme for MRI liver volumetry consisted of three main stages. First, the preprocessing stage was applied to T1-weighted MRI of the liver in the portal venous phase to reduce noise and produce the boundary-enhanced image. This boundary-enhanced image was used as a speed function for a 3D fast-marching algorithm to generate an initial surface that roughly approximated the shape of the liver. A 3D geodesic-active-contour segmentation algorithm refined the initial surface to precisely determine the liver boundaries. The liver volumes determined by our scheme were compared with those manually traced by a radiologist, used as the reference standard. RESULTS The two volumetric methods reached excellent agreement (intraclass correlation coefficient, 0.98) without statistical significance (p = 0.42). The average (± SD) accuracy was 99.4% ± 0.14%, and the average Dice overlap coefficient was 93.6% ± 1.7%. The mean processing time for our automated scheme was 1.03 ± 0.13 minutes, whereas that for manual volumetry was 24.0 ± 4.4 minutes (p < 0.001). CONCLUSION The MRI liver volumetry based on our automated scheme agreed excellently with reference-standard volumetry, and it required substantially less completion time. PMID:24370139
Computerized liver volumetry on MRI by using 3D geodesic active contour segmentation.
Huynh, Hieu Trung; Karademir, Ibrahim; Oto, Aytekin; Suzuki, Kenji
2014-01-01
Our purpose was to develop an accurate automated 3D liver segmentation scheme for measuring liver volumes on MRI. Our scheme for MRI liver volumetry consisted of three main stages. First, the preprocessing stage was applied to T1-weighted MRI of the liver in the portal venous phase to reduce noise and produce the boundary-enhanced image. This boundary-enhanced image was used as a speed function for a 3D fast-marching algorithm to generate an initial surface that roughly approximated the shape of the liver. A 3D geodesic-active-contour segmentation algorithm refined the initial surface to precisely determine the liver boundaries. The liver volumes determined by our scheme were compared with those manually traced by a radiologist, used as the reference standard. The two volumetric methods reached excellent agreement (intraclass correlation coefficient, 0.98) without statistical significance (p = 0.42). The average (± SD) accuracy was 99.4% ± 0.14%, and the average Dice overlap coefficient was 93.6% ± 1.7%. The mean processing time for our automated scheme was 1.03 ± 0.13 minutes, whereas that for manual volumetry was 24.0 ± 4.4 minutes (p < 0.001). The MRI liver volumetry based on our automated scheme agreed excellently with reference-standard volumetry, and it required substantially less completion time.
The reliability and validity of the SF-8 with a conflict-affected population in northern Uganda.
Roberts, Bayard; Browne, John; Ocaka, Kaducu Felix; Oyok, Thomas; Sondorp, Egbert
2008-12-02
The SF-8 is a health-related quality of life instrument that could provide a useful means of assessing general physical and mental health amongst populations affected by conflict. The purpose of this study was to test the validity and reliability of the SF-8 with a conflict-affected population in northern Uganda. A cross-sectional multi-staged, random cluster survey was conducted with 1206 adults in camps for internally displaced persons in Gulu and Amuru districts of northern Uganda. Data quality was assessed by analysing the number of incomplete responses to SF-8 items. Response distribution was analysed using aggregate endorsement frequency. Test-retest reliability was assessed in a separate smaller survey using the intraclass correlation test. Construct validity was measured using principal component analysis, and the Pearson Correlation test for item-summary score correlation and inter-instrument correlations. Known groups validity was assessed using a two sample t-test to evaluates the ability of the SF-8 to discriminate between groups known to have, and not have, physical and mental health problems. The SF-8 showed excellent data quality. It showed acceptable item response distribution based upon analysis of aggregate endorsement frequencies. Test-retest showed a good intraclass correlation of 0.61 for PCS and 0.68 for MCS. The principal component analysis indicated strong construct validity and concurred with the results of the validity tests by the SF-8 developers. The SF-8 also showed strong construct validity between the 8 items and PCS and MCS summary score, moderate inter-instrument validity, and strong known groups validity. This study provides evidence on the reliability and validity of the SF-8 amongst IDPs in northern Uganda.
The reliability and validity of the SF-8 with a conflict-affected population in northern Uganda
Roberts, Bayard; Browne, John; Ocaka, Kaducu Felix; Oyok, Thomas; Sondorp, Egbert
2008-01-01
Background The SF-8 is a health-related quality of life instrument that could provide a useful means of assessing general physical and mental health amongst populations affected by conflict. The purpose of this study was to test the validity and reliability of the SF-8 with a conflict-affected population in northern Uganda. Methods A cross-sectional multi-staged, random cluster survey was conducted with 1206 adults in camps for internally displaced persons in Gulu and Amuru districts of northern Uganda. Data quality was assessed by analysing the number of incomplete responses to SF-8 items. Response distribution was analysed using aggregate endorsement frequency. Test-retest reliability was assessed in a separate smaller survey using the intraclass correlation test. Construct validity was measured using principal component analysis, and the Pearson Correlation test for item-summary score correlation and inter-instrument correlations. Known groups validity was assessed using a two sample t-test to evaluates the ability of the SF-8 to discriminate between groups known to have, and not have, physical and mental health problems. Results The SF-8 showed excellent data quality. It showed acceptable item response distribution based upon analysis of aggregate endorsement frequencies. Test-retest showed a good intraclass correlation of 0.61 for PCS and 0.68 for MCS. The principal component analysis indicated strong construct validity and concurred with the results of the validity tests by the SF-8 developers. The SF-8 also showed strong construct validity between the 8 items and PCS and MCS summary score, moderate inter-instrument validity, and strong known groups validity. Conclusion This study provides evidence on the reliability and validity of the SF-8 amongst IDPs in northern Uganda. PMID:19055716
Development and validation of the Myasthenia Gravis Impairment Index.
Barnett, Carolina; Bril, Vera; Kapral, Moira; Kulkarni, Abhaya; Davis, Aileen M
2016-08-30
We aimed to develop a measure of myasthenia gravis impairment using a previously developed framework and to evaluate reliability and validity, specifically face, content, and construct validity. The first draft of the Myasthenia Gravis Impairment Index (MGII) included examination items from available measures enriched with newly developed, patient-reported items, modified after patient input. International neuromuscular specialists evaluated face and content validity via an e-mail survey. Test-retest reliability was assessed in stable patients at a 3-week interval and interrater reliability was evaluated in the same day. Construct validity was assessed through correlations between the MGII and other measures and by comparing scores in different patient groups. The first draft was assessed by 18 patients, and 72 specialists answered the survey. The second draft had 7 examination and 22 patient-reported items. Field testing included 200 patients, with 54 patients completing the reliability studies. Test-retest reliability of the total score was good (intraclass correlation coefficient 0.92; 95% confidence interval 0.79-0.94), as was interrater reliability of the examination component (intraclass correlation coefficient 0.81; 95% confidence interval 0.79-0.94). The MGII correlated well with comparison measures, with higher correlations with the MG-activities of daily living (r = 0.91) and MG-specific quality of life 15-item scale (r = 0.78). When assessing different patient groups, the scores followed expected patterns. The MGII was developed using a patient-centered framework of myasthenia-related impairments and incorporating patient input throughout the development process. It is reliable in an outpatient setting and has demonstrated construct validity. Responsiveness studies are under way. © 2016 American Academy of Neurology.
Li, Weiguo; Zhang, Zhuoli; Gordon, Andrew C.; Chen, Jeane; Nicolai, Jodi; Lewandowski, Robert J.; Omary, Reed A.
2016-01-01
Purpose To investigate the qualitative and quantitative impacts of labeling yttrium microspheres with increasing amounts of superparamagnetic iron oxide (SPIO) material for magnetic resonance (MR) imaging in phantom and rodent models. Materials and Methods Animal model studies were approved by the institutional Animal Care and Use Committee. The r2* relaxivity for each of four microsphere SPIO compositions was determined from 32 phantoms constructed with agarose gel and in eight concentrations from each of the four compositions. Intrahepatic transcatheter infusion procedures were performed in rats by using each of the four compositions before MR imaging to visualize distributions within the liver. For quantitative studies, doses of 5, 10, 15, or 20 mg 2% SPIO-labeled yttrium microspheres were infused into 24 rats (six rats per group). MR imaging R2* measurements were used to quantify the dose delivered to each liver. Pearson correlation, analysis of variance, and intraclass correlation analyses were performed to compare MR imaging measurements in phantoms and animal models. Results Increased r2* relaxivity was observed with incremental increases of SPIO microsphere content. R2* measurements of the 2% SPIO–labeled yttrium microsphere concentration were well correlated with known phantom concentrations (R2 = 1.00, P < .001) over a broader linear range than observed for the other three compositions. Microspheres were heterogeneously distributed within each liver; increasing microsphere SPIO content produced marked signal voids. R2*-based measurements of 2% SPIO–labeled yttrium microsphere delivery were well correlated with infused dose (intraclass correlation coefficient, 0.98; P < .001). Conclusion MR imaging R2* measurements of yttrium microspheres labeled with 2% SPIO can quantitatively depict in vivo intrahepatic biodistribution in a rat model. © RSNA, 2015 Online supplemental material is available for this article. PMID:26313619
Kim, Hee-Ju
2017-03-01
This study aimed to evaluate the reliability and validity of the Korean version of the Mini-Sleep Questionnaire-Insomnia in Korean college students. A total of 470 students from six nursing colleges in South Korea participated in the study. The translation and linguistic validation of the Mini-Sleep Questionnaire-Insomnia was performed based on guidelines. The Pittsburgh Sleep Quality Index and the Perceived Stress Scale were used to validate the measure. Cronbach α, item-total correlation for internal consistency reliability and intraclass correlation coefficient for test-retest reliability were evaluated. Exploratory factor analysis for construct validity, Pearson's correlation with the Pittsburgh Sleep Quality Index and the Perceived Stress Scale for concurrent validity, and the receiver operating character curve for predictive validity were assessed. The 4-item Mini-Sleep Questionnaire-Insomnia had a Cronbach α of .69 and the item-total correlations were higher than .30. Cronbach α increased to .73 if the item assessing the use of sleeping pills and tranquilizers was deleted. This item had marked skewness and kurtosis issues. Factor analysis indicated unidimensionality, explaining 53.0% of the total variance. The measure showed high test-retest reliability (i.e., intraclass correlation coefficient = .84), acceptable concurrent validity (r with the Pittsburg Sleep Quality Index = .69; r with the Perceived Stress Scale = .31) and predictive validity [area under curve = .85; 95% confidence interval (0.81, 0.90)]. The Mini-Sleep Questionnaire-Insomnia showed acceptable reliability and validity. Yet, the limited distribution in sleep medications warrants further evaluations in the clinical population. Copyright © 2017. Published by Elsevier B.V.
Development and validation of the Myasthenia Gravis Impairment Index
Bril, Vera; Kapral, Moira; Kulkarni, Abhaya; Davis, Aileen M.
2016-01-01
Objective: We aimed to develop a measure of myasthenia gravis impairment using a previously developed framework and to evaluate reliability and validity, specifically face, content, and construct validity. Methods: The first draft of the Myasthenia Gravis Impairment Index (MGII) included examination items from available measures enriched with newly developed, patient-reported items, modified after patient input. International neuromuscular specialists evaluated face and content validity via an e-mail survey. Test–retest reliability was assessed in stable patients at a 3-week interval and interrater reliability was evaluated in the same day. Construct validity was assessed through correlations between the MGII and other measures and by comparing scores in different patient groups. Results: The first draft was assessed by 18 patients, and 72 specialists answered the survey. The second draft had 7 examination and 22 patient-reported items. Field testing included 200 patients, with 54 patients completing the reliability studies. Test–retest reliability of the total score was good (intraclass correlation coefficient 0.92; 95% confidence interval 0.79–0.94), as was interrater reliability of the examination component (intraclass correlation coefficient 0.81; 95% confidence interval 0.79–0.94). The MGII correlated well with comparison measures, with higher correlations with the MG–activities of daily living (r = 0.91) and MG-specific quality of life 15-item scale (r = 0.78). When assessing different patient groups, the scores followed expected patterns. Conclusions: The MGII was developed using a patient-centered framework of myasthenia-related impairments and incorporating patient input throughout the development process. It is reliable in an outpatient setting and has demonstrated construct validity. Responsiveness studies are under way. PMID:27402891
Cross-Cultural Adaptation and Validation of the Back Beliefs Questionnaire to the Arabic Language.
Alamrani, Samia; Alsobayel, Hana; Alnahdi, Ali H; Moloney, Niamh; Mackey, Martin
2016-06-01
Translation, cross-cultural adaptation, and psychometric testing. To translate the Back Beliefs Questionnaire (BBQ) into Arabic and investigate its psychometric properties in an Arabic-speaking sample of individuals with low back pain (LBP). Back pain beliefs are associated with pain chronicity and disability in people with LBP. The BBQ is a recognized and frequently used tool for measuring these beliefs. To date the BBQ has not been translated into Arabic. The English version of the BBQ was translated and culturally adapted into Arabic (BBQ-Ar) according to published guidelines. The BBQ-Ar was then tested in a sample of 115 Arabic-speaking individuals with LBP. Reliability was evaluated through internal consistency (Cronbach α) and test-retest reliability (intraclass correlation coefficient), the latter in a subgroup of 25. Construct validity was assessed using exploratory factor analysis and by examining the correlation between the BBQ-Ar, the Oswestry Disability Index and a Numerical Pain Rating Scale. Internal consistency of the BBQ-Ar was good (Cronbach α = 0.77). Test-retest reliability was good (intraclass correlation coefficient [2,1] = 0.88). Exploratory factor analysis revealed a three-factor structure, explaining 46% of total variance, with the first factor alone explaining 24%. Eight of the nine scoring items were loaded on the first factor thus forming a unidimensional scale. A significant negative correlation was found between Oswestry Disability Index and BBQ-Ar scores (r = -0.307; P < 0.01), whereas no significant correlation was found between BBQ-Ar and Pain Rating Scale scores. No floor or celling effects were observed. The BBQ-Ar is a valid and reliable tool that can be used to assess back pain beliefs in Arabic-speaking individuals. N/A.
Nair, Rahul; Tsakos, Georgios; Yee Ting Fai, Robert
2016-12-01
To cross-culturally adapt the oral impacts on daily performance (OIDP) and assess its reliability and validity on Chinese-speaking community dwelling elderly Singaporeans. There are no previous reports of valid oral health-related quality of life instruments for elderly Singaporeans or perceived conditions associated with impacts reported in OIDP among the Singaporean elders. The OIDP was translated from English to Chinese and then back translated. The OIDP questionnaire along with questions related to overall quality of life and self-rated dental health was administered to 202 Chinese-speaking elderly Singaporeans by trained interviewers, and it was repeated after 1 month. Test-retest reliability was assessed using intraclass correlation coefficient; internal consistency was established using Cronbach's alpha, and construct validity using correlation coefficients with self-reported oral health-related and global quality of life measures. In addition, Kruskal-Wallis tests assessed differences in the OIDP score between different subjective health and global quality of life groups. The median age of participants was 75 years. About 19% reported oral impacts and difficulty eating was the most prevalent oral impact. Internal consistency was good with a Cronbach's alpha of 0.75, and the intraclass correlation coefficient was 0.75 (0.67-0.81). OIDP was significantly correlated with all measures of self-reported oral health and global ratings of quality of life, with correlation coefficients ranging between 0.15 and 0.52. Groups with worse perceptions about their health and quality of life had significantly higher OIDP scores. The OIDP showed successful reliability and validity for its use among Chinese-speaking older Singaporeans. © 2015 John Wiley & Sons A/S and The Gerodontology Association. Published by John Wiley & Sons Ltd.
Starling, Anne P.; Engel, Lawrence S.; Calafat, Antonia M.; Koutros, Stella; Satagopan, Jaya M.; Yang, Gong; Matthews, Charles E.; Cai, Qiuyin; Buckley, Jessie P.; Ji, Bu-Tian; Cai, Hui; Chow, Wong-Ho; Zheng, Wei; Gao, Yu-Tang; Rothman, Nathaniel; Xiang, Yong-Bing; Shu, Xiao-Ou
2015-01-01
Phthalate esters are man-made chemicals commonly used as plasticizers and solvents, and humans may be exposed through ingestion, inhalation, and dermal absorption. Little is known about predictors of phthalate exposure, particularly in Asian countries. Because phthalates are rapidly metabolized and excreted from the body following exposure, it is important to evaluate whether phthalate metabolites measured at a single point in time can reliably rank exposures to phthalates over a period of time. We examined the concentrations and predictors of phthalate metabolite concentrations among 50 middle-aged women and 50 men from two Shanghai cohorts, enrolled in 1997-2000 and 2002-2006, respectively. We assessed the reproducibility of urinary concentrations of phthalate metabolites in three spot samples per participant taken several years apart (mean interval between first and third sample was 7.5 years [women] or 2.9 years [men]), using Spearman's rank correlation coefficients and intra-class correlation coefficients. We detected ten phthalate metabolites in at least 50% of individuals for two or more samples. Participant sex, age, menopausal status, education, income, body mass index, consumption of bottled water, recent intake of medication, and time of day of collection of the urine sample were associated with concentrations of certain phthalate metabolites. The reproducibility of an individual's urinary concentration of phthalate metabolites across several years was low, with all intra-class correlation coefficients and most Spearman rank correlation coefficients ≤ 0.3. Only mono(2-ethylhexyl) phthalate, a metabolite of di(2-ethylhexyl)phthalate, had a Spearman rank correlation coefficient ≥ 0.4 among men, suggesting moderate reproducibility. These findings suggest that a single spot urine sample is not sufficient to rank exposures to phthalates over several years in an adult urban Chinese population. PMID:26255822
Developing Validity Evidence for the Written Pediatric History and Physical Exam Evaluation Rubric.
King, Marta A; Phillipi, Carrie A; Buchanan, Paula M; Lewin, Linda O
The written history and physical examination (H&P) is an underutilized source of medical trainee assessment. The authors describe development and validity evidence for the Pediatric History and Physical Exam Evaluation (P-HAPEE) rubric: a novel tool for evaluating written H&Ps. Using an iterative process, the authors drafted, revised, and implemented the 10-item rubric at 3 academic institutions in 2014. Eighteen attending physicians and 5 senior residents each scored 10 third-year medical student H&Ps. Inter-rater reliability (IRR) was determined using intraclass correlation coefficients. Cronbach α was used to report consistency and Spearman rank-order correlations to determine relationships between rubric items. Raters provided a global assessment, recorded time to review and score each H&P, and completed a rubric utility survey. Overall intraclass correlation was 0.85, indicating adequate IRR. Global assessment IRR was 0.89. IRR for low- and high-quality H&Ps was significantly greater than for medium-quality ones but did not differ on the basis of rater category (attending physician vs. senior resident), note format (electronic health record vs nonelectronic), or student diagnostic accuracy. Cronbach α was 0.93. The highest correlation between an individual item and total score was for assessments was 0.84; the highest interitem correlation was between assessment and differential diagnosis (0.78). Mean time to review and score an H&P was 16.3 minutes; residents took significantly longer than attending physicians. All raters described rubric utility as "good" or "very good" and endorsed continued use. The P-HAPEE rubric offers a novel, practical, reliable, and valid method for supervising physicians to assess pediatric written H&Ps. Copyright © 2016 Academic Pediatric Association. Published by Elsevier Inc. All rights reserved.
Benitez-Rosario, Miguel Angel; Caceres-Miranda, Raquel; Aguirre-Jaime, Armando
2016-03-01
A reliable and valid measure of the structure and process of end-of-life care is important for improving the outcomes of care. This study evaluated the validity and reliability of the Spanish adaptation of a satisfaction tool of the Care Evaluation Scale (CES), which was developed in Japan to evaluate palliative care structure and process from the perspective of family members. Standard forward-backward translation and a pilot test were conducted. A multicenter survey was conducted with the relatives of patients admitted to palliative care units for symptom control. The dimensional structure was assessed using confirmatory factor analyses. Concurrent and discriminant validity were tested by correlation with the SERQVHOS, a Spanish hospital care satisfaction scale and with an 11-point rating scale on satisfaction with care. The reliability of the CES was tested by Cronbach α and by test-retest correlation. A total of 284 primary caregivers completed the CES, with low missing response rates. The results of the factor analysis suggested a six-factor solution explaining 69% of the total variance. The CES moderately correlated with the SERQVHOS and with the overall satisfaction scale (intraclass correlation coefficients of 0.66 and 0.44, respectively; P = 0.001). Cronbach α was 0.90 overall and ranged from 0.85 to 0.89 for subdomains. Intraclass correlation coefficient was 0.88 (P = 0.001) for test-retest analysis. The Spanish CES was found to be a reliable and valid measure of the satisfaction with end-of-life care structure and process from family members' perspectives. Copyright © 2016 American Academy of Hospice and Palliative Medicine. Published by Elsevier Inc. All rights reserved.
Chalian, Hamid; Seyal, Adeel Rahim; Rezai, Pedram; Töre, Hüseyin Gürkan; Miller, Frank H; Bentrem, David J; Yaghmai, Vahid
2014-01-10
The accuracy for determining pancreatic cyst volume with commonly used spherical and ellipsoid methods is unknown. The role of CT volumetry in volumetric assessment of pancreatic cysts needs to be explored. To compare volumes of the pancreatic cysts by CT volumetry, spherical and ellipsoid methods and determine their accuracy by correlating with actual volume as determined by EUS-guided aspiration. Setting This is a retrospective analysis performed at a tertiary care center. Patients Seventy-eight pathologically proven pancreatic cysts evaluated with CT and endoscopic ultrasound (EUS) were included. Design The volume of fourteen cysts that had been fully aspirated by EUS was compared to CT volumetry and the routinely used methods (ellipsoid and spherical volume). Two independent observers measured all cysts using commercially available software to evaluate inter-observer reproducibility for CT volumetry. The volume of pancreatic cysts as determined by various methods was compared using repeated measures analysis of variance. Bland-Altman plot and intraclass correlation coefficient were used to determine mean difference and correlation between observers and methods. The error was calculated as the percentage of the difference between the CT estimated volumes and the aspirated volume divided by the aspirated one. CT volumetry was comparable to aspirated volume (P=0.396) with very high intraclass correlation (r=0.891, P<0.001) and small mean difference (0.22 mL) and error (8.1%). Mean difference with aspirated volume and error were larger for ellipsoid (0.89 mL, 30.4%; P=0.024) and spherical (1.73 mL, 55.5%; P=0.004) volumes than CT volumetry. There was excellent inter-observer correlation in volumetry of the entire cohort (r=0.997, P<0.001). CT volumetry is accurate and reproducible. Ellipsoid and spherical volume overestimate the true volume of pancreatic cysts.
Normal values of 3 methods to determine patellar height in children from 6 to 12 years.
Vergara-Amador, E; Davalos Herrera, D; Guevara, O A
2018-03-26
The aim of the study was to compare three methods for high-score measurement in children, Caton-Deschamps, Blackburne-Peel and Koshino-Sugimoto, to determine the normal value of each method in a group of normal children. A cross-sectional study on knee x-rays of normal children. Three orthopaedic surgeons measured the Caton-Deschamps, Blackburne-Peel and Koshino-Sugimoto indices. Concordance was assessed using the intraclass correlation coefficient. For interobserver variability, the measurements of each observer for each index were compared and for intraobserver variability, the coefficient between the 2 measurements was calculated by the same observer at 2 different times. 140 knee X-rays divided into 4 age groups were obtained. For the Blackburne-Peel index, an average median of the 3 observers was obtained of 1.07 and with P5-P95 (0.76-1.60). For the Caton-Deschamps index, an average median of the three observers of 1.22 was obtained and with P5-P95 (0.91-1.70). For the Koshino-Sugimoto index, we obtained an average median of the 3 observers of 1.16 and with P5-P95 (0.99-1.36). This study shows that the Koshino-Sugimoto index had the highest reliability, reproducibility and similarity in the population studied, both intra-observer and inter-observer. The other methods evaluated also had variability indices to be taken into account, but were inferior to the Koshino-Sugimoto index. Copyright © 2018 SECOT. Publicado por Elsevier España, S.L.U. All rights reserved.
Reliable scar scoring system to assess photographs of burn patients.
Mecott, Gabriel A; Finnerty, Celeste C; Herndon, David N; Al-Mousawi, Ahmed M; Branski, Ludwik K; Hegde, Sachin; Kraft, Robert; Williams, Felicia N; Maldonado, Susana A; Rivero, Haidy G; Rodriguez-Escobar, Noe; Jeschke, Marc G
2015-12-01
Several scar-scoring scales exist to clinically monitor burn scar development and maturation. Although scoring scars through direct clinical examination is ideal, scars must sometimes be scored from photographs. No scar scale currently exists for the latter purpose. We modified a previously described scar scale (Yeong et al., J Burn Care Rehabil 1997) and tested the reliability of this new scale in assessing burn scars from photographs. The new scale consisted of three parameters as follows: scar height, surface appearance, and color mismatch. Each parameter was assigned a score of 1 (best) to 4 (worst), generating a total score of 3-12. Five physicians with burns training scored 120 representative photographs using the original and modified scales. Reliability was analyzed using coefficient of agreement, Cronbach alpha, intraclass correlation coefficient, variance, and coefficient of variance. Analysis of variance was performed using the Kruskal-Wallis test. Color mismatch and scar height scores were validated by analyzing actual height and color differences. The intraclass correlation coefficient, the coefficient of agreement, and Cronbach alpha were higher for the modified scale than those of the original scale. The original scale produced more variance than that in the modified scale. Subanalysis demonstrated that, for all categories, the modified scale had greater correlation and reliability than the original scale. The correlation between color mismatch scores and actual color differences was 0.84 and between scar height scores and actual height was 0.81. The modified scar scale is a simple, reliable, and useful scale for evaluating photographs of burn patients. Copyright © 2015 Elsevier Inc. All rights reserved.
Translation and validation of chronic liver disease questionnaire (CLDQ) in Tamil language.
Goel, Amit; Arivazhagan, Karunanithi; Sasi, Avani; Shanmugam, Vanathy; Koshi, Seleena; Pottakkat, Biju; Lakshmi, C P; Awasthi, Ashish
2017-05-01
Chronic liver disease questionnaire (CLDQ), a self-administered quality-of-life (QOL) instrument for chronic liver disease (CLD) patients, was originally developed in English language. We aimed to translate and validate CLDQ in Tamil language (CLDQ-T). CLDQ-T, prepared by two forward and two backward independent translations by four bilingual (Tamil and English) persons, and repeated iterative modifications, was validated in adult, native-Tamil patients with CLD. CLDQ-T was re-tested in some patients 2 weeks later. Convergent validity was assessed using Spearman's correlation, and discriminant validity by comparison with World Health Organization's brief QOL tool (WHOQOL-BREF). Reliability was assessed through internal consistency (Cronbach's alpha) and test-retest reliability (intra-class correlation). Cutoff used for statistical significance was p<0.05. The study included 126 patients (age: mean [SD] 46 years [12.5]; male 104; cause: alcohol 42%, HBV 25%, HCV 4%, cryptogenic 29%; CTP class A 47%, B 37%, and C 16%). In convergent validity, all domains except the "abdominal domain" showed significant correlation between CLDQ-T and WHOQOL-BREF. Patients with severe disease had lower scores for all domains of CLDQ-T except the "abdominal" domain, but not for any of the domains for WHOQOL-BREF. Overall Cronbach's alpha was 0.942, and more than 0.7 for all the individual domains except the "activity" domain. On retesting in 44 (35%) patients, intraclass correlation coefficient was 0.879 for the overall CLDQ-T score and >0.700 for individual domains. CLDQ-T was easily understood and showed good performance characteristics in assessing QOL in Tamil-speaking patients with CLD.
Using the CanMEDS roles when interviewing for an ophthalmology residency program.
Hamel, Patrick; Boisjoly, Hélène; Corriveau, Christine; Fallaha, Nicole; Lahoud, Salim; Luneau, Katie; Olivier, Sébastien; Rouleau, Jacinthe; Toffoli, Daniela
2007-04-01
To improve the admissions process for the Université de Montréal (UdeM) ophthalmology residency program, the interview structure was modified to encompass the seven CanMEDS roles introduced by the Royal College of Physicians and Surgeons of Canada (RCPSC). These roles include an applicant's abilities as a communicator, collaborator, manager, health advocate, professional, scholar, and medical expert. In this retrospective pilot study, the records of all applicants were reviewed by 8 members of the admissions committee, with a high intraclass correlation coefficient of 0.814. Four 2-person interview teams were then formed. The first 3 groups asked the applicants specific questions based on 2-3 of the CanMEDS roles, marking their impressions of each candidate on a visual analogue scale. The last group answered candidates' questions about the program but assigned no mark. The intraclass correlations for the teams were 0.900, 0.739, and 0.585, demonstrating acceptable interrater reliability for 2 of the teams. Pearson correlation coefficients between groups of interviewers were considered adequate at 0.562, 0.432, and 0.417 (p < 0.05). For each interviewer, the Pearson correlation coefficient between record marking and interview scoring was either not statistically significant or very low. By basing the 2006 interview process on the CanMEDS roles defined by the RCPSC, information was obtained about the candidates that could not have been retrieved by a review of the medical students' records alone. Reliability analysis confirmed that this new method of conducting interviews provided sound and reliable judging and rating consistency between all members of the admissions committee.
Wehrli, Martina; Hensler, Stefanie; Schindele, Stephan; Herren, Daniel B; Marks, Miriam
2016-09-01
The brief Michigan Hand Outcomes Questionnaire (briefMHQ) was developed as a shorter version of the Michigan Hand Outcomes Questionnaire (MHQ), but its measurement properties have not been investigated in patients with Dupuytren contracture. The objective of the study was to investigate the reliability, validity, responsiveness, and interpretability of the briefMHQ. Fifty-seven patients diagnosed with Dupuytren contracture completed the briefMHQ as well as the full-length MHQ and Quick Disabilities of the Arm, Shoulder, and Hand (QuickDASH) questionnaire at baseline. Two to 14 days after baseline and 1 year after collagenase injection or surgery, patients again filled out the briefMHQ. Reliability was determined using the intraclass correlation coefficient and by calculating internal consistency (Cronbach alpha). Validity was tested by quantifying correlations with the full-length MHQ and QuickDASH. Responsiveness, based on the standardized response mean and the minimally clinically important change, was also determined. The briefMHQ had an intraclass correlation coefficient of 0.87, Cronbach alpha of 0.88, and correlations of r = 0.88 and -0.82 with the original MHQ and QuickDASH, respectively. The standardized response mean was 0.9 and the minimally clinically important change was 7 points. Overall, the briefMHQ demonstrates excellent reliability, good validity, and high responsiveness in patients with Dupuytren contracture. The briefMHQ is an accurate and time-saving tool to evaluate patients with Dupuytren contracture and the effect of a corresponding treatment. Copyright © 2016 American Society for Surgery of the Hand. Published by Elsevier Inc. All rights reserved.
Ratter, Julia; Radlinger, Lorenz; Lucas, Cees
2014-09-01
Are submaximal and maximal exercise tests reliable, valid and acceptable in people with chronic pain, fibromyalgia and fatigue disorders? Systematic review of studies of the psychometric properties of exercise tests. People older than 18 years with chronic pain, fibromyalgia and chronic fatigue disorders. Studies of the measurement properties of tests of physical capacity in people with chronic pain, fibromyalgia or chronic fatigue disorders were included. Studies were required to report: reliability coefficients (intraclass correlation coefficient, alpha reliability coefficient, limits of agreements and Bland-Altman plots); validity coefficients (intraclass correlation coefficient, Spearman's correlation, Kendal T coefficient, Pearson's correlation); or dropout rates. Fourteen studies were eligible: none had low risk of bias, 10 had unclear risk of bias and four had high risk of bias. The included studies evaluated: Åstrand test; modified Åstrand test; Lean body mass-based Åstrand test; submaximal bicycle ergometer test following another protocol other than Åstrand test; 2-km walk test; 5-minute, 6-minute and 10-minute walk tests; shuttle walk test; and modified symptom-limited Bruce treadmill test. None of the studies assessed maximal exercise tests. Where they had been tested, reliability and validity were generally high. Dropout rates were generally acceptable. The 2-km walk test was not recommended in fibromyalgia. Moderate evidence was found for reliability, validity and acceptability of submaximal exercise tests in patients with chronic pain, fibromyalgia or chronic fatigue. There is no evidence about maximal exercise tests in patients with chronic pain, fibromyalgia and chronic fatigue. Copyright © 2014. Published by Elsevier B.V.
Biomechanical factors associated with time to complete a change of direction cutting maneuver.
Marshall, Brendan M; Franklyn-Miller, Andrew D; King, Enda A; Moran, Kieran A; Strike, Siobhán C; Falvey, Éanna C
2014-10-01
Cutting ability is an important aspect of many team sports, however, the biomechanical determinants of cutting performance are not well understood. This study aimed to address this issue by identifying the kinetic and kinematic factors correlated with the time to complete a cutting maneuver. In addition, an analysis of the test-retest reliability of all biomechanical measures was performed. Fifteen (n = 15) elite multidirectional sports players (Gaelic hurling) were recruited, and a 3-dimensional motion capture analysis of a 75° cut was undertaken. The factors associated with cutting time were determined using bivariate Pearson's correlations. Intraclass correlation coefficients (ICCs) were used to examine the test-retest reliability of biomechanical measures. Five biomechanical factors were associated with cutting time (2.28 ± 0.11 seconds): peak ankle power (r = 0.77), peak ankle plantar flexor moment (r = 0.65), range of pelvis lateral tilt (r = -0.54), maximum thorax lateral rotation angle (r = 0.51), and total ground contact time (r = -0.48). Intraclass correlation coefficient scores for these 5 factors, and indeed for the majority of the other biomechanical measures, ranged from good to excellent (ICC >0.60). Explosive force production about the ankle, pelvic control during single-limb support, and torso rotation toward the desired direction of travel were all key factors associated with cutting time. These findings should assist in the development of more effective training programs aimed at improving similar cutting performances. In addition, test-retest reliability scores were generally strong, therefore, motion capture techniques seem well placed to further investigate the determinants of cutting ability.
Maleki, Iradj; Taghvaei, Tarang; Barzin, Maryam; Amin, Kamyar; Khalilian, Alireza
2015-01-01
Inflammatory bowel diseases (IBD) are a group of inflammatory conditions of the colon and small intestine that may have critical consequences on patient's quality of life (QOL). Many disease-specific QOL tools have been developed recently. The McMaster Inflammatory Bowel Disease Questionnaire (IBDQ) is one of them. The aim of this study was to translate the IBDQ from English to Persian and evaluate the validity and reliability of this version of the McMaster IBDQ. 68 subjects with ulcerative colitis were recruited in this study. The original IBDQ was translated into Persian using back- translation method. The reliability of the subscales and the summary score of the Persian IBDQ was demonstrated by intraclass correlation coefficients, their validity was evaluated by their correlations with SF-36, visual analogue scale and colitis activity index. All dimensions of IBDQ met the standards of construct validity and were correlated well with SF-36, visual analog scale and colitis activity index. IBDQ was able to discriminate the different groups of patients. The intraclass correlation coefficient was very high and its value was close to one (P<0.05). All dimensional scores differed significantly between the baseline and the follow-up measurement. The findings of this study conclude that the Persian translation of IBDQ confers satisfactory psychometric and cultural properties when applied to a sample of Iranian population with inflammatory bowel disease. This questionnaire is recommended for use in clinical trials and in the assessment of efficacy of interventions and therapy.
ERIC Educational Resources Information Center
Hedberg, E. C.; Hedges, Larry V.
2014-01-01
Randomized experiments are often considered the strongest designs to study the impact of educational interventions. Perhaps the most prevalent class of designs used in large scale education experiments is the cluster randomized design in which entire schools are assigned to treatments. In cluster randomized trials (CRTs) that assign schools to…
ERIC Educational Resources Information Center
Brandon, Paul R.; Harrison, George M.; Lawton, Brian E.
2013-01-01
When evaluators plan site-randomized experiments, they must conduct the appropriate statistical power analyses. These analyses are most likely to be valid when they are based on data from the jurisdictions in which the studies are to be conducted. In this method note, we provide software code, in the form of a SAS macro, for producing statistical…
ERIC Educational Resources Information Center
Hua, Jing; Gu, Guixiong; Meng, Wei; Wu, Zhuochun
2013-01-01
The aim of this paper was to examine the validity and reliability of age band 1 of the Movement Assessment Battery for Children-Second Edition (MABC-2) in preparation for its standardization in mainland China. Interrater and test-retest reliability of the MABC-2 was estimated using Intraclass Correlation Coefficient (ICC). Cronbach's alpha for…
Sions, Jaclyn Megan; Smith, Andrew Craig; Hicks, Gregory Evan; Elliott, James Matthew
2016-08-01
To evaluate intra- and inter-examiner reliability for the assessment of relative cross-sectional area, muscle-to-fat infiltration indices, and relative muscle cross-sectional area, i.e., total cross-sectional area minus intramuscular fat, from T1-weighted magnetic resonance images obtained in older adults with chronic low back pain. Reliability study. n = 13 (69.3 ± 8.2 years old) After lumbar magnetic resonance imaging, two examiners produced relative cross-sectional area measurements of multifidi, erector spinae, psoas, and quadratus lumborum by tracing regions of interest just inside fascial borders. Pixel-intensity summaries were used to determine muscle-to-fat infiltration indices; relative muscle cross-sectional area was calculated. Intraclass correlation coefficients were used to estimate intra- and inter-examiner reliability; standard error of measurement was calculated. Intra-examiner intraclass correlation coefficient point estimates for relative cross-sectional area, muscle-to-fat infiltration indices, and relative muscle cross-sectional area were excellent for multifidi and erector spinae across levels L2-L5 (ICC = 0.77-0.99). At L3, intra-examiner reliability was excellent for relative cross-sectional area, muscle-to-fat infiltration indices, and relative muscle cross-sectional area for both psoas and quadratus lumborum (ICC = 0.81-0.99). Inter-examiner intraclass correlation coefficients ranged from poor to excellent for relative cross-sectional area, muscle-to-fat infiltration indices, and relative muscle cross-sectional area. Assessment of relative cross-sectional area, muscle-to-fat infiltration indices, and relative muscle cross-sectional area in older adults with chronic low back pain can be reliably determined by one examiner from T1-weighted images. Such assessments provide valuable information, as muscle-to-fat infiltration indices and relative muscle cross-sectional area indicate that a substantial amount of relative cross-sectional area may be magnetic resonance-visible intramuscular fat in older adults with chronic low back pain. © 2015 American Academy of Pain Medicine. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Mehta, S. N.; Nansel, T. R.; Volkening, L. K.; Butler, D. A.; Haynie, D. L.; Laffel, L. M. B.
2016-01-01
Aims To evaluate the psychometric properties of the Diabetes Management Questionnaire, a brief, self-report measure of adherence to contemporary diabetes management for young people with Type 1 diabetes and their caregivers. Methods A total of 273 parent-child dyads completed parallel versions of the Diabetes Management Questionnaire. Eligible children (aged 8–18 years) had Type 1 diabetes for ≥1 year. A multidisciplinary team designed the Diabetes Management Questionnaire as a brief, self-administered measure of adherence to Type 1 diabetes management over the preceding month; higher scores reflect greater adherence. Psychometrics were evaluated for the entire sample and according to age of the child. Results The children (49% female) had a mean ± SD (range) age 13.3 ± 2.9 (8–18) years and their mean ± SD HbA1c was 71 ± 15 mmol/mol (8.6 ± 1.4%). Internal consistency was good for parents (α = 0.83) and children (a = 0.79). Test-retest reliability was excellent for parents (intraclass correlation coefficient =0.83) and good for children (intraclass correlation coefficient = 0.65). Parent and child scores had moderate agreement (intraclass correlation coefficient = 0.54). Diabetes Management Questionnaire scores were inversely associated with HbA1c (parents: r = –0.41, P < 0.0001; children: r = –0.27, P < 0.0001). Psychometrics were stronger in the children aged ≥13 years compared with those aged < 13 years, but were acceptable in both age groups. Mean ± SD Diabetes Management Questionnaire scores were higher among children who were receiving insulin pump therapy (n = 181) than in children receiving multiple daily injections (n = 92) according to parent (75.9 ± 11.8 vs. 70.5 ± 15.5; P = 0.004) and child report (72.2 ± 12.1 vs. 67.6 ± 13.9; P = 0.006). Conclusions The Diabetes Management Questionnaire is a brief, valid self-report measure of adherence to contemporary diabetes self-management for people aged 8–18 years who are receiving either multiple daily injections or insulin pump therapy. PMID:26280463
Cao, Shiqi; Liu, Ning; Han, Wuxiang; Zi, Yunpeng; Peng, Fan; Li, Lexiang; Fu, Qiwei; Chen, Yi; Zheng, Weijie; Qian, Qirong
2017-01-14
The Forgotten Joint Score (FJS) is a newly developed health-related quality of life (HRQoL) questionnaire designed to evaluate the awareness after total knee arthroplasty (TKA). This study cross-culturally adapted and psychometrically validated a simplified Chinese version of the FJS (SC-FJS). Cross-cultural adaptation was performed according to the internationally recognized guidelines. One-hundred and fifty participants who underwent primary TKA were recruited in this study. Cronbach's α and intra-class correlations were used to determine reliability. Construct validity was analyzed by evaluating the correlations between SC-FJS and the Knee Injury and Osteoarthritis Outcome Score (KOOS) and the short form (36) health survey (SF-36). Each of the 12 items was properly responded and correlated with the total items. SC-FJS had excellent reliability [Cronbach's α = 0.907, intra-class correlation coefficient (ICC) = 0.970, 95% CI 0.959-0.978). Elimination of any one item in all did not result in a value of Cronbach's α of <0.80. SC-FJS had a high correlation with symptoms (0.67, p < 0.001) and pain (0.60, p < 0.001) domains of KOOS and social functioning (0.66, p < 0.001) domain of SF-36, and it also moderately correlated with function in daily living (0.53, p < 0.001) and function in sport and recreation (0.40, p < 0.001) domains of KOOS, and physical subscale of SF-36 (0.49-0.53, p < 0.001) but had a low (r = 0.20) or not significant (p > 0.05) correlation with mental subscale of SF-36. SC-FJS demonstrated excellent acceptability, internal consistency, reliability, and construct validity, which can be recommended for patients who underwent joint arthroplasty in Mainland China.
Georgieva-Zhostova, Spaska; Kolev, Ognyan I; Stambolieva, Katerina
2014-09-01
The aim of the present study was the translation, cross-cultural adaptation and validation of the Dizziness Handicap Inventory in Bulgarian language (DHI-BG). Ninety-seven vestibular patients (19 men and 78 women, mean age 45.08 ± 13.85 years) took part in the investigation. All participants were asked to fill in the DHI-BG. Internal consistency was estimated using Cronbach's alpha and item-total correlation, reproducibility by calculating Bland-Altman's limits of agreement and intraclass correlation coefficients (ICCs). Associations were estimated by Spearman's correlation coefficients. The Cronbach's alpha for the total score, functional, physical and emotional subscales of DHI-BG were 0.88, 0.75, 0.72 and 0.81. The floor and ceiling effects of the DHI-BG total scale were evaluated with respect to the limits of agreement which were ±9.4-14.53 points. Intraclass correlation coefficients (ICCs) for all scale and subscales were higher than the recommended value of 0.75 and determined good test-retest reliability. The range of items correlation for DHI-BG was from 0.27 (item 12) to 0.72 (item 3). No significant differences were observed in the Cronbach's alpha coefficients between the DHI-BG and the original version, the German and Italian versions of the questionnaire. The most significant difference was observed in comparison with the German version of DHI. Construct validity presented a moderate correlation between Romberg coefficients and DHI-BG scores and strong correlation between all scores of DHI and the self-perceived disability. The results suggest that DHI-BG scores show a good discriminative validity between groups with different levels of self-assessed disability. The Bulgarian version of the DHI is a reliable and valid tool in assessing the impact of dizziness on the quality of life in Bulgarian vestibular patients.
Tuca, Maria; Greditzer, Harry Gus; Gausden, Elizabeth Bishop; Uppstrom, Tyler J.; Potter, Hollis G.; Cordasco, Frank A.; Green, Daniel W.
2017-01-01
Objectives: To analyze graft structure and signal with particular emphasis on the distal femoral socket aperture following all-epiphyseal ACLR using hamstring autografts with sequential MRI in skeletally immature athletes. Methods: Retrospective cohort study of 23 skeletally immature patients who underwent ACLR by the same surgical team at a tertiary center during 2011-2013. Athletes had at least two follow-up MRIs, the first MRI 6-12 months after surgery and the second MRI >18 months, were included. Exclusion criteria included those athletes with inMRI follow-up (6) or with a failure of their reconstructions (1). All athletes were treated with an arthroscopic all-inside, all-epiphyseal ACLR, using hamstring autograft, secured with adjustable loop cortical buttons on both tibia and femur. MRI images were analyzed independently and blinded by an orthopaedic surgery fellow and a musculoskeletal radiology fellow. Using a GE Functional Analysis Software, the signal intensity (SI) of the graft was measured in 5 different locations: 1) femoral tunnel, 2) intra-articular proximal turn, 3) midsubstance, 4) intra-articular distal turn, and 5) tibial tunnel. Values were normalized to cortical bone density. The amount of perigraft scarring and synovitis was analyzed. An intraclass correlation coefficient was used to quantify inter-rater reliability, non-parametric Wilcoxon test for perigraft scarring and synovitis, one-way ANOVA to test if significant differences of SI were seen between the different graft locations, and a 2-tailed student t-test for SI changes from 1st to 2nd MRI. Results: The study included 16 patients (5 girls and 11 boys), with an average age at surgery of 11.9 years (range 10-15). The first follow-up MRI was on average at 8.4 months (range 6-12 months), while the 2nd MRI was on average 30.7 months (range 18-40) after surgery. Intra-class correlation coefficients were above 0.7 for all measurements, indicating an excellent concordance between observers. Perigraft scarring tended to reduce with follow-up (p=0.057) though not significantly, while synovitis had a significant reduction over time (p=0.01). On average, normalized SI showed no significant differences between measurements taken in different regions of the graft (p=0.58). When comparing the graft SI from 1st to 2nd MRI, no significant differences were found in any of the locations: femoral tunnel (p=0.14), proximal turn (p=0.11), midsubstance (p=0,29), intra-articular distal (p=0.10), or tibial tunnel (p=0.15). All 16 athletes returned to their prior sport at the same level of performance without re-injury. Conclusion: ACL grafts in skeletally immature patients with all-epiphyseal reconstructions maintain a stable intensity signaling at long term MRI follow-up, with no significant signal reduction over time. Despite the sharp turn created at the distal femoral socket aperture in physeal-sparing reconstructions, no particular anatomic location of the graft presents significantly different signal intensity over others. This is the first sequential mri study in pediatric epiphyseal acl reconstructions demonstrating postoperative maintenance of graft integrity and graft signal.
Subcortical structure segmentation using probabilistic atlas priors
NASA Astrophysics Data System (ADS)
Gouttard, Sylvain; Styner, Martin; Joshi, Sarang; Smith, Rachel G.; Cody Hazlett, Heather; Gerig, Guido
2007-03-01
The segmentation of the subcortical structures of the brain is required for many forms of quantitative neuroanatomic analysis. The volumetric and shape parameters of structures such as lateral ventricles, putamen, caudate, hippocampus, pallidus and amygdala are employed to characterize a disease or its evolution. This paper presents a fully automatic segmentation of these structures via a non-rigid registration of a probabilistic atlas prior and alongside a comprehensive validation. Our approach is based on an unbiased diffeomorphic atlas with probabilistic spatial priors built from a training set of MR images with corresponding manual segmentations. The atlas building computes an average image along with transformation fields mapping each training case to the average image. These transformation fields are applied to the manually segmented structures of each case in order to obtain a probabilistic map on the atlas. When applying the atlas for automatic structural segmentation, an MR image is first intensity inhomogeneity corrected, skull stripped and intensity calibrated to the atlas. Then the atlas image is registered to the image using an affine followed by a deformable registration matching the gray level intensity. Finally, the registration transformation is applied to the probabilistic maps of each structures, which are then thresholded at 0.5 probability. Using manual segmentations for comparison, measures of volumetric differences show high correlation with our results. Furthermore, the dice coefficient, which quantifies the volumetric overlap, is higher than 62% for all structures and is close to 80% for basal ganglia. The intraclass correlation coefficient computed on these same datasets shows a good inter-method correlation of the volumetric measurements. Using a dataset of a single patient scanned 10 times on 5 different scanners, reliability is shown with a coefficient of variance of less than 2 percents over the whole dataset. Overall, these validation and reliability studies show that our method accurately and reliably segments almost all structures. Only the hippocampus and amygdala segmentations exhibit relative low correlation with the manual segmentation in at least one of the validation studies, whereas they still show appropriate dice overlap coefficients.
Vergari, Claudio; Dubois, Guillaume; Vialle, Raphael; Gennisson, Jean-Luc; Tanter, Mickael; Dubousset, Jean; Rouch, Philippe; Skalli, Wafa
2016-04-01
Intervertebral disc (IVD) is key to spine biomechanics, and it is often involved in the cascade leading to spinal deformities such as idiopathic scoliosis, especially during the growth spurt. Recent progress in elastography techniques allows access to non-invasive measurement of cervical IVD in adults; the aim of this study was to determine the feasibility and reliability of shear wave elastography in healthy children lumbar IVD. Elastography measurements were performed in 31 healthy children (6-17 years old), in the annulus fibrosus and in the transverse plane of L5-S1 or L4-L5 IVD. Reliability was determined by three experienced operators repeating measurements. Average shear wave speed in IVD was 2.9 ± 0.5 m/s; no significant correlations were observed with sex, age or body morphology. Intra-operator repeatability was 5.0 % while inter-operator reproducibility was 6.2 %. Intraclass correlation coefficient was higher than 0.9 for each operator. Feasibility and reliability of IVD shear wave elastography were demonstrated. The measurement protocol is compatible with clinical routine and the results show the method's potential to give an insight into spine deformity progression and early detection. • Intervertebral disc mechanical properties are key to spine biomechanics • Feasibility of shear wave elastography in children lumbar disc was assessed • Measurement was fast and reliable • Elastography could represent a novel biomarker for spine pathologies.
Bakker, Merel; Pace, Margherita; de Jong-Pleij, Els; Birnie, Erwin; Kagan, Karl-Oliver; Bilardo, Caterina M
2018-01-01
To investigate the feasibility and reproducibility of the prenasal thickness (PNT)/nasal bone length (NBL) ratio, maxilla-nasion-mandible (MNM) angle, facial profile line, profile line distance, and prefrontal space ratio (PFSR) in the first trimester of pregnancy, develop normal ranges, and evaluate these markers in abnormal fetuses. All measurements were performed on stored images by two operators. Feasibility, interoperator agreement, and prediction intervals were calculated for all measurements. Feasibility was the highest for the NBL (74.3-79.7%) and the MNM angle (75.7-79.05%). Correlation was good for the NBL, the PNT, and the MNM angle (intraclass correlation coefficient 0.706-0.835). Mean difference between operators was the lowest for the PNT and PFSR (0.03-0.08). Measurements in abnormal fetuses showed that the majority of trisomy 21 fetuses had either an absent nasal bone or a shorter NBL. The PNT and PNT/NBL ratio were above the 97.5th centile in one third of the cases. Fetuses with facial clefts or micrognathia showed on average a large MNM angle (multiple of the median 0.96-5.15). First-trimester facial markers are feasible. The PNT and PNT/NBL ratio were increased in one third of the trisomic fetuses, and the MNM angle in the majority of fetuses with micrognathia and facial clefts. © 2016 S. Karger AG, Basel.
Kramer, Gerbrand Maria; Frings, Virginie; Heijtel, Dennis; Smit, E F; Hoekstra, Otto S; Boellaard, Ronald
2017-06-01
The objective of this study was to validate several parametric methods for quantification of 3'-deoxy-3'- 18 F-fluorothymidine ( 18 F-FLT) PET in advanced-stage non-small cell lung carcinoma (NSCLC) patients with an activating epidermal growth factor receptor mutation who were treated with gefitinib or erlotinib. Furthermore, we evaluated the impact of noise on accuracy and precision of the parametric analyses of dynamic 18 F-FLT PET/CT to assess the robustness of these methods. Methods : Ten NSCLC patients underwent dynamic 18 F-FLT PET/CT at baseline and 7 and 28 d after the start of treatment. Parametric images were generated using plasma input Logan graphic analysis and 2 basis functions-based methods: a 2-tissue-compartment basis function model (BFM) and spectral analysis (SA). Whole-tumor-averaged parametric pharmacokinetic parameters were compared with those obtained by nonlinear regression of the tumor time-activity curve using a reversible 2-tissue-compartment model with blood volume fraction. In addition, 2 statistically equivalent datasets were generated by countwise splitting the original list-mode data, each containing 50% of the total counts. Both new datasets were reconstructed, and parametric pharmacokinetic parameters were compared between the 2 replicates and the original data. Results: After the settings of each parametric method were optimized, distribution volumes (V T ) obtained with Logan graphic analysis, BFM, and SA all correlated well with those derived using nonlinear regression at baseline and during therapy ( R 2 ≥ 0.94; intraclass correlation coefficient > 0.97). SA-based V T images were most robust to increased noise on a voxel-level (repeatability coefficient, 16% vs. >26%). Yet BFM generated the most accurate K 1 values ( R 2 = 0.94; intraclass correlation coefficient, 0.96). Parametric K 1 data showed a larger variability in general; however, no differences were found in robustness between methods (repeatability coefficient, 80%-84%). Conclusion: Both BFM and SA can generate quantitatively accurate parametric 18 F-FLT V T images in NSCLC patients before and during therapy. SA was more robust to noise, yet BFM provided more accurate parametric K 1 data. We therefore recommend BFM as the preferred parametric method for analysis of dynamic 18 F-FLT PET/CT studies; however, SA can also be used. © 2017 by the Society of Nuclear Medicine and Molecular Imaging.
High Familial Correlation in Methylphenidate Response and Side Effect Profile.
Gazer-Snitovsky, Michal; Brand-Gothelf, Ayelet; Dubnov-Raz, Gal; Weizman, Abraham; Gothelf, Doron
2015-04-21
To examine whether a familial tendency exists in clinical response to methylphenidate. Nineteen pairs of siblings or parent-child stimulant-naive individuals with ADHD were prescribed methylphenidate-immediate release, and were comprehensively evaluated at baseline, Week 2, and Week 4, using the ADHD Rating Scale IV, Clinical Global Impression Scale, and the Barkley Side Effects Rating Scale. We found significant intraclass correlations in family member response to methylphenidate-immediate release and side effect profile, including emotional symptoms and loss of appetite and weight. Family history of response to methylphenidate should be taken into account when treating ADHD. © 2015 SAGE Publications.
Keessen, Paul; Maaskant, Jolanda; Visser, Bart
2018-08-01
The standardized Mensendieck test (SMT) was developed to quantify posture, movement, gait, and respiration. In the hands of an experienced therapist, the SMT is proven to be a reliable tool. It is unclear whether posture, movement, gait, and respiration are related to the degree of functional disability in patients with chronic pain. The objective of this study was to assess the reliability and convergent validity of the SMT in a heterogeneous sample of 50 patients with chronic pain. Internal consistency was determined by Cronbach's α and interrater reliability by the intraclass correlation coefficient (ICC). Convergent validity was assessed by determining the Spearman rank correlation coefficient between the movement quality measured in the SMT and functional limitation measured on the disability rating index (DRI). The internal consistency was Cronbach's α 0.91. Substantial reliability was found for the items: movement (ICC = 0.68), gait (ICC = 0.69), sitting posture (ICC = 0.63), and respiration (ICC = 0.64). Insufficient reliability was found for standing posture (ICC = 0.23). A moderate correlation was found between average test score SMT and the DRI (r = -0.37) and respiration and DRI (r = -0.45). The SMT is a reasonably reliable tool to assess movement, gait, sitting posture, and respiration. None of the items in the domain standing posture has sufficient reliability. A thorough study of this domain should be considered. The results show little evidence for convergent validity. Several items of the SMT correlated moderately with functional limitation with the DRI. These items were global movement, hip flexion, pelvis rotation, and all respiration items.
Validation of Plantar Pressure Measurements for a Novel In-Shoe Plantar Sensory Replacement Unit
Ferber, Reed; Webber, Talia; Kin, B; Everett, Breanne; Groenland, Marcel
2013-01-01
Background Research concerning prevention of diabetic foot complications is critical. A novel in-shoe plantar sensory replacement unit (PSRU) has been developed that provides alert-based feedback derived from analyzing plantar pressure threshold measurements in real time. The purpose of this study was to compare the PSRU device to a gold standard pressure-sensing device (GS-PSD) to determine the correlation between concurrent measures of plantar pressure during walking. Methods The PSRU had an array of eight sensors with a range of 10–75 mm Hg and collected data at 4 Hz, whereas the GS-PSD had 99 sensors with a range of 1–112 mm Hg and collected data at 100 Hz. Based on an a priori power analysis, data were collected from 10 participants (3 female, 7 male) while walking over ground in both devices. The primary variable of interest was the number of data points recorded that were greater than 32 mm Hg (capillary arterial pressure—the minimum pressure reported to cause pressure ulcers) for each of the eight PSRU sensors and corresponding average recordings from the GS-PSD sensor clusters. Intraclass correlation coefficient (2,1) was used to compare data between the two devices. Results Compared with the GS-PSD, we found good-to-very-good correlations (r-value range 0.67–0.86; p-value range 0.01–0.05) for six of the PSRU’s eight sensors and poor correlation for only two sensors (r = 0.41, p = .15; r = 0.38, p = .18) when measuring the number of data points recorded that were greater than 32 mm Hg. Conclusions Based on the results of the present study, we conclude the PSRU provides analogous data when compared with a GS-PSD. PMID:24124942
Development and validation of the German version of the Orofacial Esthetic Scale.
Reissmann, Daniel R; Benecke, Andreas W; Aarabi, Ghazal; Sierwald, Ira
2015-07-01
This study aimed to develop the German version of the Orofacial Esthetic Scale (OES-G) and to assess its psychometric properties. The OES is an eight-item instrument with seven items directly addressing esthetic impacts of the orofacial region and an eighth item for a global assessment. It applies an 11-point ordinal rating scale, with summary scores ranging from 0 (worst) to 70 (best). The original OES items were translated into German using a forward-backward method. A de novo development of German items (n = 21 patients) and a cross-cultural adaptation after pilot testing (n = 15 patients) established content validity. Internal consistency and construct validity (structural, convergent, known-groups) of the OES-G were assessed in a sample of 165 prosthodontic patients. The OES was applied in 42 patients on two occasions, with a temporal distance of 2-4 weeks apart to determine test-retest reliability. Internal consistency of the OES-G was considered as satisfactory (Cronbach's alpha 0.94; average inter-item correlation 0.64). Intraclass correlation coefficient of 0.95 (95 % confidence interval 0.92-0.98) indicated excellent test-retest reliability. Correlation matrix and exploratory factor analysis provided support for unidimensionality of the measured construct. The OES-G summary score was correlated with the patients' global assessment of their esthetics (r = 0.87) and external ratings of the expert group (r = 0.55) and discriminated patients with treatment need (39.4 points) from patients without (58.4 points; p < 0.001) and with a large effect size. The OES-G has good psychometric properties and is a valuable instrument for the assessment of self-perceived orofacial esthetics.
Structured learning for robotic surgery utilizing a proficiency score: a pilot study.
Hung, Andrew J; Bottyan, Thomas; Clifford, Thomas G; Serang, Sarfaraz; Nakhoda, Zein K; Shah, Swar H; Yokoi, Hana; Aron, Monish; Gill, Inderbir S
2017-01-01
We evaluated feasibility and benefit of implementing structured learning in a robotics program. Furthermore, we assessed validity of a proficiency assessment tool for stepwise graduation. Teaching cases included robotic radical prostatectomy and partial nephrectomy. Procedure steps were categorized: basic, intermediate, and advanced. An assessment tool ["proficiency score" (PS)] was developed to evaluate ability to safely and autonomously complete a step. Graduation required a passing PS (PS ≥ 3) on three consecutive attempts. PS and validated global evaluative assessment of robotic skills (GEARS) were evaluated for completed steps. Linear regression was utilized to determine postgraduate year/PS relationship (construct validity). Spearman's rank correlation coefficient measured correlation between PS and GEARS evaluations (concurrent validity). Intraclass correlation (ICC) evaluated PS agreement between evaluator classes. Twenty-one robotic trainees participated within the pilot program, completing a median of 14 (2-69) cases each. Twenty-three study evaluators scored 14 (1-60) cases. Over 4 months, 229/294 (78 %) cases were designated "teaching" cases. Residents completed 91 % of possible evaluations; faculty completed 78 %. Verbal and quantitative feedback received by trainees increased significantly (p = 0.002, p < 0.001, respectively). Average PS increased with PGY (post-graduate year) for basic and intermediate steps (regression slopes: 0.402 (p < 0.0001), 0.323 (p < 0.0001), respectively) (construct validation). Overall, PS correlated highly with GEARS (ρ = 0.81, p < 0.0001) (concurrent validity). ICC was 0.77 (95 % CI 0.61-0.88) for resident evaluations. Structured learning can be implemented in an academic robotic program with high levels of trainee and evaluator participation, encouraging both quantitative and verbal feedback. A proficiency assessment tool developed for step-specific proficiency has construct and concurrent validity.
Intraoperative specimen radiography in patients with nonpalpable malignant breast lesions.
Schmachtenberg, C; Engelken, F; Fischer, T; Bick, U; Poellinger, A; Fallenberg, E M
2012-07-01
Specimen mammography of nonpalpable wire-localized breast lesions is the standard in breast-conserving surgery. The aim of this study was to evaluate the reliability of intraoperative 2-view specimen mammography in different cancer types. After ethics approval, 3 readers retrospectively evaluated margins on 266 2-view specimen radiographs. They determined the closest margin and the orientation. The results were correlated with the histopathology (intra-class correlation coefficient [ICC] and contingency coefficient [CC]) and compared (Wilcoxon test). Invasive ductal carcinoma (IDC) with ductal carcinoma in situ (DCIS) was present in 115 (43 %), IDC in 75 (28 %), invasive lobular carcinoma (ILC) in 57 (22 %) and rare cancers (CA) in 19 specimens (7 %). The sensitivity/specificity and positive/negative predictive value (P/NPV) of specimen mammography were 0.50/0.86 and 0.86/0.50 for CA, 0.42/0.68 and 0.48/0.63 for IDC, 0.36/0.81 and 0.69/0.51 for ILC, and 0.22/0.78 and 0.68/0.32 for IDC+DCIS. Readers correctly identified the orientation of the closest margin in at least one view in an average of 149 specimens (56 %). CCs were between 0.680 (IDC) and 0.912 (CA), suggesting a moderate correlation between radiographic and histological orientation. The correlations were worse for the radiographic and histological distances, with ICC ranging from 0.238 (ILC) to 0.475 (CA). The Wilcoxon test revealed overestimation of the radiographic margins compared to the histological ones for DCIS. Our results suggest that specimen radiography has relatively good overall specificity and good PPV, while the sensitivity and NPV are low for DCIS. A negative result on specimen radiography does not rule out histologically involved margins. © Georg Thieme Verlag KG Stuttgart · New York.
Simmenroth-Nayda, Anne; Heinemann, Stephanie; Nolte, Catharina; Fischer, Thomas; Himmel, Wolfgang
2014-01-01
Objectives: The aim of this study was to analyse the psychometric properties of the short version of the Calgary Cambridge Guides and to decide whether it can be recommended for use in the assessment of communications skills in young undergraduate medical students. Methods: Using a translated version of the Guide, 30 members from the Department of General Practice rated 5 videotaped encounters between students and simulated patients twice. Item analysis should detect possible floor and/or ceiling effects. The construct validity was investigated using exploratory factor analysis. Intra-rater reliability was measured in an interval of 3 months, inter-rater reliability was assessed by the intraclass correlation coefficient. Results: The score distribution of the items showed no ceiling or floor effects. Four of the five factors extracted from the factor analysis represented important constructs of doctor-patient communication The ratings for the first and second round of assessing the videos correlated at 0.75 (p < 0.0001). Intraclass correlation coefficients for each item ranged were moderate and ranged from 0.05 to 0.57. Conclusions: Reasonable score distributions of most items without ceiling or floor effects as well as a good test-retest reliability and construct validity recommend the C-CG as an instrument for assessing communication skills in undergraduate medical students. Some deficiencies in inter-rater reliability are a clear indication that raters need a thorough instruction before using the C-CG. PMID:25480988
Validation of the Turkish version of the Breast Reduction Assessed Severity Scale.
Kececi, Yavuz; Sir, Emin; Zengel, Baha
2013-01-01
Measuring patient-reported outcomes has become increasingly important in cosmetic and reconstructive breast surgery. There is no validated questionnaire in Turkish to evaluate quality-of-life issues for patients with mammary hypertrophy. The authors describe the reliability and validity of a translated Breast Reduction Assessed Severity Scale (BRASS) in evaluating Turkish patients. The BRASS, developed by Sigurdson et al, was translated into Turkish adhering strictly to the guidelines of questionnaire translations. Statistical analysis was carried out with Cronbach's α to test the internal consistency and intraclass correlation coefficient for test-retest reliability. Exploratory factor analysis was carried out using principal component analysis with oblimin rotation to test its construct validity. Correlations between subscales identified in the factor analysis and corresponding domains in the Short Form-36 and Rosenberg Self-Esteem Scale were analyzed. The total instrument was found to have an α coefficient of 0.92 and subscale α coefficients ranging from 0.76 to 0.87. Intraclass correlation coefficient was 0.93 for the total scale and ranged from 0.81 to 0.91 for the subscales. Exploratory factor analysis resulted in a 5-factor structure: physical implications, body pain, physical appearance, poor self-concept, and negative social interactions. With this study, the reliability and validity of the Turkish version of the BRASS were revealed. This translated version can be used to evaluate the effect of mammary hypertrophy on quality of life in Turkish patients.
Artilheiro, Mariana Cunha; Fávero, Francis Meire; Caromano, Fátima Aparecida; Oliveira, Acary de Souza Bulle; Carvas, Nelson; Voos, Mariana Callil; Sá, Cristina Dos Santos Cardoso de
2017-12-08
The Jebsen-Taylor Test evaluates upper limb function by measuring timed performance on everyday activities. The test is used to assess and monitor the progression of patients with Parkinson disease, cerebral palsy, stroke and brain injury. To analyze the reliability, internal consistency and validity of the Jebsen-Taylor Test in people with Muscular Dystrophy and to describe and classify upper limb timed performance of people with Muscular Dystrophy. Fifty patients with Muscular Dystrophy were assessed. Non-dominant and dominant upper limb performances on the Jebsen-Taylor Test were filmed. Two raters evaluated timed performance for inter-rater reliability analysis. Test-retest reliability was investigated by using intraclass correlation coefficients. Internal consistency was assessed using the Cronbach alpha. Construct validity was conducted by comparing the Jebsen-Taylor Test with the Performance of Upper Limb. The internal consistency of Jebsen-Taylor Test was good (Cronbach's α=0.98). A very high inter-rater reliability (0.903-0.999), except for writing with an Intraclass correlation coefficient of 0.772-1.000. Strong correlations between the Jebsen-Taylor Test and the Performance of Upper Limb Module were found (rho=-0.712). The Jebsen-Taylor Test is a reliable and valid measure of timed performance for people with Muscular Dystrophy. Copyright © 2017 Associação Brasileira de Pesquisa e Pós-Graduação em Fisioterapia. Publicado por Elsevier Editora Ltda. All rights reserved.
Validation and reliability of the Physical Activity Scale for the Elderly in Chinese population.
Ngai, Shirley P C; Cheung, Roy T H; Lam, Priscillia L; Chiu, Joseph K W; Fung, Eric Y H
2012-05-01
Physical Activity Scale for the Elderly (PASE) is a widely used questionnaire in epidemiological studies for assessing the physical activity level of elderly. This study aims to translate and validate PASE in Chinese population. Cross-sectional study. Chinese elderly aged 65 or above. The original English version of PASE was translated into Chinese (PASE-C) following standardized translation procedures. Ninety Chinese elderly aged 65 or above were recruited in the community. Test-retest reliability was determined by comparing the scores obtained from two separate administrations by the intraclass correlation coefficient. Validity was evaluated by Spearman's rank correlation coefficients between PASE and Medical Outcome Survey 36-Item Short Form Health Survey (SF-36), grip strength, single-leg-stance, 5 times sit-to-stand and 10-m walk. PASE-C demonstrated good test-retest reliability (intraclass correlation coefficient = 0.81). Fair to moderate association were found between PASE-C and most of the subscales of SF-36 (rs = 0.285 to 0.578, p < 0.01), grip strength (rs = 0.405 to 0.426, p < 0.001), single-leg-stance (rs = 0.470 to 0.548, p < 0.001), 5 times sit-to-stand (rs = -0.33, p = 0.001) and 10-m walk (rs = -0.281, p = 0.007). PASE-C is a reliable and valid instrument for assessing the physical activity level of elderly in Chinese population.
Shortening of an existing generic online health-related quality of life instrument for dogs.
Reid, J; Wiseman-Orr, L; Scott, M
2017-10-11
Development, initial validation and reliability testing of a shortened version of a web-based questionnaire instrument to measure generic health-related quality of life in companion dogs, to facilitate smartphone and online use. The original 46 items were reduced using expert judgment and factor analysis. Items were removed on the basis of item loadings and communalities on factors identified through factor analysis of responses from owners of healthy and unwell dogs, intrafactor item correlations, readability of items in the UK, USA and Australia and ability of individual items to discriminate between healthy and unwell dogs. Validity was assessed through factor analysis and a field trial using a "known groups" approach. Test-retest reliability was assessed using intraclass correlation coefficients. The new instrument comprises 22 items, each of which was rated by dog owners using a 7-point Likert scale. Factor analysis revealed a structure with four health-related quality of life domains (energetic/enthusiastic, happy/content, active/comfortable, and calm/relaxed) accounting for 72% of the variability in the data compared with 64% for the original instrument. The field test involving 153 healthy and unwell dogs demonstrated good discriminative properties and high intraclass correlation coefficients. The 22-item shortened form is superior to the original instrument and can be accessed via a mobile phone app. This is likely to increase the acceptability to dog owners as a routine wellness measure in health care packages and as a therapeutic monitoring tool. © 2017 British Small Animal Veterinary Association.
Reliability of the Wii Balance Board in kayak.
Vando, Stefano; Laffaye, Guillaume; Masala, Daniele; Falese, Lavinia; Padulo, Johnny
2015-01-01
the seat of the kayaker represent the principal contact point to express mechanical Energy. therefore we investigated the reliability of the Wii Balance Board measures in the kayak vs. on the ground. Bland-Altman test showed a low systematic bias on the ground (2.85%) and in kayak (-2.13%) respectively; while 0.996 for Intra-class correlation coefficient. the Wii Balance Board is useful to assess postural sway in kayak.
Zaki, Rafdzah; Bulgiba, Awang; Nordin, Noorhaire; Azina Ismail, Noor
2013-06-01
Reliability measures precision or the extent to which test results can be replicated. This is the first ever systematic review to identify statistical methods used to measure reliability of equipment measuring continuous variables. This studyalso aims to highlight the inappropriate statistical method used in the reliability analysis and its implication in the medical practice. In 2010, five electronic databases were searched between 2007 and 2009 to look for reliability studies. A total of 5,795 titles were initially identified. Only 282 titles were potentially related, and finally 42 fitted the inclusion criteria. The Intra-class Correlation Coefficient (ICC) is the most popular method with 25 (60%) studies having used this method followed by the comparing means (8 or 19%). Out of 25 studies using the ICC, only 7 (28%) reported the confidence intervals and types of ICC used. Most studies (71%) also tested the agreement of instruments. This study finds that the Intra-class Correlation Coefficient is the most popular method used to assess the reliability of medical instruments measuring continuous outcomes. There are also inappropriate applications and interpretations of statistical methods in some studies. It is important for medical researchers to be aware of this issue, and be able to correctly perform analysis in reliability studies.
Chiang, Hsin-Yu; Lu, Wen-Shian; Yu, Wan-Hui; Hsueh, I-Ping; Hsieh, Ching-Lin
2018-04-11
To examine the interrater and intrarater reliability of the Balance Computerized Adaptive Test (Balance CAT) in patients with chronic stroke having a wide range of balance functions. Repeated assessments design (1wk apart). Seven teaching hospitals. A pooled sample (N=102) including 2 independent groups of outpatients (n=50 for the interrater reliability study; n=52 for the intrarater reliability study) with chronic stroke. Not applicable. Balance CAT. For the interrater reliability study, the values of intraclass correlation coefficient, minimal detectable change (MDC), and percentage of MDC (MDC%) for the Balance CAT were .84, 1.90, and 31.0%, respectively. For the intrarater reliability study, the values of intraclass correlation coefficient, MDC, and MDC% ranged from .89 to .91, from 1.14 to 1.26, and from 17.1% to 18.6%, respectively. The Balance CAT showed sufficient intrarater reliability in patients with chronic stroke having balance functions ranging from sitting with support to independent walking. Although the Balance CAT may have good interrater reliability, we found substantial random measurement error between different raters. Accordingly, if the Balance CAT is used as an outcome measure in clinical or research settings, same raters are suggested over different time points to ensure reliable assessments. Copyright © 2018 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Gamba, Thiago O; Oliveira, Matheus L; Flores, Isadora L; Cruz, Adriana D; Almeida, Solange M; Haiter-Neto, Francisco; Lopes, Sérgio L P C
2014-03-01
To compare dental plaster model (DPM) and cone-beam computed tomography (CBCT) in the measurement of the dental arches, and investigate whether CBCT image artifacts compromise the reliability of such measurements. Twenty patients were divided into two groups based on the presence or absence of metallic restorations in the posterior teeth. Both dental arches of the patients were scanned with the CBCT unit i-CAT, and DPMs were obtained. Two examiners obtained eight arch measurements on the CBCT images and DPMs and repeated this procedure 15 days later. The arch measurements of each patient group were compared separately by the Wilcoxon rank sum (Mann-Whitney U) test, with a significance level of 5% (α = .05). Intraclass correlation measured the level of intraobserver agreement. Patients with healthy teeth showed no significant difference between all DPM and CBCT arch measurements (P > .05). Patients with metallic restoration showed significant difference between DPM and CBCT for the majority of the arch measurements (P > .05). The two examiners showed excellent intraobserver agreement for both measuring methods with intraclass correlation coefficient higher than 0.95. CBCT provided the same accuracy as DPM in the measurement of the dental arches, and was negatively influenced by the presence of image artifacts.
Wii Balance Board: Reliability and Clinical Use in Assessment of Balance in Healthy Elderly Women.
Monteiro-Junior, Renato Sobral; Ferreira, Arthur Sá; Puell, Vivian Neiva; Lattari, Eduardo; Machado, Sérgio; Otero Vaghetti, César Augusto; da Silva, Elirez Bezerra
2015-01-01
Force plate is considered gold standard tool to assess body balance. However the Wii Balance Board (WBB) platform is a trustworthy equipment to assess stabilometric components in young people. Thus, we aim to examine the reliability of measures of center of pressure with WBB in healthy elderly women. Twenty one healthy and physically active women were enrolled in the study (age: 64 ± 7 years; body mass index: 29 ± 5 kg/m2. The WBB was used to assess the center of pressure measures in the individuals. Pressure was linearly applied to different points to test the platform precision. Three assessments were performed, with two of them being held on the same day at a 5- to 10-minute interval, and the third one was performed 48 h later. A linear regression analysis was used to find out linearity, while the intraclass correlation coefficient was used to assess reliability. The platform precision was adequate (R2 = 0.997, P = 0.01). Center of pressure measures showed an excellent reliability (all intraclass correlation coefficient values were > 0.90; p < 0.01). The WBB is a precise and reliable tool of body stability quantitative measure in healthy active elderly women and its use should be encouraged in clinical settings.
Aykut, Aktas; Bumin, Degirmenci; Omer, Yilmaz; Mustafa, Kayan; Meltem, Cetin; Orhan, Celik; Nisa, Unlu; Hikmet, Orhan; Hakan, Demirtas; Mert, Koroglu
2015-09-01
The aim was to compare coronary high-definition CT (HDCT) with standard-definition CT (SDCT) angiography as to radiation dose, image quality and accuracy. 28 patients with history of coronary artery disease scanned by HDCT (Discovery CT750 HD) and SDCT (Somatom Definition AS). The scan modes were both axial prospective ECG-triggered. The vessel diameters and vessel attenuation values of totally 280 measurements from 140 coronary arteries were analyzed by two experienced radiologists. All data was analyzed by intraclass correlation test. Image quality graded by motion and stair step artifacts (grade 1, poor, to grade 4, excellent), accuracy of vessel inner and outer diameters were compared between the two CT units using the independent samples t-test and Mann-Whitney U test. The intraclass correlation coefficient (ICC) of measured vessel attenuation values in SDCT between the two radiologists was exceedingly good. The ICC was higher in HDCT. The radiation dose of HDCT was higher than that of SDCT. The mean tube current was 180 (mA) in HDCT and 147(mA) in SDCT with the same tube voltage (kVp). There was no significant difference between image quality. HDCT has a higher radiation dose but has much more atenuation and the spatial resolution which improve measurement accuracy for imaging coronary arteries.
Reliability of the penetration aspiration scale with flexible endoscopic evaluation of swallowing.
Butler, Susan G; Markley, Lisa; Sanders, Brian; Stuart, Andrew
2015-06-01
The Penetration Aspiration Scale (PAS), although designed for videofluoroscopy, has been utilized with flexible endoscopic evaluation of swallowing (FEES) in both research and clinical practice. The purpose of this investigation was to determine inter- and intrarater reliability of the PAS with FEES as a function of clinician FEES experience and retest interval. Three groups of 3 clinicians (N=9) with varying FEES experience (beginning, intermediate, and advanced) assigned PAS scores to 35 swallows. Initial ratings were repeated following short-term (ie, 1 day) and long-term (ie, 1 week) retest intervals. Intraclass correlation coefficients were calculated to assess interrater reliability on the first rating for each group. The coefficients were .91, .82, and .89 for the beginning, intermediate, and advanced clinicians, respectively. Overall interrater reliability across all 9 clinicians, irrespective of experience, was .85. Intraclass correlation coefficients were also calculated to assess intrarater reliability. The intrarater reliability for short- and long-term ratings was .90, .94, and .96 and .96, .97, and .94 for the beginning, intermediate, and advanced clinicians, respectively. Overall intrarater reliability across all 9 clinicians and all 3 ratings was .94. Excellent inter- and intrarater reliability was evidenced with the application of the PAS for FEES regardless of clinician experience and retest interval. © The Author(s) 2015.
Chan, A K; Singogo, E; Changamire, R; Ratsma, Y E C; Tassie, J-M; Harries, A D
2012-06-21
Rapid scale-up of antiretroviral therapy (ART) has challenged the health system in Malawi to monitor large numbers of patients effectively. To compare two methods of determining retention on treatment: quarterly ART clinic data aggregation vs. pharmacy stock cards. Between October 2010 and March 2011, data on ART outcomes were extracted from monitoring tools at five facilities. Pharmacy data on ART consumption were extracted. Workload for each method was observed and timed. We used intraclass correlation and Bland-Altman plots to compare the agreeability of both methods to determine treatment retention. There is wide variability between ART clinic cohort data and pharmacy data to determine treatment retention due to divergence in data at sites with large numbers of patients. However, there is a non-significant trend towards agreeability between the two methods (intraclass correlation coefficient > 0.9; P > 0.05). Pharmacy stock card monitoring is more time-efficient than quarterly ART data aggregation (81 min vs. 573 min). In low-resource settings, pharmacy records could be used to improve drug forecasting and estimate ART retention in a more time-efficient manner than quarterly data aggregation; however, a necessary precondition would be capacity building around pharmacy data management, particularly for large-sized cohorts.
Yusoff, Nasir; Low, Wah Yun; Yip, Cheng-Har
2011-01-01
The main objective of this paper is to examine the psychometric properties of the Malay Version of the Hospital Anxiety and Depression Scale (HADS), tested on 67 husbands of the women who were diagnosed with breast cancer. The eligible husbands were retrieved from the Clinical Oncology Clinic at three hospitals in Kuala Lumpur, Malaysia. Data was collected at three weeks and ten weeks following surgery for breast cancer of their wives. The psychometric properties of the HADS were reported based on Cronbach' alpha, Intraclass Correlation Coefficients (ICC), Effect Size Index (ESI), sensitivity and discriminity of the scale. Internal consistency of the scale is excellent, with Cronbach's alpha of 0.88 for Anxiety subscale and 0.79 for Depression subscale. Test-retest Intraclass Correlation Coefficient (ICC) is 0.35 and 0.42 for Anxiety and Depression Subscale, respectively. Small mean differences were observed at test-retest measurement with ESI of 0.21 for Anxiety and 0.19 for Depression. Non-significant result was revealed for the discriminant validity (mastectomy vs lumpectomy). The Malay Version of the HADS is appropriate to measure the anxiety and depression among the husbands of the women with breast cancer in Malaysia.
Ramrit, Sirinun; Yonglitthipagon, Ponlapat; Janyacharoen, Taweesak; Emasithi, Alongkot; Siritaratiwat, Wantana
2017-05-01
The aim of this study was to investigate the reliability of the Thai Gross Motor Function Classification System Family Report Questionnaire (GMFCS-FR) and the possibility of special-education teachers and caregivers in the community using this system in children with cerebral palsy (CP). The reliability was examined by two teachers and two caregivers who classified 21 children with CP aged 2 to 12 years. A GMFCS-FR workshop was organized for raters. The teachers and caregivers classified the mobility of 362 children. The rater reliability was analysed using the weighted kappa coefficient. The possibility of using the GMFCS-FR is reported. The reliability of using the GMFCS-FR in the community was analysed by the intraclass correlation coefficient. The intrarater reliability ranged from 0.91 to 1.00. The interrater reliability between teachers was 0.85 (95% confidence interval [CI] 0.69-0.97) and between caregivers was 0.84 (95% CI 0.70-0.97). Ninety-seven percent of raters used the Thai GMFCS-FR correctly. The overall intraclass correlation coefficient between raters was 0.90 (95% CI 0.88-0.92). The Thai GMFCS-FR is a reliable system for classifying the motor function of young children with CP by teachers and caregivers in the community. © 2016 Mac Keith Press.
Reliability and validity of current physical examination techniques of the foot and ankle.
Wrobel, James S; Armstrong, David G
2008-01-01
This literature review was undertaken to evaluate the reliability and validity of the orthopedic, neurologic, and vascular examination of the foot and ankle. We searched PubMed-the US National Library of Medicine's database of biomedical citations-and abstracts for relevant publications from 1966 to 2006. We also searched the bibliographies of the retrieved articles. We identified 35 articles to review. For discussion purposes, we used reliability interpretation guidelines proposed by others. For the kappa statistic that calculates reliability for dichotomous (eg, yes or no) measures, reliability was defined as moderate (0.4-0.6), substantial (0.6-0.8), and outstanding (> 0.8). For the intraclass correlation coefficient that calculates reliability for continuous (eg, degrees of motion) measures, reliability was defined as good (> 0.75), moderate (0.5-0.75), and poor (< 0.5). Intraclass correlations, based on the various examinations performed, varied widely. The range was from 0.08 to 0.98, depending on the examination performed. Concurrent and predictive validity ranged from poor to good. Although hundreds of articles exist describing various methods of lower-extremity assessment, few rigorously assess the measurement properties. This information can be used both by the discerning clinician in the art of clinical examination and by the scientist in the measurement properties of reproducibility and validity.
Intra-class correlation estimates for assessment of vitamin A intake in children.
Agarwal, Girdhar G; Awasthi, Shally; Walter, Stephen D
2005-03-01
In many community-based surveys, multi-level sampling is inherent in the design. In the design of these studies, especially to calculate the appropriate sample size, investigators need good estimates of intra-class correlation coefficient (ICC), along with the cluster size, to adjust for variation inflation due to clustering at each level. The present study used data on the assessment of clinical vitamin A deficiency and intake of vitamin A-rich food in children in a district in India. For the survey, 16 households were sampled from 200 villages nested within eight randomly-selected blocks of the district. ICCs and components of variances were estimated from a three-level hierarchical random effects analysis of variance model. Estimates of ICCs and variance components were obtained at village and block levels. Between-cluster variation was evident at each level of clustering. In these estimates, ICCs were inversely related to cluster size, but the design effect could be substantial for large clusters. At the block level, most ICC estimates were below 0.07. At the village level, many ICC estimates ranged from 0.014 to 0.45. These estimates may provide useful information for the design of epidemiological studies in which the sampled (or allocated) units range in size from households to large administrative zones.
A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research.
Koo, Terry K; Li, Mae Y
2016-06-01
Intraclass correlation coefficient (ICC) is a widely used reliability index in test-retest, intrarater, and interrater reliability analyses. This article introduces the basic concept of ICC in the content of reliability analysis. There are 10 forms of ICCs. Because each form involves distinct assumptions in their calculation and will lead to different interpretations, researchers should explicitly specify the ICC form they used in their calculation. A thorough review of the research design is needed in selecting the appropriate form of ICC to evaluate reliability. The best practice of reporting ICC should include software information, "model," "type," and "definition" selections. When coming across an article that includes ICC, readers should first check whether information about the ICC form has been reported and if an appropriate ICC form was used. Based on the 95% confident interval of the ICC estimate, values less than 0.5, between 0.5 and 0.75, between 0.75 and 0.9, and greater than 0.90 are indicative of poor, moderate, good, and excellent reliability, respectively. This article provides a practical guideline for clinical researchers to choose the correct form of ICC and suggests the best practice of reporting ICC parameters in scientific publications. This article also gives readers an appreciation for what to look for when coming across ICC while reading an article.
Agreement in functional assessment: graphic approaches to displaying respondent effects.
Haley, Stephen M; Ni, Pengsheng; Coster, Wendy J; Black-Schaffer, Randie; Siebens, Hilary; Tao, Wei
2006-09-01
The objective of this study was to examine the agreement between respondents of summary scores from items representing three functional content areas (physical and mobility, personal care and instrumental, applied cognition) within the Activity Measure for Postacute Care (AM-PAC). We compare proxy vs. patient report in both hospital and community settings as represented by intraclass correlation coefficients and two graphic approaches. The authors conducted a prospective, cohort study of a convenience sample of adults (n = 47) receiving rehabilitation services either in hospital (n = 31) or community (n = 16) settings. In addition to using intraclass correlation coefficients (ICC) as indices of agreement, we applied two graphic approaches to serve as complements to help interpret the direction and magnitude of respondent disagreements. We created a "mountain plot" based on a cumulative distribution curve and a "survival-agreement plot" with step functions used in the analysis of survival data. ICCs on summary scores between patient and proxy report were physical and mobility ICC = 0.92, personal care and instrumental ICC = 0.93, and applied cognition ICC = 0.77. Although combined respondent agreement was acceptable, graphic approaches helped interpret differences in separate analyses of clinician and family agreement. Graphic analyses allow for a simple interpretation of agreement data and may be useful in determining the meaningfulness of the amount and direction of interrespondent variation.
Developing a Danish version of the "Impact on Participation and Autonomy Questionnaire".
Ghaziani, Emma; Krogh, Anne Grethe; Lund, Hans
2013-05-01
To translate the "Impact on Participation and Autonomy Questionnaire" into Danish (IPAQ-DK), and estimate its internal consistency and test-retest reliability in order to promote participation-based interventions and research. Translation and two successive reliability assessments through test-retest. 137 adults with varying degrees of impairment; of these, 67 participated in the final reliability assessment. The translation followed guidelines set forth by the "European Group for Quality of Life Assessment and Health Measurement". Internal consistency for subscales was estimated by Chronbach's alpha. Weighted kappa coefficients and intraclass correlation coefficients were calculated to assess the test-retest reliability at item and subscale level, respectively. A preliminary reliability assessment revealed residual issues regarding the translation and cultural adaptation of the instrument. The revised version (IPAQ-DK) was subsequently subjected to a similar assessment demonstrating Chronbach's alpha values from 0.698 to 0.817. Weighted kappa ranged from 0.370 to 0.880; 78% of these values were higher than 0.600. The intraclass correlation coefficient covered values from 0.701 to 0.818. IPAQ-DK is a useful instrument for identifying person-perceived participation restrictions and satisfaction with participation. Further studies of IPAQ-DK's floor/ceiling effects and responsiveness to change are recommended, and whether there is a need for further linguistic improvement of certain items.
Wells, Michael L; Moynagh, Michael R; Carter, Rickey E; Childs, Robert A; Leitch, Cameron E; Fletcher, Joel G; Yeh, Benjamin M; Venkatesh, Sudhakar K
2017-01-01
To compare MR hepatic fractional extracellular space (fECS) to liver stiffness (LS) with magnetic resonance elastography (MRE) for evaluation of liver fibrosis. 71 consecutive patients with suspected chronic liver disease underwent standard liver MRI with MR elastography and additional delayed Gd-DTPA-enhanced sequences at 5 and 10 min in order to calculate hepatic fECS (%) and LS (kilopascals, kPa). Two radiologists blinded to clinical history examined MR images and calculated fECS and LS in identical locations for every patient. Interobserver agreement was calculated using the intraclass correlation coefficient. Pearson's correlation was calculated for LS and fECS measures, as was the area under the receiver operatic curve (AUROC), sensitivity and specificity of fECS to predict liver stiffness ≥2.93 and ≥5 kPa. The sensitivity of fECS for detecting fibrosis was separately analyzed in the subgroup of patients without anatomic findings of cirrhosis. Substantial to excellent interobserver agreement for both LS and fECS measurements was seen with intraclass correlation of 0.88 (95% CI 0.81-0.92) for LS, 0.77 (95% CI 0.66-0.85) for fECS 5 and 0.76 (95% CI 0.64-0.84) for fECS 10 . A significant correlation was found between MRE and fECS 5 (r = 0.47, p < 0.0001) and fECS 10 (r = 0.44, p < 0.0001). The performance of fECS improved for detection of advanced fibrosis (≥5 kPa) with AUROC, sensitivity and specificity of 0.72, 38%, and 94% for fECS 5 and 0.72, 67%, and 66% for fECS 10 . fECS correlates modestly with MRE-determined LS. fECS at MRI is a simple calculation to perform and may represent a practical way to suggest the presence of fibrosis during routine liver evaluation.
Karaca, Irmak; Yilmaz, Suzan Guven; Palamar, Melis; Ates, Halil
2017-07-03
To investigate the correlation of Scheimpflug camera system and two noncontact specular microscopes in terms of central corneal thickness (CCT) and corneal endothelial cell morphology measurements. One hundred eyes of 50 healthy subjects were examined by Pentacam Scheimpflug Analyzer, CEM-530 (Nidek Co, Ltd, Gamagori, Japan) and CellChek XL (Konan Medical, California, USA) via fully automated image analysis with no corrections made. Measurement differences and agreement between instruments were determined by intraclass correlation analysis. The mean age of the subjects was 36.74 ± 8.59 (range 22-57). CCTs were well correlated among all devices, with having CEM-530 the thinnest and CellChek XL the thickest measurements (intraclass correlation coefficient (ICC) = 0.83; p < 0.001 and ICC = 0.78; p < 0.001, respectively). Mean endothelial cell density (ECD) given by CEM-530 was lower than CellChek XL (2613.17 ± 228.62 and 2862.72 ± 170.42 cells/mm 2 , respectively; ICC = 0.43; p < 0.001). Mean value for coefficient of variation (CV) was 28.57 ± 3.61 in CEM-530 and 30.30 ± 3.53 in CellChek XL. Cell hexagonality (HEX) with CEM-530 was higher than with CellChek XL (68.70 ± 4.16% and 45.19 ± 6.58%, respectively). ECDs with CellChek XL and CEM-530 have good correlation, but the values obtained by CellChek XL are higher than CEM-530. Measurements for HEX and CV differ significantly and show weak correlation. Thus, we do not recommend interchangeable use of CellChek XL and CEM-530. In terms of CCTs, Pentacam, CEM-530 and CellChek XL specular microscopy instruments are reliable devices.
DeMoor, Stephanie; Abdel-Rehim, Shady; Olmsted, Richard; Myers, John G; Parker-Raley, Jessica
2017-07-01
Nontechnical skills (NTS), such as team communication, are well-recognized determinants of trauma team performance and good patient care. Measuring these competencies during trauma resuscitations is essential, yet few valid and reliable tools are available. We aimed to demonstrate that the Trauma Team Communication Assessment (TTCA-24) is a valid and reliable instrument that measures communication effectiveness during activations. Two tools with adequate psychometric strength (Trauma Nontechnical Skills Scale [T-NOTECHS], Team Emergency Assessment Measure [TEAM]) were identified during a systematic review of medical literature and compared with TTCA-24. Three coders used each tool to evaluate 35 stable and 35 unstable patient activations (defined according to Advanced Trauma Life Support criteria). Interrater reliability was calculated between coders using the intraclass correlation coefficient. Spearman rank correlation coefficient was used to establish concurrent validity between TTCA-24 and the other two validated tools. Coders achieved an intraclass correlation coefficient of 0.87 for stable patient activations and 0.78 for unstable activations scoring excellent on the interrater agreement guidelines. The median score for each assessment showed good team communication for all 70 videos (TEAM, 39.8 of 54; T-NOTECHS, 17.4 of 25; and TTCA-24, 87.4 of 96). A significant correlation between TTTC-24 and T-NOTECHS was revealed (p = 0.029), but no significant correlation between TTCA-24 and TEAM (p = 0.77). Team communication was rated slightly better across all assessments for stable versus unstable patient activations, but not statistically significant. TTCA-24 correlated with T-NOTECHS, an instrument measuring nontechnical skills for trauma teams, but not TEAM, a tool that assesses communication in generic emergency settings. TTCA-24 is a reliable and valid assessment that can be a useful adjunct when evaluating interpersonal and team communication during trauma activations. Diagnostic tests or criteria, level II.
Csizmadi, Ilona; Neilson, Heather K.; Kopciuk, Karen A.; Khandwala, Farah; Liu, Andrew; Friedenreich, Christine M.; Yasui, Yutaka; Rabasa-Lhoret, Rémi; Bryant, Heather E.; Lau, David C. W.; Robson, Paula J.
2014-01-01
We determined measurement properties of the Sedentary Time and Activity Reporting Questionnaire (STAR-Q), which was designed to estimate past-month activity energy expenditure (AEE). STAR-Q validity and reliability were assessed in 102 adults in Alberta, Canada (2009–2011), who completed 14-day doubly labeled water (DLW) protocols, 7-day activity diaries on day 15, and the STAR-Q on day 14 and again at 3 and 6 months. Three-month reliability was substantial for total energy expenditure (TEE) and AEE (intraclass correlation coefficients of 0.84 and 0.73, respectively), while 6-month reliability was moderate. STAR-Q-derived TEE and AEE were moderately correlated with DLW estimates (Spearman's ρs of 0.53 and 0.40, respectively; P < 0.001), and on average, the STAR-Q overestimated TEE and AEE (median differences were 367 kcal/day and 293 kcal/day, respectively). Body mass index-, age-, sex-, and season-adjusted concordance correlation coefficients (CCCs) were 0.24 (95% confidence interval (CI): 0.07, 0.36) and 0.21 (95% CI: 0.11, 0.32) for STAR-Q-derived versus DLW-derived TEE and AEE, respectively. Agreement between the diaries and STAR-Q (metabolic equivalent-hours/day) was strongest for occupational sedentary time (adjusted CCC = 0.76, 95% CI: 0.64, 0.85) and overall strenuous activity (adjusted CCC = 0.64, 95% CI: 0.49, 0.76). The STAR-Q demonstrated substantial validity for estimating occupational sedentary time and strenuous activity and fair validity for ranking individuals by AEE. PMID:25038920
Patellar Skin Surface Temperature by Thermography Reflects Knee Osteoarthritis Severity
Denoble, Anna E.; Hall, Norine; Pieper, Carl F.; Kraus, Virginia B.
2010-01-01
Background: Digital infrared thermal imaging is a means of measuring the heat radiated from the skin surface. Our goal was to develop and assess the reproducibility of serial infrared measurements of the knee and to assess the association of knee temperature by region of interest with radiographic severity of knee Osteoarthritis (rOA). Methods: A total of 30 women (15 Cases with symptomatic knee OA and 15 age-matched Controls without knee pain or knee OA) participated in this study. Infrared imaging was performed with a Meditherm Med2000™ Pro infrared camera. The reproducibility of infrared imaging of the knee was evaluated through determination of intraclass correlation coefficients (ICCs) for temperature measurements from two images performed 6 months apart in Controls whose knee status was not expected to change. The average cutaneous temperature for each of five knee regions of interest was extracted using WinTes software. Knee x-rays were scored for severity of rOA based on the global Kellgren-Lawrence grading scale. Results: The knee infrared thermal imaging procedure used here demonstrated long-term reproducibility with high ICCs (0.50–0.72 for the various regions of interest) in Controls. Cutaneous temperature of the patella (knee cap) yielded a significant correlation with severity of knee rOA (R = 0.594, P = 0.02). Conclusion: The skin temperature of the patellar region correlated with x-ray severity of knee OA. This method of infrared knee imaging is reliable and as an objective measure of a sign of inflammation, temperature, indicates an interrelationship of inflammation and structural knee rOA damage. PMID:21151853
Patellar skin surface temperature by thermography reflects knee osteoarthritis severity.
Denoble, Anna E; Hall, Norine; Pieper, Carl F; Kraus, Virginia B
2010-10-15
Digital infrared thermal imaging is a means of measuring the heat radiated from the skin surface. Our goal was to develop and assess the reproducibility of serial infrared measurements of the knee and to assess the association of knee temperature by region of interest with radiographic severity of knee Osteoarthritis (rOA). A total of 30 women (15 Cases with symptomatic knee OA and 15 age-matched Controls without knee pain or knee OA) participated in this study. Infrared imaging was performed with a Meditherm Med2000™ Pro infrared camera. The reproducibility of infrared imaging of the knee was evaluated through determination of intraclass correlation coefficients (ICCs) for temperature measurements from two images performed 6 months apart in Controls whose knee status was not expected to change. The average cutaneous temperature for each of five knee regions of interest was extracted using WinTes software. Knee x-rays were scored for severity of rOA based on the global Kellgren-Lawrence grading scale. The knee infrared thermal imaging procedure used here demonstrated long-term reproducibility with high ICCs (0.50-0.72 for the various regions of interest) in Controls. Cutaneous temperature of the patella (knee cap) yielded a significant correlation with severity of knee rOA (R = 0.594, P = 0.02). The skin temperature of the patellar region correlated with x-ray severity of knee OA. This method of infrared knee imaging is reliable and as an objective measure of a sign of inflammation, temperature, indicates an interrelationship of inflammation and structural knee rOA damage.
Boer, Annemarie; Dutmer, Alisa L; Schiphorst Preuper, Henrica R; van der Woude, Lucas H V; Stewart, Roy E; Deyo, Richard A; Reneman, Michiel F; Soer, Remko
2017-10-01
Validation study with cross-sectional and longitudinal measurements. To translate the US National Institutes of Health (NIH)-minimal dataset for clinical research on chronic low back pain into the Dutch language and to test its validity and reliability among people with chronic low back pain. The NIH developed a minimal dataset to encourage more complete and consistent reporting of clinical research and to be able to compare studies across countries in patients with low back pain. In the Netherlands, the NIH-minimal dataset has not been translated before and measurement properties are unknown. Cross-cultural validity was tested by a formal forward-backward translation. Structural validity was tested with exploratory factor analyses (comparative fit index, Tucker-Lewis index, and root mean square error of approximation). Hypothesis testing was performed to compare subscales of the NIH dataset with the Pain Disability Index and the EurQol-5D (Pearson correlation coefficients). Internal consistency was tested with Cronbach α and test-retest reliability at 2 weeks was calculated in a subsample of patients with Intraclass Correlation Coefficients and weighted Kappa (κω). In total, 452 patients were included of which 52 were included for the test-retest study. factor analysis for structural validity pointed into the direction of a seven-factor model (Cronbach α = 0.78). Factors and total score of the NIH-minimal dataset showed fair to good correlations with Pain Disability Index (r = 0.43-0.70) and EuroQol-5D (r = -0.41 to -0.64). Reliability: test-retest reliability per item showed substantial agreement (κω=0.65). Test-retest reliability per factor was moderate to good (Intraclass Correlation Coefficient = 0.71). The Dutch language version measurement properties of the NIH-minimal were satisfactory. N/A.
Martinez-Vega, Ingrid Patricia; Doubova, Svetlana V; Aguirre-Hernandez, Rebeca; Infante-Castañeda, Claudia
2016-03-02
The aim of this study was to adapt and validate the Distress Scale for Mexican patients with type 2 diabetes and hypertension (DSDH17M). Two family medicine clinics affiliated with the Mexican Institute of Social Security. 722 patients with type 2 diabetes and/or hypertension (235 patients with diabetes, 233 patients with hypertension and 254 patients with both diseases). A cross-sectional survey. The validation procedures included: (1) content validity using a group of experts, (2) construct validity from exploratory factor analysis, (3) internal consistency using Cronbach's α, (4) convergent validity between DSDH17M and anxiety and depression using the Spearman correlation coefficient, (5) discriminative validity through the Wilcoxon rank-sum test and (6) test-retest reliability using intraclass correlation coefficient. The DSDH17M has 17 items and three factors explaining 67% of the total variance. Cronbach α ranged from 0.83 to 0.91 among factors. The first factor of 'Regime-related Distress and Emotional Burden' moderately correlated with anxiety and depression scores. Discriminative validity revealed that patients with obesity, those with stressful events and those who did not adhere to pharmacological treatment had significantly higher distress scores in all DSDH17M domains. Test-retest intraclass correlation coefficient for DSDH17M ranged from 0.92 to 0.97 among factors. DSDH17M is a valid and reliable tool to identify distress of patients with type 2 diabetes and hypertension. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
Doubova, Svetlana V; Aguirre-Hernandez, Rebeca; Gutiérrez-de la Barrera, Marcos; Infante-Castañeda, Claudia; Pérez-Cuevas, Ricardo
2015-09-01
The purpose of this study is to validate the Mexican version of the Short-Form Supportive Care Needs survey (SCNS-SFM). A cross-sectional survey was conducted from June to December 2013 at the Oncology Hospital of the Mexican Institute of Social Security in Mexico City. The study included 825 subsequent cancer patients >20 years of age with all forms of solid cancer. Patients had prior surgical removal of histologically confirmed cancer and attended outpatient consultations. Validation of SCNS-SFM included the following: (1) content validity through a group of experts; (2) construct validity through an exploratory factor analysis based on the polychoric correlation matrix; (3) internal consistency using Cronbach's alpha; (4) convergent validity between SCNS-SFM and quality of life, anxiety, and depression scales by calculating Pearson's correlation coefficient; (5) discriminative validity through analysis of MANOVAs; and (6) test-retest reliability using intraclass correlation coefficient calculations. SCNS-SFM has 33 items with five factors accounting for 59 % of total variance. Cronbach's alpha values ranged from 0.78 to 0.90 among factors. SCNS-SFM has good convergent validity compared with quality of life and depression and anxiety scales and good discriminative validity, revealing great information, psychological support, and physical daily living needs for women, patients <60 years, and high physical daily living needs for those with <1 year since cancer diagnosis, with advanced disease stages and current chemo- or radiotherapy. Intraclass correlation coefficient between SCNS-SFM measurements was 0.9. SCNS-SFM has acceptable psychometric properties and is suitable to evaluate supportive care needs of cancer patients.
Koehorst, Marije L S; van Trijffel, Emiel; Lindeboom, Robert
2014-08-01
Clinical measurement, longitudinal. To assess the test-retest reliability, construct validity, and responsiveness of the Patient-Specific Functional Scale (PSFS) in patients with a primary shoulder complaint. Health measurement outcomes have become increasingly important for evaluating treatment. Patient-specific questionnaires are useful tools for determining treatment goals and evaluating treatment in individual patients. These questionnaires have not yet been validated in patients with nonspecific shoulder pain. Patients completed the PSFS, the numeric pain rating scale, and the Shoulder Pain and Disability Index at baseline, and after 1 week and 4 to 6 weeks. Test-retest reliability was determined using intraclass correlation coefficients. To assess convergent validity, change scores of the PSFS were correlated with the numeric pain rating scale and Shoulder Pain and Disability Index change scores. Responsiveness was assessed by calculating the area under the curve, the minimal clinically important change, and minimal detectable change, using the global rating of change as an external criterion. Fifty patients (37 men; mean age, 47.7 years) participated in the study. Reliability was high (intraclass correlation coefficient = 0.87; 95% confidence interval [CI]: 0.72, 0.94). The correlations between the change scores of the PSFS and those of the Shoulder Pain and Disability Index and numeric pain rating scale were 0.45 (95% CI: 0.17, 0.80) and 0.55 (95% CI: 0.29, 0.73), respectively. The area under the curve for the PSFS was 0.67 (95% CI: 0.51, 0.83). The minimal detectable change and minimal clinically important change were 0.97 and 1.29 points, respectively. These results suggest that the PSFS is a reliable, valid, and responsive instrument that can be used as an evaluative instrument in patients with a primary shoulder complaint.
Converting positive and negative symptom scores between PANSS and SAPS/SANS.
van Erp, Theo G M; Preda, Adrian; Nguyen, Dana; Faziola, Lawrence; Turner, Jessica; Bustillo, Juan; Belger, Aysenil; Lim, Kelvin O; McEwen, Sarah; Voyvodic, James; Mathalon, Daniel H; Ford, Judith; Potkin, Steven G; Fbirn
2014-01-01
The Scale for the Assessment of Positive Symptoms (SAPS), the Scale for the Assessment of Negative Symptoms (SANS), and the Positive and Negative Syndrome Scale for Schizophrenia (PANSS) are the most widely used schizophrenia symptom rating scales, but despite their co-existence for 25 years no easily usable between-scale conversion mechanism exists. The aim of this study was to provide equations for between-scale symptom rating conversions. Two-hundred-and-five schizophrenia patients [mean age±SD=39.5±11.6, 156 males] were assessed with the SANS, SAPS, and PANSS. Pearson's correlations between symptom scores from each of the scales were computed. Linear regression analyses, on data from 176 randomly selected patients, were performed to derive equations for converting ratings between the scales. Intraclass correlations, on data from the remaining 29 patients, not part of the regression analyses, were performed to determine rating conversion accuracy. Between-scale positive and negative symptom ratings were highly correlated. Intraclass correlations between the original positive and negative symptom ratings and those obtained via conversion of alternative ratings using the conversion equations were moderate to high (ICCs=0.65 to 0.91). Regression-based equations may be useful for conversion between schizophrenia symptom severity as measured by the SANS/SAPS and PANSS, though additional validation is warranted. This study's conversion equations, implemented at http:/converteasy.org, may aid in the comparison of medication efficacy studies, in meta- and mega-analyses examining symptoms as moderator variables, and in retrospective combination of symptom data in multi-center data sharing projects that need to pool symptom rating data when such data are obtained using different scales. Copyright © 2013 Elsevier B.V. All rights reserved.
Yin, Xiaoming; Guo, Yang; Li, Weiguo; Huo, Eugene; Zhang, Zhuoli; Nicolai, Jodi; Kleps, Robert A.; Hernando, Diego; Katsaggelos, Aggelos K.; Omary, Reed A.
2012-01-01
Purpose: To demonstrate the feasibility of using chemical shift magnetic resonance (MR) imaging fat-water separation methods for quantitative estimation of transcatheter lipiodol delivery to liver tissues. Materials and Methods: Studies were performed in accordance with institutional Animal Care and Use Committee guidelines. Proton nuclear MR spectroscopy was first performed to identify lipiodol spectral peaks and relative amplitudes. Next, phantoms were constructed with increasing lipiodol-water volume fractions. A multiecho chemical shift–based fat-water separation method was used to quantify lipiodol concentration within each phantom. Six rats served as controls; 18 rats underwent catheterization with digital subtraction angiography guidance for intraportal infusion of a 15%, 30%, or 50% by volume lipiodol-saline mixture. MR imaging measurements were used to quantify lipiodol delivery to each rat liver. Lipiodol concentration maps were reconstructed by using both single-peak and multipeak chemical shift models. Intraclass and Spearman correlation coefficients were calculated for statistical comparison of MR imaging–based lipiodol concentration and volume measurements to reference standards (known lipiodol phantom compositions and the infused lipiodol dose during rat studies). Results: Both single-peak and multipeak measurements were well correlated to phantom lipiodol concentrations (r2 > 0.99). Lipiodol volume measurements were progressively and significantly higher when comparing between animals receiving different doses (P < .05 for each comparison). MR imaging–based lipiodol volume measurements strongly correlated with infused dose (intraclass correlation coefficients > 0.93, P < .001) with both single- and multipeak approaches. Conclusion: Chemical shift MR imaging fat-water separation methods can be used for quantitative measurements of lipiodol delivery to liver tissues. © RSNA, 2012 PMID:22623693
Ando, Yukako; Kataoka, Tsuyoshi; Okamura, Hitoshi; Tanaka, Katsutoshi; Kobayashi, Toshio
2013-12-01
The purpose of this research is to verify the reliability and validity of a job stressor scale for nurses caring for patients with intractable neurological diseases. A mail survey was conducted using a self-report questionnaire. The subjects were 263 nurses and assistant nurses working in wards specializing in intractable neurological diseases. The response rate was 71.9% (valid response rate, 66.2%). With regard to reliability, internal consistency and stability were assessed. Internal consistency was examined via Cronbach's alpha. For stability, the test-retest method was performed and stability was examined via intraclass correlation coefficients. With regard to validity, factor validity, criterion-related validity, and content validity were assessed. Exploratory factor analysis was used for factor validity. For criterion-related validity, an existing scale was used as an external criterion; concurrent validity was examined via Spearman's rank correlation coefficients. As a result of analysis, there were 26 items in the scale created with an eight factor structure. Cronbach's a for the 26 items was 0.90; with the exception of two factors, alpha for all of the individual sub-factors was high at 0.7 or higher. The intraclass correlation coefficient for the 26 items was 0.89 (p < 0.001). With regard to criterion-related validity, concurrent validity was confirmed and the correlation coefficient with an external criterion was 0.73 (p < 0.001). For content validity, subjects who responded that "The questionnaire represents a stressor well or to a degree" accounted for 81% of the total responses. Reliability and validity were confirmed, so the scale created in the current research is a usable scale.
Mohammadifard, Noushin; Omidvar, Nasrin; Houshiarrad, Anahita; Neyestani, Tirang; Naderi, Gholam-Ali; Soleymani, Bahram
2011-01-01
BACKGROUND: This study's aim was to design and validate a semi-quantitative food frequency questionnaire (FFQ) for assessment of fruits and vegetables (FV) consumption in adults of Isfahan by comparing the FFQ with dietary reference method and blood plasma levels of beta-carotene, vitamin C, and retinol. METHODS: This validation study was performed on 123 healthy adults of Isfahan. FV intake was assessed using a 110-item FFQ. Data collection was performed during two different time periods to control for seasonal effects, fall/winter (cold season) and spring/summer (warm season). In each phase a FFQ and 1 day recall, and 2 days of food records as the dietary reference method were completed and plasma vitamin C, beta-carotene and retinol were measured. Data was analyzed by Pearson or Spearman and intraclass correlations. RESULTS: Serum Lipids, sex, age, body mass index (BMI) and educational level adjusted Pearson correlation coefficient of FV with plasma vitamin C, beta-carotene and retinol were 0.55, 0.47 and 0.28 in the cold season (p < 0.05) and 0.52, 0.45 and 0.35 in the warm season (p < 0.001), respectively. Energy and fat intake, sex, age, BMI and educational level adjusted Pearson correlation coefficient for FV with dietary reference method in the cold and warm seasons were 0.62 and 0.60, respectively (p < 0.001). Intraclass correlation for reproducibility of FFQ in FV was 0.65 (p<0.001). CONCLUSIONS: The designed FFQ had a good criterion validity and reproducibility for assessment of FV intake. Thus, it can serve as a valid tool in epidemiological studies to assess fruit and vegetable intake. PMID:22973322
Kurre, Annette; van Gool, Christel J A W; Bastiaenen, Caroline H G; Gloor-Juzi, Thomas; Straumann, Dominik; de Bruin, Eling D
2009-04-01
To translate the Dizziness Handicap Inventory into German (DHI-G) and investigate reliability, assess the association between selected items of the University of California Los Angeles Dizziness Questionnaire and the DHI-G, and compare the scores of patients and healthy participants. Cross-sectional design. Tertiary center for vertigo, dizziness, or balance disorders. One hundred forty-one patients with vertigo, dizziness, and unsteadiness associated with a vestibular disorder, with a mean age (standard deviation) of 51.5 (13.2) years, and 52 healthy individuals participated. Fourteen patients participated in the cognitive debriefing; 127 patients completed the questionnaires once or twice within 1 week. The DHI-G assesses disability caused by dizziness and unsteadiness; the items of the University of California Los Angeles Dizziness Questionnaire assess dizziness and impact on everyday activities. Internal consistency was estimated using Cronbach alpha, reproducibility by calculating Bland-Altman limits of agreement and intraclass correlation coefficients. Associations were estimated by Spearman correlation coefficients. Patients filled out the DHI-G without problem and found that their self-perceived disabilities were mostly included. Cronbach alpha values for the DHI-G and the functional, physical, and emotional subscales were 0.90, 0.80, 0.71, and 0.82, respectively. The limits of agreement were +/-12.4 points for the total scale (maximum, 100 points). Intraclass correlation coefficients ranged from 0.90 to 0.95. The DHI-G correlated moderately with the question assessing functional disability (0.56) and fairly with the questions quantifying dizziness (0.43, 0.35). The DHI-G discriminated significantly between healthy participants and patients. The DHI-G demonstrated good reliability and is recommended as a measure of disability in patients with dizziness and unsteadiness.
van Reedt Dortland, Arianne K B; Peters, Lilian L; Boenink, Annette D; Smit, Jan H; Slaets, Joris P J; Hoogendoorn, Adriaan W; Joos, Andreas; Latour, Corine H M; Stiefel, Friedrich; Burrus, Cyrille; Guitteny-Collas, Marie; Ferrari, Silvia
2017-05-01
The INTERMED Self-Assessment questionnaire (IMSA) was developed as an alternative to the observer-rated INTERMED (IM) to assess biopsychosocial complexity and health care needs. We studied feasibility, reliability, and validity of the IMSA within a large and heterogeneous international sample of adult hospital inpatients and outpatients as well as its predictive value for health care use (HCU) and quality of life (QoL). A total of 850 participants aged 17 to 90 years from five countries completed the IMSA and were evaluated with the IM. The following measurement properties were determined: feasibility by percentages of missing values; reliability by Cronbach α; interrater agreement by intraclass correlation coefficients; convergent validity of IMSA scores with mental health (Short Form 36 emotional well-being subscale and Hospital Anxiety and Depression Scale), medical health (Cumulative Illness Rating Scale) and QoL (Euroqol-5D) by Spearman rank correlations; and predictive validity of IMSA scores with HCU and QoL by (generalized) linear mixed models. Feasibility, face validity, and reliability (Cronbach α = 0.80) were satisfactory. Intraclass correlation coefficient between IMSA and IM total scores was .78 (95% CI = .75-.81). Correlations of the IMSA with the Short Form 36, Hospital Anxiety and Depression Scale, Cumulative Illness Rating Scale, and Euroqol-5D (convergent validity) were -.65, .15, .28, and -.59, respectively. The IMSA significantly predicted QoL and also HCU (emergency department visits, hospitalization, outpatient visits, and diagnostic examinations) after 3- and 6-month follow-up. Results were comparable between hospital sites, inpatients and outpatients, as well as age groups. The IMSA is a generic and time-efficient method to assess biopsychosocial complexity and to provide guidance for multidisciplinary care trajectories in adult patients, with good reliability and validity across different cultures.
Sun, Zhi-Jing; Zhu, Lan; Liang, Maolian; Xu, Tao; Lang, Jing-He
2016-08-01
WeChat is a promising tool for capturing electronic data; however, no research has examined its use. This study evaluates the reliability and feasibility of WeChat for administering the Pelvic Floor Impact Questionnaire Short Form 7 questionnaire to women with pelvic floor disorders. Sixty-eight pelvic floor rehabilitation women were recruited between June and December 2015 and crossover randomized to two groups. All participants completed two questionnaire formats. One group completed the paper version followed by the WeChat version; the other group completed the questionnaires in reverse order. Two weeks later, each group completed the two versions in reverse order. The WeChat version's reliability was assessed using intraclass correlation coefficients and test-retest reliability. Forty-two women (61.8%) preferred the WeChat to the paper format, eight (11.8%) preferred the paper format, and 18 (26.5%) had no preference. The younger women preferred WeChat. Completion time was 116.5 (61.3) seconds for the WeChat version and 133.4 (107.0) seconds for the paper version, with no significant difference (P = 0.145). Age and education did not impact completion time (P > 0.05). Consistency between the WeChat and paper versions was excellent. The intraclass correlation coefficients of the Pelvic Floor Impact Questionnaire Short Form 7 and the three subscales ranged from 0.915 to 0.980. The Bland-Altman analysis and linear regression results also showed high consistency. The test-retest study had a Pearson's correlation coefficient of 0.908, demonstrating a strong correlation. WeChat-based questionnaires were well accepted by women with pelvic floor disorders and had good data quality and reliability.
Reliability and Validity of a New Test of Agility and Skill for Female Amateur Soccer Players
Kutlu, Mehmet; Yapici, Hakan; Yilmaz, Abdullah
2017-01-01
Abstract The aim of this study was to evaluate the Agility and Skill Test, which had been recently developed to assess agility and skill in female athletes. Following a 10 min warm-up, two trials to test the reliability and validity of the test were conducted one week apart. Measurements were collected to compare soccer players’ physical performance in a 20 m sprint, a T-Drill test, the Illinois Agility Run Test, change-of-direction and acceleration, as well as agility and skill. All tests were completed following the same order. Thirty-four amateur female soccer players were recruited (age = 20.8 ± 1.9 years; body height = 166 ± 6.9 cm; body mass = 55.5 ± 5.8 kg). To determine the reliability and usefulness of these tests, paired sample t-tests, intra-class correlation coefficients, typical error, coefficient of variation, and differences between the typical error and smallest worthwhile change statistics were computed. Test results showed no significant differences between the two sessions (p > 0.01). There were higher intra-class correlations between the test and retest values (r = 0.94–0.99) for all tests. Typical error values were below the smallest worthwhile change, indicating ‘good’ usefulness for these tests. A near perfect Pearson correlation between the Agility and Skill Test (r = 0.98) was found, and there were moderate-to-large levels of correlation between the Agility and Skill Test and other measures (r = 0.37 to r = 0.56). The results of this study suggest that the Agility and Skill Test is a reliable and valid test for female soccer players and has significant value for assessing the integrative agility and skill capability of soccer players. PMID:28469760
Pisconti, Fernando; Mahmoud Smaili Santos, Suhaila; Lopes, Josiane; Rosa Cardoso, Jefferson; Lopes Lavado, Edson
2017-11-29
The Exercise Self-Efficacy scale (ESES) is a reliable measure, in the English language, of exercise self-efficacy in individuals with spinal cord injury. The aim of this study was to culturally adjust and validate the Exercise Self-Efficacy scale in the Portuguese language. The Exercise Self-Efficacy scale was applied to 76 subjects, with three-month intervals (three applications in total). The reliability was appraised using the intra-class correlation coefficient and Bland-Altman methods, and the internal consistency was evaluated using Cronbach´s alpha. The Exercise Self-Efficacy scale was correlated with the domains of the Quality of life Questionnaire SF-36 and Functional Independence Measure and tested using the Spearman rho coefficient. The Exercise Self-Efficacy scale-Brazil presented good internal consistency (alpha 1 = 0.856; alpha 2 = 0.855; alpha 3 = 0.822) and high reliability in the test-retest (intra-class correlation coefficient = 0.97). There was a strong correlation between the Exercise Self-Efficacy scale-Brazil and the SF-36 only in the functional capacity domain (rho = 0.708). There were no changes in Exercise Self-Efficacy scale-Brazil scores between the three applications (p = 0.796). The validation of the Exercise Self-Efficacy scale questionnaire permits the assessor to use it reliably in Portuguese speaking countries, since it is the first instrument measuring self-efficacy specifically during exercises in individuals with spinal cord injury. Furthermore, the questionnaire can be used as an instrument to verify the effectiveness of interventions that use exercise as an outcome. The results of the Brazilian version of the Exercise Self-Efficacy scale support its use as a reliable and valid measurement of exercise self-efficacy for this population.
Psychometric assessment of a scale to measure bonding workplace social capital
Tsutsumi, Akizumi; Inoue, Akiomi; Odagiri, Yuko
2017-01-01
Objectives Workplace social capital (WSC) has attracted increasing attention as an organizational and psychosocial factor related to worker health. This study aimed to assess the psychometric properties of a newly developed WSC scale for use in work environments, where bonding social capital is important. Methods We assessed the psychometric properties of a newly developed 6-item scale to measure bonding WSC using two data sources. Participants were 1,650 randomly selected workers who completed an online survey. Exploratory factor analyses were conducted. We examined the item–item and item–total correlations, internal consistency, and associations between scale scores and a previous 8-item measure of WSC. We evaluated test–retest reliability by repeating the survey with 900 of the respondents 2 weeks later. The overall scale reliability was quantified by an intraclass coefficient and the standard error of measurement. We evaluated convergent validity by examining the association with several relevant workplace psychosocial factors using a dataset from workers employed by an electrical components company (n = 2,975). Results The scale was unidimensional. The item–item and item–total correlations ranged from 0.52 to 0.78 (p < 0.01) and from 0.79 to 0.89 (p < 0.01), respectively. Internal consistency was good (Cronbach’s α coefficient: 0.93). The correlation with the 8-item scale indicated high criterion validity (r = 0.81) and the scale showed high test–retest reliability (r = 0.74, p < 0.01). The intraclass coefficient and standard error of measurement were 0.74 (95% confidence intervals: 0.71–0.77) and 4.04 (95% confidence intervals: 1.86–6.20), respectively. Correlations with relevant workplace psychosocial factors showed convergent validity. Conclusions The results confirmed that the newly developed WSC scale has adequate psychometric properties. PMID:28662058
Celik, Selda; Pinar, Rukiye
2016-09-01
To examine the psychometric properties of a Turkish version of the Diabetes Fear of Injecting and Self-testing Questionnaire (D-FISQ). Forward-backward translation of the D-FISQ from English into Turkish was conducted. Original English and translated forms were examined by a panel group. Validity was investigated using content, confirmatory factor analysis, and divergent validity. Reliability was assessed using Cronbach α values, item-total correlations, and intraclass correlations. The sample comprised 350 patients with diabetes. Data were analyzed using SPSS 15.0 for Windows and LISREL 8. The content validity index for the panel members was .90, which indicated perfect content validity; items in D-FISQ were clear, concise, readable, and distinct. Confirmatory factor analysis confirmed the original construct of the D-FISQ. All items had factor loadings higher than the recommended level of .40. The D-FISQ scores were discriminated by the level of anxiety. Reliability results were also satisfactory. Cronbach α values were within ideal limits. Item-total correlation coefficient ranged from .72 to .86. In terms of test-retest reliability, intraclass correlation coefficient was found to be over .90. D-FISQ is a valid and reliable questionnaire in assessing needle-prick fear among Turkish patients with diabetes. We recommend performing the Turkish D-FISQ in determining and screening patients with diabetes who have fear related to self-insulin injection and finger-prick test. Thus, health care professionals should be aware of the potential consequences of injection fear such as insulin misuse and poor self-monitoring of blood glucose, which may have unfavorable effects on optimal diabetes management. Copyright © 2016. Published by Elsevier B.V.
Validity of the occupational sitting and physical activity questionnaire.
Chau, Josephine Y; Van Der Ploeg, Hidde P; Dunn, Scott; Kurko, John; Bauman, Adrian E
2012-01-01
Sitting at work is an emerging occupational health risk. Few instruments designed for use in population-based research measure occupational sitting and standing as distinct behaviors. This study aimed to develop and validate brief measure of occupational sitting and physical activity. A convenience sample (n = 99, 61% female) was recruited from two medium-sized workplaces and by word-of-mouth in Sydney, Australia. Participants completed the newly developed Occupational Sitting and Physical Activity Questionnaire (OSPAQ) and a modified version of the MONICA Optional Study on Physical Activity Questionnaire (modified MOSPA-Q) twice, 1 wk apart. Participants also wore an ActiGraph accelerometer for the 7 d in between the test and retest. Analyses determined test-retest reliability with intraclass correlation coefficients and assessed criterion validity against accelerometers using the Spearman ρ. The test-retest intraclass correlation coefficients for occupational sitting, standing, and walking for OSPAQ ranged from 0.73 to 0.90, while that for the modified MOSPA-Q ranged from 0.54 to 0.89. Comparison of sitting measures with accelerometers showed higher Spearman correlations for the OSPAQ (r = 0.65) than for the modified MOSPA-Q (r = 0.52). Criterion validity correlations for occupational standing and walking measures were comparable for both instruments with accelerometers (standing: r = 0.49; walking: r = 0.27-0.29). The OSPAQ has excellent test-retest reliability and moderate validity for estimating time spent sitting and standing at work and is comparable to existing occupational physical activity measures for assessing time spent walking at work. The OSPAQ brief instrument measures sitting and standing at work as distinct behaviors and would be especially suitable in national health surveys, prospective cohort studies, and other studies that are limited by space constraints for questionnaire items.
Flosadottir, Vala; Roos, Ewa M; Ageberg, Eva
2017-09-01
The Activity Rating Scale (ARS) for disorders of the knee evaluates the level of activity by the frequency of participation in 4 separate activities with high demands on knee function, with a score ranging from 0 (none) to 16 (pivoting activities 4 times/wk). To translate and cross-culturally adapt the ARS into Swedish and to assess measurement properties of the Swedish version of the ARS. Cohort study (diagnosis); Level of evidence, 2. The COSMIN guidelines were followed. Participants (N = 100 [55 women]; mean age, 27 years) who were undergoing rehabilitation for a knee injury completed the ARS twice for test-retest reliability. The Knee injury and Osteoarthritis Outcome Score (KOOS), Tegner Activity Scale (TAS), and modernized Saltin-Grimby Physical Activity Level Scale (SGPALS) were administered at baseline to validate the ARS. Construct validity and responsiveness of the ARS were evaluated by testing predefined hypotheses regarding correlations between the ARS, KOOS, TAS, and SGPALS. The Cronbach alpha, intraclass correlation coefficients, absolute reliability, standard error of measurement, smallest detectable change, and Spearman rank-order correlation coefficients were calculated. The ARS showed good internal consistency (α ≈ 0.96), good test-retest reliability (intraclass correlation coefficient >0.9), and no systematic bias between measurements. The standard error of measurement was less than 2 points, and the smallest detectable change was less than 1 point at the group level and less than 5 points at the individual level. More than 75% of the hypotheses were confirmed, indicating good construct validity and good responsiveness of the ARS. The Swedish version of the ARS is valid, reliable, and responsive for evaluating the level of activity based on the frequency of participation in high-demand knee sports activities in young adults with a knee injury.
Ács, Balázs; Kulka, Janina; Kovács, Kristóf Attila; Teleki, Ivett; Tőkés, Anna-Mária; Meczker, Ágnes; Győrffy, Balázs; Madaras, Lilla; Krenács, Tibor; Szász, Attila Marcell
2017-07-01
Although several antibodies are available for immunohistochemical detection of Ki-67, even the most commonly used MIB-1 has not been validated yet. Our aim was to compare 5 commercially available antibodies for detection of Ki-67 in terms of agreement and their ability in predicting prognosis of breast cancer. Tissue microarrays were constructed from 378 breast cancer patients' representative formalin-fixed, paraffin-embedded tumor blocks. Five antibodies were used to detect Ki-67 expression: MIB-1 using chromogenic detection and immunofluorescent-labeled MIB-1, SP-6, 30-9, poly, and B56. Semiquantitative assessment was performed by 2 pathologists independently on digitized slides. To compare the 5 antibodies, intraclass correlation and concordance correlation coefficient were used. All the antibodies but immunofluorescent-labeled MIB-1 (at 20% and 30% thresholds, P=.993 and P=.342, respectively) and B56 (at 30% threshold, P=.288) separated high- and low-risk patient groups. However, there were a significant difference (P values for all comparisons≤.005) and a moderate concordance (intraclass correlation, 0.645) between their Ki-67 labeling index scores. The highest concordance was found between MIB-1 and poly (concordance correlation coefficient=0.785) antibodies. None of the antibodies except Ki-67 labeling index as detected by poly (P=.031) at 20% threshold and lymph node status (P<.001) were significantly linked to disease-free survival in multivariate analysis. At 30% threshold, this was reduced to lymph node status (P<.001) alone. Our results showed that there are considerable differences between the different Ki-67 antibodies in their capacity to detect proliferating tumor cells and to separate low- and high-risk breast cancer patient groups. Copyright © 2017 Elsevier Inc. All rights reserved.
Feng, Dai; Svetnik, Vladimir; Coimbra, Alexandre; Baumgartner, Richard
2014-01-01
The intraclass correlation coefficient (ICC) with fixed raters or, equivalently, the concordance correlation coefficient (CCC) for continuous outcomes is a widely accepted aggregate index of agreement in settings with small number of raters. Quantifying the precision of the CCC by constructing its confidence interval (CI) is important in early drug development applications, in particular in qualification of biomarker platforms. In recent years, there have been several new methods proposed for construction of CIs for the CCC, but their comprehensive comparison has not been attempted. The methods consisted of the delta method and jackknifing with and without Fisher's Z-transformation, respectively, and Bayesian methods with vague priors. In this study, we carried out a simulation study, with data simulated from multivariate normal as well as heavier tailed distribution (t-distribution with 5 degrees of freedom), to compare the state-of-the-art methods for assigning CI to the CCC. When the data are normally distributed, the jackknifing with Fisher's Z-transformation (JZ) tended to provide superior coverage and the difference between it and the closest competitor, the Bayesian method with the Jeffreys prior was in general minimal. For the nonnormal data, the jackknife methods, especially the JZ method, provided the coverage probabilities closest to the nominal in contrast to the others which yielded overly liberal coverage. Approaches based upon the delta method and Bayesian method with conjugate prior generally provided slightly narrower intervals and larger lower bounds than others, though this was offset by their poor coverage. Finally, we illustrated the utility of the CIs for the CCC in an example of a wake after sleep onset (WASO) biomarker, which is frequently used in clinical sleep studies of drugs for treatment of insomnia.
Kim, Hee-Ju; Abraham, Ivo
2017-01-01
Evidence is needed on the clinicometric properties of single-item or short measures as alternatives to comprehensive measures. We examined whether two single-item fatigue measures (i.e., Likert scale, numeric rating scale) or a short fatigue measure were comparable to a comprehensive measure in reliability (i.e., internal consistency and test-retest reliability) and validity (i.e., convergent, concurrent, and predictive validity) in Korean young adults. For this quantitative study, we selected the Functional Assessment of Chronic Illness Therapy-Fatigue for the comprehensive measure and the Profile of Mood States-Brief, Fatigue subscale for the short measure; and constructed two single-item measures. A total of 368 students from four nursing colleges in South Korea participated. We used Cronbach's alpha and item-total correlation for internal consistency reliability and intraclass correlation coefficient for test-retest reliability. We assessed Pearson's correlation with a comprehensive measure for convergent validity, with perceived stress level and sleep quality for concurrent validity and the receiver operating characteristic curve for predictive validity. The short measure was comparable to the comprehensive measure in internal consistency reliability (Cronbach's alpha=0.81 vs. 0.88); test-retest reliability (intraclass correlation coefficient=0.66 vs. 0.61); convergent validity (r with comprehensive measure=0.79); concurrent validity (r with perceived stress=0.55, r with sleep quality=0.39) and predictive validity (area under curve=0.88). Single-item measures were not comparable to the comprehensive measure. A short fatigue measure exhibited similar levels of reliability and validity to the comprehensive measure in Korean young adults. Copyright © 2016 Elsevier Ltd. All rights reserved.
Duracinsky, Martin; Lalanne, Christophe; Le Coeur, Sophie; Herrmann, Susan; Berzins, Baiba; Armstrong, Andrew Richard; Lau, Joseph Tak Fai; Fournier, Isabelle; Chassany, Olivier
2012-04-15
This study reports the psychometric validation of a new HIV/AIDS-specific health-related quality of life (HRQL) questionnaire, the Patient Reported Outcomes Quality of Life-HIV. The instrument was developed simultaneously across Europe, North and South America, Africa, Asia, and Australia to assess multidimensional quality of life impairments in the era of highly active antiretroviral therapy. A cross-sectional study was performed in 8 countries. The pilot 70-item questionnaire was co-administered with the HIV symptoms index, the EQ-5D and Medical Outcomes Study-HIV questionnaires. Demographic and biomedical data were collected. After item analysis and reduction, convergent discriminant concurrent validity and known-group validity were examined. Internal consistency and reliability scores were assessed using Cronbach alpha and intraclass correlation. The final sample of 791 patients was composed of 64% males (median age: 41 years, HIV diagnosis = 5 years), 13.8% were treatment naive. Item reduction yielded a 43-item form surveying 8 dimensions and 1 global health item that showed good convergent and discriminant validity and reliability (98% scaling success; Cronbach alphas 0.77-0.89). Correlations with EQ-5D and Medical Outcomes Study-HIV complied with concurrent validity expectations; likewise, correlations against the number of self-reported symptoms and depression showed good support for criterion validity. A test-retest study on French patients (n = 34) showed temporal stability (intraclass correlation coefficient = 0.86). Significant and meaningful differences of HRQL scores between countries were found. The Patient Reported Outcomes Quality of Life-HIV questionnaire is a valid and reliable instrument for assessing HRQL specific to HIV disease in different cultures and healthcare systems.
de Queirós, Andréa Simone Siqueira; Brandão, Simone Cristina Soares; Macedo, Liana Gonçalves; Ourem, Maira Souto; Mota, Vitor Gomes; Leite, Luiz Arthur Calheiros; Lopes, Edmundo Pessoa Almeida; Domingues, Ana Lúcia Coutinho
2015-01-01
The formation of intrapulmonary vascular dilations (IPVD) is the key event for the onset of hepatopulmonary syndrome, vascular changes secondary to portal hypertension that leads to hypoxemia. The diagnosis of IPVD can be made by contrasted transthoracic echocardiography or scintigraphy with technetium-macroaggregated albumin-((99m)Tc-MAA)-that is a sensitive and specific diagnostic method and quantifies the IPVD magnitude. However, its procedure and diagnostic indices are not yet standardized and well defined in health services. The aims of this study were to define normality values and evaluate the inter- and intra-observer reproducibility degree of diagnostic indexes of IPVD through (99m)Tc-MAA scintigraphy. Cross-sectional study was conducted at the Clinical Hospital, Federal University of Pernambuco (HC-UFPE) between July and December 2012. Fifteen patients with hepatosplenic schistosomiasis and nine patients without liver or heart disease (control group) were assessed. After clinical assessment, ultrasound and echocardiography, patients underwent (99m)Tc-MAA scintigraphy, and a relative brain uptake value exceeding 6 % or systemic uptake value exceeding 11 % was considered diagnostic of IPVD. Each assessment was performed by two independent observers. To analyze the results of the normal group, the nonparametric Bootsptrap method simulation model combined with the Monte Carlo method was used and to analyze inter- and intra-observer reproducibility indexes, the kappa and intra-class correlation coefficient were used. In normal subjects, the average brain uptake of (99m)Tc-MAA was 7.9 ± 0.01 % and systemic uptake was 12.4 ± 0.03 %, with low dispersal rates for both measures. The intra-observer agreement was 100 %, with kappa index of 1.0 (p < 0.0001), suggesting a perfect agreement. The inter-observer agreement was also 100 % (kappa = 1.0, p < 0.0001) for brain uptake; however, systemic uptake showed kappa = 0.25 (p = 0.07), which features tolerable concordance. The intra-class correlation was excellent for both uptake indexes. The normality values were slightly higher than those reported in studies from other countries. The demographic characteristics of the Brazilian population, the small number of patients or different methodologies can be the causes of such differences. (99m)Tc-MAA scintigraphy showed excellent reproducibility.
Neubert, Ales; Fripp, Jurgen; Engstrom, Craig; Gal, Yaniv; Crozier, Stuart; Kingsley, Michael I C
2014-11-01
Magnetic resonance (MR) examinations of morphologic characteristics of intervertebral discs (IVDs) have been used extensively for biomechanical studies and clinical investigations of the lumbar spine. Traditionally, the morphologic measurements have been performed using time- and expertise-intensive manual segmentation techniques not well suited for analyses of large-scale studies.. The purpose of this study is to introduce and validate a semiautomated method for measuring IVD height and mean sagittal area (and volume) from MR images to determine if it can replace the manual assessment and enable analyses of large MR cohorts. This study compares semiautomated and manual measurements and assesses their reliability and agreement using data from repeated MR examinations. Seven healthy asymptomatic males underwent 1.5-T MR examinations of the lumbar spine involving sagittal T2-weighted fast spin-echo images obtained at baseline, pre-exercise, and postexercise conditions. Measures of the mean height and the mean sagittal area of lumbar IVDs (L1-L2 to L4-L5) were compared for two segmentation approaches: a conventional manual method (10-15 minutes to process one IVD) and a specifically developed semiautomated method (requiring only a few mouse clicks to process each subject). Both methods showed strong test-retest reproducibility evaluated on baseline and pre-exercise examinations with strong intraclass correlations for the semiautomated and manual methods for mean IVD height (intraclass correlation coefficient [ICC]=0.99, 0.98) and mean IVD area (ICC=0.98, 0.99), respectively. A bias (average deviation) of 0.38 mm (4.1%, 95% confidence interval 0.18-0.59 mm) was observed between the manual and semiautomated methods for the IVD height, whereas there was no statistically significant difference for the mean IVD area (0.1%±3.5%). The semiautomated and manual methods both detected significant exercise-induced changes in IVD height (0.20 and 0.28 mm) and mean IVD area (5.7 and 8.3 mm(2)), respectively. The presented semiautomated method provides an alternative to time- and expertise-intensive manual procedures for analysis of larger, cross-sectional, interventional, and longitudinal MR studies for morphometric analyses of lumbar IVDs. Copyright © 2014 Elsevier Inc. All rights reserved.
Yang, Scott; Jones-Quaidoo, Sean M; Eager, Matthew; Griffin, Justin W; Reddi, Vasantha; Novicoff, Wendy; Shilt, Jeffrey; Bersusky, Ernesto; Defino, Helton; Ouellet, Jean; Arlet, Vincent
2011-07-01
In adolescent idiopathic scoliosis (AIS) there has been a shift towards increasing the number of implants and pedicle screws, which has not been proven to improve cosmetic correction. To evaluate if increasing cost of instrumentation correlates with cosmetic correction using clinical photographs. 58 Lenke 1A and B cases from a multicenter AIS database with at least 3 months follow-up of clinical photographs were used for analysis. Cosmetic parameters on PA and forward bending photographs included angular measurements of trunk shift, shoulder balance, rib hump, and ratio measurements of waist line asymmetry. Pre-op and follow-up X-rays were measured for coronal and sagittal deformity parameters. Cost density was calculated by dividing the total cost of instrumentation by the number of vertebrae being fused. Linear regression and spearman's correlation were used to correlate cost density to X-ray and photo outcomes. Three independent observers verified radiographic and cosmetic parameters for inter/interobserver variability analysis. Average pre-op Cobb angle and instrumented correction were 54° (SD 12.5) and 59% (SD 25) respectively. The average number of vertebrae fused was 10 (SD 1.9). The total cost of spinal instrumentation ranged from $6,769 to $21,274 (Mean $12,662, SD $3,858). There was a weak positive and statistically significant correlation between Cobb angle correction and cost density (r = 0.33, p = 0.01), and no correlation between Cobb angle correction of the uninstrumented lumbar spine and cost density (r = 0.15, p = 0.26). There was no significant correlation between all sagittal X-ray measurements or any of the photo parameters and cost density. There was good to excellent inter/intraobserver variability of all photographic parameters based on the intraclass correlation coefficient (ICC 0.74-0.98). Our method used to measure cosmesis had good to excellent inter/intraobserver variability, and may be an effective tool to objectively assess cosmesis from photographs. Since increasing cost density only improves mildly the Cobb angle correction of the main thoracic curve and not the correction of the uninstrumented spine or any of the cosmetic parameters, one should consider the cost of increasing implant density in Lenke 1A and B curves. In the area of rationalization of health care expenses, this study demonstrates that increasing the number of implants does not improve any relevant cosmetic or radiographic outcomes.
Probability interpretations of intraclass reliabilities.
Ellis, Jules L
2013-11-20
Research where many organizations are rated by different samples of individuals such as clients, patients, or employees frequently uses reliabilities computed from intraclass correlations. Consumers of statistical information, such as patients and policy makers, may not have sufficient background for deciding which levels of reliability are acceptable. It is shown that the reliability is related to various probabilities that may be easier to understand, for example, the proportion of organizations that will be classed significantly above (or below) the mean and the probability that an organization is classed correctly given that it is classed significantly above (or below) the mean. One can view these probabilities as the amount of information of the classification and the correctness of the classification. These probabilities have an inverse relationship: given a reliability, one can 'buy' correctness at the cost of informativeness and conversely. This article discusses how this can be used to make judgments about the required level of reliabilities. Copyright © 2013 John Wiley & Sons, Ltd.
Cuenca-Estrella, Manuel; Gomez-Lopez, Alicia; Alastruey-Izquierdo, Ana; Bernal-Martinez, Leticia; Cuesta, Isabel; Buitrago, Maria J.; Rodriguez-Tudela, Juan L.
2010-01-01
The commercial technique Vitek 2 system for antifungal susceptibility testing of yeast species was evaluated. A collection of 154 clinical yeast isolates, including amphotericin B- and azole-resistant organisms, was tested. Results were compared with those obtained by the reference procedures of both the CLSI and the European Committee on Antimicrobial Susceptibility Testing (EUCAST). Two other commercial techniques approved for clinical use, the Etest and the Sensititre YeastOne, were included in the comparative exercise as well. The average essential agreement (EA) between the Vitek 2 system and the reference procedures was >95%, comparable with the average EAs observed between the reference procedures and the Sensititre YeastOne and Etest. The EA values were >97% for Candida spp. and stood at 92% for Cryptococcus neoformans. Intraclass correlation coefficients (ICC) between the commercial techniques and the reference procedures were statistically significant (P < 0.01). Percentages of very major errors were 2.6% between Vitek 2 and the EUCAST technique and 1.6% between Vitek 2 and the CLSI technique. The Vitek 2 MIC results were available after 14 to 18 h of incubation for all Candida spp. (average time to reading, 15.5 h). The Vitek 2 system was shown to be a reliable technique to determine antifungal susceptibility testing of yeast species and a more rapid and easier alternative for clinical laboratories than the procedures developed by either the CLSI or EUCAST. PMID:20220169
Stiegler, Marjorie; Hobbs, Gene; Martinelli, Susan M; Zvara, David; Arora, Harendra; Chen, Fei
2018-01-01
Background Simulation is an effective method for creating objective summative assessments of resident trainees. Real-time assessment (RTA) in simulated patient care environments is logistically challenging, especially when evaluating a large group of residents in multiple simulation scenarios. To date, there is very little data comparing RTA with delayed (hours, days, or weeks later) video-based assessment (DA) for simulation-based assessments of Accreditation Council for Graduate Medical Education (ACGME) sub-competency milestones. We hypothesized that sub-competency milestone evaluation scores obtained from DA, via audio-video recordings, are equivalent to the scores obtained from RTA. Methods Forty-one anesthesiology residents were evaluated in three separate simulated scenarios, representing different ACGME sub-competency milestones. All scenarios had one faculty member perform RTA and two additional faculty members perform DA. Subsequently, the scores generated by RTA were compared with the average scores generated by DA. Variance component analysis was conducted to assess the amount of variation in scores attributable to residents and raters. Results Paired t-tests showed no significant difference in scores between RTA and averaged DA for all cases. Cases 1, 2, and 3 showed an intraclass correlation coefficient (ICC) of 0.67, 0.85, and 0.50 for agreement between RTA scores and averaged DA scores, respectively. Analysis of variance of the scores assigned by the three raters showed a small proportion of variance attributable to raters (4% to 15%). Conclusions The results demonstrate that video-based delayed assessment is as reliable as real-time assessment, as both assessment methods yielded comparable scores. Based on a department’s needs or logistical constraints, our findings support the use of either real-time or delayed video evaluation for assessing milestones in a simulated patient care environment. PMID:29736352
Validity of Automated Choroidal Segmentation in SS-OCT and SD-OCT.
Zhang, Li; Buitendijk, Gabriëlle H S; Lee, Kyungmoo; Sonka, Milan; Springelkamp, Henriët; Hofman, Albert; Vingerling, Johannes R; Mullins, Robert F; Klaver, Caroline C W; Abràmoff, Michael D
2015-05-01
To evaluate the validity of a novel fully automated three-dimensional (3D) method capable of segmenting the choroid from two different optical coherence tomography scanners: swept-source OCT (SS-OCT) and spectral-domain OCT (SD-OCT). One hundred eight subjects were imaged using SS-OCT and SD-OCT. A 3D method was used to segment the choroid and quantify the choroidal thickness along each A-scan. The segmented choroidal posterior boundary was evaluated by comparing to manual segmentation. Differences were assessed to test the agreement between segmentation results of the same subject. Choroidal thickness was defined as the Euclidian distance between Bruch's membrane and the choroidal posterior boundary, and reproducibility was analyzed using automatically and manually determined choroidal thicknesses. For SS-OCT, the average choroidal thickness of the entire 6- by 6-mm2 macular region was 219.5 μm (95% confidence interval [CI], 204.9-234.2 μm), and for SD-OCT it was 209.5 μm (95% CI, 197.9-221.0 μm). The agreement between automated and manual segmentations was high: Average relative difference was less than 5 μm, and average absolute difference was less than 15 μm. Reproducibility of choroidal thickness between repeated SS-OCT scans was high (coefficient of variation [CV] of 3.3%, intraclass correlation coefficient [ICC] of 0.98), and differences between SS-OCT and SD-OCT results were small (CV of 11.0%, ICC of 0.73). We have developed a fully automated 3D method for segmenting the choroid and quantifying choroidal thickness along each A-scan. The method yielded high validity. Our method can be used reliably to study local choroidal changes and may improve the diagnosis and management of patients with ocular diseases in which the choroid is affected.
Reliability of doming and toe flexion testing to quantify foot muscle strength.
Ridge, Sarah Trager; Myrer, J William; Olsen, Mark T; Jurgensmeier, Kevin; Johnson, A Wayne
2017-01-01
Quantifying the strength of the intrinsic foot muscles has been a challenge for clinicians and researchers. The reliable measurement of this strength is important in order to assess weakness, which may contribute to a variety of functional issues in the foot and lower leg, including plantar fasciitis and hallux valgus. This study reports 3 novel methods for measuring foot strength - doming (previously unmeasured), hallux flexion, and flexion of the lesser toes. Twenty-one healthy volunteers performed the strength tests during two testing sessions which occurred one to five days apart. Each participant performed each series of strength tests (doming, hallux flexion, and lesser toe flexion) four times during the first testing session (twice with each of two raters) and two times during the second testing session (once with each rater). Intra-class correlation coefficients were calculated to test for reliability for the following comparisons: between raters during the same testing session on the same day (inter-rater, intra-day, intra-session), between raters on different days (inter-rater, inter-day, inter-session), between days for the same rater (intra-rater, inter-day, inter-session), and between sessions on the same day by the same rater (intra-rater, intra-day, inter-session). ICCs showed good to excellent reliability for all tests between days, raters, and sessions. Average doming strength was 99.96 ± 47.04 N. Average hallux flexion strength was 65.66 ± 24.5 N. Average lateral toe flexion was 50.96 ± 22.54 N. These simple tests using relatively low cost equipment can be used for research or clinical purposes. If repeated testing will be conducted on the same participant, it is suggested that the same researcher or clinician perform the testing each time for optimal reliability.
Reliability and Validity of Ten Consumer Activity Trackers Depend on Walking Speed.
Fokkema, Tryntsje; Kooiman, Thea J M; Krijnen, Wim P; VAN DER Schans, Cees P; DE Groot, Martijn
2017-04-01
To examine the test-retest reliability and validity of ten activity trackers for step counting at three different walking speeds. Thirty-one healthy participants walked twice on a treadmill for 30 min while wearing 10 activity trackers (Polar Loop, Garmin Vivosmart, Fitbit Charge HR, Apple Watch Sport, Pebble Smartwatch, Samsung Gear S, Misfit Flash, Jawbone Up Move, Flyfit, and Moves). Participants walked three walking speeds for 10 min each; slow (3.2 km·h), average (4.8 km·h), and vigorous (6.4 km·h). To measure test-retest reliability, intraclass correlations (ICC) were determined between the first and second treadmill test. Validity was determined by comparing the trackers with the gold standard (hand counting), using mean differences, mean absolute percentage errors, and ICC. Statistical differences were calculated by paired-sample t tests, Wilcoxon signed-rank tests, and by constructing Bland-Altman plots. Test-retest reliability varied with ICC ranging from -0.02 to 0.97. Validity varied between trackers and different walking speeds with mean differences between the gold standard and activity trackers ranging from 0.0 to 26.4%. Most trackers showed relatively low ICC and broad limits of agreement of the Bland-Altman plots at the different speeds. For the slow walking speed, the Garmin Vivosmart and Fitbit Charge HR showed the most accurate results. The Garmin Vivosmart and Apple Watch Sport demonstrated the best accuracy at an average walking speed. For vigorous walking, the Apple Watch Sport, Pebble Smartwatch, and Samsung Gear S exhibited the most accurate results. Test-retest reliability and validity of activity trackers depends on walking speed. In general, consumer activity trackers perform better at an average and vigorous walking speed than at a slower walking speed.
Gonzalez, Javier T; Veasey, Rachel C; Rumbold, Penny L S; Stevenson, Emma J
2012-10-01
The present study aimed to investigate the reliability of metabolic and subjective appetite responses under fasted conditions and following consumption of a cereal-based breakfast. Twelve healthy, physically active males completed two postabsorption (PA) and two postprandial (PP) trials in a randomised order. In PP trials a cereal based breakfast providing 1859 kJ of energy was consumed. Expired gas samples were used to estimate energy expenditure and fat oxidation and 100mm visual analogue scales were used to determine appetite sensations at baseline and every 30 min for 120 min. Reliability was assessed using limits of agreement, coefficient of variation (CV), intraclass coefficient of correlation and 95% confidence limits of typical error. The limits of agreement and typical error were 292.0 and 105.5 kJ for total energy expenditure, 9.3 and 3.4 g for total fat oxidation and 22.9 and 8.3mm for time-averaged AUC for hunger sensations, respectively over the 120 min period in the PP trial. The reliability of energy expenditure and appetite in the 2h response to a cereal-based breakfast would suggest that an intervention requires a 211 kJ and 16.6mm difference in total postprandial energy expenditure and time-averaged hunger AUC to be meaningful, fat oxidation would require a 6.7 g difference which may not be sensitive to most meal manipulations. Copyright © 2012 Elsevier Ltd. All rights reserved.
Hong, Samin; Kim, Chan Yun; Lee, Won Seok; Seong, Gong Je
2010-01-01
To assess the reproducibility of the new spectral domain Cirrus high-definition optical coherence tomography (HD-OCT; Carl Zeiss Meditec, Dublin, CA, USA) for analysis of peripapillary retinal nerve fiber layer (RNFL) thickness in healthy eyes. Thirty healthy Korean volunteers were enrolled. Three optic disc cube 200 x 200 Cirrus HD-OCT scans were taken on the same day in discontinuous sessions by the same operator without using the repeat scan function. The reproducibility of the calculated RNFL thickness and probability code were determined by the intraclass correlation coefficient (ICC), coefficient of variation (CV), test-retest variability, and Fleiss' generalized kappa (kappa). Thirty-six eyes were analyzed. For average RNFL thickness, the ICC was 0.970, CV was 2.38%, and test-retest variability was 4.5 microm. For all quadrants except the nasal, ICCs were 0.972 or higher and CVs were 4.26% or less. Overall test-retest variability ranged from 5.8 to 8.1 microm. The kappa value of probability codes for average RNFL thickness was 0.690. The kappa values of quadrants and clock-hour sectors were lower in the nasal areas than in other areas. The reproducibility of Cirrus HD-OCT to analyze peripapillary RNFL thickness in healthy eyes was excellent compared with the previous reports for time domain Stratus OCT. For the calculated RNFL thickness and probability code, variability was relatively higher in the nasal area, and more careful analyses are needed.
Langberg, Joshua M.; Dvorsky, Melissa R.; Molitor, Stephen J.; Bourchtein, Elizaveta; Eddy, Laura D.; Smith, Zoe; Schultz, Brandon K.; Evans, Steven W.
2016-01-01
The primary goal of this study was to longitudinally evaluate the homework assignment completion patterns of middle school age adolescents with ADHD, their associations with academic performance, and malleable predictors of homework assignment completion. Analyses were conducted on a sample of 104 middle school students comprehensively diagnosed with ADHD and followed for 18 months. Multiple teachers for each student provided information about the percentage of homework assignments turned in at five separate timepoints and school grades were collected quarterly. Results showed that agreement between teachers with respect to students’ assignment completion was high, with an intraclass correlation of .879 at baseline. Students with ADHD were turning in an average of 12% fewer assignments each academic quarter in comparison to teacher-reported classroom averages. Regression analyses revealed a robust association between the percentage of assignments turned in at baseline and school grades 18 months later, even after controlling for baseline grades, achievement (reading and math), intelligence, family income, and race. Cross-lag analyses demonstrated that the association between assignment completion and grades was reciprocal, with assignment completion negatively impacting grades and low grades in turn being associated with decreased future homework completion. Parent ratings of homework materials management abilities at baseline significantly predicted the percentage of assignments turned in as reported by teachers 18 months later. These findings demonstrate that homework assignment completion problems are persistent across time and an important intervention target for adolescents with ADHD. PMID:26931065
[Shear waves elastography of the placenta in pregnant baboon].
Quarello, E; Lacoste, R; Mancini, J; Melot-Dusseau, S; Gorincour, G
2015-03-01
To evaluate tissue characteristics of the placenta by transabdominal ShearWave Elastography in pregnant baboon. For 9 months (03/2013-12/2013) two operators (EQ, GG) performed ultrasound of the placenta during pregnancy pregnant baboons station partner primatology project. The identification of the placenta was performed previously in 2D ultrasound. The elastography method was then activated. Three measurements were carried out by operator for each placenta. The intraclass correlation coefficients within and between observers were calculated for the objective assessment (elastography) of placental maturity. During the study period, 21 pregnant baboons were included and ultrasounds were performed between 1 and 3 times each. The measurements have been carried out by two operators in 100% of cases. The intra- and inter-observer ICC for single values are respectively 0.657 - 95% CI (0.548 to 0.752) and 0.458 - 95% CI (0.167 to 0.675). The intra- and inter-observer ICC for average values are respectively 0.852 - 95% CI (0.784 to 0.901) and 0.628 - 95% CI (0.286 to 0.806). The study by transabdominal ShearWave Elastography of placenta's pregnant baboons is possible. The intra- and inter-operator reproducibility of this method is good using the average of three measurements. The objective study via elastography ShearWave of the degree of placental maturity seems not yet be used in clinical practice. Studies of larger cohorts are needed. Copyright © 2015 Elsevier Masson SAS. All rights reserved.
NASA Technical Reports Server (NTRS)
Chao, Tien-Hsin; Stoner, William W.
1993-01-01
An optical neural network based on the neocognitron paradigm is introduced. A novel aspect of the architecture design is shift-invariant multichannel Fourier optical correlation within each processing layer. Multilayer processing is achieved by feeding back the ouput of the feature correlator interatively to the input spatial light modulator and by updating the Fourier filters. By training the neural net with characteristic features extracted from the target images, successful pattern recognition with intraclass fault tolerance and interclass discrimination is achieved. A detailed system description is provided. Experimental demonstrations of a two-layer neural network for space-object discrimination is also presented.
Automatic target recognition using a feature-based optical neural network
NASA Technical Reports Server (NTRS)
Chao, Tien-Hsin
1992-01-01
An optical neural network based upon the Neocognitron paradigm (K. Fukushima et al. 1983) is introduced. A novel aspect of the architectural design is shift-invariant multichannel Fourier optical correlation within each processing layer. Multilayer processing is achieved by iteratively feeding back the output of the feature correlator to the input spatial light modulator and updating the Fourier filters. By training the neural net with characteristic features extracted from the target images, successful pattern recognition with intra-class fault tolerance and inter-class discrimination is achieved. A detailed system description is provided. Experimental demonstration of a two-layer neural network for space objects discrimination is also presented.
Estimation of Temporal Gait Parameters Using a Human Body Electrostatic Sensing-Based Method.
Li, Mengxuan; Li, Pengfei; Tian, Shanshan; Tang, Kai; Chen, Xi
2018-05-28
Accurate estimation of gait parameters is essential for obtaining quantitative information on motor deficits in Parkinson's disease and other neurodegenerative diseases, which helps determine disease progression and therapeutic interventions. Due to the demand for high accuracy, unobtrusive measurement methods such as optical motion capture systems, foot pressure plates, and other systems have been commonly used in clinical environments. However, the high cost of existing lab-based methods greatly hinders their wider usage, especially in developing countries. In this study, we present a low-cost, noncontact, and an accurate temporal gait parameters estimation method by sensing and analyzing the electrostatic field generated from human foot stepping. The proposed method achieved an average 97% accuracy on gait phase detection and was further validated by comparison to the foot pressure system in 10 healthy subjects. Two results were compared using the Pearson coefficient r and obtained an excellent consistency ( r = 0.99, p < 0.05). The repeatability of the purposed method was calculated between days by intraclass correlation coefficients (ICC), and showed good test-retest reliability (ICC = 0.87, p < 0.01). The proposed method could be an affordable and accurate tool to measure temporal gait parameters in hospital laboratories and in patients' home environments.
Habets, Bas; Staal, J Bart; Tijssen, Marsha; van Cingel, Robert
2018-01-10
To determine the intrarater reliability of the Humac NORM isokinetic dynamometer for concentric and eccentric strength tests of knee and shoulder muscles. 54 participants (50% female, average age 20.9 ± 3.1 years) performed concentric and eccentric strength measures of the knee extensors and flexors, and the shoulder internal and external rotators on two different Humac NORM isokinetic dynamometers, which were situated at two different centers. The knee extensors and flexors were tested concentrically at 60° and 180°/s, and eccentrically at 60° s. Concentric strength of the shoulder internal and external rotators, and eccentric strength of the external rotators were measured at 60° and 120°/s. We calculated intraclass correlation coefficients (ICCs), standard error of measurement, standard error of measurement expressed as a %, and the smallest detectable change to determine reliability and measurement error. ICCs for the knee tests ranged from 0.74 to 0.89, whereas ICC values for the shoulder tests ranged from 0.72 to 0.94. Measurement error was highest for the concentric test of the knee extensors and lowest for the concentric test of shoulder external rotators.
Break-technique handheld dynamometry: relation between angular velocity and strength measurements.
Burns, Stephen P; Spanier, David E
2005-07-01
To determine whether the muscle strength, as measured with break-technique handheld dynamometry (HHD), is dependent on the angular velocity achieved during testing and to compare reliability at different angular velocities. Repeated-measures study. Participants underwent HHD by using make-technique (isometric) and break-technique (eccentric) dynamometry at 3 prespecified angular velocities. Elbow movement was recorded with an electrogoniometer. Inpatient spinal cord injury unit. Convenience sample of 20 persons with tetraplegia with weakness of elbow flexors or extensors. Not applicable. Elbow angular velocity and muscle strength recorded during HHD. With the break technique, angular velocities averaging 15 degrees , 33 degrees , and 55 degrees /s produced 16%, 30%, and 51% greater strength measurements, respectively, than velocities recorded by using the make technique (all P < .006 for comparisons between successive techniques). The intraclass correlation coefficient for intrarater reliability was .89 or greater for all testing techniques. Greater strength is recorded with faster angular velocities during HHD. Differences in angular velocity may explain the wide range previously reported for break- versus make-technique strength measurements. Variation in angular velocity is a potential source of variability in serial HHD strength measurements, and for this reason the make technique may be preferable.
Wang, Youfa; Xue, Hong; Chen, Hsin-jen; Igusa, Takeru
2014-09-06
Although the importance of social norms in affecting health behaviors is widely recognized, the current understanding of the social norm effects on obesity is limited due to data and methodology limitations. This study aims to use nontraditional innovative systems methods to examine: a) the effects of social norms on school children's BMI growth and fruit and vegetable (FV) consumption, and b) the effects of misperceptions of social norms on US children's BMI growth. We built an agent-based model (ABM) in a utility maximization framework and parameterized the model based on empirical longitudinal data collected in a US nationally representative study, the Early Childhood Longitudinal Study - Kindergarten Cohort (ECLS-K), to test potential mechanisms of social norm affecting children's BMI growth and FV consumption. Intraclass correlation coefficients (ICC) for BMI were 0.064-0.065, suggesting that children's BMI were similar within each school. The correlation between observed and ABM-predicted BMI was 0.87, indicating the validity of our ABM. Our simulations suggested the follow-the-average social norm acts as an endogenous stabilizer, which automatically adjusts positive and negative deviance of an individual's BMI from the group mean of a social network. One unit of BMI below the social average may lead to 0.025 unit increase in BMI per year for each child; asymmetrically, one unit of BMI above the social average, may only cause 0.015 unit of BMI reduction. Gender difference was apparent. Social norms have less impact on weight reduction among girls, and a greater impact promoting weight increase among boys. Our simulation also showed misperception of the social norm would push up the mean BMI and cause the distribution to be more skewed to the left. Our simulation results did not provide strong support for the role of social norms on FV consumption. Social norm influences US children's BMI growth. High obesity prevalence will lead to a continuous increase in children's BMI due to increased socially acceptable mean BMI. Interventions promoting healthy body image and desirable socially acceptable BMI should be implemented to control childhood obesity epidemic.
Sahebjavaher, Ramin S; Nir, Guy; Honarvar, Mohammad; Gagnon, Louis O; Ischia, Joseph; Jones, Edward C; Chang, Silvia D; Fazli, Ladan; Goldenberg, S Larry; Rohling, Robert; Kozlowski, Piotr; Sinkus, Ralph; Salcudean, Septimiu E
2015-01-01
The purpose of this work was to assess trans-perineal prostate magnetic resonance elastography (MRE) for (1) repeatability in phantoms/volunteers and (2) diagnostic power as correlated with histopathology in prostate cancer patients. The three-dimensional (3D) displacement field was obtained using a fractionally encoded gradient echo sequence using a custom-made transducer. The repeatability of the method was assessed based on three repeat studies and by changing the driving frequency by 3% in studies on a phantom and six healthy volunteers. Subsequently, 11 patients were examined with MRE prior to radical prostatectomy. The areas under the receiver operating characteristic curves were calculated using a windowed voxel-to-voxel approach by comparing the 2D registered slides, masked with the Gleason score. For the repeatability study, the average intraclass correlation coefficient for elasticity images was 99% for repeat phantom studies, 98% for ±6 Hz phantom studies, 95% for volunteer repeat studies with 2 min acquisition time, 82% for ±2 Hz volunteer studies with 2 min acquisition time and 73% for repeat volunteer studies with 8 min acquisition time. For the patient study, the average elasticity was 8.2 ± 1.7 kPa in the prostate capsule, 7.5 ± 1.9 kPa in the peripheral zone (PZ), 9.7 ± 3.0 kPa in the central gland (CG) and 9.0 ± 3.4 kPa in the transition zone. In the patient study, cancerous tissue with Gleason score at least 3 + 3 was significantly (p < 0.05) different from normal tissue in 10 out of 11 cases with tumors in the PZ, and 6 out of 9 cases with tumors in the CG. However, the overall case-averaged area under the curve was 0.72 in the PZ and 0.67 in the CG. Cancerous tissue was not always stiffer than normal tissue. The inversion algorithm was sensitive to (i) vibration amplitude and displacement nodes and (ii) misalignment of the 3D wave field due to subject movement. Copyright © 2014 John Wiley & Sons, Ltd.
Bos, Nanne; Sturms, Leontien M; Stellato, Rebecca K; Schrijvers, Augustinus J P; van Stel, Henk F
2015-10-01
Patients' experiences are an indicator of health-care performance in the accident and emergency department (A&E). The Consumer Quality Index for the Accident and Emergency department (CQI A&E), a questionnaire to assess the quality of care as experienced by patients, was investigated. The internal consistency, construct validity and discriminative capacity of the questionnaire were examined. In the Netherlands, twenty-one A&Es participated in a cross-sectional survey, covering 4883 patients. The questionnaire consisted of 78 questions. Principal components analysis determined underlying domains. Internal consistency was determined by Cronbach's alpha coefficients, construct validity by Pearson's correlation coefficients and the discriminative capacity by intraclass correlation coefficients and reliability of A&E-level mean scores (G-coefficient). Seven quality domains emerged from the principal components analysis: information before treatment, timeliness, attitude of health-care professionals, professionalism of received care, information during treatment, environment and facilities, and discharge management. Domains were internally consistent (range: 0.67-0.84). Five domains and the 'global quality rating' had the capacity to discriminate among A&Es (significant intraclass correlation coefficient). Four domains and the 'global quality rating' were close to or above the threshold for reliably demonstrating differences among A&Es. The patients' experiences score on the domain timeliness showed the largest range between the worst- and best-performing A&E. The CQI A&E is a validated survey to measure health-care performance in the A&E from patients' perspective. Five domains regarding quality of care aspects and the 'global quality rating' had the capacity to discriminate among A&Es. © 2013 John Wiley & Sons Ltd.
Eisinger-Watzl, Marianne; Straßburg, Andrea; Ramünke, Josa; Krems, Carolin; Heuer, Thorsten; Hoffmann, Ingrid
2015-04-01
To further characterise the performance of the diet history method and the 24-h recalls method, both in an updated version, a comparison was conducted. The National Nutrition Survey II, representative for Germany, assessed food consumption with both methods. The comparison was conducted in a sample of 9,968 participants aged 14-80. Besides calculating mean differences, statistical agreement measurements encompass Spearman and intraclass correlation coefficients, ranking participants in quartiles and the Bland-Altman method. Mean consumption of 12 out of 18 food groups was higher assessed with the diet history method. Three of these 12 food groups had a medium to large effect size (e.g., raw vegetables) and seven showed at least a small strength while there was basically no difference for coffee/tea or ice cream. Intraclass correlations were strong only for beverages (>0.50) and revealed the least correlation for vegetables (<0.20). Quartile classification of participants exhibited more than two-thirds being ranked in the same or adjacent quartile assessed by both methods. For every food group, Bland-Altman plots showed that the agreement of both methods weakened with increasing consumption. The cognitive effort essential for the diet history method to remember consumption of the past 4 weeks may be a source of inaccurateness, especially for inhomogeneous food groups. Additionally, social desirability gains significance. There is no assessment method without errors and attention to specific food groups is a critical issue with every method. Altogether, the 24-h recalls method applied in the presented study, offers advantages approximating food consumption as compared to the diet history method.
Xaplanteris, Panagiotis; Fournier, Stephane; Keulards, Daniëlle C J; Adjedj, Julien; Ciccarelli, Giovanni; Milkas, Anastasios; Pellicano, Mariano; Van't Veer, Marcel; Barbato, Emanuele; Pijls, Nico H J; De Bruyne, Bernard
2018-03-01
The principle of continuous thermodilution can be used to calculate absolute coronary blood flow and microvascular resistance (R). The aim of the study is to explore the safety, feasibility, and reproducibility of coronary blood flow and R measurements as measured by continuous thermodilution in humans. Absolute coronary flow and R can be calculated by thermodilution by infusing saline at room temperature through a dedicated monorail catheter. The temperature of saline as it enters the vessel, the temperature of blood and saline mixed in the distal part of the vessel, and the distal coronary pressure were measured by a pressure/temperature sensor-tipped guidewire. The feasibility and safety of the method were tested in 135 patients who were referred for coronary angiography. No significant adverse events were observed; in 11 (8.1%) patients, bradycardia and concomitant atrioventricular block appeared transiently and were reversed immediately on interruption of the infusion. The reproducibility of measurements was tested in a subgroup of 80 patients (129 arteries). Duplicate measurements had a strong correlation both for coronary blood flow (ρ=0.841, P <0.001; intraclass correlation coefficient=0.89, P <0.001) and R (ρ=0.780, P <0.001; intraclass correlation coefficient=0.89, P <0.001). In Bland-Altman plots, there was no significant bias or asymmetry. Absolute coronary blood flow (in L/min) and R (in mm Hg/L/min or Wood units) can be safely and reproducibly measured with continuous thermodilution. This approach constitutes a new opportunity for the study of the coronary microcirculation. © 2018 American Heart Association, Inc.
The Stayhealthy bioelectrical impedance analyzer predicts body fat in children and adults.
Erceg, David N; Dieli-Conwright, Christina M; Rossuello, Amerigo E; Jensky, Nicole E; Sun, Stephanie; Schroeder, E Todd
2010-05-01
Bioelectrical impedance analysis (BIA) is a time-efficient and cost-effective method for estimating body composition. We hypothesized that there would be no significant difference between the Stayhealthy BC1 BIA and the selected reference methods when determining body composition. Thus, the purpose of the present study was to determine the validity of estimating percent body fat (%BF) using the Stayhealthy BIA with its most recently updated algorithms compared to the reference methods of dual-energy x-ray absorptiometry for adults and hydrostatic weighing for children. We measured %BF in 245 adults aged 18 to 80 years and 115 children aged 10 to 17 years. Body fat by BIA was determined using a single 50 kHz frequency handheld impedance device and proprietary software. Agreement between BIA and reference methods was assessed by Bland and Altman plots. Bland and Altman analysis for men, women, and children revealed good agreement between the reference methods and BIA. There was no significant difference by t tests between mean %BF by BIA for men, women, or children when compared to the respective reference method. Significant correlation values between BIA, and reference methods for all men, women, and children were 0.85, 0.88, and 0.79, respectively. Reliability (test-retest) was assessed by intraclass correlation coefficient and coefficient of variation. Intraclass correlation coefficient values were greater than 0.99 (P < .001) for men, women, and children with coefficient of variation values 3.3%, 1.8%, and 1.7%, respectively. The Stayhealthy BIA device demonstrated good agreement between reference methods using Bland and Altman analyses. Copyright 2010 Elsevier Inc. All rights reserved.
Schrems, Wolfgang A; Schrems-Hoesl, Laura M; Bendschneider, Delia; Mardin, Christian Y; Laemmer, Robert; Kruse, Friedrich E; Horn, Folkert K
2015-10-01
New methods are needed to compare peripapillary retinal nerve fiber layer thickness (pRNFLT) measurements taken from time-domain optical coherence tomography (TD-OCT) and spectral-domain OCT (SD-OCT). To compare the agreement of measured and predicted pRNFLT using different equations based on pRNFLT measurements obtained by TD-OCT and SD-OCT. Cross-sectional single-center study that took place at the Department of Ophthalmology, University of Erlangen-Nuremberg from November 16, 2005, to June 3, 2015, and included 138 eyes of control participants, 126 eyes of patients with ocular hypertension, 128 eyes of patients with preperimetric glaucoma, and 160 eyes of patients with perimetric glaucoma. All participants had standard clinical examinations to obtain TD-OCT (via Stratus OCT) and SD-OCT (via Spectralis OCT) measurements of pRNFLT. Two groups were matched for diagnostic subgroup, eye side, sex, and age. The TD-OCT measurements of the first group were used to predict the mean SD-OCT and 6-sector vertical-split pRNFLT measurements of the second group and vice versa. The agreement between the predicted pRNFLT calculations of conversion equations and measured pRNFLT of the second group was evaluated by intraclass correlation coefficients and Bland-Altman plots. Mean and sectoral pRNFLT measurements obtained by TD-OCT and SD-OCT as well as the agreement between measured and predicted pRNFLT. The agreement for all investigated equations to predict mean pRNFLT measurements with intraclass correlation coeffecients ranged from 0.937 to 0.939. Bland-Altman plots demonstrated systemic biases between -0.7 μm and +1.1 μm for measured and predicted mean pRNFLT measurements. The ratio method demonstrated an intraclass correlation coefficient of 0.969 for the temporal-inferior sector. The best color-code agreement between both OCT devices was achieved by the no conversion method, with κ = 0.731 (95% CI, 0.656-0.806) for the mean pRNFLT. These data suggest that the prediction of mean pRNFLT values by equations derived from TD-OCT and SD-OCT can be conducted with high levels of agreement. In individual cases and singular sectors, high prediction errors may occur. When longitudinal imaging data from both TD-OCT and SD-OCT are available, conversion equations may provide longitudinal comparability.
Nguyen, Anh-Dung; Boling, Michelle C; Slye, Carrie A; Hartley, Emily M; Parisi, Gina L
2013-01-01
Accurate, efficient, and reliable measurement methods are essential to prospectively identify risk factors for knee injuries in large cohorts. To determine tester reliability using digital photographs for the measurement of static lower extremity alignment (LEA) and whether values quantified with an electromagnetic motion-tracking system are in agreement with those quantified with clinical methods and digital photographs. Descriptive laboratory study. Laboratory. Thirty-three individuals participated and included 17 (10 women, 7 men; age = 21.7 ± 2.7 years, height = 163.4 ± 6.4 cm, mass = 59.7 ± 7.8 kg, body mass index = 23.7 ± 2.6 kg/m2) in study 1, in which we examined the reliability between clinical measures and digital photographs in 1 trained and 1 novice investigator, and 16 (11 women, 5 men; age = 22.3 ± 1.6 years, height = 170.3 ± 6.9 cm, mass = 72.9 ± 16.4 kg, body mass index = 25.2 ± 5.4 kg/m2) in study 2, in which we examined the agreement among clinical measures, digital photographs, and an electromagnetic tracking system. We evaluated measures of pelvic angle, quadriceps angle, tibiofemoral angle, genu recurvatum, femur length, and tibia length. Clinical measures were assessed using clinically accepted methods. Frontal- and sagittal-plane digital images were captured and imported into a computer software program. Anatomic landmarks were digitized using an electromagnetic tracking system to calculate static LEA. Intraclass correlation coefficients and standard errors of measurement were calculated to examine tester reliability. We calculated 95% limits of agreement and used Bland-Altman plots to examine agreement among clinical measures, digital photographs, and an electromagnetic tracking system. Using digital photographs, fair to excellent intratester (intraclass correlation coefficient range = 0.70-0.99) and intertester (intraclass correlation coefficient range = 0.75-0.97) reliability were observed for static knee alignment and limb-length measures. An acceptable level of agreement was observed between clinical measures and digital pictures for limb-length measures. When comparing clinical measures and digital photographs with the electromagnetic tracking system, an acceptable level of agreement was observed in measures of static knee angles and limb-length measures. The use of digital photographs and an electromagnetic tracking system appears to be an efficient and reliable method to assess static knee alignment and limb-length measurements.
Taylor, Alden L; Wilken, Jason M; Deyle, Gail D; Gill, Norman W
2014-04-01
Descriptive biomechanical study using an experimental repeated-measures design. To quantify the response of participants with and without knee osteoarthritis (OA) to a single session of manual physical therapy. The intervention consisted primarily of joint mobilization techniques, supplemented by exercises, aiming to improve knee extension. While manual therapy benefits patients with knee OA, there is limited research quantifying the effects of a manual therapy treatment session on either motion or stiffness of osteoarthritic and normal knees. Methods The study included 5 participants with knee OA and 5 age-, gender-, and body mass index-matched healthy volunteers. Knee extension motion and stiffness were measured with videofluoroscopy before and after a 30-minute manual therapy treatment session. Analysis of variance and intraclass correlation coefficients were used to analyze the data. Participants with knee OA had restricted knee extension range of motion at baseline, in contrast to the participants with normal knees, who had full knee extension. After the therapy session, there was a significant increase in knee motion in participants with knee OA (P = .004) but not in those with normal knees (P = .201). For stiffness data, there was no main effect for time (P = .903) or load (P = .274), but there was a main effect of group (P = .012), with the participants with healthy knees having greater stiffness than those with knee OA. Reliability, using intraclass correlation coefficient model 3,3, for knee angle measurements between imaging sessions for all loading conditions was 0.99. Reliability (intraclass correlation coefficient model 3,1) for intraimage measurements was 0.97. End-range knee extension stiffness was greater in the participants with normal knees than those with knee OA. The combination of lesser stiffness and lack of motion in those with knee OA, which may indicate the potential for improvement, may explain why increased knee extension angle was observed following a single session of manual therapy in the participants with knee OA but not in those with normal knees. Videofluoroscopy of the knee appears reliable and relevant for future studies attempting to quantify the underlying mechanisms of manual therapy. J Orthop Sports Phys Ther 2014;44(4):273-282. Epub 25 February 2014. doi:10.2519/jospt.2014.4710.
The Reliability and Validity of the Computerized Double Inclinometer in Measuring Lumbar Mobility
MacDermid, Joy Christine; Arumugam, Vanitha; Vincent, Joshua Israel; Carroll, Krista L
2014-01-01
Study Design : Repeated measures reliability/validity study. Objectives : To determine the concurrent validity, test-retest, inter-rater and intra-rater reliability of lumbar flexion and extension measurements using the Tracker M.E. computerized dual inclinometer (CDI) in comparison to the modified-modified Schober (MMS) Summary of Background : Numerous studies have evaluated the reliability and validity of the various methods of measuring spinal motion, but the results are inconsistent. Differences in equipment and techniques make it difficult to correlate results. Methods : Twenty subjects with back pain and twenty without back pain were selected through convenience sampling. Two examiners measured sagittal plane lumbar range of motion for each subject. Two separate tests with the CDI and one test with the MMS were conducted. Each test consisted of three trials. Instrument and examiner order was randomly assigned. Intra-class correlations (ICCs 2, 2 and 2, 2) and Pearson correlation coefficients (r) were used to calculate reliability and concurrent validity respectively. Results : Intra-trial reliability was high to very high for both the CDI (ICCs 0.85 - 0.96) and MMS (ICCs 0.84 - 0.98). However, the reliability was poor to moderate, when the CDI unit had to be repositioned either by the same rate (ICCs 0.16 - 0.59) or a different rater (ICCs 0.45 - 0.52). Inter-rater reliability for the MMS was moderate to high (ICCs 0.75 - 0.82) which bettered the moderate correlation obtained for the CDI (ICCs 0.45 - 0.52). Correlations between the CDI and MMS were poor for flexion (0.32; p<0.05) and poor to moderate (-0.42 - -0.51; p<0.05) for extension measurements. Conclusion : When using the CDI, an average of subsequent tests is required to obtain moderate reliability. The MMS was highly reliable than the CDI. The MMS and the CDI measure lumbar movement on a different metric that are not highly related to each other. PMID:25352928
Macedo-Ojeda, Gabriela; Márquez-Sandoval, Fabiola; Fernández-Ballart, Joan; Vizmanos, Barbara
2016-01-01
The study of diet quality in a population provides information for the development of programs to improve nutritional status through better directed actions. The aim of this study was to assess the reproducibility and relative validity of a Mexican Diet Quality Index (ICDMx) for the assessment of the habitual diet of adults. The ICDMx was designed to assess the characteristics of a healthy diet using a validated semi-quantitative food frequency questionnaire (FFQ-Mx). Reproducibility was determined by comparing 2 ICDMx based on FFQs (one-year interval). Relative validity was assessed by comparing the ICDMx (2nd FFQ) with that estimated based on the intake averages from dietary records (nine days). The questionnaires were answered by 97 adults (mean age in years = 27.5, SD = 12.6). Pearson (r) and intraclass correlations (ICC) were calculated; Bland-Altman plots, Cohen’s κ coefficients and blood lipid determinations complemented the analysis. Additional analysis compared ICDMx scores with nutrients derived from dietary records, using a Pearson correlation. These nutrient intakes were transformed logarithmically to improve normality (log10) and adjusted according to energy, prior to analyses. The ICDMx obtained ICC reproducibility values ranged from 0.33 to 0.87 (23/24 items with significant correlations; mean = 0.63), while relative validity ranged from 0.26 to 0.79 (mean = 0.45). Bland-Altman plots showed a high level of agreement between methods. ICDMx scores were inversely correlated (p < 0.05) with total blood cholesterol (r = −0.33) and triglycerides (r = −0.22). ICDMx (as calculated from FFQs and DRs) obtained positive correlations with fiber, magnesium, potassium, retinol, thiamin, riboflavin, pyridoxine, and folate. The ICDMx obtained acceptable levels of reproducibility and relative validity in this population. It can be useful for population nutritional surveillance and to assess the changes resulting from the implementation of nutritional interventions. PMID:27563921
Relationship between photoreceptor outer segment length and visual acuity in diabetic macular edema.
Forooghian, Farzin; Stetson, Paul F; Meyer, Scott A; Chew, Emily Y; Wong, Wai T; Cukras, Catherine; Meyerle, Catherine B; Ferris, Frederick L
2010-01-01
The purpose of this study was to quantify photoreceptor outer segment (PROS) length in 27 consecutive patients (30 eyes) with diabetic macular edema using spectral domain optical coherence tomography and to describe the correlation between PROS length and visual acuity. Three spectral domain-optical coherence tomography scans were performed on all eyes during each session using Cirrus HD-OCT. A prototype algorithm was developed for quantitative assessment of PROS length. Retinal thicknesses and PROS lengths were calculated for 3 parameters: macular grid (6 x 6 mm), central subfield (1 mm), and center foveal point (0.33 mm). Intrasession repeatability was assessed using coefficient of variation and intraclass correlation coefficient. The association between retinal thickness and PROS length with visual acuity was assessed using linear regression and Pearson correlation analyses. The main outcome measures include intrasession repeatability of macular parameters and correlation of these parameters with visual acuity. Mean retinal thickness and PROS length were 298 mum to 381 microm and 30 microm to 32 mum, respectively, for macular parameters assessed in this study. Coefficient of variation values were 0.75% to 4.13% for retinal thickness and 1.97% to 14.01% for PROS length. Intraclass correlation coefficient values were 0.96 to 0.99 and 0.73 to 0.98 for retinal thickness and PROS length, respectively. Slopes from linear regression analyses assessing the association of retinal thickness and visual acuity were not significantly different from 0 (P > 0.20), whereas the slopes of PROS length and visual acuity were significantly different from 0 (P < 0.0005). Correlation coefficients for macular thickness and visual acuity ranged from 0.13 to 0.22, whereas coefficients for PROS length and visual acuity ranged from -0.61 to -0.81. Photoreceptor outer segment length can be quantitatively assessed using Cirrus HD-OCT. Although the intrasession repeatability of PROS measurements was less than that of macular thickness measurements, the stronger correlation of PROS length with visual acuity suggests that the PROS measures may be more directly related to visual function. Photoreceptor outer segment length may be a useful physiologic outcome measure, both clinically and as a direct assessment of treatment effects.
Yi, Honglei; Wei, Xianzhao; Zhang, Wei; Chen, Ziqiang; Wang, Xinhui; Ji, Xinran; Zhu, Xiaodong; Wang, Fei; Xu, Ximing; Li, Zhikun; Fan, Jianping; Wang, Chuanfeng; Chen, Kai; Zhang, Guoyou; Zhao, Yinchuan; Li, Ming
2014-05-01
This was a prospective clinical validation study. To evaluate the reliability and validity of the adapted simplified Chinese version of Swiss Spinal Stenosis (SC-SSS) Questionnaire. The SSS Questionnaire is a reliable and valid instrument to assess the perception of function and pain for patients with degenerative lumbar spinal stenosis. However, there is no culturally adapted SSS Questionnaire for use in mainland China. This was a prospective clinical validation study. The adaption was conducted according to International Quality of Life Assessment Project guidelines. To examine the psychometric properties of the adapted SC-SSS Questionnaire, a sample of 105 patients with lumbar spinal stenosis were included. Thirty-two patients were randomly selected to evaluate the test-retest reliability. Reliability assessment of the SC-SSS Questionnaire was determined by calculating Cronbach α and intraclass coefficient values. Concurrent validity was assessed by correlating SC-SSS Questionnaire scores with relevant domains of the 36-Item Short Form Health Survey. Cronbach α of the symptom severity scale, physical function scale, patients, and satisfaction scale of SC-SSS Questionnaire are 0.89, 0.86, 0.91, respectively, which revealed very good internal consistency. The test-retest reproducibility was found to be excellent with the intraclass correlation coefficient of 0.93, 0.91, and 0.95. In terms of concurrent validity, SC-SSS Questionnaire had good correlation with physical functioning and bodily pain of 36-Item Short Form Health Survey (r = 0.663, 0.653) and low correlation with mental health (r = 0.289). The physical function scale had good correlation with physical functioning of 36-Item Short Form Health Survey (r = 0.637), whereas the rest had moderate correlation. The satisfaction scale score was highly correlated with the change in the symptom severity (r = 0.71) and physical function (r = 0.68) scale score. The SC-SSS Questionnaire showed satisfactory reliability and validity in the evaluation of functionality in patients with lumbar spinal stenosis who are experiencing neurogenic claudication. It is simple and easy to use and can be recommended in clinical and research practice in mainland China. 3.
Rasmuson, James O; Roggli, Victor L; Boelter, Fred W; Rasmuson, Eric J; Redinger, Charles F
2014-01-01
A detailed evaluation of the correlation and linearity of industrial hygiene retrospective exposure assessment (REA) for cumulative asbestos exposure with asbestos lung burden analysis (LBA) has not been previously performed, but both methods are utilized for case-control and cohort studies and other applications such as setting occupational exposure limits. (a) To correlate REA with asbestos LBA for a large number of cases from varied industries and exposure scenarios; (b) to evaluate the linearity, precision, and applicability of both industrial hygiene exposure reconstruction and LBA; and (c) to demonstrate validation methods for REA. A panel of four experienced industrial hygiene raters independently estimated the cumulative asbestos exposure for 363 cases with limited exposure details in which asbestos LBA had been independently determined. LBA for asbestos bodies was performed by a pathologist by both light microscopy and scanning electron microscopy (SEM) and free asbestos fibers by SEM. Precision, reliability, correlation and linearity were evaluated via intraclass correlation, regression analysis and analysis of covariance. Plaintiff's answers to interrogatories, work history sheets, work summaries or plaintiff's discovery depositions that were obtained in court cases involving asbestos were utilized by the pathologist to provide a summarized brief asbestos exposure and work history for each of the 363 cases. Linear relationships between REA and LBA were found when adjustment was made for asbestos fiber-type exposure differences. Significant correlation between REA and LBA was found with amphibole asbestos lung burden and mixed fiber-types, but not with chrysotile. The intraclass correlation coefficients (ICC) for the precision of the industrial hygiene rater cumulative asbestos exposure estimates and the precision of repeated laboratory analysis were found to be in the excellent range. The ICC estimates were performed independent of specific asbestos fiber-type. Both REA and pathology assessment are reliable and complementary predictive methods to characterize asbestos exposures. Correlation analysis between the two methods effectively validates both REA methodology and LBA procedures within the determined precision, particularly for cumulative amphibole asbestos exposures since chrysotile fibers, for the most part, are not retained in the lung for an extended period of time.
Epidemiology of Parkinson disease in the city of Kolkata, India
Das, S.K.; Misra, A.K.; Ray, B.K.; Hazra, A.; Ghosal, M.K.; Chaudhuri, A.; Roy, T.; Banerjee, T.K.; Raut, D.K.
2010-01-01
Objective: No well-designed longitudinal study on Parkinson disease (PD) has been conducted in India. Therefore, we planned to determine the prevalence, incidence, and mortality rates of PD in the city of Kolkata, India, on a stratified random sample through a door-to-door survey. Method: This study was undertaken between 2003 to 2007 with a validated questionnaire by a team consisting of 4 trained field workers in 3 stages. Field workers screened the cases, later confirmed by a specialist doctor. In the third stage, a movement disorders specialist undertook home visits and reviewed all surviving cases after 1 year from last screening. Information on death was collected through verbal autopsy. A nested case-control study (1:3) was also undertaken to determine putative risk factors. The rates were age adjusted to the World Standard Population. Result: A total population of 100,802 was screened. The age-adjusted prevalence rate (PR) and average annual incidence rate were 52.85/100,000 and 5.71/100,000 per year, respectively. The slum population showed significantly decreased PR with age compared with the nonslum population. The adjusted average annual mortality rate was 2.89/100,000 per year. The relative risk of death was 8.98. The case-control study showed that tobacco chewing protected and hypertension increased PD occurrence. Conclusion: This study documented lower prevalence and incidence of PD as compared with Caucasian and a few Oriental populations. The mortality rates were comparable. The decreased age-specific PR among slum populations and higher relative risk of death need further probing. GLOSSARY AAIR = average annual incidence rate; AAMR = average annual mortality rate; CI = confidence interval; FSQ = family screening questionnaire; ICC = intraclass correlation coefficient; IR = incidence rate; MD = movement disorder; NSSO = National Sample Survey Organization; OR = odds ratio; PD = Parkinson disease; PPS = parkinsonism plus syndrome; PR = prevalence rate; PRM = Poisson regression modeling; RR = relative risk; SP = secondary parkinsonism; VA = verbal autopsy. PMID:20938028
Simulated Keratometry Repeatability in Subjects with & without Down Syndrome
Ravikumar, Ayeswarya; Marsack, Jason D.; Benoit, Julia S.; Anderson, Heather A.
2016-01-01
Purpose To assess the repeatability of simulated keratometry measures obtained with Zeiss Atlas topography for subjects with and without Down syndrome (DS). Methods Corneal topography was attempted on 140 subjects with DS and 138 controls (aged 7 to 59 years). Subjects who had at least 3 measures in each eye were included in analysis (DS: n=140 eyes (70 subjects) and controls: n=264 eyes (132 subjects)). For each measurement the steep corneal power (K), corneal astigmatism, flat K orientation, power vector representation of astigmatism (J0, J45), and astigmatic dioptric difference were determined for each measurement (collectively termed keratometry values here). For flat K orientation comparisons, only eyes with >0.50 DC of astigmatism were included (DS: n=131 eyes (68 subjects) and control: n=217 eyes (119 subjects)). Repeatability was assessed using 1) group mean variability (average standard deviation (SD) across subjects), 2) coefficient of repeatability (COR) 3) coefficient of variation (COV), and 4) intraclass correlation coefficient (ICC). Results The keratometry values showed good repeatability as evidenced by low group mean variability for DS vs control eyes (≤0.26D vs ≤0.09D for all dioptric values; 4.51° vs 3.16° for flat K orientation); however, the group mean variability was significantly higher in DS eyes than control eyes for all parameters (p≤0.03). On average, group mean variability was 2.5× greater in the DS eyes compared to control eyes across the keratometry values. Other metrics of repeatability also indicated good repeatability for both populations for each keratometry value, although repeatability was always better in the control eyes. Conclusions DS eyes showed more variability (on average: 2.5×) compared to controls for all keratometry values. Although differences were statistically significant, on average 91% of DS eyes had variability ≤0.50D for steep K and astigmatism, and 75% of DS eyes had variability ≤5 degrees for flat K orientation. PMID:27741083
Reliability and Validity Assessment of a Linear Position Transducer
Garnacho-Castaño, Manuel V.; López-Lastra, Silvia; Maté-Muñoz, José L.
2015-01-01
The objectives of the study were to determine the validity and reliability of peak velocity (PV), average velocity (AV), peak power (PP) and average power (AP) measurements were made using a linear position transducer. Validity was assessed by comparing measurements simultaneously obtained using the Tendo Weightlifting Analyzer Systemi and T-Force Dynamic Measurement Systemr (Ergotech, Murcia, Spain) during two resistance exercises, bench press (BP) and full back squat (BS), performed by 71 trained male subjects. For the reliability study, a further 32 men completed both lifts using the Tendo Weightlifting Analyzer Systemz in two identical testing sessions one week apart (session 1 vs. session 2). Intraclass correlation coefficients (ICCs) indicating the validity of the Tendo Weightlifting Analyzer Systemi were high, with values ranging from 0.853 to 0.989. Systematic biases and random errors were low to moderate for almost all variables, being higher in the case of PP (bias ±157.56 W; error ±131.84 W). Proportional biases were identified for almost all variables. Test-retest reliability was strong with ICCs ranging from 0.922 to 0.988. Reliability results also showed minimal systematic biases and random errors, which were only significant for PP (bias -19.19 W; error ±67.57 W). Only PV recorded in the BS showed no significant proportional bias. The Tendo Weightlifting Analyzer Systemi emerged as a reliable system for measuring movement velocity and estimating power in resistance exercises. The low biases and random errors observed here (mainly AV, AP) make this device a useful tool for monitoring resistance training. Key points This study determined the validity and reliability of peak velocity, average velocity, peak power and average power measurements made using a linear position transducer The Tendo Weight-lifting Analyzer Systemi emerged as a reliable system for measuring movement velocity and power. PMID:25729300
Galal, Sherif
2017-01-01
Nonunion after locked plating of distal femur fractures is not uncommon. Authors wanted to assess if "Dynamic" locked plating using near-cortex over-Drilling technique would provide a mechanical environment the promotes callus formation, thereby avoiding non-union encountered when applying locked plates with the conventional method. This study was conducted at an academic Level 1 Trauma Center. This is a prospective study conducted from November 2015 to November 2016. Follow-up was 10 months on average (ranging from 8 to 12 months). The study included 20 patients with 20 fractures (13 males, 7 females). The average patients' age was 41.2 years (18-64 years). According to the Müller AO classification of distal femur fractures (33A-C) there were 15 cases with extra-articular fractures (AO 33A), 5 patients with intra-articular fractures (AO 33C). Dynamic Locked plating using near-cortical over-drilling technique was done for all patients. Two blinded observers assessed callus score on 6-week radiographs using a 4-point ordinal scale. A 2-tailed t -test. Two-way mixed intra-class correlation testing was performed to determine reliability of the callus measurements by the 2 observers. All patients achieved union, time to union was 13.4 weeks on average (range form 8-24 weeks). Delayed union was observed in 2 patients. The average callus score for fractures was 1.8 (SD 0.6). All fractures united in alignment except 1 fracture which united in valgus malalignment, the deformity was appreciated in the postoperative radiographs. No wound related complications, no loss of reduction, no catastrophic implant failure or screw breakage were detected. Dynamic locked plating using near-cortex over-drilling is a simple technique that uses standard locked plates that promotes callus formation when used for fixing distal femur fractures.
Multivariate Analysis and Its Applications
1989-02-14
defined in situations where measurements are taken on natural clusters of individuals like brothers in a family. A number of problems arise in the study of...intraclass correlations. How do we estimate it when observations are available on clusters of different sizes? How do we test the hypothesis that the...the random variable y(X) = #I X + G2X 2 + ... + GmX m , follows an exponential distribution with mean unity. Such a class of life distributions, has a
Reliability of the Wii Balance Board in kayak
Vando, Stefano; Laffaye, Guillaume; Masala, Daniele; Falese, Lavinia; Padulo, Johnny
2015-01-01
Summary Background: the seat of the kayaker represent the principal contact point to express mechanical Energy. Methods: therefore we investigated the reliability of the Wii Balance Board measures in the kayak vs. on the ground. Results: Bland-Altman test showed a low systematic bias on the ground (2.85%) and in kayak (−2.13%) respectively; while 0.996 for Intra-class correlation coefficient. Conclusion: the Wii Balance Board is useful to assess postural sway in kayak. PMID:25878987
1990-01-10
reason for the fairly low reliability of the fourth and fifth MEOCS factors), issues of sexism and more subtle forms of racism have come to the fore...psychological climate (for which the individual is the unit for theory ). One approach, described by Glick, would use the intraclass correlation from a...and outcome measures are forced to remain obscure. A major flaw in the measurement of organizational climate is the lack of theory which would serve
Brook, Christopher D; Platt, Michael P; Russell, Kimberly; Grillone, Gregory A; Aliphas, Avner; Noordzij, J Pieter
2015-05-01
To determine the progression of flexible transnasal laryngoscopy reliability and competency in otolaryngology residency training. Prospective case control study. Academic otolaryngology department. Medical students, otolaryngology residents, and otolaryngology attending physicians. Fourteen otolaryngology residents from PGY-1 to PGY-5 and 3 attending otolaryngologists viewed 25 selected and digitally recorded flexible transnasal laryngoscopies. The evaluators were asked to rate 13 items relating to abnormalities in the oropharynx, hypopharynx, larynx, and subglottis. The level of concern and level of comfort with the diagnosis were assessed. Intraclass correlations were calculated for each topic and by level of training to determine reliability within each class and compare competency versus attending interpretations. Intraclass correlation of residents compared to attending physicians demonstrated significant improvements by year for left and right vocal fold immobility, subglottic stenosis, laryngeal mass, left and right vocal cord abnormalities, and level of concern. Additionally, pooled vocal cord mobility and pooled results in categories with good attending reliability demonstrated stepwise improvement as well. For these categories, resident reliability was found to be statistically similar to attending physicians in all categories by PGY-3. There were no trends for base of tongue abnormalities, pharyngeal abnormalities, and pharyngeal and hypopharyngeal masses. Resident competency for flexible transnasal laryngoscopy progresses during residency to reliability with attending otolaryngologists by the PGY-3 year over key facets of the examination. © American Academy of Otolaryngology-Head and Neck Surgery Foundation 2015.
Harries, Priscilla; Davies, Miranda
2015-01-01
Introduction As people with a range of disabilities strive to increase their community mobility, occupational therapy driver assessors are increasingly required to make complex recommendations regarding fitness-to-drive. However, very little is known about how therapists use information to make decisions. The aim of this study was to model how experienced occupational therapy driver assessors weight and combine information when making fitness-to-drive recommendations and establish their level of decision agreement. Method Using Social Judgment Theory method, this study examined how 45 experienced occupational therapy driver assessors from the UK, Australia and New Zealand made fitness-to-drive recommendations for a series of 64 case scenarios. Participants completed the task on a dedicated website, and data were analysed using discriminant function analysis and an intraclass correlation coefficient. Results Accounting for 87% of the variance, the cues central to the fitness-to-drive recommendations made by assessors are the client’s physical skills, cognitive and perceptual skills, road law craft skills, vehicle handling skills and the number of driving instructor interventions. Agreement (consensus) between fitness-to-drive recommendations was very high: intraclass correlation coefficient = .97, 95% confidence interval .96–.98). Conclusion Findings can be used by both experienced and novice driver assessors to reflect on and strengthen the fitness-to-drive recommendations made to clients. PMID:26435572
Reliability and Accuracy of Static Parameters Obtained From Ink and Pressure Platform Footprints.
Zuil-Escobar, Juan Carlos; Martínez-Cepa, Carmen Belén; Martín-Urrialde, Jose Antonio; Gómez-Conesa, Antonia
2016-09-01
The purpose of this study was to evaluate the accuracy and the intrarater reliability of arch angle (AA), Staheli Index (SI), and Chippaux-Smirak Index (CSI) obtained from ink and pressure platform footprints. We obtained AA, SI, and CSI measurements from ink pedigraph footprints and pressure platform footprints in 40 healthy participants (aged 25.65 ± 5.187 years). Intrarater reliability was calculated for all parameters obtained using the 2 methods. Standard error of measurement and minimal detectable change were also calculated. A repeated-measure analysis of variance was used to identify differences between ink and pressure platform footprints. Intraclass correlation coefficient and Bland and Altman plots were used to assess similar parameters obtained using different methods. Intrarater reliability was >0.9 for all parameters and was slightly higher for the ink footprints. No statistical difference was reported in repeated-measure analysis of variance for any of the parameters. Intraclass correlation coefficient values from AA, SI, and CSI that were obtained using ink footprints and pressure platform footprints were excellent, ranging from 0.797 to 0.829. However, pressure platform overestimated AA and underestimated SI and CSI. Our study revealed that AA, SI, and CSI were similar regardless of whether the ink or pressure platform method was used. In addition, the parameters indicated high intrarater reliability and were reproducible. Copyright © 2016. Published by Elsevier Inc.
Cortés-Castell, Ernesto; Juste, Mercedes; Palazón-Bru, Antonio; Monge, Laura; Sánchez-Ferrer, Francisco; Rizo-Baeza, María Mercedes
2017-01-01
Dual-energy X-ray absorptiometry (DXA) provides separate measurements of fat mass, fat-free mass and bone mass, and is a quick, accurate, and safe technique, yet one that is not readily available in routine clinical practice. Consequently, we aimed to develop statistical formulas to predict fat mass (%) and fat mass index (FMI) with simple parameters (age, sex, weight and height). We conducted a retrospective observational cross-sectional study in 416 overweight or obese patients aged 4-18 years that involved assessing adiposity by DXA (fat mass percentage and FMI), body mass index (BMI), sex and age. We randomly divided the sample into two parts (construction and validation). In the construction sample, we developed formulas to predict fat mass and FMI using linear multiple regression models. The formulas were validated in the other sample, calculating the intraclass correlation coefficient via bootstrapping. The fat mass percentage formula had a coefficient of determination of 0.65. This value was 0.86 for FMI. In the validation, the constructed formulas had an intraclass correlation coefficient of 0.77 for fat mass percentage and 0.92 for FMI. Our predictive formulas accurately predicted fat mass and FMI with simple parameters (BMI, sex and age) in children with overweight and obesity. The proposed methodology could be applied in other fields. Further studies are needed to externally validate these formulas.
Lee, Jong-Hyuck; Kim, Jae Hyuck; Kim, Sun Woong
2017-02-27
To compare the repeatability of central corneal thickness (CCT) measurement using the Pentacam between dry eyes and healthy eyes, as well as to investigate the effect of artificial tears on CCT measurement. The corneal thicknesses of 34 patients with dry eye and 28 healthy subjects were measured using the Pentacam. One eye from each subject was assigned randomly to a repeatability test, wherein a single operator performed three successive CCT measurements time points-before and 5 min after instillation of one artificial teardrop. The repeatability of measurements was assessed using the coefficient of repeatability and the intraclass correlation coefficient. The coefficient of repeatability values of the CCT measurements in dry and healthy eyes were 24.36 and 10.69 μm before instillation, and 16.85 and 9.72 μm after instillation, respectively. The intraclass correlation coefficient was higher in healthy eyes than that of in dry eyes (0.987 vs. 0.891), and it had improved significantly in dry eyes (0.948) after instillation of one artificial teardrop. The CCT measurement fluctuated in dry eyes (repeated-measures analysis of variance, P<0.001), whereas no significant changes were detected in healthy eyes, either before or after artificial tear instillation. Central corneal thickness measurement is less repeatable in dry eyes than in healthy eyes. Artificial tears improve the repeatability of CCT measurements obtained using the Pentacam in dry eyes.
Rights, Jason D; Sterba, Sonya K
2016-11-01
Multilevel data structures are common in the social sciences. Often, such nested data are analysed with multilevel models (MLMs) in which heterogeneity between clusters is modelled by continuously distributed random intercepts and/or slopes. Alternatively, the non-parametric multilevel regression mixture model (NPMM) can accommodate the same nested data structures through discrete latent class variation. The purpose of this article is to delineate analytic relationships between NPMM and MLM parameters that are useful for understanding the indirect interpretation of the NPMM as a non-parametric approximation of the MLM, with relaxed distributional assumptions. We define how seven standard and non-standard MLM specifications can be indirectly approximated by particular NPMM specifications. We provide formulas showing how the NPMM can serve as an approximation of the MLM in terms of intraclass correlation, random coefficient means and (co)variances, heteroscedasticity of residuals at level 1, and heteroscedasticity of residuals at level 2. Further, we discuss how these relationships can be useful in practice. The specific relationships are illustrated with simulated graphical demonstrations, and direct and indirect interpretations of NPMM classes are contrasted. We provide an R function to aid in implementing and visualizing an indirect interpretation of NPMM classes. An empirical example is presented and future directions are discussed. © 2016 The British Psychological Society.
Psychometric assessment of the Spiritual Climate Scale Arabic version for nurses in Saudi Arabia.
Cruz, Jonas Preposi; Albaqawi, Hamdan Mohammad; Alharbi, Sami Melbes; Alicante, Jerico G; Vitorino, Luciano M; Abunab, Hamzeh Y
2017-12-07
To assess the psychometric properties of the Spiritual Climate Scale Arabic version for Saudi nurses. Evidence showed that a high level of spiritual climate in the workplace is associated with increased productivity and performance, enhanced emotional intelligence, organisational commitment and job satisfaction among nurses. A convenient sample of 165 Saudi nurses was surveyed in this descriptive, cross-sectional study. Cronbach's α and intraclass correlation coefficient of the 2 week test-retest scores were computed to establish reliability. Exploratory factor analysis was performed to support the validity of the Spiritual Climate Scale Arabic version. The Spiritual Climate Scale Arabic version manifested excellent content validity. Exploratory factor analysis supported a single factor with an explained variance of 73.2%. The Cronbach's α values of the scale ranged from .79 to .88, while the intraclass correlation coefficient value was .90. The perceived spiritual climate was associated with the respondents' hospital, gender, age and years of experience. Findings of this study support the sound psychometric properties of the Spiritual Climate Scale Arabic version. The Spiritual Climate Scale Arabic version can be used by nurse managers to assess the nurses' perception of the spiritual climate in any clinical area. This process can lead to spiritually centred interventions, thereby ensuring a clinical climate that accepts and respects different spiritual beliefs and practices. © 2017 John Wiley & Sons Ltd.
Do physiotherapy staff record treatment time accurately? An observational study.
Bagley, Pam; Hudson, Mary; Green, John; Forster, Anne; Young, John
2009-09-01
To assess the reliability of duration of treatment time measured by physiotherapy staff in early-stage stroke patients. Comparison of physiotherapy staff's recording of treatment sessions and video recording. Rehabilitation stroke unit in a general hospital. Thirty-nine stroke patients without trunk control or who were unable to stand with an erect trunk without the support of two therapists recruited to a randomized trial evaluating the Oswestry Standing Frame. Twenty-six physiotherapy staff who were involved in patient treatment. Contemporaneous recording by physiotherapy staff of treatment time (in minutes) compared with video recording. Intraclass correlation with 95% confidence interval and the Bland and Altman method for assessing agreement by calculating the mean difference (standard deviation; 95% confidence interval), reliability coefficient and 95% limits of agreement for the differences between the measurements. The mean duration (standard deviation, SD) of treatment time recorded by physiotherapy staff was 32 (11) minutes compared with 25 (9) minutes as evidenced in the video recording. The mean difference (SD) was -6 (9) minutes (95% confidence interval (CI) -9 to -3). The reliability coefficient was 18 minutes and the 95% limits of agreement were -24 to 12 minutes. Intraclass correlation coefficient for agreement between the two methods was 0.50 (95% CI 0.12 to 0.73). Physiotherapy staff's recording of duration of treatment time was not reliable and was systematically greater than the video recording.
Nakagawa, Shinichi; Johnson, Paul C D; Schielzeth, Holger
2017-09-01
The coefficient of determination R 2 quantifies the proportion of variance explained by a statistical model and is an important summary statistic of biological interest. However, estimating R 2 for generalized linear mixed models (GLMMs) remains challenging. We have previously introduced a version of R 2 that we called [Formula: see text] for Poisson and binomial GLMMs, but not for other distributional families. Similarly, we earlier discussed how to estimate intra-class correlation coefficients (ICCs) using Poisson and binomial GLMMs. In this paper, we generalize our methods to all other non-Gaussian distributions, in particular to negative binomial and gamma distributions that are commonly used for modelling biological data. While expanding our approach, we highlight two useful concepts for biologists, Jensen's inequality and the delta method, both of which help us in understanding the properties of GLMMs. Jensen's inequality has important implications for biologically meaningful interpretation of GLMMs, whereas the delta method allows a general derivation of variance associated with non-Gaussian distributions. We also discuss some special considerations for binomial GLMMs with binary or proportion data. We illustrate the implementation of our extension by worked examples from the field of ecology and evolution in the R environment. However, our method can be used across disciplines and regardless of statistical environments. © 2017 The Author(s).
Heritability of carotid intima-media thickness: a twin study.
Zhao, Jinying; Cheema, Faiz A; Bremner, J Douglas; Goldberg, Jack; Su, Shaoyong; Snieder, Harold; Maisano, Carisa; Jones, Linda; Javed, Farhan; Murrah, Nancy; Le, Ngoc-Anh; Vaccarino, Viola
2008-04-01
To estimate the heritability of carotid intima-media thickness (IMT), a surrogate marker for atherosclerosis, independent of traditional coronary risk factors. We performed a classical twin study of carotid IMT using 98 middle-aged male twin pairs, 58 monozygotic (MZ) and 40 dizygotic (DZ) pairs, from the Vietnam Era Twin Registry. All twins were free of overt cardiovascular disease. Carotid IMT was measured by ultrasound. Bivariate and multivariate analyses were used to determine the association between traditional cardiovascular risk factors and carotid IMT. Intraclass correlation coefficients and genetic modeling techniques were used to determine the relative contributions of genes and environment to the variation in carotid IMT. In our sample, the mean of the maximum carotid IMT was 0.75+/-0.11. Age, systolic blood pressure and HDL were significantly associated with carotid IMT. The intraclass correlation coefficient for carotid IMT was larger in MZ (0.66; 95% confidence interval [CI], 0.62-0.69) than in DZ twins (0.37; 95% CI, 0.29-0.44), and the unadjusted heritability was 0.69 (95% CI, 0.54-0.79). After adjusting for traditional coronary risk factors, the heritability of carotid IMT was slightly reduced but still of considerable magnitude (0.59; 95% CI, 0.39-0.73). Genetic factors have a substantial influence on the variation of carotid IMT. Most of this genetic effect occurs through pathways independent of traditional coronary risk factors.
Bonin, Christiani Decker Batista; dos Santos, Rafaella Zulianello; Ghisi, Gabriela Lima de Melo; Vieira, Ariany Marques; Amboni, Ricardo; Benetti, Magnus
2014-01-01
Background The lack of tools to measure heart failure patients' knowledge about their syndrome when participating in rehabilitation programs demonstrates the need for specific recommendations regarding the amount or content of information required. Objectives To develop and validate a questionnaire to assess heart failure patients' knowledge about their syndrome when participating in cardiac rehabilitation programs. Methods The tool was developed based on the Coronary Artery Disease Education Questionnaire and applied to 96 patients with heart failure, with a mean age of 60.22 ± 11.6 years, 64% being men. Reproducibility was obtained via the intraclass correlation coefficient, using the test-retest method. Internal consistency was assessed by use of Cronbach's alpha, and construct validity, by use of exploratory factor analysis. Results The final version of the tool had 19 questions arranged in ten areas of importance for patient education. The proposed questionnaire had a clarity index of 8.94 ± 0.83. The intraclass correlation coefficient was 0.856, and Cronbach's alpha, 0.749. Factor analysis revealed five factors associated with the knowledge areas. Comparing the final scores with the characteristics of the population evidenced that low educational level and low income are significantly associated with low levels of knowledge. Conclusion The instrument has satisfactory clarity and validity indices, and can be used to assess the heart failure patients' knowledge about their syndrome when participating in cardiac rehabilitation programs. PMID:24652054
Towards an Operational Definition of Clinical Competency in Pharmacy
2015-01-01
Objective. To estimate the inter-rater reliability and accuracy of ratings of competence in student pharmacist/patient clinical interactions as depicted in videotaped simulations and to compare expert panelist and typical preceptor ratings of those interactions. Methods. This study used a multifactorial experimental design to estimate inter-rater reliability and accuracy of preceptors’ assessment of student performance in clinical simulations. The study protocol used nine 5-10 minute video vignettes portraying different levels of competency in student performance in simulated clinical interactions. Intra-Class Correlation (ICC) was used to calculate inter-rater reliability and Fisher exact test was used to compare differences in distribution of scores between expert and nonexpert assessments. Results. Preceptors (n=42) across 5 states assessed the simulated performances. Intra-Class Correlation estimates were higher for 3 nonrandomized video simulations compared to the 6 randomized simulations. Preceptors more readily identified high and low student performances compared to satisfactory performances. In nearly two-thirds of the rating opportunities, a higher proportion of expert panelists than preceptors rated the student performance correctly (18 of 27 scenarios). Conclusion. Valid and reliable assessments are critically important because they affect student grades and formative student feedback. Study results indicate the need for pharmacy preceptor training in performance assessment. The process demonstrated in this study can be used to establish minimum preceptor benchmarks for future national training programs. PMID:26089563
Santos-Martínez, Luis Efren; Guevara-Carrasco, Marlene; Naranjo-Ricoy, Guillermo; Baranda-Tovar, Francisco Martín; Moreno-Ruíz, Luis Antonio; Herrera-Velázquez, Marco Antonio; Magaña-Serrano, José Antonio; Valencia-Sánchez, Jesús Salvador; Calderón-Abbo, Moisés Cutiel
2014-01-01
The concordance between the parameters of arterial and central venous blood gases has not been defined yet. We studied the concordance between both parameters in post-surgical myocardial revascularization patients in stable condition. Consecutive subjects were studied in a cross-sectional design. The position of the central venous catheter was performed and simultaneously we obtained arterial and central venous blood samples prior to discharge from the intensive care unit. Data are expressed according to Bland-Altman statistical method and the intraclass correlation coefficient. Statistical result was accepted at P<.05. Two hundred and six samples were studied of 103 post-surgical patients, pH and lactate had a mean difference (limits of agreement) 0.029±0.048 (-0018, 0.077) and -0.12±0.22 (-0.57, 0.33) respectively. The magnitude of the intraclass correlation coefficient was 0.904 and 0.943 respectively. The values related to oxygen pressure were 27.86±6.08 (15.9, 39.8) and oxygen saturation 33.02±6.13 (21, 45), with magnitude of 0.258 and 0.418 respectively. The best matching parameters between arterial and central venous blood samples were pH and lactate. Copyright © 2013 Instituto Nacional de Cardiología Ignacio Chávez. Published by Masson Doyma México S.A. All rights reserved.
Nakajima, Erica C; Frankland, Michael P; Johnson, Tucker F; Antic, Sanja L; Chen, Heidi; Chen, Sheau-Chiann; Karwoski, Ronald A; Walker, Ronald; Landman, Bennett A; Clay, Ryan D; Bartholmai, Brian J; Rajagopalan, Srinivasan; Peikert, Tobias; Massion, Pierre P; Maldonado, Fabien
2018-01-01
Lung adenocarcinoma (ADC), the most common lung cancer type, is recognized increasingly as a disease spectrum. To guide individualized patient care, a non-invasive means of distinguishing indolent from aggressive ADC subtypes is needed urgently. Computer-Aided Nodule Assessment and Risk Yield (CANARY) is a novel computed tomography (CT) tool that characterizes early ADCs by detecting nine distinct CT voxel classes, representing a spectrum of lepidic to invasive growth, within an ADC. CANARY characterization has been shown to correlate with ADC histology and patient outcomes. This study evaluated the inter-observer variability of CANARY analysis. Three novice observers segmented and analyzed independently 95 biopsy-confirmed lung ADCs from Vanderbilt University Medical Center/Nashville Veterans Administration Tennessee Valley Healthcare system (VUMC/TVHS) and the Mayo Clinic (Mayo). Inter-observer variability was measured using intra-class correlation coefficient (ICC). The average ICC for all CANARY classes was 0.828 (95% CI 0.76, 0.895) for the VUMC/TVHS cohort, and 0.852 (95% CI 0.804, 0.901) for the Mayo cohort. The most invasive voxel classes had the highest ICC values. To determine whether nodule size influenced inter-observer variability, an additional cohort of 49 sub-centimeter nodules from Mayo were also segmented by three observers, with similar ICC results. Our study demonstrates that CANARY ADC classification between novice CANARY users has an acceptably low degree of variability, and supports the further development of CANARY for clinical application.
Babor, Thomas F; Xuan, Ziming; Proctor, Dwayne
2008-03-01
The purposes of this study were to develop reliable procedures to monitor the content of alcohol advertisements broadcast on television and in other media, and to detect violations of the content guidelines of the alcohol industry's self-regulation codes. A set of rating-scale items was developed to measure the content guidelines of the 1997 version of the U.S. Beer Institute Code. Six focus groups were conducted with 60 college students to evaluate the face validity of the items and the feasibility of the procedure. A test-retest reliability study was then conducted with 74 participants, who rated five alcohol advertisements on two occasions separated by 1 week. Average correlations across all advertisements using three reliability statistics (r, rho, and kappa) were almost all statistically significant and the kappas were good for most items, which indicated high test-retest agreement. We also found high interrater reliabilities (intraclass correlations) among raters for item-level and guideline-level violations, indicating that regardless of the specific item, raters were consistent in their general evaluations of the advertisements. Naïve (untrained) raters can provide consistent (reliable) ratings of the main content guidelines proposed in the U.S. Beer Institute Code. The rating procedure may have future applications for monitoring compliance with industry self-regulation codes and for conducting research on the ways in which alcohol advertisements are perceived by young adults and other vulnerable populations.
Callan, Richard S; Cooper, Jeril R; Young, Nancy B; Mollica, Anthony G; Furness, Alan R; Looney, Stephen W
2015-06-01
The problems associated with intra- and interexaminer reliability when assessing preclinical performance continue to hinder dental educators' ability to provide accurate and meaningful feedback to students. Many studies have been conducted to evaluate the validity of utilizing various technologies to assist educators in achieving that goal. The purpose of this study was to compare two different versions of E4D Compare software to determine if either could be expected to deliver consistent and reliable comparative results, independent of the individual utilizing the technology. Five faculty members obtained E4D digital images of students' attempts (sample model) at ideal gold crown preparations for tooth #30 performed on typodont teeth. These images were compared to an ideal (master model) preparation utilizing two versions of E4D Compare software. The percent correlations between and within these faculty members were recorded and averaged. The intraclass correlation coefficient was used to measure both inter- and intrarater agreement among the examiners. The study found that using the older version of E4D Compare did not result in acceptable intra- or interrater agreement among the examiners. However, the newer version of E4D Compare, when combined with the Nevo scanner, resulted in a remarkable degree of agreement both between and within the examiners. These results suggest that consistent and reliable results can be expected when utilizing this technology under the protocol described in this study.
Oo, W M; Linklater, J M; Daniel, M; Saarakkala, S; Samuels, J; Conaghan, P G; Keen, H I; Deveza, L A; Hunter, D J
2018-05-01
The aims of this study were to systematically review clinimetrics of commonly assessed ultrasound pathologies in knee, hip and hand osteoarthritis (OA), and to conduct a meta-analysis for each clinimetric. Medline, Embase, and Cochrane Library databases were searched from their inceptions to September 2016. According to the Outcome Measures in Rheumatology (OMERACT) Instrument Selection Algorithm, data extraction focused on ultrasound technical features and performance metrics. Methodological quality was assessed with modified 19-item Downs and Black score and 11-item Quality Appraisal of Diagnostic Reliability (QAREL) score. Separate meta-analyses were performed for clinimetrics: (1) inter-rater/intra-rater reliability; (2) construct validity; (3) criteria validity; and (4) internal/external responsiveness. Statistical Package for the Social Sciences (SPSS), Excel and Comprehensive Meta-analysis were used. Our search identified 1126 records; of these, 100 were eligible, including a total of 8542 patients and 32,373 joints. The average Downs and Black score was 13.01, and average QAREL was 5.93. The stratified meta-analysis was performed only for knee OA, which demonstrated moderate to substantial reliability [minimum kappa > 0.44(0.15,0.74), minimum intraclass correlation coefficient (ICC) > 0.82(0.73-0.89)], weak construct validity against pain (r = 0.12 to 0.27), function (r = 0.15 to 0.23), and blood biomarkers (r = 0.01 to 0.21), but weak to strong correlation with plain radiography (r = 0.13 to 0.60), strong association with Magnetic Resonance Imaging (MRI) [minimum r = 0.60(0.52,0.67)] and strong discrimination against symptomatic patients (OR = 3.08 to 7.46). There was strong criterion validity against cartilage histology [r = 0.66(-0.05,0.93)], and small to moderate internal [standardized mean difference(SMD) = 0.20 to 0.58] and external (r = 0.35 to 0.43) responsiveness to interventions. Ultrasound demonstrated strong criterion validity with cartilage histology, poor to strong correlation with patient findings and MRI, moderate reliability, and low responsiveness to interventions. CRD42016039954. Copyright © 2018 Osteoarthritis Research Society International. All rights reserved.
The Trojan Lifetime Champions Health Survey: Development, Validity, and Reliability
Sorenson, Shawn C.; Romano, Russell; Scholefield, Robin M.; Schroeder, E. Todd; Azen, Stanley P.; Salem, George J.
2015-01-01
Context Self-report questionnaires are an important method of evaluating lifespan health, exercise, and health-related quality of life (HRQL) outcomes among elite, competitive athletes. Few instruments, however, have undergone formal characterization of their psychometric properties within this population. Objective To evaluate the validity and reliability of a novel health and exercise questionnaire, the Trojan Lifetime Champions (TLC) Health Survey. Design Descriptive laboratory study. Setting A large National Collegiate Athletic Association Division I university. Patients or Other Participants A total of 63 university alumni (age range, 24 to 84 years), including former varsity collegiate athletes and a control group of nonathletes. Intervention(s) Participants completed the TLC Health Survey twice at a mean interval of 23 days with randomization to the paper or electronic version of the instrument. Main Outcome Measure(s) Content validity, feasibility of administration, test-retest reliability, parallel-form reliability between paper and electronic forms, and estimates of systematic and typical error versus differences of clinical interest were assessed across a broad range of health, exercise, and HRQL measures. Results Correlation coefficients, including intraclass correlation coefficients (ICCs) for continuous variables and κ agreement statistics for ordinal variables, for test-retest reliability averaged 0.86, 0.90, 0.80, and 0.74 for HRQL, lifetime health, recent health, and exercise variables, respectively. Correlation coefficients, again ICCs and κ, for parallel-form reliability (ie, equivalence) between paper and electronic versions averaged 0.90, 0.85, 0.85, and 0.81 for HRQL, lifetime health, recent health, and exercise variables, respectively. Typical measurement error was less than the a priori thresholds of clinical interest, and we found minimal evidence of systematic test-retest error. We found strong evidence of content validity, convergent construct validity with the Short-Form 12 Version 2 HRQL instrument, and feasibility of administration in an elite, competitive athletic population. Conclusions These data suggest that the TLC Health Survey is a valid and reliable instrument for assessing lifetime and recent health, exercise, and HRQL, among elite competitive athletes. Generalizability of the instrument may be enhanced by additional, larger-scale studies in diverse populations. PMID:25611315
Geiger, Daniel; Bae, Won C.; Statum, Sheronda; Du, Jiang; Chung, Christine B.
2014-01-01
Objective Temporomandibular dysfunction involves osteoarthritis of the TMJ, including degeneration and morphologic changes of the mandibular condyle. Purpose of this study was to determine accuracy of novel 3D-UTE MRI versus micro-CT (μCT) for quantitative evaluation of mandibular condyle morphology. Material & Methods Nine TMJ condyle specimens were harvested from cadavers (2M, 3F; Age 85 ± 10 yrs., mean±SD). 3D-UTE MRI (TR=50ms, TE=0.05 ms, 104 μm isotropic-voxel) was performed using a 3-T MR scanner and μCT (18 μm isotropic-voxel) was performed. MR datasets were spatially-registered with μCT dataset. Two observers segmented bony contours of the condyles. Fibrocartilage was segmented on MR dataset. Using a custom program, bone and fibrocartilage surface coordinates, Gaussian curvature, volume of segmented regions and fibrocartilage thickness were determined for quantitative evaluation of joint morphology. Agreement between techniques (MRI vs. μCT) and observers (MRI vs. MRI) for Gaussian curvature, mean curvature and segmented volume of the bone were determined using intraclass correlation correlation (ICC) analyses. Results Between MRI and μCT, the average deviation of surface coordinates was 0.19±0.15 mm, slightly higher than spatial resolution of MRI. Average deviation of the Gaussian curvature and volume of segmented regions, from MRI to μCT, was 5.7±6.5% and 6.6±6.2%, respectively. ICC coefficients (MRI vs. μCT) for Gaussian curvature, mean curvature and segmented volumes were respectively 0.892, 0.893 and 0.972. Between observers (MRI vs. MRI), the ICC coefficients were 0.998, 0.999 and 0.997 respectively. Fibrocartilage thickness was 0.55±0.11 mm, as previously described in literature for grossly normal TMJ samples. Conclusion 3D-UTE MR quantitative evaluation of TMJ condyle morphology ex-vivo, including surface, curvature and segmented volume, shows high correlation against μCT and between observers. In addition, UTE MRI allows quantitative evaluation of the fibrocartilaginous condylar component. PMID:24092237
Moon, Ki Won; Lee, Shin-Seok; Kim, Jin Hyun; Song, Ran; Lee, Eun Young; Song, Yeong Wook; Bellamy, Nicholas; Lee, Eun Bong
2012-11-01
The Australian/Canadian Osteoarthritis Hand Index (AUSCAN) is a patient self-reported 15-item questionnaire measuring the severity of hand osteoarthritis symptoms in the respect of pain, stiffness, and function. In this study, we developed a Korean version of the AUSCAN Index (K-AUSCAN) and confirmed its reliability, validity, and responsiveness. The AUSCAN Index was translated into Korean by 3 translators and translated back into English by 3 different translators. In a group of 53 patients with clinical hand osteoarthritis (mean age 58.3 ± 7.6 years), validity was evaluated against other outcome measures, including the Functional Index for Hand Osteoarthritis (FIHOA) and Multidimensional Health Assessment Questionnaire (MDHAQ). Test-retest reliability was assessed at a 2-weeks interval in 51 patients. Internal consistency of K-AUSCAN was evaluated by Cronbach's α. Responsiveness was measured by standardized response mean (SRM). The test-retest reliability of K-AUSCAN yielded intraclass correlation coefficient of 0.46 for pain, 0.58 for stiffness, and 0.67 for function. The internal consistency of K-AUSCAN was satisfactory with Cronbach's α of 0.89 for pain and 0.93 for function. The K-AUSCAN index showed good correlation with other measures (r (2) was 0.67 for K-AUSCAN pain and MDHAQ pain; r (2) was 0.72 for K-AUSCAN function and FIHOA). The pain and function of K-AUSCAN correlated substantially with each other and moderately with stiffness subscale. The average SRM for K-AUSCAN pain, stiffness, and function was -0.92, -0.48, and -0.84, respectively. The Korean version of the AUSCAN Index is a valid, reliable, and responsive tool for the assessment of hand osteoarthritis symptoms.
Chiari, Aline; de Souza Sardim, Carla Caires; Natour, Jamil
2011-01-01
OBJECTIVE: To translate, to perform a cultural adaptation of and to test the reproducibility of the Cochin Hand Functional Scale questionnaire for Brazil. METHODS: First, the Cochin Hand Functional Scale questionnaire was translated into Portuguese and was then back-translated into French. These translations were reviewed by a committee to establish a Brazilian version of the questionnaire to be tested. The validity and reproducibility of the Cochin Hand Functional Scale questionnaire was evaluated. Patients of both sexes, who were aged 18 to 60 years and presented with rheumatoid arthritis affecting their hands, were interviewed. The patients were initially interviewed by two observers and were later interviewed by a single rater. First, the Visual Analogue Scale for hand pain, the Arm, Shoulder and Hand Disability questionnaire and the Health Assessment Questionnaire were administered. The third administration of the Cochin Hand Functional Scale was performed fifteen days after the first administration. Ninety patients were assessed in the present study. RESULTS: Two questions were modified as a result of the assessment of cultural equivalence. The Cronbach's alpha value for this assessment was 0.93. The intraclass intraobserver and interobserver correlation coefficients were 0.76 and 0.96, respectively. The Spearman's coefficient indicated that there was a low level of correlation between the Cochin Hand Functional Scale and the Visual Analogue Scale for pain (0.46) and that there was a moderate level of correlation of the Cochin Scale with the Health Assessment Questionnaire (0.66) and with the Disability of the Arm, Shoulder and Hand questionnaire (0.63). The average administration time for the Cochin Scale was three minutes. CONCLUSION: The Brazilian version of the Cochin Hand Functional Scale was successfully translated and adapted, and this version exhibited good internal consistency, reliability and construct validity. PMID:21789372
Csizmadi, Ilona; Neilson, Heather K; Kopciuk, Karen A; Khandwala, Farah; Liu, Andrew; Friedenreich, Christine M; Yasui, Yutaka; Rabasa-Lhoret, Rémi; Bryant, Heather E; Lau, David C W; Robson, Paula J
2014-08-15
We determined measurement properties of the Sedentary Time and Activity Reporting Questionnaire (STAR-Q), which was designed to estimate past-month activity energy expenditure (AEE). STAR-Q validity and reliability were assessed in 102 adults in Alberta, Canada (2009-2011), who completed 14-day doubly labeled water (DLW) protocols, 7-day activity diaries on day 15, and the STAR-Q on day 14 and again at 3 and 6 months. Three-month reliability was substantial for total energy expenditure (TEE) and AEE (intraclass correlation coefficients of 0.84 and 0.73, respectively), while 6-month reliability was moderate. STAR-Q-derived TEE and AEE were moderately correlated with DLW estimates (Spearman's ρs of 0.53 and 0.40, respectively; P < 0.001), and on average, the STAR-Q overestimated TEE and AEE (median differences were 367 kcal/day and 293 kcal/day, respectively). Body mass index-, age-, sex-, and season-adjusted concordance correlation coefficients (CCCs) were 0.24 (95% confidence interval (CI): 0.07, 0.36) and 0.21 (95% CI: 0.11, 0.32) for STAR-Q-derived versus DLW-derived TEE and AEE, respectively. Agreement between the diaries and STAR-Q (metabolic equivalent-hours/day) was strongest for occupational sedentary time (adjusted CCC = 0.76, 95% CI: 0.64, 0.85) and overall strenuous activity (adjusted CCC = 0.64, 95% CI: 0.49, 0.76). The STAR-Q demonstrated substantial validity for estimating occupational sedentary time and strenuous activity and fair validity for ranking individuals by AEE. © The Author 2014. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Srivastava, A; Koul, V; Dwivedi, S N; Upadhyaya, A D; Ahuja, A; Saxena, R
2015-08-01
The aim of this study was to evaluate the performance of the newly developed handheld hemoglobinmeter (TrueHb) by comparing its performance against and an automated five-part hematology analyzer, Sysmex counter XT 1800i (Sysmex). Two hundred venous blood samples were subjected through their total hemoglobin evaluation on each device three times. The average of the three readings on each device was considered as their respective device values, that is, TrueHb values and Sysmex values. The two set of values were comparatively analyzed. The repeatability of the performance of TrueHb was also evaluated against Sysmex values. The scatter plot of TrueHb values and Sysmex values showed linear distribution with positive correlations (r = 0.99). The intraclass correlation (ICC) values between the two set of values was found to be 0.995. Regression coefficients through origin, β, was found to be 0.995, with 95% confidence intervals (CI) ranging between 0.9900 and 1.0000. The mean difference in Bland-Altman plots of TrueHb values against the Sysmex values was found to be -0.02, with limits of agreement between -0.777 and 0.732 g/dL. Statistical analysis suggested good repeatability in results of TrueHb, having a low mean CV of 2.22, against 4.44, that of Sysmex values, and 95% confidence interval of 1.99-2.44, against 3.85-5.03, that of Sysmex values. These results suggested a strong positive correlation between the two measurements devices. It is thus concluded that TrueHb is a good point-of-care testing tool for estimating hemoglobin. © 2014 John Wiley & Sons Ltd.
Ozturk, Erhan Arif; Kocer, Bilge Gonenli; Umay, Ebru; Cakci, Aytul
2018-06-07
The objectives of the present study were to translate and cross-culturally adapt the English version of the Parkinson Fatigue Scale into Turkish, to evaluate its psychometric properties, and to compare them with that of other language versions. A total of 144 patients with idiopathic Parkinson disease were included in the study. The Turkish version of Parkinson Fatigue Scale was evaluated for data quality, scaling assumptions, acceptability, reliability, and validity. The questionnaire response rate was 100% for both test and retest. The percentage of missing data was zero for items, and the percentage of computable scores was full. Floor and ceiling effects were absent. The Parkinson Fatigue Scale provides an acceptable internal consistency (Cronbach's alpha was 0.974 for 1st test and 0.964 for a retest, and corrected item-to-total correlations were ranged from 0.715 to 0.906) and test-retest reliability (Cohen's kappa coefficients were ranged from 0.632 to 0.786 for individuals items, and intraclass correlation coefficient was 0.887 for the overall Parkinson Fatigue Scale Score). An exploratory factor analysis of the items revealed a single factor explaining 71.7% of variance. The goodness-of-fit statistics for the one-factorial confirmatory factor analysis were Tucker Lewis index = 0.961, comparative fit index = 0.971 and root mean square error of approximation = 0.077 for a single factor. The average Parkinson Fatigue Scale Score was correlated significantly with sociodemographic data, clinical characteristics and scores of rating scales. The Turkish version of the Parkinson Fatigue Scale seems to be culturally well adapted and have good psychometric properties. The scale can be used in further studies to assess the fatigue in patients with Parkinson's disease.
Transcultural validation of the Oxford Shoulder Score for the French-speaking population.
Tuton, D; Barbe, C; Salmon, J-H; Dramé, M; Nérot, C; Ohl, X
2016-09-01
Patient-reported outcome measures (PROMs) have been gaining in popularity over the last decade. The Oxford Shoulder Score (OSS) is a well-established self-administered questionnaire for shoulder evaluation adapted for the English-speaking population. The aim of the present study was to develop a translation and a transcultural adaptation of the OSS and to assess its validity in native French-speaker patients with shoulder pain. The translation process was carried out following a translation/back-translation methodology by two translators. All patients completed the French OSS, the Subjective Shoulder Value (SSV), and the Constant score. Internal consistency was tested using Cronbach's α coefficient. Validity was assessed by calculating the Pearson correlation coefficient between the OSS and the Constant score and the SSV. One hundred forty-four patients suffering from degenerative or inflammatory diseases of the shoulder were included in this study. The average time required to complete the French OSS was 2min and 45s. Seventy patients were asked to complete the questionnaire twice (test/retest reliability). Internal consistency was high with Cronbach's α coefficient=0.93. The intraclass correlation coefficient was 0.91 (95% CI: 0.88-0.94) for test/retest reliability. The French OSS score was significantly correlated with the Constant-Murley score (r=0.73 and P<0.0001) and with the SSV (r=0.68 and P<0.0001). The present study shows that the French version of the OSS is reliable, valid, and reproducible. The sensitivity to change now needs to be evaluated. This score was adapted to the French-speaking population for the self-assessment of patients with degenerative or inflammatory disorders of the shoulder. Level 1, Test of previously developed criteria, diagnostic test study. Copyright © 2016 Elsevier Masson SAS. All rights reserved.
Mikkelsen, Kim Lyngby; Thommesen, Jacob; Andersen, Henning Boje
2013-01-01
Objectives Validation of a Danish patient safety incident classification adapted from the World Health Organizaton's International Classification for Patient Safety (ICPS-WHO). Design Thirty-three hospital safety management experts classified 58 safety incident cases selected to represent all types and subtypes of the Danish adaptation of the ICPS (ICPS-DK). Outcome Measures Two measures of inter-rater agreement: kappa and intra-class correlation (ICC). Results An average number of incident types used per case per rater was 2.5. The mean ICC was 0.521 (range: 0.199–0.809) and the mean kappa was 0.513 (range: 0.193–0.804). Kappa and ICC showed high correlation (r = 0.99). An inverse correlation was found between the prevalence of type and inter-rater reliability. Results are discussed according to four factors known to determine the inter-rater agreement: skill and motivation of raters; clarity of case descriptions; clarity of the operational definitions of the types and the instructions guiding the coding process; adequacy of the underlying classification scheme. Conclusions The incident types of the ICPS-DK are adequate, exhaustive and well suited for classifying and structuring incident reports. With a mean kappa a little above 0.5 the inter-rater agreement of the classification system is considered ‘fair’ to ‘good’. The wide variation in the inter-rater reliability and low reliability and poor discrimination among the highly prevalent incident types suggest that for these types, precisely defined incident sub-types may be preferred. This evaluation of the reliability and usability of WHO's ICPS should be useful for healthcare administrations that consider or are in the process of adapting the ICPS. PMID:23287641
2014-01-01
Purpose. Optical coherence tomography (OCT) has been used to investigate papilledema in single-site, mostly retrospective studies. We investigated whether spectral-domain OCT (SD-OCT), which provides thickness and volume measurements of the optic nerve head and retina, could reliably demonstrate structural changes due to papilledema in a prospective multisite clinical trial setting. Methods. At entry, 126 subjects in the Idiopathic Intracranial Hypertension Treatment Trial (IIHTT) with mild visual field loss had optic disc and macular scans, using the Cirrus SD-OCT. Images were analyzed by using the proprietary commercial and custom 3D-segmentation algorithms to calculate retinal nerve fiber layer (RNFL), total retinal thickness (TRT), optic nerve head volume (ONHV), and retinal ganglion cell layer (GCL) thickness. We evaluated variability, with interocular comparison and correlation between results for both methods. Results. The average RNFL thickness > 95% of normal controls in 90% of eyes and the RNFL, TRT, ONH height, and ONHV showed strong (r > 0.8) correlations for interocular comparisons. Variability for repeated testing of OCT parameters was low for both methods and intraclass correlations > 0.9 except for the proprietary GCL thickness. The proprietary algorithm–derived RNFL, TRT, and GCL thickness measurements had failure rates of 10%, 16%, and 20% for all eyes respectively, which were uncommon with 3D-segmentation–derived measurements. Only 7% of eyes had GCL thinning that was less than fifth percentile of normal age-matched control eyes by both methods. Conclusions. Spectral-domain OCT provides reliable continuous variables and quantified assessment of structural alterations due to papilledema. (ClinicalTrials.gov number, NCT01003639.) PMID:25370510
Fu, Lanxing; Aspinall, Peter; Bennett, Gary; Magidson, Jay; Tatham, Andrew J
2017-04-01
To quantify the influence of spectral domain optical coherence tomography (SDOCT) on decision-making in patients with suspected glaucoma. A prospective cross-sectional study involving 40 eyes of 20 patients referred by community optometrists due to suspected glaucoma. All patients had disc photographs and standard automated perimetry (SAP), and results were presented to 13 ophthalmologists who estimated pre-test probability of glaucoma (0-100%) for a total of 520 observations. Ophthalmologists were then permitted to modify probabilities of disease based on SDOCT retinal nerve fiber layer (RNFL) measurements (post-test probability). The effect of information from SDOCT on decision to treat, monitor, or discharge was assessed. Agreement among graders was assessed using intraclass correlation coefficients (ICC) and correlated component regression (CCR) was used to identify variables influencing management decisions. Patients had an average age of 69.0 ± 10.1 years, SAP mean deviation of 2.71 ± 3.13 dB, and RNFL thickness of 86.2 ± 16.7 μm. Average pre-test probability of glaucoma was 37.0 ± 33.6% with SDOCT resulting in a 13.3 ± 18.1% change in estimated probability. Incorporating information from SDOCT improved agreement regarding probability of glaucoma (ICC = 0.50 (95% CI 0.38 to 0.64) without SDOCT versus 0.64 (95% CI 0.52 to 0.76) with SDOCT). SDOCT led to a change from decision to "treat or monitor" to "discharge" in 22 of 520 cases and a change from "discharge" to "treat or monitor" in 11 of 520 cases. Pre-test probability and RNFL thickness were predictors of post-test probability of glaucoma, contributing 69 and 31% of the variance in post-test probability, respectively. Information from SDOCT altered estimated probability of glaucoma and improved agreement among clinicians in those suspected of having the disease.
Handgrip force steadiness in young and older adults: a reproducibility study.
Blomkvist, Andreas W; Eika, Fredrik; de Bruin, Eling D; Andersen, Stig; Jorgensen, Martin
2018-04-02
Force steadiness is a quantitative measure of the ability to control muscle tonus. It is an independent predictor of functional performance and has shown to correlate well with different degrees of motor impairment following stroke. Despite being clinically relevant, few studies have assessed the validity of measuring force steadiness. The aim of this study was to explore the reproducibility of handgrip force steadiness, and to assess age difference in steadiness. Intrarater reproducibility (the degree to which a rating gives consistent result on separate occasions) was investigated in a test-retest design with seven days between sessions. Ten young and thirty older adults were recruited and handgrip steadiness was tested at 5%, 10% and 25% of maximum voluntary contraction (MVC) using Nintendo Wii Balance Board (WBB). Coefficients of variation were calculated from the mean force produced (CVM) and the target force (CVT). Area between the force curve and the target force line (Area) was also calculated. For the older adults we explored reliability using intraclass correlation coefficient (ICC) and agreement using standard error of measurement (SEM), limits of agreement (LOA) and smallest real difference (SRD). A systematic improvement in handgrip steadiness was found between sessions for all measures (CVM, CVT, Area). CVM and CVT at 5% of MVC showed good to high reliability, while Area had poor reliability for all percentages of MVC. Averaged ICC for CVM, CVT and Area was 0.815, 0.806 and 0.464, respectively. Averaged ICC on 5%, 10%, and 25% of MVC was 0.751, 0.667 and 0.668, respectively. Measures of agreement showed similar trends with better results for CVM and CVT than for Area. Young adults had better handgrip steadiness than older adults across all measures. The CVM and CVT measures demonstrated good reproducibility at lower percentages of MVC using the WBB, and could become relevant measures in the clinical setting. The Area measure had poor reproducibility. Young adults have better handgrip steadiness than old adults.
Reliability and validity of the Youth Leisure-time Sedentary Behavior Questionnaire (YLSBQ).
Cabanas-Sánchez, Verónica; Martínez-Gómez, David; Esteban-Cornejo, Irene; Castro-Piñero, José; Conde-Caveda, Julio; Veiga, Óscar L
2018-01-01
To develop a questionnaire able to assess time spent by youth in a wide range of leisure-time sedentary behaviors (SB) and evaluate its test-retest reliability and criterion validity. Cross-sectional observational. The reliability sample included 194 youth, aged 10-18 years, who completed the questionnaire twice, separated by one-week interval. The validity study comprised 1207 participants aged 8-18 years. Participants wore an accelerometer for 7 consecutive days. The questionnaire was designed to assess the amount of time spent in twelve different SB during weekdays and weekends, separately. In order to avoid usual phenomenon of time over reporting, values were adjusted to real available leisure-time (LT) for each participant. Reliability was assessed by using Intraclass Correlation Coefficients (ICC) and weighted (quadratic) kappa (k), and validity was assessed by using Pearson correlation and Bland-Altman plots. The reliability of questionnaire showed a moderate-to-substantial agreement for the most (91%) of items (k=0.43-0.74; ICC=0.41-0.79) with three items (4%) reaching an almost perfect agreement (ICC=0.82-0.83). Only 'sitting and talking' evidenced fair-to-moderate reliability (k=0.27-0.39; ICC=0.34-0.46). The relationship between average sedentary time assessed by the questionnaire and accelerometry was moderate (r=0.36; p<0.001). Systematic biases were not found between questionnaire and accelerometer sedentary time for average day (r=0.05; p=0.11) but Bland-Altman plots suggest moderate discrepancies between both methods of SB measurement (mean=19.86; limits of agreement=-280.04 to 319.76). The questionnaire showed moderate to good test-retest reliability and a moderate level of validity for assessing SB in youth, similar or slightly better to previously published in this population. Copyright © 2017 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.
Clarke, Diana E; Narrow, William E; Regier, Darrel A; Kuramoto, S Janet; Kupfer, David J; Kuhl, Emily A; Greiner, Lisa; Kraemer, Helena C
2013-01-01
This article discusses the design,sampling strategy, implementation,and data analytic processes of the DSM-5 Field Trials. The DSM-5 Field Trials were conducted by using a test-retest reliability design with a stratified sampling approach across six adult and four pediatric sites in the United States and one adult site in Canada. A stratified random sampling approach was used to enhance precision in the estimation of the reliability coefficients. A web-based research electronic data capture system was used for simultaneous data collection from patients and clinicians across sites and for centralized data management.Weighted descriptive analyses, intraclass kappa and intraclass correlation coefficients for stratified samples, and receiver operating curves were computed. The DSM-5 Field Trials capitalized on advances since DSM-III and DSM-IV in statistical measures of reliability (i.e., intraclass kappa for stratified samples) and other recently developed measures to determine confidence intervals around kappa estimates. Diagnostic interviews using DSM-5 criteria were conducted by 279 clinicians of varied disciplines who received training comparable to what would be available to any clinician after publication of DSM-5.Overall, 2,246 patients with various diagnoses and levels of comorbidity were enrolled,of which over 86% were seen for two diagnostic interviews. A range of reliability coefficients were observed for the categorical diagnoses and dimensional measures. Multisite field trials and training comparable to what would be available to any clinician after publication of DSM-5 provided “real-world” testing of DSM-5 proposed diagnoses.
Correlation and agreement: overview and clarification of competing concepts and measures.
Liu, Jinyuan; Tang, Wan; Chen, Guanqin; Lu, Yin; Feng, Changyong; Tu, Xin M
2016-04-25
Agreement and correlation are widely-used concepts that assess the association between variables. Although similar and related, they represent completely different notions of association. Assessing agreement between variables assumes that the variables measure the same construct, while correlation of variables can be assessed for variables that measure completely different constructs. This conceptual difference requires the use of different statistical methods, and when assessing agreement or correlation, the statistical method may vary depending on the distribution of the data and the interest of the investigator. For example, the Pearson correlation, a popular measure of correlation between continuous variables, is only informative when applied to variables that have linear relationships; it may be non-informative or even misleading when applied to variables that are not linearly related. Likewise, the intraclass correlation, a popular measure of agreement between continuous variables, may not provide sufficient information for investigators if the nature of poor agreement is of interest. This report reviews the concepts of agreement and correlation and discusses differences in the application of several commonly used measures.
James, Evan W; LaPrade, Christopher M; Ellman, Michael B; Wijdicks, Coen A; Engebretsen, Lars; LaPrade, Robert F
2014-11-01
Anatomic root placement is necessary to restore native meniscal function during meniscal root repair. Radiographic guidelines for anatomic root placement are essential to improve the accuracy and consistency of anatomic root repair and to optimize outcomes after surgery. To define quantitative radiographic guidelines for identification of the anterior and posterior root attachments of the medial and lateral menisci on anteroposterior (AP) and lateral radiographic views. Descriptive laboratory study. The anterior and posterior roots of the medial and lateral menisci were identified in 12 human cadaveric specimens (average age, 51.3 years; age range, 39-65 years) and labeled using 2-mm radiopaque spheres. True AP and lateral radiographs were obtained, and 2 raters independently measured blinded radiographs in relation to pertinent landmarks and radiographic reference lines. On AP radiographs, the anteromedial and posteromedial roots were, on average, 31.9 ± 5.0 mm and 36.3 ± 3.5 mm lateral to the edge of the medial tibial plateau, respectively. The anterolateral and posterolateral roots were, on average, 37.9 ± 5.2 mm and 39.3 ± 3.8 mm medial to the edge of the lateral tibial plateau, respectively. On lateral radiographs, the anteromedial and anterolateral roots were, on average, 4.8 ± 3.7 mm and 20.5 ± 4.3 mm posterior to the anterior margin of the tibial plateau, respectively. The posteromedial and posterolateral roots were, on average, 18.0 ± 2.8 mm and 19.8 ± 3.5 mm anterior to the posterior margin of the tibial plateau, respectively. The intrarater and interrater intraclass correlation coefficients (ICCs) were >0.958, demonstrating excellent reliability. The meniscal root attachment sites were quantitatively and reproducibly defined with respect to anatomic landmarks and superimposed radiographic reference lines. The high ICCs indicate that the measured radiographic relationships are a consistent means for evaluating meniscal root positions. This study demonstrated consistent and reproducible radiographic guidelines for the location of the meniscal roots. These measurements may be used to assess root positions on intraoperative fluoroscopy and postoperative radiographs. © 2014 The Author(s).
Fidler, Samantha; D'Orsogna, Lloyd; Irish, Ashley B; Lewis, Joshua R; Wong, Germaine; Lim, Wai H
2018-03-02
Structural human leukocyte antigen (HLA) matching at the eplet level can be identified by HLAMatchmaker, which requires the entry of four-digit alleles. The aim of this study was to evaluate the agreement between eplet mismatches calculated by serological and two-digit typing methods compared to high-resolution four-digit typing. In a cohort of 264 donor/recipient pairs, the evaluation of measurement error was assessed using intra-class correlation to confirm the absolute agreement between the number of eplet mismatches at class I (HLA-A, -B, C) and II loci (HLA-DQ and -DR) calculated using serological or two-digit molecular typing compared to four-digit molecular typing methods. The proportion of donor/recipient pairs with a difference of >5 eplet mismatches between the HLA typing methods was also determined. Intra-class correlation coefficients between serological and four-digit molecular typing methods were 0.969 (95% confidence intervals [95% CI] 0.960-0.975) and 0.926 (95% CI 0.899-0.944), respectively; and 0.995 (95% CI 0.994-0.996) and 0.993 (95% CI 0.991-0.995), respectively between two-digit and four-digit molecular typing methods. The proportion of donor/recipient pairs with a difference of >5 eplet mismatches at class I and II loci was 4% and 16% for serological versus four-digit molecular typing methods, and 0% and 2% for two-digit versus four-digit molecular typing methods, respectively. In this small predominantly Caucasian population, compared with serology, there is a high level of agreement in the number of eplet mismatches calculated using two-compared to four-digit molecular HLA-typing methods, suggesting that two-digit typing may be sufficient in determining eplet mismatch load in kidney transplantation.
Validation of the Hindi version of National Institute of Health Stroke Scale.
Prasad, Kameshwar; Dash, Deepa; Kumar, Amit
2012-01-01
To determine the reliability and validity of the National Institute of Health Stroke Scale (NIHSS) with the Hindi and Indian adaptation of items 9 and 10. NIHSS items 9 and 10 were modified and culturally adapted at All India Institute of Medical Sciences (AIIMS) and the resulting version was termed as Hindi version (HV-NIHSS). HV-NIHSS was applied by two independent investigators on 107 patients with stroke. Inter-observer agreement and intra-class correlation coefficients were calculated. The predictive validity of the HV-NIHSS was calculated using functional outcome after three months in the form of modified Rankin Scale (mRS) and Barthel Index (BI). The study included 107 patients of stroke recruited from a tertiary referral hospital at Delhi between November 1, 2009, and October 1, 2010; the mean age of these patients was 56.26±13.84 years and 65.4% of them had suffered ischemic stroke. Inter-rater reliability was high between the two examiners, with Pearson's r ranging from 0.72 to 0.99 for the 15 items on the Scale. Intra-class correlation coefficient for the total score was 0.995 (95% CI-0.993-0.997). Concurrent construct validity was established between HV-NIHSS and baseline Glasgow Coma Scale, with a high correlation (Spearman coefficient = -0.863, P<.001). Predictive validity was also established with BI at three months (Spearman's rho: -0.829, P<.001) and with mRS at three months (Spearman's rho: 0.851, P<0.001). This study shows that a Hindi language version of the NIHSS developed at AIIMS appears reliable and valid when applied to a Hindi-speaking population.
Zijlstra, Agnes; Zijlstra, Wiebren
2013-09-01
Inverted pendulum (IP) models of human walking allow for wearable motion-sensor based estimations of spatio-temporal gait parameters during unconstrained walking in daily-life conditions. At present it is unclear to what extent different IP based estimations yield different results, and reliability and validity have not been investigated in older persons without a specific medical condition. The aim of this study was to compare reliability and validity of four different IP based estimations of mean step length in independent-living older persons. Participants were assessed twice and walked at different speeds while wearing a tri-axial accelerometer at the lower back. For all step-length estimators, test-retest intra-class correlations approached or were above 0.90. Intra-class correlations with reference step length were above 0.92 with a mean error of 0.0 cm when (1) multiplying the estimated center-of-mass displacement during a step by an individual correction factor in a simple IP model, or (2) adding an individual constant for bipedal stance displacement to the estimated displacement during single stance in a 2-phase IP model. When applying generic corrections or constants in all subjects (i.e. multiplication by 1.25, or adding 75% of foot length), correlations were above 0.75 with a mean error of respectively 2.0 and 1.2 cm. Although the results indicate that an individual adjustment of the IP models provides better estimations of mean step length, the ease of a generic adjustment can be favored when merely evaluating intra-individual differences. Further studies should determine the validity of these IP based estimations for assessing gait in daily life. Copyright © 2013 Elsevier B.V. All rights reserved.
Høyer, Christian; Pavar, Susanne; Pedersen, Begitte H; Biurrun Manresa, José A; Petersen, Lars J
2013-08-01
Mercury-in-silastic strain gauge pletysmography (SGP) is a well-established technique for blood flow and blood pressure measurements. The aim of this study was to examine (i) the possible influence of clinical clues, e.g. the presence of wounds and color changes during blood pressure measurements, and (ii) intra- and inter-observer variation of curve interpretation for segmental blood pressure measurements. A total of 204 patients with known or suspected peripheral arterial disease (PAD) were included in a diagnostic accuracy trial. Toe and ankle pressures were measured in both limbs, and primary observers analyzed a total of 804 pressure curve sets. The SGP curves were later reanalyzed separately by two observers blinded to clinical clues. Intra- and inter-observer agreement was quantified using Cohen's kappa and reliability was quantified using intra-class correlation coefficients, coefficients of variance, and Bland-Altman analysis. There was an overall agreement regarding patient diagnostic classification (PAD/not PAD) in 202/204 (99.0%) for intra-observer (κ = 0.969, p < 0.001), and 201/204 (98.5%) for inter-observer readings (κ = 0.953, p < 0.001). Reliability analysis showed excellent correlation between blinded versus non-blinded and inter-observer readings for determination of absolute segmental pressures (all intraclass correlation coefficients ≥ 0.984). The coefficient of variance for determination of absolute segmental blood pressure ranged from 2.9-3.4% for blinded/non-blinded data and from 3.8-5.0% for inter-observer data. This study shows a low inter-observer variation among experienced laboratory technicians for reading strain gauge curves. The low variation between blinded/non-blinded readings indicates that SGP measurements are minimally biased by clinical clues.
Goñi, Joaquín; Sporns, Olaf; Cheng, Hu; Aznárez-Sanado, Maite; Wang, Yang; Josa, Santiago; Arrondo, Gonzalo; Mathews, Vincent P; Hummer, Tom A; Kronenberger, William G; Avena-Koenigsberger, Andrea; Saykin, Andrew J.; Pastor, María A.
2013-01-01
High-resolution isotropic three-dimensional reconstructions of human brain gray and white matter structures can be characterized to quantify aspects of their shape, volume and topological complexity. In particular, methods based on fractal analysis have been applied in neuroimaging studies to quantify the structural complexity of the brain in both healthy and impaired conditions. The usefulness of such measures for characterizing individual differences in brain structure critically depends on their within-subject reproducibility in order to allow the robust detection of between-subject differences. This study analyzes key analytic parameters of three fractal-based methods that rely on the box-counting algorithm with the aim to maximize within-subject reproducibility of the fractal characterizations of different brain objects, including the pial surface, the cortical ribbon volume, the white matter volume and the grey matter/white matter boundary. Two separate datasets originating from different imaging centers were analyzed, comprising, 50 subjects with three and 24 subjects with four successive scanning sessions per subject, respectively. The reproducibility of fractal measures was statistically assessed by computing their intra-class correlations. Results reveal differences between different fractal estimators and allow the identification of several parameters that are critical for high reproducibility. Highest reproducibility with intra-class correlations in the range of 0.9–0.95 is achieved with the correlation dimension. Further analyses of the fractal dimensions of parcellated cortical and subcortical gray matter regions suggest robustly estimated and region-specific patterns of individual variability. These results are valuable for defining appropriate parameter configurations when studying changes in fractal descriptors of human brain structure, for instance in studies of neurological diseases that do not allow repeated measurements or for disease-course longitudinal studies. PMID:23831414
Endarti, Dwi; Riewpaiboon, Arthorn; Thavorncharoensap, Montarat; Praditsitthikorn, Naiyana; Hutubessy, Raymond; Kristina, Susi Ari
2018-05-01
To gain insight into the most suitable foreign value set among Malaysian, Singaporean, Thai, and UK value sets for calculating the EuroQol five-dimensional questionnaire index score (utility) among patients with cervical cancer in Indonesia. Data from 87 patients with cervical cancer recruited from a referral hospital in Yogyakarta province, Indonesia, from an earlier study of health-related quality of life were used in this study. The differences among the utility scores derived from the four value sets were determined using the Friedman test. Performance of the psychometric properties of the four value sets versus visual analogue scale (VAS) was assessed. Intraclass correlation coefficients and Bland-Altman plots were used to test the agreement among the utility scores. Spearman ρ correlation coefficients were used to assess convergent validity between utility scores and patients' sociodemographic and clinical characteristics. With respect to known-group validity, the Kruskal-Wallis test was used to examine the differences in utility according to the stages of cancer. There was significant difference among utility scores derived from the four value sets, among which the Malaysian value set yielded higher utility than the other three value sets. Utility obtained from the Malaysian value set had more agreements with VAS than the other value sets versus VAS (intraclass correlation coefficients and Bland-Altman plot tests results). As for the validity, the four value sets showed equivalent psychometric properties as those that resulted from convergent and known-group validity tests. In the absence of an Indonesian value set, the Malaysian value set was more preferable to be used compared with the other value sets. Further studies on the development of an Indonesian value set need to be conducted. Copyright © 2018. Published by Elsevier Inc.
Allen, Michael H; Daniel, David G; Revicki, Dennis A; Canuso, Carla M; Turkoz, Ibrahim; Fu, Dong-Jing; Alphs, Larry; Ishak, K Jack; Bartko, John J; Lindenmayer, Jean-Pierre
2012-01-01
The Clinical Global Impression for Schizoaffective Disorder scale is a new rating scale adapted from the Clinical Global Impression scale for use in patients with schizoaffective disorder. The psychometric characteristics of the Clinical Global Impression for Schizoaffective Disorder are described. Content validity was assessed using an investigator questionnaire. Inter-rater reliability was determined with 12 sets of videotaped interviews rated independently by two trained individuals. Test-retest reliability was assessed using 30 randomly selected raters from clinical trials who evaluated the same videos on separate occasions two weeks apart. Convergent and divergent validity and effect size were evaluated by comparing scores between the Clinical Global Impression for Schizoaffective Disorder and the Positive and Negative Syndrome Scale, 21-item Hamilton Rating Scale for Depression, and Young Mania Rating Scale scales using pooled patient data from two clinical trials. Clinical Global Impression for Schizoaffective Disorder scores were then linked to corresponding Positive and Negative Syndrome Scale scores. Content validity was strong. Inter-rater agreement was good to excellent for most scales and subscales (intra-class correlation coefficient ≥ 0.50). Test-retest showed good reproducibility, with intraclass correlation coefficients ranging from 0.444 to 0.898. Spearman correlations between Clinical Global Impression for Schizoaffective Disorder domains and corresponding symptom scales were 0.60 or greater, and effect sizes for Clinical Global Impression for Schizoaffective Disorder overall and domain scores were similar to Positive and Negative Syndrome Scale Young Mania Rating Scale, and 21-item Hamilton Rating Scale for Depression scores. Raters anticipated that the scale might be less effective in distinguishing negative from depressive symptoms, and, in fact, the results here may reflect that clinical reality. Multiple lines of evidence support the reliability and validity of the Clinical Global Impression for Schizoaffective Disorder for studies in schizoaffective disorder.
Lonjon, Guillaume; Ilharreborde, Brice; Odent, Thierry; Moreau, Sébastien; Glorion, Christophe; Mazda, Keyvan
2014-01-01
Outcome study to determine the internal consistency, reproducibility, and concurrent validity of the French-Canadian version of the Scoliosis Research Society 22 (SRS-22 fcv) patient questionnaire in France. To determine whether the SRS-22 fcv can be used in a population from France. The SRS-22 has been translated and validated in multiple countries, notably in the French-Canadian language in Quebec, Canada. Use of SRS-22 fcv seems appropriate for evaluating adolescent idiopathic scoliosis in France. However, French-Canadian French is noticeably different from the French spoken in France, and no study has investigated the use of a French-Canadian version of a health-quality questionnaire in another French population. The methods used for validating the SRS-22 fcv in Quebec were adopted for use with a group of 200 adolescents with idiopathic scoliosis and 60 healthy adolescents in France. Reliability and reproducibility were measured by the Cronbach α and intraclass correlation coefficient (ICC), construct validity by factorial analysis, concurrent validity by the Short-Form of the survey, and discriminant validity by analysis of variance and multivariate linear regression. In France, the SRS-22 fcv showed good global internal consistency (Cronbach α = 0.87, intraclass correlation coefficient = 0.92), a coherent factorial structure, and high correlation coefficients between the SRS-22 fcv and Short-Form of the survey (P < 0.001). However, reliability and validity were slightly less than that for the instrument's original validation and the validation of the SRS-22 fcv in Quebec. These differences could be explained by language and cultural differences. The SRS-22 fcv is relevant for use in France, but further development and validation of a specific French questionnaire remain necessary to improve the assessment of functional outcomes of adolescents with scoliosis in France. N/A.
Environmental and Genetic Factors Explain Differences in Intraocular Scattering.
Benito, Antonio; Hervella, Lucía; Tabernero, Juan; Pennos, Alexandros; Ginis, Harilaos; Sánchez-Romera, Juan F; Ordoñana, Juan R; Ruiz-Sánchez, Marcos; Marín, José M; Artal, Pablo
2016-01-01
To study the relative impact of genetic and environmental factors on the variability of intraocular scattering within a classical twin study. A total of 64 twin pairs, 32 monozygotic (MZ) (mean age: 54.9 ± 6.3 years) and 32 dizygotic (DZ) (mean age: 56.4 ± 7.0 years), were measured after a complete ophthalmologic exam had been performed to exclude all ocular pathologies that increase intraocular scatter as cataracts. Intraocular scattering was evaluated by using two different techniques based on a straylight parameter log(S) estimation: a compact optical instrument based in the principle of optical integration and a psychophysical measurement. Intraclass correlation coefficients (ICC) were used as descriptive statistics of twin resemblance, and genetic models were fitted to estimate heritability. No statistically significant difference was found for MZ and DZ groups for age (P = 0.203), best-corrected visual acuity (P = 0.626), cataract gradation (P = 0.701), sex (P = 0.941), optical log(S) (P = 0.386), or psychophysical log(S) (P = 0.568), with only a minor difference in equivalent sphere (P = 0.008). Intraclass correlation coefficients between siblings were similar for scatter parameters: 0.676 in MZ and 0.471 in DZ twins for optical log(S); 0.533 in MZ twins and 0.475 in DZ twins for psychophysical log(S). For equivalent sphere, ICCs were 0.767 in MZ and 0.228 in DZ twins. Conservative estimates of heritability for the measured scattering parameters were 0.39 and 0.20, respectively. Correlations of intraocular scatter (straylight) parameters in the groups of identical and nonidentical twins were similar. Heritability estimates were of limited magnitude, suggesting that genetic and environmental factors determine the variance of ocular straylight in healthy middle-aged adults.
Ogden, C A; Akobeng, A K; Abbott, J; Aggett, P; Sood, M R; Thomas, A G
2011-09-01
To validate IMPACT-III (UK), a health-related quality of life (HRQoL) instrument, in British children with inflammatory bowel disease (IBD). One hundred six children and parents were invited to participate. IMPACT-III (UK) was validated by inspection by health professionals and children to assess face and content validity, factor analysis to determine optimum domain structure, use of Cronbach alpha coefficients to test internal reliability, ANOVA to assess discriminant validity, correlation with the Child Health Questionnaire to assess concurrent validity, and use of intraclass correlation coefficients to assess test-retest reliability. The independent samples t test was used to measure differences between sexes and age groups, and between paper and computerised versions of IMPACT-III (UK). IMPACT-III (UK) had good face and content validity. The most robust factor solution was a 5-domain structure: body image, embarrassment, energy, IBD symptoms, and worries/concerns about IBD, all of which demonstrated good internal reliability (α = 0.74-0.88). Discriminant validity was demonstrated by significant (P < 0.05, P < 0.01) differences in HRQoL scores between the severe, moderate, and inactive/mild symptom severity groups for the embarrassment scale (63.7 vs 81.0 vs 81.2), IBD symptom scale (45.0 vs 64.2 vs 80.6), and the energy scale (46.4 vs 62.1 vs 77.7). Concurrent validity of IMPACT-III (UK) with comparable domains of the Child Health Questionnaire was confirmed. Test-retest reliability was confirmed with good intraclass correlation coefficients of 0.66 to 0.84. Paper and computer versions of IMPACT-III (UK) collected comparable scores, and there were no differences between the sexes and age groups. IMPACT-III (UK) appears to be a useful tool to measure HRQoL in British children with IBD.
Nikjooy, Afsaneh; Jafari, Hassan; Saba, Maryam A; Ebrahimi, Naghmeh; Mirzaei, Rezvan
2018-05-01
The Patient Assessment of Constipation Quality of Life (PAC-QOL) questionnaire is the most validated and the most specific tool for measuring the quality of life of patients with constipation. Over 120 million people live in countries whose official language is Persian. There is no reported Persian version of the PAC-QOL questionnaire yet. The aim of this study was to translate and culturally adapt the PAC-QOL questionnaire and to assess its reliability and validity among Persian patients with chronic constipation. Following the translation and cultural adaptation of the PAC-QOL questionnaire to Persian, 100 patients (mean±SD age=40.51±13.67) with constipation were recruited for validity measurement and 20 patients were re-examined for reliability. Content validity was assessed based on the opinions of an expert committee and the floor/ceiling effect. Construct validity was evaluated according to the hypothesis test. The SF-36 questionnaire was used for concurrent criterion validity, intra-class correlation coefficient for reliability, and Cronbach's alpha for internal consistency. The content validity of the PAC-QOL questionnaire was proven, and there was no floor/ceiling effect. Construct validity also was confirmed based on the hypothesis test. The overall Cronbach's alpha of the PAC-QOL questionnaire was 0.92 (range=0.72-0.92), and the overall intra-class correlation coefficient of the questionnaire was 0.88 (range=0.69-0.87). The correlation between the SF-36 and PAC-QOL questionnaires was moderate. The Persian version of the PAC-QOL questionnaire demonstrated good validity and reliability properties in chronic constipation. Accordingly, Persian researchers and clinicians can benefit from this questionnaire in further research and assessment of treatment outcomes.
Lorini, Chiara; Collini, Francesca; Castagnoli, Mariangela; Di Bari, Mauro; Cavallini, Maria Chiara; Zaffarana, Nicoletta; Pepe, Pasquale; Lucenteforte, Ersilia; Vannacci, Alfredo; Bonaccorsi, Guglielmo
2014-10-01
The aim of this study was to use the Malnutrition Universal Screening Tool (MUST) to assess the applicability of alternative versus direct anthropometric measurements for evaluating the risk for malnutrition in older individuals living in nursing homes (NHs). We conducted a cross-sectional survey in 67 NHs in Tuscany, Italy. We measured the weight, standing height (SH), knee height (KH), ulna length (UL), and middle-upper-arm circumference of 641 NH residents. Correlations between the different methods for calculating body mass index (BMI; using direct or alternative measurements) were evaluated by the intraclass correlation coefficient and the Bland-Altman method; agreement in the allocation of participants to the same risk category was assessed by squared weighted kappa statistic and indicators of internal relative validity. The intraclass correlation coefficient for BMI calculated using KH was 0.839 (0.815-0.861), whereas those calculated by UL were 0.890 (0.872-0.905). The limits of agreement were ±6.13 kg/m(2) using KH and ±4.66 kg/m(2) using UL. For BMI calculated using SH, 79.9% of the patients were at low risk, 8.1% at medium risk, and 12.2% at high risk for malnutrition. The agreement between this classification and that obtained using BMI calculated by alternative measurements was "fair-good." When it is not possible to determine risk category by using SH, we suggest using the alternative measurements (primarily UL, due to its highest sensitivity) to predict the height and to compare these evaluations with those obtained by using middle-upper-arm-circumference to predict the BMI. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.
Grover, S; Chakrabarti, S; Ghormode, D; Dutt, A; Kate, N; Kulhara, P
2011-12-01
The Involvement Evaluation Questionnaire (IEQ) is a comprehensive, conceptually valid and reliable means of assessing caregiver burden. However, its psychometric properties have rarely been examined in non-European settings. The aim of the present study was to evaluate the psychometric properties of an Indian translation of the IEQ (Hindi-IEQ). The European Union (English) version of IEQ was translated into Hindi and reviewed by a group of experts and caregivers for translation accuracy, cultural appropriateness, and for relevance and acceptability of items and constructs. The Hindi-IEQ was then administered to 162 primary caregivers of patients with severe mental illnesses. Eighteen caregivers completed both the English and Hindi versions to check the level of agreement between them. Another 27 completed the Hindi-IEQ twice, a week apart, to evaluate its test-retest reliability. Factor structure of the Hindi-IEQ was examined using an exploratory, principal components and factor analysis. Pearson's correlation coefficients were significant for 24 items, while intraclass correlation coefficients were significant for 28 of the 31 items (P < 0.05), indicating a satisfactory level of agreement between the Hindi and English versions. Test-retest reliability for all items of the Hindi-IEQ was adequate, with kappa values ranging from 0.46 to 0.95 and intraclass correlation coefficients from 0.76 to 1.00. Internal consistency (Cronbach's alpha = 0.89) and the split-half reliability (Spearman-Brown coefficient = 0.68) of the Hindi-IEQ were also satisfactory. However, several differences were noted in the factor structure and distribution of scores of the Hindi-IEQ, which were quite unlike that of the European Union version. The similarities and differences between the 2 versions of the IEQ indicated that sociocultural factors could influence assessment of caregiver burden across different cultures.
O’Connor, David; Potler, Natan Vega; Kovacs, Meagan; Xu, Ting; Ai, Lei; Pellman, John; Vanderwal, Tamara; Parra, Lucas C.; Cohen, Samantha; Ghosh, Satrajit; Escalera, Jasmine; Grant-Villegas, Natalie; Osman, Yael; Bui, Anastasia; Craddock, R. Cameron
2017-01-01
Abstract Background: Although typically measured during the resting state, a growing literature is illustrating the ability to map intrinsic connectivity with functional MRI during task and naturalistic viewing conditions. These paradigms are drawing excitement due to their greater tolerability in clinical and developing populations and because they enable a wider range of analyses (e.g., inter-subject correlations). To be clinically useful, the test-retest reliability of connectivity measured during these paradigms needs to be established. This resource provides data for evaluating test-retest reliability for full-brain connectivity patterns detected during each of four scan conditions that differ with respect to level of engagement (rest, abstract animations, movie clips, flanker task). Data are provided for 13 participants, each scanned in 12 sessions with 10 minutes for each scan of the four conditions. Diffusion kurtosis imaging data was also obtained at each session. Findings: Technical validation and demonstrative reliability analyses were carried out at the connection-level using the Intraclass Correlation Coefficient and at network-level representations of the data using the Image Intraclass Correlation Coefficient. Variation in intrinsic functional connectivity across sessions was generally found to be greater than that attributable to scan condition. Between-condition reliability was generally high, particularly for the frontoparietal and default networks. Between-session reliabilities obtained separately for the different scan conditions were comparable, though notably lower than between-condition reliabilities. Conclusions: This resource provides a test-bed for quantifying the reliability of connectivity indices across subjects, conditions and time. The resource can be used to compare and optimize different frameworks for measuring connectivity and data collection parameters such as scan length. Additionally, investigators can explore the unique perspectives of the brain's functional architecture offered by each of the scan conditions. PMID:28369458
The development and reliability of a repeated anaerobic cycling test in female ice hockey players.
Wilson, Kier; Snydmiller, Gary; Game, Alex; Quinney, Art; Bell, Gordon
2010-02-01
The purpose of this study was to develop and assess the reliability of a repeated anaerobic power cycling test designed to mimic the repeated sprinting nature of the sport of ice hockey. Nineteen female varsity ice hockey players (mean X +/- SD age, height and body mass = 21 +/- 2 yr, 166.6 +/- 6.3 cm and 62.3 +/- 7.3) completed 3 trials of a repeated anaerobic power test on a Monark cycle ergometer on different days. The test consisted of "all-out" cycling for 5 seconds separated by 10 seconds of low-intensity cycling, repeated 4 times. The relative load factor used for the resistance setting was equal to 0.095 kg per kilogram body mass. There was no significant difference between the peak 5-second power output (PO), mean PO, or the fatigue index (%) among the 3 different trials. The peak 5-second PO was 702.6 +/- 114.8 w and 11.3 +/- 1.1 w x kg, whereas the mean PO across the 4 repeats was 647.1 +/- 96.3 w and 10.4 +/- 1.0 w x kg averaged for the 3 different tests. The fatigue index averaged 17.8 +/- 6.5%. The intraclass correlation coefficient for peak 5-second, mean PO, and fatigue index was 0.82, 0.86, and 0.82, respectively. This study reports the methodology of a repeated anaerobic power cycling test that was reliable for the measurement of PO and calculated fatigue index in varsity women ice hockey players and can be used as a laboratory-based assessment of repeated anaerobic fitness.
Validity and reliability assessment of a peer evaluation method in team-based learning classes.
Yoon, Hyun Bae; Park, Wan Beom; Myung, Sun-Jung; Moon, Sang Hui; Park, Jun-Bean
2018-03-01
Team-based learning (TBL) is increasingly employed in medical education because of its potential to promote active group learning. In TBL, learners are usually asked to assess the contributions of peers within their group to ensure accountability. The purpose of this study is to assess the validity and reliability of a peer evaluation instrument that was used in TBL classes in a single medical school. A total of 141 students were divided into 18 groups in 11 TBL classes. The students were asked to evaluate their peers in the group based on evaluation criteria that were provided to them. We analyzed the comments that were written for the highest and lowest achievers to assess the validity of the peer evaluation instrument. The reliability of the instrument was assessed by examining the agreement among peer ratings within each group of students via intraclass correlation coefficient (ICC) analysis. Most of the students provided reasonable and understandable comments for the high and low achievers within their group, and most of those comments were compatible with the evaluation criteria. The average ICC of each group ranged from 0.390 to 0.863, and the overall average was 0.659. There was no significant difference in inter-rater reliability according to the number of members in the group or the timing of the evaluation within the course. The peer evaluation instrument that was used in the TBL classes was valid and reliable. Providing evaluation criteria and rules seemed to improve the validity and reliability of the instrument.
Reliability of bounce drop jump parameters within elite male rugby players.
Costley, Lisa; Wallace, Eric; Johnston, Michael; Kennedy, Rodney
2017-07-25
The aims of the study were to investigate the number of familiarisation sessions required to establish reliability of the bounce drop jump (BDJ) and subsequent reliability once familiarisation is achieved. Seventeen trained male athletes completed 4 BDJs in 4 separate testing sessions. Force-time data from a 20 cm BDJ was obtained using two force plates (ensuring ground contact < 250 ms). Subjects were instructed to 'jump for maximal height and minimal contact time' while the best and average of four jumps were compared. A series of performance variables were assessed in both eccentric and concentric phases including jump height, contact time, flight time, reactive strength index (RSI), peak power, rate of force development (RFD) and actual dropping height (ADH). Reliability was assessed using the intraclass correlation coefficient (ICC) and coefficient of variation (CV) while familiarisation was assessed using a repeated measures analysis of variance (ANOVA). The majority of DJ parameters exhibited excellent reliability with no systematic bias evident, while the average of 4 trials provided greater reliability. With the exception of vertical stiffness (CV: 12.0 %) and RFD (CV: 16.2 %) all variables demonstrated low within subject variation (CV range: 3.1 - 8.9 %). Relative reliability was very poor for ADH, with heights ranging from 14.87 - 29.85 cm. High levels of reliability can be obtained from the BDJ with the exception of vertical stiffness and RFD, however, extreme caution must be taken when comparing DJ results between individuals and squads due to large discrepancies between actual drop height and platform height.
Langberg, Joshua M; Dvorsky, Melissa R; Molitor, Stephen J; Bourchtein, Elizaveta; Eddy, Laura D; Smith, Zoe; Schultz, Brandon K; Evans, Steven W
2016-04-01
The primary goal of this study was to longitudinally evaluate the homework assignment completion patterns of middle school age adolescents with ADHD, their associations with academic performance, and malleable predictors of homework assignment completion. Analyses were conducted on a sample of 104 middle school students comprehensively diagnosed with ADHD and followed for 18 months. Multiple teachers for each student provided information about the percentage of homework assignments turned in at five separate time points and school grades were collected quarterly. Results showed that agreement between teachers with respect to students assignment completion was high, with an intraclass correlation of .879 at baseline. Students with ADHD were turning in an average of 12% fewer assignments each academic quarter in comparison to teacher-reported classroom averages. Regression analyses revealed a robust association between the percentage of assignments turned in at baseline and school grades 18 months later, even after controlling for baseline grades, achievement (reading and math), intelligence, family income, and race. Cross-lag analyses demonstrated that the association between assignment completion and grades was reciprocal, with assignment completion negatively impacting grades and low grades in turn being associated with decreased future homework completion. Parent ratings of homework materials management abilities at baseline significantly predicted the percentage of assignments turned in as reported by teachers 18 months later. These findings demonstrate that homework assignment completion problems are persistent across time and an important intervention target for adolescents with ADHD. Copyright © 2015 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.
Everett, Tobias C; Ng, Elaine; Power, Daniel; Marsh, Christopher; Tolchard, Stephen; Shadrina, Anna; Bould, Matthew D
2013-12-01
The use of simulation-based assessments for high-stakes physician examinations remains controversial. The Managing Emergencies in Paediatric Anaesthesia course uses simulation to teach evidence-based management of anesthesia crises to trainee anesthetists in the United Kingdom (UK) and Canada. In this study, we investigated the feasibility and reliability of custom-designed scenario-specific performance checklists and a global rating scale (GRS) assessing readiness for independent practice. After research ethics board approval, subjects were videoed managing simulated pediatric anesthesia crises in a single Canadian teaching hospital. Each subject was randomized to two of six different scenarios. All 60 scenarios were subsequently rated by four blinded raters (two in the UK, two in Canada) using the checklists and GRS. The actual and predicted reliability of the tools was calculated for different numbers of raters using the intraclass correlation coefficient (ICC) and the Spearman-Brown prophecy formula. Average measures ICCs ranged from 'substantial' to 'near perfect' (P ≤ 0.001). The reliability of the checklists and the GRS was similar. Single measures ICCs showed more variability than average measures ICC. At least two raters would be required to achieve acceptable reliability. We have established the reliability of a GRS to assess the management of simulated crisis scenarios in pediatric anesthesia, and this tool is feasible within the setting of a research study. The global rating scale allows raters to make a judgement regarding a participant's readiness for independent practice. These tools may be used in the future research examining simulation-based assessment. © 2013 John Wiley & Sons Ltd.
Is Diaphragm Motion a Good Surrogate for Liver Tumor Motion?
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yang, Juan; School of Information Science and Engineering, Shandong University, Jinan, Shandong; Cai, Jing
Purpose: To evaluate the relationship between liver tumor motion and diaphragm motion. Methods and Materials: Fourteen patients with hepatocellular carcinoma (10 of 14) or liver metastases (4 of 14) undergoing radiation therapy were included in this study. All patients underwent single-slice cine–magnetic resonance imaging simulations across the center of the tumor in 3 orthogonal planes. Tumor and diaphragm motion trajectories in the superior–inferior (SI), anterior–posterior (AP), and medial–lateral (ML) directions were obtained using an in-house-developed normalized cross-correlation–based tracking technique. Agreement between the tumor and diaphragm motion was assessed by calculating phase difference percentage, intraclass correlation coefficient, and Bland-Altman analysis (Diff).more » The distance between the tumor and tracked diaphragm area was analyzed to understand its impact on the correlation between the 2 motions. Results: Of all patients, the mean (±standard deviation) phase difference percentage values were 7.1% ± 1.1%, 4.5% ± 0.5%, and 17.5% ± 4.5% in the SI, AP, and ML directions, respectively. The mean intraclass correlation coefficient values were 0.98 ± 0.02, 0.97 ± 0.02, and 0.08 ± 0.06 in the SI, AP, and ML directions, respectively. The mean Diff values were 2.8 ± 1.4 mm, 2.4 ± 1.1 mm, and 2.2 ± 0.5 mm in the SI, AP, and ML directions, respectively. Tumor and diaphragm motions had high concordance when the distance between the tumor and tracked diaphragm area was small. Conclusions: This study showed that liver tumor motion had good correlation with diaphragm motion in the SI and AP directions, indicating diaphragm motion in the SI and AP directions could potentially be used as a reliable surrogate for liver tumor motion.« less
Waldon, Jessica; Begum, Esmot; Gendron, Melissa; Rusak, Benjamin; Andreou, Pantelis; Rajda, Malgorzata; Corkum, Penny
2016-10-01
This study sought to: (1) compare actigraphy-derived estimated sleep variables to the same variables based on the gold-standard of sleep assessment, polysomnography; (2) examine whether the correlations between the measures differ between children with attention-deficit/hyperactivity disorder and typically developing children; and (3) determine whether these correlations are altered when children with attention-deficit/hyperactivity disorder are treated with medication. Participants (24 attention-deficit/hyperactivity disorder; 24 typically developing), aged 6-12 years, completed a 1-week baseline assessment of typical sleep and daytime functioning. Following the baseline week, participants in the attention-deficit/hyperactivity disorder group completed a 4-week blinded randomized control trial of methylphenidate hydrochloride, including a 2-week placebo and 2-week methylphenidate hydrochloride treatment period. At the end of each observation (typically developing: baseline; attention-deficit/hyperactivity disorder: baseline, placebo and methylphenidate hydrochloride treatment), all participants were invited to a sleep research laboratory, where overnight polysomnography and actigraphy were recorded concurrently. Findings from intra-class correlations and Bland-Altman plots were consistent. Actigraphy was found to provide good estimates (e.g. intra-class correlations >0.61) of polysomnography results for sleep duration for all groups and conditions, as well as for sleep-onset latency and sleep efficiency for the typically developing group and attention-deficit/hyperactivity disorder group while on medication, but not for the attention-deficit/hyperactivity disorder group during baseline or placebo. Based on the Bland-Altman plots, actigraphy tended to underestimate for sleep duration (8.6-18.5 min), sleep efficiency (5.6-9.3%) and sleep-onset latency, except for attention-deficit/hyperactivity disorder during placebo in which actigraphy overestimated (-2.1 to 6.3 min). The results of the current study highlight the importance of utilizing a multimodal approach to sleep assessment in children with attention-deficit/hyperactivity disorder. © 2016 European Sleep Research Society.
Sandhu, Sukhvinder Singh; Ismail, Noor Hassim; Rampal, Krishna Gopal
2015-11-01
The Perceived Stress Scale-10 (PSS-10) is widely used to assess stress perception. The aim of this study was to translate the original PSS-10 into Malay and assess the reliability and validity of the Malay version among nurses. The Malay version of the PSS-10 was distributed among 229 nurses from four government hospitals in Selangor State. Test-retest reliability and concurrent validity was conducted with 25 nurses with the Malay version of the Depression Anxiety Stress Scales (DASS) 21. Cronbach's alpha, confirmatory factor analysis (CFA), intraclass correlation coefficient and Pearson's r correlation coefficient were used to determine the psychometric properties of the Malay PSS-10. Two factor components were yielded through exploratory factor analysis with eigenvalues of 3.37 and 2.10, respectively. Both of the factors accounted for 54.6% of the variance. CFA yielded a two-factor structure with satisfactory goodness-of-fit indices [x 2 /df = 2.43; comparative fit index (CFI) = 0.92, goodness-of-fit Index (GFI) = 0.94; standardised root mean square residual (SRMR) = 0.07 and root mean square error of approximation (RMSEA) = 0.08 (90% CI = 0.07-0.09)]. The Cronbach's alpha coefficient for the total items was 0.63 (0.82 for factor 1 and 0.72 for factor 2). The intraclass correlation coefficient (ICC) was 0.81 (95% CI: 0.62-0.91) for test-retest reliability testing after seven days. The total score and the negative component of the PSS-10 correlated significantly with the stress component of the DASS-21: (r = 0.61, P < 0.001) and (r = 0.56, P < 0.004), respectively. The Malay version of the PSS-10 demonstrated a satisfactory level of validity and reliability to assess stress perception. Therefore, this questionnaire is valid in assessing stress perception among nurses in Malaysia.
A new instrument to measure quality of life of heart failure family caregivers.
Nauser, Julie A; Bakas, Tamilyn; Welch, Janet L
2011-01-01
Family caregivers of heart failure (HF) patients experience poor physical and mental health leading to poor quality of life. Although several quality-of-life measures exist, they are often too generic to capture the unique experience of this population. The purpose of this study was to evaluate the psychometric properties of the Family Caregiver Quality of Life (FAMQOL) Scale that was designed to assess the physical, psychological, social, and spiritual dimensions of quality of life among caregivers of HF patients. Psychometric testing of the FAMQOL with 100 HF family caregivers was conducted using item analysis, Cronbach α, intraclass correlation, factor analysis, and hierarchical multiple regression guided by a conceptual model. Caregivers were predominately female (89%), white, (73%), and spouses (62%). Evidence of internal consistency reliability (α=.89) was provided for the FAMQOL, with item-total correlations of 0.39 to 0.74. Two-week test-retest reliability was supported by an intraclass correlation coefficient of 0.91. Using a 1-factor solution and principal axis factoring, loadings ranged from 0.31 to 0.78, with 41% of the variance explained by the first factor (eigenvalue=6.5). With hierarchical multiple regression, 56% of the FAMQOL variance was explained by model constructs (F8,91=16.56, P<.001). Criterion-related validity was supported by correlations with SF-36 General (r=0.45, P<.001) and Mental (r=0.59, P<.001) Health subscales and Bakas Caregiving Outcomes Scale (r=0.73, P<.001). Evidence of internal and test-retest reliability and construct and criterion validity was provided for physical, psychological, and social well-being subscales. The 16-item FAMQOL is a brief, easy-to-administer instrument that has evidence of reliability and validity in HF family caregivers. Physical, psychological, and social well-being can be measured with 4-item subscales. The FAMQOL scale could serve as a valuable measure in research, as well as an assessment tool to identify caregivers in need of intervention.
Flosadottir, Vala; Roos, Ewa M.; Ageberg, Eva
2017-01-01
Background: The Activity Rating Scale (ARS) for disorders of the knee evaluates the level of activity by the frequency of participation in 4 separate activities with high demands on knee function, with a score ranging from 0 (none) to 16 (pivoting activities 4 times/wk). Purpose: To translate and cross-culturally adapt the ARS into Swedish and to assess measurement properties of the Swedish version of the ARS. Study Design: Cohort study (diagnosis); Level of evidence, 2. Methods: The COSMIN guidelines were followed. Participants (N = 100 [55 women]; mean age, 27 years) who were undergoing rehabilitation for a knee injury completed the ARS twice for test-retest reliability. The Knee injury and Osteoarthritis Outcome Score (KOOS), Tegner Activity Scale (TAS), and modernized Saltin-Grimby Physical Activity Level Scale (SGPALS) were administered at baseline to validate the ARS. Construct validity and responsiveness of the ARS were evaluated by testing predefined hypotheses regarding correlations between the ARS, KOOS, TAS, and SGPALS. The Cronbach alpha, intraclass correlation coefficients, absolute reliability, standard error of measurement, smallest detectable change, and Spearman rank-order correlation coefficients were calculated. Results: The ARS showed good internal consistency (α ≈ 0.96), good test-retest reliability (intraclass correlation coefficient >0.9), and no systematic bias between measurements. The standard error of measurement was less than 2 points, and the smallest detectable change was less than 1 point at the group level and less than 5 points at the individual level. More than 75% of the hypotheses were confirmed, indicating good construct validity and good responsiveness of the ARS. Conclusion: The Swedish version of the ARS is valid, reliable, and responsive for evaluating the level of activity based on the frequency of participation in high-demand knee sports activities in young adults with a knee injury. PMID:28979920
Kanellakis, Spyridon; Skoufas, Efstathios; Khudokonenko, Vladlena; Apostolidou, Eftychia; Gerakiti, Loukia; Andrioti, Maria-Chrysi; Bountouvi, Evangelia; Manios, Yannis
2017-02-01
To validate anthropometric equations in the current literature predicting body fat percentage (%BF) in the Greek population, to develop and validate two anthropometric equations estimating %BF, and to compare them with the retrieved equations. Anthropometric data from 642 Greek adults were incorporated. Dual-energy X-ray absorptiometry was used as reference method. The comparison with other equations was made using Bland-Altman analysis, intraclass correlation coefficient, and Lin's concordance correlation coefficient. Nine of the thirty-one retrieved equations had no statistically significant bias. However, all of them had wide limits of agreement (±8.3 to ±16%BF). The equations accrued were: BF% = -0.615-10.948 × sex + 0.321 × waist circumference + 0.502 × hips circumference-0.39 × forearm circumference - 19.768 × height (m) and BF% = -27.787-5.515 × sex-8.419 × height + 0.145 × waist circumference + 0.270 × hips circumference + 7.509 × log of thigh skinfold + 20.090 × log of sum of skinfolds (bicep + tricep + suprailiac + subscapular)-0.445 × forearm circumference. Bland-Altman's reliability analysis showed no significant bias of -0.058 and -0.148%BF and limits of agreement ±8.100 and ±6.056%BF; the intraclass correlation coefficient was 0.955 and 0.976; and Lin's concordance correlation coefficient was 0.914 and 0.951, respectively. Literature equations performed moderately on this study's population. Therefore, two equations were designed and validated. The first one was simple and easily applicable, with measures obtained from a measuring tape, and the second one more complicated yet more accurate and reliable. Both were found to be reliable for the assessment of body composition in the Greek population. © 2017 The Obesity Society.
Vereecken, Carine Anna; Van Damme, Wendy; Maes, Lea
2005-02-01
This article examines the reliability and construct validity of questions assessing mediating factors of fruit and vegetable consumption among 11- and 12-year-old children (N=207). Internal consistencies were good for most scales, ranging from 0.56 to 0.94. Intraclass correlation coefficients between test and retest were acceptable, ranging from 0.39 to 0.90. Concerning predictive validity, preferences and perceived parental and peer behavior were significantly associated with fruit and vegetable consumption. Self-efficacy in difficult situations and a variety of available fruit were significantly correlated with fruit consumption, while permissive eating practices and obligation rules were significantly correlated with vegetable consumption. General attitudes, outcome expectations, selection efficacy, and encouraging practices were not associated with fruit or vegetable consumption.
Neuro-QoL health-related quality of life measurement system: Validation in Parkinson's disease.
Nowinski, Cindy J; Siderowf, Andrew; Simuni, Tanya; Wortman, Catherine; Moy, Claudia; Cella, David
2016-05-01
Neuro-QoL is a multidimensional patient-reported outcome measurement system assessing aspects of physical, mental, and social health identified by neurology patients and caregivers as important. One of the first neurology-specific patient-reported outcome measure systems created using modern test development methods, Neuro-Qol enables brief, yet precise, assessment and the ability to conduct both PD-specific and cross-disease comparisons. We present results of Neuro-QoL clinical validation using a sample of PD patients. A total of 120 PD patients recruited from academic medical centers were assessed at baseline, 1 week, and 6 months. Assessments included Neuro-QoL and general and PD-specific validity measures. Participants were 62% male and 95% white (average age = 66); H & Y stages were 1 (16%), 2 (61%), 3 (18%), and 4 (5%). Internal consistency and test-retest reliability of Neuro-QoL ranged from Cronbach's alphas = 0.81 to 0.94 with intraclass correlation coefficients = 0.66 to 0.80. Pearson's correlations between Neuro-QoL and legacy measures were generally moderate and in expected directions. UPDRS Part 2 was moderately correlated with Neuro-QoL Upper Extremity and Mobility, respectively (r's = -0.44; -0.59). Parkinson's Disease Questionnaire-39 and Neuro-QoL measures of similar constructs showed strong-to-moderate correlations (r's = 0.70-0.44). Neuro-QoL measures of fatigue, mobility, positive emotion, and emotional/behavioral control showed responsiveness to self-reported change. Neuro-QoL is valid for use in PD clinical research. Reliability for all but two measures is sufficient for group comparisons, with some evidence supporting responsiveness to change. Neuro-QoL possesses characteristics, such as brevity, flexibility in administration, and suitability, for cross-disease comparisons that may be advantageous to users in a variety of settings. © 2016 Movement Disorder Society. © 2016 International Parkinson and Movement Disorder Society.
Tucker, Amy J; Heap, Sarah; Ingram, Jessica; Law, Marron; Wright, Amanda J
2016-04-01
Reproducibility and validity testing of appetite ratings and energy intakes are needed in experimental and natural settings. Eighteen healthy young women ate a standardized breakfast for 8 days. Days 1 and 8, they rated their appetite (Hunger, Fullness, Desire to Eat, Prospective Food Consumption (PFC)) over a 3.5 h period using visual analogue scales, consumed an ad libitum lunch, left the research center and recorded food intake for the remainder of the day. Days 2-7, participants rated their at-home Hunger at 0 and 30 min post-breakfast and recorded food intake for the day. Total area under the curve (AUC) over the 180 min period before lunch, and energy intakes were calculated. Reproducibility of satiety measures between days was evaluated using coefficients of repeatability (CR), coefficients of variation (CV) and intra-class coefficients (ri). Correlation analysis was used to examine validity between satiety measures. AUCs for Hunger, Desire to Eat and PFC (ri = 0.73-0.78), ad libitum energy intakes (ri = 0.81) and total day energy intakes (ri = 0.48) were reproducible; fasted ratings were not. Average AUCs for Hunger, Desire to Eat and PFC, Desire to Eat at nadir and PFC at fasting, nadir and 180 min were correlated to total day energy intakes (r = 0.50-0.77, P < 0.05), but no ratings were correlated to lunch consumption. At-home Hunger ratings were weakly reproducible but not correlated to reported total energy intakes. Satiety ratings did not concur with next meal intake but PFC ratings may be useful predictors of intake. Overall, this study adds to the limited satiety research on women and challenges the accepted measures of satiety in an experimental setting. Copyright © 2016 Elsevier Ltd. All rights reserved.
Test-retest and interrater reliability of the functional lower extremity evaluation.
Haitz, Karyn; Shultz, Rebecca; Hodgins, Melissa; Matheson, Gordon O
2014-12-01
Repeated-measures clinical measurement reliability study. To establish the reliability and face validity of the Functional Lower Extremity Evaluation (FLEE). The FLEE is a 45-minute battery of 8 standardized functional performance tests that measures 3 components of lower extremity function: control, power, and endurance. The reliability and normative values for the FLEE in healthy athletes are unknown. A face validity survey for the FLEE was sent to sports medicine personnel to evaluate the level of importance and frequency of clinical usage of each test included in the FLEE. The FLEE was then administered and rated for 40 uninjured athletes. To assess test-retest reliability, each athlete was tested twice, 1 week apart, by the same rater. To assess interrater reliability, 3 raters scored each athlete during 1 of the testing sessions. Intraclass correlation coefficients were used to assess the test-retest and interrater reliability of each of the FLEE tests. In the face validity survey, the FLEE tests were rated as highly important by 58% to 71% of respondents but frequently used by only 26% to 45% of respondents. Interrater reliability intraclass correlation coefficients ranged from 0.83 to 1.00, and test-retest reliability ranged from 0.71 to 0.95. The FLEE tests are considered clinically important for assessing lower extremity function by sports medicine personnel but are underused. The FLEE also is a reliable assessment tool. Future studies are required to determine if use of the FLEE to make return-to-play decisions may reduce reinjury rates.
Bloemen, Manon A T; de Groot, Janke F; Backx, Frank J G; Westerveld, Rosalyne A; Takken, Tim
2015-05-01
To determine the best test performance and feasibility using a Graded Arm Cranking Test vs a Graded Wheelchair Propulsion Test in young people with spina bifida who use a wheelchair, and to determine the reliability of the best test. Validity and reliability study. Young people with spina bifida who use a wheelchair. Physiological responses were measured during a Graded Arm Cranking Test and a Graded Wheelchair Propulsion Test using a heart rate monitor and calibrated mobile gas analysis system (Cortex Metamax). For validity, peak oxygen uptake (VO2peak) and peak heart rate (HRpeak) were compared using paired t-tests. For reliability, the intra-class correlation coefficients, standard error of measurement, and standard detectable change were calculated. VO2peak and HRpeak were higher during wheelchair propulsion compared with arm cranking (23.1 vs 19.5 ml/kg/min, p = 0.11; 165 vs 150 beats/min, p < 0.05). Reliability of wheelchair propulsion showed high intra-class correlation coefficients (ICCs) for both VO2peak (ICC = 0.93) and HRpeak (ICC = 0.90). This pilot study shows higher HRpeak and a tendency to higher VO2peak in young people with spina bifida who are using a wheelchair when tested during wheelchair propulsion compared with arm cranking. Wheelchair propulsion showed good reliability. We recommend performing a wheelchair propulsion test for aerobic fitness testing in this population.
Sánchez-Sánchez, M M; Sánchez-Izquierdo, R; Sánchez-Muñoz, E I; Martínez-Yegles, I; Fraile-Gamo, M P; Arias-Rivera, S
2014-01-01
The Glasgow coma scale (GCS) is a common tool used for neurological assessment of critically ill patients. Despite its widespread use, the GCS has some limitations, as sometimes different observers may value differently the same response. To evaluate the interobserver agreement, among intensive care nurses with a minimum of 3 years experience, both in the overall estimate of GCS and for each of its components. Prospective observational study including 110 neurological and/or neurosurgical patients conducted in a critical care unit of 18 beds, from October 2010 until December 2012. Registered variables: Demographic characteristics, reason for admission, overall GCS and its components. The neurological evaluation was conducted by a minimum of 3 nurses. One of them applied an algorithm and consensual assessment technique and all, independently, valued response to stimuli. Interobserver agreement was measured using the intraclass correlation coefficient (ICC) for a confidence interval (CI) of 95%. The study was approved by the Ethics Committee for Clinical Trails. The intraclass correlation coefficient (confident interval) for scale was: Overall GCS: 0.989 (0.985-0.992); ocular response: 0.981 (0.974-0.986); verbal response: 0.971 (0.960-0.979); motor response: 0.987 (0.982-0.991). In our cohort of patients we observed a high level of consistency in the application of both the GCS as in each of its components. Copyright © 2013 Elsevier España, S.L. y SEEIUC. All rights reserved.
Lüdtke, Oliver; Marsh, Herbert W; Robitzsch, Alexander; Trautwein, Ulrich
2011-12-01
In multilevel modeling, group-level variables (L2) for assessing contextual effects are frequently generated by aggregating variables from a lower level (L1). A major problem of contextual analyses in the social sciences is that there is no error-free measurement of constructs. In the present article, 2 types of error occurring in multilevel data when estimating contextual effects are distinguished: unreliability that is due to measurement error and unreliability that is due to sampling error. The fact that studies may or may not correct for these 2 types of error can be translated into a 2 × 2 taxonomy of multilevel latent contextual models comprising 4 approaches: an uncorrected approach, partial correction approaches correcting for either measurement or sampling error (but not both), and a full correction approach that adjusts for both sources of error. It is shown mathematically and with simulated data that the uncorrected and partial correction approaches can result in substantially biased estimates of contextual effects, depending on the number of L1 individuals per group, the number of groups, the intraclass correlation, the number of indicators, and the size of the factor loadings. However, the simulation study also shows that partial correction approaches can outperform full correction approaches when the data provide only limited information in terms of the L2 construct (i.e., small number of groups, low intraclass correlation). A real-data application from educational psychology is used to illustrate the different approaches.
Heritability of Carotid Intima-Media Thickness: A Twin Study
Zhao, Jinying; Cheema, Faiz A.; Bremner, J. Douglas; Goldberg, Jack; Su, Shaoyong; Snieder, Harold; Maisano, Carisa; Jones, Linda; Javed, Farhan; Murrah, Nancy; Le, Ngoc-Anh; Vaccarino, Viola
2008-01-01
Objective To estimate the heritability of carotid intima-media thickness (IMT), a surrogate marker for atherosclerosis, independent of traditional coronary risk factors. Methods and Results We performed a classical twin study of carotid IMT using 98 middle-aged male twin pairs, 58 monozygotic (MZ) and 40 dizygotic (DZ) pairs, from the Vietnam Era Twin Registry. All twins were free of overt cardiovascular disease. Carotid IMT was measured by ultrasound. Bivariate and multivariate analyses were used to determine the association between traditional cardiovascular risk factors and carotid IMT. Intraclass correlation coefficients and genetic modeling techniques were used to determine the relative contributions of genes and environment to the variation in carotid IMT. In our sample, the mean of the maximum carotid IMT was 0.75 ± 0.11. Age, systolic blood pressure and HDL were significantly associated with carotid IMT. The intraclass correlation coefficient for carotid IMT was larger in MZ (0.66; 95% confidence interval [CI], 0.62–0.69) than in DZ twins (0.37; 95% CI, 0.29–0.44), and the unadjusted heritability was 0.69 (95% CI, 0.54–0.79). After adjusting for traditional coronary risk factors, the heritability of carotid IMT was slightly reduced but still of considerable magnitude (0.59; 95% CI, 0.39–0.73). Conclusion Genetic factors have a substantial influence on the variation of carotid IMT. Most of this genetic effect occurs through pathways independent of traditional coronary risk factors. PMID:17825306
Magnetic Resonance Venous Volume Measurements in Peripheral Artery Disease (from ELIMIT).
Kamran, Hassan; Nambi, Vijay; Negi, Smita; Yang, Eric Y; Chen, Changyi; Virani, Salim S; Kougias, Panos; Lumsden, Alan B; Morrisett, Joel D; Ballantyne, Christie M; Brunner, Gerd
2016-11-01
The relation between the arterial and venous systems in patients with impaired lower extremity blood flow remains poorly described. The objective of this secondary analysis of the Effectiveness of Intensive Lipid Modification Medication in Preventing the Progression on Peripheral Artery Disease Trial was to determine the association between femoral vein (FV) volumes and measurements of peripheral artery disease. FV wall, lumen, and total volumes were quantified with fast spin-echo proton density-weighted magnetic resonance imaging scans in 79 patients with peripheral artery disease over 2 years. Reproducibility was excellent for FV total vessel (intraclass correlation coefficient 0.924, confidence interval 0.910 to 0.935) and lumen volumes (intraclass correlation coefficient 0.893, confidence interval 0.873 to 0.910). Baseline superficial femoral artery volumes were directly associated with FV wall (r = 0.46, p <0.0001), lumen (r = 0.42, p = 0.0001), and total volumes (r = 0.46, p <0.0001). The 2-year change in maximum walking time was inversely associated with the 24-month change in FV total volume (r = -0.45, p = 0.03). In conclusion, FV volumes can be measured reliably with fast spin-echo proton density-weighted magnetic resonance imaging, and baseline superficial femoral artery plaque burden is positively associated with FV volumes, whereas the 2-year change in FV volumes and leg function show an inverse relation. Copyright © 2016 Elsevier Inc. All rights reserved.
Yang, Ping-Liang; Wong, David T; Dai, Shuang-Bo; Song, Hai-Bo; Ye, Ling; Liu, Jin; Liu, Bin
2009-05-01
There is no reliable method to monitor renal blood flow intraoperatively. In this study, we evaluated the feasibility and reproducibility of left renal blood flow measurements using transesophageal echocardiography during cardiac surgery. In this prospective noninterventional study, left renal blood flow was measured with transesophageal echocardiography during three time points (pre-, intra-, and postcardiopulmonary bypass) in 60 patients undergoing cardiac surgery. Sonograms from 6 subjects were interpreted by 2 blinded independent assessors at the time of acquisition and 6 mo later. Interobserver and intraobserver reproducibility were quantified by calculating variability and intraclass correlation coefficients. Patients with Doppler angles of >30 degrees (20 of 60 subjects) were eliminated from renal blood flow measurements. Left renal blood flow was successfully measured and analyzed in 36 of 60 (60%) subjects. Both interobserver and intraobserver variability were <10%. Interobserver and intraobserver reproducibility in left renal blood flow measurements were good to excellent (intraclass correlation coefficients 0.604-0.999). Left renal arterial luminal diameter for the pre, intra, and postcardiopulmonary bypass phases, ranged from 3.8 to 4.1 mm, renal arterial velocity from 25 to 35 cm/s, and left renal blood flow from 192 to 299 mL/min. In patients undergoing cardiac surgery, it was feasible in 60% of the subjects to measure left renal blood flow using intraoperative transesophageal echocardiography. The interobserver and intraobserver reproducibility of renal blood flow measurements was good to excellent.
Cañón-Montañez, Wilson; Oróstegui-Arenas, Myriam
2015-01-01
To determine the reliability (internal consistency, inter-rater reproducibility and level of agreement) of nursing outcome: "Knowledge: cardiac disease management (1830)" of the version published in Spanish, in outpatients with heart failure. A reliability study was conducted on 116 outpatients with heart failure. Six indicators of nursing outcome were operationalized. All participants were assessed simultaneously by two evaluators. Three evaluation periods were defined: initial (at baseline), final (a month later), and follow-up (two months later). Internal consistency by Cronbach alpha coefficient, inter-rater reproducibility with intraclass correlation coefficient of reproducibility or agreement and level agreement using the 95% limits of Bland and Altman. Cronbach's alpha was 0.83 (95% CI: 0.77 - 0.89) in the final evaluation, and follow-up values of 0.85 (95% CI: 0.82-0.89) and 0.83 (95% CI: 0.78 - 0.88) were found for the first and second evaluator, respectively. The intraclass correlation coefficient showed values greater 0.9 in the three evaluation periods in both the random and mixed model. The Bland-Altman 95% limits of agreement were close to zero in the three evaluations performed. The questionnaire operationalized to assess the nursing outcome: "Knowledge: cardiac disease management (1830)" in its Spanish version, is a reliable method to measure skills and knowledge in outpatients with heart failure in the Colombian context. Copyright © 2015 Elsevier España, S.L.U. All rights reserved.
Properties of a color-changeable chewing gum used to evaluate masticatory performance.
Hama, Yohei; Kanazawa, Manabu; Minakuchi, Shunsuke; Uchida, Tatsuro; Sasaki, Yoshiyuki
2014-04-01
To clarify the basic properties of a color-changeable chewing gum to determine its applicability to evaluations of masticatory performance under different types of dental status. Ten participants with natural dentition aged 26-30 years chewed gum that changes color during several chewing strokes over five repetitions. Changes in color were assessed using a colorimeter, and then L*, a*, and b* values in the CIELAB color system were quantified. Relationships between chewing progression and color changes were assessed using regression analysis and the reliability of color changes was assessed using intraclass correlation coefficients. We then measured 42 dentate participants (age, 22-31 years) and 47 complete denture wearers (age, 44-90 years) to determine the detectability of masticatory performance under two types of dental status. Regression between the number of chewing strokes and the difference between two colors was non-linear. The intraclass correlation coefficients were highest between 60 and 160 chewing strokes. Dentate and edentulous groups significantly differed (Wilcoxon rank sum test) and values were widely distributed within each group. The color of the chewing gum changed over a wide range, which was sufficient to evaluate the masticatory performance of individuals with natural dentition and those with complete dentures. Changes in the color values of the gum reliably reflected masticatory performance. These findings indicate that the color-changeable chewing gum will be useful for evaluating masticatory performance under any dental status. Copyright © 2014 Japan Prosthodontic Society. Published by Elsevier Ltd. All rights reserved.
Watanabe, Shinichiro; Kato, Hiroki; Shimosegawa, Eku; Hatazawa, Jun
2016-03-01
Genetic or environmental influences on cerebral glucose metabolism are unknown. We attempted to reveal these influences in elderly twins by means of (18)F-FDG PET. (18)F-FDG uptake was studied in 40 monozygotic and 18 dizygotic volunteer twin pairs aged 30 y or over. We also created 18 control pairs by pairing age- and sex-matched genetically unrelated subjects from dizygotic and monozygotic pairs. SUV images of the brain were reconstructed and analyzed by voxel-based statistical analysis with automated region-of-interest setting. The (18)F-FDG uptake in each cerebral lobe was semiquantified by taking a ratio of SUVmean in each region of interest to whole-brain SUVaverage. We calculated an intraclass correlation coefficient of SUV ratio in each region of interest for monozygotic and dizygotic pairs. By comparing differences in coefficients between monozygotic and dizygotic pairs, genetic and environmental contributions were estimated. The intraclass correlation coefficient in monozygotic pairs was significantly higher than that in dizygotic pairs in the parietal lobes bilaterally (P < 0.001) and in the left temporal lobe (P < 0.05) but was not significantly different in other lobes. The present study indicated that in the right and left parietal lobes and left temporal lobe, cerebral glucose metabolism is influenced more by genetics than by environment, whereas in other brain regions the influence of environment is dominant. © 2016 by the Society of Nuclear Medicine and Molecular Imaging, Inc.
Content validity and reliability of test of gross motor development in Chilean children
Cano-Cappellacci, Marcelo; Leyton, Fernanda Aleitte; Carreño, Joshua Durán
2016-01-01
ABSTRACT OBJECTIVE To validate a Spanish version of the Test of Gross Motor Development (TGMD-2) for the Chilean population. METHODS Descriptive, transversal, non-experimental validity and reliability study. Four translators, three experts and 92 Chilean children, from five to 10 years, students from a primary school in Santiago, Chile, have participated. The Committee of Experts has carried out translation, back-translation and revision processes to determine the translinguistic equivalence and content validity of the test, using the content validity index in 2013. In addition, a pilot implementation was achieved to determine test reliability in Spanish, by using the intraclass correlation coefficient and Bland-Altman method. We evaluated whether the results presented significant differences by replacing the bat with a racket, using T-test. RESULTS We obtained a content validity index higher than 0.80 for language clarity and relevance of the TGMD-2 for children. There were significant differences in the object control subtest when comparing the results with bat and racket. The intraclass correlation coefficient for reliability inter-rater, intra-rater and test-retest reliability was greater than 0.80 in all cases. CONCLUSIONS The TGMD-2 has appropriate content validity to be applied in the Chilean population. The reliability of this test is within the appropriate parameters and its use could be recommended in this population after the establishment of normative data, setting a further precedent for the validation in other Latin American countries. PMID:26815160
Mehta, Shraddha; Bastero-Caballero, Rowena F; Sun, Yijun; Zhu, Ray; Murphy, Diane K; Hardas, Bhushan; Koch, Gary
2018-04-29
Many published scale validation studies determine inter-rater reliability using the intra-class correlation coefficient (ICC). However, the use of this statistic must consider its advantages, limitations, and applicability. This paper evaluates how interaction of subject distribution, sample size, and levels of rater disagreement affects ICC and provides an approach for obtaining relevant ICC estimates under suboptimal conditions. Simulation results suggest that for a fixed number of subjects, ICC from the convex distribution is smaller than ICC for the uniform distribution, which in turn is smaller than ICC for the concave distribution. The variance component estimates also show that the dissimilarity of ICC among distributions is attributed to the study design (ie, distribution of subjects) component of subject variability and not the scale quality component of rater error variability. The dependency of ICC on the distribution of subjects makes it difficult to compare results across reliability studies. Hence, it is proposed that reliability studies should be designed using a uniform distribution of subjects because of the standardization it provides for representing objective disagreement. In the absence of uniform distribution, a sampling method is proposed to reduce the non-uniformity. In addition, as expected, high levels of disagreement result in low ICC, and when the type of distribution is fixed, any increase in the number of subjects beyond a moderately large specification such as n = 80 does not have a major impact on ICC. Copyright © 2018 John Wiley & Sons, Ltd.
Normal fetal posterior fossa in MR imaging: new biometric data and possible clinical significance.
Ber, R; Bar-Yosef, O; Hoffmann, C; Shashar, D; Achiron, R; Katorza, E
2015-04-01
Posterior fossa malformations are a common finding in prenatal diagnosis. The objectives of this study are to re-evaluate existing normal MR imaging biometric data of the fetal posterior fossa, suggest and evaluate new parameters, and demonstrate the possible clinical applications of these data. This was a retrospective review of 215 fetal MR imaging examinations with normal findings and 5 examinations of fetuses with a suspected pathologic posterior fossa. Six previously reported parameters and 8 new parameters were measured. Three new parameter ratios were calculated. Interobserver agreement was calculated by using the intraclass correlation coefficient. For measuring each structure, 151-211 MR imaging examinations were selected, resulting in a normal biometry curve according to gestational age for each parameter. Analysis of the ratio parameters showed that vermian lobe ratio and cerebellar hemisphere ratio remain constant with gestational age and that the vermis-to-cisterna magna ratio varies with gestational age. Measurements of the 5 pathologic fetuses are presented on the normal curves. Interobserver agreement was excellent, with the intraclass correlation coefficients of most parameters above 0.9 and only 2 parameters below 0.8. The biometry curves derived from new and existing biometric data and presented in this study may expand and deepen the biometry we use today, while keeping it simple and repeatable. By applying these extensive biometric data on suspected abnormal cases, diagnoses may be confirmed, better classified, or completely altered. © 2015 by American Journal of Neuroradiology.
Singhatanadgige, Weerasak; Kang, Daniel G; Luksanapruksa, Panya; Peters, Colleen; Riew, K Daniel
2016-09-01
Retrospective analysis. To evaluate the correlation and reliability of cervical sagittal alignment parameters obtained from lateral cervical radiographs (XRs) compared with lateral whole-body stereoradiographs (SRs). We evaluated adults with cervical deformity using both lateral XRs and lateral SRs obtained within 1 week of each other between 2010 and 2014. XR and SR images were measured by two independent spine surgeons using the following sagittal alignment parameters: C2-C7 sagittal Cobb angle (SCA), C2-C7 sagittal vertical axis (SVA), C1-C7 translational distance (C1-7), T1 slope (T1-S), neck tilt (NT), and thoracic inlet angle (TIA). Pearson correlation and paired t test were used for statistical analysis, with intra- and interrater reliability analyzed using intraclass correlation coefficient (ICC). A total of 35 patients were included in the study. We found excellent intrarater reliability for all sagittal alignment parameters in both the XR and SR groups with ICC ranging from 0.799 to 0.994 for XR and 0.791 to 0.995 for SR. Interrater reliability was also excellent for all parameters except NT and TIA, which had fair reliability. We also found excellent correlations between XR and SR measurements for most sagittal alignment parameters; SCA, SVA, and C1-C7 had r > 0.90, and only NT had r < 0.70. There was a significant difference between groups, with SR having lower measurements compared with XR for both SVA (0.68 cm lower, p < 0.001) and C1-C7 (1.02 cm lower, p < 0.001). There were no differences between groups for SCA, T1-S, NT, and TIA. Whole-body stereoradiography appears to be a viable alternative for measuring cervical sagittal alignment parameters compared with standard radiography. XR and SR demonstrated excellent correlation for most sagittal alignment parameters except NT. However, SR had significantly lower average SVA and C1-C7 measurements than XR. The lower radiation exposure using single SR has to be weighed against its higher cost compared with XR.
McCurdy, M; Bellows, A; Deng, D; Leppert, M; Mahone, E; Pritchard, A
2015-01-01
Reliable and valid screening and assessment tools are necessary to identify children at risk for neurodevelopmental disabilities who may require additional services. This study evaluated the test-retest reliability of the Capute Scales in a high-risk sample, hypothesizing adequate reliability across 6- and 12-month intervals. Capute Scales scores (N = 66) were collected via retrospective chart review from a NICU follow-up clinic within a large urban medical center spanning three age-ranges: 12-18, 19-24, and 25-36 months. On average, participants were classified as very low birth weight and premature. Reliability of the Capute Scales was evaluated with intraclass correlation coefficients across length of test-retest interval, age at testing, and degree of neonatal complications. The Capute Scales demonstrated high reliability, regardless of length of test-retest interval (ranging from 6 to 14 months) or age of participant, for all index scores, including overall Developmental Quotient (DQ), language-based skill index (CLAMS) and nonverbal reasoning index (CAT). Linear regressions revealed that greater neonatal risk was related to poorer test-retest reliability; however, reliability coefficients remained strong. The Capute Scales afford clinicians a reliable and valid means of screening and assessing for neurodevelopmental delay within high-risk infant populations.
Public Figure Attacks in the United States, 1995-2015.
Meloy, J Reid; Amman, Molly
2016-09-01
An archival descriptive study of public figure attackers in the United States between 1995 and 2015 was undertaken. Fifty-six incidents were identified, primarily through exhaustive internet searches, composed of 58 attackers and 58 victims. A code book was developed which focused upon victims, offenders, pre-attack behaviors including direct threats, attack characteristics, post-offense and other outcomes, motivations and psychological abstracts. The average interrater agreement for coding of bivariate variables was 0.835 (intraclass correlation coefficient). The three most likely victim categories were politicians, judges, and athletes. Attackers were males, many with a psychiatric disorder, most were grandiose, and most had both a violent and nonviolent criminal history. The known motivations for the attacks were often angry and personal, the most common being dissatisfaction with a judicial or other governmental process (23%). In only one case was the primary motivation to achieve notoriety. Lethality risk during an attack was 55%. Collateral injury or death occurred in 29% of the incidents. Only 5% communicated a direct threat to the target beforehand. The term "publicly intimate figure" is introduced to describe the sociocultural blurring of public and private lives among the targets, and its possible role in some attackers' perceptions and motivations. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.