Use of Internal Consistency Coefficients for Estimating Reliability of Experimental Tasks Scores
Green, Samuel B.; Yang, Yanyun; Alt, Mary; Brinkley, Shara; Gray, Shelley; Hogan, Tiffany; Cowan, Nelson
2017-01-01
Reliabilities of scores for experimental tasks are likely to differ from one study to another to the extent that the task stimuli change, the number of trials varies, the type of individuals taking the task changes, the administration conditions are altered, or the focal task variable differs. Given reliabilities vary as a function of the design of these tasks and the characteristics of the individuals taking them, making inferences about the reliability of scores in an ongoing study based on reliability estimates from prior studies is precarious. Thus, it would be advantageous to estimate reliability based on data from the ongoing study. We argue that internal consistency estimates of reliability are underutilized for experimental task data and in many applications could provide this information using a single administration of a task. We discuss different methods for computing internal consistency estimates with a generalized coefficient alpha and the conditions under which these estimates are accurate. We illustrate use of these coefficients using data for three different tasks. PMID:26546100
Assessment of the Maximal Split-Half Coefficient to Estimate Reliability
ERIC Educational Resources Information Center
Thompson, Barry L.; Green, Samuel B.; Yang, Yanyun
2010-01-01
The maximal split-half coefficient is computed by calculating all possible split-half reliability estimates for a scale and then choosing the maximal value as the reliability estimate. Osburn compared the maximal split-half coefficient with 10 other internal consistency estimates of reliability and concluded that it yielded the most consistently…
Internal Consistency, Retest Reliability, and their Implications For Personality Scale Validity
McCrae, Robert R.; Kurtz, John E.; Yamagata, Shinji; Terracciano, Antonio
2010-01-01
We examined data (N = 34,108) on the differential reliability and validity of facet scales from the NEO Inventories. We evaluated the extent to which (a) psychometric properties of facet scales are generalizable across ages, cultures, and methods of measurement; and (b) validity criteria are associated with different forms of reliability. Composite estimates of facet scale stability, heritability, and cross-observer validity were broadly generalizable. Two estimates of retest reliability were independent predictors of the three validity criteria; none of three estimates of internal consistency was. Available evidence suggests the same pattern of results for other personality inventories. Internal consistency of scales can be useful as a check on data quality, but appears to be of limited utility for evaluating the potential validity of developed scales, and it should not be used as a substitute for retest reliability. Further research on the nature and determinants of retest reliability is needed. PMID:20435807
Hall, Justin M; Azar, Frederick M; Miller, Robert H; Smith, Richard; Throckmorton, Thomas W
2014-09-01
We compared accuracy and reliability of a traditional method of measurement (most cephalad vertebral spinous process that can be reached by a patient with the extended thumb) to estimates made with the shoulder in abduction to determine if there were differences between the two methods. Six physicians with fellowship training in sports medicine or shoulder surgery estimated measurements in 48 healthy volunteers. Three were randomly chosen to make estimates of both internal rotation measurements for each volunteer. An independent observer made objective measurements on lateral scoliosis films (spinous process method) or with a goniometer (abduction method). Examiners were blinded to objective measurements as well as to previous estimates. Intraclass coefficients for interobserver reliability for the traditional method averaged 0.75, indicating good agreement among observers. The difference in vertebral level estimated by the examiner and the actual radiographic level averaged 1.8 levels. The intraclass coefficient for interobserver reliability for the abduction method averaged 0.81 for all examiners, indicating near-perfect agreement. Confidence intervals indicated that estimates were an average of 8° different from the objective goniometer measurements. Pearson correlation coefficients of intraobserver reliability for the abduction method averaged 0.94, indicating near-perfect agreement within observers. Confidence intervals demonstrated repeated estimates between 5° and 10° of the original. Internal rotation estimates made with the shoulder abducted demonstrated interobserver reliability superior to that of spinous process estimates, and reproducibility was high. On the basis of this finding, we now take glenohumeral internal rotation measurements with the shoulder in abduction and use a goniometer to maximize accuracy and objectivity. Copyright © 2014 Journal of Shoulder and Elbow Surgery Board of Trustees. Published by Mosby, Inc. All rights reserved.
Loeding, B L; Greenan, J P
1998-12-01
The study examined the validity and reliability of four assessments, with three instruments per domain. Domains included generalizable mathematics, communication, interpersonal relations, and reasoning skills. Participants were deaf, legally blind, or visually impaired students enrolled in vocational classes at residential secondary schools. The researchers estimated the internal consistency reliability, test-retest reliability, and construct validity correlations of three subinstruments: student self-ratings, teacher ratings, and performance assessments. The data suggest that these instruments are highly internally consistent measures of generalizable vocational skills. Four performance assessments have high-to-moderate test-retest reliability estimates, and were generally considered to possess acceptable validity and reliability.
Influences on and Limitations of Classical Test Theory Reliability Estimates.
ERIC Educational Resources Information Center
Arnold, Margery E.
It is incorrect to say "the test is reliable" because reliability is a function not only of the test itself, but of many factors. The present paper explains how different factors affect classical reliability estimates such as test-retest, interrater, internal consistency, and equivalent forms coefficients. Furthermore, the limits of classical test…
Reliability Generalization of the Psychopathy Checklist Applied in Youthful Samples
ERIC Educational Resources Information Center
Campbell, Justin S.; Pulos, Steven; Hogan, Mike; Murry, Francie
2005-01-01
This study examines the average reliability of Hare Psychopathy Checklists (PCLs) adapted for use in samples of youthful offenders (aged 12 to 21 years). Two forms of reliability are examined: 18 alpha estimates of internal consistency and 18 intraclass correlation (two or more raters) estimates of interrater reliability. The results, an average…
NASA Astrophysics Data System (ADS)
Saini, K. K.; Sehgal, R. K.; Sethi, B. L.
2008-10-01
In this paper major reliability estimators are analyzed and there comparatively result are discussed. There strengths and weaknesses are evaluated in this case study. Each of the reliability estimators has certain advantages and disadvantages. Inter-rater reliability is one of the best ways to estimate reliability when your measure is an observation. However, it requires multiple raters or observers. As an alternative, you could look at the correlation of ratings of the same single observer repeated on two different occasions. Each of the reliability estimators will give a different value for reliability. In general, the test-retest and inter-rater reliability estimates will be lower in value than the parallel forms and internal consistency ones because they involve measuring at different times or with different raters. Since reliability estimates are often used in statistical analyses of quasi-experimental designs.
Reliability Estimation When a Test Is Split into Two Parts of Unknown Effective Length.
ERIC Educational Resources Information Center
Feldt, Leonard S.
2002-01-01
Considers the situation in which content or administrative considerations limit the way in which a test can be partitioned to estimate the internal consistency reliability of the total test score. Demonstrates that a single-valued estimate of the total score reliability is possible only if an assumption is made about the comparative size of the…
ERIC Educational Resources Information Center
Saupe, Joe L.; Eimers, Mardy T.
2013-01-01
The purpose of this paper is to explore differences in the reliabilities of cumulative college grade point averages (GPAs), estimated for unweighted and weighted, one-semester, 1-year, 2-year, and 4-year GPAs. Using cumulative GPAs for a freshman class at a major university, we estimate internal consistency (coefficient alpha) reliabilities for…
A Meta-Analysis of Reliability Coefficients in Second Language Research
ERIC Educational Resources Information Center
Plonsky, Luke; Derrick, Deirdre J.
2016-01-01
Ensuring internal validity in quantitative research requires, among other conditions, reliable instrumentation. Unfortunately, however, second language (L2) researchers often fail to report and even more often fail to interpret reliability estimates beyond generic benchmarks for acceptability. As a means to guide interpretations of such estimates,…
Processes and Procedures for Estimating Score Reliability and Precision
ERIC Educational Resources Information Center
Bardhoshi, Gerta; Erford, Bradley T.
2017-01-01
Precision is a key facet of test development, with score reliability determined primarily according to the types of error one wants to approximate and demonstrate. This article identifies and discusses several primary forms of reliability estimation: internal consistency (i.e., split-half, KR-20, a), test-retest, alternate forms, interscorer, and…
Reliability of the Raven Coloured Progressive Matrices for Anglo and for Mexican-American Children.
ERIC Educational Resources Information Center
Valencia, Richard R.
1984-01-01
Investigated the internal consistency reliability estimates of the Raven Coloured Progressive Matrices (CPM) for 96 Anglo and Mexican American third-grade boys from low socioeconomic status background. The results showed that the reliability estimates of the CPM for the two ethnic groups were acceptably high and extremely similar in magnitude.…
Ponterotto, Joseph G; Ruckdeschel, Daniel E
2007-12-01
The present article addresses issues in reliability assessment that are often neglected in psychological research such as acceptable levels of internal consistency for research purposes, factors affecting the magnitude of coefficient alpha (alpha), and considerations for interpreting alpha within the research context. A new reliability matrix anchored in classical test theory is introduced to help researchers judge adequacy of internal consistency coefficients with research measures. Guidelines and cautions in applying the matrix are provided.
Advanced Relay Design and Technology for Energy-Efficient Electronics
2011-07-07
Estimates and Unique Failure Mechanisms of the Digital Micromirror Device (DMD),” in Proceedings of the IEEE Annual International Reliability Physics...Symposium (IRPS ), pp. 9-16, March 1998. [18] A. B. Sontheimer, “Digital Micromirror Device (DMD) Hinge Memory Lifetime Reliability Modeling,” in...Mechanisms of the Digital Micromirror Device (DMD),” in Proceedings of the IEEE Annual International Reliability Physics Symposium (IRPS ), pp. 9-16
Edouard, Pascal; Junge, Astrid; Kiss-Polauf, Marianna; Ramirez, Christophe; Sousa, Monica; Timpka, Toomas; Branco, Pedro
2018-03-01
The quality of epidemiological injury data depends on the reliability of reporting to an injury surveillance system. Ascertaining whether all physicians/physiotherapists report the same information for the same injury case is of major interest to determine data validity. The aim of this study was therefore to analyse the data collection reliability through the analysis of the interrater reliability. Cross-sectional survey. During the 2016 European Athletics Advanced Athletics Medicine Course in Amsterdam, all national medical teams were asked to complete seven virtual case reports on a standardised injury report form using the same definitions and classifications of injuries as the international athletics championships injury surveillance protocol. The completeness of data and the Fleiss' kappa coefficients for the inter-rater reliability were calculated for: sex, age, event, circumstance, location, type, assumed cause and estimated time-loss. Forty-one team physicians and physiotherapists of national medical teams participated in the study (response rate 89.1%). Data completeness was 96.9%. The Fleiss' kappa coefficients were: almost perfect for sex (k=1), injury location (k=0.991), event (k=0.953), circumstance (k=0.942), and age (k=0.870), moderate for type (k=0.507), fair for assumed cause (k=0.394), and poor for estimated time-loss (k=0.155). The injury surveillance system used during international athletics championships provided reliable data for "sex", "location", "event", "circumstance", and "age". More caution should be taken for "assumed cause" and "type", and even more for "estimated time-loss". This injury surveillance system displays satisfactory data quality (reliable data and high data completeness), and thus, can be recommended as tool to collect epidemiology information on injuries during international athletics championships. Copyright © 2018 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.
Reliability Generalization of Scores on the Spielberger State-Trait Anxiety Inventory.
ERIC Educational Resources Information Center
Barnes, Laura L. B.; Harp, Diane; Jung, Woo Sik
2002-01-01
Conducted a reliability generalization study for the State-Trait Anxiety Inventory (C. Spielberger, 1983) by reviewing and classifying 816 research articles. Average reliability coefficients were acceptable for both internal consistency and test-retest reliability, but variation was present among the estimates. Other differences are discussed.…
ERIC Educational Resources Information Center
Lucas, Richard E.; Donnellan, M. Brent
2012-01-01
Life satisfaction is often assessed using single-item measures. However, estimating the reliability of these measures can be difficult because internal consistency coefficients cannot be calculated. Existing approaches use longitudinal data to isolate occasion-specific variance from variance that is either completely stable or variance that…
The Riso-Hudson Enneagram Type Indicator: Estimates of Reliability and Validity
ERIC Educational Resources Information Center
Newgent, Rebecca A.; Parr, Patricia E.; Newman, Isadore; Higgins, Kristin K.
2004-01-01
This investigation was conducted to estimate the reliability and validity of scores on the Riso-Hudson Enneagram Type Indicator (D. R. Riso & R. Hudson, 1999a). Results of 287 participants were analyzed. Alpha suggests an adequate degree of internal consistency. Evidence provides mixed support for construct validity using correlational and…
ERIC Educational Resources Information Center
Wei, Meifen; Alvarez, Alvin N.; Ku, Tsun-Yao; Russell, Daniel W.; Bonett, Douglas G.
2010-01-01
Four studies were conducted to develop and validate the Coping With Discrimination Scale (CDS). In Study 1, an exploratory factor analysis (N = 328) identified 5 factors: Education/Advocacy, Internalization, Drug and Alcohol Use, Resistance, and Detachment, with internal consistency reliability estimates ranging from 0.72 to 0.90. In Study 2, a…
Reliability of the ecSatter Inventory as a tool to measure eating competence.
Stotts, Jodi L; Lohse, Barbara
2007-01-01
To examine the reliability of the ecSatter Inventory (ecSI), a measure of eating competence. Self-report questionnaires were administered in person or by mail. Retesting occurred 2 to 6 weeks after completion of the first questionnaire. Both administrations of the questionnaire were completed by 259 participants who were mostly food secure, white females with some college education; mean age was 26.9 +/- 10.4 years. Test-retest reliability and internal consistency. Spearman's rank correlation coefficients to estimate test-retest reliability and Cronbach alpha coefficients to estimate internal consistency. Spearman's rank correlation coefficient for ecSI total score was 0.68; subscale coefficients were 0.70 for eating attitudes, 0.70 for contextual skills, 0.65 for food acceptance, and 0.52 for internal regulation. Cronbach alpha coefficient for ecSI total score was 0.77. Subscale alphas coefficients were 0.80 for eating attitudes, 0.69 for contextual skills, 0.68 for food acceptance, and 0.66 for internal regulation. This study provides psychometric evidence about the reliability of ecSI as a measure of eating competence in this sample. Although some ecSI items may require revision, results suggest that the instrument may be used to evaluate nutrition education designed to improve eating competence.
Measuring eating disorder attitudes and behaviors: a reliability generalization study
2014-01-01
Background Although score reliability is a sample-dependent characteristic, researchers often only report reliability estimates from previous studies as justification for employing particular questionnaires in their research. The present study followed reliability generalization procedures to determine the mean score reliability of the Eating Disorder Inventory and its most commonly employed subscales (Drive for Thinness, Bulimia, and Body Dissatisfaction) and the Eating Attitudes Test as a way to better identify those characteristics that might impact score reliability. Methods Published studies that used these measures were coded based on their reporting of reliability information and additional study characteristics that might influence score reliability. Results Score reliability estimates were included in 26.15% of studies using the EDI and 36.28% of studies using the EAT. Mean Cronbach’s alphas for the EDI (total score = .91; subscales = .75 to .89), EAT-40 (total score = .81) and EAT-26 (total score = .86; subscales = .56 to .80) suggested variability in estimated internal consistency. Whereas some EDI subscales exhibited higher score reliability in clinical eating disorder samples than in nonclinical samples, other subscales did not exhibit these differences. Score reliability information for the EAT was primarily reported for nonclinical samples, making it difficult to characterize the effect of type of sample on these measures. However, there was a tendency for mean score reliability to be higher in the adult (vs. adolescent) samples and in female (vs. male) samples. Conclusions Overall, this study highlights the importance of assessing and reporting internal consistency during every test administration because reliability is affected by characteristics of the participants being examined. PMID:24764530
Score Reliability of Adolescent Alcohol Screening Measures: A Meta-Analytic Inquiry
ERIC Educational Resources Information Center
Shields, Alan L.; Campfield, Delia C.; Miller, Christopher S.; Howell, Ryan T.; Wallace, Kimberly; Weiss, Roger D.
2008-01-01
This study describes the reliability reporting practices in empirical studies using eight adolescent alcohol screening tools and characterizes and explores variability in internal consistency estimates across samples. Of 119 observed administrations of these instruments, 40 (34%) reported usable reliability information. The Personal Experience…
Coefficient Alpha and Reliability of Scale Scores
ERIC Educational Resources Information Center
Almehrizi, Rashid S.
2013-01-01
The majority of large-scale assessments develop various score scales that are either linear or nonlinear transformations of raw scores for better interpretations and uses of assessment results. The current formula for coefficient alpha (a; the commonly used reliability coefficient) only provides internal consistency reliability estimates of raw…
Reliability and Validity of the Evidence-Based Practice Confidence (EPIC) Scale
ERIC Educational Resources Information Center
Salbach, Nancy M.; Jaglal, Susan B.; Williams, Jack I.
2013-01-01
Introduction: The reliability, minimal detectable change (MDC), and construct validity of the evidence-based practice confidence (EPIC) scale were evaluated among physical therapists (PTs) in clinical practice. Methods: A longitudinal mail survey was conducted. Internal consistency and test-retest reliability were estimated using Cronbach's alpha…
Clayson, Peter E; Miller, Gregory A
2017-01-01
Failing to consider psychometric issues related to reliability and validity, differential deficits, and statistical power potentially undermines the conclusions of a study. In research using event-related brain potentials (ERPs), numerous contextual factors (population sampled, task, data recording, analysis pipeline, etc.) can impact the reliability of ERP scores. The present review considers the contextual factors that influence ERP score reliability and the downstream effects that reliability has on statistical analyses. Given the context-dependent nature of ERPs, it is recommended that ERP score reliability be formally assessed on a study-by-study basis. Recommended guidelines for ERP studies include 1) reporting the threshold of acceptable reliability and reliability estimates for observed scores, 2) specifying the approach used to estimate reliability, and 3) justifying how trial-count minima were chosen. A reliability threshold for internal consistency of at least 0.70 is recommended, and a threshold of 0.80 is preferred. The review also advocates the use of generalizability theory for estimating score dependability (the generalizability theory analog to reliability) as an improvement on classical test theory reliability estimates, suggesting that the latter is less well suited to ERP research. To facilitate the calculation and reporting of dependability estimates, an open-source Matlab program, the ERP Reliability Analysis Toolbox, is presented. Copyright © 2016 Elsevier B.V. All rights reserved.
Lim, Chun Yi; Law, Mary; Khetani, Mary; Rosenbaum, Peter; Pollock, Nancy
2018-08-01
To estimate the psychometric properties of a culturally adapted version of the Young Children's Participation and Environment Measure (YC-PEM) for use among Singaporean families. This is a prospective cohort study. Caregivers of 151 Singaporean children with (n = 83) and without (n = 68) developmental disabilities, between 0 and 7 years, completed the YC-PEM (Singapore) questionnaire with 3 participation scales (frequency, involvement, and change desired) and 1 environment scale for three settings: home, childcare/preschool, and community. Setting-specific estimates of internal consistency, test-retest reliability, and construct validity were obtained. Internal consistency estimates varied from .59 to .92 for the participation scales and .73 to .79 for the environment scale. Test-retest reliability estimates from the YC-PEM conducted on two occasions, 2-3 weeks apart, varied from .39 to .89 for the participation scales and from .65 to .80 for the environment scale. Moderate to large differences were found in participation and perceived environmental support between children with and without a disability. YC-PEM (Singapore) scales have adequate psychometric properties except for low internal consistency for the childcare/preschool participation frequency scale and low test-retest reliability for home participation frequency scale. The YC-PEM (Singapore) may be used for population-level studies involving young children with and without developmental disabilities.
Vatan, Sevginar; Ertaş, Sedar; Lester, David
2011-04-01
In a sample of 100 Turkish psychiatric patients with diagnoses of anxiety disorders, Lester's Helplessness, Hopelessness, and Haplessness inventory had moderate estimates of internal consistency, test-retest reliability, and construct validity.
Sun, Wei; Chou, Chih-Ping; Stacy, Alan W; Ma, Huiyan; Unger, Jennifer; Gallaher, Peggy
2007-02-01
Cronbach's a is widely used in social science research to estimate the internal consistency of reliability of a measurement scale. However, when items are not strictly parallel, the Cronbach's a coefficient provides a lower-bound estimate of true reliability, and this estimate may be further biased downward when items are dichotomous. The estimation of standardized Cronbach's a for a scale with dichotomous items can be improved by using the upper bound of coefficient phi. SAS and SPSS macros have been developed in this article to obtain standardized Cronbach's a via this method. The simulation analysis showed that Cronbach's a from upper-bound phi might be appropriate for estimating the real reliability when standardized Cronbach's a is problematic.
D'Agostino, Fabio; Barbaranelli, Claudio; Paans, Wolter; Belsito, Romina; Juarez Vela, Raul; Alvaro, Rosaria; Vellone, Ercole
2017-07-01
To evaluate the psychometric properties of the D-Catch instrument. A cross-sectional methodological study. Validity and reliability were estimated with confirmatory factor analysis (CFA) and internal consistency and inter-rater reliability, respectively. A sample of 250 nursing documentations was selected. CFA showed the adequacy of a 1-factor model (chronologically descriptive accuracy) with an outlier item (nursing diagnosis accuracy). Internal consistency and inter-rater reliability were adequate. The D-Catch is a valid and reliable instrument for measuring the accuracy of nursing documentation. Caution is needed when measuring diagnostic accuracy since only one item measures this dimension. The D-Catch can be used as an indicator of the accuracy of nursing documentation and the quality of nursing care. © 2015 NANDA International, Inc.
Training and Maintaining System-Wide Reliability in Outcome Management.
Barwick, Melanie A; Urajnik, Diana J; Moore, Julia E
2014-01-01
The Child and Adolescent Functional Assessment Scale (CAFAS) is widely used for outcome management, for providing real time client and program level data, and the monitoring of evidence-based practices. Methods of reliability training and the assessment of rater drift are critical for service decision-making within organizations and systems of care. We assessed two approaches for CAFAS training: external technical assistance and internal technical assistance. To this end, we sampled 315 practitioners trained by external technical assistance approach from 2,344 Ontario practitioners who had achieved reliability on the CAFAS. To assess the internal technical assistance approach as a reliable alternative training method, 140 practitioners trained internally were selected from the same pool of certified raters. Reliabilities were high for both practitioners trained by external technical assistance and internal technical assistance approaches (.909-.995, .915-.997, respectively). 1 and 3-year estimates showed some drift on several scales. High and consistent reliabilities over time and training method has implications for CAFAS training of behavioral health care practitioners, and the maintenance of CAFAS as a global outcome management tool in systems of care.
[Validating the Spanish version of the Nursing Activities Score].
Sánchez-Sánchez, M M; Arias-Rivera, S; Fraile-Gamo, M P; Thuissard-Vasallo, I J; Frutos-Vivar, F
2015-01-01
Validating workload scores ensures that they are appropriate for the purpose for which they were developed. To validate the Nursing Activities Score (NAS) Spanish version. Observational and prospective study. 1,045 patients who were admitted to a medical-surgical unit and a serious burns unit in 2006 were included. The nurse in charge assessed patient workloads by Nine Equivalent of Nursing Manpower use Score and NAS. To assess the internal consistency of the measurements of NAS, item-test correlations, Cronbach's α and Cronbach's α corrected by omitting each of the items were calculated. The intraobserver and interobserver reliability were assessed with the intraclass correlation coefficient by viewing recordings and Kappa (interobserver reliability) was estimated. For the analysis of internal validity, a factorial principal components analysis was performed. Convergent validity was assessed using the Spearman correlation coefficient values obtained from the Nine Equivalent of Nursing Manpower use Score and Spanish-NAS scales. For internal consistency, 164 questionnaires were analysed and a Cronbach's α of 0.373 was calculated. The intraclass correlation coefficient for intraobserver reliability estimate was 0.837 (95% IC: 0.466-0.950) and 0.662 (95% IC: 0.033-0.882) for interobserver reliability. The estimated kappa was 0.371. For internal validity, exploratory factor analysis showed that the first item explained 58.9% of the variance of the questionnaire. For convergent validity 1006 questionnaires were included and a Spearman correlation coefficient of 0.746 was observed. The psychometric properties of Spanish-NAS are acceptable. Copyright © 2014 Elsevier España, S.L.U. y SEEIUC. All rights reserved.
Silva, Wanderson Roberto; Costa, David; Pimenta, Filipa; Maroco, João; Campos, Juliana Alvares Duarte Bonini
2016-07-21
The objectives of this study were to develop a unified Portuguese-language version, for use in Brazil and Portugal, of the Body Shape Questionnaire (BSQ) and to estimate its validity, reliability, and internal consistency in Brazilian and Portuguese female university students. Confirmatory factor analysis was performed using both original (34-item) and shortened (8-item) versions. The model's fit was assessed with χ²/df, CFI, NFI, and RMSEA. Concurrent and convergent validity were assessed. Reliability was estimated through internal consistency and composite reliability (α). Transnational invariance of the BSQ was tested using multi-group analysis. The original 32-item model was refined to present a better fit and adequate validity and reliability. The shortened model was stable in both independent samples and in transnational samples (Brazil and Portugal). The use of this unified version is recommended for the assessment of body shape concerns in both Brazilian and Portuguese college students.
An Improved Internal Consistency Reliability Estimate.
ERIC Educational Resources Information Center
Cliff, Norman
1984-01-01
The proposed coefficient is derived by assuming that the average Goodman-Kruskal gamma between items of identical difficulty would be the same for items of different difficulty. An estimate of covariance between items of identical difficulty leads to an estimate of the correlation between two tests with identical distributions of difficulty.…
Development and initial validation of the internalization of Asian American stereotypes scale.
Shen, Frances C; Wang, Yu-Wei; Swanson, Jane L
2011-07-01
This research consists of four studies on the initial reliability and validity of the Internalization of Asian American Stereotypes Scale (IAASS), a self-report instrument that measures the degree Asian Americans have internalized racial stereotypes about their own group. The results from the exploratory and confirmatory factor analyses support a stable four-factor structure of the IAASS: Difficulties with English Language Communication, Pursuit of Prestigious Careers, Emotional Reservation, and Expected Academic Success. Evidence for concurrent and discriminant validity is presented. High internal-consistency and test-retest reliability estimates are reported. A discussion of how this scale can contribute to research and practice regarding internalized stereotyping among Asian Americans is provided.
Lee, Chin-Pang; Chiu, Yu-Wen; Chu, Chun-Lin; Chen, Yu; Jiang, Kun-Hao; Chen, Jiun-Liang; Chen, Ching-Yen
2016-12-01
The aging males' symptoms (AMS) scale is an instrument used to determine the health-related quality of life in adult and elderly men. The purpose of this study was to synthesize internal consistency (Cronbach's alpha) and test-retest reliability for the AMS scale and its three subscales. Of the 123 studies reviewed, 12 provided alpha coefficients which were then used in the meta-analyses of internal consistency. Seven of the 12 included studies provided test-retest coefficients, and these were used in the meta-analyses of test-retest reliability. The AMS scale had excellent internal consistency [α = 0.89 (95% CI 0.88-0.90)]; the mean alpha estimates across the AMS subscales ranged from 0.79 to 0.82. The AMS scale also had good test-retest reliability [r = 0.85 (95% CI 0.82-0.88]; the test-retest reliability coefficients of the AMS subscales ranged from 0.76 to 0.83. There was significant heterogeneity among the included studies. The AMS scale and the three subscales had fairly good internal consistency and test-retest reliability. Future psychometric studies of the AMS scale should report important characteristics of the participants, details of item scores, and test-retest reliability.
Study samples are too small to produce sufficiently precise reliability coefficients.
Charter, Richard A
2003-04-01
In a survey of journal articles, test manuals, and test critique books, the author found that a mean sample size (N) of 260 participants had been used for reliability studies on 742 tests. The distribution was skewed because the median sample size for the total sample was only 90. The median sample sizes for the internal consistency, retest, and interjudge reliabilities were 182, 64, and 36, respectively. The author presented sample size statistics for the various internal consistency methods and types of tests. In general, the author found that the sample sizes that were used in the internal consistency studies were too small to produce sufficiently precise reliability coefficients, which in turn could cause imprecise estimates of examinee true-score confidence intervals. The results also suggest that larger sample sizes have been used in the last decade compared with those that were used in earlier decades.
The Reliability of the OWLS Written Expression Scale with ESL Kindergarten Students
ERIC Educational Resources Information Center
Harrison, Gina L.; Ogle, Keira C.; Keilty, Megan
2011-01-01
A reliability analysis was conducted on the Written Expression Scale from the Oral and Written Language Scales, (OWLS, Carrow-Woolfolk, 1996), with 68 ESL and 56 non-ESL kindergarten students. Interrater and internal consistency estimates for the Written Expression Scale were examined separately for each language group. Despite lower oral English…
Changes in School Climate in a Long-Term Perspective
ERIC Educational Resources Information Center
Kallestad, Jan Helge
2010-01-01
In a previous report five school climate instruments were explored (1983 and 1985), and four scales were regarded as meaningful climate measures according to suggested criteria. These scales were re-inspected in the present study (1997 and 1998) by analyses of internal consistency, estimates of reliability (unit and aggregated reliability), and…
Reliability and validity of the McDonald Play Inventory.
McDonald, Ann E; Vigen, Cheryl
2012-01-01
This study examined the ability of a two-part self-report instrument, the McDonald Play Inventory, to reliably and validly measure the play activities and play styles of 7- to 11-yr-old children and to discriminate between the play of neurotypical children and children with known learning and developmental disabilities. A total of 124 children ages 7-11 recruited from a sample of convenience and a subsample of 17 parents participated in this study. Reliability estimates yielded moderate correlations for internal consistency, total test intercorrelations, and test-retest reliability. Validity estimates were established for content and construct validity. The results suggest that a self-report instrument yields reliable and valid measures of a child's perceived play performance and discriminates between the play of children with and without disabilities. Copyright © 2012 by the American Occupational Therapy Association, Inc.
[Estimators of internal consistency in health research: the use of the alpha coefficient].
da Silva, Franciele Cascaes; Gonçalves, Elizandra; Arancibia, Beatriz Angélica Valdivia; Bento, Gisele Graziele; Castro, Thiago Luis da Silva; Hernandez, Salma Stephany Soleman; da Silva, Rudney
2015-01-01
Academic production has increased in the area of health, increasingly demanding high quality in publications of great impact. One of the ways to consider quality is through methods that increase the consistency of data analysis, such as reliability which, depending on the type of data, can be evaluated by different coefficients, especially the alpha coefficient. Based on this, the present review systematically gathers scientific articles produced in the last five years, which in a methodological manner gave the α coefficient psychometric use as an estimator of internal consistency and reliability in the processes of construction, adaptation and validation of instruments. The identification of the studies was conducted systematically in the databases BioMed Central Journals, Web of Science, Wiley Online Library, Medline, SciELO, Scopus, Journals@Ovid, BMJ and Springer, using inclusion and exclusion criteria. Data analyses were performed by means of triangulation, content analysis and descriptive analysis. It was found that most studies were conducted in Iran (f=3), Spain (f=2) and Brazil (f=2). These studies aimed to test the psychometric properties of instruments, with eight studies using the α coefficient to assess reliability and nine for assessing internal consistency. All studies were classified as methodological research when their objectives were analyzed. In addition, four studies were also classified as correlational and one as descriptive-correlational. It can be concluded that though the α coefficient is widely used as one of the main parameters for assessing internal consistency of questionnaires in health sciences, its use as an estimator of trust of the methodology used and internal consistency has some critiques that should be considered.
ERIC Educational Resources Information Center
Hatami, Gissou; Motamed, Niloofar; Ashrafzadeh, Mahshid
2010-01-01
Validity and reliability of Persian adaptation of MSLSS in the 12-18 years, middle and high school students (430 students in grades 6-12 in Bushehr port, Iran) using confirmatory factor analysis by means of LISREL statistical package were checked. Internal consistency reliability estimates (Cronbach's coefficient [alpha]) were all above the…
A Validation of the Ski Hi Language Development Scale.
ERIC Educational Resources Information Center
Tonelson, Stephen W.
The purpose of the study was to assess the reliability and the validity of the Ski Hi Language Development Scale which was designed to determine the receptive and the expressive language levels of hearing impaired children from birth to age 5. The reliability of the instrument was estimated through: (1) internal consistency, (2) inter-rater…
[KON-2006--Neurotic Personality Questionnaire].
Aleksandrowicz, Jerzy W; Klasa, Katarzyna; Sobański, Jerzy A; Stolarska, Dorota
2007-01-01
Construction of a questionnaire describing personality traits connected to the occurrence and persistence of neurotic disorders. Responses of 794 patients (before treatment) and 520 persons from the control group on items of the constructed personality questionnaire and the symptom checklist "0". Analyses of subscales reliability and item-scale correlations, test-retest and split-half reliability. Factor analyses estimating internal reliability of the questionnaire. Cross-validation with the KO"0". symptom checklist Psychometric properties of KON-2006 questionnaire indicate that it is consistent and reliable enough. Validity analyses indicate a large probability that the X-KON coefficient informs on personality dysfunctions related to neurotic disorders. The Neurotic Personality Questionnaire KON-2006 may serve to estimate personality traits connected to the occurrence and persistence of neurotic disorders as well as changes resulting from psychotherapy.
Developing a Danish version of the "Impact on Participation and Autonomy Questionnaire".
Ghaziani, Emma; Krogh, Anne Grethe; Lund, Hans
2013-05-01
To translate the "Impact on Participation and Autonomy Questionnaire" into Danish (IPAQ-DK), and estimate its internal consistency and test-retest reliability in order to promote participation-based interventions and research. Translation and two successive reliability assessments through test-retest. 137 adults with varying degrees of impairment; of these, 67 participated in the final reliability assessment. The translation followed guidelines set forth by the "European Group for Quality of Life Assessment and Health Measurement". Internal consistency for subscales was estimated by Chronbach's alpha. Weighted kappa coefficients and intraclass correlation coefficients were calculated to assess the test-retest reliability at item and subscale level, respectively. A preliminary reliability assessment revealed residual issues regarding the translation and cultural adaptation of the instrument. The revised version (IPAQ-DK) was subsequently subjected to a similar assessment demonstrating Chronbach's alpha values from 0.698 to 0.817. Weighted kappa ranged from 0.370 to 0.880; 78% of these values were higher than 0.600. The intraclass correlation coefficient covered values from 0.701 to 0.818. IPAQ-DK is a useful instrument for identifying person-perceived participation restrictions and satisfaction with participation. Further studies of IPAQ-DK's floor/ceiling effects and responsiveness to change are recommended, and whether there is a need for further linguistic improvement of certain items.
Pailian, Hrag; Halberda, Justin
2015-04-01
We investigated the psychometric properties of the one-shot change detection task for estimating visual working memory (VWM) storage capacity-and also introduced and tested an alternative flicker change detection task for estimating these limits. In three experiments, we found that the one-shot whole-display task returns estimates of VWM storage capacity (K) that are unreliable across set sizes-suggesting that the whole-display task is measuring different things at different set sizes. In two additional experiments, we found that the one-shot single-probe variant shows improvements in the reliability and consistency of K estimates. In another additional experiment, we found that a one-shot whole-display-with-click task (requiring target localization) also showed improvements in reliability and consistency. The latter results suggest that the one-shot task can return reliable and consistent estimates of VWM storage capacity (K), and they highlight the possibility that the requirement to localize the changed target is what engenders this enhancement. Through a final series of four experiments, we introduced and tested an alternative flicker change detection method that also requires the observer to localize the changing target and that generates, from response times, an estimate of VWM storage capacity (K). We found that estimates of K from the flicker task correlated with estimates from the traditional one-shot task and also had high reliability and consistency. We highlight the flicker method's ability to estimate executive functions as well as VWM storage capacity, and discuss the potential for measuring multiple abilities with the one-shot and flicker tasks.
On the Use, the Misuse, and the Very Limited Usefulness of Cronbach's Alpha
ERIC Educational Resources Information Center
Sijtsma, Klaas
2009-01-01
This discussion paper argues that both the use of Cronbach's alpha as a reliability estimate and as a measure of internal consistency suffer from major problems. First, alpha always has a value, which cannot be equal to the test score's reliability given the inter-item covariance matrix and the usual assumptions about measurement error. Second, in…
Gottschalk, Hilton P; Bastrom, Tracey P; Edmonds, Eric W
2013-01-01
Standard elbow radiographs (AP and lateral views) are not accurate enough to measure true displacement of medial epicondyle fractures of the humerus. The amount of perceived displacement has been used to determine treatment options. This study assesses the utility of internal oblique radiographs for measurement of true displacement in these fractures. A medial epicondyle fracture was created in a cadaveric specimen. Displacement of the fragment (mm) was set at 5, 10, and 15 in line with the vector of the flexor pronator mass. The fragment was sutured temporarily in place. Radiographs were obtained at 0 (AP), 15, 30, 45, 60, 75, and 90 degrees (lateral) of internal rotation, with the elbow in set positions of flexion. This was done with and without radio-opaque markers placed on the fragment and fracture bed. The 45 and 60 degrees internal oblique radiographs were then presented to 5 separate reviewers (of different levels of training) to evaluate intraobserver and interobserver agreement. Change in elbow position did not affect the perceived displacement (P=0.82) with excellent intraobserver reliability (intraclass correlation coefficient range, 0.979 to 0.988) and interobserver agreement of 0.953. The intraclass correlation coefficient for intraobserver reliability on 45 degrees internal oblique films for all groups ranged from 0.985 to 0.998, with interobserver agreement of 0.953. For predicting displacement, the observers were 60% accurate in predicting the true displacement on the 45 degrees internal oblique films and only 35% accurate using the 60 degrees internal oblique view. Standardizing to a 45 degrees internal oblique radiograph of the elbow (regardless of elbow flexion) can augment the treating surgeon's ability to determine true displacement. At this degree of rotation, the measured number can be multiplied by 1.4 to better estimate displacement. The addition of a 45 degrees internal oblique radiograph in medial humeral epicondyle fractures has good intraobserver and interobserver reliability to more accurately estimate the true displacement of these fractures. Diagnostic study, Level II (Development of diagnostic study with universally applied reference "gold" standard).
Reliability of the Cooking Task in adults with acquired brain injury.
Poncet, Frédérique; Swaine, Bonnie; Taillefer, Chantal; Lamoureux, Julie; Pradat-Diehl, Pascale; Chevignard, Mathilde
2015-01-01
Acquired brain injury (ABI) often leads to deficits in executive functioning (EF) responsible for severe and long-standing disabilities in daily life activities. The Cooking Task is an ecological and valid test of EF involving multi-tasking in a real environment. Given its complex scoring system, it is important to establish the tool's reliability. The objective of the study was to examine the reliability of the Cooking Task (internal consistency, inter-rater and test-retest reliability). A total of 160 patients with ABI (113 men, mean age 37 years, SD = 14.3) were tested using the Cooking Task. For test-retest reliability, patients were assessed by the same rater on two occasions (mean interval 11 days) while two raters independently and simultaneously observed and scored patients' performances to estimate inter-rater reliability. Internal consistency was high for the global scale (Cronbach α = .74). Inter-rater reliability (n = 66) for total errors was also high (ICC = .93), however the test-retest reliability (n = 11) was poor (ICC = .36). In general the Cooking Task appears to be a reliable tool. The low test-retest results were expected given the importance of EF in the performance of novel tasks.
Coplen, T.B.; Peiser, H.S.
1998-01-01
International commissions and national committees for atomic weights (mean relative atomic masses) have recommended regularly updated, best values for these atomic weights as applicable to terrestrial sources of the chemical elements. Presented here is a historically complete listing starting with the values in F. W. Clarke's 1882 recalculation, followed by the recommended values in the annual reports of the American Chemical Society's Atomic Weights Commission. From 1903, an International Commission published such reports and its values (scaled to an atomic weight of 16 for oxygen) are here used in preference to those of national committees of Britain, Germany, Spain, Switzerland, and the U.S.A. We have, however, made scaling adjustments from Ar(16O) to Ar(12C) where not negligible. From 1920, this International Commission constituted itself under the International Union of Pure and Applied Chemistry (IUPAC). Since then, IUPAC has published reports (mostly biennially) listing the recommended atomic weights, which are reproduced here. Since 1979, these values have been called the "standard atomic weights" and, since 1969, all values have been published, with their estimated uncertainties. Few of the earlier values were published with uncertainties. Nevertheless, we assessed such uncertainties on the basis of our understanding of the likely contemporary judgement of the values' reliability. While neglecting remaining uncertainties of 1997 values, we derive "differences" and a retrospective index of reliability of atomic-weight values in relation to assessments of uncertainties at the time of their publication. A striking improvement in reliability appears to have been achieved since the commissions have imposed upon themselves the rule of recording estimated uncertainties from all recognized sources of error.
Highly reliable oxide VCSELs for datacom applications
NASA Astrophysics Data System (ADS)
Aeby, Ian; Collins, Doug; Gibson, Brian; Helms, Christopher J.; Hou, Hong Q.; Lou, Wenlin; Bossert, David J.; Wang, Charlie X.
2003-06-01
In this paper we describe the processes and procedures that have been developed to ensure high reliability for Emcore"s 850 nm oxide confined GaAs VCSELs. Evidence from on-going accelerated life testing and other reliability studies that confirm that this process yields reliable products will be discussed. We will present data and analysis techniques used to determine the activation energy and acceleration factors for the dominant wear-out failure mechanisms for our devices as well as our estimated MTTF of greater than 2 million use hours. We conclude with a summary of internal verification and field return rate validation data.
The Brazilian version of the effort-reward imbalance questionnaire to assess job stress.
Chor, Dóra; Werneck, Guilherme Loureiro; Faerstein, Eduardo; Alves, Márcia Guimarães de Mello; Rotenberg, Lúcia
2008-01-01
The effort-reward imbalance (ERI) model has been used to assess the health impact of job stress. We aimed at describing the cross-cultural adaptation of the ERI questionnaire into Portuguese and some psychometric properties, in particular internal consistency, test-retest reliability, and factorial structure. We developed a Brazilian version of the ERI using a back-translation method and tested its reliability. The test-retest reliability study was conducted with 111 health workers and University staff. The current analyses are based on 89 participants, after exclusion of those with missing data. Reproducibility (interclass correlation coefficients) for the "effort", "'reward", and "'overcommitment"' dimensions of the scale was estimated at 0.76, 0.86, and 0.78, respectively. Internal consistency (Cronbach's alpha) estimates for these same dimensions were 0.68, 0.78, and 0.78, respectively. The exploratory factorial structure was fairly consistent with the model's theoretical components. We conclude that the results of this study represent the first evidence in favor of the application of the Brazilian Portuguese version of the ERI scale in health research in populations with similar socioeconomic characteristics.
Mori, Koichiro; Yonemoto, Kiyoshi; Takei, Teiji; Izazola-Licea, Jose; Gobet, Benjamin
2010-01-01
The purpose of this paper is to: (1) collect relevant data and estimate Japanese international financial assistance for HIV/AIDS control; (2) discuss the difficulties in collecting relevant data and the limitations of the collected data; and (3) conduct a comparative analysis on the estimated data with OECD and Kaiser Family Foundation aggregate data. The point is that we have comprehensively collected and estimated the data on Japanese international expenditures for HIV/AIDS control while there is no reliable data that is totally managed and published. In addition, we discuss the difficulties and limitations of data collection: unpublished data; insufficient data; inseparable data; problems of exchange rates; gaps between disbursement and commitment; and difference in year period among calendar, fiscal and organization-specific years. Furthermore, we show the risk of underestimating the Japanese international contribution to HIV/AIDS control on the basis of OECD and Kaiser data. In this respect, it is significant to comprehensively collect and estimate the data on Japanese international assistance for HIV/AIDS control. Finally, we derive the implication that it is crucial for a relevant international organization and/or individual countries to comprehensively collect and administer data for international cooperation in the development of health policies for HIV/AIDS.
ERIC Educational Resources Information Center
Forde, David R.; Baron, Stephen W.; Scher, Christine D.; Stein, Murray B.
2012-01-01
This study examines the psychometric properties of the Childhood Trauma Questionnaire short form (CTQ-SF) with street youth who have run away or been expelled from their homes (N = 397). Internal reliability coefficients for the five clinical scales ranged from 0.65 to 0.95. Confirmatory Factor Analysis (CFA) was used to test the five-factor…
Validity and reliability of the Self-Reported Physical Fitness (SRFit) survey.
Keith, NiCole R; Clark, Daniel O; Stump, Timothy E; Miller, Douglas K; Callahan, Christopher M
2014-05-01
An accurate physical fitness survey could be useful in research and clinical care. To estimate the validity and reliability of a Self-Reported Fitness (SRFit) survey; an instrument that estimates muscular fitness, flexibility, cardiovascular endurance, BMI, and body composition (BC) in adults ≥ 40 years of age. 201 participants completed the SF-36 Physical Function Subscale, International Physical Activity Questionnaire (IPAQ), Older Adults' Desire for Physical Competence Scale (Rejeski), the SRFit survey, and the Rikli and Jones Senior Fitness Test. BC, height and weight were measured. SRFit survey items described BC, BMI, and Senior Fitness Test movements. Correlations between the Senior Fitness Test and the SRFit survey assessed concurrent validity. Cronbach's Alpha measured internal consistency within each SRFit domain. SRFit domain scores were compared with SF-36, IPAQ, and Rejeski survey scores to assess construct validity. Intraclass correlations evaluated test-retest reliability. Correlations between SRFit and the Senior Fitness Test domains ranged from 0.35 to 0.79. Cronbach's Alpha scores were .75 to .85. Correlations between SRFit and other survey scores were -0.23 to 0.72 and in the expected direction. Intraclass correlation coefficients were 0.79 to 0.93. All P-values were 0.001. Initial evaluation supports the SRFit survey's validity and reliability.
Llerena, Katiah; Wynn, Jonathan K; Hajcak, Greg; Green, Michael F; Horan, William P
2016-07-01
Accurately monitoring one's performance on daily life tasks, and integrating internal and external performance feedback are necessary for guiding productive behavior. Although internal feedback processing, as indexed by the error-related negativity (ERN), is consistently impaired in schizophrenia, initial findings suggest that external performance feedback processing, as indexed by the feedback negativity (FN), may actually be intact. The current study evaluated internal and external feedback processing task performance and test-retest reliability in schizophrenia. 92 schizophrenia outpatients and 63 healthy controls completed a flanker task (ERN) and a time estimation task (FN). Analyses examined the ΔERN and ΔFN defined as difference waves between correct/positive versus error/negative feedback conditions. A temporal principal component analysis was conducted to distinguish the ΔERN and ΔFN from overlapping neural responses. We also assessed test-retest reliability of ΔERN and ΔFN in patients over a 4-week interval. Patients showed reduced ΔERN accompanied by intact ΔFN. In patients, test-retest reliability for both ΔERN and ΔFN over a four-week period was fair to good. Individuals with schizophrenia show a pattern of impaired internal, but intact external, feedback processing. This pattern has implications for understanding the nature and neural correlates of impaired feedback processing in schizophrenia. Published by Elsevier B.V.
Measurement Myths and Misconceptions.
ERIC Educational Resources Information Center
Goodwin, Laura D.; Goodwin, William L.
1999-01-01
Presents frequently encountered measurement misconceptions and various measurement "rules." Origins of the misconceptions and rules are described, along with the reasons why they are problematic. Alternate approaches or considerations are given. Misconceptions discussed pertain to the estimation of internal consistency reliability and item…
Measuring the Reliability of Picture Story Exercises like the TAT
Gruber, Nicole; Kreuzpointner, Ludwig
2013-01-01
As frequently reported, psychometric assessments on Picture Story Exercises, especially variations of the Thematic Apperception Test, mostly reveal inadequate scores for internal consistency. We demonstrate that the reason for this apparent shortcoming is not caused by the coding system itself but from the incorrect use of internal consistency coefficients, especially Cronbach’s α. This problem could be eliminated by using the category-scores as items instead of the picture-scores. In addition to a theoretical explanation we prove mathematically why the use of category-scores produces an adequate internal consistency estimation and examine our idea empirically with the origin data set of the Thematic Apperception Test by Heckhausen and two additional data sets. We found generally higher values when using the category-scores as items instead of picture-scores. From an empirical and theoretical point of view, the estimated reliability is also superior to each category within a picture as item measuring. When comparing our suggestion with a multifaceted Rasch-model we provide evidence that our procedure better fits the underlying principles of PSE. PMID:24348902
Ban, Ilija; Troelsen, Anders; Kristensen, Morten Tange
2016-10-01
The Constant score (CS) has been the primary endpoint in most studies on clavicle fractures. However, the CS was not developed to assess patients with clavicle fractures. Our aim was to examine inter-rater reliability and agreement of the CS in patients with clavicle fractures. The secondary aim was to estimate the correlation between the CS and the Disabilities of the Arm, Shoulder and Hand score and the internal consistency of the 2 scores. On the basis of sample sizing, 36 patients (31 male and 5 female patients; mean age, 41.3 years) with clavicle fractures underwent standardized CS assessment at a mean of 6.8 weeks (SD, 1.0 weeks) after injury. Reliability and agreement of the CS were determined by 2 raters. The interclass correlation coefficient (ICC2,1), standard error of measurement, minimal detectable change, Cronbach α coefficient, and Pearson correlation coefficient were estimated. Inter-rater reliability of the total CS was excellent (interclass correlation coefficient, 0.94; 95% confidence interval, 0.88-0.97), with no systematic difference between the 2 raters (P = .75). The standard error of measurement (measurement error at the group level) was 4.9, whereas the minimal detectable change (smallest change needed to indicate a real change for an individual) was 13.6 CS points. The internal consistency of the 10 CS items was good, with a Cronbach α of .85, and we found a strong correlation (r = -0.92) between the CS and Disabilities of the Arm, Shoulder and Hand score. The CS was found to be reliable for assessing patients with clavicle fractures, especially at the group level. With high inter-rater reliability and agreement, in addition to good internal consistency, the standardized CS used in this study can be used for comparison of results from different settings. Copyright © 2016 Journal of Shoulder and Elbow Surgery Board of Trustees. Published by Elsevier Inc. All rights reserved.
Pruitt, Sandi L; Jeffe, Donna B; Yan, Yan; Schootman, Mario
2012-04-01
Limited psychometric research has examined the reliability of self-reported measures of neighbourhood conditions, the effect of measurement error on associations between neighbourhood conditions and health, and potential differences in the reliabilities between neighbourhood strata (urban vs rural and low vs high poverty). We assessed overall and stratified reliability of self-reported perceived neighbourhood conditions using five scales (social and physical disorder, social control, social cohesion, fear) and four single items (multidimensional neighbouring). We also assessed measurement error-corrected associations of these conditions with self-rated health. Using random-digit dialling, 367 women without breast cancer (matched controls from a larger study) were interviewed twice, 2-3 weeks apart. Test-retest (intraclass correlation coefficients (ICC)/weighted κ) and internal consistency reliability (Cronbach's α) were assessed. Differences in reliability across neighbourhood strata were tested using bootstrap methods. Regression calibration corrected estimates for measurement error. All measures demonstrated satisfactory internal consistency (α ≥ 0.70) and either moderate (ICC/κ=0.41-0.60) or substantial (ICC/κ=0.61-0.80) test-retest reliability in the full sample. Internal consistency did not differ by neighbourhood strata. Test-retest reliability was significantly lower among rural (vs urban) residents for two scales (social control, physical disorder) and two multidimensional neighbouring items; test-retest reliability was higher for physical disorder and lower for one multidimensional neighbouring item among the high (vs low) poverty strata. After measurement error correction, the magnitude of associations between neighbourhood conditions and self-rated health were larger, particularly in the rural population. Research is needed to develop and test reliable measures of perceived neighbourhood conditions relevant to the health of rural populations.
Reliability Impacts in Life Support Architecture and Technology Selection
NASA Technical Reports Server (NTRS)
Lange, Kevin E.; Anderson, Molly S.
2011-01-01
Equivalent System Mass (ESM) and reliability estimates were performed for different life support architectures based primarily on International Space Station (ISS) technologies. The analysis was applied to a hypothetical 1-year deep-space mission. High-level fault trees were initially developed relating loss of life support functionality to the Loss of Crew (LOC) top event. System reliability was then expressed as the complement (nonoccurrence) this event and was increased through the addition of redundancy and spares, which added to the ESM. The reliability analysis assumed constant failure rates and used current projected values of the Mean Time Between Failures (MTBF) from an ISS database where available. Results were obtained showing the dependence of ESM on system reliability for each architecture. Although the analysis employed numerous simplifications and many of the input parameters are considered to have high uncertainty, the results strongly suggest that achieving necessary reliabilities for deep-space missions will add substantially to the life support system mass. As a point of reference, the reliability for a single-string architecture using the most regenerative combination of ISS technologies without unscheduled replacement spares was estimated to be less than 1%. The results also demonstrate how adding technologies in a serial manner to increase system closure forces the reliability of other life support technologies to increase in order to meet the system reliability requirement. This increase in reliability results in increased mass for multiple technologies through the need for additional spares. Alternative parallel architecture approaches and approaches with the potential to do more with less are discussed. The tall poles in life support ESM are also reexamined in light of estimated reliability impacts.
Duncan, Laura; Georgiades, Kathy; Wang, Li; Van Lieshout, Ryan J; MacMillan, Harriet L; Ferro, Mark A; Lipman, Ellen L; Szatmari, Peter; Bennett, Kathryn; Kata, Anna; Janus, Magdalena; Boyle, Michael H
2017-12-04
The goals of the study were to examine test-retest reliability, informant agreement and convergent and discriminant validity of nine DSM-IV-TR psychiatric disorders classified by parent and youth versions of the Mini International Neuropsychiatric Interview for Children and Adolescents (MINI-KID). Using samples drawn from the general population and child mental health outpatient clinics, 283 youth aged 9 to 18 years and their parents separately completed the MINI-KID with trained lay interviewers on two occasions 7 to 14 days apart. Test-retest reliability estimates based on kappa (κ) went from 0.33 to 0.79 across disorders, samples and informants. Parent-youth agreement on disorders was low (average κ = 0.20). Confirmatory factor analysis provided evidence supporting convergent and discriminant validity. The MINI-KID disorder classifications yielded estimates of test-retest reliability and validity comparable to other standardized diagnostic interviews in both general population and clinic samples. These findings, in addition to the brevity and low administration cost, make the MINI-KID a good candidate for use in epidemiological research and clinical practice. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Quiroz, Viviana; Reinero, Daniela; Hernández, Patricia; Contreras, Johanna; Vernal, Rolando; Carvajal, Paola
2017-01-01
This study aimed to develop and assess the content validity and reliability of a cognitively adapted self-report questionnaire designed for surveillance of gingivitis in adolescents. Ten predetermined self-report questions evaluating early signs and symptoms of gingivitis were preliminary assessed by a panel of clinical experts. Eight questions were selected and cognitively tested in 20 adolescents aged 12 to 18 years from Santiago de Chile. The questionnaire was then conducted and answered by 178 Chilean adolescents. Internal consistency was measured using the Cronbach's alpha and temporal stability was calculated using the Kappa-index. A reliable final self-report questionnaire consisting of 5 questions was obtained, with a total Cronbach's alpha of 0.73 and a Kappa-index ranging from 0.41 to 0.77 between the different questions. The proposed questionnaire is reliable, with an acceptable internal consistency and a temporal stability from moderate to substantial, and it is promising for estimating the prevalence of gingivitis in adolescents.
Puncher, M; Zhang, W; Harrison, J D; Wakeford, R
2017-06-26
Assessments of risk to a specific population group resulting from internal exposure to a particular radionuclide can be used to assess the reliability of the appropriate International Commission on Radiological Protection (ICRP) dose coefficients used as a radiation protection device for the specified exposure pathway. An estimate of the uncertainty on the associated risk is important for informing judgments on reliability; a derived uncertainty factor, UF, is an estimate of the 95% probable geometric difference between the best risk estimate and the nominal risk and is a useful tool for making this assessment. This paper describes the application of parameter uncertainty analysis to quantify uncertainties resulting from internal exposures to radioiodine by members of the public, specifically 1, 10 and 20-year old females from the population of England and Wales. Best estimates of thyroid cancer incidence risk (lifetime attributable risk) are calculated for ingestion or inhalation of 129 I and 131 I, accounting for uncertainties in biokinetic model and cancer risk model parameter values. These estimates are compared with the equivalent ICRP derived nominal age-, sex- and population-averaged estimates of excess thyroid cancer incidence to obtain UFs. Derived UF values for ingestion or inhalation of 131 I for 1 year, 10-year and 20-year olds are around 28, 12 and 6, respectively, when compared with ICRP Publication 103 nominal values, and 9, 7 and 14, respectively, when compared with ICRP Publication 60 values. Broadly similar results were obtained for 129 I. The uncertainties on risk estimates are largely determined by uncertainties on risk model parameters rather than uncertainties on biokinetic model parameters. An examination of the sensitivity of the results to the risk models and populations used in the calculations show variations in the central estimates of risk of a factor of around 2-3. It is assumed that the direct proportionality of excess thyroid cancer risk and dose observed at low to moderate acute doses and incorporated in the risk models also applies to very small doses received at very low dose rates; the uncertainty in this assumption is considerable, but largely unquantifiable. The UF values illustrate the need for an informed approach to the use of ICRP dose and risk coefficients.
The brief multidimensional students' life satisfaction scale-college version.
Zullig, Keith J; Huebner, E Scott; Patton, Jon M; Murray, Karen A
2009-01-01
To investigate the psychometric properties of the BMSLSS-College among 723 college students. Internal consistency estimates explored scale reliability, factor analysis explored construct validity, and known-groups validity was assessed using the National College Youth Risk Behavior Survey and Harvard School of Public Health College Alcohol Study. Criterion-related validity was explored through analyses with the CDC's health-related quality of life scale and a social isolation scale. Acceptable internal consistency reliability, construct, known-groups, and criterion-related validity were established. Findings offer preliminary support for the BMSLSS-C; it could be useful in large-scale research studies, applied screening contexts, and for program evaluation purposes toward achieving Healthy People 2010 objectives.
Reliability and validity of the Modified Erikson Psychosocial Stage Inventory in diverse samples.
Leidy, N K; Darling-Fisher, C S
1995-04-01
The Modified Erikson Psychosocial Stage Inventory (MEPSI) is a relatively simple survey measure designed to assess the strength of psychosocial attributes that arise from progression through Erikson's eight stages of development. The purpose of this study was to employ secondary analysis to evaluate the internal-consistency reliability and construct validity of the MEPSI across four diverse samples: healthy young adults, hemophilic men, healthy older adults, and older adults with chronic obstructive pulmonary disease. Special attention was given to the performance of the measure across gender, with exploratory analyses examining possible age cohort and health status effects. Internal-consistency estimates for the aggregate measure were high, whereas subscale reliability levels varied across age groups. Construct validity was supported across samples. Gender, cohort, and health effects offered interesting psychometric and theoretical insights and direction for further research. Findings indicated that the MEPSI might be a useful instrument for operationalizing and testing Eriksonian developmental theory in adults.
Marine nitrous oxide emissions: An unknown liability for the international water sector
Reliable estimates of anthropogenic greenhouse gas (GHG) emissions are essential for setting effective climate policy at both the sector and national level. Current IPCC Guidelines for calculating nitrous oxide (N2O) emissions from sewage management are both highly uncertain and ...
Rodrigues, Marcelo F; Michel-Crosato, Edgard; Cardoso, Jefferson R; Traebert, Jefferson
2009-06-01
Cross-cultural translation and psychometric testing. To translate and cross-culturally adapt the Quebec Back Pain Disability Scale (QDS) to Brazilian Portuguese and to examine its validity and reliability. Current literature shows the need to adopt reliable and internationally standardized methods for the analysis of low back pain. To our knowledge, this specific questionnaire has not been translated and validated for Portuguese-speaking patients. The translation and cross-cultural adaptation of the QDS were developed in agreement with internationally recommended methodology, and the resulting product was evaluated in this study with 54 consecutive patients. Internal consistency was obtained through Cronbach's alpha; reliability was estimated through the intraclass correlation coefficient and the Bland and Altman agreement (d = mean difference). Validity was determined by correlating the scores of the Brazil-QDS with the Brazilian version of the Roland-Morris Questionnaire and Visual Analogue Pain Scale by means of the Spearman rank correlation coefficient. The internal consistency obtained was excellent (Cronbach's alpha = 0.97). Intraobserver and interobserver reliability were considered strong (ICC = 0.93-d = 0.68 and 0.96-d = 0.57, respectively). The correlation with Brazilian Roland-Morris Questionnaire and with the Visual Analogue Scale was high (r = 0.857; r = 0.758, respectively). The data showed that the process of translation and cross-cultural adaptation were successful and that the adapted instrument demonstrated excellent psychometric properties.
Xu, Fang; Wallace, Robyn C.; Garvin, William; Greenlund, Kurt J.; Bartoli, William; Ford, Derek; Eke, Paul; Town, G. Machell
2016-01-01
Public health researchers have used a class of statistical methods to calculate prevalence estimates for small geographic areas with few direct observations. Many researchers have used Behavioral Risk Factor Surveillance System (BRFSS) data as a basis for their models. The aims of this study were to 1) describe a new BRFSS small area estimation (SAE) method and 2) investigate the internal and external validity of the BRFSS SAEs it produced. The BRFSS SAE method uses 4 data sets (the BRFSS, the American Community Survey Public Use Microdata Sample, Nielsen Claritas population totals, and the Missouri Census Geographic Equivalency File) to build a single weighted data set. Our findings indicate that internal and external validity tests were successful across many estimates. The BRFSS SAE method is one of several methods that can be used to produce reliable prevalence estimates in small geographic areas. PMID:27418213
Validation of the Brazilian Portuguese Version of Geriatric Anxiety Inventory--GAI-BR.
Massena, Patrícia Nitschke; de Araújo, Narahyana Bom; Pachana, Nancy; Laks, Jerson; de Pádua, Analuiza Camozzato
2015-07-01
The Geriatric Anxiety Inventory (GAI) is a recently developed scale aiming to evaluate symptoms of anxiety in later life. This 20-item scale uses dichotomous answers highlighting non-somatic anxiety complaints of elderly people. The present study aimed to evaluate the psychometric properties of the Brazilian Portuguese version GAI (GAI-BR) in a sample from community and outpatient psychogeriatric clinic. A mixed convenience sample of 72 subjects was recruited for answering the research protocol. The interview procedures were structured with questionnaires about sociodemographic data, clinical health status, anxiety, and depression previously validated instruments, Mini-Mental State Examination, Mini International Neuropsychiatric Interview, and GAI-BR. Twenty-two percent of the sample were interviewed twice for test-retest reliability. For internal consistency analyses, the Cronbach's α test was applied. The Spearman correlation test was applied to evaluate the test-retest GAI-BR reliability. A ROC (receiver operating characteristic) curve study was made to estimate the GAI-BR area under curve, cut-off points, sensitivity, and specificity for the Generalized Anxiety Disorder diagnosis. The GAI-BR version showed high internal consistency (Cronbach's α = 0.91) and strong and significant test-retest reliability (ρ = 0.85, p < 0.001). It also showed moderate and significant correlation with the Beck Anxiety Inventory (ρ = 0.68, p < 0.001) and the State-Trait Anxiety Inventory (ρ = 0.61, p < 0.001) showing evidence of concurrent validation. The cut-off point of 13 estimated by ROC curve analyses showed sensitivity of 83.3% and specificity of 84.6% to detect Generalized Anxiety Disorder (DSM-IV). GAI-BR has demonstrated very good psychometric properties and can be a reliable instrument to measure anxiety in Brazilian elderly people.
Test-retest reliability of cardinal plane isokinetic hip torque and EMG.
Claiborne, Tina L; Timmons, Mark K; Pincivero, Danny M
2009-10-01
The objective of the present study was to establish test-retest reliability of isokinetic hip torque and prime mover electromyogram (EMG) through the three cardinal planes of motion. Thirteen healthy young adults participated in two experimental sessions, separated by approximately one week. During each session, isokinetic hip torque was evaluated on the Biodex Isokinetic Dynamometer at a velocity of 60 deg/s. Subjects performed three maximal-effort concentric and eccentric contractions, separately, for right and left hip abduction/adduction, flexion/extension, and internal/external rotation. Surface EMGs were sampled from the gluteus maximus, gluteus medius, adductor, medial and lateral hamstring, and rectus femoris muscles during all contractions. Intraclass correlation coefficients (ICC - 2,1) and standard errors of measurement (SEM) were calculated for peak torque for each movement direction and contraction mode, while ICCs were only computed for the EMG data. Motions that demonstrated high torque reliability included concentric hip abduction (right and left), flexion (right and left), extension (right) and internal rotation (right and left), and eccentric hip abduction (left), adduction (left), flexion (right), and extension (right and left) (ICC range=0.81-0.91). Motions with moderate torque reliability included concentric hip adduction (right), extension (left), internal rotation (left), and external rotation (right), and eccentric hip abduction and adduction (right), flexion (left), internal rotation (right and left), and external rotation (right and left) (ICC range=0.49-0.79). The majority of the EMG sampled muscles (n=12 and n=11 for concentric and eccentric contractions, respectively) demonstrated high reliability (ICC=0.81-0.95). Instances of low, or unacceptable, EMG reliability values occurred for the medial hamstring muscle of the left leg (both contraction modes) and the adductor muscle of the right leg during eccentric internal rotation. The major finding revealed high and moderate levels of between-day reliability of isokinetic hip peak torque and prime mover EMG. It is recommended that the day-to-day variability estimates concomitant with acceptable levels of reliability be considered when attempting to objectify intervention effects on hip muscle performance.
Noorbakhsh, Simasadat; Shams, Jamal; Faghihimohamadi, Mohamadmahdi; Zahiroddin, Hanieh; Hallgren, Mats; Kallmen, Hakan
2018-01-30
Iran is a developing and Islamic country where the consumption of alcoholic beverages is banned. However, psychiatric disorders and alcohol use disorders are often co-occurring. We used the Alcohol Use Disorders Identification Test (AUDIT) to estimate the prevalence of alcohol use and examined the psychometric properties of the test among psychiatric outpatients in Teheran, Iran. AUDIT was completed by 846 consecutive (sequential) patients. Descriptive statistics, internal consistency (Cronbach alpha), confirmatory and exploratory factor analyses were used to analyze the prevalence of alcohol use, reliability and construct validity. 12% of men and 1% of women were hazardous alcohol consumers. Internal reliability of the Iranian version of AUDIT was excellent. Confirmatory factor analyses showed that the construct validity and the fit of previous factor structures (1, 2 and 3 factors) to data were not good and seemingly contradicted results from the explorative principal axis factoring, which showed that a 1-factor solution explained 77% of the co-variances. We could not reproduce the suggested factor structure of AUDIT, probably due to the skewed distribution of alcohol consumption. Only 19% of men and 3% of women scored above 0 on AUDIT. This could be explained by the fact that alcohol is illegal in Iran. In conclusion the AUDIT exhibited good internal reliability when used as a single scale. The prevalence estimates according to AUDIT were somewhat higher among psychiatric patients compared to what was reported by WHO regarding the general population.
Measurement properties of the WOMAC LK 3.1 pain scale.
Stratford, P W; Kennedy, D M; Woodhouse, L J; Spadoni, G F
2007-03-01
The Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) is applied extensively to patients with osteoarthritis of the hip or knee. Previous work has challenged the validity of its physical function scale however an extensive evaluation of its pain scale has not been reported. Our purpose was to estimate internal consistency, factorial validity, test-retest reliability, and the standard error of measurement (SEM) of the WOMAC LK 3.1 pain scale. Four hundred and seventy-four patients with osteoarthritis of the hip or knee awaiting arthroplasty were administered the WOMAC. Estimates of internal consistency (coefficient alpha), factorial validity (confirmatory factor analysis), and the SEM based on internal consistency (SEM(IC)) were obtained. Test-retest reliability [Type 2,1 intraclass correlation coefficients (ICC)] and a corresponding SEM(TRT) were estimated on a subsample of 36 patients. Our estimates were: internal consistency alpha=0.84; SEM(IC)=1.48; Type 2,1 ICC=0.77; SEM(TRT)=1.69. Confirmatory factor analysis failed to support a single factor structure of the pain scale with uncorrelated error terms. Two comparable models provided excellent fit: (1) a model with correlated error terms between the walking and stairs items, and between night and sit items (chi2=0.18, P=0.98); (2) a two factor model with walking and stairs items loading on one factor, night and sit items loading on a second factor, and the standing item loading on both factors (chi2=0.18, P=0.98). Our examination of the factorial structure of the WOMAC pain scale failed to support a single factor and internal consistency analysis yielded a coefficient less than optimal for individual patient use. An alternate strategy to summing the five-item responses when considering individual patient application would be to interpret item responses separately or to sum only those items which display homogeneity.
Tsai, Alexander C.; Scott, Jennifer A.; Hung, Kristin J.; Zhu, Jennifer Q.; Matthews, Lynn T.; Psaros, Christina; Tomlinson, Mark
2013-01-01
Background A major barrier to improving perinatal mental health in Africa is the lack of locally validated tools for identifying probable cases of perinatal depression or for measuring changes in depression symptom severity. We systematically reviewed the evidence on the reliability and validity of instruments to assess perinatal depression in African settings. Methods and Findings Of 1,027 records identified through searching 7 electronic databases, we reviewed 126 full-text reports. We included 25 unique studies, which were disseminated in 26 journal articles and 1 doctoral dissertation. These enrolled 12,544 women living in nine different North and sub-Saharan African countries. Only three studies (12%) used instruments developed specifically for use in a given cultural setting. Most studies provided evidence of criterion-related validity (20 [80%]) or reliability (15 [60%]), while fewer studies provided evidence of construct validity, content validity, or internal structure. The Edinburgh postnatal depression scale (EPDS), assessed in 16 studies (64%), was the most frequently used instrument in our sample. Ten studies estimated the internal consistency of the EPDS (median estimated coefficient alpha, 0.84; interquartile range, 0.71-0.87). For the 14 studies that estimated sensitivity and specificity for the EPDS, we constructed 2 x 2 tables for each cut-off score. Using a bivariate random-effects model, we estimated a pooled sensitivity of 0.94 (95% confidence interval [CI], 0.68-0.99) and a pooled specificity of 0.77 (95% CI, 0.59-0.88) at a cut-off score of ≥9, with higher cut-off scores yielding greater specificity at the cost of lower sensitivity. Conclusions The EPDS can reliably and validly measure perinatal depression symptom severity or screen for probable postnatal depression in African countries, but more validation studies on other instruments are needed. In addition, more qualitative research is needed to adequately characterize local understandings of perinatal depression-like syndromes in different African contexts. PMID:24340036
Quality and rigor of the concept mapping methodology: a pooled study analysis.
Rosas, Scott R; Kane, Mary
2012-05-01
The use of concept mapping in research and evaluation has expanded dramatically over the past 20 years. Researchers in academic, organizational, and community-based settings have applied concept mapping successfully without the benefit of systematic analyses across studies to identify the features of a methodologically sound study. Quantitative characteristics and estimates of quality and rigor that may guide for future studies are lacking. To address this gap, we conducted a pooled analysis of 69 concept mapping studies to describe characteristics across study phases, generate specific indicators of validity and reliability, and examine the relationship between select study characteristics and quality indicators. Individual study characteristics and estimates were pooled and quantitatively summarized, describing the distribution, variation and parameters for each. In addition, variation in the concept mapping data collection in relation to characteristics and estimates was examined. Overall, results suggest concept mapping yields strong internal representational validity and very strong sorting and rating reliability estimates. Validity and reliability were consistently high despite variation in participation and task completion percentages across data collection modes. The implications of these findings as a practical reference to assess the quality and rigor for future concept mapping studies are discussed. Copyright © 2011 Elsevier Ltd. All rights reserved.
Osman, Augustine; Wong, Jane L; Bagge, Courtney L; Freedenthal, Stacey; Gutierrez, Peter M; Lozano, Gregorio
2012-12-01
We conducted two studies to examine the dimensions, internal consistency reliability estimates, and potential correlates of the Depression Anxiety Stress Scales-21 (DASS-21; Lovibond & Lovibond, 1995). Participants in Study 1 included 887 undergraduate students (363 men and 524 women, aged 18 to 35 years; mean [M] age = 19.46, standard deviation [SD] = 2.17) recruited from two public universities to assess the specificity of the individual DASS-21 items and to evaluate estimates of internal consistency reliability. Participants in a follow-up study (Study 2) included 410 students (168 men and 242 women, aged 18 to 47 years; M age = 19.65, SD = 2.88) recruited from the same universities to further assess factorial validity and to evaluate potential correlates of the original DASS-21 total and scale scores. Item bifactor and confirmatory factor analyses revealed that a general factor accounted for the greatest proportion of common variance in the DASS-21 item scores (Study 1). In Study 2, the fit statistics showed good fit for the bifactor model. In addition, the DASS-21 total scale score correlated more highly with scores on a measure of mixed depression and anxiety than with scores on the proposed specific scales of depression or anxiety. Coefficient omega estimates for the DASS-21 scale scores were good. Further investigations of the bifactor structure and psychometric properties of the DASS-21, specifically its incremental and discriminant validity, using known clinical groups are needed. © 2012 Wiley Periodicals, Inc.
Al Ansari, Ahmed; Al Khalifa, Khalid; Al Azzawi, Mohamed; Al Amer, Rashed; Al Sharqi, Dana; Al-Mansoor, Anwar; Munshi, Fadi M
2015-01-01
We aimed to design, implement, and evaluate the feasibility and reliability of a multisource feedback (MSF) system to assess interns in their clerkship year in the Middle Eastern culture, the Kingdom of Bahrain. The study was undertaken in the Bahrain Defense Force Hospital, a military teaching hospital in the Kingdom of Bahrain. A total of 21 interns (who represent the total population of the interns for the given year) were assessed in this study. All of the interns were rotating through our hospital during their year-long clerkship rotation. The study sample consisted of nine males and 12 females. Each participating intern was evaluated by three groups of raters, eight medical intern colleagues, eight senior medical colleagues, and eight coworkers from different departments. A total of 21 interns (nine males and 12 females) were assessed in this study. The total mean response rates were 62.3%. A factor analysis was conducted that found that the data on the questionnaire grouped into three factors that counted for 76.4% of the total variance. These three factors were labeled as professionalism, collaboration, and communication. Reliability analysis indicated that the full instrument scale had high internal consistency (Cronbach's α 0.98). The generalizability coefficients for the surveys were estimated to be 0.78. Based on our results and analysis, we conclude that the MSF tool we used on the interns rotating in their clerkship year within our Middle Eastern culture provides an effective method of evaluation because it offers a reliable, valid, and feasible process.
Edgren, Robert; Castrén, Sari; Mäkelä, Marjukka; Pörtfors, Pia; Alho, Hannu; Salonen, Anne H
2016-06-01
This review aims to clarify which instruments measuring at-risk and problem gambling (ARPG) among youth are reliable and valid in light of reported estimates of internal consistency, classification accuracy, and psychometric properties. A systematic search was conducted in PubMed, Medline, and PsycInfo covering the years 2009-2015. In total, 50 original research articles fulfilled the inclusion criteria: target age under 29 years, using an instrument designed for youth, and reporting a reliability estimate. Articles were evaluated with the revised Quality Assessment of Diagnostic Accuracy Studies tool. Reliability estimates were reported for five ARPG instruments. Most studies (66%) evaluated the South Oaks Gambling Screen Revised for Adolescents. The Gambling Addictive Behavior Scale for Adolescents was the only novel instrument. In general, the evaluation of instrument reliability was superficial. Despite its rare use, the Canadian Adolescent Gambling Inventory (CAGI) had a strong theoretical and methodological base. The Gambling Addictive Behavior Scale for Adolescents and the CAGI were the only instruments originally developed for youth. All studies, except the CAGI study, were population based. ARPG instruments for youth have not been rigorously evaluated yet. Further research is needed especially concerning instruments designed for clinical use. Copyright © 2016 The Society for Adolescent Health and Medicine. Published by Elsevier Inc. All rights reserved.
Sladkevicius, P; Installé, A; Van Den Bosch, T; Timmerman, D; Benacerraf, B; Jokubkiene, L; Di Legge, A; Votino, A; Zannoni, L; De Moor, B; De Cock, B; Van Calster, B; Valentin, L
2018-02-01
To estimate intra- and interrater agreement and reliability with regard to describing ultrasound images of the endometrium using the International Endometrial Tumor Analysis (IETA) terminology. Four expert and four non-expert raters assessed videoclips of transvaginal ultrasound examinations of the endometrium obtained from 99 women with postmenopausal bleeding and sonographic endometrial thickness ≥ 4.5 mm but without fluid in the uterine cavity. The following features were rated: endometrial echogenicity, endometrial midline, bright edge, endometrial-myometrial junction, color score, vascular pattern, irregularly branching vessels and color splashes. The color content of the endometrial scan was estimated using a visual analog scale graded from 0 to 100. To estimate intrarater agreement and reliability, the same videoclips were assessed twice with a minimum of 2 months' interval. The raters were blinded to their own results and to those of the other raters. Interrater differences in the described prevalence of most IETA variables were substantial, and some variable categories were observed rarely. Specific agreement was poor for variables with many categories. For binary variables, specific agreement was better for absence than for presence of a category. For variables with more than two outcome categories, specific agreement for expert and non-expert raters was best for not-defined endometrial midline (93% and 96%), regular endometrial-myometrial junction (72% and 70%) and three-layer endometrial pattern (67% and 56%). The grayscale ultrasound variable with the best reliability was uniform vs non-uniform echogenicity (multirater kappa (κ), 0.55 for expert and 0.52 for non-expert raters), and the variables with the lowest reliability were appearance of the endometrial-myometrial junction (κ, 0.25 and 0.16) and the nine-category endometrial echogenicity variable (κ, 0.29 and 0.28). The most reliable color Doppler variable was color score (mean weighted κ, 0.77 and 0.69). Intra- and interrater agreement and reliability were similar for experts and non-experts. Inter- and intrarater agreement and reliability when using IETA terminology were limited. This may have implications when assessing the association between a particular ultrasound feature and a specific histological diagnosis, because lack of reproducibility reduces the reliability of the association between a feature and the outcome. Future studies should investigate whether using fewer categories of variable or offering practical training could improve agreement and reliability. Copyright © 2017 ISUOG. Published by John Wiley & Sons Ltd. Copyright © 2017 ISUOG. Published by John Wiley & Sons Ltd.
Do Multiple-Choice Options Inflate Estimates of Vocabulary Size on the VST?
ERIC Educational Resources Information Center
Stewart, Jeffrey
2014-01-01
Validated under a Rasch framework (Beglar, 2010), the Vocabulary Size Test (VST) (Nation & Beglar, 2007) is an increasingly popular measure of decontextualized written receptive vocabulary size in the field of second language acquisition. However, although the validation indicates that the test has high internal reliability, still unaddressed…
Multilevel Confirmatory Factor Analysis of the Teacher My Class Inventory-Short Form
ERIC Educational Resources Information Center
Villares, Elizabeth; Mariani, Melissa; Sink, Christopher A.; Colvin, Kimberly
2016-01-01
Researchers analyzed data from elementary teachers (N = 233) to further establish the psychometric soundness of the Teacher My Class Inventory-Short Form. Supporting previous psychometric research, confirmatory factor analyses findings supported the factorial validity of the hypothesized five-factor solution. Internal reliability estimates were…
Kuzmanova, Rumyana; Stefanova, Irina; Velcheva, Irena; Stambolieva, Katerina
2014-10-01
Adverse effects (AEs) of antiepileptic drugs (AEDs) affect the quality of life of patients with epilepsy and their outcomes. There are no questionnaires or studies on the reliability and validity of instruments measuring AEs of AEDs in patients with epilepsy in Bulgarian language. The aim of the present study was the translation, cross-cultural adaptation, and validation of the LAEP in the Bulgarian language in order to use it in the Bulgarian-speaking population in providing a reliable instrument for the clinical monitoring of patients with epilepsy. One hundred thirty-one patients (57 men and 74 women, mean age: 40.13±13.37 years) took part in the investigation. The internal consistency and test-retest reliability were tested by Cronbach's α and ICC estimations. The convergent construct validity was tested by estimating the correlation of the LAEP-BG with the QOLIE-89 and the discriminant validity by evaluating the difference between LAEP-BG scores and clinical parameters such as the type of epilepsy using Kruskal-Wallis ANOVA. The LAEP-BG showed high internal consistency and reliability. The Cronbach's α of the total scale was 0.86. No significant differences between the Cronbach's α coefficients of the total LAEP-BG and original English, Chinese, Spanish, Korean, and Portuguese-Brazilian versions of the questionnaire were observed. The ICCs, which evaluate the test-retest reliability, were higher than the recommended value of 0.75 and determined the strong positive correlations between the first and second examinations. The creation of two subscales "Neurological and psychiatric side effects" and "Non neurological side effects" of the LAEP-BG proposed by us showed good internal consistency (Cronbach's α of 0.85 and 0.71, respectively). The LAEP-BG scores significantly correlated with other questionnaires such as the Quality of Life in Epilepsy Inventory-89 (QOLIE-89) and showed a good discriminative validity between groups with different levels of self-assessed AEs of AEDs. The Bulgarian version of the Liverpool Adverse Event Profile (LAEP) is a reliable and valid tool in assessing the patient-reported AEs of AEDs and their impact on the patient's outcome. Copyright © 2014 Elsevier Inc. All rights reserved.
NWS Operational Requirements for Ensemble-Based Hydrologic Forecasts
NASA Astrophysics Data System (ADS)
Hartman, R. K.
2008-12-01
Ensemble-based hydrologic forecasts have been developed and issued by National Weather Service (NWS) staff at River Forecast Centers (RFCs) for many years. Used principally for long-range water supply forecasts, only the uncertainty associated with weather and climate have been traditionally considered. As technology and societal expectations of resource managers increase, the use and desire for risk-based decision support tools has also increased. These tools require forecast information that includes reliable uncertainty estimates across all time and space domains. The development of reliable uncertainty estimates associated with hydrologic forecasts is being actively pursued within the United States and internationally. This presentation will describe the challenges, components, and requirements for operational hydrologic ensemble-based forecasts from the perspective of a NOAA/NWS River Forecast Center.
Reliability of self-reported antisocial personality disorder symptoms among substance abusers.
Cottler, L B; Compton, W M; Ridenour, T A; Ben Abdallah, A; Gallagher, T
1998-02-01
It is estimated that from 20 to 60% of substance abusers meet criteria for Antisocial Personality Disorder (APD). An accurate and reliable diagnosis is important because persons meeting criteria for APD, by the nature of their disorder, are less likely to change behaviors and more likely to relapse to both substance abuse and high risk behaviors. To understand more about the reliability of the disorder and symptoms of APD, the Diagnostic Interview Schedule Version III-R (DIS) was administered to 453 substance abusers ascertained from treatment programs and from the general population (St Louis Epidemiological Catchment Area (ECA) follow-up study). Estimates of the 1 week, test-retest reliability for the childhood conduct disorder criterion, the adult antisocial behavior criterion, and APD diagnosis fell in the good agreement range, as measured by kappa. The internal consistency of these DIS symptoms was adequate to acceptable. Individual DIS criteria designed to measure childhood conduct disorder ranged from fair to good for most items; reliability was slightly higher for the adult antisocial behavior symptom items. Finally, self-reported 'liars' were no more unreliable in their reports of their behaviors than 'non-liars'.
Tami, Suzan H; Reed, Debra B; Trejos, Elizabeth; Boylan, Mallory; Wang, Shu
2015-11-05
Our pilot study was conducted to test the reliability of the Caregiver's Feeding Styles Questionnaire (CFSQ) and the Family Nutrition and Physical Activity Assessment (FNPA) in a sample of Arab mothers. Twenty-five Arab mothers completed the CFSQ, FNPA, and the Participant Background Survey for the first administration. After 1-2 weeks, participants completed the CFSQ and the FNPA for the second administration. The two administrations of the surveys allowed for test/retest reliability of the CFSQ and the FNPA and to measure the internal consistency of the two surveys. Pearson's correlation between the first and second administrations or the 19-item scale (demandingness) and the 7-item scale (responsiveness) of the CFSQ were .95 and .86, respectively. As for the FNPA, Pearson's correlation was .80. The estimated reliabilities (Cronbach's alpha) of the CFSQ increased from .86 for the first administration to .93 for the second administration. However, the estimated reliabilities of the FNPA slightly increased from .58 for first administration to .59 for the second administration. In our pilot study of Arab mothers, the CFSQ and FNPA were shown to be promising in terms of reliability and content validity.
NASA Technical Reports Server (NTRS)
Sullivan, Michael J.
2005-01-01
This thesis develops a state estimation algorithm for the Centrifuge Rotor (CR) system where only relative measurements are available with limited knowledge of both rotor imbalance disturbances and International Space Station (ISS) thruster disturbances. A Kalman filter is applied to a plant model augmented with sinusoidal disturbance states used to model both the effect of the rotor imbalance and the 155 thrusters on the CR relative motion measurement. The sinusoidal disturbance states compensate for the lack of the availability of plant inputs for use in the Kalman filter. Testing confirms that complete disturbance modeling is necessary to ensure reliable estimation. Further testing goes on to show that increased estimator operational bandwidth can be achieved through the expansion of the disturbance model within the filter dynamics. In addition, Monte Carlo analysis shows the varying levels of robustness against defined plant/filter uncertainty variations.
Çelik, Derya; Can, Canan; Aslan, Yasemin; Ceylan, Hasan Huseyin; Bilsel, Kerem; Ozdincler, Arzu Razak
2014-01-01
The Harris Hip Score (HHS) developed to assess function and pain from the perspective of patients hip pathologies. The purpose of this study was to translate and culturally adapt the HHS into Turkish, and thereby determine the reliability and validity of the translated version. The HHS was translated into Turkish in accordance with the stages recommended by Beaton. The measurement properties of the HHS were tested in 80 patients; 52 males, mean age 51 years (range 21-75 years) suffering from different hip pathologies. The test-retest reliability was tested in 58 patients; 28 males mean age, 52 years (range 30-73 years) after an interval of seven days. The Cronbach's Alpha was used to assess internal consistency and the intra-class correlation coefficient (ICC) was used to estimate the test-retest reliability. Patients were asked to answer the Oxford Hip Score (OHS), the Western Ontario and McMaster Universities Arthritis Index (WOMAC), the VAS and the Short Form-36 (SF-36) for the validity of the estimation. The Turkish version of the HHS showed sufficient internal consistency (Cronbach's alpha,0.70) and test-retest reliability (ICC = 0.91). The correlation coefficients between the HHS, the WOMAC and the OHS were 0.64 and 0.89 respectively. The highest correlations between the HHS and SF-36 were with the physical function scale (r = 0.72), and the lowest correlations were with the mental function scale (r = 0.10). We observed no floor or ceiling effects. The Turkish version of the HHS has sufficient reliability and validity to measure patient-reported outcome for Turkish-speaking individuals with a variety of hip disorders.
Al Ansari, Ahmed; Al Khalifa, Khalid; Al Azzawi, Mohamed; Al Amer, Rashed; Al Sharqi, Dana; Al-Mansoor, Anwar; Munshi, Fadi M
2015-01-01
Background We aimed to design, implement, and evaluate the feasibility and reliability of a multisource feedback (MSF) system to assess interns in their clerkship year in the Middle Eastern culture, the Kingdom of Bahrain. Method The study was undertaken in the Bahrain Defense Force Hospital, a military teaching hospital in the Kingdom of Bahrain. A total of 21 interns (who represent the total population of the interns for the given year) were assessed in this study. All of the interns were rotating through our hospital during their year-long clerkship rotation. The study sample consisted of nine males and 12 females. Each participating intern was evaluated by three groups of raters, eight medical intern colleagues, eight senior medical colleagues, and eight coworkers from different departments. Results A total of 21 interns (nine males and 12 females) were assessed in this study. The total mean response rates were 62.3%. A factor analysis was conducted that found that the data on the questionnaire grouped into three factors that counted for 76.4% of the total variance. These three factors were labeled as professionalism, collaboration, and communication. Reliability analysis indicated that the full instrument scale had high internal consistency (Cronbach’s α 0.98). The generalizability coefficients for the surveys were estimated to be 0.78. Conclusion Based on our results and analysis, we conclude that the MSF tool we used on the interns rotating in their clerkship year within our Middle Eastern culture provides an effective method of evaluation because it offers a reliable, valid, and feasible process. PMID:26316836
A new multidimensional measure of African adolescents' perceptions of teachers' behaviors.
Mboya, M M
1994-04-01
The Perceived Teacher Behavior Inventory was designed to measure three dimensions of students' perceptions of the behaviors of their teachers. This research was conducted to assess the statistical validity and reliability of the instrument administered to 770 students attending two coeducational high schools in Cape Town, South Africa. Factor analysis clearly identified three subscales indicating that the instrument distinguished the students' perceptions of their teachers' behaviors in three areas. Estimates of internal consistency of the subscales were assessed using the squared multiple correlation as the index of reliability.
Internal consistency and stability of the CANTAB neuropsychological test battery in children.
Syväoja, Heidi J; Tammelin, Tuija H; Ahonen, Timo; Räsänen, Pekka; Tolvanen, Asko; Kankaanpää, Anna; Kantomaa, Marko T
2015-06-01
The Cambridge Neuropsychological Test Automated Battery (CANTAB) is a computer-assessed test battery widely use in different populations. The internal consistency and 1-year stability of CANTAB tests were examined in school-age children. Two hundred-thirty children (57% girls) from five schools in the Jyväskylä school district in Finland participated in the study in spring 2011. The children completed the following CANTAB tests: (a) visual memory (pattern recognition memory [PRM] and spatial recognition memory [SRM]), (b) executive function (spatial span [SSP], Stockings of Cambridge [SOC], and intra-extra dimensional set shift [IED]), and (c) attention (reaction time [RTI] and rapid visual information processing [RVP]). Seventy-four children participated in the follow-up measurements (64% girls) in spring 2012. Cronbach's alpha reliability coefficient was used to estimate the internal consistency of the nonhampering test, and structural equation models were applied to examine the stability of these tests. The reliability and the stability could not be determined for IED or SSP because of the nature of these tests. The internal consistency was acceptable only in the RTI task. The 1-year stability was moderate-to-good for the PRM, RTI, and RVP. The SSP and IED showed a moderate correlation between the two measurement points. The SRM and the SOC tasks were not reliable or stable measures in this study population. For research purposes, we recommend using structural equation modeling to improve reliability. The results suggest that the reliability and the stability of computer-based test batteries should be confirmed in the target population before using them for clinical or research purposes. (c) 2015 APA, all rights reserved).
Software Technology for Adaptable, Reliable Systems (STARS)
1994-03-25
Tmeline(3), SECOMO(3), SEER(3), GSFC Software Engineering Lab Model(l), SLIM(4), SEER-SEM(l), SPQR (2), PRICE-S(2), internally-developed models(3), APMSS(1...3 " Timeline - 3 " SASET (Software Architecture Sizing Estimating Tool) - 2 " MicroMan 11- 2 * LCM (Logistics Cost Model) - 2 * SPQR - 2 * PRICE-S - 2
ERIC Educational Resources Information Center
Pitts, Christine; Anderson, Ross; Haney, Michele
2018-01-01
The purpose of the current study was to estimate reliability, internal consistency and construct validity of the Measure of Instruction for Creative Engagement (MICE) instrument. The MICE uses an iterative process of evidence collection and scoring through teacher observations to determine instructional domain ratings and overall scores. The…
Reliability of self-rated tinnitus distress and association with psychological symptom patterns.
Hiller, W; Goebel, G; Rief, W
1994-05-01
Psychological complaints were investigated in two samples of 60 and 138 in-patients suffering from chronic tinnitus. We administered the Tinnitus Questionnaire (TQ), a 52-item self-rating scale which differentiates between dimensions of emotional and cognitive distress, intrusiveness, auditory perceptual difficulties, sleep disturbances and somatic complaints. The test-retest reliability was .94 for the TQ global score and between .86 and .93 for subscales. Three independent analyses were conducted to estimate the split-half reliability (internal consistency) which was only slightly lower than the test-retest values for scales with a relatively small number of items. Reliability was sufficient also on the level of single items. Low correlation between the TQ and the Hopkins Symptom Checklist (SCL-90-R) indicate a distinct quality of tinnitus-related and general psychological disturbances.
Vandenplas, Jérémie; Colinet, Frederic G; Gengler, Nicolas
2014-09-30
A condition to predict unbiased estimated breeding values by best linear unbiased prediction is to use simultaneously all available data. However, this condition is not often fully met. For example, in dairy cattle, internal (i.e. local) populations lead to evaluations based only on internal records while widely used foreign sires have been selected using internally unavailable external records. In such cases, internal genetic evaluations may be less accurate and biased. Because external records are unavailable, methods were developed to combine external information that summarizes these records, i.e. external estimated breeding values and associated reliabilities, with internal records to improve accuracy of internal genetic evaluations. Two issues of these methods concern double-counting of contributions due to relationships and due to records. These issues could be worse if external information came from several evaluations, at least partially based on the same records, and combined into a single internal evaluation. Based on a Bayesian approach, the aim of this research was to develop a unified method to integrate and blend simultaneously several sources of information into an internal genetic evaluation by avoiding double-counting of contributions due to relationships and due to records. This research resulted in equations that integrate and blend simultaneously several sources of information and avoid double-counting of contributions due to relationships and due to records. The performance of the developed equations was evaluated using simulated and real datasets. The results showed that the developed equations integrated and blended several sources of information well into a genetic evaluation. The developed equations also avoided double-counting of contributions due to relationships and due to records. Furthermore, because all available external sources of information were correctly propagated, relatives of external animals benefited from the integrated information and, therefore, more reliable estimated breeding values were obtained. The proposed unified method integrated and blended several sources of information well into a genetic evaluation by avoiding double-counting of contributions due to relationships and due to records. The unified method can also be extended to other types of situations such as single-step genomic or multi-trait evaluations, combining information across different traits.
Reichmann, W M; Maillefert, J F; Hunter, D J; Katz, J N; Conaghan, P G; Losina, E
2011-05-01
The goal of this systematic review was to report the responsiveness to change and reliability of conventional radiographic joint space width (JSW) measurement. We searched the PubMed and Embase databases using the following search criteria: [osteoarthritis (OA) (MeSH)] AND (knee) AND (X-ray OR radiography OR diagnostic imaging OR radiology OR disease progression) AND (joint space OR JSW or disease progression). We assessed responsiveness by calculating the standardized response mean (SRM). We assessed reliability using intra- and inter-reader intra-class correlation (ICC) and coefficient of variation (CV). Random-effects models were used to pool results from multiple studies. Results were stratified by study duration, design, techniques of obtaining radiographs, and measurement method. We identified 998 articles using the search terms. Of these, 32 articles (43 estimates) reported data on responsiveness of JSW measurement and 24 (50 estimates) articles reported data on measures of reliability. The overall pooled SRM was 0.33 [95% confidence interval (CI): 0.26, 0.41]. Responsiveness of change in JSW measurement was improved substantially in studies of greater than 2 years duration (0.57). Further stratifying this result in studies of greater than 2 years duration, radiographs obtained with the knee in a flexed position yielded an SRM of 0.71. Pooled intra-reader ICC was estimated at 0.97 (95% CI: 0.92, 1.00) and the intra-reader CV estimated at 3.0 (95% CI: 2.0, 4.0). Pooled inter-reader ICC was estimated at 0.93 (95% CI: 0.86, 0.99) and the inter-reader CV estimated at 3.4% (95% CI: 1.3%, 5.5%). Measurement of JSW obtained from radiographs in persons with knee is reliable. These data will be useful to clinicians who are planning RCTs where the change in minimum JSW is the outcome of interest. Copyright © 2011 Osteoarthritis Research Society International. Published by Elsevier Ltd. All rights reserved.
Lang, Jonas W B
2014-07-01
The measurement of implicit or unconscious motives using the picture story exercise (PSE) has long been a target of debate in the psychological literature. Most debates have centered on the apparent paradox that PSE measures of implicit motives typically show low internal consistency reliability on common indices like Cronbach's alpha but nevertheless predict behavioral outcomes. I describe a dynamic Thurstonian item response theory (IRT) model that builds on dynamic system theories of motivation, theorizing on the PSE response process, and recent advancements in Thurstonian IRT modeling of choice data. To assess the models' capability to explain the internal consistency paradox, I first fitted the model to archival data (Gurin, Veroff, & Feld, 1957) and then simulated data based on bias-corrected model estimates from the real data. Simulation results revealed that the average squared correlation reliability for the motives in the Thurstonian IRT model was .74 and that Cronbach's alpha values were similar to the real data (<.35). These findings suggest that PSE motive measures have long been reliable and increase the scientific value of extant evidence from motivational research using PSE motive measures. (c) 2014 APA, all rights reserved.
Weizman, Lior; Sira, Liat Ben; Joskowicz, Leo; Rubin, Daniel L.; Yeom, Kristen W.; Constantini, Shlomi; Shofty, Ben; Bashat, Dafna Ben
2014-01-01
Purpose: Tracking the progression of low grade tumors (LGTs) is a challenging task, due to their slow growth rate and associated complex internal tumor components, such as heterogeneous enhancement, hemorrhage, and cysts. In this paper, the authors show a semiautomatic method to reliably track the volume of LGTs and the evolution of their internal components in longitudinal MRI scans. Methods: The authors' method utilizes a spatiotemporal evolution modeling of the tumor and its internal components. Tumor components gray level parameters are estimated from the follow-up scan itself, obviating temporal normalization of gray levels. The tumor delineation procedure effectively incorporates internal classification of the baseline scan in the time-series as prior data to segment and classify a series of follow-up scans. The authors applied their method to 40 MRI scans of ten patients, acquired at two different institutions. Two types of LGTs were included: Optic pathway gliomas and thalamic astrocytomas. For each scan, a “gold standard” was obtained manually by experienced radiologists. The method is evaluated versus the gold standard with three measures: gross total volume error, total surface distance, and reliability of tracking tumor components evolution. Results: Compared to the gold standard the authors' method exhibits a mean Dice similarity volumetric measure of 86.58% and a mean surface distance error of 0.25 mm. In terms of its reliability in tracking the evolution of the internal components, the method exhibits strong positive correlation with the gold standard. Conclusions: The authors' method provides accurate and repeatable delineation of the tumor and its internal components, which is essential for therapy assessment of LGTs. Reliable tracking of internal tumor components over time is novel and potentially will be useful to streamline and improve follow-up of brain tumors, with indolent growth and behavior. PMID:24784396
Fetz, Katharina; Wenzel-Meyburg, Ursula; Schulz-Quach, Christian
2017-12-28
The evaluation of the effectiveness of undergraduate palliative care education (UPCE) programs is an essential foundation to providing high-quality UPCE programs. Therefore, the implementation of valid evaluation tools is indispensable. Until today, there has been no general consensus regarding concrete outcome parameters and their accurate measurement. The Program in Palliative Care Education and Practice Questionnaire (German Revised Version; PCEP-GR) is a promising assessment tool for UPCE. The aim of the current study was to evaluate the psychometric properties of PCEP-GR and to demonstrate its feasibility for the evaluation of UPCE programs. The practical feasibility of the PCEP-GR and its acceptance in medical students were investigated in a pilot study with 24 undergraduate medical students at Heinrich Heine University Dusseldorf, Germany. Subsequently, the PCEP-GR was surveyed in a representative sample (N = 680) of medical students in order to investigate its psychometric properties. Factorial validity was investigated by means of principal component analysis (PCA). Reliability was examined by means of split-half-reliability analysis and analysis of internal consistency. After taking into consideration the PCA and distribution analysis results, an evaluation instruction for the PCEP-GR was developed. The PCEP-GR proved to be feasible and well-accepted in medical students. PCA revealed a four-factorial solution indicating four PCEP-GR subscales: preparation to provide palliative care, attitudes towards palliative care, self-estimation of competence in communication with dying patients and their relatives and self-estimation of knowledge and skills in palliative care. The PCEP-GR showed good split-half-reliability and acceptable to good internal consistency of subscales. Attitudes towards palliative care slightly missed the criterion of acceptable internal consistency. The evaluation instruction suggests a global PCEP-GR index and four subscales. The PCEP-GR has proven to be a feasible, economic, valid and reliable tool for the assessment of UPCE that comprises self-efficacy expectation and relevant attitudes towards palliative care.
ERIC Educational Resources Information Center
Laux, John M.; Perera-Diltz, Dilani; Smirnoff, Jennifer B.; Salyers, Kathleen M.
2005-01-01
The authors investigated the psychometric capabilities of the Face Valid Other Drugs (FVOD) scale of the Substance Abuse Subtle Screening Inventory-3 (SASSI-3; G. A. Miller, 1999). Internal consistency reliability estimates and construct validity factor analysis for 230 college students provided initial support for the psychometric properties of…
Bazzo, Stefania; Battistella, Giuseppe; Riscica, Patrizia; Moino, Giuliana; Dal Pozzo, Giuseppe; Bottarel, Mery; Geromel, Mariasole; Czerwinsky, Loredana
2015-01-01
Alcohol consumption during pregnancy can result in a range of harmful effects on the developing foetus and newborn, called Fetal Alcohol Spectrum Disorders (FASD). The identification of pregnant women who use alcohol enables to provide information, support and treatment for women and the surveillance of their children. The AUDIT-C (the shortened consumption version of the Alcohol Use Disorders Identification Test) is used for investigating risky drinking with different populations, and has been applied to estimate alcohol use and risky drinking also in antenatal clinics. The aim of the study was to investigate the reliability of a self-report Italian version of the AUDIT-C questionnaire to detect alcohol consumption during pregnancy, regardless of its use as a screening tool. The questionnaire was filled in by two independent consecutive series of pregnant women at the 38th gestation week visit in the two birth locations of the Local Health Authority of Treviso (Italy), during the years 2010 and 2011 (n=220 and n=239). Reliability analysis was performed using internal consistency, item-total score correlations, and inter-item correlations. The "discriminatory power" of the test was also evaluated. Results. Overall, about one third of women recalled alcohol consumption at least once during the current pregnancy. The questionnaire had an internal consistency of 0.565 for the group of the year 2010, of 0.516 for the year 2011, and of 0.542 for the overall group. The highest item total correlations' coefficient was 0.687 and the highest inter-item correlations' coefficient was 0.675. As for the discriminatory power of the questionnaire, the highest Ferguson's delta coefficient was 0.623. These findings suggest that the Italian self-report version of the AUDIT-C possesses unsatisfactory reliability to estimate alcohol consumption during pregnancy when used as self-report questionnaire in an obstetric setting.
Estimating the Reliability of a Soyuz Spacecraft Mission
NASA Technical Reports Server (NTRS)
Lutomski, Michael G.; Farnham, Steven J., II; Grant, Warren C.
2010-01-01
Once the US Space Shuttle retires in 2010, the Russian Soyuz Launcher and Soyuz Spacecraft will comprise the only means for crew transportation to and from the International Space Station (ISS). The U.S. Government and NASA have contracted for crew transportation services to the ISS with Russia. The resulting implications for the US space program including issues such as astronaut safety must be carefully considered. Are the astronauts and cosmonauts safer on the Soyuz than the Space Shuttle system? Is the Soyuz launch system more robust than the Space Shuttle? Is it safer to continue to fly the 30 year old Shuttle fleet for crew transportation and cargo resupply than the Soyuz? Should we extend the life of the Shuttle Program? How does the development of the Orion/Ares crew transportation system affect these decisions? The Soyuz launcher has been in operation for over 40 years. There have been only two loss of life incidents and two loss of mission incidents. Given that the most recent incident took place in 1983, how do we determine current reliability of the system? Do failures of unmanned Soyuz rockets impact the reliability of the currently operational man-rated launcher? Does the Soyuz exhibit characteristics that demonstrate reliability growth and how would that be reflected in future estimates of success? NASA s next manned rocket and spacecraft development project is currently underway. Though the projects ultimate goal is to return to the Moon and then to Mars, the launch vehicle and spacecraft s first mission will be for crew transportation to and from the ISS. The reliability targets are currently several times higher than the Shuttle and possibly even the Soyuz. Can these targets be compared to the reliability of the Soyuz to determine whether they are realistic and achievable? To help answer these questions this paper will explore how to estimate the reliability of the Soyuz Launcher/Spacecraft system, compare it to the Space Shuttle, and its potential impacts for the future of manned spaceflight. Specifically it will look at estimating the Loss of Mission (LOM) probability using historical data, reliability growth, and Probabilistic Risk Assessment techniques
Assessing the psychometric properties of two food addiction scales.
Lemeshow, Adina R; Gearhardt, Ashley N; Genkinger, Jeanine M; Corbin, William R
2016-12-01
While food addiction is well accepted in popular culture and mainstream media, its scientific validity as an addictive behavior is still under investigation. This study evaluated the reliability and validity of the Yale Food Addiction Scale and Modified Yale Food Addiction Scale using data from two community-based convenience samples. We assessed the internal and test-retest reliability of the Yale Food Addiction Scale and Modified Yale Food Addiction Scale, and estimated the sensitivity and negative predictive value of the Modified Yale Food Addiction Scale using the Yale Food Addiction Scale as the benchmark. We calculated Cronbach's alphas and 95% confidence intervals (CIs) for internal reliability and Cohen's Kappa coefficients and 95% CIs for test-retest reliability. Internal consistency (n=232) was marginal to good, ranging from α=0.63 to 0.84. The test-retest reliability (n=45) for food addiction diagnosis was substantial, with Kappa=0.73 (95% CI, 0.48-0.88) (Yale Food Addiction Scale) and 0.79 (95% CI, 0.66-1.00) (Modified Yale Food Addiction Scale). Sensitivity and negative predictive value for classifying food addiction status were excellent: compared to the Yale Food Addiction Scale, the Modified Yale Food Addiction Scale's sensitivity was 92.3% (95% CI, 64%-99.8%), and the negative predictive value was 99.5% (95% CI, 97.5%-100%). Our analyses suggest that the Modified Yale Food Addiction Scale may be an appropriate substitute for the Yale Food Addiction Scale when a brief measure is needed, and support the continued use of both scales to investigate food addiction. Copyright © 2016 Elsevier Ltd. All rights reserved.
Is Coefficient Alpha Robust to Non-Normal Data?
Sheng, Yanyan; Sheng, Zhaohui
2011-01-01
Coefficient alpha has been a widely used measure by which internal consistency reliability is assessed. In addition to essential tau-equivalence and uncorrelated errors, normality has been noted as another important assumption for alpha. Earlier work on evaluating this assumption considered either exclusively non-normal error score distributions, or limited conditions. In view of this and the availability of advanced methods for generating univariate non-normal data, Monte Carlo simulations were conducted to show that non-normal distributions for true or error scores do create problems for using alpha to estimate the internal consistency reliability. The sample coefficient alpha is affected by leptokurtic true score distributions, or skewed and/or kurtotic error score distributions. Increased sample sizes, not test lengths, help improve the accuracy, bias, or precision of using it with non-normal data. PMID:22363306
Schäfer, Axel; Lüdtke, Kerstin; Breuel, Franziska; Gerloff, Nikolas; Knust, Maren; Kollitsch, Christian; Laukart, Alex; Matej, Laura; Müller, Antje; Schöttker-Königer, Thomas; Hall, Toby
2018-08-01
Headache is a common and costly health problem. Although pathogenesis of headache is heterogeneous, one reported contributing factor is dysfunction of the upper cervical spine. The flexion rotation test (FRT) is a commonly used diagnostic test to detect upper cervical movement impairment. The aim of this cross-sectional study was to investigate concurrent validity of detecting high cervical ROM impairment during the FRT by comparing measurements established by an ultrasound-based system (gold standard) with eyeball estimation. Secondary aim was to investigate intra-rater reliability of FRT ROM eyeball estimation. The examiner (6 years experience) was blinded to the data from the ultrasound-based device and to the symptoms of the patients. FRT test result (positive or negative) was based on visual estimation of range of rotation less than 34° to either side. Concurrently, range of rotation was evaluated using the ultrasound-based device. A total of 43 subjects with headache (79% female), mean age of 35.05 years (SD 13.26) were included. According to the International Headache Society Classification 23 subjects had migraine, 4 tension type headache, and 16 multiple headache forms. Sensitivity and specificity were 0.96 and 0.89 for combined rotation, indicating good concurrent reliability. The area under the ROC curve was 0.95 (95% CI 0.91-0.98) for rotation to both sides. Intra-rater reliability for eyeball estimation was excellent with Fleiss Kappa 0.79 for right rotation and left rotation. The results of this study indicate that the FRT is a valid and reliable test to detect impairment of upper cervical ROM in patients with headache.
de Moraes, Suzana Alves; Suzuki, Cláudio Shigueki; de Freitas, Isabel Cristina Martins
2013-01-01
the study aims to evaluate the reproducibility between the International Physical Activity Questionnaire and the American College of Sports Medicine/American Heart Association criteria to classify the physical activity profile in an adult population living in Ribeirão Preto, SP, Brazil. population-based cross-sectional study, including 930 adults of both genders. The reliability was evaluated by Kappa statistics, estimated according to socio-demographic strata. the kappa estimates showed good agreement between the two criteria in all strata. However, higher prevalence of "actives" was found by using the American College of Sports Medicine/American Heart Association. although the estimates have indicated good agreement, the findings suggest caution in choosing the criteria to classify physical activity profile mainly when "walking" is the main modality of physical activity.
Wahl, Simone; Boulesteix, Anne-Laure; Zierer, Astrid; Thorand, Barbara; van de Wiel, Mark A
2016-10-26
Missing values are a frequent issue in human studies. In many situations, multiple imputation (MI) is an appropriate missing data handling strategy, whereby missing values are imputed multiple times, the analysis is performed in every imputed data set, and the obtained estimates are pooled. If the aim is to estimate (added) predictive performance measures, such as (change in) the area under the receiver-operating characteristic curve (AUC), internal validation strategies become desirable in order to correct for optimism. It is not fully understood how internal validation should be combined with multiple imputation. In a comprehensive simulation study and in a real data set based on blood markers as predictors for mortality, we compare three combination strategies: Val-MI, internal validation followed by MI on the training and test parts separately, MI-Val, MI on the full data set followed by internal validation, and MI(-y)-Val, MI on the full data set omitting the outcome followed by internal validation. Different validation strategies, including bootstrap und cross-validation, different (added) performance measures, and various data characteristics are considered, and the strategies are evaluated with regard to bias and mean squared error of the obtained performance estimates. In addition, we elaborate on the number of resamples and imputations to be used, and adopt a strategy for confidence interval construction to incomplete data. Internal validation is essential in order to avoid optimism, with the bootstrap 0.632+ estimate representing a reliable method to correct for optimism. While estimates obtained by MI-Val are optimistically biased, those obtained by MI(-y)-Val tend to be pessimistic in the presence of a true underlying effect. Val-MI provides largely unbiased estimates, with a slight pessimistic bias with increasing true effect size, number of covariates and decreasing sample size. In Val-MI, accuracy of the estimate is more strongly improved by increasing the number of bootstrap draws rather than the number of imputations. With a simple integrated approach, valid confidence intervals for performance estimates can be obtained. When prognostic models are developed on incomplete data, Val-MI represents a valid strategy to obtain estimates of predictive performance measures.
Implementing the undergraduate mini-CEX: a tailored approach at Southampton University.
Hill, Faith; Kendall, Kathleen; Galbraith, Kevin; Crossley, Jim
2009-04-01
The mini-clinical evaluation exercise (mini-CEX) is widely used in the UK to assess clinical competence, but there is little evidence regarding its implementation in the undergraduate setting. This study aimed to estimate the validity and reliability of the undergraduate mini-CEX and discuss the challenges involved in its implementation. A total of 3499 mini-CEX forms were completed. Validity was assessed by estimating associations between mini-CEX score and a number of external variables, examining the internal structure of the instrument, checking competency domain response rates and profiles against expectations, and by qualitative evaluation of stakeholder interviews. Reliability was evaluated by overall reliability coefficient (R), estimation of the standard error of measurement (SEM), and from stakeholders' perceptions. Variance component analysis examined the contribution of relevant factors to students' scores. Validity was threatened by various confounding variables, including: examiner status; case complexity; attachment specialty; patient gender, and case focus. Factor analysis suggested that competency domains reflect a single latent variable. Maximum reliability can be achieved by aggregating scores over 15 encounters (R = 0.73; 95% confidence interval [CI] +/- 0.28 based on a 6-point assessment scale). Examiner stringency contributed 29% of score variation and student attachment aptitude 13%. Stakeholder interviews revealed staff development needs but the majority perceived the mini-CEX as more reliable and valid than the previous long case. The mini-CEX has good overall utility for assessing aspects of the clinical encounter in an undergraduate setting. Strengths include fidelity, wide sampling, perceived validity, and formative observation and feedback. Reliability is limited by variable examiner stringency, and validity by confounding variables, but these should be viewed within the context of overall assessment strategies.
Park, Yoon Soo; Lineberry, Matthew; Hyderi, Abbas; Bordage, Georges; Xing, Kuan; Yudkowsky, Rachel
2016-11-01
Medical schools administer locally developed graduation competency examinations (GCEs) following the structure of the United States Medical Licensing Examination Step 2 Clinical Skills that combine standardized patient (SP)-based physical examination and the patient note (PN) to create integrated clinical encounter (ICE) scores. This study examines how different subcomponent scoring weights in a locally developed GCE affect composite score reliability and pass-fail decisions for ICE scores, contributing to internal structure and consequential validity evidence. Data from two M4 cohorts (2014: n = 177; 2015: n = 182) were used. The reliability of SP encounter (history taking and physical examination), PN, and communication and interpersonal skills scores were estimated with generalizability studies. Composite score reliability was estimated for varying weight combinations. Faculty were surveyed for preferred weights on the SP encounter and PN scores. Composite scores based on Kane's method were compared with weighted mean scores. Faculty suggested weighting PNs higher (60%-70%) than the SP encounter scores (30%-40%). Statistically, composite score reliability was maximized when PN scores were weighted at 40% to 50%. Composite score reliability of ICE scores increased by up to 0.20 points when SP-history taking (SP-Hx) scores were included; excluding SP-Hx only increased composite score reliability by 0.09 points. Classification accuracy for pass-fail decisions between composite and weighted mean scores was 0.77; misclassification was < 5%. Medical schools and certification agencies should consider implications of assigning weights with respect to composite score reliability and consequences on pass-fail decisions.
Ragagnin, Marilia Nagata; Gorman, Daniel; McCarthy, Ian Donald; Sant'Anna, Bruno Sampaio; de Castro, Cláudio Campi; Turra, Alexander
2018-01-11
Obtaining accurate and reproducible estimates of internal shell volume is a vital requirement for studies into the ecology of a range of shell-occupying organisms, including hermit crabs. Shell internal volume is usually estimated by filling the shell cavity with water or sand, however, there has been no systematic assessment of the reliability of these methods and moreover no comparison with modern alternatives, e.g., computed tomography (CT). This study undertakes the first assessment of the measurement reproducibility of three contrasting approaches across a spectrum of shell architectures and sizes. While our results suggested a certain level of variability inherent for all methods, we conclude that a single measure using sand/water is likely to be sufficient for the majority of studies. However, care must be taken as precision may decline with increasing shell size and structural complexity. CT provided less variation between repeat measures but volume estimates were consistently lower compared to sand/water and will need methodological improvements before it can be used as an alternative. CT indicated volume may be also underestimated using sand/water due to the presence of air spaces visible in filled shells scanned by CT. Lastly, we encourage authors to clearly describe how volume estimates were obtained.
Estimating the risk of a scuba diving fatality in Australia.
Lippmann, John; Stevenson, Christopher; McD Taylor, David; Williams, Jo
2016-12-01
There are few data available on which to estimate the risk of death for Australian divers. This report estimates the risk of a scuba diving fatality for Australian residents, international tourists diving in Queensland, and clients of a large Victorian dive operator. Numerators for the estimates were obtained from the Divers Alert Network Asia-Pacific dive fatality database. Denominators were derived from three sources: Participation in Exercise, Recreation and Sport Surveys, 2001-2010 (Australian resident diving activity data); Tourism Research Australia surveys of international visitors to Queensland 2006-2014 and a dive operator in Victoria 2007-2014. Annual fatality rates (AFR) and 95% confidence intervals (95% CI) were calculated using an exact binomial test. Estimated AFRs were: 0.48 (0.37-0.59) deaths per 100,000 dives, or 8.73 (6.85-10.96) deaths per 100,000 divers for Australian residents; 0.12 (0.05-0.25) deaths per 100,000 dives, or 0.46 (0.20-0.91) deaths per 100,000 divers for international visitors to Queensland; and 1.64 (0.20-5.93) deaths per 100,000 dives for the dive operator in Victoria. On a per diver basis, Australian residents are estimated to be almost twenty times more likely to die whilst scuba diving than are international visitors to Queensland, or to lower than fourfold on a per dive basis. On a per dive basis, divers in Victoria are fourteen times more likely to die than are Queensland international tourists. Although some of the estimates are based on potentially unreliable denominator data extrapolated from surveys, the diving fatality rates in Australia appear to vary by State, being considerably lower in Queensland than in Victoria. These estimates are similar to or lower than comparable overseas estimates, although reliability of all such measurements varies with study size and accuracy of the data available.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chang, Yao-Feng, E-mail: yfchang@utexas.edu; Zhou, Fei; Chen, Ying-Chen
2016-01-18
Self-compliance characteristics and reliability optimization are investigated in intrinsic unipolar silicon oxide (SiO{sub x})-based resistive switching (RS) memory using TiW/SiO{sub x}/TiW device structures. The program window (difference between SET voltage and RESET voltage) is dependent on external series resistance, demonstrating that the SET process is due to a voltage-triggered mechanism. The program window has been optimized for program/erase disturbance immunity and reliability for circuit-level applications. The SET and RESET transitions have also been characterized using a dynamic conductivity method, which distinguishes the self-compliance behavior due to an internal series resistance effect (filament) in SiO{sub x}-based RS memory. By using amore » conceptual “filament/resistive gap (GAP)” model of the conductive filament and a proton exchange model with appropriate assumptions, the internal filament resistance and GAP resistance can be estimated for high- and low-resistance states (HRS and LRS), and are found to be independent of external series resistance. Our experimental results not only provide insights into potential reliability issues but also help to clarify the switching mechanisms and device operating characteristics of SiO{sub x}-based RS memory.« less
Koho, P; Aho, S; Kautiainen, H; Pohjolainen, T; Hurri, H
2014-12-01
To estimate the internal consistency, test-retest reliability and comparability of paper and computer versions of the Finnish version of the Tampa Scale of Kinesiophobia (TSK-FIN) among patients with chronic pain. In addition, patients' personal experiences of completing both versions of the TSK-FIN and preferences between these two methods of data collection were studied. Test-retest reliability study. Paper and computer versions of the TSK-FIN were completed twice on two consecutive days. The sample comprised 94 consecutive patients with chronic musculoskeletal pain participating in a pain management or individual rehabilitation programme. The group rehabilitation design consisted of physical and functional exercises, evaluation of the social situation, psychological assessment of pain-related stress factors, and personal pain management training in order to regain overall function and mitigate the inconvenience of pain and fear-avoidance behaviour. The mean TSK-FIN score was 37.1 [standard deviation (SD) 8.1] for the computer version and 35.3 (SD 7.9) for the paper version. The mean difference between the two versions was 1.9 (95% confidence interval 0.8 to 2.9). Test-retest reliability was 0.89 for the paper version and 0.88 for the computer version. Internal consistency was considered to be good for both versions. The intraclass correlation coefficient for comparability was 0.77 (95% confidence interval 0.66 to 0.85), indicating substantial reliability between the two methods. Both versions of the TSK-FIN demonstrated substantial intertest reliability, good test-retest reliability, good internal consistency and acceptable limits of agreement, suggesting their suitability for clinical use. However, subjects tended to score higher when using the computer version. As such, in an ideal situation, data should be collected in a similar manner throughout the course of rehabilitation or clinical research. Copyright © 2014 Chartered Society of Physiotherapy. Published by Elsevier Ltd. All rights reserved.
Test-Retest Analyses of the Test of English as a Foreign Language. TOEFL Research Reports Report 45.
ERIC Educational Resources Information Center
Henning, Grant
This study provides information about the total and component scores of the Test of English as a Foreign Language (TOEFL). First, the study provides comparative global and component estimates of test-retest, alternate-form, and internal-consistency reliability, controlling for sources of measurement error inherent in the examinees and the testing…
Psychometric evaluation of the Nursing Stress Scale (NSS) among Chinese nurses in Taiwan.
Lee, Mei-Hua; Holzemer, William L; Faucett, Julia
2007-01-01
The purpose of this study was to translate the Nursing Stress Scale (NSS) into Chinese and test its reliability and validity among Chinese nurses in Taiwan. Potential participants were asked to self-administer a Chinese version of the NSS. The agreement estimation was used to determine the equivalence of the meaning between the Chinese and original English versions and was rated by five bilingual nurses as 92% accurate for the 34 items. The test-retest reliability for the NSS at 2 weeks was .71 (p = .022, n=10). Internal consistency reliability and factor analysis were tested with 770 nurses from 65 inpatient units at a medical center in Taiwan. The internal consistency of the Chinese version of the NSS for an overall coefficient alpha is .91 for the total scale, and ranges from .67 to .79 for the subscales. The Chinese version of the NSS explains 53.77% of the variance in work stressors among Chinese nurses in Taiwan. Overall, the Chinese version of the NSS is internally consistent but may not be stable over 2 weeks. There was adequate evidence of the reliability and validity of the NSS-Chinese as an instrument appropriate to measure work stress among Chinese nurses. The translated NSS could be a useful tool for examining the frequency and major sources of stress experienced by Chinese nurses in hospital settings, and for the development of appropriate interventions for stress reduction.
Reddy, Linda A; Dudek, Christopher M; Fabiano, Gregory A; Peters, Stephanie
2015-12-01
This article presents information about the construct validity and reliability of a new teacher self-report measure of classroom instructional and behavioral practices (the Classroom Strategies Scales-Teacher Form; CSS-T). The theoretical underpinnings and empirical basis for the instructional and behavioral management scales are presented. Information is provided about the construct validity, internal consistency, test-retest reliability, and freedom from item-bias of the scales. Given previous investigations with the CSS Observer Form, it was hypothesized that internal consistency would be adequate and that confirmatory factor analyses (CFA) of CSS-T data from 293 classrooms would offer empirical support for the CSS-T's Total, Composite and subscales, and yield a similar factor structure to that of the CSS Observer Form. Goodness-of-fit indices of χ2/df, Root Mean Square Error of Approximation, Goodness of Fit Index, and Adjusted Goodness of Fit Index suggested satisfactory fit of proposed CFA models whereas the Comparative Fit Index did not. Internal consistency estimates of .93 and .94 were obtained for the Instructional Strategies and Behavioral Strategies Total scales respectively. Adequate test-retest reliability was found for instructional and behavioral total scales (r = .79, r = .84, percent agreement 93% and 93%). The CSS-T evidences freedom from item bias on important teacher demographics (age, educational degree, and years of teaching experience). Implications of results are discussed. (c) 2015 APA, all rights reserved).
Baker, Richard S; Bazargan, Mohsen; Calderón, José L; Hays, Ron D
2006-08-01
To compare the psychometric performance of Spanish versions of the 25-item National Eye Institute Visual Function Questionnaire (NEI VFQ-25) and the NEI VFQ-39 administered to Latino patients with the psychometric performance of the standard English NEI VFQ-25 and NEI VFQ-39 administered to non-Latino patients. Clinic-based cross-sectional survey. Four hundred three patients (160 Latinos and 243 non-Latinos) recruited from general ophthalmology clinics of an urban public hospital over a 6-month period. Structured face-to-face interviews were conducted in Spanish and English to collect data for the NEI VFQ-25 and NEI VFQ-39. We calculated the mean, standard deviation, and percentage of participants having the minimum (floor) and maximum (ceiling) possible score for each item and scale. Internal consistency reliability of the NEI VFQ-25 and NEI VFQ-39 was estimated using the Cronbach alpha and average inter-item correlation. Construct validity for the instruments was assessed by comparing scores for participants classified as having normal versus impaired visual acuity. Instrument scales for general health; general vision; ocular pain; near activities; distance activities; vision-specific social functioning, mental health, role difficulties, and dependency; driving; color vision; and peripheral vision. Internal consistency reliability was significantly lower in the Spanish version than in the English version for 3 scales of the NEI VFQ-25. More importantly, 3 scales in the Spanish version manifested inadequate reliability (alpha< or =0.70), compared with only 1 inadequately reliable subscale in the English version. Reliability coefficients associated with the Spanish NEI VFQ-39 scales exceeded commonly accepted minimum standards. Comparison of reliability coefficients between Latino and non-Latino subgroups demonstrated statistically significant differences for 4 scales: Ocular Pain, Mental Health, Role Difficulties, and Dependency. In each case, the Latino group had the lower internal consistency reliability. However, only for the Ocular Pain subscale was reliability both significantly lower and inadequate (alpha<0.70). Overall performance of the NEI VFQ in Latino populations is adequate. However, in the absence of modifications to improve the reliability of specific Spanish version subscales, comparisons between Latino and non-Latino subgroups using the NEI VFQ must be interpreted with appropriate caution.
Methods Used to Streamline the CAHPS® Hospital Survey
Keller, San; O'Malley, A James; Hays, Ron D; Matthew, Rebecca A; Zaslavsky, Alan M; Hepner, Kimberly A; Cleary, Paul D
2005-01-01
Objective To identify a parsimonious subset of reliable, valid, and consumer-salient items from 33 questions asking for patient reports about hospital care quality. Data Source CAHPS® Hospital Survey pilot data were collected during the summer of 2003 using mail and telephone from 19,720 patients who had been treated in 132 hospitals in three states and discharged from November 2002 to January 2003. Methods Standard psychometric methods were used to assess the reliability (internal consistency reliability and hospital-level reliability) and construct validity (exploratory and confirmatory factor analyses, strength of relationship to overall rating of hospital) of the 33 report items. The best subset of items from among the 33 was selected based on their statistical properties in conjunction with the importance assigned to each item by participants in 14 focus groups. Principal Findings Confirmatory factor analysis (CFA) indicated that a subset of 16 questions proposed to measure seven aspects of hospital care (communication with nurses, communication with doctors, responsiveness to patient needs, physical environment, pain control, communication about medication, and discharge information) demonstrated excellent fit to the data. Scales in each of these areas had acceptable levels of reliability to discriminate among hospitals and internal consistency reliability estimates comparable with previously developed CAHPS instruments. Conclusion Although half the length of the original, the shorter CAHPS hospital survey demonstrates promising measurement properties, identifies variations in care among hospitals, and deals with aspects of the hospital stay that are important to patients' evaluations of care quality. PMID:16316438
Nichols, John; Fay, Kellie; Bernhard, Mary Jo; Bischof, Ina; Davis, John; Halder, Marlies; Hu, Jing; Johanning, Karla; Laue, Heike; Nabb, Diane; Schlechtriem, Christian; Segner, Helmut; Swintek, Joe; Weeks, John; Embry, Michelle
2018-05-14
In vitro assays are widely employed to obtain intrinsic clearance estimates used in toxicokinetic modeling efforts. However, the reliability of these methods is seldom reported. Here we describe the results of an international ring trial designed to evaluate two in vitro assays used to measure intrinsic clearance in rainbow trout. An important application of these assays is to predict the effect of biotransformation on chemical bioaccumulation. Six laboratories performed substrate depletion experiments with cyclohexyl salicylate, fenthion, 4-n-nonylphenol, deltamethrin, methoxychlor, and pyrene using cryopreserved hepatocytes and liver S9 fractions from trout. Variability within and among laboratories was characterized as the percent coefficient of variation (CV) in measured in vitro intrinsic clearance rates (CLIN VITRO, INT; ml/h/mg protein or 106 cells) for each chemical and test system. Mean intra-laboratory CVs for each test chemical averaged 18.9% for hepatocytes and 14.1% for S9 fractions, while inter-laboratory CVs (all chemicals and all tests) averaged 30.1% for hepatocytes and 22.4% for S9 fractions. When CLIN VITRO, INT values were extrapolated to in vivo intrinsic clearance estimates (CLIN VIVO,INT; L/d/kg fish), both assays yielded similar levels of activity (< 4-fold difference for all chemicals). Hepatic clearance rates (CLH; L/d/kg fish) calculated using data from both assays exhibited even better agreement. These findings show that both assays are highly reliable and suggest that either may be used to inform chemical bioaccumulation assessments for fish. This study highlights several issues related to the demonstration of assay reliability and may provide a template for evaluating other in vitro biotransformation assays.
Tang, D Y Y; Liu, A C Y; Leung, M H T; Siu, B W M
2013-06-01
OBJECTIVE. Antisocial personality disorder (ASPD) is a risk factor for violence and is associated with poor treatment response when it is a co-morbid condition with substance abuse. It is an under-recognised clinical entity in the local Hong Kong setting, for which there are only a few available Chinese-language diagnostic instruments. None has been tested for its psychometric properties in the Cantonese-speaking population in Hong Kong. This study therefore aimed to assess the reliability and validity of the Chinese version of the ASPD subscale of the Structured Clinical Interview for the DSM-IV Axis II Disorders (SCID-II) in Hong Kong Chinese. METHODS. This assessment tool was modified according to dialectal differences between Mainland China and Hong Kong. Inpatients in Castle Peak Hospital, Hong Kong, who were designated for priority follow-up based on their assessed propensity for violence and who fulfilled the inclusion criteria for the study, were recruited. To assess the level of agreement, best-estimate diagnosis made by a multidisciplinary team was compared with diagnostic status determined by the SCID-II ASPD subscale. The internal consistency, sensitivity, and specificity of the subscale were also calculated. RESULTS. The internal consistency of the subscale was acceptable at 0.79, whereas the test-retest reliability and inter-rater reliability showed an excellent and good agreement of 0.90 and 0.86, respectively. Best-estimate clinical diagnosis-SCID diagnosis agreement was acceptable at 0.76. The sensitivity, specificity, positive and negative predictive values were 0.91, 0.86, 0.83, and 0.93, respectively. CONCLUSION. The Chinese version of the SCID-II ASPD subscale is reliable and valid for diagnosing ASPD in a Cantonese-speaking clinical population.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fallahpoor, M; Abbasi, M; Sen, A
Purpose: Patient-specific 3-dimensional (3D) internal dosimetry in targeted radionuclide therapy is essential for efficient treatment. Two major steps to achieve reliable results are: 1) generating quantitative 3D images of radionuclide distribution and attenuation coefficients and 2) using a reliable method for dose calculation based on activity and attenuation map. In this research, internal dosimetry for 153-Samarium (153-Sm) was done by SPECT-CT images coupled GATE Monte Carlo package for internal dosimetry. Methods: A 50 years old woman with bone metastases from breast cancer was prescribed 153-Sm treatment (Gamma: 103keV and beta: 0.81MeV). A SPECT/CT scan was performed with the Siemens Simbia-Tmore » scanner. SPECT and CT images were registered using default registration software. SPECT quantification was achieved by compensating for all image degrading factors including body attenuation, Compton scattering and collimator-detector response (CDR). Triple energy window method was used to estimate and eliminate the scattered photons. Iterative ordered-subsets expectation maximization (OSEM) with correction for attenuation and distance-dependent CDR was used for image reconstruction. Bilinear energy mapping is used to convert Hounsfield units in CT image to attenuation map. Organ borders were defined by the itk-SNAP toolkit segmentation on CT image. GATE was then used for internal dose calculation. The Specific Absorbed Fractions (SAFs) and S-values were reported as MIRD schema. Results: The results showed that the largest SAFs and S-values are in osseous organs as expected. S-value for lung is the highest after spine that can be important in 153-Sm therapy. Conclusion: We presented the utility of SPECT-CT images and Monte Carlo for patient-specific dosimetry as a reliable and accurate method. It has several advantages over template-based methods or simplified dose estimation methods. With advent of high speed computers, Monte Carlo can be used for treatment planning on a day to day basis.« less
Boerebach, Benjamin C M; Lombarts, Kiki M J M H; Arah, Onyebuchi A
2016-03-01
The System for Evaluation of Teaching Qualities (SETQ) was developed as a formative system for the continuous evaluation and development of physicians' teaching performance in graduate medical training. It has been seven years since the introduction and initial exploratory psychometric analysis of the SETQ questionnaires. This study investigates the validity and reliability of the SETQ questionnaires across hospitals and medical specialties using confirmatory factor analyses (CFAs), reliability analysis, and generalizability analysis. The SETQ questionnaires were tested in a sample of 3,025 physicians and 2,848 trainees in 46 hospitals. The CFA revealed acceptable fit of the data to the previously identified five-factor model. The high internal consistency estimates suggest satisfactory reliability of the subscales. These results provide robust evidence for the validity and reliability of the SETQ questionnaires for evaluating physicians' teaching performance. © The Author(s) 2014.
Konzelmann, M; Burrus, C; Hilfiker, R; Rivier, G; Deriaz, O; Luthi, F
2015-03-01
Functional evaluation of upper limb is not only based on clinical findings but requires self-administered questionnaires to address patients' perspective. The Hand Function Sort (HFS©) was only validated in English. The aim of this study was the French cross cultural adaptation and validation of the HFS© (HFS-F). 150 patients with various upper limbs impairments were recruited in a rehabilitation center. Translation and cross-cultural adaptation were made according to international guidelines. Construct validity was estimated through correlations with Disabilities Arm Shoulder and Hand (DASH) questionnaire, SF-36 mental component summary (MCS),SF-36 physical component summary (PCS) and pain intensity. Internal consistency was assessed by Cronbach's α and test-retest reliability by intraclass correlation. Cronbach's α was 0.98, test-retest reliability was excellent at 0.921 (95 % CI 0.871-0.971) same as original HFS©. Correlations with DASH were-0.779 (95 % CI -0.847 to -0.685); with SF 36 PCS 0.452 (95 % CI 0.276-0.599); with pain -0.247 (95 % CI -0.429 to -0.041); with SF 36 MCS 0.242 (95 % CI 0.042-0.422). There were no floor or ceiling effects. The HFS-F has the same good psychometric properties as the original HFS© (internal consistency, test retest reliability, convergent validity with DASH, divergent validity with SF-36 MCS, and no floor or ceiling effects). The convergent validity with SF-36 PCS was poor; we found no correlation with pain. The HFS-F could be used with confidence in a population of working patients. Other studies are necessary to study its psychometric properties in other populations.
Coefficient alpha and interculture test selection.
Thurber, Steven; Kishi, Yasuhiro
2014-04-01
The internal consistency reliability of a measure can be a focal point in an evaluation of the potential adequacy of an instrument for adaptation to another cultural setting. Cronbach's alpha (α) coefficient is often used as the statistical index for such a determination. However, alpha presumes a tau-equivalent test and may constitute an inaccurate population estimate for multidimensional tests. These notions are expanded and examined with a Japanese version of a questionnaire on nursing attitudes toward suicidal patients, originally constructed in Sweden using the English language. The English measure was reported to have acceptable internal consistency (α) albeit the dimensionality of the questionnaire was not addressed. The Japanese scale was found to lack tau-equivalence. An alternative to alpha, "composite reliability," was computed and found to be below acceptable standards in magnitude and precision. Implications for research application of the Japanese instrument are discussed. © The Author(s) 2012.
Assessing the Psychometric Properties of Two Food Addiction Scales
Lemeshow, Adina; Gearhardt, Ashley; Genkinger, Jeanine; Corbin, William R.
2016-01-01
Background While food addiction is well accepted in popular culture and mainstream media, its scientific validity as an addictive behavior is still under investigation. This study evaluated the reliability and validity of the Yale Food Addiction Scale and Modified Yale Food Addiction Scale using data from two community-based convenience samples. Methods We assessed the internal and test-retest reliability of the Yale Food Addiction Scale and Modified Yale Food Addiction Scale, and estimated the sensitivity and negative predictive value of the Modified Yale Food Addiction Scale using the Yale Food Addiction Scale as the benchmark. We calculated Cronbach’s alphas and 95% confidence intervals (CIs) for internal reliability and Cohen’s Kappa coefficients and 95% CIs for test-retest reliability. Results Internal consistency (n=232) was marginal to good, ranging from α=0.63 to 0.84. The test-retest reliability (n=45) for food addiction diagnosis was substantial, with Kappa=0.73 (95% CI, 0.48–0.88) (Yale Food Addiction Scale) and 0.79 (95% CI, 0.66–1.00) (Modified Yale Food Addiction Scale). Sensitivity and negative predictive value for classifying food addiction status were excellent: compared to the Yale Food Addiction Scale, the Modified Yale Food Addiction Scale’s sensitivity was 92.3% (95% CI, 64%–99.8%), and the negative predictive value was 99.5% (95% CI, 97.5%–100%). Conclusions Our analyses suggest that the Modified Yale Food Addiction Scale may be an appropriate substitute for the Yale Food Addiction Scale when a brief measure is needed, and support the continued use of both scales to investigate food addiction. PMID:27623221
Tarescavage, Anthony M; Wygant, Dustin B; Boutacoff, Lana I; Ben-Porath, Yossef S
2013-12-01
In the current study, we examined the reliability, validity, and clinical utility of Minnesota Multiphasic Personality Inventory-2-Restructured Form (MMPI-2-RF; Ben-Porath & Tellegen, 2011) scores in a sample of 759 bariatric surgery candidates. We provide descriptives for all scales, internal consistency and standard error of measurement estimates for all substantive scales, external correlates of substantive scales using chart review and self-report criteria, and relative risk ratios to assess the clinical utility of the instrument. Results generally support the reliability, validity, and clinical utility of MMPI-2-RF scale scores in the psychological evaluation of bariatric surgery candidates. Limitations, future directions, and practical application of these results are discussed. (c) 2013 APA, all rights reserved.
Junkes, Monica C; Fraiz, Fabian C; Sardenberg, Fernanda; Lee, Jessica Y; Paiva, Saul M; Ferreira, Fernanda M
2015-01-01
The aim of the present study was to translate, perform the cross-cultural adaptation of the Rapid Estimate of Adult Literacy in Dentistry to Brazilian-Portuguese language and test the reliability and validity of this version. After translation and cross-cultural adaptation, interviews were conducted with 258 parents/caregivers of children in treatment at the pediatric dentistry clinics and health units in Curitiba, Brazil. To test the instrument's validity, the scores of Brazilian Rapid Estimate of Adult Literacy in Dentistry (BREALD-30) were compared based on occupation, monthly household income, educational attainment, general literacy, use of dental services and three dental outcomes. The BREALD-30 demonstrated good internal reliability. Cronbach's alpha ranged from 0.88 to 0.89 when words were deleted individually. The analysis of test-retest reliability revealed excellent reproducibility (intraclass correlation coefficient = 0.983 and Kappa coefficient ranging from moderate to nearly perfect). In the bivariate analysis, BREALD-30 scores were significantly correlated with the level of general literacy (rs = 0.593) and income (rs = 0.327) and significantly associated with occupation, educational attainment, use of dental services, self-rated oral health and the respondent's perception regarding his/her child's oral health. However, only the association between the BREALD-30 score and the respondent's perception regarding his/her child's oral health remained significant in the multivariate analysis. The BREALD-30 demonstrated satisfactory psychometric properties and is therefore applicable to adults in Brazil.
Junkes, Monica C.; Fraiz, Fabian C.; Sardenberg, Fernanda; Lee, Jessica Y.; Paiva, Saul M.; Ferreira, Fernanda M.
2015-01-01
Objective The aim of the present study was to translate, perform the cross-cultural adaptation of the Rapid Estimate of Adult Literacy in Dentistry to Brazilian-Portuguese language and test the reliability and validity of this version. Methods After translation and cross-cultural adaptation, interviews were conducted with 258 parents/caregivers of children in treatment at the pediatric dentistry clinics and health units in Curitiba, Brazil. To test the instrument's validity, the scores of Brazilian Rapid Estimate of Adult Literacy in Dentistry (BREALD-30) were compared based on occupation, monthly household income, educational attainment, general literacy, use of dental services and three dental outcomes. Results The BREALD-30 demonstrated good internal reliability. Cronbach’s alpha ranged from 0.88 to 0.89 when words were deleted individually. The analysis of test-retest reliability revealed excellent reproducibility (intraclass correlation coefficient = 0.983 and Kappa coefficient ranging from moderate to nearly perfect). In the bivariate analysis, BREALD-30 scores were significantly correlated with the level of general literacy (rs = 0.593) and income (rs = 0.327) and significantly associated with occupation, educational attainment, use of dental services, self-rated oral health and the respondent’s perception regarding his/her child's oral health. However, only the association between the BREALD-30 score and the respondent’s perception regarding his/her child's oral health remained significant in the multivariate analysis. Conclusion The BREALD-30 demonstrated satisfactory psychometric properties and is therefore applicable to adults in Brazil. PMID:26158724
Validity and Reliability of a New Instrument to Measure Cancer-Related Fatigue in Adolescents
Hinds, Pamela S.; Hockenberry, Marilyn; Tong, Xin; Rai, Shesh N.; Gattuso, Jamie S.; McCarthy, Kathleen; Pui, Ching-Hon; Srivastava, Deo Kumar
2008-01-01
Adolescents undergoing treatment for cancer rate fatigue as their most prevalent and intense cancer- and treatment-related effect. Parents and staff rate it similarly. Despite its reported prevalence, intensity, and distressing effects, cancer-related fatigue in adolescents is not routinely assessed during or after cancer treatment. We contend that the insufficient clinical attention is primarily due to the lack of a reliable and valid self-report instrument with which adolescent cancer-related fatigue can be measured. Our aim was to determine the reliability and construct validity of a new instrument and its ability to measure change in fatigue over time. Initial testing involved 64 adolescents undergoing curative treatment of cancer who completed the Fatigue Scale-Adolescent (FS-A) at two to four key points in treatment in one of four studies. Internal consistency estimates ranged from 0.67 to 0.95. Validity estimates involving the FS-A with the parent version ranged from 0.13 to 0.76; estimates involving the staff version and the Reynolds Depression Scale were 0.27 and 0.87 respectively. Additional validity findings included significant fatigue differences between anemic and non-anemic patients (P = 0.042) and the emergence of four factors in an exploratory factor analysis. Findings further indicate that the FS-A can be used to measure change over time (t = 2.55, P <0.01). In summary, the FS-A has moderate to strong reliability and impressive validity coefficients for a new research instrument. PMID:17629669
Paige, Samantha R; Krieger, Janice L; Stellefson, Michael; Alber, Julia M
2017-02-01
Chronic disease patients are affected by low computer and health literacy, which negatively affects their ability to benefit from access to online health information. To estimate reliability and confirm model specifications for eHealth Literacy Scale (eHEALS) scores among chronic disease patients using Classical Test (CTT) and Item Response Theory techniques. A stratified sample of Black/African American (N=341) and Caucasian (N=343) adults with chronic disease completed an online survey including the eHEALS. Item discrimination was explored using bi-variate correlations and Cronbach's alpha for internal consistency. A categorical confirmatory factor analysis tested a one-factor structure of eHEALS scores. Item characteristic curves, in-fit/outfit statistics, omega coefficient, and item reliability and separation estimates were computed. A 1-factor structure of eHEALS was confirmed by statistically significant standardized item loadings, acceptable model fit indices (CFI/TLI>0.90), and 70% variance explained by the model. Item response categories increased with higher theta levels, and there was evidence of acceptable reliability (ω=0.94; item reliability=89; item separation=8.54). eHEALS scores are a valid and reliable measure of self-reported eHealth literacy among Internet-using chronic disease patients. Providers can use eHEALS to help identify patients' eHealth literacy skills. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
[New questionnaire to assess self-efficacy toward physical activity in children].
Aedo, Angeles; Avila, Héctor
2009-10-01
To design a questionnaire for assessment of self-efficacy toward physical activity in school children, as well as to measure its construct validity, test-retest reliability, and internal consistency. A four-stage multimethod approach was used: (1) bibliographic research followed by exploratory study and the formulation of questions and responses based on a dichotomous scale of 14 items; (2) validation of the content by a panel of experts; (3) application of the preliminary version of the questionnaire to a sample of 900 school-aged children in Mexico City; and (4) determination of the construct validity, test-retest reliability, and internal consistency (Cronbach's alpha). Three factors were identified that explain 64.15% of the variance: the search for positive alternatives to physical activity, ability to deal with possible barriers to exercising, and expectations of skill or competence. The model was validated using the goodness of fit, and the result of 65% less than 0.05 indicated that the estimated factor model fit the data. Cronbach's consistency alpha was 0.733; test-retest reliability was 0.867. The scale designed has adequate reliability and validity. These results are a good indicator of self-efficacy toward physical activity in school children, which is important when developing programs intended to promote such behavior in this age group.
Ang, Rebecca P; Chong, Wan Har; Huan, Vivien S; Yeo, Lay See
2007-01-01
This article reports the development and initial validation of scores obtained from the Adolescent Concerns Measure (ACM), a scale which assesses concerns of Asian adolescent students. In Study 1, findings from exploratory factor analysis using 619 adolescents suggested a 24-item scale with four correlated factors--Family Concerns (9 items), Peer Concerns (5 items), Personal Concerns (6 items), and School Concerns (4 items). Initial estimates of convergent validity for ACM scores were also reported. The four-factor structure of ACM scores derived from Study 1 was confirmed via confirmatory factor analysis in Study 2 using a two-fold cross-validation procedure with a separate sample of 811 adolescents. Support was found for both the multidimensional and hierarchical models of adolescent concerns using the ACM. Internal consistency and test-retest reliability estimates were adequate for research purposes. ACM scores show promise as a reliable and potentially valid measure of Asian adolescents' concerns.
A method for vibrational assessment of cortical bone
NASA Astrophysics Data System (ADS)
Song, Yan; Gunaratne, Gemunu H.
2006-09-01
Large bones from many anatomical locations of the human skeleton consist of an outer shaft (cortex) surrounding a highly porous internal region (trabecular bone) whose structure is reminiscent of a disordered cubic network. Age related degradation of cortical and trabecular bone takes different forms. Trabecular bone weakens primarily by loss of connectivity of the porous network, and recent studies have shown that vibrational response can be used to obtain reliable estimates for loss of its strength. In contrast, cortical bone degrades via the accumulation of long fractures and changes in the level of mineralization of the bone tissue. In this paper, we model cortical bone by an initially solid specimen with uniform density to which long fractures are introduced; we find that, as in the case of trabecular bone, vibrational assessment provides more reliable estimates of residual strength in cortical bone than is possible using measurements of density or porosity.
Lower Bounds to the Reliabilities of Factor Score Estimators.
Hessen, David J
2016-10-06
Under the general common factor model, the reliabilities of factor score estimators might be of more interest than the reliability of the total score (the unweighted sum of item scores). In this paper, lower bounds to the reliabilities of Thurstone's factor score estimators, Bartlett's factor score estimators, and McDonald's factor score estimators are derived and conditions are given under which these lower bounds are equal. The relative performance of the derived lower bounds is studied using classic example data sets. The results show that estimates of the lower bounds to the reliabilities of Thurstone's factor score estimators are greater than or equal to the estimates of the lower bounds to the reliabilities of Bartlett's and McDonald's factor score estimators.
Reliability of routinely collected hospital data for child maltreatment surveillance.
McKenzie, Kirsten; Scott, Debbie A; Waller, Garry S; Campbell, Margaret
2011-01-05
Internationally, research on child maltreatment-related injuries has been hampered by a lack of available routinely collected health data to identify cases, examine causes, identify risk factors and explore health outcomes. Routinely collected hospital separation data coded using the International Classification of Diseases and Related Health Problems (ICD) system provide an internationally standardised data source for classifying and aggregating diseases, injuries, causes of injuries and related health conditions for statistical purposes. However, there has been limited research to examine the reliability of these data for child maltreatment surveillance purposes. This study examined the reliability of coding of child maltreatment in Queensland, Australia. A retrospective medical record review and recoding methodology was used to assess the reliability of coding of child maltreatment. A stratified sample of hospitals across Queensland was selected for this study, and a stratified random sample of cases was selected from within those hospitals. In 3.6% of cases the coders disagreed on whether any maltreatment code could be assigned (definite or possible) versus no maltreatment being assigned (unintentional injury), giving a sensitivity of 0.982 and specificity of 0.948. The review of these cases where discrepancies existed revealed that all cases had some indications of risk documented in the records. 15.5% of cases originally assigned a definite or possible maltreatment code, were recoded to a more or less definite strata. In terms of the number and type of maltreatment codes assigned, the auditor assigned a greater number of maltreatment types based on the medical documentation than the original coder assigned (22% of the auditor coded cases had more than one maltreatment type assigned compared to only 6% of the original coded data). The maltreatment types which were the most 'under-coded' by the original coder were psychological abuse and neglect. Cases coded with a sexual abuse code showed the highest level of reliability. Given the increasing international attention being given to improving the uniformity of reporting of child-maltreatment related injuries and the emphasis on the better utilisation of routinely collected health data, this study provides an estimate of the reliability of maltreatment-specific ICD-10-AM codes assigned in an inpatient setting.
Reliability of Routinely Collected Hospital Data for Child Maltreatment Surveillance
2011-01-01
Background Internationally, research on child maltreatment-related injuries has been hampered by a lack of available routinely collected health data to identify cases, examine causes, identify risk factors and explore health outcomes. Routinely collected hospital separation data coded using the International Classification of Diseases and Related Health Problems (ICD) system provide an internationally standardised data source for classifying and aggregating diseases, injuries, causes of injuries and related health conditions for statistical purposes. However, there has been limited research to examine the reliability of these data for child maltreatment surveillance purposes. This study examined the reliability of coding of child maltreatment in Queensland, Australia. Methods A retrospective medical record review and recoding methodology was used to assess the reliability of coding of child maltreatment. A stratified sample of hospitals across Queensland was selected for this study, and a stratified random sample of cases was selected from within those hospitals. Results In 3.6% of cases the coders disagreed on whether any maltreatment code could be assigned (definite or possible) versus no maltreatment being assigned (unintentional injury), giving a sensitivity of 0.982 and specificity of 0.948. The review of these cases where discrepancies existed revealed that all cases had some indications of risk documented in the records. 15.5% of cases originally assigned a definite or possible maltreatment code, were recoded to a more or less definite strata. In terms of the number and type of maltreatment codes assigned, the auditor assigned a greater number of maltreatment types based on the medical documentation than the original coder assigned (22% of the auditor coded cases had more than one maltreatment type assigned compared to only 6% of the original coded data). The maltreatment types which were the most 'under-coded' by the original coder were psychological abuse and neglect. Cases coded with a sexual abuse code showed the highest level of reliability. Conclusion Given the increasing international attention being given to improving the uniformity of reporting of child-maltreatment related injuries and the emphasis on the better utilisation of routinely collected health data, this study provides an estimate of the reliability of maltreatment-specific ICD-10-AM codes assigned in an inpatient setting. PMID:21208411
Internal Motion Estimation by Internal-external Motion Modeling for Lung Cancer Radiotherapy.
Chen, Haibin; Zhong, Zichun; Yang, Yiwei; Chen, Jiawei; Zhou, Linghong; Zhen, Xin; Gu, Xuejun
2018-02-27
The aim of this study is to develop an internal-external correlation model for internal motion estimation for lung cancer radiotherapy. Deformation vector fields that characterize the internal-external motion are obtained by respectively registering the internal organ meshes and external surface meshes from the 4DCT images via a recently developed local topology preserved non-rigid point matching algorithm. A composite matrix is constructed by combing the estimated internal phasic DVFs with external phasic and directional DVFs. Principle component analysis is then applied to the composite matrix to extract principal motion characteristics, and generate model parameters to correlate the internal-external motion. The proposed model is evaluated on a 4D NURBS-based cardiac-torso (NCAT) synthetic phantom and 4DCT images from five lung cancer patients. For tumor tracking, the center of mass errors of the tracked tumor are 0.8(±0.5)mm/0.8(±0.4)mm for synthetic data, and 1.3(±1.0)mm/1.2(±1.2)mm for patient data in the intra-fraction/inter-fraction tracking, respectively. For lung tracking, the percent errors of the tracked contours are 0.06(±0.02)/0.07(±0.03) for synthetic data, and 0.06(±0.02)/0.06(±0.02) for patient data in the intra-fraction/inter-fraction tracking, respectively. The extensive validations have demonstrated the effectiveness and reliability of the proposed model in motion tracking for both the tumor and the lung in lung cancer radiotherapy.
A Psychometric Analysis of Quality of Life Tools in Lung Cancer Patients Who Smoke
Browning, Kristine K.; Ferketich, Amy K.; Otterson, Gregory A.; Reynolds, Nancy R.; Wewers, Mary Ellen
2009-01-01
Lung cancer is the leading cause of cancer death for both men and women in the United States. Patient quality of life (QOL) prior to cancer treatment is known to be a strong predictor of survival and toleration of treatment toxicities. A lung cancer patient’s self-assessment of QOL is highly valued among clinicians as it guides treatment-related decisions and impacts clinical outcomes. Smokers are known to report a lower QOL. Limited research has been conducted on QOL outcomes in lung cancer patients who continue to smoke. To assess QOL, a reliable and valid QOL measure specific to lung cancer is required. The Functional Assessment of Cancer Therapy-Lung Cancer (FACT-L) and Lung Cancer Symptom Scale (LCSS) are instruments that specifically examine QOL among lung cancer patients. The LCSS is a focused QOL instrument that includes physical and functional domains of QOL and disease symptomatology. The FACT-L is a broader QOL instrument that includes physical, functional, social and emotional domains and disease symptomatology. Both are psychometrically valid and are widely used in the literature, but have not been exclusively evaluated in smokers. Furthermore, there is no ‘gold standard’ instrument since there has never been a correlation study to compare estimates of reliability and validity between these instruments. The purpose of this study is to report the internal consistency and convergence validity of the FACT-L and the LCSS among newly diagnosed lung cancer patients who smoke. This data were collected and analyzed from a larger study examining smoking behavior among newly diagnosed lung cancer patients (n=51). Descriptive statistics were calculated on the FACT-L and LCSS scores, internal consistency was assessed by estimating Cronbach’s alpha coefficients, and Pearson correlation coefficients were estimated between the two scales. Internal consistency coefficients demonstrated good reliability for both scales, and the two instruments demonstrated a strong correlation, suggesting good convergence validity. Either of these instruments are appropriate measures for QOL in lung cancer patients who smoke. Given the conceptual difference between the two instruments, it is important to carefully consider the research aims when selecting the appropriate QOL measurement instrument. PMID:19181418
Reliability and Responsiveness of NutriQoL® Questionnaire.
Cuerda, Maria Cristina; Apezetxea, Antonio; Carrillo, Lourdes; Casanueva, Felipe; Cuesta, Federico; Irles, Jose Antonio; Virgili, Maria Nuria; Layola, Miquel; Lizán, Luis
2016-10-01
NutriQoL ® (Nestlé Health Science, Vevay, Switzerland) is a questionnaire developed to assess the health-related quality-of-life (HRQoL) of patients with home enteral nutrition (HEN) irrespective of their underlying condition and route of administration. The aim of this work is assessing the questionnaire's reliability and responsiveness to change. Two cohorts of patients with HEN and their primary caregivers were enrolled to assess reliability and responsiveness, respectively. All participants had to be 18 years of age or older, without mental deterioration (≤3 or 4 errors in the Pfeiffer's test) and with sufficient functional status (>40 points on Karnovsky's performance status scale). When the patients' ability to respond to the questionnaire was impaired due to underlying disease, their caregivers answered on their behalf. NutriQoL was administered in two and three visits to reliability and responsiveness cohorts, respectively. Test-retest reliability and internal consistency were assessed by the intra-class correlation coefficient (ICC) and the Cronbach's α, respectively. Responsiveness was evaluated by standardized effect size and standardized response mean between basal visit and third visit. Finally, the minimal clinically important difference (MCID) was estimated. A total of 54 and 86 participants were recruited to the reliability and responsiveness cohort, respectively. Thirty-five caregivers were selected to assess the inter-observer reliability. ICC values confirmed the good reproducibility level (ICC >0.75) of the questionnaire in both "physical functioning and activities of daily living" and "social life" domains and total score. The assessment of internal consistency in both domains of the questionnaire showed good internal consistency in visit 2. ICC showed the excellent agreement level between caregiver and patient in the global NutriQoL score. Finally, patients classified as having a minimal change in their health reported a mean (standard deviation) MCID in NutriQoL score of 0.63 (11.51). NutriQoL is a reliable and unique instrument to measure the HRQoL in HEN patients. NutriQoL detects changes in the health status of the patient. Nevertheless, further research is needed to determine the full extent of the questionnaire responsiveness.
Bergeron, Lise; Smolla, Nicole; Berthiaume, Claude; Renaud, Johanne; Breton, Jean-Jacques; St-Georges, Marie; Morin, Pauline; Zavaglia, Elissa; Labelle, Réal
2017-03-01
The Dominic Interactive for Adolescents-Revised (DIA-R) is a multimedia self-report screen for 9 mental disorders, borderline personality traits, and suicidality defined by the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders ( DSM-5). This study aimed to examine the reliability and the validity of this instrument. French- and English-speaking adolescents aged 12 to 15 years ( N = 447) were recruited from schools and clinical settings in Montreal and were evaluated twice. The internal consistency was estimated by Cronbach alpha coefficients and the test-retest reliability by intraclass correlation coefficients. Cutoff points on the DIA-R scales were determined by using clinically relevant measures for defining external validation criteria: the Schedule for Affective Disorders and Schizophrenia for School-Aged Children, the Beck Hopelessness Scale, and the Abbreviated-Diagnostic Interview for Borderlines. Receiver operating characteristic (ROC) analyses provided accuracy estimates (area under the ROC curve, sensitivity, specificity, likelihood ratio) to evaluate the ability of the DIA-R scales to predict external criteria. For most of the DIA-R scales, reliability coefficients were excellent or moderate. High or moderate accuracy estimates from ROC analyses demonstrated the ability of the DIA-R thresholds to predict psychopathological conditions. These thresholds were generally capable to discriminate between clinical and school subsamples. However, the validity of the obsessions/compulsions scale was too low. Findings clearly support the reliability and the validity of the DIA-R. This instrument may be useful to assess a wide range of adolescents' mental health problems in the continuum of services. This conclusion applies to all scales, except the obsessions/compulsions one.
NASA Astrophysics Data System (ADS)
Furno, Mauro; Rosenow, Thomas C.; Gather, Malte C.; Lüssem, Björn; Leo, Karl
2012-10-01
We report on a theoretical framework for the efficiency analysis of complex, multi-emitter organic light emitting diodes (OLEDs). The calculation approach makes use of electromagnetic modeling to quantify the overall OLED photon outcoupling efficiency and a phenomenological description for electrical and excitonic processes. From the comparison of optical modeling results and measurements of the total external quantum efficiency, we obtain reliable estimates of internal quantum yield. As application of the model, we analyze high-efficiency stacked white OLEDs and comment on the various efficiency loss channels present in the devices.
Fundamentals of endoscopic surgery: creation and validation of the hands-on test.
Vassiliou, Melina C; Dunkin, Brian J; Fried, Gerald M; Mellinger, John D; Trus, Thadeus; Kaneva, Pepa; Lyons, Calvin; Korndorffer, James R; Ujiki, Michael; Velanovich, Vic; Kochman, Michael L; Tsuda, Shawn; Martinez, Jose; Scott, Daniel J; Korus, Gary; Park, Adrian; Marks, Jeffrey M
2014-03-01
The Fundamentals of Endoscopic Surgery™ (FES) program consists of online materials and didactic and skills-based tests. All components were designed to measure the skills and knowledge required to perform safe flexible endoscopy. The purpose of this multicenter study was to evaluate the reliability and validity of the hands-on component of the FES examination, and to establish the pass score. Expert endoscopists identified the critical skill set required for flexible endoscopy. They were then modeled in a virtual reality simulator (GI Mentor™ II, Simbionix™ Ltd., Airport City, Israel) to create five tasks and metrics. Scores were designed to measure both speed and precision. Validity evidence was assessed by correlating performance with self-reported endoscopic experience (surgeons and gastroenterologists [GIs]). Internal consistency of each test task was assessed using Cronbach's alpha. Test-retest reliability was determined by having the same participant perform the test a second time and comparing their scores. Passing scores were determined by a contrasting groups methodology and use of receiver operating characteristic curves. A total of 160 participants (17 % GIs) performed the simulator test. Scores on the five tasks showed good internal consistency reliability and all had significant correlations with endoscopic experience. Total FES scores correlated 0.73, with participants' level of endoscopic experience providing evidence of their validity, and their internal consistency reliability (Cronbach's alpha) was 0.82. Test-retest reliability was assessed in 11 participants, and the intraclass correlation was 0.85. The passing score was determined and is estimated to have a sensitivity (true positive rate) of 0.81 and a 1-specificity (false positive rate) of 0.21. The FES hands-on skills test examines the basic procedural components required to perform safe flexible endoscopy. It meets rigorous standards of reliability and validity required for high-stakes examinations, and, together with the knowledge component, may help contribute to the definition and determination of competence in endoscopy.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lappin, A.R.; VanBuskirk, R.G.; Enniss, D.O.
1982-03-01
Thermal-conductivity and bulk-property measurements were made on welded and nonwelded silicic tuffs from the upper portion of Hole USW-G1, located near the southwestern margin of the Nevada Test Site. Bulk-property measurements were made by standard techniques. Thermal conductivities were measured at temperatures as high as 280{sup 0}C, confining pressures to 10 MPa, and pore pressures to 1.5 MPa. Extrapolation of measured saturated conductivities to zero porosity suggests that matrix conductivity of both zeolitized and devitrified tuffs is independent of stratigraphic position, depth, and probably location. This fact allows development of a thermal-conductivity stratigraphy for the upper portion of Hole G1.more » Estimates of saturated conductivities of zeolitized nonwelded tuffs and devitrified tuffs below the water table appear most reliable. Estimated conductivities of saturated densely welded devitrified tuffs above the water table are less reliable, due to both internal complexity and limited data presently available. Estimation of conductivity of dewatered tuffs requires use of different air thermal conductivities in devitrified and zeolitized samples. Estimated effects of in-situ fracturing generally appear negligible.« less
Jackson, T
2001-05-01
Casemix-funding systems for hospital inpatient care require a set of resource weights which will not inadvertently distort patterns of patient care. Few health systems have very good sources of cost information, and specific studies to derive empirical cost relativities are themselves costly. This paper reports a 5 year program of research into the use of data from hospital management information systems (clinical costing systems) to estimate resource relativities for inpatient hospital care used in Victoria's DRG-based payment system. The paper briefly describes international approaches to cost weight estimation. It describes the architecture of clinical costing systems, and contrasts process and job costing approaches to cost estimation. Techniques of data validation and reliability testing developed in the conduct of four of the first five of the Victorian Cost Weight Studies (1993-1998) are described. Improvement in sampling, data validity and reliability are documented over the course of the research program, the advantages of patient-level data are highlighted. The usefulness of these byproduct data for estimation of relative resource weights and other policy applications may be an important factor in hospital and health system decisions to invest in clinical costing technology.
Ray, Midge N; Houston, Thomas K; Yu, Feliciano B; Menachemi, Nir; Maisiak, Richard S; Allison, Jeroan J; Berner, Eta S
2006-01-01
The authors developed and evaluated a rating scale, the Attitudes toward Handheld Decision Support Software Scale (H-DSS), to assess physician attitudes about handheld decision support systems. The authors conducted a prospective assessment of psychometric characteristics of the H-DSS including reliability, validity, and responsiveness. Participants were 82 Internal Medicine residents. A higher score on each of the 14 five-point Likert scale items reflected a more positive attitude about handheld DSS. The H-DSS score is the mean across the fourteen items. Attitudes toward the use of the handheld DSS were assessed prior to and six months after receiving the handheld device. Cronbach's Alpha was used to assess internal consistency reliability. Pearson correlations were used to estimate and detect significant associations between scale scores and other measures (validity). Paired sample t-tests were used to test for changes in the mean attitude scale score (responsiveness) and for differences between groups. Internal consistency reliability for the scale was alpha = 0.73. In testing validity, moderate correlations were noted between the attitude scale scores and self-reported Personal Digital Assistant (PDA) usage in the hospital (correlation coefficient = 0.55) and clinic (0.48), p < 0.05 for both. The scale was responsive, in that it detected the expected increase in scores between the two administrations (3.99 (s.d. = 0.35) vs. 4.08, (s.d. = 0.34), p < 0.005). The authors' evaluation showed that the H-DSS scale was reliable, valid, and responsive. The scale can be used to guide future handheld DSS development and implementation.
Mass and Reliability System (MaRS)
NASA Technical Reports Server (NTRS)
Barnes, Sarah
2016-01-01
The Safety and Mission Assurance (S&MA) Directorate is responsible for mitigating risk, providing system safety, and lowering risk for space programs from ground to space. The S&MA is divided into 4 divisions: The Space Exploration Division (NC), the International Space Station Division (NE), the Safety & Test Operations Division (NS), and the Quality and Flight Equipment Division (NT). The interns, myself and Arun Aruljothi, will be working with the Risk & Reliability Analysis Branch under the NC Division's. The mission of this division is to identify, characterize, diminish, and communicate risk by implementing an efficient and effective assurance model. The team utilizes Reliability and Maintainability (R&M) and Probabilistic Risk Assessment (PRA) to ensure decisions concerning risks are informed, vehicles are safe and reliable, and program/project requirements are realistic and realized. This project pertains to the Orion mission, so it is geared toward a long duration Human Space Flight Program(s). For space missions, payload is a critical concept; balancing what hardware can be replaced by components verse by Orbital Replacement Units (ORU) or subassemblies is key. For this effort a database was created that combines mass and reliability data, called Mass and Reliability System or MaRS. The U.S. International Space Station (ISS) components are used as reference parts in the MaRS database. Using ISS components as a platform is beneficial because of the historical context and the environment similarities to a space flight mission. MaRS uses a combination of systems: International Space Station PART for failure data, Vehicle Master Database (VMDB) for ORU & components, Maintenance & Analysis Data Set (MADS) for operation hours and other pertinent data, & Hardware History Retrieval System (HHRS) for unit weights. MaRS is populated using a Visual Basic Application. Once populated, the excel spreadsheet is comprised of information on ISS components including: operation hours, random/nonrandom failures, software/hardware failures, quantity, orbital replaceable units (ORU), date of placement, unit weight, frequency of part, etc. The motivation for creating such a database will be the development of a mass/reliability parametric model to estimate mass required for replacement parts. Once complete, engineers working on future space flight missions will have access a mean time to failures and on parts along with their mass, this will be used to make proper decisions for long duration space flight missions
Portuguese version of the EUROPEP questionnaire: contributions to the psychometric validation
Roque, Hugo; Veloso, Ana; Ferreira, Pedro L
2016-01-01
ABSTRACT OBJECTIVE To assess the construct validity and reliability of the Portuguese version of the European Task Force on Patient Evaluation of General Practice Care questionnaire. METHODS We applied the Portuguese version of the European Task Force on Patient Evaluation of General Practice Care to 392 users of 20 Family Health Units from the North of Portugal. The validity of the construct was evaluated by exploratory factor analysis, with the Principal Axis Factoring method, by orthogonal rotation (varimax procedure), by the Kaiser normalization criteria (eigenvalue ≥ 1). The factorability of the data matrix was verified by the Kaiser-Meyer-Olkin and Bartlett’s sphericity test. We estimated the reliability by the indicator of internal consistency Cronbach’s alpha. To analyze the correlations between satisfaction and loyalty, we used the Pearson correlations. The predictor effect of satisfaction on loyalty was analyzed by simple linear regression. RESULTS Satisfaction presented five robust and well individualized dimensions – medical care, nursing care, clinical secretariat services, accessibility, and organization of services – with alpha values between 0.86 and 0.97, good levels of internal consistency. The loyalty showed alpha value of 0.72, considered a reasonable internal consistency. The satisfaction was predictive of loyalty. CONCLUSIONS The Portuguese European Task Force on Patient Evaluation of General Practice Care questionnaire is a robust and reliable instrument to measure the satisfaction and loyalty of users of the Family Health Units. PMID:27706374
Cross-cultural adaptation and validation of the Korean version of the neck disability index.
Song, Kyung-Jin; Choi, Byung-Wan; Choi, Byung-Ryeul; Seo, Gyeu-Beom
2010-09-15
Validation of a translated, culturally adapted questionnaire. The purpose of this study is to translate and culturally adapt the Neck Disability Index (NDI) and to validate the use of the derived version in Korean patient. Although several valid measures exist for measurement of neck pain and functional impairment, these measures have yet been validated in Korean version. The NDI was linguistically translated into Korean, and prefinal version was assessed and modified by a pilot study. The reliability and validity of the derived Korean version was examined in 78 patients with degenerative cervical spine disease. Test-retest reliability, internal consistency, and construct validity were investigated by comparing Visual Analogue Scale (VAS) and Short Form Health Survey (SF-36) scores. Factor analysis of Korean NDI extracted 2 factors with eigenvalues >1. The intraclass-correlation coefficient of test-retest reliability was 0.93. Reliability, estimated by internal consistency, had a Cronbach alpha value of 0.82. The correlation between NDI and VAS scores was r = 0.49, and the correlation between NDI and SF-36 scores was r = -0.44. The physical health component score of SF-36 was highly correlated with NDI, and the correlation between VAS scores and the mental health component scores of SF-36 was high. The derived Korean version of the NDI was found to be a reliable and valid instrument for measuring disability in Korean patients with cervical problems. The authors recommend its use in future Korean clinical studies.
Kuempel, Eileen D.; Sweeney, Lisa M.; Morris, John B.; Jarabek, Annie M.
2015-01-01
The purpose of this article is to provide an overview and practical guide to occupational health professionals concerning the derivation and use of dose estimates in risk assessment for development of occupational exposure limits (OELs) for inhaled substances. Dosimetry is the study and practice of measuring or estimating the internal dose of a substance in individuals or a population. Dosimetry thus provides an essential link to understanding the relationship between an external exposure and a biological response. Use of dosimetry principles and tools can improve the accuracy of risk assessment, and reduce the uncertainty, by providing reliable estimates of the internal dose at the target tissue. This is accomplished through specific measurement data or predictive models, when available, or the use of basic dosimetry principles for broad classes of materials. Accurate dose estimation is essential not only for dose-response assessment, but also for interspecies extrapolation and for risk characterization at given exposures. Inhalation dosimetry is the focus of this paper since it is a major route of exposure in the workplace. Practical examples of dose estimation and OEL derivation are provided for inhaled gases and particulates. PMID:26551218
Developing a measure of cultural-, maturity-, or esteem-driven modesty among Jewish women.
Andrews, Caryn Scheinberg
2014-01-01
Understanding modesty and how it relates to religiosity among Jewish women was relatively unexplained, and as part of a larger study, a measure was needed. The purpose of this article is to report on three studies which represent the three stages of instrument development of a measure of modesty among Jewish women, "Your Views of Modesty": (a) content/concept definition; (b) instrument development; and (c) evaluation of the psychometric properties of the instrument: reliability and validity. In Study I, Q methodology was used to define the domain and results suggesting that modesty has multidimensions. In Study II, an instrument was developed based on distinctive perspectives from each group or what was important and not so important. This formed a 25-item Likert scale. In Study III, a survey of 300 Jewish women revealed internal consistency estimates with Cronbach's alpha 0.92, indicating high degree of internal consistency reliability for "Your Views of Modesty." For construct validity, four factors were found explaining 55% of the variance of modesty: (a) religion-driven, (b) maturity-driven, (c) esteem-driven, and (d) public-based modesty was identified. "Your Views of Modesty" shows good evidence for reliability and validity in this Jewish population.
Lehotkay, R; Saraswathi Devi, T; Raju, M V R; Bada, P K; Nuti, S; Kempf, N; Carminati, G Galli
2015-03-01
In this study realised in collaboration with the department of psychology and parapsychology of Andhra University, validation of the Aberrant Behavior Checklist-Community (ABC-C) in Telugu, the official language of Andhra Pradesh, one of India's 28 states, was carried out. To assess the factor validity and reliability of this Telugu version, 120 participants with moderate to profound intellectual disability (94 men and 26 women, mean age 25.2, SD 7.1) were rated by the staff of the Lebenshilfe Institution for Mentally Handicapped in Visakhapatnam, Andhra Pradesh, India. Rating data were analysed with a confirmatory factor analysis. The internal consistency was estimated by Cronbach's alpha. To confirm the test-retest reliability, 50 participants were rated twice with an interval of 4 weeks, and 50 were rated by pairs of raters to assess inter-rater reliability. Confirmatory factor analysis revealed that the root mean square error of approximation (RMSEA) was equal to 0.06, the comparative fit index (CFI) was equal to 0.77, and the Tucker Lewis index (TLI) was equal to 0.77, which indicated that the model with five correlated factors had a good fit. Coefficient alpha ranged from 0.85 to 0.92 across the five subscales. Spearman's rank correlation coefficients for inter-rater reliability tests ranged from 0.65 to 0.75, and the correlations for test-retest reliability ranged from 0.58 to 0.76. All reliability coefficients were statistically significant (P < 0.01). The factor validity and reliability of Telugu version of the ABC-C evidenced factor validity and reliability comparable to the original English version and appears to be useful for assessing behaviour disorders in Indian people with intellectual disabilities. © 2014 MENCAP and International Association of the Scientific Study of Intellectual and Developmental Disabilities and John Wiley & Sons Ltd.
ERIC Educational Resources Information Center
Morgan, Grant B.; Zhu, Min; Johnson, Robert L.; Hodge, Kari J.
2014-01-01
Common estimators of interrater reliability include Pearson product-moment correlation coefficients, Spearman rank-order correlations, and the generalizability coefficient. The purpose of this study was to examine the accuracy of estimators of interrater reliability when varying the true reliability, number of scale categories, and number of…
Loudon, Kirsty; Zwarenstein, Merrick; Sullivan, Frank; Donnan, Peter; Treweek, Shaun
2013-04-27
If you want to know which of two or more healthcare interventions is most effective, the randomised controlled trial is the design of choice. Randomisation, however, does not itself promote the applicability of the results to situations other than the one in which the trial was done. A tool published in 2009, PRECIS (PRagmatic Explanatory Continuum Indicator Summaries) aimed to help trialists design trials that produced results matched to the aim of the trial, be that supporting clinical decision-making, or increasing knowledge of how an intervention works. Though generally positive, groups evaluating the tool have also found weaknesses, mainly that its inter-rater reliability is not clear, that it needs a scoring system and that some new domains might be needed. The aim of the study is to: Produce an improved and validated version of the PRECIS tool. Use this tool to compare the internal validity of, and effect estimates from, a set of explanatory and pragmatic trials matched by intervention. The study has four phases. Phase 1 involves brainstorming and a two-round Delphi survey of authors who cited PRECIS. In Phase 2, the Delphi results will then be discussed and alternative versions of PRECIS-2 developed and user-tested by experienced trialists. Phase 3 will evaluate the validity and reliability of the most promising PRECIS-2 candidate using a sample of 15 to 20 trials rated by 15 international trialists. We will assess inter-rater reliability, and raters' subjective global ratings of pragmatism compared to PRECIS-2 to assess convergent and face validity. Phase 4, to determine if pragmatic trials sacrifice internal validity in order to achieve applicability, will compare the internal validity and effect estimates of matched explanatory and pragmatic trials of the same intervention, condition and participants. Effect sizes for the trials will then be compared in a meta-regression. The Cochrane Risk of Bias scores will be compared with the PRECIS-2 scores of pragmatism. We have concrete suggestions for improving PRECIS and a growing list of enthusiastic individuals interested in contributing to this work. By early 2014 we expect to have a validated PRECIS-2.
Species longevity in North American fossil mammals.
Prothero, Donald R
2014-08-01
Species longevity in the fossil record is related to many paleoecological variables and is important to macroevolutionary studies, yet there are very few reliable data on average species durations in Cenozoic fossil mammals. Many of the online databases (such as the Paleobiology Database) use only genera of North American Cenozoic mammals and there are severe problems because key groups (e.g. camels, oreodonts, pronghorns and proboscideans) have no reliable updated taxonomy, with many invalid genera and species and/or many undescribed genera and species. Most of the published datasets yield species duration estimates of approximately 2.3-4.3 Myr for larger mammals, with small mammals tending to have shorter species durations. My own compilation of all the valid species durations in families with updated taxonomy (39 families, containing 431 genera and 998 species, averaging 2.3 species per genus) yields a mean duration of 3.21 Myr for larger mammals. This breaks down to 4.10-4.39 Myr for artiodactyls, 3.14-3.31 Myr for perissodactyls and 2.63-2.95 Myr for carnivorous mammals (carnivorans plus creodonts). These averages are based on a much larger, more robust dataset than most previous estimates, so they should be more reliable for any studies that need species longevity to be accurately estimated. © 2013 International Society of Zoological Sciences, Institute of Zoology/Chinese Academy of Sciences and Wiley Publishing Asia Pty Ltd.
2012-01-01
Background Dizziness and comorbid anxiety may cause severe disability of patients with vestibulopathy, but can be addressed effectively with rehabilitation. For an individually adapted treatment, a structured assessment is needed. The Vertigo Symptom Scale (VSS) with two subscales assessing vertigo symptoms (VSS-VER) and associated symptoms (VSS-AA) might be used for this purpose. As there was no validated VSS available in German, the aim of the study was the translation and cross-cultural adaptation in German (VSS-G) and the investigation of its reliability, internal and external validity. Methods The VSS was translated into German according to recognized guidelines. Psychometric properties were tested on 52 healthy controls and 202 participants with vestibulopathy. Internal validity and reliability were investigated with factor analysis, Cronbach’s α and ICC estimations. Discriminant validity was analysed with the Mann–Whitney-U-Test between patients and controls and the ROC-Curve. Convergent validity was estimated with the correlation with the Hospital Anxiety Subscale (HADS-A), Dizziness Handicap Inventory (DHI) and frequency of dizziness. Results Internal validity: factor analysis confirmed the structure of two subscales. Reliability: VSS-G: α = 0.904 and ICC (CI) =0.926 (0.826, 0.965). Discriminant validity: VSS-VER differentiate patients and controls ROC (CI) =0.99 (0.98, 1.00). Convergent validity: VSS-G correlates with DHI (r = 0.554) and frequency (T = 0.317). HADS-A correlates with VSS-AA (r = 0.452) but not with VSS-VER (r = 0.186). Conclusions The VSS-G showed satisfactory psychometric properties to assess the severity of vertigo or vertigo-related symptoms. The VSS-VER can differentiate between healthy subjects and patients with vestibular disorders. The VSS-AA showed some screening properties with high sensitivity for patients with abnormal anxiety. PMID:22747644
Testing of the SEE and OEE post-hip fracture.
Resnick, Barbara; Orwig, Denise; Zimmerman, Sheryl; Hawkes, William; Golden, Justine; Werner-Bronzert, Michelle; Magaziner, Jay
2006-08-01
The purpose of this study was to test the reliability and validity of the Self-Efficacy for Exercise (SEE) and the Outcome Expectations for Exercise (OEE) scales in a sample of 166 older women post-hip fracture. There was some evidence of validity of the SEE and OEE based on confirmatory factor analysis and Rasch model testing, criterion based and convergent validity, and evidence of internal consistency based on alpha coefficients and separation indices and reliability based on R2 estimates. Rasch model testing demonstrated that some items had high variability. Based on these findings suggestions are made for how items could be revised and the scales improved for future use.
Using Reliability to Meet Z540.3's 2 percent Rule
NASA Technical Reports Server (NTRS)
Mimbs, Scott M.
2011-01-01
NASA's Kennedy Space Center (KSC) undertook implementation of ANSI/NCSL Z540.3-2006 in October 2008. Early in the implementation, KSC identified that the largest cost driver of Z540.3 implementation is measurement uncertainty analyses for legacy calibration processes. NASA, like other organizations, has a significant inventory of measuring and test equipment (MTE) that have documented calibration procedures without documented measurement uncertainties. This paper provides background information to support the rationale for using high in-tolerance reliability as evidence of compliance to the 2% probability of false acceptance (PFA) quality metric of ANSI/NCSL Z540.3-2006 allowing use of qualifying legacy processes. NASA is adopting this as policy and is recommending NCSL International consider this as a method of compliance to Z540.3. Topics covered include compliance issues, using end-of-period reliability (EOPR) to estimate test point uncertainty, reliability data influences within the PFA model, the validity of EOPR data, and an appendix covering "observed" versus "true" EOPR.
The reliability and stability of visual working memory capacity.
Xu, Z; Adam, K C S; Fang, X; Vogel, E K
2018-04-01
Because of the central role of working memory capacity in cognition, many studies have used short measures of working memory capacity to examine its relationship to other domains. Here, we measured the reliability and stability of visual working memory capacity, measured using a single-probe change detection task. In Experiment 1, the participants (N = 135) completed a large number of trials of a change detection task (540 in total, 180 each of set sizes 4, 6, and 8). With large numbers of both trials and participants, reliability estimates were high (α > .9). We then used an iterative down-sampling procedure to create a look-up table for expected reliability in experiments with small sample sizes. In Experiment 2, the participants (N = 79) completed 31 sessions of single-probe change detection. The first 30 sessions took place over 30 consecutive days, and the last session took place 30 days later. This unprecedented number of sessions allowed us to examine the effects of practice on stability and internal reliability. Even after much practice, individual differences were stable over time (average between-session r = .76).
Imura, Tomoya; Takamura, Masahiro; Okazaki, Yoshihiro; Tokunaga, Satoko
2016-10-01
We developed a scale to measure time management and assessed its reliability and validity. We then used this scale to examine the impact of time management on psychological stress response. In Study 1-1, we developed the scale and assessed its internal consistency and criterion-related validity. Findings from a factor analysis revealed three elements of time management, “time estimation,” “time utilization,” and “taking each moment as it comes.” In Study 1-2, we assessed the scale’s test-retest reliability. In Study 1-3, we assessed the validity of the constructed scale. The results indicate that the time management scale has good reliability and validity. In Study 2, we performed a covariance structural analysis to verify our model that hypothesized that time management influences perceived control of time and psychological stress response, and perceived control of time influences psychological stress response. The results showed that time estimation increases the perceived control of time, which in turn decreases stress response. However, we also found that taking each moment as it comes reduces perceived control of time, which in turn increases stress response.
Fernández-Calderón, Fermín; Díaz-Batanero, Carmen; Rojas-Tejada, Antonio J; Castellanos-Ryan, Natalie; Lozano-Rojas, Óscar M
2017-07-14
The identification of different personality risk profiles for substance misuse is useful in preventing substance-related problems. This study aims to test the psychometric properties of a new version of the Substance Use Risk Profile Scale (SURPS) for Spanish college students. Cross-sectional study with 455 undergraduate students from four Spanish universities. A new version of the SURPS, adapted to the Spanish population, was administered with the Beck Hopelessness Scale, the UPPS-P Impulsive Behavior Scale, the State-Trait Anxiety Inventory (STAI) and the Alcohol Use Disorders Identification Test (AUDIT). Internal consistency reliability ranged between 0.652 and 0.806 for the four SURPS subscales, while reliability estimated by split-half coefficients varied from 0.686 to 0.829. The estimated test-retest reliability ranged between 0.733 and 0.868. The expected four-factor structure of the original scale was replicated. As evidence of convergent validity, we found that the SURPS subscales were significantly associated with other conceptually-relevant personality scales and significantly associated with alcohol use measures in theoretically-expected ways. This SURPS version may be a useful instrument for measuring personality traits related to vulnerability to substance use and misuse when targeting personality with preventive interventions.
Ferris, M; Cohen, S; Haberman, C; Javalkar, K; Massengill, S; Mahan, J D; Kim, S; Bickford, K; Cantu, G; Medeiros, M; Phillips, A; Ferris, M T; Hooper, S R
2015-01-01
The Self-Management and Transition to Adulthood with Rx=Treatment (STARx) Questionnaire was developed to collect information on self-management and health care transition (HCT) skills, via self-report, in a broad population of adolescents and young adults (AYAs) with chronic conditions. Over several iterations, the STARx questionnaire was created with AYA, family, and health provider input. The development and pilot testing of the STARx Questionnaire took place with the assistance of 1219 AYAs with different chronic health conditions, in multiple institutions and settings over three phases: item development, pilot testing, reliability and factor structuring. The three development phases resulted in a final version of the STARx Questionnaire. The exploratory factor analysis of the third version of the 18-item STARx identified six factors that accounted for about 65% of the variance: Medication management, Provider communication, Engagement during appointments, Disease knowledge, Adult health responsibilities, and Resource utilization. Reliability estimates revealed good internal consistency and temporal stability, with the alpha coefficient for the overall scale being .80. The STARx was developmentally sensitive, with older patients scoring significantly higher on nearly every factor than younger patients. The STARx Questionnaire is a reliable, self-report tool with adequate internal consistency, temporal stability, and a strong, multidimensional factor structure. It provides another assessment strategy to measure self-management and transition skills in AYAs with chronic conditions. Copyright © 2015 Elsevier Inc. All rights reserved.
2009-02-17
Identification of Classified Information in Unclassified DoD Systems During the Audit of Internal Controls and Data Reliability in the Deployable...TITLE AND SUBTITLE Identification of Classified Information in Unclassified DoD Systems During the Audit of Internal Controls and Data Reliability...Systems During the Audit ofInternal Controls and Data Reliability in the Deployable Disbursing System (Report No. D-2009-054) Weare providing this
The McCanse Readiness for Death Instrument (MRDI): a reliable and valid measure for hospice care.
McCanse, R P
1995-01-01
The purpose of this study was to establish whether or not readiness for death, as an indicator of healthy dying, is a measurable concept. Review of relevant literature revealed consensus regarding the universality of a human need for healthy dying. A theory of healthy dying was derived from the Rogerian paradigm. The McCanse Readiness for Death Instrument (MRDI) was constructed, which included indicators of physiological, psychological, sociological, and spiritual aspects of "healthy" field pattern as death is developmentally approached. The MRDI was a 26-item structured interview questionnaire which generated interval-ratio data through a visual analog scale. A pretest was conducted with a sample of 9 volunteer patients drawn from a small suburban outpatient hospice. The MRDI was concurrently administered to dying individuals, their primary caregivers, and their primary hospice nurses. Correlations between dying individuals' scores and their primary caregivers' estimates of patient death readiness as well as between patients and their primary hospice nurses were very encouraging. Cronbach's coefficient alpha for internal consistency reliability was .59. Content validity was supported by consensus of an expert panel of practicing hospice nurses. Construct validity was demonstrated through legitimate placement of the concept, healthy death readiness, within the theoretical web which supported it. The MRDI was then administered to a sample of 31 terminally-ill individuals, their primary caregivers, and their primary hospice nurses drawn from larger, urban hospice populations in three geographic areas of the United States. The MRDI was also administered to a contrast group of 39 cardiac-impaired individuals who were not terminally-ill. Overall internal consistency of the MRDI was found to be quite favorable (alpha = .76). Debilitating illness and actual mortality in the study sample precluded and/or confounded estimates of test-retest reliability. Convergent validity of the MRDI was indicated by significant correlations between patients' scores and primary caregivers' estimates (r = .35, p < .05) and between patients' scores and primary hospice nurses' estimates (r = .53, p < .01). Discriminant validity of the MRDI was demonstrated by a significant mean difference between the group of terminally-ill patients and the group of non-terminal, cardiac-impaired patients (t = 1.76, p < .01).
METAPHOR: Probability density estimation for machine learning based photometric redshifts
NASA Astrophysics Data System (ADS)
Amaro, V.; Cavuoti, S.; Brescia, M.; Vellucci, C.; Tortora, C.; Longo, G.
2017-06-01
We present METAPHOR (Machine-learning Estimation Tool for Accurate PHOtometric Redshifts), a method able to provide a reliable PDF for photometric galaxy redshifts estimated through empirical techniques. METAPHOR is a modular workflow, mainly based on the MLPQNA neural network as internal engine to derive photometric galaxy redshifts, but giving the possibility to easily replace MLPQNA with any other method to predict photo-z's and their PDF. We present here the results about a validation test of the workflow on the galaxies from SDSS-DR9, showing also the universality of the method by replacing MLPQNA with KNN and Random Forest models. The validation test include also a comparison with the PDF's derived from a traditional SED template fitting method (Le Phare).
NASA Astrophysics Data System (ADS)
Reshchikov, M. A.; Foussekis, M.; McNamara, J. D.; Behrends, A.; Bakin, A.; Waag, A.
2012-04-01
The optical properties of high-quality GaN co-doped with silicon and zinc are investigated by using temperature-dependent continuous-wave and time-resolved photoluminescence measurements. The blue luminescence band is related to the ZnGa acceptor in GaN:Si,Zn, which exhibits an exceptionally high absolute internal quantum efficiency (IQE). An IQE above 90% was calculated for several samples having different concentrations of Zn. Accurate and reliable values of the IQE were obtained by using several approaches based on rate equations. The concentrations of the ZnGa acceptors and free electrons were also estimated from the photoluminescence measurements.
The reliability of the Hendrich Fall Risk Model in a geriatric hospital.
Heinze, Cornelia; Halfens, Ruud; Dassen, Theo
2008-12-01
Aims and objectives. The purpose of this study was to test the interrater reliability of the Hendrich Fall Risk Model, an instrument to identify patients in a hospital setting with a high risk of falling. Background. Falls are a serious problem in older patients. Valid and reliable fall risk assessment tools are required to identify high-risk patients and to take adequate preventive measures. Methods. Seventy older patients were independently and simultaneously assessed by six pairs of raters made up of nursing staff members. Consensus estimates were calculated using simple percentage agreement and consistency estimates using Spearman's rho and intra class coefficient. Results. Percentage agreement ranged from 0.70 to 0.92 between the six pairs of raters. Spearman's rho coefficients were between 0.54 and 0.80 and the intra class coefficients were between 0.46 and 0.92. Conclusions. Whereas some pairs of raters obtained considerable interobserver agreement and internal consistency, the others did not. Therefore, it is concluded that the Hendrich Fall Risk Model is not a reliable instrument. The use of more unambiguous operationalized items is preferred. Relevance to clinical practice. In practice, well operationalized fall risk assessment tools are necessary. Observer agreement should always be investigated after introducing a standardized measurement tool. © 2008 The Authors. Journal compilation © 2008 Blackwell Publishing Ltd.
Travaglini, Davide; Fattorini, Lorenzo; Barbati, Anna; Bottalico, Francesca; Corona, Piermaria; Ferretti, Marco; Chirici, Gherardo
2013-04-01
A correct characterization of the status and trend of forest condition is essential to support reporting processes at national and international level. An international forest condition monitoring has been implemented in Europe since 1987 under the auspices of the International Co-operative Programme on Assessment and Monitoring of Air Pollution Effects on Forests (ICP Forests). The monitoring is based on harmonized methodologies, with individual countries being responsible for its implementation. Due to inconsistencies and problems in sampling design, however, the ICP Forests network is not able to produce reliable quantitative estimates of forest condition at European and sometimes at country level. This paper proposes (1) a set of requirements for status and change assessment and (2) a harmonized sampling strategy able to provide unbiased and consistent estimators of forest condition parameters and of their changes at both country and European level. Under the assumption that a common definition of forest holds among European countries, monitoring objectives, parameters of concern and accuracy indexes are stated. On the basis of fixed-area plot sampling performed independently in each country, an unbiased and consistent estimator of forest defoliation indexes is obtained at both country and European level, together with conservative estimators of their sampling variance and power in the detection of changes. The strategy adopts a probabilistic sampling scheme based on fixed-area plots selected by means of systematic or stratified schemes. Operative guidelines for its application are provided.
Test battery for measuring the perception and recognition of facial expressions of emotion
Wilhelm, Oliver; Hildebrandt, Andrea; Manske, Karsten; Schacht, Annekathrin; Sommer, Werner
2014-01-01
Despite the importance of perceiving and recognizing facial expressions in everyday life, there is no comprehensive test battery for the multivariate assessment of these abilities. As a first step toward such a compilation, we present 16 tasks that measure the perception and recognition of facial emotion expressions, and data illustrating each task's difficulty and reliability. The scoring of these tasks focuses on either the speed or accuracy of performance. A sample of 269 healthy young adults completed all tasks. In general, accuracy and reaction time measures for emotion-general scores showed acceptable and high estimates of internal consistency and factor reliability. Emotion-specific scores yielded lower reliabilities, yet high enough to encourage further studies with such measures. Analyses of task difficulty revealed that all tasks are suitable for measuring emotion perception and emotion recognition related abilities in normal populations. PMID:24860528
Citronberg, Jessica S; Wilkens, Lynne R; Lim, Unhee; Hullar, Meredith A J; White, Emily; Newcomb, Polly A; Le Marchand, Loïc; Lampe, Johanna W
2016-09-01
Plasma lipopolysaccharide-binding protein (LBP), a measure of internal exposure to bacterial lipopolysaccharide, has been associated with several chronic conditions and may be a marker of chronic inflammation; however, no studies have examined the reliability of this biomarker in a healthy population. We examined the temporal reliability of LBP measured in archived samples from participants in two studies. In Study one, 60 healthy participants had blood drawn at two time points: baseline and follow-up (either three, six, or nine months). In Study two, 24 individuals had blood drawn three to four times over a seven-month period. We measured LBP in archived plasma by ELISA. Test-retest reliability was estimated by calculating the intraclass correlation coefficient (ICC). Plasma LBP concentrations showed moderate reliability in Study one (ICC 0.60, 95 % CI 0.43-0.75) and Study two (ICC 0.46, 95 % CI 0.26-0.69). Restricting the follow-up period improved reliability. In Study one, the reliability of LBP over a three-month period was 0.68 (95 % CI: 0.41-0.87). In Study two, the ICC of samples taken ≤seven days apart was 0.61 (95 % CI 0.29-0.86). Plasma LBP concentrations demonstrated moderate test-retest reliability in healthy individuals with reliability improving over a shorter follow-up period.
Farias, José Cazuza de; Loch, Mathias Roberto; Lima, Antônio José de; Sales, Joana Marcela; Ferreira, Flávia Emília Leite de Lima
2017-09-28
: The objective of this two-part study was to estimate the reproducibility, internal consistency, and construct validity of KIDSCREEN-27, a questionnaire to measure health-related quality of life, in Brazilian adolescents. One study component estimated reproducibility (176 adolescents, 59.7% females, 64.7% 10 to 12 years of age), and another estimated internal consistency and validity (1,321 adolescents, 53.7% females, 56.9% 10 to 12 years of age). The studies were conducted with adolescents of both sexes in public schools in the municipality of João Pessoa, Paraíba State, Brazil. KIDSCREEN-27 consists of 27 items distributed across five domains (physical well-being, 5 items; psychological well-being, 7 items; parents and social support, 7 items; autonomy and relationship with parents, 4 items; school environment, 4 items). Reproducibility was estimated by intra-class correlation coefficient (ICC). Confirmatory factor analysis was used to assess construct validity, and composite reliability index (CRI) was used to verify the questionnaire's internal consistency. ICCs were greater than or equal to 0.70 (0.70 to 0.96). Factor loads were greater than 0.40, except for five items (0.28 to 0.39). The model's goodness-of-fit indices were adequate (χ2/df = 2.79; RMR = 0.035; RMSEA = 0.037; GFI = 0.951; AGFI = 0.941; CFI = 0.908; TLI = 0.901). CRI varied from 0.65 to 0.70 in the domains and was 0.90 for the questionnaire. KIDSCREEN-27 reached satisfactory levels of reproducibility, internal consistency, and construct validity and can be used to assess health-related quality of life in Brazilian adolescents 10 to 15 years of age.
Walker, C; Papadopoulos, L; Lipton, M; Hussein, M
2006-02-01
A lack of information about disease in children can lead to erroneous views such as children believing that hospital admittance or the presence of a disease is a punishment for a perceived wrong. There has thus far been no standard tool available to measure children's illness conceptualizations from a Leventhalian framework. Three groups of children with eczema, asthma and eczema and asthma between the ages of 7 and 12 years of age were recruited. Children were given the Children's Illness Perception Questionnaire (CIPQ), a 26-item instrument adapted from the Illness Perception Questionnaire for adults. A Kuder - Richardson 20 test of reliability for dichotomous data was performed allowing an estimate of the internal consistency of the measurement scales. It can be seen that, for all three illness groups, internal consistency is acceptable for the timeline and consequences scale. The cure/control scale, however, was not internally consistent for any illness group. As health professionals, we need to develop the means to further understand how paediatric illness beliefs relate to specific disease types, age and psychosocial factors and the utility of this instrument is discussed within this context.
The structure of internal stresses in the uncompacted ice cover
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sukhorukov, K.K.
1995-12-31
Interactions between engineering structures and sea ice cover are associated with an inhomogeneous space/time field of internal stresses. Field measurements (e.g., Coon, 1989; Tucker, 1992) have revealed considerable local stresses depending on the regional stress field and ice structure. These stresses appear in different time and space scales and depend on rheologic properties of the ice. To estimate properly the stressed state a knowledge of a connection between internal stress components in various regions of the ice cover is necessary. To develop reliable algorithms for estimates of ice action on engineering structures new experimental data are required to take intomore » account both microscale (comparable with local ice inhomogeneities) and small-scale (kilometers) inhomogeneities of the ice cover. Studies of compacted ice (concentration N is nearly 1) are mostly important. This paper deals with the small-scale spatial distribution of internal stresses in the interaction zone between the ice covers of various concentrations and icebergs. The experimental conditions model a situation of the interaction between a wide structure and the ice cover. Field data on a drifting ice were collected during the Russian-US experiment in Antarctica WEDDELL-I in 1992.« less
Back to the future: estimating pre-injury brain volume in patients with traumatic brain injury.
Ross, David E; Ochs, Alfred L; D Zannoni, Megan; Seabaugh, Jan M
2014-11-15
A recent meta-analysis by Hedman et al. allows for accurate estimation of brain volume changes throughout the life span. Additionally, Tate et al. showed that intracranial volume at a later point in life can be used to estimate reliably brain volume at an earlier point in life. These advancements were combined to create a model which allowed the estimation of brain volume just prior to injury in a group of patients with mild or moderate traumatic brain injury (TBI). This volume estimation model was used in combination with actual measurements of brain volume to test hypotheses about progressive brain volume changes in the patients. Twenty six patients with mild or moderate TBI were compared to 20 normal control subjects. NeuroQuant® was used to measure brain MRI volume. Brain volume after the injury (from MRI scans performed at t1 and t2) was compared to brain volume just before the injury (volume estimation at t0) using longitudinal designs. Groups were compared with respect to volume changes in whole brain parenchyma (WBP) and its 3 major subdivisions: cortical gray matter (GM), cerebral white matter (CWM) and subcortical nuclei+infratentorial regions (SCN+IFT). Using the normal control data, the volume estimation model was tested by comparing measured brain volume to estimated brain volume; reliability ranged from good to excellent. During the initial phase after injury (t0-t1), the TBI patients had abnormally rapid atrophy of WBP and CWM, and abnormally rapid enlargement of SCN+IFT. Rates of volume change during t0-t1 correlated with cross-sectional measures of volume change at t1, supporting the internal reliability of the volume estimation model. A logistic regression analysis using the volume change data produced a function which perfectly predicted group membership (TBI patients vs. normal control subjects). During the first few months after injury, patients with mild or moderate TBI have rapid atrophy of WBP and CWM, and rapid enlargement of SCN+IFT. The magnitude and pattern of the changes in volume may allow for the eventual development of diagnostic tools based on the volume estimation approach. Copyright © 2014 Elsevier Inc. All rights reserved.
The live donor assessment tool: a psychosocial assessment tool for live organ donors.
Iacoviello, Brian M; Shenoy, Akhil; Braoude, Jenna; Jennings, Tiane; Vaidya, Swapna; Brouwer, Julianna; Haydel, Brandy; Arroyo, Hansel; Thakur, Devendra; Leinwand, Joseph; Rudow, Dianne LaPointe
2015-01-01
Psychosocial evaluation is an important part of the live organ donor evaluation process, yet it is not standardized across institutions, and although tools exist for the psychosocial evaluation of organ recipients, none exist to assess donors. We set out to develop a semistructured psychosocial evaluation tool (the Live Donor Assessment Tool, LDAT) to assess potential live organ donors and to conduct preliminary analyses of the tool's reliability and validity. Review of the literature on the psychosocial variables associated with treatment adherence, quality of life, live organ donation outcome, and resilience, as well as review of the procedures for psychosocial evaluation at our center and other centers around the country, identified 9 domains to address; these domains were distilled into several items each, in collaboration with colleagues at transplant centers across the country, for a total of 29 items. Four raters were trained to use the LDAT, and they retrospectively scored 99 psychosocial evaluations conducted on live organ donor candidates. Reliability of the LDAT was assessed by calculating the internal consistency of the items in the scale and interrater reliability between raters; validity was estimated by comparing LDAT scores between those with a "positive" evaluation outcome and "negative" outcome. The LDAT was found to have good internal consistency, inter-rater reliability, and showed signs of validity: LDAT scores differentiated the positive vs. negative outcome groups. The LDAT demonstrated good reliability and validity, but future research on the LDAT and the ability to implement the LDAT prospectively is warranted. Copyright © 2015 The Academy of Psychosomatic Medicine. Published by Elsevier Inc. All rights reserved.
Papadopoulou, Soultana L.; Exarchakos, Georgios; Christodoulou, Dimitrios; Theodorou, Stavroula; Beris, Alexandre; Ploumis, Avraam
2016-01-01
Introduction The Ohkuma questionnaire is a validated screening tool originally used to detect dysphagia among patients hospitalized in Japanese nursing facilities. Objective The purpose of this study is to evaluate the reliability and validity of the adapted Greek version of the Ohkuma questionnaire. Methods Following the steps for cross-cultural adaptation, we delivered the validated Ohkuma questionnaire to 70 patients (53 men, 17 women) who were either suffering from dysphagia or not. All of them completed the questionnaire a second time within a month. For all of them, we performed a bedside and VFSS study of dysphagia and asked participants to undergo a second VFSS screening, with the exception of nine individuals. Statistical analysis included measurement of internal consistency with Cronbach's α coefficient, reliability with Cohen's Kappa, Pearson's correlation coefficient and construct validity with categorical components, and One-Way Anova test. Results According to Cronbach's α coefficient (0.976) for total score, there was high internal consistency for the Ohkuma Dysphagia questionnaire. Test-retest reliability (Cohen's Kappa) ranged from 0.586 to 1.00, exhibiting acceptable stability. We also estimated the Pearson's correlation coefficient for the test-retest total score, which reached high levels (0.952; p = 0.000). The One-Way Anova test in the two measurement times showed statistically significant correlation in both measurements (p = 0.02 and p = 0.016). Conclusion The adapted Greek version of the questionnaire is valid and reliable and can be used for the screening of dysphagia in the Greek-speaking patients. PMID:28050209
Papadopoulou, Soultana L; Exarchakos, Georgios; Christodoulou, Dimitrios; Theodorou, Stavroula; Beris, Alexandre; Ploumis, Avraam
2017-01-01
Introduction The Ohkuma questionnaire is a validated screening tool originally used to detect dysphagia among patients hospitalized in Japanese nursing facilities. Objective The purpose of this study is to evaluate the reliability and validity of the adapted Greek version of the Ohkuma questionnaire. Methods Following the steps for cross-cultural adaptation, we delivered the validated Ohkuma questionnaire to 70 patients (53 men, 17 women) who were either suffering from dysphagia or not. All of them completed the questionnaire a second time within a month. For all of them, we performed a bedside and VFSS study of dysphagia and asked participants to undergo a second VFSS screening, with the exception of nine individuals. Statistical analysis included measurement of internal consistency with Cronbach's α coefficient, reliability with Cohen's Kappa, Pearson's correlation coefficient and construct validity with categorical components, and One-Way Anova test. Results According to Cronbach's α coefficient (0.976) for total score, there was high internal consistency for the Ohkuma Dysphagia questionnaire. Test-retest reliability (Cohen's Kappa) ranged from 0.586 to 1.00, exhibiting acceptable stability. We also estimated the Pearson's correlation coefficient for the test-retest total score, which reached high levels (0.952; p = 0.000). The One-Way Anova test in the two measurement times showed statistically significant correlation in both measurements ( p = 0.02 and p = 0.016). Conclusion The adapted Greek version of the questionnaire is valid and reliable and can be used for the screening of dysphagia in the Greek-speaking patients.
Utilization of bone impedance for age estimation in postmortem cases.
Ishikawa, Noboru; Suganami, Hideki; Nishida, Atsushi; Miyamori, Daisuke; Kakiuchi, Yasuhiro; Yamada, Naotake; Wook-Cheol, Kim; Kubo, Toshikazu; Ikegaya, Hiroshi
2015-11-01
In the field of Forensic Medicine the number of unidentified cadavers has increased due to natural disasters and international terrorism. The age estimation is very important for identification of the victims. The degree of sagittal closure is one of such age estimation methods. However it is not widely accepted as a reliable method for age estimation. In this study, we have examined whether measuring impedance value (z-values) of the sagittal suture of the skull is related to the age in men and women and discussed the possibility to use bone impedance for age estimation. Bone impedance values increased with aging and decreased after the age of 64.5. Then we compared age estimation through the conventional visual method and the proposed bone impedance measurement technique. It is suggested that the bone impedance measuring technique may be of value to forensic science as a method of age estimation. Copyright © 2015 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.
Kurtz, J E; Lee, P A; Sherker, J L
1999-06-01
This study examines the internal consistency and temporal stability of informant ratings from two widely used instruments for normal personality assessment, the revised NEO Personality Inventory (NEO PI-R) and the Interpersonal Adjective Scales (IAS). Well-known adult targets were selected by 109 undergraduate students and rated on two occasions separated by a 6-month interval. With few exceptions, estimates of internal consistency are adequate to good for both instruments. NEO PI-R domain scores yield coefficient alphas ranging from .89 to .96, with a median of .80 for the 30 facet scales. IAS octant scales show coefficient alphas ranging from .83 to .92. Retest Pearson correlations are above .70 for each of the NEO PI-R domain scores and both IAS axis coordinates, and intraclass correlations are above .60 for all scales from both instruments. Score changes were small but statistically significant for three of the five NEO PI-R domains at retest. The retest stability of IAS type classifications varies as a function of the extremity of the associated octant scores.
Researches of fruit quality prediction model based on near infrared spectrum
NASA Astrophysics Data System (ADS)
Shen, Yulin; Li, Lian
2018-04-01
With the improvement in standards for food quality and safety, people pay more attention to the internal quality of fruits, therefore the measurement of fruit internal quality is increasingly imperative. In general, nondestructive soluble solid content (SSC) and total acid content (TAC) analysis of fruits is vital and effective for quality measurement in global fresh produce markets, so in this paper, we aim at establishing a novel fruit internal quality prediction model based on SSC and TAC for Near Infrared Spectrum. Firstly, the model of fruit quality prediction based on PCA + BP neural network, PCA + GRNN network, PCA + BP adaboost strong classifier, PCA + ELM and PCA + LS_SVM classifier are designed and implemented respectively; then, in the NSCT domain, the median filter and the SavitzkyGolay filter are used to preprocess the spectral signal, Kennard-Stone algorithm is used to automatically select the training samples and test samples; thirdly, we achieve the optimal models by comparing 15 kinds of prediction model based on the theory of multi-classifier competition mechanism, specifically, the non-parametric estimation is introduced to measure the effectiveness of proposed model, the reliability and variance of nonparametric estimation evaluation of each prediction model to evaluate the prediction result, while the estimated value and confidence interval regard as a reference, the experimental results demonstrate that this model can better achieve the optimal evaluation of the internal quality of fruit; finally, we employ cat swarm optimization to optimize two optimal models above obtained from nonparametric estimation, empirical testing indicates that the proposed method can provide more accurate and effective results than other forecasting methods.
Rigo-Bonnin, Raül; Blanco-Font, Aurora; Canalias, Francesca
2018-05-08
Values of mass concentration of tacrolimus in whole blood are commonly used by the clinicians for monitoring the status of a transplant patient and for checking whether the administered dose of tacrolimus is effective. So, clinical laboratories must provide results as accurately as possible. Measurement uncertainty can allow ensuring reliability of these results. The aim of this study was to estimate measurement uncertainty of whole blood mass concentration tacrolimus values obtained by UHPLC-MS/MS using two top-down approaches: the single laboratory validation approach and the proficiency testing approach. For the single laboratory validation approach, we estimated the uncertainties associated to the intermediate imprecision (using long-term internal quality control data) and the bias (utilizing a certified reference material). Next, we combined them together with the uncertainties related to the calibrators-assigned values to obtain a combined uncertainty for, finally, to calculate the expanded uncertainty. For the proficiency testing approach, the uncertainty was estimated in a similar way that the single laboratory validation approach but considering data from internal and external quality control schemes to estimate the uncertainty related to the bias. The estimated expanded uncertainty for single laboratory validation, proficiency testing using internal and external quality control schemes were 11.8%, 13.2%, and 13.0%, respectively. After performing the two top-down approaches, we observed that their uncertainty results were quite similar. This fact would confirm that either two approaches could be used to estimate the measurement uncertainty of whole blood mass concentration tacrolimus values in clinical laboratories. Copyright © 2018 The Canadian Society of Clinical Chemists. Published by Elsevier Inc. All rights reserved.
Comprehensive clinical assessment in community setting: applicability of the MDS-HC.
Morris, J N; Fries, B E; Steel, K; Ikegami, N; Bernabei, R; Carpenter, G I; Gilgen, R; Hirdes, J P; Topinková, E
1997-08-01
To describe the results of an international trial of the home care version of the MDS assessment and problem identification system (the MDS-HC), including reliability estimates, a comparison of MDS-HC reliabilities with reliabilities of the same items in the MDS 2.0 nursing home assessment instrument, and an examination of the types of problems found in home care clients using the MDS-HC. Independent, dual assessment of clients of home-care agencies by trained clinicians using a draft of the MDS-HC, with additional descriptive data regarding problem profiles for home care clients. Reliability data from dual assessments of 241 randomly selected clients of home care agencies in five countries, all of whom volunteered to test the MDS-HC. Also included are an expanded sample of 780 home care assessments from these countries and 187 dually assessed residents from 21 nursing homes in the United States. The array of MDS-HC assessment items included measures in the following areas: personal items, cognitive patterns, communication/hearing, vision, mood and behavior, social functioning, informal support services, physical functioning, continence, disease diagnoses health conditions and preventive health measures, nutrition/hydration, dental status, skin condition, environmental assessment, service utilization, and medications. Forty-seven percent of the functional, health status, social environment, and service items in the MDS-HC were taken from the MDS 2.0 for nursing homes. For this item set, it is estimated that the average weighted Kappa is .74 for the MDS-HC and .75 for the MDS 2.0. Similarly, high reliability values were found for items newly introduced in the MDS-HC (weighted Kappa = .70). Descriptive findings also characterize the problems of home care clients, with subanalyses within cognitive performance levels. Findings indicate that the core set of items in the MDS 2.0 work equally well in community and nursing home settings. New items are highly reliable. In tandem, these instruments can be used within the international community, assisting and planning care for older adults within a broad spectrum of service settings, including nursing homes and home care programs. With this community-based, second-generation problem and care plan-driven assessment instrument, disability assessment can be performed consistently across the world.
Peker, Kadriye; Köse, Taha Emre; Güray, Beliz; Uysal, Ömer; Erdem, Tamer Lütfi
2017-04-01
To culturally adapt the Turkish version of Rapid Estimate of Adult Literacy in Dentistry (TREALD-30) for Turkish-speaking adult dental patients and to evaluate its psychometric properties. After translation and cross-cultural adaptation, TREALD-30 was tested in a sample of 127 adult patients who attended a dental school clinic in Istanbul. Data were collected through clinical examinations and self-completed questionnaires, including TREALD-30, the Oral Health Impact Profile (OHIP), the Rapid Estimate of Adult Literacy in Medicine (REALM), two health literacy screening questions, and socio-behavioral characteristics. Psychometric properties were examined using Classical Test Theory (CTT) and Rasch analysis. Internal consistency (Cronbach's Alpha = 0.91) and test-retest reliability (Intraclass correlation coefficient = 0.99) were satisfactory for TREALD-30. It exhibited good convergent and predictive validity. Monthly family income, years of education, dental flossing, health literacy, and health literacy skills were found as stronger predictors of patients'oral health literacy (OHL). Confirmatory factor analysis (CFA) confirmed a two-factor model. The Rasch model explained 37.9% of the total variance in this dataset. In addition, TREALD-30 had eleven misfitting items, which indicated evidence of multidimensionality. The reliability indeces provided in Rasch analysis (person separation reliability = 0.91 and expected-a-posteriori/plausible reliability = 0.94) indicated that TREALD-30 had acceptable reliability. TREALD-30 showed satisfactory psychometric properties. It may be used to identify patients with low OHL. Socio-demographic factors, oral health behaviors and health literacy skills should be taken into account when planning future studies to assess the OHL in both clinical and community settings.
Tsai, Alexander C.
2014-01-01
OBJECTIVES To systematically review the reliability and validity of instruments used to screen for major depressive disorder or assess depression symptom severity among persons with HIV in sub-Saharan Africa. DESIGN Systematic review and meta-analysis. METHODS A systematic evidence search protocol was applied to seven bibliographic databases. Studies examining the reliability and/or validity of depression assessment tools were selected for inclusion if they were based on data collected from HIV-positive adults in any African member state of the United Nations. Random-effects meta-analysis was employed to calculate pooled estimates of depression prevalence. In a subgroup of studies of criterion-related validity, the bivariate random-effects model was used to calculate pooled estimates of sensitivity and specificity. RESULTS Of 1,117 records initially identified, I included 13 studies of 5,373 persons with HIV in 7 sub-Saharan African countries. Reported estimates of Cronbach’s alpha ranged from 0.63–0.95, and analyses of internal structure generally confirmed the existence of a depression-like construct accounting for a substantial portion of variance. The pooled prevalence of probable depression was 29.5% (95% CI, 20.5–39.4), while the pooled prevalence of major depressive disorder was 13.9% (95% CI, 9.7–18.6). The Center for Epidemiologic Studies-Depression scale was the most frequently studied instrument, with a pooled sensitivity of 0.82 (95% CI, 0.73–0.87) for detecting major depressive disorder. CONCLUSIONS Depression screening instruments yielded relatively high false positive rates. Overall, few studies described the reliability and/or validity of depression instruments in sub-Saharan Africa. PMID:24853307
Pedersen, Scott J; Kitic, Cecilia M; Bird, Marie-Louise; Mainsbridge, Casey P; Cooley, P Dean
2016-08-19
With the advent of workplace health and wellbeing programs designed to address prolonged occupational sitting, tools to measure behaviour change within this environment should derive from empirical evidence. In this study we measured aspects of validity and reliability for the Occupational Sitting and Physical Activity Questionnaire that asks employees to recount the percentage of work time they spend in the seated, standing, and walking postures during a typical workday. Three separate cohort samples (N = 236) were drawn from a population of government desk-based employees across several departmental agencies. These volunteers were part of a larger state-wide intervention study. Workplace sitting and physical activity behaviour was measured both subjectively against the International Physical Activity Questionnaire, and objectively against ActivPal accelerometers before the intervention began. Criterion validity and concurrent validity for each of the three posture categories were assessed using Spearman's rank correlation coefficients, and a bias comparison with 95 % limits of agreement. Test-retest reliability of the survey was reported with intraclass correlation coefficients. Criterion validity for this survey was strong for sitting and standing estimates, but weak for walking. Participants significantly overestimated the amount of walking they did at work. Concurrent validity was moderate for sitting and standing, but low for walking. Test-retest reliability of this survey proved to be questionable for our sample. Based on our findings we must caution occupational health and safety professionals about the use of employee self-report data to estimate workplace physical activity. While the survey produced accurate measurements for time spent sitting at work it was more difficult for employees to estimate their workplace physical activity.
ERIC Educational Resources Information Center
Davenport, Ernest C.; Davison, Mark L.; Liou, Pey-Yan; Love, Quintin U.
2015-01-01
This article uses definitions provided by Cronbach in his seminal paper for coefficient a to show the concepts of reliability, dimensionality, and internal consistency are distinct but interrelated. The article begins with a critique of the definition of reliability and then explores mathematical properties of Cronbach's a. Internal consistency…
Reliability Problems of the Datum: Solutions for Questionnaire Responses.
ERIC Educational Resources Information Center
Bastick, Tony
Questionnaires often ask for estimates, and these estimates are given with different reliabilities. It is difficult to know the different reliabilities of single estimates and to take these into account in subsequent analyses. This paper contains a practical example to show that not taking the reliability of different responses into account can…
Weighing in on international growth standards: testing the case in Australian preschool children.
Pattinson, C L; Staton, S L; Smith, S S; Trost, S G; Sawyer, E F; Thorpe, K J
2017-10-01
Overweight and obesity in preschool-aged children are major health concerns. Accurate and reliable estimates of prevalence are necessary to direct public health and clinical interventions. There are currently three international growth standards used to determine prevalence of overweight and obesity, each using different methodologies: Center for Disease Control (CDC), World Health Organization (WHO) and International Obesity Task Force (IOTF). Adoption and use of each method were examined through a systematic review of Australian population studies (2006-2017). For this period, systematically identified population studies (N = 20) reported prevalence of overweight and obesity ranging between 15 and 38% with most (n = 16) applying the IOTF standards. To demonstrate the differences in prevalence estimates yielded by the IOTF in comparison to the WHO and CDC standards, methods were applied to a sample of N = 1,926 Australian children, aged 3-5 years. As expected, the three standards yielded significantly different estimates when applied to this single population. Prevalence of overweight/obesity was WHO - 9.3%, IOTF - 21.7% and CDC - 33.1%. Judicious selection of growth standards, taking account of their underpinning methodologies and provisions of access to study data sets to allow prevalence comparisons, is recommended. © 2017 World Obesity Federation.
González-Ortiz, Ailema Janeth; Arce-Santander, Celene Viridiana; Vega-Vega, Olynka; Correa-Rotter, Ricardo; Espinosa-Cuevas, María de Los Angeles
2014-10-04
The protein-energy wasting syndrome (PEW) is a condition of malnutrition, inflammation, anorexia and wasting of body reserves resulting from inflammatory and non-inflammatory conditions in patients with chronic kidney disease (CKD).One way of assessing PEW, extensively described in the literature, is using the Malnutrition Inflammation Score (MIS). To assess the reliability and consistency of MIS for diagnosis of PEW in Mexican adults with CKD on hemodialysis (HD). Study of diagnostic tests. A sample of 45 adults with CKD on HD were analyzed during the period June-July 2014.The instrument was applied on 2 occasions; the test-retest reliability was calculated using the Intraclass Correlation Coefficient (ICC); the internal consistency of the questionnaire was analyzed using Cronbach's αcoefficient. A weighted Kappa test was used to estimate the validity of the instrument; the result was subsequently compared with the Bilbrey nutritional index (BNI). The reliability of the questionnaires, evaluated in the patient sample, was ICC=0.829.The agreement between MIS observations was considered adequate, k= 0.585 (p <0.001); when comparing it with BNI, a value of k = 0.114 was obtained (p <0.001).In order to estimate the tendency, a correlation test was performed. The r² correlation coefficient was 0.488 (P <0.001). MIS has adequate reliability and validity for diagnosing PEW in the population with chronic kidney disease on HD. Copyright AULA MEDICA EDICIONES 2014. Published by AULA MEDICA. All rights reserved.
Computer-Aided Reliability Estimation
NASA Technical Reports Server (NTRS)
Bavuso, S. J.; Stiffler, J. J.; Bryant, L. A.; Petersen, P. L.
1986-01-01
CARE III (Computer-Aided Reliability Estimation, Third Generation) helps estimate reliability of complex, redundant, fault-tolerant systems. Program specifically designed for evaluation of fault-tolerant avionics systems. However, CARE III general enough for use in evaluation of other systems as well.
Amin, N A; Quek, K F; Oxley, J A; Noah, R M; Nordin, R
2015-10-01
The Job Content Questionnaire (M-JCQ) is an established self-reported instrument used across the world to measure the work dimensions based on the Karasek's demand-control-support model. To evaluate the psychometrics properties of the Malay version of M-JCQ among nurses in Malaysia. This cross-sectional study was carried out on nurses working in 4 public hospitals in Klang Valley area, Malaysia. M-JCQ was used to assess the perceived psychosocial stressors and physical demands of nurses at their workplaces. Construct validity of the questionnaire was examined using exploratory factor analysis (EFA). Cronbach's α values were used to estimate the reliability (internal consistency) of the M-JCQ. EFA showed that 34 selected items were loaded in 4 factors. Except for psychological job demand (Cronbach's α 0.51), the remaining 3 α values for 3 subscales (job control, social support, and physical demand) were greater than 0.70, indicating acceptable internal consistency. However, an item was excluded due to poor item-total correlation (r<0.3). The final M-JCQ was consisted of 33 items. The M-JCQ is a reliable and valid instrument to measure psychosocial and physical stressors in the workplace of public hospital nurses in Malaysia.
Clayson, Peter E; Miller, Gregory A
2017-01-01
Generalizability theory (G theory) provides a flexible, multifaceted approach to estimating score reliability. G theory's approach to estimating score reliability has important advantages over classical test theory that are relevant for research using event-related brain potentials (ERPs). For example, G theory does not require parallel forms (i.e., equal means, variances, and covariances), can handle unbalanced designs, and provides a single reliability estimate for designs with multiple sources of error. This monograph provides a detailed description of the conceptual framework of G theory using examples relevant to ERP researchers, presents the algorithms needed to estimate ERP score reliability, and provides a detailed walkthrough of newly-developed software, the ERP Reliability Analysis (ERA) Toolbox, that calculates score reliability using G theory. The ERA Toolbox is open-source, Matlab software that uses G theory to estimate the contribution of the number of trials retained for averaging, group, and/or event types on ERP score reliability. The toolbox facilitates the rigorous evaluation of psychometric properties of ERP scores recommended elsewhere in this special issue. Copyright © 2016 Elsevier B.V. All rights reserved.
QUIROZ, Viviana; REINERO, Daniela; HERNÁNDEZ, Patricia; CONTRERAS, Johanna; VERNAL, Rolando; CARVAJAL, Paola
2017-01-01
Abstract The major infectious diseases in Chile encompass the periodontal diseases, with a combined prevalence that rises up to 90% of the population. Thus, the population-based surveillance of periodontal diseases plays a central role for assessing their prevalence and for planning, implementing, and evaluating preventive and control programs. Self-report questionnaires have been proposed for the surveillance of periodontal diseases in adult populations world-wide. Objective This study aimed to develop and assess the content validity and reliability of a cognitively adapted self-report questionnaire designed for surveillance of gingivitis in adolescents. Material and Methods Ten predetermined self-report questions evaluating early signs and symptoms of gingivitis were preliminary assessed by a panel of clinical experts. Eight questions were selected and cognitively tested in 20 adolescents aged 12 to 18 years from Santiago de Chile. The questionnaire was then conducted and answered by 178 Chilean adolescents. Internal consistency was measured using the Cronbach’s alpha and temporal stability was calculated using the Kappa-index. Results A reliable final self-report questionnaire consisting of 5 questions was obtained, with a total Cronbach’s alpha of 0.73 and a Kappa-index ranging from 0.41 to 0.77 between the different questions. Conclusions The proposed questionnaire is reliable, with an acceptable internal consistency and a temporal stability from moderate to substantial, and it is promising for estimating the prevalence of gingivitis in adolescents. PMID:28877279
The intelligibility in Context Scale: validity and reliability of a subjective rating measure.
McLeod, Sharynne; Harrison, Linda J; McCormack, Jane
2012-04-01
To describe a new measure of functional intelligibility, the Intelligibility in Context Scale (ICS), and evaluate its validity, reliability, and sensitivity using 3 clinical measures of severity of speech sound disorder: (a) percentage of phonemes correct (PPC), (b) percentage of consonants correct (PCC), and (c) percentage of vowels correct (PVC). Speech skills of 120 preschool children (109 with parent-/teacher-identified concern about how they talked and made speech sounds and 11 with no identified concern) were assessed with the Diagnostic Evaluation of Articulation and Phonology (Dodd, Hua, Crosbie, Holm, & Ozanne, 2002). Parents completed the 7-item ICS, which rates the degree to which children's speech is understood by different communication partners (parents, immediate family, extended family, friends, acquaintances, teachers, and strangers) on a 5-point scale. Parents' ratings showed that most children were always (5) or usually (4) understood by parents, immediate family, and teachers, but only sometimes (3) by strangers. Factor analysis confirmed the internal consistency of the ICS items; therefore, ratings were averaged to form an overall intelligibility score. The ICS had high internal reliability (α = .93), sensitivity, and construct validity. Criterion validity was established through significant correlations between the ICS and PPC (r = .54), PCC (r = .54), and PVC (r = .36). The ICS is a promising new measure of functional intelligibility. These data provide initial support for the ICS as an easily administered, valid, and reliable estimate of preschool children's intelligibility when speaking with people of varying levels of familiarity and authority.
A Hybrid Neural Network-Genetic Algorithm Technique for Aircraft Engine Performance Diagnostics
NASA Technical Reports Server (NTRS)
Kobayashi, Takahisa; Simon, Donald L.
2001-01-01
In this paper, a model-based diagnostic method, which utilizes Neural Networks and Genetic Algorithms, is investigated. Neural networks are applied to estimate the engine internal health, and Genetic Algorithms are applied for sensor bias detection and estimation. This hybrid approach takes advantage of the nonlinear estimation capability provided by neural networks while improving the robustness to measurement uncertainty through the application of Genetic Algorithms. The hybrid diagnostic technique also has the ability to rank multiple potential solutions for a given set of anomalous sensor measurements in order to reduce false alarms and missed detections. The performance of the hybrid diagnostic technique is evaluated through some case studies derived from a turbofan engine simulation. The results show this approach is promising for reliable diagnostics of aircraft engines.
Quality metric for spherical panoramic video
NASA Astrophysics Data System (ADS)
Zakharchenko, Vladyslav; Choi, Kwang Pyo; Park, Jeong Hoon
2016-09-01
Virtual reality (VR)/ augmented reality (AR) applications allow users to view artificial content of a surrounding space simulating presence effect with a help of special applications or devices. Synthetic contents production is well known process form computer graphics domain and pipeline has been already fixed in the industry. However emerging multimedia formats for immersive entertainment applications such as free-viewpoint television (FTV) or spherical panoramic video require different approaches in content management and quality assessment. The international standardization on FTV has been promoted by MPEG. This paper is dedicated to discussion of immersive media distribution format and quality estimation process. Accuracy and reliability of the proposed objective quality estimation method had been verified with spherical panoramic images demonstrating good correlation results with subjective quality estimation held by a group of experts.
A Note on Structural Equation Modeling Estimates of Reliability
ERIC Educational Resources Information Center
Yang, Yanyun; Green, Samuel B.
2010-01-01
Reliability can be estimated using structural equation modeling (SEM). Two potential problems with this approach are that estimates may be unstable with small sample sizes and biased with misspecified models. A Monte Carlo study was conducted to investigate the quality of SEM estimates of reliability by themselves and relative to coefficient…
Estimating Measures of Pass-Fail Reliability from Parallel Half-Tests.
ERIC Educational Resources Information Center
Woodruff, David J.; Sawyer, Richard L.
Two methods for estimating measures of pass-fail reliability are derived, by which both theta and kappa may be estimated from a single test administration. The methods require only a single test administration and are computationally simple. Both are based on the Spearman-Brown formula for estimating stepped-up reliability. The non-distributional…
Large Sample Confidence Intervals for Item Response Theory Reliability Coefficients
ERIC Educational Resources Information Center
Andersson, Björn; Xin, Tao
2018-01-01
In applications of item response theory (IRT), an estimate of the reliability of the ability estimates or sum scores is often reported. However, analytical expressions for the standard errors of the estimators of the reliability coefficients are not available in the literature and therefore the variability associated with the estimated reliability…
Reliability Correction for Functional Connectivity: Theory and Implementation
Mueller, Sophia; Wang, Danhong; Fox, Michael D.; Pan, Ruiqi; Lu, Jie; Li, Kuncheng; Sun, Wei; Buckner, Randy L.; Liu, Hesheng
2016-01-01
Network properties can be estimated using functional connectivity MRI (fcMRI). However, regional variation of the fMRI signal causes systematic biases in network estimates including correlation attenuation in regions of low measurement reliability. Here we computed the spatial distribution of fcMRI reliability using longitudinal fcMRI datasets and demonstrated how pre-estimated reliability maps can correct for correlation attenuation. As a test case of reliability-based attenuation correction we estimated properties of the default network, where reliability was significantly lower than average in the medial temporal lobe and higher in the posterior medial cortex, heterogeneity that impacts estimation of the network. Accounting for this bias using attenuation correction revealed that the medial temporal lobe’s contribution to the default network is typically underestimated. To render this approach useful to a greater number of datasets, we demonstrate that test-retest reliability maps derived from repeated runs within a single scanning session can be used as a surrogate for multi-session reliability mapping. Using data segments with different scan lengths between 1 and 30 min, we found that test-retest reliability of connectivity estimates increases with scan length while the spatial distribution of reliability is relatively stable even at short scan lengths. Finally, analyses of tertiary data revealed that reliability distribution is influenced by age, neuropsychiatric status and scanner type, suggesting that reliability correction may be especially important when studying between-group differences. Collectively, these results illustrate that reliability-based attenuation correction is an easily implemented strategy that mitigates certain features of fMRI signal nonuniformity. PMID:26493163
García Bengoechea, Enrique; Sabiston, Catherine M; Wilson, Philip M
2017-01-01
The aim of this study was to provide initial evidence of validity and reliability of scores derived from the Activity Context in Youth Sport Questionnaire (ACYSQ), an instrument designed to offer a comprehensive assessment of the activities adolescents take part in during sport practices. Two studies were designed for the purposes of item development and selection, and to provide evidence of structural and criterion validity of ACYSQ scores, respectively (N = 334; M age = 14.93, SD = 1.76 years). Confirmatory factor analysis (CFA) supported the adequacy of a 20-item ACYSQ measurement model, which was invariant across gender, and comprised the following dimensions: (1) stimulation; (2) usefulness-value; (3) authenticity; (4) repetition-boredom; and (5) ineffectiveness. Internal consistency reliability estimates and composite reliability estimates for ACYSQ subscale scores ranged from 0.72 to 0.91. In regression analyses, stimulation predicted enjoyment and perceived competence, ineffectiveness was significantly associated with perceived competence and authenticity emerged as a predictor of commitment in sport. These findings indicate that the ACYSQ displays adequate psychometric properties and the use of the instrument may be useful for studying selected activity-based features of the practice environment and their motivational consequences in youth sport.
Carvalho, Hudson W de; Andreoli, Sérgio B; Lara, Diogo R; Patrick, Christopher J; Quintana, Maria Inês; Bressan, Rodrigo A; Melo, Marcelo F de; Mari, Jair de J; Jorge, Miguel R
2013-01-01
Positive and negative affect are the two psychobiological-dispositional dimensions reflecting proneness to positive and negative activation that influence the extent to which individuals experience life events as joyful or as distressful. The Positive and Negative Affect Schedule (PANAS) is a structured questionnaire that provides independent indexes of positive and negative affect. This study aimed to validate a Brazilian interview-version of the PANAS by means of factor and internal consistency analysis. A representative community sample of 3,728 individuals residing in the cities of São Paulo and Rio de Janeiro, Brazil, voluntarily completed the PANAS. Exploratory structural equation model analysis was based on maximum likelihood estimation and reliability was calculated via Cronbach's alpha coefficient. Our results provide support for the hypothesis that the PANAS reliably measures two distinct dimensions of positive and negative affect. The structure and reliability of the Brazilian version of the PANAS are consistent with those of its original version. Taken together, these results attest the validity of the Brazilian adaptation of the instrument.
Coughlan, Diarmuid; Yeh, Susan T; O'Neill, Ciaran; Frick, Kevin D
2014-01-01
To inform policymakers of the importance of evaluating various methods for estimating the direct medical expenditures for a low-incidence condition, head and neck cancer (HNC). Four methods of estimation have been identified: 1) summing all health care expenditures, 2) estimating disease-specific expenditures consistent with an attribution approach, 3) estimating disease-specific expenditures by matching, and 4) estimating disease-specific expenditures by using a regression-based approach. A literature review of studies (2005-2012) that used the Medical Expenditure Panel Survey (MEPS) was undertaken to establish the most popular expenditure estimation methods. These methods were then applied to a sample of 120 respondents with HNC, derived from pooled data (2003-2008). The literature review shows that varying expenditure estimation methods have been used with MEPS but no study compared and contrasted all four methods. Our estimates are reflective of the national treated prevalence of HNC. The upper-bound estimate of annual direct medical expenditures of adult respondents with HNC between 2003 and 2008 was $3.18 billion (in 2008 dollars). Comparable estimates arising from methods focusing on disease-specific and incremental expenditures were all lower in magnitude. Attribution yielded annual expenditures of $1.41 billion, matching method of $1.56 billion, and regression method of $1.09 billion. This research demonstrates that variation exists across and within expenditure estimation methods applied to MEPS data. Despite concerns regarding aspects of reliability and consistency, reporting a combination of the four methods offers a degree of transparency and validity to estimating the likely range of annual direct medical expenditures of a condition. © 2013 International Society for Pharmacoeconomics and Outcomes Research (ISPOR) Published by International Society for Pharmacoeconomics and Outcomes Research (ISPOR) All rights reserved.
A particle swarm model for estimating reliability and scheduling system maintenance
NASA Astrophysics Data System (ADS)
Puzis, Rami; Shirtz, Dov; Elovici, Yuval
2016-05-01
Modifying data and information system components may introduce new errors and deteriorate the reliability of the system. Reliability can be efficiently regained with reliability centred maintenance, which requires reliability estimation for maintenance scheduling. A variant of the particle swarm model is used to estimate reliability of systems implemented according to the model view controller paradigm. Simulations based on data collected from an online system of a large financial institute are used to compare three component-level maintenance policies. Results show that appropriately scheduled component-level maintenance greatly reduces the cost of upholding an acceptable level of reliability by reducing the need in system-wide maintenance.
2013-01-01
Background If you want to know which of two or more healthcare interventions is most effective, the randomised controlled trial is the design of choice. Randomisation, however, does not itself promote the applicability of the results to situations other than the one in which the trial was done. A tool published in 2009, PRECIS (PRagmatic Explanatory Continuum Indicator Summaries) aimed to help trialists design trials that produced results matched to the aim of the trial, be that supporting clinical decision-making, or increasing knowledge of how an intervention works. Though generally positive, groups evaluating the tool have also found weaknesses, mainly that its inter-rater reliability is not clear, that it needs a scoring system and that some new domains might be needed. The aim of the study is to: Produce an improved and validated version of the PRECIS tool. Use this tool to compare the internal validity of, and effect estimates from, a set of explanatory and pragmatic trials matched by intervention. Methods The study has four phases. Phase 1 involves brainstorming and a two-round Delphi survey of authors who cited PRECIS. In Phase 2, the Delphi results will then be discussed and alternative versions of PRECIS-2 developed and user-tested by experienced trialists. Phase 3 will evaluate the validity and reliability of the most promising PRECIS-2 candidate using a sample of 15 to 20 trials rated by 15 international trialists. We will assess inter-rater reliability, and raters’ subjective global ratings of pragmatism compared to PRECIS-2 to assess convergent and face validity. Phase 4, to determine if pragmatic trials sacrifice internal validity in order to achieve applicability, will compare the internal validity and effect estimates of matched explanatory and pragmatic trials of the same intervention, condition and participants. Effect sizes for the trials will then be compared in a meta-regression. The Cochrane Risk of Bias scores will be compared with the PRECIS-2 scores of pragmatism. Discussion We have concrete suggestions for improving PRECIS and a growing list of enthusiastic individuals interested in contributing to this work. By early 2014 we expect to have a validated PRECIS-2. PMID:23782862
ERIC Educational Resources Information Center
Lee, Guemin; Park, In-Yong
2012-01-01
Previous assessments of the reliability of test scores for testlet-composed tests have indicated that item-based estimation methods overestimate reliability. This study was designed to address issues related to the extent to which item-based estimation methods overestimate the reliability of test scores composed of testlets and to compare several…
Statistical tools for transgene copy number estimation based on real-time PCR.
Yuan, Joshua S; Burris, Jason; Stewart, Nathan R; Mentewab, Ayalew; Stewart, C Neal
2007-11-01
As compared with traditional transgene copy number detection technologies such as Southern blot analysis, real-time PCR provides a fast, inexpensive and high-throughput alternative. However, the real-time PCR based transgene copy number estimation tends to be ambiguous and subjective stemming from the lack of proper statistical analysis and data quality control to render a reliable estimation of copy number with a prediction value. Despite the recent progresses in statistical analysis of real-time PCR, few publications have integrated these advancements in real-time PCR based transgene copy number determination. Three experimental designs and four data quality control integrated statistical models are presented. For the first method, external calibration curves are established for the transgene based on serially-diluted templates. The Ct number from a control transgenic event and putative transgenic event are compared to derive the transgene copy number or zygosity estimation. Simple linear regression and two group T-test procedures were combined to model the data from this design. For the second experimental design, standard curves were generated for both an internal reference gene and the transgene, and the copy number of transgene was compared with that of internal reference gene. Multiple regression models and ANOVA models can be employed to analyze the data and perform quality control for this approach. In the third experimental design, transgene copy number is compared with reference gene without a standard curve, but rather, is based directly on fluorescence data. Two different multiple regression models were proposed to analyze the data based on two different approaches of amplification efficiency integration. Our results highlight the importance of proper statistical treatment and quality control integration in real-time PCR-based transgene copy number determination. These statistical methods allow the real-time PCR-based transgene copy number estimation to be more reliable and precise with a proper statistical estimation. Proper confidence intervals are necessary for unambiguous prediction of trangene copy number. The four different statistical methods are compared for their advantages and disadvantages. Moreover, the statistical methods can also be applied for other real-time PCR-based quantification assays including transfection efficiency analysis and pathogen quantification.
Gallasch, Cristiane Helena; Alexandre, Neusa Maria Costa; Amick, Benjamin
2007-12-01
The study objectives were to translate and adapt the Work Role Functioning Questionnaire (WRFQ) into the Brazilian Portuguese language and evaluate its reliability in patients experiencing musculoskeletal disorders. The cross-cultural adaptation was performed according to the internationally recommended methodology, using the following guidelines: translation, back-translation, revision by a committee, and pretest. At first, the questionnaire was independently translated by two bilingual translators, who had Portuguese as their mother language. Subsequently, two other translators whose mother language was English did the back-translation. A committee composed of five specialists revised and compared the translations obtained, developing the final version for pretest application. The pretest was carried out with 30 patients experiencing musculoskeletal disorders. Psychometric properties were evaluated by administering the questionnaire to 105 subjects with musculoskeletal disorders and receiving physical therapy treatment. The reliability was estimated through stability and homogeneity assessment. The construct validity was tested comparing subjects experiencing musculoskeletal disorders to healthy workers. The results indicated good content validity and internal consistency (Cronbach alpha = 0.95). Cronbach alpha for each scale was >0.85, except for the social demand scale. The Intraclass Correlation Coefficient for the test-retest reliability was satisfactory for mental demands (ICC = 0.68) and excellent for the others (0.82-0.91). In relation to the construct validity, the mean score obtained for each scale was lower for physical, work scheduling, and output demands in the subjects with musculoskeletal disorders. There was a significant difference (p < 0.001) between the groups in comparison to work scheduling, physical, and output demands. The data showed that the cross-cultural adaptation process was successful and the adapted instrument demonstrated psychometric properties making it reliable to use in Brazilian culture.
International Space Station End-of-Life Probabilistic Risk Assessment
NASA Technical Reports Server (NTRS)
Duncan, Gary W.
2014-01-01
The International Space Station (ISS) end-of-life (EOL) cycle is currently scheduled for 2020, although there are ongoing efforts to extend ISS life cycle through 2028. The EOL for the ISS will require deorbiting the ISS. This will be the largest manmade object ever to be de-orbited therefore safely deorbiting the station will be a very complex problem. This process is being planned by NASA and its international partners. Numerous factors will need to be considered to accomplish this such as target corridors, orbits, altitude, drag, maneuvering capabilities etc. The ISS EOL Probabilistic Risk Assessment (PRA) will play a part in this process by estimating the reliability of the hardware supplying the maneuvering capabilities. The PRA will model the probability of failure of the systems supplying and controlling the thrust needed to aid in the de-orbit maneuvering.
Okochi, Jiro; Utsunomiya, Sakiko; Takahashi, Tai
2005-01-01
Background The International Classification of Functioning, Disability and Health (ICF) was published by the World Health Organization (WHO) to standardize descriptions of health and disability. Little is known about the reliability and clinical relevance of measurements using the ICF and its qualifiers. This study examines the test-retest reliability of ICF codes, and the rate of immeasurability in long-term care settings of the elderly to evaluate the clinical applicability of the ICF and its qualifiers, and the ICF checklist. Methods Reliability of 85 body function (BF) items and 152 activity and participation (AP) items of the ICF was studied using a test-retest procedure with a sample of 742 elderly persons from 59 institutional and at home care service centers. Test-retest reliability was estimated using the weighted kappa statistic. The clinical relevance of the ICF was estimated by calculating immeasurability rate. The effect of the measurement settings and evaluators' experience was analyzed by stratification of these variables. The properties of each item were evaluated using both the kappa statistic and immeasurability rate to assess the clinical applicability of WHO's ICF checklist in the elderly care setting. Results The median of the weighted kappa statistics of 85 BF and 152 AP items were 0.46 and 0.55 respectively. The reproducibility statistics improved when the measurements were performed by experienced evaluators. Some chapters such as genitourinary and reproductive functions in the BF domain and major life area in the AP domain contained more items with lower test-retest reliability measures and rated as immeasurable than in the other chapters. Some items in the ICF checklist were rated as unreliable and immeasurable. Conclusion The reliability of the ICF codes when measured with the current ICF qualifiers is relatively low. The result in increase in reliability according to evaluators' experience suggests proper education will have positive effects to raise the reliability. The ICF checklist contains some items that are difficult to be applied in the geriatric care settings. The improvements should be achieved by selecting the most relevant items for each measurement and by developing appropriate qualifiers for each code according to the interest of the users. PMID:16050960
ERIC Educational Resources Information Center
Black, Ryan A.; Yang, Yanyun; Beitra, Danette; McCaffrey, Stacey
2015-01-01
Estimation of composite reliability within a hierarchical modeling framework has recently become of particular interest given the growing recognition that the underlying assumptions of coefficient alpha are often untenable. Unfortunately, coefficient alpha remains the prominent estimate of reliability when estimating total scores from a scale with…
A Comparison of Three Multivariate Models for Estimating Test Battery Reliability.
ERIC Educational Resources Information Center
Wood, Terry M.; Safrit, Margaret J.
1987-01-01
A comparison of three multivariate models (canonical reliability model, maximum generalizability model, canonical correlation model) for estimating test battery reliability indicated that the maximum generalizability model showed the least degree of bias, smallest errors in estimation, and the greatest relative efficiency across all experimental…
Gaps in policy-relevant information on burden of disease in children: a systematic review.
Rudan, Igor; Lawn, Joy; Cousens, Simon; Rowe, Alexander K; Boschi-Pinto, Cynthia; Tomasković, Lana; Mendoza, Walter; Lanata, Claudio F; Roca-Feltrer, Arantxa; Carneiro, Ilona; Schellenberg, Joanna A; Polasek, Ozren; Weber, Martin; Bryce, Jennifer; Morris, Saul S; Black, Robert E; Campbell, Harry
Valid information about cause-specific child mortality and morbidity is an essential foundation for national and international health policy. We undertook a systematic review to investigate the geographical dispersion of and time trends in publication for policy-relevant information about children's health and to assess associations between the availability of reliable data and poverty. We identified data available on Jan 1, 2001, and published since 1980, for the major causes of morbidity and mortality in young children. Studies with relevant data were assessed against a set of inclusion criteria to identify those likely to provide unbiased estimates of the burden of childhood disease in the community. Only 308 information units from more than 17,000 papers identified were regarded as possible unbiased sources for estimates of childhood disease burden. The geographical distribution of these information units revealed a pattern of small well-researched populations surrounded by large areas with little available information. No reliable population-based data were identified from many of the world's poorest countries, which account for about a third of all deaths of children worldwide. The number of new studies diminished over the last 10 years investigated. The number of population-based studies yielding estimates of burden of childhood disease from less developed countries was low. The decreasing trend over time suggests reductions in research investment in this sphere. Data are especially sparse from the world's least developed countries with the highest child mortality. Guidelines are needed for the conduct of burden-of-disease studies together with an international research policy that gives increased emphasis to global equity and coverage so that knowledge can be generated from all regions of the world.
Kono, Kenichi; Nishida, Yusuke; Moriyama, Yoshihumi; Taoka, Masahiro; Sato, Takashi
2015-06-01
The assessment of nutritional states using fat free mass (FFM) measured with near-infrared spectroscopy (NIRS) is clinically useful. This measurement should incorporate the patient's post-dialysis weight ("dry weight"), in order to exclude the effects of any change in water mass. We therefore used NIRS to investigate the regression, independent variables, and absolute reliability of FFM in dry weight. The study included 47 outpatients from the hemodialysis unit. Body weight was measured before dialysis, and FFM was measured using NIRS before and after dialysis treatment. Multiple regression analysis was used to estimate the FFM in dry weight as the dependent variable. The measured FFM before dialysis treatment (Mw-FFM), and the difference between measured and dry weight (Mw-Dw) were independent variables. We performed Bland-Altman analysis to detect errors between the statistically estimated FFM and the measured FFM after dialysis treatment. The multiple regression equation to estimate the FFM in dry weight was: Dw-FFM = 0.038 + (0.984 × Mw-FFM) + (-0.571 × [Mw-Dw]); R(2) = 0.99). There was no systematic bias between the estimated and the measured values of FFM in dry weight. Using NIRS, FFM in dry weight can be calculated by an equation including FFM in measured weight and the difference between the measured weight and the dry weight. © 2015 The Authors. Therapeutic Apheresis and Dialysis © 2015 International Society for Apheresis.
Validation of the Weight Concerns Scale Applied to Brazilian University Students.
Dias, Juliana Chioda Ribeiro; da Silva, Wanderson Roberto; Maroco, João; Campos, Juliana Alvares Duarte Bonini
2015-06-01
The aim of this study was to evaluate the validity and reliability of the Portuguese version of the Weight Concerns Scale (WCS) when applied to Brazilian university students. The scale was completed by 1084 university students from Brazilian public education institutions. A confirmatory factor analysis was conducted. The stability of the model in independent samples was assessed through multigroup analysis, and the invariance was estimated. Convergent, concurrent, divergent, and criterion validities as well as internal consistency were estimated. Results indicated that the one-factor model presented an adequate fit to the sample and values of convergent validity. The concurrent validity with the Body Shape Questionnaire and divergent validity with the Maslach Burnout Inventory for Students were adequate. Internal consistency was adequate, and the factorial structure was invariant in independent subsamples. The results present a simple and short instrument capable of precisely and accurately assessing concerns with weight among Brazilian university students. Copyright © 2015 Elsevier Ltd. All rights reserved.
Trani, Jean-François; Babulal, Ganesh Muneshwar; Bakhshi, Parul
2015-01-01
Although 80% of persons with disabilities live in low and middle-income countries, there is still a lack of comprehensive, cross-culturally validated tools to identify persons facing activity limitations and functioning difficulties in these settings. In absence of such a tool, disability estimates vary considerably according to the methodology used, and policies are based on unreliable estimates. The Disability Screening Questionnaire composed of 27 items (DSQ-27) was initially designed by a group of international experts in survey development and disability in Afghanistan for a national survey. Items were selected based on major domains of activity limitations and functioning difficulties linked to an impairment as defined by the International Classification of Functioning, Disability and Health. Face, content and construct validity, as well as sensitivity and specificity were examined. Based on the results obtained, the tool was subsequently refined and expanded to 34 items, tested and validated in Darfur, Sudan. Internal consistency for the total DSQ-34 using a raw and standardized Cronbach's Alpha and within each domain using a standardized Cronbach's Alpha was examined in the Asian context (India and Nepal). Exploratory factor analysis (EFA) using principal axis factoring (PAF) evaluated the lowest number of factors to account for the common variance among the questions in the screen. Test-retest reliability was determined by calculating intraclass correlation (ICC) and inter-rater reliability by calculating the kappa statistic; results were checked using Bland-Altman plots. The DSQ-34 was further tested for standard error of measurement (SEM) and for the minimum detectable change (MDC). Good internal consistency was indicated by Cronbach's Alpha of 0.83/0.82 for India and 0.76/0.78 for Nepal. We confirmed our assumption for EFA using the Kaiser-Meyer-Olkin measure of sampling well above the accepted cutoff of 0.40 for India (0.82) and Nepal (0.82). The criteria for Bartlett's test of sphericity were also met for both India (< .001) and Nepal (< .001). Estimates of reliability from the two countries reached acceptable levels of ICC of 0.75 (p<0.001) for India of 0.77 for Nepal (p<0.001) and good strength of agreement for weighted kappa (respectively 0.77 and 0.79). The SEM/MDC was 0.80/2.22 for India and 0.96/2.66 for Nepal indicating a smaller amount of measurement error in the screen. In Nepal and India, the DSQ-34 shows strong psychometric properties that indicate that it effectively discriminates between persons with and without disabilities. This instrument can be used in association with other instruments for the purpose of comparing health outcomes of persons with and without disabilities in LMICs.
Hu, B; Lin, L F; Zhuang, M Q; Yuan, Z Y; Li, S Y; Yang, Y J; Lu, M; Yu, S Z; Jin, L; Ye, W M; Wang, X F
2015-09-01
To examine the test-retest reliabilities and relative validities of the Chinese version of short International Physical Activity Questionnaire (IPAQ-S-C), the Global Physical Activity Questionnaire (GPAQ-C), and the Total Energy Expenditure Questionnaire (TEEQ-C) in a population-based prospective study, the Taizhou Longitudinal Study (TZLS). A longitudinal comparative study. A total of 205 participants (male: 38.54%) aged 30-70 years completed three questionnaires twice (day one and day nine) and physical activity log (PA-log) over seven consecutive days. The test-retest reliabilities were evaluated using intra-class correlation coefficients (ICCs) and the relative validities were estimated by comparing the data from physical activity questionnaires (PAQs) and PA-log. Good reliabilities were observed between the repeated PAQs. The ICCs ranged from 0.51 to 0.80 for IPAQ-C, 0.67 to 0.85 for GPAQ-C, and 0.74 to 0.94 for TEEQ-C, respectively. Energy expenditure of most PA domains estimated by the three PAQs correlated moderately with the results recorded by PA-log except the walking domain of IPAQ-S-C. The partial correlation coefficients between the PAQs and PA-log ranged from 0.44 to 0.58 for IPAQ-S-C, 0.26 to 0.52 for GPAQ-C, and 0.41 to 0.72 for TEEQ-C, respectively. Bland-Altman plots showed acceptable agreement between the three PAQs and PA-log. The three PAQs, especially TEEQ-C, were relatively reliable and valid for assessment of physical activity and could be used in TZLS. Copyright © 2015 The Royal Society for Public Health. Published by Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Nair, S. P.; Righetti, R.
2015-05-01
Recent elastography techniques focus on imaging information on properties of materials which can be modeled as viscoelastic or poroelastic. These techniques often require the fitting of temporal strain data, acquired from either a creep or stress-relaxation experiment to a mathematical model using least square error (LSE) parameter estimation. It is known that the strain versus time relationships for tissues undergoing creep compression have a non-linear relationship. In non-linear cases, devising a measure of estimate reliability can be challenging. In this article, we have developed and tested a method to provide non linear LSE parameter estimate reliability: which we called Resimulation of Noise (RoN). RoN provides a measure of reliability by estimating the spread of parameter estimates from a single experiment realization. We have tested RoN specifically for the case of axial strain time constant parameter estimation in poroelastic media. Our tests show that the RoN estimated precision has a linear relationship to the actual precision of the LSE estimator. We have also compared results from the RoN derived measure of reliability against a commonly used reliability measure: the correlation coefficient (CorrCoeff). Our results show that CorrCoeff is a poor measure of estimate reliability for non-linear LSE parameter estimation. While the RoN is specifically tested only for axial strain time constant imaging, a general algorithm is provided for use in all LSE parameter estimation.
Development of a Tool to Measure Youths' Food Allergy Management Facilitators and Barriers.
Herbert, Linda Jones; Lin, Adora; Matsui, Elizabeth; Wood, Robert A; Sharma, Hemant
2016-04-01
This study's aims are to identify factors related to allergen avoidance and epinephrine carriage among youth with food allergy, develop a tool to measure food allergy management facilitators and barriers, and investigate its initial reliability and validity. The Food Allergy Management Perceptions Questionnaire (FAMPQ) was developed based on focus groups with 19 adolescents and young adults with food allergy. Additional youth with food allergy (N = 92; ages: 13-21 years) completed food allergy clinical history and management questionnaires and the FAMPQ. Internal reliability estimates for the FAMPQ Facilitators and Barriers subscales were acceptable to good. Youth who were adherent to allergen avoidance and epinephrine carriage had higher Facilitator scores. Poor adherence was more likely among youth with higher Barrier scores. Initial FAMPQ reliability and validity is promising. Additional research is needed to develop FAMPQ clinical guidelines. © The Author 2015. Published by Oxford University Press on behalf of the Society of Pediatric Psychology. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Paoli, Carly J.; Hays, Ron D.; Taylor-Stokes, Gavin; Piercy, James; Gitlin, Matthew
2014-01-01
Background and objectives The US Centers for Medicare and Medicaid Services (CMS) End Stage Renal Disease Prospective Payment System and Quality Incentive Program requires that dialysis centers meet predefined criteria for quality of patient care to ensure future funding. The CMS selected the Consumer Assessment of Healthcare Providers and Systems In-Center Hemodialysis (CAHPS-ICH) survey for the assessment of patient experience of care. This analysis evaluated the psychometric properties of the CAHPS-ICH survey in a sample of hemodialysis patients. Design, setting, participants, & measurements Data were drawn from the Adelphi CKD Disease Specific Program (a retrospective, cross-sectional survey of nephrologists and patients). Selected United States–based nephrologists treating patients receiving hemodialysis completed patient record forms and provided information on their dialysis center. Patients (n=404) completed the CAHPS-ICH survey (comprising 58 questions) providing six scores for the assessment of patient experience of care. CAHPS-ICH item-scale convergence, discrimination, and reliability were evaluated for multi-item scales. Floor and ceiling effects were estimated for all six scores. Patient (demographics, dialysis history, vascular access method) and facility characteristics (size, ratio of patients-to-physicians, nurses, and technicians) associated with the CAHPS-ICH scores were also evaluated. Results Item-scale correlations and internal consistency reliability estimates provided support for the nephrologists’ communication (range, 0.16–0.71; α=0.81) and quality of care (range, 0.16–0.76; α=0.90) composites. However, the patient information composite had low internal consistency reliability (α=0.55). Provider-to-patient ratios (range, 2.37 for facilities with >36 patients per physician to 2.8 for those with <8 patients per physician) and time spent in the waiting room (3.44 for >15 minutes of waiting time to 3.75 for 5 to <10 minutes) were characteristics most consistently related to patients’ perceptions of dialysis care. Conclusions CAHPS-ICH is a potentially valuable and informative tool for the evaluation of patients’ experiences with dialysis care. Additional studies are needed to estimate clinically meaningful differences between care providers. PMID:24832092
Calculating system reliability with SRFYDO
DOE Office of Scientific and Technical Information (OSTI.GOV)
Morzinski, Jerome; Anderson - Cook, Christine M; Klamann, Richard M
2010-01-01
SRFYDO is a process for estimating reliability of complex systems. Using information from all applicable sources, including full-system (flight) data, component test data, and expert (engineering) judgment, SRFYDO produces reliability estimates and predictions. It is appropriate for series systems with possibly several versions of the system which share some common components. It models reliability as a function of age and up to 2 other lifecycle (usage) covariates. Initial output from its Exploratory Data Analysis mode consists of plots and numerical summaries so that the user can check data entry and model assumptions, and help determine a final form for themore » system model. The System Reliability mode runs a complete reliability calculation using Bayesian methodology. This mode produces results that estimate reliability at the component, sub-system, and system level. The results include estimates of uncertainty, and can predict reliability at some not-too-distant time in the future. This paper presents an overview of the underlying statistical model for the analysis, discusses model assumptions, and demonstrates usage of SRFYDO.« less
Forecasting overhaul or replacement intervals based on estimated system failure intensity
NASA Astrophysics Data System (ADS)
Gannon, James M.
1994-12-01
System reliability can be expressed in terms of the pattern of failure events over time. Assuming a nonhomogeneous Poisson process and Weibull intensity function for complex repairable system failures, the degree of system deterioration can be approximated. Maximum likelihood estimators (MLE's) for the system Rate of Occurrence of Failure (ROCOF) function are presented. Evaluating the integral of the ROCOF over annual usage intervals yields the expected number of annual system failures. By associating a cost of failure with the expected number of failures, budget and program policy decisions can be made based on expected future maintenance costs. Monte Carlo simulation is used to estimate the range and the distribution of the net present value and internal rate of return of alternative cash flows based on the distributions of the cost inputs and confidence intervals of the MLE's.
NASA Astrophysics Data System (ADS)
Yu, Z. P.; Yue, Z. F.; Liu, W.
2018-05-01
With the development of artificial intelligence, more and more reliability experts have noticed the roles of subjective information in the reliability design of complex system. Therefore, based on the certain numbers of experiment data and expert judgments, we have divided the reliability estimation based on distribution hypothesis into cognition process and reliability calculation. Consequently, for an illustration of this modification, we have taken the information fusion based on intuitional fuzzy belief functions as the diagnosis model of cognition process, and finished the reliability estimation for the open function of cabin door affected by the imprecise judgment corresponding to distribution hypothesis.
Montiel-Company, José María; Subirats-Roig, Cristian; Flores-Martí, Pau; Bellot-Arcís, Carlos; Almerich-Silla, José Manuel
2016-11-01
The aim of this study was to examine the validity and reliability of the Maslach Burnout Inventory-Human Services Survey (MBI-HSS) as a tool for assessing the prevalence and level of burnout in dental students in Spanish universities. The survey was adapted from English to Spanish. A sample of 533 dental students from 15 Spanish universities and a control group of 188 medical students self-administered the survey online, using the Google Drive service. The test-retest reliability or reproducibility showed an Intraclass Correlation Coefficient of 0.95. The internal consistency of the survey was 0.922. Testing the construct validity showed two components with an eigenvalue greater than 1.5, which explained 51.2% of the total variance. Factor I (36.6% of the variance) comprised the items that estimated emotional exhaustion and depersonalization. Factor II (14.6% of the variance) contained the items that estimated personal accomplishment. The cut-off point for the existence of burnout achieved a sensitivity of 92.2%, a specificity of 92.1%, and an area under the curve of 0.96. Comparison of the total dental students sample and the control group of medical students showed significantly higher burnout levels for the dental students (50.3% vs. 40.4%). In this study, the MBI-HSS was found to be viable, valid, and reliable for measuring burnout in dental students. Since the study also found that the dental students suffered from high levels of this syndrome, these results suggest the need for preventive burnout control programs.
Are Validity and Reliability "Relevant" in Qualitative Evaluation Research?
ERIC Educational Resources Information Center
Goodwin, Laura D.; Goodwin, William L.
1984-01-01
The views of prominant qualitative methodologists on the appropriateness of validity and reliability estimation for the measurement strategies employed in qualitative evaluations are summarized. A case is made for the relevance of validity and reliability estimation. Definitions of validity and reliability for qualitative measurement are presented…
A General Approach for Estimating Scale Score Reliability for Panel Survey Data
ERIC Educational Resources Information Center
Biemer, Paul P.; Christ, Sharon L.; Wiesen, Christopher A.
2009-01-01
Scale score measures are ubiquitous in the psychological literature and can be used as both dependent and independent variables in data analysis. Poor reliability of scale score measures leads to inflated standard errors and/or biased estimates, particularly in multivariate analysis. Reliability estimation is usually an integral step to assess…
ERIC Educational Resources Information Center
Md Desa, Zairul Nor Deana
2012-01-01
In recent years, there has been increasing interest in estimating and improving subscore reliability. In this study, the multidimensional item response theory (MIRT) and the bi-factor model were combined to estimate subscores, to obtain subscores reliability, and subscores classification. Both the compensatory and partially compensatory MIRT…
Reliability and precision of pellet-group counts for estimating landscape-level deer density
David S. deCalesta
2013-01-01
This study provides hitherto unavailable methodology for reliably and precisely estimating deer density within forested landscapes, enabling quantitative rather than qualitative deer management. Reliability and precision of the deer pellet-group technique were evaluated in 1 small and 2 large forested landscapes. Density estimates, adjusted to reflect deer harvest and...
Method matters: Understanding diagnostic reliability in DSM-IV and DSM-5.
Chmielewski, Michael; Clark, Lee Anna; Bagby, R Michael; Watson, David
2015-08-01
Diagnostic reliability is essential for the science and practice of psychology, in part because reliability is necessary for validity. Recently, the DSM-5 field trials documented lower diagnostic reliability than past field trials and the general research literature, resulting in substantial criticism of the DSM-5 diagnostic criteria. Rather than indicating specific problems with DSM-5, however, the field trials may have revealed long-standing diagnostic issues that have been hidden due to a reliance on audio/video recordings for estimating reliability. We estimated the reliability of DSM-IV diagnoses using both the standard audio-recording method and the test-retest method used in the DSM-5 field trials, in which different clinicians conduct separate interviews. Psychiatric patients (N = 339) were diagnosed using the SCID-I/P; 218 were diagnosed a second time by an independent interviewer. Diagnostic reliability using the audio-recording method (N = 49) was "good" to "excellent" (M κ = .80) and comparable to the DSM-IV field trials estimates. Reliability using the test-retest method (N = 218) was "poor" to "fair" (M κ = .47) and similar to DSM-5 field-trials' estimates. Despite low test-retest diagnostic reliability, self-reported symptoms were highly stable. Moreover, there was no association between change in self-report and change in diagnostic status. These results demonstrate the influence of method on estimates of diagnostic reliability. (c) 2015 APA, all rights reserved).
Li, Hua; Jiang, Xiaoyu; Xie, Jingping; Gore, John C; Xu, Junzhong
2017-06-01
To investigate the influence of transcytolemmal water exchange on estimates of tissue microstructural parameters derived from diffusion MRI using conventional PGSE and IMPULSED methods. Computer simulations were performed to incorporate a broad range of intracellular water life times τ in (50-∞ ms), cell diameters d (5-15 μm), and intrinsic diffusion coefficient D in (0.6-2 μm 2 /ms) for different values of signal-to-noise ratio (SNR) (10 to 50). For experiments, murine erythroleukemia (MEL) cancer cells were cultured and treated with saponin to selectively change cell membrane permeability. All fitted microstructural parameters from simulations and experiments in vitro were compared with ground-truth values. Simulations showed that, for both PGSE and IMPULSED methods, cell diameter d can be reliably fit with sufficient SNR (≥ 50), whereas intracellular volume fraction f in is intrinsically underestimated due to transcytolemmal water exchange. D in can be reliably fit only with sufficient SNR and using the IMPULSED method with short diffusion times. These results were confirmed with those obtained in the cell culture experiments in vitro. For the sequences and models considered in this study, transcytolemmal water exchange has minor effects on the fittings of d and D in with physiologically relevant membrane permeabilities if the SNR is sufficient (> 50), but f in is intrinsically underestimated. Magn Reson Med 77:2239-2249, 2017. © 2016 International Society for Magnetic Resonance in Medicine. © 2016 International Society for Magnetic Resonance in Medicine.
Hu, Yinhuan; Zhang, Zixia; Xie, Jinzhu; Wang, Guanping
2017-02-01
The objective of this study is to describe the development of the Outpatient Experience Questionnaire (OPEQ) and to assess the validity and reliability of the scale. Literature review, patient interviews, Delphi method and Cross-sectional validation survey. Six comprehensive public hospitals in China. The survey was carried out on a sample of 600 outpatients. Acceptability of the questionnaire was assessed according to the overall response rate, item non-response rate and the average completion time. Correlation coefficients and confirmatory factor analysis were used to test construct validity. Delphi method was used to assess the content validity of the questionnaire. Cronbach's coefficient alpha and split-half reliability coefficient were used to estimate the internal reliability of the questionnaire. The overall response rate was 97.2% and the item non-response rate ranged from 0% to 0.3%. The mean completion time was 6 min. The Spearman correlations of item-total score ranged from 0.466 to 0.765. The results of confirmatory factor analysis showed that all items had factor loadings above 0.40 and the dimension intercorrelation ranged from 0.449 to 0.773, the goodness of fit of the questionnaire was reasonable. The overall authority grade of expert consultation was 0.80 and Kendall's coefficient of concordance W was 0.186. The Cronbach's coefficients alpha of six dimensions ranged from 0.708 to 0.895, the split-half reliability coefficient (Spearman-Brown coefficient) was 0.969. The OPEQ is a promising instrument covering the most important aspects which influence outpatient experiences of comprehensive public hospital in China. It has good evidence for acceptability, validity and reliability. © The Author 2016. Published by Oxford University Press in association with the International Society for Quality in Health Care. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com
Construction of a memory battery for computerized administration, using item response theory.
Ferreira, Aristides I; Almeida, Leandro S; Prieto, Gerardo
2012-10-01
In accordance with Item Response Theory, a computer memory battery with six tests was constructed for use in the Portuguese adult population. A factor analysis was conducted to assess the internal structure of the tests (N = 547 undergraduate students). According to the literature, several confirmatory factor models were evaluated. Results showed better fit of a model with two independent latent variables corresponding to verbal and non-verbal factors, reproducing the initial battery organization. Internal consistency reliability for the six tests were alpha = .72 to .89. IRT analyses (Rasch and partial credit models) yielded good Infit and Outfit measures and high precision for parameter estimation. The potential utility of these memory tasks for psychological research and practice willbe discussed.
Terry, Leann; Kelley, Ken
2012-11-01
Composite measures play an important role in psychology and related disciplines. Composite measures almost always have error. Correspondingly, it is important to understand the reliability of the scores from any particular composite measure. However, the point estimates of the reliability of composite measures are fallible and thus all such point estimates should be accompanied by a confidence interval. When confidence intervals are wide, there is much uncertainty in the population value of the reliability coefficient. Given the importance of reporting confidence intervals for estimates of reliability, coupled with the undesirability of wide confidence intervals, we develop methods that allow researchers to plan sample size in order to obtain narrow confidence intervals for population reliability coefficients. We first discuss composite reliability coefficients and then provide a discussion on confidence interval formation for the corresponding population value. Using the accuracy in parameter estimation approach, we develop two methods to obtain accurate estimates of reliability by planning sample size. The first method provides a way to plan sample size so that the expected confidence interval width for the population reliability coefficient is sufficiently narrow. The second method ensures that the confidence interval width will be sufficiently narrow with some desired degree of assurance (e.g., 99% assurance that the 95% confidence interval for the population reliability coefficient will be less than W units wide). The effectiveness of our methods was verified with Monte Carlo simulation studies. We demonstrate how to easily implement the methods with easy-to-use and freely available software. ©2011 The British Psychological Society.
Hassani, Lale; Dehdari, Tahereh; Hajizadeh, Ebrahim; Shojaeizadeh, Davoud; Abedini, Mehrandokht; Nedjat, Saharnaz
2014-01-01
Given that there are many Iranian women who have never had a Pap smear, this study was designed to develop and validate a measurement tool based on the Protection Motivation Theory to assess factors influencing the Iranian women's intention to perform first Pap testing. In this psychometric research, to determine the Content Validity Index (CVI) and the Content Validity Ratio (CVR), a panel of experts (n=10) reviewed scale items. Reliability was estimated through the Intraclass Correlation Coefficient (n=30) and internal consistency (n=240). Also, factor analysis (exploratory and conformity) was performed on the data of the sample women who had never had a Pap smear test (n=240). A 26-item questionnaire was developed. The CVI and CVR scores of the scale were 0.89 and 0.90, respectively. Exploratory factor analysis loaded a 26-item with seven factors questionnaire (perceived vulnerability and severity, fear, response costs, response efficacy, self-efficacy, and protection motivation (or intention)) that jointly accounted for 72.76% of the observed variance. Confirmatory factor analysis indicated a good fit for the data. Internal consistency (range 0.70-0.93) and test-retest reliability (range 0.72-0.96) of sub-scales were acceptable. This study showed that the designed instrument was a valid and reliable tool for measuring the factors influencing the women's intention to perform their first Pap testing.
Appearance motives to tan and not tan: evidence for validity and reliability of a new scale.
Cafri, Guy; Thompson, J Kevin; Roehrig, Megan; Rojas, Ariz; Sperry, Steffanie; Jacobsen, Paul B; Hillhouse, Joel
2008-04-01
Risk for skin cancer is increased by UV exposure and decreased by sun protection. Appearance reasons to tan and not tan have consistently been shown to be related to intentions and behaviors to UV exposure and protection. This study was designed to determine the factor structure of appearance motives to tan and not tan, evaluate the extent to which this factor structure is gender invariant, test for mean differences in the identified factors, and evaluate internal consistency, temporal stability, and criterion-related validity. Five-hundred eighty-nine females and 335 male college students were used to test confirmatory factor analysis models within and across gender groups, estimate latent mean differences, and use the correlation coefficient and Cronbach's alpha to further evaluate the reliability and validity of the identified factors. A measurement invariant (i.e., factor-loading invariant) model was identified with three higher-order factors: sociocultural influences to tan (lower order factors: media, friends, family, significant others), appearance reasons to tan (general, acne, body shape), and appearance reasons not to tan (skin aging, immediate skin damage). Females had significantly higher means than males on all higher-order factors. All subscales had evidence of internal consistency, temporal stability, and criterion-related validity. This study offers a framework and measurement instrument that has evidence of validity and reliability for evaluating appearance-based motives to tan and not tan.
Rask, Marie; Oscarsson, Marie; Ludwig, Neil; Swahnberg, Katarina
2017-04-04
Cervical dysplasia is a precancerous condition, which has been shown to create anxiety in women. To be able to investigate these women's health-related quality of life, a disease-specific instrument is required. There does not seem to be a Swedish version of an instrument to screen for this specific disease. Therefore, this study aims to translate and cross-culturally adapt the Functional Assessment of Chronic Illness Therapy - Cervical Dysplasia (FACIT-CD) into a Swedish context and evaluate its linguistic validity and reliability. The Functional Assessment of Chronic Illness Therapy (FACIT) translation methodology was used, which consists of several steps including pilot testing of the FACIT-CD instrument through cognitive debriefing interviews. Ten women diagnosed with cervical dysplasia participated in the cognitive debriefing interviews. The internal consistency reliability of the Swedish FACIT-CD was estimated by Cronbach's alpha coefficient. Homogeneity of the items was evaluated by corrected item-total correlations. The sample consists of 34 women who were diagnosed with cervical dysplasia. The translation and cross-cultural adaptation went smoothly without any problems for the majority of the items. The cognitive debriefing interviews indicated that the Swedish FACIT-CD consists of relevant items, is easy to understand and complete, and has unambiguous and comprehensive response categories. The translation and cross-cultural adaptation resulted in a Swedish FACIT-CD, which is conceptually and semantically equivalent to the English version and linguistically valid. The total scale of the Swedish FACIT-CD exhibited good internal consistency reliability with a Cronbach's alpha coefficient of 0.84, and all of the subscales exhibited acceptable value between 0.71 and 0.81 except the Relationships subscale, which had a value of 0.67. Finally, all but four items exceeded the acceptable level for the corrected item-total correlations of ≥ 0.20. The Swedish FACIT-CD is conceptually and semantically equivalent to the English version and linguistically valid; further, it exhibits good internal consistency reliability.
The Chinese version of the Outcome Expectations for Exercise scale: validation study.
Lee, Ling-Ling; Chiu, Yu-Yun; Ho, Chin-Chih; Wu, Shu-Chen; Watson, Roger
2011-06-01
Estimates of the reliability and validity of the English nine-item Outcome Expectations for Exercise (OEE) scale have been tested and found to be valid for use in various settings, particularly among older people, with good internal consistency and validity. Data on the use of the OEE scale among older Chinese people living in the community and how cultural differences might affect the administration of the OEE scale are limited. To test the validity and reliability of the Chinese version of the Outcome Expectations for Exercise scale among older people. A cross-sectional validation study was designed to test the Chinese version of the OEE scale (OEE-C). Reliability was examined by testing both the internal consistency for the overall scale and the squared multiple correlation coefficient for the single item measure. The validity of the scale was tested on the basis of both a traditional psychometric test and a confirmatory factor analysis using structural equation modelling. The Mokken Scaling Procedure (MSP) was used to investigate if there were any hierarchical, cumulative sets of items in the measure. The OEE-C scale was tested in a group of older people in Taiwan (n=108, mean age=77.1). There was acceptable internal consistency (alpha=.85) and model fit in the scale. Evidence of the validity of the measure was demonstrated by the tests for criterion-related validity and construct validity. There was a statistically significant correlation between exercise outcome expectations and exercise self-efficacy (r=.34, p<.01). An analysis of the Mokken Scaling Procedure found that nine items of the scale were all retained in the analysis and the resulting scale was reliable and statistically significant (p=.0008). The results obtained in the present study provided acceptable levels of reliability and validity evidence for the Chinese Outcome Expectations for Exercise scale when used with older people in Taiwan. Future testing of the OEE-C scale needs to be carried out to see whether these results are generalisable to older Chinese people living in urban areas. Copyright © 2010 Elsevier Ltd. All rights reserved.
Gordt, Katharina; Mikolaizak, A Stefanie; Nerz, Corinna; Barz, Carolin; Gerhardy, Thomas; Weber, Michaela; Becker, Clemens; Schwenk, Michael
2018-02-12
Tools to detect subtle balance deficits in high-functioning community-dwelling older adults are lacking. The Community Balance and Mobility Scale (CBM) is a valuable tool to measure balance deficits in this group; however, it is not yet available in the German language. The aim was 1) to translate and cross-culturally adapt the CBM into the German language and 2) to investigate the measurement properties of the German CBM (G-CBM). The original CBM was translated into the German language according to established guidelines. A total of 51 older adults (mean age 69.9 ± 7.1 years) were recruited to measure construct validity by comparing the G‑CBM against standardized balance and/or mobility assessments including the Fullerton Advanced Balance Scale (FAB), Berg Balance Scale (BBS), 3 m Tandem Walk (3MTW), 8 Level Balance Scale (8LBS), 30 s Chair Stand Test (30CST), Timed Up and Go (TUG) test, gait speed, and the Falls Efficacy Scale International (FES-I). Intrarater and interrater reliability and internal consistency reliability were estimated using intraclass correlations (ICC) and Cronbach's alpha, respectively. Ceiling effects were calculated as the percentage of the sample scoring the maximum score. The G‑CBM correlated excellently with FAB and BBS (ρ = 0.78-0.85; P < 0.001), good with 3MTW, TUG, and FES-I (ρ = -0.55 to -0.61; P < 0.001), and moderately with 8LBS, 30CST, and habitual gait speed (ρ = 0.32-0.46; P < 0.001). Intrarater (ICC 3,k = 0.998; P < 0.001) and interrater (ICC 2,k = 0.996; P < 0.001) reliability, and internal consistency reliability (α = 0.998) were also high. The G‑CBM did not show ceiling effects. The G‑CBM is a valid and reliable tool for measuring subtle balance deficits in older high-functioning adults. The absence of ceiling effects emphasizes the use of this scale in this cohort. The G‑CBM can now be utilized in clinical practice.
A Measure for the Reliability of a Rating Scale Based on Longitudinal Clinical Trial Data
ERIC Educational Resources Information Center
Laenen, Annouschka; Alonso, Ariel; Molenberghs, Geert
2007-01-01
A new measure for reliability of a rating scale is introduced, based on the classical definition of reliability, as the ratio of the true score variance and the total variance. Clinical trial data can be employed to estimate the reliability of the scale in use, whenever repeated measurements are taken. The reliability is estimated from the…
Kuntzelman, Karl; Jack Rhodes, L; Harrington, Lillian N; Miskovic, Vladimir
2018-06-01
There is a broad family of statistical methods for capturing time series regularity, with increasingly widespread adoption by the neuroscientific community. A common feature of these methods is that they permit investigators to quantify the entropy of brain signals - an index of unpredictability/complexity. Despite the proliferation of algorithms for computing entropy from neural time series data there is scant evidence concerning their relative stability and efficiency. Here we evaluated several different algorithmic implementations (sample, fuzzy, dispersion and permutation) of multiscale entropy in terms of their stability across sessions, internal consistency and computational speed, accuracy and precision using a combination of electroencephalogram (EEG) and synthetic 1/ƒ noise signals. Overall, we report fair to excellent internal consistency and longitudinal stability over a one-week period for the majority of entropy estimates, with several caveats. Computational timing estimates suggest distinct advantages for dispersion and permutation entropy over other entropy estimates. Considered alongside the psychometric evidence, we suggest several ways in which researchers can maximize computational resources (without sacrificing reliability), especially when working with high-density M/EEG data or multivoxel BOLD time series signals. Copyright © 2018 Elsevier Inc. All rights reserved.
Short Personality and Life Event scale for detection of suicide attempters.
Artieda-Urrutia, Paula; Delgado-Gómez, David; Ruiz-Hernández, Diego; García-Vega, Juan Manuel; Berenguer, Nuria; Oquendo, Maria A; Blasco-Fontecilla, Hilario
2015-01-01
To develop a brief and reliable psychometric scale to identify individuals at risk for suicidal behaviour. Case-control study. 182 individuals (61 suicide attempters, 57 psychiatric controls, and 64 psychiatrically healthy controls) aged 18 or older, admitted to the Emergency Department at Puerta de Hierro University Hospital in Madrid, Spain. All participants completed a form including their socio-demographic and clinical characteristics, and the Personality and Life Events scale (27 items). To assess Axis I diagnoses, all psychiatric patients (including suicide attempters) were administered the Mini International Neuropsychiatric Interview. Descriptive statistics were computed for the socio-demographic factors. Additionally, χ(2) independence tests were applied to evaluate differences in socio-demographic and clinical variables, and the Personality and Life Events scale between groups. A stepwise linear regression with backward variable selection was conducted to build the Short Personality Life Event (S-PLE) scale. In order to evaluate the accuracy, a ROC analysis was conducted. The internal reliability was assessed using Cronbach's α, and the external reliability was evaluated using a test-retest procedure. The S-PLE scale, composed of just 6 items, showed good performance in discriminating between medical controls, psychiatric controls and suicide attempters in an independent sample. For instance, the S-PLE scale discriminated between past suicide and past non-suicide attempters with sensitivity of 80% and specificity of 75%. The area under the ROC curve was 88%. A factor analysis extracted only one factor, revealing a single dimension of the S-PLE scale. Furthermore, the S-PLE scale provides values of internal and external reliability between poor (test-retest: 0.55) and acceptable (Cronbach's α: 0.65) ranges. Administration time is about one minute. The S-PLE scale is a useful and accurate instrument for estimating the risk of suicidal behaviour in settings where the time is scarce. Copyright © 2015 SEP y SEPB. Published by Elsevier España. All rights reserved.
A review of health resource tracking in developing countries.
Powell-Jackson, Timothy; Mills, Anne
2007-11-01
Timely, reliable and complete information on financial resources in the health sector is critical for sound policy making and planning, particularly in developing countries where resources are both scarce and unpredictable. Health resource tracking has a long history and has seen renewed interest more recently as pressure has mounted to improve accountability for the attainment of the health Millennium Development Goals. We review the methods used to track health resources and recent experiences of their application, with a view to identifying the major challenges that must be overcome if data availability and reliability are to improve. At the country level, there have been important advances in the refinement of the National Health Accounts (NHA) methodology, which is now regarded as the international standard. Significant efforts have also been put into the development of methods to track disease-specific expenditures. However, NHA as a framework can do little to address the underlying problem of weak government public expenditure management and information systems that provide much of the raw data. The experience of institutionalizing NHA suggests progress has been uneven and there is a potential for stand-alone disease accounts to make the situation worse by undermining capacity and confusing technicians. Global level tracking of donor assistance to health relies to a large extent on the OECD's Creditor Reporting System. Despite improvements in its coverage and reliability, the demand for estimates of aid to control of specific diseases is resulting in multiple, uncoordinated data requests to donor agencies, placing additional workload on the providers of information. The emergence of budget support aid modalities poses a methodological challenge to health resource tracking, as such support is difficult to attribute to any particular sector or health programme. Attention should focus on improving underlying financial and information systems at the country level, which will facilitate more reliable and timely reporting of NHA estimates. Effective implementation of a framework to make donors more accountable to recipient countries and the international community will improve the availability of financial data on their activities.
Bottema-Beutel, Kristen; Lloyd, Blair; Carter, Erik W; Asmus, Jennifer M
2014-11-01
Attaining reliable estimates of observational measures can be challenging in school and classroom settings, as behavior can be influenced by multiple contextual factors. Generalizability (G) studies can enable researchers to estimate the reliability of observational data, and decision (D) studies can inform how many observation sessions are necessary to achieve a criterion level of reliability. We conducted G and D studies using observational data from a randomized control trial focusing on social and academic participation of students with severe disabilities in inclusive secondary classrooms. Results highlight the importance of anchoring observational decisions to reliability estimates from existing or pilot data sets. We outline steps for conducting G and D studies and address options when reliability estimates are lower than desired.
Developing Reliable Life Support for Mars
NASA Technical Reports Server (NTRS)
Jones, Harry W.
2017-01-01
A human mission to Mars will require highly reliable life support systems. Mars life support systems may recycle water and oxygen using systems similar to those on the International Space Station (ISS). However, achieving sufficient reliability is less difficult for ISS than it will be for Mars. If an ISS system has a serious failure, it is possible to provide spare parts, or directly supply water or oxygen, or if necessary bring the crew back to Earth. Life support for Mars must be designed, tested, and improved as needed to achieve high demonstrated reliability. A quantitative reliability goal should be established and used to guide development t. The designers should select reliable components and minimize interface and integration problems. In theory a system can achieve the component-limited reliability, but testing often reveal unexpected failures due to design mistakes or flawed components. Testing should extend long enough to detect any unexpected failure modes and to verify the expected reliability. Iterated redesign and retest may be required to achieve the reliability goal. If the reliability is less than required, it may be improved by providing spare components or redundant systems. The number of spares required to achieve a given reliability goal depends on the component failure rate. If the failure rate is under estimated, the number of spares will be insufficient and the system may fail. If the design is likely to have undiscovered design or component problems, it is advisable to use dissimilar redundancy, even though this multiplies the design and development cost. In the ideal case, a human tended closed system operational test should be conducted to gain confidence in operations, maintenance, and repair. The difficulty in achieving high reliability in unproven complex systems may require the use of simpler, more mature, intrinsically higher reliability systems. The limitations of budget, schedule, and technology may suggest accepting lower and less certain expected reliability. A plan to develop reliable life support is needed to achieve the best possible reliability.
International Space Station End-of-Life Probabilistic Risk Assessment
NASA Technical Reports Server (NTRS)
Duncan, Gary
2014-01-01
Although there are ongoing efforts to extend the ISS life cycle through 2028, the International Space Station (ISS) end-of-life (EOL) cycle is currently scheduled for 2020. The EOL for the ISS will require de-orbiting the ISS. This will be the largest manmade object ever to be de-orbited, therefore safely de-orbiting the station will be a very complex problem. This process is being planned by NASA and its international partners. Numerous factors will need to be considered to accomplish this such as target corridors, orbits, altitude, drag, maneuvering capabilities, debris mapping etc. The ISS EOL Probabilistic Risk Assessment (PRA) will play a part in this process by estimating the reliability of the hardware supplying the maneuvering capabilities. The PRA will model the probability of failure of the systems supplying and controlling the thrust needed to aid in the de-orbit maneuvering.
ERIC Educational Resources Information Center
Green, Samuel B.; Yang, Yanyun
2009-01-01
A method is presented for estimating reliability using structural equation modeling (SEM) that allows for nonlinearity between factors and item scores. Assuming the focus is on consistency of summed item scores, this method for estimating reliability is preferred to those based on linear SEM models and to the most commonly reported estimate of…
A study of fault prediction and reliability assessment in the SEL environment
NASA Technical Reports Server (NTRS)
Basili, Victor R.; Patnaik, Debabrata
1986-01-01
An empirical study on estimation and prediction of faults, prediction of fault detection and correction effort, and reliability assessment in the Software Engineering Laboratory environment (SEL) is presented. Fault estimation using empirical relationships and fault prediction using curve fitting method are investigated. Relationships between debugging efforts (fault detection and correction effort) in different test phases are provided, in order to make an early estimate of future debugging effort. This study concludes with the fault analysis, application of a reliability model, and analysis of a normalized metric for reliability assessment and reliability monitoring during development of software.
Developing a fatigue questionnaire for Chinese civil aviation pilots.
Dai, Jing; Luo, Min; Hu, Wendong; Ma, Jin; Wen, Zhihong
2018-03-23
To assess the fatigue risk is an important challenge in improving flight safety in aviation industry. The aim of this study was to develop a comprehensive fatigue risk management indicators system and a fatigue questionnaire for Chinese civil aviation pilots. Participants included 74 (all males) civil aviation pilots. They finished the questionnaire in 20 minutes before a flight mission. The estimation of internal consistency with Cronbach's α and Student's t test as well as Pearson's correlation analysis were the main statistical methods. The results revealed that the fatigue questionnaire had acceptable internal consistency reliability and construct validity; there were significant differences on fatigue scores between international and domestic flight pilots. And some international flight pilots, who had taken medications as a sleep aid, had worse sleep quality than those had not. The long-endurance flight across time zones caused significant differences in circadian rhythm. The fatigue questionnaire can be used to measure Chinese civil aviation pilots' fatigue, which provided a reference for fatigue risk management system to civil aviation pilots.
Tracking reliability for space cabin-borne equipment in development by Crow model.
Chen, J D; Jiao, S J; Sun, H L
2001-12-01
Objective. To study and track the reliability growth of manned spaceflight cabin-borne equipment in the course of its development. Method. A new technique of reliability growth estimation and prediction, which is composed of the Crow model and test data conversion (TDC) method was used. Result. The estimation and prediction value of the reliability growth conformed to its expectations. Conclusion. The method could dynamically estimate and predict the reliability of the equipment by making full use of various test information in the course of its development. It offered not only a possibility of tracking the equipment reliability growth, but also the reference for quality control in manned spaceflight cabin-borne equipment design and development process.
Musa, Gada; Henríquez, Fernando; Muñoz-Neira, Carlos; Delgado, Carolina; Lillo, Patricia; Slachevsky, Andrea
2017-01-01
The Neuropsychiatric Inventory Questionnaire (NPI-Q) is an informant-based instrument that measures the presence and severity of 12 Neuropsychiatric Symptoms (NPS) in patients with dementia, as well as informant distress. To measure the psychometric properties of the NPI-Q and the prevalence of NPS in patients with Alzheimer's disease (AD) in Chile. 53 patients with AD were assessed. Subjects were divided into two different groups: mild AD (n=26) and moderate AD (n=27). Convergent validity was estimated by correlating the outcomes of the NPI-Q with Neuropsychiatric Inventory (NPI) scores and with a global cognitive efficiency test (Addenbrooke's Cognitive Examination - Revised - ACE-R). Reliability of the NPI-Q was analysed by calculating its internal consistency. Prevalence of NPS was estimated with both the NPI and NPI-Q. Positive and significant correlations were observed between the NPI-Q, the NPI, and the ACE-R (r=0.730; p<0.01 and 0.315; p<0.05 respectively). The instrument displayed an adequate level of reliability (Cronbach's alpha=0.783). The most prevalent NPS were apathy/indifference (62.3%) and dysphoria/depression (58.5%). The NPI-Q exhibited acceptable validity and reliability indicators for patients with AD in Chile, indicating that it is a suitable instrument for the routine assessment of NPS in clinical practice.
Musa, Gada; Henríquez, Fernando; Muñoz-Neira, Carlos; Delgado, Carolina; Lillo, Patricia; Slachevsky, Andrea
2017-01-01
The Neuropsychiatric Inventory Questionnaire (NPI-Q) is an informant-based instrument that measures the presence and severity of 12 Neuropsychiatric Symptoms (NPS) in patients with dementia, as well as informant distress. Objective To measure the psychometric properties of the NPI-Q and the prevalence of NPS in patients with Alzheimer's disease (AD) in Chile. Methods 53 patients with AD were assessed. Subjects were divided into two different groups: mild AD (n=26) and moderate AD (n=27). Convergent validity was estimated by correlating the outcomes of the NPI-Q with Neuropsychiatric Inventory (NPI) scores and with a global cognitive efficiency test (Addenbrooke's Cognitive Examination - Revised - ACE-R). Reliability of the NPI-Q was analysed by calculating its internal consistency. Prevalence of NPS was estimated with both the NPI and NPI-Q. Results Positive and significant correlations were observed between the NPI-Q, the NPI, and the ACE-R (r=0.730; p<0.01 and 0.315; p<0.05 respectively). The instrument displayed an adequate level of reliability (Cronbach's alpha=0.783). The most prevalent NPS were apathy/indifference (62.3%) and dysphoria/depression (58.5%). Conclusion The NPI-Q exhibited acceptable validity and reliability indicators for patients with AD in Chile, indicating that it is a suitable instrument for the routine assessment of NPS in clinical practice. PMID:29213504
The Yale-Brown Obsessive Compulsive Scale: A Reliability Generalization Meta-Analysis.
López-Pina, José Antonio; Sánchez-Meca, Julio; López-López, José Antonio; Marín-Martínez, Fulgencio; Núñez-Núñez, Rosa Maria; Rosa-Alcázar, Ana I; Gómez-Conesa, Antonia; Ferrer-Requena, Josefa
2015-10-01
The Yale-Brown Obsessive Compulsive Scale (Y-BOCS) is the most frequently applied test to assess obsessive compulsive symptoms. We conducted a reliability generalization meta-analysis on the Y-BOCS to estimate the average reliability, examine the variability among the reliability estimates, search for moderators, and propose a predictive model that researchers and clinicians can use to estimate the expected reliability of the Y-BOCS. We included studies where the Y-BOCS was applied to a sample of adults and reliability estimate was reported. Out of the 11,490 references located, 144 studies met the selection criteria. For the total scale, the mean reliability was 0.866 for coefficients alpha, 0.848 for test-retest correlations, and 0.922 for intraclass correlations. The moderator analyses led to a predictive model where the standard deviation of the total test and the target population (clinical vs. nonclinical) explained 38.6% of the total variability among coefficients alpha. Finally, clinical implications of the results are discussed. © The Author(s) 2014.
Reliability reporting across studies using the Buss Durkee Hostility Inventory.
Vassar, Matt; Hale, William
2009-01-01
Empirical research on anger and hostility has pervaded the academic literature for more than 50 years. Accurate measurement of anger/hostility and subsequent interpretation of results requires that the instruments yield strong psychometric properties. For consistent measurement, reliability estimates must be calculated with each administration, because changes in sample characteristics may alter the scale's ability to generate reliable scores. Therefore, the present study was designed to address reliability reporting practices for a widely used anger assessment, the Buss Durkee Hostility Inventory (BDHI). Of the 250 published articles reviewed, 11.2% calculated and presented reliability estimates for the data at hand, 6.8% cited estimates from a previous study, and 77.1% made no mention of score reliability. Mean alpha estimates of scores for BDHI subscales generally fell below acceptable standards. Additionally, no detectable pattern was found between reporting practices and publication year or journal prestige. Areas for future research are also discussed.
ERIC Educational Resources Information Center
Setzer, J. Carl; He, Yi
2009-01-01
Reliability Analysis for the Internationally Administered 2002 Series GED (General Educational Development) Tests Reliability refers to the consistency, or stability, of test scores when the authors administer the measurement procedure repeatedly to groups of examinees (American Educational Research Association [AERA], American Psychological…
Health Auctions: a Valuation Experiment (HAVE) study protocol.
Kularatna, Sanjeewa; Petrie, Dennis; Scuffham, Paul A; Byrnes, Joshua
2016-04-07
Quality-adjusted life years are derived using health state utility weights which adjust for the relative value of living in each health state compared with living in perfect health. Various techniques are used to estimate health state utility weights including time-trade-off and standard gamble. These methods have exhibited limitations in terms of complexity, validity and reliability. A new composite approach using experimental auctions to value health states is introduced in this protocol. A pilot study will test the feasibility and validity of using experimental auctions to value health states in monetary terms. A convenient sample (n=150) from a population of university staff and students will be invited to participate in 30 auction sets with a group of 5 people in each set. The 9 health states auctioned in each auction set will come from the commonly used EQ-5D-3L instrument. At most participants purchase 2 health states, and the participant who acquires the 2 'best' health states on average will keep the amount of money they do not spend in acquiring those health states. The value (highest bid and average bid) of each of the 24 health states will be compared across auctions to test for reliability across auction groups and across auctioneers. A test retest will be conducted for 10% of the sample to assess reliability of responses for health states auctions. Feasibility of conducting experimental auctions to value health states will also be examined. The validity of estimated health states values will be compared with published utility estimates from other methods. This pilot study will explore the feasibility, reliability and validity in using experimental auction for valuing health states. Ethical clearance was obtained from Griffith University ethics committee. The results will be disseminated in peer-reviewed journals and major international conferences. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
Bronchiolitis Score of Sant Joan de Déu: BROSJOD Score, validation and usefulness.
Balaguer, Mònica; Alejandre, Carme; Vila, David; Esteban, Elisabeth; Carrasco, Josep L; Cambra, Francisco José; Jordan, Iolanda
2017-04-01
To validate the bronchiolitis score of Sant Joan de Déu (BROSJOD) and to examine the previously defined scoring cutoff. Prospective, observational study. BROSJOD scoring was done by two independent physicians (at admission, 24 and 48 hr). Internal consistency of the score was assessed using Cronbach's α. To determine inter-rater reliability, the concordance correlation coefficient estimated as an intraclass correlation coefficient (CCC) and limits of agreement estimated as the 90% total deviation index (TDI) were estimated. An expert opinion was used to classify patients according to clinical severity. A validity analysis was conducted comparing the 3-level classification score to that expert opinion. Volume under the surface (VUS), predictive values, and probability of correct classification (PCC) were measured to assess discriminant validity. About 112 patients were recruited, 62 of them (55.4%) males. Median age: 52.5 days (IQR: 32.75-115.25). The admission Cronbach's α was 0.77 (CI95%: 0.71; 0.82) and at 24 hr it was 0.65 (CI95%: 0.48; 0.7). The inter-rater reliability analysis was: CCC at admission 0.96 (95%CI 0.94-0.97), at 24 h 0.77 (95%CI 0.65-0.86), and at 48 hr 0.94 (95%CI 0.94-0.97); TDI 90%: 1.6, 2.9, and 1.57, respectively. The discriminant validity at admission: VUS of 0.8 (95%CI 0.70-0.90), at 24 h 0.92 (95%CI 0.85-0.99), and at 48 hr 0.93 (95%CI 0.87-0.99). The predictive values and PCC values were within 38-100% depending on the level of clinical severity. There is a high inter-rater reliability, showing the BROSJOD score to be reliable and valid, even when different observers apply it. Pediatr Pulmonol. 2017;52:533-539. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Ferreira, Mariana Cândido; Björklund, Martin; Dach, Fabiola; Chaves, Thais Cristina
The purpose of this study was to adapt and evaluate the psychometric properties of the ProFitMap-neck to Brazilian Portuguese. The cross-cultural adaptation consisted of 5 stages, and 180 female patients with chronic neck pain participated in the study. A subsample (n = 30) answered the pretest, and another subsample (n = 100) answered the questionnaire a second time. Internal consistency, test-retest reliability, and construct validity (hypothesis testing and structural validity) were estimated. For construct validity, the scores of the questionnaire were correlated with the Neck Disability Index (NDI), and the Hospital Anxiety and Depression Scale (HADS), the Tampa Scale of Kinesiophobia (TSK), and the 36-item Short-Form Health Survey (SF-36). Internal consistency was determined by adequate Cronbach's α values (α > 0.70). Strong reliability was identified by high intraclass correlation coefficients (ICC > 0.75). Construct validity was identified by moderate and strong correlations of the Br-ProFitMap-neck with total NDI score (-0.56
USDA-ARS?s Scientific Manuscript database
Error in rater estimates of plant disease severity occur, and standard area diagrams (SADs) help improve accuracy and reliability. The effects of diagram number in a SAD set on accuracy and reliability is unknown. The objective of this study was to compare estimates of pecan scab severity made witho...
Weis, Joachim; Tomaszewski, Krzysztof A; Hammerlid, Eva; Ignacio Arraras, Juan; Conroy, Thierry; Lanceley, Anne; Schmidt, Heike; Wirtz, Markus; Singer, Susanne; Pinto, Monica; Alm El-Din, Mohamed; Compter, Inge; Holzner, Bernhard; Hofmeister, Dirk; Chie, Wei-Chu; Czeladzki, Marek; Harle, Amelie; Jones, Louise; Ritter, Sabrina; Flechtner, Hans-Henning; Bottomley, Andrew
2017-05-01
The European Organisation for Research and Treatment of Cancer (EORTC) Group has developed a new multidimensional instrument measuring cancer-related fatigue to be used in conjunction with the quality of life core questionnaire (EORTC QLQ-C30). The module EORTC QLQ-FA13 assesses physical, cognitive, and emotional aspects of cancer-related fatigue. The methodology follows the EORTC guidelines for phase IV validation of modules. This paper focuses on the results of the psychometric validation of the factorial structure of the module. For validation and cross-validation confirmatory factor analysis (maximum likelihood estimation), intraclass correlation and Cronbach alpha for internal consistency were employed. The study involved an international multicenter collaboration of 11 European and non-European countries. A total of 946 patients with various tumor diagnoses were enrolled. Based on the confirmatory factor analysis, we could approve the three-dimensional structure of the module. Removing one item and reassigning the factorial mapping of another item resulted in the EORTC QLQ-FA12. For the revised scale, we found evidence supporting good local (indicator reliability ≥ 0.60, factor reliability ≥ 0.82) and global model fit (GFI t1|t2 = 0.965/0.957, CFI t1|t2 = 0.976/0.972, RMSEA t1|t2 = 0.060/0.069) for both measurement points. For each scale, test-retest reliability proved to be very good (intraclass correlation: R t1-t2 = 0.905-0.921) and internal consistency proved to be good to high (Cronbach alpha = .79-.90). Based on the former phase III module, the multidimensional structure was revised as a phase IV module (EORTC FA12) with an improved scale structure. For a comprehensive validation of the EORTC FA12, further aspects of convergent and divergent validity as well as sensitivity to change should be determined. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Effect of Combined Loading Due to Bending and Internal Pressure on Pipe Flaw Evaluation Criteria
NASA Astrophysics Data System (ADS)
Miura, Naoki; Sakai, Shinsuke
Considering a rule for the rationalization of maintenance of Light Water Reactor piping, reliable flaw evaluation criteria are essential for determining how a detected flaw will be detrimental to continuous plant operation. Ductile fracture is one of the dominant failure modes that must be considered for carbon steel piping and can be analyzed by elastic-plastic fracture mechanics. Some analytical efforts have provided various flaw evaluation criteria using load correction factors, such as the Z-factors in the JSME codes on fitness-for-service for nuclear power plants and the section XI of the ASME boiler and pressure vessel code. The present Z-factors were conventionally determined, taking conservativity and simplicity into account; however, the effect of internal pressure, which is an important factor under actual plant conditions, was not adequately considered. Recently, a J-estimation scheme, LBB.ENGC for the ductile fracture analysis of circumferentially through-wall-cracked pipes subjected to combined loading was developed for more accurate prediction under more realistic conditions. This method explicitly incorporates the contributions of both bending and tension due to internal pressure by means of a scheme that is compatible with an arbitrary combined-loading history. In this study, the effect of internal pressure on the flaw evaluation criteria was investigated using the new J-estimation scheme. The Z-factor obtained in this study was compared with the presently used Z-factors, and the predictability of the current flaw evaluation criteria was quantitatively evaluated in consideration of the internal pressure.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jansen, Paul; Ubachs, Wim; Bethlem, Hendrick L.
2011-12-15
Recently, methanol was identified as a sensitive target system to probe variations of the proton-to-electron mass ratio {mu}[Jansen et al., Phys. Rev. Lett. 106, 100801 (2011)]. The high sensitivity of methanol originates from the interplay between overall rotation and hindered internal rotation of the molecule; that is, transitions that convert internal rotation energy into overall rotation energy, or vice versa, have an enhanced sensitivity coefficient, K{sub {mu}}. As internal rotation is a common phenomenon in polyatomic molecules, it is likely that other molecules display similar or even larger effects. In this paper we generalize the concepts that form the foundationmore » of the high sensitivity in methanol and use this to construct an approximate model which makes it possible to estimate the sensitivities of transitions in internal rotor molecules with C{sub 3v} symmetry, without performing a full calculation of energy levels. We find that a reliable estimate of transition sensitivities can be obtained from the three rotational constants (A, B, and C) and three torsional constants (F, V{sub 3}, and {rho}). This model is verified by comparing obtained sensitivities for methanol, acetaldehyde, acetamide, methyl formate, and acetic acid with a full analysis of the molecular Hamiltonian. Of the molecules considered, methanol is by far the most suitable candidate for laboratory and cosmological tests searching for a possible variation of {mu}.« less
Doig, Emmah; Prescott, Sarah; Fleming, Jennifer; Cornwell, Petrea; Kuipers, Pim
2016-01-01
To examine the internal reliability and test-retest reliability of the Client-Centeredness of Goal Setting (C-COGS) scale. The C-COGS scale was administered to 42 participants with acquired brain injury after completion of multidisciplinary goal planning. Internal reliability of scale items was examined using item-partial total correlations and Cronbach's α coefficient. The scale was readministered within a 1-mo period to a subsample of 12 participants to examine test-retest reliability by calculating exact and close percentage agreement for each item. After examination of item-partial total correlations, test items were revised. The revised items demonstrated stronger internal consistency than the original items. Preliminary evaluation of test-retest reliability was fair, with an average exact percent agreement across all test items of 67%. Findings support the preliminary reliability of the C-COGS scale as a tool to evaluate and promote client-centered goal planning in brain injury rehabilitation. Copyright © 2016 by the American Occupational Therapy Association, Inc.
Delimiting Coefficient a from Internal Consistency and Unidimensionality
ERIC Educational Resources Information Center
Sijtsma, Klaas
2015-01-01
I discuss the contribution by Davenport, Davison, Liou, & Love (2015) in which they relate reliability represented by coefficient a to formal definitions of internal consistency and unidimensionality, both proposed by Cronbach (1951). I argue that coefficient a is a lower bound to reliability and that concepts of internal consistency and…
Kevern, Mark A.; Beecher, Michael; Rao, Smita
2014-01-01
Context: Athletes who participate in throwing and racket sports consistently demonstrate adaptive changes in glenohumeral-joint internal and external rotation in the dominant arm. Measurements of these motions have demonstrated excellent intrarater and poor interrater reliability. Objective: To determine intrarater reliability, interrater reliability, and standard error of measurement for shoulder internal rotation, external rotation, and total arc of motion using an inclinometer in 3 testing procedures in National Collegiate Athletic Association Division I baseball and softball athletes. Design: Cross-sectional study. Setting: Athletic department. Patients or Other Participants Thirty-eight players participated in the study. Shoulder internal rotation, external rotation, and total arc of motion were measured by 2 investigators in 3 test positions. The standard supine position was compared with a side-lying test position, as well as a supine test position without examiner overpressure. Results: Excellent intrarater reliability was noted for all 3 test positions and ranges of motion, with intraclass correlation coefficient values ranging from 0.93 to 0.99. Results for interrater reliability were less favorable. Reliability for internal rotation was highest in the side-lying position (0.68) and reliability for external rotation and total arc was highest in the supine-without-overpressure position (0.774 and 0.713, respectively). The supine-with-overpressure position yielded the lowest interrater reliability results in all positions. The side-lying position had the most consistent results, with very little variation among intraclass correlation coefficient values for the various test positions. Conclusions: The results of our study clearly indicate that the side-lying test procedure is of equal or greater value than the traditional supine-with-overpressure method. PMID:25188316
General Aviation Aircraft Reliability Study
NASA Technical Reports Server (NTRS)
Pettit, Duane; Turnbull, Andrew; Roelant, Henk A. (Technical Monitor)
2001-01-01
This reliability study was performed in order to provide the aviation community with an estimate of Complex General Aviation (GA) Aircraft System reliability. To successfully improve the safety and reliability for the next generation of GA aircraft, a study of current GA aircraft attributes was prudent. This was accomplished by benchmarking the reliability of operational Complex GA Aircraft Systems. Specifically, Complex GA Aircraft System reliability was estimated using data obtained from the logbooks of a random sample of the Complex GA Aircraft population.
ERIC Educational Resources Information Center
Lane, Ginny G.; White, Amy E.; Henson, Robin K.
2002-01-01
Conducted a reliability generalizability study on the Coopersmith Self-Esteem Inventory (CSEI; S. Coopersmith, 1967) to examine the variability of reliability estimates across studies and to identify study characteristics that may predict this variability. Results show that reliability for CSEI scores can vary considerably, especially at the…
Solid Fuel Use for Household Cooking: Country and Regional Estimates for 1980–2010
Bonjour, Sophie; Adair-Rohani, Heather; Wolf, Jennyfer; Bruce, Nigel G.; Mehta, Sumi; Lahiff, Maureen; Rehfuess, Eva A.; Mishra, Vinod; Smith, Kirk R.
2013-01-01
Background: Exposure to household air pollution from cooking with solid fuels in simple stoves is a major health risk. Modeling reliable estimates of solid fuel use is needed for monitoring trends and informing policy. Objectives: In order to revise the disease burden attributed to household air pollution for the Global Burden of Disease 2010 project and for international reporting purposes, we estimated annual trends in the world population using solid fuels. Methods: We developed a multilevel model based on national survey data on primary cooking fuel. Results: The proportion of households relying mainly on solid fuels for cooking has decreased from 62% (95% CI: 58, 66%) to 41% (95% CI: 37, 44%) between 1980 and 2010. Yet because of population growth, the actual number of persons exposed has remained stable at around 2.8 billion during three decades. Solid fuel use is most prevalent in Africa and Southeast Asia where > 60% of households cook with solid fuels. In other regions, primary solid fuel use ranges from 46% in the Western Pacific, to 35% in the Eastern Mediterranean and < 20% in the Americas and Europe. Conclusion: Multilevel modeling is a suitable technique for deriving reliable solid-fuel use estimates. Worldwide, the proportion of households cooking mainly with solid fuels is decreasing. The absolute number of persons using solid fuels, however, has remained steady globally and is increasing in some regions. Surveys require enhancement to better capture the health implications of new technologies and multiple fuel use. PMID:23674502
NASA Astrophysics Data System (ADS)
Remmlinger, Jürgen; Buchholz, Michael; Meiler, Markus; Bernreuter, Peter; Dietmayer, Klaus
For reliable and safe operation of lithium-ion batteries in electric or hybrid vehicles, diagnosis of the cell degradation is necessary. This can be achieved by monitoring the increase of the internal resistance of the battery cells over the whole lifetime of the battery. In this paper, a method to identify the internal resistance in a hybrid vehicle is presented. Therefore, a special purpose model deduced from an equivalent circuit is developed. This model contains parameters depending on the degradation of the battery cell. To achieve the required robustness and stable results under these conditions, the method uses specific signal intervals occurring during normal operation of the battery in a hybrid vehicle. This identification signal has a defined timespan and occurs regularly. The identification is done on vehicle measurement data of terminal cell voltage and current collected with a usual vehicle sampling rate. Using the adapted internal resistance value in the model, a degradation index is calculated by compensating other influences, e.g. battery temperature. This task is the main challenge, as the impact of the temperature on the resistance, for example, is one order of magnitude higher than the influence of the degradation for the investigated lithium-ion cell. The developed estimation and monitoring method is validated with measurement data from single cells and shows good results and very low computational effort.
[The reliability of a questionnaire regarding Colombian children's physical activity].
Herazo-Beltrán, Aliz Y; Domínguez-Anaya, Regina
2012-10-01
Reporting the Physical Activity Questionnaire for school children's (PAQ-C) test-retest reliability and internal consistency. This was a descriptive study of 100 school-aged children aged 9 to 11 years old attending a school in Cartagena, Colombia. The sample was randomly selected. The PAQ-C was given twice, one week apart, after the informed consent forms had been signing by the children's parents and school officials. Cronbach's alpha coefficient of reliability was used for assessing internal consistency and an intra-class correlation coefficient for test-retest reliability SPSS (version 17.0) was used for statistical analysis. The questionnaire scored 0.73 internal consistencies during the first measurement and 0.78 on the second; intra-class correlation coefficient was 0.60. There were differences between boys and girls regarding both measurements. The PAQ-C had acceptable internal consistency and test-retest reliability, thereby making it useful for measuring children's self-reported physical activity and a valuable tool for population studies in Colombia.
Fission product release and survivability of UN-kernel LWR TRISO fuel
DOE Office of Scientific and Technical Information (OSTI.GOV)
T. M. Besmann; M. K. Ferber; H.-T. Lin
2014-05-01
A thermomechanical assessment of the LWR application of TRISO fuel with UN kernels was performed. Fission product release under operational and transient temperature conditions was determined by extrapolation from fission product recoil calculations and limited data from irradiated UN pellets. Both fission recoil and diffusive release were considered and internal particle pressures computed for both 650 and 800 um diameter kernels as a function of buffer layer thickness. These pressures were used in conjunction with a finite element program to compute the radial and tangential stresses generated within a TRISO particle undergoing burnup. Creep and swelling of the inner andmore » outer pyrolytic carbon layers were included in the analyses. A measure of reliability of the TRISO particle was obtained by computing the probability of survival of the SiC barrier layer and the maximum tensile stress generated in the pyrolytic carbon layers from internal pressure and thermomechanics of the layers. These reliability estimates were obtained as functions of the kernel diameter, buffer layer thickness, and pyrolytic carbon layer thickness. The value of the probability of survival at the end of irradiation was inversely proportional to the maximum pressure.« less
Underwater robot society doing internal inspection and leak monitoring of water systems
NASA Astrophysics Data System (ADS)
Halme, Aarne; Vainio, Mika; Appelqvist, Pekka; Jakubik, Peter; Schonberg, Torsten; Visala, Arto
1997-09-01
In the field of civil engineering an effective internal monitoring of pipes and water storage is very problematic. Normally the sensors used for the task are either fixed or manually movable. Thus they will only provide locally and temporally restricted information. As a solution an underwater robotic sensor/actuator society is presented. The system is capable of operating inside a fluid environment as a kind of distributed sensory system. The value of the system emerges from the interactions between the members. Through a communication system the society fuses information from individual members and provides a more reliable estimate of the conditions inside water systems. Tests results in a transparent demo process consisting of tanks and pipes with a volume of 700 liters are presented.
Survival Differences among Native-Born and Foreign-Born Older Adults in the United States
Dupre, Matthew E.; Gu, Danan; Vaupel, James W.
2012-01-01
Background Studies show that the U.S. foreign-born population has lower mortality than the native-born population before age 65. Until recently, the lack of data prohibited reliable comparisons of U.S. mortality by nativity at older ages. This study provides reliable estimates of U.S. foreign-born and native-born mortality at ages 65 and older at the end of the 20th century. Life expectancies of the U.S. foreign born are compared to other developed nations and the foreign-born contribution to total life expectancy (TLE) in the United States is assessed. Methods Newly available data from Medicare Part B records linked with Social Security Administration files are used to estimate period life tables for nearly all U.S. adults aged 65 and older in 1995. Age-specific survival differences and life expectancies are examined in 1995 by sex, race, and place of birth. Results Foreign-born men and women had lower mortality at almost every age from 65 to 100 compared to native-born men and women. Survival differences by nativity were substantially greater for blacks than whites. Foreign-born blacks had the longest life expectancy of all population groups (18.73 [95% confidence interval {CI}, 18.15–19.30] years at age 65 for men and 22.76 [95% CI, 22.28–23.23] years at age 65 for women). The foreign-born population increased TLE in the United States at older ages, and by international comparison, the U.S. foreign born were among the longest-lived persons in the world. Conclusion Survival estimates based on reliable Medicare data confirm that foreign-born adults have longer life expectancy at older ages than native-born adults in the United States. PMID:22615929
ERIC Educational Resources Information Center
Green, Samuel B.; Yang, Yanyun
2015-01-01
In the lead article, Davenport, Davison, Liou, & Love demonstrate the relationship among homogeneity, internal consistency, and coefficient alpha, and also distinguish among them. These distinctions are important because too often coefficient alpha--a reliability coefficient--is interpreted as an index of homogeneity or internal consistency.…
Parts and Components Reliability Assessment: A Cost Effective Approach
NASA Technical Reports Server (NTRS)
Lee, Lydia
2009-01-01
System reliability assessment is a methodology which incorporates reliability analyses performed at parts and components level such as Reliability Prediction, Failure Modes and Effects Analysis (FMEA) and Fault Tree Analysis (FTA) to assess risks, perform design tradeoffs, and therefore, to ensure effective productivity and/or mission success. The system reliability is used to optimize the product design to accommodate today?s mandated budget, manpower, and schedule constraints. Stand ard based reliability assessment is an effective approach consisting of reliability predictions together with other reliability analyses for electronic, electrical, and electro-mechanical (EEE) complex parts and components of large systems based on failure rate estimates published by the United States (U.S.) military or commercial standards and handbooks. Many of these standards are globally accepted and recognized. The reliability assessment is especially useful during the initial stages when the system design is still in the development and hard failure data is not yet available or manufacturers are not contractually obliged by their customers to publish the reliability estimates/predictions for their parts and components. This paper presents a methodology to assess system reliability using parts and components reliability estimates to ensure effective productivity and/or mission success in an efficient manner, low cost, and tight schedule.
Hernansaiz-Garrido, Helena; Alonso-Tapia, Jesús
2017-01-01
Internalized stigma and disclosure concerns are key elements for the study of mental health in people living with HIV. Since no measures of these constructs were available for Spanish population, this study sought to develop such instruments, to analyze their reliability and validity and to provide a short version. A heterogeneous sample of 458 adults from different Spanish-speaking countries completed the HIV-Internalized Stigma Scale and the HIV-Disclosure Concerns Scale, along with the Hospital Anxiety and Depression Scale, Rosenberg's Self-esteem Scale and other socio-demographic variables. Reliability and correlation analyses, exploratory factor analyses, path analyses with latent variables, and ANOVAs were conducted to test the scales' psychometric properties. The scales showed good reliability in terms of internal consistency and temporal stability, as well as good sensitivity and factorial and criterion validity. The HIV-Internalized Stigma Scale and the HIV-Disclosure Concerns Scale are reliable and valid means to assess these variables in several contexts.
Kashkouli, Mohsen Bahmani; Karimi, Nasser; Aghamirsalim, Mohamadreza; Abtahi, Mohammad Bagher; Nojomi, Marzieh; Shahrad-Bejestani, Hadi; Salehi, Masoud
2017-02-01
To determine the measurement properties of the Persian language version of the Graves orbitopathy quality of life questionnaire (GO-QOL). Following a systematic translation and cultural adaptation process, 141 consecutive unselected thyroid eye disease (TED) patients answered the Persian GO-QOL and underwent complete ophthalmic examination. The questionnaire was again completed by 60 patients on the second visit, 2-4 weeks later. Construct validity (cross-cultural validity, structural validity and hypotheses testing), reliability (internal consistency and test-retest reliability), and floor and ceiling effects of the Persian version of the GO-QOL were evaluated. Furthermore, Rasch analysis was used to assess its psychometric properties. Cross-cultural validity was established by back-translation techniques, committee review and pretesting techniques. Bi-dimensionality of the questionnaire was confirmed by factor analysis. Construct validity was also supported through confirmation of 6 out of 8 predefined hypotheses. Cronbach's α and intraclass correlation coefficient (ICC) were 0.650 and 0.859 for visual functioning and 0.875 and 0.896 for appearance subscale, respectively. Mean quality of life (QOL) scores for visual functioning and appearance were 78.18 (standard deviation, SD, 21.57) and 56.25 (SD 26.87), respectively. Person reliabilities from the Rasch rating scale model for both visual functioning and appearance revealed an acceptable internal consistency for the Persian GO-QOL. The Persian GO-QOL questionnaire is a valid and reliable tool with good psychometric properties in evaluation of Persian-speaking patients with TED. Applying Rasch analysis to future versions of the GO-QOL is recommended in order to perform tests for linearity between the estimated item measures in different versions.
Vélez, Claudia Marcela; Lugo, Luz Helena; García, Héctor Iván
2012-09-01
Validate the KIDSCREEN-27 for parents in the metropolitan area of Medellín, Colombia, including the Social Acceptance (SA) subscale of KIDSCREEN-52, as it evaluates the effect of bullying in Life Quality of children. The study population was made up by parents of children between 8 and 18, from Medellín and its metropolitan area. A sample of 1,150 parents was estimated according to the different psychometric properties to be measured. Construct validation was made by comparing the mean scores between groups of high and low socioeconomic conditions. The content validity and the measurement of reliability were verified by internal consistency and test-retest stability. The parent-child agreement was also measured. The internal consistency was adequate (Cronbach alpha 0,76-0,83). Parents of children with better socio-economic status had higher scores in all dimensions (p<0,05). Scores were higher among healthy children. Women had lower scores than men, while children registered higher scores than adolescents. The intraclass correlation coefficient for the reliability assessment was above 0.7 in all dimensions, except in School Environment-SE- (ICC 0,6-0,92). The parent-child agreement reached moderate and good levels (ICC 0,49-0,69). The exploratory factorial analysis, including social acceptance subscale, registered eight dimensions, four of which in agreement with the original questionnaire: Physical activity, SE, Social Support, and SA subscale. KIDSCREEN-27 for parents is a valid and reliable instrument to be used in the Colombian context. Copyright © 2012 Asociación Colombiana de Psiquiatría. Publicado por Elsevier España. All rights reserved.
Maciel, João; Infante, Paulo; Ribeiro, Susana; Ferreira, André; Silva, Artur C; Caravana, Jorge; Carvalho, Manuel G
2014-11-01
The prevalence of obesity has increased worldwide. An assessment of the impact of obesity on health-related quality of life (HRQoL) requires specific instruments. The Moorehead-Ardelt Quality of Life Questionnaire II (MA-II) is a widely used instrument to assess HRQoL in morbidly obese patients. The objective of this study was to translate and validate a Portuguese version of the MA-II.The study included forward and backward translations of the original MA-II. The reliability of the Portuguese MA-II was estimated using the internal consistency and test-retest methods. For validation purposes, the Spearman's rank correlation coefficient was used to evaluate the correlation between the Portuguese MA-II and the Portuguese versions of two other questionnaires, the 36-item Short Form Health Survey (SF-36) and the Impact of Weight on Quality of Life-Lite (IWQOL-Lite).One hundred and fifty morbidly obese patients were randomly assigned to test the reliability and validity of the Portuguese MA-II. Good internal consistency was demonstrated by a Cronbach's alpha coefficient of 0.80, and a very good agreement in terms of test-retest reliability was recorded, with an overall intraclass correlation coefficient (ICC) of 0.88. The total sums of MA-II scores and each item of MA-II were significantly correlated with all domains of SF-36 and IWQOL-Lite. A statistically significant negative correlation was found between the MA-II total score and BMI. Moreover, age, gender and surgical status were independent predictors of MA-II total score.A reliable and valid Portuguese version of the MA-II was produced, thus enabling the routine use of MA-II in the morbidly obese Portuguese population.
Lin, Chiu-Chu; Wu, Chia-Chen; Wu, Li-Min; Chen, Hsing-Mei; Chang, Shu-Chen
2013-04-01
This study aims to develop a valid and reliable chronic kidney disease self-management instrument (CKD-SM) for assessing early stage chronic kidney disease patients' self-management behaviours. Enhancing early stage chronic kidney disease patients' self-management plays a key role in delaying the progression of chronic kidney disease. Healthcare provider understanding of early stage chronic kidney disease patients' self-management behaviours can help develop effective interventions. A valid and reliable instrument for measuring chronic kidney disease patients' self-management behaviours is needed. A cross-sectional descriptive study collected data for principal components analysis with oblique rotation. Mandarin- or Taiwanese-speaking adults with chronic kidney disease (n=252) from two medical centres and one regional hospital in Southern Taiwan completed the CKD-SM. Construct validity was evaluated by exploratory factor analysis. Internal consistency and test-retest reliability were estimated by Cronbach's alpha and Pearson correlation coefficients. Four factors were extracted and labelled self-integration, problem-solving, seeking social support and adherence to recommended regimen. The four factors accounted for 60.51% of the total variance. Each factor showed acceptable internal reliability with Cronbach's alpha from 0.77-0.92. The test-retest correlations for the CKD-SM was 0.72. The psychometric quality of the CKD-SM instrument was satisfactory. Research to conduct a confirmatory factor analysis to further validate this new instrument's construct validity is recommended. The CKD-SM instrument is useful for clinicians who wish to identify the problems with self-management among chronic kidney disease patients early. Self-management assessment will be helpful to develop intervention tailored to the needs of the chronic kidney disease population. © 2013 Blackwell Publishing Ltd.
The reliability of the Glasgow Coma Scale: a systematic review.
Reith, Florence C M; Van den Brande, Ruben; Synnot, Anneliese; Gruen, Russell; Maas, Andrew I R
2016-01-01
The Glasgow Coma Scale (GCS) provides a structured method for assessment of the level of consciousness. Its derived sum score is applied in research and adopted in intensive care unit scoring systems. Controversy exists on the reliability of the GCS. The aim of this systematic review was to summarize evidence on the reliability of the GCS. A literature search was undertaken in MEDLINE, EMBASE and CINAHL. Observational studies that assessed the reliability of the GCS, expressed by a statistical measure, were included. Methodological quality was evaluated with the consensus-based standards for the selection of health measurement instruments checklist and its influence on results considered. Reliability estimates were synthesized narratively. We identified 52 relevant studies that showed significant heterogeneity in the type of reliability estimates used, patients studied, setting and characteristics of observers. Methodological quality was good (n = 7), fair (n = 18) or poor (n = 27). In good quality studies, kappa values were ≥0.6 in 85%, and all intraclass correlation coefficients indicated excellent reliability. Poor quality studies showed lower reliability estimates. Reliability for the GCS components was higher than for the sum score. Factors that may influence reliability include education and training, the level of consciousness and type of stimuli used. Only 13% of studies were of good quality and inconsistency in reported reliability estimates was found. Although the reliability was adequate in good quality studies, further improvement is desirable. From a methodological perspective, the quality of reliability studies needs to be improved. From a clinical perspective, a renewed focus on training/education and standardization of assessment is required.
Field reliability of competency and sanity opinions: A systematic review and meta-analysis.
Guarnera, Lucy A; Murrie, Daniel C
2017-06-01
We know surprisingly little about the interrater reliability of forensic psychological opinions, even though courts and other authorities have long called for known error rates for scientific procedures admitted as courtroom testimony. This is particularly true for opinions produced during routine practice in the field, even for some of the most common types of forensic evaluations-evaluations of adjudicative competency and legal sanity. To address this gap, we used meta-analytic procedures and study space methodology to systematically review studies that examined the interrater reliability-particularly the field reliability-of competency and sanity opinions. Of 59 identified studies, 9 addressed the field reliability of competency opinions and 8 addressed the field reliability of sanity opinions. These studies presented a wide range of reliability estimates; pairwise percentage agreements ranged from 57% to 100% and kappas ranged from .28 to 1.0. Meta-analytic combinations of reliability estimates obtained by independent evaluators returned estimates of κ = .49 (95% CI: .40-.58) for competency opinions and κ = .41 (95% CI: .29-.53) for sanity opinions. This wide range of reliability estimates underscores the extent to which different evaluation contexts tend to produce different reliability rates. Unfortunately, our study space analysis illustrates that available field reliability studies typically provide little information about contextual variables crucial to understanding their findings. Given these concerns, we offer suggestions for improving research on the field reliability of competency and sanity opinions, as well as suggestions for improving reliability rates themselves. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Test Assembly Implications for Providing Reliable and Valid Subscores
ERIC Educational Resources Information Center
Lee, Minji K.; Sweeney, Kevin; Melican, Gerald J.
2017-01-01
This study investigates the relationships among factor correlations, inter-item correlations, and the reliability estimates of subscores, providing a guideline with respect to psychometric properties of useful subscores. In addition, it compares subscore estimation methods with respect to reliability and distinctness. The subscore estimation…
Compound estimation procedures in reliability
NASA Technical Reports Server (NTRS)
Barnes, Ron
1990-01-01
At NASA, components and subsystems of components in the Space Shuttle and Space Station generally go through a number of redesign stages. While data on failures for various design stages are sometimes available, the classical procedures for evaluating reliability only utilize the failure data on the present design stage of the component or subsystem. Often, few or no failures have been recorded on the present design stage. Previously, Bayesian estimators for the reliability of a single component, conditioned on the failure data for the present design, were developed. These new estimators permit NASA to evaluate the reliability, even when few or no failures have been recorded. Point estimates for the latter evaluation were not possible with the classical procedures. Since different design stages of a component (or subsystem) generally have a good deal in common, the development of new statistical procedures for evaluating the reliability, which consider the entire failure record for all design stages, has great intuitive appeal. A typical subsystem consists of a number of different components and each component has evolved through a number of redesign stages. The present investigations considered compound estimation procedures and related models. Such models permit the statistical consideration of all design stages of each component and thus incorporate all the available failure data to obtain estimates for the reliability of the present version of the component (or subsystem). A number of models were considered to estimate the reliability of a component conditioned on its total failure history from two design stages. It was determined that reliability estimators for the present design stage, conditioned on the complete failure history for two design stages have lower risk than the corresponding estimators conditioned only on the most recent design failure data. Several models were explored and preliminary models involving bivariate Poisson distribution and the Consael Process (a bivariate Poisson process) were developed. Possible short comings of the models are noted. An example is given to illustrate the procedures. These investigations are ongoing with the aim of developing estimators that extend to components (and subsystems) with three or more design stages.
Trani, Jean-François; Babulal, Ganesh Muneshwar; Bakhshi, Parul
2015-01-01
Background Although 80% of persons with disabilities live in low and middle-income countries, there is still a lack of comprehensive, cross-culturally validated tools to identify persons facing activity limitations and functioning difficulties in these settings. In absence of such a tool, disability estimates vary considerably according to the methodology used, and policies are based on unreliable estimates. Methods and Findings The Disability Screening Questionnaire composed of 27 items (DSQ-27) was initially designed by a group of international experts in survey development and disability in Afghanistan for a national survey. Items were selected based on major domains of activity limitations and functioning difficulties linked to an impairment as defined by the International Classification of Functioning, Disability and Health. Face, content and construct validity, as well as sensitivity and specificity were examined. Based on the results obtained, the tool was subsequently refined and expanded to 34 items, tested and validated in Darfur, Sudan. Internal consistency for the total DSQ-34 using a raw and standardized Cronbach’s Alpha and within each domain using a standardized Cronbach’s Alpha was examined in the Asian context (India and Nepal). Exploratory factor analysis (EFA) using principal axis factoring (PAF) evaluated the lowest number of factors to account for the common variance among the questions in the screen. Test-retest reliability was determined by calculating intraclass correlation (ICC) and inter-rater reliability by calculating the kappa statistic; results were checked using Bland-Altman plots. The DSQ-34 was further tested for standard error of measurement (SEM) and for the minimum detectable change (MDC). Good internal consistency was indicated by Cronbach’s Alpha of 0.83/0.82 for India and 0.76/0.78 for Nepal. We confirmed our assumption for EFA using the Kaiser-Meyer-Olkin measure of sampling well above the accepted cutoff of 0.40 for India (0.82) and Nepal (0.82). The criteria for Bartlett’s test of sphericity were also met for both India (< .001) and Nepal (< .001). Estimates of reliability from the two countries reached acceptable levels of ICC of 0.75 (p<0.001) for India of 0.77 for Nepal (p<0.001) and good strength of agreement for weighted kappa (respectively 0.77 and 0.79). The SEM/MDC was 0.80/2.22 for India and 0.96/2.66 for Nepal indicating a smaller amount of measurement error in the screen. Conclusions In Nepal and India, the DSQ-34 shows strong psychometric properties that indicate that it effectively discriminates between persons with and without disabilities. This instrument can be used in association with other instruments for the purpose of comparing health outcomes of persons with and without disabilities in LMICs. PMID:26630668
Over two hundred million injuries to anterior teeth attributable to large overjet: a meta-analysis.
Petti, Stefano
2015-02-01
The association between large overjet and traumatic dental injuries (TDIs) to anterior teeth is documented. However, observational studies are discrepant and generalizability (i.e. external validity) of meta-analyses is limited. Therefore, this meta-analysis sought to reconcile such discrepancies seeking to provide reliable risk estimates which could be generalizable at global level. Literature search (years 1990-2014) was performed (Scopus, GOOGLE Scholar, Medline). Selected primary studies were divided into subsets: 'primary teeth, overjet threshold 3-4 mm' (Primary3); 'permanent teeth, overjet threshold 3-4 mm' (Permanent3); 'permanent teeth, overjet threshold 6 ± 1 mm' (Permanent6). The adjusted odds ratios (ORs) were extracted. To obtain the highest level of reliability (i.e. internal validity), the pooled OR estimates were assessed accounting for between-study heterogeneity, publication bias and confounding. Result robustness was investigated with sensitivity and subgroup analyses. Fifty-four primary studies from Africa, America, Asia and Europe were included. The sampled individuals were children, adolescents and adults. Overall, there were >10 000 patients with TDI. The pooled OR estimates resulted 2.31 (95% confidence interval - 95CI, 1.01-5.27), 2.01 (95CI, 1.39-2.91) and 2.24 (95CI, 1.56-3.21) for Primary3, Permanent3 and Permant6, respectively. Sensitivity and subgroup analyses corroborated these estimates. Reliability and generalizability of pooled ORs were high enough and made it possible to assess that the fraction of global TDIs attributable to large overjet is 21.8% (95CI, 9.7-34.5%) and that large overjet is co-responsible for 235 008 000 global TDI cases (95CI, 104,760,000-372,168,000). This high global burden of TDI suggests that preventive measures must be implemented in patients with large overjet. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Test Reliability at the Individual Level
Hu, Yueqin; Nesselroade, John R.; Erbacher, Monica K.; Boker, Steven M.; Burt, S. Alexandra; Keel, Pamela K.; Neale, Michael C.; Sisk, Cheryl L.; Klump, Kelly
2016-01-01
Reliability has a long history as one of the key psychometric properties of a test. However, a given test might not measure people equally reliably. Test scores from some individuals may have considerably greater error than others. This study proposed two approaches using intraindividual variation to estimate test reliability for each person. A simulation study suggested that the parallel tests approach and the structural equation modeling approach recovered the simulated reliability coefficients. Then in an empirical study, where forty-five females were measured daily on the Positive and Negative Affect Schedule (PANAS) for 45 consecutive days, separate estimates of reliability were generated for each person. Results showed that reliability estimates of the PANAS varied substantially from person to person. The methods provided in this article apply to tests measuring changeable attributes and require repeated measures across time on each individual. This article also provides a set of parallel forms of PANAS. PMID:28936107
Zucoloto, Miriane Lucindo; Maroco, João; Duarte Bonini Campos, Juliana Alvares
2015-01-01
To evaluate the psychometric properties of the Multidimensional Pain Inventory (MPI) in a Brazilian sample of patients with orofacial pain. A total of 1,925 adult patients, who sought dental care in the School of Dentistry of São Paulo State University's Araraquara campus, were invited to participate; 62.5% (n=1,203) agreed to participate. Of these, 436 presented with orofacial pain and were included. The mean age was 39.9 (SD=13.6) years and 74.5% were female. Confirmatory factor analysis was conducted using χ²/df, comparative fit index, goodness of fit index, and root mean square error of approximation as indices of goodness of fit. Convergent validity was estimated by the average variance extracted and composite reliability, and internal consistency by Cronbach's alpha standardized coefficient (α). The stability of the models was tested in independent samples (test and validation; dental pain and orofacial pain). The factorial invariance was estimated by multigroup analysis (Δχ²). Factorial, convergent validity, and internal consistency were adequate in all three parts of the MPI. To achieve this adequate fit for Part 1, item 15 needed to be deleted (λ=0.13). Discriminant validity was compromised between the factors "activities outside the home" and "social activities" of Part 3 of the MPI in the total sample, validation sample, and in patients with dental pain and with orofacial pain. A strong invariance between different subsamples from the three parts of the MPI was detected. The MPI produced valid, reliable, and stable data for pain assessment among Brazilian patients with orofacial pain.
Patient-Specific Biomechanical Modeling for Guidance During Minimally-Invasive Hepatic Surgery.
Plantefève, Rosalie; Peterlik, Igor; Haouchine, Nazim; Cotin, Stéphane
2016-01-01
During the minimally-invasive liver surgery, only the partial surface view of the liver is usually provided to the surgeon via the laparoscopic camera. Therefore, it is necessary to estimate the actual position of the internal structures such as tumors and vessels from the pre-operative images. Nevertheless, such task can be highly challenging since during the intervention, the abdominal organs undergo important deformations due to the pneumoperitoneum, respiratory and cardiac motion and the interaction with the surgical tools. Therefore, a reliable automatic system for intra-operative guidance requires fast and reliable registration of the pre- and intra-operative data. In this paper we present a complete pipeline for the registration of pre-operative patient-specific image data to the sparse and incomplete intra-operative data. While the intra-operative data is represented by a point cloud extracted from the stereo-endoscopic images, the pre-operative data is used to reconstruct a biomechanical model which is necessary for accurate estimation of the position of the internal structures, considering the actual deformations. This model takes into account the patient-specific liver anatomy composed of parenchyma, vascularization and capsule, and is enriched with anatomical boundary conditions transferred from an atlas. The registration process employs the iterative closest point technique together with a penalty-based method. We perform a quantitative assessment based on the evaluation of the target registration error on synthetic data as well as a qualitative assessment on real patient data. We demonstrate that the proposed registration method provides good results in terms of both accuracy and robustness w.r.t. the quality of the intra-operative data.
Image denoising by exploring external and internal correlations.
Yue, Huanjing; Sun, Xiaoyan; Yang, Jingyu; Wu, Feng
2015-06-01
Single image denoising suffers from limited data collection within a noisy image. In this paper, we propose a novel image denoising scheme, which explores both internal and external correlations with the help of web images. For each noisy patch, we build internal and external data cubes by finding similar patches from the noisy and web images, respectively. We then propose reducing noise by a two-stage strategy using different filtering approaches. In the first stage, since the noisy patch may lead to inaccurate patch selection, we propose a graph based optimization method to improve patch matching accuracy in external denoising. The internal denoising is frequency truncation on internal cubes. By combining the internal and external denoising patches, we obtain a preliminary denoising result. In the second stage, we propose reducing noise by filtering of external and internal cubes, respectively, on transform domain. In this stage, the preliminary denoising result not only enhances the patch matching accuracy but also provides reliable estimates of filtering parameters. The final denoising image is obtained by fusing the external and internal filtering results. Experimental results show that our method constantly outperforms state-of-the-art denoising schemes in both subjective and objective quality measurements, e.g., it achieves >2 dB gain compared with BM3D at a wide range of noise levels.
Methods to assess geological CO2 storage capacity: Status and best practice
Heidug, Wolf; Brennan, Sean T.; Holloway, Sam; Warwick, Peter D.; McCoy, Sean; Yoshimura, Tsukasa
2013-01-01
To understand the emission reduction potential of carbon capture and storage (CCS), decision makers need to understand the amount of CO2 that can be safely stored in the subsurface and the geographical distribution of storage resources. Estimates of storage resources need to be made using reliable and consistent methods. Previous estimates of CO2 storage potential for a range of countries and regions have been based on a variety of methodologies resulting in a correspondingly wide range of estimates. Consequently, there has been uncertainty about which of the methodologies were most appropriate in given settings, and whether the estimates produced by these methods were useful to policy makers trying to determine the appropriate role of CCS. In 2011, the IEA convened two workshops which brought together experts for six national surveys organisations to review CO2 storage assessment methodologies and make recommendations on how to harmonise CO2 storage estimates worldwide. This report presents the findings of these workshops and an internationally shared guideline for quantifying CO2 storage resources.
Online estimation of internal stack temperatures in solid oxide fuel cell power generating units
NASA Astrophysics Data System (ADS)
Dolenc, B.; Vrečko, D.; Juričić, Ɖ.; Pohjoranta, A.; Pianese, C.
2016-12-01
Thermal stress is one of the main factors affecting the degradation rate of solid oxide fuel cell (SOFC) stacks. In order to mitigate the possibility of fatal thermal stress, stack temperatures and the corresponding thermal gradients need to be continuously controlled during operation. Due to the fact that in future commercial applications the use of temperature sensors embedded within the stack is impractical, the use of estimators appears to be a viable option. In this paper we present an efficient and consistent approach to data-driven design of the estimator for maximum and minimum stack temperatures intended (i) to be of high precision, (ii) to be simple to implement on conventional platforms like programmable logic controllers, and (iii) to maintain reliability in spite of degradation processes. By careful application of subspace identification, supported by physical arguments, we derive a simple estimator structure capable of producing estimates with 3% error irrespective of the evolving stack degradation. The degradation drift is handled without any explicit modelling. The approach is experimentally validated on a 10 kW SOFC system.
Ulus, Tumer; Yurtseven, Eray; Cavdar, Sabanur; Erginoz, Ethem; Erdogan, M. Sarper
2012-01-01
Aim To compare the quality of the 2008 cancer mortality data of the Istanbul Directorate of Cemeteries (IDC) with the 2008 data of International Agency for Research on Cancer (IARC) and Turkish Statistical Institute (TUIK), and discuss the suitability of using this databank for estimations of cancer mortality in the future. Methods We used 2008 and 2010 death records of the IDC and compared it to TUIK and IARC data. Results According to the WHO statistics, in Turkey in 2008 there were 67 255 estimated cancer deaths. As the population of Turkey was 71 517 100, the cancer mortality rate was 9.4 per 10 000. According to the IDC statistics, the cancer mortality rate in Istanbul in 2008 was 5.97 per 10 000. Conclusion IDC estimates were higher than WHO estimates probably because WHO bases its estimates on a sample group and because of the restrictions of IDC data collection method. Death certificates could be a reliable and accurate data source for mortality statistics if the problems of data collection are solved. PMID:23100210
Hansen, Andreas Wolff; Dahl-Petersen, Inger; Helge, Jørn Wulff; Brage, Søren; Grønbæk, Morten; Flensborg-Madsen, Trine
2014-03-01
The International Physical Activity Questionnaire (IPAQ) is commonly used in surveys, but reliability and validity has not been established in the Danish population. Among participants in the Danish Health Examination survey 2007-2008, 142 healthy participants (45% men) wore a unit that combined accelerometry and heart rate monitoring (Acc+HR) for 7 consecutive days and then completed the IPAQ. Background data were obtained from the survey. Physical activity energy expenditure (PAEE) and time in moderate, vigorous, and sedentary intensity levels were derived from the IPAQ and compared with estimates from Acc+HR using Spearman's correlation coefficients and Bland-Altman plots. Repeatability of the IPAQ was also assessed. PAEE from the 2 methods was significantly positively correlated (0.29 and 0.49; P = 0.02 and P < 0.001; for women and men, respectively). Men significantly overestimated PAEE by IPAQ (56.2 vs 45.3 kJ/kg/day, IPAQ: Acc+HR, P < .01), while the difference was nonsignificant for women (40.8 vs 44.4 kJ/kg/day). Bland-Altman plots showed that the IPAQ overestimated PAEE, moderate, and vigorous activity without systematic error. Reliability of the IPAQ was moderate to high for all domains and intensities (total PAEE intraclass correlation coefficient = 0.58). This Danish Internet-based version of the long IPAQ had modest validity and reliability when assessing PAEE at population level.
Is the Parkinson Anxiety Scale comparable across raters?
Forjaz, Maria João; Ayala, Alba; Martinez-Martin, Pablo; Dujardin, Kathy; Pontone, Gregory M; Starkstein, Sergio E; Weintraub, Daniel; Leentjens, Albert F G
2015-04-01
The Parkinson Anxiety Scale is a new scale developed to measure anxiety severity in Parkinson's disease specifically. It consists of three dimensions: persistent anxiety, episodic anxiety, and avoidance behavior. This study aimed to assess the measurement properties of the scale while controlling for the rater (self- vs. clinician-rated) effect. The Parkinson Anxiety Scale was administered to a cross-sectional multicenter international sample of 362 Parkinson's disease patients. Both patients and clinicians rated the patient's anxiety independently. A many-facet Rasch model design was applied to estimate and remove the rater effect. The following measurement properties were assessed: fit to the Rasch model, unidimensionality, reliability, differential item functioning, item local independency, interrater reliability (self or clinician), and scale targeting. In addition, test-retest stability, construct validity, precision, and diagnostic properties of the Parkinson Anxiety Scale were also analyzed. A good fit to the Rasch model was obtained for Parkinson Anxiety Scale dimensions A and B, after the removal of one item and rescoring of the response scale for certain items, whereas dimension C showed marginal fit. Self versus clinician rating differences were of small magnitude, with patients reporting higher anxiety levels than clinicians. The linear measure for Parkinson Anxiety Scale dimensions A and B showed good convergent construct with other anxiety measures and good diagnostic properties. Parkinson Anxiety Scale modified dimensions A and B provide valid and reliable measures of anxiety in Parkinson's disease that are comparable across raters. Further studies are needed with dimension C. © 2014 International Parkinson and Movement Disorder Society.
García-Tornel Florensa, S; García García, J J; Reuter, J; Clow, C; Reuter, L
1996-05-01
The purpose of this dissertation research was to design, standardize and validate the Spanish version of the Kent Infant Development Scale (KIDS). This questionnaire is based on information obtained from the parents. It was translated into Spanish and named "Escala de Desarrollo Infantil de Kent" (EDIK). The EDIK normative data were collected from the parents of 662 healthy infants (ages 1 to 15 months) in pediatric clinics in Catalonia (Spain). Test-retest reliability (r = 0.99; p < 0.001), interjudge reliability (r = 0.98; p < 0.001) and internal consistency (Cronbach alpha = 0.9947) were determined. An "r' of 0.96 was obtained when EDIK scores were compared to their estimated developmental ages obtained from the Denver Developmental Scale. The correlation of the infants' chronological age and their EDIK was 0.96 (p < 0.001). The high reliability and validity correlation coefficients demonstrate the sound psychometric properties of the EDIK. It appears to be a useful and acceptable instrument in measuring the developmental status of infants by using the reports of their parents.
Karakuła-Juchnowicz, Hanna; Stecka, Mariola
2017-08-29
In view of unavailability in Poland of the standardized methods to measure PIQ, the aim of the work was to develop a Polish test to assess the premorbid level of intelligence - PART(Polish AdultReading Test) and to measureits psychometric properties, such as validity, reliability as well as standardization in the group of schizophrenia patients. The principles of PART construction were based on the idea of popular worldwide National Adult Reading Test by Hazel Nelson. The research comprised a group of 122 subjects (65 schizophrenia patients and 57 healthy people), aged 18-60 years, matched for age and gender. PART appears to be a method with high internal consistency and reliability measured by test-retest, inter-rater reliability, and the method with acceptable diagnostic and prognostic validity. The standardized procedures of PART have been investigated and described. Considering the psychometric values of PART and a short time of its performance, the test may be a useful diagnostic instrument in the assessment of premorbid level of intelligence in a group of schizophrenic patients.
The challenge of mapping between two medical coding systems.
Wojcik, Barbara E; Stein, Catherine R; Devore, Raymond B; Hassell, L Harrison
2006-11-01
Deployable medical systems patient conditions (PCs) designate groups of patients with similar medical conditions and, therefore, similar treatment requirements. PCs are used by the U.S. military to estimate field medical resources needed in combat operations. Information associated with each of the 389 PCs is based on subject matter expert opinion, instead of direct derivation from standard medical codes. Currently, no mechanisms exist to tie current or historical medical data to PCs. Our study objective was to determine whether reliable conversion between PC codes and International Classification of Diseases, 9th Revision, Clinical Modification (ICD-9-CM) diagnosis codes is possible. Data were analyzed for three professional coders assigning all applicable ICD-9-CM diagnosis codes to each PC code. Inter-rater reliability was measured by using Cohen's K statistic and percent agreement. Methods were developed to calculate kappa statistics when multiple responses could be selected from many possible categories. Overall, we found moderate support for the possibility of reliable conversion between PCs and ICD-9-CM diagnoses (mean kappa = 0.61). Current PCs should be modified into a system that is verifiable with real data.
An ABC estimate of pedigree error rate: application in dog, sheep and cattle breeds.
Leroy, G; Danchin-Burge, C; Palhiere, I; Baumung, R; Fritz, S; Mériaux, J C; Gautier, M
2012-06-01
On the basis of correlations between pairwise individual genealogical kinship coefficients and allele sharing distances computed from genotyping data, we propose an approximate Bayesian computation (ABC) approach to assess pedigree file reliability through gene-dropping simulations. We explore the features of the method using simulated data sets and show precision increases with the number of markers. An application is further made with five dog breeds, four sheep breeds and one cattle breed raised in France and displaying various characteristics and population sizes, using microsatellite or SNP markers. Depending on the breeds, pedigree error estimations range between 1% and 9% in dog breeds, 1% and 10% in sheep breeds and 4% in cattle breeds. © 2011 The Authors, Animal Genetics © 2011 Stichting International Foundation for Animal Genetics.
A brief review on key technologies in the battery management system of electric vehicles
NASA Astrophysics Data System (ADS)
Liu, Kailong; Li, Kang; Peng, Qiao; Zhang, Cheng
2018-04-01
Batteries have been widely applied in many high-power applications, such as electric vehicles (EVs) and hybrid electric vehicles, where a suitable battery management system (BMS) is vital in ensuring safe and reliable operation of batteries. This paper aims to give a brief review on several key technologies of BMS, including battery modelling, state estimation and battery charging. First, popular battery types used in EVs are surveyed, followed by the introduction of key technologies used in BMS. Various battery models, including the electric model, thermal model and coupled electro-thermal model are reviewed. Then, battery state estimations for the state of charge, state of health and internal temperature are comprehensively surveyed. Finally, several key and traditional battery charging approaches with associated optimization methods are discussed.
Some comments on mapping from disease-specific to generic health-related quality-of-life scales.
Palta, Mari
2013-01-01
An article by Lu et al. in this issue of Value in Health addresses the mapping of treatment or group differences in disease-specific measures (DSMs) of health-related quality of life onto differences in generic health-related quality-of-life scores, with special emphasis on how the mapping is affected by the reliability of the DSM. In the proposed mapping, a factor analytic model defines a conversion factor between the scores as the ratio of factor loadings. Hence, the mapping applies to convert true underlying scales and has desirable properties facilitating the alignment of instruments and understanding their relationship in a coherent manner. It is important to note, however, that when DSM means or differences in mean DSMs are estimated, their mapping is still of a measurement error-prone predictor, and the correct conversion coefficient is the true mapping multiplied by the reliability of the DSM in the relevant sample. In addition, the proposed strategy for estimating the factor analytic mapping in practice requires assumptions that may not hold. We discuss these assumptions and how they may be the reason we obtain disparate estimates of the mapping factor in an application of the proposed methods to groups of patients. Copyright © 2013 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.
Weak data do not make a free lunch, only a cheap meal
DOE Office of Scientific and Technical Information (OSTI.GOV)
Luo, Zhipu; Rajashankar, Kanagalaghatta; Dauter, Zbigniew
2014-01-17
Four data sets were processed at resolutions significantly exceeding the criteria traditionally used for estimating the diffraction data resolution limit. The analysis of these data and the corresponding model-quality indicators suggests that the criteria of resolution limits widely adopted in the past may be somewhat conservative. Various parameters, such asR mergeandI/σ(I), optical resolution and the correlation coefficients CC 1/2and CC*, can be used for judging the internal data quality, whereas the reliability factorsRandR freeas well as the maximum-likelihood target values and real-space map correlation coefficients can be used to estimate the agreement between the data and the refined model. However,more » none of these criteria provide a reliable estimate of the data resolution cutoff limit. The analysis suggests that extension of the maximum resolution by about 0.2 Å beyond the currently adopted limit where theI/σ(I) value drops to 2.0 does not degrade the quality of the refined structural models, but may sometimes be advantageous. Such an extension may be particularly beneficial for significantly anisotropic diffraction. Extension of the maximum resolution at the stage of data collection and structure refinement is cheap in terms of the required effort and is definitely more advisable than accepting a too conservative resolution cutoff, which is unfortunately quite frequent among the crystal structures deposited in the Protein Data Bank.« less
Slootweg, Irene A.; Lombarts, Kiki M. J. M. H.; Boerebach, Benjamin C. M.; Heineman, Maas Jan; Scherpbier, Albert J. J. A.; van der Vleuten, Cees P. M.
2014-01-01
Background Teamwork between clinical teachers is a challenge in postgraduate medical training. Although there are several instruments available for measuring teamwork in health care, none of them are appropriate for teaching teams. The aim of this study is to develop an instrument (TeamQ) for measuring teamwork, to investigate its psychometric properties and to explore how clinical teachers assess their teamwork. Method To select the items to be included in the TeamQ questionnaire, we conducted a content validation in 2011, using a Delphi procedure in which 40 experts were invited. Next, for pilot testing the preliminary tool, 1446 clinical teachers from 116 teaching teams were requested to complete the TeamQ questionnaire. For data analyses we used statistical strategies: principal component analysis, internal consistency reliability coefficient, and the number of evaluations needed to obtain reliable estimates. Lastly, the median TeamQ scores were calculated for teams to explore the levels of teamwork. Results In total, 31 experts participated in the Delphi study. In total, 114 teams participated in the TeamQ pilot. The median team response was 7 evaluations per team. The principal component analysis revealed 11 factors; 8 were included. The reliability coefficients of the TeamQ scales ranged from 0.75 to 0.93. The generalizability analysis revealed that 5 to 7 evaluations were needed to obtain internal reliability coefficients of 0.70. In terms of teamwork, the clinical teachers scored residents' empowerment as the highest TeamQ scale and feedback culture as the area that would most benefit from improvement. Conclusions This study provides initial evidence of the validity of an instrument for measuring teamwork in teaching teams. The high response rates and the low number of evaluations needed for reliably measuring teamwork indicate that TeamQ is feasible for use by teaching teams. Future research could explore the effectiveness of feedback on teamwork in follow up measurements. PMID:25393006
Zhu, Junya; Li, Liping; Zhao, Hailei; Han, Guangshu; Wu, Albert W; Weingart, Saul N
2014-10-01
Existing patient safety climate instruments, most of which have been developed in the USA, may not accurately reflect the conditions in the healthcare systems of other countries. To develop and evaluate a patient safety climate instrument for healthcare workers in Chinese hospitals. Based on a review of existing instruments, expert panel review, focus groups and cognitive interviews, we developed items relevant to patient safety climate in Chinese hospitals. The draft instrument was distributed to 1700 hospital workers from 54 units in six hospitals in five Chinese cities between July and October 2011, and 1464 completed surveys were received. We performed exploratory and confirmatory factor analyses and estimated internal consistency reliability, within-unit agreement, between-unit variation, unit-mean reliability, correlation between multi-item composites, and association between the composites and two single items of perceived safety. The final instrument included 34 items organised into nine composites: institutional commitment to safety, unit management support for safety, organisational learning, safety system, adequacy of safety arrangements, error reporting, communication and peer support, teamwork and staffing. All composites had acceptable unit-mean reliabilities (≥0.74) and within-unit agreement (Rwg ≥0.71), and exhibited significant between-unit variation with intraclass correlation coefficients ranging from 9% to 21%. Internal consistency reliabilities ranged from 0.59 to 0.88 and were ≥0.70 for eight of the nine composites. Correlations between composites ranged from 0.27 to 0.73. All composites were positively and significantly associated with the two perceived safety items. The Chinese Hospital Survey on Patient Safety Climate demonstrates adequate dimensionality, reliability and validity. The integration of qualitative and quantitative methods is essential to produce an instrument that is culturally appropriate for Chinese hospitals. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Slootweg, Irene A; Lombarts, Kiki M J M H; Boerebach, Benjamin C M; Heineman, Maas Jan; Scherpbier, Albert J J A; van der Vleuten, Cees P M
2014-01-01
Teamwork between clinical teachers is a challenge in postgraduate medical training. Although there are several instruments available for measuring teamwork in health care, none of them are appropriate for teaching teams. The aim of this study is to develop an instrument (TeamQ) for measuring teamwork, to investigate its psychometric properties and to explore how clinical teachers assess their teamwork. To select the items to be included in the TeamQ questionnaire, we conducted a content validation in 2011, using a Delphi procedure in which 40 experts were invited. Next, for pilot testing the preliminary tool, 1446 clinical teachers from 116 teaching teams were requested to complete the TeamQ questionnaire. For data analyses we used statistical strategies: principal component analysis, internal consistency reliability coefficient, and the number of evaluations needed to obtain reliable estimates. Lastly, the median TeamQ scores were calculated for teams to explore the levels of teamwork. In total, 31 experts participated in the Delphi study. In total, 114 teams participated in the TeamQ pilot. The median team response was 7 evaluations per team. The principal component analysis revealed 11 factors; 8 were included. The reliability coefficients of the TeamQ scales ranged from 0.75 to 0.93. The generalizability analysis revealed that 5 to 7 evaluations were needed to obtain internal reliability coefficients of 0.70. In terms of teamwork, the clinical teachers scored residents' empowerment as the highest TeamQ scale and feedback culture as the area that would most benefit from improvement. This study provides initial evidence of the validity of an instrument for measuring teamwork in teaching teams. The high response rates and the low number of evaluations needed for reliably measuring teamwork indicate that TeamQ is feasible for use by teaching teams. Future research could explore the effectiveness of feedback on teamwork in follow up measurements.
Iranian Health Literacy Questionnaire (IHLQ): An Instrument for Measuring Health Literacy in Iran.
Haghdoost, Ali Akbar; Rakhshani, Fatemeh; Aarabi, Mohsen; Montazeri, Ali; Tavousi, Mahmoud; Solimanian, Atoosa; Sarbandi, Fatemeh; Namdar, Hosein; Iranpour, Abedin
2015-06-01
Promoting Health Literacy (HL) is considered as an important goal in strategic plans of many countries. In spite of the necessity for access to valid, reliable and native HL instruments, the number of such instruments in the Persian language is scarce. Moreover, there is no good estimation of HL status in Iran. The aim of this study was to provide a valid, reliable and native instrument to measure and monitor community HL in Iran and also, to provide an estimation of HL status in two Iranian provinces. By applying the multistage cluster sampling, 1080 respondents (540 from each gender) were recruited from Kerman and Mazandaran provinces of Iran, from February to June 2014 to participate in this cross-sectional study. The development of the Iranian Health Literacy Questionnaire (IHLQ) was initiated with a comprehensive review of the literature. Then, face, content and construct validity as well as reliability were determined. Internal consistency and test-retest reliability (ICC) of the factors was in the range of 0.71 to 0.96 and 0.73 to 0.86, respectively. In order to construct validity, Exploratory Factor Analysis (EFA) Kaiser-Meyer-Olkin (KMO) = 0.95 and Bartlett's test result of 3.017 with P < 0.001) with varimax rotation was used. Optimal reduced solution, including 36 items and seven factors, was found in EFA. Five of the factors identified were reading/comprehension skills, individual empowerment, communication/decision-making skills, social empowerment and health knowledge. It was concluded that IHLQ might be a practical and useful tool for investigating HL for Persian language speakers around the world. Since HL is dynamic and its instruments should be regularly revised, further studies are recommended to assess HL with application of IHLQ to detect its potential imperfections.
Cancela Carral, José María; Lago Ballesteros, Joaquín; Ayán Pérez, Carlos; Mosquera Morono, María Belén
2016-01-01
To analyse the reliability and validity of the Weekly Activity Checklist (WAC), the One Week Recall (OWR), and the Godin-Shephard Leisure Time Exercise Questionnaire (GLTEQ) in Spanish adolescents. A total of 78 adolescents wore a pedometer for one week, filled out the questionnaires at the end of this period and underwent a test to estimate their maximal oxygen consumption (VO2max). The reliability of the questionnaires was determined by means of a factor analysis. Convergent validity was obtained by comparing the questionnaires' scores against the amount of physical activity quantified by the pedometer and the VO2max reported. The questionnaires showed a weak internal consistency (WAC: α=0.59-0.78; OWR: α=0.53-0.73; GLTEQ: α=0.60). Moderate statistically significant correlations were found between the pedometer and the WAC (r=0.69; p <0.01) and the OWR (r=0.42; p <0.01), while a low statistically significant correlation was found for the GLTEQ (r=0.36; p=0.01). The estimated VO2max showed a low level of association with the WAC results (r=0.30; p <0.05), and the OWR results (r=0.29; p <0.05). When classifying the participants as active or inactive, the level of agreement with the pedometer was moderate for the WAC (k=0.46) and the OWR (r=0.44), and slight for the GLTEQ (r=0.20). Of the three questionnaires analysed, the WAC showed the best psychometric performance as it was the only one with respectable convergent validity, while sharing low reliability with the OWR and the GLTEQ. Copyright © 2016 SESPAS. Publicado por Elsevier España, S.L.U. All rights reserved.
Milanović, Zoran; Pantelić, Saša; Trajković, Nebojša; Jorgić, Bojan; Sporiš, Goran; Bratić, Milovan
2014-01-01
The purpose of this study was to determine the test-retest reliability of the International Physical Activity Questionnaire (IPAQ) for older adults in Serbia. Six hundred and sixty older adults (352 men, 53%; 308 women, 47%; mean age 67.65±5.76 years) participated in the study. To examine test-retest reliability, the participants were asked to complete the IPAQ on two occasions 2 weeks apart. Moderate reliability was observed between the repeated IPAQ, with intraclass correlation coefficients ranging from 0.53 to 0.91. The least reliability was established in leisure time activity (0.53) and the most reliability in the transport domain (0.91). Men and women had similar intraclass correlation coefficients for total physical activity (0.71 versus 0.74, respectively), while the biggest difference was obtained for housework in men (0.68) and in women (0.90). Our study shows that the long version of the IPAQ is a reliable instrument for assessing physical activity levels in older adults and that it may be useful for generating internationally comparable data.
How Many Sleep Diary Entries Are Needed to Reliably Estimate Adolescent Sleep?
Arora, Teresa; Gradisar, Michael; Taheri, Shahrad; Carskadon, Mary A.
2017-01-01
Abstract Study Objectives: To investigate (1) how many nights of sleep diary entries are required for reliable estimates of five sleep-related outcomes (bedtime, wake time, sleep onset latency [SOL], sleep duration, and wake after sleep onset [WASO]) and (2) the test–retest reliability of sleep diary estimates of school night sleep across 12 weeks. Methods: Data were drawn from four adolescent samples (Australia [n = 385], Qatar [n = 245], United Kingdom [n = 770], and United States [n = 366]), who provided 1766 eligible sleep diary weeks for reliability analyses. We performed reliability analyses for each cohort using complete data (7 days), one to five school nights, and one to two weekend nights. We also performed test–retest reliability analyses on 12-week sleep diary data available from a subgroup of 55 US adolescents. Results: Intraclass correlation coefficients for bedtime, SOL, and sleep duration indicated good-to-excellent reliability from five weekday nights of sleep diary entries across all adolescent cohorts. Four school nights was sufficient for wake times in the Australian and UK samples, but not the US or Qatari samples. Only Australian adolescents showed good reliability for two weekend nights of bedtime reports; estimates of SOL were adequate for UK adolescents based on two weekend nights. WASO was not reliably estimated using 1 week of sleep diaries. We observed excellent test–rest reliability across 12 weeks of sleep diary data in a subsample of US adolescents. Conclusion: We recommend at least five weekday nights of sleep dairy entries to be made when studying adolescent bedtimes, SOL, and sleep duration. Adolescent sleep patterns were stable across 12 consecutive school weeks. PMID:28199718
How Many Sleep Diary Entries Are Needed to Reliably Estimate Adolescent Sleep?
Short, Michelle A; Arora, Teresa; Gradisar, Michael; Taheri, Shahrad; Carskadon, Mary A
2017-03-01
To investigate (1) how many nights of sleep diary entries are required for reliable estimates of five sleep-related outcomes (bedtime, wake time, sleep onset latency [SOL], sleep duration, and wake after sleep onset [WASO]) and (2) the test-retest reliability of sleep diary estimates of school night sleep across 12 weeks. Data were drawn from four adolescent samples (Australia [n = 385], Qatar [n = 245], United Kingdom [n = 770], and United States [n = 366]), who provided 1766 eligible sleep diary weeks for reliability analyses. We performed reliability analyses for each cohort using complete data (7 days), one to five school nights, and one to two weekend nights. We also performed test-retest reliability analyses on 12-week sleep diary data available from a subgroup of 55 US adolescents. Intraclass correlation coefficients for bedtime, SOL, and sleep duration indicated good-to-excellent reliability from five weekday nights of sleep diary entries across all adolescent cohorts. Four school nights was sufficient for wake times in the Australian and UK samples, but not the US or Qatari samples. Only Australian adolescents showed good reliability for two weekend nights of bedtime reports; estimates of SOL were adequate for UK adolescents based on two weekend nights. WASO was not reliably estimated using 1 week of sleep diaries. We observed excellent test-rest reliability across 12 weeks of sleep diary data in a subsample of US adolescents. We recommend at least five weekday nights of sleep dairy entries to be made when studying adolescent bedtimes, SOL, and sleep duration. Adolescent sleep patterns were stable across 12 consecutive school weeks. © Sleep Research Society 2017. Published by Oxford University Press on behalf of the Sleep Research Society. All rights reserved. For permissions, please e-mail journals.permissions@oup.com.
Psychometric properties of the Thai Spiritual Well-Being Scale.
Chaiviboontham, Suchira; Phinitkhajorndech, Noppawan; Hanucharurnkul, Somchit; Noipiang, Thaniya
2016-04-01
The purpose of this study was to investigate the psychometric properties of the modified Thai Spiritual Well-Being Scale in patients with advanced cancer. This cross-sectional study was employed to investigate psychometric properties. Some 196 participants from three tertiary hospitals in Bangkok and suburban Thailand were asked to complete a Personal Information Questionnaire (PIQ), The Memorial Symptom Assessment Scale (MSAS), and the Spiritual Well-Being Scale (SWBS). Validity was determined by known-group, concurrent, and constructs validity. Reliability was estimated using internal consistency by Cronbach's α coefficients. Three factors were extracted: so-called existential well-being, religious well-being, and peacefulness accounted for 71.44% of total variance. The Cronbach's α coefficients for total SWB, EWB, RWB, and peacefulness were 0.96, 0.94, and 0.93, respectively. These findings indicate that the Thai SWBS is a valid and reliable instrument, and it presented one more factor than the original version.
Measuring awareness of financial skills: reliability and validity of a new measure.
Cramer, K; Tuokko, H A; Mateer, C A; Hultsch, D F
2004-03-01
This paper examines the psychometric properties of a three-part (participant, informant, and performance) Measure for assessing Awareness of Financial Skills (MAFS). The MAFS was administered to 10 seniors with dementia and 25 well-functioning seniors, and their informants. Measures of cognitive functioning, social desirability, neuroticism, and perceived control were administered to each participant to allow for an assessment of validity. Internal consistency estimates for the participant and informant questionnaires were found to be 0.92 and 0.97, respectively. Convergent validity analysis indicated that performance on this measure was related to level of cognitive functioning, with higher level of unawareness associated with decreased cognitive ability. Discriminant validity analysis showed that performance on this measure was not related to social desirability or neuroticism. This study provides evidence that the MAFS is a reliable and valid tool for assessing awareness of financial skills in older adults.
Powell-Young, Yolanda M; Spruill, Ida J
2011-12-01
The purpose of this investigation was to examine the reliability and factor structure of the Harter Self-Perception Profile for Adolescents (SPPA) with African-Americans. While the SPPA has demonstrated strong psychometric properties with European-Americans, limited information exists with African-Americans. Three hundred and ten (N = 310) female adolescents, from 14 through 18 years of age, completed the SPPA. Estimations of internal consistency reliability with Cronbach's alpha (alpha), item suitability with Pearson (gamma) correlations, and evaluation of factor structure fit utilizing principle axis extraction with oblimin (oblique) rotation were conducted. When compared with Harter's normative data, psychometric properties of the SPPA varied significantly with the current sample. Findings suggested cautious interpretation of data generated with demographically similar cohorts. Further study is warranted to ascertain the factor structure that is most relevant for use with African-American adolescents.
Service quality, satisfaction, and behavioral intention in home delivered meals program
Joung, Hyun-Woo; Yuan, Jingxue Jessica; Huffman, Lynn
2011-01-01
This study was conducted to evaluate recipients' perception of service quality, satisfaction, and behavioral intention in home delivered meals program in the US. Out of 398 questionnaires, 265 (66.6%) were collected, and 209 questionnaires (52.5%) were used for the statistical analysis. A Confirmatory Factor Analysis (CFA) with a maximum likelihood was first conducted to estimate the measurement model by verifying the underlying structure of constructs. The level of internal consistency in each construct was acceptable, with Cronbach's alpha estimates ranging from 0.7 to 0.94. All of the composite reliabilities of the constructs were over the cutoff value of 0.50, ensuring adequate internal consistency of multiple items for each construct. As a second step, a Meals-On-Wheels (MOW) recipient perception model was estimated. The model's fit as indicated by these indexes was satisfactory and path coefficients were analyzed. Two paths between (1) volunteer issues and behavioral intention and (2) responsiveness and behavioral intention were not significant. The path for predicting a positive relationship between food quality and satisfaction was supported. The results show that having high food quality may create recipient satisfaction. The findings suggest that food quality and responsiveness are significant predictors of positive satisfaction. Moreover, satisfied recipients have positive behavioral intention toward MOW programs. PMID:21556231
Service quality, satisfaction, and behavioral intention in home delivered meals program.
Joung, Hyun-Woo; Kim, Hak-Seon; Yuan, Jingxue Jessica; Huffman, Lynn
2011-04-01
This study was conducted to evaluate recipients' perception of service quality, satisfaction, and behavioral intention in home delivered meals program in the US. Out of 398 questionnaires, 265 (66.6%) were collected, and 209 questionnaires (52.5%) were used for the statistical analysis. A Confirmatory Factor Analysis (CFA) with a maximum likelihood was first conducted to estimate the measurement model by verifying the underlying structure of constructs. The level of internal consistency in each construct was acceptable, with Cronbach's alpha estimates ranging from 0.7 to 0.94. All of the composite reliabilities of the constructs were over the cutoff value of 0.50, ensuring adequate internal consistency of multiple items for each construct. As a second step, a Meals-On-Wheels (MOW) recipient perception model was estimated. The model's fit as indicated by these indexes was satisfactory and path coefficients were analyzed. Two paths between (1) volunteer issues and behavioral intention and (2) responsiveness and behavioral intention were not significant. The path for predicting a positive relationship between food quality and satisfaction was supported. The results show that having high food quality may create recipient satisfaction. The findings suggest that food quality and responsiveness are significant predictors of positive satisfaction. Moreover, satisfied recipients have positive behavioral intention toward MOW programs.
Mosmuller, David G M; Mennes, Lisette M; Prahl, Charlotte; Kramer, Gem J C; Disse, Melissa A; van Couwelaar, Gijs M; Niessen, Frank B; Griot, J P W Don
2017-09-01
The development of the Cleft Aesthetic Rating Scale, a simple and reliable photographic reference scale for the assessment of nasolabial appearance in complete unilateral cleft lip and palate patients. A blind retrospective analysis of photographs of cleft lip and palate patients was performed with this new rating scale. VU Medical Center Amsterdam and the Academic Center for Dentistry of Amsterdam. Complete unilateral cleft lip and palate patients at the age of 6 years. Photographs that showed the highest interobserver agreement in earlier assessments were selected for the photographic reference scale. Rules were attached to the rating scale to provide a guideline for the assessment and improve interobserver reliability. Cropped photographs revealing only the nasolabial area were assessed by six observers using this new Cleft Aesthetic Rating Scale in two different sessions. Photographs of 62 children (6 years of age, 44 boys and 18 girls) were assessed. The interobserver reliability for the nose and lip together was 0.62, obtained with the intraclass correlation coefficient. To measure the internal consistency, a Cronbach alpha of .91 was calculated. The estimated reliability for three observers was .84, obtained with the Spearman Brown formula. A new, easy to use, and reliable scoring system with a photographic reference scale is presented in this study.
Skinner, Ian W; Hübscher, Markus; Moseley, G Lorimer; Lee, Hopin; Wand, Benedict M; Traeger, Adrian C; Gustin, Sylvia M; McAuley, James H
2017-08-15
Eyetracking is commonly used to investigate attentional bias. Although some studies have investigated the internal consistency of eyetracking, data are scarce on the test-retest reliability and agreement of eyetracking to investigate attentional bias. This study reports the test-retest reliability, measurement error, and internal consistency of 12 commonly used outcome measures thought to reflect the different components of attentional bias: overall attention, early attention, and late attention. Healthy participants completed a preferential-looking eyetracking task that involved the presentation of threatening (sensory words, general threat words, and affective words) and nonthreatening words. We used intraclass correlation coefficients (ICCs) to measure test-retest reliability (ICC > .70 indicates adequate reliability). The ICCs(2, 1) ranged from -.31 to .71. Reliability varied according to the outcome measure and threat word category. Sensory words had a lower mean ICC (.08) than either affective words (.32) or general threat words (.29). A longer exposure time was associated with higher test-retest reliability. All of the outcome measures, except second-run dwell time, demonstrated low measurement error (<6%). Most of the outcome measures reported high internal consistency (α > .93). Recommendations are discussed for improving the reliability of eyetracking tasks in future research.
Reliability Estimates for Undergraduate Grade Point Average
ERIC Educational Resources Information Center
Westrick, Paul A.
2017-01-01
Undergraduate grade point average (GPA) is a commonly employed measure in educational research, serving as a criterion or as a predictor depending on the research question. Over the decades, researchers have used a variety of reliability coefficients to estimate the reliability of undergraduate GPA, which suggests that there has been no consensus…
Reliability of Test Scores in Nonparametric Item Response Theory.
ERIC Educational Resources Information Center
Sijtsma, Klaas; Molenaar, Ivo W.
1987-01-01
Three methods for estimating reliability are studied within the context of nonparametric item response theory. Two were proposed originally by Mokken and a third is developed in this paper. Using a Monte Carlo strategy, these three estimation methods are compared with four "classical" lower bounds to reliability. (Author/JAZ)
IRT-Estimated Reliability for Tests Containing Mixed Item Formats
ERIC Educational Resources Information Center
Shu, Lianghua; Schwarz, Richard D.
2014-01-01
As a global measure of precision, item response theory (IRT) estimated reliability is derived for four coefficients (Cronbach's a, Feldt-Raju, stratified a, and marginal reliability). Models with different underlying assumptions concerning test-part similarity are discussed. A detailed computational example is presented for the targeted…
Evaluation of Validity and Reliability for Hierarchical Scales Using Latent Variable Modeling
ERIC Educational Resources Information Center
Raykov, Tenko; Marcoulides, George A.
2012-01-01
A latent variable modeling method is outlined, which accomplishes estimation of criterion validity and reliability for a multicomponent measuring instrument with hierarchical structure. The approach provides point and interval estimates for the scale criterion validity and reliability coefficients, and can also be used for testing composite or…
López-Pina, José Antonio; Sánchez-Meca, Julio; López-López, José Antonio; Marín-Martínez, Fulgencio; Núñez-Núñez, Rosa Ma; Rosa-Alcázar, Ana I; Gómez-Conesa, Antonia; Ferrer-Requena, Josefa
2015-01-01
The Yale-Brown Obsessive-Compulsive Scale for children and adolescents (CY-BOCS) is a frequently applied test to assess obsessive-compulsive symptoms. We conducted a reliability generalization meta-analysis on the CY-BOCS to estimate the average reliability, search for reliability moderators, and propose a predictive model that researchers and clinicians can use to estimate the expected reliability of the CY-BOCS scores. A total of 47 studies reporting a reliability coefficient with the data at hand were included in the meta-analysis. The results showed good reliability and a large variability associated to the standard deviation of total scores and sample size.
International classification of reliability for implanted cochlear implant receiver stimulators.
Battmer, Rolf-Dieter; Backous, Douglas D; Balkany, Thomas J; Briggs, Robert J S; Gantz, Bruce J; van Hasselt, Andrew; Kim, Chong Sun; Kubo, Takeshi; Lenarz, Thomas; Pillsbury, Harold C; O'Donoghue, Gerard M
2010-10-01
To design an international standard to be used when reporting reliability of the implanted components of cochlear implant systems to appropriate governmental authorities, cochlear implant (CI) centers, and for journal editors in evaluating manuscripts involving cochlear implant reliability. The International Consensus Group for Cochlear Implant Reliability Reporting was assembled to unify ongoing efforts in the United States, Europe, Asia, and Australia to create a consistent and comprehensive classification system for the implanted components of CI systems across manufacturers. All members of the consensus group are from tertiary referral cochlear implant centers. None. A clinically relevant classification scheme adapted from principles of ISO standard 5841-2:2000 originally designed for reporting reliability of cardiac pacemakers, pulse generators, or leads. Standard definitions for device failure, survival time, clinical benefit, reduced clinical benefit, and specification were generated. Time intervals for reporting back to implant centers for devices tested to be "out of specification," categorization of explanted devices, the method of cumulative survival reporting, and content of reliability reports to be issued by manufacturers was agreed upon by all members. The methodology for calculating Cumulative survival was adapted from ISO standard 5841-2:2000. The International Consensus Group on Cochlear Implant Device Reliability Reporting recommends compliance to this new standard in reporting reliability of implanted CI components by all manufacturers of CIs and the adoption of this standard as a minimal reporting guideline for editors of journals publishing cochlear implant research results.
Mejia, Amanda F; Nebel, Mary Beth; Barber, Anita D; Choe, Ann S; Pekar, James J; Caffo, Brian S; Lindquist, Martin A
2018-05-15
Reliability of subject-level resting-state functional connectivity (FC) is determined in part by the statistical techniques employed in its estimation. Methods that pool information across subjects to inform estimation of subject-level effects (e.g., Bayesian approaches) have been shown to enhance reliability of subject-level FC. However, fully Bayesian approaches are computationally demanding, while empirical Bayesian approaches typically rely on using repeated measures to estimate the variance components in the model. Here, we avoid the need for repeated measures by proposing a novel measurement error model for FC describing the different sources of variance and error, which we use to perform empirical Bayes shrinkage of subject-level FC towards the group average. In addition, since the traditional intra-class correlation coefficient (ICC) is inappropriate for biased estimates, we propose a new reliability measure denoted the mean squared error intra-class correlation coefficient (ICC MSE ) to properly assess the reliability of the resulting (biased) estimates. We apply the proposed techniques to test-retest resting-state fMRI data on 461 subjects from the Human Connectome Project to estimate connectivity between 100 regions identified through independent components analysis (ICA). We consider both correlation and partial correlation as the measure of FC and assess the benefit of shrinkage for each measure, as well as the effects of scan duration. We find that shrinkage estimates of subject-level FC exhibit substantially greater reliability than traditional estimates across various scan durations, even for the most reliable connections and regardless of connectivity measure. Additionally, we find partial correlation reliability to be highly sensitive to the choice of penalty term, and to be generally worse than that of full correlations except for certain connections and a narrow range of penalty values. This suggests that the penalty needs to be chosen carefully when using partial correlations. Copyright © 2018. Published by Elsevier Inc.
Predictors of validity and reliability of a physical activity record in adolescents
2013-01-01
Background Poor to moderate validity of self-reported physical activity instruments is commonly observed in young people in low- and middle-income countries. However, the reasons for such low validity have not been examined in detail. We tested the validity of a self-administered daily physical activity record in adolescents and assessed if personal characteristics or the convenience level of reporting physical activity modified the validity estimates. Methods The study comprised a total of 302 adolescents from an urban and rural area in Ecuador. Validity was evaluated by comparing the record with accelerometer recordings for seven consecutive days. Test-retest reliability was examined by comparing registrations from two records administered three weeks apart. Time spent on sedentary (SED), low (LPA), moderate (MPA) and vigorous (VPA) intensity physical activity was estimated. Bland Altman plots were used to evaluate measurement agreement. We assessed if age, sex, urban or rural setting, anthropometry and convenience of completing the record explained differences in validity estimates using a linear mixed model. Results Although the record provided higher estimates for SED and VPA and lower estimates for LPA and MPA compared to the accelerometer, it showed an overall fair measurement agreement for validity. There was modest reliability for assessing physical activity in each intensity level. Validity was associated with adolescents’ personal characteristics: sex (SED: P = 0.007; LPA: P = 0.001; VPA: P = 0.009) and setting (LPA: P = 0.000; MPA: P = 0.047). Reliability was associated with the convenience of completing the physical activity record for LPA (low convenience: P = 0.014; high convenience: P = 0.045). Conclusions The physical activity record provided acceptable estimates for reliability and validity on a group level. Sex and setting were associated with validity estimates, whereas convenience to fill out the record was associated with better reliability estimates for LPA. This tendency of improved reliability estimates for adolescents reporting higher convenience merits further consideration. PMID:24289296
Current status, uncertainty and future needs in soil organic carbon monitoring.
Jandl, Robert; Rodeghiero, Mirco; Martinez, Cristina; Cotrufo, M Francesca; Bampa, Francesca; van Wesemael, Bas; Harrison, Robert B; Guerrini, Iraê Amaral; Richter, Daniel Deb; Rustad, Lindsey; Lorenz, Klaus; Chabbi, Abad; Miglietta, Franco
2014-01-15
Increasing human demands on soil-derived ecosystem services requires reliable data on global soil resources for sustainable development. The soil organic carbon (SOC) pool is a key indicator of soil quality as it affects essential biological, chemical and physical soil functions such as nutrient cycling, pesticide and water retention, and soil structure maintenance. However, information on the SOC pool, and its temporal and spatial dynamics is unbalanced. Even in well-studied regions with a pronounced interest in environmental issues information on soil carbon (C) is inconsistent. Several activities for the compilation of global soil C data are under way. However, different approaches for soil sampling and chemical analyses make even regional comparisons highly uncertain. Often, the procedures used so far have not allowed the reliable estimation of the total SOC pool, partly because the available knowledge is focused on not clearly defined upper soil horizons and the contribution of subsoil to SOC stocks has been less considered. Even more difficult is quantifying SOC pool changes over time. SOC consists of variable amounts of labile and recalcitrant molecules of plant, and microbial and animal origin that are often operationally defined. A comprehensively active soil expert community needs to agree on protocols of soil surveying and lab procedures towards reliable SOC pool estimates. Already established long-term ecological research sites, where SOC changes are quantified and the underlying mechanisms are investigated, are potentially the backbones for regional, national, and international SOC monitoring programs. © 2013.
Development, reliability, and validity of the My Child's Play (MCP) questionnaire.
Schneider, Eleanor; Rosenblum, Sara
2014-01-01
This article describes the development, reliability, and validity of My Child's Play (MCP), a parent questionnaire designed to evaluate the play of children ages 3-9 yr. The first phase of the study determined the questionnaire's content and face validity. Subsequently, the internal reliability consistency and construct and concurrent validity were demonstrated using 334 completed questionnaires. The MCP showed good internal consistency (α = .86). The factor analysis revealed four distinct factors with acceptable levels of internal reliability (Cronbach's αs = .63-.81) and gender- and age-related differences in play characteristics; both findings attest to the tool's construct validity. Significant correlations (r = .33, p < .0001) with the Parent as a Teacher Inventory demonstrate the MCP's concurrent validity. The MCP demonstrated acceptable reliability and validity. It appears to be a promising standardized assessment tool for use in research and practice to promote understanding of a child's play. Copyright © 2014 by the American Occupational Therapy Association, Inc.
NASA Astrophysics Data System (ADS)
Liu, Yiming; Shi, Yimin; Bai, Xuchao; Zhan, Pei
2018-01-01
In this paper, we study the estimation for the reliability of a multicomponent system, named N- M-cold-standby redundancy system, based on progressive Type-II censoring sample. In the system, there are N subsystems consisting of M statistically independent distributed strength components, and only one of these subsystems works under the impact of stresses at a time and the others remain as standbys. Whenever the working subsystem fails, one from the standbys takes its place. The system fails when the entire subsystems fail. It is supposed that the underlying distributions of random strength and stress both belong to the generalized half-logistic distribution with different shape parameter. The reliability of the system is estimated by using both classical and Bayesian statistical inference. Uniformly minimum variance unbiased estimator and maximum likelihood estimator for the reliability of the system are derived. Under squared error loss function, the exact expression of the Bayes estimator for the reliability of the system is developed by using the Gauss hypergeometric function. The asymptotic confidence interval and corresponding coverage probabilities are derived based on both the Fisher and the observed information matrices. The approximate highest probability density credible interval is constructed by using Monte Carlo method. Monte Carlo simulations are performed to compare the performances of the proposed reliability estimators. A real data set is also analyzed for an illustration of the findings.
Assuring long-term reliability of concentrator PV systems
NASA Astrophysics Data System (ADS)
McConnell, R.; Garboushian, V.; Brown, J.; Crawford, C.; Darban, K.; Dutra, D.; Geer, S.; Ghassemian, V.; Gordon, R.; Kinsey, G.; Stone, K.; Turner, G.
2009-08-01
Concentrator PV (CPV) systems have attracted significant interest because these systems incorporate the world's highest efficiency solar cells and they are targeting the lowest cost production of solar electricity for the world's utility markets. Because these systems are just entering solar markets, manufacturers and customers need to assure their reliability for many years of operation. There are three general approaches for assuring CPV reliability: 1) field testing and development over many years leading to improved product designs, 2) testing to internationally accepted qualification standards (especially for new products) and 3) extended reliability tests to identify critical weaknesses in a new component or design. Amonix has been a pioneer in all three of these approaches. Amonix has an internal library of field failure data spanning over 15 years that serves as the basis for its seven generations of CPV systems. An Amonix product served as the test CPV module for the development of the world's first qualification standard completed in March 2001. Amonix staff has served on international standards development committees, such as the International Electrotechnical Commission (IEC), in support of developing CPV standards needed in today's rapidly expanding solar markets. Recently Amonix employed extended reliability test procedures to assure reliability of multijunction solar cell operation in its seventh generation high concentration PV system. This paper will discuss how these three approaches have all contributed to assuring reliability of the Amonix systems.
Bowman, Gene L.; Shannon, Jackilen; Ho, Emily; Traber, Maret G.; Frei, Balz; Oken, Barry S.; Kaye, Jeffery A.; Quinn, Joseph F.
2010-01-01
Introduction There is great interest in nutritional strategies for the prevention of age-related cognitive decline, yet the best methods for nutritional assessment in populations at risk for dementia are still evolving. Our study objective was to test the reliability and validity of two common nutritional assessments (plasma nutrient biomarkers and Food Frequency Questionnaire) in people at risk for dementia. Methods Thirty-eight elders, half with amnestic -Mild Cognitive Impairment and half with intact cognition were recruited. Nutritional assessments were collected together at baseline and again at 1 month. Intraclass and Pearson correlation coefficients quantified reliability and validity. Results Twenty-six nutrients were examined and reliability was very good or better for 77% (20/26, ICC ≥ .75) of the plasma nutrient biomarkers and for 88% of the FFQ estimates. Twelve of the plasma nutrient estimates were as reliable as the commonly measured plasma cholesterol (ICC=.92). FFQ and plasma long-chain fatty acids (docosahexaenoic acid, r =.39, eicosapentaenoic acid, r = .39) and carotenoids (α-carotene, r =.49; lutein + zeaxanthin, r = .48; β-carotene, r = .43; β-cryptoxanthin, r = .41) were correlated, but no other FFQ estimates correlated with respective nutrient biomarkers. Correlations between FFQ and plasma fatty acids and carotenoids were significantly stronger after removing subjects with MCI. Conclusion The reliability and validity of plasma and FFQ nutrient estimates vary according to the nutrient of interest. Memory deficit attenuates FFQ estimate validity and inflates FFQ estimate reliability. Many plasma nutrient biomarkers have very good reliability over 1-month regardless of memory state. This method can circumvent sources of error seen in other less direct methods of nutritional assessment. PMID:20856100
Evaluating North American Electric Grid Reliability Using the Barabasi-Albert Network Model
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chassin, David P.; Posse, Christian
2005-09-15
The reliability of electric transmission systems is examined using a scale-free model of network topology and failure propagation. The topologies of the North American eastern and western electric grids are analyzed to estimate their reliability based on the Barabási-Albert network model. A commonly used power system reliability index is computed using a simple failure propagation model. The results are compared to the values of power system reliability indices previously obtained using other methods and they suggest that scale-free network models are usable to estimate aggregate electric grid reliability.
Evaluating North American Electric Grid Reliability Using the Barabasi-Albert Network Model
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chassin, David P.; Posse, Christian
2005-09-15
The reliability of electric transmission systems is examined using a scale-free model of network topology and failure propagation. The topologies of the North American eastern and western electric grids are analyzed to estimate their reliability based on the Barabasi-Albert network model. A commonly used power system reliability index is computed using a simple failure propagation model. The results are compared to the values of power system reliability indices previously obtained using standard power engineering methods, and they suggest that scale-free network models are usable to estimate aggregate electric grid reliability.
Accurate determination of brain metabolite concentrations using ERETIC as external reference.
Zoelch, Niklaus; Hock, Andreas; Heinzer-Schweizer, Susanne; Avdievitch, Nikolai; Henning, Anke
2017-08-01
Magnetic Resonance Spectroscopy (MRS) can provide in vivo metabolite concentrations in standard concentration units if a reliable reference signal is available. For 1 H MRS in the human brain, typically the signal from the tissue water is used as the (internal) reference signal. However, a concentration determination based on the tissue water signal most often requires a reliable estimate of the water concentration present in the investigated tissue. Especially in clinically interesting cases, this estimation might be difficult. To avoid assumptions about the water in the investigated tissue, the Electric REference To access In vivo Concentrations (ERETIC) method has been proposed. In this approach, the metabolite signal is compared with a reference signal acquired in a phantom and potential coil-loading differences are corrected using a synthetic reference signal. The aim of this study, conducted with a transceiver quadrature head coil, was to increase the accuracy of the ERETIC method by correcting the influence of spatial B 1 inhomogeneities and to simplify the quantification with ERETIC by incorporating an automatic phase correction for the ERETIC signal. Transmit field ( B1+) differences are minimized with a volume-selective power optimization, whereas reception sensitivity changes are corrected using contrast-minimized images of the brain and by adapting the voxel location in the phantom measurement closely to the position measured in vivo. By applying the proposed B 1 correction scheme, the mean metabolite concentrations determined with ERETIC in 21 healthy subjects at three different positions agree with concentrations derived with the tissue water signal as reference. In addition, brain water concentrations determined with ERETIC were in agreement with estimations derived using tissue segmentation and literature values for relative water densities. Based on the results, the ERETIC method presented here is a valid tool to derive in vivo metabolite concentration, with potential advantages compared with internal water referencing in diseased tissue. Copyright © 2017 John Wiley & Sons, Ltd.
Reliability of TMS phosphene threshold estimation: Toward a standardized protocol.
Mazzi, Chiara; Savazzi, Silvia; Abrahamyan, Arman; Ruzzoli, Manuela
Phosphenes induced by transcranial magnetic stimulation (TMS) are a subjectively described visual phenomenon employed in basic and clinical research as index of the excitability of retinotopically organized areas in the brain. Phosphene threshold estimation is a preliminary step in many TMS experiments in visual cognition for setting the appropriate level of TMS doses; however, the lack of a direct comparison of the available methods for phosphene threshold estimation leaves unsolved the reliability of those methods in setting TMS doses. The present work aims at fulfilling this gap. We compared the most common methods for phosphene threshold calculation, namely the Method of Constant Stimuli (MOCS), the Modified Binary Search (MOBS) and the Rapid Estimation of Phosphene Threshold (REPT). In two experiments we tested the reliability of PT estimation under each of the three methods, considering the day of administration, participants' expertise in phosphene perception and the sensitivity of each method to the initial values used for the threshold calculation. We found that MOCS and REPT have comparable reliability when estimating phosphene thresholds, while MOBS estimations appear less stable. Based on our results, researchers and clinicians can estimate phosphene threshold according to MOCS or REPT equally reliably, depending on their specific investigation goals. We suggest several important factors for consideration when calculating phosphene thresholds and describe strategies to adopt in experimental procedures. Copyright © 2017 Elsevier Inc. All rights reserved.
Systematic effects in LOD from SLR observations
NASA Astrophysics Data System (ADS)
Bloßfeld, Mathis; Gerstl, Michael; Hugentobler, Urs; Angermann, Detlef; Müller, Horst
2014-09-01
Beside the estimation of station coordinates and the Earth’s gravity field, laser ranging observations to near-Earth satellites can be used to determine the rotation of the Earth. One parameter of this rotation is ΔLOD (excess Length Of Day) which describes the excess revolution time of the Earth w.r.t. 86,400 s. Due to correlations among the different parameter groups, it is difficult to obtain reliable estimates for all parameters. In the official ΔLOD products of the International Earth Rotation and Reference Systems Service (IERS), the ΔLOD information determined from laser ranging observations is excluded from the processing. In this paper, we study the existing correlations between ΔLOD, the orbital node Ω, the even zonal gravity field coefficients, cross-track empirical accelerations and relativistic accelerations caused by the Lense-Thirring and deSitter effect in detail using first order Gaussian perturbation equations. We found discrepancies due to different a priories by using different gravity field models of up to 1.0 ms for polar orbits at an altitude of 500 km and up to 40.0 ms, if the gravity field coefficients are estimated using only observations to LAGEOS 1. If observations to LAGEOS 2 are included, reliable ΔLOD estimates can be achieved. Nevertheless, an impact of the a priori gravity field even on the multi-satellite ΔLOD estimates can be clearly identified. Furthermore, we investigate the effect of empirical cross-track accelerations and the effect of relativistic accelerations of near-Earth satellites on ΔLOD. A total effect of 0.0088 ms is caused by not modeled Lense-Thirring and deSitter terms. The partial derivatives of these accelerations w.r.t. the position and velocity of the satellite cause very small variations (0.1 μs) on ΔLOD.
Protracted exposure to fallout: the Rongelap and Utirik experience.
Lessard, E T; Miltenberger, R P; Cohn, S H; Musolino, S V; Conard, R A
1984-03-01
From June 1946 to August 1958, the U.S. Department of Defense and the U.S. Atomic Energy Commission (AEC) conducted nuclear weapons tests in the Northern Marshall Islands. On 1 March 1954, BRAVO, an above-ground test in the Castle series, produced high levels of radioactive material, some of which subsequently fell on Rongelap and Utirik Atolls due to an unexpected wind shift. On 3 March 1954, the inhabitants of these atolls were moved out of the affected area. They later returned to Utirik in June 1954 and to Rongelap in June 1957. Comprehensive environmental and personnel radiological monitoring programs were initiated in the mid 1950s by Brookhaven National Laboratory to ensure that body burdens of the exposed Marshallese subjects remained within AEC guidelines. Their body-burden histories and calculated activity ingestion rate patterns post-return are presented along with estimates of internal committed effective dose equivalents. External exposure data are also included. In addition, relationships between body burden or urine-activity concentration and declining continuous intake were developed. The implications of these studies are: (1) the dietary intake of 137Cs was a major component contributing to the committed effective dose equivalent for the years after the initial contamination of the atolls; (2) for persons whose diet included fish, 65Zn was a major component of committed effective dose equivalent during the first years post-return; (3) a decline in the daily activity ingestion rate greater than that resulting from radioactive decay of the source was estimated for 137Cs, 65Zn, 90Sr and 60Co; (4) the relative impact of each nuclide on the estimate of committed effective dose equivalent was dependent upon the time interval between initial contamination and rehabilitation; and (5) the internal committed effective dose equivalent exceeded the external dose equivalent by a factor of 1.1 at Utirik and 1.5 at Rongelap during the rehabitation period. Few reliable 239Pu measurements on human excreta were made. An analysis of the tentative data leads to the conclusion that a reliable estimate of committed effective dose equivalent requires further research.
Williams, R B
1999-08-01
A compartmentalised model is presented for the estimation of the monetary losses suffered by the world's poultry industry resulting from coccidiosis of chickens and costs of its control. The model is designed so that the major elements of loss may be separately quantified for any chicken-producing entity, e.g., a farm, a poultry company, a country, etc. Examples are presented and the sources, reliability and geographical relevance of the data used for each parameter are provided. Loss elements for specific geographical areas should be recalculated at appropriate intervals to take into account local and international fluctuations in costs of chicks feed, labour, financial inflation and world currency exchange rates. Equations are given for relationships among numbers of chickens, liveweights, weights of carcasses, feed consumptions, feed conversion ratio (FCR), prices of feeds, prices of anticoccidial therapeutic and prophylactic drugs, values of chickens, chicken rearing costs; and effects of coccidiosis on mortality, weight gain and FCR. Using these equations, it is theoretically possible for an international team of representatives, each using reliable local data, to calculate simultaneously each relevant loss element for their respective countries. Addition of these elements could give, for the first time, an accurate global estimate of the losses due to chicken coccidiosis. The total cost of coccidiosis in chickens in the United Kingdom in 1995 is estimated to have been at least GB pound silver 38 588 795, of which 98.1% involved broilers (80.6% due to effects on mortality, weight gain and feed conversion, and 17.5% due to the cost of chemoprophylaxis and therapy). The costs of poor performance due to coccidiosis and its chemical control totalled 4.54% of the gross revenue from UK sales of live broilers. This model includes a new method for comparing the profitabilities of different treatments in commercial trials. providing actual costs rather than the arbitrary numerical scores of other methods. Although originally designed for the study of coccidiosis, the model is equally applicable to any disease. It should be of value to agricultural economists, the animal feed and poultry industries, animal health companies, and to research scientists (particularly for preparing grant applications).
Skull counting in late stages after internal contamination by actinides.
Tani, Kotaro; Shutt, Arron; Kurihara, Osamu; Kosako, Toshiso
2015-02-01
Monitoring preparation for internal contamination with actinides (e.g. Pu and Am) is required to assess internal doses at nuclear fuel cycle-related facilities. In this paper, the authors focus on skull counting in case of single-incident inhalation of (241)Am and propose an effective procedure for skull counting with an existing system, taking into account the biokinetic behaviour of (241)Am in the human body. The predicted response of the system to skull counting under a certain counting geometry was found to be only ∼1.0 × 10(-5) cps Bq(-1) 1y after intake. However, this disadvantage could be remedied by repeated measurements of the skull during the late stage of the intake due to the predicted response reaching a plateau at about the 1000th day after exposure and exceeding that in the lung counting. Further studies are needed for the development of a new detection system with higher sensitivity to perform reliable internal dose estimations based on direct measurements. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
A Latent Class Approach to Estimating Test-Score Reliability
ERIC Educational Resources Information Center
van der Ark, L. Andries; van der Palm, Daniel W.; Sijtsma, Klaas
2011-01-01
This study presents a general framework for single-administration reliability methods, such as Cronbach's alpha, Guttman's lambda-2, and method MS. This general framework was used to derive a new approach to estimating test-score reliability by means of the unrestricted latent class model. This new approach is the latent class reliability…
ERIC Educational Resources Information Center
Gadermann, Anne M.; Guhn, Martin; Zumbo, Bruno D.
2012-01-01
This paper provides a conceptual, empirical, and practical guide for estimating ordinal reliability coefficients for ordinal item response data (also referred to as Likert, Likert-type, ordered categorical, or rating scale item responses). Conventionally, reliability coefficients, such as Cronbach's alpha, are calculated using a Pearson…
Kim, Kyoung-Eun; Lim, Jae-Young
2011-01-01
The Roland-Morris Disability Questionnaire (RMDQ) is a reliable tool for evaluating disability in patients with back pain, but no Korean version has been published and validated. We developed a cross-culturally adapted Korean version of the RMDQ (RMDQ-K) and validated its use for assessing disability in Korean patients with low back pain. Two hundred thirty-one patients with low back pain were assessed using the RMDQ-K, visual analog scale (VAS) during rest and activity, and the Oswestry Disability Index (ODI). The results of 40 patients were used to evaluate the test-retest reliability. The correlations of the RMDQ-K with the VAS and ODI were used to assess validity. The reliability of the RMDQ-K estimated using the internal consistency reached a Cronbach's alpha of 0.893. Test-retest trials showed a high intraclass correlation coefficient of 0.837 (95% CI 0.833-0.953). The RMDQ-K was significantly correlated with the ODI (r=0.738) and VAS during rest (r=0.450) and activity (r=0.412). This study demonstrates that the RMDQ-K is a reliable, valid instrument for measuring of disability in Korean patients with low back pain.
Partnering to Establish and Study Simulation in International Nursing Education.
Garner, Shelby L; Killingsworth, Erin; Raj, Leena
The purpose of this article was to describe an international partnership to establish and study simulation in India. A pilot study was performed to determine interrater reliability among faculty new to simulation when evaluating nursing student competency performance. Interrater reliability was below the ideal agreement level. Findings in this study underscore the need to obtain baseline interrater reliability data before integrating competency evaluation into a simulation program.
Haddad, Mark; Waqas, Ahmed; Sukhera, Ahmed Bashir; Tarar, Asad Zaman
2017-07-27
Depression is common mental health problem and leading contributor to the global burden of disease. The attitudes and beliefs of the public and of health professionals influence social acceptance and affect the esteem and help-seeking of people experiencing mental health problems. The attitudes of clinicians are particularly relevant to their role in accurately recognising and providing appropriate support and management of depression. This study examines the characteristics of the revised depression attitude questionnaire (R-DAQ) with doctors working in healthcare settings in Lahore, Pakistan. A cross-sectional survey was conducted in 2015 using the revised depression attitude questionnaire (R-DAQ). A convenience sample of 700 medical practitioners based in six hospitals in Lahore was approached to participate in the survey. The R-DAQ structure was examined using Parallel Analysis from polychoric correlations. Unweighted least squares analysis (ULSA) was used for factor extraction. Model fit was estimated using goodness-of-fit indices and the root mean square of standardized residuals (RMSR), and internal consistency reliability for the overall scale and subscales was assessed using reliability estimates based on Mislevy and Bock (BILOG 3 Item analysis and test scoring with binary logistic models. Mooresville: Scientific Software, 55) and the McDonald's Omega statistic. Findings using this approach were compared with principal axis factor analysis based on Pearson correlation matrix. 601 (86%) of the doctors approached consented to participate in the study. Exploratory factor analysis of R-DAQ scale responses demonstrated the same 3-factor structure as in the UK development study, though analyses indicated removal of 7 of the 22 items because of weak loading or poor model fit. The 3 factor solution accounted for 49.8% of the common variance. Scale reliability and internal consistency were adequate: total scale standardised alpha was 0.694; subscale reliability for professional confidence was 0.732, therapeutic optimism/pessimism was 0.638, and generalist perspective was 0.769. The R-DAQ was developed with a predominantly UK-based sample of health professionals. This study indicates that this scale functions adequately and provides a valid measure of depression attitudes for medical practitioners in Pakistan, with the same factor structure as in the scale development sample. However, optimal scale function necessitated removal of several items, with a 15-item scale enabling the most parsimonious factor solution for this population.
Hydrological modelling improvements required in basins in the Hindukush-Karakoram-Himalayas region
NASA Astrophysics Data System (ADS)
Khan, Asif; Richards, Keith S.; McRobie, Allan; Booij, Martijn
2016-04-01
Millions of people rely on river water originating from basins in the Hindukush-Karakoram-Himalayas (HKH), where snow- and ice-melt are significant flow components. One such basin is the Upper Indus Basin (UIB), where snow- and ice-melt can contribute more than 80% of total flow. Containing some of the world's largest alpine glaciers, this basin may be highly susceptible to global warming and climate change, and reliable predictions of future water availability are vital for resource planning for downstream food and energy needs in a changing climate, but depend on significantly improved hydrological modelling. However, a critical assessment of available hydro-climatic data and hydrological modelling in the HKH region has identified five major failings in many published hydro-climatic studies, even those appearing in reputable international journals. The main weaknesses of these studies are: i) incorrect basin areas; ii) under-estimated precipitation; iii) incorrectly-defined glacier boundaries; iv) under-estimated snow-cover data; and v) use of biased melt factors for snow and ice during the summer months. This paper illustrates these limitations, which have either resulted in modelled flows being under-estimates of measured flows, leading to an implied severe water scarcity; or have led to the use of unrealistically high degree-day factors and over-estimates of glacier melt contributions, implying unrealistic melt rates. These effects vary amongst sub-basins. Forecasts obtained from these models cannot be used reliably in policy making or water resource development, and need revision. Detailed critical analysis and improvement of existing hydrological modelling may be equally necessary in other mountain regions across the world.
Ford-Gilboe, Marilyn; Wathen, C Nadine; Varcoe, Colleen; MacMillan, Harriet L; Scott-Storey, Kelly; Mantler, Tara; Hegarty, Kelsey; Perrin, Nancy
2016-01-01
Objectives Approaches to measuring intimate partner violence (IPV) in populations often privilege physical violence, with poor assessment of other experiences. This has led to underestimating the scope and impact of IPV. The aim of this study was to develop a brief, reliable and valid self-report measure of IPV that adequately captures its complexity. Design Mixed-methods instrument development and psychometric testing to evolve a brief version of the Composite Abuse Scale (CAS) using secondary data analysis and expert feedback. Setting Data from 5 Canadian IPV studies; feedback from international IPV experts. Participants 31 international IPV experts including academic researchers, service providers and policy actors rated CAS items via an online survey. Pooled data from 6278 adult Canadian women were used for scale development. Primary/secondary outcome measures Scale reliability and validity; robustness of subscales assessing different IPV experiences. Results A 15-item version of the CAS has been developed (Composite Abuse Scale (Revised)—Short Form, CASR-SF), including 12 items developed from the original CAS and 3 items suggested through expert consultation and the evolving literature. Items cover 3 abuse domains: physical, sexual and psychological, with questions asked to assess lifetime, recent and current exposure, and abuse frequency. Factor loadings for the final 3-factor solution ranged from 0.81 to 0.91 for the 6 psychological abuse items, 0.63 to 0.92 for the 4 physical abuse items, and 0.85 and 0.93 for the 2 sexual abuse items. Moderate correlations were observed between the CASR-SF and measures of depression, post-traumatic stress disorder and coercive control. Internal consistency of the CASR-SF was 0.942. These reliability and validity estimates were comparable to those obtained for the original 30-item CAS. Conclusions The CASR-SF is brief self-report measure of IPV experiences among women that has demonstrated initial reliability and validity and is suitable for use in population studies or other studies. Additional validation of the 15-item scale with diverse samples is required. PMID:27927659
Uncertainties in obtaining high reliability from stress-strength models
NASA Technical Reports Server (NTRS)
Neal, Donald M.; Matthews, William T.; Vangel, Mark G.
1992-01-01
There has been a recent interest in determining high statistical reliability in risk assessment of aircraft components. The potential consequences are identified of incorrectly assuming a particular statistical distribution for stress or strength data used in obtaining the high reliability values. The computation of the reliability is defined as the probability of the strength being greater than the stress over the range of stress values. This method is often referred to as the stress-strength model. A sensitivity analysis was performed involving a comparison of reliability results in order to evaluate the effects of assuming specific statistical distributions. Both known population distributions, and those that differed slightly from the known, were considered. Results showed substantial differences in reliability estimates even for almost nondetectable differences in the assumed distributions. These differences represent a potential problem in using the stress-strength model for high reliability computations, since in practice it is impossible to ever know the exact (population) distribution. An alternative reliability computation procedure is examined involving determination of a lower bound on the reliability values using extreme value distributions. This procedure reduces the possibility of obtaining nonconservative reliability estimates. Results indicated the method can provide conservative bounds when computing high reliability. An alternative reliability computation procedure is examined involving determination of a lower bound on the reliability values using extreme value distributions. This procedure reduces the possibility of obtaining nonconservative reliability estimates. Results indicated the method can provide conservative bounds when computing high reliability.
Predicting Cost/Reliability/Maintainability of Advanced General Aviation Avionics Equipment
NASA Technical Reports Server (NTRS)
Davis, M. R.; Kamins, M.; Mooz, W. E.
1978-01-01
A methodology is provided for assisting NASA in estimating the cost, reliability, and maintenance (CRM) requirements for general avionics equipment operating in the 1980's. Practical problems of predicting these factors are examined. The usefulness and short comings of different approaches for modeling coast and reliability estimates are discussed together with special problems caused by the lack of historical data on the cost of maintaining general aviation avionics. Suggestions are offered on how NASA might proceed in assessing cost reliability CRM implications in the absence of reliable generalized predictive models.
Guimerà, Xavier; Dorado, Antonio David; Bonsfills, Anna; Gabriel, Gemma; Gabriel, David; Gamisans, Xavier
2016-10-01
Knowledge of mass transport mechanisms in biofilm-based technologies such as biofilters is essential to improve bioreactors performance by preventing mass transport limitation. External and internal mass transport in biofilms was characterized in heterotrophic biofilms grown on a flat plate bioreactor. Mass transport resistance through the liquid-biofilm interphase and diffusion within biofilms were quantified by in situ measurements using microsensors with a high spatial resolution (<50 μm). Experimental conditions were selected using a mathematical procedure based on the Fisher Information Matrix to increase the reliability of experimental data and minimize confidence intervals of estimated mass transport coefficients. The sensitivity of external and internal mass transport resistances to flow conditions within the range of typical fluid velocities over biofilms (Reynolds numbers between 0.5 and 7) was assessed. Estimated external mass transfer coefficients at different liquid phase flow velocities showed discrepancies with studies considering laminar conditions in the diffusive boundary layer near the liquid-biofilm interphase. The correlation of effective diffusivity with flow velocities showed that the heterogeneous structure of biofilms defines the transport mechanisms inside biofilms. Internal mass transport was driven by diffusion through cell clusters and aggregates at Re below 2.8. Conversely, mass transport was driven by advection within pores, voids and water channels at Re above 5.6. Between both flow velocities, mass transport occurred by a combination of advection and diffusion. Effective diffusivities estimated at different biofilm densities showed a linear increase of mass transport resistance due to a porosity decrease up to biofilm densities of 50 g VSS·L(-1). Mass transport was strongly limited at higher biofilm densities. Internal mass transport results were used to propose an empirical correlation to assess the effective diffusivity within biofilms considering the influence of hydrodynamics and biofilm density. Copyright © 2016 Elsevier Ltd. All rights reserved.
Constraining uncertainties in water supply reliability in a tropical data scarce basin
NASA Astrophysics Data System (ADS)
Kaune, Alexander; Werner, Micha; Rodriguez, Erasmo; de Fraiture, Charlotte
2015-04-01
Assessing the water supply reliability in river basins is essential for adequate planning and development of irrigated agriculture and urban water systems. In many cases hydrological models are applied to determine the surface water availability in river basins. However, surface water availability and variability is often not appropriately quantified due to epistemic uncertainties, leading to water supply insecurity. The objective of this research is to determine the water supply reliability in order to support planning and development of irrigated agriculture in a tropical, data scarce environment. The approach proposed uses a simple hydrological model, but explicitly includes model parameter uncertainty. A transboundary river basin in the tropical region of Colombia and Venezuela with an approximately area of 2100 km² was selected as a case study. The Budyko hydrological framework was extended to consider climatological input variability and model parameter uncertainty, and through this the surface water reliability to satisfy the irrigation and urban demand was estimated. This provides a spatial estimate of the water supply reliability across the basin. For the middle basin the reliability was found to be less than 30% for most of the months when the water is extracted from an upstream source. Conversely, the monthly water supply reliability was high (r>98%) in the lower basin irrigation areas when water was withdrawn from a source located further downstream. Including model parameter uncertainty provides a complete estimate of the water supply reliability, but that estimate is influenced by the uncertainty in the model. Reducing the uncertainty in the model through improved data and perhaps improved model structure will improve the estimate of the water supply reliability allowing better planning of irrigated agriculture and dependable water allocation decisions.
Cost Estimation of Software Development and the Implications for the Program Manager
1992-06-01
Software Lifecycle Model (SLIM), the Jensen System-4 model, the Software Productivity, Quality, and Reliability Estimator ( SPQR \\20), the Constructive...function models in current use are the Software Productivity, Quality, and Reliability Estimator ( SPQR /20) and the Software Architecture Sizing and...Estimator ( SPQR /20) was developed by T. Capers Jones of Software Productivity Research, Inc., in 1985. The model is intended to estimate the outcome
APPLICATION OF TRAVEL TIME RELIABILITY FOR PERFORMANCE ORIENTED OPERATIONAL PLANNING OF EXPRESSWAYS
NASA Astrophysics Data System (ADS)
Mehran, Babak; Nakamura, Hideki
Evaluation of impacts of congestion improvement scheme s on travel time reliability is very significant for road authorities since travel time reliability repr esents operational performance of expressway segments. In this paper, a methodology is presented to estimate travel tim e reliability prior to implementation of congestion relief schemes based on travel time variation modeling as a function of demand, capacity, weather conditions and road accident s. For subject expressway segmen ts, traffic conditions are modeled over a whole year considering demand and capacity as random variables. Patterns of demand and capacity are generated for each five minute interval by appl ying Monte-Carlo simulation technique, and accidents are randomly generated based on a model that links acci dent rate to traffic conditions. A whole year analysis is performed by comparing de mand and available capacity for each scenario and queue length is estimated through shockwave analysis for each time in terval. Travel times are estimated from refined speed-flow relationships developed for intercity expressways and buffer time index is estimated consequently as a measure of travel time reliability. For validation, estimated reliability indices are compared with measured values from empirical data, and it is shown that the proposed method is suitable for operational evaluation and planning purposes.
NASA Technical Reports Server (NTRS)
Unal, Resit; Morris, W. Douglas; White, Nancy H.; Lepsch, Roger A.; Brown, Richard W.
2000-01-01
This paper describes the development of parametric models for estimating operational reliability and maintainability (R&M) characteristics for reusable vehicle concepts, based on vehicle size and technology support level. A R&M analysis tool (RMAT) and response surface methods are utilized to build parametric approximation models for rapidly estimating operational R&M characteristics such as mission completion reliability. These models that approximate RMAT, can then be utilized for fast analysis of operational requirements, for lifecycle cost estimating and for multidisciplinary sign optimization.
NASA Astrophysics Data System (ADS)
Mohammed, Amal A.; Abraheem, Sudad K.; Fezaa Al-Obedy, Nadia J.
2018-05-01
In this paper is considered with Burr type XII distribution. The maximum likelihood, Bayes methods of estimation are used for estimating the unknown scale parameter (α). Al-Bayyatis’ loss function and suggest loss function are used to find the reliability with the least loss. So the reliability function is expanded in terms of a set of power function. For this performance, the Matlab (ver.9) is used in computations and some examples are given.
Abdullah, Kawsari; Thorpe, Kevin E; Mamak, Eva; Maguire, Jonathon L; Birken, Catherine S; Fehlings, Darcy; Hanley, Anthony J; Macarthur, Colin; Zlotkin, Stanley H; Parkin, Patricia C
2015-07-14
The OptEC trial aims to evaluate the effectiveness of oral iron in young children with non-anemic iron deficiency (NAID). The initial sample size calculated for the OptEC trial ranged from 112-198 subjects. Given the uncertainty regarding the parameters used to calculate the sample, an internal pilot study was conducted. The objectives of this internal pilot study were to obtain reliable estimate of parameters (standard deviation and design factor) to recalculate the sample size and to assess the adherence rate and reasons for non-adherence in children enrolled in the pilot study. The first 30 subjects enrolled into the OptEC trial constituted the internal pilot study. The primary outcome of the OptEC trial is the Early Learning Composite (ELC). For estimation of the SD of the ELC, descriptive statistics of the 4 month follow-up ELC scores were assessed within each intervention group. The observed SD within each group was then pooled to obtain an estimated SD (S2) of the ELC. Correlation (ρ) between the ELC measured at baseline and follow-up was assessed. Recalculation of the sample size was performed using analysis of covariance (ANCOVA) method which uses the design factor (1- ρ(2)). Adherence rate was calculated using a parent reported rate of missed doses of the study intervention. The new estimate of the SD of the ELC was found to be 17.40 (S2). The design factor was (1- ρ2) = 0.21. Using a significance level of 5%, power of 80%, S2 = 17.40 and effect estimate (Δ) ranging from 6-8 points, the new sample size based on ANCOVA method ranged from 32-56 subjects (16-28 per group). Adherence ranged between 14% and 100% with 44% of the children having an adherence rate ≥ 86%. Information generated from our internal pilot study was used to update the design of the full and definitive trial, including recalculation of sample size, determination of the adequacy of adherence, and application of strategies to improve adherence. ClinicalTrials.gov Identifier: NCT01481766 (date of registration: November 22, 2011).
NASA Astrophysics Data System (ADS)
Iskandar, Ismed; Satria Gondokaryono, Yudi
2016-02-01
In reliability theory, the most important problem is to determine the reliability of a complex system from the reliability of its components. The weakness of most reliability theories is that the systems are described and explained as simply functioning or failed. In many real situations, the failures may be from many causes depending upon the age and the environment of the system and its components. Another problem in reliability theory is one of estimating the parameters of the assumed failure models. The estimation may be based on data collected over censored or uncensored life tests. In many reliability problems, the failure data are simply quantitatively inadequate, especially in engineering design and maintenance system. The Bayesian analyses are more beneficial than the classical one in such cases. The Bayesian estimation analyses allow us to combine past knowledge or experience in the form of an apriori distribution with life test data to make inferences of the parameter of interest. In this paper, we have investigated the application of the Bayesian estimation analyses to competing risk systems. The cases are limited to the models with independent causes of failure by using the Weibull distribution as our model. A simulation is conducted for this distribution with the objectives of verifying the models and the estimators and investigating the performance of the estimators for varying sample size. The simulation data are analyzed by using Bayesian and the maximum likelihood analyses. The simulation results show that the change of the true of parameter relatively to another will change the value of standard deviation in an opposite direction. For a perfect information on the prior distribution, the estimation methods of the Bayesian analyses are better than those of the maximum likelihood. The sensitivity analyses show some amount of sensitivity over the shifts of the prior locations. They also show the robustness of the Bayesian analysis within the range between the true value and the maximum likelihood estimated value lines.
Sinval, Jorge; Pasian, Sonia; Queirós, Cristina; Marôco, João
2018-01-01
The aim of this paper is to present a revision of international versions of the Utrecht Work Engagement Scale and to describe the psychometric properties of a Portuguese version of the UWES-9 developed simultaneously for Brazil and Portugal, the validity evidence related with the internal structure, namely, Dimensionality, measurement invariance between Brazil and Portugal, and Reliability of the scores. This is the first UWES version developed simultaneously for both countries, and it is an important instrument for understanding employees' work engagement in the organizations, allowing human resources departments to better use workforces, especially when they are migrants. A total of 524 Brazilian workers and 522 Portuguese workers participated in the study. Confirmatory Factor Analysis, group comparisons, and Reliability estimates were used. The use of workers who were primarily professionals or administrative support, according to ISCO-08, reinforced the need to collect data on other professional occupations. Confirmatory factor analysis showed acceptable fit for the UWES-9 original three-factor solution, and a second-order factor structure has been proposed that presented an acceptable fit. Full-scale invariance was obtained between the Portuguese and Brazilian samples, both for the original three-factor first-order and second-order models. Data revealed that Portuguese and Brazilian workers didn't show statistically significant differences in the work engagement dimensions. This version allows for direct comparisons of means and, consequently, for performance of comparative and cross-cultural studies between these two countries. PMID:29618995
Reliability of the Test of Integrated Language and Literacy Skills (TILLS).
Mailend, Marja-Liisa; Plante, Elena; Anderson, Michele A; Applegate, E Brooks; Nelson, Nickola W
2016-07-01
As new standardized tests become commercially available, it is critical that clinicians have access to the information about a test's psychometric properties, including aspects of reliability. The purpose of the three studies reported in this article was to investigate the reliability of a new test, the Test of Integrated Language and Literacy Skills (TILLS), with consideration of both internal and external sources of measurement error. The TILLS was administered to children aged 6;0-18;11 years. The participants varied in terms of their language and literacy skills and included children with typical language development as well as those diagnosed with language or learning disability. The sample of children also varied in terms of their racial and socioeconomic backgrounds. Study 1 (N = 1056) assessed the internal consistency of TILLS calculating the coefficient omega for each subtest. Study 2 (N = 103) and Study 3 (N = 39) used the intra-class correlation coefficients to report on test-retest and inter-rater reliability respectively. The results indicate strong internal consistency and inter-rater reliability for all subtests of TILLS. The test-retest reliability was strong for all but one subtest, for which the intra-class correlation coefficient was in the acceptable range. This article provides clinicians with essential scientific information that supports the internal and external reliability of a new test of oral and written language skills, the TILLS. Information about reliability is critical for guiding the selection of an appropriate diagnostic tool amongst a number of options. © 2016 Royal College of Speech and Language Therapists.
Interval Estimation of Revision Effect on Scale Reliability via Covariance Structure Modeling
ERIC Educational Resources Information Center
Raykov, Tenko
2009-01-01
A didactic discussion of a procedure for interval estimation of change in scale reliability due to revision is provided, which is developed within the framework of covariance structure modeling. The method yields ranges of plausible values for the population gain or loss in reliability of unidimensional composites, which results from deletion or…
Parrett, Charles; Johnson, D.R.; Hull, J.A.
1989-01-01
Estimates of streamflow characteristics (monthly mean flow that is exceeded 90, 80, 50, and 20 percent of the time for all years of record and mean monthly flow) were made and are presented in tabular form for 312 sites in the Missouri River basin in Montana. Short-term gaged records were extended to the base period of water years 1937-86, and were used to estimate monthly streamflow characteristics at 100 sites. Data from 47 gaged sites were used in regression analysis relating the streamflow characteristics to basin characteristics and to active-channel width. The basin-characteristics equations, with standard errors of 35% to 97%, were used to estimate streamflow characteristics at 179 ungaged sites. The channel-width equations, with standard errors of 36% to 103%, were used to estimate characteristics at 138 ungaged sites. Streamflow measurements were correlated with concurrent streamflows at nearby gaged sites to estimate streamflow characteristics at 139 ungaged sites. In a test using 20 pairs of gages, the standard errors ranged from 31% to 111%. At 139 ungaged sites, the estimates from two or more of the methods were weighted and combined in accordance with the variance of individual methods. When estimates from three methods were combined the standard errors ranged from 24% to 63 %. A drainage-area-ratio adjustment method was used to estimate monthly streamflow characteristics at seven ungaged sites. The reliability of the drainage-area-ratio adjustment method was estimated to be about equal to that of the basin-characteristics method. The estimate were checked for reliability. Estimates of monthly streamflow characteristics from gaged records were considered to be most reliable, and estimates at sites with actual flow record from 1937-86 were considered to be completely reliable (zero error). Weighted-average estimates were considered to be the most reliable estimates made at ungaged sites. (USGS)
Ozaki, Y.; Kaida, A.; Miura, M.; Nakagawa, K.; Toda, K.; Yoshimura, R.; Sumi, Y.; Kurabayashi, T.
2017-01-01
Abstract Early stage oral cancer can be cured with oral brachytherapy, but whole-body radiation exposure status has not been previously studied. Recently, the International Commission on Radiological Protection Committee (ICRP) recommended the use of ICRP phantoms to estimate radiation exposure from external and internal radiation sources. In this study, we used a Monte Carlo simulation with ICRP phantoms to estimate whole-body exposure from oral brachytherapy. We used a Particle and Heavy Ion Transport code System (PHITS) to model oral brachytherapy with 192Ir hairpins and 198Au grains and to perform a Monte Carlo simulation on the ICRP adult reference computational phantoms. To confirm the simulations, we also computed local dose distributions from these small sources, and compared them with the results from Oncentra manual Low Dose Rate Treatment Planning (mLDR) software which is used in day-to-day clinical practice. We successfully obtained data on absorbed dose for each organ in males and females. Sex-averaged equivalent doses were 0.547 and 0.710 Sv with 192Ir hairpins and 198Au grains, respectively. Simulation with PHITS was reliable when compared with an alternative computational technique using mLDR software. We concluded that the absorbed dose for each organ and whole-body exposure from oral brachytherapy can be estimated with Monte Carlo simulation using PHITS on ICRP reference phantoms. Effective doses for patients with oral cancer were obtained. PMID:28339846
Airborne gamma radiation soil moisture measurements over short flight lines
NASA Technical Reports Server (NTRS)
Peck, Eugene L.; Carrol, Thomas R.; Lipinski, Daniel M.
1990-01-01
Results are presented on airborne gamma radiation measurements of soil moisture condition, carried out along short flight lines as part of the First International Satellite Land Surface Climatology Project Field Experiment (FIFE). Data were collected over an area in Kansas during the summers of 1987 and 1989. The airborne surveys, together with ground measurements, provide the most comprehensive set of airborne and ground truth data available in the U.S. for calibrating and evaluating airborne gamma flight lines. Analysis showed that, using standard National Weather Service weights for the K, Tl, and Gc radiation windows, the airborne soil moisture estimates for the FIFE lines had a root mean square error of no greater than 3.0 percent soil moisture. The soil moisture estimates for sections having acquisition time of at least 15 sec were found to be reliable.
Validity and Reliability of Assessing Body Composition Using a Mobile Application.
Macdonald, Elizabeth Z; Vehrs, Pat R; Fellingham, Gilbert W; Eggett, Dennis; George, James D; Hager, Ronald
2017-12-01
The purpose of this study was to determine the validity and reliability of the LeanScreen (LS) mobile application that estimates percent body fat (%BF) using estimates of circumferences from photographs. The %BF of 148 weight-stable adults was estimated once using dual-energy x-ray absorptiometry (DXA). Each of two administrators assessed the %BF of each subject twice using the LS app and manually measured circumferences. A mixed-model ANOVA and Bland-Altman analyses were used to compare the estimates of %BF obtained from each method. Interrater and intrarater reliabilities values were determined using multiple measurements taken by each of the two administrators. The LS app and manually measured circumferences significantly underestimated (P < 0.05) the %BF determined using DXA by an average of -3.26 and -4.82 %BF, respectively. The LS app (6.99 %BF) and manually measured circumferences (6.76 %BF) had large limits of agreement. All interrater and intrarater reliability coefficients of estimates of %BF using the LS app and manually measured circumferences exceeded 0.99. The estimates of %BF from manually measured circumferences and the LS app were highly reliable. However, these field measures are not currently recommended for the assessment of body composition because of significant bias and large limits of agreements.
Ersoy, Mehmet Akif; Varan, Azmi
2007-01-01
The aim of this study was to evaluate the reliability and validity of the Turkish version of the Internalized Stigma of Mental Illness Scale (ISMI) in patients with psychiatric disorders. The study included 203 patients diagnosed with various psychiatric disorders in a psychiatry outpatient clinic of a university hospital. The reliability of the scale was assessed by investigation of its internal consistency and split-half reliability. The convergent validity of the scale was demonstrated by the relationship between the Turkish form of the ISMI and various criteria scales. Cronbach's alpha value was 0.93 for the entire scale and ranged between 0.63 and 0.87 for the 5 subscales of the ISMI. In terms of convergent validity, the total score of the Turkish ISMI significantly correlated with the Beck Depression Inventory, Rosenberg Self-Esteem Scale, Sociotropy-Autonomy Scale, Brief Symptom Inventory, Multidimensional Scale of Perceived Social Support, Clinical Global Impression Scale, and Global Assessment of Functioning Scale scores. All values were in the expected direction. In the light of the findings, it was concluded that the Turkish version of ISMI could be used as a reliable and valid tool in assessing internalized stigma of the Turkish psychiatric patients.
Cole, Jason C; Ito, Diane; Chen, Yaozhu J; Cheng, Rebecca; Bolognese, Jennifer; Li-McLeod, Josephine
2014-09-04
There is a lack of validated instruments to measure the level of burden of Alzheimer's disease (AD) on caregivers. The Impact of Alzheimer's Disease on Caregiver Questionnaire (IADCQ) is a 12-item instrument with a seven-day recall period that measures AD caregiver's burden across emotional, physical, social, financial, sleep, and time aspects. Primary objectives of this study were to evaluate psychometric properties of IADCQ administered on the Web and to determine most appropriate scoring algorithm. A national sample of 200 unpaid AD caregivers participated in this study by completing the Web-based version of IADCQ and Short Form-12 Health Survey Version 2 (SF-12v2™). The SF-12v2 was used to measure convergent validity of IADCQ scores and to provide an understanding of the overall health-related quality of life of sampled AD caregivers. The IADCQ survey was also completed four weeks later by a randomly selected subgroup of 50 participants to assess test-retest reliability. Confirmatory factor analysis (CFA) was implemented to test the dimensionality of the IADCQ items. Classical item-level and scale-level psychometric analyses were conducted to estimate psychometric characteristics of the instrument. Test-retest reliability was performed to evaluate the instrument's stability and consistency over time. Virtually none (2%) of the respondents had either floor or ceiling effects, indicating the IADCQ covers an ideal range of burden. A single-factor model obtained appropriate goodness of fit and provided evidence that a simple sum score of the 12 items of IADCQ can be used to measure AD caregiver's burden. Scales-level reliability was supported with a coefficient alpha of 0.93 and an intra-class correlation coefficient (for test-retest reliability) of 0.68 (95% CI: 0.50-0.80). Low-moderate negative correlations were observed between the IADCQ and scales of the SF-12v2. The study findings suggest the IADCQ has appropriate psychometric characteristics as a unidimensional, Web-based measure of AD caregiver burden and is supported by strong model fit statistics from CFA, high degree of item-level reliability, good internal consistency, moderate test-retest reliability, and moderate convergent validity. Additional validation of the IADCQ is warranted to ensure invariance between the paper-based and Web-based administration and to determine an appropriate responder definition.
Scale for positive aspects of caregiving experience: development, reliability, and factor structure.
Kate, N; Grover, S; Kulhara, P; Nehra, R
2012-06-01
OBJECTIVE. To develop an instrument (Scale for Positive Aspects of Caregiving Experience [SPACE]) that evaluates positive caregiving experience and assess its psychometric properties. METHODS. Available scales which assess some aspects of positive caregiving experience were reviewed and a 50-item questionnaire with a 5-point rating was constructed. In all, 203 primary caregivers of patients with severe mental disorders were asked to complete the questionnaire. Internal consistency, test-retest reliability, cross-language reliability, split-half reliability, and face validity were evaluated. Principal component factor analysis was run to assess the factorial validity of the scale. RESULTS. The scale developed as part of the study was found to have good internal consistency, test-retest reliability, cross-language reliability, split-half reliability, and face validity. Principal component factor analysis yielded a 4-factor structure, which also had good test-retest reliability and cross-language reliability. There was a strong correlation between the 4 factors obtained. CONCLUSION. The SPACE developed as part of this study has good psychometric properties.
Sampling design trade-offs in occupancy studies with imperfect detection: examples and software
Bailey, L.L.; Hines, J.E.; Nichols, J.D.
2007-01-01
Researchers have used occupancy, or probability of occupancy, as a response or state variable in a variety of studies (e.g., habitat modeling), and occupancy is increasingly favored by numerous state, federal, and international agencies engaged in monitoring programs. Recent advances in estimation methods have emphasized that reliable inferences can be made from these types of studies if detection and occupancy probabilities are simultaneously estimated. The need for temporal replication at sampled sites to estimate detection probability creates a trade-off between spatial replication (number of sample sites distributed within the area of interest/inference) and temporal replication (number of repeated surveys at each site). Here, we discuss a suite of questions commonly encountered during the design phase of occupancy studies, and we describe software (program GENPRES) developed to allow investigators to easily explore design trade-offs focused on particularities of their study system and sampling limitations. We illustrate the utility of program GENPRES using an amphibian example from Greater Yellowstone National Park, USA.
Jeffery, Nicholas W; Gregory, T Ryan
2014-10-01
Crustaceans are enormously diverse both phylogenetically and ecologically, but they remain substantially underrepresented in the existing genome size database. An expansion of this dataset could be facilitated if it were possible to obtain genome size estimates from ethanol-preserved specimens. In this study, two tests were performed in order to assess the reliability of genome size data generated using preserved material. First, the results of estimates based on flash-frozen versus ethanol-preserved material were compared across 37 species of crustaceans that differ widely in genome size. Second, a comparison was made of specimens from a single species that had been stored in ethanol for 1-14 years. In both cases, the use of gill tissue in Feulgen image analysis densitometry proved to be a very viable approach. This finding is of direct relevance to both new studies of field-collected crustaceans as well as potential studies based on existing collections. © 2014 International Society for Advancement of Cytometry.
Pang, Susan; Cowen, Simon
2017-12-13
We describe a novel generic method to derive the unknown endogenous concentrations of analyte within complex biological matrices (e.g. serum or plasma) based upon the relationship between the immunoassay signal response of a biological test sample spiked with known analyte concentrations and the log transformed estimated total concentration. If the estimated total analyte concentration is correct, a portion of the sigmoid on a log-log plot is very close to linear, allowing the unknown endogenous concentration to be estimated using a numerical method. This approach obviates conventional relative quantification using an internal standard curve and need for calibrant diluent, and takes into account the individual matrix interference on the immunoassay by spiking the test sample itself. This technique is based on standard additions for chemical analytes. Unknown endogenous analyte concentrations within even 2-fold diluted human plasma may be determined reliably using as few as four reaction wells.
Wang, Kai; Li, Yao; Teng, Jing-Fei; Zhou, Hai-Yong; Xu, Dan-Feng; Fan, Yi
2015-01-01
To evaluate the efficacy and safety of plasmakinetic resection of the prostate (PKRP) versus transurethral resection of the prostate (TURP) for the treatment of patients with benign prostate hyperplasia (BPH), a meta-analysis of randomized controlled trials was carried out. We searched PubMed, Embase, Web of Science and the Cochrane Library. The pooled estimates of maximum flow rate, International Prostate Symptom Score, operation time, catheterization time, irrigated volume, hospital stay, transurethral resection syndrome, transfusion, clot retention, urinary retention and urinary stricture were assessed. There was no notable difference in International Prostate Symptom Score between TURP and PKRP groups during the 1-month, 3 months, 6 months and 12 months follow-up period, while the pooled Q max at 1-month favored PKRP group. PKRP group was related to a lower risk rate of transurethral resection syndrome, transfusion and clot retention, and the catheterization time and operation time were also shorter than that of TURP. The irrigated volume, length of hospital stay, urinary retention and urinary stricture rate were similar between groups. In conclusion, our study suggests that the PKRP is a reliable minimal invasive technique and may anticipatorily prove to be an alternative electrosurgical procedure for the treatment of BPH.
Müller-Engelmann, Meike; Schnyder, Ulrich; Dittmann, Clara; Priebe, Kathlen; Bohus, Martin; Thome, Janine; Fydrich, Thomas; Pfaltz, Monique C; Steil, Regina
2018-05-01
The Clinician-Administered PTSD Scale (CAPS) is a widely used diagnostic interview for posttraumatic stress disorder (PTSD). Following fundamental modifications in the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition ( DSM-5), the CAPS had to be revised. This study examined the psychometric properties (internal consistency, interrater reliability, convergent and discriminant validity, and structural validity) of the German version of the CAPS-5 in a trauma-exposed sample ( n = 223 with PTSD; n =51 without PTSD). The results demonstrated high internal consistency (αs = .65-.93) and high interrater reliability (ICCs = .81-.89). With regard to convergent and discriminant validity, we found high correlations between the CAPS severity score and both the Posttraumatic Diagnostic Scale sum score ( r = .87) and the Beck Depression Inventory total score ( r = .72). Regarding the underlying factor structure, the hybrid model demonstrated the best fit, followed by the anhedonia model. However, we encountered some nonpositive estimates for the correlations of the latent variables (factors) for both models. The model with the best fit without methodological problems was the externalizing behaviors model, but the results also supported the DSM-5 model. Overall, the results demonstrate that the German version of the CAPS-5 is a psychometrically sound measure.
Martínez-González, Agustín E.; Rodríguez-Jiménez, Tíscar; Piqueras, José A.; Vera-Villarroel, Pablo; Godoy, Antonio
2015-01-01
In recent years, there has been a considerable increase in the development of assessment tools for obsessive-compulsive symptomatology in children and adolescents. The Obsessive Compulsive Inventory-Child Version (OCI-CV) is a well-established assessment self-report, with special interest for the assessment of dimensions of Obsessive Compulsive Disorder (OCD). This instrument has shown to be useful for clinical and non-clinical populations in two languages (English and European Spanish). Thus, the aim of this study was to analyze the psychometric properties of the OCI-CV in a Chilean community sample. The sample consisted of 816 children and adolescents with a mean age of 14.54 years (SD = 2.21; range = 10–18 years). Factor structure, internal consistency, test-retest reliability, convergent/divergent validity, and gender/age differences were examined. Confirmatory factor analysis showed a 6-factor structure (Doubting/Checking, Obsessing, Hoarding, Washing, Ordering, and Neutralizing) with one second-order factor. Good estimates of reliability (including internal consistency and test-retest), evidence supporting the validity, and small age and gender differences (higher levels of OCD symptomatology among older participants and women, respectively) are found. The OCI-CV is also an adequate scale for the assessment of obsessions and compulsions in a general population of Chilean children and adolescents. PMID:26317404
Advances in In Vitro and In Silico Tools for Toxicokinetic Dose ...
Recent advances in vitro assays, in silico tools, and systems biology approaches provide opportunities for refined mechanistic understanding for chemical safety assessment that will ultimately lead to reduced reliance on animal-based methods. With the U.S. commercial chemical landscape encompassing thousands of chemicals with limited data, safety assessment strategies that reliably predict in vivo systemic exposures and subsequent in vivo effects efficiently are a priority. Quantitative in vitro-in vivo extrapolation (QIVIVE) is a methodology that facilitates the explicit and quantitative application of in vitro experimental data and in silico modeling to predict in vivo system behaviors and can be applied to predict chemical toxicokinetics, toxicodynamics and also population variability. Tiered strategies that incorporate sufficient information to reliably inform the relevant decision context will facilitate acceptance of these alternative data streams for safety assessments. This abstract does not necessarily reflect U.S. EPA policy. This talk will provide an update to an international audience on the state of science being conducted within the EPA’s Office of Research and Development to develop and refine approaches that estimate internal chemical concentrations following a given exposure, known as toxicokinetics. Toxicokinetic approaches hold great potential in their ability to link in vitro activities or toxicities identified during high-throughput screen
Validation of a new classification system for skin tears.
LeBlanc, Kimberly; Baranoski, Sharon; Holloway, Samantha; Langemo, Diane
2013-06-01
The aim of this study was to validate and establish reliability of the International Skin Tear classification system. A consensus panel of 12 internationally recognized key opinion leaders convened in 2011 to establish consensus statements on the prevention, prediction, assessment, and treatment of skin tears. Subsequently, a new skin tear classification system was proposed. The system was then tested for interrater and intrarater reliability between the experts before being tested more widely on a sample of 327 individuals from the United States, Canada, and Europe. The results of the study indicated a substantial level of agreement for the expert panel (Fleiss κ = 0.619; 2-month follow-up = 0.653). Intrarater reliability was high (Cohen κ = 0.877). Interrater reliability was moderate (Fleiss κ = 0.555) for healthcare professionals (n = 303) and fair for non-health professionals (Fleiss κ = 0.338; n = 24). This international study established the reliability and validity of a new classification system for skin tears.
Lee, Jennifer; Koh, Jung Hee; Kwok, Seung-Ki; Park, Sung-Hwan
2016-05-01
This study was conducted to generate and validate a cross-culturally adapted Korean version of the xerostomia inventory (XI), an 11-item questionnaire designed to measure the severity of xerostomia. The original English version of the XI was translated into Korean according to the guidelines for cross-cultural adaptation of health-related quality-of-life measures. Among a prospective cohort of primary Sjögren's syndrome (pSS) in Korea, 194 patients were analyzed. Internal consistency was evaluated by using Cronbach's alpha, and test-retest reliability was obtained by using an intraclass correlation coefficient (ICC) analysis. Construct validity was investigated by performing a correlation analysis between XI total score and salivary flow rate (SFR). Cronbach's alpha for internal consistency was 0.868, and the ICC for test-retest reliability ranged from 0.48 to 0.827, with a median value of 0.72. Moderate negative correlations between XI score and stimulated SFR, unstimulated SFR, and differential (stimulated minus unstimulated) SFR were observed (Spearman's rho, ρ = -0.515, -0.447, and -0.482, respectively; P < 0.001). The correlation analysis between the visual analogue scale (VAS) score of overall dryness and SFR indicated a smaller ρ value (-0.235 [P = 0.006], -0.243 [P = 0.002], and -0.252 [P = 0.003], respectively), which supports that XI more accurately reflects the degree of xerostomia in the pSS patients. In conclusion, the Korean version of the XI is a reliable tool to estimate the severity of xerostomia in patients with pSS.
Wang, Chao; Chen, Peijie; Zhuang, Jie
2013-12-01
The psychometric profiles of the widely used International Physical Activity Questionnaire-Short Form (IPAQ-SF) in Chinese youth have not been reported. The purpose of this study was to examine the validity and reliability of the IPAQ-SF using a sample of Chinese youth. One thousand and twenty-one youth (M(age) = 14.26 +/- 1.63 years, 52.8% boys) from 11 cities in China wore accelerometers for 7 consecutive days and completed the IPAQ-SF on the 8th day to recall their physical activity (PA) during accelerometer-wearing days. A subsample of 92 youth (M(age) = 15.90 +/- 1.35 years, 46.7% boys) completed the IPAQ-SF again a week later to recall their PA during accelerometer-wearing days. Differences in PA estimated by the IPAQ-SF and accelerometer were examined by paired-sample t test. Spearman correlation coefficients were used to examine the correlation between the IPAQ-SF and accelerometer. Test-retest reliability of the IPAQ-SF was determined by the intraclass correlation coefficient (ICC). Compared with accelerometer, the IPAQ-SF overestimated sedentary time, moderate PA (MPA), vigorous PA (VPA), and moderate-to-vigorous PA (MVPA). Correlations between PA (total PA, MPA, VPA, and MVPA) and sedentary time measured by 2 instruments ranged from "none" to "low" (p = .08-.31). Test-retest ICC of the IPAQ-SF ranged from "moderate" to "high" (ICC = .43-.83), except for sitting in boys (ICC = .06), sitting for the whole sample (ICC = .32), and VPA in girls (ICC = .35). The IPAQ-SF was not a valid instrument for measuring PA and sedentary behavior in Chinese youth.
A Laboratory Study on the Reliability Estimations of the Mini-CEX
ERIC Educational Resources Information Center
de Lima, Alberto Alves; Conde, Diego; Costabel, Juan; Corso, Juan; Van der Vleuten, Cees
2013-01-01
Reliability estimations of workplace-based assessments with the mini-CEX are typically based on real-life data. Estimations are based on the assumption of local independence: the object of the measurement should not be influenced by the measurement itself and samples should be completely independent. This is difficult to achieve. Furthermore, the…
Comparability and Reliability Considerations of Adequate Yearly Progress
ERIC Educational Resources Information Center
Maier, Kimberly S.; Maiti, Tapabrata; Dass, Sarat C.; Lim, Chae Young
2012-01-01
The purpose of this study is to develop an estimate of Adequate Yearly Progress (AYP) that will allow for reliable and valid comparisons among student subgroups, schools, and districts. A shrinkage-type estimator of AYP using the Bayesian framework is described. Using simulated data, the performance of the Bayes estimator will be compared to…
Sample Size for Estimation of G and Phi Coefficients in Generalizability Theory
ERIC Educational Resources Information Center
Atilgan, Hakan
2013-01-01
Problem Statement: Reliability, which refers to the degree to which measurement results are free from measurement errors, as well as its estimation, is an important issue in psychometrics. Several methods for estimating reliability have been suggested by various theories in the field of psychometrics. One of these theories is the generalizability…
Software For Computing Reliability Of Other Software
NASA Technical Reports Server (NTRS)
Nikora, Allen; Antczak, Thomas M.; Lyu, Michael
1995-01-01
Computer Aided Software Reliability Estimation (CASRE) computer program developed for use in measuring reliability of other software. Easier for non-specialists in reliability to use than many other currently available programs developed for same purpose. CASRE incorporates mathematical modeling capabilities of public-domain Statistical Modeling and Estimation of Reliability Functions for Software (SMERFS) computer program and runs in Windows software environment. Provides menu-driven command interface; enabling and disabling of menu options guides user through (1) selection of set of failure data, (2) execution of mathematical model, and (3) analysis of results from model. Written in C language.
Reliability and validity of the de Morton Mobility Index in individuals with sub-acute stroke.
Braun, Tobias; Marks, Detlef; Thiel, Christian; Grüneberg, Christian
2018-02-04
To establish the validity and reliability of the de Morton Mobility Index (DEMMI) in patients with sub-acute stroke. This cross-sectional study was performed in a neurological rehabilitation hospital. We assessed unidimensionality, construct validity, internal consistency reliability, inter-rater reliability, minimal detectable change and possible floor and ceiling effects of the DEMMI in adult patients with sub-acute stroke. The study included a total sample of 121 patients with sub-acute stroke. We analysed validity (n = 109) and reliability (n = 51) in two sub-samples. Rasch analysis indicated unidimensionality with an overall fit to the model (chi-square = 12.37, p = 0.577). All hypotheses on construct validity were confirmed. Internal consistency reliability (Cronbach's alpha = 0.94) and inter-rater reliability (intraclass correlation coefficient = 0.95; 95% confidence interval: 0.92-0.97) were excellent. The minimal detectable change with 90% confidence was 13 points. No floor or ceiling effects were evident. These results indicate unidimensionality, sufficient internal consistency reliability, inter-rater reliability, and construct validity of the DEMMI in patients with a sub-acute stroke. Advantages of the DEMMI in clinical application are the short administration time, no need for special equipment and interval level data. The de Morton Mobility Index, therefore, may be a useful performance-based bedside test to measure mobility in individuals with a sub-acute stroke across the whole mobility spectrum. Implications for Rehabilitation The de Morton Mobility Index (DEMMI) is an unidimensional measurement instrument of mobility in individuals with sub-acute stroke. The DEMMI has excellent internal consistency and inter-rater reliability, and sufficient construct validity. The minimal detectable change of the DEMMI with 90% confidence in stroke rehabilitation is 13 points. The lack of any floor or ceiling effects on hospital admission indicates applicability across the whole mobility spectrum of patients with sub-acute stroke.
The Pregnant Women with HIV Attitude Scale: development and initial psychometric evaluation.
Tyer-Viola, Lynda A; Duffy, Mary E
2010-08-01
This paper is a report of the development and initial psychometric evaluation of the Pregnant Women with HIV Attitude Scale. Previous research has identified that attitudes toward persons with HIV/AIDS have been judgmental and could affect clinical care and outcomes. Stigma towards persons with HIV has persisted as a barrier to nursing care globally. Women are more vulnerable during pregnancy. An instrument to specifically measure obstetric care provider's attitudes toward this population is needed to target identified gaps in providing respectful care. Existing literature and instruments were analysed and two existing measures, the Attitudes about People with HIV Scale and the Attitudes toward Women with HIV Scale, were combined to create an initial item pool to address attitudes toward HIV-positive pregnant women. The data were collected in 2003 with obstetric nurses attending a national conference in the United States of America (N = 210). Content validity was used for item pool development and principal component analysis and analysis of variance were used to determine construct validity. Reliability was analysed using Cronbach's Alpha. The new measure demonstrated high internal consistency (alpha estimates = 0.89). Principal component analysis yielded a two-component structure that accounted for 45% of the total variance: Mothering-Choice (alpha estimates = 0.89) and Sympathy-Rights (alpha estimates = 0.72). These data provided initial evidence of the psychometric properties of the Pregnant Women with HIV Attitude Scale. Further analysis is required of the validity of the constructs of this scale and its reliability with various obstetric care providers.
Steele, Catriona M.; Namasivayam-MacDonald, Ashwini M.; Guida, Brittany T.; Cichero, Julie A.; Duivestein, Janice; MRSc; Hanson, Ben; Lam, Peter; Riquelme, Luis F.
2018-01-01
Objective To assess consensual validity, interrater reliability, and criterion validity of the International Dysphagia Diet Standardisation Initiative Functional Diet Scale, a new functional outcome scale intended to capture the severity of oropharyngeal dysphagia, as represented by the degree of diet texture restriction recommended for the patient. Design Participants assigned International Dysphagia Diet Standardisation Initiative Functional Diet Scale scores to 16 clinical cases. Consensual validity was measured against reference scores determined by an author reference panel. Interrater reliability was measured overall and across quartile subsets of the dataset. Criterion validity was evaluated versus Functional Oral Intake Scale (FOIS) scores assigned by survey respondents to the same case scenarios. Feedback was requested regarding ease and likelihood of use. Setting Web-based survey. Participants Respondents (NZ170) from 29 countries. Interventions Not applicable. Main Outcome Measures Consensual validity (percent agreement and Kendall t), criterion validity (Spearman rank correlation), and interrater reliability (Kendall concordance and intraclass coefficients). Results The International Dysphagia Diet Standardisation Initiative Functional Diet Scale showed strong consensual validity, criterion validity, and interrater reliability. Scenarios involving liquid-only diets, transition from nonoral feeding, or trial diet advances in therapy showed the poorest consensus, indicating a need for clear instructions on how to score these situations. The International Dysphagia Diet Standardisation Initiative Functional Diet Scale showed greater sensitivity than the FOIS to specific changes in diet. Most (>70%) respondents indicated enthusiasm for implementing the International Dysphagia Diet Standardisation Initiative Functional Diet Scale. Conclusions This initial validation study suggests that the International Dysphagia Diet Standardisation Initiative Functional Diet Scale has strong consensual and criterion validity and can be used reliably by clinicians to capture diet texture restriction and progression in people with dysphagia. PMID:29428348
Steele, Catriona M; Namasivayam-MacDonald, Ashwini M; Guida, Brittany T; Cichero, Julie A; Duivestein, Janice; Hanson, Ben; Lam, Peter; Riquelme, Luis F
2018-05-01
To assess consensual validity, interrater reliability, and criterion validity of the International Dysphagia Diet Standardisation Initiative Functional Diet Scale, a new functional outcome scale intended to capture the severity of oropharyngeal dysphagia, as represented by the degree of diet texture restriction recommended for the patient. Participants assigned International Dysphagia Diet Standardisation Initiative Functional Diet Scale scores to 16 clinical cases. Consensual validity was measured against reference scores determined by an author reference panel. Interrater reliability was measured overall and across quartile subsets of the dataset. Criterion validity was evaluated versus Functional Oral Intake Scale (FOIS) scores assigned by survey respondents to the same case scenarios. Feedback was requested regarding ease and likelihood of use. Web-based survey. Respondents (N=170) from 29 countries. Not applicable. Consensual validity (percent agreement and Kendall τ), criterion validity (Spearman rank correlation), and interrater reliability (Kendall concordance and intraclass coefficients). The International Dysphagia Diet Standardisation Initiative Functional Diet Scale showed strong consensual validity, criterion validity, and interrater reliability. Scenarios involving liquid-only diets, transition from nonoral feeding, or trial diet advances in therapy showed the poorest consensus, indicating a need for clear instructions on how to score these situations. The International Dysphagia Diet Standardisation Initiative Functional Diet Scale showed greater sensitivity than the FOIS to specific changes in diet. Most (>70%) respondents indicated enthusiasm for implementing the International Dysphagia Diet Standardisation Initiative Functional Diet Scale. This initial validation study suggests that the International Dysphagia Diet Standardisation Initiative Functional Diet Scale has strong consensual and criterion validity and can be used reliably by clinicians to capture diet texture restriction and progression in people with dysphagia. Copyright © 2018 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Raymer, James; van der Erf, Rob; van Wissen, Leo
2010-01-01
Due to differences in definitions and measurement methods, cross-country comparisons of international migration patterns are difficult and confusing. Emigration numbers reported by sending countries tend to differ from the corresponding immigration numbers reported by receiving countries. In this paper, a methodology is presented to achieve harmonised estimates of migration flows benchmarked to a specific definition of duration. This methodology accounts for both differences in definitions and the effects of measurement error due to, for example, under reporting and sampling fluctuations. More specifically, the differences between the two sets of reported data are overcome by estimating a set of adjustment factors for each country’s immigration and emigration data. The adjusted data take into account any special cases where the origin–destination patterns do not match the overall patterns. The new method for harmonising migration flows that we present is based on earlier efforts by Poulain (European Journal of Population, 9(4): 353–381 1993, Working Paper 12, joint ECE-Eurostat Work Session on Migration Statistics, Geneva, Switzerland 1999) and is illustrated for movements between 19 European countries from 2002 to 2007. The results represent a reliable and consistent set of international migration flows that can be used for understanding recent changes in migration patterns, as inputs into population projections and for developing evidence-based migration policies. PMID:21124647
de Beer, Joop; Raymer, James; van der Erf, Rob; van Wissen, Leo
2010-11-01
Due to differences in definitions and measurement methods, cross-country comparisons of international migration patterns are difficult and confusing. Emigration numbers reported by sending countries tend to differ from the corresponding immigration numbers reported by receiving countries. In this paper, a methodology is presented to achieve harmonised estimates of migration flows benchmarked to a specific definition of duration. This methodology accounts for both differences in definitions and the effects of measurement error due to, for example, under reporting and sampling fluctuations. More specifically, the differences between the two sets of reported data are overcome by estimating a set of adjustment factors for each country's immigration and emigration data. The adjusted data take into account any special cases where the origin-destination patterns do not match the overall patterns. The new method for harmonising migration flows that we present is based on earlier efforts by Poulain (European Journal of Population, 9(4): 353-381 1993, Working Paper 12, joint ECE-Eurostat Work Session on Migration Statistics, Geneva, Switzerland 1999) and is illustrated for movements between 19 European countries from 2002 to 2007. The results represent a reliable and consistent set of international migration flows that can be used for understanding recent changes in migration patterns, as inputs into population projections and for developing evidence-based migration policies.
Yu, Tang-Qing; Lapelosa, Mauro; Vanden-Eijnden, Eric; Abrams, Cameron F
2015-03-04
We use Markovian milestoning molecular dynamics (MD) simulations on a tessellation of the collective variable space for CO localization in myoglobin to estimate the kinetics of entry, exit, and internal site-hopping. The tessellation is determined by analysis of the free-energy surface in that space using transition-path theory (TPT), which provides criteria for defining optimal milestones, allowing short, independent, cell-constrained MD simulations to provide properly weighted kinetic data. We coarse grain the resulting kinetic model at two levels: first, using crystallographically relevant internal cavities and their predicted interconnections and solvent portals; and second, as a three-state side-path scheme inspired by similar models developed from geminate recombination experiments. We show semiquantitative agreement with experiment on entry and exit rates and in the identification of the so-called "histidine gate" at position 64 through which ≈90% of flux between solvent and the distal pocket passes. We also show with six-dimensional calculations that the minimum free-energy pathway of escape through the histidine gate is a "knock-on" mechanism in which motion of the ligand and the gate are sequential and interdependent. In total, these results suggest that such TPT simulations are indeed a promising approach to overcome the practical time-scale limitations of MD to allow reliable estimation of transition mechanisms and rates among metastable states.
77 FR 47670 - Amended Certification Regarding Eligibility To Apply for Worker Adjustment Assistance
Federal Register 2010, 2011, 2012, 2013, 2014
2012-08-09
... DEPARTMENT OF LABOR Employment and Training Administration [TA-W-75,151; TA-W-75,151A] Amended... Reliability Center, A Subsidiary of Navistar International Corporation, Truck Division, Including All On-Site... Reliability Center, a Subsidiary of Navistar International Corporation, Truck Division, 3033 Wayne Trace, Fort...
Nurses as Evaluators of the Humanistic Behavior of Internal Medicine Residents.
ERIC Educational Resources Information Center
Butterfield, Paula S.; And Others
1987-01-01
The reliability of a 13-item questionnaire designed to assess the humanistic behaviors of internal medicine residents and the reliability of nurses as raters of those behaviors were examined. Residents were evaluated by nurses on two general medicine services and on cardiology and hematology-oncology services. (Author/MLW)
Internal Consistency Reliability of the Self-Report Antisocial Process Screening Device
ERIC Educational Resources Information Center
Poythress, Norman G.; Douglas, Kevin S.; Falkenbach, Diana; Cruise, Keith; Lee, Zina; Murrie, Daniel C.; Vitacco, Michael
2006-01-01
The self-report version of the Antisocial Process Screening Device (APSD) has become a popular measure for assessing psychopathic features in justice-involved adolescents. However, the internal consistency reliability of its component scales (Narcissism, Callous-Unemotional, and Impulsivity) has been questioned in several studies. This study…
The Effects of Participation Rate on the Internal Reliability of Peer Nomination Measures
ERIC Educational Resources Information Center
Marks, Peter E. L.; Babcock, Ben; Cillessen, Antonius H. N.; Crick, Nicki R.
2013-01-01
Although low participation rates have historically been considered problematic in peer nomination research, some researchers have recently argued that small proportions of participants can, in fact, provide adequate sociometric data. The current study used a classical measurement perspective to investigate the internal reliability (Cronbach's…
Item Analysis to Improve Reliability for an Internal Medicine Undergraduate OSCE
ERIC Educational Resources Information Center
Auewarakul, Chirayu; Downing, Steven M.; Praditsuwan, Rungnirand; Jaturatamrong, Uapong
2005-01-01
Utilization of objective structured clinical examinations (OSCEs) for final assessment of medical students in Internal Medicine requires a representative sample of OSCE stations. The reliability and generalizability of OSCE scores provides validity evidence for OSCE scores and supports its contribution to the final clinical grade of medical…
Alkhateeb, Haitham M
2004-06-01
The Arabic translation of the Mathematics Teaching Efficacy Beliefs was completed by 144 undergraduate students (M age=20.6) in Jordan. The findings support the internal reliability of the Arabic translation of the Mathematics Teaching Efficacy Beliefs as well as its construct validity.
Reducing random measurement error in assessing postural load on the back in epidemiologic surveys.
Burdorf, A
1995-02-01
The goal of this study was to design strategies to assess postural load on the back in occupational epidemiology by taking into account the reliability of measurement methods and the variability of exposure among the workers under study. Intermethod reliability studies were evaluated to estimate the systematic bias (accuracy) and random measurement error (precision) of various methods to assess postural load on the back. Intramethod reliability studies were reviewed to estimate random variability of back load over time. Intermethod surveys have shown that questionnaires have a moderate reliability for gross activities such as sitting, whereas duration of trunk flexion and rotation should be assessed by observation methods or inclinometers. Intramethod surveys indicate that exposure variability can markedly affect the reliability of estimates of back load if the estimates are based upon a single measurement over a certain time period. Equations have been presented to evaluate various study designs according to the reliability of the measurement method, the optimum allocation of the number of repeated measurements per subject, and the number of subjects in the study. Prior to a large epidemiologic study, an exposure-oriented survey should be conducted to evaluate the performance of measurement instruments and to estimate sources of variability for back load. The strategy for assessing back load can be optimized by balancing the number of workers under study and the number of repeated measurements per worker.
The relationship between cost estimates reliability and BIM adoption: SEM analysis
NASA Astrophysics Data System (ADS)
Ismail, N. A. A.; Idris, N. H.; Ramli, H.; Rooshdi, R. R. Raja Muhammad; Sahamir, S. R.
2018-02-01
This paper presents the usage of Structural Equation Modelling (SEM) approach in analysing the effects of Building Information Modelling (BIM) technology adoption in improving the reliability of cost estimates. Based on the questionnaire survey results, SEM analysis using SPSS-AMOS application examined the relationships between BIM-improved information and cost estimates reliability factors, leading to BIM technology adoption. Six hypotheses were established prior to SEM analysis employing two types of SEM models, namely the Confirmatory Factor Analysis (CFA) model and full structural model. The SEM models were then validated through the assessment on their uni-dimensionality, validity, reliability, and fitness index, in line with the hypotheses tested. The final SEM model fit measures are: P-value=0.000, RMSEA=0.079<0.08, GFI=0.824, CFI=0.962>0.90, TLI=0.956>0.90, NFI=0.935>0.90 and ChiSq/df=2.259; indicating that the overall index values achieved the required level of model fitness. The model supports all the hypotheses evaluated, confirming that all relationship exists amongst the constructs are positive and significant. Ultimately, the analysis verified that most of the respondents foresee better understanding of project input information through BIM visualization, its reliable database and coordinated data, in developing more reliable cost estimates. They also perceive to accelerate their cost estimating task through BIM adoption.
Reliability of Space-Shuttle Pressure Vessels with Random Batch Effects
NASA Technical Reports Server (NTRS)
Feiveson, Alan H.; Kulkarni, Pandurang M.
2000-01-01
In this article we revisit the problem of estimating the joint reliability against failure by stress rupture of a group of fiber-wrapped pressure vessels used on Space-Shuttle missions. The available test data were obtained from an experiment conducted at the U.S. Department of Energy Lawrence Livermore Laboratory (LLL) in which scaled-down vessels were subjected to life testing at four accelerated levels of pressure. We estimate the reliability assuming that both the Shuttle and LLL vessels were chosen at random in a two-stage process from an infinite population with spools of fiber as the primary sampling unit. Two main objectives of this work are: (1) to obtain practical estimates of reliability taking into account random spool effects and (2) to obtain a realistic assessment of estimation accuracy under the random model. Here, reliability is calculated in terms of a 'system' of 22 fiber-wrapped pressure vessels, taking into account typical pressures and exposure times experienced by Shuttle vessels. Comparisons are made with previous studies. The main conclusion of this study is that, although point estimates of reliability are still in the 'comfort zone,' it is advisable to plan for replacement of the pressure vessels well before the expected Lifetime of 100 missions per Shuttle Orbiter. Under a random-spool model, there is simply not enough information in the LLL data to provide reasonable assurance that such replacement would not be necessary.
Devcich, Daniel A; Weller, Jennifer; Mitchell, Simon J; McLaughlin, Scott; Barker, Lauren; Rudolph, Jenny W; Raemer, Daniel B; Zammert, Martin; Singer, Sara J; Torrie, Jane; Frampton, Chris Ma; Merry, Alan F
2016-10-01
Realising the full potential of the WHO Surgical Safety Checklist (SSC) to reduce perioperative harm requires the constructive engagement of all operating room (OR) team members during its administration. To facilitate research on SSC implementation, a valid and reliable instrument is needed for measuring OR team behaviours during its administration. We developed a behaviourally anchored rating scale (BARS) for this purpose. We used a modified Delphi process, involving 16 subject matter experts, to compile a BARS with behavioural domains applicable to all three phases of the SSC. We evaluated the instrument in 80 adult OR cases and 30 simulated cases using two medical student raters and seven expert raters, respectively. Intraclass correlation coefficients were calculated to assess inter-rater reliability. Internal consistency and instrument discrimination were explored. Sample size estimates for potential study designs using the instrument were calculated. The Delphi process resulted in a BARS instrument (the WHOBARS) with five behavioural domains. Intraclass correlation coefficients calculated from the OR cases exceeded 0.80 for 80% of the instrument's domains across the SSC phases. The WHOBARS showed high internal consistency across the three phases of the SSC and ability to discriminate among surgical cases in both clinical and simulated settings. Fewer than 20 cases per group would be required to show a difference of 1 point between groups in studies of the SSC, where α=0.05 and β=0.8. We have developed a generic instrument for comprehensively rating the administration of the SSC and informing initiatives to realise its full potential. We have provided data supporting its capacity for discrimination, internal consistency and inter-rater reliability. Further psychometric evaluation is warranted. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
Transcultural adaptation of the Breast Cancer Awareness Measure.
Al-Khasawneh, E M; Leocadio, M; Seshan, V; Siddiqui, S T; Khan, A N; Al-Manaseer, M M
2016-09-01
To overcome the lack of a validated and robust Arabic instrument to measure breast cancer awareness. Currently, there is no validated Arabic instrument for measuring breast cancer awareness levels. We adapted, translated and validated the Breast Cancer Awareness Measure developed by Cancer Research UK. The instrument was translated into Arabic and back-translated for validation. Validation and reliability tests were conducted using purposively sampled 972 Arab women older than 20 years, living in Oman. The adapted content was validated by a panel of medical, linguistic and cultural experts, followed by cognitive interviews (n = 10), behavioural coding (n = 30) and criterion validation (n = 646). The instrument was tested for acceptability and its subscales for internal consistency. Inter-rater reliability was estimated between two similar groups (n = 144 and n = 142) to test homogeneity. The adapted and translated instrument had a high acceptability (98.7% completed). The validation process shaped the adaptation, and resulted in strong criterion validity (R = 0.58, P < 0.01). The instrument subscales for risk factors and warning signs had high internal consistency (Cronbach's alpha 0.856 and 0.890, respectively), with all floor and ceiling effects less than 15%. The correlation measure for inter-rater reliability was 0.97 (P < 0.01). Through the incorporation of contextual characteristics and prevalent beliefs among Arab populations, the adapted Best Cancer Awareness Measure is a robust Arabic instrument for the measurement of breast cancer awareness and early detection practices among Arab women. The purposively selected sample may not be representative of the population. Improvement of awareness and early detection of breast cancer can contribute towards reducing mortality from the disease. The adapted instrument has policy implications, since measurement of awareness levels is essential towards breast health promotion policies in Arab countries. © 2016 International Council of Nurses.
Psychometrics Matter in Health Behavior: A Long-term Reliability Generalization Study.
Pickett, Andrew C; Valdez, Danny; Barry, Adam E
2017-09-01
Despite numerous calls for increased understanding and reporting of reliability estimates, social science research, including the field of health behavior, has been slow to respond and adopt such practices. Therefore, we offer a brief overview of reliability and common reporting errors; we then perform analyses to examine and demonstrate the variability of reliability estimates by sample and over time. Using meta-analytic reliability generalization, we examined the variability of coefficient alpha scores for a well-designed, consistent, nationwide health study, covering a span of nearly 40 years. For each year and sample, reliability varied. Furthermore, reliability was predicted by a sample characteristic that differed among age groups within each administration. We demonstrated that reliability is influenced by the methods and individuals from which a given sample is drawn. Our work echoes previous calls that psychometric properties, particularly reliability of scores, are important and must be considered and reported before drawing statistical conclusions.
Obtaining reliable phase-gradient delays from otoacoustic emission data.
Shera, Christopher A; Bergevin, Christopher
2012-08-01
Reflection-source otoacoustic emission phase-gradient delays are widely used to obtain noninvasive estimates of cochlear function and properties, such as the sharpness of mechanical tuning and its variation along the length of the cochlear partition. Although different data-processing strategies are known to yield different delay estimates and trends, their relative reliability has not been established. This paper uses in silico experiments to evaluate six methods for extracting delay trends from reflection-source otoacoustic emissions (OAEs). The six methods include both previously published procedures (e.g., phase smoothing, energy-weighting, data exclusion based on signal-to-noise ratio) and novel strategies (e.g., peak-picking, all-pass factorization). Although some of the methods perform well (e.g., peak-picking), others introduce substantial bias (e.g., phase smoothing) and are not recommended. In addition, since standing waves caused by multiple internal reflection can complicate the interpretation and compromise the application of OAE delays, this paper develops and evaluates two promising signal-processing strategies, the first based on time-frequency filtering using the continuous wavelet transform and the second on cepstral analysis, for separating the direct emission from its subsequent reflections. Altogether, the results help to resolve previous disagreements about the frequency dependence of human OAE delays and the sharpness of cochlear tuning while providing useful analysis methods for future studies.
Numerosity estimation benefits from transsaccadic information integration
Hübner, Carolin; Schütz, Alexander C.
2017-01-01
Humans achieve a stable and homogeneous representation of their visual environment, although visual processing varies across the visual field. Here we investigated the circumstances under which peripheral and foveal information is integrated for numerosity estimation across saccades. We asked our participants to judge the number of black and white dots on a screen. Information was presented either in the periphery before a saccade, in the fovea after a saccade, or in both areas consecutively to measure transsaccadic integration. In contrast to previous findings, we found an underestimation of numerosity for foveal presentation and an overestimation for peripheral presentation. We used a maximum-likelihood model to predict accuracy and reliability in the transsaccadic condition based on peripheral and foveal values. We found near-optimal integration of peripheral and foveal information, consistently with previous findings about orientation integration. In three consecutive experiments, we disrupted object continuity between the peripheral and foveal presentations to probe the limits of transsaccadic integration. Even for global changes on our numerosity stimuli, no influence of object discontinuity was observed. Overall, our results suggest that transsaccadic integration is a robust mechanism that also works for complex visual features such as numerosity and is operative despite internal or external mismatches between foveal and peripheral information. Transsaccadic integration facilitates an accurate and reliable perception of our environment. PMID:29149766
Forde, David R; Baron, Stephen W; Scher, Christine D; Stein, Murray B
2012-01-01
This study examines the psychometric properties of the Childhood Trauma Questionnaire short form (CTQ-SF) with street youth who have run away or been expelled from their homes (N = 397). Internal reliability coefficients for the five clinical scales ranged from .65 to .95. Confirmatory Factor Analysis (CFA) was used to test the five-factor structure of the scales yielding acceptable fit for the total sample. Additional multigroup analyses were performed to consider items by gender. Results provided only evidence of weak factorial invariance. Constrained models showed invariance in configuration, factor loadings, and factor covariances but failed for equality of intercepts. Mean trauma scores for street youth tended to fall in the moderate to severe range on all abuse/neglect clinical scales. Females reported higher levels of abuse and neglect. Prevalence of child maltreatment of individual forms was very high with 98% of street youth reporting one or more forms; 27.4% of males and 48.9% of females reported all five forms. Results of this study support the viability of the CTQ-SF for screening maltreatment in a highly vulnerable street population. Caution is recommended when comparing prevalence estimates for male and female street youth given the failure of the strong factorial multigroup model.
Caradot, Nicolas; Sonnenberg, Hauke; Rouault, Pascale; Gruber, Günter; Hofer, Thomas; Torres, Andres; Pesci, Maria; Bertrand-Krajewski, Jean-Luc
2015-01-01
This paper reports about experiences gathered from five online monitoring campaigns in the sewer systems of Berlin (Germany), Graz (Austria), Lyon (France) and Bogota (Colombia) using ultraviolet-visible (UV-VIS) spectrometers and turbidimeters. Online probes are useful for the measurement of highly dynamic processes, e.g. combined sewer overflows (CSO), storm events, and river impacts. The influence of local calibration on the quality of online chemical oxygen demand (COD) measurements of wet weather discharges has been assessed. Results underline the need to establish local calibration functions for both UV-VIS spectrometers and turbidimeters. It is suggested that practitioners calibrate locally their probes using at least 15-20 samples. However, these samples should be collected over several events and cover most of the natural variability of the measured concentration. For this reason, the use of automatic peristaltic samplers in parallel to online monitoring is recommended with short representative sampling campaigns during wet weather discharges. Using reliable calibration functions, COD loads of CSO and storm events can be estimated with a relative uncertainty of approximately 20%. If no local calibration is established, concentrations and loads are estimated with a high error rate, questioning the reliability and meaning of the online measurement. Similar results have been obtained for total suspended solids measurements.
Reliability analysis of structural ceramic components using a three-parameter Weibull distribution
NASA Technical Reports Server (NTRS)
Duffy, Stephen F.; Powers, Lynn M.; Starlinger, Alois
1992-01-01
Described here are nonlinear regression estimators for the three-Weibull distribution. Issues relating to the bias and invariance associated with these estimators are examined numerically using Monte Carlo simulation methods. The estimators were used to extract parameters from sintered silicon nitride failure data. A reliability analysis was performed on a turbopump blade utilizing the three-parameter Weibull distribution and the estimates from the sintered silicon nitride data.
Proposed Reliability/Cost Model
NASA Technical Reports Server (NTRS)
Delionback, L. M.
1982-01-01
New technique estimates cost of improvement in reliability for complex system. Model format/approach is dependent upon use of subsystem cost-estimating relationships (CER's) in devising cost-effective policy. Proposed methodology should have application in broad range of engineering management decisions.
to do so, and (5) three distinct versions of the problem of estimating component reliability from system failure-time data are treated, each resulting inconsistent estimators with asymptotically normal distributions.
Johnston, Lisa G; McLaughlin, Katherine R; Rhilani, Houssine El; Latifi, Amina; Toufik, Abdalla; Bennani, Aziza; Alami, Kamal; Elomari, Boutaina; Handcock, Mark S
2015-01-01
Background Respondent-driven sampling is used worldwide to estimate the population prevalence of characteristics such as HIV/AIDS and associated risk factors in hard-to-reach populations. Estimating the total size of these populations is of great interest to national and international organizations, however reliable measures of population size often do not exist. Methods Successive Sampling-Population Size Estimation (SS-PSE) along with network size imputation allows population size estimates to be made without relying on separate studies or additional data (as in network scale-up, multiplier and capture-recapture methods), which may be biased. Results Ten population size estimates were calculated for people who inject drugs, female sex workers, men who have sex with other men, and migrants from sub-Sahara Africa in six different cities in Morocco. SS-PSE estimates fell within or very close to the likely values provided by experts and the estimates from previous studies using other methods. Conclusions SS-PSE is an effective method for estimating the size of hard-to-reach populations that leverages important information within respondent-driven sampling studies. The addition of a network size imputation method helps to smooth network sizes allowing for more accurate results. However, caution should be used particularly when there is reason to believe that clustered subgroups may exist within the population of interest or when the sample size is small in relation to the population. PMID:26258908
Duncan, Laura; Comeau, Jinette; Wang, Li; Vitoroulis, Irene; Boyle, Michael H; Bennett, Kathryn
2018-02-19
A better understanding of factors contributing to the observed variability in estimates of test-retest reliability in published studies on standardized diagnostic interviews (SDI) is needed. The objectives of this systematic review and meta-analysis were to estimate the pooled test-retest reliability for parent and youth assessments of seven common disorders, and to examine sources of between-study heterogeneity in reliability. Following a systematic review of the literature, multilevel random effects meta-analyses were used to analyse 202 reliability estimates (Cohen's kappa = ҡ) from 31 eligible studies and 5,369 assessments of 3,344 children and youth. Pooled reliability was moderate at ҡ = .58 (CI 95% 0.53-0.63) and between-study heterogeneity was substantial (Q = 2,063 (df = 201), p < .001 and I 2 = 79%). In subgroup analysis, reliability varied across informants for specific types of psychiatric disorder (ҡ = .53-.69 for parent vs. ҡ = .39-.68 for youth) with estimates significantly higher for parents on attention deficit hyperactivity disorder, oppositional defiant disorder and the broad groupings of externalizing and any disorder. Reliability was also significantly higher in studies with indicators of poor or fair study methodology quality (sample size <50, retest interval <7 days). Our findings raise important questions about the meaningfulness of published evidence on the test-retest reliability of SDIs and the usefulness of these tools in both clinical and research contexts. Potential remedies include the introduction of standardized study and reporting requirements for reliability studies, and exploration of other approaches to assessing and classifying child and adolescent psychiatric disorder. © 2018 Association for Child and Adolescent Mental Health.
Validation of the Spanish Version of the Mammography-Specific Self-Efficacy Scale.
Jerome-D'Emilia, Bonnie; Suplee, Patricia; Akincigil, Ayse
2015-05-01
To consider psychometric estimates of the validity and reliability of the Spanish translation of a mammography-specific self-efficacy scale. A cross-sectional study. Three primarily Hispanic churches and a Hispanic community center in a low-income urban area of New Jersey. 153 low-income Hispanic women aged 40-85 years. The translated scale was administered to participants during a six-month period. Internal consistency, reliability, and construct and predictive validity were assessed. Demographic variables included income and insurance status. Outcome variables included total mammography-specific self-efficacy and having had a mammogram within the past two years. Preliminary evidence of reliability and validity were found, and predictive validity was demonstrated. The health needs of specific populations can be addressed only when research instruments have been appropriately validated and all relevant factors are considered. Diverse groups of low-income women face similar challenges and barriers in their efforts to get screened. Nurses are in an ideal position to help women with preventive care decision making (e.g., screening for breast cancer). Understanding how a woman's level of self-efficacy affects her decision making should be considered when counseling a client.
Recovering bridge deflections from collocated acceleration and strain measurements
NASA Astrophysics Data System (ADS)
Bell, M.; Ma, T. W.; Xu, N. S.
2015-04-01
In this research, an internal model based method is proposed to estimate the displacement profile of a bridge subjected to a moving traffic load using a combination of acceleration and strain measurements. The structural response is assumed to be within the linear range. The deflection profile is assumed to be dominated by the fundamental mode of the bridge, therefore only requiring knowledge of the first mode. This still holds true under a multiple vehicle loading situation as the high mode shapes don't impact the over all response of the structure. Using the structural modal parameters and partial knowledge of the moving vehicle load, the internal models of the structure and the moving load can be respectively established, which can be used to form an autonomous state-space representation of the system. The structural displacements, velocities, and accelerations are the states of such a system, and it is fully observable when the measured output contains structural accelerations and strains. Reliable estimates of structural displacements are obtained using the standard Kalman filtering technique. The effectiveness and robustness of the proposed method has been demonstrated and evaluated via numerical simulation of a simply supported single span concrete bridge subjected to a moving traffic load.
Femoral anatomical frame: assessment of various definitions.
Della Croce, U; Camomilla, V; Leardini, A; Cappozzo, A
2003-06-01
The reliability of the estimate of joint kinematic variables and the relevant functional interpretation are affected by the uncertainty with which bony anatomical landmarks and underlying bony segment anatomical frames are determined. When a stereo-photogrammetric system is used for in vivo studies, minimising and compensating for this uncertainty is crucial. This paper deals with the propagation of the errors associated with the location of both internal and palpable femoral anatomical landmarks to the estimation of the orientation of the femoral anatomical frame and to the knee joint angles during movement. Given eight anatomical landmarks, and the precision with which they can be identified experimentally, 12 different rules were defined for the construction of the anatomical frame and submitted to comparative assessment. Results showed that using more than three landmarks allows for more repeatable anatomical frame orientation and knee joint kinematics estimation. Novel rules are proposed that use optimization algorithms. On the average, the femoral frame orientation dispersion had a standard deviation of 2, 2.5 and 1.5 degrees for the frontal, transverse, and sagittal plane, respectively. However, a proper choice of the relevant construction rule allowed for a reduction of these inaccuracies in selected planes to 1 degrees rms. The dispersion of the knee adduction-abduction and internal-external rotation angles could also be limited to 1 degrees rms irrespective of the flexion angle value.
Ning, Jia; Schubert, Tilman; Johnson, Kevin M; Roldán-Alzate, Alejandro; Chen, Huijun; Yuan, Chun; Reeder, Scott B
2018-06-01
To propose a simple method to correct vascular input function (VIF) due to inflow effects and to test whether the proposed method can provide more accurate VIFs for improved pharmacokinetic modeling. A spoiled gradient echo sequence-based inflow quantification and contrast agent concentration correction method was proposed. Simulations were conducted to illustrate improvement in the accuracy of VIF estimation and pharmacokinetic fitting. Animal studies with dynamic contrast-enhanced MR scans were conducted before, 1 week after, and 2 weeks after portal vein embolization (PVE) was performed in the left portal circulation of pigs. The proposed method was applied to correct the VIFs for model fitting. Pharmacokinetic parameters fitted using corrected and uncorrected VIFs were compared between different lobes and visits. Simulation results demonstrated that the proposed method can improve accuracy of VIF estimation and pharmacokinetic fitting. In animal study results, pharmacokinetic fitting using corrected VIFs demonstrated changes in perfusion consistent with changes expected after PVE, whereas the perfusion estimates derived by uncorrected VIFs showed no significant changes. The proposed correction method improves accuracy of VIFs and therefore provides more precise pharmacokinetic fitting. This method may be promising in improving the reliability of perfusion quantification. Magn Reson Med 79:3093-3102, 2018. © 2017 International Society for Magnetic Resonance in Medicine. © 2017 International Society for Magnetic Resonance in Medicine.
A model describing diffusion in prostate cancer.
Gilani, Nima; Malcolm, Paul; Johnson, Glyn
2017-07-01
Quantitative diffusion MRI has frequently been studied as a means of grading prostate cancer. Interpretation of results is complicated by the nature of prostate tissue, which consists of four distinct compartments: vascular, ductal lumen, epithelium, and stroma. Current diffusion measurements are an ill-defined weighted average of these compartments. In this study, prostate diffusion is analyzed in terms of a model that takes explicit account of tissue compartmentalization, exchange effects, and the non-Gaussian behavior of tissue diffusion. The model assumes that exchange between the cellular (ie, stromal plus epithelial) and the vascular and ductal compartments is slow. Ductal and cellular diffusion characteristics are estimated by Monte Carlo simulation and a two-compartment exchange model, respectively. Vascular pseudodiffusion is represented by an additional signal at b = 0. Most model parameters are obtained either from published data or by comparing model predictions with the published results from 41 studies. Model prediction error is estimated using 10-fold cross-validation. Agreement between model predictions and published results is good. The model satisfactorily explains the variability of ADC estimates found in the literature. A reliable model that predicts the diffusion behavior of benign and cancerous prostate tissue of different Gleason scores has been developed. Magn Reson Med 78:316-326, 2017. © 2016 International Society for Magnetic Resonance in Medicine. © 2016 International Society for Magnetic Resonance in Medicine.
Examining the reliability of ADAS-Cog change scores.
Grochowalski, Joseph H; Liu, Ying; Siedlecki, Karen L
2016-09-01
The purpose of this study was to estimate and examine ways to improve the reliability of change scores on the Alzheimer's Disease Assessment Scale, Cognitive Subtest (ADAS-Cog). The sample, provided by the Alzheimer's Disease Neuroimaging Initiative, included individuals with Alzheimer's disease (AD) (n = 153) and individuals with mild cognitive impairment (MCI) (n = 352). All participants were administered the ADAS-Cog at baseline and 1 year, and change scores were calculated as the difference in scores over the 1-year period. Three types of change score reliabilities were estimated using multivariate generalizability. Two methods to increase change score reliability were evaluated: reweighting the subtests of the scale and adding more subtests. Reliability of ADAS-Cog change scores over 1 year was low for both the AD sample (ranging from .53 to .64) and the MCI sample (.39 to .61). Reweighting the change scores from the AD sample improved reliability (.68 to .76), but lengthening provided no useful improvement for either sample. The MCI change scores had low reliability, even with reweighting and adding additional subtests. The ADAS-Cog scores had low reliability for measuring change. Researchers using the ADAS-Cog should estimate and report reliability for their use of the change scores. The ADAS-Cog change scores are not recommended for assessment of meaningful clinical change.
ERIC Educational Resources Information Center
Oakland, Thomas
New strategies for evaluation criterion referenced measures (CRM) are discussed. These strategies examine the following issues: (1) the use of normed referenced measures (NRM) as CRM and then estimating the reliability and validity of such measures in terms of variance from an arbitrarily specified criterion score, (2) estimation of the…
A Note on the Reliability Coefficients for Item Response Model-Based Ability Estimates
ERIC Educational Resources Information Center
Kim, Seonghoon
2012-01-01
Assuming item parameters on a test are known constants, the reliability coefficient for item response theory (IRT) ability estimates is defined for a population of examinees in two different ways: as (a) the product-moment correlation between ability estimates on two parallel forms of a test and (b) the squared correlation between the true…
Wu, Joseph T.; Ho, Andrew; Ma, Edward S. K.; Lee, Cheuk Kwong; Chu, Daniel K. W.; Ho, Po-Lai; Hung, Ivan F. N.; Ho, Lai Ming; Lin, Che Kit; Tsang, Thomas; Lo, Su-Vui; Lau, Yu-Lung; Leung, Gabriel M.
2011-01-01
Background In an emerging influenza pandemic, estimating severity (the probability of a severe outcome, such as hospitalization, if infected) is a public health priority. As many influenza infections are subclinical, sero-surveillance is needed to allow reliable real-time estimates of infection attack rate (IAR) and severity. Methods and Findings We tested 14,766 sera collected during the first wave of the 2009 pandemic in Hong Kong using viral microneutralization. We estimated IAR and infection-hospitalization probability (IHP) from the serial cross-sectional serologic data and hospitalization data. Had our serologic data been available weekly in real time, we would have obtained reliable IHP estimates 1 wk after, 1–2 wk before, and 3 wk after epidemic peak for individuals aged 5–14 y, 15–29 y, and 30–59 y. The ratio of IAR to pre-existing seroprevalence, which decreased with age, was a major determinant for the timeliness of reliable estimates. If we began sero-surveillance 3 wk after community transmission was confirmed, with 150, 350, and 500 specimens per week for individuals aged 5–14 y, 15–19 y, and 20–29 y, respectively, we would have obtained reliable IHP estimates for these age groups 4 wk before the peak. For 30–59 y olds, even 800 specimens per week would not have generated reliable estimates until the peak because the ratio of IAR to pre-existing seroprevalence for this age group was low. The performance of serial cross-sectional sero-surveillance substantially deteriorates if test specificity is not near 100% or pre-existing seroprevalence is not near zero. These potential limitations could be mitigated by choosing a higher titer cutoff for seropositivity. If the epidemic doubling time is longer than 6 d, then serial cross-sectional sero-surveillance with 300 specimens per week would yield reliable estimates when IAR reaches around 6%–10%. Conclusions Serial cross-sectional serologic data together with clinical surveillance data can allow reliable real-time estimates of IAR and severity in an emerging pandemic. Sero-surveillance for pandemics should be considered. Please see later in the article for the Editors' Summary PMID:21990967
Smile line assessment comparing quantitative measurement and visual estimation.
Van der Geld, Pieter; Oosterveld, Paul; Schols, Jan; Kuijpers-Jagtman, Anne Marie
2011-02-01
Esthetic analysis of dynamic functions such as spontaneous smiling is feasible by using digital videography and computer measurement for lip line height and tooth display. Because quantitative measurements are time-consuming, digital videography and semiquantitative (visual) estimation according to a standard categorization are more practical for regular diagnostics. Our objective in this study was to compare 2 semiquantitative methods with quantitative measurements for reliability and agreement. The faces of 122 male participants were individually registered by using digital videography. Spontaneous and posed smiles were captured. On the records, maxillary lip line heights and tooth display were digitally measured on each tooth and also visually estimated according to 3-grade and 4-grade scales. Two raters were involved. An error analysis was performed. Reliability was established with kappa statistics. Interexaminer and intraexaminer reliability values were high, with median kappa values from 0.79 to 0.88. Agreement of the 3-grade scale estimation with quantitative measurement showed higher median kappa values (0.76) than the 4-grade scale estimation (0.66). Differentiating high and gummy smile lines (4-grade scale) resulted in greater inaccuracies. The estimation of a high, average, or low smile line for each tooth showed high reliability close to quantitative measurements. Smile line analysis can be performed reliably with a 3-grade scale (visual) semiquantitative estimation. For a more comprehensive diagnosis, additional measuring is proposed, especially in patients with disproportional gingival display. Copyright © 2011 American Association of Orthodontists. Published by Mosby, Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Kamiaka, Shoya; Benomar, Othman; Suto, Yasushi
2018-05-01
Advances in asteroseismology of solar-like stars, now provide a unique method to estimate the stellar inclination i⋆. This enables to evaluate the spin-orbit angle of transiting planetary systems, in a complementary fashion to the Rossiter-McLaughlineffect, a well-established method to estimate the projected spin-orbit angle λ. Although the asteroseismic method has been broadly applied to the Kepler data, its reliability has yet to be assessed intensively. In this work, we evaluate the accuracy of i⋆ from asteroseismology of solar-like stars using 3000 simulated power spectra. We find that the low signal-to-noise ratio of the power spectra induces a systematic under-estimate (over-estimate) bias for stars with high (low) inclinations. We derive analytical criteria for the reliable asteroseismic estimate, which indicates that reliable measurements are possible in the range of 20° ≲ i⋆ ≲ 80° only for stars with high signal-to-noise ratio. We also analyse and measure the stellar inclination of 94 Kepler main-sequence solar-like stars, among which 33 are planetary hosts. According to our reliability criteria, a third of them (9 with planets, 22 without) have accurate stellar inclination. Comparison of our asteroseismic estimate of vsin i⋆ against spectroscopic measurements indicates that the latter suffers from a large uncertainty possibly due to the modeling of macro-turbulence, especially for stars with projected rotation speed vsin i⋆ ≲ 5km/s. This reinforces earlier claims, and the stellar inclination estimated from the combination of measurements from spectroscopy and photometric variation for slowly rotating stars needs to be interpreted with caution.
A method of bias correction for maximal reliability with dichotomous measures.
Penev, Spiridon; Raykov, Tenko
2010-02-01
This paper is concerned with the reliability of weighted combinations of a given set of dichotomous measures. Maximal reliability for such measures has been discussed in the past, but the pertinent estimator exhibits a considerable bias and mean squared error for moderate sample sizes. We examine this bias, propose a procedure for bias correction, and develop a more accurate asymptotic confidence interval for the resulting estimator. In most empirically relevant cases, the bias correction and mean squared error correction can be performed simultaneously. We propose an approximate (asymptotic) confidence interval for the maximal reliability coefficient, discuss the implementation of this estimator, and investigate the mean squared error of the associated asymptotic approximation. We illustrate the proposed methods using a numerical example.
Practical Issues in Implementing Software Reliability Measurement
NASA Technical Reports Server (NTRS)
Nikora, Allen P.; Schneidewind, Norman F.; Everett, William W.; Munson, John C.; Vouk, Mladen A.; Musa, John D.
1999-01-01
Many ways of estimating software systems' reliability, or reliability-related quantities, have been developed over the past several years. Of particular interest are methods that can be used to estimate a software system's fault content prior to test, or to discriminate between components that are fault-prone and those that are not. The results of these methods can be used to: 1) More accurately focus scarce fault identification resources on those portions of a software system most in need of it. 2) Estimate and forecast the risk of exposure to residual faults in a software system during operation, and develop risk and safety criteria to guide the release of a software system to fielded use. 3) Estimate the efficiency of test suites in detecting residual faults. 4) Estimate the stability of the software maintenance process.
NASA Astrophysics Data System (ADS)
Rosas, Pedro; Wagemans, Johan; Ernst, Marc O.; Wichmann, Felix A.
2005-05-01
A number of models of depth-cue combination suggest that the final depth percept results from a weighted average of independent depth estimates based on the different cues available. The weight of each cue in such an average is thought to depend on the reliability of each cue. In principle, such a depth estimation could be statistically optimal in the sense of producing the minimum-variance unbiased estimator that can be constructed from the available information. Here we test such models by using visual and haptic depth information. Different texture types produce differences in slant-discrimination performance, thus providing a means for testing a reliability-sensitive cue-combination model with texture as one of the cues to slant. Our results show that the weights for the cues were generally sensitive to their reliability but fell short of statistically optimal combination - we find reliability-based reweighting but not statistically optimal cue combination.
Examination of simplified travel demand model. [Internal volume forecasting model
DOE Office of Scientific and Technical Information (OSTI.GOV)
Smith, R.L. Jr.; McFarlane, W.J.
1978-01-01
A simplified travel demand model, the Internal Volume Forecasting (IVF) model, proposed by Low in 1972 is evaluated as an alternative to the conventional urban travel demand modeling process. The calibration of the IVF model for a county-level study area in Central Wisconsin results in what appears to be a reasonable model; however, analysis of the structure of the model reveals two primary mis-specifications. Correction of the mis-specifications leads to a simplified gravity model version of the conventional urban travel demand models. Application of the original IVF model to ''forecast'' 1960 traffic volumes based on the model calibrated for 1970more » produces accurate estimates. Shortcut and ad hoc models may appear to provide reasonable results in both the base and horizon years; however, as shown by the IVF mode, such models will not always provide a reliable basis for transportation planning and investment decisions.« less
The reliability paradox of the Parent-Child Conflict Tactics Corporal Punishment Subscale.
Lorber, Michael F; Slep, Amy M Smith
2018-02-01
In the present investigation we consider and explain an apparent paradox in the measurement of corporal punishment with the Parent-Child Conflict Tactics Scale (CTS-PC): How can it have poor internal consistency and still be reliable? The CTS-PC was administered to a community sample of 453 opposite sex couples who were parents of 3- to 7-year-old children. Internal consistency was marginal, yet item response theory analyses revealed that reliability rose sharply with increasing corporal punishment, exceeding .80 in the upper ranges of the construct. The results suggest that the CTS-PC Corporal Punishment subscale reliably discriminates among parents who report average to high corporal punishment (64% of mothers and 56% of fathers in the present sample), despite low overall internal consistency. These results have straightforward implications for the use and reporting of the scale. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Polcin, Douglas L.; Galloway, Gantt P.; Bond, Jason; Korcha, Rachael; Greenfield, Thomas K.
2008-01-01
The addiction field lacks an accepted definition and reliable measure of confrontation. The Alcohol and Drug Confrontation Scale (ADCS) defines confrontation as warnings about the potential consequences of substance use. To assess psychometric properties, 323 individual entering recovery houses in U.S. urban and suburban areas were interviewed between 2003 and 2005 (20% women, 68% white). Analyses included test-retest reliability, confirmatory factor analysis, and measures of internal consistency. Findings support the ADCS as a reliable way of assessing two factors: Internal Support and External intensity. Confrontation was experienced as supportive, accurate and helpful. Additional studies should assess confrontation in different contexts. PMID:20686635
NASA Astrophysics Data System (ADS)
Mahmood, Faleh H.; Kadhim, Hussein T.; Resen, Ali K.; Shaban, Auday H.
2018-05-01
The failure such as air gap weirdness, rubbing, and scrapping between stator and rotor generator arise unavoidably and may cause extremely terrible results for a wind turbine. Therefore, we should pay more attention to detect and identify its cause-bearing failure in wind turbine to improve the operational reliability. The current paper tends to use of power spectral density analysis method of detecting internal race and external race bearing failure in micro wind turbine by estimation stator current signal of the generator. The failure detector method shows that it is well suited and effective for bearing failure detection.
Design of power-plant installations pressure-loss characteristics of duct components
NASA Technical Reports Server (NTRS)
Henry, John R
1944-01-01
A correlation of what are believed to be the most reliable data available on duct components of aircraft power-plant installations is presented. The information is given in a convenient form and is offered as an aid in designing duct systems and, subject to certain qualifications, as a guide in estimating their performance. The design and performance data include those for straight ducts; simple bends of square, circular, and elliptical cross sections; compound bends; diverging and converging bends; vaned bends; diffusers; branch ducts; internal inlets; and an angular placement of heat exchangers. Examples are included to illustrate methods of applying these data in analyzing duct systems. (author)
The Trunk Impairment Scale - modified to ordinal scales in the Norwegian version.
Gjelsvik, Bente; Breivik, Kyrre; Verheyden, Geert; Smedal, Tori; Hofstad, Håkon; Strand, Liv Inger
2012-01-01
To translate the Trunk Impairment Scale (TIS), a measure of trunk control in patients after stroke, into Norwegian (TIS-NV), and to explore its construct validity, internal consistency, intertester and test-retest reliability. TIS was translated according to international guidelines. The validity study was performed on data from 201 patients with acute stroke. Fifty patients with stroke and acquired brain injury were recruited to examine intertester and test-retest reliability. Construct validity was analyzed with exploratory and confirmatory factor analysis and item response theory, internal consistency with Cronbach's alpha test, and intertester and test-retest reliability with kappa and intraclass correlation coefficient tests. The back-translated version of TIS-NV was validated by the original developer. The subscale Static sitting balance was removed. By combining items from the subscales Dynamic sitting balance and Coordination, six ordinal superitems (testlets) were constructed. The TIS-NV was renamed the modified TIS-NV (TIS-modNV). After modifications the TIS-modNV fitted well to a locally dependent unidimensional item response theory model. It demonstrated good construct validity, excellent internal consistency, and high intertester and test-retest reliability for the total score. This study supports that the TIS-modNV is a valid and reliable scale for use in clinical practice and research.
Tan, Maw Pin; Nalathamby, Nemala; Mat, Sumaiyah; Tan, Pey June; Kamaruzzaman, Shahrul Bahyah; Morgan, Karen
2018-01-01
While the prevalence of falls among Malaysian older adults is comparable to other older populations around the world, little is currently known about fear of falling in Malaysia. The Falls Efficacy Scale International (FES-I) and short FES-I scales to measure fear of falling have not yet been validated for use within the Malaysian population, and are currently not available in Bahasa Malaysia (BM). A total of 402 participants aged ≥63 years were recruited. The questionnaire was readministered to 149 participants, 4 to 8 weeks after the first administration to determine test-retest reliability. The original version of the 7-item short FES-I is available in English, while the Mandarin was adapted from the 16-item Mandarin FES-I. The BM version was translated according to protocol by four experts. The internal structure of the FES-I was examined by factor analysis. The 7-item short FES-I showed good internal reliability and test-retest reliability for English, Mandarin, and BM versions for Malaysia.
Estimating Premorbid Cognitive Abilities in Low-Educated Populations
Apolinario, Daniel; Brucki, Sonia Maria Dozzi; Ferretti, Renata Eloah de Lucena; Farfel, José Marcelo; Magaldi, Regina Miksian; Busse, Alexandre Leopold; Jacob-Filho, Wilson
2013-01-01
Objective To develop an informant-based instrument that would provide a valid estimate of premorbid cognitive abilities in low-educated populations. Methods A questionnaire was drafted by focusing on the premorbid period with a 10-year time frame. The initial pool of items was submitted to classical test theory and a factorial analysis. The resulting instrument, named the Premorbid Cognitive Abilities Scale (PCAS), is composed of questions addressing educational attainment, major lifetime occupation, reading abilities, reading habits, writing abilities, calculation abilities, use of widely available technology, and the ability to search for specific information. The validation sample was composed of 132 older Brazilian adults from the following three demographically matched groups: normal cognitive aging (n = 72), mild cognitive impairment (n = 33), and mild dementia (n = 27). The scores of a reading test and a neuropsychological battery were adopted as construct criteria. Post-mortem inter-informant reliability was tested in a sub-study with two relatives from each deceased individual. Results All items presented good discriminative power, with corrected item-total correlation varying from 0.35 to 0.74. The summed score of the instrument presented high correlation coefficients with global cognitive function (r = 0.73) and reading skills (r = 0.82). Cronbach's alpha was 0.90, showing optimal internal consistency without redundancy. The scores did not decrease across the progressive levels of cognitive impairment, suggesting that the goal of evaluating the premorbid state was achieved. The intraclass correlation coefficient was 0.96, indicating excellent inter-informant reliability. Conclusion The instrument developed in this study has shown good properties and can be used as a valid estimate of premorbid cognitive abilities in low-educated populations. The applicability of the PCAS, both as an estimate of premorbid intelligence and cognitive reserve, is discussed. PMID:23555894
Accelerometer-based measures in physical activity surveillance: current practices and issues.
Pedišić, Željko; Bauman, Adrian
2015-02-01
Self-reports of physical activity (PA) have been the mainstay of measurement in most non-communicable disease (NCD) surveillance systems. To these, other measures are added to summate to a comprehensive PA surveillance system. Recently, some national NCD surveillance systems have started using accelerometers as a measure of PA. The purpose of this paper was specifically to appraise the suitability and role of accelerometers for population-level PA surveillance. A thorough literature search was conducted to examine aspects of the generalisability, reliability, validity, comprehensiveness and between-study comparability of accelerometer estimates, and to gauge the simplicity, cost-effectiveness, adaptability and sustainability of their use in NCD surveillance. Accelerometer data collected in PA surveillance systems may not provide estimates that are generalisable to the target population. Accelerometer-based estimates have adequate reliability for PA surveillance, but there are still several issues associated with their validity. Accelerometer-based prevalence estimates are largely dependent on the investigators' choice of intensity cut-off points. Maintaining standardised accelerometer data collections in long-term PA surveillance systems is difficult, which may cause discontinuity in time-trend data. The use of accelerometers does not necessarily produce useful between-study and international comparisons due to lack of standardisation of data collection and processing methods. To conclude, it appears that accelerometers still have limitations regarding generalisability, validity, comprehensiveness, simplicity, affordability, adaptability, between-study comparability and sustainability. Therefore, given the current evidence, it seems that the widespread adoption of accelerometers specifically for large-scale PA surveillance systems may be premature. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Palmer, Kara K.
2017-01-01
Assessing children’s perceptions of their movement abilities (i.e., perceived competence) is traditionally done using picture scales—Pictorial Scale of Perceived Competence and Acceptance for Young Children or Pictorial Scale of Perceived Movement Skill Competence. Pictures fail to capture the temporal components of movement. To address this limitation, we created a digital-based instrument to assess perceived motor competence: the Digital Scale of Perceived Motor Competence. The purpose of this study was to determine the validity, reliability, and internal consistency of the Digital-based Scale of Perceived Motor Skill Competence. The Digital-based Scale of Perceived Motor Skill Competence is based on the twelve fundamental motor skills from the Test of Gross Motor Development-2nd Edition with a similar layout and item structure as the Pictorial Scale of Perceived Movement Skill Competence. Face Validity of the instrument was examined in Phase I (n = 56; Mage = 8.6 ± 0.7 years, 26 girls). Test-retest reliability and internal consistency were assessed in Phase II (n = 54, Mage = 8.7 years ± 0.5 years, 26 girls). Intra-class correlations (ICC) and Cronbach’s alpha were conducted to determine test-retest reliability and internal consistency for all twelve skills along with locomotor and object control subscales. The Digital Scale of Perceived Motor Competence demonstrates excellent test-retest reliability (ICC = 0.83, total; ICC = 0.77, locomotor; ICC = 0.79, object control) and acceptable/good internal consistency (α = 0.62, total; α = 0.57, locomotor; α = 0.49, object control). Findings provide evidence of the reliability of the three level digital-based instrument of perceived motor competence for older children. PMID:29910408
Bayesian methods in reliability
NASA Astrophysics Data System (ADS)
Sander, P.; Badoux, R.
1991-11-01
The present proceedings from a course on Bayesian methods in reliability encompasses Bayesian statistical methods and their computational implementation, models for analyzing censored data from nonrepairable systems, the traits of repairable systems and growth models, the use of expert judgment, and a review of the problem of forecasting software reliability. Specific issues addressed include the use of Bayesian methods to estimate the leak rate of a gas pipeline, approximate analyses under great prior uncertainty, reliability estimation techniques, and a nonhomogeneous Poisson process. Also addressed are the calibration sets and seed variables of expert judgment systems for risk assessment, experimental illustrations of the use of expert judgment for reliability testing, and analyses of the predictive quality of software-reliability growth models such as the Weibull order statistics.
Weak data do not make a free lunch, only a cheap meal
DOE Office of Scientific and Technical Information (OSTI.GOV)
Luo, Zhipu; Rajashankar, Kanagalaghatta; Dauter, Zbigniew, E-mail: dauter@anl.gov
2014-02-01
Refinement and analysis of four structures with various data resolution cutoffs suggests that at present there are no reliable criteria for judging the diffraction data resolution limit and the condition I/σ(I) = 2.0 is reasonable. However, extending the limit by about 0.2 Å beyond the resolution defined by this threshold does not deteriorate the quality of refined structures and in some cases may be beneficial. Four data sets were processed at resolutions significantly exceeding the criteria traditionally used for estimating the diffraction data resolution limit. The analysis of these data and the corresponding model-quality indicators suggests that the criteria ofmore » resolution limits widely adopted in the past may be somewhat conservative. Various parameters, such as R{sub merge} and I/σ(I), optical resolution and the correlation coefficients CC{sub 1/2} and CC*, can be used for judging the internal data quality, whereas the reliability factors R and R{sub free} as well as the maximum-likelihood target values and real-space map correlation coefficients can be used to estimate the agreement between the data and the refined model. However, none of these criteria provide a reliable estimate of the data resolution cutoff limit. The analysis suggests that extension of the maximum resolution by about 0.2 Å beyond the currently adopted limit where the I/σ(I) value drops to 2.0 does not degrade the quality of the refined structural models, but may sometimes be advantageous. Such an extension may be particularly beneficial for significantly anisotropic diffraction. Extension of the maximum resolution at the stage of data collection and structure refinement is cheap in terms of the required effort and is definitely more advisable than accepting a too conservative resolution cutoff, which is unfortunately quite frequent among the crystal structures deposited in the Protein Data Bank.« less
Methods and Costs to Achieve Ultra Reliable Life Support
NASA Technical Reports Server (NTRS)
Jones, Harry W.
2012-01-01
A published Mars mission is used to explore the methods and costs to achieve ultra reliable life support. The Mars mission and its recycling life support design are described. The life support systems were made triply redundant, implying that each individual system will have fairly good reliability. Ultra reliable life support is needed for Mars and other long, distant missions. Current systems apparently have insufficient reliability. The life cycle cost of the Mars life support system is estimated. Reliability can be increased by improving the intrinsic system reliability, adding spare parts, or by providing technically diverse redundant systems. The costs of these approaches are estimated. Adding spares is least costly but may be defeated by common cause failures. Using two technically diverse systems is effective but doubles the life cycle cost. Achieving ultra reliability is worth its high cost because the penalty for failure is very high.
Reliability of digital reactor protection system based on extenics.
Zhao, Jing; He, Ya-Nan; Gu, Peng-Fei; Chen, Wei-Hua; Gao, Feng
2016-01-01
After the Fukushima nuclear accident, safety of nuclear power plants (NPPs) is widespread concerned. The reliability of reactor protection system (RPS) is directly related to the safety of NPPs, however, it is difficult to accurately evaluate the reliability of digital RPS. The method is based on estimating probability has some uncertainties, which can not reflect the reliability status of RPS dynamically and support the maintenance and troubleshooting. In this paper, the reliability quantitative analysis method based on extenics is proposed for the digital RPS (safety-critical), by which the relationship between the reliability and response time of RPS is constructed. The reliability of the RPS for CPR1000 NPP is modeled and analyzed by the proposed method as an example. The results show that the proposed method is capable to estimate the RPS reliability effectively and provide support to maintenance and troubleshooting of digital RPS system.
Lord, Sarah Peregrine; Can, Doğan; Yi, Michael; Marin, Rebeca; Dunn, Christopher W.; Imel, Zac E.; Georgiou, Panayiotis; Narayanan, Shrikanth; Steyvers, Mark; Atkins, David C.
2014-01-01
The current paper presents novel methods for collecting MISC data and accurately assessing reliability of behavior codes at the level of the utterance. The MISC 2.1 was used to rate MI interviews from five randomized trials targeting alcohol and drug use. Sessions were coded at the utterance-level. Utterance-based coding reliability was estimated using three methods and compared to traditional reliability estimates of session tallies. Session-level reliability was generally higher compared to reliability using utterance-based codes, suggesting that typical methods for MISC reliability may be biased. These novel methods in MI fidelity data collection and reliability assessment provided rich data for therapist feedback and further analyses. Beyond implications for fidelity coding, utterance-level coding schemes may elucidate important elements in the counselor-client interaction that could inform theories of change and the practice of MI. PMID:25242192
Lord, Sarah Peregrine; Can, Doğan; Yi, Michael; Marin, Rebeca; Dunn, Christopher W; Imel, Zac E; Georgiou, Panayiotis; Narayanan, Shrikanth; Steyvers, Mark; Atkins, David C
2015-02-01
The current paper presents novel methods for collecting MISC data and accurately assessing reliability of behavior codes at the level of the utterance. The MISC 2.1 was used to rate MI interviews from five randomized trials targeting alcohol and drug use. Sessions were coded at the utterance-level. Utterance-based coding reliability was estimated using three methods and compared to traditional reliability estimates of session tallies. Session-level reliability was generally higher compared to reliability using utterance-based codes, suggesting that typical methods for MISC reliability may be biased. These novel methods in MI fidelity data collection and reliability assessment provided rich data for therapist feedback and further analyses. Beyond implications for fidelity coding, utterance-level coding schemes may elucidate important elements in the counselor-client interaction that could inform theories of change and the practice of MI. Copyright © 2015 Elsevier Inc. All rights reserved.
Low-flow characteristics of streams in the lower Wisconsin River basin
Gebert, W.A.
1978-01-01
Low-flow characteristics estimated for the lower Wisconsin River basin have a high degree of reliability when compared with other basins in Wisconsin, Reliable estimates appear to be related to the relatively uniform geologic features in the basin.
Motivation towards dual career of European student-athletes.
Lupo, Corrado; Guidotti, Flavia; Goncalves, Carlos E; Moreira, Liliana; Doupona Topic, Mojca; Bellardini, Helena; Tonkonogi, Michail; Colin, Allen; Capranica, Laura
2015-01-01
The present study aimed to investigate motivations for the dual career of European student-athletes living in countries providing different educational services for elite athletes: State-centric regulation-State as sponsor/facilitator (State), National Sporting Federations/Institutes as intermediary (Federation) and Laisser Faire, no formal structures (No Structure). Therefore, the European Student-athletes' Motivation towards Sports and Academics Questionnaire (SAMSAQ-EU) was administered to 524 European student-athletes. Exploratory Factor Analysis, and Confirmatory Factor Analysis were applied to test the factor structure, and the reliability and validity of the SAMSAQ-EU, respectively. A multivariate approach was applied to verify subgroup effects (P ≤ 0.05) according to gender (i.e., female and male), age (i.e., ≤ 24 years, > 24 years), type of sport (i.e., individual sport and team sport) and competition level (i.e., national and international). Insufficient confirmatory indexes were reported for the whole European student-athlete group, whereas distinct three factor models [i.e., Student Athletic Motivation (SAM); Academic Motivation (AM); Career Athletic Motivation (CAM)] emerged, with acceptable reliability estimates, for State (SAM = 0.82; AM = 0.75; and CAM = 0.75), Federation (SAM = 0.82; AM = 0.66; and CAM = 0.87) and No Structure (SAM = 0.78; AM = 0.74; and CAM = 0.79) subgroups. Differences between subgroups were found only for competition level (P < 0.001) in relation to SAM (P = 0.001) and CAM (P < 0.001). For SAM, the highest and lowest values emerged for Federation (national, 5.1 ± 0.5; international, 5.4 ± 0.5) and State (national, 4.5 ± 0.9; international, 4.8 ± 0.7). The opposite picture emerged for CAM (Federation: national, 3.3 ± 0.7; international, 3.5 ± 0.9; State: national, 5.0 ± 0.8; international, 5.0 ± 0.9). Therefore, despite SAMSAQ-EU demonstrated to be a useful tool, results showed that European student-athletes' motivation for dual career has to be specifically investigated according to social contexts.
Autonomous navigation system based on GPS and magnetometer data
NASA Technical Reports Server (NTRS)
Julie, Thienel K. (Inventor); Richard, Harman R. (Inventor); Bar-Itzhack, Itzhack Y. (Inventor)
2004-01-01
This invention is drawn to an autonomous navigation system using Global Positioning System (GPS) and magnetometers for low Earth orbit satellites. As a magnetometer is reliable and always provides information on spacecraft attitude, rate, and orbit, the magnetometer-GPS configuration solves GPS initialization problem, decreasing the convergence time for navigation estimate and improving the overall accuracy. Eventually the magnetometer-GPS configuration enables the system to avoid costly and inherently less reliable gyro for rate estimation. Being autonomous, this invention would provide for black-box spacecraft navigation, producing attitude, orbit, and rate estimates without any ground input with high accuracy and reliability.
Reitz, Meredith; Sanford, Ward E.; Senay, Gabriel; Cazenas, J.
2017-01-01
This study presents new data-driven, annual estimates of the division of precipitation into the recharge, quick-flow runoff, and evapotranspiration (ET) water budget components for 2000-2013 for the contiguous United States (CONUS). The algorithms used to produce these maps ensure water budget consistency over this broad spatial scale, with contributions from precipitation influx attributed to each component at 800 m resolution. The quick-flow runoff estimates for the contribution to the rapidly varying portion of the hydrograph are produced using data from 1,434 gaged watersheds, and depend on precipitation, soil saturated hydraulic conductivity, and surficial geology type. Evapotranspiration estimates are produced from a regression using water balance data from 679 gaged watersheds and depend on land cover, temperature, and precipitation. The quick-flow and ET estimates are combined to calculate recharge as the remainder of precipitation. The ET and recharge estimates are checked against independent field data, and the results show good agreement. Comparisons of recharge estimates with groundwater extraction data show that in 15% of the country, groundwater is being extracted at rates higher than the local recharge. These maps of the internally consistent water budget components of recharge, quick-flow runoff, and ET, being derived from and tested against data, are expected to provide reliable first-order estimates of these quantities across the CONUS, even where field measurements are sparse.
Kashyap, Gyan Chandra; Singh, Shri Kant
2017-03-21
The purpose of this study was to test the reliability, validity and factor structure of GHQ-12 questionnaire on male tannery workers of India. We have tested three different factor models of the GHQ-12. This paper used primary data obtained from a cross-sectional household study of tannery workers from Jajmau area of the city of Kanpur in northern India, which was conducted during January-June, 2015, as part of a doctoral program. The study covered 286 tannery workers from the study area. An interview schedule containing GHQ-12 was used for tannery workers who had completed at least 1 year at their present occupation preceding the survey. To test reliability, Cronbach's alpha test was used. The convergent test was used for validity. Confirmatory factor analysis was used to compare three factor structures for the GHQ-12. A total of 286 samples were analyzed in this study. The mean age of the tannery workers in this study was 38 years (SD = 1.42). We found the alpha coefficient to be 0.93 for the complete sample. The value of alpha represents the acceptable internal consistency for all the groups. Each item of scale showed almost the same internal consistency of 0.93 for the male tannery workers. The correlation between factor 1 (Anxiety and Depression) and factor 2 (Social Dysfunction) was 0.92. The correlation between factor 1 (Anxiety and Depression) and factor 3 (Loss of confidence) was the highest 0.98. Comparative fit index (CFI) estimate best-fitted for model-III that gave the CFI value 0.97. The SRMR indicator gave the lowest value 0.031 for the model-III. The findings suggest that the Hindi version of GHQ-12 is a reliable and valid tool for measuring psychological distress in male tannery workers of Kanpur city, India. Study found that the model proposed by the Graetz was the best fitted model for the data.
Behar-Horenstein, Linda S; Garvan, Cyndi W; Moore, Thomas E; Catalanotto, Frank A
2013-08-01
Valid and reliable instruments to measure and assess cultural competence for oral health care providers are scarce in the literature, and most published scales have been contested due to a lack of item analysis and internal estimates of reliability. The purposes of this study were, first, to develop a standardized instrument to measure dental students' knowledge of diversity, skills in culturally competent patient-centered communication, and use of culture-centered practices in patient care and, second, to provide preliminary validity support for this instrument. The initial instrument used in this study was a thirty-six-item Likert-scale survey entitled the Knowledge, Efficacy, and Practices Instrument for Oral Health Providers (KEPI-OHP). This instrument is an adaption of an initially thirty-three-item version of the Multicultural Awareness, Knowledge, and Skills Scale-Counselor Edition (MAKSS-CE), a scale that assesses factors related to social justice, cultural differences among clients, and cross-cultural client management. After the authors conducted cognitive and expert interviews, focus groups, pilot testing, and item analysis, their initial instrument was reduced to twenty-eight items. The KEPI-OHP was then distributed to 916 dental students (response rate=48.6 percent) across the United States to measure its reliability and assess its validity. Both exploratory and confirmatory factor analyses were conducted to test the scale's validity. The modification of the survey into a sensible instrument with a relatively clear factor structure using factor analysis resulted in twenty items. A scree test suggested three expressive factors, which were retained for rotation. Bentler's comparative fit and Bentler and Bonnett's non-normed indices were 0.95 and 0.92, respectively. A three-factor solution, including efficacy of assessment, knowledge of diversity, and culture-centered practice subscales, comprised of twenty-items was identified. The KEPI-OHP was found to have reasonable internal consistency reliability to warrant its use for baseline and repeated measures in assessing changes in dental students' growth in cultural competence across four-year dental curricula.
Valentim, Daniela Pereira; Sato, Tatiana de Oliveira; Comper, Maria Luiza Caíres; Silva, Anderson Martins da; Boas, Cristiana Villas; Padula, Rosimeire Simprini
There are very few observational methods for analysis of biomechanical exposure available in Brazilian-Portuguese. This study aimed to cross-culturally adapt and test the measurement properties of the Rapid Upper Limb Assessment (RULA) and Strain Index (SI). The cross-cultural adaptation and measurement properties test were established according to Beaton et al. and COSMIN guidelines, respectively. Several tasks that required static posture and/or repetitive motion of upper limbs were evaluated (n>100). The intra-raters' reliability for the RULA ranged from poor to almost perfect (k: 0.00-0.93), and SI from poor to excellent (ICC 2.1 : 0.05-0.99). The inter-raters' reliability was very poor for RULA (k: -0.12 to 0.13) and ranged from very poor to moderate for SI (ICC 2.1 : 0.00-0.53). The agreement was good for RULA (75-100% intra-raters, and 42.24-100% inter-raters) and to SI (EPM: -1.03% to 1.97%; intra-raters, and -0.17% to 1.51% inter-raters). The internal consistency was appropriate for RULA (α=0.88), and low for SI (α=0.65). Moderate construct validity were observed between RULA and SI, in wrist/hand-wrist posture (rho: 0.61) and strength/intensity of exertion (rho: 0.39). The adapted versions of the RULA and SI presented semantic and cultural equivalence for the Brazilian Portuguese. The RULA and SI had reliability estimates ranged from very poor to almost perfect. The internal consistency for RULA was better than the SI. The correlation between methods was moderate only of muscle request/movement repetition. Previous training is mandatory to use of observations methods for biomechanical exposure assessment, although it does not guarantee good reproducibility of these measures. Copyright © 2017 Associação Brasileira de Pesquisa e Pós-Graduação em Fisioterapia. Publicado por Elsevier Editora Ltda. All rights reserved.
Voss, Christine; Dean, Paige H; Gardner, Ross F; Duncombe, Stephanie L; Harris, Kevin C
2017-01-01
To assess the criterion validity, internal consistency, reliability and cut-point for the Physical Activity Questionnaire for Children (PAQ-C) and Adolescents (PAQ-A) in children and adolescents with congenital heart disease-a special population at high cardiovascular risk in whom physical activity has not been extensively evaluated. We included 84 participants (13.6±2.9 yrs, 50% female) with simple (37%), moderate (31%), or severe congenital heart disease (27%), as well as cardiac transplant recipients (6%), from BC Children's Hospital, Canada. They completed the PAQ-C (≤11yrs, n = 28) or-A (≥12yrs, n = 56), and also wore a triaxial accelerometer (GT3X+ or GT9X) over the right hip for 7 days (n = 59 met valid wear time criteria). Median daily moderate-to-vigorous physical activity was 46.9 minutes per day (IQR 31.6-61.8) and 25% met physical activity guidelines defined as ≥60 minutes of moderate-to-vigorous physical activity per day. Median PAQ-score was 2.6 (IQR 1.9-3.0). PAQ-Scores were significantly related to accelerometry-derived metrics of physical activity (rho = 0.44-0.55, all p<0.01) and sedentary behaviour (rho = -0.53, p<0.001). Internal consistency was high (α = 0.837), as was reliability (stability) of PAQ-Scores over a 4-months period (ICC = 0.73, 95%CI 0.55-0.84; p<0.001). We identified that a PAQ-Score cut-point of 2.87 discriminates between those meeting physical guidelines and those that do not in the combined PAQ-C and-A samples (area under the curve = 0.80 (95%CI 0.67-0.92). Validity and reliability of the PAQ in children and adolescents with CHD was comparable to or stronger than previous studies in healthy children. Therefore, the PAQ may be used to estimate general levels of physical activity in children and adolescents with CHD.
Moussas, George; Dadouti, Georgia; Douzenis, Athanassios; Poulis, Evangelos; Tzelembis, Athanassios; Bratis, Dimitris; Christodoulou, Christos; Lykouras, Lefteris
2009-05-14
Problems associated with alcohol abuse are recognised by the World Health Organization as a major health issue, which according to most recent estimations is responsible for 1.4% of the total world burden of morbidity and has been proven to increase mortality risk by 50%. Because of the size and severity of the problem, early detection is very important. This requires easy to use and specific tools. One of these is the Alcohol Use Disorders Identification Test (AUDIT). This study aims to standardise the questionnaire in a Greek population. AUDIT was translated and back-translated from its original language by two English-speaking psychiatrists. The tool contains 10 questions. A score >or= 11 is an indication of serious abuse/dependence. In the study, 218 subjects took part: 128 were males and 90 females. The average age was 40.71 years (+/- 11.34). From the 218 individuals, 109 (75 male, 34 female) fulfilled the criteria for alcohol dependence according to the Diagnostic and Statistical Manual of Mental Disorders, 4th edition (DSM-IV), and presented requesting admission; 109 subjects (53 male, 56 female) were healthy controls. Internal reliability (Cronbach alpha) was 0.80 for the controls and 0.80 for the alcohol-dependent individuals. Controls had significantly lower average scores (t test P < 0.001) when compared to the alcoholics. The questionnaire's sensitivity for scores >8 was 0.98 and its specificity was 0.94 for the same score. For the alcohol-dependent sample 3% scored as false negatives and from the control group 1.8% scored false positives. In the alcohol-dependent sample there was no difference between males and females in their average scores (t test P > 0.05). The Greek version of AUDIT has increased internal reliability and validity. It detects 97% of the alcohol-dependent individuals and has a high sensitivity and specificity. AUDIT is easy to use, quick and reliable and can be very useful in detection alcohol problems in sensitive populations.
The Challenges of Credible Thermal Protection System Reliability Quantification
NASA Technical Reports Server (NTRS)
Green, Lawrence L.
2013-01-01
The paper discusses several of the challenges associated with developing a credible reliability estimate for a human-rated crew capsule thermal protection system. The process of developing such a credible estimate is subject to the quantification, modeling and propagation of numerous uncertainties within a probabilistic analysis. The development of specific investment recommendations, to improve the reliability prediction, among various potential testing and programmatic options is then accomplished through Bayesian analysis.
Anderson, Donald D; Segal, Neil A; Kern, Andrew M; Nevitt, Michael C; Torner, James C; Lynch, John A
2012-01-01
Recent findings suggest that contact stress is a potent predictor of subsequent symptomatic osteoarthritis development in the knee. However, much larger numbers of knees (likely on the order of hundreds, if not thousands) need to be reliably analyzed to achieve the statistical power necessary to clarify this relationship. This study assessed the reliability of new semiautomated computational methods for estimating contact stress in knees from large population-based cohorts. Ten knees of subjects from the Multicenter Osteoarthritis Study were included. Bone surfaces were manually segmented from sequential 1.0 Tesla magnetic resonance imaging slices by three individuals on two nonconsecutive days. Four individuals then registered the resulting bone surfaces to corresponding bone edges on weight-bearing radiographs, using a semi-automated algorithm. Discrete element analysis methods were used to estimate contact stress distributions for each knee. Segmentation and registration reliabilities (day-to-day and interrater) for peak and mean medial and lateral tibiofemoral contact stress were assessed with Shrout-Fleiss intraclass correlation coefficients (ICCs). The segmentation and registration steps of the modeling approach were found to have excellent day-to-day (ICC 0.93-0.99) and good inter-rater reliability (0.84-0.97). This approach for estimating compartment-specific tibiofemoral contact stress appears to be sufficiently reliable for use in large population-based cohorts.
NASA Astrophysics Data System (ADS)
Hadwin, Paul J.; Sipkens, T. A.; Thomson, K. A.; Liu, F.; Daun, K. J.
2016-01-01
Auto-correlated laser-induced incandescence (AC-LII) infers the soot volume fraction (SVF) of soot particles by comparing the spectral incandescence from laser-energized particles to the pyrometrically inferred peak soot temperature. This calculation requires detailed knowledge of model parameters such as the absorption function of soot, which may vary with combustion chemistry, soot age, and the internal structure of the soot. This work presents a Bayesian methodology to quantify such uncertainties. This technique treats the additional "nuisance" model parameters, including the soot absorption function, as stochastic variables and incorporates the current state of knowledge of these parameters into the inference process through maximum entropy priors. While standard AC-LII analysis provides a point estimate of the SVF, Bayesian techniques infer the posterior probability density, which will allow scientists and engineers to better assess the reliability of AC-LII inferred SVFs in the context of environmental regulations and competing diagnostics.
Automated detection of irradiated food with the comet assay.
Verbeek, F; Koppen, G; Schaeken, B; Verschaeve, L
2008-01-01
Food irradiation is the process of exposing food to ionising radiation in order to disinfect, sanitise, sterilise and preserve food or to provide insect disinfestation. Irradiated food should be adequately labelled according to international and national guidelines. In many countries, there are furthermore restrictions to the product-specific maximal dose that can be administered. Therefore, there is a need for methods that allow detection of irradiated food, as well as for methods that provide a reliable dose estimate. In recent years, the comet assay was proposed as a simple, rapid and inexpensive method to fulfil these goals, but further research is required to explore the full potential of this method. In this paper we describe the use of an automated image analysing system to measure DNA comets which allow the discrimination between irradiated and non-irradiated food as well as the set-up of standard dose-response curves, and hence a sufficiently accurate dose estimation.
Hellweg, Stephanie; Schuster-Amft, Corina
2016-07-19
Agitation is frequently observed during early recovery after traumatic brain injury (TBI). Agitated behaviour often interferes with a goal-orientated rehabilitation and can be a substantial hindrance to therapy. Despite the relatively high occurance of agitation in TBI population there is no objective assessement in German (G) available. An existing scale with excellent psychometric properties is the "Agitated Behavior Scale (ABS)" developed by Corrigan in 1989. The aim of the study was to translate the Agitated Behavior Scale (ABS) into German (ABS-G) and investigate the inter- and intrarater reliability and internal consistency in patients with moderate to severe TBI. A formal nine-step translation and cross-cultural adaptation procedure (TCCA) was applied. Subsequently a prospective observational patient study was conducted. To examine the interrater reliability and internal consistency, two therapists rated 20 patients independently after a therapy session. This procedure was repeated twice on a weekly basis. The intrarater reliability was assessed through video recordings from three patients. Nine raters scored the demonstrated behaviour on the videotape with the ABS-G independently twice within one month. The inter- and intrarater reliability were evaluated with the Spearman rank correlation coefficient and the quadratic weighted kappa. The internal consistency was tested with Cronbach's alpha. Behaviour of 20 patients (18 males; mean age 41 ± 20.7; mean Functional Independence Measure (FIM) cognitive score on admission 7.1 ± 4.04; mean ABS-G score at first observation 17.3 ± 2.83) was assessed threefold. Interrater reliability yielded a correlation coefficient for ABS-G total score of all 60 paired observations of r s 0.845 and a weighted Kappa of 0.738. Intrarater reliability for ABS-G total score ranged between r s 0.719 and 0.953 and showed a weighted Kappa between 0.871 and 0.953. Cronbach's alpha indicated moderate internal consistency with 0.661. This study demonstrates that the ABS-G is a reliable instrument for evaluating agitation in patients with moderate to severe TBI. Hereby it would be possible to monitor agitation objectively and optimise the management of agitated patients according to international recommendations.
NASA Astrophysics Data System (ADS)
Otosu, Takuhiro; Yamaguchi, Shoichi
2017-07-01
We present standing evanescent-wave fluorescence correlation spectroscopy (SEW-FCS). This technique utilizes the interference of two evanescent waves which generates a standing evanescent-wave. Fringe-pattern illumination created by a standing evanescent-wave enables us to measure the diffusion coefficients of molecules with a super-resolution corresponding to one fringe width. Because the fringe width can be reliably estimated by a simple procedure, utilization of fringes is beneficial to quantitatively analyze the slow diffusion of molecules in a supported lipid bilayer (SLB), a model biomembrane formed on a solid substrate, with the timescale relevant for reliable FCS analysis. Furthermore, comparison of the data between SEW-FCS and conventional total-internal reflection FCS, which can also be performed by the SEW-FCS instrument, effectively eliminates the artifact due to afterpulsing of the photodiode detector. The versatility of SEW-FCS is demonstrated by its application to various SLBs.
Keum, Brian TaeHyuk; Miller, Matthew J
2017-04-01
The purpose of this study was to develop the Perceived Online Racism Scale (PORS) to assess perceived online racist interpersonal interactions and exposure to online racist content among people of color. Items were developed through a multistage process involving a comprehensive literature review, focus-groups, qualitative data collection, and survey of online racism experiences. Based on a sample of 1,023 racial minority participants, exploratory and confirmatory factor analyses provided support for a 30-item bifactor model accounted by the general factor and the following 3 specific factors: (a) personal experience of racial cyber-aggression, (b) vicarious exposure to racial cyber-aggression, and (c) online-mediated exposure to racist reality. The PORS demonstrated measurement invariance across racial/ethnic groups in our sample. Internal reliability estimates for the total and subscale scores of the PORS were above .88 and the 4-week test-retest reliability was adequate. Limitations and future directions for research are discussed. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Bogdan, Anna; Sudoł-Szopińska, Iwona; Luczak, Anna; Konarska, Maria; Pietrowski, Piotr
2012-01-01
This article proposes a method for a comprehensive assessment of the effect of integral motorcycle helmets on physiological and cognitive responses of motorcyclists. To verify the reliability of commonly used tests, we conducted experiments with 5 motorcyclists. We recorded changes in physiological parameters (heart rate, local skin temperature, core temperature, air temperature, relative humidity in the space between the helmet and the surface of the head, and the concentration of O(2) and CO(2) under the helmet) and in psychological parameters (motorcyclists' reflexes, fatigue, perceptiveness and mood). We also studied changes in the motorcyclists' subjective sensation of thermal comfort. The results made it possible to identify reliable parameters for assessing the effect of integral helmets on performance, i.e., physiological factors (head skin temperature, internal temperature and concentration of O(2) and CO(2) under the helmet) and on psychomotor factors (reaction time, attention and vigilance, work performance, concentration and a subjective feeling of mood and fatigue).
Translation, Cultural Adaptation and Validation of the Simple Shoulder Test to Spanish
Arcuri, Francisco; Barclay, Fernando; Nacul, Ivan
2015-01-01
Background: The validation of widely used scales facilitates the comparison across international patient samples. Objective: The objective was to translate, culturally adapt and validate the Simple Shoulder Test into Argentinian Spanish. Methods: The Simple Shoulder Test was translated from English into Argentinian Spanish by two independent translators, translated back into English and evaluated for accuracy by an expert committee to correct the possible discrepancies. It was then administered to 50 patients with different shoulder conditions.Psycometric properties were analyzed including internal consistency, measured with Cronbach´s Alpha, test-retest reliability at 15 days with the interclass correlation coefficient. Results: The internal consistency, validation, was an Alpha of 0,808, evaluated as good. The test-retest reliability index as measured by intra-class correlation coefficient (ICC) was 0.835, evaluated as excellent. Conclusion: The Simple Shoulder Test translation and it´s cultural adaptation to Argentinian-Spanish demonstrated adequate internal reliability and validity, ultimately allowing for its use in the comparison with international patient samples.
Papadakaki, Maria; Prokopiadou, Dimitra; Petridou, Eleni; Kogevinas, Manolis; Lionis, Christos
2012-06-01
The current article aims to translate the PREMIS (Physician Readiness to Manage Intimate Partner Violence) survey into the Greek language and test its validity and reliability in a sample of primary care physicians. The validation study was conducted in 2010 and involved all the general practitioners serving two adjacent prefectures of Greece (n = 80). Maximum-likelihood factor analysis (MLF) was used to extract key survey factors. The instrument was further assessed for the following psychometric properties: (a) scale reliability, (b) item-specific reliability, (c) test-retest reliability, (d) scale construct validity, and (e) internal predictive validity. The MLF analysis of 23 opinion items revealed a seven-factor solution (preparation, constraint, workplace issues, screening, self-efficacy, alcohol/drugs, victim understanding), which was statistically sound (p = .293). Most of the newly derived scales displayed satisfactory internal consistency (α ≥ .60), high item-specific reliability, strong construct, and internal predictive validity (F = 2.82; p = .004), and high repeatability when retested with 20 individuals (intraclass correlation coefficient [ICC] > .70). The tool was found appropriate to facilitate the identification of competence deficits and the evaluation of training initiatives.
NUVEM - New methods to Use gnss water Vapor Estimates for Meteorology of Portugal
NASA Astrophysics Data System (ADS)
Fernandes, R. M. S.; Viterbo, P.; Bos, M. S.; Martins, J. P.; Sá, A. G.; Valentim, H.; Jones, J.
2014-12-01
NUVEM (New methods to Use gnss water Vapor Estimates for Meteorology of Portugal) is a collaborative project funded by the Portuguese National Science Foundation (FCT) aiming to implement a multi-disciplinary approach in order to operationalize the inclusion of GNSS-PWV estimates for nowcasting in Portugal, namely for the preparation of warnings of severe weather. To achieve such goal, the NUVEM project is divided in two major components: a) Development and implementation of methods to compute accurate estimates of PWV (Precipitable Water Vapor) in NRT (Near Real-Time); b) Integration of such estimates in nowcasting procedures in use at IPMA (Portuguese Meteorological Service). Methodologies will be optimized at SEGAL to passive and actively access to the data; the PWV estimations will be computed using PPP (Precise Point Positioning), which permits the estimation of each individual station separately; solutions will be validated using internal and external values; and computed solutions will be transferred timely to the IPMA Operational Center. Validation of derived estimations using robust statistics is an important component of the project. The need for sending computed values as soon as possible to IPMA requires fast but reliable internal (e.g., noise estimation) and external (e.g., feedback from IPMA using other sensors like radiosondes) assessment of the quality of the PWV estimates. At IPMA, the goal is to implement the operational use of GNSS-PWV to assist weather nowcasting in Portugal. This will be done with the assistance of the Meteo group of IDL. Maps of GNSS-PWV will be automatically created and compared with solutions provided by other operational systems in order to help IPMA to detect suspicious patterns at near real time. This will be the first step towards the assimilation of GNSS-PWV estimates at IPMA nowcasting models. The NUVEM (EXPL/GEO-MET/0413/2013) project will also contribute to the active participation of Portugal at the COST Action ES1206 - Advanced Global Navigation Satellite Systems tropospheric products for monitoring severe weather events and climate (GNSS4SWEC). This work is also carried out in the framework of the Portuguese Project SMOG (PTDC/CTE-ATM/119922/2010).
A validation study of public health knowledge, skills, social responsibility and applied learning.
Vackova, Dana; Chen, Coco K; Lui, Juliana N M; Johnston, Janice M
2018-06-22
To design and validate a questionnaire to measure medical students' Public Health (PH) knowledge, skills, social responsibility and applied learning as indicated in the four domains recommended by the Association of Schools & Programmes of Public Health (ASPPH). A cross-sectional study was conducted to develop an evaluation tool for PH undergraduate education through item generation, reduction, refinement and validation. The 74 preliminary items derived from the existing literature were reduced to 55 items based on expert panel review which included those with expertise in PH, psychometrics and medical education, as well as medical students. Psychometric properties of the preliminary questionnaire were assessed as follows: frequency of endorsement for item variance; principal component analysis (PCA) with varimax rotation for item reduction and factor estimation; Cronbach's Alpha, item-total correlation and test-retest validity for internal consistency and reliability. PCA yielded five factors: PH Learning Experience (6 items); PH Risk Assessment and Communication (5 items); Future Use of Evidence in Practice (6 items); Recognition of PH as a Scientific Discipline (4 items); and PH Skills Development (3 items), explaining 72.05% variance. Internal consistency and reliability tests were satisfactory (Cronbach's Alpha ranged from 0.87 to 0.90; item-total correlation > 0.59). Lower paired test-retest correlations reflected instability in a social science environment. An evaluation tool for community-centred PH education has been developed and validated. The tool measures PH knowledge, skills, social responsibilities and applied learning as recommended by the internationally recognised Association of Schools & Programmes of Public Health (ASPPH).
Adverse drug reactions in Germany: direct costs of internal medicine hospitalizations.
Rottenkolber, Dominik; Schmiedl, Sven; Rottenkolber, Marietta; Farker, Katrin; Saljé, Karen; Mueller, Silke; Hippius, Marion; Thuermann, Petra A; Hasford, Joerg
2011-06-01
German hospital reimbursement modalities changed as a result of the introduction of Diagnosis Related Groups (DRG) in 2004. Therefore, no data on the direct costs of adverse drug reactions (ADRs) resulting in admissions to departments of internal medicine are available. The objective was to quantify the ADR-related economic burden (direct costs) of hospitalizations in internal medicine wards in Germany. Record-based study analyzing the patient records of about 57,000 hospitalizations between 2006 and 2007 of the Net of Regional Pharmacovigilance Centers (Germany). All ADRs were evaluated by a team of experts in pharmacovigilance for severity, causality, and preventability. The calculation of accurate person-related costs for ADRs relied on the German DRG system (G-DRG 2009). Descriptive and bootstrap statistical methods were applied for data analysis. The incidence of hospitalization due to at least 'possible' serious outpatient ADRs was estimated to be approximately 3.25%. Mean age of the 1834 patients was 71.0 years (SD 14.7). Most frequent ADRs were gastrointestinal hemorrhage (n = 336) and drug-induced hypoglycemia (n = 270). Average inpatient length-of-stay was 9.3 days (SD 7.1). Average treatment costs of a single ADR were estimated to be approximately €2250. The total costs sum to €434 million per year for Germany. Considering the proportion of preventable cases (20.1%), this equals a saving potential of €87 million per year. Preventing ADRs is advisable in order to realize significant nationwide savings potential. Our cost estimates provide a reliable benchmark as they were calculated based on an intensified ADR surveillance and an accurate person-related cost application. Copyright © 2011 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Wayson, Michael B.; Bolch, Wesley E.
2018-04-01
Internal radiation dose estimates for diagnostic nuclear medicine procedures are typically calculated for a reference individual. Resultantly, there is uncertainty when determining the organ doses to patients who are not at 50th percentile on either height or weight. This study aims to better personalize internal radiation dose estimates for individual patients by modifying the dose estimates calculated for reference individuals based on easily obtainable morphometric characteristics of the patient. Phantoms of different sitting heights and waist circumferences were constructed based on computational reference phantoms for the newborn, 10 year-old, and adult. Monoenergetic photons and electrons were then simulated separately at 15 energies. Photon and electron specific absorbed fractions (SAFs) were computed for the newly constructed non-reference phantoms and compared to SAFs previously generated for the age-matched reference phantoms. Differences in SAFs were correlated to changes in sitting height and waist circumference to develop scaling factors that could be applied to reference SAFs as morphometry corrections. A further set of arbitrary non-reference phantoms were then constructed and used in validation studies for the SAF scaling factors. Both photon and electron dose scaling methods were found to increase average accuracy when sitting height was used as the scaling parameter (~11%). Photon waist circumference-based scaling factors showed modest increases in average accuracy (~7%) for underweight individuals, but not for overweight individuals. Electron waist circumference-based scaling factors did not show increases in average accuracy. When sitting height and waist circumference scaling factors were combined, modest average gains in accuracy were observed for photons (~6%), but not for electrons. Both photon and electron absorbed doses are more reliably scaled using scaling factors computed in this study. They can be effectively scaled using sitting height alone as patient-specific morphometric parameter.
Wayson, Michael B; Bolch, Wesley E
2018-04-13
Internal radiation dose estimates for diagnostic nuclear medicine procedures are typically calculated for a reference individual. Resultantly, there is uncertainty when determining the organ doses to patients who are not at 50th percentile on either height or weight. This study aims to better personalize internal radiation dose estimates for individual patients by modifying the dose estimates calculated for reference individuals based on easily obtainable morphometric characteristics of the patient. Phantoms of different sitting heights and waist circumferences were constructed based on computational reference phantoms for the newborn, 10 year-old, and adult. Monoenergetic photons and electrons were then simulated separately at 15 energies. Photon and electron specific absorbed fractions (SAFs) were computed for the newly constructed non-reference phantoms and compared to SAFs previously generated for the age-matched reference phantoms. Differences in SAFs were correlated to changes in sitting height and waist circumference to develop scaling factors that could be applied to reference SAFs as morphometry corrections. A further set of arbitrary non-reference phantoms were then constructed and used in validation studies for the SAF scaling factors. Both photon and electron dose scaling methods were found to increase average accuracy when sitting height was used as the scaling parameter (~11%). Photon waist circumference-based scaling factors showed modest increases in average accuracy (~7%) for underweight individuals, but not for overweight individuals. Electron waist circumference-based scaling factors did not show increases in average accuracy. When sitting height and waist circumference scaling factors were combined, modest average gains in accuracy were observed for photons (~6%), but not for electrons. Both photon and electron absorbed doses are more reliably scaled using scaling factors computed in this study. They can be effectively scaled using sitting height alone as patient-specific morphometric parameter.
Mayo, Ann M
2015-01-01
It is important for CNSs and other APNs to consider the reliability and validity of instruments chosen for clinical practice, evidence-based practice projects, or research studies. Psychometric testing uses specific research methods to evaluate the amount of error associated with any particular instrument. Reliability estimates explain more about how well the instrument is designed, whereas validity estimates explain more about scores that are produced by the instrument. An instrument may be architecturally sound overall (reliable), but the same instrument may not be valid. For example, if a specific group does not understand certain well-constructed items, then the instrument does not produce valid scores when used with that group. Many instrument developers may conduct reliability testing only once, yet continue validity testing in different populations over many years. All CNSs should be advocating for the use of reliable instruments that produce valid results. Clinical nurse specialists may find themselves in situations where reliability and validity estimates for some instruments that are being utilized are unknown. In such cases, CNSs should engage key stakeholders to sponsor nursing researchers to pursue this most important work.
2013-01-01
Summary of background data Recent smartphones, such as the iPhone, are often equipped with an accelerometer and magnetometer, which, through software applications, can perform various inclinometric functions. Although these applications are intended for recreational use, they have the potential to measure and quantify range of motion. The purpose of this study was to estimate the intra and inter-rater reliability as well as the criterion validity of the clinometer and compass applications of the iPhone in the assessment cervical range of motion in healthy participants. Methods The sample consisted of 28 healthy participants. Two examiners measured cervical range of motion of each participant twice using the iPhone (for the estimation of intra and inter-reliability) and once with the CROM (for the estimation of criterion validity). Estimates of reliability and validity were then established using the intraclass correlation coefficient (ICC). Results We observed a moderate intra-rater reliability for each movement (ICC = 0.65-0.85) but a poor inter-rater reliability (ICC < 0.60). For the criterion validity, the ICCs are moderate (>0.50) to good (>0.65) for movements of flexion, extension, lateral flexions and right rotation, but poor (<0.50) for the movement left rotation. Conclusion We found good intra-rater reliability and lower inter-rater reliability. When compared to the gold standard, these applications showed moderate to good validity. However, before using the iPhone as an outcome measure in clinical settings, studies should be done on patients presenting with cervical problems. PMID:23829201
[Maslach Burnout Inventory - Student Survey: Portugal-Brazil cross-cultural adaptation].
Campos, Juliana Alvares Duarte Bonini; Maroco, João
2012-10-01
To perform a cross-cultural adaptation of the Portuguese version of the Maslach Burnout Inventory for students (MBI-SS), and investigate its reliability, validity and cross-cultural invariance. The face validity involved the participation of a multidisciplinary team. Content validity was performed. The Portuguese version was completed in 2009, on the internet, by 958 Brazilian and 556 Portuguese university students from the urban area. Confirmatory factor analysis was carried out using as fit indices: the χ²/df, the Comparative Fit Index (CFI), the Goodness of Fit Index (GFI) and the Root Mean Square Error of Approximation (RMSEA). To verify the stability of the factor solution according to the original English version, cross-validation was performed in 2/3 of the total sample and replicated in the remaining 1/3. Convergent validity was estimated by the average variance extracted and composite reliability. The discriminant validity was assessed, and the internal consistency was estimated by the Cronbach's alpha coefficient. Concurrent validity was estimated by the correlational analysis of the mean scores of the Portuguese version and the Copenhagen Burnout Inventory, and the divergent validity was compared to the Beck Depression Inventory. The invariance of the model between the Brazilian and the Portuguese samples was assessed. The three-factor model of Exhaustion, Disengagement and Efficacy showed good fit (c 2/df = 8.498, CFI = 0.916, GFI = 0.902, RMSEA = 0.086). The factor structure was stable (λ:χ²dif = 11.383, p = 0.50; Cov: χ²dif = 6.479, p = 0.372; Residues: χ²dif = 21.514, p = 0.121). Adequate convergent validity (VEM = 0.45;0.64, CC = 0.82;0.88), discriminant (ρ² = 0.06;0.33) and internal consistency (α = 0.83;0.88) were observed. The concurrent validity of the Portuguese version with the Copenhagen Inventory was adequate (r = 0.21, 0.74). The assessment of the divergent validity was impaired by the approach of the theoretical concept of the dimensions Exhaustion and Disengagement of the Portuguese version with the Beck Depression Inventory. Invariance of the instrument between the Brazilian and Portuguese samples was not observed (λ:χ²dif = 84.768, p<0.001; Cov: χ²dif = 129.206, p < 0.001; Residues: χ²dif = 518.760, p < 0.001). The Portuguese version of the Maslach Burnout Inventory for students showed adequate reliability and validity, but its factor structure was not invariant between the countries, indicating the absence of cross-cultural stability.
Assessment of psychometric properties of a modified PHEEM questionnaire.
Gooneratne, I K; Munasinghe, S R; Siriwardena, C; Olupeliyawa, A M; Karunathilake, I
2008-12-01
An effective tool in analysing the learning environment, customised to the Sri Lankan setting, is vital for the assessment and delivery of quality healthcare training of preregistration house officers. Such a tool should be reliable and valid. We assessed psychometric properties such as internal reliability and construct validity of a modified version of the Postgraduate Hospital Educational Environment Measure (PHEEM). A modified PHEEM questionnaire customised to the Sri Lankan context was developed in accordance to the Sri Lanka Medical Council guidelines. The questionnaire was distributed to all interns at the National Hospital of Sri Lanka, Colombo North Teaching Hospital and Wathupitiwala Base Hospital during a calendar year (n = 100, response rate = 86%). Internal reliability and construct validity of the inventory were assessed by using Cronbach's alpha and exploratory factor analysis respectively as statistical methods. PHEEM consists of 3 subscales: perceptions of autonomy, social support and teaching, which are factors perceived to be influencing the educational environment. This administration demonstrated high internal reliability as reflected by a Cronbach's alpha value of 0.84. Exploratory factor analysis identified 12 factors with eigenvalue >1. However, the first factor had an eigenvalue of 6.7 (accounting for 19.7% of variance), while the rest had eigenvalues < 2.5. These results suggest a single predictive factor and thus a one-dimensional scale as opposed to the three-dimensional scale which is used in the current questionnaire. The psychometric properties of this tool reflect a high degree of internal reliability in assessing the educational environment of intern doctors in Sri Lanka. It is possible that the clinical educational environment is collectively represented as a single dimension. This may be due to the complex interplay between individual items in the questionnaire. Therefore the psychometric properties do not justify the interpretation of the educational environment through specified subscales.
Reliability and Validity of the Chinese (Mandarin) Tinnitus Handicap Inventory
Meng, Zhaoli; Zheng, Yun; Wang, Kai; Kong, Xiudan; Tao, Yong; Xu, Ke; Liu, Guanjian
2012-01-01
Objectives The Tinnitus Handicap Inventory (THI) is a commonly used self-reporting tinnitus questionnaire. We undertook this study to determine the reliability and validity of the Chinese-Mandarin version of the Tinnitus Handicap Inventory (THI-CM) for measuring tinnitus-related handicaps. Methods We tested the test-retest reliability, internal reliability, and construct validity of the THI-CM. Two-hundred patients seeking treatment for primary or secondary tinnitus in Southwest China were asked to complete THI-CM prior to clinical evaluation. Patients were evaluated by a clinician using standard methods, and 40 patients were asked to complete THI-CM a second time 14±3 days after the initial interview. Results The test-retest reliability of THI-CM was high (Pearson correlation, 0.98), as was the internal reliability (Cronbach's α, 0.93). Factor analysis indicated that THI-CM has a unifactorial structure. Conclusion The THI-CM version is reliable. The total score in THI-CM can be used to measure tinnitus-related handicaps in Mandarin-speaking populations. PMID:22468196
User's guide to the Reliability Estimation System Testbed (REST)
NASA Technical Reports Server (NTRS)
Nicol, David M.; Palumbo, Daniel L.; Rifkin, Adam
1992-01-01
The Reliability Estimation System Testbed is an X-window based reliability modeling tool that was created to explore the use of the Reliability Modeling Language (RML). RML was defined to support several reliability analysis techniques including modularization, graphical representation, Failure Mode Effects Simulation (FMES), and parallel processing. These techniques are most useful in modeling large systems. Using modularization, an analyst can create reliability models for individual system components. The modules can be tested separately and then combined to compute the total system reliability. Because a one-to-one relationship can be established between system components and the reliability modules, a graphical user interface may be used to describe the system model. RML was designed to permit message passing between modules. This feature enables reliability modeling based on a run time simulation of the system wide effects of a component's failure modes. The use of failure modes effects simulation enhances the analyst's ability to correctly express system behavior when using the modularization approach to reliability modeling. To alleviate the computation bottleneck often found in large reliability models, REST was designed to take advantage of parallel processing on hypercube processors.
NASA Astrophysics Data System (ADS)
Castellarin, A.; Montanari, A.; Brath, A.
2002-12-01
The study derives Regional Depth-Duration-Frequency (RDDF) equations for a wide region of northern-central Italy (37,200 km 2) by following an adaptation of the approach originally proposed by Alila [WRR, 36(7), 2000]. The proposed RDDF equations have a rather simple structure and allow an estimation of the design storm, defined as the rainfall depth expected for a given storm duration and recurrence interval, in any location of the study area for storm durations from 1 to 24 hours and for recurrence intervals up to 100 years. The reliability of the proposed RDDF equations represents the main concern of the study and it is assessed at two different levels. The first level considers the gauged sites and compares estimates of the design storm obtained with the RDDF equations with at-site estimates based upon the observed annual maximum series of rainfall depth and with design storm estimates resulting from a regional estimator recently developed for the study area through a Hierarchical Regional Approach (HRA) [Gabriele and Arnell, WRR, 27(6), 1991]. The second level performs a reliability assessment of the RDDF equations for ungauged sites by means of a jack-knife procedure. Using the HRA estimator as a reference term, the jack-knife procedure assesses the reliability of design storm estimates provided by the RDDF equations for a given location when dealing with the complete absence of pluviometric information. The results of the analysis show that the proposed RDDF equations represent practical and effective computational means for producing a first guess of the design storm at the available raingauges and reliable design storm estimates for ungauged locations. The first author gratefully acknowledges D.H. Burn for sponsoring the submission of the present abstract.
Downs, Stephen; Marquez, Jodie; Chiarelli, Pauline
2013-06-01
What is the intra-rater and inter-rater relative reliability of the Berg Balance Scale? What is the absolute reliability of the Berg Balance Scale? Does the absolute reliability of the Berg Balance Scale vary across the scale? Systematic review with meta-analysis of reliability studies. Any clinical population that has undergone assessment with the Berg Balance Scale. Relative intra-rater reliability, relative inter-rater reliability, and absolute reliability. Eleven studies involving 668 participants were included in the review. The relative intrarater reliability of the Berg Balance Scale was high, with a pooled estimate of 0.98 (95% CI 0.97 to 0.99). Relative inter-rater reliability was also high, with a pooled estimate of 0.97 (95% CI 0.96 to 0.98). A ceiling effect of the Berg Balance Scale was evident for some participants. In the analysis of absolute reliability, all of the relevant studies had an average score of 20 or above on the 0 to 56 point Berg Balance Scale. The absolute reliability across this part of the scale, as measured by the minimal detectable change with 95% confidence, varied between 2.8 points and 6.6 points. The Berg Balance Scale has a higher absolute reliability when close to 56 points due to the ceiling effect. We identified no data that estimated the absolute reliability of the Berg Balance Scale among participants with a mean score below 20 out of 56. The Berg Balance Scale has acceptable reliability, although it might not detect modest, clinically important changes in balance in individual subjects. The review was only able to comment on the absolute reliability of the Berg Balance Scale among people with moderately poor to normal balance. Copyright © 2013 Australian Physiotherapy Association. Published by .. All rights reserved.
Ozaki, Y; Watanabe, H; Kaida, A; Miura, M; Nakagawa, K; Toda, K; Yoshimura, R; Sumi, Y; Kurabayashi, T
2017-07-01
Early stage oral cancer can be cured with oral brachytherapy, but whole-body radiation exposure status has not been previously studied. Recently, the International Commission on Radiological Protection Committee (ICRP) recommended the use of ICRP phantoms to estimate radiation exposure from external and internal radiation sources. In this study, we used a Monte Carlo simulation with ICRP phantoms to estimate whole-body exposure from oral brachytherapy. We used a Particle and Heavy Ion Transport code System (PHITS) to model oral brachytherapy with 192Ir hairpins and 198Au grains and to perform a Monte Carlo simulation on the ICRP adult reference computational phantoms. To confirm the simulations, we also computed local dose distributions from these small sources, and compared them with the results from Oncentra manual Low Dose Rate Treatment Planning (mLDR) software which is used in day-to-day clinical practice. We successfully obtained data on absorbed dose for each organ in males and females. Sex-averaged equivalent doses were 0.547 and 0.710 Sv with 192Ir hairpins and 198Au grains, respectively. Simulation with PHITS was reliable when compared with an alternative computational technique using mLDR software. We concluded that the absorbed dose for each organ and whole-body exposure from oral brachytherapy can be estimated with Monte Carlo simulation using PHITS on ICRP reference phantoms. Effective doses for patients with oral cancer were obtained. © The Author 2017. Published by Oxford University Press on behalf of The Japan Radiation Research Society and Japanese Society for Radiation Oncology.
Hiligsmann, Mickaël; Ethgen, Olivier; Bruyère, Olivier; Richy, Florent; Gathon, Henry-Jean; Reginster, Jean-Yves
2009-01-01
Markov models are increasingly used in economic evaluations of treatments for osteoporosis. Most of the existing evaluations are cohort-based Markov models missing comprehensive memory management and versatility. In this article, we describe and validate an original Markov microsimulation model to accurately assess the cost-effectiveness of prevention and treatment of osteoporosis. We developed a Markov microsimulation model with a lifetime horizon and a direct health-care cost perspective. The patient history was recorded and was used in calculations of transition probabilities, utilities, and costs. To test the internal consistency of the model, we carried out an example calculation for alendronate therapy. Then, external consistency was investigated by comparing absolute lifetime risk of fracture estimates with epidemiologic data. For women at age 70 years, with a twofold increase in the fracture risk of the average population, the costs per quality-adjusted life-year gained for alendronate therapy versus no treatment were estimated at €9105 and €15,325, respectively, under full and realistic adherence assumptions. All the sensitivity analyses in terms of model parameters and modeling assumptions were coherent with expected conclusions and absolute lifetime risk of fracture estimates were within the range of previous estimates, which confirmed both internal and external consistency of the model. Microsimulation models present some major advantages over cohort-based models, increasing the reliability of the results and being largely compatible with the existing state of the art, evidence-based literature. The developed model appears to be a valid model for use in economic evaluations in osteoporosis.
Walker, Martin; Basáñez, María-Gloria; Ouédraogo, André Lin; Hermsen, Cornelus; Bousema, Teun; Churcher, Thomas S
2015-01-16
Quantitative molecular methods (QMMs) such as quantitative real-time polymerase chain reaction (q-PCR), reverse-transcriptase PCR (qRT-PCR) and quantitative nucleic acid sequence-based amplification (QT-NASBA) are increasingly used to estimate pathogen density in a variety of clinical and epidemiological contexts. These methods are often classified as semi-quantitative, yet estimates of reliability or sensitivity are seldom reported. Here, a statistical framework is developed for assessing the reliability (uncertainty) of pathogen densities estimated using QMMs and the associated diagnostic sensitivity. The method is illustrated with quantification of Plasmodium falciparum gametocytaemia by QT-NASBA. The reliability of pathogen (e.g. gametocyte) densities, and the accompanying diagnostic sensitivity, estimated by two contrasting statistical calibration techniques, are compared; a traditional method and a mixed model Bayesian approach. The latter accounts for statistical dependence of QMM assays run under identical laboratory protocols and permits structural modelling of experimental measurements, allowing precision to vary with pathogen density. Traditional calibration cannot account for inter-assay variability arising from imperfect QMMs and generates estimates of pathogen density that have poor reliability, are variable among assays and inaccurately reflect diagnostic sensitivity. The Bayesian mixed model approach assimilates information from replica QMM assays, improving reliability and inter-assay homogeneity, providing an accurate appraisal of quantitative and diagnostic performance. Bayesian mixed model statistical calibration supersedes traditional techniques in the context of QMM-derived estimates of pathogen density, offering the potential to improve substantially the depth and quality of clinical and epidemiological inference for a wide variety of pathogens.
ERIC Educational Resources Information Center
VanVoorhis, Carmen R. Wilson; Blumentritt, Tracie L.
2007-01-01
We examined the internal consistency reliability, convergent and divergent validity, and factor structure of the Beck Depression Inventory-II (BDI-II) in a sample of 131 Mexican American youth. The BDI-II demonstrated excellent internal consistency reliability (alpha = 0.90) and solid convergent and divergent validity with various clinical scales…
The Chinese Version of the Self-Report Family Inventory: Reliability and Validity.
ERIC Educational Resources Information Center
Shek, Daniel T. L.; Lai, Kelly Y. C.
2001-01-01
Reliability and validity of Chinese Self-Report Family Inventory (C-SFI) were examined in three studies. Study 1 showed C-SFI was temporally stable and internally consistent. Study 2 indicated C-SFI could discriminate between clinical and nonclinical groups. Study 3 gave support for internal consistency, concurrent validity and construct validity.…
Validity and Reliability of Internalized Stigma of Mental Illness (Cantonese)
ERIC Educational Resources Information Center
Young, Daniel Kim-Wan; Ng, Petrus Y. N.; Pan, Jia-Yan; Cheng, Daphne
2017-01-01
Purpose: This study aims to translate and test the reliability and validity of the Internalized Stigma of Mental Illness-Cantonese (ISMI-C). Methods: The original English version of ISMI is translated into the ISMI-C by going through forward and backward translation procedure. A cross-sectional research design is adopted that involved 295…
ERIC Educational Resources Information Center
Lim, Young-Jin
2015-01-01
The aim of this study was to examine the internal consistency reliability, test-retest reliability, factorial structure validity, and convergent validity of a Korean version of the Satisfaction With Life Scale adapted for children (K-SWLS-C). Participants consisted of 653 elementary school students (48% were male). The internal consistency of the…
Wagner, Brian J.; Gorelick, Steven M.
1986-01-01
A simulation nonlinear multiple-regression methodology for estimating parameters that characterize the transport of contaminants is developed and demonstrated. Finite difference contaminant transport simulation is combined with a nonlinear weighted least squares multiple-regression procedure. The technique provides optimal parameter estimates and gives statistics for assessing the reliability of these estimates under certain general assumptions about the distributions of the random measurement errors. Monte Carlo analysis is used to estimate parameter reliability for a hypothetical homogeneous soil column for which concentration data contain large random measurement errors. The value of data collected spatially versus data collected temporally was investigated for estimation of velocity, dispersion coefficient, effective porosity, first-order decay rate, and zero-order production. The use of spatial data gave estimates that were 2–3 times more reliable than estimates based on temporal data for all parameters except velocity. Comparison of estimated linear and nonlinear confidence intervals based upon Monte Carlo analysis showed that the linear approximation is poor for dispersion coefficient and zero-order production coefficient when data are collected over time. In addition, examples demonstrate transport parameter estimation for two real one-dimensional systems. First, the longitudinal dispersivity and effective porosity of an unsaturated soil are estimated using laboratory column data. We compare the reliability of estimates based upon data from individual laboratory experiments versus estimates based upon pooled data from several experiments. Second, the simulation nonlinear regression procedure is extended to include an additional governing equation that describes delayed storage during contaminant transport. The model is applied to analyze the trends, variability, and interrelationship of parameters in a mourtain stream in northern California.
Hickman, Ronald L.; Pinto, Melissa D.; Lee, Eunsuk; Daly, Barbara J.
2015-01-01
The Decision Regret Scale (DRS) is a five-item instrument that captures an individual’s regret associated with a healthcare decision. Cross-sectional data were collected from 109 cardiac patients who decided to receive an internal cardioverter defibrillator (ICD). Exploratory and confirmatory factor analyses, assessments of the internal reliability consistency (α = .86), and discriminant validity established the DRS as a reliable and valid measure of decision regret in ICD recipients. The DRS, a psychometrically sound instrument, has relevance for clinicians and researchers vested in optimizing the decisional outcomes of ICD recipients. Future research is needed to examine the reliability and validity of the DRS in a larger and more diverse sample of ICD recipients. PMID:22679707
Fountoulakis, K N; Iacovides, A; Ioannidou, Ch; Bascialla, F; Nimatoudis, I; Kaprinis, G; Janca, A; Dahl, A
2002-05-17
The International Personality Disorders Examination (IPDE) constitutes the proposal of the WHO for the reliable diagnosis of personality disorders (PD). The IPDE assesses pathological personality and is compatible both with DSM-IV and ICD-10 diagnosis. However it is important to test the reliability and cultural applicability of different IPDE translations. Thirty-one patients (12 male and 19 female) aged 35.25 +/- 11.08 years, took part in the study. Three examiners applied the interview (23 interviews of two and 8 interviews of 3 examiners, that is 47 pairs of interviews and 70 single interviews). The phi coefficient was used to test categorical diagnosis agreement and the Pearson Product Moment correlation coefficient to test agreement concerning the number of criteria met. Translation and back-translation did not reveal specific problems. Results suggested that reliability of the Greek translation is good. However, socio-cultural factors (family coherence, work environment etc) could affect the application of some of the IPDE items in Greece. The diagnosis of any PD was highly reliable with phi >0.92. However, diagnosis of non-specific PD was not reliable at all (phi close to 0) suggesting that this is a true residual category. Diagnosis of specific PDs were highly reliable with the exception of schizoid PD. Diagnosis of antisocial and Borderline PDs were perfectly reliable with phi equal to 1.00. The Greek translation of the IPDE is a reliable instrument for the assessment of personality disorder but cultural variation may limit its applicability in international comparisons.
Karaman, Adem; Durur-Subasi, Irmak; Alper, Fatih; Durur-Karakaya, Afak; Subasi, Mahmut; Akgun, Metin
2017-10-01
To determine whether the use of necrosis/wall apparent diffusion coefficient (ADC) ratios in the differentiation of necrotic lung lesions is more reliable than measuring the wall alone. In this retrospective study, a total of 76 patients (54 males and 22 females, 71% vs. 29%, with a mean age of 53 ± 18 years, range, 18-84) were enrolled, 33 of whom had lung carcinoma and 43 had a benign necrotic lung lesion. A 3T scanner was used. The calculation of the necrosis/wall ADC ratio was based on ADC values measured from necrosis and the wall of the lesions by diffusion-weighted imaging (DWI). Statistical analyses were performed with the independent samples t-test and receiver operating characteristic analysis. Intraobserver and interobserver reliability were calculated for ADC values of wall and necrosis. The mean necrosis/wall ADC ratio was 1.67 ± 0.23 for malignant lesions and 0.75 ± 0.19 for benign lung lesions (P < 0.001). To estimate malignancy the area under the curve (AUC) values for necrosis ADC, wall ADC, and the necrosis/wall ADC ratio were 0.720, 0.073, and 0.997, respectively. A wall/necrosis ADC ratio cutoff value of 1.12 demonstrated a 100% sensitivity and 98% specificity in the estimation of malignancy. Positive predictive value was 100%, and negative predictive value 98% and diagnostic accuracy 99%. There was a good intraobserver and interobserver reliability for wall and necrosis. The necrosis/wall ADC ratio appears to be a reliable and promising tool for discriminating lung carcinoma from benign necrotic lung lesions than measuring the wall alone. 4 Technical Efficacy: Stage 2 J. Magn. Reson. Imaging 2017;46:1001-1006. © 2017 International Society for Magnetic Resonance in Medicine.
Iranian Health Literacy Questionnaire (IHLQ): An Instrument for Measuring Health Literacy in Iran
Haghdoost, Ali Akbar; Rakhshani, Fatemeh; Aarabi, Mohsen; Montazeri, Ali; Tavousi, Mahmoud; Solimanian, Atoosa; Sarbandi, Fatemeh; Namdar, Hosein; Iranpour, Abedin
2015-01-01
Background: Promoting Health Literacy (HL) is considered as an important goal in strategic plans of many countries. In spite of the necessity for access to valid, reliable and native HL instruments, the number of such instruments in the Persian language is scarce. Moreover, there is no good estimation of HL status in Iran. Objectives: The aim of this study was to provide a valid, reliable and native instrument to measure and monitor community HL in Iran and also, to provide an estimation of HL status in two Iranian provinces. Patients and Methods: By applying the multistage cluster sampling, 1080 respondents (540 from each gender) were recruited from Kerman and Mazandaran provinces of Iran, from February to June 2014 to participate in this cross-sectional study. The development of the Iranian Health Literacy Questionnaire (IHLQ) was initiated with a comprehensive review of the literature. Then, face, content and construct validity as well as reliability were determined. Results: Internal consistency and test-retest reliability (ICC) of the factors was in the range of 0.71 to 0.96 and 0.73 to 0.86, respectively. In order to construct validity, Exploratory Factor Analysis (EFA) Kaiser-Meyer-Olkin (KMO) = 0.95 and Bartlett’s test result of 3.017 with P < 0.001) with varimax rotation was used. Optimal reduced solution, including 36 items and seven factors, was found in EFA. Five of the factors identified were reading/comprehension skills, individual empowerment, communication/decision-making skills, social empowerment and health knowledge. Conclusions: It was concluded that IHLQ might be a practical and useful tool for investigating HL for Persian language speakers around the world. Since HL is dynamic and its instruments should be regularly revised, further studies are recommended to assess HL with application of IHLQ to detect its potential imperfections. PMID:26290752
Human Reliability Assessments: Using the Past (Shuttle) to Predict the Future (Orion)
NASA Technical Reports Server (NTRS)
DeMott, Diana L.; Bigler, Mark A.
2017-01-01
NASA (National Aeronautics and Space Administration) Johnson Space Center (JSC) Safety and Mission Assurance (S&MA) uses two human reliability analysis (HRA) methodologies. The first is a simplified method which is based on how much time is available to complete the action, with consideration included for environmental and personal factors that could influence the human's reliability. This method is expected to provide a conservative value or placeholder as a preliminary estimate. This preliminary estimate or screening value is used to determine which placeholder needs a more detailed assessment. The second methodology is used to develop a more detailed human reliability assessment on the performance of critical human actions. This assessment needs to consider more than the time available, this would include factors such as: the importance of the action, the context, environmental factors, potential human stresses, previous experience, training, physical design interfaces, available procedures/checklists and internal human stresses. The more detailed assessment is expected to be more realistic than that based primarily on time available. When performing an HRA on a system or process that has an operational history, we have information specific to the task based on this history and experience. In the case of a Probabilistic Risk Assessment (PRA) that is based on a new design and has no operational history, providing a "reasonable" assessment of potential crew actions becomes more challenging. To determine what is expected of future operational parameters, the experience from individuals who had relevant experience and were familiar with the system and process previously implemented by NASA was used to provide the "best" available data. Personnel from Flight Operations, Flight Directors, Launch Test Directors, Control Room Console Operators, and Astronauts were all interviewed to provide a comprehensive picture of previous NASA operations. Verification of the assumptions and expectations expressed in the assessments will be needed when the procedures, flight rules, and operational requirements are developed and then finalized.
Human Reliability Assessments: Using the Past (Shuttle) to Predict the Future (Orion)
NASA Technical Reports Server (NTRS)
DeMott, Diana; Bigler, Mark
2016-01-01
NASA (National Aeronautics and Space Administration) Johnson Space Center (JSC) Safety and Mission Assurance (S&MA) uses two human reliability analysis (HRA) methodologies. The first is a simplified method which is based on how much time is available to complete the action, with consideration included for environmental and personal factors that could influence the human's reliability. This method is expected to provide a conservative value or placeholder as a preliminary estimate. This preliminary estimate or screening value is used to determine which placeholder needs a more detailed assessment. The second methodology is used to develop a more detailed human reliability assessment on the performance of critical human actions. This assessment needs to consider more than the time available, this would include factors such as: the importance of the action, the context, environmental factors, potential human stresses, previous experience, training, physical design interfaces, available procedures/checklists and internal human stresses. The more detailed assessment is expected to be more realistic than that based primarily on time available. When performing an HRA on a system or process that has an operational history, we have information specific to the task based on this history and experience. In the case of a Probabilistic Risk Assessment (PRA) that is based on a new design and has no operational history, providing a "reasonable" assessment of potential crew actions becomes more challenging. In order to determine what is expected of future operational parameters, the experience from individuals who had relevant experience and were familiar with the system and process previously implemented by NASA was used to provide the "best" available data. Personnel from Flight Operations, Flight Directors, Launch Test Directors, Control Room Console Operators and Astronauts were all interviewed to provide a comprehensive picture of previous NASA operations. Verification of the assumptions and expectations expressed in the assessments will be needed when the procedures, flight rules and operational requirements are developed and then finalized.
Ardestani, Marzieh M; Moazen, Mehran; Maniei, Ehsan; Jin, Zhongmin
2015-04-01
Commercially available fixed bearing knee prostheses are mainly divided into two groups: posterior stabilized (PS) versus cruciate retaining (CR). Despite the widespread comparative studies, the debate continues regarding the superiority of one type over the other. This study used a combined finite element (FE) simulation and principal component analysis (PCA) to evaluate "reliability" and "sensitivity" of two PS designs versus two CR designs over a patient population. Four fixed bearing implants were chosen: PFC (DePuy), PFC Sigma (DePuy), NexGen (Zimmer) and Genesis II (Smith & Nephew). Using PCA, a large probabilistic knee joint motion and loading database was generated based on the available experimental data from literature. The probabilistic knee joint data were applied to each implant in a FE simulation to calculate the potential envelopes of kinematics (i.e. anterior-posterior [AP] displacement and internal-external [IE] rotation) and contact mechanics. The performance envelopes were considered as an indicator of performance reliability. For each implant, PCA was used to highlight how much the implant performance was influenced by changes in each input parameter (sensitivity). Results showed that (1) conformity directly affected the reliability of the knee implant over a patient population such that lesser conformity designs (PS or CR), had higher kinematic variability and were more influenced by AP force and IE torque, (2) contact reliability did not differ noticeably among different designs and (3) CR or PS designs affected the relative rank of critical factors that influenced the reliability of each design. Such investigations enlighten the underlying biomechanics of various implant designs and can be utilized to estimate the potential performance of an implant design over a patient population. Copyright © 2015 IPEM. Published by Elsevier Ltd. All rights reserved.
An empirical Bayes approach for the Poisson life distribution.
NASA Technical Reports Server (NTRS)
Canavos, G. C.
1973-01-01
A smooth empirical Bayes estimator is derived for the intensity parameter (hazard rate) in the Poisson distribution as used in life testing. The reliability function is also estimated either by using the empirical Bayes estimate of the parameter, or by obtaining the expectation of the reliability function. The behavior of the empirical Bayes procedure is studied through Monte Carlo simulation in which estimates of mean-squared errors of the empirical Bayes estimators are compared with those of conventional estimators such as minimum variance unbiased or maximum likelihood. Results indicate a significant reduction in mean-squared error of the empirical Bayes estimators over the conventional variety.
Aerts, Frank; Carrier, Kathy; Alwood, Becky
2016-01-01
Background: The assessment of clinical manifestation of muscle fatigue is an effective procedure in establishing therapeutic exercise dose. Few studies have evaluated physical therapist reliability in establishing muscle fatigue through detection of changes in quality of movement patterns in a live setting. Objective: The purpose of this study is to evaluate the inter-rater reliability of physical therapists’ ability to detect altered movement patterns due to muscle fatigue. Design: A reliability study in a live setting with multiple raters. Participants: Forty-four healthy individuals (ages 19-35) were evaluated by six physical therapists in a live setting. Methods: Participants were evaluated by physical therapists for altered movement patterns during resisted shoulder rotation. Each participant completed a total of four tests: right shoulder internal rotation, right shoulder external rotation, left shoulder internal rotation and left shoulder external rotation. Results: For all tests combined, the inter-rater reliability for a single rater scoring ICC (2,1) was .65 (95%, .60, .71) This corresponds to moderate inter-rater reliability between physical therapists. Limitations: The results of this study apply only to healthy participants and therefore cannot be generalized to a symptomatic population. Conclusion: Moderate inter-rater reliability was found between physical therapists in establishing muscle fatigue through the observation of sustained altered movement patterns during dynamic resistive shoulder internal and external rotation. PMID:27347241
International physical activity questionnaire: reliability and validity of the Turkish version.
Saglam, Melda; Arikan, Hulya; Savci, Sema; Inal-Ince, Deniz; Bosnak-Guclu, Meral; Karabulut, Erdem; Tokgozoglu, Lale
2010-08-01
Physical inactivity is a global problem which is related to many chronic health disorders. Physical activity scales which allow cross-cultural comparisons have been developed. The goal was to assess the reliability and validity of a Turkish version of the International Physical Activity Questionnaire (IPAQ). 1,097 university students (721 women, 376 men; ages 18-32) volunteered. Short and long forms of the IPAQ gave good agreement and comparable 1-wk. test-retest reliabilities. Caltrac accelerometer data were compared with IPAQ scores in 80 participants with good agreement for short and long forms. Turkish versions of the IPAQ short and long forms are reliable and valid in assessment of physical activity.
Krishan, Kewal; Chatterjee, Preetika M; Kanchan, Tanuj; Kaur, Sandeep; Baryah, Neha; Singh, R K
2016-04-01
Sex estimation is considered as one of the essential parameters in forensic anthropology casework, and requires foremost consideration in the examination of skeletal remains. Forensic anthropologists frequently employ morphologic and metric methods for sex estimation of human remains. These methods are still very imperative in identification process in spite of the advent and accomplishment of molecular techniques. A constant boost in the use of imaging techniques in forensic anthropology research has facilitated to derive as well as revise the available population data. These methods however, are less reliable owing to high variance and indistinct landmark details. The present review discusses the reliability and reproducibility of various analytical approaches; morphological, metric, molecular and radiographic methods in sex estimation of skeletal remains. Numerous studies have shown a higher reliability and reproducibility of measurements taken directly on the bones and hence, such direct methods of sex estimation are considered to be more reliable than the other methods. Geometric morphometric (GM) method and Diagnose Sexuelle Probabiliste (DSP) method are emerging as valid methods and widely used techniques in forensic anthropology in terms of accuracy and reliability. Besides, the newer 3D methods are shown to exhibit specific sexual dimorphism patterns not readily revealed by traditional methods. Development of newer and better methodologies for sex estimation as well as re-evaluation of the existing ones will continue in the endeavour of forensic researchers for more accurate results. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hill, J.R.; Heger, A.S.; Koen, B.V.
1984-04-01
This report is the result of a preliminary feasibility study of the applicability of Stein and related parametric empirical Bayes (PEB) estimators to the Nuclear Plant Reliability Data System (NPRDS). A new estimator is derived for the means of several independent Poisson distributions with different sampling times. This estimator is applied to data from NPRDS in an attempt to improve failure rate estimation. Theoretical and Monte Carlo results indicate that the new PEB estimator can perform significantly better than the standard maximum likelihood estimator if the estimation of the individual means can be combined through the loss function or throughmore » a parametric class of prior distributions.« less
Lucchetti, Giancarlo; Lucchetti, Alessandra Lamas Granero; de Bernardin Gonçalves, Juliane Piasseschi; Vallada, Homero P
2015-02-01
Functional Assessment of Chronic Illness Therapy-Spiritual Well-Being scale (FACIT-Sp 12) is one of the most used and most validated instruments for assessing spiritual well-being in the world. Some Brazilian studies have used this instrument without, however, assessing its psychometric properties. The present study aims to validate the Portuguese version of the FACIT-Sp 12 among Brazilian psychiatric inpatients. A self-administered questionnaire, covering spiritual well-being (FACIT-Sp 12), depression, anxiety, religiosity, quality of life, and optimism, was administered. Of those who met the inclusion criteria, 579 patients were invited to participate and 493 (85.1 %) were able to fill out the FACIT-Sp 12 twice (test and retest). Subsequently, the validation analysis was carried out. Estimation of test-retest reliability, discriminant, and convergent validity was determined by the Spearman's correlation test, and the internal consistency was examined by the Cronbach's alpha. The sample was predominantly male (63.9 %) with a mean age of 35.9 years, and the most common psychiatric condition was bipolar disorder (25.7 %) followed by schizophrenia (20.4 %), drug use (20.0 %), and depression (17.6 %) according to ICD-10. The total FACIT-Sp 12 scale as well as the subscales demonstrated high internal consistency (coefficient alphas ranging from 0.893 for the total scale to 0.655 for the Meaning subscale), good convergent and divergent validity, and satisfactory test-retest reliability (rho = 0.699). The Portuguese version of FACIT-Sp 12 is a valid and reliable measure to use in Brazilian psychiatric inpatients. The availability of a brief and broad measure of spiritual well-being can help the study of spirituality and its influence on health by researchers from countries that speak the Portuguese language.
Psychometric Properties of a Russian Version of the Cognitive Flexibility Inventory (CFI-R).
Kurginyan, Sergey S; Osavolyuk, Ekaterina Y
2018-01-01
The Cognitive Flexibility Inventory (CFI) is a brief self-report measure of the type of cognitive flexibility (CF) necessary to successfully challenge and restructure maladaptive beliefs with more balanced and adaptive thinking; it is particularly popular for use with English speakers. The CFI has recently been translated into five languages (Chinese, Japanese, Iranian, Turkish, and Russian), although estimates of reliability and validity of these translated versions are scarce. This study reports on the factor structure, internal consistency, reliability, and construct validity of the CFI. We adopted the CFI for a Russian-speaking population, using student sample of 445 first and second-year undergraduates ( M = 18.59 years, SD = 1.19) and found that a two-factor model fitted the data well. However, the structure of the CFI was revised because of some modifications, which were made to the original English to match the Russian equivalents of items originally developed to assess the definite aspect of cognitive flexibility. The CFI-R showed good internal consistency and suitable 7-week test-retest reliability. The construct validity of the Russian version of the CFI was studied by computing correlations with other related measures of CF (Attributional Style Questionnaire), depressive symptoms (Beck Depression Inventory), coping (Ways of Coping (Revised), and rigidity (Tomsk Rigidity Questionnaire). Furthermore, to assess whether the construct validity were affected by psychopathology we examined results for non-clinical and clinical samples, using "known-groups" method. The clinical sample reported lower CF than did the non-clinical sample on the CFI-R's total score and its subscales' scores. Findings in the present study suggest that the psychometric properties of the Russian CFI are comparable to the English original, making it appropriate to research assessment of the type of CF in Russian speaking population.
Assessing patient-centered care: one approach to health disparities education.
Wilkerson, LuAnn; Fung, Cha-Chi; May, Win; Elliott, Donna
2010-05-01
Patient-centered care has been described as one approach to cultural competency education that could reduce racial and ethnic health disparities by preparing providers to deliver care that is respectful and responsive to the preferences of each patient. In order to evaluate the effectiveness of a curriculum in teaching patient-centered care (PCC) behaviors to medical students, we drew on the work of Kleinman, Eisenberg, and Good to develop a scale that could be embedded across cases in an objective structured clinical examination (OSCE). To compare the reliability, validity, and feasibility of an embedded patient-centered care scale with the use of a single culturally challenging case in measuring students' use of PCC behaviors as part of a comprehensive OSCE. A total of 322 students from two California medical schools participated in the OSCE as beginning seniors. Cronbach's alpha was used to assess the internal consistency of each approach. Construct validity was addressed by establishing convergent and divergent validity using the cultural challenge case total score and OSCE component scores. Feasibility assessment considered cost and training needs for the standardized patients (SPs). Medical students demonstrated a moderate level of patient-centered skill (mean = 63%, SD = 11%). The PCC Scale demonstrated an acceptable level of internal consistency (alpha = 0.68) over the single case scale (alpha = 0.60). Both convergent and divergent validities were established through low to moderate correlation coefficients. The insertion of PCC items across multiple cases in a comprehensive OSCE can provide a reliable estimate of students' use of PCC behaviors without incurring extra costs associated with implementing a special cross-cultural OSCE. This approach is particularly feasible when an OSCE is already part of the standard assessment of clinical skills. Reliability may be increased with an additional investment in SP training.
Weech-Maldonado, Robert; Carle, Adam; Weidmer, Beverly; Hurtado, Margarita; Ngo-Metzger, Quyen; Hays, Ron D
2012-09-01
There is a need for reliable and valid measures of cultural competence (CC) from the patient's perspective. This paper evaluates the reliability and validity of the Consumer Assessments of Healthcare Providers and Systems (CAHPS) CC item set. Using 2008 survey data, we assessed the internal consistency of the CAHPS CC scales using the Cronbach α's and examined the validity of the measures using exploratory and confirmatory factor analysis, multitrait scaling analysis, and regression analysis. A random stratified sample (based on race/ethnicity and language) of 991 enrollees, younger than 65 years, from 2 Medicaid managed care plans in California and New York. CAHPS CC item set after excluding screener items and ratings. Confirmatory factor analysis (Comparative Fit Index=0.98, Tucker Lewis Index=0.98, and Root Mean Square Error or Approximation=0.06) provided support for a 7-factor structure: Doctor Communication--Positive Behaviors, Doctor Communication--Negative Behaviors, Doctor Communication--Health Promotion, Doctor Communication--Alternative Medicine, Shared Decision-Making, Equitable Treatment, and Trust. Item-total correlations (corrected for item overlap) for the 7 scales exceeded 0.40. Exploratory factor analysis showed support for 1 additional factor: Access to Interpreter Services. Internal consistency reliability estimates ranged from 0.58 (Alternative Medicine) to 0.92 (Positive Behaviors) and was 0.70 or higher for 4 of the 8 composites. All composites were positively and significantly associated with the overall doctor rating. The CAHPS CC 26-item set demonstrates adequate measurement properties and can be used as a supplemental item set to the CAHPS Clinician and Group Surveys in assessing culturally competent care from the patient's perspective.
Bergeron, Lise; Berthiaume, Claude; St-Georges, Marie; Piché, Geneviève; Smolla, Nicole
2013-08-01
As no single informant can be considered the gold standard of child psychopathology, interviewing of children regarding their own symptoms is necessary. Our study focused on the reliability, validity, and clinical use of the Dominic Interactive (DI), a multimedia self-report screen to assess symptoms for the most frequent Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition, Text Revision, mental disorders in school-aged children. A sample of 585 children aged 6 to 11 years from the community and psychiatric clinics was used to analyze the internal consistency, the test-retest estimate of reliability, and the criterion-related validity of the DI against the referral status. In addition, cross-informant correlation coefficients between this instrument (child report) and the Child Symptom Inventory (parent report) were explored in a subsample of 292 participants. For the total sample, Cronbach alpha coefficients ranged from 0.63 to 0.91. Test-retest kappas varied from 0.42 to 0.62 for categories based on cut-off points, except for specific phobias. Intraclass correlation coefficients ranged from 0.70 to 0.81 for symptom scales. The DI discriminated between referred and non-referred children in psychiatric clinics for all symptom scales. Significant cross-informant correlation coefficients were higher for the externalizing symptoms (0.35 to 0.48) than the internalizing symptoms (0.14 to 0.27). Findings of our study reasonably support adequate psychometric properties of the DI. This instrument offers a developmentally sensitive screening method to obtain unique information from young children about their mental health problems in front-line services, psychiatric clinics, and research settings.
Ciampa, Philip J; Skinner, Shannon L; Patricio, Sérgio R; Rothman, Russell L; Vermund, Sten H; Audet, Carolyn M
2012-01-01
The relationship between HIV knowledge and HIV-related behaviors in settings like Mozambique has been limited by a lack of rigorously validated measures. A convenience sample of women seeking prenatal care at two clinics were administered an adapted, orally-administered, 27 item HIV-knowledge scale, the HK-27. Validation analyses were stratified by survey language (Portuguese and Echuabo). Kuder-Richardson (KR-20) coefficients estimated internal reliability. Construct validity was assessed with bivariate associations between HK-27 scores (% correct) and selected participant characteristics. The association between knowledge, self-reported HIV testing, and HIV infection were evaluated with multivariable logistic regression. Participants (N = 348) had a median age of 24; 188 spoke Portuguese, and 160 spoke Echuabo. Mean HK-27 scores were higher for Portuguese-speaking participants than Echuabo-speaking participants (68% correct vs. 42%, p<0.001). Internal reliability was strong (KR-20>0.8) for scales in both languages. Higher HK-27 scores were significantly (p≤0.05) correlated with more education, more media items in the home, a history of HIV testing, and participant work outside of the home for women of both languages. HK-27 scores were independently associated with completion of HIV testing in multivariable analysis (per 1% correct: aOR:1.02, 95%CI:0.01-0.03, p = 0.01), but not with HIV infection. HK-27 is a reliable and valid measure of HIV knowledge among Portuguese and Echuabo-speaking Mozambican women. The HK-27 demonstrated significant knowledge deficits among women in the study, and higher scores were associated with higher HIV testing probability. Future studies should evaluate the role of the HK-27 in longitudinal studies and in other populations.
Kosteniuk, Julie G; Wilson, Erin C; Penz, Kelly L; MacLeod, Martha L P; Stewart, Norma J; Kulig, Judith C; Karunanayake, Chandima P; Kilpatrick, Kelley
2016-01-01
To report the development and psychometric evaluation of a scale to measure rural and remote (rural/remote) nurses' perceptions of the engagement of their workplaces in key dimensions of primary health care (PHC). Amidst ongoing PHC reforms, a comprehensive instrument is needed to evaluate the degree to which rural/remote health care settings are involved in the key dimensions that characterize PHC delivery, particularly from the perspective of professionals delivering care. This study followed a three-phase process of instrument development and psychometric evaluation. A literature review and expert consultation informed instrument development in the first phase, followed by an iterative process of content evaluation in the second phase. In the final phase, a pilot survey was undertaken and item discrimination analysis employed to evaluate the internal consistency reliability of each subscale in the preliminary 60-item Primary Health Care Engagement (PHCE) Scale. The 60-item scale was subsequently refined to a 40-item instrument. The pilot survey sample included 89 nurses in current practice who had experience in rural/remote practice settings. Participants completed either a web-based or paper survey from September to December, 2013. Following item discrimination analysis, the 60-item instrument was refined to a 40-item PHCE Scale consisting of 10 subscales, each including three to five items. Alpha estimates of the 10 refined subscales ranged from 0.61 to 0.83, with seven of the subscales demonstrating acceptable reliability (α ⩾ 0.70). The refined 40-item instrument exhibited good internal consistency reliability (α=0.91). The 40-item PHCE Scale may be considered for use in future studies regardless of locale, to measure the extent to which health care professionals perceive their workplaces to be engaged in key dimensions of PHC.
NASA Astrophysics Data System (ADS)
Morlot, T.; Mathevet, T.; Perret, C.; Favre Pugin, A. C.
2014-12-01
Streamflow uncertainty estimation has recently received a large attention in the literature. A dynamic rating curve assessment method has been introduced (Morlot et al., 2014). This dynamic method allows to compute a rating curve for each gauging and a continuous streamflow time-series, while calculating streamflow uncertainties. Streamflow uncertainty takes into account many sources of uncertainty (water level, rating curve interpolation and extrapolation, gauging aging, etc.) and produces an estimated distribution of streamflow for each days. In order to caracterise streamflow uncertainty, a probabilistic framework has been applied on a large sample of hydrometric stations of the Division Technique Générale (DTG) of Électricité de France (EDF) hydrometric network (>250 stations) in France. A reliability diagram (Wilks, 1995) has been constructed for some stations, based on the streamflow distribution estimated for a given day and compared to a real streamflow observation estimated via a gauging. To build a reliability diagram, we computed the probability of an observed streamflow (gauging), given the streamflow distribution. Then, the reliability diagram allows to check that the distribution of probabilities of non-exceedance of the gaugings follows a uniform law (i.e., quantiles should be equipropables). Given the shape of the reliability diagram, the probabilistic calibration is caracterised (underdispersion, overdispersion, bias) (Thyer et al., 2009). In this paper, we present case studies where reliability diagrams have different statistical properties for different periods. Compared to our knowledge of river bed morphology dynamic of these hydrometric stations, we show how reliability diagram gives us invaluable information on river bed movements, like a continuous digging or backfilling of the hydraulic control due to erosion or sedimentation processes. Hence, the careful analysis of reliability diagrams allows to reconcile statistics and long-term river bed morphology processes. This knowledge improves our real-time management of hydrometric stations, given a better caracterisation of erosion/sedimentation processes and the stability of hydrometric station hydraulic control.
NASA Technical Reports Server (NTRS)
Mathur, F. P.
1972-01-01
Description of an on-line interactive computer program called CARE (Computer-Aided Reliability Estimation) which can model self-repair and fault-tolerant organizations and perform certain other functions. Essentially CARE consists of a repository of mathematical equations defining the various basic redundancy schemes. These equations, under program control, are then interrelated to generate the desired mathematical model to fit the architecture of the system under evaluation. The mathematical model is then supplied with ground instances of its variables and is then evaluated to generate values for the reliability-theoretic functions applied to the model.
Examples of Nonconservatism in the CARE 3 Program
NASA Technical Reports Server (NTRS)
Dotson, Kelly J.
1988-01-01
This paper presents parameter regions in the CARE 3 (Computer-Aided Reliability Estimation version 3) computer program where the program overestimates the reliability of a modeled system without warning the user. Five simple models of fault-tolerant computer systems are analyzed; and, the parameter regions where reliability is overestimated are given. The source of the error in the reliability estimates for models which incorporate transient fault occurrences was not readily apparent. However, the source of much of the error for models with permanent and intermittent faults can be attributed to the choice of values for the run-time parameters of the program.
Estimation of sex from the lower limb measurements of Sudanese adults.
Ahmed, Altayeb Abdalla
2013-06-10
The sex estimation from mutilated and amputated limbs or body parts is one of the most vital steps in person identification in medical-legal autopsies. Sex estimation from lower limb anthropometric measurements has demonstrated a high degree of expected accuracy in a limited range of the global population. The aims of this study were to assess the degree of the sexual dimorphism in lower limb measurements and the accuracy of utilization of these measurements for estimation of sex in a contemporary adult Sudanese population. The tibial length, bimalleolar breadth, foot length, and foot breadth of 240 right-handed Sudanese Arab subjects (120 males and 120 females) aged between 25 and 30 years were measured following international anthropometric standards. Demarking points, sexual dimorphism indices and discriminant functions were developed from 200 subjects (100 males and 100 females) who comprised the study group. All variables were sexually dimorphic. The bimalleolar breadth and foot breadth significantly contributed to sex estimation. Leg dimensions showed a higher accuracy for sex estimation than foot dimensions. Cross-validated sex classification accuracy ranged between 78% and 89.5%. The reliability of these standards was assessed in a test sample of 20 males and 20 females, and the results showed accuracy between 75% and 90%. This study provides new forensic standards for sex estimation from lower limb measurements of Sudanese adults. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Dichter, Martin Nikolaus; Schwab, Christian G G; Meyer, Gabriele; Bartholomeyczik, Sabine; Halek, Margareta
2016-02-01
For people with dementia, the concept of quality of life (Qol) reflects the disease's impact on the whole person. Thus, Qol is an increasingly used outcome measure in dementia research. This systematic review was performed to identify available dementia-specific Qol measurements and to assess the quality of linguistic validations and reliability studies of these measurements (PROSPERO 2013: CRD42014008725). The MEDLINE, CINAHL, EMBASE, PsycINFO, and Cochrane Methodology Register databases were systematically searched without any date restrictions. Forward and backward citation tracking were performed on the basis of selected articles. A total of 70 articles addressing 19 dementia-specific Qol measurements were identified; nine measurements were adapted to nonorigin countries. The quality of the linguistic validations varied from insufficient to good. Internal consistency was the most frequently tested reliability property. Most of the reliability studies lacked internal validity. Qol measurements for dementia are insufficiently linguistic validated and not well tested for reliability. None of the identified measurements can be recommended without further research. The application of international guidelines and quality criteria is strongly recommended for the performance of linguistic validations and reliability studies of dementia-specific Qol measurements. Copyright © 2016 Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Helms, LuAnn Sherbeck
This paper discusses the fact that reliability is about scores and not tests and how reliability limits effect sizes. The paper also explores the classical reliability coefficients of stability, equivalence, and internal consistency. Stability is concerned with how stable test scores will be over time, while equivalence addresses the relationship…
Mehta, Saurabh P; MacDermid, Joy C; Richardson, Julie; MacIntyre, Norma J; Grewal, Ruby
2015-01-01
Clinical measurement. This study examined test-retest reliability and convergent/divergent construct validity of selected tests and measures that assess balance impairment, fear of falling (FOF), impaired physical activity (PA), and lower extremity muscle strength (LEMS) in females >45 years of age after the distal radius fracture (DRF) population. Twenty one female participants with DRF were assessed on two occasions. Timed Up and Go, Functional Reach, and One Leg Standing tests assessed balance impairment. Shortened Falls Efficacy Scale, Activity-specific Balance Confidence scale, and Fall Risk Perception Questionnaire assessed FOF. International Physical Activity Questionnaire and Rapid Assessment of Physical Activity were administered to assess PA level. Chair stand test and isometric muscle strength testing for hip and knee assessed LEMS. Intraclass correlation coefficients (ICC) examined the test-retest reliability of the measures. Pearson correlation coefficients (r) examined concurrent relationships between the measures. The results demonstrated fair to excellent test-retest reliability (ICC between 0.50 and 0.96) and low to moderate concordance between the measures (low if r ≤ 0.4; moderate if r = 0.4-0.7). The results provide preliminary estimates of test-retest reliability and convergent/divergent construct validity of selected measures associated with increased risk for falling in the females >45 years of age after DRF. Further research directions to advance knowledge regarding fall risk assessment in DRF population have been identified. Copyright © 2015 Hanley & Belfus. Published by Elsevier Inc. All rights reserved.