sample assessment items: Topics by Science.gov

Sample records for sample assessment items

Psychological distress in cancer survivors: the further development of an item bank.

PubMed

Smith, Adam B; Armes, Jo; Richardson, Alison; Stark, Dan P

2013-02-01

Assessment of psychological distress by patient report is necessary to meet patients' needs throughout the cancer journey. We have previously developed an item bank to assess psychological distress but not evaluated it for cancer survivors. Our first aim in this study was to test whether we could extend our item bank to include cancer survivors. The second aim was to examine whether the item bank could assess positive affect as a single construct alongside negative psychological symptoms. Responses from 1315 cancer survivors to the Hospital Anxiety and Depression Scale (HADS) and the Positive and Negative Affect Scale (PANAS) were considered for inclusion in a pre-existing item bank created from a heterogeneous sample of 4914 cancer patients. Differential item functioning (DIF) was used to assess whether HADS responses drawn from the two samples were equivalent. Common-item equating was used to anchor the shared (HADS) items, whilst the PANAS items were added. Item fit was evaluated at each stage, and misfitting items were removed. Unidimensionality was assessed with a principal components factor analysis. The DIF analysis did not reveal any differences between the HADS item locations from the two samples. Three misfitting PANAS items were removed, resulting in a final unidimensional bank of 80 items with good internal reliability (α = 0.85). The new item bank is valid for use across the cancer journey, including cancer survivors, and modestly improves the assessment of all levels of psychological distress and positive psychological function. Copyright © 2011 John Wiley & Sons, Ltd.
Assessing cross-cultural validity of scales: a methodological review and illustrative example.

PubMed

Beckstead, Jason W; Yang, Chiu-Yueh; Lengacher, Cecile A

2008-01-01

In this article, we assessed the cross-cultural validity of the Women's Role Strain Inventory (WRSI), a multi-item instrument that assesses the degree of strain experienced by women who juggle the roles of working professional, student, wife and mother. Cross-cultural validity is evinced by demonstrating the measurement invariance of the WRSI. Measurement invariance is the extent to which items of multi-item scales function in the same way across different samples of respondents. We assessed measurement invariance by comparing a sample of working women in Taiwan with a similar sample from the United States. Structural equation models (SEMs) were employed to determine the invariance of the WRSI and to estimate the unique validity variance of its items. This article also provides nurse-researchers with the necessary underlying measurement theory and illustrates how SEMs may be applied to assess cross-cultural validity of instruments used in nursing research. Overall performance of the WRSI was acceptable but our analysis showed that some items did not display invariance properties across samples. Item analysis is presented and recommendations for improving the instrument are discussed.
Differential Item Functioning by Gender on a Large-Scale Science Performance Assessment: A Comparison across Grade Levels.

ERIC Educational Resources Information Center

Holweger, Nancy; Taylor, Grace

The fifth-grade and eighth-grade science items on a state performance assessment were compared for differential item functioning (DIF) due to gender. The grade 5 sample consisted of 8,539 females and 8,029 males and the grade 8 sample consisted of 7,477 females and 7,891 males. A total of 30 fifth grade items and 26 eighth grade items were…
Use of Matrix Sampling Procedures to Assess Achievement in Solving Open Addition and Subtraction Sentences.

ERIC Educational Resources Information Center

Montague, Margariete A.

This study investigated the feasibility of concurrently and randomly sampling examinees and items in order to estimate group achievement. Seven 32-item tests reflecting a 640-item universe of simple open sentences were used such that item selection (random, systematic) and assignment (random, systematic) of items (four, eight, sixteen) to forms…
Personality Assessment Inventory scale characteristics and factor structure in the assessment of alcohol dependency.

PubMed

Schinka, J A

1995-02-01

Individual scale characteristics and the inventory structure of the Personality Assessment Inventory (PAI; Morey, 1991) were examined by conducting internal consistency and factor analyses of item and scale score data from a large group (N = 301) of alcohol-dependent patients. Alpha coefficients, mean inter-item correlations, and corrected item-total scale correlations for the sample paralleled values reported by Morey for a large clinical sample. Minor differences in the scale factor structure of the inventory from Morey's clinical sample were found. Overall, the findings support the use of the PAI in the assessment of personality and psychopathology of alcohol-dependent patients.
Assessment of Genetics Understanding. Under What Conditions Do Situational Features Have an Impact on Measures?

NASA Astrophysics Data System (ADS)

Schmiemann, Philipp; Nehm, Ross H.; Tornabene, Robyn E.

2017-12-01

Understanding how situational features of assessment tasks impact reasoning is important for many educational pursuits, notably the selection of curricular examples to illustrate phenomena, the design of formative and summative assessment items, and determination of whether instruction has fostered the development of abstract schemas divorced from particular instances. The goal of our study was to employ an experimental research design to quantify the degree to which situational features impact inferences about participants' understanding of Mendelian genetics. Two participant samples from different educational levels and cultural backgrounds (high school, n = 480; university, n = 444; Germany and USA) were used to test for context effects. A multi-matrix test design was employed, and item packets differing in situational features (e.g., plant, animal, human, fictitious) were randomly distributed to participants in the two samples. Rasch analyses of participant scores from both samples produced good item fit, person reliability, and item reliability and indicated that the university sample displayed stronger performance on the items compared to the high school sample. We found, surprisingly, that in both samples, no significant differences in performance occurred among the animal, plant, and human item contexts, or between the fictitious and "real" item contexts. In the university sample, we were also able to test for differences in performance between genders, among ethnic groups, and by prior biology coursework. None of these factors had a meaningful impact upon performance or context effects. Thus some, but not all, types of genetics problem solving or item formats are impacted by situational features.
Influence of item distribution pattern and abundance on efficiency of benthic core sampling

USGS Publications Warehouse

Behney, Adam C.; O'Shaughnessy, Ryan; Eichholz, Michael W.; Stafford, Joshua D.

2014-01-01

ore sampling is a commonly used method to estimate benthic item density, but little information exists about factors influencing the accuracy and time-efficiency of this method. We simulated core sampling in a Geographic Information System framework by generating points (benthic items) and polygons (core samplers) to assess how sample size (number of core samples), core sampler size (cm2), distribution of benthic items, and item density affected the bias and precision of estimates of density, the detection probability of items, and the time-costs. When items were distributed randomly versus clumped, bias decreased and precision increased with increasing sample size and increased slightly with increasing core sampler size. Bias and precision were only affected by benthic item density at very low values (500–1,000 items/m2). Detection probability (the probability of capturing ≥ 1 item in a core sample if it is available for sampling) was substantially greater when items were distributed randomly as opposed to clumped. Taking more small diameter core samples was always more time-efficient than taking fewer large diameter samples. We are unable to present a single, optimal sample size, but provide information for researchers and managers to derive optimal sample sizes dependent on their research goals and environmental conditions.
Development of a Self-Report Physical Function Instrument for Disability Assessment: Item Pool Construction and Factor Analysis

PubMed Central

McDonough, Christine M.; Jette, Alan M.; Ni, Pengsheng; Bogusz, Kara; Marfeo, Elizabeth E; Brandt, Diane E; Chan, Leighton; Meterko, Mark; Haley, Stephen M.; Rasch, Elizabeth K.

2014-01-01

Objectives To build a comprehensive item pool representing work-relevant physical functioning and to test the factor structure of the item pool. These developmental steps represent initial outcomes of a broader project to develop instruments for the assessment of function within the context of Social Security Administration (SSA) disability programs. Design Comprehensive literature review; gap analysis; item generation with expert panel input; stakeholder interviews; cognitive interviews; cross-sectional survey administration; and exploratory and confirmatory factor analyses to assess item pool structure. Setting In-person and semi-structured interviews; internet and telephone surveys. Participants A sample of 1,017 SSA claimants, and a normative sample of 999 adults from the US general population. Interventions Not Applicable. Main Outcome Measure Model fit statistics Results The final item pool consisted of 139 items. Within the claimant sample 58.7% were white; 31.8% were black; 46.6% were female; and the mean age was 49.7 years. Initial factor analyses revealed a 4-factor solution which included more items and allowed separate characterization of: 1) Changing and Maintaining Body Position, 2) Whole Body Mobility, 3) Upper Body Function and 4) Upper Extremity Fine Motor. The final 4-factor model included 91 items. Confirmatory factor analyses for the 4-factor models for the claimant and the normative samples demonstrated very good fit. Fit statistics for claimant and normative samples respectively were: Comparative Fit Index = 0.93 and 0.98; Tucker-Lewis Index = 0.92 and 0.98; Root Mean Square Error Approximation = 0.05 and 0.04. Conclusions The factor structure of the Physical Function item pool closely resembled the hypothesized content model. The four scales relevant to work activities offer promise for providing reliable information about claimant physical functioning relevant to work disability. PMID:23542402
Development of a self-report physical function instrument for disability assessment: item pool construction and factor analysis.

PubMed

McDonough, Christine M; Jette, Alan M; Ni, Pengsheng; Bogusz, Kara; Marfeo, Elizabeth E; Brandt, Diane E; Chan, Leighton; Meterko, Mark; Haley, Stephen M; Rasch, Elizabeth K

2013-09-01

To build a comprehensive item pool representing work-relevant physical functioning and to test the factor structure of the item pool. These developmental steps represent initial outcomes of a broader project to develop instruments for the assessment of function within the context of Social Security Administration (SSA) disability programs. Comprehensive literature review; gap analysis; item generation with expert panel input; stakeholder interviews; cognitive interviews; cross-sectional survey administration; and exploratory and confirmatory factor analyses to assess item pool structure. In-person and semistructured interviews and Internet and telephone surveys. Sample of SSA claimants (n=1017) and a normative sample of adults from the U.S. general population (n=999). Not applicable. Model fit statistics. The final item pool consisted of 139 items. Within the claimant sample, 58.7% were white; 31.8% were black; 46.6% were women; and the mean age was 49.7 years. Initial factor analyses revealed a 4-factor solution, which included more items and allowed separate characterization of: (1) changing and maintaining body position, (2) whole body mobility, (3) upper body function, and (4) upper extremity fine motor. The final 4-factor model included 91 items. Confirmatory factor analyses for the 4-factor models for the claimant and the normative samples demonstrated very good fit. Fit statistics for claimant and normative samples, respectively, were: Comparative Fit Index=.93 and .98; Tucker-Lewis Index=.92 and .98; and root mean square error approximation=.05 and .04. The factor structure of the physical function item pool closely resembled the hypothesized content model. The 4 scales relevant to work activities offer promise for providing reliable information about claimant physical functioning relevant to work disability. Copyright © 2013 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
The PROMIS fatigue item bank has good measurement properties in patients with fibromyalgia and severe fatigue.

PubMed

Yost, Kathleen J; Waller, Niels G; Lee, Minji K; Vincent, Ann

2017-06-01

Efficient management of fibromyalgia (FM) requires precise measurement of FM-specific symptoms. Our objective was to assess the measurement properties of the Patient-Reported Outcome Measurement Information System (PROMIS) fatigue item bank (FIB) in people with FM. We applied classical psychometric and item response theory methods to cross-sectional PROMIS-FIB data from two samples. Data on the clinical FM sample were obtained at a tertiary medical center. Data for the U.S. general population sample were obtained from the PROMIS network. The full 95-item bank was administered to both samples. We investigated dimensionality of the item bank in both samples by separately fitting a bifactor model with two group factors; experience and impact. We assessed measurement invariance between samples, and we explored an alternate factor structure with the normative sample and subsequently confirmed that structure in the clinical sample. Finally, we assessed whether reporting FM subdomain scores added value over reporting a single total score. The item bank was dominated by a general fatigue factor. The fit of the initial bifactor model and evidence of measurement invariance indicated that the same constructs were measured across the samples. An alternative bifactor model with three group factors demonstrated slightly improved fit. Subdomain scores add value over a total score. We demonstrated that the PROMIS-FIB is appropriate for measuring fatigue in clinical samples of FM patients. The construct can be presented by a single score; however, subdomain scores for the three group factors identified in the alternative model may also be reported.
Refining and validating the Social Interaction Anxiety Scale and the Social Phobia Scale.

PubMed

Carleton, R Nicholas; Collimore, Kelsey C; Asmundson, Gordon J G; McCabe, Randi E; Rowa, Karen; Antony, Martin M

2009-01-01

The Social Interaction Anxiety Scale and Social Phobia Scale are companion measures for assessing symptoms of social anxiety and social phobia. The scales have good reliability and validity across several samples, however, exploratory and confirmatory factor analyses have yielded solutions comprising substantially different item content and factor structures. These discrepancies are likely the result of analyzing items from each scale separately or simultaneously. The current investigation sets out to assess items from those scales, both simultaneously and separately, using exploratory and confirmatory factor analyses in an effort to resolve the factor structure. Participants consisted of a clinical sample (n 5353; 54% women) and an undergraduate sample (n 5317; 75% women) who completed the Social Interaction Anxiety Scale and Social Phobia Scale, along with additional fear-related measures to assess convergent and discriminant validity. A three-factor solution with a reduced set of items was found to be most stable, irrespective of whether the items from each scale are assessed together or separately. Items from the Social Interaction Anxiety Scale represented one factor, whereas items from the Social Phobia Scale represented two other factors. Initial support for scale and factor validity, along with implications and recommendations for future research, is provided. (c) 2009 Wiley-Liss, Inc.
Methodology for the development and calibration of the SCI-QOL item banks

PubMed Central

Tulsky, David S.; Kisala, Pamela A.; Victorson, David; Choi, Seung W.; Gershon, Richard; Heinemann, Allen W.; Cella, David

2015-01-01

Objective To develop a comprehensive, psychometrically sound, and conceptually grounded patient reported outcomes (PRO) measurement system for individuals with spinal cord injury (SCI). Methods Individual interviews (n = 44) and focus groups (n = 65 individuals with SCI and n = 42 SCI clinicians) were used to select key domains for inclusion and to develop PRO items. Verbatim items from other cutting-edge measurement systems (i.e. PROMIS, Neuro-QOL) were included to facilitate linkage and cross-population comparison. Items were field tested in a large sample of individuals with traumatic SCI (n = 877). Dimensionality was assessed with confirmatory factor analysis. Local item dependence and differential item functioning were assessed, and items were calibrated using the item response theory (IRT) graded response model. Finally, computer adaptive tests (CATs) and short forms were administered in a new sample (n = 245) to assess test-retest reliability and stability. Participants and Procedures A calibration sample of 877 individuals with traumatic SCI across five SCI Model Systems sites and one Department of Veterans Affairs medical center completed SCI-QOL items in interview format. Results We developed 14 unidimensional calibrated item banks and 3 calibrated scales across physical, emotional, and social health domains. When combined with the five Spinal Cord Injury – Functional Index physical function banks, the final SCI-QOL system consists of 22 IRT-calibrated item banks/scales. Item banks may be administered as CATs or short forms. Scales may be administered in a fixed-length format only. Conclusions The SCI-QOL measurement system provides SCI researchers and clinicians with a comprehensive, relevant and psychometrically robust system for measurement of physical-medical, physical-functional, emotional, and social outcomes. All SCI-QOL instruments are freely available on Assessment CenterSM. PMID:26010963
Methodology for the development and calibration of the SCI-QOL item banks.

PubMed

Tulsky, David S; Kisala, Pamela A; Victorson, David; Choi, Seung W; Gershon, Richard; Heinemann, Allen W; Cella, David

2015-05-01

To develop a comprehensive, psychometrically sound, and conceptually grounded patient reported outcomes (PRO) measurement system for individuals with spinal cord injury (SCI). Individual interviews (n=44) and focus groups (n=65 individuals with SCI and n=42 SCI clinicians) were used to select key domains for inclusion and to develop PRO items. Verbatim items from other cutting-edge measurement systems (i.e. PROMIS, Neuro-QOL) were included to facilitate linkage and cross-population comparison. Items were field tested in a large sample of individuals with traumatic SCI (n=877). Dimensionality was assessed with confirmatory factor analysis. Local item dependence and differential item functioning were assessed, and items were calibrated using the item response theory (IRT) graded response model. Finally, computer adaptive tests (CATs) and short forms were administered in a new sample (n=245) to assess test-retest reliability and stability. A calibration sample of 877 individuals with traumatic SCI across five SCI Model Systems sites and one Department of Veterans Affairs medical center completed SCI-QOL items in interview format. We developed 14 unidimensional calibrated item banks and 3 calibrated scales across physical, emotional, and social health domains. When combined with the five Spinal Cord Injury--Functional Index physical function banks, the final SCI-QOL system consists of 22 IRT-calibrated item banks/scales. Item banks may be administered as CATs or short forms. Scales may be administered in a fixed-length format only. The SCI-QOL measurement system provides SCI researchers and clinicians with a comprehensive, relevant and psychometrically robust system for measurement of physical-medical, physical-functional, emotional, and social outcomes. All SCI-QOL instruments are freely available on Assessment CenterSM.
Developing a model of competence in the operating theatre: psychometric validation of the perceived perioperative competence scale-revised.

PubMed

Gillespie, Brigid M; Polit, Denise F; Hamlin, Lois; Chaboyer, Wendy

2012-01-01

This paper describes the development and validation of the Revised Perioperative Competence Scale (PPCS-R). There is a lack of a psychometrically tested sound self-assessment tools to measure nurses' perceived competence in the operating room. Content validity was established by a panel of international experts and the original 98-item scale was pilot tested with 345 nurses in Queensland, Australia. Following the removal of several items, a national sample that included all 3209 nurses who were members of the Australian College of Operating Room Nurses was surveyed using the 94-item version. Psychometric testing assessed content validity using exploratory factor analysis, internal consistency using Cronbach's alpha, and construct validity using the "known groups" technique. During item reduction, several preliminary factor analyses were performed on two random halves of the sample (n=550). Usable data for psychometric assessment were obtained from 1122 nurses. The original 94-item scale was reduced to 40 items. The final factor analysis using the entire sample resulted in a 40 item six-factor solution. Cronbach's alpha for the 40-item scale was .96. Construct validation demonstrated significant differences (p<.0001) in perceived competence scores relative to years of operating room experience and receipt of specialty education. On the basis of these results, the psychometric properties of the PPCS-R were considered encouraging. Further testing of the tool in different samples of operating room nurses is necessary to enable cross-cultural comparisons. Copyright © 2011 Elsevier Ltd. All rights reserved.
A Multilevel Testlet Model for Dual Local Dependence

ERIC Educational Resources Information Center

Jiao, Hong; Kamata, Akihito; Wang, Shudong; Jin, Ying

2012-01-01

The applications of item response theory (IRT) models assume local item independence and that examinees are independent of each other. When a representative sample for psychometric analysis is selected using a cluster sampling method in a testlet-based assessment, both local item dependence and local person dependence are likely to be induced.…
The Assessment of Physiotherapy Practice (APP) is a valid measure of professional competence of physiotherapy students: a cross-sectional study with Rasch analysis.

PubMed

Dalton, Megan; Davidson, Megan; Keating, Jenny

2011-01-01

Is the Assessment of Physiotherapy Practice (APP) a valid instrument for the assessment of entry-level competence in physiotherapy students? Cross-sectional study with Rasch analysis of initial (n=326) and validation samples (n=318). Students were assessed on completion of 4, 5, or 6-week clinical placements across one university semester. 298 clinical educators and 456 physiotherapy students at nine universities in Australia and New Zealand provided 644 completed APP instruments. APP data in both samples showed overall fit to a Rasch model of expected item functioning for interval scale measurement. Item 6 (Written communication) exhibited misfit in both samples, but was retained as an important element of competence. The hierarchy of item difficulty was the same in both samples with items related to professional behaviour and communication the easiest to achieve and items related to clinical reasoning the most difficult. Item difficulty was well targeted to person ability. No Differential Item Functioning was identified, indicating that the scale performed in a comparable way regardless of the student's age, gender or amount of prior clinical experience, and the educator's age, gender, or experience as an educator, or the type of facility, university, or clinical area. The instrument demonstrated unidimensionality confirming the appropriateness of summing the scale scores on each item to provide an overall score of clinical competence and was able to discriminate four levels of professional competence (Person Separation Index=0.96). Person ability and raw APP scores had a linear relationship (r(2)=0.99). Rasch analysis supports the interpretation that a student's APP score is an indication of their underlying level of professional competence in workplace practice. Copyright © 2011 Australian Physiotherapy Association. Published by .. All rights reserved.
Reliability and validity of a scale for health-promoting schools.

PubMed

Lee, Eun Young; Shin, Young-Jeon; Choi, Bo Youl; Cho, Ho Soon Michelle

2014-12-01

Despite a growing body of research regarding the health-promoting schools (HPS) concept from the World Health Organization (WHO), research on measuring of the HPS is limited. This study aims to develop a scale for assessing the status of the HPS based on the WHO guidelines and to evaluate the reliability and validity of the scale. After completing the translation and back-translation process, the content validity of the 50-item scale for HPS (SHPS) was assessed by an expert committee review and pretested with 17 teachers. A stratified, random sampling design was used. A total of 728 teachers from 94 schools completed a self-administered questionnaire. The total sample was randomly divided into three groups for exploratory factor analysis (EFA), confirmatory factor analysis (CFA) and cross-validation. The EFA suggested seven factors, including 37 items, and the CFA confirmed these factors. In a second-order factor analysis, the second-order seven-factor model had acceptable fit indices (root mean square error of approximation 0.07, comparative fit index 0.98) with stability over validation sample and whole sample. Thus, the first-order seven factors (school nutrition services [three-item, α = 0.87], healthy school policies [six-item, α = 0.87], school's physical environment [10-item, α = 0.91], school's social environment [four-item, α = 0.88], community links [six-item, α = 0.91], individual health skills and action competencies [three-item, α = 0.89], and health services [five-item, α = 0.86]) loaded significantly onto the second-order factor (HPS [37-item, α = 0.97]). In conclusion, the SHPS is a reliable and valid measurement tool for assessing the states of the HPS in the Korean school context. It will be useful for comprehensively assessing schools' needs and monitoring the progress of school health interventions. © The Author (2013). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Development and validation of the Perceived Food Environment Questionnaire in a French-Canadian population.

PubMed

Carbonneau, Elise; Robitaille, Julie; Lamarche, Benoît; Corneau, Louise; Lemieux, Simone

2017-08-01

The present study aimed to develop and validate a questionnaire assessing perceived food environment in a French-Canadian population. A questionnaire, the Perceived Food Environment Questionnaire, was developed assessing perceived accessibility to healthy (nine items) and unhealthy foods (three items). A pre-test sample was recruited for a pilot testing of the questionnaire. For the validation study, another sample was recruited and completed the questionnaire twice. Exploratory factor analysis was performed on the items to assess the number of factors (subscales). Cronbach's α was used to measure internal consistency reliability. Test-retest reliability was assessed with Pearson correlations. Online survey. Men and women from the Québec City area (n 31 in the pre-test sample; n 150 in the validation study sample). The pilot testing did not lead to any change in the questionnaire. The exploratory factor analysis revealed a two-subscale structure. The first subscale is composed of six items assessing accessibility to healthy foods and the second includes three items related to accessibility to unhealthy foods. Three items were removed from the questionnaire due to low loading on the two subscales. The subscales demonstrated adequate internal consistency (Cronbach's α=0·77 for healthy foods and 0·62 for unhealthy foods) and test-retest reliability (r=0·59 and 0·60, respectively; both P<0·0001). The Perceived Food Environment Questionnaire was developed for a French-Canadian population and demonstrated good psychometric properties. Further validation is recommended if the questionnaire is to be used in other populations.
Development and preliminary testing of a computerized adaptive assessment of chronic pain.

PubMed

Anatchkova, Milena D; Saris-Baglama, Renee N; Kosinski, Mark; Bjorner, Jakob B

2009-09-01

The aim of this article is to report the development and preliminary testing of a prototype computerized adaptive test of chronic pain (CHRONIC PAIN-CAT) conducted in 2 stages: (1) evaluation of various item selection and stopping rules through real data-simulated administrations of CHRONIC PAIN-CAT; (2) a feasibility study of the actual prototype CHRONIC PAIN-CAT assessment system conducted in a pilot sample. Item calibrations developed from a US general population sample (N = 782) were used to program a pain severity and impact item bank (kappa = 45), and real data simulations were conducted to determine a CAT stopping rule. The CHRONIC PAIN-CAT was programmed on a tablet PC using QualityMetric's Dynamic Health Assessment (DYHNA) software and administered to a clinical sample of pain sufferers (n = 100). The CAT was completed in significantly less time than the static (full item bank) assessment (P < .001). On average, 5.6 items were dynamically administered by CAT to achieve a precise score. Scores estimated from the 2 assessments were highly correlated (r = .89), and both assessments discriminated across pain severity levels (P < .001, RV = .95). Patients' evaluations of the CHRONIC PAIN-CAT were favorable. This report demonstrates that the CHRONIC PAIN-CAT is feasible for administration in a clinic. The application has the potential to improve pain assessment and help clinicians manage chronic pain.
Development and validation of a professionalism assessment scale for medical students

PubMed Central

Klemenc-Ketis, Zalika; Vrecko, Helena

2014-01-01

Objectives To develop and validate a scale for the assess-ment of professionalism in medical students based on students' perceptions of and attitudes towards professional-ism in medicine. Methods This was a mixed methods study with under-graduate medical students. Two focus groups were carried out with 12 students, followed by a transcript analysis (grounded theory method with open coding). Then, a 3-round Delphi with 20 family medicine experts was carried out. A psychometric assessment of the scale was performed with a group of 449 students. The items of the Professional-ism Assessment Scale could be answered on a five-point Likert scale. Results After the focus groups, the first version of the PAS consisted of 56 items and after the Delphi study, 30 items remained. The final sample for quantitative study consisted of 122 students (27.2% response rate). There were 95 (77.9%) female students in the sample. The mean age of the sample was 22.1 ± 2.1 years. After the principal component analysis, we removed 8 items and produced the final version of the PAS (22 items). The Cronbach's alpha of the scale was 0.88. Factor analysis revealed three factors: empathy and humanism, professional relationships and development and responsibility. Conclusions The new Professionalism Assessment Scale proved to be valid and reliable. It can be used for the assessment of professionalism in undergraduate medical students. PMID:25382090

Assessing item fit for unidimensional item response theory models using residuals from estimated item response functions.

PubMed

Haberman, Shelby J; Sinharay, Sandip; Chon, Kyong Hee

2013-07-01

Residual analysis (e.g. Hambleton & Swaminathan, Item response theory: principles and applications, Kluwer Academic, Boston, 1985; Hambleton, Swaminathan, & Rogers, Fundamentals of item response theory, Sage, Newbury Park, 1991) is a popular method to assess fit of item response theory (IRT) models. We suggest a form of residual analysis that may be applied to assess item fit for unidimensional IRT models. The residual analysis consists of a comparison of the maximum-likelihood estimate of the item characteristic curve with an alternative ratio estimate of the item characteristic curve. The large sample distribution of the residual is proved to be standardized normal when the IRT model fits the data. We compare the performance of our suggested residual to the standardized residual of Hambleton et al. (Fundamentals of item response theory, Sage, Newbury Park, 1991) in a detailed simulation study. We then calculate our suggested residuals using data from an operational test. The residuals appear to be useful in assessing the item fit for unidimensional IRT models.
Innovative Application of a Multidimensional Item Response Model in Assessing the Influence of Social Desirability on the Pseudo-Relationship between Self-Efficacy and Behavior

ERIC Educational Resources Information Center

Watson, Kathy; Baranowski, Tom; Thompson, Debbe; Jago, Russell; Baranowski, Janice; Klesges, Lisa M.

2006-01-01

This study examined multidimensional item response theory (MIRT) modeling to assess social desirability (SocD) influences on self-reported physical activity self-efficacy (PASE) and fruit and vegetable self-efficacy (FVSE). The observed sample included 473 Houston-area adolescent males (10-14 years). SocD (nine items), PASE (19 items) and FVSE (21…
Metric equivalence assessment in cross-cultural research: using an example of the Center for Epidemiological Studies--Depression Scale.

PubMed

Kim, Miyong; Han, Hae-Ra; Phillips, Linda

2003-01-01

Metric equivalence is a quantitative way to assess cross-cultural equivalences of translated instruments by examining the patterns of psychometric properties based on cross-cultural data derived from both versions of the instrument. Metric equivalence checks at item and instrument levels can be used as a valuable tool to refine cross-cultural instruments. Korean and English versions of the Center for Epidemiological Studies-Depression Scale (CES-D) were administered to 154 Korean Americans and 151 Anglo Americans to illustrate approaches to assessing their metric equivalence. Inter-item and item-total correlations, Cronbach's alpha coefficients, and factor analysis were used for metric equivalence checks. The alpha coefficient for the Korean-American sample was 0.85 and 0.92 for the Anglo American sample. Although all items of the CES-D surpassed the desirable minimum of 0.30 in the Anglo American sample, four items did not meet the standard in the Korean American sample. Differences in average inter-item correlations were also noted between the two groups (0.25 for Korean Americans and 0.37 for Anglo Americans). Factor analysis identified two factors for both groups, and factor loadings showed similar patterns and congruence coefficients. Results of the item analysis procedures suggest the possibility of bias in certain items that may influence the sensitivity of the Korean version of the CES-D. These item biases also provide a possible explanation for the alpha differences. Although factor loadings showed similar patterns for the Korean and English versions of the CES-D, factorial similarity alone is not sufficient for testing the universality of the structure underlying an instrument.
Adaptive screening for depression--recalibration of an item bank for the assessment of depression in persons with mental and somatic diseases and evaluation in a simulated computer-adaptive test environment.

PubMed

Forkmann, Thomas; Kroehne, Ulf; Wirtz, Markus; Norra, Christine; Baumeister, Harald; Gauggel, Siegfried; Elhan, Atilla Halil; Tennant, Alan; Boecker, Maren

2013-11-01

This study conducted a simulation study for computer-adaptive testing based on the Aachen Depression Item Bank (ADIB), which was developed for the assessment of depression in persons with somatic diseases. Prior to computer-adaptive test simulation, the ADIB was newly calibrated. Recalibration was performed in a sample of 161 patients treated for a depressive syndrome, 103 patients from cardiology, and 103 patients from otorhinolaryngology (mean age 44.1, SD=14.0; 44.7% female) and was cross-validated in a sample of 117 patients undergoing rehabilitation for cardiac diseases (mean age 58.4, SD=10.5; 24.8% women). Unidimensionality of the itembank was checked and a Rasch analysis was performed that evaluated local dependency (LD), differential item functioning (DIF), item fit and reliability. CAT-simulation was conducted with the total sample and additional simulated data. Recalibration resulted in a strictly unidimensional item bank with 36 items, showing good Rasch model fit (item fit residuals<|2.5|) and no DIF or LD. CAT simulation revealed that 13 items on average were necessary to estimate depression in the range of -2 and +2 logits when terminating at SE≤0.32 and 4 items if using SE≤0.50. Receiver Operating Characteristics analysis showed that θ estimates based on the CAT algorithm have good criterion validity with regard to depression diagnoses (Area Under the Curve≥.78 for all cut-off criteria). The recalibration of the ADIB succeeded and the simulation studies conducted suggest that it has good screening performance in the samples investigated and that it may reasonably add to the improvement of depression assessment. © 2013.
EXTENDING THE FLOOR AND THE CEILING FOR ASSESSMENT OF PHYSICAL FUNCTION

PubMed Central

Fries, James F.; Lingala, Bharathi; Siemons, Liseth; Glas, Cees A. W.; Cella, David; Hussain, Yusra N; Bruce, Bonnie; Krishnan, Eswar

2014-01-01

Objective The objective of the current study was to improve the assessment of physical function by improving the precision of assessment at the floor (extremely poor function) and at the ceiling (extremely good health) of the health continuum. Methods Under the NIH PROMIS program, we developed new physical function floor and ceiling items to supplement the existing item bank. Using item response theory (IRT) and the standard PROMIS methodology, we developed 30 floor items and 26 ceiling items and administered them during a 12-month prospective observational study of 737 individuals at the extremes of health status. Change over time was compared across anchor instruments and across items by means of effect sizes. Using the observed changes in scores, we back-calculated sample size requirements for the new and comparison measures. Results We studied 444 subjects with chronic illness and/or extreme age, and 293 generally fit subjects including athletes in training. IRT analyses confirmed that the new floor and ceiling items outperformed reference items (p<0.001). The estimated post-hoc sample size requirements were reduced by a factor of two to four at the floor and a factor of two at the ceiling. Conclusion Extending the range of physical function measurement can substantially improve measurement quality, can reduce sample size requirements and improve research efficiency. The paradigm shift from Disability to Physical Function includes the entire spectrum of physical function, signals improvement in the conceptual base of outcome assessment, and may be transformative as medical goals more closely approach societal goals for health. PMID:24782194
Gender-Based Differential Item Performance in Mathematics Achievement Items.

ERIC Educational Resources Information Center

Doolittle, Allen E.; Cleary, T. Anne

1987-01-01

Eight randomly equivalent samples of high school seniors were each given a unique form of the ACT Assessment Mathematics Usage Test (ACTM). Signed measures of differential item performance (DIP) were obtained for each item in the eight ACTM forms. DIP estimates were analyzed and a significant item category effect was found. (Author/LMO)
The Transition Readiness Assessment Questionnaire (TRAQ): its factor structure, reliability, and validity.

PubMed

Wood, David L; Sawicki, Gregory S; Miller, M David; Smotherman, Carmen; Lukens-Bull, Katryne; Livingood, William C; Ferris, Maria; Kraemer, Dale F

2014-01-01

National consensus statements recommend that providers regularly assess the transition readiness skills of adolescent and young adults (AYA). In 2010 we developed a 29-item version of Transition Readiness Assessment Questionnaire (TRAQ). We reevaluated item performance and factor structure, and reassessed the TRAQ's reliability and validity. We surveyed youth from 3 academic clinics in Jacksonville, Florida; Chapel Hill, North Carolina; and Boston, Massachusetts. Participants were AYA with special health care needs aged 14 to 21 years. From a convenience sample of 306 patients, we conducted item reduction strategies and exploratory factor analysis (EFA). On a second convenience sample of 221 patients, we conducted confirmatory factor analysis (CFA). Internal reliability was assessed by Cronbach's alpha and criterion validity. Analyses were conducted by the Wilcoxon rank sum test and mixed linear models. The item reduction and EFA resulted in a 20-item scale with 5 identified subscales. The CFA conducted on a second sample provided a good fit to the data. The overall scale has high reliability overall (Cronbach's alpha = .94) and good reliability for 4 of the 5 subscales (Cronbach's alpha ranging from .90 to .77 in the pooled sample). Each of the 5 subscale scores were significantly higher for adolescents aged 18 years and older versus those younger than 18 (P < .0001) in both univariate and multivariate analyses. The 20-item, 5-factor structure for the TRAQ is supported by EFA and CFA on independent samples and has good internal reliability and criterion validity. Additional work is needed to expand or revise the TRAQ subscales and test their predictive validity. Copyright © 2014 Academic Pediatric Association. Published by Elsevier Inc. All rights reserved.
Validation of the MOS Social Support Survey 6-item (MOS-SSS-6) measure with two large population-based samples of Australian women.

PubMed

Holden, Libby; Lee, Christina; Hockey, Richard; Ware, Robert S; Dobson, Annette J

2014-12-01

This study aimed to validate a 6-item 1-factor global measure of social support developed from the Medical Outcomes Study Social Support Survey (MOS-SSS) for use in large epidemiological studies. Data were obtained from two large population-based samples of participants in the Australian Longitudinal Study on Women's Health. The two cohorts were aged 53-58 and 28-33 years at data collection (N = 10,616 and 8,977, respectively). Items selected for the 6-item 1-factor measure were derived from the factor structure obtained from unpublished work using an earlier wave of data from one of these cohorts. Descriptive statistics, including polychoric correlations, were used to describe the abbreviated scale. Cronbach's alpha was used to assess internal consistency and confirmatory factor analysis to assess scale validity. Concurrent validity was assessed using correlations between the new 6-item version and established 19-item version, and other concurrent variables. In both cohorts, the new 6-item 1-factor measure showed strong internal consistency and scale reliability. It had excellent goodness-of-fit indices, similar to those of the established 19-item measure. Both versions correlated similarly with concurrent measures. The 6-item 1-factor MOS-SSS measures global functional social support with fewer items than the established 19-item measure.
A Time and Place for Everything: Developmental Differences in the Building Blocks of Episodic Memory

PubMed Central

Lee, Joshua K.; Wendelken, J. Carter; Bunge, Silvia A.; Ghetti, Simona

2015-01-01

This research investigated whether episodic memory development can be explained by improvements in relational binding processes, involved in forming novel associations between events and the context in which they occurred. Memory for item-space, item-time, and item-item relations was assessed in an ethnically diverse sample of 151 children aged 7 to 11 years and 28 young adults. Item-space memory reached adult performance by 9½ years, whereas item-time and item-item memory improved into adulthood. In path analysis, item-space, but not item-time best explained item-item memory. Across age groups, relational binding related to source memory and performance on standardized memory assessments. In conclusion, relational binding development depends on relation type, but relational binding overall supports episodic memory development. PMID:26493950
Scale Comparability between Nonaccommodated and Accommodated Forms of a Statewide High School Assessment: Assessment Using "l[subscript z]" Person-Fit

ERIC Educational Resources Information Center

Seo, Dong Gi; Hao, Shiqi

2016-01-01

Differential item/test functioning (DIF/DTF) are routine procedures to detect item/test unfairness as an explanation for group performance difference. However, unequal sample sizes and small sample sizes have an impact on the statistical power of the DIF/DTF detection procedures. Furthermore, DIF/DTF cannot be used for two test forms without…
Development and evaluation of a brief screener to estimate fast-food and beverage consumption among adolescents.

PubMed

Nelson, Melissa C; Lytle, Leslie A

2009-04-01

Sweetened beverage and fast-food intake have been identified as important targets for obesity prevention. However, there are few brief dietary assessment tools available to evaluate these behaviors among adolescents. The objective of this research was to examine reliability and validity of a 22-item dietary screener assessing adolescent consumption of specific energy-containing and non-energy-containing beverages (nine items) and fast food (13 items). The screener was administered to adolescents (ages 11 to 18 years) recruited from the Minneapolis/St Paul, MN, metro region. One sample of adolescents completed test-retest reliability of the screener (n=33, primarily white adolescents). Another adolescent sample completed the screener along with three 24-hour dietary recalls to assess criterion validity (n=59 white adolescents). Test-retest assessments were completed approximately 7 to 14 days apart, and agreement between the two administrations of the screener was substantial, with most items yielding Spearman correlations and kappa statistics that were >0.60. When compared to the gold standard dietary recall data, findings indicate that the validity of the screener items assessing adolescents' intake of regular soda, sports drinks, milk, and water was fair. However, the differential assessment periods captured by the two methods (ie, 1 month for the screener vs 3 days for the recalls) posed challenges in analysis and made it impossible to assess the validity of some screener items. Overall while these screener items largely represent reliable measures with fair validity, our findings highlight the challenges inherent in the validation of brief dietary assessment tools.
Psychometric properties of the communication Confidence Rating Scale for Aphasia (CCRSA): phase 1.

PubMed

Cherney, Leora R; Babbitt, Edna M; Semik, Patrick; Heinemann, Allen W

2011-01-01

Confidence is a construct that has not been explored previously in aphasia research. We developed the Communication Confidence Rating Scale for Aphasia (CCRSA) to assess confidence in communicating in a variety of activities and evaluated its psychometric properties using rating scale (Rasch) analysis. The CCRSA was administered to 21 individuals with aphasia before and after participation in a computer-based language therapy study. Person reliability of the 8-item CCRSA was .77. The 5-category rating scale demonstrated monotonic increases in average measures from low to high ratings. However, one item ("I follow news, sports, stories on TV/movies") misfit the construct defined by the other items (mean square infit = 1.69, item-measure correlation = .41). Deleting this item improved reliability to .79; the 7 remaining items demonstrated excellent fit to the underlying construct, although there was a modest ceiling effect in this sample. Pre- to posttreatment changes on the 7-item CCRSA measure were statistically significant using a paired samples t test. Findings support the reliability and sensitivity of the CCRSA in assessing participants' self-report of communication confidence. Further evaluation of communication confidence is required with larger and more diverse samples.
Psychometric properties and measurement invariance of the Beck hopelessness scale (BHS): results from a German representative population sample.

PubMed

Kliem, Sören; Lohmann, Anna; Mößle, Thomas; Brähler, Elmar

2018-04-25

The Beck Hopelessness Scale (BHS) has been the most frequently used instrument for the measurement of hopelessness in the past 40 years. Only recently has it officially been translated into German. The psychometric properties and factor structure of the BHS have been cause for intensive debate in the past. Based on a representative sample of the German population (N = 2450) item analysis including item sensitivity, item-total correlation and item difficulty was performed. Confirmatory factor analyses (CFA) for several factor solutions from the literature were performed. Multiple group factor analysis was performed to assess measurement invariance. Construct validity was assessed via the replication of well-established correlations with concurrently assessed measures. Most items exhibited adequate properties. Items #4, #8 and #13 exhibited poor item characteristics- each of these items had previously received negative evaluations in international studies. A one-dimensional factor solution, favorable for the calculation and interpretation of a sum score, was regarded as adequate. A bi-factor model with one content factor and two method factors (defined by positive/negative item coding) resulted in an excellent model fit. Cronbach's alpha in the current sample was .87. Hopelessness, as measured by the BHS, significantly correlated in the expected direction with suicidal ideation (r = .36), depression (r = .53) and life satisfaction (r = -.53). Strict measurement invariance could be established regarding gender and depression status. Due to limited research regarding the interpretation of fit indices with dichotomous data, interpretation of CFA results needs to remain tentative. The BHS is a valid measure of hopelessness in various subgroups of the general population. Future research could aim at replicating these findings using item response theory and cross-cultural samples. A one-dimensional bi-factor model seems appropriate even in a non-clinical population.
Construct Validity Evidence for Single-Response Items to Estimate Physical Activity Levels in Large Sample Studies

ERIC Educational Resources Information Center

Jackson, Allen W.; Morrow, James R., Jr.; Bowles, Heather R.; FitzGerald, Shannon J.; Blair, Steven N.

2007-01-01

Valid measurement of physical activity is important for studying the risks for morbidity and mortality. The purpose of this study was to examine evidence of construct validity of two similar single-response items assessing physical activity via self-report. Both items are based on the stages of change model. The sample was 687 participants (men =…
Assessment of Teacher Perceived Skill in Classroom Assessment Practices Using IRT Models

ERIC Educational Resources Information Center

Koloi-Keaikitse, Setlhomo

2017-01-01

The purpose of this study was to assess teacher perceived skill in classroom assessment practices. Data were collected from a sample of (N = 691) teachers selected from government primary, junior secondary, and senior secondary schools in Botswana. Item response theory models were used to identify teacher response on items that measured their…
Assessment of acquired capability for suicide in clinical practice.

PubMed

Rimkeviciene, Jurgita; Hawgood, Jacinta; O'Gorman, John; De Leo, Diego

2016-12-01

The Interpersonal Psychological Theory of suicide proposes that the interaction between Thwarted Belongingness, Perceived Burdensomeness, and Acquired Capability for Suicide (ACS) predicts proximal risk of death by suicide. Instruments to assess all three constructs are available. However, research on the validity of one of them, the acquired capability for suicide scale (ACSS), has been limited, especially in terms of its clinical relevance. This study aimed to explore the utility of the different versions of the ACSS in clinical assessment. Three versions of the scale were investigated, the full 20-item version, a 7-item version and a single item version representing self-perceived capability for suicide. In a sample of patients recruited from a clinic specialising in the treatment of suicidality and in a community sample, all versions of the ACSS were found to show reasonable levels of reliability and to correlate as expected with reports of suicidal ideation, self-harm, and attempted suicide. The item assessing self-perceived acquired capacity for suicide showed highest correlations with all levels of suicidal behaviour. However, no version of the ACSS on its own showed a capacity to indicate suicide attempts in the combined sample. It is concluded that the versions of the scale have construct validity, but their clinical utility is limited. An assessment using a single item on self-perceived ACS outperforms the full and shortened versions of ACSS in clinical settings and can be recommended with caution for clinicians interested in assessing this characteristic.
Evaluation of the Clinical LOINC (Logical Observation Identifiers, Names, and Codes) Semantic Structure as a Terminology Model for Standardized Assessment Measures

PubMed Central

Bakken, Suzanne; Cimino, James J.; Haskell, Robert; Kukafka, Rita; Matsumoto, Cindi; Chan, Garrett K.; Huff, Stanley M.

2000-01-01

Objective: The purpose of this study was to test the adequacy of the Clinical LOINC (Logical Observation Identifiers, Names, and Codes) semantic structure as a terminology model for standardized assessment measures. Methods: After extension of the definitions, 1,096 items from 35 standardized assessment instruments were dissected into the elements of the Clinical LOINC semantic structure. An additional coder dissected at least one randomly selected item from each instrument. When multiple scale types occurred in a single instrument, a second coder dissected one randomly selected item representative of each scale type. Results: The results support the adequacy of the Clinical LOINC semantic structure as a terminology model for standardized assessments. Using the revised definitions, the coders were able to dissect into the elements of Clinical LOINC all the standardized assessment items in the sample instruments. Percentage agreement for each element was as follows: component, 100 percent; property, 87.8 percent; timing, 82.9 percent; system/sample, 100 percent; scale, 92.6 percent; and method, 97.6 percent. Discussion: This evaluation was an initial step toward the representation of standardized assessment items in a manner that facilitates data sharing and re-use. Further clarification of the definitions, especially those related to time and property, is required to improve inter-rater reliability and to harmonize the representations with similar items already in LOINC. PMID:11062226
Brief Opioid Overdose Knowledge (BOOK): A Questionnaire to Assess Overdose Knowledge in Individuals Who Use Illicit or Prescribed Opioids.

PubMed

Dunn, Kelly E; Barrett, Frederick S; Yepez-Laubach, Claudia; Meyer, Andrew C; Hruska, Bryce J; Sigmon, Stacey C; Fingerhood, Michael; Bigelow, George E

2016-01-01

Opioid overdose is a public health crisis. This study describes efforts to develop and validate the Brief Opioid Overdose Knowledge (BOOK) questionnaire to assess patient knowledge gaps related to opioid overdose risks. Two samples of illicit opioid users and a third sample of patients receiving an opioid for the treatment of chronic pain (total N = 848) completed self-report items pertaining to opioid overdose risks. A 3-factor scale was established, representing Opioid Knowledge (4 items), Opioid Overdose Knowledge (4 items), and Opioid Overdose Response Knowledge (4 items). The scale had strong internal and face validity. Patients with chronic pain performed worse than illicit drug users in almost all items assessed, highlighting the need to increase knowledge of opioid overdose risk to this population. This study sought to develop a brief, internally valid method for quickly assessing deficits in opioid overdose risk areas within users of illicit and prescribed opioids, to provide an efficient metric for assessing and comparing educational interventions, facilitate conversations between physicians and patients about overdose risks, and help formally identify knowledge deficits in other patient populations.
Identifying gender specific risk/need areas for male and female juvenile offenders: Factor analyses with the Structured Assessment of Violence Risk in Youth (SAVRY).

PubMed

Hilterman, Ed L B; Bongers, Ilja; Nicholls, Tonia L; van Nieuwenhuizen, Chijs

2016-02-01

By constructing risk assessment tools in which the individual items are organized in the same way for male and female juvenile offenders it is assumed that these items and subscales have similar relevance across males and females. The identification of criminogenic needs that vary in relevance for 1 of the genders, could contribute to more meaningful risk assessments, especially for female juvenile offenders. In this study, exploratory factor analyses (EFA) on a construction sample of male (n = 3,130) and female (n = 466) juvenile offenders were used to aggregate the 30 items of the Structured Assessment of Violence Risk in Youth (SAVRY) into empirically based risk/need factors and explore differences between genders. The factor models were cross-validated through confirmatory factor analyses (CFA) on a validation sample of male (n = 2,076) and female (n = 357) juvenile offenders. In both the construction sample and the validation sample, 5 factors were identified: (a) Antisocial behavior; (b) Family functioning; (c) Personality traits; (d) Social support; and (e) Treatability. The male and female models were significantly different and the internal consistency of the factors was good, both in the construction sample and the validation sample. Clustering risk/need items for male and female juvenile offenders into meaningful factors may guide clinicians in the identification of gender-specific treatment interventions. PsycINFO Database Record (c) 2016 APA, all rights reserved.
Language-related differential item functioning between English and German PROMIS Depression items is negligible.

PubMed

Fischer, H Felix; Wahl, Inka; Nolte, Sandra; Liegl, Gregor; Brähler, Elmar; Löwe, Bernd; Rose, Matthias

2017-12-01

To investigate differential item functioning (DIF) of PROMIS Depression items between US and German samples we compared data from the US PROMIS calibration sample (n = 780), a German general population survey (n = 2,500) and a German clinical sample (n = 621). DIF was assessed in an ordinal logistic regression framework, with 0.02 as criterion for R 2 -change and 0.096 for Raju's non-compensatory DIF. Item parameters were initially fixed to the PROMIS Depression metric; we used plausible values to account for uncertainty in depression estimates. Only four items showed DIF. Accounting for DIF led to negligible effects for the full item bank as well as a post hoc simulated computer-adaptive test (< 0.1 point on the PROMIS metric [mean = 50, standard deviation =10]), while the effect on the short forms was small (< 1 point). The mean depression severity (43.6) in the German general population sample was considerably lower compared to the US reference value of 50. Overall, we found little evidence for language DIF between US and German samples, which could be addressed by either replacing the DIF items by items not showing DIF or by scoring the short form in German samples with the corrected item parameters reported. Copyright © 2016 John Wiley & Sons, Ltd.

Face validity and reliability of a pictorial instrument for assessing fundamental movement skill perceived competence in young children.

PubMed

Barnett, Lisa M; Ridgers, Nicola D; Zask, Avigdor; Salmon, Jo

2015-01-01

To determine reliability and face validity of an instrument to assess young children's perceived fundamental movement skill competence. Validation and reliability study. A pictorial instrument based on the Test Gross Motor Development-2 assessed perceived locomotor (six skills) and object control (six skills) competence using the format and item structure from the physical competence subscale of the Pictorial Scale of Perceived Competence and Acceptance for Young Children. Sample 1 completed object control items in May (n=32) and locomotor items in October 2012 (n=23) at two time points seven days apart. Children were asked at the end of the test-retest their understanding of what was happening in each picture to determine face validity. Sample 2 (n=58) completed 12 items in November 2012 on a single occasion to test internal reliability only. Sample 1 children were aged 5-7 years (M=6.0, SD=0.8) at object control assessment and 5-8 years at locomotor assessment (M=6.5, SD=0.9). Sample 2 children were aged 6-8 years (M=7.2, SD=0.73). Intra-class correlations assessed in Sample 1 children were excellent for object control (intra-class correlation=0.78), locomotor (intra-class correlation=0.82) and all 12 skills (intra-class correlations=0.83). Face validity was acceptable. Internal consistency was adequate in both samples for each subscale and all 12 skills (alpha range 0.60-0.81). This study has provided preliminary evidence for instrument reliability and face validity. This enables future alignment between the measurement of perceived and actual fundamental movement skill competence in young children. Crown Copyright © 2014. Published by Elsevier Ltd. All rights reserved.
Computerized Adaptive Testing Provides Reliable and Efficient Depression Measurement Using the CES-D Scale

PubMed Central

2017-01-01

Background The Center for Epidemiologic Studies Depression Scale (CES-D) is a measure of depressive symptomatology which is widely used internationally. Though previous attempts were made to shorten the CES-D scale, few have attempted to develop a Computerized Adaptive Test (CAT) version for the CES-D. Objective The aim of this study was to provide evidence on the efficiency and accuracy of the CES-D when administered using CAT using an American sample group. Methods We obtained a sample of 2060 responses to the CESD-D from US participants using the myPersonality application. The average age of participants was 26 years (range 19-77). We randomly split the sample into two groups to evaluate and validate the psychometric models. We used evaluation group data (n=1018) to assess dimensionality with both confirmatory factor and Mokken analysis. We conducted further psychometric assessments using item response theory (IRT), including assessments of item and scale fit to Samejima’s graded response model (GRM), local dependency and differential item functioning. We subsequently conducted two CAT simulations to evaluate the CES-D CAT using the validation group (n=1042). Results Initial CFA results indicated a poor fit to the model and Mokken analysis revealed 3 items which did not conform to the same dimension as the rest of the items. We removed the 3 items and fit the remaining 17 items to GRM. We found no evidence of differential item functioning (DIF) between age and gender groups. Estimates of the level of CES-D trait score provided by the simulated CAT algorithm and the original CES-D trait score derived from original scale were correlated highly. The second CAT simulation conducted using real participant data demonstrated higher precision at the higher levels of depression spectrum. Conclusions Depression assessments using the CES-D CAT can be more accurate and efficient than those made using the fixed-length assessment. PMID:28931496
Psychometrics of the preschooler physical activity parenting practices instrument among a Latino sample.

PubMed

O'Connor, Teresia M; Cerin, Ester; Hughes, Sheryl O; Robles, Jessica; Thompson, Deborah I; Mendoza, Jason A; Baranowski, Tom; Lee, Rebecca E

2014-01-15

Latino preschoolers (3-5 year old children) have among the highest rates of obesity. Low levels of physical activity (PA) are a risk factor for obesity. Characterizing what Latino parents do to encourage or discourage their preschooler to be physically active can help inform interventions to increase their PA. The objective was therefore to develop and assess the psychometrics of a new instrument: the Preschooler Physical Activity Parenting Practices (PPAPP) among a Latino sample, to assess parenting practices used to encourage or discourage PA among preschool-aged children. Cross-sectional study of 240 Latino parents who reported the frequency of using PA parenting practices. 95% of respondents were mothers; 42% had more than a high school education. Child mean age was 4.5 (±0.9) years (52% male). Test-retest reliability was assessed in 20%, 2 weeks later. We assessed the fit of a priori models using Confirmatory factor analyses (CFA). In a separate sub-sample (35%), preschool-aged children wore accelerometers to assess associations with their PA and PPAPP subscales. The a-priori models showed poor fit to the data. A modified factor structure for encouraging PPAPP had one multiple-item scale: engagement (15 items), and two single-items (have outdoor toys; not enroll in sport-reverse coded). The final factor structure for discouraging PPAPP had 4 subscales: promote inactive transport (3 items), promote screen time (3 items), psychological control (4 items) and restricting for safety (4 items). Test-retest reliability (ICC) for the two scales ranged from 0.56-0.85. Cronbach's alphas ranged from 0.5-0.9. Several sub-factors correlated in the expected direction with children's objectively measured PA. The final models for encouraging and discouraging PPAPP had moderate to good fit, with moderate to excellent test-retest reliabilities. The PPAPP should be further evaluated to better assess its associations with children's PA and offers a new tool for measuring PPAPP among Latino families with preschool-aged children.
Strategic assessment of the availability of pediatric trauma care equipment, technology and supplies in Ghana.

PubMed

Ankomah, James; Stewart, Barclay T; Oppong-Nketia, Victor; Koranteng, Adofo; Gyedu, Adam; Quansah, Robert; Donkor, Peter; Abantanga, Francis; Mock, Charles

2015-11-01

This study aimed to assess the availability of pediatric trauma care items (i.e. equipment, supplies, technology) and factors contributing to deficiencies in Ghana. Ten universal and 9 pediatric-sized items were selected from the World Health Organization's Guidelines for Essential Trauma Care. Direct inspection and structured interviews with administrative, clinical and biomedical engineering staff were used to assess item availability at 40 purposively sampled district, regional and tertiary hospitals in Ghana. Hospital assessments demonstrated marked deficiencies for a number of essential items (e.g. basic airway supplies, chest tubes, blood pressure cuffs, electrolyte determination, portable X-ray). Lack of pediatric-sized items resulting from equipment absence, lack of training, frequent stock-outs and technology breakage were common. Pediatric items were consistently less available than adult-sized items at each hospital level. This study identified several successes and problems with pediatric trauma care item availability in Ghana. Item availability could be improved, both affordably and reliably, by better organization and planning (e.g. regular assessment of demand and inventory, reliable financing for essential trauma care items). In addition, technology items were often broken. Developing local service and biomedical engineering capability was highlighted as a priority to avoid long periods of equipment breakage. Copyright © 2015 Elsevier Inc. All rights reserved.
Strategic assessment of the availability of pediatric trauma care equipment, technology and supplies in Ghana

PubMed Central

Ankomah, James; Stewart, Barclay T; Oppong-Nketia, Victor; Koranteng, Adofo; Gyedu, Adam; Quansah, Robert; Donkor, Peter; Abantanga, Francis; Mock, Charles

2015-01-01

Background This study aimed to assess the availability of pediatric trauma care items (i.e. equipment, supplies, technology) and factors contributing to deficiencies in Ghana. Methods Ten universal and 9 pediatric-sized items were selected from the World Health Organization’s Guidelines for Essential Trauma Care. Direct inspection and structured interviews with administrative, clinical and biomedical engineering staff were used to assess item availability at 40 purposively sampled district, regional and tertiary hospitals in Ghana. Results Hospital assessments demonstrated marked deficiencies for a number of essential items (e.g. basic airway supplies, chest tubes, blood pressure cuffs, electrolyte determination, portable Xray). Lack of pediatric-sized items resulting from equipment absence, lack of training, frequent stock-outs and technology breakage were common. Pediatric items were consistently less available than adult-sized items at each hospital level. Conclusion This study identified several successes and problems with pediatric trauma care item availability in Ghana. Item availability could be improved, both affordably and reliably, by better organization and planning (e.g. regular assessment of demand and inventory, reliable financing for essential trauma care items). In addition, technology items were often broken. Developing local service and biomedical engineering capability was highlighted as a priority to avoid long periods of equipment breakage. PMID:25841284
Explaining and Controlling for the Psychometric Properties of Computer-Generated Figural Matrix Items

ERIC Educational Resources Information Center

Freund, Philipp Alexander; Hofer, Stefan; Holling, Heinz

2008-01-01

Figural matrix items are a popular task type for assessing general intelligence (Spearman's g). Items of this kind can be constructed rationally, allowing the implementation of computerized generation algorithms. In this study, the influence of different task parameters on the degree of difficulty in matrix items was investigated. A sample of N =…
A Time and Place for Everything: Developmental Differences in the Building Blocks of Episodic Memory

ERIC Educational Resources Information Center

Lee, Joshua K.; Wendelken, Carter; Bunge, Silvia A.; Ghetti, Simona

2016-01-01

This research investigated whether episodic memory development can be explained by improvements in relational binding processes, involved in forming novel associations between events and the context in which they occurred. Memory for item-space, item-time, and item-item relations was assessed in an ethnically diverse sample of 151 children aged…
Using Confirmatory Factor Analysis and the Rasch Model to Assess Measurement Invariance in a High Stakes Reading Assessment

ERIC Educational Resources Information Center

Randall, Jennifer; Engelhard, George, Jr.

2010-01-01

The psychometric properties and multigroup measurement invariance of scores across subgroups, items, and persons on the "Reading for Meaning" items from the Georgia Criterion Referenced Competency Test (CRCT) were assessed in a sample of 778 seventh-grade students. Specifically, we sought to determine the extent to which score-based…
Improving Assessment of Work Related Mental Health Function Using the Work Disability Functional Assessment Battery (WD-FAB).

PubMed

Marfeo, Elizabeth E; Ni, Pengsheng; McDonough, Christine; Peterik, Kara; Marino, Molly; Meterko, Mark; Rasch, Elizabeth K; Chan, Leighton; Brandt, Diane; Jette, Alan M

2018-03-01

Purpose To improve the mental health component of the Work Disability Functional Assessment Battery (WD-FAB), developed for the US Social Security Administration's (SSA) disability determination process. Specifically our goal was to expand the WD-FAB scales of mood & emotions, resilience, social interactions, and behavioral control to improve the depth and breadth of the current scales and expand the content coverage to include aspects of cognition & communication function. Methods Data were collected from a random, stratified sample of 1695 claimants applying for the SSA work disability benefits, and a general population sample of 2025 working age adults. 169 new items were developed to replenish the WD-FAB scales and analyzed using factor analysis and item response theory (IRT) analysis to construct unidimensional scales. We conducted computer adaptive test (CAT) simulations to examine the psychometric properties of the WD-FAB. Results Analyses supported the inclusion of four mental health subdomains: Cognition & Communication (68 items), Self-Regulation (34 items), Resilience & Sociability (29 items) and Mood & Emotions (34 items). All scales yielded acceptable psychometric properties. Conclusions IRT methods were effective in expanding the WD-FAB to assess mental health function. The WD-FAB has the potential to enhance work disability assessment both within the context of the SSA disability programs as well as other clinical and vocational rehabilitation settings.
Using the Rasch Measurement Model in Psychometric Analysis of the Family Effectiveness Measure

PubMed Central

McCreary, Linda L.; Conrad, Karen M.; Conrad, Kendon J.; Scott, Christy K; Funk, Rodney R.; Dennis, Michael L.

2013-01-01

Background Valid assessment of family functioning can play a vital role in optimizing client outcomes. Because family functioning is influenced by family structure, socioeconomic context, and culture, existing measures of family functioning--primarily developed with nuclear, middle class European American families--may not be valid assessments of families in diverse populations. The Family Effectiveness Measure was developed to address this limitation. Objectives To test the Family Effectiveness Measure with data from a primarily low-income African American convenience sample, using the Rasch measurement model. Method A sample of 607 adult women completed the measure. Rasch analysis was used to assess unidimensionality, response category functioning, item fit, person reliability, differential item functioning by race and parental status, and item hierarchy. Criterion-related validity was tested using correlations with five other variables related to family functioning. Results The Family Effectiveness Measure measures two separate constructs: The effective family functioning construct was a psychometrically sound measure of the target construct that was more efficient due to the deletion of 22 items. The ineffective family functioning construct consisted of 16 of those deleted items but was not as strong psychometrically. Items in both constructs evidenced no differential item functioning by race. Criterion-related validity was supported for both. Discussion In contrast to the prevailing conceptualization that family functioning is a single construct, assessed by positively and negatively worded items, use of the Rasch analysis suggested the existence of two constructs. While the effective family functioning is a strong and efficient measure of family functioning, the ineffective family functioning will require additional item development and psychometric testing. PMID:23636342
A Review of ETS Differential Item Functioning Assessment Procedures: Flagging Rules, Minimum Sample Size Requirements, and Criterion Refinement. Research Report. ETS RR-12-08

ERIC Educational Resources Information Center

Zwick, Rebecca

2012-01-01

Differential item functioning (DIF) analysis is a key component in the evaluation of the fairness and validity of educational tests. The goal of this project was to review the status of ETS DIF analysis procedures, focusing on three aspects: (a) the nature and stringency of the statistical rules used to flag items, (b) the minimum sample size…
Development and psychometric evaluation of the PROMIS Pediatric Life Satisfaction item banks, child-report, and parent-proxy editions.

PubMed

Forrest, Christopher B; Devine, Janine; Bevans, Katherine B; Becker, Brandon D; Carle, Adam C; Teneralli, Rachel E; Moon, JeanHee; Tucker, Carole A; Ravens-Sieberer, Ulrike

2018-01-01

To describe the psychometric evaluation and item response theory calibration of the PROMIS Pediatric Life Satisfaction item banks, child-report, and parent-proxy editions. A pool of 55 life satisfaction items was administered to 1992 children 8-17 years old and 964 parents of children 5-17 years old. Analyses included descriptive statistics, reliability, factor analysis, differential item functioning, and assessment of construct validity. Thirteen items were deleted because of poor psychometric performance. An 8-item short form was administered to a national sample of 996 children 8-17 years old, and 1294 parents of children 5-17 years old. The combined sample (2988 children and 2258 parents) was used in item response theory (IRT) calibration analyses. The final item banks were unidimensional, the items were locally independent, and the items were free from impactful differential item functioning. The 8-item and 4-item short form scales showed excellent reliability, convergent validity, and discriminant validity. Life satisfaction decreased with declining socio-economic status, presence of a special health care need, and increasing age for girls, but not boys. After IRT calibration, we found that 4- and 8-item short forms had a high degree of precision (reliability) across a wide range (>4 SD units) of the latent variable. The PROMIS Pediatric Life Satisfaction item banks and their short forms provide efficient, precise, and valid assessments of life satisfaction in children and youth.
Developing and testing new smoking measures for the Health Plan Employer Data and Information Set.

PubMed

Pbert, Lori; Vuckovic, Nancy; Ockene, Judith K; Hollis, Jack F; Riedlinger, Karen

2003-04-01

To develop and test items for the Health Plan Employee Data and Information Set (HEDIS) that assess delivery of the full range of provider-delivered tobacco interventions. The authors identified potential items via literature review; items were reviewed by national experts. Face validity of candidate items was tested in focus groups. The final survey was sent to a random sample of 1711 adult primary care patients; the re-test survey was sent to self-identified smokers. The process identified reliable items to capture provider assessment of motivation and provision of assistance and follow-up. One can reliably assess patient self-report of provider delivery of the full range of brief tobacco interventions. Such assessment and feedback to health plans and providers may increase use of evidence-based brief interventions.
Parts on Demand: Evaluation of Approaches to Achieve Flexible Manufacturing Systems for Navy Partson Demand. Volume 1

DTIC Science & Technology

1984-02-01

measurable impact if changed. The following items were included in the sample: * Mark Zero Items -Low demand insurance items which represent about three...R&D efforts reviewed. The resulting assessment highlighted the generic enabling technologies and cross- cutting R&D projects required to focus current...supplied by spot buys, and which may generate Navy Inventory Control Numbers (NICN). Random samples of data were extracted from the Master Data File ( MDF
A Rasch Analysis of Assessments of Morning and Evening Fatigue in Oncology Patients Using the Lee Fatigue Scale.

PubMed

Lerdal, Anners; Kottorp, Anders; Gay, Caryl; Aouizerat, Bradley E; Lee, Kathryn A; Miaskowski, Christine

2016-06-01

To accurately investigate diurnal variations in fatigue, a measure needs to be psychometrically sound and demonstrate stable item function in relationship to time of day. Rasch analysis is a modern psychometric approach that can be used to evaluate these characteristics. To evaluate, using Rasch analysis, the psychometric properties of the Lee Fatigue Scale (LFS) in a sample of oncology patients. The sample comprised 587 patients (mean age 57.3 ± 11.9 years, 80% women) undergoing chemotherapy for breast, gastrointestinal, gynecological, or lung cancer. Patients completed the 13-item LFS within 30 minutes of awakening (i.e., morning fatigue) and before going to bed (i.e., evening fatigue). Rasch analysis was used to assess validity and reliability. In initial analyses of differential item function, eight of the 13 items functioned differently depending on whether the LFS was completed in the morning or in the evening. Subsequent analyses were conducted separately for the morning and evening fatigue assessments. Nine of the morning fatigue items and 10 of the evening fatigue items demonstrated acceptable goodness-of-fit to the Rasch model. Principal components analyses indicated that both morning and evening assessments demonstrated unidimensionality. Person-separation indices indicated that both morning and evening fatigue scales were able to distinguish four distinct strata of fatigue severity. Excluding four items from the morning fatigue scale and three items from the evening fatigue scale improved the psychometric properties of the LFS for assessing diurnal variations in fatigue severity in oncology patients. Copyright © 2016 American Academy of Hospice and Palliative Medicine. Published by Elsevier Inc. All rights reserved.
Assessing Hopelessness in Terminally Ill Cancer Patients: Development of the Hopelessness Assessment in Illness Questionnaire

PubMed Central

Rosenfeld, Barry; Pessin, Hayley; Lewis, Charles; Abbey, Jennifer; Olden, Megan; Sachs, Emily; Amakawa, Lia; Kolva, Elissa; Brescia, Robert; Breitbart, William

2013-01-01

Hopelessness has become an increasingly important construct in palliative care research, yet concerns exist regarding the utility of existing measures when applied to patients with a terminal illness. This article describes a series of studies focused on the exploration, development, and analysis of a measure of hopelessness specifically intended for use with terminally ill cancer patients. The 1st stage of measure development involved interviews with 13 palliative care experts and 30 terminally ill patients. Qualitative analysis of the patient interviews culminated in the development of a set of potential questionnaire items. In the 2nd study phase, we evaluated these preliminary items with a sample of 314 participants, using item response theory and classical test theory to identify optimal items and response format. These analyses generated an 8-item measure that we tested in a final study phase, using a 3rd sample (n = 228) to assess reliability and concurrent validity. These analyses demonstrated strong support for the Hopelessness Assessment in Illness Questionnaire providing greater explanatory power than existing measures of hopelessness and found little evidence that this assessment was confounded by illness-related variables (e.g., prognosis). In summary, these 3 studies suggest that this brief measure of hopelessness is particularly useful for palliative care settings. Further research is needed to assess the applicability of the measure to other populations and contexts. PMID:21443366
The reverse of social anxiety is not always the opposite: the reverse-scored items of the social interaction anxiety scale do not belong.

PubMed

Rodebaugh, Thomas L; Woods, Carol M; Heimberg, Richard G

2007-06-01

Although well-used and empirically supported, the Social Interaction Anxiety Scale (SIAS) has a questionable factor structure and includes reverse-scored items with questionable utility. Here, using samples of undergraduates and a sample of clients with social anxiety disorder, we extend previous work that opened the question of whether the reverse-scored items belong on the scale. First, we successfully confirmed the factor structure obtained in previous samples. Second, we found the reverse-scored items to show consistently weaker relationships with a variety of comparison measures. Third, we demonstrated that removing the reverse-scored questions generally helps rather than hinders the psychometric performance of the SIAS total score. Fourth, we found that the reverse-scored items show a strong relationship with the normal personality characteristic of extraversion, suggesting that the reverse-scored items may primarily assess extraversion. Given the above results, we suggest investigators consider performing data analyses using only the straightforwardly worded items of the SIAS.
Development of a tool to measure person-centered maternity care in developing settings: validation in a rural and urban Kenyan population.

PubMed

Afulani, Patience A; Diamond-Smith, Nadia; Golub, Ginger; Sudhinaraset, May

2017-09-22

Person-centered reproductive health care is recognized as critical to improving reproductive health outcomes. Yet, little research exists on how to operationalize it. We extend the literature in this area by developing and validating a tool to measure person-centered maternity care. We describe the process of developing the tool and present the results of psychometric analyses to assess its validity and reliability in a rural and urban setting in Kenya. We followed standard procedures for scale development. First, we reviewed the literature to define our construct and identify domains, and developed items to measure each domain. Next, we conducted expert reviews to assess content validity; and cognitive interviews with potential respondents to assess clarity, appropriateness, and relevance of the questions. The questions were then refined and administered in surveys; and survey results used to assess construct and criterion validity and reliability. The exploratory factor analysis yielded one dominant factor in both the rural and urban settings. Three factors with eigenvalues greater than one were identified for the rural sample and four factors identified for the urban sample. Thirty of the 38 items administered in the survey were retained based on the factors loadings and correlation between the items. Twenty-five items load very well onto a single factor in both the rural and urban sample, with five items loading well in either the rural or urban sample, but not in both samples. These 30 items also load on three sub-scales that we created to measure dignified and respectful care, communication and autonomy, and supportive care. The Chronbach alpha for the main scale is greater than 0.8 in both samples, and that for the sub-scales are between 0.6 and 0.8. The main scale and sub-scales are correlated with global measures of satisfaction with maternity services, suggesting criterion validity. We present a 30-item scale with three sub-scales to measure person-centered maternity care. This scale has high validity and reliability in a rural and urban setting in Kenya. Validation in additional settings is however needed. This scale will facilitate measurement to improve person-centered maternity care, and subsequently improve reproductive outcomes.
Development of a scale to assess Hwa-Byung, a Korean culture-bound syndrome, using the Korean MMPI-2.

PubMed

Roberts, Miguel E; Han, Kyunghee; Weed, Nathan C

2006-09-01

This study documents the development of an MMPI-2 scale designed to assess features of the Korean culture-bound syndrome, Hwa-Byung (HB). An American research team and psychiatric practitioners in Korea created an 18-item HB scale via rational item selection and psycho-metric refinement. Principal components analysis of scale items revealed four components, reflecting content domains of general health, gastrointestinal symptoms, hopelessness, and anger. This four-component solution applied well to both Korean men and women, but not to an American sample. Although some findings were encouraging, future studies employing clinical samples are needed to provide further validation of this scale.
Assessing psychological well-being: self-report instruments for the NIH Toolbox.

PubMed

Salsman, John M; Lai, Jin-Shei; Hendrie, Hugh C; Butt, Zeeshan; Zill, Nicholas; Pilkonis, Paul A; Peterson, Christopher; Stoney, Catherine M; Brouwers, Pim; Cella, David

2014-02-01

Psychological well-being (PWB) has a significant relationship with physical and mental health. As a part of the NIH Toolbox for the Assessment of Neurological and Behavioral Function, we developed self-report item banks and short forms to assess PWB. Expert feedback and literature review informed the selection of PWB concepts and the development of item pools for positive affect, life satisfaction, and meaning and purpose. Items were tested with a community-dwelling US Internet panel sample of adults aged 18 and above (N = 552). Classical and item response theory (IRT) approaches were used to evaluate unidimensionality, fit of items to the overall measure, and calibrations of those items, including differential item function (DIF). IRT-calibrated item banks were produced for positive affect (34 items), life satisfaction (16 items), and meaning and purpose (18 items). Their psychometric properties were supported based on the results of factor analysis, fit statistics, and DIF evaluation. All banks measured the concepts precisely (reliability ≥0.90) for more than 98% of participants. These adult scales and item banks for PWB provide the flexibility, efficiency, and precision necessary to promote future epidemiological, observational, and intervention research on the relationship of PWB with physical and mental health.

PSSA Released Reading Items, 2000-2001. The Pennsylvania System of School Assessment.

ERIC Educational Resources Information Center

Pennsylvania State Dept. of Education, Harrisburg. Bureau of Curriculum and Academic Services.

This document contains materials directly related to the actual reading test of the Pennsylvania System of School Assessment (PSSA), including the reading rubric, released passages, selected-response questions with answer keys, performance tasks, and scored samples of students' responses to the tasks. All of these items may be duplicated to…
Methodology for Developing and Evaluating the PROMIS® Smoking Item Banks

PubMed Central

Cai, Li; Stucky, Brian D.; Tucker, Joan S.; Shadel, William G.; Edelen, Maria Orlando

2014-01-01

Introduction: This article describes the procedures used in the PROMIS® Smoking Initiative for the development and evaluation of item banks, short forms (SFs), and computerized adaptive tests (CATs) for the assessment of 6 constructs related to cigarette smoking: nicotine dependence, coping expectancies, emotional and sensory expectancies, health expectancies, psychosocial expectancies, and social motivations for smoking. Methods: Analyses were conducted using response data from a large national sample of smokers. Items related to each construct were subjected to extensive item factor analyses and evaluation of differential item functioning (DIF). Final item banks were calibrated, and SF assessments were developed for each construct. The performance of the SFs and the potential use of the item banks for CAT administration were examined through simulation study. Results: Item selection based on dimensionality assessment and DIF analyses produced item banks that were essentially unidimensional in structure and free of bias. Simulation studies demonstrated that the constructs could be accurately measured with a relatively small number of carefully selected items, either through fixed SFs or CAT-based assessment. Illustrative results are presented, and subsequent articles provide detailed discussion of each item bank in turn. Conclusions: The development of the PROMIS smoking item banks provides researchers with new tools for measuring smoking-related constructs. The use of the calibrated item banks and suggested SF assessments will enhance the quality of score estimates, thus advancing smoking research. Moreover, the methods used in the current study, including innovative approaches to item selection and SF construction, may have general relevance to item bank development and evaluation. PMID:23943843
Sex Differences in Item Functioning in the Comprehensive Inventory of Basic Skills-II Vocabulary Assessments

ERIC Educational Resources Information Center

French, Brian F.; Gotch, Chad M.

2013-01-01

The Brigance Comprehensive Inventory of Basic Skills-II (CIBS-II) is a diagnostic battery intended for children in grades 1st through 6th. The aim of this study was to test for item invariance, or differential item functioning (DIF), of the CIBS-II across sex in the standardization sample through the use of item response theory DIF detection…
Overlap and distinction between measures of insight and self-stigma.

PubMed

Hasson-Ohayon, Ilanit

2018-05-24

Multiple studies on insight into one's illness and self-stigma among patients with serious mental illness and their relatives have shown that these constructs are related to one another and that they affect outcome. However, a critical exploration of the items used to assess both constructs raises questions with regard to the possible overlapping and centrality of items. The current study used five different samples to explore the possible overlap and distinction between insight and self-stigma, and to identify central items, via network analyses and principal component factor analysis. Findings from the network analyses showed overlap between insight and self-stigma exist with a relatively clearer observational distinction between the constructs among the two parent samples in comparison to the patient samples. Principal component factor analysis constrained to two factors showed that a relatively high percentage of items were not loaded on either factor, and in a few datasets, several insight items were loaded on the self-stigma scale and vice versa. The author discusses implications for research and calls for rethinking the way insight is assessed. Clinical implications are also discussed in reference to central items of social isolation, future worries and stereotype endorsement among the different study groups. Copyright © 2018 Elsevier B.V. All rights reserved.
Assessing adolescents' personality with the NEO PI-R.

PubMed

De Fruyt, F; Mervielde, I; Hoekstra, H A; Rolland, J P

2000-12-01

The suitability of the Revised NEO Personality Inventory (NEO PI-R) to assess adolescents' personality traits was investigated in an unselected heterogeneous sample of 469 adolescents aged 12 to 17 years. They were further administered the Hierarchical Personality Inventory for Children (HiPIC) to allow an examination of convergent and discriminant validity. The adult NEO PI-R factor structure proved to be highly replicable in the sample of adolescents, with all facet scales primarily loading on the expected factors, independent of the age group. Domain and facet internal consistency coefficients were comparable to those obtained in adult samples, with less than 12% of the items showing corrected item-facet correlations below absolute value .20. Although, in general, adolescents reported few difficulties with the comprehensibility of the items, they tend to report more problems with the Openness to Ideas (05) and Openness to Values (06) items. Correlations between NEO PI-R and HiPIC scales underscored the convergent and discriminant validity of the NEO facets and HiPIC scales. It was concluded that the NEO PI-R in its present form is useful for assessing adolescents' traits at the primary level, but additional research is necessary to infer the most appropriate facet level structure.
Meta-analytic guidelines for evaluating single-item reliabilities of personality instruments.

PubMed

Spörrle, Matthias; Bekk, Magdalena

2014-06-01

Personality is an important predictor of various outcomes in many social science disciplines. However, when personality traits are not the principal focus of research, for example, in global comparative surveys, it is often not possible to assess them extensively. In this article, we first provide an overview of the advantages and challenges of single-item measures of personality, a rationale for their construction, and a summary of alternative ways of assessing their reliability. Second, using seven diverse samples (Ntotal = 4,263) we develop the SIMP-G, the German adaptation of the Single-Item Measures of Personality, an instrument assessing the Big Five with one item per trait, and evaluate its validity and reliability. Third, we integrate previous research and our data into a first meta-analysis of single-item reliabilities of personality measures, and provide researchers with guidelines and recommendations for the evaluation of single-item reliabilities. © The Author(s) 2013.
Checklist content on a standardized patient assessment: an ex post facto review.

PubMed

Boulet, John R; van Zanten, Marta; de Champlain, André; Hawkins, Richard E; Peitzman, Steven J

2008-03-01

While checklists are often used to score standardized patient based clinical assessments, little research has focused on issues related to their development or the level of agreement with respect to the importance of specific items. Five physicians independently reviewed checklists from 11 simulation scenarios that were part of the former Educational Commission for Foreign Medical Graduate's Clinical Skills Assessment and classified the clinical appropriateness of each of the checklist items. Approximately 78% of the original checklist items were judged to be needed, or indicated, given the presenting complaint and the purpose of the assessment. Rater agreement was relatively poor with pairwise associations (Kappa coefficient) ranging from 0.09 to 0.29. However, when only consensus indicated items were included, there was little change in examinee scores, including their reliability over encounters. Although most checklist items in this sample were judged to be appropriate, some could potentially be eliminated, thereby minimizing the scoring burden placed on the standardized patients. Periodic review of checklist items, concentrating on their clinical importance, is warranted.
The Development and Validation of the Intercultural Sensitivity Scale.

ERIC Educational Resources Information Center

Chen, Guo-Ming; Starosta, William J.

The present study developed and assessed reliability and validity of a new instrument, the Intercultural Sensitivity Scale (ISS). Based on a review of the literature, 44 items thought to be important for intercultural sensitivity were generated. A sample of 414 college students rated these items and generated a 24-item final version of the…
Cross-cultural Measurement Equivalence of the KINDL Questionnaire for Quality of Life Assessment in Children and Adolescents.

PubMed

Jafari, Peyman; Stevanovic, Dejan; Bagheri, Zahra

2016-04-01

This cross-cultural study aimed to assess whether Iranian and Serbian children, and also their parents, perceived the meaning of the items in the KINDL quality of life questionnaire consistently. The sample included 1086 Iranian and 756 Serbian children and adolescents, alongside 1061 and 618 of their parents, respectively. The ordinal logistic regression was used to assess differential item functioning (DIF) of the self and proxy-reports of the two versions of the KINDL, including Kid-KINDL and Kiddo-KINDL, across Iranian and Serbian samples. Statistically significant DIF was flagged for 14 out of 24 (58%) and 20 out of 24 (83%) items in the self-report of the Kid-KINDL and Kiddo-KINDL, respectively. Moreover, 20 out of 24 (83%) in the proxy reports of the both Kid-KINDL and Kiddo-KINDL, showed DIF across two samples. Accordingly, considerable caution is warranted when using the KINDL for cross-cultural comparisons.
Psychometrics of the preschooler physical activity parenting practices instrument among a Latino sample

PubMed Central

2014-01-01

Background Latino preschoolers (3-5 year old children) have among the highest rates of obesity. Low levels of physical activity (PA) are a risk factor for obesity. Characterizing what Latino parents do to encourage or discourage their preschooler to be physically active can help inform interventions to increase their PA. The objective was therefore to develop and assess the psychometrics of a new instrument: the Preschooler Physical Activity Parenting Practices (PPAPP) among a Latino sample, to assess parenting practices used to encourage or discourage PA among preschool-aged children. Methods Cross-sectional study of 240 Latino parents who reported the frequency of using PA parenting practices. 95% of respondents were mothers; 42% had more than a high school education. Child mean age was 4.5 (±0.9) years (52% male). Test-retest reliability was assessed in 20%, 2 weeks later. We assessed the fit of a priori models using Confirmatory factor analyses (CFA). In a separate sub-sample (35%), preschool-aged children wore accelerometers to assess associations with their PA and PPAPP subscales. Results The a-priori models showed poor fit to the data. A modified factor structure for encouraging PPAPP had one multiple-item scale: engagement (15 items), and two single-items (have outdoor toys; not enroll in sport-reverse coded). The final factor structure for discouraging PPAPP had 4 subscales: promote inactive transport (3 items), promote screen time (3 items), psychological control (4 items) and restricting for safety (4 items). Test-retest reliability (ICC) for the two scales ranged from 0.56-0.85. Cronbach’s alphas ranged from 0.5-0.9. Several sub-factors correlated in the expected direction with children’s objectively measured PA. Conclusion The final models for encouraging and discouraging PPAPP had moderate to good fit, with moderate to excellent test-retest reliabilities. The PPAPP should be further evaluated to better assess its associations with children’s PA and offers a new tool for measuring PPAPP among Latino families with preschool-aged children. PMID:24428935
Impact of IRT item misfit on score estimates and severity classifications: an examination of PROMIS depression and pain interference item banks.

PubMed

Zhao, Yue

2017-03-01

In patient-reported outcome research that utilizes item response theory (IRT), using statistical significance tests to detect misfit is usually the focus of IRT model-data fit evaluations. However, such evaluations rarely address the impact/consequence of using misfitting items on the intended clinical applications. This study was designed to evaluate the impact of IRT item misfit on score estimates and severity classifications and to demonstrate a recommended process of model-fit evaluation. Using secondary data sources collected from the Patient-Reported Outcome Measurement Information System (PROMIS) wave 1 testing phase, analyses were conducted based on PROMIS depression (28 items; 782 cases) and pain interference (41 items; 845 cases) item banks. The identification of misfitting items was assessed using Orlando and Thissen's summed-score item-fit statistics and graphical displays. The impact of misfit was evaluated according to the agreement of both IRT-derived T-scores and severity classifications between inclusion and exclusion of misfitting items. The examination of the presence and impact of misfit suggested that item misfit had a negligible impact on the T-score estimates and severity classifications with the general population sample in the PROMIS depression and pain interference item banks, implying that the impact of item misfit was insignificant. Findings support the T-score estimates in the two item banks as robust against item misfit at both the group and individual levels and add confidence to the use of T-scores for severity diagnosis in the studied sample. Recommendations on approaches for identifying item misfit (statistical significance) and assessing the misfit impact (practical significance) are given.
A test of the International Personality Item Pool representation of the Revised NEO Personality Inventory and development of a 120-item IPIP-based measure of the five-factor model.

PubMed

Maples, Jessica L; Guan, Li; Carter, Nathan T; Miller, Joshua D

2014-12-01

There has been a substantial increase in the use of personality assessment measures constructed using items from the International Personality Item Pool (IPIP) such as the 300-item IPIP-NEO (Goldberg, 1999), a representation of the Revised NEO Personality Inventory (NEO PI-R; Costa & McCrae, 1992). The IPIP-NEO is free to use and can be modified to accommodate its users' needs. Despite the substantial interest in this measure, there is still a dearth of data demonstrating its convergence with the NEO PI-R. The present study represents an investigation of the reliability and validity of scores on the IPIP-NEO. Additionally, we used item response theory (IRT) methodology to create a 120-item version of the IPIP-NEO. Using an undergraduate sample (n = 359), we examined the reliability, as well as the convergent and criterion validity, of scores from the 300-item IPIP-NEO, a previously constructed 120-item version of the IPIP-NEO (Johnson, 2011), and the newly created IRT-based IPIP-120 in comparison to the NEO PI-R across a range of outcomes. Scores from all 3 IPIP measures demonstrated strong reliability and convergence with the NEO PI-R and a high degree of similarity with regard to their correlational profiles across the criterion variables (rICC = .983, .972, and .976, respectively). The replicability of these findings was then tested in a community sample (n = 757), and the results closely mirrored the findings from Sample 1. These results provide support for the use of the IPIP-NEO and both 120-item IPIP-NEO measures as assessment tools for measurement of the five-factor model. (c) 2014 APA, all rights reserved.
Tool for Evaluating the Ways Nurses Assess Pain (TENAP): psychometric properties assessment.

PubMed

Ng, Siok Qi; Brammer, Jillian; Creedy, Debra K; Klainin-Yobas, Piyanee

2014-12-01

Elderly people with cognitive impairment are at risk for under-treatment of pain due to their inability to communicate. Poor knowledge and attitudes of nurses toward pain in this population may result in inadequate pain assessment. This study used a descriptive correlational design to develop and validate a tool to assess nurses' knowledge, attitudes, and reported practice of pain assessment in cognitively impaired elderly patients in acute care settings. The Tool for Evaluating the ways Nurses Assess Pain (TENAP) has two sections: (1) nurses' knowledge and attitudes about pain assessment and management and (2) two vignettes to assess reported practice. Content validity was established by an expert panel of three geriatric-trained nurse clinicians, and pilot tested with a convenience sample of 10 nurses. The psychometric properties were tested with a sample of 263 Registered and Enrolled nurses working in medical wards of two public hospitals in Singapore. The final version of TENAP comprised 29 items. Content validity index ranged from 0.84 to 1.00. The scale took 10 to 15 minutes to complete and items were easily understood. Results from the factor analysis suggested that Section A demonstrated one factor (13 items) while Section B had two distinct factors (16 items), one for each vignette, supporting construct validity of the scale. Cronbach's alphas for all factors were acceptable. TENAP was feasible, valid, and reliable for assessing nurses' knowledge, attitudes, and reported practice of pain assessment in cognitively-impaired elderly patients. Further testing of the tool with a larger sample of nurses in other practice contexts is needed. Copyright © 2014 American Society for Pain Management Nursing. Published by Elsevier Inc. All rights reserved.
A content validated questionnaire for assessment of self reported venous blood sampling practices

PubMed Central

2012-01-01

Background Venous blood sampling is a common procedure in health care. It is strictly regulated by national and international guidelines. Deviations from guidelines due to human mistakes can cause patient harm. Validated questionnaires for health care personnel can be used to assess preventable "near misses"--i.e. potential errors and nonconformities during venous blood sampling practices that could transform into adverse events. However, no validated questionnaire that assesses nonconformities in venous blood sampling has previously been presented. The aim was to test a recently developed questionnaire in self reported venous blood sampling practices for validity and reliability. Findings We developed a questionnaire to assess deviations from best practices during venous blood sampling. The questionnaire contained questions about patient identification, test request management, test tube labeling, test tube handling, information search procedures and frequencies of error reporting. For content validity, the questionnaire was confirmed by experts on questionnaires and venous blood sampling. For reliability, test-retest statistics were used on the questionnaire answered twice. The final venous blood sampling questionnaire included 19 questions out of which 9 had in total 34 underlying items. It was found to have content validity. The test-retest analysis demonstrated that the items were generally stable. In total, 82% of the items fulfilled the reliability acceptance criteria. Conclusions The questionnaire could be used for assessment of "near miss" practices that could jeopardize patient safety and gives several benefits instead of assessing rare adverse events only. The higher frequencies of "near miss" practices allows for quantitative analysis of the effect of corrective interventions and to benchmark preanalytical quality not only at the laboratory/hospital level but also at the health care unit/hospital ward. PMID:22260505
A content validated questionnaire for assessment of self reported venous blood sampling practices.

PubMed

Bölenius, Karin; Brulin, Christine; Grankvist, Kjell; Lindkvist, Marie; Söderberg, Johan

2012-01-19

Venous blood sampling is a common procedure in health care. It is strictly regulated by national and international guidelines. Deviations from guidelines due to human mistakes can cause patient harm. Validated questionnaires for health care personnel can be used to assess preventable "near misses"--i.e. potential errors and nonconformities during venous blood sampling practices that could transform into adverse events. However, no validated questionnaire that assesses nonconformities in venous blood sampling has previously been presented. The aim was to test a recently developed questionnaire in self reported venous blood sampling practices for validity and reliability. We developed a questionnaire to assess deviations from best practices during venous blood sampling. The questionnaire contained questions about patient identification, test request management, test tube labeling, test tube handling, information search procedures and frequencies of error reporting. For content validity, the questionnaire was confirmed by experts on questionnaires and venous blood sampling. For reliability, test-retest statistics were used on the questionnaire answered twice. The final venous blood sampling questionnaire included 19 questions out of which 9 had in total 34 underlying items. It was found to have content validity. The test-retest analysis demonstrated that the items were generally stable. In total, 82% of the items fulfilled the reliability acceptance criteria. The questionnaire could be used for assessment of "near miss" practices that could jeopardize patient safety and gives several benefits instead of assessing rare adverse events only. The higher frequencies of "near miss" practices allows for quantitative analysis of the effect of corrective interventions and to benchmark preanalytical quality not only at the laboratory/hospital level but also at the health care unit/hospital ward.
Validating the Assessment for Measuring Indonesian Secondary School Students Performance in Ecology

NASA Astrophysics Data System (ADS)

Rachmatullah, A.; Roshayanti, F.; Ha, M.

2017-09-01

The aims of this current study are validating the American Association for the Advancement of Science (AAAS) Ecology assessment and examining the performance of Indonesian secondary school students on the assessment. A total of 611 Indonesian secondary school students (218 middle school students and 393 high school students) participated in the study. Forty-five items of AAAS assessment in the topic of Interdependence in Ecosystems were divided into two versions which every version has 21 similar items. Linking item method was used as the method to combine those two versions of assessment and further Rasch analyses were utilized to validate the instrument. Independent sample t-test was also run to compare the performance of Indonesian students and American students based on the mean of item difficulty. We found that from the total of 45 items, three items were identified as misfitting items. Later on, we also found that both Indonesian middle and high school students were significantly lower performance with very large and medium effect size compared to American students. We will discuss our findings in the regard of validation issue and the connection to Indonesian student’s science literacy.
Measuring stigma after spinal cord injury: Development and psychometric characteristics of the SCI-QOL Stigma item bank and short form.

PubMed

Kisala, Pamela A; Tulsky, David S; Pace, Natalie; Victorson, David; Choi, Seung W; Heinemann, Allen W

2015-05-01

To develop a calibrated item bank and computer adaptive test (CAT) to assess the effects of stigma on health-related quality of life in individuals with spinal cord injury (SCI). Grounded-theory based qualitative item development methods, large-scale item calibration field testing, confirmatory factor analysis, and item response theory (IRT)-based psychometric analyses. Five SCI Model System centers and one Department of Veterans Affairs medical center in the United States. Adults with traumatic SCI. SCI-QOL Stigma Item Bank A sample of 611 individuals with traumatic SCI completed 30 items assessing SCI-related stigma. After 7 items were iteratively removed, factor analyses confirmed a unidimensional pool of items. Graded Response Model IRT analyses were used to estimate slopes and thresholds for the final 23 items. The SCI-QOL Stigma item bank is unique not only in the assessment of SCI-related stigma but also in the inclusion of individuals with SCI in all phases of its development. Use of confirmatory factor analytic and IRT methods provide flexibility and precision of measurement. The item bank may be administered as a CAT or as a 10-item fixed-length short form and can be used for research and clinical applications.
Measuring stigma after spinal cord injury: Development and psychometric characteristics of the SCI-QOL Stigma item bank and short form

PubMed Central

Kisala, Pamela A.; Tulsky, David S.; Pace, Natalie; Victorson, David; Choi, Seung W.; Heinemann, Allen W.

2015-01-01

Objective To develop a calibrated item bank and computer adaptive test (CAT) to assess the effects of stigma on health-related quality of life in individuals with spinal cord injury (SCI). Design Grounded-theory based qualitative item development methods, large-scale item calibration field testing, confirmatory factor analysis, and item response theory (IRT)-based psychometric analyses. Setting Five SCI Model System centers and one Department of Veterans Affairs medical center in the United States. Participants Adults with traumatic SCI. Main Outcome Measures SCI-QOL Stigma Item Bank Results A sample of 611 individuals with traumatic SCI completed 30 items assessing SCI-related stigma. After 7 items were iteratively removed, factor analyses confirmed a unidimensional pool of items. Graded Response Model IRT analyses were used to estimate slopes and thresholds for the final 23 items. Conclusions The SCI-QOL Stigma item bank is unique not only in the assessment of SCI-related stigma but also in the inclusion of individuals with SCI in all phases of its development. Use of confirmatory factor analytic and IRT methods provide flexibility and precision of measurement. The item bank may be administered as a CAT or as a 10-item fixed-length short form and can be used for research and clinical applications. PMID:26010973
Methodology for developing and evaluating the PROMIS smoking item banks.

PubMed

Hansen, Mark; Cai, Li; Stucky, Brian D; Tucker, Joan S; Shadel, William G; Edelen, Maria Orlando

2014-09-01

This article describes the procedures used in the PROMIS Smoking Initiative for the development and evaluation of item banks, short forms (SFs), and computerized adaptive tests (CATs) for the assessment of 6 constructs related to cigarette smoking: nicotine dependence, coping expectancies, emotional and sensory expectancies, health expectancies, psychosocial expectancies, and social motivations for smoking. Analyses were conducted using response data from a large national sample of smokers. Items related to each construct were subjected to extensive item factor analyses and evaluation of differential item functioning (DIF). Final item banks were calibrated, and SF assessments were developed for each construct. The performance of the SFs and the potential use of the item banks for CAT administration were examined through simulation study. Item selection based on dimensionality assessment and DIF analyses produced item banks that were essentially unidimensional in structure and free of bias. Simulation studies demonstrated that the constructs could be accurately measured with a relatively small number of carefully selected items, either through fixed SFs or CAT-based assessment. Illustrative results are presented, and subsequent articles provide detailed discussion of each item bank in turn. The development of the PROMIS smoking item banks provides researchers with new tools for measuring smoking-related constructs. The use of the calibrated item banks and suggested SF assessments will enhance the quality of score estimates, thus advancing smoking research. Moreover, the methods used in the current study, including innovative approaches to item selection and SF construction, may have general relevance to item bank development and evaluation. © The Author 2013. Published by Oxford University Press on behalf of the Society for Research on Nicotine and Tobacco. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
The Music Attentiveness Screening Assessment, Revised (MASA-R): A Study of Technical Adequacy.

PubMed

Waldon, Eric G; Lesser, Alexander; Weeden, Lydia; Messick, Emily

2016-01-01

Evidence suggests that attention is an important consideration when designing procedural support interventions for children undergoing distressing medical procedures. As such, the extent to which children can attend to musical stimuli used during music-based procedural support interventions would seem important. The Music Attentiveness Screening Assessment (MASA) was designed to assess a child's ability to attend to musical stimuli, but further revisions were deemed necessary to improve administration, test-retest reliability, and interobserver agreement for the measure's items. This study investigated the technical adequacy of the Music Attentiveness Screening Assessment, Revised (MASA-R), with a non-clinical sample of children aged 4 to 9 years by examining (a) Construct validity using comparator instruments measuring auditory attention; (b) Test-retest reliability following a two-week delay; and (c) Interobserver agreement when administered by two independent examiners. This non-clinical sample included 69 children who were administered both items from MASA-R and two comparator instruments: the Auditory Attention subtest from the NEPSY-II (NII-AA) for children aged 5 to 9 years (n = 47); and the Auditory Attention subtest from the Woodcock-Johnson Tests of Cognitive Abilities, 3rd ed. (WJIII-AA), for children aged 4 years (n = 22). A significant proportion of score variance was shared by both MASA-R items and the comparator measures: R (2) = .16, F(2, 66) = 6.30, p = .003. MASA-R score estimates with regard to test-retest reliability (Item I, intra-class correlation [ICC] = .88; Item II, ICC = .91) and interobserver agreement (Item I, ICC = .99; Item II, ICC = .98) also fell into acceptable ranges. Estimates of MASA-R score construct validity, test-retest reliability, and interobserver agreement appear improved over its predecessor, MASA. While findings are promising, additional investigation of its use with a clinical sample is needed before it can be confidently used in pediatrics. © the American Music Therapy Association 2015. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Examination of the PROMIS upper extremity item bank.

PubMed

Hung, Man; Voss, Maren W; Bounsanga, Jerry; Crum, Anthony B; Tyser, Andrew R

Clinical measurement. The psychometric properties of the PROMIS v1.2 UE item bank were tested on various samples prior to its release, but have not been fully evaluated among the orthopaedic population. This study assesses the performance of the UE item bank within the UE orthopaedic patient population. The UE item bank was administered to 1197 adult patients presenting to a tertiary orthopaedic clinic specializing in hand and UE conditions and was examined using traditional statistics and Rasch analysis. The UE item bank fits a unidimensional model (outfit MNSQ range from 0.64 to 1.70) and has adequate reliabilities (person = 0.84; item = 0.82) and local independence (item residual correlations range from -0.37 to 0.34). Only one item exhibits gender differential item functioning. Most items target low levels of function. The UE item bank is a useful clinical assessment tool. Additional items covering higher functions are needed to enhance validity. Supplemental testing is recommended for patients at higher levels of function until more high function UE items are developed. 2c. Copyright © 2016 Hanley & Belfus. Published by Elsevier Inc. All rights reserved.
Factor Structure of the Internet Addiction Test in Online Gamers and Poker Players.

PubMed

Khazaal, Yasser; Achab, Sophia; Billieux, Joel; Thorens, Gabriel; Zullino, Daniele; Dufour, Magali; Rothen, Stéphane

2015-01-01

The Internet Addiction Test (IAT) is the most widely used questionnaire to screen for problematic Internet use. Nevertheless, its factorial structure is still debated, which complicates comparisons among existing studies. Most previous studies were performed with students or community samples despite the probability of there being more problematic Internet use among users of specific applications, such as online gaming or gambling. To assess the factorial structure of a modified version of the IAT that addresses specific applications, such as video games and online poker. Two adult samples-one sample of Internet gamers (n=920) and one sample of online poker players (n=214)-were recruited and completed an online version of the modified IAT. Both samples were split into two subsamples. Two principal component analyses (PCAs) followed by two confirmatory factor analyses (CFAs) were run separately. The results of principal component analysis indicated that a one-factor model fit the data well across both samples. In consideration of the weakness of some IAT items, a 17-item modified version of the IAT was proposed. This study assessed, for the first time, the factorial structure of a modified version of an Internet-administered IAT on a sample of Internet gamers and a sample of online poker players. The scale seems appropriate for the assessment of such online behaviors. Further studies on the modified 17-item IAT version are needed.
Psychometric evaluation of an item bank for computerized adaptive testing of the EORTC QLQ-C30 cognitive functioning dimension in cancer patients.

PubMed

Dirven, Linda; Groenvold, Mogens; Taphoorn, Martin J B; Conroy, Thierry; Tomaszewski, Krzysztof A; Young, Teresa; Petersen, Morten Aa

2017-11-01

The European Organisation of Research and Treatment of Cancer (EORTC) Quality of Life Group is developing computerized adaptive testing (CAT) versions of all EORTC Quality of Life Questionnaire (QLQ-C30) scales with the aim to enhance measurement precision. Here we present the results on the field-testing and psychometric evaluation of the item bank for cognitive functioning (CF). In previous phases (I-III), 44 candidate items were developed measuring CF in cancer patients. In phase IV, these items were psychometrically evaluated in a large sample of international cancer patients. This evaluation included an assessment of dimensionality, fit to the item response theory (IRT) model, differential item functioning (DIF), and measurement properties. A total of 1030 cancer patients completed the 44 candidate items on CF. Of these, 34 items could be included in a unidimensional IRT model, showing an acceptable fit. Although several items showed DIF, these had a negligible impact on CF estimation. Measurement precision of the item bank was much higher than the two original QLQ-C30 CF items alone, across the whole continuum. Moreover, CAT measurement may on average reduce study sample sizes with about 35-40% compared to the original QLQ-C30 CF scale, without loss of power. A CF item bank for CAT measurement consisting of 34 items was established, applicable to various cancer patients across countries. This CAT measurement system will facilitate precise and efficient assessment of HRQOL of cancer patients, without loss of comparability of results.
Translation Fidelity of Psychological Scales: An Item Response Theory Analysis of an Individualism-Collectivism Scale.

ERIC Educational Resources Information Center

Bontempo, Robert

1993-01-01

Describes a method for assessing the quality of translations based on item response theory (IRT). Results from the IRT technique with French and Chinese versions of a scale measuring individualism-collectivism for samples of 250 U.S., 357 French, and 290 Chinese undergraduates show how several biased items are detected. (SLD)
Development and community-based validation of eight item banks to assess mental health.

PubMed

Batterham, Philip J; Sunderland, Matthew; Carragher, Natacha; Calear, Alison L

2016-09-30

There is a need for precise but brief screening of mental health problems in a range of settings. The development of item banks to assess depression and anxiety has resulted in new adaptive and static screeners that accurately assess severity of symptoms. However, expansion to a wider array of mental health problems is required. The current study developed item banks for eight mental health problems: social anxiety disorder, panic disorder, post-traumatic stress disorder, obsessive-compulsive disorder, adult attention-deficit hyperactivity disorder, drug use, psychosis and suicidality. The item banks were calibrated in a population-based Australian adult sample (N=3175) by administering large item pools (45-75 items) and excluding items on the basis of local dependence or measurement non-invariance. Item Response Theory parameters were estimated for each item bank using a two-parameter graded response model. Each bank consisted of 19-47 items, demonstrating excellent fit and precision across a range of -1 to 3 standard deviations from the mean. No previous study has developed such a broad range of mental health item banks. The calibrated item banks will form the basis of a new system of static and adaptive measures to screen for a broad array of mental health problems in the community. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
An item response theory evaluation of three depression assessment instruments in a clinical sample.

PubMed

Adler, Mats; Hetta, Jerker; Isacsson, Göran; Brodin, Ulf

2012-06-21

This study investigates whether an analysis, based on Item Response Theory (IRT), can be used for initial evaluations of depression assessment instruments in a limited patient sample from an affective disorder outpatient clinic, with the aim to finding major advantages and deficiencies of the instruments. Three depression assessment instruments, the depression module from the Patient Health Questionnaire (PHQ9), the depression subscale of Affective Self Rating Scale (AS-18-D) and the Montgomery-Åsberg Depression Rating Scale (MADRS) were evaluated in a sample of 61 patients with affective disorder diagnoses, mainly bipolar disorder. A '3- step IRT strategy' was used. In a first step, the Mokken non-parametric analysis showed that PHQ9 and AS-18-D had strong overall scalabilities of 0.510 [C.I. 0.42, 0.61] and 0,513 [C.I. 0.41, 0.63] respectively, while MADRS had a weak scalability of 0.339 [C.I. 0.25, 0.43]. In a second step, a Rasch model analysis indicated large differences concerning the item discriminating capacity and was therefore considered not suitable for the data. In third step, applying a more flexible two parameter model, all three instruments showed large differences in item information and items had a low capacity to reliably measure respondents at low levels of depression severity. We conclude that a stepwise IRT-approach, as performed in this study, is a suitable tool for studying assessment instruments at early stages of development. Such an analysis can give useful information, even in small samples, in order to construct more precise measurements or to evaluate existing assessment instruments. The study suggests that the PHQ9 and AS-18-D can be useful for measurement of depression severity in an outpatient clinic for affective disorder, while the MADRS shows weak measurement properties for this type of patients.
GAP-REACH

PubMed Central

Lewis-Fernández, Roberto; Raggio, Greer A.; Gorritz, Magdaliz; Duan, Naihua; Marcus, Sue; Cabassa, Leopoldo J.; Humensky, Jennifer; Becker, Anne E.; Alarcón, Renato D.; Oquendo, María A.; Hansen, Helena; Like, Robert C.; Weiss, Mitchell; Desai, Prakash N.; Jacobsen, Frederick M.; Foulks, Edward F.; Primm, Annelle; Lu, Francis; Kopelowicz, Alex; Hinton, Ladson; Hinton, Devon E.

2015-01-01

Growing awareness of health and health care disparities highlights the importance of including information about race, ethnicity, and culture (REC) in health research. Reporting of REC factors in research publications, however, is notoriously imprecise and unsystematic. This article describes the development of a checklist to assess the comprehensiveness and the applicability of REC factor reporting in psychiatric research publications. The 16-itemGAP-REACH© checklist was developed through a rigorous process of expert consensus, empirical content analysis in a sample of publications (N = 1205), and interrater reliability (IRR) assessment (N = 30). The items assess each section in the conventional structure of a health research article. Data from the assessment may be considered on an item-by-item basis or as a total score ranging from 0% to 100%. The final checklist has excellent IRR (κ = 0.91). The GAP-REACH may be used by multiple research stakeholders to assess the scope of REC reporting in a research article. PMID:24080673
Measuring the Quality of Life of Visually Impaired Children: First Stage Psychometric Evaluation of the Novel VQoL_CYP Instrument.

PubMed

Tadić, Valerija; Cooper, Andrew; Cumberland, Phillippa; Lewando-Hundt, Gillian; Rahi, Jugnoo S

2016-01-01

To report piloting and initial validation of the VQoL_CYP, a novel age-appropriate vision-related quality of life (VQoL) instrument for self-reporting by children with visual impairment (VI). Participants were a random patient sample of children with VI aged 10-15 years. 69 patients, drawn from patient databases at Great Ormond Street Hospital and Moorfields Eye Hospital, United Kingdom, participated in piloting of the draft 47-item VQoL instrument, which enabled preliminary item reduction. Subsequent administration of the instrument, alongside functional vision (FV) and generic health-related quality of life (HRQoL) self-report measures, to 101 children with VI comprising a nationally representative sample enabled further item reduction and evaluation of psychometric properties using Rasch analysis. Construct validity was assessed through Pearson correlation coefficients. Item reduction through piloting (8 items removed for skewness and individual item response pattern) and validation (1 item removed for skewness and 3 for misfit in Rasch) produced a 35-item scale, with fit values within acceptable limits, no notable differential item functioning, good measurement precision, ordered response categories and acceptable targeting in Rasch. The VQoL_CYP showed good construct validity, correlating strongly with HRQoL scores, moderately with FV scores but not with acuity. Robust child-appropriate self-report VQoL measures for children with VI are necessary for understanding the broader impacts of living with a visual disability, distinguishing these from limited functioning per se. Future planned use in larger patient samples will allow further psychometric development of the VQoL_CYP as an adjunct to objective outcomes assessment.
A 7-item version of the fatigue severity scale has better psychometric properties among HIV-infected adults: an application of a Rasch model.

PubMed

Lerdal, Anners; Kottorp, Anders; Gay, Caryl; Aouizerat, Bradley E; Portillo, Carmen J; Lee, Kathryn A

2011-11-01

To examine the psychometric properties of the 9-item Fatigue Severity Scale (FSS) using a Rasch model application. A convenience sample of HIV-infected adults was recruited, and a subset of the sample was assessed at 6-month intervals for 2 years. Socio-demographic, clinical, and symptom data were collected by self-report questionnaires. CD4 T-cell count and viral load measures were obtained from medical records. The Rasch analysis included 316 participants with 698 valid questionnaires. FSS item 2 did not advanced monotonically, and items 1 and 2 did not show acceptable goodness-of-fit to the Rasch model. A reduced FSS 7-item version demonstrated acceptable goodness-of-fit and explained 61.2% of the total variance in the scale. In the FSS-7 item version, no uniform Differential Item Functioning was found in relation to time of evaluation or to any of the socio-demographic or clinical variables. This study demonstrated that the FSS-7 has better psychometric properties than the FSS-9 in this HIV sample and that responses to the different items are comparable over time and unrelated to socio-demographic and clinical variables.
INTRODUCTION TO PATIENT-REPORTED OUTCOME ITEM BANKS: ISSUES IN MINORITY AGING RESEARCH

PubMed Central

Templin, Thomas N; Hays, Ron D; Gershon, Richard C; Rothrock, Nan; Jones, Richard N; Teresi, Jeanne A; Stewart, Anita; Weech-Maldonado, Robert; Wallace, Steve

2014-01-01

In 2004 NIH awarded contracts to initiate the development of high quality psychological and neuropsychological outcome measures for improved assessment of health-related outcomes. The workshop introduced these measurement development initiatives, the measures created, and the NIH supported resource (Assessment Center) for internet or tablet-based test administration and scoring. Presentation covered: (a) item response theory (IRT) and assessment of test bias, (b) construction of item banks and computerized adaptive testing, and (c) the different ways in which qualitative analyses contribute to the definition of construct domains and the refinement of outcome constructs. The panel discussion included questions about representativeness of samples, and assessment of cultural bias. PMID:23570428
Testing whether the DSM-5 personality disorder trait model can be measured with a reduced set of items: An item response theory investigation of the Personality Inventory for DSM-5.

PubMed

Maples, Jessica L; Carter, Nathan T; Few, Lauren R; Crego, Cristina; Gore, Whitney L; Samuel, Douglas B; Williamson, Rachel L; Lynam, Donald R; Widiger, Thomas A; Markon, Kristian E; Krueger, Robert F; Miller, Joshua D

2015-12-01

The fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) includes an alternative model of personality disorders (PDs) in Section III, consisting in part of a pathological personality trait model. To date, the 220-item Personality Inventory for DSM-5 (PID-5; Krueger, Derringer, Markon, Watson, & Skodol, 2012) is the only extant self-report instrument explicitly developed to measure this pathological trait model. The present study used item response theory-based analyses in a large sample (n = 1,417) to investigate whether a reduced set of 100 items could be identified from the PID-5 that could measure the 25 traits and 5 domains. This reduced set of PID-5 items was then tested in a community sample of adults currently receiving psychological treatment (n = 109). Across a wide range of criterion variables including NEO PI-R domains and facets, DSM-5 Section II PD scores, and externalizing and internalizing outcomes, the correlational profiles of the original and reduced versions of the PID-5 were nearly identical (rICC = .995). These results provide strong support for the hypothesis that an abbreviated set of PID-5 items can be used to reliably, validly, and efficiently assess these personality disorder traits. The ability to assess the DSM-5 Section III traits using only 100 items has important implications in that it suggests these traits could still be measured in settings in which assessment-related resources (e.g., time, compensation) are limited. (c) 2015 APA, all rights reserved).
An Investigation of the Measurement Properties of the Spot-the-Word Test In a Community Sample

ERIC Educational Resources Information Center

Mackinnon, Andrew; Christensen, Helen

2007-01-01

Intellectual ability is assessed with the Spot-the-Word (STW) test (A. Baddeley, H. Emslie, & I. Nimmo Smith, 1993) by asking respondents to identify a word in a word-nonword item pair. Results in moderate-sized samples suggest this ability is resistant to decline due to dementia. The authors used a 3-parameter item response theory model to…
Missouri Assessment Program, Spring 2002: Social Studies, Grade 8. Released Items [and] Scoring Guide.

ERIC Educational Resources Information Center

Missouri State Dept. of Elementary and Secondary Education, Jefferson City.

This booklet contains sample items from the Missouri social studies test for eighth graders. The first sample is based on a speech delivered by Elizabeth Cady Stanton in the mid-1880s, which proposed a new approach to raising girls. Students are directed to use their own knowledge and the speech excerpt to do three activities. The second sample…
Psychometric Properties of Reverse-Scored Items on the CES-D in a Sample of Ethnically Diverse Older Adults

ERIC Educational Resources Information Center

Carlson, Mike; Wilcox, Rand; Chou, Chih-Ping; Chang, Megan; Yang, Frances; Blanchard, Jeanine; Marterella, Abbey; Kuo, Ann; Clark, Florence

2011-01-01

Reverse-scored items on assessment scales increase cognitive processing demands and may therefore lead to measurement problems for older adult respondents. In this study, the objective was to examine possible psychometric inadequacies of reverse-scored items on the Center for Epidemiologic Studies Depression Scale (CES-D) when used to assess…
A cross-cultural study to assess measurement invariance of the KIDSCREEN-27 questionnaire across Serbian and Iranian children and adolescents.

PubMed

Stevanovic, Dejan; Jafari, Peyman

2015-01-01

The KIDSCREEN questionnaire for health-related quality of life (HRQOL) assessments in children and adolescents was simultaneously developed across 13 European countries, and it was subsequently translated and culturally adapted to over 30 different languages across the world. The aim of this study was to evaluate the measurement equivalence of the KIDSCREEN-27 across Serbian and Iranian children and adolescents. The items in the KIDSCREEN-27 were analyzed for differential item functioning (DIF) across Iranian and Serbian populations using ordinal logistic regression with three different criteria. The sample included 330 Iranian and 329 Serbian children and adolescents and 330 and 314 of their parents, respectively. Across the two samples, DIF was detected in 16 (59 %) of 27 items in the child self-reports and in 20 (74 %) of 27 items in the parent/proxy report. However, using alternative criteria based on magnitude detected for DIF, only three items in the parent/proxy report showed significant DIF. Our study provided more evidence that the KIDSCREEN-27 possesses DIF items across different cultures, but their impact is probably small, and the questionnaire could be used for cross-cultural HRQOL comparisons.
Australian oral health case notes: assessment of forensic relevance and adherence to recording guidelines.

PubMed

Stow, L; James, H; Richards, L

2016-06-01

Dental case notes record clinical diagnoses and treatments, as well as providing continuity of patient care. They are also used for dento-legal litigation and forensic purposes. Maintaining accurate and comprehensive dental patient records is a dental worker's ethical and legal obligation. Australian registered specialist forensic odontologists were surveyed to determine the relevance of recorded case note items for dental identification. A dental case notes sample was assessed for adherence with odontologist nominated forensic value and compiled professional record keeping guidelines of forensic relevance. Frequency of item recording, confidence interval, examiner agreement and statistical significance were determined. Broad agreement existed between forensic odontologists as to which recorded dental items have most forensic relevance. Inclusion frequency of these items in sampled case notes varied widely (e.g. single area radiographic view present in 75%, CI = 65.65-82.50; completed odontogram in 56%, CI = 46.23-65.33). Recording of information specified by professional record keeping guidelines also varied, although overall inclusion was higher than for forensically desired items (e.g. patient's full name in 99%, CI = 94.01 - >99.99; named treating practitioner in 23%, CI = 15.78-32.31). Many sampled dental case notes lacked details identified as being valuable by forensic specialists and as specified by professional record keeping guidelines. © 2016 Australian Dental Association.
Assessment of the structure of the Hospital Anxiety and Depression Scale in musculoskeletal patients

PubMed Central

Pallant, Julie F; Bailey, Catherine M

2005-01-01

Background Research suggests there is a high prevalence of anxiety and depression amongst patients with chronic musculoskeletal pain, which can influence the effectiveness of rehabilitation programs. It is therefore important for clinicians involved in musculoskeletal rehabilitation programs to consider screening patients for elevated levels of anxiety and depression and to provide appropriate counselling or treatment where necessary. The HADS has been used as a screening tool for assessment of anxiety and depression in a wide variety of clinical groups. Recent research however has questioned its suitability for use with some patient groups due to problems with dimensionality and the behaviour of individual items. The aim of this study is to assess the underlying structure and psychometric properties of the HADS among patients attending musculoskeletal rehabilitation. Methods Data was obtained from 296 patients attending an outpatient musculoskeletal pain clinic. The total sample was used to identify the proportion of patients with elevated levels of anxiety and depression. Half the sample (n = 142) was used for exploratory factor analysis (EFA), with the holdout sample (n = 154) used for confirmatory factor analysis (CFA) to explore the underlying structure of the scale. Results A substantial proportion of patients were classified as probable cases on the HADS Anxiety subscale (38.2%) and HADS Depression subscale (30.1%), with the sample recording higher mean HADS subscales scores than many other patient groups (breast cancer, end-stage renal disease, heart disease) reported in the literature. EFA supported a two factor structure (representing anxiety and depression) as proposed by the scale's authors, however item 7 (an anxiety item) failed to load appropriately. Removing Item 7 resulted in a clear two factor solution in both EFA and CFA. Conclusion The high levels of anxiety and depression detected in this sample suggests that screening for psychological comorbidity is important in musculoskeletal rehabilitation settings. It is necessary for clinicians who are considering using the HADS as a screening tool to first assess its suitability with their particular patient group. Although EFA and CFA supported the presence of two subscales representing anxiety and depression, the results with this musculoskeletal sample suggest that item 7 should be removed from the anxiety subscale. PMID:16364179
The Dimensional Assessment of Personality Psychopathology Basic Questionnaire: shortened versions item analysis.

PubMed

Aluja, Anton; Blanch, Àngel; Blanco, Eduardo; Martí-Guiu, Maite; Balada, Ferran

2015-01-13

This study has been designed to evaluate and replicate the psychometric properties of the Dimensional Assessment of Personality Psychopathology-Basic Questionnaire (DAPP-BQ) and the DAPP-BQ short form (DAPP-SF) in a large Spanish general population sample. Additionally, we have generated a reduced form called DAPP-90, using a strategy based on a structural equation modeling (SEM) methodology in two independent samples, a calibration and a validation sample. The DAPP-90 scales obtained a more satisfactory fit on SEM adjustment values (average: TLI > .97 and RMSEA < .04) respect to full DAPP-BQ and the 136-item version. According to the factorial congruency coefficients, the DAPP-90 obtains a similar structure to the DAPP-BQ and the DAPP-SF. The DAPP-90 internal consistency is acceptable, with a Cronbach's alpha mean of .75. We did not find any differences in the pattern of relations between the two DAPP-BQ shortened versions and the SCL-90-R factors. The new 90-items version is especially useful when it is difficult to use the long version for diverse reasons, such as the assessment of patients in hospital consultation or in brief psychological assessments.
A Chinese version of the revised Nurses Professional Values Scale: reliability and validity assessment.

PubMed

Lin, Yu-Hua; Wang, Liching Sung

2010-08-01

The purpose of this study was to assess the reliability and validity of a Chinese version of the revised nurses professional values scale (NPVS-R). The convenient sampling method, including senior undergraduate nursing students (n=110) and clinical nurses (n=223), was applied to recruit appropriate samples from southern Taiwan. The revised nurses professional values scale (NPVS-R) was used in this study. Content validity, construct validity, internal consistency, and reliability were assessed. The final sample consisted of 286 subjects. three factors were detected in the results, accounting for 60.12% of the explained variance. The first factor was titled professionalism, and included 13 items. The second factor was named caring, and consisted of seven items. Activism was the third factor, which included six items. Overall Cronbach's alpha coefficient was 0.90, taken from values for each of the three factors of 0.88, 0.90, and 0.81, respectively. The Chinese version of the NPVS-R can be considered a reliable and valid scale for assigning values that can mark professionalism in Taiwanese nurses. Copyright 2009 Elsevier Ltd. All rights reserved.
Development and validation of brief scales to measure emotional and behavioural problems among Chinese adolescents

PubMed Central

Shen, Minxue; Hu, Ming; Sun, Zhenqiu

2017-01-01

Objectives To develop and validate brief scales to measure common emotional and behavioural problems among adolescents in the examination-oriented education system and collectivistic culture of China. Setting Middle schools in Hunan province. Participants 5442 middle school students aged 11–19 years were sampled. 4727 valid questionnaires were collected and used for validation of the scales. The final sample included 2408 boys and 2319 girls. Primary and secondary outcome measures The tools were assessed by the item response theory, classical test theory (reliability and construct validity) and differential item functioning. Results Four scales to measure anxiety, depression, study problem and sociality problem were established. Exploratory factor analysis showed that each scale had two solutions. Confirmatory factor analysis showed acceptable to good model fit for each scale. Internal consistency and test–retest reliability of all scales were above 0.7. Item response theory showed that all items had acceptable discrimination parameters and most items had appropriate difficulty parameters. 10 items demonstrated differential item functioning with respect to gender. Conclusions Four brief scales were developed and validated among adolescents in middle schools of China. The scales have good psychometric properties with minor differential item functioning. They can be used in middle school settings, and will help school officials to assess the students’ emotional/behavioural problems. PMID:28062469

Commutability of food microbiology proficiency testing samples.

PubMed

Abdelmassih, M; Polet, M; Goffaux, M-J; Planchon, V; Dierick, K; Mahillon, J

2014-03-01

Food microbiology proficiency testing (PT) is a useful tool to assess the analytical performances among laboratories. PT items should be close to routine samples to accurately evaluate the acceptability of the methods. However, most PT providers distribute exclusively artificial samples such as reference materials or irradiated foods. This raises the issue of the suitability of these samples because the equivalence-or 'commutability'-between results obtained on artificial vs. authentic food samples has not been demonstrated. In the clinical field, the use of noncommutable PT samples has led to erroneous evaluation of the performances when different analytical methods were used. This study aimed to provide a first assessment of the commutability of samples distributed in food microbiology PT. REQUASUD and IPH organized 13 food microbiology PTs including 10-28 participants. Three types of PT items were used: genuine food samples, sterile food samples and reference materials. The commutability of the artificial samples (reference material or sterile samples) was assessed by plotting the distribution of the results on natural and artificial PT samples. This comparison highlighted matrix-correlated issues when nonfood matrices, such as reference materials, were used. Artificially inoculated food samples, on the other hand, raised only isolated commutability issues. In the organization of a PT-scheme, authentic or artificially inoculated food samples are necessary to accurately evaluate the analytical performances. Reference materials, used as PT items because of their convenience, may present commutability issues leading to inaccurate penalizing conclusions for methods that would have provided accurate results on food samples. For the first time, the commutability of food microbiology PT samples was investigated. The nature of the samples provided by the organizer turned out to be an important factor because matrix effects can impact on the analytical results. © 2013 The Society for Applied Microbiology.
The Validity of a New Structured Assessment of Gastrointestinal Symptoms Scale (SAGIS) for Evaluating Symptoms in the Clinical Setting.

PubMed

Koloski, N A; Jones, M; Hammer, J; von Wulffen, M; Shah, A; Hoelz, H; Kutyla, M; Burger, D; Martin, N; Gurusamy, S R; Talley, N J; Holtmann, G

2017-08-01

The clinical assessments of patients with gastrointestinal symptoms can be time-consuming, and the symptoms captured during the consultation may be influenced by a variety of patient and non-patient factors. To facilitate standardized symptom assessment in the routine clinical setting, we developed the Structured Assessment of Gastrointestinal Symptom (SAGIS) instrument to precisely characterize symptoms in a routine clinical setting. We aimed to validate SAGIS including its reliability, construct and discriminant validity, and utility in the clinical setting. Development of the SAGIS consisted of initial interviews with patients referred for the diagnostic work-up of digestive symptoms and relevant complaints identified. The final instrument consisted of 22 items as well as questions on extra intestinal symptoms and was given to 1120 consecutive patients attending a gastroenterology clinic randomly split into derivation (n = 596) and validation datasets (n = 551). Discriminant validity along with test-retest reliability was assessed. The time taken to perform a clinical assessment with and without the SAGIS was recorded along with doctor satisfaction with this tool. Exploratory factor analysis conducted on the derivation sample suggested five symptom constructs labeled as abdominal pain/discomfort (seven items), gastroesophageal reflux disease/regurgitation symptoms (four items), nausea/vomiting (three items), diarrhea/incontinence (five items), and difficult defecation and constipation (2 items). Confirmatory factor analysis conducted on the validation sample supported the initially developed five-factor measurement model ([Formula: see text], p < 0.0001, χ 2 /df = 4.6, CFI = 0.90, TLI = 0.88, RMSEA = 0.08). All symptom groups demonstrated differentiation between disease groups. The SAGIS was shown to be reliable over time and resulted in a 38% reduction of the time required for clinical assessment. The SAGIS instrument has excellent psychometric properties and supports the clinical assessment of and symptom-based categorization of patients with a wide spectrum of gastrointestinal symptoms.
Development of the PROMIS health expectancies of smoking item banks.

PubMed

Edelen, Maria Orlando; Tucker, Joan S; Shadel, William G; Stucky, Brian D; Cerully, Jennifer; Li, Zhen; Hansen, Mark; Cai, Li

2014-09-01

Smokers' health-related outcome expectancies are associated with a number of important constructs in smoking research, yet there are no measures currently available that focus exclusively on this domain. This paper describes the development and evaluation of item banks for assessing the health expectancies of smoking. Using data from a sample of daily (N = 4,201) and nondaily (N = 1,183) smokers, we conducted a series of item factor analyses, item response theory analyses, and differential item functioning analyses (according to gender, age, and race/ethnicity) to arrive at a unidimensional set of health expectancies items for daily and nondaily smokers. We also evaluated the performance of short forms (SFs) and computer adaptive tests (CATs) to efficiently assess health expectancies. A total of 24 items were included in the Health Expectancies item banks; 13 items are common across daily and nondaily smokers, 6 are unique to daily, and 5 are unique to nondaily. For both daily and nondaily smokers, the Health Expectancies item banks are unidimensional, reliable (reliability = 0.95 and 0.96, respectively), and perform similarly across gender, age, and race/ethnicity groups. A SF common to daily and nondaily smokers consists of 6 items (reliability = 0.87). Results from simulated CATs showed that health expectancies can be assessed with good precision with an average of 5-6 items adaptively selected from the item banks. Health expectancies of smoking can be assessed on the basis of these item banks via SFs, CATs, or through a tailored set of items selected for a specific research purpose. © The Author 2014. Published by Oxford University Press on behalf of the Society for Research on Nicotine and Tobacco. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Development of the PROMIS nicotine dependence item banks.

PubMed

Shadel, William G; Edelen, Maria Orlando; Tucker, Joan S; Stucky, Brian D; Hansen, Mark; Cai, Li

2014-09-01

Nicotine dependence is a core construct important for understanding cigarette smoking and smoking cessation behavior. This article describes analyses conducted to develop and evaluate item banks for assessing nicotine dependence among daily and nondaily smokers. Using data from a sample of daily (N = 4,201) and nondaily (N =1,183) smokers, we conducted a series of item factor analyses, item response theory analyses, and differential item functioning analyses (according to gender, age, and race/ethnicity) to arrive at a unidimensional set of nicotine dependence items for daily and nondaily smokers. We also evaluated performance of short forms (SFs) and computer adaptive tests (CATs) to efficiently assess dependence. A total of 32 items were included in the Nicotine Dependence item banks; 22 items are common across daily and nondaily smokers, 5 are unique to daily smokers, and 5 are unique to nondaily smokers. For both daily and nondaily smokers, the Nicotine Dependence item banks are strongly unidimensional, highly reliable (reliability = 0.97 and 0.97, respectively), and perform similarly across gender, age, and race/ethnicity groups. SFs common to daily and nondaily smokers consist of 8 and 4 items (reliability = 0.91 and 0.81, respectively). Results from simulated CATs showed that dependence can be assessed with very good precision for most respondents using fewer than 6 items adaptively selected from the item banks. Nicotine dependence on cigarettes can be assessed on the basis of these item banks via one of the SFs, by using CATs, or through a tailored set of items selected for a specific research purpose. © The Author 2014. Published by Oxford University Press on behalf of the Society for Research on Nicotine and Tobacco. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Development of the PROMIS negative psychosocial expectancies of smoking item banks.

PubMed

Stucky, Brian D; Edelen, Maria Orlando; Tucker, Joan S; Shadel, William G; Cerully, Jennifer; Kuhfeld, Megan; Hansen, Mark; Cai, Li

2014-09-01

Negative psychosocial expectancies of smoking include aspects of social disapproval and disappointment in oneself. This paper describes analyses conducted to develop and evaluate item banks for assessing psychosocial expectancies among daily and nondaily smokers. Using data from a sample of daily (N = 4,201) and nondaily (N =1,183) smokers, we conducted a series of item factor analyses, item response theory analyses, and differential item functioning analyses (according to gender, age, and race/ethnicity) to arrive at a unidimensional set of psychosocial expectancies items for daily and nondaily smokers. We also evaluated performance of short forms (SFs) and computer adaptive tests (CATs) to efficiently assess psychosocial expectancies. A total of 21 items were included in the Psychosocial Expectancies item banks: 14 items are common across daily and nondaily smokers, 6 are unique to daily, and 1 is unique to nondaily. For both daily and nondaily smokers, the Psychosocial Expectancies item banks are strongly unidimensional, highly reliable (reliability = 0.95 and 0.93, respectively), and perform similarly across gender, age, and race/ethnicity groups. A SF common to daily and nondaily smokers consists of 6 items (reliability = 0.85). Results from simulated CATs showed that, on average, fewer than 8 items are needed to assess psychosocial expectancies with adequate precision when using the item banks. Psychosocial expectancies of smoking can be assessed on the basis of these item banks via the SF, by using CAT, or through a tailored set of items selected for a specific research purpose. © The Author 2014. Published by Oxford University Press on behalf of the Society for Research on Nicotine and Tobacco. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Development of an item bank for the assessment of depression in persons with mental illnesses and physical diseases using Rasch analysis.

PubMed

Forkmann, Thomas; Boecker, Maren; Norra, Christine; Eberle, Nicole; Kircher, Tilo; Schauerte, Patrick; Mischke, Karl; Westhofen, Martin; Gauggel, Siegfried; Wirtz, Markus

2009-05-01

The calibration of item banks provides the basis for computerized adaptive testing that ensures high diagnostic precision and minimizes participants' test burden. The present study aimed at developing a new item bank that allows for assessing depression in persons with mental and persons with somatic diseases. The sample consisted of 161 participants treated for a depressive syndrome, and 206 participants with somatic illnesses (103 cardiologic, 103 otorhinolaryngologic; overall mean age = 44.1 years, SD =14.0; 44.7% women) to allow for validation of the item bank in both groups. Persons answered a pool of 182 depression items on a 5-point Likert scale. Evaluation of Rasch model fit (infit < 1.3), differential item functioning, dimensionality, local independence, item spread, item and person separation (>2.0), and reliability (>.80) resulted in a bank of 79 items with good psychometric properties. The bank provides items with a wide range of content coverage and may serve as a sound basis for computerized adaptive testing applications. It might also be useful for researchers who wish to develop new fixed-length scales for the assessment of depression in specific rehabilitation settings. (PsycINFO Database Record (c) 2009 APA, all rights reserved).
GAP-REACH: a checklist to assess comprehensive reporting of race, ethnicity, and culture in psychiatric publications.

PubMed

Lewis-Fernández, Roberto; Raggio, Greer A; Gorritz, Magdaliz; Duan, Naihua; Marcus, Sue; Cabassa, Leopoldo J; Humensky, Jennifer; Becker, Anne E; Alarcón, Renato D; Oquendo, María A; Hansen, Helena; Like, Robert C; Weiss, Mitchell; Desai, Prakash N; Jacobsen, Frederick M; Foulks, Edward F; Primm, Annelle; Lu, Francis; Kopelowicz, Alex; Hinton, Ladson; Hinton, Devon E

2013-10-01

Growing awareness of health and health care disparities highlights the importance of including information about race, ethnicity, and culture (REC) in health research. Reporting of REC factors in research publications, however, is notoriously imprecise and unsystematic. This article describes the development of a checklist to assess the comprehensiveness and the applicability of REC factor reporting in psychiatric research publications. The 16-item GAP-REACH checklist was developed through a rigorous process of expert consensus, empirical content analysis in a sample of publications (N = 1205), and interrater reliability (IRR) assessment (N = 30). The items assess each section in the conventional structure of a health research article. Data from the assessment may be considered on an item-by-item basis or as a total score ranging from 0% to 100%. The final checklist has excellent IRR (κ = 0.91). The GAP-REACH may be used by multiple research stakeholders to assess the scope of REC reporting in a research article.
Validation of the MedUseQ: A Self-Administered Screener for Older Adults to Assess Medication Use Problems.

PubMed

Berman, Rebecca L; Iris, Madelyn; Conrad, Kendon J; Robinson, Carrie

2018-01-01

Older adults taking multiple prescription and nonprescription drugs are at risk for medication use problems, yet there are few brief, self-administered screening tools designed specifically for them. The study objective was to develop and validate a patient-centered screener for community-dwelling older adults. In phase 1, a convenience sample of 57 stakeholders (older adults, pharmacists, nurses, and physicians) participated in concept mapping, using Concept System® Global MAX TM , to identify items for a questionnaire. In phase 2, a 40-item questionnaire was tested with a convenience sample of 377 adults and a 24-item version was tested with 306 older adults, aged 55 and older, using Rasch methodology. In phase 3, stakeholder focus groups provided feedback on the format of questionnaire materials and recommended strategies for addressing problems. The concept map contained 72 statements organized into 6 conceptual clusters or domains. The 24-item screener was unidimensional. Cronbach's alpha was .87, person reliability was acceptable (.74), and item reliability was high (.96). The MedUseQ is a validated, patient-centered tool targeting older adults that can be used to assess a wide range of medication use problems in clinical and community settings and to identify areas for education, intervention, or further assessment.
Rasch analysis of the Chedoke-McMaster Attitudes towards Children with Handicaps scale.

PubMed

Armstrong, Megan; Morris, Christopher; Tarrant, Mark; Abraham, Charles; Horton, Mike C

2017-02-01

Aim To assess whether the Chedoke-McMaster Attitudes towards Children with Handicaps (CATCH) 36-item total scale and subscales fit the unidimensional Rasch model. Method The CATCH was administered to 1881 children, aged 7-16 years in a cross-sectional survey. Data were used from a random sample of 416 for the initial Rasch analysis. The analysis was performed on the 36-item scale and then separately for each subscale. The analysis explored fit to the Rasch model in terms of overall scale fit, individual item fit, item response categories, and unidimensionality. Item bias for gender and school level was also assessed. Revised scales were then tested on an independent second random sample of 415 children. Results Analyses indicated that the 36-item overall scale was not unidimensional and did not fit the Rasch model. Two scales of affective attitudes and behavioural intention were retained after four items were removed from each due to misfit to the Rasch model. Additionally, the scaling was improved when the two most negative response categories were aggregated. There was no item bias by gender or school level on the revised scales. Items assessing cognitive attitudes did not fit the Rasch model and had low internal consistency as a scale. Conclusion Affective attitudes and behavioural intention CATCH sub-scales should be treated separately. Caution should be exercised when using the cognitive subscale. Implications for Rehabilitation The 36-item Chedoke-McMaster Attitudes towards Children with Handicaps (CATCH) scale as a whole did not fit the Rasch model; thus indicating a multi-dimensional scale. Researchers should use two revised eight-item subscales of affective attitudes and behavioural intentions when exploring interventions aiming to improve children's attitudes towards disabled people or factors associated with those attitudes. Researchers should use the cognitive subscale with caution, as it did not create a unidimensional and internally consistent scale. Therefore, conclusions drawn from this scale may not accurately reflect children's attitudes.
Validation and psychometric properties of the Somatic and Psychological HEalth REport (SPHERE) in a young Australian-based population sample using non-parametric item response theory.

PubMed

Couvy-Duchesne, Baptiste; Davenport, Tracey A; Martin, Nicholas G; Wright, Margaret J; Hickie, Ian B

2017-08-01

The Somatic and Psychological HEalth REport (SPHERE) is a 34-item self-report questionnaire that assesses symptoms of mental distress and persistent fatigue. As it was developed as a screening instrument for use mainly in primary care-based clinical settings, its validity and psychometric properties have not been studied extensively in population-based samples. We used non-parametric Item Response Theory to assess scale validity and item properties of the SPHERE-34 scales, collected through four waves of the Brisbane Longitudinal Twin Study (N = 1707, mean age = 12, 51% females; N = 1273, mean age = 14, 50% females; N = 1513, mean age = 16, 54% females, N = 1263, mean age = 18, 56% females). We estimated the heritability of the new scores, their genetic correlation, and their predictive ability in a sub-sample (N = 1993) who completed the Composite International Diagnostic Interview. After excluding items most responsible for noise, sex or wave bias, the SPHERE-34 questionnaire was reduced to 21 items (SPHERE-21), comprising a 14-item scale for anxiety-depression and a 10-item scale for chronic fatigue (3 items overlapping). These new scores showed high internal consistency (alpha > 0.78), moderate three months reliability (ICC = 0.47-0.58) and item scalability (Hi > 0.23), and were positively correlated (phenotypic correlations r = 0.57-0.70; rG = 0.77-1.00). Heritability estimates ranged from 0.27 to 0.51. In addition, both scores were associated with later DSM-IV diagnoses of MDD, social anxiety and alcohol dependence (OR in 1.23-1.47). Finally, a post-hoc comparison showed that several psychometric properties of the SPHERE-21 were similar to those of the Beck Depression Inventory. The scales of SPHERE-21 measure valid and comparable constructs across sex and age groups (from 9 to 28 years). SPHERE-21 scores are heritable, genetically correlated and show good predictive ability of mental health in an Australian-based population sample of young people.
Better assessment of physical function: item improvement is neglected but essential

PubMed Central

2009-01-01

Introduction Physical function is a key component of patient-reported outcome (PRO) assessment in rheumatology. Modern psychometric methods, such as Item Response Theory (IRT) and Computerized Adaptive Testing, can materially improve measurement precision at the item level. We present the qualitative and quantitative item-evaluation process for developing the Patient Reported Outcomes Measurement Information System (PROMIS) Physical Function item bank. Methods The process was stepwise: we searched extensively to identify extant Physical Function items and then classified and selectively reduced the item pool. We evaluated retained items for content, clarity, relevance and comprehension, reading level, and translation ease by experts and patient surveys, focus groups, and cognitive interviews. We then assessed items by using classic test theory and IRT, used confirmatory factor analyses to estimate item parameters, and graded response modeling for parameter estimation. We retained the 20 Legacy (original) Health Assessment Questionnaire Disability Index (HAQ-DI) and the 10 SF-36's PF-10 items for comparison. Subjects were from rheumatoid arthritis, osteoarthritis, and healthy aging cohorts (n = 1,100) and a national Internet sample of 21,133 subjects. Results We identified 1,860 items. After qualitative and quantitative evaluation, 124 newly developed PROMIS items composed the PROMIS item bank, which included revised Legacy items with good fit that met IRT model assumptions. Results showed that the clearest and best-understood items were simple, in the present tense, and straightforward. Basic tasks (like dressing) were more relevant and important versus complex ones (like dancing). Revised HAQ-DI and PF-10 items with five response options had higher item-information content than did comparable original Legacy items with fewer response options. IRT analyses showed that the Physical Function domain satisfied general criteria for unidimensionality with one-, two-, three-, and four-factor models having comparable model fits. Correlations between factors in the test data sets were > 0.90. Conclusions Item improvement must underlie attempts to improve outcome assessment. The clear, personally important and relevant, ability-framed items in the PROMIS Physical Function item bank perform well in PRO assessment. They will benefit from further study and application in a wider variety of rheumatic diseases in diverse clinical groups, including those at the extremes of physical functioning, and in different administration modes. PMID:20015354
Better assessment of physical function: item improvement is neglected but essential.

PubMed

Bruce, Bonnie; Fries, James F; Ambrosini, Debbie; Lingala, Bharathi; Gandek, Barbara; Rose, Matthias; Ware, John E

2009-01-01

Physical function is a key component of patient-reported outcome (PRO) assessment in rheumatology. Modern psychometric methods, such as Item Response Theory (IRT) and Computerized Adaptive Testing, can materially improve measurement precision at the item level. We present the qualitative and quantitative item-evaluation process for developing the Patient Reported Outcomes Measurement Information System (PROMIS) Physical Function item bank. The process was stepwise: we searched extensively to identify extant Physical Function items and then classified and selectively reduced the item pool. We evaluated retained items for content, clarity, relevance and comprehension, reading level, and translation ease by experts and patient surveys, focus groups, and cognitive interviews. We then assessed items by using classic test theory and IRT, used confirmatory factor analyses to estimate item parameters, and graded response modeling for parameter estimation. We retained the 20 Legacy (original) Health Assessment Questionnaire Disability Index (HAQ-DI) and the 10 SF-36's PF-10 items for comparison. Subjects were from rheumatoid arthritis, osteoarthritis, and healthy aging cohorts (n = 1,100) and a national Internet sample of 21,133 subjects. We identified 1,860 items. After qualitative and quantitative evaluation, 124 newly developed PROMIS items composed the PROMIS item bank, which included revised Legacy items with good fit that met IRT model assumptions. Results showed that the clearest and best-understood items were simple, in the present tense, and straightforward. Basic tasks (like dressing) were more relevant and important versus complex ones (like dancing). Revised HAQ-DI and PF-10 items with five response options had higher item-information content than did comparable original Legacy items with fewer response options. IRT analyses showed that the Physical Function domain satisfied general criteria for unidimensionality with one-, two-, three-, and four-factor models having comparable model fits. Correlations between factors in the test data sets were > 0.90. Item improvement must underlie attempts to improve outcome assessment. The clear, personally important and relevant, ability-framed items in the PROMIS Physical Function item bank perform well in PRO assessment. They will benefit from further study and application in a wider variety of rheumatic diseases in diverse clinical groups, including those at the extremes of physical functioning, and in different administration modes.
Development and validation of oral health-related early childhood quality of life tool for North Indian preschool children.

PubMed

Mathur, Vijay Prakash; Dhillon, Jatinder Kaur; Logani, Ajay; Agarwal, Ramesh

2014-01-01

The purpose of this study was to develop a reliable instrument [Oral Health related Early Childhood Quality of Life (OH- ECQOL) scale] for measuring oral health related quality of life (OHrQoL) in preschool children in North Indian population. Four pediatric dentists evaluated a pool of 65 items from various QoL questionnaires to assess their relevance to Indian population. These items were discussed with eight independent pediatric dentists and two community dentists who were not a part of this study to assess relevance of these items to preschool age children based on their comprehensiveness and clarity. Based on their responses and feedback a modified pool of items was developed and administered to a convenience sample of 20 parents who rated these items according to their relevance. The test retest reliability was evaluated on another sample of 20 parents of 2-5 year old children. The final questionnaire comprised of 16 items (12 child and 4 family). This was administered to 300 parents of 24-71 months old children divided on the basis of early childhood caries to assess its reliability and validity. OH-ECQOL scores were significantly associated with parental ratings of their child's general and oral health, and the presence of dental disease in the child. Cronbach's alpha was 0.862, and the ICC for test-retest reliability was 0.94. The OH-ECQOL proved reliable and valid tool for assessing the impact of oral disorders on the quality of life of preschool children in Northern India.
Application of Group-Level Item Response Models in the Evaluation of Consumer Reports about Health Plan Quality

ERIC Educational Resources Information Center

Reise, Steven P.; Meijer, Rob R.; Ainsworth, Andrew T.; Morales, Leo S.; Hays, Ron D.

2006-01-01

Group-level parametric and non-parametric item response theory models were applied to the Consumer Assessment of Healthcare Providers and Systems (CAHPS[R]) 2.0 core items in a sample of 35,572 Medicaid recipients nested within 131 health plans. Results indicated that CAHPS responses are dominated by within health plan variation, and only weakly…
Quality of reporting in abstracts of randomized controlled trials published in leading journals of periodontology and implant dentistry: a survey.

PubMed

Faggion, Clovis Mariano; Giannakopoulos, Nikolaos Nikitas

2012-10-01

Most readers, reviewers, and editors rely on abstracts to decide whether to assess the full text of an article. A research abstract should, therefore, be as informative as possible. The standard of reporting in abstracts of randomized controlled trials (RCTs) in periodontology and implant dentistry has not yet been assessed. The objectives of this review are: 1) to assess the quality of reporting in abstracts of RCTs in periodontology and implant dentistry, and 2) to investigate changes in the quality of reporting by comparing samples from different periods. The authors searched the PubMed electronic database, independently and in duplicate, for abstracts of RCTs published in seven leading journals of periodontology and implant dentistry from 2005 to 2007 and from 2009 to 2011. The quality of reporting in selected abstracts with reference to the CONSORT (Consolidated Standards of Reporting Trials) for Abstracts checklist published in January 2008 was assessed independently and in duplicate. Cohen κ statistic was used to determine the extent of agreement of the reviewers. Pearson χ(2) test and/or Fisher exact test were used to assess differences in reporting in the two samples. Level of significance was set at P <0.05. Three hundred ninety-two abstracts are included in this review. Three items (intervention, objective, and conclusions) were almost fully reported in both samples. In contrast, other items (randomization, trial registration, and funding) were never reported. There were significant changes in reporting for only two items, trial design and title (items better reported in the pre- and post-CONSORT samples, respectively). Most topics, however, were similarly poorly reported in both samples of abstracts. The quality of reporting in abstracts of RCTs in periodontology and implant dentistry can be improved. Authors should follow the CONSORT for Abstracts guidelines, and journal editors should promote clear rules to improve authors' adherence to these guidelines.
Development and evaluation of CAHPS survey items assessing how well healthcare providers address health literacy.

PubMed

Weidmer, Beverly A; Brach, Cindy; Hays, Ron D

2012-09-01

The complexity of health information often exceeds patients' skills to understand and use it. To develop survey items assessing how well healthcare providers communicate health information. Domains and items for the Consumer Assessment of Healthcare Providers and Systems (CAHPS) Item Set for Addressing Health Literacy were identified through an environmental scan and input from stakeholders. The draft item set was translated into Spanish and pretested in both English and Spanish. The revised item set was field tested with a randomly selected sample of adult patients from 2 sites using mail and telephonic data collection. Item-scale correlations, confirmatory factor analysis, and internal consistency reliability estimates were estimated to assess how well the survey items performed and identify composite measures. Finally, we regressed the CAHPS global rating of the provider item on the CAHPS core communication composite and the new health literacy composites. A total of 601 completed surveys were obtained (52% response rate). Two composite measures were identified: (1) Communication to Improve Health Literacy (16 items); and (2) How Well Providers Communicate About Medicines (6 items). These 2 composites were significantly uniquely associated with the global rating of the provider (communication to improve health literacy: P<0.001, b=0.28; and communication about medicines composite: P=0.02, b=0.04). The 2 composites and the CAHPS core communication composite accounted for 51% of the variance in the global rating of the provider. A 5-item subset of the Communication to Improve Health Literacy composite accounted for 90% of the variance of the original 16-item composite. This study provides support for reliability and validity of the CAHPS Item Set for Addressing Health Literacy. These items can serve to assess whether healthcare providers have communicated effectively with their patients and as a tool for quality improvement.
Developmental changes in visual short-term memory in infancy: evidence from eye-tracking.

PubMed

Oakes, Lisa M; Baumgartner, Heidi A; Barrett, Frederick S; Messenger, Ian M; Luck, Steven J

2013-01-01

We assessed visual short-term memory (VSTM) for color in 6- and 8-month-old infants (n = 76) using a one-shot change detection task. In this task, a sample array of two colored squares was visible for 517 ms, followed by a 317-ms retention period and then a 3000-ms test array consisting of one unchanged item and one item in a new color. We tracked gaze at 60 Hz while infants looked at the changed and unchanged items during test. When the two sample items were different colors (Experiment 1), 8-month-old infants exhibited a preference for the changed item, indicating memory for the colors, but 6-month-olds exhibited no evidence of memory. When the two sample items were the same color and did not need to be encoded as separate objects (Experiment 2), 6-month-old infants demonstrated memory. These results show that infants can encode information in VSTM in a single, brief exposure that simulates the timing of a single fixation period in natural scene viewing, and they reveal rapid developmental changes between 6 and 8 months in the ability to store individuated items in VSTM.
The comparability of English, French and Dutch scores on the Functional Assessment of Chronic Illness Therapy-Fatigue (FACIT-F): an assessment of differential item functioning in patients with systemic sclerosis.

PubMed

Kwakkenbos, Linda; Willems, Linda M; Baron, Murray; Hudson, Marie; Cella, David; van den Ende, Cornelia H M; Thombs, Brett D

2014-01-01

The Functional Assessment of Chronic Illness Therapy-Fatigue (FACIT-F) is commonly used to assess fatigue in rheumatic diseases, and has shown to discriminate better across levels of the fatigue spectrum than other commonly used measures. The aim of this study was to assess the cross-language measurement equivalence of the English, French, and Dutch versions of the FACIT-F in systemic sclerosis (SSc) patients. The FACIT-F was completed by 871 English-speaking Canadian, 238 French-speaking Canadian and 230 Dutch SSc patients. Confirmatory factor analysis was used to assess the factor structure in the three samples. The Multiple-Indicator Multiple-Cause (MIMIC) model was utilized to assess differential item functioning (DIF), comparing English versus French and versus Dutch patient responses separately. A unidimensional factor model showed good fit in all samples. Comparing French versus English patients, statistically significant, but small-magnitude DIF was found for 3 of 13 items. French patients had 0.04 of a standard deviation (SD) lower latent fatigue scores than English patients and there was an increase of only 0.03 SD after accounting for DIF. For the Dutch versus English comparison, 4 items showed small, but statistically significant, DIF. Dutch patients had 0.20 SD lower latent fatigue scores than English patients. After correcting for DIF, there was a reduction of 0.16 SD in this difference. There was statistically significant DIF in several items, but the overall effect on fatigue scores was minimal. English, French and Dutch versions of the FACIT-F can be reasonably treated as having equivalent scoring metrics.
The Comparability of English, French and Dutch Scores on the Functional Assessment of Chronic Illness Therapy-Fatigue (FACIT-F): An Assessment of Differential Item Functioning in Patients with Systemic Sclerosis

PubMed Central

Kwakkenbos, Linda; Willems, Linda M.; Baron, Murray; Hudson, Marie; Cella, David; van den Ende, Cornelia H. M.; Thombs, Brett D.

2014-01-01

Objective The Functional Assessment of Chronic Illness Therapy- Fatigue (FACIT-F) is commonly used to assess fatigue in rheumatic diseases, and has shown to discriminate better across levels of the fatigue spectrum than other commonly used measures. The aim of this study was to assess the cross-language measurement equivalence of the English, French, and Dutch versions of the FACIT-F in systemic sclerosis (SSc) patients. Methods The FACIT-F was completed by 871 English-speaking Canadian, 238 French-speaking Canadian and 230 Dutch SSc patients. Confirmatory factor analysis was used to assess the factor structure in the three samples. The Multiple-Indicator Multiple-Cause (MIMIC) model was utilized to assess differential item functioning (DIF), comparing English versus French and versus Dutch patient responses separately. Results A unidimensional factor model showed good fit in all samples. Comparing French versus English patients, statistically significant, but small-magnitude DIF was found for 3 of 13 items. French patients had 0.04 of a standard deviation (SD) lower latent fatigue scores than English patients and there was an increase of only 0.03 SD after accounting for DIF. For the Dutch versus English comparison, 4 items showed small, but statistically significant, DIF. Dutch patients had 0.20 SD lower latent fatigue scores than English patients. After correcting for DIF, there was a reduction of 0.16 SD in this difference. Conclusions There was statistically significant DIF in several items, but the overall effect on fatigue scores was minimal. English, French and Dutch versions of the FACIT-F can be reasonably treated as having equivalent scoring metrics. PMID:24638101
Toward a More Systematic Assessment of Smoking: Development of a Smoking Module for PROMIS®

PubMed Central

Tucker, Joan S.; Shadel, William G.; Stucky, Brian D.; Cai, Li

2012-01-01

Introduction The aim of the PROMIS® Smoking Initiative is to develop, evaluate, and standardize item banks to assess cigarette smoking behavior and biopsychosocial constructs associated with smoking for both daily and non-daily smokers. Methods We used qualitative methods to develop the item pool (following the PROMIS® approach: e.g., literature search, “binning and winnowing” of items, and focus groups and cognitive interviews to finalize wording and format), and quantitative methods (e.g., factor analysis) to develop the item banks. Results We considered a total of 1622 extant items, and 44 new items for inclusion in the smoking item banks. A final set of 277 items representing 11 conceptual domains was selected for field testing in a national sample of smokers. Using data from 3021 daily smokers in the field test, an iterative series of exploratory factor analyses and project team discussions resulted in six item banks: Positive Consequences of Smoking (40 items), Smoking Dependence/Craving (55 items), Health Consequences of Smoking (26 items), Psychosocial Consequences of Smoking (37 items), Coping Aspects of Smoking (30 items), and Social Factors of Smoking (23 items). Conclusions Inclusion of a smoking domain in the PROMIS® framework will standardize measurement of key smoking constructs using state-of-the-art psychometric methods, and make them widely accessible to health care providers, smoking researchers and the large community of researchers using PROMIS® who might not otherwise include an assessment of smoking in their design. Next steps include reducing the number of items in each domain, conducting confirmatory analyses, and duplicating the process for non-daily smokers. PMID:22770824

Toward a more systematic assessment of smoking: development of a smoking module for PROMIS®.

PubMed

Edelen, Maria O; Tucker, Joan S; Shadel, William G; Stucky, Brian D; Cai, Li

2012-11-01

The aim of the PROMIS® Smoking Initiative is to develop, evaluate, and standardize item banks to assess cigarette smoking behavior and biopsychosocial constructs associated with smoking for both daily and non-daily smokers. We used qualitative methods to develop the item pool (following the PROMIS® approach: e.g., literature search, "binning and winnowing" of items, and focus groups and cognitive interviews to finalize wording and format), and quantitative methods (e.g., factor analysis) to develop the item banks. We considered a total of 1622 extant items, and 44 new items for inclusion in the smoking item banks. A final set of 277 items representing 11 conceptual domains was selected for field testing in a national sample of smokers. Using data from 3021 daily smokers in the field test, an iterative series of exploratory factor analyses and project team discussions resulted in six item banks: Positive Consequences of Smoking (40 items), Smoking Dependence/Craving (55 items), Health Consequences of Smoking (26 items), Psychosocial Consequences of Smoking (37 items), Coping Aspects of Smoking (30 items), and Social Factors of Smoking (23 items). Inclusion of a smoking domain in the PROMIS® framework will standardize measurement of key smoking constructs using state-of-the-art psychometric methods, and make them widely accessible to health care providers, smoking researchers and the large community of researchers using PROMIS® who might not otherwise include an assessment of smoking in their design. Next steps include reducing the number of items in each domain, conducting confirmatory analyses, and duplicating the process for non-daily smokers. Copyright © 2012 Elsevier Ltd. All rights reserved.
The five item Barthel index

PubMed Central

Hobart, J; Thompson, A

2001-01-01

OBJECTIVES—Routine data collection is now considered mandatory. Therefore, staff rated clinical scales that consist of multiple items should have the minimum number of items necessary for rigorous measurement. This study explores the possibility of developing a short form Barthel index, suitable for use in clinical trials, epidemiological studies, and audit, that satisfies criteria for rigorous measurement and is psychometrically equivalent to the 10 item instrument. METHODS—Data were analysed from 844 consecutive admissions to a neurological rehabilitation unit in London. Random half samples were generated. Short forms were developed in one sample (n=419), by selecting items with the best measurement properties, and tested in the other (n=418). For each of the 10 items of the BI, item total correlations and effect sizes were computed and rank ordered. The best items were defined as those with the lowest cross product of these rank orderings. The acceptability, reliability, validity, and responsiveness of three short form BIs (five, four, and three item) were determined and compared with the 10 item BI. Agreement between scores generated by short forms and 10 item BI was determined using intraclass correlation coefficients and the method of Bland and Altman. RESULTS—The five best items in this sample were transfers, bathing, toilet use, stairs, and mobility. Of the three short forms examined, the five item BI had the best measurement properties and was psychometrically equivalent to the 10 item BI. Agreement between scores generated by the two measures for individual patients was excellent (ICC=0.90) but not identical (limits of agreement=1.84±3.84). CONCLUSIONS—The five item short form BI may be a suitable outcome measure for group comparison studies in comparable samples. Further evaluations are needed. Results demonstrate a fundamental difference between assessment and measurement and the importance of incorporating psychometric methods in the development and evaluation of health measures.   PMID:11459898
Spanish adaptation of social withdrawal motivation and frequency scales.

PubMed

Indias García, Sílvia; De Paúl Ochotorena, Joaquín

2016-11-01

To adapt into Spanish three scales measuring frequency (SWFS) and motivation for social withdrawal (CSPS and SWMS) and to develop a scale capable of assessing the five motivations for social withdrawal. Participants were 1,112 Spanish adolescents, aged 12-17 years. The sample was randomly split into two groups in which exploratory and confirmatory (CFA) factor analyses were performed separately. A sample of adolescents in residential care (n = 128) was also used to perform discriminant validity analyses. SWFS was reduced to eight items that account for 40% of explained variance (PVE), and its reliability is high. SWMS worked adequately in the original version, according to CFA. Some items from the CSPS were removed from the final Spanish version. The newly developed scale (SWMS-5D) is composed of 20 items including five subscales: Peer Isolation, Unsociability, Shyness, Low Mood and Avoidance. Analyses reveal adequate convergent and discriminant validities. The resulting SWFS-8 and SWMS-5D could be considered useful instruments to assess frequency and motivation for social withdrawal in Spanish samples.
Factor Structure of the Internet Addiction Test in Online Gamers and Poker Players

PubMed Central

Achab, Sophia; Billieux, Joel; Thorens, Gabriel; Zullino, Daniele; Dufour, Magali; Rothen, Stéphane

2015-01-01

Background The Internet Addiction Test (IAT) is the most widely used questionnaire to screen for problematic Internet use. Nevertheless, its factorial structure is still debated, which complicates comparisons among existing studies. Most previous studies were performed with students or community samples despite the probability of there being more problematic Internet use among users of specific applications, such as online gaming or gambling. Objective To assess the factorial structure of a modified version of the IAT that addresses specific applications, such as video games and online poker. Methods Two adult samples—one sample of Internet gamers (n=920) and one sample of online poker players (n=214)—were recruited and completed an online version of the modified IAT. Both samples were split into two subsamples. Two principal component analyses (PCAs) followed by two confirmatory factor analyses (CFAs) were run separately. Results The results of principal component analysis indicated that a one-factor model fit the data well across both samples. In consideration of the weakness of some IAT items, a 17-item modified version of the IAT was proposed. Conclusions This study assessed, for the first time, the factorial structure of a modified version of an Internet-administered IAT on a sample of Internet gamers and a sample of online poker players. The scale seems appropriate for the assessment of such online behaviors. Further studies on the modified 17-item IAT version are needed. PMID:26543917
Measuring Acceptance of Sleep Difficulties: The Development of the Sleep Problem Acceptance Questionnaire.

PubMed

Bothelius, Kristoffer; Jernelöv, Susanna; Fredrikson, Mats; McCracken, Lance M; Kaldo, Viktor

2015-11-01

Acceptance may be an important therapeutic process in sleep medicine, but valid psychometric instruments measuring acceptance related to sleep difficulties are lacking. The purpose of this study was to develop a measure of acceptance in insomnia, and to examine its factor structure as well as construct validity. In a cross-sectional design, a principal component analysis for item reduction was conducted on a first sample (A) and a confirmatory factor analysis on a second sample (B). Construct validity was tested on a combined sample (C). Questionnaire items were derived from a measure of acceptance in chronic pain, and data were gathered through screening or available from pretreatment assessments in four insomnia treatment trials, administered online, via bibliotherapy and in primary care. Adults with insomnia: 372 in sample A and 215 in sample B. Sample C (n = 820) included sample A and B with another 233 participants added. Construct validity was assessed through relations with established acceptance and sleep scales. The principal component analysis presented a two-factor solution with eight items, explaining 65.9% of the total variance. The confirmatory factor analysis supported the solution. Acceptance of sleep problems was more closely related to subjective symptoms and consequences of insomnia than to diary description of sleep, or to acceptance of general private events. The Sleep Problem Acceptance Questionnaire (SPAQ), containing the subscales "Activity Engagement" and "Willingness", is a valid tool to assess acceptance of insomnia. © 2015 Associated Professional Sleep Societies, LLC.
Translation, Validation, and Reliability of the Dutch Late-Life Function and Disability Instrument Computer Adaptive Test.

PubMed

Arensman, Remco M; Pisters, Martijn F; de Man-van Ginkel, Janneke M; Schuurmans, Marieke J; Jette, Alan M; de Bie, Rob A

2016-09-01

Adequate and user-friendly instruments for assessing physical function and disability in older adults are vital for estimating and predicting health care needs in clinical practice. The Late-Life Function and Disability Instrument Computer Adaptive Test (LLFDI-CAT) is a promising instrument for assessing physical function and disability in gerontology research and clinical practice. The aims of this study were: (1) to translate the LLFDI-CAT to the Dutch language and (2) to investigate its validity and reliability in a sample of older adults who spoke Dutch and dwelled in the community. For the assessment of validity of the LLFDI-CAT, a cross-sectional design was used. To assess reliability, measurement of the LLFDI-CAT was repeated in the same sample. The item bank of the LLFDI-CAT was translated with a forward-backward procedure. A sample of 54 older adults completed the LLFDI-CAT, World Health Organization Disability Assessment Schedule 2.0, RAND 36-Item Short-Form Health Survey physical functioning scale (10 items), and 10-Meter Walk Test. The LLFDI-CAT was repeated in 2 to 8 days (mean=4.5 days). Pearson's r and the intraclass correlation coefficient (ICC) (2,1) were calculated to assess validity, group-level reliability, and participant-level reliability. A correlation of .74 for the LLFDI-CAT function scale and the RAND 36-Item Short-Form Health Survey physical functioning scale (10 items) was found. The correlations of the LLFDI-CAT disability scale with the World Health Organization Disability Assessment Schedule 2.0 and the 10-Meter Walk Test were -.57 and -.53, respectively. The ICC (2,1) of the LLFDI-CAT function scale was .84, with a group-level reliability score of .85. The ICC (2,1) of the LLFDI-CAT disability scale was .76, with a group-level reliability score of .81. The high percentage of women in the study and the exclusion of older adults with recent joint replacement or hospitalization limit the generalizability of the results. The Dutch LLFDI-CAT showed strong validity and high reliability when used to assess physical function and disability in older adults dwelling in the community. © 2016 American Physical Therapy Association.
Scoring correction for MMPI-2 Hs scale with patients experiencing a traumatic brain injury: a test of measurement invariance.

PubMed

Alkemade, Nathan; Bowden, Stephen C; Salzman, Louis

2015-02-01

It has been suggested that MMPI-2 scoring requires removal of some items when assessing patients after a traumatic brain injury (TBI). Gass (1991. MMPI-2 interpretation and closed head injury: A correction factor. Psychological assessment, 3, 27-31) proposed a correction procedure in line with the hypothesis that MMPI-2 endorsement may be affected by symptoms of TBI. This study assessed the validity of the Gass correction procedure. A sample of patients with a TBI (n = 242), and a random subset of the MMPI-2 normative sample (n = 1,786). The correction procedure implies a failure of measurement invariance across populations. This study examined measurement invariance of one of the MMPI-2 scales (Hs) that includes TBI correction items. A four-factor model of the MMPI-2 Hs items was defined. The factor model was found to meet the criteria for partial measurement invariance. Analysis of the change in sensitivity and specificity values implied by partial measurement invariance failed to indicate significant practical impact of partial invariance. Overall, the results support continued use of all Hs items to assess psychological well-being in patients with TBI. © The Author 2014. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Development of an Item Bank for the Assessment of Knowledge on Biology in Argentine University Students.

PubMed

Cupani, Marcos; Zamparella, Tatiana Castro; Piumatti, Gisella; Vinculado, Grupo

The calibration of item banks provides the basis for computerized adaptive testing that ensures high diagnostic precision and minimizes participants' test burden. This study aims to develop a bank of items to measure the level of Knowledge on Biology using the Rasch model. The sample consisted of 1219 participants that studied in different faculties of the National University of Cordoba (mean age = 21.85 years, SD = 4.66; 66.9% are women). The items were organized in different forms and into separate subtests, with some common items across subtests. The students were told they had to answer 60 questions of knowledge on biology. Evaluation of Rasch model fit (Zstd >|2.0|), differential item functioning, dimensionality, local independence, item and person separation (>2.0), and reliability (>.80) resulted in a bank of 180 items with good psychometric properties. The bank provides items with a wide range of content coverage and may serve as a sound basis for computerized adaptive testing applications. The contribution of this work is significant in the field of educational assessment in Argentina.
Can a single question effectively screen for burnout in Australian cancer care workers?

PubMed Central

2010-01-01

Background Burnout has important clinical and professional implications among health care workers, with high levels of burnout documented in oncology staff. The aim of this study was to ascertain how well a brief single-item measure could be used to screen for burnout in the Australian oncology workforce. Methods During 2007, 1322 members of the Clinical Oncological Society of Australia were invited to participate in a cross-sectional nationwide survey; 740 (56%) of eligible members consented and completed the survey. Data from the 638 consenting members who reported that their work involved direct patient contact were included in the secondary analyses reported in this paper. Burnout was assessed using the MBI Human Services Survey Emotional Exhaustion sub-scale and a single-item self-defined burnout scale. Results Emotional exhaustion was "high" in 33% of the sample when assessed by the psychometrically validated MBI. The single-item burnout measure identified 28% of the sample who classified themselves as "definitely burning out", "having persistent symptoms of burnout", or "completely burned out". MBI Emotional Exhaustion was significantly correlated with the single-item burnout measure (r = 0.68, p < 0.0001) and an ANOVA yielded an R2 of 0.5 (p < 0.0001). Conclusions The moderate to high correlation between the single-item self-defined burnout measure and the emotional exhaustion component of burnout suggest that this single item can effectively screen for burnout in health care settings which are time-poor for assessing burnout more comprehensively. PMID:21162747
Development and Testing of the Church Environment Audit Tool.

PubMed

Kaczynski, Andrew T; Jake-Schoffman, Danielle E; Peters, Nathan A; Dunn, Caroline G; Wilcox, Sara; Forthofer, Melinda

2018-05-01

In this paper, we describe development and reliability testing of a novel tool to evaluate the physical environment of faith-based settings pertaining to opportunities for physical activity (PA) and healthy eating (HE). Tool development was a multistage process including a review of similar tools, stakeholder review, expert feedback, and pilot testing. Final tool sections included indoor opportunities for PA, outdoor opportunities for PA, food preparation equipment, kitchen type, food for purchase, beverages for purchase, and media. Two independent audits were completed at 54 churches. Interrater reliability (IRR) was determined with Kappa and percent agreement. Of 218 items, 102 were assessed for IRR and 116 could not be assessed because they were not present at enough churches. Percent agreement for all 102 items was over 80%. For 42 items, the sample was too homogeneous to assess Kappa. Forty-six of the remaining items had Kappas greater than 0.60 (25 items 0.80-1.00; 21 items 0.60-0.79), indicating substantial to almost perfect agreement. The tool proved reliable and efficient for assessing church environments and identifying potential intervention points. Future work can focus on applications within faith-based partnerships to understand how church environments influence diverse health outcomes.
Assessing birth experience in fathers as an important aspect of clinical obstetrics: how applicable is Salmon's Item List for men?

PubMed

Gawlik, Stephanie; Müller, Mitho; Hoffmann, Lutz; Dienes, Aimée; Reck, Corinna

2015-01-01

validated questionnaire assessment of fathers' experiences during childbirth is lacking in routine clinical practice. Salmon's Item List is a short, validated method used for the assessment of birth experience in mothers in both English- and German-speaking communities. With little to no validated data available for fathers, this pilot study aimed to assess the applicability of the German version of Salmon's Item List, including a multidimensional birth experience concept, in fathers. longitudinal study. Data were collected by questionnaires. University hospital in Germany. the birth experiences of 102 fathers were assessed four to six weeks post partum using the German version of Salmon's Item List. construct validity testing with exploratory factor analysis using principal component analysis with varimax rotation was performed to identify the dimensions of childbirth experiences. Internal consistency was also analysed. factor analysis yielded a four-factor solution comprising 17 items that accounted for 54.5% of the variance. The main domain was 'fulfilment', and the secondary domains were 'emotional distress', 'physical discomfort' and 'emotional adaption'. For fulfilment, Cronbach's α met conventional reliability standards (0.87). Salmon's Item List is an appropriate instrument to assess birth experience in fathers in terms of fulfilment. Larger samples need to be examined in order to prove the stability of the factor structure before this can be extended to routine clinical assessment. a reduced version of Salmon's Item List may be useful as a screening tool for general assessment. Copyright © 2014 Elsevier Ltd. All rights reserved.
Item response theory analysis of the Utrecht Work Engagement Scale for Students (UWES-S) using a sample of Japanese university and college students majoring medical science, nursing, and natural science.

PubMed

Tsubakita, Takashi; Shimazaki, Kazuyo; Ito, Hiroshi; Kawazoe, Nobuo

2017-10-30

The Utrecht Work Engagement Scale for Students has been used internationally to assess students' academic engagement, but it has not been analyzed via item response theory. The purpose of this study was to conduct an item response theory analysis of the Japanese version of the Utrecht Work Engagement Scale for Students translated by authors. Using a two-parameter model and Samejima's graded response model, difficulty and discrimination parameters were estimated after confirming the factor structure of the scale. The 14 items on the scale were analyzed with a sample of 3214 university and college students majoring medical science, nursing, or natural science in Japan. The preliminary parameter estimation was conducted with the two parameter model, and indicated that three items should be removed because there were outlier parameters. Final parameter estimation was conducted using the survived 11 items, and indicated that all difficulty and discrimination parameters were acceptable. The test information curve suggested that the scale better assesses higher engagement than average engagement. The estimated parameters provide a basis for future comparative studies. The results also suggested that a 7-point Likert scale is too broad; thus, the scaling should be modified to fewer graded scaling structure.
Psychometric evaluation of a unified Portuguese-language version of the Body Shape Questionnaire in female university students.

PubMed

Silva, Wanderson Roberto; Costa, David; Pimenta, Filipa; Maroco, João; Campos, Juliana Alvares Duarte Bonini

2016-07-21

The objectives of this study were to develop a unified Portuguese-language version, for use in Brazil and Portugal, of the Body Shape Questionnaire (BSQ) and to estimate its validity, reliability, and internal consistency in Brazilian and Portuguese female university students. Confirmatory factor analysis was performed using both original (34-item) and shortened (8-item) versions. The model's fit was assessed with χ²/df, CFI, NFI, and RMSEA. Concurrent and convergent validity were assessed. Reliability was estimated through internal consistency and composite reliability (α). Transnational invariance of the BSQ was tested using multi-group analysis. The original 32-item model was refined to present a better fit and adequate validity and reliability. The shortened model was stable in both independent samples and in transnational samples (Brazil and Portugal). The use of this unified version is recommended for the assessment of body shape concerns in both Brazilian and Portuguese college students.
Item response theory analysis of the life orientation test-revised: age and gender differential item functioning analyses.

PubMed

Steca, Patrizia; Monzani, Dario; Greco, Andrea; Chiesi, Francesca; Primi, Caterina

2015-06-01

This study is aimed at testing the measurement properties of the Life Orientation Test-Revised (LOT-R) for the assessment of dispositional optimism by employing item response theory (IRT) analyses. The LOT-R was administered to a large sample of 2,862 Italian adults. First, confirmatory factor analyses demonstrated the theoretical conceptualization of the construct measured by the LOT-R as a single bipolar dimension. Subsequently, IRT analyses for polytomous, ordered response category data were applied to investigate the items' properties. The equivalence of the items across gender and age was assessed by analyzing differential item functioning. Discrimination and severity parameters indicated that all items were able to distinguish people with different levels of optimism and adequately covered the spectrum of the latent trait. Additionally, the LOT-R appears to be gender invariant and, with minor exceptions, age invariant. Results provided evidence that the LOT-R is a reliable and valid measure of dispositional optimism. © The Author(s) 2014.
Validity and reliability of the Utrecht Work Engagement Scale-Student Version in Sri Lanka.

PubMed

Wickramasinghe, Nuwan Darshana; Dissanayake, Devani Sakunthala; Abeywardena, Gihan Sajiwa

2018-05-04

The present study was aimed at assessing the validity and the reliability of the Sinhala version of the Utrecht Work Engagement Scale-Student Version (UWES-S) among collegiate cycle students in Sri Lanka. The 17-item UWES-S was translated to Sinhala and the judgmental validity was assessed by a multi-disciplinary panel of experts. Construct validity of the UWES-S was appraised by using multi-trait scaling analysis and exploratory factor analysis (EFA) on data obtained from a sample of 194 grade thirteen students in the Kurunegala district, Sri Lanka. Reliability of the UWES-S was assessed by using internal consistency and test-retest reliability. Except for item 13, all other items showed good psychometric properties in judgemental validity, item-convergent validity and item-discriminant validity. EFA using principal component analysis with Oblimin rotation, suggested a three-factor solution (including vigor, dedication and absorption subscales) explaining 65.4% of the total variance for the 16-item UWES-S (with item 13 deleted). All three subscales show high internal consistency with Cronbach's α coefficient values of 0.867, 0.819, and 0.903 and test-retest reliability was high (p < 0.001). Hence, the Sinhala version of the 16-item UWES-S is a valid and a reliable instrument to assess work engagement among collegiate cycle students in Sri Lanka.
Development and Validity Testing of an Arthritis Self-Management Assessment Tool.

PubMed

Oh, HyunSoo; Han, SunYoung; Kim, SooHyun; Seo, WhaSook

Because of the chronic, progressive nature of arthritis and the substantial effects it has on quality of life, patients may benefit from self-management. However, no valid, reliable self-management assessment tool has been devised for patients with arthritis. This study was conducted to develop a comprehensive self-management assessment tool for patients with arthritis, that is, the Arthritis Self-Management Assessment Tool (ASMAT). To develop a list of qualified items corresponding to the conceptual definitions and attributes of arthritis self-management, a measurement model was established on the basis of theoretical and empirical foundations. Content validity testing was conducted to evaluate whether listed items were suitable for assessing arthritis self-management. Construct validity and reliability of the ASMAT were tested. Construct validity was examined using confirmatory factor analysis and nomological validity. The 32-item ASMAT was developed with a sample composed of patients in a clinic in South Korea. Content validity testing validated the 32 items, which comprised medical (10 items), behavioral (13 items), and psychoemotional (9 items) management subscales. Construct validity testing of the ASMAT showed that the 32 items properly corresponded with conceptual constructs of arthritis self-management, and were suitable for assessing self-management ability in patients with arthritis. Reliability was also well supported. The ASMAT devised in the present study may aid the evaluation of patient self-management ability and the effectiveness of self-management interventions. The authors believe the developed tool may also aid the identification of problems associated with the adoption of self-management practice, and thus improve symptom management, independence, and quality of life of patients with arthritis.
Developing a situational judgment test blueprint for assessing the non-cognitive skills of applicants to the University of Utah School of Medicine, the United States

PubMed Central

2015-01-01

Purpose: The situational judgment test (SJT) shows promise for assessing the non-cognitive skills of medical school applicants, but has only been used in Europe. Since the admissions processes and education levels of applicants to medical school are different in the United States and in Europe, it is necessary to obtain validity evidence of the SJT based on a sample of United States applicants. Methods: Ninety SJT items were developed and Kane’s validity framework was used to create a test blueprint. A total of 489 applicants selected for assessment/interview day at the University of Utah School of Medicine during the 2014-2015 admissions cycle completed one of five SJTs, which assessed professionalism, coping with pressure, communication, patient focus, and teamwork. Item difficulty, each item’s discrimination index, internal consistency, and the categorization of items by two experts were used to create the test blueprint. Results: The majority of item scores were within an acceptable range of difficulty, as measured by the difficulty index (0.50-0.85) and had fair to good discrimination. However, internal consistency was low for each domain, and 63% of items appeared to assess multiple domains. The concordance of categorization between the two educational experts ranged from 24% to 76% across the five domains. Conclusion: The results of this study will help medical school admissions departments determine how to begin constructing a SJT. Further testing with a more representative sample is needed to determine if the SJT is a useful assessment tool for measuring the non-cognitive skills of medical school applicants. PMID:26582629
The Revised Child Anxiety and Depression Scale 25-Parent Version: Scale Development and Validation in a School-Based and Clinical Sample.

PubMed

Ebesutani, Chad; Korathu-Larson, Priya; Nakamura, Brad J; Higa-McMillan, Charmaine; Chorpita, Bruce

2017-09-01

To help facilitate the dissemination and implementation of evidence-based assessment practices, we examined the psychometric properties of the shortened 25-item version of the Revised Child Anxiety and Depression Scale-parent report (RCADS-25-P), which was based on the same items as the previously published shortened 25-item child version. We used two independent samples of youth-a school sample ( N = 967, Grades 3-12) and clinical sample ( N = 433; 6-18 years)-to examine the factor structure, reliability, and validity of the RCADS-25-P scale scores. Results revealed that the two-factor structure (i.e., depression and broad anxiety factor) fit the data well in both the school and clinical sample. All reliability estimates, including test-retest indices, exceeded benchmark for good reliability. In the school sample, the RCADS-25-P scale scores converged significantly with related criterion measures and diverged with nonrelated criterion measures. In the clinical sample, the RCADS-25-P scale scores successfully discriminated between those with and without target problem diagnoses. In both samples, child-parent agreement indices were in the expected ranges. Normative data were also reported. The RCADS-25-P thus demonstrated robust psychometric properties across both a school and clinical sample as an effective brief screening instrument to assess for depression and anxiety in children and adolescents.
A Psychometric Evaluation of the Core Bereavement Items

ERIC Educational Resources Information Center

Holland, Jason M.; Nam, Ilsung; Neimeyer, Robert A.

2013-01-01

Despite being a routinely administered assessment of grieving, few studies have empirically examined the psychometric properties of the Core Bereavement Items (CBI). The present study investigated the factor structure, internal reliability, and concurrent validity of the CBI in a large, diverse sample of bereaved young adults (N = 1,366).…
An examination of reactivity to craving assessment: craving to smoke does not change over the course of a multi-item craving questionnaire.

PubMed

Germeroth, Lisa J; Tiffany, Stephen T

2015-06-01

Self-report measures are typically used to assess drug craving, but researchers have questioned whether completing these assessments can elicit or enhance craving. Previous studies have examined cigarette craving reactivity and found null craving reactivity effects. Several methodological limitations of those studies, however, preclude definitive conclusions. The current study addresses limitations of previous studies and extends this area of research by using a large sample size to examine: (1) item-by-item changes in craving level during questionnaire completion, (2) craving reactivity as a function of craving intensity reflected in item content, (3) craving reactivity differences between nicotine dependent and nondependent smokers, and (4) potential reactivity across multiple sessions. This study also used a more comprehensive craving assessment (the 32-item Questionnaire on Smoking Urges; QSU) than employed in previous studies. Nicotine dependent and nondependent smokers (n=270; nicotine dependence determined by the Nicotine Addiction Taxon Scale) completed the QSU on six separate occasions across 12 weeks. Craving level was observed at the item level and across various subsets of items. Analyses indicated that there was no significant effect of item/subset position on craving ratings, nor were there any significant interactions between item/subset position and session or level of nicotine dependence. These findings indicate that, even with relatively sensitive procedures for detecting potential reactivity, there was no evidence that completing a craving questionnaire induces craving. Copyright © 2015 Elsevier Ltd. All rights reserved.

Development of measures from the theory of planned behavior applied to leisure-time physical activity.

PubMed

Kerner, Matthew S

2005-06-01

Using the theory of planned behavior as a conceptual framework, scales assessing Attitude to Leisure-time Physical Activity, Expectations of Others, Perceived Control, and Intention to Engage in Leisure-time Physical Activity were developed for use among middle-school students. The study sample included 349 boys and 400 girls, 10 to 14 years of age (M=11.9 yr., SD=.9). Unipolar and bipolar scales with seven response choices were developed, with each scale item phrased in a Likert-type format. Following revisions, 22 items were retained in the Attitude to Leisure-time Physical Activity Scale, 10 items in the Expectations of Others Scale, 3 items in the Perceived Control Scale, and 17 items in the Intention to Engage in Leisure-time Physical Activity Scale. Adequate internal consistency was indicated by standardized coefficients alpha ranging from .75 to .89. Current results must be extended to assess discriminant and predictive validities and to check various reliabilities with new samples, then evaluation of intervention techniques for promotion of positive attitudes about leisure-time physical activity, including perception of control and intentions to engage in leisure-time physical activity.
Screening for depression in advanced disease: psychometric properties, sensitivity, and specificity of two items of the Palliative Care Outcome Scale (POS).

PubMed

Antunes, Bárbara; Murtagh, Fliss; Bausewein, Claudia; Harding, Richard; Higginson, Irene J

2015-02-01

Depression is common among patients with advanced disease but often difficult to detect. To assess the Palliative care Outcome Scale (POS) (10 items) against the Geriatric Depression Scale (GDS)-10 total score and the Hospital Anxiety and Depression Scale (HADS)-Depression subscale total score and determine if the POS has appropriate items to screen for depression among people with advanced disease. This was a secondary analysis performed on five studies. Four psychometric properties were assessed: data quality, scaling assumptions, acceptability, and internal consistency (reliability). Receiver operating characteristic (ROC) curves were used to determine the area under the curve. Sensitivity, specificity, positive and negative predictive values, false positive and negative rates, and positive and negative likelihood ratios were computed. The overall sample had 416 patients from Germany and England: 144 had cancer and 267 had nonmalignant conditions. Prevalence of depression across the sample was 17.5%. Floor and ceiling effects were rare. Cronbach's alpha coefficients for POS items 7 and 8 summed, GDS-10 and HADS-Depression items varied: 0.61 (heart failure) and 0.80 (cancer). Two items combined (Item 7-feeling depressed and Item 8-feeling good about yourself) consistently presented the highest area under the ROC curve, ranging from 0.76 (95% CI 0.60, 0.93) (Germany, lung cancer) to 0.97 (95% CI 0.91, 1.0) (heart failure), highest negative predictive value, and lowest false negative rate. For the overall sample, the cutoff 2/3 presented a negative predictive value of 89.4% (95% CI 84.7, 92.8) and false negative rate of 10.6 (95% CI 7.2, 15.3). POS items 7 and 8 summed are potentially useful to screen for depression in advanced disease populations. Copyright © 2015 American Academy of Hospice and Palliative Medicine. Published by Elsevier Inc. All rights reserved.
Psychometric properties of the Triarchic Psychopathy Measure: An item response theory approach.

PubMed

Shou, Yiyun; Sellbom, Martin; Xu, Jing

2018-05-01

There is cumulative evidence for the cross-cultural validity of the Triarchic Psychopathy Measure (TriPM; Patrick, 2010) among non-Western populations. Recent studies using correlational and regression analyses show promising construct validity of the TriPM in Chinese samples. However, little is known about the efficiency of items in TriPM in assessing the proposed latent traits. The current study evaluated the psychometric properties of the Chinese TriPM at the item level using item response theory analyses. It also examined the measurement invariance of the TriPM between the Chinese and the U.S. student samples by applying differential item functioning analyses under the item response theory framework. The results supported the unidimensional nature of the Disinhibition and Meanness scales. Both scales had a greater level of precision in the respective underlying constructs at the positive ends. The two scales, however, had several items that were weakly associated with their respective latent traits in the Chinese student sample. Boldness, on the other hand, was found to be multidimensional, and reflected a more normally distributed range of variation. The examination of measurement bias via differential item functioning analyses revealed that a number of items of the TriPM were not equivalent across the Chinese and the U.S. Some modification and adaptation of items might be considered for improving the precision of the TriPM for Chinese participants. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
The performance of the Edinburgh Postnatal Depression Scale in English speaking and non-English speaking populations in Australia.

PubMed

Small, Rhonda; Lumley, Judith; Yelland, Jane; Brown, Stephanie

2007-01-01

The Edinburgh Postnatal Depression Scale (EPDS) has been widely used to assess maternal depression following childbirth in a range of English speaking countries, and increasingly also in translation in non-English speaking ones. It has performed satisfactorily in most validation studies, has proved easy to administer, is acceptable to women, and rates of depression in the range of 10-20% have been consistently found. The performance of the EPDS was compared across different population samples in Australia: (i) Women born in Australia or in another English speaking country who completed the EPDS in English as part of the 1994 postal Survey of Recent Mothers (SRM) 6-7 months after birth (n = 1166); (ii) Women born in non-English speaking countries who also completed the EPDS in English in the same survey (n = 142); and (iii) Women born in Vietnam (n = 103), Turkey (n = 104) and the Philippines (n = 106) who completed the EPDS 6-9 months after birth in translation in the Mothers in a New Country Study (MINC) study (total n = 313). The pattern of item responses on the EPDS was assessed in various ways across the samples and internal reliability coefficients were calculated. Exploratory factor analyses were also conducted to assess the similarity in the factor solutions across the samples. The EPDS had good construct validity and item endorsement by women was similar across the samples. Internal reliability of the scale was also very satisfactory with Cronbach's alpha for each sample being > or = 8. Between 39 and 46% of the variance in each of the three main samples was accounted for by one principal factor 'depression' (6-7 items loading), with two supplementary factors 'loss of enjoyment' (2 items loading) and 'despair/self-harm' (2-3 items loading) accounting for a further 20-25% of the variance. Alternative one and two factor solutions also showed a great deal of consistency between the samples. The good item consistency of the EPDS and the relative stability of the factor patterns across the samples are indicative that the scale is understood and completed in similar ways by women in these different English speaking and non-English speaking population groups. With the proviso that careful translation processes and extensive piloting of translations are always needed, these findings lend further support to the use of the EPDS in cross-cultural research on depression following childbirth.
Development and validation of a ten-item questionnaire with explanatory illustrations to assess upper extremity disorders: favorable effect of illustrations in the item reduction process.

PubMed

Kurimoto, Shigeru; Suzuki, Mikako; Yamamoto, Michiro; Okui, Nobuyuki; Imaeda, Toshihiko; Hirata, Hitoshi

2011-11-01

The purpose of this study is to develop a short and valid measure for upper extremity disorders and to assess the effect of attached illustrations in item reduction of a self-administered disability questionnaire while retaining psychometric properties. A validated questionnaire used to assess upper extremity disorders, the Hand20, was reduced to ten items using two item-reduction techniques. The psychometric properties of the abbreviated form, the Hand10, were evaluated on an independent sample that was used for the shortening process. Validity, reliability, and responsiveness of the Hand10 were retained in the item reduction process. It was possible that the use of explanatory illustrations attached to the Hand10 helped with its reproducibility. The illustrations for the Hand10 promoted text comprehension and motivation to answer the items. These changes resulted in high acceptability; more than 99.3% of patients, including 98.5% of elderly patients, could complete the Hand10 properly. The illustrations had favorable effects on the item reduction process and made it possible to retain precision of the instrument. The Hand10 is a reliable and valid instrument for individual-level applications with the advantage of being compact and broadly applicable, even in elderly individuals.
Item and scale differential functioning of the Mini-Mental State Exam assessed using the Differential Item and Test Functioning (DFIT) Framework.

PubMed

Morales, Leo S; Flowers, Claudia; Gutierrez, Peter; Kleinman, Marjorie; Teresi, Jeanne A

2006-11-01

To illustrate the application of the Differential Item and Test Functioning (DFIT) method using English and Spanish versions of the Mini-Mental State Examination (MMSE). Study participants were 65 years of age or older and lived in North Manhattan, New York. Of the 1578 study participants who were administered the MMSE 665 completed it in Spanish. : The MMSE contains 20 items that measure the degree of cognitive impairment in the areas of orientation, attention and calculation, registration, recall and language, as well as the ability to follow verbal and written commands. After assessing the dimensionality of the MMSE scale, item response theory person and item parameters were estimated separately for the English and Spanish sample using Samejima's 2-parameter graded response model. Then the DFIT framework was used to assess differential item functioning (DIF) and differential test functioning (DTF). Nine items were found to show DIF; these were items that ask the respondent to name the correct season, day of the month, city, state, and 2 nearby streets, recall 3 objects, repeat the phrase no ifs, no ands, no buts, follow the command, "close your eyes," and the command, "take the paper in your right hand, fold the paper in half with both hands, and put the paper down in your lap." At the scale level, however, the MMSE did not show differential functioning. Respondents to the English and Spanish versions of the MMSE are comparable on the basis of scale scores. However, assessments based on individual MMSE items may be misleading.
Development and initial evaluation of the SCI-FI/AT

PubMed Central

Jette, Alan M.; Slavin, Mary D.; Ni, Pengsheng; Kisala, Pamela A.; Tulsky, David S.; Heinemann, Allen W.; Charlifue, Susie; Tate, Denise G.; Fyffe, Denise; Morse, Leslie; Marino, Ralph; Smith, Ian; Williams, Steve

2015-01-01

Objectives To describe the domain structure and calibration of the Spinal Cord Injury Functional Index for samples using Assistive Technology (SCI-FI/AT) and report the initial psychometric properties of each domain. Design Cross sectional survey followed by computerized adaptive test (CAT) simulations. Setting Inpatient and community settings. Participants A sample of 460 adults with traumatic spinal cord injury (SCI) stratified by level of injury, completeness of injury, and time since injury. Interventions None Main outcome measure SCI-FI/AT Results Confirmatory factor analysis (CFA) and Item response theory (IRT) analyses identified 4 unidimensional SCI-FI/AT domains: Basic Mobility (41 items) Self-care (71 items), Fine Motor Function (35 items), and Ambulation (29 items). High correlations of full item banks with 10-item simulated CATs indicated high accuracy of each CAT in estimating a person's function, and there was high measurement reliability for the simulated CAT scales compared with the full item bank. SCI-FI/AT item difficulties in the domains of Self-care, Fine Motor Function, and Ambulation were less difficult than the same items in the original SCI-FI item banks. Conclusion With the development of the SCI-FI/AT, clinicians and investigators have available multidimensional assessment scales that evaluate function for users of AT to complement the scales available in the original SCI-FI. PMID:26010975
Development and initial evaluation of the SCI-FI/AT.

PubMed

Jette, Alan M; Slavin, Mary D; Ni, Pengsheng; Kisala, Pamela A; Tulsky, David S; Heinemann, Allen W; Charlifue, Susie; Tate, Denise G; Fyffe, Denise; Morse, Leslie; Marino, Ralph; Smith, Ian; Williams, Steve

2015-05-01

To describe the domain structure and calibration of the Spinal Cord Injury Functional Index for samples using Assistive Technology (SCI-FI/AT) and report the initial psychometric properties of each domain. Cross sectional survey followed by computerized adaptive test (CAT) simulations. Inpatient and community settings. A sample of 460 adults with traumatic spinal cord injury (SCI) stratified by level of injury, completeness of injury, and time since injury. None SCI-FI/AT RESULTS: Confirmatory factor analysis (CFA) and Item response theory (IRT) analyses identified 4 unidimensional SCI-FI/AT domains: Basic Mobility (41 items) Self-care (71 items), Fine Motor Function (35 items), and Ambulation (29 items). High correlations of full item banks with 10-item simulated CATs indicated high accuracy of each CAT in estimating a person's function, and there was high measurement reliability for the simulated CAT scales compared with the full item bank. SCI-FI/AT item difficulties in the domains of Self-care, Fine Motor Function, and Ambulation were less difficult than the same items in the original SCI-FI item banks. With the development of the SCI-FI/AT, clinicians and investigators have available multidimensional assessment scales that evaluate function for users of AT to complement the scales available in the original SCI-FI.
Development of the Computer-Adaptive Version of the Late-Life Function and Disability Instrument

PubMed Central

Tian, Feng; Kopits, Ilona M.; Moed, Richard; Pardasaney, Poonam K.; Jette, Alan M.

2012-01-01

Background. Having psychometrically strong disability measures that minimize response burden is important in assessing of older adults. Methods. Using the original 48 items from the Late-Life Function and Disability Instrument and newly developed items, a 158-item Activity Limitation and a 62-item Participation Restriction item pool were developed. The item pools were administered to a convenience sample of 520 community-dwelling adults 60 years or older. Confirmatory factor analysis and item response theory were employed to identify content structure, calibrate items, and build the computer-adaptive testings (CATs). We evaluated real-data simulations of 10-item CAT subscales. We collected data from 102 older adults to validate the 10-item CATs against the Veteran’s Short Form-36 and assessed test–retest reliability in a subsample of 57 subjects. Results. Confirmatory factor analysis revealed a bifactor structure, and multi-dimensional item response theory was used to calibrate an overall Activity Limitation Scale (141 items) and an overall Participation Restriction Scale (55 items). Fit statistics were acceptable (Activity Limitation: comparative fit index = 0.95, Tucker Lewis Index = 0.95, root mean square error approximation = 0.03; Participation Restriction: comparative fit index = 0.95, Tucker Lewis Index = 0.95, root mean square error approximation = 0.05). Correlation of 10-item CATs with full item banks were substantial (Activity Limitation: r = .90; Participation Restriction: r = .95). Test–retest reliability estimates were high (Activity Limitation: r = .85; Participation Restriction r = .80). Strength and pattern of correlations with Veteran’s Short Form-36 subscales were as hypothesized. Each CAT, on average, took 3.56 minutes to administer. Conclusions. The Late-Life Function and Disability Instrument CATs demonstrated strong reliability, validity, accuracy, and precision. The Late-Life Function and Disability Instrument CAT can achieve psychometrically sound disability assessment in older persons while reducing respondent burden. Further research is needed to assess their ability to measure change in older adults. PMID:22546960
[A methodological approach to assessing the quality of medical health information on its way from science to the mass media].

PubMed

Serong, Julia; Anhäuser, Marcus; Wormer, Holger

2015-01-01

A current research project deals with the question of how the quality of medical health information changes on its way from the academic journal via press releases to the news media. In an exploratory study a sample of 30 news items has been selected stage-by-stage from an adjusted total sample of 1,695 journalistic news items on medical research in 2013. Using a multidimensional set of criteria the news items as well as the corresponding academic articles, abstracts and press releases are examined by science journalists and medical experts. Together with a content analysis of the expert assessments, it will be verified to what extent established quality standards for medical journalism can be applied to medical health communication and public relations or even to studies and abstracts as well. Copyright © 2015. Published by Elsevier GmbH.
Ecological content validation of the Information Assessment Method for parents (IAM-parent): A mixed methods study.

PubMed

Bujold, M; El Sherif, R; Bush, P L; Johnson-Lafleur, J; Doray, G; Pluye, P

2018-02-01

This mixed methods study content validated the Information Assessment Method for parents (IAM-parent) that allows users to systematically rate and comment on online parenting information. Quantitative data and results: 22,407 IAM ratings were collected; of the initial 32 items, descriptive statistics showed that 10 had low relevance. Qualitative data and results: IAM-based comments were collected, and 20 IAM users were interviewed (maximum variation sample); the qualitative data analysis assessed the representativeness of IAM items, and identified items with problematic wording. Researchers, the program director, and Web editors integrated quantitative and qualitative results, which led to a shorter and clearer IAM-parent. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.
Does Assessment Type Matter? A Measurement Invariance Analysis of Online and Paper and Pencil Assessment of the Community Assessment of Psychic Experiences (CAPE)

PubMed Central

Vleeschouwer, Marloes; Schubart, Chris D.; Henquet, Cecile; Myin-Germeys, Inez; van Gastel, Willemijn A.; Hillegers, Manon H. J.; van Os, Jim J.; Boks, Marco P. M.; Derks, Eske M.

2014-01-01

Background The psychometric properties of an online test are not necessarily identical to its paper and pencil original. The aim of this study is to test whether the factor structure of the Community Assessment of Psychic Experiences (CAPE) is measurement invariant with respect to online vs. paper and pencil assessment. Method The factor structure of CAPE items assessed by paper and pencil (N = 796) was compared with the factor structure of CAPE items assessed by the Internet (N = 21,590) using formal tests for Measurement Invariance (MI). The effect size was calculated by estimating the Signed Item Difference in the Sample (SIDS) index and the Signed Test Difference in the Sample (STDS) for a hypothetical subject who scores 2 standard deviations above average on the latent dimensions. Results The more restricted Metric Invariance model showed a significantly worse fit compared to the less restricted Configural Invariance model (χ2(23) = 152.75, p<0.001). However, the SIDS indices appear to be small, with an average of −0.11. A STDS of −4.80 indicates that Internet sample members who score 2 standard deviations above average would be expected to score 4.80 points lower on the CAPE total scale (ranging from 42 to 114 points) than would members of the Paper sample with the same latent trait score. Conclusions Our findings did not support measurement invariance with respect to assessment method. Because of the small effect sizes, the measurement differences between the online assessed CAPE and its paper and pencil original can be neglected without major consequences for research purposes. However, a person with a high vulnerability for psychotic symptoms would score 4.80 points lower on the total scale if the CAPE is assessed online compared to paper and pencil assessment. Therefore, for clinical purposes, one should be cautious with online assessment of the CAPE. PMID:24465389
Development and Evaluation of the PROMIS® Pediatric Positive Affect Item Bank, Child-Report and Parent-Proxy Editions.

PubMed

Forrest, Christopher B; Ravens-Sieberer, Ulrike; Devine, Janine; Becker, Brandon D; Teneralli, Rachel; Moon, JeanHee; Carle, Adam; Tucker, Carole A; Bevans, Katherine B

2018-03-01

The purpose of this study is to describe the psychometric evaluation and item response theory calibration of the PROMIS Pediatric Positive Affect item bank, child-report and parent-proxy editions. The initial item pool comprising 53 items, previously developed using qualitative methods, was administered to 1,874 children 8-17 years old and 909 parents of children 5-17 years old. Analyses included descriptive statistics, reliability, factor analysis, differential item functioning, and construct validity. A total of 14 items were deleted, because of poor psychometric performance, and an 8-item short form constructed from the remaining 39 items was administered to a national sample of 1,004 children 8-17 years old, and 1,306 parents of children 5-17 years old. The combined sample was used in item response theory (IRT) calibration analyses. The final item bank appeared unidimensional, the items appeared locally independent, and the items were free from differential item functioning. The scales showed excellent reliability and convergent and discriminant validity. Positive affect decreased with children's age and was lower for those with a special health care need. After IRT calibration, we found that 4 and 8 item short forms had a high degree of precision (reliability) across a wide range of the latent trait (>4 SD units). The PROMIS Pediatric Positive Affect item bank and its short forms provide an efficient, precise, and valid assessment of positive affect in children and youth.
Impact of Design Effects in Large-Scale District and State Assessments

ERIC Educational Resources Information Center

Phillips, Gary W.

2015-01-01

This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…
Generating Multiple Imputations for Matrix Sampling Data Analyzed with Item Response Models.

ERIC Educational Resources Information Center

Thomas, Neal; Gan, Nianci

1997-01-01

Describes and assesses missing data methods currently used to analyze data from matrix sampling designs implemented by the National Assessment of Educational Progress. Several improved methods are developed, and these models are evaluated using an EM algorithm to obtain maximum likelihood estimates followed by multiple imputation of complete data…
Development and Standardization of the Diagnostic Adaptive Behavior Scale: Application of Item Response Theory to the Assessment of Adaptive Behavior.

PubMed

Tassé, Marc J; Schalock, Robert L; Thissen, David; Balboni, Giulia; Bersani, Henry Hank; Borthwick-Duffy, Sharon A; Spreat, Scott; Widaman, Keith F; Zhang, Dalun; Navas, Patricia

2016-03-01

The Diagnostic Adaptive Behavior Scale (DABS) was developed using item response theory (IRT) methods and was constructed to provide the most precise and valid adaptive behavior information at or near the cutoff point of making a decision regarding a diagnosis of intellectual disability. The DABS initial item pool consisted of 260 items. Using IRT modeling and a nationally representative standardization sample, the item set was reduced to 75 items that provide the most precise adaptive behavior information at the cutoff area determining the presence or not of significant adaptive behavior deficits across conceptual, social, and practical skills. The standardization of the DABS is described and discussed.
Assessing the heterogeneity of aggressive behavior traits: exploratory and confirmatory analyses of the reactive and instrumental aggression Personality Assessment Inventory (PAI) scales.

PubMed

Antonius, Daniel; Sinclair, Samuel Justin; Shiva, Andrew A; Messinger, Julie W; Maile, Jordan; Siefert, Caleb J; Belfi, Brian; Malaspina, Dolores; Blais, Mark A

2013-01-01

The heterogeneity of violent behavior is often overlooked in risk assessment despite its importance in the management and treatment of psychiatric and forensic patients. In this study, items from the Personality Assessment Inventory (PAI) were first evaluated and rated by experts in terms of how well they assessed personality features associated with reactive and instrumental aggression. Exploratory principal component analyses (PCA) were then conducted on select items using a sample of psychiatric and forensic inpatients (n = 479) to examine the latent structure and construct validity of these reactive and instrumental aggression factors. Finally, a confirmatory factor analysis (CFA) was conducted on a separate sample of psychiatric inpatients (n = 503) to evaluate whether these factors yielded acceptable model fit. Overall, the exploratory and confirmatory analyses supported the existence of two latent PAI factor structures, which delineate personality traits related to reactive and instrumental aggression.
Development of a Computer Adaptive Test for Depression Based on the Dutch-Flemish Version of the PROMIS Item Bank.

PubMed

Flens, Gerard; Smits, Niels; Terwee, Caroline B; Dekker, Joost; Huijbrechts, Irma; de Beurs, Edwin

2017-03-01

We developed a Dutch-Flemish version of the patient-reported outcomes measurement information system (PROMIS) adult V1.0 item bank for depression as input for computerized adaptive testing (CAT). As item bank, we used the Dutch-Flemish translation of the original PROMIS item bank (28 items) and additionally translated 28 U.S. depression items that failed to make the final U.S. item bank. Through psychometric analysis of a combined clinical and general population sample ( N = 2,010), 8 added items were removed. With the final item bank, we performed several CAT simulations to assess the efficiency of the extended (48 items) and the original item bank (28 items), using various stopping rules. Both item banks resulted in highly efficient and precise measurement of depression and showed high similarity between the CAT simulation scores and the full item bank scores. We discuss the implications of using each item bank and stopping rule for further CAT development.
The emotion dysregulation inventory: Psychometric properties and item response theory calibration in an autism spectrum disorder sample.

PubMed

Mazefsky, Carla A; Yu, Lan; White, Susan W; Siegel, Matthew; Pilkonis, Paul A

2018-06-01

Individuals with autism spectrum disorder (ASD) often present with prominent emotion dysregulation that requires treatment but can be difficult to measure. The Emotion Dysregulation Inventory (EDI) was created using methods developed by the Patient-Reported Outcomes Measurement Information System (PROMIS ® ) to capture observable indicators of poor emotion regulation. Caregivers of 1,755 youth with ASD completed 66 candidate EDI items, and the final 30 items were selected based on classical test theory and item response theory (IRT) analyses. The analyses identified two factors: (a) Reactivity, characterized by intense, rapidly escalating, sustained, and poorly regulated negative emotional reactions, and (b) Dysphoria, characterized by anhedonia, sadness, and nervousness. The final items did not show differential item functioning (DIF) based on gender, age, intellectual ability, or verbal ability. Because the final items were calibrated using IRT, even a small number of items offers high precision, minimizing respondent burden. IRT co-calibration of the EDI with related measures demonstrated its superiority in assessing the severity of emotion dysregulation with as few as seven items. Validity of the EDI was supported by expert review, its association with related constructs (e.g., anxiety and depression symptoms, aggression), higher scores in psychiatric inpatients with ASD compared to a community ASD sample, and demonstration of test-retest stability and sensitivity to change. In sum, the EDI provides an efficient and sensitive method to measure emotion dysregulation for clinical assessment, monitoring, and research in youth with ASD of any level of cognitive or verbal ability. Autism Res 2018, 11: 928-941. © 2018 International Society for Autism Research, Wiley Periodicals, Inc. This paper describes a new measure of poor emotional control called the Emotion Dysregulation Inventory (EDI). Caregivers of 1,755 youth with ASD completed candidate items, and advanced statistical techniques were applied to identify the best final items. The EDI is unique because it captures common emotional problems in ASD and is appropriate for both nonverbal and verbal youth. It is an efficient and sensitive measure for use in clinical assessments, monitoring, and research with youth with ASD. © 2018 International Society for Autism Research, Wiley Periodicals, Inc.
Development of a Short Form of the Five-Factor Narcissism Inventory: the FFNI-SF.

PubMed

Sherman, Emily D; Miller, Joshua D; Few, Lauren R; Campbell, W Keith; Widiger, Thomas A; Crego, Cristina; Lynam, Donald R

2015-09-01

The Five-Factor Narcissism Inventory (FFNI; Glover, Miller, Lynam, Crego, & Widiger, 2012) is a 148-item self-report inventory of 15 traits designed to assess the basic elements of narcissism from the perspective of a 5-factor model. The FFNI assesses both vulnerable (i.e., cynicism/distrust, need for admiration, reactive anger, and shame) and grandiose (i.e., acclaim seeking, arrogance, authoritativeness, entitlement, exhibitionism, exploitativeness, grandiose fantasies, indifference, lack of empathy, manipulativeness, and thrill seeking) variants of narcissism. The present study reports the development of a short-form version of the FFNI in 4 diverse samples (i.e., 2 undergraduate samples, a sample recruited from MTurk, and a clinical community sample) using item response theory. The validity of the resultant 60-item short form was compared against the validity of the full scale in the 4 samples at both the subscale level and the level of the grandiose and vulnerable composites. Results indicated that the 15 subscales remain relatively reliable, possess a factor structure identical to the structure of the long-form scales, and manifest correlational profiles highly similar to those of the long-form scales in relation to a variety of criterion measures, including basic personality dimensions, other measures of grandiose and vulnerable narcissism, and indicators of externalizing and internalizing psychopathology. Grandiose and vulnerable composites also behave almost identically across the short- and long-form versions. It is concluded that the FFNI-Short Form (FFNI-SF) offers a well-articulated assessment of the basic traits comprising grandiose and vulnerable narcissism, particularly when assessment time is limited. (c) 2015 APA, all rights reserved.

Diagnosing Conceptions about the Epistemology of Science: Contributions of a Quantitative Assessment Methodology

ERIC Educational Resources Information Center

Vázquez-Alonso, Ángel; Manassero-Mas, María-Antonia; García-Carmona, Antonio; Montesano de Talavera, Marisa

2016-01-01

This study applies a new quantitative methodological approach to diagnose epistemology conceptions in a large sample. The analyses use seven multiple-rating items on the epistemology of science drawn from the item pool Views on Science-Technology-Society (VOSTS). The bases of the new methodological diagnostic approach are the empirical…
Testing the PROMIS® Depression measures for monitoring depression in a clinical sample outside the US.

PubMed

Vilagut, G; Forero, C G; Adroher, N D; Olariu, E; Cella, D; Alonso, J

2015-09-01

The Patient Reported Outcomes Measurement Information System (PROMIS) was devised to facilitate assessment of patient self-reported health status, taking advantage of Item Response Theory. We aimed to assess measurement properties of the PROMIS Depression item bank and an 8-item static short form in a Spanish clinical sample. A three-month follow-up study of patients with active mood/anxiety symptoms (n = 218) was carried out. We assessed model unidimensionality (Confirmatory Item Factor Analysis), reliability (internal consistency and Item Information Curves), and validity (convergent-discriminant with correlations; known-groups with comparison of means and effect sizes; and criterion validity with Receiver operating Characteristics (ROC) analysis). We also assessed 3-month responsiveness to change (Cohen's effect sizes (d) in stable and recovered patients). The unidimensional model showed adequate fit (CFI = 0.97, RMSEA = 0.08). Information Curves had reliabilities over 0.90 throughout most of the score continuum. As expected, we observed high correlations with external self-reported depression, and moderate with self-reported anxiety and clinical measures. The item bank showed an increasing severity gradient from no disorder (mean = 48, SE = 0.6) to depression with comorbid anxiety (mean = 55.8, SE = 0.4). PROMIS detected depression disorder with great accuracy according to the area under the curve (AUC = 0.89). Both formats, item bank and short form, were highly responsive to change in recovered patients (d > 0.7) and had small changes in stable patients (d < 0.2). The good metric properties of the Spanish PROMIS Depression measures provide further evidence of their adequacy for monitoring depression levels of patients in clinical settings. This double check of quality (within countries and populations) supports the ability of PROMIS measures for guaranteeing fair comparisons across languages and countries in specific clinical populations. Copyright © 2015 Elsevier Ltd. All rights reserved.
Diagnosing Autism in Adults with Intellectual Disability: Validation of the DiBAS-R in an Independent Sample

ERIC Educational Resources Information Center

Heinrich, Manuel; Böhm, Julia; Sappok, Tanja

2018-01-01

The study assessed the diagnostic validity of the diagnostic behavioral assessment for autism spectrum disorders-revised (DiBAS-R; 19-item screening scale based on ratings by caregivers) in a clinical sample of 381 adults with ID. Analysis revealed a sensitivity of 0.82 and a specificity of 0.67 in the overall sample (70.3% agreement). Sensitivity…
The development and initial psychometric evaluation of a measure assessing adherence to prescribed exercise: the Exercise Adherence Rating Scale (EARS).

PubMed

Newman-Beinart, Naomi A; Norton, Sam; Dowling, Dominic; Gavriloff, Dimitri; Vari, Chiara; Weinman, John A; Godfrey, Emma L

2017-06-01

There is no gold standard for measuring adherence to prescribed home exercise. Self-report diaries are commonly used however lack of standardisation, inaccurate recall and self-presentation bias limit their validity. A valid and reliable tool to assess exercise adherence behaviour is required. Consequently, this article reports the development and psychometric evaluation of the Exercise Adherence Rating Scale (EARS). Development of a questionnaire. Secondary care in physiotherapy departments of three hospitals. A focus group consisting of 8 patients with chronic low back pain (CLBP) and 2 physiotherapists was conducted to generate qualitative data. Following on from this, a convenience sample of 224 people with CLBP completed the initial 16-item EARS for purposes of subsequent validity and reliability analyses. Construct validity was explored using exploratory factor analysis and item response theory. Test-retest reliability was assessed 3 weeks later in a sub-sample of patients. An item pool consisting of 6 items was found suitable for factor analysis. Examination of the scale structure of these 6 items revealed a one factor solution explaining a total of 71% of the variance in adherence to exercise. The six items formed a unidimensional scale that showed good measurement properties, including acceptable internal consistency and high test-retest reliability. The EARS enables the measurement of adherence to prescribed home exercise. This may facilitate the evaluation of interventions promoting self-management for both the prevention and treatment of chronic conditions. Copyright © 2017 Chartered Society of Physiotherapy. Published by Elsevier Ltd. All rights reserved.
Validation of a scale for assessing attitudes towards outcomes of genetic cancer testing among primary care providers and breast specialists

PubMed Central

N’Diaye, Khadim; Evans, D. Gareth; Harris, Hilary; Tibben, Aad; van Asperen, Christi; Schmidtke, Joerg; Nippert, Irmgard; Mancini, Julien; Julian-Reynier, Claire

2017-01-01

Objective To develop a generic scale for assessing attitudes towards genetic testing and to psychometrically assess these attitudes in the context of BRCA1/2 among a sample of French general practitioners, breast specialists and gyneco-obstetricians. Study design and setting Nested within the questionnaire developed for the European InCRisC (International Cancer Risk Communication Study) project were 14 items assessing expected benefits (8 items) and drawbacks (6 items) of the process of breast/ovarian genetic cancer testing (BRCA1/2). Another item assessed agreement with the statement that, overall, the expected health benefits of BRCA1/2 testing exceeded its drawbacks, thereby justifying its prescription. The questionnaire was mailed to a sample of 1,852 French doctors. Of these, 182 breast specialists, 275 general practitioners and 294 gyneco-obstetricians completed and returned the questionnaire to the research team. Principal Component Analysis, Cronbach’s α coefficient, and Pearson’s correlation coefficients were used in the statistical analyses of collected data. Results Three dimensions emerged from the respondents’ responses, and were classified under the headings: “Anxiety, Conflict and Discrimination”, “Risk Information”, and “Prevention and Surveillance”. Cronbach’s α coefficient for the 3 dimensions was 0.79, 0.76 and 0.62, respectively, and each dimension exhibited strong correlation with the overall indicator of agreement (criterion validity). Conclusions The validation process of the 15 items regarding BRCA1/2 testing revealed satisfactory psychometric properties for the creation of a new scale entitled the Attitudes Towards Genetic Testing for BRCA1/2 (ATGT-BRCA1/2) Scale. Further testing is required to confirm the validity of this tool which could be used generically in other genetic contexts. PMID:28570656
Medical student quality-of-life in the clerkships: a scale validation study.

PubMed

Brannick, Michael T; Horn, Gregory T; Schnaus, Michael J; Wahi, Monika M; Goldin, Steven B

2015-04-01

Many aspects of medical school are stressful for students. To empirically assess student reactions to clerkship programs, or to assess efforts to improve such programs, educators must measure the overall well-being of the students reliably and validly. The purpose of the study was to develop and validate a measure designed to achieve these goals. The authors developed a measure of quality of life for medical students by sampling (public domain) items tapping general happiness, fatigue, and anxiety. A quality-of-life scale was developed by factor analyzing responses to the items from students in two different clerkships from 2005 to 2008. Reliability was assessed using Cronbach's alpha. Validity was assessed by factor analysis, convergence with additional theoretically relevant scales, and sensitivity to change over time. The refined nine-item measure is a Likert scaled survey of quality-of-life items comprised of two domains: exhaustion and general happiness. The resulting scale demonstrated good reliability and factorial validity at two time points for each of the two samples. The quality-of-life measure also correlated with measures of depression and the amount of sleep reported during the clerkships. The quality-of-life measure appeared more sensitive to changes over time than did the depression measure. The measure is short and can be easily administered in a survey. The scale appears useful for program evaluation and more generally as an outcome variable in medical educational research.
Development of the PROMIS coping expectancies of smoking item banks.

PubMed

Shadel, William G; Edelen, Maria Orlando; Tucker, Joan S; Stucky, Brian D; Hansen, Mark; Cai, Li

2014-09-01

Smoking is a coping strategy for many smokers who then have difficulty finding new ways to cope with negative affect when they quit. This paper describes analyses conducted to develop and evaluate item banks for assessing the coping expectancies of smoking for daily and nondaily smokers. Using data from a large sample of daily (N = 4,201) and nondaily (N = 1,183) smokers, we conducted a series of item factor analyses, item response theory analyses, and differential item functioning (DIF) analyses (according to gender, age, and ethnicity) to arrive at a unidimensional set of items for daily and nondaily smokers. We also evaluated performance of short forms (SFs) and computer adaptive tests (CATs) for assessing coping expectancies of smoking. For both daily and nondaily smokers, the unidimensional Coping Expectancies item banks (21 items) are relatively DIF free and are highly reliable (0.96 and 0.97, respectively). A common 4-item SF for daily and nondaily smokers also showed good reliability (0.85). Adaptive tests required an average of 4.3 and 3.7 items for simulated daily and nondaily respondents, respectively, and achieved reliabilities of 0.91 for both when the maximum test length was 10 items. This research provides a new set of items that can be used to reliably assess coping expectancies of smoking, through a SF, CAT, or a tailored set selected for a specific research purpose. © The Author 2014. Published by Oxford University Press on behalf of the Society for Research on Nicotine and Tobacco. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Development of a mobbing short scale in the Gutenberg Health Study.

PubMed

Garthus-Niegel, Susan; Nübling, Matthias; Letzel, Stephan; Hegewald, Janice; Wagner, Mandy; Wild, Philipp S; Blettner, Maria; Zwiener, Isabella; Latza, Ute; Jankowiak, Sylvia; Liebers, Falk; Seidler, Andreas

2016-01-01

Despite its highly detrimental potential, most standard questionnaires assessing psychosocial stress at work do not include mobbing as a risk factor. In the German standard version of COPSOQ, mobbing is assessed with a single item. In the Gutenberg Health Study, this version was used together with a newly developed short scale based on the Leymann Inventory of Psychological Terror. The purpose of the present study was to evaluate the psychometric properties of these two measures, to compare them and to test their differential impact on relevant outcome parameters. This analysis is based on a population-based sample of 1441 employees participating in the Gutenberg Health Study. Exploratory and confirmatory factor analyses and reliability analyses were used to assess the mobbing scale. To determine their predictive validities, multiple linear regression analyses with six outcome parameters and log-binomial regression models for two of the outcome aspects were run. Factor analyses of the five-item scale confirmed a one-factor solution, reliability was α = 0.65. Both the single-item and the five-item scales were associated with all six outcome scales. Effect sizes were similar for both mobbing measures. Mobbing is an important risk factor for health-related outcomes. For the purpose of psychosocial risk assessment in the workplace, both the single-item and the five-item constructs were psychometrically appropriate. Associations with outcomes were about equivalent. However, the single item has the advantage of parsimony, whereas the five-item construct depicts several distinct forms of mobbing.
Upper-extremity and mobility subdomains from the Patient-Reported Outcomes Measurement Information System (PROMIS) adult physical functioning item bank.

PubMed

Hays, Ron D; Spritzer, Karen L; Amtmann, Dagmar; Lai, Jin-Shei; Dewitt, Esi Morgan; Rothrock, Nan; Dewalt, Darren A; Riley, William T; Fries, James F; Krishnan, Eswar

2013-11-01

To create upper-extremity and mobility subdomain scores from the Patient-Reported Outcomes Measurement Information System (PROMIS) physical functioning adult item bank. Expert reviews were used to identify upper-extremity and mobility items from the PROMIS item bank. Psychometric analyses were conducted to assess empirical support for scoring upper-extremity and mobility subdomains. Data were collected from the U.S. general population and multiple disease groups via self-administered surveys. The sample (N=21,773) included 21,133 English-speaking adults who participated in the PROMIS wave 1 data collection and 640 Spanish-speaking Latino adults recruited separately. Not applicable. We used English- and Spanish-language data and existing PROMIS item parameters for the physical functioning item bank to estimate upper-extremity and mobility scores. In addition, we fit graded response models to calibrate the upper-extremity items and mobility items separately, compare separate to combined calibrations, and produce subdomain scores. After eliminating items because of local dependency, 16 items remained to assess upper extremity and 17 items to assess mobility. The estimated correlation between upper extremity and mobility was .59 using existing PROMIS physical functioning item parameters (r=.60 using parameters calibrated separately for upper-extremity and mobility items). Upper-extremity and mobility subdomains shared about 35% of the variance in common, and produced comparable scores whether calibrated separately or together. The identification of the subset of items tapping these 2 aspects of physical functioning and scored using the existing PROMIS parameters provides the option of scoring these subdomains in addition to the overall physical functioning score. Copyright © 2013 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Modeling the World Health Organization Disability Assessment Schedule II using non-parametric item response models.

PubMed

Galindo-Garre, Francisca; Hidalgo, María Dolores; Guilera, Georgina; Pino, Oscar; Rojo, J Emilio; Gómez-Benito, Juana

2015-03-01

The World Health Organization Disability Assessment Schedule II (WHO-DAS II) is a multidimensional instrument developed for measuring disability. It comprises six domains (getting around, self-care, getting along with others, life activities and participation in society). The main purpose of this paper is the evaluation of the psychometric properties for each domain of the WHO-DAS II with parametric and non-parametric Item Response Theory (IRT) models. A secondary objective is to assess whether the WHO-DAS II items within each domain form a hierarchy of invariantly ordered severity indicators of disability. A sample of 352 patients with a schizophrenia spectrum disorder is used in this study. The 36 items WHO-DAS II was administered during the consultation. Partial Credit and Mokken scale models are used to study the psychometric properties of the questionnaire. The psychometric properties of the WHO-DAS II scale are satisfactory for all the domains. However, we identify a few items that do not discriminate satisfactorily between different levels of disability and cannot be invariantly ordered in the scale. In conclusion the WHO-DAS II can be used to assess overall disability in patients with schizophrenia, but some domains are too general to assess functionality in these patients because they contain items that are not applicable to this pathology. Copyright © 2014 John Wiley & Sons, Ltd.
Normative data for the Rappel libre/Rappel indicé à 16 items (16-item Free and Cued Recall) in the elderly Quebec-French population.

PubMed

Dion, Mélissa; Potvin, Olivier; Belleville, Sylvie; Ferland, Guylaine; Renaud, Mélanie; Bherer, Louis; Joubert, Sven; Vallet, Guillaume T; Simard, Martine; Rouleau, Isabelle; Lecomte, Sarah; Macoir, Joël; Hudon, Carol

2015-01-01

Performance on verbal memory tests is generally associated with socio-demographic variables such as age, sex, and education level. Performance also varies between different cultural groups. The present study aimed to establish normative data for the Rappel libre/Rappel indicé à 16 items (16-item Free and Cued Recall; RL/RI-16), a French adaptation of the Free and Cued Selective Reminding Test (Buschke, 1984; Grober, Buschke, Crystal, Bang, & Dresner, 1988). The sample consisted of 566 healthy French-speaking older adults (50-88 years old) from the province of Quebec, Canada. Normative data for the RL/RI-16 were derived from 80% of the total sample (normative sample) and cross-validated using the remaining participants (20%; validation sample). The effects of participants' age, sex, and education level were assessed on different indices of memory performance. Results indicated that these variables were independently associated with performance. Normative data are presented as regression equations with standard deviations (symmetric distributions) and percentiles (asymmetric distributions).
Citizenship. A Statewide Assessment in Texas.

ERIC Educational Resources Information Center

Texas Education Agency, Austin.

Citizenship test items developed by the National Assessment of Educational Progress were administered to a large sample of 9, 13, and 17-year old students, as part of the Texas Assessment Project. The knowledge and attitudes assessed fell into three categories: (1) political knowledge: constitutional rights, governmental structure, governmental…
Development and validation of green eating behaviors, stage of change, decisional balance, and self-efficacy scales in college students.

PubMed

Weller, Kathryn E; Greene, Geoffrey W; Redding, Colleen A; Paiva, Andrea L; Lofgren, Ingrid; Nash, Jessica T; Kobayashi, Hisanori

2014-01-01

To develop and validate an instrument to assess environmentally conscious eating (Green Eating [GE]) behavior (BEH) and GE Transtheoretical Model constructs including Stage of Change (SOC), Decisional Balance (DB), and Self-efficacy (SE). Cross-sectional instrument development survey. Convenience sample (n = 954) of 18- to 24-year-old college students from a northeastern university. The sample was randomly split: (N1) and (N2). N1 was used for exploratory factor analyses using principal components analyses; N2 was used for confirmatory analyses (structural modeling) and reliability analyses (coefficient α). The full sample was used for measurement invariance (multi-group confirmatory analyses) and convergent validity (BEH) and known group validation (DB and SE) by SOC using analysis of variance. Reliable (α > .7), psychometrically sound, and stable measures included 2 correlated 5-item DB subscales (Pros and Cons), 2 correlated SE subscales (school [5 items] and home [3 items]), and a single 6-item BEH scale. Most students (66%) were in Precontemplation and Contemplation SOC. Behavior, DB, and SE scales differed significantly by SOC (P < .001) with moderate to large effect sizes, as predicted by the Transtheoretical Model, which supported the validity of these measures. Successful development and preliminary validation of this 25-item GE instrument provides a basis for assessment as well as development of tailored interventions for college students. Copyright © 2014 Society for Nutrition Education and Behavior. Published by Elsevier Inc. All rights reserved.
The development of a knowledge test of depression and its treatment for patients suffering from non-psychotic depression: a psychometric assessment

PubMed Central

Gabriel, Adel; Violato, Claudio

2009-01-01

Background To develop and psychometrically assess a multiple choice question (MCQ) instrument to test knowledge of depression and its treatments in patients suffering from depression. Methods A total of 63 depressed patients and twelve psychiatric experts participated. Based on empirical evidence from an extensive review, theoretical knowledge and in consultations with experts, 27-item MCQ knowledge of depression and its treatment test was constructed. Data collected from the psychiatry experts were used to assess evidence of content validity for the instrument. Results Cronbach's alpha of the instrument was 0.68, and there was an overall 87.8% agreement (items are highly relevant) between experts about the relevance of the MCQs to test patient knowledge on depression and its treatments. There was an overall satisfactory patients' performance on the MCQs with 78.7% correct answers. Results of an item analysis indicated that most items had adequate difficulties and discriminations. Conclusion There was adequate reliability and evidence for content and convergent validity for the instrument. Future research should employ a lager and more heterogeneous sample from both psychiatrist and community samples, than did the present study. Meanwhile, the present study has resulted in psychometrically tested instruments for measuring knowledge of depression and its treatment of depressed patients. PMID:19754944
Assessment of health surveys: fitting a multidimensional graded response model.

PubMed

Depaoli, Sarah; Tiemensma, Jitske; Felt, John M

The multidimensional graded response model, an item response theory (IRT) model, can be used to improve the assessment of surveys, even when sample sizes are restricted. Typically, health-based survey development utilizes classical statistical techniques (e.g. reliability and factor analysis). In a review of four prominent journals within the field of Health Psychology, we found that IRT-based models were used in less than 10% of the studies examining scale development or assessment. However, implementing IRT-based methods can provide more details about individual survey items, which is useful when determining the final item content of surveys. An example using a quality of life survey for Cushing's syndrome (CushingQoL) highlights the main components for implementing the multidimensional graded response model. Patients with Cushing's syndrome (n = 397) completed the CushingQoL. Results from the multidimensional graded response model supported a 2-subscale scoring process for the survey. All items were deemed as worthy contributors to the survey. The graded response model can accommodate unidimensional or multidimensional scales, be used with relatively lower sample sizes, and is implemented in free software (example code provided in online Appendix). Use of this model can help to improve the quality of health-based scales being developed within the Health Sciences.
[Examination of calibrated item banks for the assessment of work capacity in an outpatient sample of cardiological patients].

PubMed

Haschke, A; Abberger, B; Schröder, K; Wirtz, M; Bengel, J; Baumeister, H

2013-12-01

Work capacity is a major outcome variable in cardiological rehabilitation. However, there is a lacks of capacious and economic assessment instruments for work capacity. By developing item response theory based item banks a first step to close this gap is done. The present study aims to validate the work capacity item banks for cardiovascular rehabilitation inpatients (WCIB-Cardio) in a sample of cardiovascular rehabilitation outpatients. Additionally, we examined differences between in- and outpatients with regard to their work capacity. Data of 283 cardiovascular rehabilitation inpatients and 77 cardiovascular rehabilitation outpatients were collected in 15 rehabilitation centres. The WCIB-Cardio contains the 2 domains of "cognitive work capacity"(20 items) and "physical work capacity"(18 items). Validation of the item bank for cardiological outpatients was conducted with separate Rasch analysis for each domain. For the domain of cognitive work capacity 10 items showed satisfying quality criteria (Rasch reliability=0.71; overall model fit=0.07). For the domain of physical work capacity good values for Rasch-reliability (0.83) and overall -model fit (0.65) could be proven after exclusion of 3 items. Unidimensionality and a broad ability spectrum could be covered for both domains. With regard to content, outpatients evaluate themselves less burdened than inpatients for the domain of cognitive work capacity (‾X outpatient =-2.06 vs. ‾X inpatient =-2.49; p<0.07) similarly for the domain of physical work capacity (‾X outpatient =-3.68 vs. ‾X inpatient =-2.88; p<0.01). With the WCIB-Cardio II there is a precondition to develop self-report instruments of work capacity in cardiological in- and outpatients. © Georg Thieme Verlag KG Stuttgart · New York.
Assessing Student Understanding of the "New Biology": Development and Evaluation of a Criterion-Referenced Genomics and Bioinformatics Assessment

NASA Astrophysics Data System (ADS)

Campbell, Chad Edward

Over the past decade, hundreds of studies have introduced genomics and bioinformatics (GB) curricula and laboratory activities at the undergraduate level. While these publications have facilitated the teaching and learning of cutting-edge content, there has yet to be an evaluation of these assessment tools to determine if they are meeting the quality control benchmarks set forth by the educational research community. An analysis of these assessment tools indicated that <10% referenced any quality control criteria and that none of the assessments met more than one of the quality control benchmarks. In the absence of evidence that these benchmarks had been met, it is unclear whether these assessment tools are capable of generating valid and reliable inferences about student learning. To remedy this situation the development of a robust GB assessment aligned with the quality control benchmarks was undertaken in order to ensure evidence-based evaluation of student learning outcomes. Content validity is a central piece of construct validity, and it must be used to guide instrument and item development. This study reports on: (1) the correspondence of content validity evidence gathered from independent sources; (2) the process of item development using this evidence; (3) the results from a pilot administration of the assessment; (4) the subsequent modification of the assessment based on the pilot administration results and; (5) the results from the second administration of the assessment. Twenty-nine different subtopics within GB (Appendix B: Genomics and Bioinformatics Expert Survey) were developed based on preliminary GB textbook analyses. These subtopics were analyzed using two methods designed to gather content validity evidence: (1) a survey of GB experts (n=61) and (2) a detailed content analyses of GB textbooks (n=6). By including only the subtopics that were shown to have robust support across these sources, 22 GB subtopics were established for inclusion in the assessment. An expert panel subsequently developed, evaluated, and revised two multiple-choice items to align with each of the 22 subtopics, producing a final item pool of 44 items. These items were piloted with student samples of varying content exposure levels. Both Classical Test Theory (CTT) and Item Response Theory (IRT) methodologies were used to evaluate the assessment's validity, reliability and ability inferences, and its ability to differentiate students with different magnitudes of content exposure. A total of 18 items were subsequently modified and reevaluated by an expert panel. The 26 original and 18 modified items were once again piloted with student samples of varying content exposure levels. Both CTT and IRT methodologies were once again used to evaluate student responses in order to evaluate the assessment's validity and reliability inferences as well as its ability to differentiate students with different magnitudes of content exposure. Interviews with students from different content exposure levels were also performed in order to gather convergent validity evidence (external validity evidence) as well as substantive validity evidence. Also included are the limitations of the assessment and a set of guidelines on how the assessment can best be used.
An investigation of the measurement properties of the Spot-the-Word test in a community sample.

PubMed

Mackinnon, Andrew; Christensen, Helen

2007-12-01

Intellectual ability is assessed with the Spot-the-Word (STW) test (A. Baddeley, H. Emslie, & I. Nimmo Smith, 1993) by asking respondents to identify a word in a word-nonword item pair. Results in moderate-sized samples suggest this ability is resistant to decline due to dementia. The authors used a 3-parameter item response theory model to investigate the measurement properties of the STW in a large community-dwelling sample (n=2,480) 60 to 64 years of age. A number of poorly performing items were identified. Substantial guessing was present; however, the number of words correctly identified was found to be an accurate index of ability. Performance was moderately related to a number of tests of cognitive performance and was effectively unrelated to visual acuity and to physical or mental health status. The STW is a promising test of ability that, in the future, may be refined by the deletion or replacement of poorly functioning items.
Replication Study of the Milwaukee Inventory for Subtypes of Trichotillomania–Adult Version in a Clinically Characterized Sample

PubMed Central

Keuthen, Nancy J.; Tung, Esther S.; Woods, Douglas W.; Franklin, Martin E.; Altenburger, Erin M.; Pauls, David L.; Flessner, Christopher A.

2015-01-01

In the present study, we evaluated the Milwaukee Inventory for Subtypes of Trichotillomania–Adult Version (MIST-A) in a replication sample of clinically characterized hair pullers using exploratory factor analysis (EFA; N = 193). EFA eigenvalues and visual inspection of our scree plot revealed a two-factor solution. Factor structure coefficients and internal consistencies suggested a 13-item scale with an 8-item “Intention” scale and a 5-item “Emotion” scale. Both scales displayed good construct and discriminant validity. These findings indicate the need for a revised scale that provides a more refined assessment of pulling phenomenology that can facilitate future treatment advances. PMID:25868534
Psychometric properties of the Multidimensional Assessment of Fatigue scale in traumatic brain injury: an NIDRR Traumatic Brain Injury Model Systems study.

PubMed

Lequerica, Anthony; Bushnik, Tamara; Wright, Jerry; Kolakowsky-Hayner, Stephanie A; Hammond, Flora M; Dijkers, Marcel P; Cantor, Joshua

2012-01-01

To investigate the psychometric properties of the Multidimensional Assessment of Fatigue (MAF) scale in a traumatic brain injury (TBI) sample. Prospective survey study. Community. One hundred sixty-seven individuals with TBI admitted for inpatient rehabilitation, enrolled into the TBI Model Systems national database, and followed up at either the first or second year postinjury. Not applicable. Multidimensional Assessment of Fatigue. The initial analysis, using items 1 to 14, which are based on a 10-point rating scale, found that only 1 item ("walking") misfit the overall construct of fatigue in this TBI population. However, this 10-point rating scale was found to have disordered thresholds. When ratings were collapsed into 4 response categories, all MAF items used to calculate the Global Fatigue Index formed a unidimensional scale. Findings generally support the unidimensionality of the MAF when used in a TBI population but call into question the use of a 10-point rating scale for items 1 to 14. Further study is needed to investigate the use of a 4-category rating scale across all items and the fit of the "walking" item for a measure of fatigue among individuals with TBI.

A psychometric study of the Fear of Sleep Inventory-Short Form (FoSI-SF).

PubMed

Pruiksma, Kristi E; Taylor, Daniel J; Ruggero, Camilo; Boals, Adriel; Davis, Joanne L; Cranston, Christopher; DeViva, Jason C; Zayfert, Claudia

2014-05-15

Fear of sleep may play a significant role in sleep disturbances in individuals with posttraumatic stress disorder (PTSD). This report describes a psychometric study of the Fear of Sleep Inventory (FoSI), which was developed to measure this construct. The psychometric properties of the FoSI were examined in a non-clinical sample of 292 college students (Study I) and in a clinical sample of 67 trauma-exposed adults experiencing chronic nightmares (Study II). Data on the 23 items of the FoSI were subjected to exploratory factor analyses (EFA) to identify items uniquely assessing fear of sleep. Next, reliability and validity of a 13-item version of the FoSI was examined in both samples. A 13-item Short-Form version (FoSI-SF) was identified as having a clear 2-factor structure with high internal consistency in both the non-clinical (α = 0.76-0.94) and clinical (α = 0.88-0.91) samples. Both studies demonstrated good convergent validity with measures of PTSD (0.48-0.61) and insomnia (0.39-0.48) and discriminant validity with a measure of sleep hygiene (0.19-0.27). The total score on the FoSI-SF was significantly higher in the clinical sample (mean = 17.90, SD = 12.56) than in the non-clinical sample (mean = 4.80, SD = 7.72); t(357) = 8.85 p < 0.001. Although all items are recommended for clinical purposes, the data support the use of the 13-item FoSI-SF for research purposes. Replication of the factor structure in clinical samples is needed. Results are discussed in terms of limitations of this study and directions for further research.
Development of the functional vision questionnaire for children and young people with visual impairment: the FVQ_CYP.

PubMed

Tadić, Valerija; Cooper, Andrew; Cumberland, Phillippa; Lewando-Hundt, Gillian; Rahi, Jugnoo S

2013-12-01

To develop a novel age-appropriate measure of functional vision (FV) for self-reporting by visually impaired (VI) children and young people. Questionnaire development. A representative patient sample of VI children and young people aged 10 to 15 years, visual acuity of the logarithm of the minimum angle of resolution (logMAR) worse than 0.48, and a school-based (nonrandom) expert group sample of VI students aged 12 to 17 years. A total of 32 qualitative semistructured interviews supplemented by narrative feedback from 15 eligible VI children and young people were used to generate draft instrument items. Seventeen VI students were consulted individually on item relevance and comprehensibility, instrument instructions, format, and administration methods. The resulting draft instrument was piloted with 101 VI children and young people comprising a nationally representative sample, drawn from 21 hospitals in the United Kingdom. Initial item reduction was informed by presence of missing data and individual item response pattern. Exploratory factor analysis (FA) and parallel analysis (PA), and Rasch analysis (RA) were applied to test the instrument's psychometric properties. Psychometric indices and validity assessment of the Functional Vision Questionnaire for Children and Young People (FVQ_CYP). A total of 712 qualitative statements became a 56-item draft scale, capturing the level of difficulty in performing vision-dependent activities. After piloting, items were removed iteratively as follows: 11 for high percentage of missing data, 4 for skewness, and 1 for inadequate item infit and outfit values in RA, 3 having shown differential item functioning across age groups and 1 across gender in RA. The remaining 36 items showed item fit values within acceptable limits, good measurement precision and targeting, and ordered response categories. The reduced scale has a clear unidimensional structure, with all items having a high factor loading on the single factor in FA and PA. The summary scores correlated significantly with visual acuity. We have developed a novel, psychometrically robust self-report questionnaire for children and young people-the FVQ_CYP-that captures the functional impact of visual disability from their perspective. The 36-item, 4-point unidimensional scale has potential as a complementary adjunct to objective clinical assessments in routine pediatric ophthalmology practice and in research. Copyright © 2013 American Academy of Ophthalmology. Published by Elsevier Inc. All rights reserved.
Applications of computerized adaptive testing (CAT) to the assessment of headache impact.

PubMed

Ware, John E; Kosinski, Mark; Bjorner, Jakob B; Bayliss, Martha S; Batenhorst, Alice; Dahlöf, Carl G H; Tepper, Stewart; Dowson, Andrew

2003-12-01

To evaluate the feasibility of computerized adaptive testing (CAT) and the reliability and validity of CAT-based estimates of headache impact scores in comparison with 'static' surveys. Responses to the 54-item Headache Impact Test (HIT) were re-analyzed for recent headache sufferers (n = 1016) who completed telephone interviews during the National Survey of Headache Impact (NSHI). Item response theory (IRT) calibrations and the computerized dynamic health assessment (DYNHA) software were used to simulate CAT assessments by selecting the most informative items for each person and estimating impact scores according to pre-set precision standards (CAT-HIT). Results were compared with IRT estimates based on all items (total-HIT), computerized 6-item dynamic estimates (CAT-HIT-6), and a developmental version of a 'static' 6-item form (HIT-6-D). Analyses focused on: respondent burden (survey length and administration time), score distributions ('ceiling' and 'floor' effects), reliability and standard errors, and clinical validity (diagnosis, level of severity). A random sample (n = 245) was re-assessed to test responsiveness. A second study (n = 1103) compared actual CAT surveys and an improved 'static' HIT-6 among current headache sufferers sampled on the Internet. Respondents completed measures from the first study and the generic SF-8 Health Survey; some (n = 540) were re-tested on the Internet after 2 weeks. In the first study, simulated CAT-HIT and total-HIT scores were highly correlated (r = 0.92) without 'ceiling' or 'floor' effects and with a substantial reduction (90.8%) in respondent burden. Six of the 54 items accounted for the great majority of item administrations (3603/5028, 77.6%). CAT-HIT reliability estimates were very high (0.975-0.992) in the range where 95% of respondents scored, and relative validity (RV) coefficients were high for diagnosis (RV = 0.87) and severity (RV = 0.89); patient-level classifications were accurate 91.3% for a diagnosis of migraine. For all three criteria of change, CAT-HIT scores were more responsive than all other measures. In the second study, estimates of respondent burden, item usage, reliability and clinical validity were replicated. The test-retest reliability of CAT-HIT was 0.79 and alternate forms coefficients ranged from 0.85 to 0.91. All correlations with the generic SF-8 were negative. CAT-based administrations of headache impact items achieved very large reductions in respondent burden without compromising validity for purposes of patient screening or monitoring changes in headache impact over time. IRT models and CAT-based dynamic health assessments warrant testing among patients with other conditions.
Testing the Index of Problematic Online Experiences (I-POE) with a national sample of adolescents.

PubMed

Mitchell, Kimberly J; Jones, Lisa M; Wells, Melissa

2013-12-01

This article assesses the utility of the Index of Problematic Online Experiences (I-POE) in a national sample of adolescents in the United States. The study was based on a cross-sectional national telephone survey of 1560 Internet users, ages 10 through 17. Data were collected between August, 2010 and January, 2011. The I-POE is an 18-item binary response index which can be used to assess problematic internet use across multiple behaviors and activities. Exploratory and confirmatory factor analysis supported a revised index with two factors: a 9-item "excessive use" scale and a 9-item "online social and communication problems" scale among this population. The I-POE showed favorable psychometric properties including adequate internal consistency for the overall scale and for the two subscales. Scores correlate with offline emotional and behavioral difficulties and the I-POE could have value for use as a part of broad mental health assessment procedures in clinical or school settings. Copyright © 2013 The Foundation for Professionals in Services for Adolescents. Published by Elsevier Ltd. All rights reserved.
Rater reliability and concurrent validity of the Keyboard Personal Computer Style instrument (K-PeCS).

PubMed

Baker, Nancy A; Cook, James R; Redfern, Mark S

2009-01-01

This paper describes the inter-rater and intra-rater reliability, and the concurrent validity of an observational instrument, the Keyboard Personal Computer Style instrument (K-PeCS), which assesses stereotypical postures and movements associated with computer keyboard use. Three trained raters independently rated the video clips of 45 computer keyboard users to ascertain inter-rater reliability, and then re-rated a sub-sample of 15 video clips to ascertain intra-rater reliability. Concurrent validity was assessed by comparing the ratings obtained using the K-PeCS to scores developed from a 3D motion analysis system. The overall K-PeCS had excellent reliability [inter-rater: intra-class correlation coefficients (ICC)=.90; intra-rater: ICC=.92]. Most individual items on the K-PeCS had from good to excellent reliability, although six items fell below ICC=.75. Those K-PeCS items that were assessed for concurrent validity compared favorably to the motion analysis data for all but two items. These results suggest that most items on the K-PeCS can be used to reliably document computer keyboarding style.
Further Refinements in the Measurement of Exercise Imagery: The Exercise Imagery Inventory

ERIC Educational Resources Information Center

Giacobbi, Peter R., Jr.; Hausenblas, Heather A.; Penfield, Randall D.

2005-01-01

The factorial and construct validity of the Exercise Imagery Inventory (EII) were assessed with 3 separate samples of participants. In Phase 1, a 41-item measure was administered to 504 undergraduate students. Exploratory factor analysis supported a 4-factor model that explained 65% of the variance. In Phase 2, a 19-item measure was administered…
A Confirmatory Factor Analysis of Reilly's Role Overload Scale

ERIC Educational Resources Information Center

Thiagarajan, Palaniappan; Chakrabarty, Subhra; Taylor, Ronald D.

2006-01-01

In 1982, Reilly developed a 13-item scale to measure role overload. This scale has been widely used, but most studies did not assess the unidimensionality of the scale. Given the significance of unidimensionality in scale development, the current study reports a confirmatory factor analysis of the 13-item scale in two samples. Based on the…
Linking Outcomes from Peabody Picture Vocabulary Test Forms Using Item Response Models

ERIC Educational Resources Information Center

Hoffman, Lesa; Templin, Jonathan; Rice, Mabel L.

2012-01-01

Purpose: The present work describes how vocabulary ability as assessed by 3 different forms of the Peabody Picture Vocabulary Test (PPVT; Dunn & Dunn, 1997) can be placed on a common latent metric through item response theory (IRT) modeling, by which valid comparisons of ability between samples or over time can then be made. Method: Responses…
Assessing Conceptual and Algorithmic Knowledge in General Chemistry with ACS Exams

ERIC Educational Resources Information Center

Holme, Thomas; Murphy, Kristen

2011-01-01

In 2005, the ACS Examinations Institute released an exam for first-term general chemistry in which items are intentionally paired with one conceptual and one traditional item. A second-term, paired-questions exam was released in 2007. This paper presents an empirical study of student performances on these two exams based on national samples of…
The PROMIS Physical Function item bank was calibrated to a standardized metric and shown to improve measurement efficiency.

PubMed

Rose, Matthias; Bjorner, Jakob B; Gandek, Barbara; Bruce, Bonnie; Fries, James F; Ware, John E

2014-05-01

To document the development and psychometric evaluation of the Patient-Reported Outcomes Measurement Information System (PROMIS) Physical Function (PF) item bank and static instruments. The items were evaluated using qualitative and quantitative methods. A total of 16,065 adults answered item subsets (n>2,200/item) on the Internet, with oversampling of the chronically ill. Classical test and item response theory methods were used to evaluate 149 PROMIS PF items plus 10 Short Form-36 and 20 Health Assessment Questionnaire-Disability Index items. A graded response model was used to estimate item parameters, which were normed to a mean of 50 (standard deviation [SD]=10) in a US general population sample. The final bank consists of 124 PROMIS items covering upper, central, and lower extremity functions and instrumental activities of daily living. In simulations, a 10-item computerized adaptive test (CAT) eliminated floor and decreased ceiling effects, achieving higher measurement precision than any comparable length static tool across four SDs of the measurement range. Improved psychometric properties were transferred to the CAT's superior ability to identify differences between age and disease groups. The item bank provides a common metric and can improve the measurement of PF by facilitating the standardization of patient-reported outcome measures and implementation of CATs for more efficient PF assessments over a larger range. Copyright © 2014. Published by Elsevier Inc.
Refining a self-assessment of informatics competency scale using Mokken scaling analysis.

PubMed

Yoon, Sunmoo; Shaffer, Jonathan A; Bakken, Suzanne

2015-01-01

Healthcare environments are increasingly implementing health information technology (HIT) and those from various professions must be competent to use HIT in meaningful ways. In addition, HIT has been shown to enable interprofessional approaches to health care. The purpose of this article is to describe the refinement of the Self-Assessment of Nursing Informatics Competencies Scale (SANICS) using analytic techniques based upon item response theory (IRT) and discuss its relevance to interprofessional education and practice. In a sample of 604 nursing students, the 93-item version of SANICS was examined using non-parametric IRT. The iterative modeling procedure included 31 steps comprising: (1) assessing scalability, (2) assessing monotonicity, (3) assessing invariant item ordering, and (4) expert input. SANICS was reduced to an 18-item hierarchical scale with excellent reliability. Fundamental skills for team functioning and shared decision making among team members (e.g. "using monitoring systems appropriately," "describing general systems to support clinical care") had the highest level of difficulty, and "demonstrating basic technology skills" had the lowest difficulty level. Most items reflect informatics competencies relevant to all health professionals. Further, the approaches can be applied to construct a new hierarchical scale or refine an existing scale related to informatics attitudes or competencies for various health professions.
Developing and investigating the use of single-item measures in organizational research.

PubMed

Fisher, Gwenith G; Matthews, Russell A; Gibbons, Alyssa Mitchell

2016-01-01

The validity of organizational research relies on strong research methods, which include effective measurement of psychological constructs. The general consensus is that multiple item measures have better psychometric properties than single-item measures. However, due to practical constraints (e.g., survey length, respondent burden) there are situations in which certain single items may be useful for capturing information about constructs that might otherwise go unmeasured. We evaluated 37 items, including 18 newly developed items as well as 19 single items selected from existing multiple-item scales based on psychometric characteristics, to assess 18 constructs frequently measured in organizational and occupational health psychology research. We examined evidence of reliability; convergent, discriminant, and content validity assessments; and test-retest reliabilities at 1- and 3-month time lags for single-item measures using a multistage and multisource validation strategy across 3 studies, including data from N = 17 occupational health subject matter experts and N = 1,634 survey respondents across 2 samples. Items selected from existing scales generally demonstrated better internal consistency reliability and convergent validity, whereas these particular new items generally had higher levels of content validity. We offer recommendations regarding when use of single items may be more or less appropriate, as well as 11 items that seem acceptable, 14 items with mixed results that might be used with caution due to mixed results, and 12 items we do not recommend using as single-item measures. Although multiple-item measures are preferable from a psychometric standpoint, in some circumstances single-item measures can provide useful information. (c) 2016 APA, all rights reserved).
Examining Gender Differences in Written Assessment Tasks in Biology: A Case Study of Evolutionary Explanations

PubMed Central

Federer, Meghan Rector; Nehm, Ross H.; Pearl, Dennis K.

2016-01-01

Understanding sources of performance bias in science assessment provides important insights into whether science curricula and/or assessments are valid representations of student abilities. Research investigating assessment bias due to factors such as instrument structure, participant characteristics, and item types are well documented across a variety of disciplines. However, the relationships among these factors are unclear for tasks evaluating understanding through performance on scientific practices, such as explanation. Using item-response theory (Rasch analysis), we evaluated differences in performance by gender on a constructed-response (CR) assessment about natural selection (ACORNS). Three isomorphic item strands of the instrument were administered to a sample of undergraduate biology majors and nonmajors (Group 1: n = 662 [female = 51.6%]; G2: n = 184 [female = 55.9%]; G3: n = 642 [female = 55.1%]). Overall, our results identify relationships between item features and performance by gender; however, the effect is small in the majority of cases, suggesting that males and females tend to incorporate similar concepts into their CR explanations. These results highlight the importance of examining gender effects on performance in written assessment tasks in biology. PMID:26865642
Psychometric properties of a revised version of the Assisting Hand Assessment (Kids-AHA 5.0).

PubMed

Holmefur, Marie M; Krumlinde-Sundholm, Lena

2016-06-01

The aim of this study was to scrutinize the Assisting Hand Assessment (AHA) version 4.4 for possible improvements and to evaluate the psychometric properties regarding internal scale validity and aspects of reliability of a revised version of the AHA. In collaboration with experts, scoring criteria were changed for four items, and one fully new item was constructed. Twenty-two original, one new, and four revised items were scored for 164 assessments of children with unilateral cerebral palsy aged 18 months to 12 years. Rasch measurement analysis was used to evaluate internal scale validity by exploring rating-scale functioning, item and person goodness-of-fit, and principal component analysis. Targeting and scale reliability were also evaluated. After removal of misfitting items, a 20-item scale showed satisfactory goodness-of-fit. Unidimensionality was confirmed by principal component analysis. The rating scale functioned well for the 20 items, and the item difficulty was well suited to the ability level of the sample. The person reliability coefficient was 0.98, indicating high separation ability of the scale. A conversion table of AHA scores between the previous version (4.4) and the new version (5.0) was constructed. The new, 20-item version of the Kids-AHA (version 5.0), demonstrated excellent internal scale validity, suggesting improved responsiveness to changes and shortened scoring time. For comparison of scores from version 4.4 to 5.0, a transformation table is presented. © 2015 Mac Keith Press.
Development and Initial Validation of Military Deployment-Related TBI Quality-of-Life Item Banks.

PubMed

Toyinbo, Peter A; Vanderploeg, Rodney D; Donnell, Alison J; Mutolo, Sandra A; Cook, Karon F; Kisala, Pamela A; Tulsky, David S

2016-01-01

To investigate unique factors that affect health-related quality of life (QOL) in individuals with military deployment-related traumatic brain injury (MDR-TBI) and to develop appropriate assessment tools, consistent with the TBI-QOL/PROMIS/Neuro-QOL systems. Three focus groups from each of the 4 Veterans Administration (VA) Polytrauma Rehabilitation Centers, consisting of 20 veterans with mild to severe MDR-TBI, and 36 VA providers were involved in early stage of new item banks development. The item banks were field tested in a sample (N = 485) of veterans enrolled in VA and diagnosed with an MDR-TBI. Focus groups and survey. Developed item banks and short forms for Guilt, Posttraumatic Stress Disorder/Trauma, and Military-Related Loss. Three new item banks representing unique domains of MDR-TBI health outcomes were created: 15 new Posttraumatic Stress Disorder items plus 16 SCI-QOL legacy Trauma items, 37 new Military-Related Loss items plus 18 TBI-QOL legacy Grief/Loss items, and 33 new Guilt items. Exploratory and confirmatory factor analyses plus bifactor analysis of the items supported sufficient unidimensionality of the new item pools. Convergent and discriminant analyses results, as well as known group comparisons, provided initial support for the validity and clinical utility of the new item response theory-calibrated item banks and their short forms. This work provides a unique opportunity to identify issues specific to individuals with MDR-TBI and ensure that they are captured in QOL assessment, thus extending the existing TBI-QOL measurement system.
Development and validation of a socioculturally competent trust in physician scale for a developing country setting.

PubMed

Gopichandran, Vijayaprasad; Wouters, Edwin; Chetlapalli, Satish Kumar

2015-05-03

Trust in physicians is the unwritten covenant between the patient and the physician that the physician will do what is in the best interest of the patient. This forms the undercurrent of all healthcare relationships. Several scales exist for assessment of trust in physicians in developed healthcare settings, but to our knowledge none of these have been developed in a developing country context. To develop and validate a new trust in physician scale for a developing country setting. Dimensions of trust in physicians, which were identified in a previous qualitative study in the same setting, were used to develop a scale. This scale was administered among 616 adults selected from urban and rural areas of Tamil Nadu, south India, using a multistage sampling cross sectional survey method. The individual items were analysed using a classical test approach as well as item response theory. Cronbach's α was calculated and the item to total correlation of each item was assessed. After testing for unidimensionality and absence of local dependence, a 2 parameter logistic Semajima's graded response model was fit and item characteristics assessed. Competence, assurance of treatment, respect for the physician and loyalty to the physician were important dimensions of trust. A total of 31 items were developed using these dimensions. Of these, 22 were selected for final analysis. The Cronbach's α was 0.928. The item to total correlations were acceptable for all the 22 items. The item response analysis revealed good item characteristic curves and item information for all the items. Based on the item parameters and item information, a final 12 item scale was developed. The scale performs optimally in the low to moderate trust range. The final 12 item trust in physician scale has a good construct validity and internal consistency. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Development and validation of a socioculturally competent trust in physician scale for a developing country setting

PubMed Central

Gopichandran, Vijayaprasad; Wouters, Edwin; Chetlapalli, Satish Kumar

2015-01-01

Trust in physicians is the unwritten covenant between the patient and the physician that the physician will do what is in the best interest of the patient. This forms the undercurrent of all healthcare relationships. Several scales exist for assessment of trust in physicians in developed healthcare settings, but to our knowledge none of these have been developed in a developing country context. Objectives To develop and validate a new trust in physician scale for a developing country setting. Methods Dimensions of trust in physicians, which were identified in a previous qualitative study in the same setting, were used to develop a scale. This scale was administered among 616 adults selected from urban and rural areas of Tamil Nadu, south India, using a multistage sampling cross sectional survey method. The individual items were analysed using a classical test approach as well as item response theory. Cronbach's α was calculated and the item to total correlation of each item was assessed. After testing for unidimensionality and absence of local dependence, a 2 parameter logistic Semajima's graded response model was fit and item characteristics assessed. Results Competence, assurance of treatment, respect for the physician and loyalty to the physician were important dimensions of trust. A total of 31 items were developed using these dimensions. Of these, 22 were selected for final analysis. The Cronbach's α was 0.928. The item to total correlations were acceptable for all the 22 items. The item response analysis revealed good item characteristic curves and item information for all the items. Based on the item parameters and item information, a final 12 item scale was developed. The scale performs optimally in the low to moderate trust range. Conclusions The final 12 item trust in physician scale has a good construct validity and internal consistency. PMID:25941182
A psychometric comparison of three scales and a single-item measure to assess sexual satisfaction.

PubMed

Mark, Kristen P; Herbenick, Debby; Fortenberry, J Dennis; Sanders, Stephanie; Reece, Michael

2014-01-01

This study was designed to systematically compare and contrast the psychometric properties of three scales developed to measure sexual satisfaction and a single-item measure of sexual satisfaction. The Index of Sexual Satisfaction (ISS), Global Measure of Sexual Satisfaction (GMSEX), and the New Sexual Satisfaction Scale-Short (NSSS-S) were compared to one another and to a single-item measure of sexual satisfaction. Conceptualization of the constructs, distribution of scores, internal consistency, convergent validity, test-retest reliability, and factor structure were compared between the measures. A total of 211 men and 214 women completed the scales and a measure of relationship satisfaction, with 33% (n = 139) of the sample reassessed two months later. All scales demonstrated appropriate distribution of scores and adequate internal consistency. The GMSEX, NSSS-S, and the single-item measure demonstrated convergent validity. Test-retest reliability was demonstrated by the ISS, GMSEX, and NSSS-S, but not the single-item measure. Taken together, the GMSEX received the strongest psychometric support in this sample for a unidimensional measure of sexual satisfaction and the NSSS-S received the strongest psychometric support in this sample for a bidimensional measure of sexual satisfaction.
[The appraisal of reliability and validity of subjective workload assessment technique and NASA-task load index].

PubMed

Xiao, Yuan-mei; Wang, Zhi-ming; Wang, Mian-zhen; Lan, Ya-jia

2005-06-01

To test the reliability and validity of two mental workload assessment scales, i.e. subjective workload assessment technique (SWAT) and NASA task load index (NASA-TLX). One thousand two hundred and sixty-eight mental workers were sampled from various kinds of occupations, such as scientific research, education, administration and medicine, etc, with randomized cluster sampling. The re-test reliability, split-half reliability, Cronbach's alpha coefficient and correlation coefficients between item score and total score were adopted to test the reliability. The test of validity included structure validity. The re-test reliability coefficients of these two scales and their items were ranged from 0.516 to 0.753 (P < 0.01), indicating the two scales had good re-test reliability; the split-half reliability of SWAT was 0.645, and its Cronbach's alpha coefficient was more than 0.80, all the correlation coefficients between its items score and total score were more than 0.70; as for NASA-TLX, both the split-half reliability and Cronbach's alpha coefficient were more than 0.80, the correlation coefficients between its items score and total score were all more than 0.60 (P < 0.01) except the item of performance. Both scales had good inner consistency. The Pearson correlation coefficient between the two scales was 0.492 (P < 0.01), implying the results of the two scales had good consistency. Factor analysis showed that the two scales had good structure validity. Both SWAT and NASA-TLX have good reliability and validity and may be used as a valid tool to assess mental workload in China after being revised properly.
Mathematics Assessment Sampler 3-5

ERIC Educational Resources Information Center

National Council of Teachers of Mathematics, 2005

2005-01-01

The sample assessment items in this volume are sorted according to the strands of number and operations, algebra, geometry, measurement, and data analysis and probability. Because one goal of assessment is to determine students' abilities to communicate mathematically, the writing team suggests ways to extend or modify multiple-choice and…

Differential Item Functioning in Primary Healthcare Evaluation Instruments by French/English Version, Educational Level and Urban/Rural Location

PubMed Central

Haggerty, Jeannie L.; Bouharaoui, Fatima; Santor, Darcy A.

2011-01-01

Evaluating the extent to which groups or subgroups of individuals differ with respect to primary healthcare experience depends on first ruling out the possibility of bias. Objective: To determine whether item or subscale performance differs systematically between French/English, high/low education subgroups and urban/rural residency. Method: A sample of 645 adult users balanced by French/English language (in Quebec and Nova Scotia, respectively), high/low education and urban/rural residency responded to six validated instruments: the Primary Care Assessment Survey (PCAS); the Primary Care Assessment Tool – Short Form (PCAT-S); the Components of Primary Care Index (CPCI); the first version of the EUROPEP (EUROPEP-I); the Interpersonal Processes of Care Survey, version II (IPC-II); and part of the Veterans Affairs National Outpatient Customer Satisfaction Survey (VANOCSS). We normalized subscale scores to a 0-to-10 scale and tested for between-group differences using ANOVA tests. We used a parametric item response model to test for differences between subgroups in item discriminability and item difficulty. We re-examined group differences after removing items with differential item functioning. Results: Experience of care was assessed more positively in the English-speaking (Nova Scotia) than in the French-speaking (Quebec) respondents. We found differential English/French item functioning in 48% of the 153 items: discriminability in 20% and differential difficulty in 28%. English items were more discriminating generally than the French. Removing problematic items did not change the differences in French/English assessments. Differential item functioning by high/low education status affected 27% of items, with items being generally more discriminating in high-education groups. Between-group comparisons were unchanged. In contrast, only 9% of items showed differential item functioning by geography, affecting principally the accessibility attribute. Removing problematic items reversed a previously non-significant finding, revealing poorer first-contact access in rural than in urban areas. Conclusion: Differential item functioning does not bias or invalidate French/English comparisons on subscales, but additional development is required to make French and English items equivalent. These instruments are relatively robust by educational status and geography, but results suggest potential differences in the underlying construct in low-education and rural respondents. PMID:23205035
The state of chronic pain education in geriatric medicine fellowship training programs: results of a national survey.

PubMed

Weiner, Debra K; Turner, Gregory H; Hennon, John G; Perera, Subashan; Hartmann, Susanne

2005-10-01

A survey of U.S. geriatric medicine fellowship training programs was performed to assess the status of teaching about chronic pain evaluation and management and identify opportunities for improvement. After an initial e-mail query, 43 of 96 programs agreed to participate. A self-administered questionnaire, with items adapted from a 2002 consensus panel statement, was mailed to their 171 fellows-in-training and 43 fellowship directors. Thirty-two programs (33% of nationwide sample) including 79 fellows (30% of nationwide sample) and 25 directors (26% of nationwide sample) returned surveys; 21 institutions returned both faculty and fellow surveys. Overall, directors endorsed the 19 items identified by the consensus panel as essential components of fellowship training, but fellows identified deficiencies, both before and during fellowship training. Specific areas of undereducation included comprehensive musculoskeletal assessment, neuropathic pain evaluation, indications for low back pain imaging, the role of multidisciplinary pain clinics and nonpharmacological modalities, the effect of physical and psychosocial comorbidities in formulating treatment goals, and the effect of aging on analgesic metabolism and prescription. Both groups were generally positive about fellows' abilities to implement pain-related clinical skills. Discrepancies existed between fellowship directors' ratings of importance of teaching individual items and the degree to which teaching was actually done, as well as faculty versus fellow assessments of whether some of the 19 items were taught. Primary care training programs (e.g., internal medicine, family medicine, geriatric medicine) should pay more systematic attention to educating trainees about chronic pain to optimize patient care, decrease suffering, and diminish healthcare expenditures.
Development and refinement of the WAItE: a new obesity-specific quality of life measure for adolescents.

PubMed

Oluboyede, Yemi; Hulme, Claire; Hill, Andrew

2017-08-01

Few weight-specific outcome measures, developed specifically for obese and overweight adolescents, exist and none are suitable for the elicitation of utility values used in the assessment of cost effectiveness. The development of a descriptive system for a new weight-specific measure. Qualitative interviews were conducted with 31 treatment-seeking (above normal weight status) and non-treatment-seeking (school sample) adolescents aged 11-18 years, to identify a draft item pool and associated response options. 315 eligible consenting adolescents, aged 11-18 years, enrolled in weight management services and recruited via an online panel, completed two version of a long-list 29-item descriptive system (consisting of frequency and severity response scales). Psychometric assessments and Rasch analysis were applied to the draft 29-item instrument to identify a brief tool containing the best performing items and associated response options. Seven items were selected, for the final item set; all displayed internal consistency, moderate floor effects and the ability to discriminate between weight categories. The assessment of unidimensionality was supported (t test statistic of 0.024, less than the 0.05 threshold value). The Weight-specific Adolescent Instrument for Economic-evaluation focuses on aspects of life affected by weight that are important to adolescents. It has the potential for adding key information to the assessment of weight management interventions aimed at the younger population.
Applying Item Response Theory methods to design a learning progression-based science assessment

NASA Astrophysics Data System (ADS)

Chen, Jing

Learning progressions are used to describe how students' understanding of a topic progresses over time and to classify the progress of students into steps or levels. This study applies Item Response Theory (IRT) based methods to investigate how to design learning progression-based science assessments. The research questions of this study are: (1) how to use items in different formats to classify students into levels on the learning progression, (2) how to design a test to give good information about students' progress through the learning progression of a particular construct and (3) what characteristics of test items support their use for assessing students' levels. Data used for this study were collected from 1500 elementary and secondary school students during 2009--2010. The written assessment was developed in several formats such as the Constructed Response (CR) items, Ordered Multiple Choice (OMC) and Multiple True or False (MTF) items. The followings are the main findings from this study. The OMC, MTF and CR items might measure different components of the construct. A single construct explained most of the variance in students' performances. However, additional dimensions in terms of item format can explain certain amount of the variance in student performance. So additional dimensions need to be considered when we want to capture the differences in students' performances on different types of items targeting the understanding of the same underlying progression. Items in each item format need to be improved in certain ways to classify students more accurately into the learning progression levels. This study establishes some general steps that can be followed to design other learning progression-based tests as well. For example, first, the boundaries between levels on the IRT scale can be defined by using the means of the item thresholds across a set of good items. Second, items in multiple formats can be selected to achieve the information criterion at all the defined boundaries. This ensures the accuracy of the classification. Third, when item threshold parameters vary a bit, the scoring rubrics and the items need to be reviewed to make the threshold parameters similar across items. This is because one important design criterion of the learning progression-based items is that ideally, a student should be at the same level across items, which means that the item threshold parameters (d1, d 2 and d3) should be similar across items. To design a learning progression-based science assessment, we need to understand whether the assessment measures a single construct or several constructs and how items are associated with the constructs being measured. Results from dimension analyses indicate that items of different carbon transforming processes measure different aspects of the carbon cycle construct. However, items of different practices assess the same construct. In general, there are high correlations among different processes or practices. It is not clear whether the strong correlations are due to the inherent links among these process/practice dimensions or due to the fact that the student sample does not show much variation in these process/practice dimensions. Future data are needed to examine the dimensionalities in terms of process/practice in detail. Finally, based on item characteristics analysis, recommendations are made to write more discriminative CR items and better OMC, MTF options. Item writers can follow these recommendations to write better learning progression-based items.
The Impact of Reading Self-Efficacy and Task Value on Reading Comprehension Scores in Different Item Formats

ERIC Educational Resources Information Center

Solheim, Oddny Judith

2011-01-01

It has been hypothesized that students with low self-efficacy will struggle with complex reading tasks in assessment situations. In this study we examined whether perceived reading self-efficacy and reading task value uniquely predicted reading comprehension scores in two different item formats in a sample of fifth-grade students. Results showed…
Concurrent Validity and Sensitivity to Change of Direct Behavior Rating Single-Item Scales (DBR-SIS) within an Elementary Sample

ERIC Educational Resources Information Center

Smith, Rhonda L.; Eklund, Katie; Kilgus, Stephen P.

2018-01-01

The purpose of this study was to evaluate the concurrent validity, sensitivity to change, and teacher acceptability of Direct Behavior Rating single-item scales (DBR-SIS), a brief progress monitoring measure designed to assess student behavioral change in response to intervention. Twenty-four elementary teacher-student dyads implemented a daily…
Developing the Impossible Figures Task to Assess Visual-Spatial Talents among Chinese Students: A Rasch Measurement Model Analysis

ERIC Educational Resources Information Center

Chan, David W.

2010-01-01

Data of item responses to the Impossible Figures Task (IFT) from 492 Chinese primary, secondary, and university students were analyzed using the dichotomous Rasch measurement model. Item difficulty estimates and person ability estimates located on the same logit scale revealed that the pooled sample of Chinese students, who were relatively highly…
Validation of the educational needs assessment tool as a generic instrument for rheumatic diseases in seven European countries.

PubMed

Ndosi, Mwidimi; Bremander, Ann; Hamnes, Bente; Horton, Mike; Kukkurainen, Marja Leena; Machado, Pedro; Marques, Andrea; Meesters, Jorit; Stamm, Tanja A; Tennant, Alan; de la Torre-Aboki, Jenny; Vliet Vlieland, Theodora P M; Zangi, Heidi A; Hill, Jackie

2014-12-01

To validate the educational needs assessment tool (ENAT) as a generic tool for assessing the educational needs of patients with rheumatic diseases in European Countries. A convenience sample of patients from seven European countries was included comprising the following diagnostic groups: ankylosing spondylitis, psoriatic arthritis, systemic sclerosis, systemic lupus erythematosus, osteoarthritis (OA) and fibromyalgia syndrome. Translated versions of the ENAT were completed through surveys in each country. Rasch analysis was used to assess the construct validity of the adapted ENATs including differential item functioning by culture (cross-cultural DIF). Initially, the data from each country and diagnostic group were fitted to the Rasch model separately, and then the pooled data from each diagnostic group. The sample comprised 3015 patients; the majority, 1996 (66.2%), were women. Patient characteristics (stratified by diagnostic group) were comparable across countries except the educational background, which was variable. In most occasions, the 39-item ENAT deviated significantly from the Rasch model expectations (item-trait interaction χ(2) p<0.05). After correction for local dependency (grouping the items into seven domains and analysing them as 'testlets'), fit to the model was satisfied (item-trait interaction χ(2) p>0.18) in all pooled disease group datasets except OA (χ(2)=99.91; p=0.002). The internal consistency in each group was high (Person Separation Index above 0.90). There was no significant DIF by person characteristics. Cross-cultural DIF was found in some items, which required adjustments. Subsequently, interval-level scales were calibrated to enable transformation of ENAT scores when required. The adapted ENAT is a valid tool with high internal consistency providing accurate estimation of the educational needs of people with rheumatic diseases. Cross-cultural comparison of educational needs is now possible. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
The Chinese version of Instrument of Professional Attitude for Student Nurses (IPASN): Assessment of reliability and validity.

PubMed

Xiao, Yu-Ying; Li, Ting; Xiao, Lin; Wang, Su-Wei; Wang, Si-Qi; Wang, Han-Xiao; Wang, Bei-Bei; Gao, Yu-Lin

2017-02-01

Professional attitude is of great importance for nursing talents in the modern society. To develop an effective educational program for student nurses in China, an appropriate instrument is required for the assessment of their professional attitude. To assess the validity and reliability of the Instrument of Professional Attitude for Student Nurses (IPASN) in Chinese version. The original version of IPASN was translated through Brislin model (translation, back translation, culture adaption and pilot study) with the authorization from the developer. A total of 681 nursing students were chosen by stratified convenience sampling to assess construct validity using exploratory factor analysis (EFA). Besides, item analysis, Cronbach's alpha coefficients, test-retest reliability were conducted to test the psychometric properties in this part. A total of 204 nursing undergraduate trainees were selected by cluster convenience sampling to confirm the structure using confirmatory factor analysis (CFA) in another time. Corrected item-total correlations, alpha if item deleted were between 0.33 and 0.69, 0.906 and 0.913, respectively, indicating no item should be deleted. Cronbach alpha value was 0.91 for the total scale and Cronbach alpha coefficient for subscales ranged from 0.67 to 0.89. Test-retest reliability estimated from intraclass correlation coefficient (ICC) was 0.74 (P<0.05). Differences in item scores between the high-score group (the first 27%) and low-score group (the last 27%) were significant (P<0.001), indicating that the item discrimination ability was good. Seven subscales (contribution to increase of scientific information load, autonomy, community service, continuous education, to promote professional development, cooperation and theory guiding practice) were identified in EFA and confirmed in CFA, and explained 65.5% of the total variance. It indicated that the Chinese version of IPASN was valid and reliable for the evaluation of nursing students' professional attitude. Copyright © 2016 Elsevier Ltd. All rights reserved.
Prediction of true test scores from observed item scores and ancillary data.

PubMed

Haberman, Shelby J; Yao, Lili; Sinharay, Sandip

2015-05-01

In many educational tests which involve constructed responses, a traditional test score is obtained by adding together item scores obtained through holistic scoring by trained human raters. For example, this practice was used until 2008 in the case of GRE(®) General Analytical Writing and until 2009 in the case of TOEFL(®) iBT Writing. With use of natural language processing, it is possible to obtain additional information concerning item responses from computer programs such as e-rater(®). In addition, available information relevant to examinee performance may include scores on related tests. We suggest application of standard results from classical test theory to the available data to obtain best linear predictors of true traditional test scores. In performing such analysis, we require estimation of variances and covariances of measurement errors, a task which can be quite difficult in the case of tests with limited numbers of items and with multiple measurements per item. As a consequence, a new estimation method is suggested based on samples of examinees who have taken an assessment more than once. Such samples are typically not random samples of the general population of examinees, so that we apply statistical adjustment methods to obtain the needed estimated variances and covariances of measurement errors. To examine practical implications of the suggested methods of analysis, applications are made to GRE General Analytical Writing and TOEFL iBT Writing. Results obtained indicate that substantial improvements are possible both in terms of reliability of scoring and in terms of assessment reliability. © 2015 The British Psychological Society.
Psychometric properties of a short form of the Center for Epidemiologic Studies Depression (CES-D-10) scale for screening depressive symptoms in healthy community dwelling older adults.

PubMed

Mohebbi, Mohammadreza; Nguyen, Van; McNeil, John J; Woods, Robyn L; Nelson, Mark R; Shah, Raj C; Storey, Elsdon; Murray, Anne M; Reid, Christopher M; Kirpach, Brenda; Wolfe, Rory; Lockery, Jessica E; Berk, Michael

The 10-item Center for the Epidemiological Studies of Depression Short Form (CES-D-10) is a widely used self-report measure of depression symptomatology. The aim of this study is to investigate the psychometric properties of the CES-D-10 in healthy community dwelling older adults. The sample consists of 19,114 community-based individuals residing in Australia and the United States who participated in the ASPREE trial baseline assessment. All individuals were free of any major illness at the time. We evaluated construct validity by performing confirmatory factor analysis, examined measurement invariance across country and gender followed by evaluating item discrimination bias in age, gender, race, ethnicity and education level, and assessing internal consistency. High item-total correlations and Cronbach's alpha indicated high internal consistency. The factor analyses suggested a unidimensional factor structure. Construct validity was supported in the overall sample, and by country and gender sub-groups. The CES-D-10 was invariant across countries, and although evidence of marginal gender non-invariance was observed there was no evidence of notable gender specific item discrimination bias. No notable differences in discrimination parameters or group membership measurement non-invariance were detected by gender, age, race, ethnicity, and education level. These findings suggest the CES-D-10 is a reliable and valid measure of depression in a volunteer sample. No noteworthy evidence of invariance and/or item discrimination bias is observed across gender, age, race, language and ethnic groups. Copyright © 2017 Elsevier Inc. All rights reserved.
Development and preliminary evaluation of a music-based attention assessment for patients with traumatic brain injury.

PubMed

Jeong, Eunju; Lesiuk, Teresa L

2011-01-01

Impairments in attention are commonly seen in individuals with traumatic brain injury (TBI). While visual attention assessment measurements have been rigorously developed and frequently used in cognitive neurorehabilitation, there is a paucity of auditory attention assessment measurements for patients with TBI. The purpose of this study was to field test a researcher-developed Music-based Attention Assessment (MAA), a melodic contour identification test designed to assess three different types of attention (i.e., sustained attention, selective attention, and divided attention), for patients with TBI. Additionally, this study aimed to evaluate the readability and comprehensibility of the test items and to examine the preliminary psychometric properties of the scale and test items. Fifteen patients diagnosed with TBI completed 3 different series of tasks in which they were required to identify melodic contours. The resulting data showed that (a) test items in each of the 3 subtests were found to have an easy to moderate level of item difficulty and an acceptable to high level of item discrimination, and (b) the musical characteristics (i.e., contour, congruence, and pitch interference) were found to be associated with the level of item difficulty, and (c) the internal consistency of the MAA as computed by Cronbach's alpha was .95. Subsequent studies using a larger sample of typical participants, along with individuals with TBI, are needed to confirm construct validity and internal consistency of the MAA. In addition, the authors recommend examination of criterion validity of the MAA as correlated with current neuropsychological attention assessment measurements.
Psychometric evaluation of the Dutch version of the Subjective Opiate Withdrawal Scale (SOWS).

PubMed

Dijkstra, Boukje A G; Krabbe, Paul F M; Riezebos, Truus G M; van der Staak, Cees P F; De Jong, Cor A J

2007-01-01

To evaluate the psychometric properties of the Dutch version of the 16-item Subjective Opiate Withdrawal Scale (SOWS). The SOWS measures withdrawal symptoms at the time of assessment. The Dutch SOWS was repeatedly administered to a sample of 272 opioid-dependent inpatients of four addiction treatment centers during rapid detoxification with or without general anesthesia. Examination of the psychometric properties of the SOWS included exploratory factor analysis, internal consistency, test-retest reliability, and criterion validity. Exploratory factor analysis of the SOWS revealed a general pattern of four factors with three items not always clustered in the same factors at different points of measurement. After excluding these items from factor analysis four factors were identified during detoxification (temperature dysregulation, tractus locomotorius, tractus gastro-intestinalis and facial disinhibition). The 13-item SOWS shows high internal consistency and test-retest reliability and good validity at different stages of withdrawal. The 13-item SOWS is a reliable and valid instrument to assess opioid withdrawal during rapid detoxification. Three items were deleted because their content does not correspond directly with opioid withdrawal symptoms. Copyright (c) 2007 S. Karger AG, Basel.
Reliability (internal consistency) of the job content questionnaire on job stress among office workers of a multinational company in Kuala Lumpur.

PubMed

Maizura, Husna; Masilamani, Retneswari; Aris, Tahir

2009-04-01

This small, cross-sectional study assessed the reliability of 3 scales from the Job Content Questionnaire (JCQ)-decision latitude, psychological job demand, and social support-in a group of office workers in a multinational company in Kuala Lumpur. A universal sample of 30 white-collar workers from a department of the company self-administered the English version of the JCQ comprising 21 core items selected from the full recommended version of 49 items on-site. Reliability (internal consistency) was evaluated using Cronbach's alpha coefficients for each scale. Corrected item-total correlation was presented for each and every item. Cronbach's alpha coefficients were acceptable for decision latitude (.76) and social support (.79) but slightly lower for psychological job demand (.64). Values for all item-total correlations for all 3 scales were greater than .3. In conclusion, this study suggests that the JCQ is a reliable scale for assessing job stress in this group of workers.
The Work-Family Conflict Scale (WAFCS): development and initial validation of a self-report measure of work-family conflict for use with parents.

PubMed

Haslam, Divna; Filus, Ania; Morawska, Alina; Sanders, Matthew R; Fletcher, Renee

2015-06-01

This paper outlines the development and validation of the Work-Family Conflict Scale (WAFCS) designed to measure work-to-family conflict (WFC) and family-to-work conflict (FWC) for use with parents of young children. An expert informant and consumer feedback approach was utilised to develop and refine 20 items, which were subjected to a rigorous validation process using two separate samples of parents of 2-12 year old children (n = 305 and n = 264). As a result of statistical analyses several items were dropped resulting in a brief 10-item scale comprising two subscales assessing theoretically distinct but related constructs: FWC (five items) and WFC (five items). Analyses revealed both subscales have good internal consistency, construct validity as well as concurrent and predictive validity. The results indicate the WAFCS is a promising brief measure for the assessment of work-family conflict in parents. Benefits of the measure as well as potential uses are discussed.
Factor structure and psychometric properties of the Fertility Problem Inventory–Short Form

PubMed Central

Zurlo, Maria Clelia; Cattaneo Della Volta, Maria Franscesca; Vallone, Federica

2017-01-01

The study analyses factor structure and psychometric properties of the Italian version of the Fertility Problem Inventory–Short Form. A sample of 206 infertile couples completed the Italian version of Fertility Problem Inventory (46 items) with demographics, State Anxiety Scale of State-Trait Anxiety Inventory (Form Y), Edinburgh Depression Scale and Dyadic Adjustment Scale, used to assess convergent and discriminant validity. Confirmatory factor analysis was unsatisfactory (comparative fit index = 0.87; Tucker-Lewis Index = 0.83; root mean square error of approximation = 0.17), and Cronbach’s α (0.95) revealed a redundancy of items. Exploratory factor analysis was carried out deleting cross-loading items, and Mokken scale analysis was applied to verify the items homogeneity within the reduced subscales of the questionnaire. The Fertility Problem Inventory–Short Form consists of 27 items, tapping four meaningful and reliable factors. Convergent and discriminant validity were confirmed. Findings indicated that the Fertility Problem Inventory–Short Form is a valid and reliable measure to assess infertility-related stress dimensions. PMID:29379625
Anatomy of a physics test: Validation of the physics items on the Texas Assessment of Knowledge and Skills

NASA Astrophysics Data System (ADS)

Marshall, Jill A.; Hagedorn, Eric A.; O'Connor, Jerry

2009-06-01

We report the results of an analysis of the Texas Assessment of Knowledge and Skills (TAKS) designed to determine whether the TAKS is a valid indicator of whether students know and can do physics at the level necessary for success in future coursework, STEM careers, and life in a technological society. We categorized science items from the 2003 and 2004 10th and 11th grade TAKS by content area(s) covered, knowledge and skills required to select the correct answer, and overall quality. We also analyzed a 5000 student sample of item-level results from the 2004 11th grade exam, performing full-information factor analysis, calculating classical test indices, and determining each item's response curve using item response theory. Triangulation of our results revealed strengths and weaknesses of the different methods of analysis. The TAKS was found to be only weakly indicative of physics preparation and we make recommendations for increasing the validity of standardized physics testing.
Construct measurement quality improves predictive accuracy in violence risk assessment: an illustration using the personality assessment inventory.

PubMed

Hendry, Melissa C; Douglas, Kevin S; Winter, Elizabeth A; Edens, John F

2013-01-01

Much of the risk assessment literature has focused on the predictive validity of risk assessment tools. However, these tools often comprise a list of risk factors that are themselves complex constructs, and focusing on the quality of measurement of individual risk factors may improve the predictive validity of the tools. The present study illustrates this concern using the Antisocial Features and Aggression scales of the Personality Assessment Inventory (Morey, 1991). In a sample of 1,545 prison inmates and offenders undergoing treatment for substance abuse (85% male), we evaluated (a) the factorial validity of the ANT and AGG scales, (b) the utility of original ANT and AGG scales and newly derived ANT and AGG scales for predicting antisocial outcomes (recidivism and institutional infractions), and (c) whether items with a stronger relationship to the underlying constructs (higher factor loadings) were in turn more strongly related to antisocial outcomes. Confirmatory factor analyses (CFAs) indicated that ANT and AGG items were not structured optimally in these data in terms of correspondence to the subscale structure identified in the PAI manual. Exploratory factor analyses were conducted on a random split-half of the sample to derive optimized alternative factor structures, and cross-validated in the second split-half using CFA. Four-factor models emerged for both the ANT and AGG scales, and, as predicted, the size of item factor loadings was associated with the strength with which items were associated with institutional infractions and community recidivism. This suggests that the quality by which a construct is measured is associated with its predictive strength. Implications for risk assessment are discussed. Copyright © 2013 John Wiley & Sons, Ltd.
Serial Assessment of Trauma Care Capacity in Ghana in 2004 and 2014.

PubMed

Stewart, Barclay T; Quansah, Robert; Gyedu, Adam; Boakye, Godfred; Abantanga, Francis; Ankomah, James; Donkor, Peter; Mock, Charles

2016-02-01

Trauma care capacity assessments in developing countries have generated evidence to support advocacy, detailed baseline capabilities, and informed targeted interventions. However, serial assessments to determine the effect of capacity improvements or changes over time have rarely been performed. To compare the availability of trauma care resources in Ghana between 2004 and 2014 to assess the effects of a decade of change in the trauma care landscape and derive recommendations for improvements. Capacity assessments were performed using direct inspection and structured interviews derived from the World Health Organization's Guidelines for Essential Trauma Care. In Ghana, 10 hospitals in 2004 and 32 hospitals in 2014 were purposively sampled to represent those most likely to care for injuries. Clinical staff, administrators, logistic/procurement officers, and technicians/biomedical engineers who interacted, directly or indirectly, with trauma care resources were interviewed at each hospital. Availability of items for trauma care was rated from 0 (complete absence) to 3 (fully available). Factors contributing to deficiency in 2014 were determined for items rated lower than 3. Each item rated lower than 3 at a specific hospital was defined as a hospital-item deficiency. Scores for total number of hospital-item deficiencies were derived for each contributing factor. There were significant improvements in mean ratings for trauma care resources: district-level (smaller) hospitals had a mean rating of 0.8 for all items in 2004 vs 1.3 in 2014 (P = .002); regional (larger) hospitals had a mean rating of 1.1 in 2004 vs 1.4 in 2014 (P = .01). However, a number of critical deficiencies remain (eg, chest tubes, diagnostics, and orthopedic and neurosurgical care; mean ratings ≤ 2). Leading contributing factors were item absence (503 hospital-item deficiencies), lack of training (335 hospital-item deficiencies), and stockout of consumables (137 hospital-item deficiencies). There has been significant improvement in trauma care capacity during the past decade in Ghana; however, critical deficiencies remain and require urgent redress to avert preventable death and disability. Serial capacity assessment is a valuable tool for monitoring efforts to strengthen trauma care systems, identifying what has been successful, and highlighting needs.
Magazine alcohol advertising compliance with the Australian Alcoholic Beverages Advertising Code.

PubMed

Donovan, Kati; Donovan, Rob; Howat, Peter; Weller, Narelle

2007-01-01

The purpose of this study was to assess the frequency and content of alcoholic beverage advertisements and sales promotions in magazines popular with adolescents and young people in Australia, and assess the extent to which the ads complied with Australia's self-regulatory Alcoholic Beverages Advertising Code (ABAC). Alcohol advertisements and promotions were identified in a sample of 93 magazines popular with young people. The identified items were coded against 28 measures constructed to assess the content of the items against the five sections of the ABAC. Two thirds of the magazines contained at least one alcohol advertisement or promotion with a total of 142 unique items identified: 80 were brand advertisements and 62 were other types of promotional items (i.e. sales promotions, event sponsorships, cross promotions with other marketers and advertorials). It was found that 52% of items appeared to contravene at least one section of the ABAC. The two major apparent breaches related to section B--the items having a strong appeal to adolescents (34%) and to section C--promoting positive social, sexual and psychological expectancies of consumption (28%). It was also found that promotional items appeared to breach the ABAC as often as did advertisements. It is concluded that the self-regulating system appears not to be working for the alcoholic beverages industry in Australia and that increased government surveillance and regulation should be considered, giving particular emphasis to the inclusion of promotional items other than brand advertising.

Construction and validation of a psychometric scale to measure awareness on consumption of irradiated foods.

PubMed

Rusin, Tiago; Araújo, Wilma Maria Coelho; Faiad, Cristiane; Vital, Helio de Carvalho

2017-01-01

Although food irradiation has been used to ensure food safety, most consumers are unaware of the basic concepts of irradiation, misinterpreting information and demonstrating a negative attitude toward food items treated with ionizing radiation. This research is aimed at developing a tool to assess the awareness on the consumption of irradiated food. The sample was composed by employees from different social classes and school levels of Brazilian universities, who reflect the end-users of the irradiated foods, representative of the views of lay consumers. The total number of respondents was 614. In order to assess the Awareness Scale on Consumption of Irradiated Foods (ASCIF), an instrument has been developed and submitted to semantic tests and judge's validation. The instrument, that included 32 items, contemplated four construct factors: concepts (6 items), awareness (10 items), labeling (7 items) and safety of Irradiated foods (9 items). The data were collected by electronic means, through the site . By using exploratory factorial analysis (EFA) 4 factors have been found. They summarize the 31 items included. These factors account for 64.32% of the variance of the items and the internal consistency of the factors has been deemed good. An Exploratory Structural Equation Modeling (ESEM) was conducted to evaluate the factor structure of the instrument. The proposed instrument has been found to meet consistency criteria as an efficient tool for indicating assessing potential challenges and opportunities for the irradiated food markets.
Construction and validation of a psychometric scale to measure awareness on consumption of irradiated foods

PubMed Central

2017-01-01

Although food irradiation has been used to ensure food safety, most consumers are unaware of the basic concepts of irradiation, misinterpreting information and demonstrating a negative attitude toward food items treated with ionizing radiation. This research is aimed at developing a tool to assess the awareness on the consumption of irradiated food. The sample was composed by employees from different social classes and school levels of Brazilian universities, who reflect the end-users of the irradiated foods, representative of the views of lay consumers. The total number of respondents was 614. In order to assess the Awareness Scale on Consumption of Irradiated Foods (ASCIF), an instrument has been developed and submitted to semantic tests and judge’s validation. The instrument, that included 32 items, contemplated four construct factors: concepts (6 items), awareness (10 items), labeling (7 items) and safety of Irradiated foods (9 items). The data were collected by electronic means, through the site . By using exploratory factorial analysis (EFA) 4 factors have been found. They summarize the 31 items included. These factors account for 64.32% of the variance of the items and the internal consistency of the factors has been deemed good. An Exploratory Structural Equation Modeling (ESEM) was conducted to evaluate the factor structure of the instrument. The proposed instrument has been found to meet consistency criteria as an efficient tool for indicating assessing potential challenges and opportunities for the irradiated food markets. PMID:29220375
The Blood Donor Anxiety Scale: a six-item state anxiety measure based on the Spielberger State-Trait Anxiety Inventory.

PubMed

Chell, Kathleen; Waller, Daniel; Masser, Barbara

2016-06-01

Research demonstrates that anxiety elevates the risk of blood donors experiencing adverse events, which in turn deters the performance of repeat blood donations. Identifying donors suffering from heightened state anxiety is important to assess the impact of evidence-based interventions. This study analyzed the appropriateness of a shortened version of the state subscale of the State-Trait Anxiety Inventory (STAI) in a blood donation context. STAI-State questionnaire data were collected from two separate samples of Australian blood donors (n = 919 and n = 824 after cleaning). Responses to demographic, donation history, and adverse reaction questions were also obtained. Identification of items and analysis was performed systematically to assess and compare internal reliability and content, construct, convergent, and criterion validity of three potential short-form state anxiety scales. Of the three short-form scales tested, STAI-State six-item scale demonstrated the best metric properties with the least number of items across both sample groups. Cronbach's alpha was acceptable (α = 0.844 and α = 0.820), correlated positively with the original measure (r = 0.927 and r = 0.931) and criterion-related variables, and maintained the two-dimension factorial structure of the original measure. The six-item short version of the STAI-State subscale presented the most reliable and valid scale for use with blood donors. A validated donor anxiety tool provides a standardized assessment and record of donor anxiety to gauge the effectiveness of ongoing efforts to enhance the donation experience. © 2016 AABB.
The shortened food expectations--Long-term care questionnaire: Assessing nursing home residents' satisfaction with food and food service.

PubMed

Crogan, Neva L; Evans, Bronwynne C

2006-11-01

Lack of nursing home resident satisfaction with meals often results in reduced food intake, leading to poor nutritional status, weight loss, functional decline, and depression. The purpose of this article is to describe the development and initial testing of the 28-item revised Food Expectations-Long-Term Care (FoodEx-LTC) questionnaire with a convenience sample of nursing home residents (N = 61). Because of possible respondent burden, the original 44-item, five-domain FoodEx-LTC was revised, resulting in the deletion of 16 redundant items and those with inter-item correlations less than .25. Coefficient alpha scores ranged from .65 to .82, and test-retest correlations ranged from .79 to .88, dependent on domain. This revised instrument has good initial validity and reliability, resulting in a shorter instrument that accurately assesses nursing home resident satisfaction with food and food service.
Validation of Physics Standardized Test Items

NASA Astrophysics Data System (ADS)

Marshall, Jill

2008-10-01

The Texas Physics Assessment Team (TPAT) examined the Texas Assessment of Knowledge and Skills (TAKS) to determine whether it is a valid indicator of physics preparation for future course work and employment, and of the knowledge and skills needed to act as an informed citizen in a technological society. We categorized science items from the 2003 and 2004 10th and 11th grade TAKS by content area(s) covered, knowledge and skills required to select the correct answer, and overall quality. We also analyzed a 5000 student sample of item-level results from the 2004 11th grade exam using standard statistical methods employed by test developers (factor analysis and Item Response Theory). Triangulation of our results revealed strengths and weaknesses of the different methods of analysis. The TAKS was found to be only weakly indicative of physics preparation and we make recommendations for increasing the validity of standardized physics testing..
Colorado Student Assessment Program: 2001 Released Passages, Items, and Prompts. Grade 4 Reading and Writing, Grade 4 Lectura y Escritura, Grade 5 Mathematics and Reading, Grade 6 Reading, Grade 7 Reading and Writing, Grade 8 Mathematics, Reading and Science, Grade 9 Reading, and Grade 10 Mathematics and Reading and Writing.

ERIC Educational Resources Information Center

Colorado State Dept. of Education, Denver.

This document contains released reading comprehension passages, test items, and writing prompts from the Colorado Student Assessment Program for 2001. The sample questions and prompts are included without answers or examples of student responses. Test materials are included for: (1) Grade 4 Reading and Writing; (2) Grade 4 Lectura y Escritura…
Evaluation of the Parent-Report Inventory of Callous-Unemotional Traits in a Sample of Children Recruited from Intimate Partner Violence Services: A Multidimensional Rasch Analysis.

PubMed

McDonald, Shelby Elaine; Ma, Lin; Green, Kathy E; Hitti, Stephanie A; Cody, Anna M; Donovan, Courtney; Williams, James Herbert; Ascione, Frank R

2018-03-01

Our study applied multidimensional item response theory (MIRT) to compare structural models of the parent-report version of the Inventory of Callous and Unemotional Traits (ICU; English and North American Spanish translations). A total of 291 maternal caregivers were recruited from community-based domestic violence services and reported on their children (77.9% ethnic minority; 47% female), who ranged in age from 7 to 12 years (mean = 9.07, standard deviation = 1.64). We compared 9 models that were based on prior psychometric evaluations of the ICU. MIRT analyses indicated that a revised 18-item version comprising 2 factors (callous-unemotional and empathic-prosocial) was more suitable for our sample. Differential item functioning was found for several items across ethnic and language groups, but not for child gender or age. Evidence of construct validity was found. We recommend continued research and revisions to the ICU to better assess the presence of callous-unemotional traits in community samples of school-age children. © 2017 Wiley Periodicals, Inc.
An Item Response Theory Analysis of DSM-IV Cannabis Abuse and Dependence Criteria in Adolescents

PubMed Central

Hartman, Christie A.; Gelhorn, Heather; Crowley, Thomas J.; Sakai, Joseph T.; Stallings, Michael; Young, Susan E.; Rhee, Soo Hyun; Corley, Robin; Hewitt, John K.; Hopfer, Christian J.

2008-01-01

Objective To examine three aspects of adolescent cannabis problems: 1) do DSM-IV cannabis abuse and dependence criteria represent two different levels of severity of substance involvement, 2) to what degree do each of the 11 abuse and dependence criteria assess adolescent cannabis problems, and 3) do the DSM-IV items function similarly across different adolescent populations? Method We examined 5587 adolescents aged 11–19, including 615 youth in treatment for substance use disorders, 179 adjudicated youth, and 4793 youth from the community. All subjects were assessed with a structured diagnostic interview. Item response theory was utilized to analyze symptom endorsement patterns. Results Abuse and dependence criteria were not found to represent different levels of severity of problem cannabis use in any of the samples. Among the 11 abuse and dependence criteria, Problems cutting down and Legal problems were the least informative for distinguishing problem users. Two dependence criteria and three of the four abuse criteria indicated different severities of cannabis problems across samples. Conclusions We found little evidence to support the idea that abuse and dependence are separate constructs for adolescent cannabis problems. Furthermore, certain abuse criteria may indicate severe substance problems while specific dependence items may indicate less severe problems. The abuse items in particular need further study. These results have implications for the refinement of the current substance use disorder criteria for DSM-V. PMID:18176333
Designing and Testing an Inventory for Measuring Social Media Competency of Certified Health Education Specialists

PubMed Central

Bernhardt, Jay M; Stellefson, Michael; Weiler, Robert M; Anderson-Lewis, Charkarra; Miller, M David; MacInnes, Jann

2015-01-01

Background Social media can promote healthy behaviors by facilitating engagement and collaboration among health professionals and the public. Thus, social media is quickly becoming a vital tool for health promotion. While guidelines and trainings exist for public health professionals, there are currently no standardized measures to assess individual social media competency among Certified Health Education Specialists (CHES) and Master Certified Health Education Specialists (MCHES). Objective The aim of this study was to design, develop, and test the Social Media Competency Inventory (SMCI) for CHES and MCHES. Methods The SMCI was designed in three sequential phases: (1) Conceptualization and Domain Specifications, (2) Item Development, and (3) Inventory Testing and Finalization. Phase 1 consisted of a literature review, concept operationalization, and expert reviews. Phase 2 involved an expert panel (n=4) review, think-aloud sessions with a small representative sample of CHES/MCHES (n=10), a pilot test (n=36), and classical test theory analyses to develop the initial version of the SMCI. Phase 3 included a field test of the SMCI with a random sample of CHES and MCHES (n=353), factor and Rasch analyses, and development of SMCI administration and interpretation guidelines. Results Six constructs adapted from the unified theory of acceptance and use of technology and the integrated behavioral model were identified for assessing social media competency: (1) Social Media Self-Efficacy, (2) Social Media Experience, (3) Effort Expectancy, (4) Performance Expectancy, (5) Facilitating Conditions, and (6) Social Influence. The initial item pool included 148 items. After the pilot test, 16 items were removed or revised because of low item discrimination (r<.30), high interitem correlations (Ρ>.90), or based on feedback received from pilot participants. During the psychometric analysis of the field test data, 52 items were removed due to low discrimination, evidence of content redundancy, low R-squared value, or poor item infit or outfit. Psychometric analyses of the data revealed acceptable reliability evidence for the following scales: Social Media Self-Efficacy (alpha=.98, item reliability=.98, item separation=6.76), Social Media Experience (alpha=.98, item reliability=.98, item separation=6.24), Effort Expectancy(alpha =.74, item reliability=.95, item separation=4.15), Performance Expectancy (alpha =.81, item reliability=.99, item separation=10.09), Facilitating Conditions (alpha =.66, item reliability=.99, item separation=16.04), and Social Influence (alpha =.66, item reliability=.93, item separation=3.77). There was some evidence of local dependence among the scales, with several observed residual correlations above |.20|. Conclusions Through the multistage instrument-development process, sufficient reliability and validity evidence was collected in support of the purpose and intended use of the SMCI. The SMCI can be used to assess the readiness of health education specialists to effectively use social media for health promotion research and practice. Future research should explore associations across constructs within the SMCI and evaluate the ability of SMCI scores to predict social media use and performance among CHES and MCHES. PMID:26399428
Designing and Testing an Inventory for Measuring Social Media Competency of Certified Health Education Specialists.

PubMed

Alber, Julia M; Bernhardt, Jay M; Stellefson, Michael; Weiler, Robert M; Anderson-Lewis, Charkarra; Miller, M David; MacInnes, Jann

2015-09-23

Social media can promote healthy behaviors by facilitating engagement and collaboration among health professionals and the public. Thus, social media is quickly becoming a vital tool for health promotion. While guidelines and trainings exist for public health professionals, there are currently no standardized measures to assess individual social media competency among Certified Health Education Specialists (CHES) and Master Certified Health Education Specialists (MCHES). The aim of this study was to design, develop, and test the Social Media Competency Inventory (SMCI) for CHES and MCHES. The SMCI was designed in three sequential phases: (1) Conceptualization and Domain Specifications, (2) Item Development, and (3) Inventory Testing and Finalization. Phase 1 consisted of a literature review, concept operationalization, and expert reviews. Phase 2 involved an expert panel (n=4) review, think-aloud sessions with a small representative sample of CHES/MCHES (n=10), a pilot test (n=36), and classical test theory analyses to develop the initial version of the SMCI. Phase 3 included a field test of the SMCI with a random sample of CHES and MCHES (n=353), factor and Rasch analyses, and development of SMCI administration and interpretation guidelines. Six constructs adapted from the unified theory of acceptance and use of technology and the integrated behavioral model were identified for assessing social media competency: (1) Social Media Self-Efficacy, (2) Social Media Experience, (3) Effort Expectancy, (4) Performance Expectancy, (5) Facilitating Conditions, and (6) Social Influence. The initial item pool included 148 items. After the pilot test, 16 items were removed or revised because of low item discrimination (r<.30), high interitem correlations (Ρ>.90), or based on feedback received from pilot participants. During the psychometric analysis of the field test data, 52 items were removed due to low discrimination, evidence of content redundancy, low R-squared value, or poor item infit or outfit. Psychometric analyses of the data revealed acceptable reliability evidence for the following scales: Social Media Self-Efficacy (alpha=.98, item reliability=.98, item separation=6.76), Social Media Experience (alpha=.98, item reliability=.98, item separation=6.24), Effort Expectancy(alpha =.74, item reliability=.95, item separation=4.15), Performance Expectancy (alpha =.81, item reliability=.99, item separation=10.09), Facilitating Conditions (alpha =.66, item reliability=.99, item separation=16.04), and Social Influence (alpha =.66, item reliability=.93, item separation=3.77). There was some evidence of local dependence among the scales, with several observed residual correlations above |.20|. Through the multistage instrument-development process, sufficient reliability and validity evidence was collected in support of the purpose and intended use of the SMCI. The SMCI can be used to assess the readiness of health education specialists to effectively use social media for health promotion research and practice. Future research should explore associations across constructs within the SMCI and evaluate the ability of SMCI scores to predict social media use and performance among CHES and MCHES.
Sub-grouping patients with non-specific low back pain based on cluster analysis of discriminatory clinical items.

PubMed

Billis, Evdokia; McCarthy, Christopher J; Roberts, Chris; Gliatis, John; Papandreou, Maria; Gioftsos, George; Oldham, Jacqueline A

2013-02-01

To identify potential subgroups amongst patients with non-specific low back pain based on a consensus list of potentially discriminatory examination items. Exploratory study. A convenience sample of 106 patients with non-specific low back pain (43 males, 63 females, mean age 36 years, standard deviation 15.9 years) and 7 physiotherapists. Based on 3 focus groups and a two-round Delphi involving 23 health professionals and a random stratified sample of 150 physiotherapists, respectively, a comprehensive examination list comprising the most "discriminatory" items was compiled. Following reliability analysis, the most reliable clinical items were assessed with a sample of patients with non-specific low back pain. K-means cluster analysis was conducted for 2-, 3- and 4-cluster options to explore for meaningful homogenous subgroups. The most clinically meaningful cluster was a two-subgroup option, comprising a small group (n = 24) with more severe clinical presentation (i.e. more widespread pain, functional and sleeping problems, other symptoms, increased investigations undertaken, more severe clinical signs, etc.) and a larger less dysfunctional group (n = 80). A number of potentially discriminatory clinical items were identified by health professionals and sub-classified, based on a sample of patients with non-specific low back pain, into two subgroups. However, further work is needed to validate this classification process.
Psychometric evaluation of the Questionnaire about the Process of Recovery (QPR).

PubMed

Williams, Julie; Leamy, Mary; Pesola, Francesca; Bird, Victoria; Le Boutillier, Clair; Slade, Mike

2015-12-01

Supporting recovery is the aim of national mental health policy in many countries. However, only one measure of recovery has been developed in England: the Questionnaire about the Process of Recovery (QPR), which measures recovery from the perspective of adult mental health service users with a psychosis diagnosis. To independently evaluate the psychometric properties of the 15- and 22-item versions of the QPR. Two samples were used: data-set 1 (n = 88) involved assessment of the QPR at baseline, 2 weeks and 3 months. Data-set 2 (n = 399; trial registration: ISRCTN02507940) involved assessment of the QPR at baseline and 1 year. For the 15-item version, internal consistency was 0.89, convergent validity was 0.73, test-retest reliability was 0.74 and sensitivity to change was 0.40. Confirmatory factor analysis showed the 15-item version offered a good fit. For the 22-item version, the interpersonal subscale was found to underperform and the intrapersonal subscale overlaps substantially with the 15-item version. Both the 15-item and the intrapersonal subscale of the 22-item versions of the QPR demonstrated satisfactory psychometric properties. The 15-item version is slightly more robust and also less burdensome, so it can be recommended for use in research and clinical practice. © The Royal College of Psychiatrists 2015.
Rasch analysis of the UK Functional Assessment Measure in patients with complex disability after stroke.

PubMed

Medvedev, Oleg N; Turner-Stokes, Lynne; Ashford, Stephen; Siegert, Richard J

2018-02-28

To determine whether the UK Functional Assessment Measure (UK FIM+FAM) fits the Rasch model in stroke patients with complex disability and, if so, to derive a conversion table of Rasch-transformed interval level scores. The sample included a UK multicentre cohort of 1,318 patients admitted for specialist rehabilitation following a stroke. Rasch analysis was conducted for the 30-item scale including 3 domains of items measuring physical, communication and psychosocial functions. The fit of items to the Rasch model was examined using 3 different analytical approaches referred to as "pathways". The best fit was achieved in the pathway where responses from motor, communication and psychosocial domains were summarized into 3 super-items and where some items were split because of differential item functioning (DIF) relative to left and right hemisphere location (χ2 (10) = 14.48, p = 0.15). Re-scoring of items showing disordered thresholds did not significantly improve the overall model fit. The UK FIM+FAM with domain super-items satisfies expectations of the unidimensional Rasch model without the need for re-scoring. A conversion table was produced to convert the total scale scores into interval-level data based on person estimates of the Rasch model. The clinical benefits of interval-transformed scores require further evaluation.
The Consumer Assessment of Healthcare Providers and Systems (CAHPS) cultural competence (CC) item set.

PubMed

Weech-Maldonado, Robert; Carle, Adam; Weidmer, Beverly; Hurtado, Margarita; Ngo-Metzger, Quyen; Hays, Ron D

2012-09-01

There is a need for reliable and valid measures of cultural competence (CC) from the patient's perspective. This paper evaluates the reliability and validity of the Consumer Assessments of Healthcare Providers and Systems (CAHPS) CC item set. Using 2008 survey data, we assessed the internal consistency of the CAHPS CC scales using the Cronbach α's and examined the validity of the measures using exploratory and confirmatory factor analysis, multitrait scaling analysis, and regression analysis. A random stratified sample (based on race/ethnicity and language) of 991 enrollees, younger than 65 years, from 2 Medicaid managed care plans in California and New York. CAHPS CC item set after excluding screener items and ratings. Confirmatory factor analysis (Comparative Fit Index=0.98, Tucker Lewis Index=0.98, and Root Mean Square Error or Approximation=0.06) provided support for a 7-factor structure: Doctor Communication--Positive Behaviors, Doctor Communication--Negative Behaviors, Doctor Communication--Health Promotion, Doctor Communication--Alternative Medicine, Shared Decision-Making, Equitable Treatment, and Trust. Item-total correlations (corrected for item overlap) for the 7 scales exceeded 0.40. Exploratory factor analysis showed support for 1 additional factor: Access to Interpreter Services. Internal consistency reliability estimates ranged from 0.58 (Alternative Medicine) to 0.92 (Positive Behaviors) and was 0.70 or higher for 4 of the 8 composites. All composites were positively and significantly associated with the overall doctor rating. The CAHPS CC 26-item set demonstrates adequate measurement properties and can be used as a supplemental item set to the CAHPS Clinician and Group Surveys in assessing culturally competent care from the patient's perspective.
Rapid and Accurate Behavioral Health Diagnostic Screening: Initial Validation Study of a Web-Based, Self-Report Tool (the SAGE-SR)

PubMed Central

Purcell, Susan E; Rhea, Karen; Maier, Philip; First, Michael; Zweede, Lisa; Sinisterra, Manuela; Nunn, M Brad; Austin, Marie-Paule; Brodey, Inger S

2018-01-01

Background The Structured Clinical Interview for DSM (SCID) is considered the gold standard assessment for accurate, reliable psychiatric diagnoses; however, because of its length, complexity, and training required, the SCID is rarely used outside of research. Objective This paper aims to describe the development and initial validation of a Web-based, self-report screening instrument (the Screening Assessment for Guiding Evaluation-Self-Report, SAGE-SR) based on the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5) and the SCID-5-Clinician Version (CV) intended to make accurate, broad-based behavioral health diagnostic screening more accessible within clinical care. Methods First, study staff drafted approximately 1200 self-report items representing individual granular symptoms in the diagnostic criteria for the 8 primary SCID-CV modules. An expert panel iteratively reviewed, critiqued, and revised items. The resulting items were iteratively administered and revised through 3 rounds of cognitive interviewing with community mental health center participants. In the first 2 rounds, the SCID was also administered to participants to directly compare their Likert self-report and SCID responses. A second expert panel evaluated the final pool of items from cognitive interviewing and criteria in the DSM-5 to construct the SAGE-SR, a computerized adaptive instrument that uses branching logic from a screener section to administer appropriate follow-up questions to refine the differential diagnoses. The SAGE-SR was administered to healthy controls and outpatient mental health clinic clients to assess test duration and test-retest reliability. Cutoff scores for screening into follow-up diagnostic sections and criteria for inclusion of diagnoses in the differential diagnosis were evaluated. Results The expert panel reduced the initial 1200 test items to 664 items that panel members agreed collectively represented the SCID items from the 8 targeted modules and DSM criteria for the covered diagnoses. These 664 items were iteratively submitted to 3 rounds of cognitive interviewing with 50 community mental health center participants; the expert panel reviewed session summaries and agreed on a final set of 661 clear and concise self-report items representing the desired criteria in the DSM-5. The SAGE-SR constructed from this item pool took an average of 14 min to complete in a nonclinical sample versus 24 min in a clinical sample. Responses to individual items can be combined to generate DSM criteria endorsements and differential diagnoses, as well as provide indices of individual symptom severity. Preliminary measures of test-retest reliability in a small, nonclinical sample were promising, with good to excellent reliability for screener items in 11 of 13 diagnostic screening modules (intraclass correlation coefficient [ICC] or kappa coefficients ranging from .60 to .90), with mania achieving fair test-retest reliability (ICC=.50) and other substance use endorsed too infrequently for analysis. Conclusions The SAGE-SR is a computerized adaptive self-report instrument designed to provide rigorous differential diagnostic information to clinicians. PMID:29572204
Discriminant content validity: a quantitative methodology for assessing content of theory-based measures, with illustrative applications.

PubMed

Johnston, Marie; Dixon, Diane; Hart, Jo; Glidewell, Liz; Schröder, Carin; Pollard, Beth

2014-05-01

In studies involving theoretical constructs, it is important that measures have good content validity and that there is not contamination of measures by content from other constructs. While reliability and construct validity are routinely reported, to date, there has not been a satisfactory, transparent, and systematic method of assessing and reporting content validity. In this paper, we describe a methodology of discriminant content validity (DCV) and illustrate its application in three studies. Discriminant content validity involves six steps: construct definition, item selection, judge identification, judgement format, single-sample test of content validity, and assessment of discriminant items. In three studies, these steps were applied to a measure of illness perceptions (IPQ-R) and control cognitions. The IPQ-R performed well with most items being purely related to their target construct, although timeline and consequences had small problems. By contrast, the study of control cognitions identified problems in measuring constructs independently. In the final study, direct estimation response formats for theory of planned behaviour constructs were found to have as good DCV as Likert format. The DCV method allowed quantitative assessment of each item and can therefore inform the content validity of the measures assessed. The methods can be applied to assess content validity before or after collecting data to select the appropriate items to measure theoretical constructs. Further, the data reported for each item in Appendix S1 can be used in item or measure selection. Statement of contribution What is already known on this subject? There are agreed methods of assessing and reporting construct validity of measures of theoretical constructs, but not their content validity. Content validity is rarely reported in a systematic and transparent manner. What does this study add? The paper proposes discriminant content validity (DCV), a systematic and transparent method of assessing and reporting whether items assess the intended theoretical construct and only that construct. In three studies, DCV was applied to measures of illness perceptions, control cognitions, and theory of planned behaviour response formats. Appendix S1 gives content validity indices for each item of each questionnaire investigated. Discriminant content validity is ideally applied while the measure is being developed, before using to measure the construct(s), but can also be applied after using a measure. © 2014 The British Psychological Society.
Vegetable parenting practices scale. Item response modeling analyses

PubMed Central

Chen, Tzu-An; O’Connor, Teresia; Hughes, Sheryl; Beltran, Alicia; Baranowski, Janice; Diep, Cassandra; Baranowski, Tom

2015-01-01

Objective To evaluate the psychometric properties of a vegetable parenting practices scale using multidimensional polytomous item response modeling which enables assessing item fit to latent variables and the distributional characteristics of the items in comparison to the respondents. We also tested for differences in the ways item function (called differential item functioning) across child’s gender, ethnicity, age, and household income groups. Method Parents of 3–5 year old children completed a self-reported vegetable parenting practices scale online. Vegetable parenting practices consisted of 14 effective vegetable parenting practices and 12 ineffective vegetable parenting practices items, each with three subscales (responsiveness, structure, and control). Multidimensional polytomous item response modeling was conducted separately on effective vegetable parenting practices and ineffective vegetable parenting practices. Results One effective vegetable parenting practice item did not fit the model well in the full sample or across demographic groups, and another was a misfit in differential item functioning analyses across child’s gender. Significant differential item functioning was detected across children’s age and ethnicity groups, and more among effective vegetable parenting practices than ineffective vegetable parenting practices items. Wright maps showed items only covered parts of the latent trait distribution. The harder- and easier-to-respond ends of the construct were not covered by items for effective vegetable parenting practices and ineffective vegetable parenting practices, respectively. Conclusions Several effective vegetable parenting practices and ineffective vegetable parenting practices scale items functioned differently on the basis of child’s demographic characteristics; therefore, researchers should use these vegetable parenting practices scales with caution. Item response modeling should be incorporated in analyses of parenting practice questionnaires to better assess differences across demographic characteristics. PMID:25895694
Preliminary development and psychometric evaluation of an unmet needs measure for adolescents and young adults with cancer: the Cancer Needs Questionnaire - Young People (CNQ-YP).

PubMed

Clinton-McHarg, Tara; Carey, Mariko; Sanson-Fisher, Rob; D'Este, Catherine; Shakeshaft, Anthony

2012-01-30

Adolescents and young adult (AYA) cancer survivors may have unique physical, psychological and social needs due to their cancer occurring at a critical phase of development. The aim of this study was to develop a psychometrically rigorous measure of unmet need to capture the specific needs of this group. Items were developed following a comprehensive literature review, focus groups with AYAs, and feedback from health care providers, researchers and other professionals. The measure was pilot tested with 32 AYA cancer survivors recruited through a state-based cancer registry to establish face and content validity. A main sample of 139 AYA cancer patients and survivors were recruited through seven treatment centres and invited to complete the questionnaire. To establish test-retest reliability, a sub-sample of 34 participants completed the measure a second time. Exploratory factor analysis was performed and the measure was assessed for internal consistency, discriminative validity, potential responsiveness and acceptability. The Cancer Needs Questionnaire - Young People (CNQ-YP) has established face and content validity, and acceptability. The final measure has 70 items and six factors: Treatment Environment and Care (33 items); Feelings and Relationships (14 items); Daily Life (12 items); Information and Activities (5 items); Education (3 items); and Work (3 items). All domains achieved Cronbach's alpha values greater than 0.80. Item-to-item test-retest reliability was also high, with all but four items reaching weighted kappa values above 0.60. The CNQ-YP is the first multi-dimensional measure of unmet need which has been developed specifically for AYA cancer patients and survivors. The measure displays a strong factor structure, and excellent internal consistency and test-retest reliability. However, the small sample size has implications for the reliability of the statistical analyses undertaken, particularly the exploratory factor analysis. Future studies with a larger sample are recommended to confirm the factor structure of the measure. Longitudinal studies to establish responsiveness and predictive validity should also be undertaken.
Preliminary development and psychometric evaluation of an unmet needs measure for adolescents and young adults with cancer: the Cancer Needs Questionnaire - Young People (CNQ-YP)

PubMed Central

2012-01-01

Background Adolescents and young adult (AYA) cancer survivors may have unique physical, psychological and social needs due to their cancer occurring at a critical phase of development. The aim of this study was to develop a psychometrically rigorous measure of unmet need to capture the specific needs of this group. Methods Items were developed following a comprehensive literature review, focus groups with AYAs, and feedback from health care providers, researchers and other professionals. The measure was pilot tested with 32 AYA cancer survivors recruited through a state-based cancer registry to establish face and content validity. A main sample of 139 AYA cancer patients and survivors were recruited through seven treatment centres and invited to complete the questionnaire. To establish test-retest reliability, a sub-sample of 34 participants completed the measure a second time. Exploratory factor analysis was performed and the measure was assessed for internal consistency, discriminative validity, potential responsiveness and acceptability. Results The Cancer Needs Questionnaire - Young People (CNQ-YP) has established face and content validity, and acceptability. The final measure has 70 items and six factors: Treatment Environment and Care (33 items); Feelings and Relationships (14 items); Daily Life (12 items); Information and Activities (5 items); Education (3 items); and Work (3 items). All domains achieved Cronbach's alpha values greater than 0.80. Item-to-item test-retest reliability was also high, with all but four items reaching weighted kappa values above 0.60. Conclusions The CNQ-YP is the first multi-dimensional measure of unmet need which has been developed specifically for AYA cancer patients and survivors. The measure displays a strong factor structure, and excellent internal consistency and test-retest reliability. However, the small sample size has implications for the reliability of the statistical analyses undertaken, particularly the exploratory factor analysis. Future studies with a larger sample are recommended to confirm the factor structure of the measure. Longitudinal studies to establish responsiveness and predictive validity should also be undertaken. PMID:22284545
[Item function analysis on the Quality of Life-Alzheimer's Disease(QOL-AD)Chinese version, based on the Item Response Theory(IRT)].

PubMed

Wan, Li-ping; He, Run-lian; Ai, Yong-mei; Zhang, Hui-min; Xing, Min; Yang, Lin; Song, Yan-long; Yu, Hong-mei

2013-07-01

To introduce the Item Function Analysis(IFA) of Quality of Life- Alzheimer's disease(QOL-AD)Chinese version and to explore the feasibility of its application on Chinese patients with AD. Two hundred AD patients were interviewed and assessed by QOL-AD, through the stratified cluster sampling method. Multilog 7.03. was used for Item Function Analysis. Difference scale(a), difficulty scale(b)and Item Characteristic Curve(ICC) of each item of QOL-AD were provided. Different scales of the item 1, 7 were below 0.6, while all the others were above 0.6. As for ICC. The first and last lines for the other items were monotonic in which the two in between were in inverted V-shape, with very steep slopes, except for the item 1 and 7. Results form the IFA showed that QOL-AD was applicable to be used in the Chinese patients with AD.

On the Viability of PTSD Checklist (PCL) Short Form Use: Analyses from Mississippi Gulf Coast Hurricane Katrina Survivors

ERIC Educational Resources Information Center

Hirschel, Michael J.; Schulenberg, Stefan E.

2010-01-01

One measure commonly used to assess posttraumatic stress disorder is the PTSD Checklist (PCL). Lang and Stein (2005) extracted 4 subsets of PCL items, validating 2 of them for possible use in screening in primary care settings. The viability of the 4 item subsets was evaluated psychometrically in the present study with a sample of Hurricane…
The Development of the Stages of Recovery Scale for Persons with Persistent Mental Illness

ERIC Educational Resources Information Center

Song, Li-Yu; Hsu, Su-Ting

2011-01-01

This study aimed to develop a scale which could be used as a valid way to show the evidence of recovery-oriented services. A 51-item scale was developed to assess both the component processes and outcomes of recovery. A sample of 471 participants administered the questionnaire. The factor analysis yielded a 45-item scale with six subscales,…
Using Surveillance of Mental Health to Increase Understanding of Youth Involvement in High-Risk Behaviors: A Value-Added Analysis

ERIC Educational Resources Information Center

Dowdy, Erin; Furlong, Michael J.; Sharkey, Jill D.

2013-01-01

This study examined the potential utility of adding items that assessed youths' emotional and behavioral disorders to a commonly used surveillance survey. The goal was to evaluate whether the added items could enhance understanding of youths' involvement in high-risk behaviors. A sample of 3,331 adolescents in Grades 8, 10, and 12 from four…
[Assessment of stress in childhood: Children's Daily Stress Inventory (Inventario Infantil de Estresores Cotidiano, IIEC)].

PubMed

Trianes Torres, María Victoria; Blanca Mena, María José; Fernández Baena, Francisco J; Escobar Espejo, Milagros; Maldonado Montero, Enrique F; Muñoz Sánchez, Angela María

2009-11-01

The present study introduces the Children's Daily Stress Inventory (Inventario Infantil de Estresores Cotidianos, IIEC) as a measure that assesses daily stress in primary school children. The inventory was applied to a sample of 1094 primary school students. The final version includes 25 dichotomic items covering the areas of health, school/peers, and family. The score is obtained by adding the total of positive answers. Analyses of items, reliability and several external pieces of evidence of validity based on relations with other variables are presented. The results show adequate psychometric properties for the assessment of daily stress in children.
Trauma Coping Self-Efficacy: A Context Specific Self-Efficacy Measure for Traumatic Stress

PubMed Central

Benight, Charles C.; Shoji, Kotaro; James, Lori E.; Waldrep, Edward E.; Delahanty, Douglas L.; Cieslak, Roman

2015-01-01

The psychometric properties of a Trauma Coping Self-Efficacy (CSE-T) scale that assesses general trauma-related coping self-efficacy perceptions were assessed. Measurement equivalence was assessed using several different samples: hospitalized trauma patients (n1 = 74, n2 = 69, n3 = 60), three samples of disaster survivors (n1 = 273, n2 = 227, n3 = 138), and trauma exposed college students (N = 242). This is the first multi-sample evaluation of the psychometric properties for a general trauma-related CSE measure. Results showed that a brief and parsimonious 9-item version of the CSE performed well across the samples with a robust factor structure; factor structure and factor loadings were similar across study samples. The 9-item scale CSE-T demonstrated measurement equivalence across samples indicating that the underlying concept of general post-traumatic CSE is organized in a similar manner in the different trauma-exposed groups. These results offer strong support for cross-event construct validity of the CSE-T scale. Associations of the CSE-T with important expected covariates showed significant evidence for convergent validity. Finally, discriminant validity was also supported. Replication of the factor structure, internal reliability, and other evidence for construct validity is a critical next step for future research. PMID:26524542
Sample size allocation for food item radiation monitoring and safety inspection.

PubMed

Seto, Mayumi; Uriu, Koichiro

2015-03-01

The objective of this study is to identify a procedure for determining sample size allocation for food radiation inspections of more than one food item to minimize the potential risk to consumers of internal radiation exposure. We consider a simplified case of food radiation monitoring and safety inspection in which a risk manager is required to monitor two food items, milk and spinach, in a contaminated area. Three protocols for food radiation monitoring with different sample size allocations were assessed by simulating random sampling and inspections of milk and spinach in a conceptual monitoring site. Distributions of (131)I and radiocesium concentrations were determined in reference to (131)I and radiocesium concentrations detected in Fukushima prefecture, Japan, for March and April 2011. The results of the simulations suggested that a protocol that allocates sample size to milk and spinach based on the estimation of (131)I and radiocesium concentrations using the apparent decay rate constants sequentially calculated from past monitoring data can most effectively minimize the potential risks of internal radiation exposure. © 2014 Society for Risk Analysis.
Item response theory, computerized adaptive testing, and PROMIS: assessment of physical function.

PubMed

Fries, James F; Witter, James; Rose, Matthias; Cella, David; Khanna, Dinesh; Morgan-DeWitt, Esi

2014-01-01

Patient-reported outcome (PRO) questionnaires record health information directly from research participants because observers may not accurately represent the patient perspective. Patient-reported Outcomes Measurement Information System (PROMIS) is a US National Institutes of Health cooperative group charged with bringing PRO to a new level of precision and standardization across diseases by item development and use of item response theory (IRT). With IRT methods, improved items are calibrated on an underlying concept to form an item bank for a "domain" such as physical function (PF). The most informative items can be combined to construct efficient "instruments" such as 10-item or 20-item PF static forms. Each item is calibrated on the basis of the probability that a given person will respond at a given level, and the ability of the item to discriminate people from one another. Tailored forms may cover any desired level of the domain being measured. Computerized adaptive testing (CAT) selects the best items to sharpen the estimate of a person's functional ability, based on prior responses to earlier questions. PROMIS item banks have been improved with experience from several thousand items, and are calibrated on over 21,000 respondents. In areas tested to date, PROMIS PF instruments are superior or equal to Health Assessment Questionnaire and Medical Outcome Study Short Form-36 Survey legacy instruments in clarity, translatability, patient importance, reliability, and sensitivity to change. Precise measures, such as PROMIS, efficiently incorporate patient self-report of health into research, potentially reducing research cost by lowering sample size requirements. The advent of routine IRT applications has the potential to transform PRO measurement.
Cross-cultural measurement invariance in the satisfaction with food-related life scale in older adults from two developing countries.

PubMed

Schnettler, Berta; Miranda-Zapata, Edgardo; Lobos, Germán; Lapo, María; Grunert, Klaus G; Adasme-Berríos, Cristian; Hueche, Clementina

2017-05-30

Nutrition is one of the major determinants of successful aging. The Satisfaction with Food-related Life (SWFL) scale measures a person's overall assessment regarding their food and eating habits. The SWFL scale has been used in older adult samples across different countries in Europe, Asia and America, however, there are no studies that have evaluated the cross-cultural measurement invariance of the scale in older adult samples. Therefore, we evaluated the measurement invariance of the SWFL scale across older adults from Chile and Ecuador. Stratified random sampling was used to recruit a sample of older adults of both genders from Chile (mean age = 71.38, SD = 6.48, range = 60-92) and from Ecuador (mean age = 73.70, SD = 7.45, range = 60-101). Participants reported their levels of satisfaction with food-related life by completing the SWFL scale, which consists of five items grouped into a single dimension. Confirmatory factor analysis (CFA) was used to examine cross-cultural measurement invariance of the SWFL scale. Results showed that the SWFL scale exhibited partial measurement invariance, with invariance of all factor loadings, invariance in all but one item's threshold (item 1) and invariance in all items' uniqueness (residuals), which leads us to conclude that there is a reasonable level of partial measurement invariance for the CFA model of the SWFL scale, when comparing the Chilean and Ecuadorian older adult samples. The lack of invariance in item 1 confirms previous studies with adults and emerging adults in Chile that suggest this item is culture-sensitive. We recommend revising the wording of the first item of the SWFL in order to relate the statement with the person's life. The SWFL scale shows partial measurement invariance across older adults from Chile and Ecuador. A 4-item version of the scale (excluding item 1) provides the basis for international comparisons of satisfaction with food-related life in older adults from developing countries in South America.
Rasch analysis of the participation scale (P-scale): usefulness of the P-scale to a rehabilitation services network.

PubMed

Souza, Mariana Angélica Peixoto; Coster, Wendy Jane; Mancini, Marisa Cotta; Dutra, Fabiana Caetano Martins Silva; Kramer, Jessica; Sampaio, Rosana Ferreira

2017-12-08

A person's participation is acknowledged as an important outcome of the rehabilitation process. The Participation Scale (P-Scale) is an instrument that was designed to assess the participation of individuals with a health condition or disability. The scale was developed in an effort to better describe the participation of people living in middle-income and low-income countries. The aim of this study was to use Rasch analysis to examine whether the Participation Scale is suitable to assess the perceived ability to take part in participation situations by patients with diverse levels of function. The sample was comprised by 302 patients from a public rehabilitation services network. Participants had orthopaedic or neurological health conditions, were at least 18 years old, and completed the Participation Scale. Rasch analysis was conducted using the Winsteps software. The mean age of all participants was 45.5 years (standard deviation = 14.4), 52% were male, 86% had orthopaedic conditions, and 52% had chronic symptoms. Rasch analysis was performed using a dichotomous rating scale, and only one item showed misfit. Dimensionality analysis supported the existence of only one Rasch dimension. The person separation index was 1.51, and the item separation index was 6.38. Items N2 and N14 showed Differential Item Functioning between men and women. Items N6 and N12 showed Differential Item Functioning between acute and chronic conditions. The item difficulty range was -1.78 to 2.09 logits, while the sample ability range was -2.41 to 4.61 logits. The P-Scale was found to be useful as a screening tool for participation problems reported by patients in a rehabilitation context, despite some issues that should be addressed to further improve the scale.
The Modified Checklist for Autism in Toddlers in extremely low gestational age newborns: individual items associated with motor, cognitive, vision and hearing limitations.

PubMed

Luyster, Rhiannon J; Kuban, Karl C K; O'Shea, T Michael; Paneth, Nigel; Allred, Elizabeth N; Leviton, Alan

2011-07-01

The Modified Checklist for Autism in Toddlers (M-CHAT) has yielded elevated rates of screening failure for children born preterm or with low birthweight. We extended these findings with a detailed examination of M-CHAT items in a large sample of children born at extremely low gestational age. The sample was grouped according to children's current limitations and degree of impairment. The aim was to better understand how disabilities might influence M-CHAT scores. Fourteen participating institutions of the Extremely Low Gestational Age Newborns (ELGAN) Study prospectively collected information about 1086 infants who were born before the 28th week of gestation and had an assessment at age 24-months. The 24-month visit included a neurological assessment, the Bayley Scales of Infant Development, Second edition (BSID-II), M-CHAT and a medical history form. Outcome measures included the distribution of failed M-CHAT items among groups classified according to cerebral palsy diagnosis, gross motor function, BSID-II scores and vision or hearing impairments. M-CHAT items were failed more frequently by children with concurrently identified impairments (motor, cognitive, vision and hearing). In addition, the frequency of item failure increased with the severity of impairment. The failed M-CHAT items were often, but not consistently, related to children's specific impairments. Importantly, four of the six M-CHAT 'critical items' were commonly affected by presence and severity of concurrent impairments. The strong association between impaired sensory or motor function and M-CHAT results among extremely low gestational age children suggests that such impairments might give rise to false positive M-CHAT screening. © 2011 Blackwell Publishing Ltd.
Visual search by chimpanzees (Pan): assessment of controlling relations.

PubMed Central

Tomonaga, M

1995-01-01

Three experimentally sophisticated chimpanzees (Pan), Akira, Chloe, and Ai, were trained on visual search performance using a modified multiple-alternative matching-to-sample task in which a sample stimulus was followed by the search display containing one target identical to the sample and several uniform distractors (i.e., negative comparison stimuli were identical to each other). After they acquired this task, they were tested for transfer of visual search performance to trials in which the sample was not followed by the uniform search display (odd-item search). Akira showed positive transfer of visual search performance to odd-item search even when the display size (the number of stimulus items in the search display) was small, whereas Chloe and Ai showed a transfer only when the display size was large. Chloe and Ai used some nonrelational cues such as perceptual isolation of the target among uniform distractors (so-called pop-out). In addition to the odd-item search test, various types of probe trials were presented to clarify the controlling relations in multiple-alternative matching to sample. Akira showed a decrement of accuracy as a function of the display size when the search display was nonuniform (i.e., each "distractor" stimulus was not the same), whereas Chloe and Ai showed perfect performance. Furthermore, when the sample was identical to the uniform distractors in the search display, Chloe and Ai never selected an odd-item target, but Akira selected it when the display size was large. These results indicated that Akira's behavior was controlled mainly by relational cues of target-distractor oddity, whereas an identity relation between the sample and the target strongly controlled the performance of Chloe and Ai. PMID:7714449
A 67-Item Stress Resilience item bank showing high content validity was developed in a psychosomatic sample.

PubMed

Obbarius, Nina; Fischer, Felix; Obbarius, Alexander; Nolte, Sandra; Liegl, Gregor; Rose, Matthias

2018-04-10

To develop the first item bank to measure Stress Resilience (SR) in clinical populations. Qualitative item development resulted in an initial pool of 131 items covering a broad theoretical SR concept. These items were tested in n=521 patients at a psychosomatic outpatient clinic. Exploratory and Confirmatory Factor Analysis (CFA), as well as other state-of-the-art item analyses and IRT were used for item evaluation and calibration of the final item bank. Out of the initial item pool of 131 items, we excluded 64 items (54 factor loading <.5, 4 residual correlations >.3, 2 non-discriminative Item Response Curves, 4 Differential Item Functioning). The final set of 67 items indicated sufficient model fit in CFA and IRT analyses. Additionally, a 10-item short form with high measurement precision (SE≤.32 in a theta range between -1.8 and +1.5) was derived. Both the SR item bank and the SR short form were highly correlated with an existing static legacy tool (Connor-Davidson Resilience Scale). The final SR item bank and 10-item short form showed good psychometric properties. When further validated, they will be ready to be used within a framework of Computer-Adaptive Tests for a comprehensive assessment of the Stress-Construct. Copyright © 2018. Published by Elsevier Inc.
Comprehensive clinical assessment in community setting: applicability of the MDS-HC.

PubMed

Morris, J N; Fries, B E; Steel, K; Ikegami, N; Bernabei, R; Carpenter, G I; Gilgen, R; Hirdes, J P; Topinková, E

1997-08-01

To describe the results of an international trial of the home care version of the MDS assessment and problem identification system (the MDS-HC), including reliability estimates, a comparison of MDS-HC reliabilities with reliabilities of the same items in the MDS 2.0 nursing home assessment instrument, and an examination of the types of problems found in home care clients using the MDS-HC. Independent, dual assessment of clients of home-care agencies by trained clinicians using a draft of the MDS-HC, with additional descriptive data regarding problem profiles for home care clients. Reliability data from dual assessments of 241 randomly selected clients of home care agencies in five countries, all of whom volunteered to test the MDS-HC. Also included are an expanded sample of 780 home care assessments from these countries and 187 dually assessed residents from 21 nursing homes in the United States. The array of MDS-HC assessment items included measures in the following areas: personal items, cognitive patterns, communication/hearing, vision, mood and behavior, social functioning, informal support services, physical functioning, continence, disease diagnoses health conditions and preventive health measures, nutrition/hydration, dental status, skin condition, environmental assessment, service utilization, and medications. Forty-seven percent of the functional, health status, social environment, and service items in the MDS-HC were taken from the MDS 2.0 for nursing homes. For this item set, it is estimated that the average weighted Kappa is .74 for the MDS-HC and .75 for the MDS 2.0. Similarly, high reliability values were found for items newly introduced in the MDS-HC (weighted Kappa = .70). Descriptive findings also characterize the problems of home care clients, with subanalyses within cognitive performance levels. Findings indicate that the core set of items in the MDS 2.0 work equally well in community and nursing home settings. New items are highly reliable. In tandem, these instruments can be used within the international community, assisting and planning care for older adults within a broad spectrum of service settings, including nursing homes and home care programs. With this community-based, second-generation problem and care plan-driven assessment instrument, disability assessment can be performed consistently across the world.
Development of a Psychosocial Risk Screener for Siblings of Children With Cancer: Incorporating the Perspectives of Parents.

PubMed

Long, Kristin A; Pariseau, Emily M; Muriel, Anna C; Chu, Andrea; Kazak, Anne E; Alderfer, Melissa A

2018-04-03

Although many siblings experience distress after a child's cancer diagnosis, their psychosocial functioning is seldom assessed in clinical oncology settings. One barrier to systematic sibling screening is the lack of a validated, sibling-specific screening instrument. Thus, this study developed sibling-specific screening modules in English and Spanish for the Psychosocial Assessment Tool (PAT), a well-validated screener of family psychosocial risk. A purposive sample of English- and Spanish-speaking parents of children with cancer (N = 29) completed cognitive interviews to provide in-depth feedback on the development of the new PAT sibling modules. Interviews were transcribed verbatim, cleaned, and analyzed using applied thematic analysis. Items were updated iteratively according to participants' feedback. Data collection continued until saturation was reached (i.e., all items were clear and valid). Two sibling modules were developed to assess siblings' psychosocial risk at diagnosis (preexisting risk factors) and several months thereafter (reactions to cancer). Most prior PAT items were retained; however, parents recommended changes to improve screening format (separately assessing each sibling within the family and expanding response options to include "sometimes"), developmental sensitivity (developing or revising items for ages 0-2, 3-4, 5-9, and 10+ years), and content (adding items related to sibling-specific social support, global assessments of sibling risk, emotional/behavioral reactions to cancer, and social ecological factors such as family and school). Psychosocial screening requires sibling-specific screening items that correspond to preexisting risk (at diagnosis) and reactions to cancer (several months after diagnosis). Validated, sibling-specific screeners will facilitate identification of siblings with elevated psychosocial risk.
A Psychometric Study of the Fear of Sleep Inventory-Short Form (FoSI-SF)

PubMed Central

Pruiksma, Kristi E.; Taylor, Daniel J.; Ruggero, Camilo; Boals, Adriel; Davis, Joanne L.; Cranston, Christopher; DeViva, Jason C.; Zayfert, Claudia

2014-01-01

Study Objectives: Fear of sleep may play a significant role in sleep disturbances in individuals with posttraumatic stress disorder (PTSD). This report describes a psychometric study of the Fear of Sleep Inventory (FoSI), which was developed to measure this construct. Methods: The psychometric properties of the FoSI were examined in a non-clinical sample of 292 college students (Study I) and in a clinical sample of 67 trauma-exposed adults experiencing chronic nightmares (Study II). Data on the 23 items of the FoSI were subjected to exploratory factor analyses (EFA) to identify items uniquely assessing fear of sleep. Next, reliability and validity of a 13-item version of the FoSI was examined in both samples. Results: A 13-item Short-Form version (FoSI-SF) was identified as having a clear 2-factor structure with high internal consistency in both the non-clinical (α = 0.76–0.94) and clinical (α = 0.88-0.91) samples. Both studies demonstrated good convergent validity with measures of PTSD (0.48-0.61) and insomnia (0.39-0.48) and discriminant validity with a measure of sleep hygiene (0.19-0.27). The total score on the FoSI-SF was significantly higher in the clinical sample (mean = 17.90, SD = 12.56) than in the non-clinical sample (mean = 4.80, SD = 7.72); t357 = 8.85 p < 0.001. Conclusions: Although all items are recommended for clinical purposes, the data support the use of the 13-item FoSI-SF for research purposes. Replication of the factor structure in clinical samples is needed. Results are discussed in terms of limitations of this study and directions for further research. Citation: Pruiksma KE, Taylor DJ, Ruggero C, Boals A, Davis JL, Cranston C, DeViva JC, Zayfert C. A psychometric study of the Fear of Sleep Inventory-short form (FoSI-SF). J Clin Sleep Med 2014;10(5):551-558. PMID:24812541
Drive: Theory and Construct Validation

PubMed Central

Petrides, K. V.

2016-01-01

This article explicates the theory of drive and describes the development and validation of two measures. A representative set of drive facets was derived from an extensive corpus of human attributes (Study 1). Operationalised using an International Personality Item Pool version (the Drive:IPIP), a three-factor model was extracted from the facets in two samples and confirmed on a third sample (Study 2). The multi-item IPIP measure showed congruence with a short form, based on single-item ratings of the facets, and both demonstrated cross-informant reliability. Evidence also supported the measures’ convergent, discriminant, concurrent, and incremental validity (Study 3). Based on very promising findings, the authors hope to initiate a stream of research in what is argued to be a rather neglected niche of individual differences and non-cognitive assessment. PMID:27409773
Development and Validation of a Computerized-Adaptive Test for PTSD (P-CAT).

PubMed

Eisen, Susan V; Schultz, Mark R; Ni, Pengsheng; Haley, Stephen M; Smith, Eric G; Spiro, Avron; Osei-Bonsu, Princess E; Nordberg, Sam; Jette, Alan M

2016-10-01

The primary purpose was to develop, field test, and validate a computerized-adaptive test (CAT) for posttraumatic stress disorder (PTSD) to enhance PTSD assessment and decrease the burden of symptom monitoring. Data sources included self-report and interviewer-administered diagnostic interviews. The sample included 1,288 veterans. In phase 1, 89 items from a previously developed PTSD item pool were administered to a national sample of 1,085 veterans. A multidimensional graded-response item response theory model was used to calibrate items for incorporation into a CAT for PTSD (P-CAT). In phase 2, in a separate sample of 203 veterans, the P-CAT was validated against three other self-report measures (PTSD Checklist, Civilian Version; Mississippi Scale for Combat-Related PTSD; and Primary Care PTSD Screen) and the PTSD module of the Structured Clinical Interview for DSM-IV. A bifactor model with one general PTSD factor and four subfactors consistent with DSM-5 (reexperiencing, avoidance, negative mood-cognitions, and arousal), yielded good fit. The P-CAT discriminated veterans with PTSD from those with other mental health conditions and those with no mental health conditions (Cohen's d effect sizes >.90). The P-CAT also discriminated those with and without a PTSD diagnosis and those who screened positive versus negative for PTSD. Concurrent validity was supported by high correlations (r=.85-.89) with the validation measures. The P-CAT appears to be a promising tool for efficient and accurate assessment of PTSD symptomatology. Further testing is needed to evaluate its responsiveness to change. With increasing availability of computers and other technologies, CAT may be a viable and efficient assessment method.
The Curiosity and Exploration Inventory-II: Development, Factor Structure, and Psychometrics

PubMed Central

Kashdan, Todd B.; Gallagher, Matthew W.; Silvia, Paul J.; Winterstein, Beate P.; Breen, William E.; Terhar, Daniel; Steger, Michael F.

2009-01-01

Given curiosity’s fundamental role in motivation, learning, and well-being, we sought to refine the measurement of trait curiosity with an improved version of the Curiosity and Exploration Inventory (CEI; Kashdan, Rose, & Fincham, 2004). A preliminary pool of 36 items was administered to 311 undergraduate students, who also completed measures of emotion, emotion regulation, personality, and well-being. Factor analyses indicated a two factor model—motivation to seek out knowledge and new experiences (Stretching; 5 items) and a willingness to embrace the novel, uncertain, and unpredictable nature of everyday life (Embracing; 5 items). In two additional samples (ns = 150 and 119), we cross-validated this factor structure and provided initial evidence for construct validity. This includes positive correlations with personal growth, openness to experience, autonomy, purpose in life, self-acceptance, psychological flexibility, positive affect, and positive social relations, among others. Applying item response theory (IRT) to these samples (n = 578), we showed that the items have good discrimination and a desirable breadth of difficulty. The item information functions and test information function were centered near zero, indicating that the scale assesses the mid-range of the latent curiosity trait most reliably. The findings thus far provide good evidence for the psychometric properties of the 10-item CEI-II. PMID:20160913
Health and role functioning: the use of focus groups in the development of an item bank.

PubMed

Anatchkova, Milena D; Bjorner, Jakob B

2010-02-01

Role functioning is an important part of health-related quality of life. However, assessment of role functioning is complicated by the wide definition of roles and by fluctuations in role participation across the life-span. The aim of this study is to explore variations in role functioning across the lifespan using qualitative approaches, to inform the development of a role functioning item bank and to pilot test sample items from the bank. Eight focus groups were conducted with a convenience sample of 38 English-speaking adults recruited in Rhode Island. Participants were stratified by gender and four age groups. Focus groups were taped, transcribed, and analyzed for thematic content. Participants of all ages identified family roles as the most important. There was age variation in the importance of social life roles, with younger and older adults rating them as more important. Occupational roles were identified as important by younger and middle-aged participants. The potential of health problems to affect role participation was recognized. Participants found the sample items easy to understand, response options identical in meaning and preferred five response choices. Participants identified key aspects of role functioning and provided insights on their perception of the impact of health on their role participation. These results will inform item bank generation.
Exploring the Validity of the Affect Balance Scale With a Sample of Family Caregivers

PubMed Central

Perkinson, Margaret A.; Albert, Steven M.; Luborsky, Mark; Moss, Miriam; Glicksman, Allen

2014-01-01

Open-ended responses of caregiving daughters and daughters-in-law were generated by a modified random probe technique to investigate the construct validity of the two subscales of the Affect Balance Scale (ABS), i.e., the 5-item Positive Affect Scale (PAS) and the 5-item Negative Affect Scale (NAS). A set of criteria were developed to distinguish between responses that did and did not correspond to Bradburn’s assumptions concerning affect. While most responses met at least one of the criteria, very few met all. In exploring the nature of affect, we found that positive affect was based to a large extent on personal accomplishments and the recognition of others. The assessment of negative affect was a more interior, or self-focused process. For a significant subset of the sample, a negative response to a closed-ended PAS or NAS item implied disagreement or discontent with the wording or the implications of the item itself, rather than an absence of affect. Not all of the ABS items were equally valid measures of affect. PMID:8056955

Validation of the Expanded Versions of the Adult ADHD Self-Report Scale v1.1 Symptom Checklist and the Adult ADHD Investigator Symptom Rating Scale.

PubMed

Silverstein, Michael J; Faraone, Stephen V; Alperin, Samuel; Leon, Terry L; Biederman, Joseph; Spencer, Thomas J; Adler, Lenard A

2018-02-01

The aim of this study is to validate the Adult ADHD Self-Report Scale (ASRS) and Adult ADHD Investigator Symptom Rating Scale (AISRS) expanded versions, including executive function deficits (EFDs) and emotional dyscontrol (EC) items, and to present ASRS and AISRS pilot normative data. Two patient samples (referred and primary care physician [PCP] controls) were pooled together for these analyses. Final analysis included 297 respondents, 171 with adult ADHD. Cronbach's alphas were high for all sections of the scales. Examining histograms of ASRS 31-item and AISRS 18-item total scores for ADHD controls, 95% cutoff scores were 70 and 23, respectively; histograms for pilot normative sample suggest cutoffs of 82 and 26, respectively. (a) ASRS- and AISRS-expanded versions have high validity in assessment of core 18 adult ADHD Diagnostic and Statistical Manual of Mental Disorders ( DSM) symptoms and EFD and EC symptoms. (b) ASRS (31-item) scores 70 to 82 and AISRS (18-item) scores from 23 to 26 suggest a high likelihood of adult ADHD.
A psychometric evaluation of the Hospital Anxiety and Depression Scale for the medically hospitalized elderly.

PubMed

Helvik, Anne-Sofie; Engedal, Knut; Skancke, Randi H; Selbæk, Geir

2011-10-01

Few psychometric studies of the Hospital Anxiety and Depression Scale (HADS) scale have been performed with clinical samples of elderly individuals. The participants were 484 elderly (65-101 years, 241 men) patients in an acute medical unit. The HADS, the Montgomery-Aasberg Depression Rating Scale (MADRS) and questionnaires assessing quality of life, functional impairment, and cognitive function were used. The psychometric evaluation of the HADS included the following analyses: 1) the internal construct validity by means of principal component analysis followed by an oblique rotation and corrected item-total correlation; 2) the internal consistency reliability by means of the alpha coefficient (Cronbach's) and 3) concurrent validity by means of Spearman's rho. We found a two-factor solution explaining 45% of the variance. Six of seven items loaded adequately (≥0.40) on the HADS-A subscale (item 7 did not) and five of seven items loaded adequately on the HADS-D subscale (items 8 and 10 did not). Cronbach's alpha for the HADS-A and HADS-D subscale was 0.78 and 0.71, respectively. The correlation between HADS-D and the MADRS, a measure of the concurrent validity, was 0.51. The HADS appears to differentiate well between depression and anxiety. The internal consistency of the HADS in a sample of elderly persons was as satisfactory as it is in samples with younger persons. In contrast to younger samples, item 8 ("I feel as if I have slowed down") did not load adequately on the HADS-D subscale. This may be attributed to the way elderly people experience and describe their symptoms.
A decision-tree approach to the assessment of posttraumatic stress disorder: Engineering empirically rigorous and ecologically valid assessment measures.

PubMed

Stewart, Regan W; Tuerk, Peter W; Metzger, Isha W; Davidson, Tatiana M; Young, John

2016-02-01

Structured diagnostic interviews are widely considered to be the optimal method of assessing symptoms of posttraumatic stress; however, few clinicians report using structured assessments to guide clinical practice. One commonly cited impediment to these assessment approaches is the amount of time required for test administration and interpretation. Empirically keyed methods to reduce the administration time of structured assessments may be a viable solution to increase the use of standardized and reliable diagnostic tools. Thus, the present research conducted an initial feasibility study using a sample of treatment-seeking military veterans (N = 1,517) to develop a truncated assessment protocol based on the Clinician-Administered Posttraumatic Stress Disorder (PTSD) Scale (CAPS). Decision-tree analysis was utilized to identify a subset of predictor variables among the CAPS items that were most predictive of a diagnosis of PTSD. The algorithm-driven, atheoretical sequence of questions reduced the number of items administered by more than 75% and classified the validation sample at 92% accuracy. These results demonstrated the feasibility of developing a protocol to assess PTSD in a way that imposes little assessment burden while still providing a reliable categorization. (c) 2016 APA, all rights reserved).
The feeding practices and structure questionnaire: construction and initial validation in a sample of Australian first-time mothers and their 2-year olds.

PubMed

Jansen, Elena; Mallan, Kimberley M; Nicholson, Jan M; Daniels, Lynne A

2014-06-04

Early feeding practices lay the foundation for children's eating habits and weight gain. Questionnaires are available to assess parental feeding but overlapping and inconsistent items, subscales and terminology limit conceptual clarity and between study comparisons. Our aim was to consolidate a range of existing items into a parsimonious and conceptually robust questionnaire for assessing feeding practices with very young children (<3 years). Data were from 462 mothers and children (age 21-27 months) from the NOURISH trial. Items from five questionnaires and two study-specific items were submitted to a priori item selection, allocation and verification, before theoretically-derived factors were tested using Confirmatory Factor Analysis. Construct validity of the new factors was examined by correlating these with child eating behaviours and weight. Following expert review 10 factors were specified. Of these, 9 factors (40 items) showed acceptable model fit and internal reliability (Cronbach's α: 0.61-0.89). Four factors reflected non-responsive feeding practices: 'Distrust in Appetite', 'Reward for Behaviour', 'Reward for Eating', and 'Persuasive Feeding'. Five factors reflected structure of the meal environment and limits: 'Structured Meal Setting', 'Structured Meal Timing', 'Family Meal Setting', 'Overt Restriction' and 'Covert Restriction'. Feeding practices generally showed the expected pattern of associations with child eating behaviours but none with weight. The Feeding Practices and Structure Questionnaire (FPSQ) provides a new reliable and valid measure of parental feeding practices, specifically maternal responsiveness to children's hunger/satiety signals facilitated by routine and structure in feeding. Further validation in more diverse samples is required.
Measurement invariance across Genders on the Childhood Illness Attitude Scales (CIAS).

PubMed

Thorisdottir, Audur S; Villadsen, Anna; LeBouthillier, Daniel M; Rask, Charlotte Ulrikka; Wright, Kristi D; Walker, John R; Feldgaier, Steven; Asmundson, Gordon J G

2017-07-01

The Childhood Illness Attitude Scales (CIAS) were created as a developmentally appropriate measure for symptoms of health anxiety (HA) in school-aged children. Despite overall sound psychometric properties reported in previous studies, more comprehensive examination of the latent structure and potential response bias in the CIAS is needed. The purpose of the present study was to cross-validate the latent structure of the CIAS across genders and to examine gender-specific variations in CIAS scores. The sample comprised data from 602 Canadian and Danish school-aged children (M age =10.54, SD=0.99; 52.5% girls). Confirmatory factor analyses were conducted to test 3-, modified 3-, and 4-factor models in both samples. Multigroup confirmatory factor analysis was performed to test factor structure invariance across boys and girls in a combined sample. Differential Item Functioning (DIF) was assessed using test characteristic curves. A modified 3-factor solution (i.e., fears=11 items, help-seeking=6 items, and symptom effects=4 items) provided the best fit to the data (χ 2 (364, N=602)=681.7, p<0.001; χ 2 /df=1.803; RMSEA=0.037; CFI=0.926). The factor structure was stable, well-fitting, and indicated measurement invariance across groups. DIF analyses revealed no gender-based response bias at the scale level. Results support a revised 3-factor version of the CIAS that can be used with confidence to assess symptoms of HA in school-aged boys and girls. Copyright © 2017 Elsevier Inc. All rights reserved.
Psychometric assessment of scales measuring HIV public stigma, drug-use public stigma and fear of HIV infection among young adolescents and their parents

PubMed Central

Ha, Toan; Liu, Hongjie; Li, Jian; Nield, Jennifer; Lu, Zhouping

2011-01-01

The objective of this study was to design and assess measurement instruments that accurately measure the levels of stigma among individuals with a primarily collectivist culture. A cross-sectional study was conducted among middle school students and their parents or guardians in a rural area of China. Exploratory and confirmatory factor analyses were used to examine and determine the latent factors of the sub-scales of stigma respectively among students and their parents. Factor analyses identified three sub-scales: HIV public stigma (7 items), drug-use public stigma (9 items), and fear of HIV infection (7 items). There were no items with cross-loading onto multiple factors, supporting the distinctness of the constructs that these scales were meant to measure. Goodness of fit indices indicated that a three-factor solution fit the data at an acceptable level in the student sample (χ2/degree ratio = 1.98, CFI = 0.92, RMSEA = 0.055, SRMR = 0.057) and in the parent sample (χ2/degree ratio = 1.95, CFI = 0.91, RMSEA = 0.06, SRMR = 0.059). Reliability of the three scales was excellent (Cronbach’s alpha: 0.78–0.92 for students; 0.80–0.94 for parents or guardians) and stable across split samples and for the data as a whole. The scales are brief and suitable for use in developing countries where the collectivist culture prevails. PMID:21756072
How item banks and their application can influence measurement practice in rehabilitation medicine: a PROMIS fatigue item bank example.

PubMed

Lai, Jin-Shei; Cella, David; Choi, Seung; Junghaenel, Doerte U; Christodoulou, Christopher; Gershon, Richard; Stone, Arthur

2011-10-01

To illustrate how measurement practices can be advanced by using as an example the fatigue item bank (FIB) and its applications (short forms and computerized adaptive testing [CAT]) that were developed through the National Institutes of Health Patient Reported Outcomes Measurement Information System (PROMIS) Cooperative Group. Psychometric analysis of data collected by an Internet survey company using item response theory-related techniques. A U.S. general population representative sample collected through the Internet. Respondents used for dimensionality evaluation of the PROMIS FIB (N=603) and item calibrations (N=14,931). Not applicable. Fatigue items (112) developed by the PROMIS fatigue domain working group, 13-item Functional Assessment of Chronic Illness Therapy-Fatigue, and 4-item Medical Outcomes Study 36-Item Short Form Health Survey Vitality scale. The PROMIS FIB version 1, which consists of 95 items, showed acceptable psychometric properties. CAT showed consistently better precision than short forms. However, all 3 short forms showed good precision for most participants in that more than 95% of the sample could be measured precisely with reliability greater than 0.9. Measurement practice can be advanced by using a psychometrically sound measurement tool and its applications. This example shows that CAT and short forms derived from the PROMIS FIB can reliably estimate fatigue reported by the U.S. general population. Evaluation in clinical populations is warranted before the item bank can be used for clinical trials. Copyright © 2011 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Developing an item bank and short forms that assess the impact of asthma on quality of life.

PubMed

Stucky, Brian D; Edelen, Maria Orlando; Sherbourne, Cathy D; Eberhart, Nicole K; Lara, Marielena

2014-02-01

The present work describes the process of developing an item bank and short forms that measure the impact of asthma on quality of life (QoL) that avoids confounding QoL with asthma symptomatology and functional impairment. Using a diverse national sample of adults with asthma (N = 2032) we conducted exploratory and confirmatory factor analyses, and item response theory and differential item functioning analyses to develop a 65-item unidimensional item bank and separate short form assessments. A psychometric evaluation of the RAND Impact of Asthma on QoL item bank (RAND-IAQL) suggests that though the concept of asthma impact on QoL is multi-faceted, it may be measured as a single underlying construct. The performance of the bank was then evaluated with a real-data simulated computer adaptive test. From the RAND-IAQL item bank we then developed two short forms consisting of 4 and 12 items (reliability = 0.86 and 0.93, respectively). A real-data simulated computer adaptive test suggests that as few as 4-5 items from the bank are needed to obtain highly precise scores. Preliminary validity results indicate that the RAND-IAQL measures distinguish between levels of asthma control. To measure the impact of asthma on QoL, users of these items may choose from two highly reliable short forms, computer adaptive test administration, or content-specific subsets of items from the bank tailored to their specific needs. Copyright © 2013 Elsevier Ltd. All rights reserved.
Measuring more than we know? An examination of the motivational and situational influences in science achievement

NASA Astrophysics Data System (ADS)

Haydel, Angela Michelle

The purpose of this dissertation was to advance theoretical understanding about fit between the personal resources of individuals and the characteristics of science achievement tasks. Testing continues to be pervasive in schools, yet we know little about how students perceive tests and what they think and feel while they are actually working on test items. This study focused on both the personal (cognitive and motivational) and situational factors that may contribute to individual differences in achievement-related outcomes. 387 eighth grade students first completed a survey including measures of science achievement goals, capability beliefs, efficacy related to multiple-choice items and performance assessments, validity beliefs about multiple-choice items and performance assessments, and other perceptions of these item formats. Students then completed science achievement tests including multiple-choice items and two performance assessments. A sample of students was asked to verbalize both thoughts and feelings as they worked through the test items. These think-alouds were transcribed and coded for evidence of cognitive, metacognitive and motivational engagement. Following each test, all students completed measures of effort, mood, energy level and strategy use during testing. Students reported that performance assessments were more challenging, authentic, interesting and valid than multiple-choice tests. They also believed that comparisons between students were easier using multiple-choice items. Overall, students tried harder, felt better, had higher levels of energy and used more strategies while working on performance assessments. Findings suggested that performance assessments might be more congruent with a mastery achievement goal orientation, while multiple-choice tests might be more congruent with a performance achievement goal orientation. A variable-centered analytic approach including regression analyses provided information about how students, on average, who differed in terms of their teachers' ratings of their science ability, achievement goals, capability beliefs and experiences with science achievement tasks perceived, engaged in, and performed on multiple-choice items and performance assessments. Person-centered analyses provided information about the perceptions, engagement and performance of subgroups of individuals who had different motivational characteristics. Generally, students' personal goals and capability beliefs related more strongly to test perceptions, but not performance, while teacher ratings of ability and test-specific beliefs related to performance.
Developing a short version of the Toronto Structured Interview for Alexithymia using item response theory.

PubMed

Sekely, Angela; Taylor, Graeme J; Bagby, R Michael

2018-03-17

The Toronto Structured Interview for Alexithymia (TSIA) was developed to provide a structured interview method for assessing alexithymia. One drawback of this instrument is the amount of time it takes to administer and score. The current study used item response theory (IRT) methods to analyze data from a large heterogeneous multi-language sample (N = 842) to investigate whether a subset of items could be selected to create a short version of the instrument. Samejima's (1969) graded response model was used to fit the item responses. Items providing maximum information were retained in the short model, resulting in the elimination of 12-items from the original 24-items. Despite the 50% reduction in the number of items, 65.22% of the information was retained. Further studies are needed to validate the short version. A short version of the TSIA is potentially of practical value to clinicians and researchers with time constraints. Copyright © 2018. Published by Elsevier B.V.
Development and psychometric evaluation of the Personal Growth Initiative Scale-II.

PubMed

Robitschek, Christine; Ashton, Matthew W; Spering, Cynthia C; Geiger, Nathaniel; Byers, Danielle; Schotts, G Christian; Thoen, Megan A

2012-04-01

The original Personal Growth Initiative Scale (PGIS; Robitschek, 1998) was unidimensional, despite theory identifying multiple components (e.g., cognition and behavior) of personal growth initiative (PGI). The present research developed a multidimensional measure of the complex process of PGI, while retaining the brief and psychometrically sound properties of the original scale. Study 1 focused on scale development, including theoretical derivation of items, assessing factor structure, reducing number of items, and refining the scale length using samples of college students. Study 2 consisted of confirmatory factor analysis with 3 independent samples of college students and community members. Lastly, Study 3 assessed test-retest reliability over 1-, 2-, 4-, and 6-week periods and tests of concurrent and discriminant validity using samples of college students. The final measure, the Personal Growth Initiative Scale-II (PGIS-II), includes 4 subscales: Readiness for Change, Planfulness, Using Resources, and Intentional Behavior. These studies provide exploratory and confirmatory evidence for the 4-factor structure, strong internal consistency for the subscales and overall score across samples, acceptable temporal stability at all assessed intervals, and concurrent and discriminant validity of the PGIS-II. Future directions for research and clinical practice are discussed.
Psychometric evaluation of a coping questionnaire in two independent samples of people with diabetes.

PubMed

Persson, Lars-Olof; Erichsen, Magdalena; Wändell, Per; Gåfvels, Catharina

2013-10-01

The study examines internal item/scale structure and concurrent validity of a newly developed 48-item questionnaire [General Coping Questionnaire (GCQ)] that measures 10 aspects of coping with chronic illness (self-trust, problem-reducing actions, change of values, social trust, minimization, fatalism, resignation, protest, isolation and intrusion). The tests were performed in two independent samples of persons with diabetes mellitus. The first sample consisted of 119 subjects with type I diabetes and the second sample of 184 subjects with type II diabetes. Concurrent validity was examined by comparisons with measures of health-related quality of life (SF-36), a measure of metabolic control (HbA1c) and incidence of diabetic complications. The item/scale structure was found to be similar and very good in both samples. The 10 dimensions correlated as expected with the measure of mental health, although the 'negative' dimensions of the GCQ correlated higher compared with the 'positive' dimensions. Weaker relations with metabolic control were also found in one of the samples. These tests provide further evidence that GCQ is a well-structured, relevant and reliable instrument for assessing coping reactions in chronic somatic conditions. Copyright © 2012 John Wiley & Sons, Ltd.
The Stigma Resistance Scale: A multi-sample validation of a new instrument to assess mental illness stigma resistance.

PubMed

Firmin, Ruth L; Lysaker, Paul H; McGrew, John H; Minor, Kyle S; Luther, Lauren; Salyers, Michelle P

2017-12-01

Although associated with key recovery outcomes, stigma resistance remains under-studied largely due to limitations of existing measures. This study developed and validated a new measure of stigma resistance. Preliminary items, derived from qualitative interviews of people with lived experience, were pilot tested online with people self-reporting a mental illness diagnosis (n = 489). Best performing items were selected, and the refined measure was administered to an independent sample of people with mental illness at two state mental health consumer recovery conferences (n = 202). Confirmatory factor analyses (CFA) guided by theory were used to test item fit, correlations between the refined stigma resistance measure and theoretically relevant measures were examined for validity, and test-retest correlations of a subsample were examined for stability. CFA demonstrated strong fit for a 5-factor model. The final 20-item measure demonstrated good internal consistency for each of the 5 subscales, adequate test-retest reliability at 3 weeks, and strong construct validity (i.e., positive associations with quality of life, recovery, and self-efficacy, and negative associations with overall symptoms, defeatist beliefs, and self-stigma). The new measure offers a more reliable and nuanced assessment of stigma resistance. It may afford greater personalization of interventions targeting stigma resistance. Copyright © 2017 Elsevier B.V. All rights reserved.
A Spanish-Language Risk Perception Survey for Developing Diabetes: Translation Process and Assessment of Psychometric Properties.

PubMed

Joiner, Kevin L; Sternberg, Rosa Maria; Kennedy, Christine; Chen, Jyu-Lin; Fukuoka, Yoshimi; Janson, Susan L

2016-12-01

Create a Spanish-language version of the Risk Perception Survey for Developing Diabetes (RPS-DD) and assess psychometric properties. The Spanish-language version was created through translation, harmonization, and presentation to the tool's original author. It was field tested in a foreignborn Latino sample and properties evaluated in principal components analysis. Personal Control, Optimistic Bias, and Worry multi-item Likert subscale responses did not cluster together. A clean solution was obtained after removing two Personal Control subscale items. Neither the Personal Disease Risk scale nor the Environmental Health Risk scale responses loaded onto single factors. Reliabilities ranged from .54 to .88. Test of knowledge performance varied by item. This study contributes to evidence of validation of a Spanish-language RPS-DD in foreign-born Latinos.
A Litmus Test for Performance Assessment.

ERIC Educational Resources Information Center

Finson, Kevin D.; Beaver, John B.

1992-01-01

Presents 10 guidelines for developing performance-based assessment items. Presents a sample activity developed from the guidelines. The activity tests students ability to observe, classify, and infer, using red and blue litmus paper, a pH-range finder, vinegar, ammonia, an unknown solution, distilled water, and paper towels. (PR)
Development and validation of a fatigue assessment scale for U.S. construction workers.

PubMed

Zhang, Mingzong; Sparer, Emily H; Murphy, Lauren A; Dennerlein, Jack T; Fang, Dongping; Katz, Jeffrey N; Caban-Martinez, Alberto J

2015-02-01

To develop a fatigue assessment scale and test its reliability and validity for commercial construction workers. Using a two-phased approach, we first identified items (first phase) for the development of a Fatigue Assessment Scale for Construction Workers (FASCW) through review of existing scales in the scientific literature, key informant interviews (n = 11) and focus groups (three groups with six workers each) with construction workers. The second phase included assessment for the reliability, validity, and sensitivity of the new scale using a repeated-measures study design with a convenience sample of construction workers (n = 144). Phase one resulted in a 16-item preliminary scale that after factor analysis yielded a final 10-item scale with two sub-scales ("Lethargy" and "Bodily Ailment"). During phase two, the FASCW and its subscales demonstrated satisfactory internal consistency (alpha coefficients were FASCW [0.91], Lethargy [0.86] and Bodily Ailment [0.84]) and acceptable test-retest reliability (Pearson Correlations Coefficients: 0.59-0.68; Intraclass Correlation Coefficients: 0.74-0.80). Correlation analysis substantiated concurrent and convergent validity. A discriminant analysis demonstrated that the FASCW differentiated between groups with arthritis status and different work hours. The 10-item FASCW with good reliability and validity is an effective tool for assessing the severity of fatigue among construction workers. © 2015 Wiley Periodicals, Inc.
Development of the Assessment of Belief Conflict in Relationship-14 (ABCR-14).

PubMed

Kyougoku, Makoto; Teraoka, Mutsumi; Masuda, Noriko; Ooura, Mariko; Abe, Yasushi

2015-01-01

Nurses and other healthcare workers frequently experience belief conflict, one of the most important, new stress-related problems in both academic and clinical fields. In this study, using a sample of 1,683 nursing practitioners, we developed The Assessment of Belief Conflict in Relationship-14 (ABCR-14), a new scale that assesses belief conflict in the healthcare field. Standard psychometric procedures were used to develop and test the scale, including a qualitative framework concept and item-pool development, item reduction, and scale development. We analyzed the psychometric properties of ABCR-14 according to entropy, polyserial correlation coefficient, exploratory factor analysis, confirmatory factor analysis, average variance extracted, Cronbach's alpha, Pearson product-moment correlation coefficient, and multidimensional item response theory (MIRT). The results of the analysis supported a three-factor model consisting of 14 items. The validity and reliability of ABCR-14 was suggested by evidence from high construct validity, structural validity, hypothesis testing, internal consistency reliability, and concurrent validity. The result of the MIRT offered strong support for good item response of item slope parameters and difficulty parameters. However, the ABCR-14 Likert scale might need to be explored from the MIRT point of view. Yet, as mentioned above, there is sufficient evidence to support that ABCR-14 has high validity and reliability. The ABCR-14 demonstrates good psychometric properties for nursing belief conflict. Further studies are recommended to confirm its application in clinical practice.
Is It Just Me, or Are There Other Parents and Teachers Out There Confused about SOL Reading Assessments?

ERIC Educational Resources Information Center

Bintz, William P.

1998-01-01

Describes an incident involving the author, his daughter, and sample items from a Standards of Learning (SOL) assessment. Elaborates on the author's increasing confusion with SOL assessments, especially in reading. Proposes that educators spend less time testing kids and more time "testing their theories" so that assessments better reflect recent…
[Assessment of temperament with the Infant Behavior Questionnaire Revised (IBQ-R) - the psychometric properties of a German version].

PubMed

Vonderlin, Eva; Ropeter, Anna; Pauen, Sabina

2012-09-01

The Infant Behavior Questionnaire Revised (IBQ-R; Gartstein & Rothbart, 2003) is one of the most common parent-report instruments for assessing infant temperament. This study evaluated the psychometric properties of a German version. We studied item characteristics, internal consistency, and descriptive statistics for all 14 scales in a sample of 7- to 9-month-old infants and their mothers (N = 119). Factor analysis was conducted to identify higher-order relationships between the scales. Item analysis showed mixed corrected item-total correlations. Internal consistencies were all moderate to high. Results of the factor analysis confirmed the two dimensions of Surgency/Extraversion and Negative Affectivity, whereas the dimension Orienting/Regulation was not replicated. In contrast to the American sample, activity level in the German sample loaded on the factor Negative Affectivity. The scales low intensity pleasure and soothability, which loaded on factor Orienting/Regulation in the original version, showed substantial loadings on both dimensions Surgency/Extraversion and Negative Affectivity (inverted), whereas the scale duration of orienting was located on the factor Surgency/Extraversion. The German version of the IBQ-R provides a satisfying instrument for investigating infant temperament. However, further work is needed to improve the methodological quality of the questionnaire. Further research should especially focus on the factor structure of infant temperament. We suggest developing a shorter version and testing it with a larger and more diverse sample.
Psychometric analysis of the Generalized Anxiety Disorder scale (GAD-7) in primary care using modern item response theory.

PubMed

Jordan, Pascal; Shedden-Mora, Meike C; Löwe, Bernd

2017-01-01

The Generalized Anxiety Disorder scale (GAD-7) is one of the most frequently used diagnostic self-report scales for screening, diagnosis and severity assessment of anxiety disorder. Its psychometric properties from the view of the Item Response Theory paradigm have rarely been investigated. We aimed to close this gap by analyzing the GAD-7 within a large sample of primary care patients with respect to its psychometric properties and its implications for scoring using Item Response Theory. Robust, nonparametric statistics were used to check unidimensionality of the GAD-7. A graded response model was fitted using a Bayesian approach. The model fit was evaluated using posterior predictive p-values, item information functions were derived and optimal predictions of anxiety were calculated. The sample included N = 3404 primary care patients (60% female; mean age, 52,2; standard deviation 19.2) The analysis indicated no deviations of the GAD-7 scale from unidimensionality and a decent fit of a graded response model. The commonly suggested ultra-brief measure consisting of the first two items, the GAD-2, was supported by item information analysis. The first four items discriminated better than the last three items with respect to latent anxiety. The information provided by the first four items should be weighted more heavily. Moreover, estimates corresponding to low to moderate levels of anxiety show greater variability. The psychometric validity of the GAD-2 was supported by our analysis.

Psychometric analysis of the Generalized Anxiety Disorder scale (GAD-7) in primary care using modern item response theory

PubMed Central

Shedden-Mora, Meike C.; Löwe, Bernd

2017-01-01

Objective The Generalized Anxiety Disorder scale (GAD-7) is one of the most frequently used diagnostic self-report scales for screening, diagnosis and severity assessment of anxiety disorder. Its psychometric properties from the view of the Item Response Theory paradigm have rarely been investigated. We aimed to close this gap by analyzing the GAD-7 within a large sample of primary care patients with respect to its psychometric properties and its implications for scoring using Item Response Theory. Methods Robust, nonparametric statistics were used to check unidimensionality of the GAD-7. A graded response model was fitted using a Bayesian approach. The model fit was evaluated using posterior predictive p-values, item information functions were derived and optimal predictions of anxiety were calculated. Results The sample included N = 3404 primary care patients (60% female; mean age, 52,2; standard deviation 19.2) The analysis indicated no deviations of the GAD-7 scale from unidimensionality and a decent fit of a graded response model. The commonly suggested ultra-brief measure consisting of the first two items, the GAD-2, was supported by item information analysis. The first four items discriminated better than the last three items with respect to latent anxiety. Conclusion The information provided by the first four items should be weighted more heavily. Moreover, estimates corresponding to low to moderate levels of anxiety show greater variability. The psychometric validity of the GAD-2 was supported by our analysis. PMID:28771530
The Meal Pattern Questionnaire: A psychometric evaluation using the Eating Disorder Examination.

PubMed

Alfonsson, S; Sewall, A; Lidholm, H; Hursti, T

2016-04-01

Meal pattern is an important variable in both obesity treatment and treatment for eating disorders. Momentary assessment and eating diaries are highly valid measurement methods but often cumbersome and not always feasible to use in clinical practice. The aim of this study was to design and evaluate a self-report instrument for measuring meal patterns. The Pattern of eating item from the Eating Disorder Examination (EDE) interview was adapted to self-report format to follow the same overall structure as the Eating Disorder Examination Questionnaire. The new instrument was named the Meal Patterns Questionnaire (MPQ) and was compared with the EDE in a student sample (n=105) and an obese sample (n=111). The individual items of the MPQ and the EDE showed moderate to high correlations (rho=.63-89) in the two samples. Significant differences between the MPQ and EDE were only found for two items in the obese sample. The total scores correlated to a high degree (rho=.87/.74) in both samples and no significant differences were found in this variable. The MPQ can provide an overall picture of a person's eating patterns and is a valid way to collect data regarding meal patterns. The MPQ may be a useable tool in clinical practice and research studies when more extensive instruments cannot be used. Future studies should evaluate the MPQ in diverse cultural populations and with more ecological assessment methods. Copyright © 2015 Elsevier Ltd. All rights reserved.
Psychometric properties of the 7-item game addiction scale among french and German speaking adults.

PubMed

Khazaal, Yasser; Chatton, Anne; Rothen, Stephane; Achab, Sophia; Thorens, Gabriel; Zullino, Daniele; Gmel, Gerhard

2016-05-10

The 7-item Game Addiction Scale (GAS) is a used to screen for addictive game use. Both cross cross-linguistic validation and validation in French and German is needed in adult samples. The objective of the study is to assess the factorial structure of the French and German versions of the GAS among adults. Two samples of men from French (N = 3318) and German (N = 2665) language areas of Switzerland were assessed with the GAS, the Major Depression Inventory (MDI), the Brief Sensation Seeking Scale, and the Zuckerman-Kuhlman Personality Questionnaire (ZKPQ-50-cc). They were also assessed for cannabis and alcohol use. The internal consistency of the scale was satisfactory (Cronbach α = 0.85). A one-factor solution was found in both samples. Small and positive associations were found between GAS scores and the MDI, as well as the Neuroticism-Anxiety and Aggression-Hostility subscales of the ZKPQ-50-cc. A small negative association was found with the ZKPQ-50-cc Sociability subscale. The GAS, in its French and German versions, is appropriate for the assessment of game addiction among adults.
Child Adjustment and Parent Efficacy Scale-Developmental Disability (CAPES-DD): First psychometric evaluation of a new child and parenting assessment tool for children with a developmental disability.

PubMed

Emser, Theresa S; Mazzucchelli, Trevor G; Christiansen, Hanna; Sanders, Matthew R

2016-01-01

This study examined the psychometric properties of the Child Adjustment and Parent Efficacy Scale-Developmental Disability (CAPES-DD), a brief inventory for assessing emotional and behavioral problems of children with developmental disabilities aged 2- to 16-years, as well as caregivers' self-efficacy in managing these problems. A sample of 636 parents participated in the study. Children's ages ranged from 2 to 15. Exploratory and confirmatory factor analyses supported a 21-item, three-factor model of CAPES-DD child adjustment with 13 items describing behavioral (10 items) and emotional (3 items) problems and 8 items describing prosocial behavior. Three additional items were included due to their clinical usefulness and contributed to a Total Problem Score. Factor analyses also supported a 16-item, one factor model of CAPES-DD self-efficacy. Psychometric evaluation of the CAPES-DD revealed scales had satisfactory to very good internal consistency, as well as very good convergent and predictive validity. The instrument is to be in the public domain and free for practitioners and researchers to use. Potential uses of the measure and implications for future validation studies are discussed. Copyright © 2015 Elsevier Ltd. All rights reserved.
Assessing organizational climate: psychometric properties of the CLIOR Scale.

PubMed

Peña-Suárez, Elsa; Muñiz, José; Campillo-Álvarez, Angela; Fonseca-Pedrero, Eduardo; García-Cueto, Eduardo

2013-02-01

Organizational climate is the set of perceptions shared by workers who occupy the same workplace. The main goal of this study is to develop a new organizational climate scale and to determine its psychometric properties. The sample consisted of 3,163 Health Service workers. A total of 88.7% of participants worked in hospitals, and 11.3% in primary care; 80% were women and 20% men, with a mean age of 51.9 years (SD= 6.28). The proposed scale consists of 50 Likert-type items, with an alpha coefficient of 0.97, and an essentially one-dimensional structure. The discrimination indexes of the items are greater than 0.40, and the items show no differential item functioning in relation to participants' sex. A short version of the scale was developed, made up of 15 items, with discrimination indexes higher than 0.40, an alpha coefficient of 0.94, and its structure was clearly one-dimensional. These results indicate that the new scale has adequate psychometric properties, allowing a reliable and valid assessment of organizational climate.
Psychometric properties of the Exercise Benefits/Barriers Scale in Mexican elderly women

PubMed Central

Enríquez-Reyna, María Cristina; Cruz-Castruita, Rosa María; Ceballos-Gurrola, Oswaldo; García-Cadena, Cirilo Humberto; Hernández-Cortés, Perla Lizeth; Guevara-Valtier, Milton Carlos

2017-01-01

ABSTRACT Objective: analyze and assess the psychometric properties of the subscales in the Spanish version of the Exercise Benefits/Barriers Scale in an elderly population in the Northeast of Mexico. Method: methodological study. The sample consisted of 329 elderly associated with one of the five public centers for senior citizens in the metropolitan area of Northeast Mexico. The psychometric properties included the assessment of the Cronbach's alpha coefficient, the Kaiser Meyer Olkin coefficient, the inter-item correlation, exploratory and confirmatory factor analysis. Results: in the principal components analysis, two components were identified based on the 43 items in the scale. The item-total correlation coefficient of the exercise benefits subscale was good. Nevertheless, the coefficient for the exercise barriers subscale revealed inconsistencies. The reliability and validity were acceptable. The confirmatory factor analysis revealed that the elimination of items improved the goodness of fit of the baseline scale, without affecting its validity or reliability. Conclusion: the Exercise Benefits/Barriers subscale presented satisfactory psychometric properties for the Mexican context. A 15-item short version is presented with factorial structure, validity and reliability similar to the complete scale. PMID:28591306
Assessing Student Outcomes of Undergraduate Research with URSSA, the Undergraduate Student Self-Assessment Instrument

NASA Astrophysics Data System (ADS)

Laursen, S. L.; Weston, T. J.; Thiry, H.

2012-12-01

URSSA is the Undergraduate Research Student Self-Assessment, an online survey instrument for programs and departments to use in assessing the student outcomes of undergraduate research (UR). URSSA focuses on what students learn from their UR experience, rather than whether they liked it. The online questionnaire includes both multiple-choice and open-ended items that focus on students' gains from undergraduate research. These gains include skills, knowledge, deeper understanding of the intellectual and practical work of science, growth in confidence, changes in identity, and career preparation. Other items probe students' participation in important research-related activities that lead to these gains (e.g. giving presentations, having responsibility for a project). These activities, and the gains themselves, are based in research and thus constitute a core set of items. Using these items as a group helps to align a particular program assessment with research-demonstrated outcomes. Optional items may be used to probe particular features that are augment the research experience (e.g. field trips, career seminars, housing arrangements). The URSSA items are based on extensive, interview-based research and evaluation work on undergraduate research by our group and others. This grounding in research means that URSSA measures what we know to be important about the UR experience The items were tested with students, revised and re-tested. Data from a large pilot sample of over 500 students enabled statistical testing of the items' validity and reliability. Optional items about UR program elements were developed in consultation with UR program developers and leaders. The resulting instrument is flexible. Users begin with a set of core items, then customize their survey with optional items to probe students' experiences of specific program elements. The online instrument is free and easy to use, with numeric results available as raw data, summary statistics, cross-tabs, and graphs, and as raw, downloadable data. Finally, URSSA has high content validity based on its research grounding and rigorous development. We will present examples of how URSSA has been used in evaluations of UR programs. A multi-year evaluation of a university-based UR program shows that URSSA items are sensitive to differences in students' prior level of experience with research. For example, experienced student researchers reported greater gains than did their peers new to UR in understanding the process of research and in coming to see themselves as scientists. These differences are consistent with interview data that suggest a developmental progression of gains as students pursue research and gain confidence in their ability to contribute meaningfully. A second example comes from a multi-site evaluation of sites funded by the National Science Foundation's Research Experience for Undergraduates (REU) program in Biology. This study acquired data from nearly 800 students at some 60 Bio REU sites in 2010 and 2011. Results reveal differences in gains among demographic groups, and the general strength of these well-planned programs relative to a comparison sample of UR programs that are not part of REU. Our presentation will demonstrate the evaluative use of URSSA and its potential applications to undergraduate research in the geosciences.
Linguistic Adaptation and Psychometric Properties of Tamil Version of General Oral Health Assessment Index-Tml.

PubMed

Appukuttan, D P; Vinayagavel, M; Balasundaram, A; Damodaran, L K; Shivaraman, P; Gunasshegaran, K

2015-01-01

Oral health has an impact on quality of life hence for research purpose validation of a Tamil version of General Oral Health Assessment Index would enable it to be used as a valuable tool among Tamil speaking population. In this study, we aimed to assess the psychometric properties of translated Tamil version of General Oral Health Assessment Index (GOHAI-Tml). Linguistic adaptation involved forward and backward blind translation process. Reliability was analyzed using test-retest, Cronbach alpha, and split half reliability. Inter-item and item-total correlation were evaluated using Spearman rank correlation. Convenience sampling was done, and 265 consecutive patients aged 20-70 years attending the outpatient department were recruited. Subjects were requested to fill a self-reporting questionnaire along with Tamil GOHAI version. Clinical examination was done on the same visit. Concurrent validity was measured by assessing the relationship between GOHAI scores and self-perceived oral health and general health status, satisfaction with oral health, need for dental treatment and esthetic satisfaction. Discriminant validity was evaluated by comparing the GOHAI scores with the objectively assessed clinical parameters. Exploratory factor analysis was done to examine the factor structure. Mean GOHAI-Tml was 52.7 (6.8, range 22-60, median 54). The mean number of negative impacts was 2 (2.4, range 0-11, median 1). The Spearman rank correlation for test-retest ranged from 0.8 to 0.9 (P < 0.001) for all the 12 items between visits. The Cronbach alpha for 265 samples was 0.8 suggesting good internal consistency and homogeneity between items. Item scale correlation ranged from 0.4 to 0.8 (P < 0.001). Concurrent and discriminant validity was established. Principal component analysis resulted in extraction of four factors which together accounted for 66.4% (7.9/12) variance. GOHAI-Tml has shown acceptable psychometric properties, so that it can be used as an efficient tool in identifying the impact of oral health on quality of life among the Tamil speaking population.
Measuring depression after spinal cord injury: Development and psychometric characteristics of the SCI-QOL Depression item bank and linkage with PHQ-9.

PubMed

Tulsky, David S; Kisala, Pamela A; Kalpakjian, Claire Z; Bombardier, Charles H; Pohlig, Ryan T; Heinemann, Allen W; Carle, Adam; Choi, Seung W

2015-05-01

To develop a calibrated spinal cord injury-quality of life (SCI-QOL) item bank, computer adaptive test (CAT), and short form to assess depressive symptoms experienced by individuals with SCI, transform scores to the Patient Reported Outcomes Measurement Information System (PROMIS) metric, and create a crosswalk to the Patient Health Questionnaire (PHQ)-9. We used grounded-theory based qualitative item development methods, large-scale item calibration field testing, confirmatory factor analysis, item response theory (IRT) analyses, and statistical linking techniques to transform scores to a PROMIS metric and to provide a crosswalk with the PHQ-9. Five SCI Model System centers and one Department of Veterans Affairs medical center in the United States. Adults with traumatic SCI. Spinal Cord Injury--Quality of Life (SCI-QOL) Depression Item Bank Individuals with SCI were involved in all phases of SCI-QOL development. A sample of 716 individuals with traumatic SCI completed 35 items assessing depression, 18 of which were PROMIS items. After removing 7 non-PROMIS items, factor analyses confirmed a unidimensional pool of items. We used a graded response IRT model to estimate slopes and thresholds for the 28 retained items. The SCI-QOL Depression measure correlated 0.76 with the PHQ-9. The SCI-QOL Depression item bank provides a reliable and sensitive measure of depressive symptoms with scores reported in terms of general population norms. We provide a crosswalk to the PHQ-9 to facilitate comparisons between measures. The item bank may be administered as a CAT or as a short form and is suitable for research and clinical applications.
Item response theory analyses of the Delis-Kaplan Executive Function System card sorting subtest.

PubMed

Spencer, Mercedes; Cho, Sun-Joo; Cutting, Laurie E

2018-02-02

In the current study, we examined the dimensionality of the 16-item Card Sorting subtest of the Delis-Kaplan Executive Functioning System assessment in a sample of 264 native English-speaking children between the ages of 9 and 15 years. We also tested for measurement invariance for these items across age and gender groups using item response theory (IRT). Results of the exploratory factor analysis indicated that a two-factor model that distinguished between verbal and perceptual items provided the best fit to the data. Although the items demonstrated measurement invariance across age groups, measurement invariance was violated for gender groups, with two items demonstrating differential item functioning for males and females. Multigroup analysis using all 16 items indicated that the items were more effective for individuals whose IRT scale scores were relatively high. A single-group explanatory IRT model using 14 non-differential item functioning items showed that for perceptual ability, females scored higher than males and that scores increased with age for both males and females; for verbal ability, the observed increase in scores across age differed for males and females. The implications of these findings are discussed.
[Additional psychometric data for the DS1K mood questionnaire. Experience from a large sample study involving parents of young children].

PubMed

Danis, Ildiko; Scheuring, Noemi; Papp, Eszter; Czinner, Antal

2012-06-01

A new instrument for assessing depressive mood, the first version of Depression Scale Questionnaire (DS1K) was published in 2008 by Halmai et al. This scale was used in our large sample study, in the framework of the For Healthy Offspring project, involving parents of young children. The original questionnaire was developed in small samples, so our aim was to assist further development of the instrument by the psychometric analysis of the data in our large sample (n=1164). The DS1K scale was chosen to measure the parents' mood and mental state in the For Healthy Offspring project. The questionnaire was completed by 1063 mothers and 328 fathers, yielding a heterogenous sample with respect to age and socio-demographic status. Analyses included main descriptive statistics, establishing the scales' inner consistency and some comparisons. Results were checked in our original and multiple imputed datasets as well. According to our results the reliability of our scale was much worse than in the original study (Cronbach alpha: 0.61 versus 0.88). During the detailed item-analysis it became clear that two items contributed to the observed decreased coherence. We assumed a problem related to misreading in case of one of these items. This assumption was checked by cross-analysis by the assumed reading level. According to our results the reliability of the scale was increased in both the lower and higher education level groups if we did not include one or both of these problematic items. However, as the number of items decreased, the relative sensitivity of the scale was also reduced, with fewer persons categorized in the risk group compared to the original scale. We suggest for the authors as an alternative solution to redefine the problematic items and retest the reliability of the measurement in a sample with diverse socio-demographic characteristics.
Psychometric Properties of the Disability Assessment Schedule (DAS) for Behavior Problems: An Independent Investigation

ERIC Educational Resources Information Center

Tsakanikos, Elias; Underwood, Lisa; Sturmey, Peter; Bouras, Nick; McCarthy, Jane

2011-01-01

The present study employed the Disability Assessment Schedule (DAS) to assess problem behaviors in a large sample of adults with ID (N = 568) and evaluate the psychometric properties of this instrument. Although the DAS problem behaviors were found to be internally consistent (Cronbach's [alpha] = 0.87), item analysis revealed one weak item…
Assessing the capacity of ministries of health to use research in decision-making: conceptual framework and tool.

PubMed

Rodríguez, Daniela C; Hoe, Connie; Dale, Elina M; Rahman, M Hafizur; Akhter, Sadika; Hafeez, Assad; Irava, Wayne; Rajbangshi, Preety; Roman, Tamlyn; Ţîrdea, Marcela; Yamout, Rouham; Peters, David H

2017-08-01

The capacity to demand and use research is critical for governments if they are to develop policies that are informed by evidence. Existing tools designed to assess how government officials use evidence in decision-making have significant limitations for low- and middle-income countries (LMICs); they are rarely tested in LMICs and focus only on individual capacity. This paper introduces an instrument that was developed to assess Ministry of Health (MoH) capacity to demand and use research evidence for decision-making, which was tested for reliability and validity in eight LMICs (Bangladesh, Fiji, India, Lebanon, Moldova, Pakistan, South Africa, Zambia). Instrument development was based on a new conceptual framework that addresses individual, organisational and systems capacities, and items were drawn from existing instruments and a literature review. After initial item development and pre-testing to address face validity and item phrasing, the instrument was reduced to 54 items for further validation and item reduction. In-country study teams interviewed a systematic sample of 203 MoH officials. Exploratory factor analysis was used in addition to standard reliability and validity measures to further assess the items. Thirty items divided between two factors representing organisational and individual capacity constructs were identified. South Africa and Zambia demonstrated the highest level of organisational capacity to use research, whereas Pakistan and Bangladesh were the lowest two. In contrast, individual capacity was highest in Pakistan, followed by South Africa, whereas Bangladesh and Lebanon were the lowest. The framework and related instrument represent a new opportunity for MoHs to identify ways to understand and improve capacities to incorporate research evidence in decision-making, as well as to provide a basis for tracking change.
Validity and Reliability of Persian Version of HIV/AIDS Related Stigma Scale for People Living With HIV/AIDS in Iran.

PubMed

Pourmarzi, Davoud; Khoramirad, Ashraf; Ahmari Tehran, Hoda; Abedini, Zahra

2015-11-01

To assess the perceived HIV/AIDS related stigma a comprehensive and well developed stigma instrument is necessary. This study aimed to assess validity and reliability of the Persian version of HIV/AIDS related stigma scale which was developed by Kang et al for people living with HIV/AIDS in Iran. Thescale was forward translatedby two bilingual academic members then both translations were discussed by expert team. Back-translation was done by two other bilingual translators then we carried out discussion with both of them. To evaluate understandability the scale was administered to 10 Persons Living with HIV/AIDS (PLWHA). Final Persian version was administered to 80 PLWHA in Qom, Iran in 2014. Test-retest reliability was assessed in a sample of 20 PLWHA after a week by intra-class correlation coefficient (ICC). Cronbach's alpha coefficient for overall scale was 0.85. Also Cronbach's alpha coefficients for the five subscales were as follows: social rejection (9 items, α = 0.84), negative self-worth (4 items, α = 0.70), perceived interpersonal insecurity (2 items, α = 0.57), financial insecurity (3 items, α = 0.70), discretionary disclosure (2 items, α = 0.83). Test-retest reliability was also approved with ICC = 0.78. Correlation between items and their hypothesized subscale is greater than 0.5. Correlation between an item and its own subscale was significantly higher than its correlation with other subscales. This study demonstrate that the Persian version of HIV/AIDS related stigma scale is valid and reliable to assess HIV/AIDS related stigma perceived by people living whit HIV/AIDS in Iran.
Validity and Reliability of Persian Version of HIV/AIDS Related Stigma Scale for People Living With HIV/AIDS in Iran

PubMed Central

Pourmarzi, Davoud; Khoramirad, Ashraf; Ahmari Tehran, Hoda; Abedini, Zahra

2015-01-01

Objective: To assess the perceived HIV/AIDS related stigma a comprehensive and well developed stigma instrument is necessary. This study aimed to assess validity and reliability of the Persian version of HIV/AIDS related stigma scale which was developed by Kang et al for people living with HIV/AIDS in Iran. Materials and methods: Thescale was forward translatedby two bilingual academic members then both translations were discussed by expert team. Back-translation was done by two other bilingual translators then we carried out discussion with both of them. To evaluate understandability the scale was administered to 10 Persons Living with HIV/AIDS (PLWHA). Final Persian version was administered to 80 PLWHA in Qom, Iran in 2014. Test–retest reliability was assessed in a sample of 20 PLWHA after a week by intra-class correlation coefficient (ICC). Results: Cronbach’s alpha coefficient for overall scale was 0.85. Also Cronbach’s alpha coefficients for the five subscales were as follows: social rejection (9 items, α = 0.84), negative self-worth (4 items, α = 0.70), perceived interpersonal insecurity (2 items, α = 0.57), financial insecurity (3 items, α = 0.70), discretionary disclosure (2 items, α = 0.83). Test–retest reliability was also approved with ICC = 0.78. Correlation between items and their hypothesized subscale is greater than 0.5. Correlation between an item and its own subscale was significantly higher than its correlation with other subscales. Conclusion: This study demonstrate that the Persian version of HIV/AIDS related stigma scale is valid and reliable to assess HIV/AIDS related stigma perceived by people living whit HIV/AIDS in Iran. PMID:27047562
Psychometric properties and feasibility of the Swedish version of the Philadelphia Geriatric Center Morale Scale.

PubMed

Niklasson, Johan; Conradsson, Mia; Hörnsten, Carl; Nyqvist, Fredrica; Padyab, Mojgan; Nygren, Björn; Olofsson, Birgitta; Lövheim, Hugo; Gustafson, Yngve

2015-11-01

Morale is related to psychological well-being and quality of life in older people. The Philadelphia Geriatric Center Morale Scale (PGCMS) is widely used to assess morale. The purpose of this study was to evaluate the psychometric properties and feasibility of the Swedish version of the 17-item PGCMS among very old people. The Umeå 85+/GERDA study included Swedish-speaking people aged 85, 90 and 95 years and older, from Sweden and Finland. Participants were interviewed in their own homes using a predefined set of questions. In the main sample, 493 individuals answered all 17 PGCMS items (aged 89.0 ± 4.3 years). Another 105 answered between 1 and 16 questions (aged 89.6 ± 4.4 years). A convenience sample was also collected, and 54 individuals answered all 17 PGCMS items twice (aged 84.7 ± 6.7 years). The same assessor restated the questions within 1 week. Cronbach's alpha was 0.74 among those who answered all 17 questions in the main sample. Confirmatory factor analysis was used to test the construct validity of the most widely used version of the PGCMS, with 17 items and three factors, and showed a generally good fit. Among those answering between 1 and 17 PGCMS questions, 92.6 % (554/598) answered 16 or 17. The convenience sample was used for intra-rater test-retesting, and the intraclass correlation coefficient (ICC) was 0.89. The least significant change between two assessments, with 95 % confidence interval, was 3.53 PGCMS points. The Swedish version of the PGCMS seems to have satisfactory psychometric properties and feasibility among very old people.
Is the Berg Balance Scale an effective tool for the measurement of early postural control impairments in patients with Parkinson's disease? Evidence from Rasch analysis.

PubMed

La Porta, F; Giordano, A; Caselli, S; Foti, C; Franchignoni, F

2015-12-01

It is unclear whether the BBS is an effective tool for the measurement of early postural control impairments in patients with Parkinson's disease (PD). The aim of this paper was to evaluate BBS' content validity, internal construct validity, reliability and targeting in patients with PD within the Rasch analysis framework. Observational, cross-sectional study. Outpatient Rehabilitation Unit. A sample of 285 outpatients with PD. The content validity of the BBS was assessed using standard linking techniques. The BBS was administered by trained physiotherapists. The data collected then underwent Rasch analysis. Content validity analysis showed a lack of items assessing postural responses to tripping and slips and stability during walking. On Rasch analysis, the BBS failed the requirements of monotonicity, local independence, unidimensionality and invariance. After rescoring 7 items, grouping of locally dependent items into testlets, and deletion of the static sitting balance item because mistargeted and underdiscriminating, the Rasch-modified BBS for PD (BBS-PD) showed adequate internal construct validity (χ(2)24=39.693; P=0.023), including absence of differential item functioning (DIF) across gender and age, and was, as a whole, sufficiently precise for individual person measurement (PSI=0.894). However, the scale was not well targeted to the sample in view of the prevalence of higher scores. This study demonstrated the internal construct validity and reliability of the BBS-PD as a measurement tool for patients with PD within the Rasch analysis framework. However, the lack of items critical to the assessment of postural control impairments typical of PD, affected negatively the targeting, so that a significant percentage of patients was located in the higher ability range of the measurement continuum, where precision of measurement is reduced. These findings suggest that the BBS, even if modified, may not be an effective tool for the measurement of early postural control in patients with PD.
A Computer-Adaptive Disability Instrument for Lower Extremity Osteoarthritis Research Demonstrated Promising Breadth, Precision and Reliability

PubMed Central

Jette, Alan M.; McDonough, Christine M.; Haley, Stephen M.; Ni, Pengsheng; Olarsch, Sippy; Latham, Nancy; Hambleton, Ronald K.; Felson, David; Kim, Young-jo; Hunter, David

2012-01-01

Objective To develop and evaluate a prototype measure (OA-DISABILITY-CAT) for osteoarthritis research using Item Response Theory (IRT) and Computer Adaptive Test (CAT) methodologies. Study Design and Setting We constructed an item bank consisting of 33 activities commonly affected by lower extremity (LE) osteoarthritis. A sample of 323 adults with LE osteoarthritis reported their degree of limitation in performing everyday activities and completed the Health Assessment Questionnaire-II (HAQ-II). We used confirmatory factor analyses to assess scale unidimensionality and IRT methods to calibrate the items and examine the fit of the data. Using CAT simulation analyses, we examined the performance of OA-DISABILITY-CATs of different lengths compared to the full item bank and the HAQ-II. Results One distinct disability domain was identified. The 10-item OA-DISABILITY-CAT demonstrated a high degree of accuracy compared with the full item bank (r=0.99). The item bank and the HAQ-II scales covered a similar estimated scoring range. In terms of reliability, 95% of OA-DISABILITY reliability estimates were over 0.83 versus 0.60 for the HAQ-II. Except at the highest scores the 10-item OA-DISABILITY-CAT demonstrated superior precision to the HAQ-II. Conclusion The prototype OA-DISABILITY-CAT demonstrated promising measurement properties compared to the HAQ-II, and is recommended for use in LE osteoarthritis research. PMID:19216052
Screening for depression in arthritis populations: an assessment of differential item functioning in three self-reported questionnaires.

PubMed

Hu, Jinxiang; Ward, Michael M

2017-09-01

To determine if persons with arthritis differ systematically from persons without arthritis in how they respond to questions on three depression questionnaires, which include somatic items such as fatigue and sleep disturbance. We extracted data on the Centers for Epidemiological Studies Depression (CES-D) scale, the Patient Health Questionnaire-9 (PHQ-9), and the Kessler-6 (K-6) scale from three large population-based national surveys. We assessed items on these questionnaires for differential item functioning (DIF) between persons with and without self-reported physician-diagnosed arthritis using multiple indicator multiple cause models, which controlled for the underlying level of depression and important confounders. We also examined if DIF by arthritis status was similar between women and men. Although five items of the CES-D, one item of the PHQ-9, and five items of the K-6 scale had evidence of DIF based on statistical comparisons, the magnitude of each difference was less than the threshold of a small effect. The statistical differences were a function of the very large sample sizes in the surveys. Effect sizes for DIF were similar between women and men except for two items on the Patient Health Questionnaire-9. For each questionnaire, DIF accounted for 8% or less of the arthritis-depression association, and excluding items with DIF did not reduce the difference in depression scores between those with and without arthritis. Persons with arthritis respond to items on the CES-D, PHQ-9, and K-6 depression scales similarly to persons without arthritis, despite the inclusion of somatic items in these scales.
Rasch analysis of the Mini-Mental Adjustment to Cancer Scale (mini-MAC) among a heterogeneous sample of long-term cancer survivors: A cross-sectional study

PubMed Central

2012-01-01

Background The mini-Mental Adjustment to Cancer Scale (mini-MAC) is a well-recognised, popular measure of coping in psycho-oncology and assesses five cancer-specific coping strategies. It has been suggested that these five subscales could be grouped to form the over-arching adaptive and maladptive coping subscales to facilitate the interpretation and clinical application of the scale. Despite the popularity of the mini-MAC, few studies have examined its psychometric properties among long-term cancer survivors, and further validation of the mini-MAC is needed to substantiate its use with the growing population of survivors. Therefore, this study examined the psychometric properties and dimensionality of the mini-MAC in a sample of long-term cancer survivors using Rasch analysis. Methods RUMM 2030 was used to analyse the mini-MAC data (n=851). Separate Rasch analyses were conducted for each of the original mini-MAC subscales as well as the over-arching adaptive and maladaptive coping subscales to examine summary and individual model fit statistics, person separation index (PSI), response format, local dependency, targeting, item bias (or differential item functioning -DIF), and dimensionality. Results For the fighting spirit, fatalism, and helplessness-hopelessness subscales, a revised three-point response format seemed more optimal than the original four-point response. To achieve model fit, items were deleted from four of the five subscales – Anxious Preoccupation items 7, 25, and 29; Cognitive Avoidance items 11 and 17; Fighting Spirit item 18; and Helplessness-Hopelessness items 16 and 20. For those subscales with sufficient items, analyses supported unidimensionality. Combining items to form the adaptive and maladaptive subscales was partially supported. Conclusions The original five subscales required item deletion and/or rescaling to improve goodness of fit to the Rasch model. While evidence was found for overarching subscales of adaptive and maladaptive coping, extensive modifications were necessary to achieve this result. Further exploration and validation of over-arching subscales assessing adaptive and maladaptive coping is necessary with cancer survivors. PMID:22607052

Assessing Teachers' Positive Psychological Functioning at Work: Development and Validation of the Teacher Subjective Wellbeing Questionnaire

ERIC Educational Resources Information Center

Renshaw, Tyler L.; Long, Anna C. J.; Cook, Clayton R.

2015-01-01

This study reports on the initial development and validation of the Teacher Subjective Wellbeing Questionnaire (TSWQ) with 2 samples of educators--a general sample of 185 elementary and middle school teachers, and a target sample of 21 elementary school teachers experiencing classroom management challenges. The TSWQ is an 8-item self-report…
Identifying Items to Assess Methodological Quality in Physical Therapy Trials: A Factor Analysis

PubMed Central

Cummings, Greta G.; Fuentes, Jorge; Saltaji, Humam; Ha, Christine; Chisholm, Annabritt; Pasichnyk, Dion; Rogers, Todd

2014-01-01

Background Numerous tools and individual items have been proposed to assess the methodological quality of randomized controlled trials (RCTs). The frequency of use of these items varies according to health area, which suggests a lack of agreement regarding their relevance to trial quality or risk of bias. Objective The objectives of this study were: (1) to identify the underlying component structure of items and (2) to determine relevant items to evaluate the quality and risk of bias of trials in physical therapy by using an exploratory factor analysis (EFA). Design A methodological research design was used, and an EFA was performed. Methods Randomized controlled trials used for this study were randomly selected from searches of the Cochrane Database of Systematic Reviews. Two reviewers used 45 items gathered from 7 different quality tools to assess the methodological quality of the RCTs. An exploratory factor analysis was conducted using the principal axis factoring (PAF) method followed by varimax rotation. Results Principal axis factoring identified 34 items loaded on 9 common factors: (1) selection bias; (2) performance and detection bias; (3) eligibility, intervention details, and description of outcome measures; (4) psychometric properties of the main outcome; (5) contamination and adherence to treatment; (6) attrition bias; (7) data analysis; (8) sample size; and (9) control and placebo adequacy. Limitation Because of the exploratory nature of the results, a confirmatory factor analysis is needed to validate this model. Conclusions To the authors' knowledge, this is the first factor analysis to explore the underlying component items used to evaluate the methodological quality or risk of bias of RCTs in physical therapy. The items and factors represent a starting point for evaluating the methodological quality and risk of bias in physical therapy trials. Empirical evidence of the association among these items with treatment effects and a confirmatory factor analysis of these results are needed to validate these items. PMID:24786942
Identifying items to assess methodological quality in physical therapy trials: a factor analysis.

PubMed

Armijo-Olivo, Susan; Cummings, Greta G; Fuentes, Jorge; Saltaji, Humam; Ha, Christine; Chisholm, Annabritt; Pasichnyk, Dion; Rogers, Todd

2014-09-01

Numerous tools and individual items have been proposed to assess the methodological quality of randomized controlled trials (RCTs). The frequency of use of these items varies according to health area, which suggests a lack of agreement regarding their relevance to trial quality or risk of bias. The objectives of this study were: (1) to identify the underlying component structure of items and (2) to determine relevant items to evaluate the quality and risk of bias of trials in physical therapy by using an exploratory factor analysis (EFA). A methodological research design was used, and an EFA was performed. Randomized controlled trials used for this study were randomly selected from searches of the Cochrane Database of Systematic Reviews. Two reviewers used 45 items gathered from 7 different quality tools to assess the methodological quality of the RCTs. An exploratory factor analysis was conducted using the principal axis factoring (PAF) method followed by varimax rotation. Principal axis factoring identified 34 items loaded on 9 common factors: (1) selection bias; (2) performance and detection bias; (3) eligibility, intervention details, and description of outcome measures; (4) psychometric properties of the main outcome; (5) contamination and adherence to treatment; (6) attrition bias; (7) data analysis; (8) sample size; and (9) control and placebo adequacy. Because of the exploratory nature of the results, a confirmatory factor analysis is needed to validate this model. To the authors' knowledge, this is the first factor analysis to explore the underlying component items used to evaluate the methodological quality or risk of bias of RCTs in physical therapy. The items and factors represent a starting point for evaluating the methodological quality and risk of bias in physical therapy trials. Empirical evidence of the association among these items with treatment effects and a confirmatory factor analysis of these results are needed to validate these items. © 2014 American Physical Therapy Association.
Stakeholder opinion of functional communication activities following traumatic brain injury.

PubMed

Larkins, B M; Worrall, L E; Hickson, L M

2004-07-01

To establish a process whereby assessment of functional communication reflects the authentic communication of the target population. The major functional communication assessments available from the USA may not be as relevant to those who reside elsewhere, nor assessments developed primarily for persons who have had a stroke as relevant for traumatic brain injury rehabilitation. The investigation used the Nominal Group Technique to elicit free opinion and support individuals who have compromised communication ability. A survey mailed out sampled a larger number of stakeholders to test out differences among groups. Five stakeholder groups generated items and the survey determined relative 'importance'. The stakeholder groups in both studies comprised individuals with traumatic brain injury and their families, health professionals, third-party payers, employers, and Maori, the indigenous population of New Zealand. There was no statistically significant difference found between groups for 19 of the 31 items. Only half of the items explicitly appear on a well-known USA functional communication assessment. The present study has implications for whether functional communication assessments are valid across cultures and the type of impairment.
Convergent and Discriminant Validity of the Five Factor Form and the Sliderbar Inventory.

PubMed

Rojas, Stephanie L; Widiger, Thomas A

2018-03-01

Existing measures of the five factor model (FFM) of personality are generally, if not exclusively, unipolar in their assessment of maladaptive variants of the FFM domains. However, two recently developed measures, the Five Factor Form (FFF) and the Sliderbar Inventory (SI), include items that assess for maladaptive variants at both poles of each item. This structure is unique among existing measures of personality and personality disorder, although there is a historical, infrequently used Stone Personality Trait Schema (SPTS) that had also included this item structure. To facilitate an exploration of their convergent and discriminant validity, the SI and SPTS items were reorganized into FFM scales. The convergent and discriminant validity of the FFF, SI-FFM, and SPTS-FFM scales was considered in a sample of 450 adults with current or a history of mental health treatment. The FFF, SI-FFM, and SPTS-FFM were also compared with respect to their relationship with FFM domains. Finally, the FFF items and SI-FFM scales were tested with respect to their relationship with measures of maladaptive variants of both high and low agreeableness and conscientiousness. The implications of the results are discussed with respect to the assessment of maladaptive personality functioning, and suggestions for future research are provided.
Improving Assessment of the Spectrum of Reward-Related Eating: The RED-13

PubMed Central

Mason, Ashley E.; Vainik, Uku; Acree, Michael; Tomiyama, A. Janet; Dagher, Alain; Epel, Elissa S.; Hecht, Frederick M.

2017-01-01

A diversity of scales capture facets of reward-related eating (RRE). These scales assess food cravings, uncontrolled eating, addictive behavior, restrained eating, binge eating, and other eating behaviors. However, these scales differ in terms of the severity of RRE they capture. We sought to incorporate the items from existing scales to broaden the 9-item Reward-based Eating Drive scale (RED-9; Epel et al., 2014), which assesses three dimensions of RRE (lack of satiety, preoccupation with food, and lack of control over eating), in order to more comprehensively assess the entire spectrum of RRE. In a series of 4 studies, we used Item Response Theory models to consider candidate items to broaden the RED-9. Studies 1 and 2 evaluated the abilities of additional items from existing scales to increase the RED-9’s coverage across the spectrum of RRE. Study 3 evaluated candidate items identified in Studies 1 and 2 in a new sample to assess the extent to which they accounted for more variance in areas less well-covered by the RED-9. Study 4 tested the ability of the RED-13 to provide consistent coverage across the range of the RRE spectrum. The resultant RED-13 accounted for greater variability than the RED-9 by reducing gaps in coverage of RRE in middle-to-low ranges. Like the RED-9, the RED-13 was positively correlated with BMI. The RED-13 was also positively related to a diagnosis of type 2 diabetes as well as cravings for sweet and savory foods. In summary, the RED-13 is a brief self-report measure that broadly captures the spectrum of RRE and may be a useful tool for identifying individuals at risk for overweight or obesity. PMID:28611698
Development and validation of the German version of the Orofacial Esthetic Scale.

PubMed

Reissmann, Daniel R; Benecke, Andreas W; Aarabi, Ghazal; Sierwald, Ira

2015-07-01

This study aimed to develop the German version of the Orofacial Esthetic Scale (OES-G) and to assess its psychometric properties. The OES is an eight-item instrument with seven items directly addressing esthetic impacts of the orofacial region and an eighth item for a global assessment. It applies an 11-point ordinal rating scale, with summary scores ranging from 0 (worst) to 70 (best). The original OES items were translated into German using a forward-backward method. A de novo development of German items (n = 21 patients) and a cross-cultural adaptation after pilot testing (n = 15 patients) established content validity. Internal consistency and construct validity (structural, convergent, known-groups) of the OES-G were assessed in a sample of 165 prosthodontic patients. The OES was applied in 42 patients on two occasions, with a temporal distance of 2-4 weeks apart to determine test-retest reliability. Internal consistency of the OES-G was considered as satisfactory (Cronbach's alpha 0.94; average inter-item correlation 0.64). Intraclass correlation coefficient of 0.95 (95 % confidence interval 0.92-0.98) indicated excellent test-retest reliability. Correlation matrix and exploratory factor analysis provided support for unidimensionality of the measured construct. The OES-G summary score was correlated with the patients' global assessment of their esthetics (r = 0.87) and external ratings of the expert group (r = 0.55) and discriminated patients with treatment need (39.4 points) from patients without (58.4 points; p < 0.001) and with a large effect size. The OES-G has good psychometric properties and is a valuable instrument for the assessment of self-perceived orofacial esthetics.
Rapid and Accurate Behavioral Health Diagnostic Screening: Initial Validation Study of a Web-Based, Self-Report Tool (the SAGE-SR).

PubMed

Brodey, Benjamin; Purcell, Susan E; Rhea, Karen; Maier, Philip; First, Michael; Zweede, Lisa; Sinisterra, Manuela; Nunn, M Brad; Austin, Marie-Paule; Brodey, Inger S

2018-03-23

The Structured Clinical Interview for DSM (SCID) is considered the gold standard assessment for accurate, reliable psychiatric diagnoses; however, because of its length, complexity, and training required, the SCID is rarely used outside of research. This paper aims to describe the development and initial validation of a Web-based, self-report screening instrument (the Screening Assessment for Guiding Evaluation-Self-Report, SAGE-SR) based on the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5) and the SCID-5-Clinician Version (CV) intended to make accurate, broad-based behavioral health diagnostic screening more accessible within clinical care. First, study staff drafted approximately 1200 self-report items representing individual granular symptoms in the diagnostic criteria for the 8 primary SCID-CV modules. An expert panel iteratively reviewed, critiqued, and revised items. The resulting items were iteratively administered and revised through 3 rounds of cognitive interviewing with community mental health center participants. In the first 2 rounds, the SCID was also administered to participants to directly compare their Likert self-report and SCID responses. A second expert panel evaluated the final pool of items from cognitive interviewing and criteria in the DSM-5 to construct the SAGE-SR, a computerized adaptive instrument that uses branching logic from a screener section to administer appropriate follow-up questions to refine the differential diagnoses. The SAGE-SR was administered to healthy controls and outpatient mental health clinic clients to assess test duration and test-retest reliability. Cutoff scores for screening into follow-up diagnostic sections and criteria for inclusion of diagnoses in the differential diagnosis were evaluated. The expert panel reduced the initial 1200 test items to 664 items that panel members agreed collectively represented the SCID items from the 8 targeted modules and DSM criteria for the covered diagnoses. These 664 items were iteratively submitted to 3 rounds of cognitive interviewing with 50 community mental health center participants; the expert panel reviewed session summaries and agreed on a final set of 661 clear and concise self-report items representing the desired criteria in the DSM-5. The SAGE-SR constructed from this item pool took an average of 14 min to complete in a nonclinical sample versus 24 min in a clinical sample. Responses to individual items can be combined to generate DSM criteria endorsements and differential diagnoses, as well as provide indices of individual symptom severity. Preliminary measures of test-retest reliability in a small, nonclinical sample were promising, with good to excellent reliability for screener items in 11 of 13 diagnostic screening modules (intraclass correlation coefficient [ICC] or kappa coefficients ranging from .60 to .90), with mania achieving fair test-retest reliability (ICC=.50) and other substance use endorsed too infrequently for analysis. The SAGE-SR is a computerized adaptive self-report instrument designed to provide rigorous differential diagnostic information to clinicians. ©Benjamin Brodey, Susan E Purcell, Karen Rhea, Philip Maier, Michael First, Lisa Zweede, Manuela Sinisterra, M Brad Nunn, Marie-Paule Austin, Inger S Brodey. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 23.03.2018.
Social Exclusion Index-for Health Surveys (SEI-HS): a prospective nationwide study to extend and validate a multidimensional social exclusion questionnaire.

PubMed

van Bergen, Addi P L; Hoff, Stella J M; Schreurs, Hanneke; van Loon, Annelies; van Hemert, Albert M

2017-03-14

Social exclusion (SE) refers to the inability of certain groups or individuals to fully participate in society. SE is associated with socioeconomic inequalities in health, and its measurement in routine public health monitoring is considered key to designing effective health policies. In an earlier retrospective analysis we demonstrated that in all four major Dutch cities, SE could largely be measured with existing local public health monitoring data. The current prospective study is aimed at constructing and validating an extended national measure for SE that optimally employs available items. In 2012, a stratified general population sample of 258,928 Dutch adults completed a version of the Netherlands Public Health Monitor (PHM) questionnaire in which 9 items were added covering aspects of SE that were found to be missing in our previous research. Items were derived from the SCP social exclusion index, a well-constructed 15-item instrument developed by the Netherlands Institute for Social Research (SCP). The dataset was randomly divided into a development sample (N =129,464) and a validation sample (N = 129,464). Canonical correlation analysis was conducted in the development sample. The psychometric properties were studied and compared with those of the original SCP index. All analyses were then replicated in the validation sample. The analysis yielded a four dimensional index, the Social Exclusion Index for Health Surveys (SEI-HS), containing 8 SCP items and 9 PHM items. The four dimensions: "lack of social participation", "material deprivation", "lack of normative integration" and "inadequate access to basic social rights", were each measured with 3 to 6 items. The SEI-HS showed adequate internal consistency for both the general index and for two of four dimension scales. The internal structure and construct validity of the SEI-HS were satisfactory and similar to the original SCP index. Replication of the SEI-HS in the validation sample confirmed its generalisability. This study demonstrates that the SEI-HS offers epidemiologists and public health researchers a uniform, reliable, valid and efficient means of assessing social exclusion and its underlying dimensions. The study also provides valuable insights in how to develop embedded measures for public health surveillance.
Lower-fat menu items in restaurants satisfy customers.

PubMed

Fitzpatrick, M P; Chapman, G E; Barr, S I

1997-05-01

To evaluate a restaurant-based nutrition program by measuring customer satisfaction with lower-fat menu items and assessing patrons' reactions to the program. Questionnaires to assess satisfaction with menu items were administered to patrons in eight of the nine restaurants that volunteered to participate in the nutrition program. One patron from each participating restaurant was randomly selected for a semistructured interview about nutrition programming in restaurants. Persons dining in eight participating restaurants over a 1-week period (n = 686). Independent samples t tests were used to compare respondents' satisfaction with lower-fat and regular menu items. Two-way analysis of variance tests were completed using overall satisfaction as the dependent variable and menu-item classification (ie, lower fat or regular) and one of eight other menu item and respondent characteristics as independent variables. Qualitative methods were used to analyze interview transcripts. Of 1,127 menu items rated for satisfaction, 205 were lower fat, 878 were regular, and 44 were of unknown classification. Customers were significantly more satisfied with lower-fat than with regular menu items (P < .001). Overall satisfaction did not vary by any of the other independent variables. Interview results indicate the importance of restaurant during as an indulgent experience. High satisfaction with lower-fat menu items suggests that customers will support restaurant providing such choices. Dietitians can use these findings to encourage restaurateurs to include lower-fat choices on their menus, and to assure clients that their expectations of being indulged are not incompatible with these choices.
Analysis of the psychometric properties of the Multiple Sclerosis Impact Scale-29 (MSIS-29) in relapsing–remitting multiple sclerosis using classical and modern test theory

PubMed Central

Wyrwich, KW; Phillips, GA; Vollmer, T; Guo, S

2016-01-01

Background Investigations using classical test theory support the psychometric properties of the original version of the Multiple Sclerosis Impact Scale (MSIS-29v1), a disease-specific measure of multiple sclerosis (MS) impact (physical and psychological subscales). Later, assessments of the MSIS-29v1 in an MS community-based sample using Rasch analysis led to revisions of the instrument’s response options (MSIS-29v2). Objective The objective of this paper is to evaluate the psychometric properties of the MSIS-29v1 in a clinical trial cohort of relapsing–remitting MS patients (RRMS). Methods Data from 600 patients with RRMS enrolled in the SELECT clinical trial were used. Assessments were performed at baseline and at Weeks 12, 24, and 52. In addition to traditional psychometric analyses, Item Response Theory (IRT) and Rasch analysis were used to evaluate the measurement properties of the MSIS-29v1. Results Both MSIS-29v1 subscales demonstrated strong reliability, construct validity, and responsiveness. The IRT and Rasch analysis showed overall support for response category threshold ordering, person-item fit, and item fit for both subscales. Conclusions Both MSIS-29v1 subscales demonstrated robust measurement properties using classical, IRT, and Rasch techniques. Unlike previous research using a community-based sample, the MSIS-29v1 was found to be psychometrically sound to assess physical and psychological impairments in a clinical trial sample of patients with RRMS. PMID:28607741
Analysis of the psychometric properties of the Multiple Sclerosis Impact Scale-29 (MSIS-29) in relapsing-remitting multiple sclerosis using classical and modern test theory.

PubMed

Bacci, E D; Wyrwich, K W; Phillips, G A; Vollmer, T; Guo, S

2016-01-01

Investigations using classical test theory support the psychometric properties of the original version of the Multiple Sclerosis Impact Scale (MSIS-29v1), a disease-specific measure of multiple sclerosis (MS) impact (physical and psychological subscales). Later, assessments of the MSIS-29v1 in an MS community-based sample using Rasch analysis led to revisions of the instrument's response options (MSIS-29v2). The objective of this paper is to evaluate the psychometric properties of the MSIS-29v1 in a clinical trial cohort of relapsing-remitting MS patients (RRMS). Data from 600 patients with RRMS enrolled in the SELECT clinical trial were used. Assessments were performed at baseline and at Weeks 12, 24, and 52. In addition to traditional psychometric analyses, Item Response Theory (IRT) and Rasch analysis were used to evaluate the measurement properties of the MSIS-29v1. Both MSIS-29v1 subscales demonstrated strong reliability, construct validity, and responsiveness. The IRT and Rasch analysis showed overall support for response category threshold ordering, person-item fit, and item fit for both subscales. Both MSIS-29v1 subscales demonstrated robust measurement properties using classical, IRT, and Rasch techniques. Unlike previous research using a community-based sample, the MSIS-29v1 was found to be psychometrically sound to assess physical and psychological impairments in a clinical trial sample of patients with RRMS.
Development and psychometric testing of an instrument designed to measure chronic pain in dogs with osteoarthritis

PubMed Central

Boston, Raymond C.; Coyne, James C.; Farrar, John T.

2010-01-01

Objective To develop and psychometrically test an owner self-administered questionnaire designed to assess severity and impact of chronic pain in dogs with osteoarthritis. Sample Population 70 owners of dogs with osteoarthritis and 50 owners of clinically normal dogs. Procedures Standard methods for the stepwise development and testing of instruments designed to assess subjective states were used. Items were generated through focus groups and an expert panel. Items were tested for readability and ambiguity, and poorly performing items were removed. The reduced set of items was subjected to factor analysis, reliability testing, and validity testing. Results Severity of pain and interference with function were 2 factors identified and named on the basis of the items contained in them. Cronbach’s α was 0.93 and 0.89, respectively, suggesting that the items in each factor could be assessed as a group to compute factor scores (ie, severity score and interference score). The test-retest analysis revealed κ values of 0.75 for the severity score and 0.81 for the interference score. Scores correlated moderately well (r = 0.51 and 0.50, respectively) with the overall quality-of-life (QOL) question, such that as severity and interference scores increased, QOL decreased. Clinically normal dogs had significantly lower severity and interference scores than dogs with osteoarthritis. Conclusions and Clinical Relevance A psychometrically sound instrument was developed. Responsiveness testing must be conducted to determine whether the questionnaire will be useful in reliably obtaining quantifiable assessments from owners regarding the severity and impact of chronic pain and its treatment on dogs with osteoarthritis. PMID:17542696
Self-report assessment of the DSM-IV personality disorders. Measurement of trait and distress characteristics: the ADP-IV.

PubMed

Schotte, C K; de Doncker, D; Vankerckhoven, C; Vertommen, H; Cosyns, P

1998-09-01

Self-report instruments assessing the DSM personality disorders are characterized by overdiagnosis due to their emphasis on the measurement of personality traits rather than the impairment and distress associated with the criteria. The ADP-IV, a Dutch questionnaire, introduces an alternative assessment method: each test item assesses 'Trait' as well as 'Distress/impairment' characteristics of a DSM-IV criterion. This item format allows dimensional as well as categorical diagnostic evaluations. The present study explores the validity of the ADP-IV in a sample of 659 subjects of the Flemish population. The dimensional personality disorder subscales, measuring Trait characteristics, are internally consistent and display a good concurrent validity with the Wisconsin Personality Disorders Inventory. Factor analysis at the item-level resulted in 11 orthogonal factors, describing personality dimensions such as psychopathy, social anxiety and avoidance, negative affect and self-image. Factor analysis at the subscale-level identified two basic dimensions, reflecting hostile (DSM-IV Cluster B) and anxious (DSM-IV Cluster C) interpersonal attitudes. Categorical ADP-IV diagnoses are obtained using scoring algorithms, which emphasize the Trait or the Distress concepts in the diagnostic evaluation. Prevalences of ADP-IV diagnoses of any personality disorder according to these algorithms vary between 2.28 and 20.64%. Although further research in clinical samples is required, the present results support the validity of the ADP-IV and the potential of the measurement of trait and distress characteristics as a method for assessing personality pathology.
Development of a new Rasch-based scoring algorithm for the National Eye Institute Visual Functioning Questionnaire to improve its interpretability.

PubMed

Petrillo, Jennifer; Bressler, Neil M; Lamoureux, Ecosse; Ferreira, Alberto; Cano, Stefan

2017-08-14

The NEI VFQ-25 has undergone psychometric evaluation in patients with varying ocular conditions and the general population. However, important limitations which may affect the interpretation of clinical trial results have been previously identified, such as concerns with reliability and validity. The purpose of this study was to evaluate the National Eye Institute Visual Functioning Questionnaire (NEI VFQ-25) and make recommendations for a revised scoring structure, with a view to improving its psychometric performance and interpretability. Rasch Measurement Theory analyses were conducted in two stages using pooled baseline NEI VFQ-25 data for 2487 participants with retinal diseases enrolled in six clinical trials. In stage 1, we examined: scale-to-sample targeting; thresholds for item response options; item fit statistics; stability; local dependence; and reliability. In stage 2, a post-hoc revision of the scoring structure (VFQ-28R) was created and psychometrically re-evaluated. In stage 1, we found that the NEI VFQ-25 was mis-targeted to the sample, and had disordered response thresholds (15/25 items) and mis-fitting items (8/25 items). However, items appeared to be stable (differential item functioning for three items), have minimal item dependency (one pair of items) and good reliability (person-separation index, 0.93). In stage 2, the modified Rasch-scored NEI VFQ-28-R was assessed. It comprised two broad domains: Activity Limitation (19 items) and Socio-Emotional Functioning (nine items). The NEI VFQ-28-R demonstrated improved performance with fewer disordered response thresholds (no items), less item misfit (three items) and improved population targeting (reduced ceiling effect) compared with the NEI VFQ-25. Compared with the original version, the proposed NEI VFQ-28-R, with Rasch-based scoring and a two-domain structure, appears to offer improved psychometric performance and interpretability of the vision-related quality of life scale for the population analysed.
Assessing the validity and reliability of the Pool Activity Level (PAL) Checklist for use with older people with dementia.

PubMed

Wenborn, Jennifer; Challis, David; Pool, Jackie; Burgess, Jane; Elliott, Nicola; Orrell, Martin

2008-03-01

Activity is key to maintaining physical and mental health and well-being. However, as dementia affects the ability to engage in activity, care-givers can find it difficult to provide appropriate activities. The Pool Activity Level (PAL) Checklist guides the selection of appropriate, personally meaningful activities. The aim of this study was to assess the reliability and validity of the PAL Checklist when used with older people with dementia. A postal questionnaire sent to activity providers assessed content validity. Validity and reliability were measured in a sample of 60 older people with dementia. The questionnaire response rate was 83% (102/122). Most respondents felt no important items were missing. Seven of the nine activities were ranked as 'very important' or 'essential' by at least 77% of the sample, indicating very good content validity. Correlation with measures of cognition, severity of dementia and activity performance demonstrated strong concurrent validity. Inter-item correlation indicated strong construct validity. Cronbach's alpha coefficient measured internal consistency as excellent (0.95). All items achieved acceptable test-retest reliability, and the majority demonstrated acceptable inter-rater reliability. We conclude that the PAL Checklist demonstrates adequate validity and reliability when used with older people with dementia and appears a useful tool for a variety of care settings.
Initial psychometric evaluation of the Moral Injury Questionnaire--Military version.

PubMed

Currier, Joseph M; Holland, Jason M; Drescher, Kent; Foy, David

2015-01-01

Moral injury is an emerging construct related to negative consequences associated with war-zone stressors that transgress military veterans' deeply held values/beliefs. Given the newness of the construct, there is a need for instrumentation that might assess morally injurious experiences (MIEs) in this population. Drawing on a community sample of 131 Iraq and/or Afghanistan Veterans and clinical sample of 82 returning Veterans, we conducted an initial psychometric evaluation of the newly developed Moral Injury Questionnaire-Military version (MIQ-M)-a 20-item self-report measure for assessing MIEs. Possibly due to low rates of reporting, an item assessing sexual trauma did not yield favourable psychometric properties and was excluded from analyses. Veterans in the clinical sample endorsed significantly higher scores across MIQ-M items. Factor analytic results for the final 19 items supported a unidimensional structure, and convergent validity analyses revealed that higher scores (indicative of more MIEs) were correlated with greater general combat exposure, impairments in work/social functioning, posttraumatic stress and depression in the community sample. In addition, when controlling for demographics, deployment-related factors and exposure to life threat stressors associated with combat, tests of incremental validity indicated that MIQ-M scores were also uniquely linked with suicide risk and other mental health outcomes. These findings provide preliminary evidence for the validity of the MIQ-M and support the applicability of this measure for further research and clinical work with Veterans. Military service can confront service members with experiences that undermine their core sense of humanity and violate global values and beliefs. These types of experiences increase the risk for posttraumatic maladjustment in this population, even when accounting for rates of exposure to life threat traumas. Moral injury is an emerging construct to more fully capture the many possible psychological, ethical, and spiritual/existential challenges among persons who served in modern wars and other trauma-exposed professional groups. There is currently a need for psychometrically sound instrumentation for assessing morally injurious experiences (MIEs). The Moral Injury Questionnaire - Military Version (MIQ-M) was developed to provide a tool for assessing possible MIEs among military populations. This study provides preliminary evidence of the validity - including factorial, concurrent, and incremental - and clinical utility of the MIQ-M for further applications in clinical and research contexts. Copyright © 2013 John Wiley & Sons, Ltd.
Validation of a short, qualitative food frequency questionnaire in French adults participating in the MONA LISA-NUT study 2005-2007.

PubMed

Giovannelli, Jonathan; Dallongeville, Jean; Wagner, Aline; Bongard, Vanina; Laillet, Brigitte; Marecaux, Nadine; Ruidavets, Jean Bernard; Haas, Bernadette; Ferrieres, Jean; Arveiler, Dominique; Simon, Chantal; Dauchet, Luc

2014-04-01

Food frequency questionnaires (FFQs) are often used to evaluate individuals' food intakes in epidemiologic studies because of their simplicity and low cost. To assess the validity of a short (24 items), qualitative FFQ used in the MONA LISA-NUT study. Cross-sectional study of a representative sample in three French counties. The sample included 2,630 participants aged 35 to 65 years from the MONA LISA-NUT study. Food consumption was measured with the FFQ and via food records for 3 consecutive days. Plasma fatty acids were measured from a subset of participants. The FFQ items' validity was assessed by calculating crude and deattenuated Pearson correlation coefficients between frequencies reported by the FFQ and average weights reported by the food records. Furthermore, the validity of some items of the FFQ measuring the consumption of fatty foods was assessed by calculating Pearson correlation coefficients between frequencies of consumption of these foods and dosages of the corresponding plasma fatty acids: fish and eicosapentaenoic acid (EPA) and docosahexaenoic acid (DHA), olive oil and oleic acid, margarine and elaidic acid, and dairy products and pentadecanoic and heptadecanoic acids. The mean of the deattenuated Pearson correlation coefficients for all items was 0.46, with values ranging from 0.22 (fried food) to 0.77 (breakfast cereal). The correlation coefficient was ≤ 0.4 for one third of the 24 items. Moderate correlations were found between fish and EPA/DHA (EPA: r=0.43, 95% CI 0.33 to 0.51; DHA: r=0.39, 95% CI 0.30 to 0.47), but not for other food items. One third of the 24 items in the short, qualitative FFQ evaluated here were not sufficiently valid. However, for the food groups most commonly studied in the literature, this FFQ had the same degree of validity as other questionnaires designed to classify subjects according to their level of intake. Copyright © 2014 Academy of Nutrition and Dietetics. Published by Elsevier Inc. All rights reserved.
The dimensionality of DSM-IV alcohol use disorders among adolescent and adult drinkers and symptom patterns by age, gender, and race/ethnicity.

PubMed

Harford, Thomas C; Yi, Hsiao-ye; Faden, Vivian B; Chen, Chiung M

2009-05-01

There is limited information on the validity of Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV) alcohol use disorders (AUD) symptom criteria among adolescents in the general population. The purpose of this study is to assess the DSM-IV AUD symptom criteria as reported by adolescent and adult drinkers in a single representative sample of the U.S. population aged 12 years and older. This design avoids potential confounding due to differences in survey methodology when comparing adolescents and adults from different surveys. A total of 133,231 current drinkers (had at least 1 drink in the past year) aged 12 years and older were drawn from respondents to the 2002 to 2005 National Surveys on Drug Use and Health. DSM-IV AUD criteria were assessed by questions related to specific symptoms occurring during the past 12 months. Factor analytic and item response theory models were applied to the 11 AUD symptom criteria to assess the probabilities of symptom item endorsements across different values of the underlying trait. A 1-factor model provided an adequate and parsimonious interpretation for the 11 AUD criteria for the total sample and for each of the gender-age groups. The MIMIC model exhibited significant indication for item bias among some criteria by gender, age, and race/ethnicity. Symptom criteria for "tolerance,"time spent," and "hazardous use" had lower item thresholds (i.e., lower severity) and low item discrimination, and they were well separated from the other symptoms, especially in the 2 younger age groups (12 to 17 and 18 to 25). "Larger amounts,"cut down,"withdrawal," and "legal problems" had higher item thresholds but generally lower item discrimination, and they tend to exhibit greater dispersion at higher AUD severity, particularly in the youngest age group (12 to 17). Findings from the present study do not provide support for the 2 separate DSM-IV diagnoses of alcohol abuse and dependence among either adolescents or adults. Variations in criteria severity for both abuse and dependence offer support for a dimensional approach to diagnosis which should be considered in the ongoing development of DSM-V.
An in-depth psychometric analysis of the Connor-Davidson Resilience Scale: calibration with Rasch-Andrich model.

PubMed

Arias González, Víctor B; Crespo Sierra, María Teresa; Arias Martínez, Benito; Martínez-Molina, Agustín; Ponce, Fernando P

2015-09-23

The Connor-Davidson Resilience Scale (CD-RISC) is inarguably one of the best-known instruments in the field of resilience assessment. However, the criteria for the psychometric quality of the instrument were based only on classical test theory. The aim of this paper has focused on the calibration of the CD-RISC with a nonclinical sample of 444 adults using the Rasch-Andrich Rating Scale Model, in order to clarify its structure and analyze its psychometric properties at the level of item. Two items showed misfit to the model and were eliminated. The remaining 22 items form basically a unidimensional scale. The CD-RISC has good psychometric properties. The fit of both the items and the persons to the Rasch model was good, and the response categories were functioning properly. Two of the items showed differential item functioning. The CD-RISC has an obvious ceiling effect, which suggests to include more difficult items in future versions of the scale.

Development and evaluation of the Korean Health Literacy Instrument.

PubMed

Kang, Soo Jin; Lee, Tae Wha; Paasche-Orlow, Michael K; Kim, Gwang Suk; Won, Hee Kwan

2014-01-01

The purpose of this study is to develop and validate the Korean Health Literacy Instrument, which measures the capacity to understand and use health-related information and make informed health decisions in Korean adults. In Phase 1, 33 initial items were generated to measure functional, interactive, and critical health literacy with prose, document, and numeracy tasks. These items included content from health promotion, disease management, and health navigation contexts. Content validity assessment was conducted by an expert panel, and 11 items were excluded. In Phase 2, the 22 remaining items were administered to a convenience sample of 292 adults from community and clinical settings. Exploratory factor and item difficulty and discrimination analyses were conducted and four items with low discrimination were deleted. In Phase 3, the remaining 18 items were administered to a convenience sample of 315 adults 40-64 years of age from community and clinical settings. A confirmatory factor analysis was performed to test the construct validity of the instrument. The Korean Health Literacy Instrument has a range of 0 to 18. The mean score in our validation study was 11.98. The instrument exhibited an internal consistency reliability coefficient of 0.82, and a test-retest reliability of 0.89. The instrument is suitable for screening individuals who have limited health literacy skills. Future studies are needed to further define the psychometric properties and predictive validity of the Korean Health Literacy Instrument.
Rater Evaluations for Psychiatric Instruments and Cultural Differences: The PANSS in China and United States

PubMed Central

Aggarwal, Neil Krishan; Zhang, Xiang Yang; Stefanovics, Elina; Chen, Da Chun; Xiu, Mei Hong; Xu, Ke; Rosenheck, Robert A.

2013-01-01

This article compares Positive and Negative Syndrome Scale (PANSS) data from Chinese and American inpatients with chronic schizophrenia to show how differences in item ratings may reflect cultural attitudes of raters. The Chinese sample (N=504) came from Beijing Huilongguan Hospital. The American sample came from 268 PANSS assessments of CATIE subjects hospitalized for 15 days or more to optimize equivalence of the samples. Controlling for age and gender, the Chinese sample scored significantly lower for total score by 25% (p<.0001), for the positive sub-scale by 35% (p<.0001), and on the general sub-scale by 32% (p<.0001), but not significantly different on the negative sub-scale score (+0.26%, p=0.76). However, the Chinese sample scored 26% higher on the item on poor rapport (p<.0001), 10.2% higher on passive social withdrawal (p=.003), and most notably 46% higher on the item on lack of judgment and insight (p<.0001). These results remain broadly consistent across gender sub-group analyses. Differences seem to be best explained by both cultural differences in patient clinical presentations as well as varying American and Chinese cultural values affecting rater judgment. PMID:22922237
Development of a questionnaire for assessing dependence on electronic cigarettes among a large sample of ex-smoking E-cigarette users.

PubMed

Foulds, Jonathan; Veldheer, Susan; Yingst, Jessica; Hrabovsky, Shari; Wilson, Stephen J; Nichols, Travis T; Eissenberg, Thomas

2015-02-01

Electronic cigarettes (e-cigs) are becoming increasingly popular, but little is known about their dependence potential. This study aimed to assess ratings of dependence on electronic cigarettes and retrospectively compare them with rated dependence on tobacco cigarettes among a large sample of ex-smokers who switched to e-cigs. A total of 3,609 current users of e-cigs who were ex-cigarette smokers completed a 158-item online survey about their e-cig use, including 10 items designed to assess their previous dependence on cigarettes and 10 almost identical items, worded to assess their current dependence on e-cigs (range 0-20). Scores on the 10-item Penn State (PS) Cigarette Dependence Index were significantly higher than on the comparable PS Electronic Cigarette Dependence Index (14.5 vs. 8.1, p < .0001). In multivariate analysis, those who had used e-cigs longer had higher e-cig dependence scores, as did those using more advanced e-cigs that were larger than a cigarette and had a manual button. Those using zero nicotine liquid had significantly lower e-cig dependence scores than those using 1-12 mg/ml, who scored significantly lower than those using 13 or greater mg/ml nicotine liquid (p < .003). Current e-cigarette users reported being less dependent on e-cigarettes than they retrospectively reported having been dependent on cigarettes prior to switching. E-cig dependence appears to vary by product characteristics and liquid nicotine concentration, and it may increase over time. © The Author 2014. Published by Oxford University Press on behalf of the Society for Research on Nicotine and Tobacco. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Psychometric properties of the brief version of the Fear of Negative Evaluation Scale in a Turkish sample.

PubMed

Koydemir, Selda; Demir, Ayhan

2007-06-01

The purpose of the study was to report initial data on the psychometric properties of the Brief Fear of Negative Evaluation Scale. The scale was applied to a nonclinical sample of 250 (137 women, 113 men) Turkish undergraduate students selected randomly from Middle East Technical University. Their mean age was 20.4 yr. (SD= 1.9). The factor structure of the Turkish version, its criterion validity, and internal reliability coefficients were assessed. Although maximum likelihood factor analysis initially indicated that the scale had only one factor, a forced two-factor solution accounted for more variance (61%) in scale scores than a single factor. The straightforward items loaded on the first factor, and the reverse-coded items loaded on the second factor. The total score was significantly positively correlated with scores on the Revised Cheek and Buss Shyness Scale and significantly negatively correlated with scores on the Rosenberg Self-Esteem Scale. Factor 1 (straightforward items) correlated more highly with both Shyness and Self-esteem than Factor 2 (reverse-coded items). Internal consistency estimate was .94 for the Total scores, .91 for the Factor 1 (straightforward items), and .87 for the Factor 2 (reverse-coded items). No sex differences were evident for Fear of Negative Evaluation.
An Investigation of Sample Size Splitting on ATFIND and DIMTEST

ERIC Educational Resources Information Center

Socha, Alan; DeMars, Christine E.

2013-01-01

Modeling multidimensional test data with a unidimensional model can result in serious statistical errors, such as bias in item parameter estimates. Many methods exist for assessing the dimensionality of a test. The current study focused on DIMTEST. Using simulated data, the effects of sample size splitting for use with the ATFIND procedure for…
Attitudes Towards the Sexuality of Adults with an Intellectual Disability: Parents, Support Staff, and a Community Sample

ERIC Educational Resources Information Center

Cuskelly, Monica; Bryde, Rachel

2004-01-01

Attitudes toward the sexuality of adults with intellectual disability were assessed in parents and carers of adults with intellectual disability and in a community sample. An instrument that contained items relating to eight aspects of sexuality (sexual feelings, sex education, masturbation, personal relationships, sexual intercourse,…
The Children's Perceived Locus of Causality Scale for Physical Education

ERIC Educational Resources Information Center

Pannekoek, Linda; Piek, Jan P.; Hagger, Martin S.

2014-01-01

A mixed methods design was applied to evaluate the application of the Perceived Locus of Causality scale (PLOC) to preadolescent samples in physical education settings. Subsequent to minor item adaptations to accommodate the assessment of younger samples, qualitative pilot tests were performed (N = 15). Children's reports indicated the need…
Evaluation of the US Army Institute of Public Health Destination Monitoring Program, a food safety surveillance program.

PubMed

Rapp-Santos, Kamala; Havas, Karyn; Vest, Kelly

2015-01-01

The Destination Monitoring Program, operated by the US Army Public Health Command (APHC), is one component that supports the APHC Veterinary Service's mission to ensure safety and quality of food procured for the Department of Defense (DoD). This program relies on retail product testing to ensure compliance of production facilities and distributors that supply food to the DoD. This program was assessed to determine the validity and timeliness by specifically evaluating whether sample size of items collected was adequate, if food samples collected were representative of risk, and whether the program returns results in a timely manner. Data was collected from the US Army Veterinary Services Lotus Notes database, including all food samples collected and submitted from APHC Region-North for the purposes of destination monitoring from January 1, 2013 to December 31, 2013. For most food items, only one sample was submitted for testing. The ability to correctly identify a contaminated food lot may be limited by reliance on test results from only one sample, as the level of confidence in a negative test result is low. The food groups most frequently sampled by APHC correlated with the commodities that were implicated in foodborne illness in the United States. Food items to be submitted were equally distributed among districts and branches, but sections within large branches submitted relatively few food samples compared to sections within smaller branches and districts. Finally, laboratory results were not available for about half the food items prior to their respective expiration dates.
Development of a Computer-Adaptive Physical Function Instrument for Social Security Administration Disability Determination

PubMed Central

Ni, Pengsheng; McDonough, Christine M.; Jette, Alan M.; Bogusz, Kara; Marfeo, Elizabeth E.; Rasch, Elizabeth K.; Brandt, Diane E.; Meterko, Mark; Chan, Leighton

2014-01-01

Objectives To develop and test an instrument to assess physical function (PF) for Social Security Administration (SSA) disability programs, the SSA-PF. Item Response Theory (IRT) analyses were used to 1) create a calibrated item bank for each of the factors identified in prior factor analyses, 2) assess the fit of the items within each scale, 3) develop separate Computer-Adaptive Test (CAT) instruments for each scale, and 4) conduct initial psychometric testing. Design Cross-sectional data collection; IRT analyses; CAT simulation. Setting Telephone and internet survey. Participants Two samples: 1,017 SSA claimants, and 999 adults from the US general population. Interventions None. Main Outcome Measure Model fit statistics, correlation and reliability coefficients, Results IRT analyses resulted in five unidimensional SSA-PF scales: Changing & Maintaining Body Position, Whole Body Mobility, Upper Body Function, Upper Extremity Fine Motor, and Wheelchair Mobility for a total of 102 items. High CAT accuracy was demonstrated by strong correlations between simulated CAT scores and those from the full item banks. Comparing the simulated CATs to the full item banks, very little loss of reliability or precision was noted, except at the lower and upper ranges of each scale. No difference in response patterns by age or sex was noted. The distributions of claimant scores were shifted to the lower end of each scale compared to those of a sample of US adults. Conclusions The SSA-PF instrument contributes important new methodology for measuring the physical function of adults applying to the SSA disability programs. Initial evaluation revealed that the SSA-PF instrument achieved considerable breadth of coverage in each content domain and demonstrated noteworthy psychometric properties. PMID:23578594
The feeding practices and structure questionnaire: construction and initial validation in a sample of Australian first-time mothers and their 2-year olds

PubMed Central

2014-01-01

Background Early feeding practices lay the foundation for children’s eating habits and weight gain. Questionnaires are available to assess parental feeding but overlapping and inconsistent items, subscales and terminology limit conceptual clarity and between study comparisons. Our aim was to consolidate a range of existing items into a parsimonious and conceptually robust questionnaire for assessing feeding practices with very young children (<3 years). Methods Data were from 462 mothers and children (age 21–27 months) from the NOURISH trial. Items from five questionnaires and two study-specific items were submitted to a priori item selection, allocation and verification, before theoretically-derived factors were tested using Confirmatory Factor Analysis. Construct validity of the new factors was examined by correlating these with child eating behaviours and weight. Results Following expert review 10 factors were specified. Of these, 9 factors (40 items) showed acceptable model fit and internal reliability (Cronbach’s α: 0.61-0.89). Four factors reflected non-responsive feeding practices: ‘Distrust in Appetite’, ‘Reward for Behaviour’, ‘Reward for Eating’, and ‘Persuasive Feeding’. Five factors reflected structure of the meal environment and limits: ‘Structured Meal Setting’, ‘Structured Meal Timing’, ‘Family Meal Setting’, ‘Overt Restriction’ and ‘Covert Restriction’. Feeding practices generally showed the expected pattern of associations with child eating behaviours but none with weight. Conclusion The Feeding Practices and Structure Questionnaire (FPSQ) provides a new reliable and valid measure of parental feeding practices, specifically maternal responsiveness to children’s hunger/satiety signals facilitated by routine and structure in feeding. Further validation in more diverse samples is required. PMID:24898364
Why do persons with bipolar disorder stop their medication?

PubMed

Devulapalli, Kavi K; Ignacio, Rosalinda V; Weiden, Peter; Cassidy, Kristin A; Williams, Tiffany D; Safavi, Roknedin; Blow, Frederic C; Sajatovic, Martha

2010-01-01

Non-adherence to maintenance medication regimens is a major problem, limiting outcomes for many persons with bipolar disorder. The aim of this paper is to determine the most relevant aspects of adherence attitudes in a sample of bipolar patients selected for problems with adherence behavior. Among a larger sample of bipolar disorder patients participating in a prospective follow-up study (N = 140), a subsample of patients were selected for non-adherent behavior defined as missing ≥ 30% of medication during the past month (n = 27; 19.3%). Adherence attitudes were assessed with the Rating of Medication Influences scale (ROMI), a self-reported attitudinal measure assessing reasons for and against adherence. Multiple logistic regression models for non-adherence vs. adherence were estimated with each of the 19 ROMI items in the model, while controlling for sex, age, ethnicity, education, duration of illness, and substance abuse. Mean score of ROMI items corresponding to reasons for treatment adherence was greater among adherent participants, whereas the mean score of ROMI items corresponding to reasons for treatment non-adherence was greater among nonadherent participants. The ROMI item identifying that the individual believes that medications are unnecessary had the strongest influence for non-adherence (p < 0.0001). This was followed by ROMI items corresponding to no perceived daily benefit (p = 0.0008), perceived change in appearance (p = 0.0057), and perceived interference with life goals (p = 0.0033). The ROMI item identifying fear of relapse was the strongest predictor for adherence (p = 0.0017). Non-adherent patients with bipolar disorder differ from adherent patients with bipolar disorder on reasons for adherence and non-adherence. Utilization of tools that evaluate medication treatment attitudes, such as the ROMI or similar measures, may assist clinicians in the selection of interventions that are most likely to modify future treatment adherence.
Development of a computer-adaptive physical function instrument for Social Security Administration disability determination.

PubMed

Ni, Pengsheng; McDonough, Christine M; Jette, Alan M; Bogusz, Kara; Marfeo, Elizabeth E; Rasch, Elizabeth K; Brandt, Diane E; Meterko, Mark; Haley, Stephen M; Chan, Leighton

2013-09-01

To develop and test an instrument to assess physical function for Social Security Administration (SSA) disability programs, the SSA-Physical Function (SSA-PF) instrument. Item response theory (IRT) analyses were used to (1) create a calibrated item bank for each of the factors identified in prior factor analyses, (2) assess the fit of the items within each scale, (3) develop separate computer-adaptive testing (CAT) instruments for each scale, and (4) conduct initial psychometric testing. Cross-sectional data collection; IRT analyses; CAT simulation. Telephone and Internet survey. Two samples: SSA claimants (n=1017) and adults from the U.S. general population (n=999). None. Model fit statistics, correlation, and reliability coefficients. IRT analyses resulted in 5 unidimensional SSA-PF scales: Changing & Maintaining Body Position, Whole Body Mobility, Upper Body Function, Upper Extremity Fine Motor, and Wheelchair Mobility for a total of 102 items. High CAT accuracy was demonstrated by strong correlations between simulated CAT scores and those from the full item banks. On comparing the simulated CATs with the full item banks, very little loss of reliability or precision was noted, except at the lower and upper ranges of each scale. No difference in response patterns by age or sex was noted. The distributions of claimant scores were shifted to the lower end of each scale compared with those of a sample of U.S. adults. The SSA-PF instrument contributes important new methodology for measuring the physical function of adults applying to the SSA disability programs. Initial evaluation revealed that the SSA-PF instrument achieved considerable breadth of coverage in each content domain and demonstrated noteworthy psychometric properties. Copyright © 2013 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Confirmatory factor analysis and measurement invariance of the Child Feeding Questionnaire in low-income Hispanic and African-American mothers with preschool-age children.

PubMed

Kong, Angela; Vijayasiri, Ganga; Fitzgibbon, Marian L; Schiffer, Linda A; Campbell, Richard T

2015-07-01

Validation work of the Child Feeding Questionnaire (CFQ) in low-income minority samples suggests a need for further conceptual refinement of this instrument. Using confirmatory factor analysis, this study evaluated 5- and 6-factor models on a large sample of African-American and Hispanic mothers with preschool-age children (n = 962). The 5-factor model included: 'perceived responsibility', 'concern about child's weight', 'restriction', 'pressure to eat', and 'monitoring' and the 6-factor model also tested 'food as a reward'. Multi-group analysis assessed measurement invariance by race/ethnicity. In the 5-factor model, two low-loading items from 'restriction' and one low-variance item from 'perceived responsibility' were dropped to achieve fit. Only removal of the low-variance item was needed to achieve fit in the 6-factor model. Invariance analyses demonstrated differences in factor loadings. This finding suggests African-American and Hispanic mothers may vary in their interpretation of some CFQ items and use of cognitive interviews could enhance item interpretation. Our results also demonstrated that 'food as a reward' is a plausible construct among a low-income minority sample and adds to the evidence that this factor resonates conceptually with parents of preschoolers; however, further testing is needed to determine the validity of this factor with older age groups. Copyright © 2015 Elsevier Ltd. All rights reserved.
Overview of classical test theory and item response theory for the quantitative assessment of items in developing patient-reported outcomes measures.

PubMed

Cappelleri, Joseph C; Jason Lundy, J; Hays, Ron D

2014-05-01

The US Food and Drug Administration's guidance for industry document on patient-reported outcomes (PRO) defines content validity as "the extent to which the instrument measures the concept of interest" (FDA, 2009, p. 12). According to Strauss and Smith (2009), construct validity "is now generally viewed as a unifying form of validity for psychological measurements, subsuming both content and criterion validity" (p. 7). Hence, both qualitative and quantitative information are essential in evaluating the validity of measures. We review classical test theory and item response theory (IRT) approaches to evaluating PRO measures, including frequency of responses to each category of the items in a multi-item scale, the distribution of scale scores, floor and ceiling effects, the relationship between item response options and the total score, and the extent to which hypothesized "difficulty" (severity) order of items is represented by observed responses. If a researcher has few qualitative data and wants to get preliminary information about the content validity of the instrument, then descriptive assessments using classical test theory should be the first step. As the sample size grows during subsequent stages of instrument development, confidence in the numerical estimates from Rasch and other IRT models (as well as those of classical test theory) would also grow. Classical test theory and IRT can be useful in providing a quantitative assessment of items and scales during the content-validity phase of PRO-measure development. Depending on the particular type of measure and the specific circumstances, the classical test theory and/or the IRT should be considered to help maximize the content validity of PRO measures. Copyright © 2014 Elsevier HS Journals, Inc. All rights reserved.
Integrating Prospective Longitudinal Data: Modeling Personality and Health in the Terman Life Cycle and Hawaii Longitudinal Studies

PubMed Central

Kern, Margaret L.; Hampson, Sarah E.; Goldberg, Lewis R.; Friedman, Howard S.

2013-01-01

The present study used a collaborative framework to integrate two long-term prospective studies: the Terman Life Cycle Study and the Hawaii Personality and Health Longitudinal Study. Using a five-factor personality-trait framework, teacher assessments of child personality were rationally and empirically aligned to establish similar factor structures across samples. Comparable items related to adult self-rated health, education, and alcohol use were harmonized, and data were pooled on harmonized items. A structural model was estimated, allowing paths to differ by sample. Harmonized child personality factors were then used to examine markers of physiological dysfunction in the Hawaii sample and mortality risk in the Terman sample. Harmonized conscientiousness predicted less physiological dysfunction in the Hawaii sample and lower mortality risk in the Terman sample. These results illustrate how collaborative, integrative work with multiple samples offers the exciting possibility that samples from different cohorts and ages can be linked together to directly test lifespan theories of personality and health. PMID:23231689
Construct validity and reliability of the Music Attentiveness Screening Assessment (MASA).

PubMed

Waldon, Eric G; Broadhurst, Emily

2014-01-01

Music as alternate engagement (MAE) can be used effectively to distract children during painful or anxiety-provoking medical procedures. For such interventions to be successful, it would seem important to assess the degree to which a child can attend to musical stimuli. The purposes of this study were as follows: (a) To establish construct validity by determining the extent to which the Music Attentiveness Screening Assessment (MASA) measures auditory attention; and (b) to gather evidence regarding MASA test-retest and inter-observer reliability. The Auditory Attention (AA) subtest from the NEPSY-II (NEPSY, Second Edition) and the two items from MASA were administered to a nonclinical sample of children (N = 50) aged 5 to 9 years. There was a statistically significant proportion of AA score variance shared with MASA (both items), R (2) = .21, F(2, 47) = 6.34, p = .004. Test-retest reliability on the first MASA item was moderately high (Pearson r = .84) while on the second item it was lower (r = .63). Similarly, interobserver agreement was high for Item I (intraclass correlation coefficient [ICC] = .95) and lower for Item II (ICC = .71). Evidence suggests that MASA measures, at least in part, auditory attention. Despite this finding, a large proportion of unexplained variance remains. Furthermore, reliability estimates (test-retest and interobserver agreement) differ between both items. These findings are discussed with particular attention paid to the ways in which MASA should be revised and further study conducted. © the American Music Therapy Association 2014. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Development and reliability testing of a self-report instrument to measure the office layout as a correlate of occupational sitting.

PubMed

Duncan, Mitch J; Rashid, Mahbub; Vandelanotte, Corneel; Cutumisu, Nicoleta; Plotnikoff, Ronald C

2013-02-04

Spatial configurations of office environments assessed by Space Syntax methodologies are related to employee movement patterns. These methods require analysis of floors plans which are not readily available in large population-based studies or otherwise unavailable. Therefore a self-report instrument to assess spatial configurations of office environments using four scales was developed. The scales are: local connectivity (16 items), overall connectivity (11 items), visibility of co-workers (10 items), and proximity of co-workers (5 items). A panel cohort (N = 1154) completed an online survey, only data from individuals employed in office-based occupations (n = 307) were used to assess scale measurement properties. To assess test-retest reliability a separate sample of 37 office-based workers completed the survey on two occasions 7.7 (±3.2) days apart. Redundant scale items were eliminated using factor analysis; Chronbach's α was used to evaluate internal consistency and test re-test reliability (retest-ICC). ANOVA was employed to examine differences between office types (Private, Shared, Open) as a measure of construct validity. Generalized Linear Models were used to examine relationships between spatial configuration scales and the duration of and frequency of breaks in occupational sitting. The number of items on all scales were reduced, Chronbach's α and ICCs indicated good scale internal consistency and test re-test reliability: local connectivity (5 items; α = 0.70; retest-ICC = 0.84), overall connectivity (6 items; α = 0.86; retest-ICC = 0.87), visibility of co-workers (4 items; α = 0.78; retest-ICC = 0.86), and proximity of co-workers (3 items; α = 0.85; retest-ICC = 0.70). Significant (p ≤ 0.001) differences, in theoretically expected directions, were observed for all scales between office types, except overall connectivity. Significant associations were observed between all scales and occupational sitting behaviour (p ≤ 0.05). All scales have good measurement properties indicating the instrument may be a useful alternative to Space Syntax to examine environmental correlates of occupational sitting in population surveys.
Development and reliability testing of a self-report instrument to measure the office layout as a correlate of occupational sitting

PubMed Central

2013-01-01

Background Spatial configurations of office environments assessed by Space Syntax methodologies are related to employee movement patterns. These methods require analysis of floors plans which are not readily available in large population-based studies or otherwise unavailable. Therefore a self-report instrument to assess spatial configurations of office environments using four scales was developed. Methods The scales are: local connectivity (16 items), overall connectivity (11 items), visibility of co-workers (10 items), and proximity of co-workers (5 items). A panel cohort (N = 1154) completed an online survey, only data from individuals employed in office-based occupations (n = 307) were used to assess scale measurement properties. To assess test-retest reliability a separate sample of 37 office-based workers completed the survey on two occasions 7.7 (±3.2) days apart. Redundant scale items were eliminated using factor analysis; Chronbach’s α was used to evaluate internal consistency and test re-test reliability (retest-ICC). ANOVA was employed to examine differences between office types (Private, Shared, Open) as a measure of construct validity. Generalized Linear Models were used to examine relationships between spatial configuration scales and the duration of and frequency of breaks in occupational sitting. Results The number of items on all scales were reduced, Chronbach’s α and ICCs indicated good scale internal consistency and test re-test reliability: local connectivity (5 items; α = 0.70; retest-ICC = 0.84), overall connectivity (6 items; α = 0.86; retest-ICC = 0.87), visibility of co-workers (4 items; α = 0.78; retest-ICC = 0.86), and proximity of co-workers (3 items; α = 0.85; retest-ICC = 0.70). Significant (p ≤ 0.001) differences, in theoretically expected directions, were observed for all scales between office types, except overall connectivity. Significant associations were observed between all scales and occupational sitting behaviour (p ≤ 0.05). Conclusion All scales have good measurement properties indicating the instrument may be a useful alternative to Space Syntax to examine environmental correlates of occupational sitting in population surveys. PMID:23379485
Design and validation of a questionnaire to assess organizational culture in French hospital wards.

PubMed

Saillour-Glénisson, F; Domecq, S; Kret, M; Sibe, M; Dumond, J P; Michel, P

2016-09-17

Although many organizational culture questionnaires have been developed, there is a lack of any validated multidimensional questionnaire assessing organizational culture at hospital ward level and adapted to health care context. Facing the lack of an appropriate tool, a multidisciplinary team designed and validated a dimensional organizational culture questionnaire for healthcare settings to be administered at ward level. A database of organizational culture items and themes was created after extensive literature review. Items were regrouped into dimensions and subdimensions (classification validated by experts). Pre-test and face validation was conducted with 15 health care professionals. In a stratified cluster random sample of hospitals, the psychometric validation was conducted in three phases on a sample of 859 healthcare professionals from 36 multidisciplinary medicine services: 1) the exploratory phase included a description of responses' saturation levels, factor and correlations analyses and an internal consistency analysis (Cronbach's alpha coefficient); 2) confirmatory phase used the Structural Equation Modeling (SEM); 3) reproducibility was studied by a test-retest. The overall response rate was 80 %; the completion average was 97 %. The metrological results were: a global Cronbach's alpha coefficient of 0.93, higher than 0.70 for 12 sub-dimensions; all Dillon-Goldstein's rho coefficients higher than 0.70; an excellent quality of external model with a Goodness of Fitness (GoF) criterion of 0.99. Seventy percent of the items had a reproducibility ranging from moderate (Intra-Class Coefficient between 50 and 70 % for 25 items) to good (ICC higher than 70 % for 33 items). COMEt (Contexte Organisationnel et Managérial en Etablissement de Santé) questionnaire is a validated multidimensional organizational culture questionnaire made of 6 dimensions, 21 sub-dimensions and 83 items. It is the first dimensional organizational culture questionnaire, specific to healthcare context, for a unit level assessment showing robust psychometric properties (validity and reliability). This tool is suited for research purposes, especially for assessing organizational context in research analysing the effectiveness of hospital quality improvement strategies. Our tool is also suited for an overall assessment of ward culture and could be a powerful trigger to improve management and clinical performance. Its psychometric properties in other health systems need to be tested.
Multi-technique quantitative analysis and socioeconomic considerations of lead, cadmium, and arsenic in children's toys and toy jewelry.

PubMed

Hillyer, Margot M; Finch, Lauren E; Cerel, Alisha S; Dattelbaum, Jonathan D; Leopold, Michael C

2014-08-01

A wide spectrum and large number of children's toys and toy jewelry items were purchased from both bargain and retail vendors and analyzed for arsenic, cadmium, and lead metal content using multiple analytical techniques, including flame and furnace atomic absorption spectroscopy as well as X-ray fluorescence spectroscopy. Particularly dangerous for young children, metal concentrations in toys/toy jewelry were assessed for compliance with current Consumer Safety Product Commission (CPSC) regulations (F963-11). A conservative metric involving multiple analytical techniques was used to categorize compliance: one technique confirmation of metal in excess of CPSC limits indicated a "suspect" item while confirmation on two different techniques warranted a non-compliant designation. Sample matrix-based standard addition provided additional confirmation of non-compliant and suspect products. Results suggest that origin of purchase, rather than cost, is a significant factor in the risk assessment of these materials with 57% of toys/toy jewelry items from bargain stores non-compliant or suspect compared to only 15% from retail outlets and 13% if only low cost items from the retail stores are compared. While jewelry was found to be the most problematic product (73% of non-compliant/suspect samples), lead (45%) and arsenic (76%) were the most dominant toxins found in non-compliant/suspect samples. Using the greater Richmond area as a model, the discrepancy between bargain and retail children's products, along with growing numbers of bargain stores in low-income and urban areas, exemplifies an emerging socioeconomic public health issue. Copyright © 2014 Elsevier Ltd. All rights reserved.

Assessment of fatigue in rheumatoid arthritis: a psychometric comparison of single-item, multiitem, and multidimensional measures.

PubMed

Oude Voshaar, Martijn A H; Ten Klooster, Peter M; Bode, Christina; Vonkeman, Harald E; Glas, Cees A W; Jansen, Tim; van Albada-Kuipers, Iet; van Riel, Piet L C M; van de Laar, Mart A F J

2015-03-01

To compare the psychometric functioning of multidimensional disease-specific, multiitem generic, and single-item measures of fatigue in patients with rheumatoid arthritis (RA). Confirmatory factor analysis (CFA) and longitudinal item response theory (IRT) modeling were used to evaluate the measurement structure and local reliability of the Bristol RA Fatigue Multi-Dimensional Questionnaire (BRAF-MDQ), the Medical Outcomes Study Short Form-36 (SF-36) vitality scale, and the BRAF Numerical Rating Scales (BRAF-NRS) in a sample of 588 patients with RA. A 1-factor CFA model yielded a similar fit to a 5-factor model with subscale-specific dimensions, and the items from the different instruments adequately fit the IRT model, suggesting essential unidimensionality in measurement. The SF-36 vitality scale outperformed the BRAF-MDQ at lower levels of fatigue, but was less precise at moderate to higher levels of fatigue. At these levels of fatigue, the living, cognition, and emotion subscales of the BRAF-MDQ provide additional precision. The BRAF-NRS showed a limited measurement range with its highest precision centered on average levels of fatigue. The different instruments appear to access a common underlying domain of fatigue severity, but differ considerably in their measurement precision along the continuum. The SF-36 vitality scale can be used to measure fatigue severity in samples with relatively mild fatigue. For samples expected to have higher levels of fatigue, the multidimensional BRAF-MDQ appears to be a better choice. The BRAF-NRS are not recommended if precise assessment is required, for instance in longitudinal settings.
Questionnaire of core beliefs related to drug use and craving for assessment of relapse risk.

PubMed

Martínez-González, José Miguel; Vilar López, Raquel; Lozano-Rojas, Oscar; Verdejo-García, Antonio

2017-07-12

This study was aimed at designing a questionnaire for the assessment of addiction-related core beliefs and craving. The sample comprised 215 patients (85.8% males and 14.2% females) in treatment for dependence to alcohol (40%), cocaine (36.3%) and cannabis (23.7%). Descriptive statistics were used to characterize the sample. Variance, regression and factorial analyses were conducted to study the questionnaire structure and its relation with variables such as abstinence and craving. Items about drug-related beliefs yielded a four-factor structure: what patient think that they could not do without drug use, lack of withdrawal, conditions required to use drugs again, and use of drugs as the only way to feel good. Items related to craving yielded three factors: negative emotions as precipitants of drug use, positive emotions, and difficulties attributed to coping with craving. Furthermore, beliefs were more important to predict craving than abstinence time. The present questionnaire allows to assess a set of significant factors to design prevention relapse programs.
Bem Sex Role Inventory Validation in the International Mobility in Aging Study.

PubMed

Ahmed, Tamer; Vafaei, Afshin; Belanger, Emmanuelle; Phillips, Susan P; Zunzunegui, Maria-Victoria

2016-09-01

This study investigated the measurement structure of the Bem Sex Role Inventory (BSRI) with different factor analysis methods. Most previous studies on validity applied exploratory factor analysis (EFA) to examine the BSRI. We aimed to assess the psychometric properties and construct validity of the 12-item short-form BSRI in a sample administered to 1,995 older adults from wave 1 of the International Mobility in Aging Study (IMIAS). We used Cronbach's alpha to assess internal consistency reliability and confirmatory factor analysis (CFA) to assess psychometric properties. EFA revealed a three-factor model, further confirmed by CFA and compared with the original two-factor structure model. Results revealed that a two-factor solution (instrumentality-expressiveness) has satisfactory construct validity and superior fit to data compared to the three-factor solution. The two-factor solution confirms expected gender differences in older adults. The 12-item BSRI provides a brief, psychometrically sound, and reliable instrument in international samples of older adults.
Development and initial validation of the appropriate antibiotic use self-efficacy scale.

PubMed

Hill, Erin M; Watkins, Kaitlin

2018-06-04

While there are various medication self-efficacy scales that exist, none assess self-efficacy for appropriate antibiotic use. The Appropriate Antibiotic Use Self-Efficacy Scale (AAUSES) was developed, pilot tested, and its psychometric properties were examined. Following pilot testing of the scale, a 28-item questionnaire was examined using a sample (n = 289) recruited through the Amazon Mechanical Turk platform. Participants also completed other scales and items, which were used in assessing discriminant, convergent, and criterion-related validity. Test-retest reliability was also examined. After examining the scale and removing items that did not assess appropriate antibiotic use, an exploratory factor analysis was conducted on 13 items from the original scale. Three factors were retained that explained 65.51% of the variance. The scale and its subscales had adequate internal consistency. The scale had excellent test-retest reliability, as well as demonstrated convergent, discriminant, and criterion-related validity. The AAUSES is a valid and reliable scale that assesses three domains of appropriate antibiotic use self-efficacy. The AAUSES may have utility in clinical and research settings in understanding individuals' beliefs about appropriate antibiotic use and related behavioral correlates. Future research is needed to examine the scale's utility in these settings. Copyright © 2018 Elsevier B.V. All rights reserved.
Scale Refinement and Initial Evaluation of a Behavioral Health Function Measurement Tool for Work Disability Evaluation

PubMed Central

Marfeo, Elizabeth E.; Ni, Pengsheng; Bogusz, Kara; Meterko, Mark; McDonough, Christine M.; Chan, Leighton; Rasch, Elizabeth K.; Brandt, Diane E.; Jette, Alan M.

2014-01-01

Objectives To use item response theory (IRT) data simulations to construct and perform initial psychometric testing of a newly developed instrument, the Social Security Administration Behavioral Health Function (SSA-BH) instrument, that aims to assess behavioral health functioning relevant to the context of work. Design Cross-sectional survey followed by item response theory (IRT) calibration data simulations Setting Community Participants A sample of individuals applying for SSA disability benefits, claimants (N=1015), and a normative comparative sample of US adults (N=1000) Interventions None. Main Outcome Measure Social Security Administration Behavioral Health Function (SSA-BH) measurement instrument Results Item response theory analyses supported the unidimensionality of four SSA-BH scales: Mood and Emotions (35 items), Self-Efficacy (23 items), Social Interactions (6 items), and Behavioral Control (15 items). All SSA-BH scales demonstrated strong psychometric properties including reliability, accuracy, and breadth of coverage. High correlations of the simulated 5- or 10- item CATs with the full item bank indicated robust ability of the CAT approach to comprehensively characterize behavioral health function along four distinct dimensions. Conclusions Initial testing and evaluation of the SSA-BH instrument demonstrated good accuracy, reliability, and content coverage along all four scales. Behavioral function profiles of SSA claimants were generated and compared to age and sex matched norms along four scales: Mood and Emotions, Behavioral Control, Social Interactions, and Self-Efficacy. Utilizing the CAT based approach offers the ability to collect standardized, comprehensive functional information about claimants in an efficient way, which may prove useful in the context of the SSA’s work disability programs. PMID:23542404
Development and Validation of the PROMIS Pediatric Sleep Disturbance and Sleep-Related Impairment Item Banks.

PubMed

Forrest, Christopher B; Meltzer, Lisa J; Marcus, Carole L; de la Motte, Anna; Kratchman, Amy; Buysse, Daniel J; Pilkonis, Paul A; Becker, Brandon D; Bevans, Katherine B

2018-03-13

To develop and evaluate the measurement properties of child-report and parent-proxy versions of the PROMIS ® Pediatric Sleep Disturbance and Sleep-Related Impairment item banks. A national sample of 1,104 children (8-17 years-old) and 1,477 parents of children 5-17 years-old was recruited from an internet panel to evaluate the psychometric properties of 43 sleep health items. A convenience sample of children and parents recruited from a pediatric sleep clinic was obtained to provide evidence of the measures' validity; polysomnography data were collected from a subgroup of these children. Factor analyses suggested two dimensions: sleep disturbance and daytime sleep-related impairment. The final item banks included 15 items for Sleep Disturbance and 13 for Sleep-Related Impairment. Items were calibrated using the graded response model from item response theory. Of the 28 items, 16 are included in the parallel PROMIS adult sleep health measures. Reliability of the measures exceeded 0.90. Validity was supported by correlations with existing measures of pediatric sleep health and higher sleep disturbance and sleep-related impairment scores for children with sleep problems and those with chronic and neurodevelopmental disorders. The sleep health measures were not correlated with results from polysomnography. The PROMIS Pediatric Sleep Disturbance and Sleep-Related Impairment item banks provide subjective assessments of a child's difficulties falling and staying asleep as well as daytime sleepiness and its impact on functioning. They may prove useful in the future for clinical research and practice. Future research should evaluate their responsiveness to clinical change in diverse patient populations.
Evaluation of psychometric properties and differential item functioning of 8-item Child Perceptions Questionnaires using item response theory.

PubMed

Yau, David T W; Wong, May C M; Lam, K F; McGrath, Colman

2015-08-19

Four-factor structure of the two 8-item short forms of Child Perceptions Questionnaire CPQ11-14 (RSF:8 and ISF:8) has been confirmed. However, the sum scores are typically reported in practice as a proxy of Oral health-related Quality of Life (OHRQoL), which implied a unidimensional structure. This study first assessed the unidimensionality of 8-item short forms of CPQ11-14. Item response theory (IRT) was employed to offer an alternative and complementary approach of validation and to overcome the limitations of classical test theory assumptions. A random sample of 649 12-year-old school children in Hong Kong was analyzed. Unidimensionality of the scale was tested by confirmatory factor analysis (CFA), principle component analysis (PCA) and local dependency (LD) statistic. Graded response model was fitted to the data. Contribution of each item to the scale was assessed by item information function (IIF). Reliability of the scale was assessed by test information function (TIF). Differential item functioning (DIF) across gender was identified by Wald test and expected score functions. Both CPQ11-14 RSF:8 and ISF:8 did not deviate much from the unidimensionality assumption. Results from CFA indicated acceptable fit of the one-factor model. PCA indicated that the first principle component explained >30 % of the total variation with high factor loadings for both RSF:8 and ISF:8. Almost all LD statistic <10 indicated the absence of local dependency. Flat and low IIFs were observed in the oral symptoms items suggesting little contribution of information to the scale and item removal caused little practical impact. Comparing the TIFs, RSF:8 showed slightly better information than ISF:8. In addition to oral symptoms items, the item "Concerned with what other people think" demonstrated a uniform DIF (p < 0.001). The expected score functions were not much different between boys and girls. Items related to oral symptoms were not informative to OHRQoL and deletion of these items is suggested. The impact of DIF across gender on the overall score was minimal. CPQ11-14 RSF:8 performed slightly better than ISF:8 in measurement precision. The 6-item short forms suggested by IRT validation should be further investigated to ensure their robustness, responsiveness and discriminative performance.
Adaptation of the Quality Indicator for Rehabilitative Care (QuIRC) for use in mental health supported accommodation services (QuIRC-SA).

PubMed

Killaspy, Helen; White, Sarah; Dowling, Sarah; Krotofil, Joanna; McPherson, Peter; Sandhu, Sima; Arbuthnott, Maurice; Curtis, Sarah; Leavey, Gerard; Priebe, Stefan; Shepherd, Geoff; King, Michael

2016-04-14

No standardised tools for assessing the quality of specialist mental health supported accommodation services exist. To address this, we adapted the Quality Indicator for Rehabilitative care-QuIRC-that was originally developed to assess the quality of longer term inpatient and community based mental health facilities. The QuIRC, which is completed by the service manager and gives ratings of seven domains of care, has good psychometric properties. Focus groups with staff of the three main types of supported accommodation in the UK (residential care, supported housing and floating outreach services) were carried out to identify potential amendments to the QuIRC. Additional advice was gained from consultation with three expert panels, two of which comprised service users with lived experience of mental health and supported accommodation services. The amended QuIRC (QuIRC-SA) was piloted with a manager of each of the three service types. Item response variance, inter-rater reliability and internal consistency were assessed in a random sample of 52 services. Factorial structure and discriminant validity were assessed in a larger random sample of 87 services. The QuIRC-SA comprised 143 items of which only 18 items showed a narrow range of response and five items had poor inter-rater reliability. The tool showed good discriminant validity, with supported housing services generally scoring higher than the other two types of supported accommodation on most domains. Exploratory factor analysis showed that the QuIRC-SA items loaded onto the domains to which they had been allocated. The QuIRC-SA is the first standardised tool for quality assessment of specialist mental health supported accommodation services. Its psychometric properties mean that it has potential for use in research as well as audit and quality improvement programmes. A web based application is being developed to make it more accessible which will produce a printable report for the service manager about the performance of their service, comparison data for similar services and suggestions on how to improve service quality.
Parallel short forms for the assessment of activities of daily living in cardiovascular rehabilitation patients (PADL-cardio): development and validation.

PubMed

Schmucker, Andreas; Abberger, Birgit; Boecker, Maren; Baumeister, Harald

2017-11-26

To develop and validate parallel short forms for the assessment of activities of daily living in cardiac rehabilitation patients (PADL-cardio I & II). PADL-cardio I & II were developed based on a sample of 106 patients [mean age = 57.6; standard deviation (SD) = 11.1; 72.6% males] using Rasch analysis and validated with a sample of 81 patients (mean age = 59.1; SD = 11.1; 88.9% males). All patients answered PADL-cardio and the Short Form 12 Health Survey. Both versions of PADL-cardio are composed of 10 items. The fit to the Rasch model was given documented by a non-significant Item-trait interaction score (PADL-cardio I: χ 2 = 31.08, df = 30, p = 0.41; PADL-cardio II: χ 2 = 45.6, df = 40, p = 0.25). The two versions were free of differential item functioning. Person-separation reliability was 0.72/0.78 and unidimensionality was given. The two versions correlated with r = 0.98 and the correlation between PADL-cardio and the underlying item bank was 0.99 for both versions. Concurrent validity is indicated through correlations with the Short Form 12 Health Survey (r = -0.37 to -0.40). PADL-cardio provides a short and psychometrically sound option for the assessment of activities of daily living in cardiovascular rehabilitation patients. The two versions of PADL-cardio are equivalent. Hence, they can be used to reduce practice and retest effects in repeated measurement, facilitating the longitudinal assessment of activities of daily living. Implications for Rehabilitation New parallel test forms for the assessment of activities of daily living in cardiac rehabilitation (PADL-cardio I & PADL-cardio II) are available. PADL-cardio I & II consist of 10 items and are therefore especially timesaving. Concurrent validity is given through correlations with the Short Form Health Survey 12. Therapeutic success could be determined more precisely by the parallel forms reducing practice and retest effects.
Development of the Assessment of Belief Conflict in Relationship-14 (ABCR-14)

PubMed Central

Kyougoku, Makoto; Teraoka, Mutsumi; Masuda, Noriko; Ooura, Mariko; Abe, Yasushi

2015-01-01

Purpose Nurses and other healthcare workers frequently experience belief conflict, one of the most important, new stress-related problems in both academic and clinical fields. Methods In this study, using a sample of 1,683 nursing practitioners, we developed The Assessment of Belief Conflict in Relationship-14 (ABCR-14), a new scale that assesses belief conflict in the healthcare field. Standard psychometric procedures were used to develop and test the scale, including a qualitative framework concept and item-pool development, item reduction, and scale development. We analyzed the psychometric properties of ABCR-14 according to entropy, polyserial correlation coefficient, exploratory factor analysis, confirmatory factor analysis, average variance extracted, Cronbach’s alpha, Pearson product-moment correlation coefficient, and multidimensional item response theory (MIRT). Results The results of the analysis supported a three-factor model consisting of 14 items. The validity and reliability of ABCR-14 was suggested by evidence from high construct validity, structural validity, hypothesis testing, internal consistency reliability, and concurrent validity. The result of the MIRT offered strong support for good item response of item slope parameters and difficulty parameters. However, the ABCR-14 Likert scale might need to be explored from the MIRT point of view. Yet, as mentioned above, there is sufficient evidence to support that ABCR-14 has high validity and reliability. Conclusion The ABCR-14 demonstrates good psychometric properties for nursing belief conflict. Further studies are recommended to confirm its application in clinical practice. PMID:26247356
Measuring family functioning in families with parental cancer: Reliability and validity of the German adaptation of the Family Assessment Device (FAD).

PubMed

Beierlein, Volker; Bultmann, Johanna Christine; Möller, Birgit; von Klitzing, Kai; Flechtner, Hans-Henning; Resch, Franz; Herzog, Wolfgang; Brähler, Elmar; Führer, Daniel; Romer, Georg; Koch, Uwe; Bergelt, Corinna

2017-02-01

The concept of family functioning is gaining importance in psycho-oncology research and health care services. The Family Assessment Device (FAD) is a well-established measure of family functioning. Psychometric properties inherent in the German 51-item adaptation of the FAD are examined in different samples of families with parental cancer. Acceptance, reliability, and validity of FAD scales are analysed in samples from different study settings (N=1701 cancer patients, N=261 partners, N=158 dependent adolescent children 11 to 18years old). Missing items in the FAD scales (acceptance) are rare for adults (<1.1%) and adolescent children (<4.4%). In samples of adults and older adolescents (15 to 18years), all FAD scales except for the Roles scale are significantly reliable (0.75≤Cronbach's α≤0.88). The scales correlate highly (0.46≤Pearson's r≤0.59) with the criterion satisfaction with family life (convergent validity), and have smaller correlations (0.16≤r≤0.49) with measures of emotional distress and subjective well-being (divergent validity). In most FAD scales, adults seeking family counselling report worse family functioning (0.24≤Cohen's d≤0.59) than adults in other samples with parental cancer (discriminative validity). Overall, the German 51-item adaptation of the FAD reveals good acceptance, reliability, and validity for cancer patients and their relatives. Particularly the scale General Functioning shows excellent psychometric properties. The FAD is suitable in the assessment of families with parental cancer for adults and adolescents older than 11years. Copyright © 2016 Elsevier Inc. All rights reserved.
Psychometric properties of the Arabic version of the 12-item diabetes fatalism scale

PubMed Central

Abi Kharma, Joelle

2018-01-01

Background There are widespread fatalistic beliefs in Arab countries, especially among individuals with diabetes. However, there is no tool to assess diabetes fatalism in this population. This study describes the processes used to create an Arabic version of the Diabetes Fatalism Scale (DFS) and examine its psychometric properties. Methods A descriptive correlational design was used with a convenience sample of Lebanese adults (N = 274) with type 2 diabetes recruited from a major hospital in Beirut, Lebanon and by snowball sampling. The 12- item Diabetes Fatalism Scale- Arabic (12-item DFS-Ar) was back-translated from the original version, pilot tested on 22 adults with type 2 diabetes and then administered to 274 patients to assess the validity and reliability of the scale. Confirmatory factor analysis (CFA) was used to test the hypothesized factor structure. Cronbach’s alpha was used to test for reliability. Results CFA supported the existence of the three factor hypothesis of the original DFS scale. The five items measuring “emotional distress” loaded under Factor 1, the four items measuring “spiritual coping” loaded under factor 2 and the last three items measuring “perceived self-efficacy” of the original scale loaded under Factor 3 (p <0.001 for all three subscales). Goodness of fit indices confirmed adequateness of the CFA model (CFI = 0.97, TLI = 0.96, RMSEA = 0.067 and pclose = 0.05). The 12-item DFS-Ar showed good reliability (Cronbach’s alpha of 0.86) and significantly predicted HbA1c (β = 0.20, p < 0.01). After adjusting for the demographic characteristics and the number of diabetes comorbid conditions, the 12-item DFS-Ar score was independently associated with HbA1c in a multivariable model (β = 0.16, p < 0.05). Conclusions The 12-item DFS-Ar demonstrated good psychometric properties that are comparable to the original scale. It is a valid and reliable measure of diabetes fatalism. Further testing with larger and non-Lebanese Arabic population is needed. PMID:29324827
Computer-adaptive test to measure community reintegration of Veterans.

PubMed

Resnik, Linda; Tian, Feng; Ni, Pengsheng; Jette, Alan

2012-01-01

The Community Reintegration of Injured Service Members (CRIS) measure consists of three scales measuring extent of, perceived limitations in, and satisfaction with community reintegration. Length of the CRIS may be a barrier to its widespread use. Using item response theory (IRT) and computer-adaptive test (CAT) methodologies, this study developed and evaluated a briefer community reintegration measure called the CRIS-CAT. Large item banks for each CRIS scale were constructed. A convenience sample of 517 Veterans responded to all items. Exploratory and confirmatory factor analyses (CFAs) were used to identify the dimensionality within each domain, and IRT methods were used to calibrate items. Accuracy and precision of CATs of different lengths were compared with the full-item bank, and data were examined for differential item functioning (DIF). CFAs supported unidimensionality of scales. Acceptable item fit statistics were found for final models. Accuracy of 10-, 15-, 20-, and variable-item CATs for all three scales was 0.88 or above. CAT precision increased with number of items administered and decreased at the upper ranges of each scale. Three items exhibited moderate DIF by sex. The CRIS-CAT demonstrated promising measurement properties and is recommended for use in community reintegration assessment.
The Meaningful Activity Participation Assessment: A Measure of Engagement in Personally Valued Activities

ERIC Educational Resources Information Center

Eakman, Aaron M.; Carlson, Mike E.; Clark, Florence A.

2010-01-01

The Meaningful Activity Participation Assessment (MAPA), a recently developed 28-item tool designed to measure the meaningfulness of activity, was tested in a sample of 154 older adults. The MAPA evidenced a sufficient level of internal consistency and test-retest reliability and correlated as theoretically predicted with the Life Satisfaction…
Understanding Associations of Control Beliefs, Social Relations, and Well-Being in Older Adults with Osteoarthritis

ERIC Educational Resources Information Center

Ferreira, Vanessa M.; Sherman, Aurora M.

2006-01-01

Control beliefs and social relationships have been individually assessed in relation to adaptation to chronic illness, although only rarely together. Further, some control scales show psychometric limitations in older adult samples. To address these concerns, a scale assessing external control was created by factor analyzing the items from…
Validation of general job satisfaction in the Korean Labor and Income Panel Study.

PubMed

Park, Shin Goo; Hwang, Sang Hee

2017-01-01

The purpose of this study is to assess the validity and reliability of general job satisfaction (JS) in the Korean Labor and Income Panel Study (KLIPS). We used the data from the 17th wave (2014) of the nationwide KLIPS, which selected a representative panel sample of Korean households and individuals aged 15 or older residing in urban areas. We included in this study 7679 employed subjects (4529 males and 3150 females). The general JS instrument consisted of five items rated on a scale from 1 (strongly disagree) to 5 (strongly agree). The general JS reliability was assessed using the corrected item-total correlation and Cronbach's alpha coefficient. The validity of general JS was assessed using confirmatory factor analysis (CFA) and Pearson's correlation. The corrected item-total correlations ranged from 0.736 to 0.837. Therefore, no items were removed. Cronbach's alpha for general JS was 0.925, indicating excellent internal consistency. The CFA of the general JS model showed a good fit. Pearson's correlation coefficients for convergent validity showed moderate or strong correlations. The results obtained in our study confirm the validity and reliability of general JS.
Assessing coach motivation: the development of the Coach Motivation Questionnaire (CMQ).

PubMed

McLean, Kristy N; Mallett, Clifford J; Newcombe, Peter

2012-04-01

The aim of this research was to develop and assess the psychometric properties of the Coach Motivation Questionnaire (CMQ). Study 1 focused on the compilation and pilot testing of potential questionnaire items. Consistent with self-determination theory, items were devised to tap into six forms of motivation: amotivation, external regulation, introjected regulation, identified regulation, integrated regulation, and intrinsic motivation. The purpose of the second study (N = 556) was to empirically examine the psychometric properties of the CMQ. Items were subjected to confirmatory factor analyses to determine the fit of the a priori model. In addition, the validity of the questionnaire was assessed through links with the theoretically related concepts of intrinsic need satisfaction, well-being, and goal orientation. Together with test-retest reliability (Study 3), these results showed preliminary support for the psychometric properties of the CMQ. Finally, using an independent sample (N = 254), the fourth study confirmed the factor structure and supports the use of the CMQ in future coaching research.
More relevant, precise, and efficient items for assessment of physical function and disability: moving beyond the classic instruments

PubMed Central

Fries, J F; Bruce, B; Bjorner, J; Rose, M

2006-01-01

Objectives Patient reported outcomes (PROs) have become standard study endpoints. However, little attention has been given to using item improvement to advance PRO performance which could improve precision, clarity, patient relevance, and information content of “physical function/disability” items and thus the performance of resulting instruments. Methods The present study included1860 physical function/disability items from 165 instruments. Item formulations were assessed by frequency of use, modified Delphi consensus, respondent judgement of clarity and importance, and item response theory (IRT). Data from 1100 rheumatoid arthritis, osteoarthritis, and normal ageing subjects, using qualitative item review, focus groups, cognitive interviews, and patient survey were used to achieve a unique item pool that was clear, reliable, sensitive to change, readily translatable, devoid of floor and ceiling limitations, contained unidimensional subdomains, and had maximal information content. Results A “present tense” time frame was used most frequently, better understood, more readily translated, and more directly estimated the latent trait of disability. Items in the “past tense” had 80–90% false negatives (p<0.001). The best items were brief, clear, and contained a single construct. Responses with four to five options were preferred by both experts and respondents. The term physical function may be preferable to the term disability because of fewer floor effects. IRT analyses of “disability” suggest four independent subdomains (mobility, dexterity, axial, and compound) with factor loadings of 0.81–0.99. Conclusions Major improvement in performance of items and instruments is possible, and may have the effect of substantially reducing sample size requirements for clinical trials. PMID:17038464
A twin study of specific bulimia nervosa symptoms.

PubMed

Mazzeo, S E; Mitchell, K S; Bulik, C M; Aggen, S H; Kendler, K S; Neale, M C

2010-07-01

Twin studies have suggested that additive genetic factors significantly contribute to liability to bulimia nervosa (BN). However, the diagnostic criteria for BN remain controversial. In this study, an item-factor model was used to examine the BN diagnostic criteria and the genetic and environmental contributions to BN in a population-based twin sample. The validity of the equal environment assumption (EEA) for BN was also tested. Participants were 1024 female twins (MZ n=614, DZ n=410) from the population-based Mid-Atlantic Twin Registry. BN was assessed using symptom-level (self-report) items consistent with DSM-IV and ICD-10 diagnostic criteria. Items assessing BN were included in an item-factor model. The EEA was measured by items assessing similarity of childhood and adolescent environment, which have demonstrated construct validity. Scores on the EEA factor were used to specify the degree to which twins shared environmental experiences in this model. The EEA was not violated for BN. Modeling results indicated that the majority of the variance in BN was due to additive genetic factors. There was substantial variability in additive genetic and environmental contributions to specific BN symptoms. Most notably, vomiting was very strongly influenced by additive genetic factors, while other symptoms were much less heritable, including the influence of weight on self-evaluation. These results highlight the importance of assessing eating disorders at the symptom level. Refinement of eating disorder phenotypes could ultimately lead to improvements in treatment and targeted prevention, by clarifying sources of variation for specific components of symptomatology.
Development and Validation of a Fatigue Assessment Scale for U.S. Construction Workers

PubMed Central

Zhang, Mingzong; Sparer, Emily H.; Murphy, Lauren A.; Dennerlein, Jack T.; Fang, Dongping; Katz, Jeffrey N.; Caban-Martinez, Alberto J.

2015-01-01

Objective To develop a fatigue assessment scale and test its reliability and validity for commercial construction workers. Methods Using a two-phased approach, we first identified items for the development of a Fatigue Assessment Scale for Construction Workers (FASCW) through review of existing scales in the scientific literature, key informant interviews (n=11) and focus groups (3 groups with 6 workers each) with construction workers. The second phase included assessment for the reliability, validity and sensitivity of the new scale using a repeated-measures study design with a convenience sample of construction workers (n=144). Results Phase one resulted in a 16-item preliminary scale that after factor analysis yielded a final 10-item scale with two sub-scales (“Lethargy” and “Bodily Ailment”).. During phase two, the FASCW and its subscales demonstrated satisfactory internal consistency (alpha coefficients were FASCW (0.91), Lethargy (0.86) and Bodily Ailment (0.84)) and acceptable test-retest reliability (Pearson Correlations Coefficients: 0.59–0.68; Intraclass Correlation Coefficients: 0.74–0.80). Correlation analysis substantiated concurrent and convergent validity. A discriminant analysis demonstrated that the FASCW differentiated between groups with arthritis status and different work hours. Conclusions The 10-item FASCW with good reliability and validity is an effective tool for assessing the severity of fatigue among construction workers. PMID:25603944

Measuring Filial Piety in the 21st Century: Development, Factor Structure, and Reliability of the 10-Item Contemporary Filial Piety Scale.

PubMed

Lum, Terry Y S; Yan, Elsie C W; Ho, Andy H Y; Shum, Michelle H Y; Wong, Gloria H Y; Lau, Mandy M Y; Wang, Junfang

2016-11-01

The experience and practice of filial piety have evolved in modern Chinese societies, and existing measures fail to capture these important changes. Based on a conceptual analysis on current literature, 42 items were initially compiled to form a Contemporary Filial Piety Scale (CFPS), and 1,080 individuals from a representative sample in Hong Kong were surveyed. Principal component analysis generated a 16-item three-factor model: Pragmatic Obligations (Factor 1; 10 items), Compassionate Reverence (Factor 2; 4 items), and Family Continuity (Factor 3; 2 items). Confirmatory factor analysis revealed strong factor loadings for Factors 1 and 2, while removing Factor 3 and conceptually duplicated items increased total variance explained from 58.02% to 60.09% and internal consistency from .84 to .88. A final 10-item two-factor structure model was adopted with a goodness of fit of 0.95. The CFPS-10 is a data-driven, simple, and efficient instrument with strong psychometric properties for assessing contemporary filial piety. © The Author(s) 2015.
Scale development for measuring and predicting adolescents' leisure time physical activity behavior.

PubMed

Ries, Francis; Romero Granados, Santiago; Arribas Galarraga, Silvia

2009-01-01

The aim of this study was to develop a scale for assessing and predicting adolescents' physical activity behavior in Spain and Luxembourg using the Theory of Planned Behavior as a framework. The sample was comprised of 613 Spanish (boys = 309, girls = 304; M age =15.28, SD =1.127) and 752 Luxembourgish adolescents (boys = 343, girls = 409; M age = 14.92, SD = 1.198), selected from students of two secondary schools in both countries, with a similar socio-economic status. The initial 43-items were all scored on a 4-point response format using the structured alternative format and translated into Spanish, French and German. In order to ensure the accuracy of the translation, standardized parallel back-translation techniques were employed. Following two pilot tests and subsequent revisions, a second order exploratory factor analysis with oblimin direct rotation was used for factor extraction. Internal consistency and test-retest reliabilities were also tested. The 4-week test-retest correlations confirmed the items' time stability. The same five factors were obtained, explaining 63.76% and 63.64% of the total variance in both samples. Internal consistency for the five factors ranged from α = 0.759 to α = 0. 949 in the Spanish sample and from α = 0.735 to α = 0.952 in the Luxembourgish sample. For both samples, inter-factor correlations were all reported significant and positive, except for Factor 5 where they were significant but negative. The high internal consistency of the subscales, the reported item test-retest reliabilities and the identical factor structure confirm the adequacy of the elaborated questionnaire for assessing the TPB-based constructs when used with a population of adolescents in Spain and Luxembourg. The results give some indication that they may have value in measuring the hypothesized TPB constructs for PA behavior in a cross-cultural context. Key pointsWhen using the structured alternative format, weak internal consistency was obtained. Rephrasing the items and scoring items on a Likert-type scale enhanced greatly the subscales reliability.Identical factorial structure was extracted for both culturally different samples.The obtained factors, namely perceived physical competence, parents' physical activity, perceived resources support, attitude toward physical activity and perceived parental support were hypothesized as for the original TPB constructs.
The Community Integration Questionnaire - Revised: Australian normative data and measurement of electronic social networking.

PubMed

Callaway, Libby; Winkler, Dianne; Tippett, Alice; Herd, Natalie; Migliorini, Christine; Willer, Barry

2016-06-01

Consideration of the relationship between meaningful participation, health and wellbeing underpins occupational therapy intervention, and drives measurement of community integration following acquired brain injury (ABI). However, utility of community integration measures has been limited to date by lack of normative data against which to compare outcomes, and none examine the growing use of electronic social networking (ESN) for social participation. This research had four aims: (i) develop and pilot items assessing ESN to add to the Community Integration Questionnaire, producing the Community Integration Questionnaire-Revised (CIQ-R); (ii) examine factor structure of the CIQ-R; (iii) collect Australian CIQ-R normative data; and (iv) assess test-retest reliability of the revised measure. Australia. A convenience sample of adults without ABI (N = 124) was used to develop and pilot ESN items. A representative general population sample of adults without ABI aged 18-64 years (N = 1973) was recruited to gather normative CIQ-R data. Cross-sectional survey. Demographic items and the CIQ-R. The CIQ-R demonstrated acceptable psychometric properties, with minor modification to the original scoring based on the factor analyses provided. Large representative general population CIQ-R normative data have been established, detailing contribution of a range of independent demographic variables to community integration. The addition of electronic social networking items to the CIQ-R offers a contemporary method of assessing community integration following ABI. Normative CIQ-R data enhance the understanding of community integration in the general population, allowing occupational therapists and other clinicians to make more meaningful comparisons between groups. © 2016 Occupational Therapy Australia.
International Space Station (ISS) 3D Printer Performance and Material Characterization Methodology

NASA Technical Reports Server (NTRS)

Bean, Q. A.; Cooper, K. G.; Edmunson, J. E.; Johnston, M. M.; Werkheiser, M. J.

2015-01-01

In order for human exploration of the Solar System to be sustainable, manufacturing of necessary items on-demand in space or on planetary surfaces will be a requirement. As a first step towards this goal, the 3D Printing In Zero-G (3D Print) technology demonstration made the first items fabricated in space on the International Space Station. From those items, and comparable prints made on the ground, information about the microgravity effects on the printing process can be determined. Lessons learned from this technology demonstration will be applicable to other in-space manufacturing technologies, and may affect the terrestrial manufacturing industry as well. The flight samples were received at the George C. Marshall Space Flight Center on 6 April 2015. These samples will undergo a series of tests designed to not only thoroughly characterize the samples, but to identify microgravity effects manifested during printing by comparing their results to those of samples printed on the ground. Samples will be visually inspected, photographed, scanned with structured light, and analyzed with scanning electron microscopy. Selected samples will be analyzed with computed tomography; some will be assessed using ASTM standard tests. These tests will provide the information required to determine the effects of microgravity on 3D printing in microgravity.
The development and psychometric validation of the Ethical Awareness Scale.

PubMed

Milliken, Aimee; Ludlow, Larry; DeSanto-Madeya, Susan; Grace, Pamela

2018-04-19

To develop and psychometrically assess the Ethical Awareness Scale using Rasch measurement principles and a Rasch item response theory model. Critical care nurses must be equipped to provide good (ethical) patient care. This requires ethical awareness, which involves recognizing the ethical implications of all nursing actions. Ethical awareness is imperative in successfully addressing patient needs. Evidence suggests that the ethical import of everyday issues may often go unnoticed by nurses in practice. Assessing nurses' ethical awareness is a necessary first step in preparing nurses to identify and manage ethical issues in the highly dynamic critical care environment. A cross-sectional design was used in two phases of instrument development. Using Rasch principles, an item bank representing nursing actions was developed (33 items). Content validity testing was performed. Eighteen items were selected for face validity testing. Two rounds of operational testing were performed with critical care nurses in Boston between February-April 2017. A Rasch analysis suggests sufficient item invariance across samples and sufficient construct validity. The analysis further demonstrates a progression of items uniformly along a hierarchical continuum; items that match respondent ability levels; response categories that are sufficiently used; and adequate internal consistency. Mean ethical awareness scores were in the low/moderate range. The results suggest the Ethical Awareness Scale is a psychometrically sound, reliable and valid measure of ethical awareness in critical care nurses. © 2018 John Wiley & Sons Ltd.
Psychometric Properties of the International Personality Item Pool Big-Five Personality Questionnaire for the Greek population.

PubMed

Ypofanti, Maria; Zisi, Vasiliki; Zourbanos, Nikolaos; Mouchtouri, Barbara; Tzanne, Pothiti; Theodorakis, Yannis; Lyrakos, Georgios

2015-09-30

Goldberg's International Personality Item Pool (IPIP) big-five personality factor markers currently lack validating evidence. The structure of the 50-item IPIP was examined in two different adult samples (total N=811), in each case justifying a 5-factor solution, with only minor discrepancies. Age differences were comparable to previous findings using other inventories. One sample (N=193) also completed additionally another personality measure (the TIPI Short Form). Conscientiousness, extraversion and emotional stability/neuroticism scales of the IPIP were highly correlated with those of the TIPI (r=0.62 to 0.65, P=0.01). Agreeableness and Intellect/Openness scales correlated less strongly (r=0.54 and 0.58 respectively, P=0.01). The IPIP scales have good internal consistency (a=0.88) and relate strongly to major dimensions of personality assessed by the two questionnaires.
Psychometric Properties of the Chinese Shortened Version of the Zuckerman–Kuhlman Personality Questionnaire in a Sample of Adolescents and Young Adults

PubMed Central

Wang, Daoyang; Hu, Mingming; Zheng, Chanjin; Liu, Zhengguang

2017-01-01

Introduction: The original 89-item Zuckerman–Kuhlman Personality Questionnaire (form III Revised, ZKPQ-III-R) is a widely accepted and used self-report measure for personality traits. This study assessed the reliability and construct validity of the Chinese short 46-item version of the ZKPQ-III-R in a sample of adolescents and young adults. Methodology: A total of 1,019 Chinese adolescents and young adults completed the Chinese version of the original 89-item version ZKPQ-III-R and short 46-item version ZKPQ-III-R, self-report measures of depression, life satisfaction, and subjective health complaints (SHC), the Big Five personality traits, and a substance use risk profile. We explored the internal consistency of five dimensions of the short 46-item version ZKPQ-III-R and compared it with observations in previous studies of Chinese and other populations. The structure of the questionnaire was analyzed by confirmatory factor analysis and exploratory structural equation modeling. Results: The short 46-item version ZKPQ-III-R had adequate internal reliability for all five dimensions, with Cronbach’s α coefficients of 0.63 to 0.84. The concurrent validity of the short 46-item version ZKPQ-III-R was supported by significant correlations with depression, life satisfaction, and SHC. The short 46-item version ZKPQ-III-R had better fit, similar reliability coefficients, and slightly better construct and convergent validity than the 89-item version. Conclusion: The Chinese version of the 46-item ZKPQ-III-R presented reliability and validity in measuring personality in Chinese adolescents and young adults. PMID:28326057
A Chinese version of the City of Hope Quality of Life-Ostomy Questionnaire: validity and reliability assessment.

PubMed

Gao, Wenjun; Yuan, Changrong; Wang, Jichuan; Du, Jiarui; Wu, Huiqiao; Qian, Xiaojie; Hinds, Pamela S

2013-01-01

The City of Hope Quality of Life-Ostomy Questionnaire is a widely accepted scale to assess quality of life in ostomy patients. However, the validity and reliability of the Chinese version (C-COH) have not been studied. The objective of the study was to assess the validity and reliability of the C-COH among ostomy patients sampled from Shanghai from August 2010 to June 2011. Content validity was examined based on the reviews of a panel of 10 experts; test-retest was conducted to assess the item reliabilities of the scale; a pilot sample (n = 274) was selected to explore the factorial structure of the C-COH using exploratory factor analysis; a validation sample (n = 370) was selected to confirm the findings from the exploratory study using confirmatory factor analysis (CFA). Statistical package SPSS version 16.0 was used for the exploratory factor analysis, and Amos 17.0 was used for the CFA. The C-COH was developed by modifying 1 item and excluding 11 items from the original scale. Four factors/subscales (physical well-being, psychological well-being, social well-being, and spiritual well-being) were identified and confirmed in the C-COH The scale reliabilities estimated from the CFA results for the 4 subscales were 0.860, 0.885, 0.864, and 0.686, respectively. Findings support the reliability and validity of the C-COH. The C-COH could be a useful measure of the level of quality of life among Chinese patients with a stoma and may provide important intervention implications for healthcare providers to help improve the life quality of patients with a stoma.
Development and psychometric analysis of the Brief DSM-5 Alcohol Use Disorder Diagnostic Assessment: Towards effective diagnosis in college students.

PubMed

Hagman, Brett T

2017-11-01

The Diagnostic and Statistical Manual of Mental Disorders (5th edition) Alcohol Use Disorder (DSM-5 AUD) criteria have been modified to reflect a single, continuous disorder. It is critical that we develop brief assessment measures that can accurately assess for DSM-5 AUD criteria in college students to assist in screening, referral, and brief intervention services implemented on college campuses. The present study sought to develop and assess for the psychometric properties of a brief 13-item measure designed to capture the full spectrum of the DSM-5 AUD criteria in a sample of college students. Participants were past-year drinkers (N = 923) between the ages of 18 to 30 enrolled at 3 universities. Respondents completed a 30-min anonymous battery of questionnaires online. The Brief DSM-5 AUD Assessment consisted of 13 items designed to reflect the DSM-5 AUD criteria. Results indicated a high degree of internal consistency reliability with high item-to-scale correlations. Confirmatory factor analyses indicated that a dominant single factor emerged with good model fit. The Item Response Theory (IRT) analyses indicated that the difficulty parameters for each criterion were intermixed along the upper portion of the underlying AUD severity continuum, and the discrimination parameters were all high. Additional analysis indicated that those with a DSM-5 AUD had greater levels of alcohol and other drug use and problem severity in comparison to those without a DSM-5 AUD. Study findings provide empirical support for the reliability and validity of the Brief 13-item DSM-5 Assessment. It should be routinely included into research and clinical practice efforts. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
OA Go Away: Development and Preliminary Validation of a Self-Management Tool to Promote Adherence to Exercise and Physical Activity for People with Osteoarthritis of the Hip or Knee

PubMed Central

Toupin April, Karine; Backman, Catherine; Tugwell, Peter

2016-01-01

Purpose: To determine the face and content validity, construct validity, and test–retest reliability of the OA Go Away (OGA), a personalized self-management tool to promote adherence to exercise and physical activity for people with osteoarthritis (OA) of the hip or knee. Methods: The face and content validity of OGA version 1.0 were determined via interviews with 10 people with OA of the hip or knee and 10 clinicians. A revised OGA version 2.0 was then tested for construct validity and test–retest reliability with a new sample of 50 people with OA of the hip or knee by comparing key items in the OGA journal with validated outcome measures assessing similar health outcomes and comparing scores on key items of the journal 4–7 days apart. Face and content validity were then confirmed with a new sample of 5 people with OA of the hip or knee and 5 clinicians. Results: Eighteen of 30 items from the OGA version 1.0 and 41 of 43 items from the OGA version 2.0 journal, goals and action plan, and exercise log had adequate content validity. Construct validity and test–retest reliability were acceptable for the main items of the OGA version 2.0 journal. The OGA underwent modifications based on results and participant feedback. Conclusion: The OGA is a novel self-management intervention and assessment tool for people with OA of the hip or knee that shows adequate preliminary measurement properties. PMID:27909359
Rasch model analysis of the Depression, Anxiety and Stress Scales (DASS)

PubMed Central

Shea, Tracey L; Tennant, Alan; Pallant, Julie F

2009-01-01

Background There is a growing awareness of the need for easily administered, psychometrically sound screening tools to identify individuals with elevated levels of psychological distress. Although support has been found for the psychometric properties of the Depression, Anxiety and Stress Scales (DASS) using classical test theory approaches it has not been subjected to Rasch analysis. The aim of this study was to use Rasch analysis to assess the psychometric properties of the DASS-21 scales, using two different administration modes. Methods The DASS-21 was administered to 420 participants with half the sample responding to a web-based version and the other half completing a traditional pencil-and-paper version. Conformity of DASS-21 scales to a Rasch partial credit model was assessed using the RUMM2020 software. Results To achieve adequate model fit it was necessary to remove one item from each of the DASS-21 subscales. The reduced scales showed adequate internal consistency reliability, unidimensionality and freedom from differential item functioning for sex, age and mode of administration. Analysis of all DASS-21 items combined did not support its use as a measure of general psychological distress. A scale combining the anxiety and stress items showed satisfactory fit to the Rasch model after removal of three items. Conclusion The results provide support for the measurement properties, internal consistency reliability, and unidimensionality of three slightly modified DASS-21 scales, across two different administration methods. The further use of Rasch analysis on the DASS-21 in larger and broader samples is recommended to confirm the findings of the current study. PMID:19426512
Rasch model analysis of the Depression, Anxiety and Stress Scales (DASS).

PubMed

Shea, Tracey L; Tennant, Alan; Pallant, Julie F

2009-05-09

There is a growing awareness of the need for easily administered, psychometrically sound screening tools to identify individuals with elevated levels of psychological distress. Although support has been found for the psychometric properties of the Depression, Anxiety and Stress Scales (DASS) using classical test theory approaches it has not been subjected to Rasch analysis. The aim of this study was to use Rasch analysis to assess the psychometric properties of the DASS-21 scales, using two different administration modes. The DASS-21 was administered to 420 participants with half the sample responding to a web-based version and the other half completing a traditional pencil-and-paper version. Conformity of DASS-21 scales to a Rasch partial credit model was assessed using the RUMM2020 software. To achieve adequate model fit it was necessary to remove one item from each of the DASS-21 subscales. The reduced scales showed adequate internal consistency reliability, unidimensionality and freedom from differential item functioning for sex, age and mode of administration. Analysis of all DASS-21 items combined did not support its use as a measure of general psychological distress. A scale combining the anxiety and stress items showed satisfactory fit to the Rasch model after removal of three items. The results provide support for the measurement properties, internal consistency reliability, and unidimensionality of three slightly modified DASS-21 scales, across two different administration methods. The further use of Rasch analysis on the DASS-21 in larger and broader samples is recommended to confirm the findings of the current study.
The stroke impairment assessment set: its internal consistency and predictive validity.

PubMed

Tsuji, T; Liu, M; Sonoda, S; Domen, K; Chino, N

2000-07-01

To study the scale quality and predictive validity of the Stroke Impairment Assessment Set (SIAS) developed for stroke outcome research. Rasch analysis of the SIAS; stepwise multiple regression analysis to predict discharge functional independence measure (FIM) raw scores from demographic data, the SIAS scores, and the admission FIM scores; cross-validation of the prediction rule. Tertiary rehabilitation center in Japan. One hundred ninety stroke inpatients for the study of the scale quality and the predictive validity; a second sample of 116 stroke inpatients for the cross-validation study. Mean square fit statistics to study the degree of fit to the unidimensional model; logits to express item difficulties; discharge FIM scores for the study of predictive validity. The degree of misfit was acceptable except for the shoulder range of motion (ROM), pain, visuospatial function, and speech items; and the SIAS items could be arranged on a common unidimensional scale. The difficulty patterns were identical at admission and at discharge except for the deep tendon reflexes, ROM, and pain items. They were also similar for the right- and left-sided brain lesion groups except for the speech and visuospatial items. For the prediction of the discharge FIM scores, the independent variables selected were age, the SIAS total scores, and the admission FIM scores; and the adjusted R2 was .64 (p < .0001). Stability of the predictive equation was confirmed in the cross-validation sample (R2 = .68, p < .001). The unidimensionality of the SIAS was confirmed, and the SIAS total scores proved useful for stroke outcome prediction.
Work-related measures of physical and behavioral health function: Test-retest reliability.

PubMed

Marino, Molly Elizabeth; Meterko, Mark; Marfeo, Elizabeth E; McDonough, Christine M; Jette, Alan M; Ni, Pengsheng; Bogusz, Kara; Rasch, Elizabeth K; Brandt, Diane E; Chan, Leighton

2015-10-01

The Work Disability Functional Assessment Battery (WD-FAB), developed for potential use by the US Social Security Administration to assess work-related function, currently consists of five multi-item scales assessing physical function and four multi-item scales assessing behavioral health function; the WD-FAB scales are administered as Computerized Adaptive Tests (CATs). The goal of this study was to evaluate the test-retest reliability of the WD-FAB Physical Function and Behavioral Health CATs. We administered the WD-FAB scales twice, 7-10 days apart, to a sample of 376 working age adults and 316 adults with work-disability. Intraclass correlation coefficients were calculated to measure the consistency of the scores between the two administrations. Standard error of measurement (SEM) and minimal detectable change (MDC90) were also calculated to measure the scales precision and sensitivity. For the Physical Function CAT scales, the ICCs ranged from 0.76 to 0.89 in the working age adult sample, and 0.77-0.86 in the sample of adults with work-disability. ICCs for the Behavioral Health CAT scales ranged from 0.66 to 0.70 in the working age adult sample, and 0.77-0.80 in the adults with work-disability. The SEM ranged from 3.25 to 4.55 for the Physical Function scales and 5.27-6.97 for the Behavioral Health function scales. For all scales in both samples, the MDC90 ranged from 7.58 to 16.27. Both the Physical Function and Behavioral Health CATs of the WD-FAB demonstrated good test-retest reliability in adults with work-disability and general adult samples, a critical requirement for assessing work related functioning in disability applicants and in other contexts. Copyright © 2015 Elsevier Inc. All rights reserved.
Work-related measures of Physical and Behavioral Health Function: Test-Retest Reliability

PubMed Central

Marino, Molly Elizabeth; Meterko, Mark; Marfeo, Elizabeth E.; McDonough, Christine M.; Jette, Alan M.; Ni, Pengsheng; Bogusz, Kara; Rasch, Elizabeth K.; Brandt, Diane E.; Chan, Leighton

2015-01-01

Background The Work Disability Functional Assessment Battery (WD-FAB), developed for potential use by the US Social Security Administration to assess work-related function, currently consists of five multi-item scales assessing physical function and four multi-item scales assessing behavioral health function; the WD-FAB scales are administered as Computerized Adaptive Tests (CATs). Objective The goal of this study was to evaluate the test-retest reliability of the WD-FAB Physical Function and Behavioral Health CATs. Methods We administered the WD-FAB scales twice, 7–10 days apart, to a sample of 376 working age adults and 316 adults with work-disability. Intraclass correlation coefficients were calculated to measure the consistency of the scores between the two administrations. Standard error of measurement (SEM) and minimal detectable change (MDC90) were also calculated to measure the scales precision and sensitivity. Results For the Physical Function CAT scales, the ICCs ranged from 0.76–0.89 in the working age adult sample, and 0.77–0.86 in the sample of adults with work-disability. ICCs for the Behavioral Health CAT scales ranged from 0.66–0.70 in the working age adult sample, and 0.77–0.80 in the adults with work-disability. The SEM ranged from 3.25–4.55 for the Physical Function scales and 5.27–6.97 for the Behavioral Health function scales. For all scales in both samples, the MDC90 ranged from 7.58–16.27. Conclusion Both the Physical Function and Behavioral Health CATs of the WD-FAB demonstrated good test-retest reliability in adults with work-disability and general adult samples, a critical requirement for assessing work related functioning in disability applicants and in other contexts. PMID:25991419
Psychometric assessment of the short-form Child Perceptions Questionnaire: an international collaborative study.

PubMed

Thomson, W M; Foster Page, L A; Robinson, P G; Do, L G; Traebert, J; Mohamed, A R; Turton, B J; McGrath, C; Bekes, K; Hirsch, C; Del Carmen Aguilar-Diaz, F; Marshman, Z; Benson, P E; Baker, S R

2016-12-01

To examine the factor structure and other psychometric characteristics of the most commonly used child oral-health-related quality-of-life (OHRQoL) measure (the 16-item short-form CPQ 11-14 ) in a large number of children (N = 5804) from different settings and who had a range of caries experience and associated impacts. Secondary data analyses used subnational epidemiological samples of 11- to 14-year-olds in Australia (N = 372), New Zealand (three samples: 352, 202, 429), Brunei (423), Cambodia (244), Hong Kong (542), Malaysia (439), Thailand (220, 325), England (88, 374), Germany (1055), Mexico (335) and Brazil (404). Confirmatory factor analysis (CFA) was used to examine the factor structure of the CPQ 11-14 across the combined sample and within four regions (Australia/NZ, Asia, UK/Europe and Latin America). Item impact and internal reliability analysis were also conducted. Caries experience varied, with mean DMFT scores ranging from 0.5 in the Malaysian sample to 3.4 in one New Zealand sample. Even more variation was noted in the proportion reporting only fair or poor oral health; this was highest in the Cambodian and Mexican samples and lowest in the German sample and one New Zealand sample. One in 10 reported that their oral health had a marked impact on their life overall. The CFA across all samples revealed two factors with eigenvalues greater than 1. The first involved all items in the oral symptoms and functional limitations subscales; the second involved all emotional well-being and social well-being items. The first was designated the 'symptoms/function' subscale, and the second was designated the 'well-being' subscale. Cronbach's alpha scores were 0.72 and 0.84, respectively. The symptoms/function subscale contained more of the items with greater impact, with the item 'Food stuck in between your teeth' having greatest impact; in the well-being subscale, the 'Felt shy or embarrassed' item had the greatest impact. Repeating the analyses by world region gave similar findings. The CPQ 11-14 performed well cross-sectionally in the largest analysis of the scale in the literature to date, with robust and mostly consistent psychometric characteristics, albeit with two underlying factors (rather than the originally hypothesized four-factor structure). It appears to be a sound, robust measure which should be useful for research, practice and policy. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Korean Version of Inventory of Complicated Grief Scale: Psychometric Properties in Korean Adolescents

PubMed Central

2016-01-01

We aimed to validate the Inventory of Complicated Grief (ICG)-Korean version among 1,138 Korean adolescents, representing a response rate of 57% of 1,997 students. Participants completed a set of questionnaires including demographic variables (age, sex, years of education, experience of grief), the ICG, the Children's Depression Inventory (CDI) and the Lifetime Incidence of Traumatic Events-Child (LITE-C). Exploratory factor analysis was performed to determine whether the ICG items indicated complicated grief in Korean adolescents. The internal consistency of the ICG-Korean version was Cronbach's α=0.87. The test-retest reliability for a randomly selected sample of 314 participants in 2 weeks was r=0.75 (P<0.001). Concurrent validity was assessed using a correlation between the ICG total scores and the CDI total scores (r=0.75, P<0.001). The criterion-related validity based on the comparison of ICG total scores between adolescents without complicated grief (1.2±3.7) and adolescent with complicated grief (3.2±6.6) groups was relatively high (t=5.71, P<0.001). The data acquired from the 1,138 students was acceptable for a factor analysis (Kaiser-Meyer-Olkin Measure of Sampling Adequacy=0.911; Bartlett's Test of Sphericity, χ2=13,144.7, P<0.001). After omission of 3 items, the value of Cronbach's α increased from 0.87 for the 19-item ICG-Korean version to 0.93 for the 16-item ICG-Korean version. These results suggest that the ICG is a useful tool in assessing for complicated grief in Korean adolescents. However, the 16-item version of the ICG appeared to be more valid compared to the 19-item version of the ICG. We suggest that the 16-item version of the ICG be used to screen for complicated grief in Korean adolescents. PMID:26770046
Korean Version of Inventory of Complicated Grief Scale: Psychometric Properties in Korean Adolescents.

PubMed

Han, Doug Hyun; Lee, Jung Jae; Moon, Duk-Soo; Cha, Myoung-Jin; Kim, Min A; Min, Seonyeong; Yang, Ji Hoon; Lee, Eun Jeong; Yoo, Seo Koo; Chung, Un-Sun

2016-01-01

We aimed to validate the Inventory of Complicated Grief (ICG)-Korean version among 1,138 Korean adolescents, representing a response rate of 57% of 1,997 students. Participants completed a set of questionnaires including demographic variables (age, sex, years of education, experience of grief), the ICG, the Children's Depression Inventory (CDI) and the Lifetime Incidence of Traumatic Events-Child (LITE-C). Exploratory factor analysis was performed to determine whether the ICG items indicated complicated grief in Korean adolescents. The internal consistency of the ICG-Korean version was Cronbach's α=0.87. The test-retest reliability for a randomly selected sample of 314 participants in 2 weeks was r=0.75 (P<0.001). Concurrent validity was assessed using a correlation between the ICG total scores and the CDI total scores (r=0.75, P<0.001). The criterion-related validity based on the comparison of ICG total scores between adolescents without complicated grief (1.2 ± 3.7) and adolescent with complicated grief (3.2 ± 6.6) groups was relatively high (t=5.71, P<0.001). The data acquired from the 1,138 students was acceptable for a factor analysis (Kaiser-Meyer-Olkin Measure of Sampling Adequacy=0.911; Bartlett's Test of Sphericity, χ(2)=13,144.7, P<0.001). After omission of 3 items, the value of Cronbach's α increased from 0.87 for the 19-item ICG-Korean version to 0.93 for the 16-item ICG-Korean version. These results suggest that the ICG is a useful tool in assessing for complicated grief in Korean adolescents. However, the 16-item version of the ICG appeared to be more valid compared to the 19-item version of the ICG. We suggest that the 16-item version of the ICG be used to screen for complicated grief in Korean adolescents.
Evaluating the Dimensionality of Self-Determination Theory's Relative Autonomy Continuum.

PubMed

Sheldon, Kennon M; Osin, Evgeny N; Gordeeva, Tamara O; Suchkov, Dmitry D; Sychev, Oleg A

2017-09-01

We conducted a theoretical and psychometric evaluation of self-determination theory's "relative autonomy continuum" (RAC), an important aspect of the theory whose validity has recently been questioned. We first derived a Comprehensive Relative Autonomy Index (C-RAI) containing six subscales and 24 items, by conducting a paired paraphrase content analysis of existing RAI measures. We administered the C-RAI to multiple U.S. and Russian samples, assessing motivation to attend class, study a major, and take responsibility. Item-level and scale-level multidimensional scaling analyses, confirmatory factor analyses, and simplex/circumplex modeling analyses reaffirmed the validity of the RAC, across multiple samples, stems, and studies. Validation analyses predicting subjective well-being and trait autonomy from the six separate subscales, in combination with various higher order composites (weighted and unweighted), showed that an aggregate unweighted RAI score provides the most unbiased and efficient indicator of the overall quality of motivation within the behavioral domain being assessed.
Dental responsibility loadings and the relative value of dental services.

PubMed

Teusner, D N; Ju, X; Brennan, D S

2017-09-01

To estimate responsibility loadings for a comprehensive list of dental services, providing a standardized unit of clinical work effort. Dentists (n = 2500) randomly sampled from the Australian Dental Association membership (2011) were randomly assigned to one of 25 panels. Panels were surveyed by questionnaires eliciting responsibility loadings for eight common dental services (core items) and approximately 12 other items unique to that questionnaire. In total, loadings were elicited for 299 items listed in the Australian Dental Schedule 9th Edition. Data were weighted to reflect the age and sex distribution of the workforce. To assess reliability, regression models assessed differences in core item loadings by panel assignment. Estimated loadings were described by reporting the median and mean. Response rate was 37%. Panel composition did not vary by practitioner characteristics. Core item loadings did not vary by panel assignment. Oral surgery and endodontic service areas had the highest proportion (91%) of services with median loadings ≥1.5, followed by prosthodontics (78%), periodontics (76%), orthodontics (63%), restorative (62%) and diagnostic services (31%). Preventive services had median loadings ≤1.25. Dental responsibility loadings estimated by this study can be applied in the development of relative value scales. © 2017 Australian Dental Association.

Validation of the Brazilian version of the 'Spanish Burnout Inventory' in teachers.

PubMed

Gil-Monte, Pedro R; Carlotto, Mary Sandra; Câmara, Sheila Gonçalves

2010-02-01

To assess factorial validity and internal consistency of the Brazilian version of the 'Spanish Burnout Inventory' (SBI). The translation process of the SBI into Brazilian Portuguese included translation, back translation, and semantic equivalence. A confirmatory factor analysis was carried out using a four-factor model, which was similar to the original SBI. The sample consisted of 714 teachers working in schools in the metropolitan area of the city of Porto Alegre, Southern Brazil, in 2008. The instrument comprises 20 items and four subscales: Enthusiasm towards job (5 items), Psychological exhaustion (4 items), Indolence (6 items), and Guilt (5 items). The model was analyzed using LISREL 8. Goodness-of-Fit statistics showed that the hypothesized model had adequate fit: chi2(164) = 605.86 (p<0.000); Goodness-of-Fit Index = 0.92; Adjusted Goodness-of-Fit Index = 0.90; Root Mean Square Error of Approximation = 0.062; Nonnormed Fit Index = 0.91; Comparative Fit Index = 0.92; and Parsimony Normed Fit Index = 0.77. Cronbach's alpha measures for all subscales were higher than 0.70. The study showed that the SBI has adequate factorial validity and internal consistency to assess burnout in Brazilian teachers.
Direct amplification of casework bloodstains using the Promega PowerPlex(®) 21 PCR amplification system.

PubMed

Gray, Kerryn; Crowle, Damian; Scott, Pam

2014-09-01

A significant number of evidence items submitted to Forensic Science Service Tasmania (FSST) are blood swabs or bloodstained items. Samples from these items routinely undergo phenol:chloroform:isoamyl alcohol organic extraction and quantitative Polymerase Chain Reaction (qPCR) testing prior to PowerPlex(®) 21 amplification. This multi-step process has significant cost and timeframe implications in a fiscal climate of tightening government budgets, pressure towards improved operating efficiencies, and an increasing emphasis on rapid techniques better supporting intelligence-led policing. Direct amplification of blood and buccal cells on cloth and Whatman FTA™ card with PowerPlex(®) 21 has already been successfully implemented for reference samples, eliminating the requirement for sample pre-treatment. Scope for expanding this method to include less pristine casework blood swabs and samples from bloodstained items was explored in an endeavour to eliminate lengthy DNA extraction, purification and qPCR steps for a wider subset of samples. Blood was deposited onto a range of substrates including those historically found to inhibit STR amplification. Samples were collected with micro-punch, micro-swab, or both. The potential for further fiscal savings via reduced volume amplifications was assessed by amplifying all samples at full and reduced volume (25 and 13μL). Overall success rate data showed 80% of samples yielded a complete profile at reduced volume, compared to 78% at full volume. Particularly high success rates were observed for the blood on fabric/textile category with 100% of micro-punch samples yielding complete profiles at reduced volume and 85% at full volume. Following the success of this trial, direct amplification of suitable casework blood samples has been implemented at reduced volume. Significant benefits have been experienced, most noticeably where results from crucial items have been provided to police investigators prior to interview of suspects, and a coronial identification has been successfully completed in a short timeframe to avoid delay in the release of human remains to family members. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Farmer Perceptions of Soil and Water Conservation Issues: Implications to Agricultural and Extension Education.

ERIC Educational Resources Information Center

Bruening, Thomas H.; Martin, Robert A.

A sample of 731 farmers was surveyed to identify perceptions regarding selected soil and water conservation practices. The sample was stratified and proportioned by conservation district to have a representative group of respondents across Iowa. Items on the mailed questionnaire were designed to assess perceptions regarding issues in soil and…
The development and psychometric properties of the American sign language proficiency assessment (ASL-PA).

PubMed

Maller, S; Singleton, J; Supalla, S; Wix, T

1999-01-01

We describe the procedures for constructing an instrument designed to evaluate children's proficiency in American Sign Language (ASL). The American Sign Language Proficiency Assessment (ASL-PA) is a much-needed tool that potentially could be used by researchers, language specialists, and qualified school personnel. A half-hour ASL sample is collected on video from a target child (between ages 6 and 12) across three separate discourse settings and is later analyzed and scored by an assessor who is highly proficient in ASL. After the child's language sample is scored, he or she can be assigned an ASL proficiency rating of Level 1, 2, or 3. At this phase in its development, substantial evidence of reliability and validity has been obtained for the ASL-PA using a sample of 80 profoundly deaf children (ages 6-12) of varying ASL skill levels. The article first explains the item development and administration of the ASL-PA instrument, then describes the empirical item analysis, standard setting procedures, and evidence of reliability and validity. The ASL-PA is a promising instrument for assessing elementary school-age children's ASL proficiency. Plans for further development are also discussed.
Examining construct validity of a new naturalistic observational assessment of hand skills for preschool- and school-age children.

PubMed

Chien, Chi-Wen; Brown, Ted; McDonald, Rachael

2012-04-01

The Assessment of Children's Hand Skills is a new assessment that utilises a naturalistic observational method to capture children's real-life hand skill performance when engaged at various types of daily activities in everyday living contexts. The Assessment of Children's Hand Skills is designed for use with 2- to 12-year-old children with a range of disabilities or health conditions. The study aimed to investigate construct validity of the Assessment of Children's Hand Skills in Australian children. Rasch analysis was used to examine internal construct validity of the Assessment of Children's Hand Skills in a mixed sample of 53 children with disabilities (including autism spectrum disorder, developmental/genetic disorders and physical disabilities) and 85 typically developing children. External construct validity was examined by correlating with three questionnaires evaluating daily living skills and hand skills. Rasch goodness-of-fit analysis suggested that all 22 activity items and 19 of 20 hand skill items in the Assessment of Children's Hand Skills measured a single construct. The Assessment of Children's Hand Skills items were placed in a clinically meaningful hierarchy from easy to hard, and the difficulty range of the items also matched the majority of children with disabilities and typically developing preschool-aged children. Moderate to high correlations (0.59 ≤ Spearman's ρ coefficients ≤ 0.89, P < 0.01) were found with the assessments of daily living and fine motor skills. This study provided preliminary evidence supporting the construct validity of the Assessment of Children's Hand Skills for its clinical application in assessing children's real-life hand skill performance in Australian contexts. © 2012 The Authors Australian Occupational Therapy Journal © 2012 Occupational Therapy Australia.
Oropharyngeal dysphagia: surveying practice patterns of the speech-language pathologist.

PubMed

Martino, Rosemary; Pron, Gaylene; Diamant, Nicholas E

2004-01-01

The present study was designed to obtain a comprehensive view of the dysphagia assessment practice patterns of speech-language pathologists and their opinion on the importance of these practices using survey methods and taking into consideration clinician, patient, and practice-setting variables. A self-administered mail questionnaire was developed following established methodology to maximize response rates. Eight dysphagia experts independently rated the new survey for content validity. Test-retest reliability was assessed with a random sample of 23 participants. The survey was sent to 50 speech-language pathologists randomly selected from the Canadian professional association database of members who practice in dysphagia. Surveys were mailed according to the Dillman Total Design Method and included an incentive offer. High survey (64%) and item response (95%) rates were achieved and clinicians were reliable reporters of their practice behaviors (ICC>0.60). Of all the clinical assessment items, 36% were reported with high (>80%) utilization and 24% with low (<20%) utilization, the former pertaining to tongue motion and vocal quality after food/fluid intake and the latter to testing of oral sensation without food. One-third (33%) of instrumental assessment items were highly utilized and included assessment of bolus movement and laryngeal response to bolus misdirection. Overall, clinician experience and teaching institutions influenced greater utilization. Opinions of importance were similar to utilization behaviors (r = 0.947, p = 0.01). Of all patients referred for dysphagia assessment, full clinical assessments were administered to 71% of patients but instrumental assessments to only 36%. A hierarchical model of practice behavior is proposed to explain this pattern of progressively decreasing item utilization.
Validation of the Adolescent Concerns Measure (ACM): evidence from exploratory and confirmatory factor analysis.

PubMed

Ang, Rebecca P; Chong, Wan Har; Huan, Vivien S; Yeo, Lay See

2007-01-01

This article reports the development and initial validation of scores obtained from the Adolescent Concerns Measure (ACM), a scale which assesses concerns of Asian adolescent students. In Study 1, findings from exploratory factor analysis using 619 adolescents suggested a 24-item scale with four correlated factors--Family Concerns (9 items), Peer Concerns (5 items), Personal Concerns (6 items), and School Concerns (4 items). Initial estimates of convergent validity for ACM scores were also reported. The four-factor structure of ACM scores derived from Study 1 was confirmed via confirmatory factor analysis in Study 2 using a two-fold cross-validation procedure with a separate sample of 811 adolescents. Support was found for both the multidimensional and hierarchical models of adolescent concerns using the ACM. Internal consistency and test-retest reliability estimates were adequate for research purposes. ACM scores show promise as a reliable and potentially valid measure of Asian adolescents' concerns.
Psychometric assessment of HIV/STI sexual risk scale among MSM: a Rasch model approach.

PubMed

Li, Jian; Liu, Hongjie; Liu, Hui; Feng, Tiejian; Cai, Yumao

2011-10-05

Little research has assessed the degree of severity and ordering of different types of sexual behaviors for HIV/STI infection in a measurement scale. The purpose of this study was to apply the Rasch model on psychometric assessment of an HIV/STI sexual risk scale among men who have sex with men (MSM). A cross-sectional study using respondent driven sampling was conducted among 351 MSM in Shenzhen, China. The Rasch model was used to examine the psychometric properties of an HIV/STI sexual risk scale including nine types of sexual behaviors. The Rasch analysis of the nine items met the unidimensionality and local independence assumption. Although the person reliability was low at 0.35, the item reliability was high at 0.99. The fit statistics provided acceptable infit and outfit values. Item difficulty invariance analysis showed that the item estimates of the risk behavior items were invariant (within error). The findings suggest that the Rasch model can be utilized for measuring the level of sexual risk for HIV/STI infection as a single latent construct and for establishing the relative degree of severity of each type of sexual behavior in HIV/STI transmission and acquisition among MSM. The measurement scale provides a useful measurement tool to inform, design and evaluate behavioral interventions for HIV/STI infection among MSM.
Three-dimensional structural representation of the sleep-wake adaptability.

PubMed

Putilov, Arcady A

2016-01-01

Various characteristics of the sleep-wake cycle can determine the success or failure of individual adjustment to certain temporal conditions of the today's society. However, it remains to be explored how many such characteristics can be self-assessed and how they are inter-related one to another. The aim of the present report was to apply a three-dimensional structural representation of the sleep-wake adaptability in the form of "rugby cake" (scalene or triaxial ellipsoid) to explain the results of analysis of the pattern of correlations of the responses to the initial 320-item list of a new inventory with scores on the six scales designed for multidimensional self-assessment of the sleep-wake adaptability (Morning and Evening Lateness, Anytime and Nighttime Sleepability, and Anytime and Daytime Wakeability). The results obtained for sample consisting of 149 respondents were confirmed by the results of similar analysis of earlier collected responses of 139 respondents to the same list of 320 items and responses of 1213 respondents to the 72 items of one of the earlier established questionnaire tools. Empirical evidence was provided in support of the model-driven prediction of the possibility to identify items linked to as many as 36 narrow (6 core and 30 mixed) adaptabilities of the sleep-wake cycle. The results enabled the selection of 168 items for self-assessment of all these adaptabilities predicted by the rugby cake model.
Leadership: validation of a self-report scale.

PubMed

Dussault, Marc; Frenette, Eric; Fernet, Claude

2013-04-01

The aim of this paper was to propose and test the factor structure of a new self-report questionnaire on leadership. A sample of 373 school principals in the Province of Quebec, Canada completed the initial 46-item version of the questionnaire. In order to obtain a questionnaire of minimal length, a four-step procedure was retained. First, items analysis was performed using Classical Test Theory. Second, Rasch analysis was used to identify non-fitting or overlapping items. Third, a confirmatory factor analysis (CFA) using structural equation modelling was performed on the 21 remaining items to verify the factor structure of the scale. Results show that the model with a single third-order dimension (leadership), two second-order dimensions (transactional and transformational leadership), and one first-order dimension (laissez-faire leadership) provides a good fit to the data. Finally, invariance of factor structure was assessed with a second sample of 222 vice-principals in the Province of Quebec, Canada. This model is in agreement with the theoretical model developed by Bass (1985), upon which the questionnaire is based.
Detection and validation of unscalable item score patterns using item response theory: an illustration with Harter's Self-Perception Profile for Children.

PubMed

Meijer, Rob R; Egberink, Iris J L; Emons, Wilco H M; Sijtsma, Klaas

2008-05-01

We illustrate the usefulness of person-fit methodology for personality assessment. For this purpose, we use person-fit methods from item response theory. First, we give a nontechnical introduction to existing person-fit statistics. Second, we analyze data from Harter's (1985) Self-Perception Profile for Children (Harter, 1985) in a sample of children ranging from 8 to 12 years of age (N = 611) and argue that for some children, the scale scores should be interpreted with care and caution. Combined information from person-fit indexes and from observation, interviews, and self-concept theory showed that similar score profiles may have a different interpretation. For some children in the sample, item scores did not adequately reflect their trait level. Based on teacher interviews, this was found to be due most likely to a less developed self-concept and/or problems understanding the meaning of the questions. We recommend investigating the scalability of score patterns when using self-report inventories to help the researcher interpret respondents' behavior correctly.
What's in a name? The challenge of describing interventions in systematic reviews: analysis of a random sample of reviews of non-pharmacological stroke interventions

PubMed Central

Hoffmann, Tammy C; Walker, Marion F; Langhorne, Peter; Eames, Sally; Thomas, Emma; Glasziou, Paul

2015-01-01

Objective To assess, in a sample of systematic reviews of non-pharmacological interventions, the completeness of intervention reporting, identify the most frequently missing elements, and assess review authors’ use of and beliefs about providing intervention information. Design Analysis of a random sample of systematic reviews of non-pharmacological stroke interventions; online survey of review authors. Data sources and study selection The Cochrane Library and PubMed were searched for potentially eligible systematic reviews and a random sample of these assessed for eligibility until 60 (30 Cochrane, 30 non-Cochrane) eligible reviews were identified. Data collection In each review, the completeness of the intervention description in each eligible trial (n=568) was assessed by 2 independent raters using the Template for Intervention Description and Replication (TIDieR) checklist. All review authors (n=46) were invited to complete a survey. Results Most reviews were missing intervention information for the majority of items. The most incompletely described items were: modifications, fidelity, materials, procedure and tailoring (missing from all interventions in 97%, 90%, 88%, 83% and 83% of reviews, respectively). Items that scored better, but were still incomplete for the majority of reviews, were: ‘when and how much’ (in 31% of reviews, adequate for all trials; in 57% of reviews, adequate for some trials); intervention mode (in 22% of reviews, adequate for all trials; in 38%, adequate for some trials); and location (in 19% of reviews, adequate for all trials). Of the 33 (71%) authors who responded, 58% reported having further intervention information but not including it, and 70% tried to obtain information. Conclusions Most focus on intervention reporting has been directed at trials. Poor intervention reporting in stroke systematic reviews is prevalent, compounded by poor trial reporting. Without adequate intervention descriptions, the conduct, usability and interpretation of reviews are restricted and therefore, require action by trialists, systematic reviewers, peer reviewers and editors. PMID:26576811
Components of a Measure to Describe Organizational Culture in Academic Pharmacy.

PubMed

Desselle, Shane; Rosenthal, Meagen; Holmes, Erin R; Andrews, Brienna; Lui, Julia; Raja, Leela

2017-12-01

Objective. To develop a measure of organizational culture in academic pharmacy and identify characteristics of an academic pharmacy program that would be impactful for internal (eg, students, employees) and external (eg, preceptors, practitioners) clients of the program. Methods. A three-round Delphi procedure of 24 panelists from pharmacy schools in the U.S. and Canada generated items based on the Organizational Culture Profile (OCP), which were then evaluated and refined for inclusion in subsequent rounds. Items were assessed for appropriateness and impact. Results. The panel produced 35 items across six domains that measured organizational culture in academic pharmacy: competitiveness, performance orientation, social responsibility, innovation, emphasis on collegial support, and stability. Conclusion. The items generated require testing for validation and reliability in a large sample to finalize this measure of organizational culture.
Developing a tool to assess motivation among health service providers working with public health system in India.

PubMed

Purohit, Bhaskar; Maneskar, Abhishek; Saxena, Deepak

2016-04-14

Addressing the shortage of health service providers (doctors and nurses) in rural health centres remains a huge challenge. The lack of motivation of health service providers to serve in rural areas is one of the major reasons for such shortage. While many studies have aimed at analysing the reasons for low motivation, hardly any studies in India have focused on developing valid and reliable tools to measure motivation among health service providers. Hence, the objective of the study was to test and develop a valid and reliable instrument to assess the motivation of health service providers working with the public health system in India and the extent to which the motivation factors included in the study motivate health service providers to perform better at work. The present study adapted an already developed tool on motivation. The reliability and validity of the tool were established using different methods. The first stage of the tool development involved content development and assessment where, after a detailed literature review, a predeveloped tool with 19 items was adapted. However, in light of the literature review and pilot test, the same tool was modified to suit the local context by adding 7 additional items so that the final modified tool comprised of 26 items. A correlation matrix was applied to check the pattern of relationships among the items. The total sample size for the study was 154 health service providers from one Western state in India. To understand the sampling adequacy, the Kaiser-Meyer-Olkin measure of sampling adequacy and Bartlett's test of sphericity were applied and finally factor analysis was carried out to calculate the eigenvalues and to understand the relative impact of factors affecting motivation. A correlation matrix value of 0.017 was obtained narrating multi-co-linearity among the observations. Based on initial factor analysis, 8 out of 26 study factors were excluded from the study components with a cutoff range of less than 0.6. Running the factor analysis again suggested the inclusion of 18 items which were subsequently labelled under the following heads: transparency, goals, security, convenience, benefits, encouragement, adequacy of earnings and further growth and power. There is a great need to develop instruments aimed at assessing the motivation of health service providers. The instrument used in the study has good psychometric properties and may serve as a useful tool to assess motivation among healthcare providers.
Psychometric Properties of the Procrastination Assessment Scale-Student (PASS) in a Student Sample of Sabzevar University of Medical Sciences.

PubMed

Mortazavi, Forough; Mortazavi, Saideh S; Khosrorad, Razieh

2015-09-01

Procrastination is a common behavior which affects different aspects of life. The procrastination assessment scale-student (PASS) evaluates academic procrastination apropos its frequency and reasons. The aims of the present study were to translate, culturally adapt, and validate the Farsi version of the PASS in a sample of Iranian medical students. In this cross-sectional study, the PASS was translated into Farsi through the forward-backward method, and its content validity was thereafter assessed by a panel of 10 experts. The Farsi version of the PASS was subsequently distributed among 423 medical students. The internal reliability of the PASS was assessed using Cronbach's alpha. An exploratory factor analysis (EFA) was conducted on 18 items and then 28 items of the scale to find new models. The construct validity of the scale was assessed using both EFA and confirmatory factor analysis. The predictive validity of the scale was evaluated by calculating the correlation between the academic procrastination scores and the students' average scores in the previous semester. The corresponding reliability of the first and second parts of the scale was 0.781 and 0.861. An EFA on 18 items of the scale found 4 factors which jointly explained 53.2% of variances: The model was marginally acceptable (root mean square error of approximation [RMSEA] =0.098, standardized root mean square residual [SRMR] =0.076, χ(2) /df =4.8, comparative fit index [CFI] =0.83). An EFA on 28 items of the scale found 4 factors which altogether explained 42.62% of variances: The model was acceptable (RMSEA =0.07, SRMR =0.07, χ(2)/df =2.8, incremental fit index =0.90, CFI =0.90). There was a negative correlation between the procrastination scores and the students' average scores (r = -0.131, P =0.02). The Farsi version of the PASS is a valid and reliable tool to measure academic procrastination in Iranian undergraduate medical students.
[Motivation for psychosomatic-psychotherapeutic treatment of vocational stresses -- development and validation of a questionnaire].

PubMed

Zwerenz, R; Knickenberg, R J; Schattenburg, L; Beutel, M E

2005-02-01

There is a lack of questionnaires assessing the motivation of inpatients to scrutinize occupational stresses and deal with them as part of their psychotherapeutic treatment. Work-related stress contributes significantly to the development of mental disorders. Vocational reintegration is an outcome criterion for the success of vocational rehabilitation. Patients are often not motivated for dealing with occupational stresses during inpatient medical rehabilitation. Therefore it is necessary to assess patient motivation at the beginning of treatment, in order to assign them to specific interventions, e. g. promoting motivation. A questionnaire (Fragebogen zur berufsbezogenen Therapiemotivation -- FBTM) consisting of 84 items was developed, based on published questionnaires for psychotherapy motivation. 283 psychosomatic rehabilitation inpatients were administered the FBTM, subsequently analyzed by item and factor analyses. Based on a second sample (n = 282) confirmatory factor analyses and validation of the questionnaire were executed. Item and factor analyses revealed a four factor structure. 24 items constituted the subscales that could be described as "intention to change", "wish for pension", "negative treatment expectations" and "active coping". Reliability (Cronbach's Alpha) was satisfactory with coefficients between 0.69 and 0.87, and only low correlations could be found between the four subscales. Correlations with other measures were most pronounced for the subscale "intention to change". Some significant but low correlations could be reported between the FBTM and a standardized questionnaire of psychotherapy motivation (FMP). Confirmatory factor analyses of a second sample (n = 282) confirmed the original four factors. First evidence of sensitivity could be observed in a sample of patients who took part in an intervention promoting work-related therapy motivation during psychosomatic inpatient rehabilitation. The FBTM is a reliable and valid instrument assessing work-related therapy motivation of inpatients, as a relevant therapeutic measure in psychosomatic rehabilitation. Further validation, especially the analysis of predictive validity is desirable.
Psychometric properties of stress and anxiety measures among nulliparous women.

PubMed

Bann, Carla M; Parker, Corette B; Grobman, William A; Willinger, Marian; Simhan, Hyagriv N; Wing, Deborah A; Haas, David M; Silver, Robert M; Parry, Samuel; Saade, George R; Wapner, Ronald J; Elovitz, Michal A; Miller, Emily S; Reddy, Uma M

2017-03-01

To examine the psychometric properties of three measures, the perceived stress scale (PSS), pregnancy experience scale (PES), and state trait anxiety inventory (STAI), for assessing stress and anxiety during pregnancy among a large sample of nulliparous women. The sample included 10,002 pregnant women participating in the Nulliparous Pregnancy Outcomes Study: Monitoring Mothers-to-Be (nMoM2b). Internal consistency reliability was assessed with Cronbach's alpha and factorial validity with confirmatory factor analyses. Intraclass correlations (ICCs) were calculated to determine stability of PSS scales over time. Psychometric properties were examined for the overall sample, as well as subgroups based on maternal age, race/ethnicity and language. All three scales demonstrated good internal consistency reliability. Confirmatory factor analyses supported the factor structures of the PSS and the PES. However, a one-factor solution of the trait-anxiety subscale from the STAI did not fit well; a two-factor solution, splitting the items into factors based on direction of item wording (positive versus negative) provided a better fit. Scores on the PSS were generally stable over time (ICC = 0.60). Subgroup analyses revealed a few items that did not perform well on Spanish versions of the scales. Overall, the scales performed well, suggesting they could be useful tools for identifying women experiencing high levels of stress and anxiety during pregnancy and allowing for the implementation of interventions to help reduce maternal stress and anxiety.
The factor structure of the Values in Action Inventory of Strengths (VIA-IS): An item-level exploratory structural equation modeling (ESEM) bifactor analysis.

PubMed

Ng, Vincent; Cao, Mengyang; Marsh, Herbert W; Tay, Louis; Seligman, Martin E P

2017-08-01

The factor structure of the Values in Action Inventory of Strengths (VIA-IS; Peterson & Seligman, 2004) has not been well established as a result of methodological challenges primarily attributable to a global positivity factor, item cross-loading across character strengths, and questions concerning the unidimensionality of the scales assessing character strengths. We sought to overcome these methodological challenges by applying exploratory structural equation modeling (ESEM) at the item level using a bifactor analytic approach to a large sample of 447,573 participants who completed the VIA-IS with all 240 character strengths items and a reduced set of 107 unidimensional character strength items. It was found that a 6-factor bifactor structure generally held for the reduced set of unidimensional character strength items; these dimensions were justice, temperance, courage, wisdom, transcendence, humanity, and an overarching general factor that is best described as dispositional positivity. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Development and Psychometric Evaluation of the Hypoglycemia Perspectives Questionnaire in Patients with Type 2 Diabetes Mellitus.

PubMed

Kawata, Ariane K; Wilson, Hilary; Ong, Siew Hwa; Kulich, Karoly; Coyne, Karin

2016-10-01

The aim of this study was to evaluate the factor structure and psychometric characteristics of the Hypoglycemia Perspectives Questionnaire (HPQ) assessing experience and perceptions of hypoglycemia in patients with type 2 diabetes mellitus (T2DM). HPQ was administered to adults with T2DM in a clinical sample from Cyprus (HYPO-Cyprus, n = 500) and a community sample in the United States (US, n = 1257) from the 2011 US National Health and Wellness Survey. Demographic and clinical data were collected. Analysis of HPQ data from two convenience samples examined item performance, factor structure, and HPQ measurement properties (reliability, convergent validity, known-groups validity). Analyses supported three HPQ domains: symptom concern (six items), compensatory behavior (five items), and worry (five items). Internal consistency was high for all three domains (all ≥0.75), supporting reliability. Convergent validity was supported by moderate Spearman correlations between HPQ domain scores and the Audit of Diabetes-Dependent Quality of Life (ADDQoL-19) total score. Patients with recent hypoglycemia events had significantly higher HPQ scores, supporting known-group validity. HPQ may be a valid and reliable measure capturing the experience and impact of hypoglycemia and useful in clinical trials and community-based settings.
Confirmatory Factor and Rasch Analyses Support a Revised 14-Item Version of the Organizational, Policies, and Practices (OPP) Scale.

PubMed

Shi, Qiyun; MacDermid, Joy C; Tang, Kenneth; Sinden, Kathryn E; Walton, Dave; Grewal, Ruby

2017-06-01

Background The long version of the organizational, policies and practices (OPP) had a high burden and short versions were developed to solve this drawback. The 11-item version showed promise, but the ergonomic subscale was deficient. The OPP-14 was developed by adding three additional items to the ergonomics subscale. The aim of this study is to evaluate the factor structure using confirmatory factor and Rasch analyses in healthy firefighters. Methods A sample of 261 firefighters (Mean age 42 years, 95 % male) were sampled. A confirmatory factor and Rasch analyses were used to assess the internal consistency, factor structure and other psychometric characteristics of revised OPP-14. Results The OPP-14 demonstrates sound factor structure and internal consistency in firefighters. Confirmatory factor analysis confirmed the consistency of the original 4-domain structure (CFI = 0.97, TLI = 0.96, and RMSEA = 0.053). The 5 items showing misfit initially with disordered thresholds were rescored. The four subscales satisfied Rasch expectations with well target and acceptable reliability. Conclusions The OPP-14 scale shows a promising factor structure in this sample and remediated deficits found in OPP-11. This version may be preferable for musculoskeletal concerns or work applications where ergonomic indicators are relevant.

Assessment of nuclear anxiety among American students: Stability over time, secular trends, and emotional correlates

DOE Office of Scientific and Technical Information (OSTI.GOV)

Newcomb, M.D.

1989-10-01

Studies of reactions and attitudes toward nuclear war have progressed from the use of anecdotal evidence to multi-item psychological measures. Additional psychometric data and substantive results of the Nuclear Attitudes Questionnaire (NAQ; Newcomb, 1986) are reported here. Data from three independent samples of students from the United States collected in 1984, 1986, and 1987 were compared and contrasted. The 1986 data were obtained immediately following the Chernobyl nuclear power plant accident. Test-retest reliability of the NAQ items and subscales was quite high and comparable among samples and established the across-time stability of the measure. There were several secular trends acrossmore » years on items and subscales, indicating some increased concern about nuclear power (particularly in 1986), but also a general increase in nuclear concerns, fears, and anxiety. Anticipated sex differences were found on many of the NAQ items and subscales. Correlations between the NAQ subscales and the nine SCL-90-R scales (Derogatis, 1977) were consistent for the 1986 and 1987 samples. In latent variable analyses, a general factor of Emotional Distress was significantly correlated with a general factor of Nuclear Anxiety, as well as specifically with nuclear concern and fear for the future.« less
Assessing public speaking fear with the short form of the Personal Report of Confidence as a Speaker scale: confirmatory factor analyses among a French-speaking community sample.

PubMed

Heeren, Alexandre; Ceschi, Grazia; Valentiner, David P; Dethier, Vincent; Philippot, Pierre

2013-01-01

The main aim of this study was to assess the reliability and structural validity of the French version of the 12-item version of the Personal Report of Confidence as Speaker (PRCS), one of the most promising measurements of public speaking fear. A total of 611 French-speaking volunteers were administered the French versions of the short PRCS, the Liebowitz Social Anxiety Scale, the Fear of Negative Evaluation scale, as well as the Trait version of the Spielberger State-Trait Anxiety Inventory and the Beck Depression Inventory-II, which assess the level of anxious and depressive symptoms, respectively. Regarding its structural validity, confirmatory factor analyses indicated a single-factor solution, as implied by the original version. Good scale reliability (Cronbach's alpha = 0.86) was observed. The item discrimination analysis suggested that all the items contribute to the overall scale score reliability. The French version of the short PRCS showed significant correlations with the Liebowitz Social Anxiety Scale (r = 0.522), the Fear of Negative Evaluation scale (r = 0.414), the Spielberger State-Trait Anxiety Inventory (r = 0.516), and the Beck Depression Inventory-II (r = 0.361). The French version of the short PRCS is a reliable and valid measure for the evaluation of the fear of public speaking among a French-speaking sample. These findings have critical consequences for the measurement of psychological and pharmacological treatment effectiveness in public speaking fear among a French-speaking sample.
Assessing public speaking fear with the short form of the Personal Report of Confidence as a Speaker scale: confirmatory factor analyses among a French-speaking community sample

PubMed Central

Heeren, Alexandre; Ceschi, Grazia; Valentiner, David P; Dethier, Vincent; Philippot, Pierre

2013-01-01

Background: The main aim of this study was to assess the reliability and structural validity of the French version of the 12-item version of the Personal Report of Confidence as Speaker (PRCS), one of the most promising measurements of public speaking fear. Methods: A total of 611 French-speaking volunteers were administered the French versions of the short PRCS, the Liebowitz Social Anxiety Scale, the Fear of Negative Evaluation scale, as well as the Trait version of the Spielberger State-Trait Anxiety Inventory and the Beck Depression Inventory-II, which assess the level of anxious and depressive symptoms, respectively. Results: Regarding its structural validity, confirmatory factor analyses indicated a single-factor solution, as implied by the original version. Good scale reliability (Cronbach’s alpha = 0.86) was observed. The item discrimination analysis suggested that all the items contribute to the overall scale score reliability. The French version of the short PRCS showed significant correlations with the Liebowitz Social Anxiety Scale (r = 0.522), the Fear of Negative Evaluation scale (r = 0.414), the Spielberger State-Trait Anxiety Inventory (r = 0.516), and the Beck Depression Inventory-II (r = 0.361). Conclusion: The French version of the short PRCS is a reliable and valid measure for the evaluation of the fear of public speaking among a French-speaking sample. These findings have critical consequences for the measurement of psychological and pharmacological treatment effectiveness in public speaking fear among a French-speaking sample. PMID:23662060
Development of the Systems Thinking Scale for Adolescent Behavior Change.

PubMed

Moore, Shirley M; Komton, Vilailert; Adegbite-Adeniyi, Clara; Dolansky, Mary A; Hardin, Heather K; Borawski, Elaine A

2018-03-01

This report describes the development and psychometric testing of the Systems Thinking Scale for Adolescent Behavior Change (STS-AB). Following item development, initial assessments of understandability and stability of the STS-AB were conducted in a sample of nine adolescents enrolled in a weight management program. Exploratory factor analysis of the 16-item STS-AB and internal consistency assessments were then done with 359 adolescents enrolled in a weight management program. Test-retest reliability of the STS-AB was .71, p = .03; internal consistency reliability was .87. Factor analysis of the 16-item STS-AB indicated a one-factor solution with good factor loadings, ranging from .40 to .67. Evidence of construct validity was supported by significant correlations with established measures of variables associated with health behavior change. We provide beginning evidence of the reliability and validity of the STS-AB to measure systems thinking for health behavior change in young adolescents.
Development of the Systems Thinking Scale for Adolescent Behavior Change

PubMed Central

Moore, Shirley M.; Komton, Vilailert; Adegbite-Adeniyi, Clara; Dolansky, Mary A.; Hardin, Heather K.; Borawski, Elaine A.

2017-01-01

This report describes the development and psychometric testing of the Systems Thinking Scale for Adolescent Behavior Change (STS-AB). Following item development, initial assessments of understandability and stability of the STS-AB were conducted in a sample of nine adolescents enrolled in a weight management program. Exploratory factor analysis of the 16-item STS-AB and internal consistency assessments were then done with 359 adolescents enrolled in a weight management program. Test–retest reliability of the STS-AB was .71, p = .03; internal consistency reliability was .87. Factor analysis of the 16-item STS-AB indicated a one-factor solution with good factor loadings, ranging from .40 to .67. Evidence of construct validity was supported by significant correlations with established measures of variables associated with health behavior change. We provide beginning evidence of the reliability and validity of the STS-AB to measure systems thinking for health behavior change in young adolescents. PMID:28303755
Geriatric Anxiety Scale: item response theory analysis, differential item functioning, and creation of a ten-item short form (GAS-10).

PubMed

Mueller, Anne E; Segal, Daniel L; Gavett, Brandon; Marty, Meghan A; Yochim, Brian; June, Andrea; Coolidge, Frederick L

2015-07-01

The Geriatric Anxiety Scale (GAS; Segal et al. (Segal, D. L., June, A., Payne, M., Coolidge, F. L. and Yochim, B. (2010). Journal of Anxiety Disorders, 24, 709-714. doi:10.1016/j.janxdis.2010.05.002) is a self-report measure of anxiety that was designed to address unique issues associated with anxiety assessment in older adults. This study is the first to use item response theory (IRT) to examine the psychometric properties of a measure of anxiety in older adults. A large sample of older adults (n = 581; mean age = 72.32 years, SD = 7.64 years, range = 60 to 96 years; 64% women; 88% European American) completed the GAS. IRT properties were examined. The presence of differential item functioning (DIF) or measurement bias by age and sex was assessed, and a ten-item short form of the GAS (called the GAS-10) was created. All GAS items had discrimination parameters of 1.07 or greater. Items from the somatic subscale tended to have lower discrimination parameters than items on the cognitive or affective subscales. Two items were flagged for DIF, but the impact of the DIF was negligible. Women scored significantly higher than men on the GAS and its subscales. Participants in the young-old group (60 to 79 years old) scored significantly higher on the cognitive subscale than participants in the old-old group (80 years old and older). Results from the IRT analyses indicated that the GAS and GAS-10 have strong psychometric properties among older adults. We conclude by discussing implications and future research directions.
An assessment of functioning and non-functioning distractors in multiple-choice questions: a descriptive analysis.

PubMed

Tarrant, Marie; Ware, James; Mohammed, Ahmed M

2009-07-07

Four- or five-option multiple choice questions (MCQs) are the standard in health-science disciplines, both on certification-level examinations and on in-house developed tests. Previous research has shown, however, that few MCQs have three or four functioning distractors. The purpose of this study was to investigate non-functioning distractors in teacher-developed tests in one nursing program in an English-language university in Hong Kong. Using item-analysis data, we assessed the proportion of non-functioning distractors on a sample of seven test papers administered to undergraduate nursing students. A total of 514 items were reviewed, including 2056 options (1542 distractors and 514 correct responses). Non-functioning options were defined as ones that were chosen by fewer than 5% of examinees and those with a positive option discrimination statistic. The proportion of items containing 0, 1, 2, and 3 functioning distractors was 12.3%, 34.8%, 39.1%, and 13.8% respectively. Overall, items contained an average of 1.54 (SD = 0.88) functioning distractors. Only 52.2% (n = 805) of all distractors were functioning effectively and 10.2% (n = 158) had a choice frequency of 0. Items with more functioning distractors were more difficult and more discriminating. The low frequency of items with three functioning distractors in the four-option items in this study suggests that teachers have difficulty developing plausible distractors for most MCQs. Test items should consist of as many options as is feasible given the item content and the number of plausible distractors; in most cases this would be three. Item analysis results can be used to identify and remove non-functioning distractors from MCQs that have been used in previous tests.
Validity and Reliability of the Turkish Chronic Pain Acceptance Questionnaire

PubMed

Akmaz, Hazel Ekin; Uyar, Meltem; Kuzeyli Yıldırım, Yasemin; Akın Korhan, Esra

2018-05-29

Pain acceptance is the process of giving up the struggle with pain and learning to live a worthwhile life despite it. In assessing patients with chronic pain in Turkey, making a diagnosis and tracking the effectiveness of treatment is done with scales that have been translated into Turkish. However, there is as yet no valid and reliable scale in Turkish to assess the acceptance of pain. To validate a Turkish version of the Chronic Pain Acceptance Questionnaire developed by McCracken and colleagues. Methodological and cross sectional study. A simple randomized sampling method was used in selecting the study sample. The sample was composed of 201 patients, more than 10 times the number of items examined for validity and reliability in the study, which totaled 20. A patient identification form, the Chronic Pain Acceptance Questionnaire, and the Brief Pain Inventory were used to collect data. Data were collected by face-to-face interviews. In the validity testing, the content validity index was used to evaluate linguistic equivalence, content validity, construct validity, and expert views. In reliability testing of the scale, Cronbach’s α coefficient was calculated, and item analysis and split-test reliability methods were used. Principal component analysis and varimax rotation were used in factor analysis and to examine factor structure for construct concept validity. The item analysis established that the scale, all items, and item-total correlations were satisfactory. The mean total score of the scale was 21.78. The internal consistency coefficient was 0.94, and the correlation between the two halves of the scale was 0.89. The Chronic Pain Acceptance Questionnaire, which is intended to be used in Turkey upon confirmation of its validity and reliability, is an evaluation instrument with sufficient validity and reliability, and it can be reliably used to examine patients’ acceptance of chronic pain.
Validity and Reliability of the Turkish Chronic Pain Acceptance Questionnaire

PubMed Central

Akmaz, Hazel Ekin; Uyar, Meltem; Kuzeyli Yıldırım, Yasemin; Akın Korhan, Esra

2018-01-01

Background: Pain acceptance is the process of giving up the struggle with pain and learning to live a worthwhile life despite it. In assessing patients with chronic pain in Turkey, making a diagnosis and tracking the effectiveness of treatment is done with scales that have been translated into Turkish. However, there is as yet no valid and reliable scale in Turkish to assess the acceptance of pain. Aims: To validate a Turkish version of the Chronic Pain Acceptance Questionnaire developed by McCracken and colleagues. Study Design: Methodological and cross sectional study. Methods: A simple randomized sampling method was used in selecting the study sample. The sample was composed of 201 patients, more than 10 times the number of items examined for validity and reliability in the study, which totaled 20. A patient identification form, the Chronic Pain Acceptance Questionnaire, and the Brief Pain Inventory were used to collect data. Data were collected by face-to-face interviews. In the validity testing, the content validity index was used to evaluate linguistic equivalence, content validity, construct validity, and expert views. In reliability testing of the scale, Cronbach’s α coefficient was calculated, and item analysis and split-test reliability methods were used. Principal component analysis and varimax rotation were used in factor analysis and to examine factor structure for construct concept validity. Results: The item analysis established that the scale, all items, and item-total correlations were satisfactory. The mean total score of the scale was 21.78. The internal consistency coefficient was 0.94, and the correlation between the two halves of the scale was 0.89. Conclusion: The Chronic Pain Acceptance Questionnaire, which is intended to be used in Turkey upon confirmation of its validity and reliability, is an evaluation instrument with sufficient validity and reliability, and it can be reliably used to examine patients’ acceptance of chronic pain. PMID:29843496
Factorial composition of the Aggression Questionnaire: a multi-sample study in Greek adults.

PubMed

Vitoratou, Silia; Ntzoufras, Ioannis; Smyrnis, Nikolaos; Stefanis, Nicholas C

2009-06-30

The primary aim of the current article was the evaluation of the factorial composition of the Aggression Questionnaire (AQ(29)) in the Greek population. The translated questionnaire was administered to the following three heterogeneous adult samples: a general population sample from Athens, a sample of young male conscripts and a sample of individuals facing problems related to substance use. Factor analysis highlighted a structure similar to the one proposed by Buss and Perry [Buss, A.F., Perry, M., 1992. The Aggression Questionnaire. Journal of Personality and Social Psychology 63, 452-459]. However, the refined 12-item version of Bryant and Smith [Bryant, F.B., Smith, B.D., 2001. Refining the architecture of aggression: a measurement model for the Buss-Perry Aggression Questionnaire. Journal of Research in Personality 35, 138-167] provided a better fit to our data. Therefore, the refined model was implemented in further analysis. Multiple group confirmatory factor analysis was applied in order to assess the variability of the 12-item AQ across gender and samples. The percentage of factor loading invariance between males and females and across the three samples defined above was high (higher than 75%). The reliability (internal consistency) of the scale was satisfactory in all cases. Content validity of the 12-item AQ was confirmed by comparison with the Symptom Check-List 90 Revised.
Do early changes in the HAM-D-17 anxiety/somatization factor items affect treatment outcome among depressed outpatients? Comparison of two controlled trials of St John’s Wort (Hypericum Perforatum) versus an SSRI

PubMed Central

Bitran, Stella; Farabaugh, Amy H; Ameral, Victoria E; LaRocca, Rachel A; Clain, Alisabet J; Fava, Maurizio; Mischoulon, David

2011-01-01

Objective To assess whether early changes in HAM-D-17 anxiety/somatization items predict remission in two controlled studies of hypericum perforatum (St. John’s wort) versus an SSRI for major depressive disorder (MDD). Methods The Hypericum Depression Trial Study Group (NIMH) study randomized 340 subjects to hypericum, sertraline, or placebo for 8 weeks. The MGH study randomized 135 subjects to hypericum, fluoxetine, or placebo for 12 weeks. We examined whether remission was associated with early changes in anxiety/somatization symptoms. Results In the NIMH study, significant associations were observed between remission and early improvement in the anxiety-psychic item (sertraline arm), somatic-gastrointestinal item (hypericum arm), and somatic symptoms-general (placebo arm). None of the three treatment arms of the MGH study showed significant associations between anxiety/somatization symptoms and remission. When both study samples were pooled, we found associations for anxiety-psychic (SSRI arm), somatic-gastrointestinal and hypochondriasis (hypericum arm), and anxiety-psychic and somatic symptoms-general (placebo arm). In the entire sample, remission was associated with improvement in the anxiety-psychic, somatic-gastrointestinal, and somatic symptoms-general items. Conclusions The number and type of anxiety/somatization items associated with remission varied depending on the intervention. Early scrutiny of the HAM-D-17 anxiety/somatization items may help predict remission of MDD. PMID:21278577
Using a Process Dissociation Approach to Assess Verbal Short-Term Memory for Item and Order Information in a Sample of Individuals with a Self-Reported Diagnosis of Dyslexia

PubMed Central

Wang, Xiaoli; Xuan, Yifu; Jarrold, Christopher

2016-01-01

Previous studies have examined whether difficulties in short-term memory for verbal information, that might be associated with dyslexia, are driven by problems in retaining either information about to-be-remembered items or the order in which these items were presented. However, such studies have not used process-pure measures of short-term memory for item or order information. In this work we adapt a process dissociation procedure to properly distinguish the contributions of item and order processes to verbal short-term memory in a group of 28 adults with a self-reported diagnosis of dyslexia and a comparison sample of 29 adults without a dyslexia diagnosis. In contrast to previous work that has suggested that individuals with dyslexia experience item deficits resulting from inefficient phonological representation and language-independent order memory deficits, the results showed no evidence of specific problems in short-term retention of either item or order information among the individuals with a self-reported diagnosis of dyslexia, despite this group showing expected difficulties on separate measures of word and non-word reading. However, there was some suggestive evidence of a link between order memory for verbal material and individual differences in non-word reading, consistent with other claims for a role of order memory in phonologically mediated reading. The data from the current study therefore provide empirical evidence to question the extent to which item and order short-term memory are necessarily impaired in dyslexia. PMID:26941679
Using a Process Dissociation Approach to Assess Verbal Short-Term Memory for Item and Order Information in a Sample of Individuals with a Self-Reported Diagnosis of Dyslexia.

PubMed

Wang, Xiaoli; Xuan, Yifu; Jarrold, Christopher

2016-01-01

Previous studies have examined whether difficulties in short-term memory for verbal information, that might be associated with dyslexia, are driven by problems in retaining either information about to-be-remembered items or the order in which these items were presented. However, such studies have not used process-pure measures of short-term memory for item or order information. In this work we adapt a process dissociation procedure to properly distinguish the contributions of item and order processes to verbal short-term memory in a group of 28 adults with a self-reported diagnosis of dyslexia and a comparison sample of 29 adults without a dyslexia diagnosis. In contrast to previous work that has suggested that individuals with dyslexia experience item deficits resulting from inefficient phonological representation and language-independent order memory deficits, the results showed no evidence of specific problems in short-term retention of either item or order information among the individuals with a self-reported diagnosis of dyslexia, despite this group showing expected difficulties on separate measures of word and non-word reading. However, there was some suggestive evidence of a link between order memory for verbal material and individual differences in non-word reading, consistent with other claims for a role of order memory in phonologically mediated reading. The data from the current study therefore provide empirical evidence to question the extent to which item and order short-term memory are necessarily impaired in dyslexia.
Development and preliminary evaluation of the OsteoArthritis Questionnaire (OA-Quest): a psychometric study.

PubMed

Busija, L; Buchbinder, R; Osborne, R H

2016-08-01

This study reports the development of the OsteoArthritis Questionnaire (OA-Quest) - a new measure designed to comprehensively capture the potentially modifiable burden of osteoarthritis. Item development was guided by the a priori conceptual framework of the Personal Burden of Osteoarthritis (PBO) which captures 8 dimensions of osteoarthritis burden (Physical distress, Fatigue, Physical limitations, Psychosocial distress, Physical de-conditioning, Financial hardship, Sleep disturbances, Lost productivity). One hundred and twenty three candidate items were pretested in a clinical sample of 18 osteoarthritis patients. The measurement properties of the OA-Quest were assessed with exploratory factor analysis (EFA), Rasch modelling, and confirmatory factor analysis (CFA) in a community-based sample (n = 792). EFA replicated 7 of the 8 PBO domains. An exception was PBO Fatigue domain, with items merging into the Physical distress subscale in the OA-Quest. Following item analysis, a 42-item 7-subscale questionnaire was constructed, measuring Physical distress (seven items, Cronbach's α = 0.93), Physical limitations (11 items, α = 0.95), Psychosocial distress (seven items, α = 0.93), Physical de-conditioning (four items, α = 0.87), Financial hardship (four items, α = 0.93), Sleep disturbances (five items, α = 0.96), and Lost productivity (four items α = 0.90). A highly restricted 7-factor CFA model had excellent fit with the data (χ(2)(113) = 316.36, P < 0.001; chi-square/degrees of freedom = 2.8; comparative fit index [CFI] = 0.97; root mean square error of approximation [RMSEA] = 0.07), supporting construct validity of the new measure. The OA-Quest is a new measure of osteoarthritis burden that is founded on a comprehensive conceptual model. It has strong evidence of construct validity and provides reliable measurement across a broad range of osteoarthritis burden. Copyright © 2016 Osteoarthritis Research Society International. Published by Elsevier Ltd. All rights reserved.
Prevalence of responsible hospitality policies in licensed premises that are associated with alcohol-related harm.

PubMed

Daly, Justine B; Campbell, Elizabeth M; Wiggers, John H; Considine, Robyn J

2002-06-01

This study aimed to determine the prevalence of responsible hospitality policies in a group of licensed premises associated with alcohol-related harm. During March 1999, 108 licensed premises with one or more police-identified alcohol-related incidents in the previous 3 months received a visit from a police officer. A 30-item audit checklist was used to determine the responsible hospitality policies being undertaken by each premises within eight policy domains: display required signage (three items); responsible host practices to prevent intoxication and under-age drinking (five items); written policies and guidelines for responsible service (three items); discouraging inappropriate promotions (three items); safe transport (two items); responsible management issues (seven items); physical environment (three items) and entry conditions (four items). No premises were undertaking all 30 items. Eighty per cent of the premises were undertaking 20 of the 30 items. All premises were undertaking at least 17 of the items. The proportion of premises undertaking individual items ranged from 16% to 100%. Premises were less likely to report having and providing written responsible hospitality documentation to staff, using door charges and having entry/re-entry rules. Significant differences between rural and urban premises were evident for four policies. Clubs were significantly more likely than hotels to have a written responsible service of alcohol policy and to clearly display codes of dress and conditions of entry. This study provides an indication of the extent and nature of responsible hospitality policies in a sample of licensed premises that are associated with a broad range of alcohol related harms. The finding that a large majority of such premises appear to adopt responsible hospitality policies suggests a need to assess the validity and reliability of tools used in the routine assessment of such policies, and of the potential for harm from licensed premises.
Parameter Estimation with Small Sample Size: A Higher-Order IRT Model Approach

ERIC Educational Resources Information Center

de la Torre, Jimmy; Hong, Yuan

2010-01-01

Sample size ranks as one of the most important factors that affect the item calibration task. However, due to practical concerns (e.g., item exposure) items are typically calibrated with much smaller samples than what is desired. To address the need for a more flexible framework that can be used in small sample item calibration, this article…
[SOMS-2: translation into portuguese of the screening for Somatoform Disorders].

PubMed

Fabião, Cristina; Costa E Silva, Carolina; Fleming, Manuela; Barbosa, António

2008-01-01

The diagnosis of Somatization Disorder (SD) requires the presence of somatic medically unexplained symptoms (MUS) which must be assessed so that organic diseases may be excluded. SOMS-2 is a self-report measure for SD that assesses medically unexplained symptoms by requiring participants to answer affirmatively and qualify any of the complaints as MUS, only if they have obtained from his doctor the opinion that the said complaint is not due to an organic disease. According to the authors, original SOMS-2 has a good internal consistency with Cronbach's a = .87 and a good correlation between selfratings and interview (r = .75). After obtaining the author's permission, translation from and into English has been made by experienced translators. The resulting questionnaire has been used on a small group of patients. Afterwards the items in which there were difficulties in understanding during the pretest were identified and experienced practitioners were asked for suggestions. The resulting version was answered by 123 primary health care patients (sample I). After some modifications of the SOMS-2, another group of 190 primary health care patients answered the questionnaire (sample II). Most patients, in the first sample, found it difficult to understand that, in order to answer affirmatively it was necessary to answer three questions: 1) is the symptom present? 2) has your doctor found no clear causes for the symptom? 3) does the symptom affect your well-being? The difficulties in understanding items 21 and 45 (pre-test) were confirmed. Items 11, 28 and 38 were more easily understood when worded differently. In sample I, less than 5% of positive answers were given to items 20, 21, 23, 40, 43, 45, and 51. Probably because of the low education level of the Portuguese population which this sample reflects, difficulties in carrying out the instructions given at the beginning made it advisable to modify the SOMS-2, so that the three implicit questions in each question of the SOMS-2 were divided into two columns (two explicit questions). Simultaneously attention must continue on controlling severity criterion (the third implicit question). After phase I, the items with an answer rate of less than 5% were eliminated. The majority of them are coincident with the low answer rate items found by the authors of the original version. The next step is to study the internal consistency and the correlation between results of self-ratings and interview, of the resulting version, in order to establish the validity of the SOMS-2 in these populations.
Assessing Mathematics Self-Efficacy: How Many Categories Do We Really Need?

ERIC Educational Resources Information Center

Toland, Michael D.; Usher, Ellen L.

2016-01-01

The present study tested whether a reduced number of categories is optimal for assessing mathematics self-efficacy among middle school students using a 6-point Likert-type format or a 0- to 100-point format. Two independent samples of middle school adolescents (N = 1,913) were administered a 24-item Middle School Mathematics Self-Efficacy Scale…
Modeling Nonignorable Missing Data with Item Response Theory (IRT). Research Report. ETS RR-10-11

ERIC Educational Resources Information Center

Rose, Norman; von Davier, Matthias; Xu, Xueli

2010-01-01

Large-scale educational surveys are low-stakes assessments of educational outcomes conducted using nationally representative samples. In these surveys, students do not receive individual scores, and the outcome of the assessment is inconsequential for respondents. The low-stakes nature of these surveys, as well as variations in average performance…
Students' Learning Assessment Practices Used by Jordanian Teachers of Mathematics for Grades (1-6)

ERIC Educational Resources Information Center

Abed, Eman Rasmi; Abu Awwad, Ferial Mohammad

2016-01-01

This study aims to investigate the students' learning assessment practices used by Jordanian teachers of mathematics for grades (1-6) in Amman. The sample of the study consists of (402) teachers. A questionnaire of (72) items are developed on four domains, namely: questions, homework, exams, and alternative strategies. Validity and reliability are…

The Danieli Inventory of Multigenerational Legacies of Trauma, Part I: Survivors' posttrauma adaptational styles in their children's eyes.

PubMed

Danieli, Yael; Norris, Fran H; Lindert, Jutta; Paisner, Vera; Engdahl, Brian; Richter, Julia

2015-09-01

A comprehensive valid behavioral measure for assessing multidimensional multigenerational impacts of massive trauma has been missing thus far. We describe the development of the Posttrauma Adaptational Styles questionnaire (Part I of the three-part Danieli Inventory of Multigenerational Legacies of Trauma), a self-report questionnaire of Holocaust survivors' children's perceptions of each parent and their own upbringing (60 items per parent). The items were based on literature and cognitive interviewing of 18 survivors' offspring. A web-based convenience sample survey was designed in English and Hebrew and completed by 482 adult children (M age = 59; 67% women) of Holocaust survivors. Exploratory factor analyses were conducted by using maximum likelihood extraction with Geomin rotation to examine the factor structure of the original 70 items for each parent. Conducted hierarchically, the analysis yielded three higher-order factors reflecting intensities of victim, numb, and fighter styles. The 30-item Victim Style Scale (α = .92-.93) and 18-item Numb Style Scale (α = .89) had excellent internal consistency; the consistency of the 12-item Fighter Style Scale (α = .69-.70) was more modest. English-Hebrew analyses suggested good-to-excellent congruence in factor structure (φ = .87-.99). Further research is needed to evaluate the validity of the measure in other samples and populations. Copyright © 2015 Elsevier Ltd. All rights reserved.
Screening for HIV-related PTSD: sensitivity and specificity of the 17-item Posttraumatic Stress Diagnostic Scale (PDS) in identifying HIV-related PTSD among a South African sample.

PubMed

Martin, L; Fincham, D; Kagee, A

2009-11-01

The identification of HIV-positive patients who exhibit criteria for Posttraumatic Stress Disorder (PTSD) and related trauma symptomatology is of clinical importance in the maintenance of their overall wellbeing. This study assessed the sensitivity and specificity of the 17-item Posttraumatic Stress Diagnostic Scale (PDS), a self-report instrument, in the detection of HIV-related PTSD. An adapted version of the PTSD module of the Composite International Diagnostic Interview (CIDI) served as the gold standard. 85 HIV-positive patients diagnosed with HIV within the year preceding data collection were recruited by means of convenience sampling from three HIV clinics within primary health care facilities in the Boland region of South Africa. A significant association was found between the 17-item PDS and the adapted PTSD module of the CIDI. A ROC curve analysis indicated that the 17-item PDS correctly discriminated between PTSD caseness and non-caseness 74.9% of the time. Moreover, a PDS cut-off point of > or = 15 yielded adequate sensitivity (68%) and 1-specificity (65%). The 17-item PDS demonstrated a PPV of 76.0% and a NPV of 56.7%. The 17-item PDS can be used as a brief screening measure for the detection of HIV-related PTSD among HIV-positive patients in South Africa.
Measurement properties of the CLOX Executive Clock Drawing Task in an inpatient stroke rehabilitation setting.

PubMed

Zuverza-Chavarria, Virginia; Tsanadis, John

2011-05-01

The goal of this study was to explore the psychometric properties of the CLOX Executive Clock Drawing Task (Royall, Cordes, & Polk, 1998) in persons who had sustained a stroke and were receiving inpatient rehabilitation. Rasch modeling was utilized to examine the psychometric properties of the CLOX. Separate analyses were conducted for the free draw (CLOX 1) and copy (CLOX 2) portions of the measure to investigate each presentation mode independently. The sample consisted of 66 inpatient adults who had sustained a stroke. CLOX 1 met most Rasch model expectations for item fit, unidimensionality, test reliability, and sample targeting. CLOX 2 was less psychometrically sound and contained two items with significant misfit. CLOX 2 demonstrated a significant ceiling effect that resulted in poor sample targeting. CLOX 1 is a psychometrically sound screening instrument for assessing persons with stroke receiving inpatient rehabilitation. In addition to the psychometric weaknesses of CLOX 2, its interpretive yield is minimal and clinicians may consider omitting it. Recommendations are made for using the Rasch item-person maps in clinical practice.
Answering the call: a tool that measures functional breast cancer literacy.

PubMed

Williams, Karen Patricia; Templin, Thomas N; Hines, Resche D

2013-01-01

There is a need for health care providers and health care educators to ensure that the messages they communicate are understood. The purpose of this research was to test the reliability and validity, in a culturally diverse sample of women, of a revised Breast Cancer Literacy Assessment Tool (Breast-CLAT) designed to measure functional understanding of breast cancer in English, Spanish, and Arabic. Community health workers verbally administered the 35-item Breast-CLAT to 543 Black, Latina, and Arab American women. A confirmatory factor analysis using a 2-parameter item response theory model was used to test the proposed 3-factor Breast-CLAT (awareness, screening and knowledge, and prevention and control). The confirmatory factor analysis using a 2-parameter item response theory model had a good fit (TLI = .91, RMSEA = .04) to the proposed 3-factor structure. The total scale reliability ranged from .80 for Black participants to .73 for total culturally diverse sample. The three subscales were differentially predictive of family history of cancer. The revised Breast-CLAT scales demonstrated internal consistency reliability and validity in this multiethnic, community-based sample.
Is Using the Strengths and Difficulties Questionnaire in a Community Sample the Optimal Way to Assess Mental Health Functioning?

PubMed

Vaz, Sharmila; Cordier, Reinie; Boyes, Mark; Parsons, Richard; Joosten, Annette; Ciccarelli, Marina; Falkmer, Marita; Falkmer, Torbjorn

2016-01-01

An important characteristic of a screening tool is its discriminant ability or the measure's accuracy to distinguish between those with and without mental health problems. The current study examined the inter-rater agreement and screening concordance of the parent and teacher versions of SDQ at scale, subscale and item-levels, with the view of identifying the items that have the most informant discrepancies; and determining whether the concordance between parent and teacher reports on some items has the potential to influence decision making. Cross-sectional data from parent and teacher reports of the mental health functioning of a community sample of 299 students with and without disabilities from 75 different primary schools in Perth, Western Australia were analysed. The study found that: a) Intraclass correlations between parent and teacher ratings of children's mental health using the SDQ at person level was fair on individual child level; b) The SDQ only demonstrated clinical utility when there was agreement between teacher and parent reports using the possible or 90% dichotomisation system; and c) Three individual items had positive likelihood ratio scores indicating clinical utility. Of note was the finding that the negative likelihood ratio or likelihood of disregarding the absence of a condition when both parents and teachers rate the item as absent was not significant. Taken together, these findings suggest that the SDQ is not optimised for use in community samples and that further psychometric evaluation of the SDQ in this context is clearly warranted.
Landscape effects on diets of two canids in Northwestern Texas: A multinomial modeling approach

USGS Publications Warehouse

Lemons, P.R.; Sedinger, J.S.; Herzog, M.P.; Gipson, P.S.; Gilliland, R.L.

2010-01-01

Analyses of feces, stomach contents, and regurgitated pellets are common techniques for assessing diets of vertebrates and typically contain more than 1 food item per sampling unit. When analyzed, these individual food items have traditionally been treated as independent, which represents pseudoreplication. When food types are recorded as present or absent, these samples can be treated as multinomial vectors of food items, with each vector representing 1 realization of a possible diet. We suggest such data have a similar structure to capture histories for closed-capture, capturemarkrecapture data. To assess the effects of landscapes and presence of a potential competitor, we used closed-capture models implemented in program MARK into analyze diet data generated from feces of swift foxes (Vulpes velox) and coyotes (Canis latrans) in northwestern Texas. The best models of diet contained season and location for both swift foxes and coyotes, but year accounted for less variation, suggesting that landscape type is an important predictor of diets of both species. Models containing the effect of coyote reduction were not competitive (??QAICc 53.6685), consistent with the hypothesis that presence of coyotes did not influence diet of swift foxes. Our findings suggest that landscape type may have important influences on diets of both species. We believe that multinomial models represent an effective approach to assess hypotheses when diet studies have a data structure similar to ours. ?? 2010 American Society of Mammalogists.
Assessing Self-Efficacy for Coping with Cancer: Development and Psychometric Analysis of the Brief Version of the Cancer Behavior Inventory (CBI-B)

PubMed Central

Heitzmann, Carolyn A.; Merluzzi, Thomas V.; Jean-Pierre, Pascal; Roscoe, Joseph A.; Kirsh, Kenneth L.; Passik, Steven D.

2010-01-01

Introduction The Cancer Behavior Inventory-Brief Version (CBI-B), a 12-item measure of self-efficacy for coping with cancer derived from the longer 33-item version (CBI-L), was subjected to psychometric analysis. Method Participants consisted of three samples: 735 cancer patients from a multi-center CCOP study, 199 from central Indiana, and 370 from a national sample. Samples were mixed with respect to initial cancer diagnosis. Participants completed the CBI-B and measures of quality of life, optimism, life satisfaction, depression, and sickness impact. Results EFA with oblique rotation yielded four factors in the first sample: 1) Maintaining Independence and Positive Attitude; 2) Participating in Medical Care; 3) Coping and Stress Management; and 4) Managing Affect, which were confirmed in subsequent samples. Cronbach α coefficient for the 12-item CBI-B ranged from .84 to .88. Validity of the CBI-B was demonstrated by positive correlations with measures of quality of life and optimism and negative correlations with measures of depression and sickness impact. Discussion The CBI-B is a valid brief measure of self-efficacy for coping that could be easily integrated into clinical oncology research and practice and also be used in screening patients. PMID:11351373
The Mindful Attention Awareness Scale: Further Examination of Dimensionality, Reliability, and Concurrent Validity Estimates.

PubMed

Osman, Augustine; Lamis, Dorian A; Bagge, Courtney L; Freedenthal, Stacey; Barnes, Sean M

2016-01-01

We examined the factor structure and psychometric properties of the Mindful Attention Awareness Scale (MAAS) in a sample of 810 undergraduate students. Using common exploratory factor analysis (EFA), we obtained evidence for a 1-factor solution (41.84% common variance). To confirm unidimensionality of the 15-item MAAS, we conducted a 1-factor confirmatory factor analysis (CFA). Results of the EFA and CFA, respectively, provided support for a unidimensional model. Using differential item functioning analysis methods within item response theory modeling (IRT-based DIF), we found that individuals with high and low levels of nonattachment responded similarly to the MAAS items. Following a detailed item analysis, we proposed a 5-item short version of the instrument and present descriptive statistics and composite score reliability for the short and full versions of the MAAS. Finally, correlation analyses showed that scores on the full and short versions of the MAAS were associated with measures assessing related constructs. The 5-item MAAS is as useful as the original MAAS in enhancing our understanding of the mindfulness construct.
Validation of the school lunch recall questionnaire to capture school lunch intake of third- to fifth-grade students.

PubMed

Paxton, Amy; Baxter, Suzanne Domel; Fleming, Phyllis; Ammerman, Alice

2011-03-01

Children's dietary intake is a key variable in evaluations of school-based interventions. Current methods for assessing children's intake, such as 24-hour recalls and meal observations, are time- and resource-intensive. As part of a study to evaluate the impact of farm-to-school programs, the school lunch recall was developed from a need for a valid and efficient tool to assess school lunch intake among large samples of children. A self-administered paper-and-pencil questionnaire, the school lunch recall prompts for school lunch items by asking children whether they chose a menu item, how much of it they ate, how much they liked it, and whether they would choose it again. The school lunch recall was validated during summer school in 2008 with 18 third- to fifth-grade students (8 to 11 years old) in a North Carolina elementary school. For 4 consecutive days, trained observers recorded foods and amounts students ate during school lunch. Students completed the school lunch recall immediately after lunch. Thirty-seven total observation school lunch recall sets were analyzed. Comparison of school lunch recalls against observations indicated high accuracy, with means of 6% for omission rate (items observed but unreported), 10% for intrusion rate (items unobserved but reported), and 0.63 servings for total inaccuracy (a measure that combines errors for reporting items and amounts). For amounts, accuracy was high for matches (0.06 and 0.01 servings for absolute and arithmetic differences, respectively) but lower for omissions (0.47 servings) and intrusions (0.54 servings). In this pilot study, the school lunch recall was a valid, efficient tool for assessing school lunch intake for a small sample of third- to fifth-grade students. Copyright © 2011 American Dietetic Association. Published by Elsevier Inc. All rights reserved.
Development of the Attitudes to Domestic Violence Questionnaire for Children and Adolescents.

PubMed

Fox, Claire L; Gadd, David; Sim, Julius

2015-09-01

To provide a more robust assessment of the effectiveness of a domestic abuse prevention education program, a questionnaire was developed to measure children's attitudes to domestic violence. The aim was to develop a short questionnaire that would be easy to use for practitioners but, at the same time, sensitive enough to pick up on subtle changes in young people's attitudes. We therefore chose to ask children about different situations in which they might be willing to condone domestic violence. In Study 1, we tested a set of 20 items, which we reduced by half to a set of 10 items. The factor structure of the scale was explored and its internal consistency was calculated. In Study 2, we tested the factor structure of the 10-item Attitudes to Domestic Violence (ADV) Scale in a separate calibration sample. Finally, in Study 3, we then assessed the test-retest reliability of the 10-item scale. The ADV Questionnaire is a promising tool to evaluate the effectiveness of domestic abuse education prevention programs. However, further development work is necessary. © The Author(s) 2014.
The initial development of the WebMedQual scale: domain assessment of the construct of quality of health web sites.

PubMed

Provost, Mélanie; Koompalum, Dayin; Dong, Diane; Martin, Bradley C

2006-01-01

To develop a comprehensive instrument assessing quality of health-related web sites. Phase I consisted of a literature review to identify constructs thought to indicate web site quality and to identify items. During content analysis, duplicate items were eliminated and items that were not clear, meaningful, or measurable were reworded or removed. Some items were generated by the authors. Phase II: a panel consisting of six healthcare and MIS reviewers was convened to assess each item for its relevance and importance to the construct and to assess item clarity and measurement feasibility. Three hundred and eighty-four items were generated from 26 sources. The initial content analysis reduced the scale to 104 items. Four of the six expert reviewers responded; high concordance on the relevance, importance and measurement feasibility of each item was observed: 3 out of 4, or all raters agreed on 76-85% of items. Based on the panel ratings, 9 items were removed, 3 added, and 10 revised. The WebMedQual consists of 8 categories, 8 sub-categories, 95 items and 3 supplemental items to assess web site quality. The constructs are: content (19 items), authority of source (18 items), design (19 items), accessibility and availability (6 items), links (4 items), user support (9 items), confidentiality and privacy (17 items), e-commerce (6 items). The "WebMedQual" represents a first step toward a comprehensive and standard quality assessment of health web sites. This scale will allow relatively easy assessment of quality with possible numeric scoring.
A Psychometric Analysis of the Italian Version of the eHealth Literacy Scale Using Item Response and Classical Test Theory Methods

PubMed Central

Dima, Alexandra Lelia; Schulz, Peter Johannes

2017-01-01

Background The eHealth Literacy Scale (eHEALS) is a tool to assess consumers’ comfort and skills in using information technologies for health. Although evidence exists of reliability and construct validity of the scale, less agreement exists on structural validity. Objective The aim of this study was to validate the Italian version of the eHealth Literacy Scale (I-eHEALS) in a community sample with a focus on its structural validity, by applying psychometric techniques that account for item difficulty. Methods Two Web-based surveys were conducted among a total of 296 people living in the Italian-speaking region of Switzerland (Ticino). After examining the latent variables underlying the observed variables of the Italian scale via principal component analysis (PCA), fit indices for two alternative models were calculated using confirmatory factor analysis (CFA). The scale structure was examined via parametric and nonparametric item response theory (IRT) analyses accounting for differences between items regarding the proportion of answers indicating high ability. Convergent validity was assessed by correlations with theoretically related constructs. Results CFA showed a suboptimal model fit for both models. IRT analyses confirmed all items measure a single dimension as intended. Reliability and construct validity of the final scale were also confirmed. The contrasting results of factor analysis (FA) and IRT analyses highlight the importance of considering differences in item difficulty when examining health literacy scales. Conclusions The findings support the reliability and validity of the translated scale and its use for assessing Italian-speaking consumers’ eHealth literacy. PMID:28400356
Assessing quality of maternity care in Hungary: expert validation and testing of the mother-centered prenatal care (MCPC) survey instrument.

PubMed

Rubashkin, Nicholas; Szebik, Imre; Baji, Petra; Szántó, Zsuzsa; Susánszky, Éva; Vedam, Saraswathi

2017-11-16

Instruments to assess quality of maternity care in Central and Eastern European (CEE) region are scarce, despite reports of poor doctor-patient communication, non-evidence-based care, and informal cash payments. We validated and tested an online questionnaire to study maternity care experiences among Hungarian women. Following literature review, we collated validated items and scales from two previous English-language surveys and adapted them to the Hungarian context. An expert panel assessed items for clarity and relevance on a 4-point ordinal scale. We calculated item-level Content Validation Index (CVI) scores. We designed 9 new items concerning informal cash payments, as well as 7 new "model of care" categories based on mode of payment. The final questionnaire (N = 111 items) was tested in two samples of Hungarian women, representative (N = 600) and convenience (N = 657). We conducted bivariate analysis and thematic analysis of open-ended responses. Experts rated pre-existing English-language items as clear and relevant to Hungarian women's maternity care experiences with an average CVI for included questions of 0.97. Significant differences emerged across the model of care categories in terms of informal payments, informed consent practices, and women's perceptions of autonomy. Thematic analysis (N = 1015) of women's responses identified 13 priority areas of the maternity care experience, 9 of which were addressed by the questionnaire. We developed and validated a comprehensive questionnaire that can be used to evaluate respectful maternity care, evidence-based practice, and informal cash payments in CEE region and beyond.
Reliability and validity of the 12-item WHODAS 2.0 in patients with Kashin-Beck disease.

PubMed

Younus, Mohammad Imran; Wang, Di-Miao; Yu, Fang-Fang; Fang, Hua; Guo, Xiong

2017-09-01

The purpose of this study was to check the reliability and validity of the 12-item Chinese version of the World Health Organization Disability Assessment Schedule 2.0 (WHODAS 2.0) for the assessment of disability in patients with Kashin-Beck disease (KBD). We recruited 219 patients with KBD from the high-risk KBD area in the Shaanxi province, using stratified multistage random sampling. We assessed each patient using the Chinese version of the 12-item WHODAS 2.0 and the Western Ontario and McMaster Universities Index of Osteoarthritis (WOMAC). Statistical evaluations of the instruments consisted of Cronbach's alpha, intraclass correlation coefficient (ICC), confirmatory factor analysis (CFA), and Pearson's correlation coefficient. Cronbach's alpha and ICC for the six domains ranged from 0.704 to 0.906 and 0.690 to 0.852, respectively. A six-factor structure fits the data well (CFI = 0.967, TLI = 0.944, RMSEA = 0.08). Regarding convergent validity, the four domains of the 12-item WHODAS 2.0 (getting around, self-care, life activity, and participation) showed moderate-to-strong correlation for all three domains of the WOMAC (0.428 < |r| < 0.804). Regarding divergent validity, the two domains of the 12-item WHODAS 2.0 (understanding and communication, and getting along with people) showed weak correlation for the three domains of WOMAC (0.182 < |r| < 0.295). The Chinese version of 12-item WHODAS 2.0 questionnaire is a reliable and valid instrument when administered to KBD patients.
Validity of self-reported adult secondhand smoke exposure

PubMed Central

Prochaska, Judith J; Grossman, William; Young-Wolff, Kelly C; Benowitz, Neal L

2015-01-01

Objectives Exposure of adults to secondhand smoke (SHS) has immediate adverse effects on the cardiovascular system and causes coronary heart disease. The current study evaluated brief self-report screening measures for accurately identifying adult cardiology patients with clinically significant levels of SHS exposure in need of intervention. Design and setting A cross-sectional study conducted in a university-affiliated cardiology clinic and cardiology inpatient service. Patients Participants were 118 non-smoking patients (59% male, mean age=63.6 years, SD=16.8) seeking cardiology services. Main outcome measures Serum cotinine levels and self-reported SHS exposure in the past 24 h and 7 days on 13 adult secondhand exposure to smoke (ASHES) items. Results A single item assessment of SHS exposure in one’s own home in the past 7 days was significantly correlated with serum cotinine levels (r=0.41, p<0.001) with sensitivity ≥75%, specificity >85% and correct classification rates >85% at cotinine cut-off points of >0.215 and >0.80 ng/mL. The item outperformed multi-item scales, an assessment of home smoking rules, and SHS exposure assessed in other residential areas, automobiles and public settings. The sample was less accurate at self-reporting lower levels of SHS exposure (cotinine 0.05–0.215 ng/mL). Conclusions The single item ASHES-7d Home screener is brief, assesses recent SHS exposure over a week’s time, and yielded the optimal balance of sensitivity and specificity. The current findings support use of the ASHES-7d Home screener to detect SHS exposure and can be easily incorporated into assessment of other major vital signs in cardiology. PMID:23997071
Performance Characteristics of Middle-Class and Lower-Class Preschool Children on the Stanford-Binet, 1960 Revision.

ERIC Educational Resources Information Center

Meyer, William J.; Goldstein, David

The relative difficulty levels of Stanford-Binet items between the ages of four and six among prekindergarten Head Start children were studied. A comparison sample of prekindergarten white middle class children was included to evaluate the age norms on a culturally typical sample of children and to assess performance on the Binet as it might…
From Minnesota to Cairo: Student Perceptions of Community-Based Learning

ERIC Educational Resources Information Center

Ibrahim, Mona M.; Rosenheim, Marnie R.; Amer, Mona M.; Larson, Haley A.

2016-01-01

This study explored perceptions of community-based learning in a sample of 176 students at a liberal arts college in Cairo, Egypt, and a sample of 176 students at a liberal arts college in the Midwestern United States. Students responded to a 38-item rating scale that assessed gains in several domains as a result of engaging in community-based…
The breastfeeding self-efficacy scale: psychometric assessment of the short form.

PubMed

Dennis, Cindy-Lee

2003-01-01

The purpose of this study was to reduce the number of items on the original Breastfeeding Self-Efficacy Scale (BSES) and psychometrically assess the revised BSES-Short Form (BSES-SF). As part of a longitudinal study, participants completed mailed questionnaires at 1, 4, and 8 weeks postpartum. Health region in British Columbia. A population-based sample of 491 breastfeeding mothers. BSES, Edinburgh Postnatal Depression Scale, Rosenberg Self-Esteem Scale, and Perceived Stress Scale. Internal consistency statistics with the original BSES suggested item redundancy. As such, 18 items were deleted, using explicit reduction criteria. Based on the encouraging reliability analysis of the new 14-item BSES-SF, construct validity was assessed using principal components factor analysis, comparison of contrasted groups, and correlations with measures of similar constructs. Support for predictive validity was demonstrated through significant mean differences between breastfeeding and bottle feeding mothers at 4 (p < .001) and 8 (p < .001) weeks postpartum. Demographic response patterns suggested the BSES-SF is a unique tool to identify mothers at risk of prematurely discontinuing breastfeeding. These psychometric results indicate the BSES-SF is an excellent measure of breastfeeding self-efficacy and considered ready for clinical use to (a) identify breastfeeding mothers at high risk, (b) assess breastfeeding behaviors and cognitions to individualize confidence-building strategies, and (c) evaluate the effectiveness of various interventions and guide program development.
Longitudinal Measurement Invariance of Posttraumatic Stress Disorder in Deployed Marines.

PubMed

Contractor, Ateka A; Bolton, Elisa; Gallagher, Matthew W; Rhodes, Charla; Nash, William P; Litz, Brett

2017-06-01

The meaningful interpretation of longitudinal study findings requires temporal stability of the constructs assessed (i.e., measurement invariance). We sought to examine measurement invariance of the construct of posttraumatic stress disorder (PTSD) as based on the Diagnostic and Statistical Manual of Mental Disorders indexed by the PTSD Checklist (PCL) and the Clinician-Administered PTSD Scale (CAPS) in a sample of 834 Marines with significant combat experience. PTSD was assessed 1-month predeployment (T0), and again at 1-month (T1), 5-months (T2), and 8-months postdeployment (T3). We tested configural (pattern of item/parcel loadings), metric (item/parcel loadings on latent factors), and scalar (item/parcel-level severity) invariance and explored sources of measurement instability (partial invariance testing). The T0 best-fitting emotional numbing model factor structure informed the conceptualization of PTSD's latent factors and parcel formations. We found (1) scalar noninvariance for the construct of PTSD as measured by the PCL and the CAPS, and for PTSD symptom clusters as assessed by the CAPS; and (2) metric noninvariance for PTSD symptom clusters as measured by the PCL. Exploratory analyses revealed factor-loading and intercept differences from pre- to postdeployment for avoidance symptoms, numbing symptoms (mainly psychogenic amnesia and foreshortened future), and the item assessing startle, each of which reduced construct stability. Implications of these findings for longitudinal studies of PTSD are discussed. Copyright © 2017 International Society for Traumatic Stress Studies.
Reliability and One-Year Stability of the PIN3 Neighborhood Environmental Audit in Urban and Rural Neighborhoods.

PubMed

Porter, Anna K; Wen, Fang; Herring, Amy H; Rodríguez, Daniel A; Messer, Lynne C; Laraia, Barbara A; Evenson, Kelly R

2018-06-01

Reliable and stable environmental audit instruments are needed to successfully identify the physical and social attributes that may influence physical activity. This study described the reliability and stability of the PIN3 environmental audit instrument in both urban and rural neighborhoods. Four randomly sampled road segments in and around a one-quarter mile buffer of participants' residences from the Pregnancy, Infection, and Nutrition (PIN3) study were rated twice, approximately 2 weeks apart. One year later, 253 of the year 1 sampled roads were re-audited. The instrument included 43 measures that resulted in 73 item scores for calculation of percent overall agreement, kappa statistics, and log-linear models. For same-day reliability, 81% of items had moderate to outstanding kappa statistics (kappas ≥ 0.4). Two-week reliability was slightly lower, with 77% of items having moderate to outstanding agreement using kappa statistics. One-year stability had 68% of items showing moderate to outstanding agreement using kappa statistics. The reliability of the audit measures was largely consistent when comparing urban to rural locations, with only 8% of items exhibiting significant differences (α < 0.05) by urbanicity. The PIN3 instrument is a reliable and stable audit tool for studies assessing neighborhood attributes in urban and rural environments.

Development, content validity, and cross-cultural adaptation of a patient-reported outcome measure for real-time symptom assessment in irritable bowel syndrome.

PubMed

Vork, L; Keszthelyi, D; Mujagic, Z; Kruimel, J W; Leue, C; Pontén, I; Törnblom, H; Simrén, M; Albu-Soda, A; Aziz, Q; Corsetti, M; Holvoet, L; Tack, J; Rao, S S; van Os, J; Quetglas, E G; Drossman, D A; Masclee, A A M

2018-03-01

End-of-day questionnaires, which are considered the gold standard for assessing abdominal pain and other gastrointestinal (GI) symptoms in irritable bowel syndrome (IBS), are influenced by recall and ecological bias. The experience sampling method (ESM) is characterized by random and repeated assessments in the natural state and environment of a subject, and herewith overcomes these limitations. This report describes the development of a patient-reported outcome measure (PROM) based on the ESM principle, taking into account content validity and cross-cultural adaptation. Focus group interviews with IBS patients and expert meetings with international experts in the fields of neurogastroenterology & motility and pain were performed in order to select the items for the PROM. Forward-and-back translation and cognitive interviews were performed to adapt the instrument for the use in different countries and to assure on patients' understanding with the final items. Focus group interviews revealed 42 items, categorized into five domains: physical status, defecation, mood and psychological factors, context and environment, and nutrition and drug use. Experts reduced the number of items to 32 and cognitive interviewing after translation resulted in a few slight adjustments regarding linguistic issues, but not regarding content of the items. An ESM-based PROM, suitable for momentary assessment of IBS symptom patterns was developed, taking into account content validity and cross-cultural adaptation. This PROM will be implemented in a specifically designed smartphone application and further validation in a multicenter setting will follow. © 2017 John Wiley & Sons Ltd.
The Brief Early Childhood Screening Assessment: Preliminary Validity in Pediatric Primary Care.

PubMed

Fallucco, Elise M; Wysocki, Tim; James, Lauren; Kozikowski, Chelsea; Williams, Andre; Gleason, Mary M

Brief, well-validated instruments are needed to facilitate screening for early childhood behavioral and emotional problems (BEPs). The objectives of this study were to empirically reduce the length of the Early Childhood Screening Assessment (ECSA) and to assess the validity and reliability of this shorter tool. Using caregiver ECSA responses for 2467 children aged 36 to 60 months seen in primary care, individual ECSA items were ranked on a scale ranging from "absolutely retain" to "absolutely delete." Items were deleted sequentially beginning with "absolutely delete" and going up the item prioritization list, resulting in 35 shorter versions of the ECSA. A separate primary care sample (n = 69) of mothers of children aged 18 to 60 months was used to determine the sensitivity and specificity of each shorter ECSA version using psychiatric diagnosis on the Diagnostic Infant and Preschool Assessment as the gold standard. The version with the optimal balance of sensitivity, specificity, and length was selected as the Brief ECSA. Associations between Brief ECSA scores and other pertinent measures were evaluated to estimate reliability and validity. A 22-item measure reflected the best combination of brevity, sensitivity and specificity. A cutoff score of 9 or higher on the 22-item Brief ECSA demonstrated acceptable sensitivity (89%) and specificity (85%) for predicting a psychiatric diagnosis. Brief ECSA scores correlated significantly and in expected directions with scores on pertinent measures and with demographic variables. The results indicate that the Brief ECSA has sound psychometric properties for identifying young children with BEPs in primary care.
Multimethod Personality Profile Assessment Methodology: Alcohol Abusers versus Nonalcoholic Controls.

DTIC Science & Technology

1982-12-01

following: oral fixation with passive, dependent features; psychopathy ; low frustration tolerance; low perseverance; guilt and anxiety; egocentricity...research findings of elevated scale 4 with alcoholic samples, the psychopathy /sociopathy personality domain, as tapped by the MMPI/Pd items, warranted
The DUNDRUM-1 structured professional judgment for triage to appropriate levels of therapeutic security: retrospective-cohort validation study

PubMed Central

2011-01-01

Background The assessment of those presenting to prison in-reach and court diversion services and those referred for admission to mental health services is a triage decision, allocating the patient to the appropriate level of therapeutic security. This is a critical clinical decision. We set out to improve on unstructured clinical judgement. We collated qualitative information and devised an 11 item structured professional judgment instrument for this purpose then tested for validity. Methods All those assessed following screening over a three month period at a busy remand committals prison (n = 246) were rated in a retrospective cohort design blind to outcome. Similarly, all those admitted to a mental health service from the same prison in-reach service over an overlapping two year period were rated blind to outcome (n = 100). Results The 11 item scale had good internal consistency (Cronbach's alpha = 0.95) and inter-rater reliability. The scale score did not correlate with the HCR-20 'historical' score. For the three month sample, the receiver operating characteristic area under the curve (AUC) for those admitted to hospital was 0.893 (95% confidence interval 0.843 to 0.943). For the two year sample, AUC distinguished at each level between those admitted to open wards, low secure units or a medium/high secure service. Open wards v low secure units AUC = 0.805 (95% CI 0.680 to 0.930); low secure v medium/high secure AUC = 0.866, (95% CI 0.784 to 0.949). Item to outcome correlations were significant for all 11 items. Conclusions The DUNDRUM-1 triage security scale and its items performed to criterion levels when tested against the real world outcome. This instrument can be used to ensure consistency in decision making when deciding who to admit to secure forensic hospitals. It can also be used to benchmark admission thresholds between services and jurisdictions. In this study we found some divergence between assessed need and actual placement. This provides fertile ground for future research as well as practical assistance in assessing unmet need, auditing case mix and planning care pathways. PMID:21410967
Assessment of the psychometrics of a PROMIS item bank: self-efficacy for managing daily activities

PubMed Central

Hong, Ickpyo; Li, Chih-Ying; Romero, Sergio; Gruber-Baldini, Ann L.; Shulman, Lisa M.

2017-01-01

Purpose The aim of this study is to investigate the psychometrics of the Patient-Reported Outcomes Measurement Information System self-efficacy for managing daily activities item bank. Methods The item pool was field tested on a sample of 1087 participants via internet (n = 250) and in-clinic (n = 837) surveys. All participants reported having at least one chronic health condition. The 35 item pool was investigated for dimensionality (confirmatory factor analyses, CFA and exploratory factor analysis, EFA), item-total correlations, local independence, precision, and differential item functioning (DIF) across gender, race, ethnicity, age groups, data collection modes, and neurological chronic conditions (McFadden Pseudo R2 less than 10 %). Results The item pool met two of the four CFA fit criteria (CFI = 0.952 and SRMR = 0.07). EFA analysis found a dominant first factor (eigenvalue = 24.34) and the ratio of first to second eigenvalue was 12.4. The item pool demonstrated good item-total correlations (0.59–0.85) and acceptable internal consistency (Cronbach’s alpha = 0.97). The item pool maintained its precision (reliability over 0.90) across a wide range of theta (3.70), and there was no significant DIF. Conclusion The findings indicated the item pool has sound psychometric properties and the test items are eligible for development of computerized adaptive testing and short forms. PMID:27048495
Assessment of the psychometrics of a PROMIS item bank: self-efficacy for managing daily activities.

PubMed

Hong, Ickpyo; Velozo, Craig A; Li, Chih-Ying; Romero, Sergio; Gruber-Baldini, Ann L; Shulman, Lisa M

2016-09-01

The aim of this study is to investigate the psychometrics of the Patient-Reported Outcomes Measurement Information System self-efficacy for managing daily activities item bank. The item pool was field tested on a sample of 1087 participants via internet (n = 250) and in-clinic (n = 837) surveys. All participants reported having at least one chronic health condition. The 35 item pool was investigated for dimensionality (confirmatory factor analyses, CFA and exploratory factor analysis, EFA), item-total correlations, local independence, precision, and differential item functioning (DIF) across gender, race, ethnicity, age groups, data collection modes, and neurological chronic conditions (McFadden Pseudo R (2) less than 10 %). The item pool met two of the four CFA fit criteria (CFI = 0.952 and SRMR = 0.07). EFA analysis found a dominant first factor (eigenvalue = 24.34) and the ratio of first to second eigenvalue was 12.4. The item pool demonstrated good item-total correlations (0.59-0.85) and acceptable internal consistency (Cronbach's alpha = 0.97). The item pool maintained its precision (reliability over 0.90) across a wide range of theta (3.70), and there was no significant DIF. The findings indicated the item pool has sound psychometric properties and the test items are eligible for development of computerized adaptive testing and short forms.
Measuring Microaggression and Organizational Climate Factors in Military Units

DTIC Science & Technology

2011-04-01

i.e., items) to accurately assess what we intend for them to measure. To assess construct and convergent validity, the author assessed the statistical ...sample indicated both convergent and construct validity of the microaggression scale. Table 5 presents these statistics . Measuring Microaggressions...models. As shown in Table 7, the measurement models had acceptable fit indices. That is, the Chi-square statistics were at their minimum; although the
Perceived Stress Scale: confirmatory factor analysis of the PSS14 and PSS10 versions in two samples of pregnant women from the BRISA cohort.

PubMed

Yokokura, Ana Valéria Carvalho Pires; Silva, Antônio Augusto Moura da; Fernandes, Juliana de Kássia Braga; Del-Ben, Cristina Marta; Figueiredo, Felipe Pinheiro de; Barbieri, Marco Antonio; Bettiol, Heloisa

2017-12-18

This study aimed to assess the dimensional structure, reliability, convergent validity, discriminant validity, and scalability of the Perceived Stress Scale (PSS). The sample consisted of 1,447 pregnant women in São Luís (Maranhão State) and 1,400 in Ribeirão Preto (São Paulo State), Brazil. The 14 and 10-item versions of the scale were assessed using confirmatory factor analysis, using weighted least squares means and variance (WLSMV). In both cities, the two-factor models (positive factors, measuring resilience to stressful situations, and negative factors, measuring stressful situations) showed better fit than the single-factor models. The two-factor models for the complete (PSS14) and reduced scale (PSS10) showed good internal consistency (Cronbach's alpha ≥ 0.70). All the factor loadings were ≥ 0.50, except for items 8 and 12 of the negative dimension and item 13 of the positive dimension. The correlations between both dimensions of stress and psychological violence showed the expected magnitude (0.46-0.59), providing evidence of an adequate convergent construct validity. The correlations between the scales' positive and negative dimensions were around 0.74-0.78, less than 0.85, which suggests adequate discriminant validity. Extracted mean variance and scalability were slightly higher for PSS10 than for PSS14. The results were consistent in both cities. In conclusion, the single-factor solution is not recommended for assessing stress in pregnant women. The reduced, 10-item two-factor scale appears to be more appropriate for measuring perceived stress in pregnant women.
Development of a food frequency questionnaire for Sri Lankan adults

PubMed Central

2012-01-01

Background Food Frequency Questionnaires (FFQs) are commonly used in epidemiologic studies to assess long-term nutritional exposure. Because of wide variations in dietary habits in different countries, a FFQ must be developed to suit the specific population. Sri Lanka is undergoing nutritional transition and diet-related chronic diseases are emerging as an important health problem. Currently, no FFQ has been developed for Sri Lankan adults. In this study, we developed a FFQ to assess the regular dietary intake of Sri Lankan adults. Methods A nationally representative sample of 600 adults was selected by a multi-stage random cluster sampling technique and dietary intake was assessed by random 24-h dietary recall. Nutrient analysis of the FFQ required the selection of foods, development of recipes and application of these to cooked foods to develop a nutrient database. We constructed a comprehensive food list with the units of measurement. A stepwise regression method was used to identify foods contributing to a cumulative 90% of variance to total energy and macronutrients. In addition, a series of photographs were included. Results We obtained dietary data from 482 participants and 312 different food items were recorded. Nutritionists grouped similar food items which resulted in a total of 178 items. After performing step-wise multiple regression, 93 foods explained 90% of the variance for total energy intake, carbohydrates, protein, total fat and dietary fibre. Finally, 90 food items and 12 photographs were selected. Conclusion We developed a FFQ and the related nutrient composition database for Sri Lankan adults. Culturally specific dietary tools are central to capturing the role of diet in risk for chronic disease in Sri Lanka. The next step will involve the verification of FFQ reproducibility and validity. PMID:22937734
An Assessment of the Measurement Equivalence of English and French Versions of the Center for Epidemiologic Studies Depression (CES-D) Scale in Systemic Sclerosis

PubMed Central

Delisle, Vanessa C.; Kwakkenbos, Linda; Hudson, Marie; Baron, Murray; Thombs, Brett D.

2014-01-01

Objectives Center for Epidemiologic Studies Depression (CES-D) Scale scores in English- and French-speaking Canadian systemic sclerosis (SSc) patients are commonly pooled in analyses, but no studies have evaluated the metric equivalence of the English and French CES-D. The study objective was to examine the metric equivalence of the CES-D in English- and French-speaking SSc patients. Methods The CES-D was completed by 1007 English-speaking and 248 French-speaking patients from the Canadian Scleroderma Research Group Registry. Confirmatory factor analysis (CFA) was used to assess the factor structure in both samples. The Multiple-Indicator Multiple-Cause (MIMIC) model was utilized to assess differential item functioning (DIF). Results A two-factor model (Positive and Negative affect) showed excellent fit in both samples. Statistically significant, but small-magnitude, DIF was found for 3 of 20 CES-D items, including items 3 (Blues), 10 (Fearful), and 11 (Sleep). Prior to accounting for DIF, French-speaking patients had 0.08 of a standard deviation (SD) lower latent scores for the Positive factor (95% confidence interval [CI]−0.25 to 0.08) and 0.09 SD higher scores (95% CI−0.07 to 0.24) for the Negative factor than English-speaking patients. After DIF correction, there was no change on the Positive factor and a non-significant increase of 0.04 SD on the Negative factor for French-speaking patients (difference = 0.13 SD, 95% CI−0.03 to 0.28). Conclusions The English and French versions of the CES-D, despite minor DIF on several items, are substantively equivalent and can be used in studies that combine data from English- and French-speaking Canadian SSc patients. PMID:25036894
The application of a network approach to Health-Related Quality of Life (HRQoL): introducing a new method for assessing HRQoL in healthy adults and cancer patients.

PubMed

Kossakowski, Jolanda J; Epskamp, Sacha; Kieffer, Jacobien M; van Borkulo, Claudia D; Rhemtulla, Mijke; Borsboom, Denny

2016-04-01

Health-Related Quality of Life (HRQoL) research has typically adopted either a formative approach, in which HRQoL is the common effect of its observables, or a reflective approach--defining HRQoL as a latent variable that determines observable characteristics of HRQoL. Both approaches, however, do not take into account the complex organization of these characteristics. The objective of this study was to introduce a new approach for analyzing HRQoL data, namely a network model (NM). An NM, as opposed to traditional research strategies, accounts for interactions among observables and offers a complementary analytic approach. We applied the NM to samples of Dutch cancer patients (N = 485) and Dutch healthy adults (N = 1742) who completed the 36-item Short Form Health Survey (SF-36). Networks were constructed for both samples separately and for a combined sample with diagnostic status added as an extra variable. We assessed the network structures and compared the structures of the two separate samples on the item and domain levels. The relative importance of individual items in the network structures was determined using centrality analyses. We found that the global structure of the SF-36 is dominant in all networks, supporting the validity of questionnaire's subscales. Furthermore, results suggest that the network structure of both samples was highly similar. Centrality analyses revealed that maintaining a daily routine despite one's physical health predicts HRQoL levels best. We concluded that the NM provides a fruitful alternative to classical approaches used in the psychometric analysis of HRQoL data.
A new self-report inventory of dyslexia for students: criterion and construct validity.

PubMed

Tamboer, Peter; Vorst, Harrie C M

2015-02-01

The validity of a Dutch self-report inventory of dyslexia was ascertained in two samples of students. Six biographical questions, 20 general language statements and 56 specific language statements were based on dyslexia as a multi-dimensional deficit. Dyslexia and non-dyslexia were assessed with two criteria: identification with test results (Sample 1) and classification using biographical information (both samples). Using discriminant analyses, these criteria were predicted with various groups of statements. All together, 11 discriminant functions were used to estimate classification accuracy of the inventory. In Sample 1, 15 statements predicted the test criterion with classification accuracy of 98%, and 18 statements predicted the biographical criterion with classification accuracy of 97%. In Sample 2, 16 statements predicted the biographical criterion with classification accuracy of 94%. Estimations of positive and negative predictive value were 89% and 99%. Items of various discriminant functions were factor analysed to find characteristic difficulties of students with dyslexia, resulting in a five-factor structure in Sample 1 and a four-factor structure in Sample 2. Answer bias was investigated with measures of internal consistency reliability. Less than 20 self-report items are sufficient to accurately classify students with and without dyslexia. This supports the usefulness of self-assessment of dyslexia as a valid alternative to diagnostic test batteries. Copyright © 2015 John Wiley & Sons, Ltd.
The Dimensionality of DSM-IV Alcohol Use Disorders among Adolescent and Adult Drinkers and Symptom Patterns by Age, Gender, and Race/Ethnicity

PubMed Central

Harford, Thomas C.; Yi, Hsiao-ye; Faden, Vivian B.; Chen, Chiung M.

2015-01-01

Background There is limited information on the validity of Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV) alcohol use disorders (AUD) symptom criteria among adolescents in the general population. The purpose of the present study is to assess the DSM-IV AUD symptom criteria as reported by adolescent and adult drinkers in a single representative sample of the U.S. population ages 12 years and older. This design avoids potential confounding due to differences in survey methodology when comparing adolescents and adults from different surveys. Methods A total of 133,231 current drinkers (had at least one drink in the past year) ages 12 years and older were drawn from respondents to the 2002–2005 National Surveys on Drug Use and Health. DSM-IV AUD criteria were assessed by questions related to specific symptoms occurring during the past 12 months. Factor analytic (FA) and item response theory (IRT) models were applied to the 11 AUD symptom criteria to assess the probabilities of symptom item endorsements across different values of the underlying trait. Results A one-factor model provided an adequate and parsimonious interpretation for the 11 AUD criteria for the total sample and for each of the gender-age groups. The MIMIC model exhibited significant indication for item bias among some criteria by gender, age, and race/ethnicity. Symptom criteria for “tolerance,” “time spent,” and “hazardous use” had lower item thresholds (i.e., lower severity) and low item discrimination, and they were well separated from the other symptoms, especially in the two younger age groups (12–17 and 18–25). “Larger amounts,” “cut down,” “withdrawal,” and “legal problems” had higher item thresholds but generally lower item discrimination, and they tend to exhibit greater dispersion at higher AUD severity, particularly in the youngest age group (12–17). Conclusions Findings from the present study do not provide support for the two separate DSM-IV diagnoses of alcohol abuse and dependence among either adolescents or adults. Variations in criteria severity for both abuse and dependence offer support for a dimensional approach to diagnosis which should be considered in the ongoing development of DSM-V. PMID:19320629
Development and validation of a new questionnaire measuring treatment satisfaction in patients with non-valvular atrial fibrillation: SAFUCA®.

PubMed

Ruiz, Miguel A; González-Porras, José Ramón; Aranguren, José Luis; Franco, Eduardo; Villasante, Fernando; Tuñón, José; González-López, Tomás José; de Salas-Cansado, Marina; Soto, Javier

2017-03-01

To develop a new questionnaire with good psychometric properties to measure satisfaction with medical care in patients with non-valvular atrial fibrillation. The initial instrument was composed of 37 items, arranged in 6 dimensions: efficacy, ease and convenience, impact on daily activities, satisfaction with medical care, undesired effects of medication, and overall satisfaction. Items and dimensions were extracted from reviewing existing instruments, 3 focus groups with chronic patients, and a panel of 8 experts. Additionally, 3 visual analog scales measuring quality of life, effectiveness, and overall satisfaction were administered. A convenience sample of 119 patients was used for item reduction. Classic psychometric theory and item analysis techniques were used (exploratory factor and confirmatory factor analysis, test-retest, and correlation with visual scales). A validation sample of 230 patients was used to assess convergent validity, and an additional 220 patients sample was used to discriminate between treatment and compliance groups. The questionnaire was reduced in length to 25 items, but the impact dimension had split in treatment inconvenience and treatment control. Overall reliability was high (α = 0.861) with acceptable dimensional reliabilities (α = 0.764-0.908). Individual dimensions correlated to varying degrees. Test-retest correlations were high (r = 0.784-0.965), and correlations with visual and already validated scales were substantial. Differences were detected between antivitamin K and new-oral-anticoagulant treatments in several dimensions (p < 0.05). Treatment satisfaction was related with compliance. This new 25-item questionnaire has good psychometric properties for measuring satisfaction with medical care in patients with this condition. It is capable of detecting differences between different treatments.
Have infant gross motor abilities changed in 20 years? A re-evaluation of the Alberta Infant Motor Scale normative values

PubMed Central

Darrah, Johanna; Bartlett, Doreen; Maguire, Thomas O; Avison, William R; Lacaze-Masmonteil, Thierry

2014-01-01

Aim To compare the original normative data of the Alberta Infant Motor Scale (AIMS) (n=2202) collected 20 years ago with a contemporary sample of Canadian infants. Method This was a cross-sectional cohort study of 650 Canadian infants (338 males, 312 females; mean age 30.9wks [SD 15.5], range 2wks–18mo) assessed once on the AIMS. Assessments were stratified by age, and infants proportionally represented the ethnic diversity of Canada. Logistic regression was used to place AIMS items on an age scale representing the age at which 50% of the infants passed an item on the contemporary data set and the original data set. Forty-three items met the criterion for stable regression results in both data sets. Results The correlation coefficient between the age locations of items on the original and contemporary data sets was 0.99. The mean age difference between item locations was 0.7 weeks. Age values from the original data set when converted to the contemporary scale differed by less than 1 week. Interpretation The sequence and age at emergence of AIMS items has remained similar over 20 years and current normative values remain valid. Concern that the ‘back to sleep’ campaign has influenced the age at emergence of gross motor abilities is not supported. PMID:24684556
The influence of item order on intentional response distortion in the assessment of high potentials: assessing pilot applicants.

PubMed

Khorramdel, Lale; Kubinger, Klaus D; Uitz, Alexander

2014-04-01

An experiment was conducted to investigate the effects of item order and questionnaire content on faking good or intentional response distortion. It was hypothesized that intentional response distortion would either increase towards the end of a long questionnaire, as learning effects might make it easier to adjust responses to a faking good schema, or decrease because applicants' will to distort responses is reduced if the questionnaire lasts long enough. Furthermore, it was hypothesized that certain types of questionnaire content are especially vulnerable to response distortion. Eighty-four pre-selected pilot applicants filled out a questionnaire consisting of 516 items including items from the NEO five factor inventory (NEO FFI), NEO personality inventory revised (NEO PI-R) and business-focused inventory of personality (BIP). The positions of the items were varied within the applicant sample to test if responses are affected by item order, and applicants' response behaviour was additionally compared to that of volunteers. Applicants reported significantly higher mean scores than volunteers, and results provide some evidence of decreased faking tendencies towards the end of the questionnaire. Furthermore, it could be demonstrated that lower variances or standard deviations in combination with appropriate (often higher) mean scores can serve as an indicator for faking tendencies in group comparisons, even if effects are not significant. © 2013 International Union of Psychological Science.
Cross-cultural adaptation of the World Health Organization Disability Assessment Schedule 2.0 (WHODAS 2.0) for Hebrew-speaking subjects with and without hand injury.

PubMed

Marom, Batia S; Carel, Rafael S; Sharabi, Moshe; Ratzon, Navah Z

2017-06-01

The World Health Organization Disability Assessment Schedule 2.0 (WHODAS 2.0) questionnaire is used internationally to assess function and disability. The instrument has been translated into several languages, but no Hebrew version exists. The objective of this study was to evaluate the use of the 12-item WHODAS 2.0 questionnaire among Hebrew speakers with and without hand injuries (HI). The translated questionnaire was conducted among 155 uninjured subjects (UI) and 77 male workers with HI. Internal consistency was assessed using Cronbach's alpha. Test-retest reliability was assessed in UI subjects and calculated using the intraclass correlation coefficient (ICC agreement ). Validity was evaluated by correlating the 12-item WHODAS 2.0 to the short-form of health survey (SF-12) in UI subjects and comparing the 12-item WHODAS 2.0 scores and the Quick Disability of Arm, Shoulder, and Hand (QDASH) Outcome Measure in the HI group. The Cronbach's alpha of the WHODAS 2.0 for the entire sample was α = 0.83. The ICC agreement for test-retest reliability was 0.88. A positive significant correlation was found between the 12-item WHODAS 2.0 and the QDASH (r s = 0.53, p < .005). The results support the reliability and validity of this Hebrew translation of the 12-item WHODAS 2.0. IMPLICATIONS FOR REHABILITATION Measurement tools that assess activities and participation after HI are an essential part of the rehabilitation process. The 12-item WHODAS 2.0 is a useful tool, since it addresses a broader range of activity and participation domains compared to the DASH and enables better implementation of the ICF model. Since the WHODAS 2.0 does not target a specific disease (as oppose to the DASH), it can be used to compare disabilities caused by different diseases or traumas. The WHODAS 2.0 measures both the function and disability in general populations as well as clinical situations; therefore, the instrument is useful for assessing both health and disability.
[Impact of passing items above the ceiling on the assessment results of Peabody developmental motor scales].

PubMed

Zhao, Gai; Bian, Yang; Li, Ming

2013-12-18

To analyze the impact of passing items above the roof level in the gross motor subtest of Peabody development motor scales (PDMS-2) on its assessment results. In the subtests of PDMS-2, 124 children from 1.2 to 71 months were administered. Except for the original scoring method, a new scoring method which includes passing items above the ceiling were developed. The standard scores and quotients of the two scoring methods were compared using the independent-samples t test. Only one child could pass the items above the ceiling in the stationary subtest, 19 children in the locomotion subtest, and 17 children in the visual-motor integration subtest. When the scores of these passing items were included in the raw scores, the total raw scores got the added points of 1-12, the standard scores added 0-1 points and the motor quotients added 0-3 points. The diagnostic classification was changed only in two children. There was no significant difference between those two methods about motor quotients or standard scores in the specific subtest (P>0.05). The passing items above a ceiling of PDMS-2 isn't a rare situation. It usually takes place in the locomotion subtest and visual-motor integration subtest. Including these passing items into the scoring system will not make significant difference in the standard scores of the subtests or the developmental motor quotients (DMQ), which supports the original setting of a ceiling established by upassing 3 items in a row. However, putting the passing items above the ceiling into the raw score will improve tracking of children's developmental trajectory and intervention effects.
Psychometrical assessment and item analysis of the General Health Questionnaire in victims of terrorism.

PubMed

Delgado-Gomez, David; Lopez-Castroman, Jorge; de Leon-Martinez, Victoria; Baca-Garcia, Enrique; Cabanas-Arrate, Maria Luisa; Sanchez-Gonzalez, Antonio; Aguado, David

2013-03-01

There is a need to assess the psychiatric morbidity that appears as a consequence of terrorist attacks. The General Health Questionnaire (GHQ) has been used to this end, but its psychometric properties have never been evaluated in a population affected by terrorism. A sample of 891 participants included 162 direct victims of terrorist attacks and 729 relatives of the victims. All participants were evaluated using the 28-item version of the GHQ (GHQ-28). We examined the reliability and external validity of scores on the scale using Cronbach's alpha and Pearson correlation with the State-Trait Anxiety Inventory (STAI), respectively. The factor structure of the scale was analyzed with varimax rotation. Samejima's (1969) graded response model was used to explore the item properties. The GHQ-28 scores showed good reliability and item-scale correlations. The factor analysis identified 3 factors: anxious-somatic symptoms, social dysfunction, and depression symptoms. All factors showed good correlation with the STAI. Before rotation, the first, second, and third factor explained 44.0%, 6.4%, and 5.0% of the variance, respectively. Varimax rotation redistributed the percentages of variance accounted for to 28.4%, 13.8%, and 13.2%, respectively. Items with the highest loadings in the first factor measured anxiety symptoms, whereas items with the highest loadings in the third factor measured suicide ideation. Samejima's model found that high scores in suicide-related items were associated with severe depression. The factor structure of the GHQ-28 found in this study underscores the preeminence of anxiety symptoms among victims of terrorism and their relatives. Item response analysis identified the most difficult and significant items for each factor. PsycINFO Database Record (c) 2013 APA, all rights reserved.
Single item measures of emotional exhaustion and depersonalization are useful for assessing burnout in medical professionals.

PubMed

West, Colin P; Dyrbye, Liselotte N; Sloan, Jeff A; Shanafelt, Tait D

2009-12-01

Burnout has negative effects on work performance and patient care. The current standard for burnout assessment is the Maslach Burnout Inventory (MBI), a well-validated instrument consisting of 22 items answered on a 7-point Likert scale. However, the length of the MBI can limit its utility in physician surveys. To evaluate the performance of two questions relative to the full MBI for measuring burnout. Cross-sectional data from 2,248 medical students, 333 internal medicine residents, 465 internal medicine faculty, and 7,905 practicing surgeons. The single questions with the highest factor loading on the emotional exhaustion (EE) ("I feel burned out from my work") and depersonalization (DP) ("I have become more callous toward people since I took this job") domains of burnout were evaluated in four large samples of medical students, internal medicine residents, internal medicine faculty, and practicing surgeons. Spearman correlations between the single EE question and the full EE domain score minus that question ranged from 0.76-0.83. Spearman correlations between the single DP question and the full DP domain score minus that question ranged from 0.61-0.72. Responses to the single item measures of emotional exhaustion and depersonalization stratified risk of high burnout in the relevant domain on the full MBI, with consistent patterns across the four sampled groups. Single item measures of emotional exhaustion and depersonalization provide meaningful information on burnout in medical professionals.

Evaluation of an antiretroviral medication attitude scale and relationships between medication attitudes and medication nonadherence.

PubMed

Viswanathan, Hema; Anderson, Rodney; Thomas, Joseph

2005-05-01

The objectives of this study were to refine a scale designed to assess attitudes toward antiretroviral medication, to examine variation in medication attitudes across clinical and demographic characteristics, and to assess relationships between medication attitudes and medication nonadherence. A cross-sectional design was used to survey individuals at least 18 years of age, currently on antiretroviral therapy, and served by a regional HIV/AIDS center. The survey was administered by pharmacy students using convenience sampling between February 2002 and August 2002. Nonadherence was measured using a nine-item scale with a higher score indicative of higher nonadherence. An antiretroviral medication attitude scale was developed based on revision of a zidovudine attitude inventory. The sample of 99 patients was predominantly male (79.8%), had an annual income of less than $10,000 (74%), and was comprised of 50% whites and 40.8% blacks. Participants were between 18 and 70 years old. Item reduction using item-total correlations and factor analytic techniques resulted in a 15-item medication attitude scale with good internal consistency (Cronbach alpha coefficient = 0.84). A multiple regression model showed a significant negative relationship between attitude toward medication and medication nonadherence after controlling for covariates including age, education, gender, ethnicity, work status, social support, CD4 cell count and number of antiretroviral medications, suggesting that more positive the attitude toward medication, lower the medication nonadherence. Findings underscore the importance of attitude toward medication as a modifiable factor that can be targeted to improve medication adherence.
Dimensions of the South Oaks Gambling Screen in Finland: A cross-sectional population study.

PubMed

Salonen, Anne H; Rosenström, Tom; Edgren, Robert; Volberg, Rachel; Alho, Hannu; Castrén, Sari

2017-06-01

The underlying structure of problematic gambling behaviors, such as those assessed by the South Oaks Gambling Screen (SOGS), remain unknown: Can problem gambling be assessed unidimensionally or should multiple qualitatively different dimensions be taken into account, and if so, what do these qualitative dimensions indicate? How significant are the deviations from unidimensionality in practice? A cross-sectional random sample of Finns aged 15-74 (n = 4,484) was drawn from the Population Information Registry and surveyed in 2011-2012. Analyses were conducted using descriptive statistics, Confirmatory factor analysis (CFA) and multidimensional item response theory (MIRT) models. Altogether, 14.9% of the population endorsed at least one of the 20 SOGS items, but nine items had low endorsement rates (≤ 0.2%). CFA and MIRT techniques suggested that individuals differed from each other in two positively correlated (r = 0.70) underlying dimensions: "impact on self primarily" and "impact on others also". This two-factor correlated-factors model can be reinterpreted as a bifactor model with one general gambling-problem factor and two specific factors with similar interpretation as in the correlated-factors model but with non-overlapping items. The two specific factors may provide clinically useful information without extra costs of assessment. © 2017 Scandinavian Psychological Associations and John Wiley & Sons Ltd.
Multiple, correlated covariates associated with differential item functioning (DIF): Accounting for language DIF when education levels differ across languages.

PubMed

Gibbons, Laura E; Crane, Paul K; Mehta, Kala M; Pedraza, Otto; Tang, Yuxiao; Manly, Jennifer J; Narasimhalu, Kaavya; Teresi, Jeanne; Jones, Richard N; Mungas, Dan

2011-04-28

Differential item functioning (DIF) occurs when a test item has different statistical properties in subgroups, controlling for the underlying ability measured by the test. DIF assessment is necessary when evaluating measurement bias in tests used across different language groups. However, other factors such as educational attainment can differ across language groups, and DIF due to these other factors may also exist. How to conduct DIF analyses in the presence of multiple, correlated factors remains largely unexplored. This study assessed DIF related to Spanish versus English language in a 44-item object naming test. Data come from a community-based sample of 1,755 Spanish- and English-speaking older adults. We compared simultaneous accounting, a new strategy for handling differences in educational attainment across language groups, with existing methods. Compared to other methods, simultaneously accounting for language- and education-related DIF yielded salient differences in some object naming scores, particularly for Spanish speakers with at least 9 years of education. Accounting for factors that vary across language groups can be important when assessing language DIF. The use of simultaneous accounting will be relevant to other cross-cultural studies in cognition and in other fields, including health-related quality of life.
Multiple, correlated covariates associated with differential item functioning (DIF): Accounting for language DIF when education levels differ across languages

PubMed Central

Gibbons, Laura E.; Crane, Paul K.; Mehta, Kala M.; Pedraza, Otto; Tang, Yuxiao; Manly, Jennifer J.; Narasimhalu, Kaavya; Teresi, Jeanne; Jones, Richard N.; Mungas, Dan

2012-01-01

Differential item functioning (DIF) occurs when a test item has different statistical properties in subgroups, controlling for the underlying ability measured by the test. DIF assessment is necessary when evaluating measurement bias in tests used across different language groups. However, other factors such as educational attainment can differ across language groups, and DIF due to these other factors may also exist. How to conduct DIF analyses in the presence of multiple, correlated factors remains largely unexplored. This study assessed DIF related to Spanish versus English language in a 44-item object naming test. Data come from a community-based sample of 1,755 Spanish- and English-speaking older adults. We compared simultaneous accounting, a new strategy for handling differences in educational attainment across language groups, with existing methods. Compared to other methods, simultaneously accounting for language- and education-related DIF yielded salient differences in some object naming scores, particularly for Spanish speakers with at least 9 years of education. Accounting for factors that vary across language groups can be important when assessing language DIF. The use of simultaneous accounting will be relevant to other cross-cultural studies in cognition and in other fields, including health-related quality of life. PMID:22900138
The EORTC information questionnaire, EORTC QLQ-INFO25. Validation study for Spanish patients.

PubMed

Arraras, Juan Ignacio; Manterola, Ana; Hernández, Berta; Arias de la Vega, Fernando; Martínez, Maite; Vila, Meritxell; Eito, Clara; Vera, Ruth; Domínguez, Miguel Ángel

2011-06-01

The EORTC QLQ-INFO25 evaluates the information received by cancer patients. This study assesses the psychometric properties of the QLQ-INFO25 when applied to a sample of Spanish patients. A total of 169 patients with different cancers and stages of disease completed the EORTC QLQINFO25, the EORTC QLQ-C30 and the information scales of the inpatient satisfaction module EORTC IN-PATSAT32 on two occasions during the patients' treatment and follow- up period. Psychometric evaluation of the structure, reliability, validity and responsiveness to changes was conducted. Patient acceptability was assessed with a debriefing questionnaire. Multi-trait scaling confirmed the 4 multi-item scales (information about disease, medical tests, treatment and other services) and eight single items. All items met the standards for convergent validity and all except one met the standards of item discriminant validity. Internal consistency for all scales (α>0.70) and the whole questionnaire (α>0.90) was adequate in the three measurements, except information about the disease (0.67) and other services (0.68) in the first measurement, as was test-retest reliability (intraclass correlations >0.70). Correlations with related areas of IN-PATSAT32 (r>0.40) supported convergent validity. Divergent validity was confirmed through low correlations with EORTC QLQ-C30 scales (r<0.30). The EORTC QLQ-INFO-25 discriminated among groups based on gender, age, education, levels of anxiety and depression, treatment line, wish for information and satisfaction. One scale and an item showed changes over time. The EORTC QLQ-INFO 25 is a reliable and valid instrument when applied to a sample of Spanish cancer patients. These results are in line with those of the EORTC validation study.
PACIC Instrument: disentangling dimensions using published validation models.

PubMed

Iglesias, K; Burnand, B; Peytremann-Bridevaux, I

2014-06-01

To better understand the structure of the Patient Assessment of Chronic Illness Care (PACIC) instrument. More specifically to test all published validation models, using one single data set and appropriate statistical tools. Validation study using data from cross-sectional survey. A population-based sample of non-institutionalized adults with diabetes residing in Switzerland (canton of Vaud). French version of the 20-items PACIC instrument (5-point response scale). We conducted validation analyses using confirmatory factor analysis (CFA). The original five-dimension model and other published models were tested with three types of CFA: based on (i) a Pearson estimator of variance-covariance matrix, (ii) a polychoric correlation matrix and (iii) a likelihood estimation with a multinomial distribution for the manifest variables. All models were assessed using loadings and goodness-of-fit measures. The analytical sample included 406 patients. Mean age was 64.4 years and 59% were men. Median of item responses varied between 1 and 4 (range 1-5), and range of missing values was between 5.7 and 12.3%. Strong floor and ceiling effects were present. Even though loadings of the tested models were relatively high, the only model showing acceptable fit was the 11-item single-dimension model. PACIC was associated with the expected variables of the field. Our results showed that the model considering 11 items in a single dimension exhibited the best fit for our data. A single score, in complement to the consideration of single-item results, might be used instead of the five dimensions usually described. © The Author 2014. Published by Oxford University Press in association with the International Society for Quality in Health Care; all rights reserved.
Components of a Measure to Describe Organizational Culture in Academic Pharmacy

PubMed Central

Rosenthal, Meagen; Holmes, Erin R.; Andrews, Brienna; Lui, Julia; Raja, Leela

2017-01-01

Objective. To develop a measure of organizational culture in academic pharmacy and identify characteristics of an academic pharmacy program that would be impactful for internal (eg, students, employees) and external (eg, preceptors, practitioners) clients of the program. Methods. A three-round Delphi procedure of 24 panelists from pharmacy schools in the U.S. and Canada generated items based on the Organizational Culture Profile (OCP), which were then evaluated and refined for inclusion in subsequent rounds. Items were assessed for appropriateness and impact. Results. The panel produced 35 items across six domains that measured organizational culture in academic pharmacy: competitiveness, performance orientation, social responsibility, innovation, emphasis on collegial support, and stability. Conclusion. The items generated require testing for validation and reliability in a large sample to finalize this measure of organizational culture. PMID:29367768
Do Images Influence Assessment in Anatomy? Exploring the Effect of Images on Item Difficulty and Item Discrimination

ERIC Educational Resources Information Center

Vorstenbosch, Marc A. T. M.; Klaassen, Tim P. F. M.; Kooloos, Jan G. M.; Bolhuis, Sanneke M.; Laan, Roland F. J. M.

2013-01-01

Anatomists often use images in assessments and examinations. This study aims to investigate the influence of different types of images on item difficulty and item discrimination in written assessments. A total of 210 of 460 students volunteered for an extra assessment in a gross anatomy course. This assessment contained 39 test items grouped in…
Music Therapy Assessment Tool for Awareness in Disorders of Consciousness (MATADOC): Reliability and Validity of a Measure to Assess Awareness in Patients with Disorders of Consciousness.

PubMed

Magee, Wendy L; Siegert, Richard J; Taylor, Steve M; Daveson, Barbara A; Lenton-Smith, Gemma

2016-01-01

Prolonged Disorders of Consciousness (PDOC) describes a population where a consciousness disorder has persisted for at least four weeks post injury but is still under investigation. Complex motor, sensory, communication, and cognitive impairments cause challenges with diagnosis, assessment, and intervention planning. Developing sensitive, reliable, and valid measures is a central concern. The auditory modality is the most sensitive for identifying awareness; however, the current standardized behavioral measures fail to provide adequate screening and measurement of auditory responsiveness. The Music Therapy Assessment Tool for Awareness in Disorders of Consciousness (MATADOC) is a recently standardized measure for assessment with PDOC; however, psychometric values for two of its subscales require examination. To determine the measurement characteristics and properties of the MATADOC subscales two and three. In a convenience sample of 21participants with PDOC, a prospective repeated measures study examined inter-rater reliability (IRR) and test-retest reliability (TRR) for both subscales and internal consistency of subscale two. Overall, the items from the MATADOC subscales two and three demonstrated good agreement across and within assessors, with some variability on two identified items. The MATADOC is a standardized measure for assessment of auditory responsiveness in PDOC. Psychometric limitations for the two identified items may have resulted from variations in music therapist clinical experience and training, leading to variations in the administration and interpretation of PDOC patient responses to these two MATADOC assessment items. Although its psychometric properties could be improved, the MATADOC's clinimetric properties make it a valuable assessment to guide clinical work for patients with PDOC. © the American Music Therapy Association 2015. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Development of a research ethics knowledge and analytical skills assessment tool.

PubMed

Taylor, Holly A; Kass, Nancy E; Ali, Joseph; Sisson, Stephen; Bertram, Amanda; Bhan, Anant

2012-04-01

The goal of this project was to develop and validate a new tool to evaluate learners' knowledge and skills related to research ethics. A core set of 50 questions from existing computer-based online teaching modules were identified, refined and supplemented to create a set of 74 multiple-choice, true/false and short answer questions. The questions were pilot-tested and item discrimination was calculated for each question. Poorly performing items were eliminated or refined. Two comparable assessment tools were created. These assessment tools were administered as a pre-test and post-test to a cohort of 58 Indian junior health research investigators before and after exposure to a new course on research ethics. Half of the investigators were exposed to the course online, the other half in person. Item discrimination was calculated for each question and Cronbach's α for each assessment tool. A final version of the assessment tool that incorporated the best questions from the pre-/post-test phase was used to assess retention of research ethics knowledge and skills 3 months after course delivery. The final version of the REKASA includes 41 items and had a Cronbach's α of 0.837. The results illustrate, in one sample of learners, the successful, systematic development and use of a knowledge and skills assessment tool in research ethics capable of not only measuring basic knowledge in research ethics and oversight but also assessing learners' ability to apply ethics knowledge to the analytical task of reasoning through research ethics cases, without reliance on essay or discussion-based examination. These promising preliminary findings should be confirmed with additional groups of learners.
Personality in sanctuary-housed chimpanzees: A comparative approach of psychobiological and penta-factorial human models.

PubMed

Úbeda, Yulán; Llorente, Miquel

2015-02-18

We evaluate a sanctuary chimpanzee sample (N = 11) using two adapted human assessment instruments: the Five-Factor Model (FFM) and Eysenck's Psychoticism-Extraversion-Neuroticism (PEN) model. The former has been widely used in studies of animal personality, whereas the latter has never been used to assess chimpanzees. We asked familiar keepers and scientists (N = 28) to rate 38 (FFM) and 12 (PEN) personality items. The personality surveys showed reliability in all of the items for both instruments. These were then analyzed in a principal component analysis and a regularized exploratory factor analysis, which revealed four and three components, respectively. The results indicate that both questionnaires show a clear factor structure, with characteristic factors not just for the species, but also for the sample type. However, due to its brevity, the PEN may be more suitable for assessing personality in a sanctuary, where employees do not have much time to devote to the evaluation process. In summary, both models are sensitive enough to evaluate the personality of a group of chimpanzees housed in a sanctuary.
Development of a brief measure of intimate partner violence experiences: the Composite Abuse Scale (Revised)—Short Form (CASR-SF)

PubMed Central

Ford-Gilboe, Marilyn; Wathen, C Nadine; Varcoe, Colleen; MacMillan, Harriet L; Scott-Storey, Kelly; Mantler, Tara; Hegarty, Kelsey; Perrin, Nancy

2016-01-01

Objectives Approaches to measuring intimate partner violence (IPV) in populations often privilege physical violence, with poor assessment of other experiences. This has led to underestimating the scope and impact of IPV. The aim of this study was to develop a brief, reliable and valid self-report measure of IPV that adequately captures its complexity. Design Mixed-methods instrument development and psychometric testing to evolve a brief version of the Composite Abuse Scale (CAS) using secondary data analysis and expert feedback. Setting Data from 5 Canadian IPV studies; feedback from international IPV experts. Participants 31 international IPV experts including academic researchers, service providers and policy actors rated CAS items via an online survey. Pooled data from 6278 adult Canadian women were used for scale development. Primary/secondary outcome measures Scale reliability and validity; robustness of subscales assessing different IPV experiences. Results A 15-item version of the CAS has been developed (Composite Abuse Scale (Revised)—Short Form, CASR-SF), including 12 items developed from the original CAS and 3 items suggested through expert consultation and the evolving literature. Items cover 3 abuse domains: physical, sexual and psychological, with questions asked to assess lifetime, recent and current exposure, and abuse frequency. Factor loadings for the final 3-factor solution ranged from 0.81 to 0.91 for the 6 psychological abuse items, 0.63 to 0.92 for the 4 physical abuse items, and 0.85 and 0.93 for the 2 sexual abuse items. Moderate correlations were observed between the CASR-SF and measures of depression, post-traumatic stress disorder and coercive control. Internal consistency of the CASR-SF was 0.942. These reliability and validity estimates were comparable to those obtained for the original 30-item CAS. Conclusions The CASR-SF is brief self-report measure of IPV experiences among women that has demonstrated initial reliability and validity and is suitable for use in population studies or other studies. Additional validation of the 15-item scale with diverse samples is required. PMID:27927659
Conjoint Community Resiliency Assessment Measure-28/10 items (CCRAM28 and CCRAM10): A self-report tool for assessing community resilience.

PubMed

Leykin, Dmitry; Lahad, Mooli; Cohen, Odeya; Goldberg, Avishay; Aharonson-Daniel, Limor

2013-12-01

Community resilience is used to describe a community's ability to deal with crises or disruptions. The Conjoint Community Resiliency Assessment Measure (CCRAM) was developed in order to attain an integrated, multidimensional instrument for the measurement of community resiliency. The tool was developed using an inductive, exploratory, sequential mixed methods design. The objective of the present study was to portray and evaluate the CCRAM's psychometric features. A large community sample (N = 1,052) were assessed by the CCRAM tool, and the data was subjected to exploratory and confirmatory factor analysis. A Five factor model (21 items) was obtained, explaining 67.67 % of the variance. This scale was later reduced to 10-item brief instrument. Both scales showed good internal consistency coefficients (α = .92 and α = .85 respectively), and acceptable fit indices to the data. Seven additional items correspond to information requested by leaders, forming the CCRAM28. The CCRAM has been shown to be an acceptable practical tool for assessing community resilience. Both internal and external validity have been demonstrated, as all factors obtained in the factor analytical process, were tightly linked to previous literature on community resilience. The CCRAM facilitates the estimation of an overall community resiliency score but furthermore, it detects the strength of five important constructs of community function following disaster: Leadership, Collective Efficacy, Preparedness, Place Attachment and Social Trust. Consequently, the CCRAM can serve as an aid for community leaders to assess, monitor, and focus actions to enhance and restore community resilience for crisis situations.
Assessing Adolescents' Positive Psychological Functioning at School: Development and Validation of the Student Subjective Wellbeing Questionnaire

ERIC Educational Resources Information Center

Renshaw, Tyler L.; Long, Anna C. J.; Cook, Clayton R.

2015-01-01

This study reports on the initial development and validation of the Student Subjective Wellbeing Questionnaire (SSWQ) with a sample of 1,002 students in Grades 6-8. The SSWQ is a 16-item self-report instrument for assessing youths' subjective wellbeing at school, which is operationalized via 4 subscales measuring school connectedness, academic…
Psychometrical Assessment and Item Analysis of the General Health Questionnaire in Victims of Terrorism

ERIC Educational Resources Information Center

Delgado-Gomez, David; Lopez-Castroman, Jorge; de Leon-Martinez, Victoria; Baca-Garcia, Enrique; Cabanas-Arrate, Maria Luisa; Sanchez-Gonzalez, Antonio; Aguado, David

2013-01-01

There is a need to assess the psychiatric morbidity that appears as a consequence of terrorist attacks. The General Health Questionnaire (GHQ) has been used to this end, but its psychometric properties have never been evaluated in a population affected by terrorism. A sample of 891 participants included 162 direct victims of terrorist attacks and…
Time Diary and Questionnaire Assessment of Factors Associated with Academic and Personal Success among University Undergraduates

ERIC Educational Resources Information Center

George, Darren; Dixon, Sinikka; Stansal, Emory; Gelb, Shannon Lund; Pheri, Tabitha

2008-01-01

Objective and Participants: A sample of 231 students attending a private liberal arts university in central Alberta, Canada, completed a 5-day time diary and a 71-item questionnaire assessing the influence of personal, cognitive, and attitudinal factors on success. Methods: The authors used 3 success measures: cumulative grade point average (GPA),…
Implementation and Initial Validation of the Combined English Language Skills Assessment (CELSA) at Golden West College.

ERIC Educational Resources Information Center

Isonio, Steven

During spring 1992, the Combined English Language Skills Assessment (CELSA) test was piloted with a sample of English-as-a-Second-Language (ESL) classes at Golden West College (GWC) in Huntington Beach, California. The CELSA, which utilizes a cloze format including parts of conversations and short dialogues, combines items from beginning,…
Assessing Children's Homework Performance: Development of Multi-Dimensional, Multi-Informant Rating Scales.

PubMed

Power, Thomas J; Dombrowski, Stefan C; Watkins, Marley W; Mautone, Jennifer A; Eagle, John W

2007-06-01

Efforts to develop interventions to improve homework performance have been impeded by limitations in the measurement of homework performance. This study was conducted to develop rating scales for assessing homework performance among students in elementary and middle school. Items on the scales were intended to assess student strengths as well as deficits in homework performance. The sample included 163 students attending two school districts in the Northeast. Parents completed the 36-item Homework Performance Questionnaire - Parent Scale (HPQ-PS). Teachers completed the 22-item teacher scale (HPQ-TS) for each student for whom the HPQ-PS had been completed. A common factor analysis with principal axis extraction and promax rotation was used to analyze the findings. The results of the factor analysis of the HPQ-PS revealed three salient and meaningful factors: student task orientation/efficiency, student competence, and teacher support. The factor analysis of the HPQ-TS uncovered two salient and substantive factors: student responsibility and student competence. The findings of this study suggest that the HPQ is a promising set of measures for assessing student homework functioning and contextual factors that may influence performance. Directions for future research are presented.
Assessing Children’s Homework Performance: Development of Multi-Dimensional, Multi-Informant Rating Scales

PubMed Central

Power, Thomas J.; Dombrowski, Stefan C.; Watkins, Marley W.; Mautone, Jennifer A.; Eagle, John W.

2007-01-01

Efforts to develop interventions to improve homework performance have been impeded by limitations in the measurement of homework performance. This study was conducted to develop rating scales for assessing homework performance among students in elementary and middle school. Items on the scales were intended to assess student strengths as well as deficits in homework performance. The sample included 163 students attending two school districts in the Northeast. Parents completed the 36-item Homework Performance Questionnaire – Parent Scale (HPQ-PS). Teachers completed the 22-item teacher scale (HPQ-TS) for each student for whom the HPQ-PS had been completed. A common factor analysis with principal axis extraction and promax rotation was used to analyze the findings. The results of the factor analysis of the HPQ-PS revealed three salient and meaningful factors: student task orientation/efficiency, student competence, and teacher support. The factor analysis of the HPQ-TS uncovered two salient and substantive factors: student responsibility and student competence. The findings of this study suggest that the HPQ is a promising set of measures for assessing student homework functioning and contextual factors that may influence performance. Directions for future research are presented. PMID:18516211
Adherence to the items in a bundle for the prevention of ventilator-associated pneumonia.

PubMed

Sachetti, Amanda; Rech, Viviane; Dias, Alexandre Simões; Fontana, Caroline; Barbosa, Gilberto da Luz; Schlichting, Dionara

2014-01-01

To assess adherence to a ventilator care bundle in an intensive care unit and to determine the impact of adherence on the rates of ventilator-associated pneumonia. A total of 198 beds were assessed for 60 days using a checklist that consisted of the following items: bed head elevation to 30 to 45º; position of the humidifier filter; lack of fluid in the ventilator circuit; oral hygiene; cuff pressure; and physical therapy. Next, an educational lecture was delivered, and 235 beds were assessed for the following 60 days. Data were also collected on the incidence of ventilator-acquired pneumonia. Adherence to the following ventilator care bundle items increased: bed head elevation from 18.7% to 34.5%; lack of fluid in the ventilator circuit from 55.6% to 72.8%; oral hygiene from 48.5% to 77.8%; and cuff pressure from 29.8% to 51.5%. The incidence of ventilator-associated pneumonia was statistically similar before and after intervention (p=0.389). The educational intervention performed in this study increased the adherence to the ventilator care bundle, but the incidence of ventilator-associated pneumonia did not decrease in the small sample that was assessed.

Test-retest reliability of Physical Activity Neighborhood Environment Scale among urban men and women in Nanjing, China.

PubMed

Zhao, L; Wang, Z; Qin, Z; Leslie, E; He, J; Xiong, Y; Xu, F

2018-03-01

The identification of physical-activity-friendly built environment (BE) constructs is highly useful for physical activity promotion and maintenance. The Physical Activity Neighborhood Environment Scale (PANES) was developed for assessing BE correlates. However, PANES reliability has not been investigated among adults in China. A cross-sectional study. With multistage sampling approaches, 1568 urban adults (aged 35-74 years) were recruited for the initial survey on all 17 items of PANES Chinese version (PANES-CHN), with the survey repeated 7 days later for each participant. Intraclass correlation coefficient (ICC) was used to assess the test-retest reliability of PANES-CHN for each item. Totally, 1551 participants completed both surveys (follow-up rate = 98.9%). Among participants (mean age: 54.7 ± 11.1 years), 47.8% were men, 22.1% were elders, and 22.7% had ≥13 years of education. Overall, the PANES-CHN demonstrated at least substantial reliability with ICCs ranging from 0.66 to 0.95 (core items), from 0.75 to 0.95 (recommended items), and from 0.78 to 0.87 (optional items). Similar outcomes were observed when data were analyzed by gender or age groups. The PANES-CHN has excellent test-retest reliability and thus has valuable utility for assessing urban BE attributes among Chinese adults. Copyright © 2017 The Royal Society for Public Health. Published by Elsevier Ltd. All rights reserved.
The Validity of the 16-Item Version of the Prodromal Questionnaire (PQ-16) to Screen for Ultra High Risk of Developing Psychosis in the General Help-Seeking Population

PubMed Central

Ising, Helga K.; Veling, Wim; Loewy, Rachel L.; Rietveld, Marleen W.; Rietdijk, Judith; Dragt, Sara; Klaassen, Rianne M. C.; Nieman, Dorien H.; Wunderink, Lex; Linszen, Don H.; van der Gaag, Mark

2012-01-01

In order to bring about implementation of routine screening for psychosis risk, a brief version of the Prodromal Questionnaire (PQ; Loewy et al., 2005) was developed and tested in a general help-seeking population. We assessed a consecutive patient sample of 3533 young adults who were help-seeking for nonpsychotic disorders at the secondary mental health services in the Hague with the PQ. We performed logistic regression analyses and CHi-squared Automatic Interaction Detector decision tree analysis to shorten the original 92 items. Receiver operating characteristic curves were used to examine the psychometric properties of the PQ-16. In the general help-seeking population, a cutoff score of 6 or more positively answered items on the 16-item version of the PQ produced correct classification of Comprehensive Assessment of At-Risk Mental State (Yung et al., 2005) psychosis risk/clinical psychosis in 44% of the cases, distinguishing Comprehensive Assessment of At-Risk Mental States (CAARMS) diagnosis from no CAARMS diagnosis with high sensitivity (87%) and specificity (87%). These results were comparable to the PQ-92. The PQ-16 is a good self-report screen for use in secondary mental health care services to select subjects for interviewing for psychosis risk. The low number of items makes it quite appropriate for screening large help-seeking populations, thus enhancing the feasibility of detection and treatment of ultra high-risk patients in routine mental health services. PMID:22516147
Mini-Mental Status Examination: mixed Rasch model item analysis derived two different cognitive dimensions of the MMSE.

PubMed

Schultz-Larsen, Kirsten; Kreiner, Svend; Lomholt, Rikke Kirstine

2007-03-01

This study published in two companion papers assesses properties of the Mini-Mental State Examination (MMSE) with the purpose of improving the efficiencies of the methods of screening for cognitive impairment and dementia. An item analysis by conventional and mixed Rasch models was used to explore empirically derived cognitive dimensions of the MMSE, to assess item bias, and to construct diagnostic cut-points. The scores of 1,189 elderly residents were analyzed. Two dimensions of cognitive function, which are statistically and conceptually different from those obtained in previous studies, were derived. The corresponding sum scales were (1) age-correlated MMSE scale (A-MMSE scale: orientation to time, attention/calculation, naming, repetition, and three-stage command) and (2) non-age-correlated MMSE scale (B-MMSE scale: orientation to place, registration, recall, reading, and copying). The "writing" item was not included due to differential effects of age and sex. The analysis also showed that the study sample consisted of two cognitively different groups of elderly. The findings indicate that a two-scale solution is a stable and statistically supported framework for interpreting data obtained by means of the MMSE. Supplementary analyses are presented in the companion paper to explore the performance of this item response theory calibration as a screening test for dementia.
A comparison of three methods of assessing differential item functioning (DIF) in the Hospital Anxiety Depression Scale: ordinal logistic regression, Rasch analysis and the Mantel chi-square procedure.

PubMed

Cameron, Isobel M; Scott, Neil W; Adler, Mats; Reid, Ian C

2014-12-01

It is important for clinical practice and research that measurement scales of well-being and quality of life exhibit only minimal differential item functioning (DIF). DIF occurs where different groups of people endorse items in a scale to different extents after being matched by the intended scale attribute. We investigate the equivalence or otherwise of common methods of assessing DIF. Three methods of measuring age- and sex-related DIF (ordinal logistic regression, Rasch analysis and Mantel χ(2) procedure) were applied to Hospital Anxiety Depression Scale (HADS) data pertaining to a sample of 1,068 patients consulting primary care practitioners. Three items were flagged by all three approaches as having either age- or sex-related DIF with a consistent direction of effect; a further three items identified did not meet stricter criteria for important DIF using at least one method. When applying strict criteria for significant DIF, ordinal logistic regression was slightly less sensitive. Ordinal logistic regression, Rasch analysis and contingency table methods yielded consistent results when identifying DIF in the HADS depression and HADS anxiety scales. Regardless of methods applied, investigators should use a combination of statistical significance, magnitude of the DIF effect and investigator judgement when interpreting the results.
Development and validation of the positive affect and well-being scale for the neurology quality of life (Neuro-QOL) measurement system.

PubMed

Salsman, John M; Victorson, David; Choi, Seung W; Peterman, Amy H; Heinemann, Allen W; Nowinski, Cindy; Cella, David

2013-11-01

To develop and validate an item-response theory-based patient-reported outcomes assessment tool of positive affect and well-being (PAW). This is part of a larger NINDS-funded study to develop a health-related quality of life measurement system across major neurological disorders, called Neuro-QOL. Informed by a literature review and qualitative input from clinicians and patients, item pools were created to assess PAW concepts. Items were administered to a general population sample (N = 513) and a group of individuals with a variety of neurologic conditions (N = 581) for calibration and validation purposes, respectively. A 23-item calibrated bank and a 9-item short form of PAW was developed, reflecting components of positive affect, life satisfaction, or an overall sense of purpose and meaning. The Neuro-QOL PAW measure demonstrated sufficient unidimensionality and displayed good internal consistency, test-retest reliability, model fit, convergent and discriminant validity, and responsiveness. The Neuro-QOL PAW measure was designed to aid clinicians and researchers to better evaluate and understand the potential role of positive health processes for individuals with chronic neurological conditions. Further psychometric testing within and between neurological conditions, as well as testing in non-neurologic chronic diseases, will help evaluate the generalizability of this new tool.
Integrating prospective longitudinal data: modeling personality and health in the Terman Life Cycle and Hawaii Longitudinal Studies.

PubMed

Kern, Margaret L; Hampson, Sarah E; Goldberg, Lewis R; Friedman, Howard S

2014-05-01

The present study used a collaborative framework to integrate 2 long-term prospective studies: the Terman Life Cycle Study and the Hawaii Personality and Health Longitudinal Study. Within a 5-factor personality-trait framework, teacher assessments of child personality were rationally and empirically aligned to establish similar factor structures across samples. Comparable items related to adult self-rated health, education, and alcohol use were harmonized, and data were pooled on harmonized items. A structural model was estimated as a multigroup analysis. Harmonized child personality factors were then used to examine markers of physiological dysfunction in the Hawaii sample and mortality risk in the Terman sample. Harmonized conscientiousness predicted less physiological dysfunction in the Hawaii sample and lower mortality risk in the Terman sample. These results illustrate how collaborative, integrative work with multiple samples offers the exciting possibility that samples from different cohorts and ages can be linked together to directly test life span theories of personality and health. (PsycINFO Database Record (c) 2014 APA, all rights reserved).
Posttraumatic maladaptive beliefs scale: evolution of the personal beliefs and reactions scale.

PubMed

Vogt, Dawne S; Shipherd, Jillian C; Resick, Patricia A

2012-09-01

The posttraumatic maladaptive beliefs scale (PMBS) was developed to measure maladaptive beliefs about current life circumstances that may occur following trauma exposure. This scale assesses maladaptive beliefs within three domains: (a) threat of harm, (b) self-worth and judgment, and (c) reliability and trustworthiness of others. Items for the PMBS were drawn from a larger preexisting measure that assesses a wide range of personal beliefs and reactions associated with trauma exposure. The construct validity of the PMBS was assessed in two independent samples of interpersonal trauma survivors. This article provides data to support the reliability and validity of the PMBS as an instrument to assess general, rather than trauma-specific, maladaptive beliefs that have relevance for functioning in the aftermath of a traumatic event. Moreover, the measure is sensitive to changes that occur in treatment, and the length of the measure (15 items) is practical for use in clinical settings.
Evaluation of the Multiple Sclerosis Walking Scale-12 (MSWS-12) in a Dutch sample: Application of item response theory.

PubMed

Mokkink, Lidwine Brigitta; Galindo-Garre, Francisca; Uitdehaag, Bernard Mj

2016-12-01

The Multiple Sclerosis Walking Scale-12 (MSWS-12) measures walking ability from the patients' perspective. We examined the quality of the MSWS-12 using an item response theory model, the graded response model (GRM). A total of 625 unique Dutch multiple sclerosis (MS) patients were included. After testing for unidimensionality, monotonicity, and absence of local dependence, a GRM was fit and item characteristics were assessed. Differential item functioning (DIF) for the variables gender, age, duration of MS, type of MS and severity of MS, reliability, total test information, and standard error of the trait level (θ) were investigated. Confirmatory factor analysis showed a unidimensional structure of the 12 items of the scale, explaining 88% of the variance. Item 2 did not fit into the GRM model. Reliability was 0.93. Items 8 and 9 (of the 11 and 12 item version respectively) showed DIF on the variable severity, based on the Expanded Disability Status Scale (EDSS). However, the EDSS is strongly related to the content of both items. Our results confirm the good quality of the MSWS-12. The trait level (θ) scores and item parameters of both the 12- and 11-item versions were highly comparable, although we do not suggest to change the content of the MSWS-12. © The Author(s), 2016.
Detecting and measuring deprivation in primary care: development, reliability and validity of a self-reported questionnaire: the DiPCare-Q

PubMed Central

Bischoff, Thomas; Diserens, Esther-Amélie; Herzig, Lilli; Meystre-Agustoni, Giovanna; Panese, Francesco; Favrat, Bernard; Sass, Catherine; Bodenmann, Patrick

2012-01-01

Objectives Advances in biopsychosocial science have underlined the importance of taking social history and life course perspective into consideration in primary care. For both clinical and research purposes, this study aims to develop and validate a standardised instrument measuring both material and social deprivation at an individual level. Methods We identified relevant potential questions regarding deprivation using a systematic review, structured interviews, focus group interviews and a think-aloud approach. Item response theory analysis was then used to reduce the length of the 38-item questionnaire and derive the deprivation in primary care questionnaire (DiPCare-Q) index using data obtained from a random sample of 200 patients during their planned visits to an ambulatory general internal medicine clinic. Patients completed the questionnaire a second time over the phone 3 days later to enable us to assess reliability. Content validity of the DiPCare-Q was then assessed by 17 general practitioners. Psychometric properties and validity of the final instrument were investigated in a second set of patients. The DiPCare-Q was administered to a random sample of 1898 patients attending one of 47 different private primary care practices in western Switzerland along with questions on subjective social status, education, source of income, welfare status and subjective poverty. Results Deprivation was defined in three distinct dimensions: material (eight items), social (five items) and health deprivation (three items). Item consistency was high in both the derivation (Kuder-Richardson Formula 20 (KR20) =0.827) and the validation set (KR20 =0.778). The DiPCare-Q index was reliable (interclass correlation coefficients=0.847) and was correlated to subjective social status (rs=−0.539). Conclusion The DiPCare-Q is a rapid, reliable and validated instrument that may prove useful for measuring both material and social deprivation in primary care. PMID:22307103
Can a Simple Dietary Index Derived from a Sub-Set of Questionnaire Items Assess Diet Quality in a Sample of Australian Adults?

PubMed Central

Trapp, Georgina S. A.; Knuiman, Matthew; Hooper, Paula; Ambrosini, Gina L.

2018-01-01

Large, longitudinal surveys often lack consistent dietary data, limiting the use of existing tools and methods that are available to measure diet quality. This study describes a method that was used to develop a simple index for ranking individuals according to their diet quality in a longitudinal study. The RESIDential Environments (RESIDE) project (2004–2011) collected dietary data in varying detail, across four time points. The most detailed dietary data were collected using a 24-item questionnaire at the final time point (n = 555; age ≥ 25 years). At preceding time points, sub-sets of the 24 items were collected. A RESIDE dietary guideline index (RDGI) that was based on the 24-items was developed to assess diet quality in relation to the Australian Dietary Guidelines. The RDGI scores were regressed on the longitudinal sub-sets of six and nine questionnaire items at T4, from which two simple index scores (S-RDGI1 and S-RDGI2) were predicted. The S-RDGI1 and S-RDGI2 showed reasonable agreement with the RDGI (Spearman’s rho = 0.78 and 0.84; gross misclassification = 1.8%; correct classification = 64.9% and 69.7%; and, Cohen’s weighted kappa = 0.58 and 0.64, respectively). For all of the indices, higher diet quality was associated with being female, undertaking moderate to high amounts of physical activity, not smoking, and self-reported health. The S-RDGI1 and S-RDGI2 explained 62% and 73% of the variation in RDGI scores, demonstrating that a large proportion of the variability in diet quality scores can be captured using a relatively small sub-set of questionnaire items. The methods described in this study can be applied elsewhere, in situations where limited dietary data are available, to generate a sample-specific score for ranking individuals according to diet quality. PMID:29652828
Can a Simple Dietary Index Derived from a Sub-Set of Questionnaire Items Assess Diet Quality in a Sample of Australian Adults?

PubMed

Bivoltsis, Alexia; Trapp, Georgina S A; Knuiman, Matthew; Hooper, Paula; Ambrosini, Gina L

2018-04-13

Large, longitudinal surveys often lack consistent dietary data, limiting the use of existing tools and methods that are available to measure diet quality. This study describes a method that was used to develop a simple index for ranking individuals according to their diet quality in a longitudinal study. The RESIDential Environments (RESIDE) project (2004-2011) collected dietary data in varying detail, across four time points. The most detailed dietary data were collected using a 24-item questionnaire at the final time point ( n = 555; age ≥ 25 years). At preceding time points, sub-sets of the 24 items were collected. A RESIDE dietary guideline index (RDGI) that was based on the 24-items was developed to assess diet quality in relation to the Australian Dietary Guidelines. The RDGI scores were regressed on the longitudinal sub-sets of six and nine questionnaire items at T4, from which two simple index scores (S-RDGI1 and S-RDGI2) were predicted. The S-RDGI1 and S-RDGI2 showed reasonable agreement with the RDGI (Spearman's rho = 0.78 and 0.84; gross misclassification = 1.8%; correct classification = 64.9% and 69.7%; and, Cohen's weighted kappa = 0.58 and 0.64, respectively). For all of the indices, higher diet quality was associated with being female, undertaking moderate to high amounts of physical activity, not smoking, and self-reported health. The S-RDGI1 and S-RDGI2 explained 62% and 73% of the variation in RDGI scores, demonstrating that a large proportion of the variability in diet quality scores can be captured using a relatively small sub-set of questionnaire items. The methods described in this study can be applied elsewhere, in situations where limited dietary data are available, to generate a sample-specific score for ranking individuals according to diet quality.
Relative Validity and Reliability of a 1-Week, Semiquantitative Food Frequency Questionnaire for Women Participating in the Supplemental Nutrition Assistance Program.

PubMed

Sanjeevi, Namrata; Freeland-Graves, Jeanne; George, Goldy Chacko

2017-12-01

The Supplemental Nutrition Assistance Program (SNAP) plays a critical role in reducing food insecurity by distribution of benefits at a monthly interval to participants. Households that receive assistance from SNAP spend at least three-quarters of benefits within the first 2 weeks of receipt. Because this expenditure pattern may be associated with lower food intake toward the end of the month, it is important to develop a tool that can assess the weekly diets of SNAP participants. The goal of this study was to develop and assess the relative validity and reliability of a semiquantitative 1-week food frequency questionnaire (FFQ) tailored to a population of women participating in SNAP. The FFQ was derived from an existing 195-item FFQ that was based on a reference period of 1 month. This 195-item FFQ has been validated in a population of low-income postpartum women who were recruited from central Texas during 2004. Mean daily servings of each food item in the 195-item FFQ completed by women who took part in the 2004 validation study were calculated to determine the most frequently consumed food items. Emphasis on these items led to the creation of a shorter, 1-week FFQ of only 95 items. This new 1-week instrument was compared with 3-day diet records to evaluate relative validity in a sample of women participating in SNAP. For reliability, the FFQ was administered a second time, separated by a 1-month time interval. The validity study included 70 female SNAP participants who were recruited from the partner agencies of the Central Texas Food Bank from March to June 2015. A subsample of 40 women participated in the reliability study. Outcome measures were mean nutrient intake values obtained from the two tests of the 95-item FFQ and 3-day diet records. Deattenuated Pearson correlation coefficients examined relationships in nutrient intake between the 95-item FFQ and 3-day diet records, and a paired samples t test determined differences in mean nutrient intake. Weighted Cohen's κ indicated agreement in quartile classification of study participants by the 95-item FFQ and 3-day diet records, according to nutrient intake. Test-retest reliability was assessed by intraclass correlations and weighted Cohen's κ. Mean deattenuated Pearson correlation between the FFQ and 3-day diet records was 0.61, and the weighted Cohen's κ=0.39. Finally, the average test-retest correlation and weighted Cohen's κ of the FFQ was 0.66 and 0.50, respectively. These results suggest that the 1-week, 95-item FFQ demonstrated acceptable relative validity and reliability in low-income women participating in SNAP in southwestern United States. Copyright © 2017 Academy of Nutrition and Dietetics. Published by Elsevier Inc. All rights reserved.
Confirmatory factor analysis of the Frommelt Attitude Toward Care of the Dying Scale (FATCOD-B) among Italian medical students.

PubMed

Leombruni, Paolo; Loera, Barbara; Miniotti, Marco; Zizzi, Francesca; Castelli, Lorys; Torta, Riccardo

2015-10-01

A steady increase in the number of patients requiring end-of-life care has been observed during the last decades. The assessment of healthcare students' attitudes toward end-of-life care is an important step in their curriculum, as it provides information about their disposition to practice palliative medicine. The Frommelt Attitude Toward Care of the Dying Scale (FATCOD-B) was developed to detect such a disposition, but its psychometric properties are yet to be clearly defined. A convenience sample of 608 second-year medical students participated in our study in the 2012/2013 and 2013/2014 academic years. All participants completed the FATCOD-B. The sample was randomly divided in two subsamples. In the item analysis, reliability (Cronbach's α), internal consistency (item-total correlations), and an exploratory factor analysis (EFA) were conducted using the first subsample (n = 300). Using the second subsample (n = 308), confirmatory factor analysis (CFA) was performed using the robust ML method in the Lisrel program. Reliability for all items was 0.699. Item-total correlations, ranging from 0.03 to 0.39, were weak. EFA identified a two-dimensional orthogonal solution, explaining 20% of total variance. CFA upheld the two-dimensional model, but the loadings on the dimensions and their respective indicators were weak and equal to zero for certain items. The findings of the present study suggest that the FATCOD-B measures a two-dimensional construct and that several items seem in need of revision. Future research oriented toward building a revised version of the scale should pay attention to item ambiguity and take particular care to distinguish among items that concern emotions and beliefs related to end-of-life care, as well as their subjects (e.g., the healthcare provider, the patient, his family).
Assessing Psychopathy Among Justice Involved Adolescents with the PCL: YV: An Item Response Theory Examination Across Gender

PubMed Central

Tsang, Siny; Schmidt, Karen M.; Vincent, Gina M.; Salekin, Randall T.; Moretti, Marlene M.; Odgers, Candice L.

2014-01-01

This study used an item response theory (IRT) model and a large adolescent sample of justice involved youth (N = 1,007, 38% female) to examine the item functioning of the Psychopathy Checklist – Youth Version (PCL: YV). Items that were most discriminating (or most sensitive to changes) of the latent trait (thought to be psychopathy) among adolescents included “Glibness/superficial charm”, “Lack of remorse”, and “Need for stimulation”, whereas items that were least discriminating included “Pathological lying”, “Failure to accept responsibility”, and “Lacks goals.” The items “Impulsivity” and “Irresponsibility” were the most likely to be rated high among adolescents, whereas “Parasitic lifestyle”, and “Glibness/superficial charm” were the most likely to be rated low. Evidence of differential item functioning (DIF) on four of the 13 items was found between boys and girls. “Failure to accept responsibility” and “Impulsivity” were endorsed more frequently to describe adolescent girls than boys at similar levels of the latent trait, and vice versa for “Grandiose sense of self-worth” and “Lacks goals.” The DIF findings suggest that four PCL: YV items function differently between boys and girls. PMID:25580672
Do early changes in the HAM-D-17 anxiety/somatization factor items affect the treatment outcome among depressed outpatients? Comparison of two controlled trials of St John's wort (Hypericum perforatum) versus a SSRI.

PubMed

Bitran, Stella; Farabaugh, Amy H; Ameral, Victoria E; LaRocca, Rachel A; Clain, Alisabet J; Fava, Maurizio; Mischoulon, David

2011-07-01

To assess whether early changes in Hamilton Depression Rating Scale-17 anxiety/somatization items predict remission in two controlled studies of Hypericum perforatum (St John's wort) versus selective serotonin reuptake inhibitors for major depressive disorder. The Hypericum Depression Trial Study Group (National Institute of Mental Health) randomized 340 patients to Hypericum, sertraline, or placebo for 8 weeks, whereas the Massachusetts General Hospital study randomized 135 patients to Hypericum, fluoxetine, or placebo for 12 weeks. The investigators examined whether remission was associated with early changes in anxiety/somatization symptoms. In the National Institute of Mental Health study, significant associations were observed between remission and early improvement in the anxiety (psychic) item (sertraline arm), somatic (gastrointestinal item; Hypericum arm), and somatic (general) symptoms (placebo arm). None of the three treatment arms of the Massachusetts General Hospital study showed significant associations between anxiety/somatization symptoms and remission. When both study samples were pooled, we found associations for anxiety (psychic; selective serotonin reuptake inhibitors arm), somatic (gastrointestinal), and hypochondriasis (Hypericum arm), and anxiety (psychic) and somatic (general) symptoms (placebo arm). In the entire sample, remission was associated with the improvement in the anxiety (psychic), somatic (gastrointestinal), and somatic (general) items. The number and the type of anxiety/somatization items associated with remission varied depending on the intervention. Early scrutiny of the Hamilton Depression Rating Scale-17 anxiety/somatization items may help to predict remission of major depressive disorder.
Cross-cultural development and psychometric evaluation of a measure to assess fear of childbirth prior to pregnancy.

PubMed

Stoll, Kathrin; Hauck, Yvonne; Downe, Soo; Edmonds, Joyce; Gross, Mechthild M; Malott, Anne; McNiven, Patricia; Swift, Emma; Thomson, Gillian; Hall, Wendy A

2016-06-01

Assessment of childbirth fear, in advance of pregnancy, and early identification of modifiable factors contributing to fear can inform public health initiatives and/or school-based educational programming for the next generation of maternity care consumers. We developed and evaluated a short fear of birth scale that incorporates the most common dimensions of fear reported by men and women prior to pregnancy, fear of: labour pain, being out of control and unable to cope with labour and birth, complications, and irreversible physical damage. University students in six countries (Australia, Canada, England, Germany, Iceland, and the United States, n = 2240) participated in an online survey to assess their fears and attitudes about birth. We report internal consistency reliability, corrected-item-to-total correlations, factor loadings and convergent and discriminant validity of the new scale. The Childbirth Fear - Prior to Pregnancy (CFPP) scale showed high internal consistency across samples (α > 0.86). All corrected-item-to total correlations exceeded 0.45, supporting the uni-dimensionality of the scale. Construct validity of the CFPP was supported by a high correlation between the new scale and a two-item visual analogue scale that measures fear of birth (r > 0.6 across samples). Weak correlations of the CFPP with scores on measures that assess related psychological states (anxiety, depression and stress) support the discriminant validity of the scale. The CFPP is a short, reliable and valid measure of childbirth fear among young women and men in six countries who plan to have children. Copyright © 2016 Elsevier B.V. All rights reserved.
Validation of a score tool for measurement of histological severity in juvenile dermatomyositis and association with clinical severity of disease

PubMed Central

Varsani, Hemlata; Charman, Susan C; Li, Charles K; Marie, Suely K N; Amato, Anthony A; Banwell, Brenda; Bove, Kevin E; Corse, Andrea M; Emslie-Smith, Alison M; Jacques, Thomas S; Lundberg, Ingrid E; Minetti, Carlo; Nennesmo, Inger; Rushing, Elisabeth J; Sallum, Adriana M E; Sewry, Caroline; Pilkington, Clarissa A; Holton, Janice L; Wedderburn, Lucy R

2015-01-01

Objectives To study muscle biopsy tissue from patients with juvenile dermatomyositis (JDM) in order to test the reliability of a score tool designed to quantify the severity of histological abnormalities when applied to biceps humeri in addition to quadriceps femoris. Additionally, to evaluate whether elements of the tool correlate with clinical measures of disease severity. Methods 55 patients with JDM with muscle biopsy tissue and clinical data available were included. Biopsy samples (33 quadriceps, 22 biceps) were prepared and stained using standardised protocols. A Latin square design was used by the International Juvenile Dermatomyositis Biopsy Consensus Group to score cases using our previously published score tool. Reliability was assessed by intraclass correlation coefficient (ICC) and scorer agreement (α) by assessing variation in scorers’ ratings. Scores from the most reliable tool items correlated with clinical measures of disease activity at the time of biopsy. Results Inter- and intraobserver agreement was good or high for many tool items, including overall assessment of severity using a Visual Analogue Scale. The tool functioned equally well on biceps and quadriceps samples. A modified tool using the most reliable score items showed good correlation with measures of disease activity. Conclusions The JDM biopsy score tool has high inter- and intraobserver agreement and can be used on both biceps and quadriceps muscle tissue. Importantly, the modified tool correlates well with clinical measures of disease activity. We propose that standardised assessment of muscle biopsy tissue should be considered in diagnostic investigation and clinical trials in JDM. PMID:24064003
Dietary intake among adults in Trinidad and Tobago and development of a quantitative food frequency questionnaire to highlight nutritional needs for lifestyle interventions.

PubMed

Ramdath, D Dan; Hilaire, Debbie G; Cheong, Kimlyn D; Sharma, Sangita

2011-09-01

To create a food list and develop a draft quantitative food frequency questionnaire (QFFQ) for Trinidad and Tobago. A mixed sampling method was used to obtain a representative sample and trained interviewers administered 24-h dietary recalls. Portion sizes were assessed and the most frequently reported foods were tabulated. Results are from 155 men and 169 women aged 21-64 years. The most frequently reported food items were: full-cream milk (64%), rice (61%), and sweetened fruit drinks (50%). Carbonated drinks were consumed by 28%. The most frequently consumed fruits were banana (23%) and citrus (22%); < 20% consumed a vegetable food item. The final QFFQ contains 146 items: 19 breads/cakes/cereals; seven rice/pastas/noodles; 12 dairy; 26 meats/poultry/fish/soy products; 15 fruits; 34 vegetables; six legumes; 11 other; 12 drinks; four alcoholic drinks. A list of commonly consumed foods in Trinidad and Tobago was obtained and a draft QFFQ was prepared.
[A Screening-Tool for Three Dimensions of Work-Related Behavior and Experience Patterns in the Psychosomatic Rehabilitation - A Proposal for a Short-Form of the Occupational Stress and Coping Inventory (AVEM-3D)].

PubMed

Beierlein, V; Köllner, V; Neu, R; Schulz, H

2016-12-01

Objectives: The assessment of work pressures is of particular importance in psychosomatic rehabilitation. An established questionnaire is the Occupational Stress and Coping Inventory (German abbr. AVEM), but it is quite long and with regard to scoring time-consuming in routine clinical care. It should therefore be tested, whether a shortened version of the AVEM can be developed, which is able to assess the formerly described three second-order factors of the AVEM, namely Working Commitment, Resilience, and Emotions, sufficiently reliable and valid, and which also may be used for screening of patients with prominent work-related behavior and experience patterns. Methods: Data were collected at admission from consecutive samples of three hospitals of psychosomatic rehabilitation ( N = 10,635 patients). The sample was randomly divided in two subsamples (design and validation sample). Using exploratory principal component analyses in the design sample, items with the highest factor loadings for the three new scales were selected and evaluated psychometrically using the validation sample. Possible Cut-off values ought to be derived from distribution patterns of scores in the scales. Relationships with sociodemographic, occupational and diagnosis-related characteristics, as well as with patterns of work-related experiences and behaviors are examined. Results: The three performed principal component analyses explained in the design sample on the respective first factor between 31 % and 34 % of the variance. The selected 20 items were assigned to the 3-factor structure in the validation sample as expected. The three new scales are sufficiently reliable with values of Cronbach's α between 0,84 and 0,88. The naming of the three new scales is based on the names of the secondary factors. Cut-off values for the identification of distinctive patient-reported data are proposed. Conclusion: Main advantages of the proposed shortened version AVEM-3D are that with a considerable smaller number of items the three main dimensions of relevant work-related behavior and experience patterns can be reliably measured. The proposed measure is simple and economic to use and interpret. Based on the present sample we provide means and standard deviations as reference at admission of psychosomatic rehabilitation. As a limitation it should be mentioned that further evaluation of reliability, validity and sensitivity to change restricted to the items of the shortened version is necessary. The practicability and validity of the proposed cut-off values cannot yet be conclusively assessed. Finally, the validity of the AVEM-3D in groups of indications other than psychosomatic patients and in healthy persons remains to be examined. © Georg Thieme Verlag KG Stuttgart · New York.
An Evaluation of the Precision of Measurement of Ryff's Psychological Well-Being Scales in a Population Sample

ERIC Educational Resources Information Center

Abbott, Rosemary A.; Ploubidis, George B.; Huppert, Felicia A.; Kuh, Diana; Croudace, Tim J.

2010-01-01

The aim of this study is to assess the effective measurement range of Ryff's Psychological Well-being scales (PWB). It applies normal ogive item response theory (IRT) methodology using factor analysis procedures for ordinal data based on a limited information estimation approach. The data come from a sample of 1,179 women participating in a…

Some links on this page may take you to non-federal websites. Their policies may differ from this site.