Sample records for developing assessment items

  1. Methodology for Developing and Evaluating the PROMIS® Smoking Item Banks

    PubMed Central

    Cai, Li; Stucky, Brian D.; Tucker, Joan S.; Shadel, William G.; Edelen, Maria Orlando

    2014-01-01

    Introduction: This article describes the procedures used in the PROMIS® Smoking Initiative for the development and evaluation of item banks, short forms (SFs), and computerized adaptive tests (CATs) for the assessment of 6 constructs related to cigarette smoking: nicotine dependence, coping expectancies, emotional and sensory expectancies, health expectancies, psychosocial expectancies, and social motivations for smoking. Methods: Analyses were conducted using response data from a large national sample of smokers. Items related to each construct were subjected to extensive item factor analyses and evaluation of differential item functioning (DIF). Final item banks were calibrated, and SF assessments were developed for each construct. The performance of the SFs and the potential use of the item banks for CAT administration were examined through simulation study. Results: Item selection based on dimensionality assessment and DIF analyses produced item banks that were essentially unidimensional in structure and free of bias. Simulation studies demonstrated that the constructs could be accurately measured with a relatively small number of carefully selected items, either through fixed SFs or CAT-based assessment. Illustrative results are presented, and subsequent articles provide detailed discussion of each item bank in turn. Conclusions: The development of the PROMIS smoking item banks provides researchers with new tools for measuring smoking-related constructs. The use of the calibrated item banks and suggested SF assessments will enhance the quality of score estimates, thus advancing smoking research. Moreover, the methods used in the current study, including innovative approaches to item selection and SF construction, may have general relevance to item bank development and evaluation. PMID:23943843

  2. Methodology for developing and evaluating the PROMIS smoking item banks.

    PubMed

    Hansen, Mark; Cai, Li; Stucky, Brian D; Tucker, Joan S; Shadel, William G; Edelen, Maria Orlando

    2014-09-01

    This article describes the procedures used in the PROMIS Smoking Initiative for the development and evaluation of item banks, short forms (SFs), and computerized adaptive tests (CATs) for the assessment of 6 constructs related to cigarette smoking: nicotine dependence, coping expectancies, emotional and sensory expectancies, health expectancies, psychosocial expectancies, and social motivations for smoking. Analyses were conducted using response data from a large national sample of smokers. Items related to each construct were subjected to extensive item factor analyses and evaluation of differential item functioning (DIF). Final item banks were calibrated, and SF assessments were developed for each construct. The performance of the SFs and the potential use of the item banks for CAT administration were examined through simulation study. Item selection based on dimensionality assessment and DIF analyses produced item banks that were essentially unidimensional in structure and free of bias. Simulation studies demonstrated that the constructs could be accurately measured with a relatively small number of carefully selected items, either through fixed SFs or CAT-based assessment. Illustrative results are presented, and subsequent articles provide detailed discussion of each item bank in turn. The development of the PROMIS smoking item banks provides researchers with new tools for measuring smoking-related constructs. The use of the calibrated item banks and suggested SF assessments will enhance the quality of score estimates, thus advancing smoking research. Moreover, the methods used in the current study, including innovative approaches to item selection and SF construction, may have general relevance to item bank development and evaluation. © The Author 2013. Published by Oxford University Press on behalf of the Society for Research on Nicotine and Tobacco. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  3. The initial development of the WebMedQual scale: domain assessment of the construct of quality of health web sites.

    PubMed

    Provost, Mélanie; Koompalum, Dayin; Dong, Diane; Martin, Bradley C

    2006-01-01

    To develop a comprehensive instrument assessing quality of health-related web sites. Phase I consisted of a literature review to identify constructs thought to indicate web site quality and to identify items. During content analysis, duplicate items were eliminated and items that were not clear, meaningful, or measurable were reworded or removed. Some items were generated by the authors. Phase II: a panel consisting of six healthcare and MIS reviewers was convened to assess each item for its relevance and importance to the construct and to assess item clarity and measurement feasibility. Three hundred and eighty-four items were generated from 26 sources. The initial content analysis reduced the scale to 104 items. Four of the six expert reviewers responded; high concordance on the relevance, importance and measurement feasibility of each item was observed: 3 out of 4, or all raters agreed on 76-85% of items. Based on the panel ratings, 9 items were removed, 3 added, and 10 revised. The WebMedQual consists of 8 categories, 8 sub-categories, 95 items and 3 supplemental items to assess web site quality. The constructs are: content (19 items), authority of source (18 items), design (19 items), accessibility and availability (6 items), links (4 items), user support (9 items), confidentiality and privacy (17 items), e-commerce (6 items). The "WebMedQual" represents a first step toward a comprehensive and standard quality assessment of health web sites. This scale will allow relatively easy assessment of quality with possible numeric scoring.

  4. Development of the Assessment Items of Debris Flow Using the Delphi Method

    NASA Astrophysics Data System (ADS)

    Byun, Yosep; Seong, Joohyun; Kim, Mingi; Park, Kyunghan; Yoon, Hyungkoo

    2016-04-01

    In recent years in Korea, Typhoon and the localized extreme rainfall caused by the abnormal climate has increased. Accordingly, debris flow is becoming one of the most dangerous natural disaster. This study aimed to develop the assessment items which can be used for conducting damage investigation of debris flow. Delphi method was applied to classify the realms of assessment items. As a result, 29 assessment items which can be classified into 6 groups were determined.

  5. A Time and Place for Everything: Developmental Differences in the Building Blocks of Episodic Memory

    PubMed Central

    Lee, Joshua K.; Wendelken, J. Carter; Bunge, Silvia A.; Ghetti, Simona

    2015-01-01

    This research investigated whether episodic memory development can be explained by improvements in relational binding processes, involved in forming novel associations between events and the context in which they occurred. Memory for item-space, item-time, and item-item relations was assessed in an ethnically diverse sample of 151 children aged 7 to 11 years and 28 young adults. Item-space memory reached adult performance by 9½ years, whereas item-time and item-item memory improved into adulthood. In path analysis, item-space, but not item-time best explained item-item memory. Across age groups, relational binding related to source memory and performance on standardized memory assessments. In conclusion, relational binding development depends on relation type, but relational binding overall supports episodic memory development. PMID:26493950

  6. Measuring stigma after spinal cord injury: Development and psychometric characteristics of the SCI-QOL Stigma item bank and short form.

    PubMed

    Kisala, Pamela A; Tulsky, David S; Pace, Natalie; Victorson, David; Choi, Seung W; Heinemann, Allen W

    2015-05-01

    To develop a calibrated item bank and computer adaptive test (CAT) to assess the effects of stigma on health-related quality of life in individuals with spinal cord injury (SCI). Grounded-theory based qualitative item development methods, large-scale item calibration field testing, confirmatory factor analysis, and item response theory (IRT)-based psychometric analyses. Five SCI Model System centers and one Department of Veterans Affairs medical center in the United States. Adults with traumatic SCI. SCI-QOL Stigma Item Bank A sample of 611 individuals with traumatic SCI completed 30 items assessing SCI-related stigma. After 7 items were iteratively removed, factor analyses confirmed a unidimensional pool of items. Graded Response Model IRT analyses were used to estimate slopes and thresholds for the final 23 items. The SCI-QOL Stigma item bank is unique not only in the assessment of SCI-related stigma but also in the inclusion of individuals with SCI in all phases of its development. Use of confirmatory factor analytic and IRT methods provide flexibility and precision of measurement. The item bank may be administered as a CAT or as a 10-item fixed-length short form and can be used for research and clinical applications.

  7. Measuring stigma after spinal cord injury: Development and psychometric characteristics of the SCI-QOL Stigma item bank and short form

    PubMed Central

    Kisala, Pamela A.; Tulsky, David S.; Pace, Natalie; Victorson, David; Choi, Seung W.; Heinemann, Allen W.

    2015-01-01

    Objective To develop a calibrated item bank and computer adaptive test (CAT) to assess the effects of stigma on health-related quality of life in individuals with spinal cord injury (SCI). Design Grounded-theory based qualitative item development methods, large-scale item calibration field testing, confirmatory factor analysis, and item response theory (IRT)-based psychometric analyses. Setting Five SCI Model System centers and one Department of Veterans Affairs medical center in the United States. Participants Adults with traumatic SCI. Main Outcome Measures SCI-QOL Stigma Item Bank Results A sample of 611 individuals with traumatic SCI completed 30 items assessing SCI-related stigma. After 7 items were iteratively removed, factor analyses confirmed a unidimensional pool of items. Graded Response Model IRT analyses were used to estimate slopes and thresholds for the final 23 items. Conclusions The SCI-QOL Stigma item bank is unique not only in the assessment of SCI-related stigma but also in the inclusion of individuals with SCI in all phases of its development. Use of confirmatory factor analytic and IRT methods provide flexibility and precision of measurement. The item bank may be administered as a CAT or as a 10-item fixed-length short form and can be used for research and clinical applications. PMID:26010973

  8. Putting Interoperability to the Test: Building a Large Reusable Assessment Item Bank

    ERIC Educational Resources Information Center

    Sclater, Niall; MacDonald, Mary

    2004-01-01

    The COLA project has been developing a large bank of assessment items for units across the Scottish further education curriculum since May 2003. These will be made available to learners mainly via colleges' virtual learning environments (VLEs). Many people have been involved in the development of the COLA assessment item bank to ensure a high…

  9. Checking Equity: Why Differential Item Functioning Analysis Should Be a Routine Part of Developing Conceptual Assessments

    ERIC Educational Resources Information Center

    Martinková, Patricia; Drabinová, Adéla; Liaw, Yuan-Ling; Sanders, Elizabeth A.; McFarland, Jenny L.; Price, Rebecca M.

    2017-01-01

    We provide a tutorial on differential item functioning (DIF) analysis, an analytic method useful for identifying potentially biased items in assessments. After explaining a number of methodological approaches, we test for gender bias in two scenarios that demonstrate why DIF analysis is crucial for developing assessments, particularly because…

  10. Development and validation of a socioculturally competent trust in physician scale for a developing country setting.

    PubMed

    Gopichandran, Vijayaprasad; Wouters, Edwin; Chetlapalli, Satish Kumar

    2015-05-03

    Trust in physicians is the unwritten covenant between the patient and the physician that the physician will do what is in the best interest of the patient. This forms the undercurrent of all healthcare relationships. Several scales exist for assessment of trust in physicians in developed healthcare settings, but to our knowledge none of these have been developed in a developing country context. To develop and validate a new trust in physician scale for a developing country setting. Dimensions of trust in physicians, which were identified in a previous qualitative study in the same setting, were used to develop a scale. This scale was administered among 616 adults selected from urban and rural areas of Tamil Nadu, south India, using a multistage sampling cross sectional survey method. The individual items were analysed using a classical test approach as well as item response theory. Cronbach's α was calculated and the item to total correlation of each item was assessed. After testing for unidimensionality and absence of local dependence, a 2 parameter logistic Semajima's graded response model was fit and item characteristics assessed. Competence, assurance of treatment, respect for the physician and loyalty to the physician were important dimensions of trust. A total of 31 items were developed using these dimensions. Of these, 22 were selected for final analysis. The Cronbach's α was 0.928. The item to total correlations were acceptable for all the 22 items. The item response analysis revealed good item characteristic curves and item information for all the items. Based on the item parameters and item information, a final 12 item scale was developed. The scale performs optimally in the low to moderate trust range. The final 12 item trust in physician scale has a good construct validity and internal consistency. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  11. Development and validation of a socioculturally competent trust in physician scale for a developing country setting

    PubMed Central

    Gopichandran, Vijayaprasad; Wouters, Edwin; Chetlapalli, Satish Kumar

    2015-01-01

    Trust in physicians is the unwritten covenant between the patient and the physician that the physician will do what is in the best interest of the patient. This forms the undercurrent of all healthcare relationships. Several scales exist for assessment of trust in physicians in developed healthcare settings, but to our knowledge none of these have been developed in a developing country context. Objectives To develop and validate a new trust in physician scale for a developing country setting. Methods Dimensions of trust in physicians, which were identified in a previous qualitative study in the same setting, were used to develop a scale. This scale was administered among 616 adults selected from urban and rural areas of Tamil Nadu, south India, using a multistage sampling cross sectional survey method. The individual items were analysed using a classical test approach as well as item response theory. Cronbach's α was calculated and the item to total correlation of each item was assessed. After testing for unidimensionality and absence of local dependence, a 2 parameter logistic Semajima's graded response model was fit and item characteristics assessed. Results Competence, assurance of treatment, respect for the physician and loyalty to the physician were important dimensions of trust. A total of 31 items were developed using these dimensions. Of these, 22 were selected for final analysis. The Cronbach's α was 0.928. The item to total correlations were acceptable for all the 22 items. The item response analysis revealed good item characteristic curves and item information for all the items. Based on the item parameters and item information, a final 12 item scale was developed. The scale performs optimally in the low to moderate trust range. Conclusions The final 12 item trust in physician scale has a good construct validity and internal consistency. PMID:25941182

  12. Development of an instrument for the evaluation of advanced life support performance.

    PubMed

    Peltonen, L-M; Peltonen, V; Salanterä, S; Tommila, M

    2017-10-01

    Assessing advanced life support (ALS) competence requires validated instruments. Existing instruments include aspects of technical skills (TS), non-technical skills (NTS) or both, but one instrument for detailed assessment that suits all resuscitation situations is lacking. This study aimed to develop an instrument for the evaluation of the overall ALS performance of the whole team. This instrument development study had four phases. First, we reviewed literature and resuscitation guidelines to explore items to include in the instrument. Thereafter, we interviewed resuscitation team professionals (n = 66), using the critical incident technique, to determine possible additional aspects associated with the performance of ALS. Second, we developed an instrument based on the findings. Third, we used an expert panel (n = 20) to assess the validity of the developed instrument. Finally, we revised the instrument based on the experts' comments and tested it with six experts who evaluated 22 video recorded resuscitations. The final version of the developed instrument had 69 items divided into adherence to guidelines (28 items), clinical decision-making (5 items), workload management (12 items), team behaviour (8 items), information management (6 items), patient integrity and consideration of laymen (4 items) and work routines (6 items). The Cronbach's α values were good, and strong correlations between the overall performance and the instrument were observed. The instrument may be useful for detailed assessment of the team's overall performance, but the numerous items make the use demanding. The instrument is still under development, and more research is needed to determine its psychometric properties. © 2017 The Acta Anaesthesiologica Scandinavica Foundation. Published by John Wiley & Sons Ltd.

  13. Developing an Engineering Design Process Assessment using Mixed Methods.

    PubMed

    Wind, Stefanie A; Alemdar, Meltem; Lingle, Jeremy A; Gale, Jessica D; Moore, Roxanne A

    Recent reforms in science education worldwide include an emphasis on engineering design as a key component of student proficiency in the Science, Technology, Engineering, and Mathematics disciplines. However, relatively little attention has been directed to the development of psychometrically sound assessments for engineering. This study demonstrates the use of mixed methods to guide the development and revision of K-12 Engineering Design Process (EDP) assessment items. Using results from a middle-school EDP assessment, this study illustrates the combination of quantitative and qualitative techniques to inform item development and revisions. Overall conclusions suggest that the combination of quantitative and qualitative evidence provides an in-depth picture of item quality that can be used to inform the revision and development of EDP assessment items. Researchers and practitioners can use the methods illustrated here to gather validity evidence to support the interpretation and use of new and existing assessments.

  14. Psychological distress in cancer survivors: the further development of an item bank.

    PubMed

    Smith, Adam B; Armes, Jo; Richardson, Alison; Stark, Dan P

    2013-02-01

    Assessment of psychological distress by patient report is necessary to meet patients' needs throughout the cancer journey. We have previously developed an item bank to assess psychological distress but not evaluated it for cancer survivors. Our first aim in this study was to test whether we could extend our item bank to include cancer survivors. The second aim was to examine whether the item bank could assess positive affect as a single construct alongside negative psychological symptoms. Responses from 1315 cancer survivors to the Hospital Anxiety and Depression Scale (HADS) and the Positive and Negative Affect Scale (PANAS) were considered for inclusion in a pre-existing item bank created from a heterogeneous sample of 4914 cancer patients. Differential item functioning (DIF) was used to assess whether HADS responses drawn from the two samples were equivalent. Common-item equating was used to anchor the shared (HADS) items, whilst the PANAS items were added. Item fit was evaluated at each stage, and misfitting items were removed. Unidimensionality was assessed with a principal components factor analysis. The DIF analysis did not reveal any differences between the HADS item locations from the two samples. Three misfitting PANAS items were removed, resulting in a final unidimensional bank of 80 items with good internal reliability (α = 0.85). The new item bank is valid for use across the cancer journey, including cancer survivors, and modestly improves the assessment of all levels of psychological distress and positive psychological function. Copyright © 2011 John Wiley & Sons, Ltd.

  15. Development and community-based validation of eight item banks to assess mental health.

    PubMed

    Batterham, Philip J; Sunderland, Matthew; Carragher, Natacha; Calear, Alison L

    2016-09-30

    There is a need for precise but brief screening of mental health problems in a range of settings. The development of item banks to assess depression and anxiety has resulted in new adaptive and static screeners that accurately assess severity of symptoms. However, expansion to a wider array of mental health problems is required. The current study developed item banks for eight mental health problems: social anxiety disorder, panic disorder, post-traumatic stress disorder, obsessive-compulsive disorder, adult attention-deficit hyperactivity disorder, drug use, psychosis and suicidality. The item banks were calibrated in a population-based Australian adult sample (N=3175) by administering large item pools (45-75 items) and excluding items on the basis of local dependence or measurement non-invariance. Item Response Theory parameters were estimated for each item bank using a two-parameter graded response model. Each bank consisted of 19-47 items, demonstrating excellent fit and precision across a range of -1 to 3 standard deviations from the mean. No previous study has developed such a broad range of mental health item banks. The calibrated item banks will form the basis of a new system of static and adaptive measures to screen for a broad array of mental health problems in the community. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  16. Development and content validation of performance assessments for endoscopic third ventriculostomy.

    PubMed

    Breimer, Gerben E; Haji, Faizal A; Hoving, Eelco W; Drake, James M

    2015-08-01

    This study aims to develop and establish the content validity of multiple expert rating instruments to assess performance in endoscopic third ventriculostomy (ETV), collectively called the Neuro-Endoscopic Ventriculostomy Assessment Tool (NEVAT). The important aspects of ETV were identified through a review of current literature, ETV videos, and discussion with neurosurgeons, fellows, and residents. Three assessment measures were subsequently developed: a procedure-specific checklist (CL), a CL of surgical errors, and a global rating scale (GRS). Neurosurgeons from various countries, all identified as experts in ETV, were then invited to participate in a modified Delphi survey to establish the content validity of these instruments. In each Delphi round, experts rated their agreement including each procedural step, error, and GRS item in the respective instruments on a 5-point Likert scale. Seventeen experts agreed to participate in the study and completed all Delphi rounds. After item generation, a total of 27 procedural CL items, 26 error CL items, and 9 GRS items were posed to Delphi panelists for rating. An additional 17 procedural CL items, 12 error CL items, and 1 GRS item were added by panelists. After three rounds, strong consensus (>80% agreement) was achieved on 35 procedural CL items, 29 error CL items, and 10 GRS items. Moderate consensus (50-80% agreement) was achieved on an additional 7 procedural CL items and 1 error CL item. The final procedural and error checklist contained 42 and 30 items, respectively (divided into setup, exposure, navigation, ventriculostomy, and closure). The final GRS contained 10 items. We have established the content validity of three ETV assessment measures by iterative consensus of an international expert panel. Each measure provides unique assessment information and thus can be used individually or in combination, depending on the characteristics of the learner and the purpose of the assessment. These instruments must now be evaluated in both the simulated and operative settings, to determine their construct validity and reliability. Ultimately, the measures contained in the NEVAT may prove suitable for formative assessment during ETV training and potentially as summative assessment measures during certification.

  17. Checking Equity: Why Differential Item Functioning Analysis Should Be a Routine Part of Developing Conceptual Assessments

    PubMed Central

    Martinková, Patrícia; Drabinová, Adéla; Liaw, Yuan-Ling; Sanders, Elizabeth A.; McFarland, Jenny L.; Price, Rebecca M.

    2017-01-01

    We provide a tutorial on differential item functioning (DIF) analysis, an analytic method useful for identifying potentially biased items in assessments. After explaining a number of methodological approaches, we test for gender bias in two scenarios that demonstrate why DIF analysis is crucial for developing assessments, particularly because simply comparing two groups’ total scores can lead to incorrect conclusions about test fairness. First, a significant difference between groups on total scores can exist even when items are not biased, as we illustrate with data collected during the validation of the Homeostasis Concept Inventory. Second, item bias can exist even when the two groups have exactly the same distribution of total scores, as we illustrate with a simulated data set. We also present a brief overview of how DIF analysis has been used in the biology education literature to illustrate the way DIF items need to be reevaluated by content experts to determine whether they should be revised or removed from the assessment. Finally, we conclude by arguing that DIF analysis should be used routinely to evaluate items in developing conceptual assessments. These steps will ensure more equitable—and therefore more valid—scores from conceptual assessments. PMID:28572182

  18. The development of a science process assessment for fourth-grade students

    NASA Astrophysics Data System (ADS)

    Smith, Kathleen A.; Welliver, Paul W.

    In this study, a multiple-choice test entitled the Science Process Assessment was developed to measure the science process skills of students in grade four. Based on the Recommended Science Competency Continuum for Grades K to 6 for Pennsylvania Schools, this instrument measured the skills of (1) observing, (2) classifying, (3) inferring, (4) predicting, (5) measuring, (6) communicating, (7) using space/time relations, (8) defining operationally, (9) formulating hypotheses, (10) experimenting, (11) recognizing variables, (12) interpreting data, and (13) formulating models. To prepare the instrument, classroom teachers and science educators were invited to participate in two science education workshops designed to develop an item bank of test questions applicable to measuring process skill learning. Participants formed writing teams and generated 65 test items representing the 13 process skills. After a comprehensive group critique of each item, 61 items were identified for inclusion into the Science Process Assessment item bank. To establish content validity, the item bank was submitted to a select panel of science educators for the purpose of judging item acceptability. This analysis yielded 55 acceptable test items and produced the Science Process Assessment, Pilot 1. Pilot 1 was administered to 184 fourth-grade students. Students were given a copy of the test booklet; teachers read each test aloud to the students. Upon completion of this first administration, data from the item analysis yielded a reliability coefficient of 0.73. Subsequently, 40 test items were identified for the Science Process Assessment, Pilot 2. Using the test-retest method, the Science Process Assessment, Pilot 2 (Test 1 and Test 2) was administered to 113 fourth-grade students. Reliability coefficients of 0.80 and 0.82, respectively, were ascertained. The correlation between Test 1 and Test 2 was 0.77. The results of this study indicate that (1) the Science Process Assessment, Pilot 2, is a valid and reliable instrument applicable to measuring the science process skills of students in grade four, (2) using educational workshops as a means of developing item banks of test questions is viable and productive in the test development process, and (3) involving classroom teachers and science educators in the test development process is educationally efficient and effective.

  19. Development of an item bank and computer adaptive test for role functioning.

    PubMed

    Anatchkova, Milena D; Rose, Matthias; Ware, John E; Bjorner, Jakob B

    2012-11-01

    Role functioning (RF) is a key component of health and well-being and an important outcome in health research. The aim of this study was to develop an item bank to measure impact of health on role functioning. A set of different instruments including 75 newly developed items asking about the impact of health on role functioning was completed by 2,500 participants. Established item response theory methods were used to develop an item bank based on the generalized partial credit model. Comparison of group mean bank scores of participants with different self-reported general health status and chronic conditions was used to test the external validity of the bank. After excluding items that did not meet established requirements, the final item bank consisted of a total of 64 items covering three areas of role functioning (family, social, and occupational). Slopes in the bank ranged between .93 and 4.37; the mean threshold range was -1.09 to -2.25. Item bank-based scores were significantly different for participants with and without chronic conditions and with different levels of self-reported general health. An item bank assessing health impact on RF across three content areas has been successfully developed. The bank can be used for development of short forms or computerized adaptive tests to be applied in the assessment of role functioning as one of the common denominators across applications of generic health assessment.

  20. 75 FR 43515 - National Assessment Governing Board; Meeting

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-07-26

    ... frameworks, developing appropriate student achievement levels for each grade and subject tested, developing... 12 economics, grades 4 and 8 reading, and grades 4 and 8 writing. The writing items are for the 2011 operational assessment; the reading items are for the 2013 pilot test; and the economics items are for the...

  1. Osmosis and Diffusion Conceptual Assessment

    PubMed Central

    Fisher, Kathleen M.; Williams, Kathy S.; Lineback, Jennifer Evarts

    2011-01-01

    Biology student mastery regarding the mechanisms of diffusion and osmosis is difficult to achieve. To monitor comprehension of these processes among students at a large public university, we developed and validated an 18-item Osmosis and Diffusion Conceptual Assessment (ODCA). This assessment includes two-tiered items, some adopted or modified from the previously published Diffusion and Osmosis Diagnostic Test (DODT) and some newly developed items. The ODCA, a validated instrument containing fewer items than the DODT and emphasizing different content areas within the realm of osmosis and diffusion, better aligns with our curriculum. Creation of the ODCA involved removal of six DODT item pairs, modification of another six DODT item pairs, and development of three new item pairs addressing basic osmosis and diffusion concepts. Responses to ODCA items testing the same concepts as the DODT were remarkably similar to responses to the DODT collected from students 15 yr earlier, suggesting that student mastery regarding the mechanisms of diffusion and osmosis remains elusive. PMID:22135375

  2. Enhancing self-report assessment of PTSD: development of an item bank.

    PubMed

    Del Vecchio, Nicole; Elwy, A Rani; Smith, Eric; Bottonari, Kathryn A; Eisen, Susan V

    2011-04-01

    The authors report results of work to enhance self-report posttraumatic stress disorder (PTSD) assessment by developing an item bank for use in a computer-adapted test. Computer-adapted tests have great potential to decrease the burden of PTSD assessment and outcomes monitoring. The authors conducted a systematic literature review of PTSD instruments, created a database of items, performed qualitative review and readability analysis, and conducted cognitive interviews with veterans diagnosed with PTSD. The systematic review yielded 480 studies in which 41 PTSD instruments comprising 993 items met inclusion criteria. The final PTSD item bank includes 104 items representing each of the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV; American Psychiatric Association [APA], 1994), PTSD symptom clusters (reexperiencing, avoidance, and hyperarousal), and 3 additional subdomains (depersonalization, guilt, and sexual problems) that expanded the assessment item pool. Copyright © 2011 International Society for Traumatic Stress Studies.

  3. Development and Validity Testing of an Arthritis Self-Management Assessment Tool.

    PubMed

    Oh, HyunSoo; Han, SunYoung; Kim, SooHyun; Seo, WhaSook

    Because of the chronic, progressive nature of arthritis and the substantial effects it has on quality of life, patients may benefit from self-management. However, no valid, reliable self-management assessment tool has been devised for patients with arthritis. This study was conducted to develop a comprehensive self-management assessment tool for patients with arthritis, that is, the Arthritis Self-Management Assessment Tool (ASMAT). To develop a list of qualified items corresponding to the conceptual definitions and attributes of arthritis self-management, a measurement model was established on the basis of theoretical and empirical foundations. Content validity testing was conducted to evaluate whether listed items were suitable for assessing arthritis self-management. Construct validity and reliability of the ASMAT were tested. Construct validity was examined using confirmatory factor analysis and nomological validity. The 32-item ASMAT was developed with a sample composed of patients in a clinic in South Korea. Content validity testing validated the 32 items, which comprised medical (10 items), behavioral (13 items), and psychoemotional (9 items) management subscales. Construct validity testing of the ASMAT showed that the 32 items properly corresponded with conceptual constructs of arthritis self-management, and were suitable for assessing self-management ability in patients with arthritis. Reliability was also well supported. The ASMAT devised in the present study may aid the evaluation of patient self-management ability and the effectiveness of self-management interventions. The authors believe the developed tool may also aid the identification of problems associated with the adoption of self-management practice, and thus improve symptom management, independence, and quality of life of patients with arthritis.

  4. Evolution of a Test Item

    ERIC Educational Resources Information Center

    Spaan, Mary

    2007-01-01

    This article follows the development of test items (see "Language Assessment Quarterly", Volume 3 Issue 1, pp. 71-79 for the article "Test and Item Specifications Development"), beginning with a review of test and item specifications, then proceeding to writing and editing of items, pretesting and analysis, and finally selection of an item for a…

  5. Evaluting the Validity of Technology-Enhanced Educational Assessment Items and Tasks: An Emprical Approach to Studying Item Features and Scoring Rubrics

    ERIC Educational Resources Information Center

    Thomas, Ally

    2016-01-01

    With the advent of the newly developed Common Core State Standards and the Next Generation Science Standards, innovative assessments, including technology-enhanced items and tasks, will be needed to meet the challenges of developing valid and reliable assessments in a world of computer-based testing. In a recent critique of the next generation…

  6. A Study of STEM Assessments in Engineering, Science, and Mathematics for Elementary and Middle School Students

    ERIC Educational Resources Information Center

    Harwell, Michael; Moreno, Mario; Phillips, Alison; Guzey, S. Selcen; Moore, Tamara J.; Roehrig, Gillian H.

    2015-01-01

    The purpose of this study was to develop, scale, and validate assessments in engineering, science, and mathematics with grade appropriate items that were sensitive to the curriculum developed by teachers. The use of item response theory to assess item functioning was a focus of the study. The work is part of a larger project focused on increasing…

  7. Development of a simple 12-item theory-based instrument to assess the impact of continuing professional development on clinical behavioral intentions.

    PubMed

    Légaré, France; Borduas, Francine; Freitas, Adriana; Jacques, André; Godin, Gaston; Luconi, Francesca; Grimshaw, Jeremy

    2014-01-01

    Decision-makers in organizations providing continuing professional development (CPD) have identified the need for routine assessment of its impact on practice. We sought to develop a theory-based instrument for evaluating the impact of CPD activities on health professionals' clinical behavioral intentions. Our multipronged study had four phases. 1) We systematically reviewed the literature for instruments that used socio-cognitive theories to assess healthcare professionals' clinically-oriented behavioral intentions and/or behaviors; we extracted items relating to the theoretical constructs of an integrated model of healthcare professionals' behaviors and removed duplicates. 2) A committee of researchers and CPD decision-makers selected a pool of items relevant to CPD. 3) An international group of experts (n = 70) reached consensus on the most relevant items using electronic Delphi surveys. 4) We created a preliminary instrument with the items found most relevant and assessed its factorial validity, internal consistency and reliability (weighted kappa) over a two-week period among 138 physicians attending a CPD activity. Out of 72 potentially relevant instruments, 47 were analyzed. Of the 1218 items extracted from these, 16% were discarded as improperly phrased and 70% discarded as duplicates. Mapping the remaining items onto the constructs of the integrated model of healthcare professionals' behaviors yielded a minimum of 18 and a maximum of 275 items per construct. The partnership committee retained 61 items covering all seven constructs. Two iterations of the Delphi process produced consensus on a provisional 40-item questionnaire. Exploratory factorial analysis following test-retest resulted in a 12-item questionnaire. Cronbach's coefficients for the constructs varied from 0.77 to 0.85. A 12-item theory-based instrument for assessing the impact of CPD activities on health professionals' clinical behavioral intentions showed adequate validity and reliability. Further studies could assess its responsiveness to behavior change following CPD activities and its capacity to predict health professionals' clinical performance.

  8. Development of a Simple 12-Item Theory-Based Instrument to Assess the Impact of Continuing Professional Development on Clinical Behavioral Intentions

    PubMed Central

    Légaré, France; Borduas, Francine; Freitas, Adriana; Jacques, André; Godin, Gaston; Luconi, Francesca; Grimshaw, Jeremy

    2014-01-01

    Background Decision-makers in organizations providing continuing professional development (CPD) have identified the need for routine assessment of its impact on practice. We sought to develop a theory-based instrument for evaluating the impact of CPD activities on health professionals' clinical behavioral intentions. Methods and Findings Our multipronged study had four phases. 1) We systematically reviewed the literature for instruments that used socio-cognitive theories to assess healthcare professionals' clinically-oriented behavioral intentions and/or behaviors; we extracted items relating to the theoretical constructs of an integrated model of healthcare professionals' behaviors and removed duplicates. 2) A committee of researchers and CPD decision-makers selected a pool of items relevant to CPD. 3) An international group of experts (n = 70) reached consensus on the most relevant items using electronic Delphi surveys. 4) We created a preliminary instrument with the items found most relevant and assessed its factorial validity, internal consistency and reliability (weighted kappa) over a two-week period among 138 physicians attending a CPD activity. Out of 72 potentially relevant instruments, 47 were analyzed. Of the 1218 items extracted from these, 16% were discarded as improperly phrased and 70% discarded as duplicates. Mapping the remaining items onto the constructs of the integrated model of healthcare professionals' behaviors yielded a minimum of 18 and a maximum of 275 items per construct. The partnership committee retained 61 items covering all seven constructs. Two iterations of the Delphi process produced consensus on a provisional 40-item questionnaire. Exploratory factorial analysis following test-retest resulted in a 12-item questionnaire. Cronbach's coefficients for the constructs varied from 0.77 to 0.85. Conclusion A 12-item theory-based instrument for assessing the impact of CPD activities on health professionals' clinical behavioral intentions showed adequate validity and reliability. Further studies could assess its responsiveness to behavior change following CPD activities and its capacity to predict health professionals' clinical performance. PMID:24643173

  9. Development of a Food Frequency Questionnaire for Assessing Dietary Intake in Children and Adolescents in South America.

    PubMed

    Saravia, Luisa; González-Zapata, Laura I; Rendo-Urteaga, Tara; Ramos, Jamile; Collese, Tatiana Sadalla; Bove, Isabel; Delgado, Carlos; Tello, Florencia; Iglesia, Iris; Gonçalves Sousa, Ederson Dassler; De Moraes, Augusto César Ferreira; Carvalho, Heráclito Barbosa; Moreno, Luis A

    2018-03-01

    This study aimed to describe the development of a food frequency questionnaire (FFQ) to assess dietary intake in South American children and adolescents. A total of 345 children (aged 3-10 years) and 357 adolescents (aged 11-17 years) were included for analysis. The FFQ was designed to be self-administered and to assess dietary intake over the past 3 months. It was developed in Spanish and translated into Portuguese. Multiple approaches were considered to compile the food list, and 11 food groups were included. A food photo booklet was produced as supporting material. The FFQ items maintained a common core list among centers (47 items) and country-specific foods. The FFQ for Buenos Aires and Lima had a total of 63 items; there were 55 items for the FFQ in Medelin, 60 items for Montevideo, 58 items for Santiago, 67 items for Sao Paulo, and 68 items for Teresina. Alcohol was also incorporated in the adolescents' FFQ. We developed a semiquantitative, culturally adapted FFQ to assess dietary intake in children and adolescents in South America. It has an optimal size allowing its completion in a high proportion of the population; therefore, it can be used in epidemiological studies with South American children and adolescents. © 2018 The Obesity Society.

  10. Toward a More Systematic Assessment of Smoking: Development of a Smoking Module for PROMIS®

    PubMed Central

    Tucker, Joan S.; Shadel, William G.; Stucky, Brian D.; Cai, Li

    2012-01-01

    Introduction The aim of the PROMIS® Smoking Initiative is to develop, evaluate, and standardize item banks to assess cigarette smoking behavior and biopsychosocial constructs associated with smoking for both daily and non-daily smokers. Methods We used qualitative methods to develop the item pool (following the PROMIS® approach: e.g., literature search, “binning and winnowing” of items, and focus groups and cognitive interviews to finalize wording and format), and quantitative methods (e.g., factor analysis) to develop the item banks. Results We considered a total of 1622 extant items, and 44 new items for inclusion in the smoking item banks. A final set of 277 items representing 11 conceptual domains was selected for field testing in a national sample of smokers. Using data from 3021 daily smokers in the field test, an iterative series of exploratory factor analyses and project team discussions resulted in six item banks: Positive Consequences of Smoking (40 items), Smoking Dependence/Craving (55 items), Health Consequences of Smoking (26 items), Psychosocial Consequences of Smoking (37 items), Coping Aspects of Smoking (30 items), and Social Factors of Smoking (23 items). Conclusions Inclusion of a smoking domain in the PROMIS® framework will standardize measurement of key smoking constructs using state-of-the-art psychometric methods, and make them widely accessible to health care providers, smoking researchers and the large community of researchers using PROMIS® who might not otherwise include an assessment of smoking in their design. Next steps include reducing the number of items in each domain, conducting confirmatory analyses, and duplicating the process for non-daily smokers. PMID:22770824

  11. Toward a more systematic assessment of smoking: development of a smoking module for PROMIS®.

    PubMed

    Edelen, Maria O; Tucker, Joan S; Shadel, William G; Stucky, Brian D; Cai, Li

    2012-11-01

    The aim of the PROMIS® Smoking Initiative is to develop, evaluate, and standardize item banks to assess cigarette smoking behavior and biopsychosocial constructs associated with smoking for both daily and non-daily smokers. We used qualitative methods to develop the item pool (following the PROMIS® approach: e.g., literature search, "binning and winnowing" of items, and focus groups and cognitive interviews to finalize wording and format), and quantitative methods (e.g., factor analysis) to develop the item banks. We considered a total of 1622 extant items, and 44 new items for inclusion in the smoking item banks. A final set of 277 items representing 11 conceptual domains was selected for field testing in a national sample of smokers. Using data from 3021 daily smokers in the field test, an iterative series of exploratory factor analyses and project team discussions resulted in six item banks: Positive Consequences of Smoking (40 items), Smoking Dependence/Craving (55 items), Health Consequences of Smoking (26 items), Psychosocial Consequences of Smoking (37 items), Coping Aspects of Smoking (30 items), and Social Factors of Smoking (23 items). Inclusion of a smoking domain in the PROMIS® framework will standardize measurement of key smoking constructs using state-of-the-art psychometric methods, and make them widely accessible to health care providers, smoking researchers and the large community of researchers using PROMIS® who might not otherwise include an assessment of smoking in their design. Next steps include reducing the number of items in each domain, conducting confirmatory analyses, and duplicating the process for non-daily smokers. Copyright © 2012 Elsevier Ltd. All rights reserved.

  12. Survey Development to Assess College Students' Perceptions of the Campus Environment.

    PubMed

    Sowers, Morgan F; Colby, Sarah; Greene, Geoffrey W; Pickett, Mackenzie; Franzen-Castle, Lisa; Olfert, Melissa D; Shelnutt, Karla; Brown, Onikia; Horacek, Tanya M; Kidd, Tandalayo; Kattelmann, Kendra K; White, Adrienne A; Zhou, Wenjun; Riggsbee, Kristin; Yan, Wangcheng; Byrd-Bredbenner, Carol

    2017-11-01

    We developed and tested a College Environmental Perceptions Survey (CEPS) to assess college students' perceptions of the healthfulness of their campus. CEPS was developed in 3 stages: questionnaire development, validity testing, and reliability testing. Questionnaire development was based on an extensive literature review and input from an expert panel to establish content validity. Face validity was established with the target population using cognitive interviews with 100 college students. Concurrent-criterion validity was established with in-depth interviews (N = 30) of college students compared to surveys completed by the same 30 students. Surveys completed by college students from 8 universities (N = 1147) were used to test internal structure (factor analysis) and internal consistency (Cronbach's alpha). After development and testing, 15 items remained from the original 48 items. A 5-factor solution emerged: physical activity (4 items, α = .635), water (3 items, α = .773), vending (2 items, α = .680), healthy food (2 items, α = .631), and policy (2 items, α = .573). The mean total score for all universities was 62.71 (±11.16) on a 100-point scale. CEPS appears to be a valid and reliable tool for assessing college students' perceptions of their health-related campus environment.

  13. What can we learn from PISA?: Investigating PISA's approach to scientific literacy

    NASA Astrophysics Data System (ADS)

    Schwab, Cheryl Jean

    This dissertation is an investigation of the relationship between the multidimensional conception of scientific literacy and its assessment. The Programme for International Student Assessment (PISA), developed under the auspices of the Organization for Economic Cooperation and Development (OECD), offers a unique opportunity to evaluate the assessment of scientific literacy. PISA developed a continuum of performance for scientific literacy across three competencies (i.e., process, content, and situation). Foundational to the interpretation of PISA science assessment is PISA's definition of scientific literacy, which I argue incorporates three themes drawn from history: (a) scientific way of thinking, (b) everyday relevance of science, and (c) scientific literacy for all students. Three coordinated studies were conducted to investigate the validity of PISA science assessment and offer insight into the development of items to assess scientific 2 literacy. Multidimensional models of the internal structure of the PISA 2003 science items were found not to reflect the complex character of PISA's definition of scientific literacy. Although the multidimensional models across the three competencies significantly decreased the G2 statistic from the unidimensional model, high correlations between the dimensions suggest that the dimensions are similar. A cognitive analysis of student verbal responses to PISA science items revealed that students were using competencies of scientific literacy, but the competencies were not elicited by the PISA science items at the depth required by PISA's definition of scientific literacy. Although student responses contained only knowledge of scientific facts and simple scientific concepts, students were using more complex skills to interpret and communicate their responses. Finally the investigation of different scoring approaches and item response models illustrated different ways to interpret student responses to assessment items. These analyses highlighted the complexities of students' responses to the PISA science items and the use of the ordered partition model to accommodate different but equal item responses. The results of the three investigations are used to discuss ways to improve the development and interpretation of PISA's science items.

  14. Development and initial validation of an instrument to assess stressors among South African sports coaches.

    PubMed

    Kubayi, Alliance; Toriola, Abel; Didymus, Faye

    2018-06-01

    The aim of this series of studies was to develop and initially validate an instrument to assess stressors among South African sports coaches. In study one, a preliminary pool of 45 items was developed based on existing literature and an expert panel was employed to assess the content validity and applicability of these items. In study two, the 32 items that were retained after study one were analysed using principal component analysis (PCA). The resultant factorial structure comprised four components: environmental stressors, performance stressors, task-related stressors, and athlete stressors. These four components were made up of 26 items and, together, the components and items comprised the provisional Stressors in Sports Coaching Questionnaire (SSCQ). The results show that the SSCQ demonstrates acceptable internal consistency (.73-.89). The findings provide preliminary evidence that SSCQ is a valid tool to assess stressors among South African sports coaches.

  15. Developing Parallel Career and Occupational Development Objectives and Exercise (Test) Items in Spanish for Assessment and Evaluation.

    ERIC Educational Resources Information Center

    Muratti, Jose E.; And Others

    A parallel Spanish edition was developed of released objectives and objective-referenced items used in the National Assessment of Educational Progress (NAEP) in the field of Career and Occupational Development (COD). The Spanish edition was designed to assess the identical skills, attitudes, concepts, and knowledge of Spanish-dominant students…

  16. Methodology for the development and calibration of the SCI-QOL item banks

    PubMed Central

    Tulsky, David S.; Kisala, Pamela A.; Victorson, David; Choi, Seung W.; Gershon, Richard; Heinemann, Allen W.; Cella, David

    2015-01-01

    Objective To develop a comprehensive, psychometrically sound, and conceptually grounded patient reported outcomes (PRO) measurement system for individuals with spinal cord injury (SCI). Methods Individual interviews (n = 44) and focus groups (n = 65 individuals with SCI and n = 42 SCI clinicians) were used to select key domains for inclusion and to develop PRO items. Verbatim items from other cutting-edge measurement systems (i.e. PROMIS, Neuro-QOL) were included to facilitate linkage and cross-population comparison. Items were field tested in a large sample of individuals with traumatic SCI (n = 877). Dimensionality was assessed with confirmatory factor analysis. Local item dependence and differential item functioning were assessed, and items were calibrated using the item response theory (IRT) graded response model. Finally, computer adaptive tests (CATs) and short forms were administered in a new sample (n = 245) to assess test-retest reliability and stability. Participants and Procedures A calibration sample of 877 individuals with traumatic SCI across five SCI Model Systems sites and one Department of Veterans Affairs medical center completed SCI-QOL items in interview format. Results We developed 14 unidimensional calibrated item banks and 3 calibrated scales across physical, emotional, and social health domains. When combined with the five Spinal Cord Injury – Functional Index physical function banks, the final SCI-QOL system consists of 22 IRT-calibrated item banks/scales. Item banks may be administered as CATs or short forms. Scales may be administered in a fixed-length format only. Conclusions The SCI-QOL measurement system provides SCI researchers and clinicians with a comprehensive, relevant and psychometrically robust system for measurement of physical-medical, physical-functional, emotional, and social outcomes. All SCI-QOL instruments are freely available on Assessment CenterSM. PMID:26010963

  17. Methodology for the development and calibration of the SCI-QOL item banks.

    PubMed

    Tulsky, David S; Kisala, Pamela A; Victorson, David; Choi, Seung W; Gershon, Richard; Heinemann, Allen W; Cella, David

    2015-05-01

    To develop a comprehensive, psychometrically sound, and conceptually grounded patient reported outcomes (PRO) measurement system for individuals with spinal cord injury (SCI). Individual interviews (n=44) and focus groups (n=65 individuals with SCI and n=42 SCI clinicians) were used to select key domains for inclusion and to develop PRO items. Verbatim items from other cutting-edge measurement systems (i.e. PROMIS, Neuro-QOL) were included to facilitate linkage and cross-population comparison. Items were field tested in a large sample of individuals with traumatic SCI (n=877). Dimensionality was assessed with confirmatory factor analysis. Local item dependence and differential item functioning were assessed, and items were calibrated using the item response theory (IRT) graded response model. Finally, computer adaptive tests (CATs) and short forms were administered in a new sample (n=245) to assess test-retest reliability and stability. A calibration sample of 877 individuals with traumatic SCI across five SCI Model Systems sites and one Department of Veterans Affairs medical center completed SCI-QOL items in interview format. We developed 14 unidimensional calibrated item banks and 3 calibrated scales across physical, emotional, and social health domains. When combined with the five Spinal Cord Injury--Functional Index physical function banks, the final SCI-QOL system consists of 22 IRT-calibrated item banks/scales. Item banks may be administered as CATs or short forms. Scales may be administered in a fixed-length format only. The SCI-QOL measurement system provides SCI researchers and clinicians with a comprehensive, relevant and psychometrically robust system for measurement of physical-medical, physical-functional, emotional, and social outcomes. All SCI-QOL instruments are freely available on Assessment CenterSM.

  18. Measuring ability to assess claims about treatment effects: a latent trait analysis of items from the ‘Claim Evaluation Tools’ database using Rasch modelling

    PubMed Central

    Austvoll-Dahlgren, Astrid; Guttersrud, Øystein; Nsangi, Allen; Semakula, Daniel; Oxman, Andrew D

    2017-01-01

    Background The Claim Evaluation Tools database contains multiple-choice items for measuring people’s ability to apply the key concepts they need to know to be able to assess treatment claims. We assessed items from the database using Rasch analysis to develop an outcome measure to be used in two randomised trials in Uganda. Rasch analysis is a form of psychometric testing relying on Item Response Theory. It is a dynamic way of developing outcome measures that are valid and reliable. Objectives To assess the validity, reliability and responsiveness of 88 items addressing 22 key concepts using Rasch analysis. Participants We administrated four sets of multiple-choice items in English to 1114 people in Uganda and Norway, of which 685 were children and 429 were adults (including 171 health professionals). We scored all items dichotomously. We explored summary and individual fit statistics using the RUMM2030 analysis package. We used SPSS to perform distractor analysis. Results Most items conformed well to the Rasch model, but some items needed revision. Overall, the four item sets had satisfactory reliability. We did not identify significant response dependence between any pairs of items and, overall, the magnitude of multidimensionality in the data was acceptable. The items had a high level of difficulty. Conclusion Most of the items conformed well to the Rasch model’s expectations. Following revision of some items, we concluded that most of the items were suitable for use in an outcome measure for evaluating the ability of children or adults to assess treatment claims. PMID:28550019

  19. IRT Item Parameter Scaling for Developing New Item Pools

    ERIC Educational Resources Information Center

    Kang, Hyeon-Ah; Lu, Ying; Chang, Hua-Hua

    2017-01-01

    Increasing use of item pools in large-scale educational assessments calls for an appropriate scaling procedure to achieve a common metric among field-tested items. The present study examines scaling procedures for developing a new item pool under a spiraled block linking design. The three scaling procedures are considered: (a) concurrent…

  20. Development of Rasch-based item banks for the assessment of work performance in patients with musculoskeletal diseases.

    PubMed

    Mueller, Evelyn A; Bengel, Juergen; Wirtz, Markus A

    2013-12-01

    This study aimed to develop a self-description assessment instrument to measure work performance in patients with musculoskeletal diseases. In terms of the International Classification of Functioning, Disability and Health (ICF), work performance is defined as the degree of meeting the work demands (activities) at the actual workplace (environment). To account for the fact that work performance depends on the work demands of the job, we strived to develop item banks that allow a flexible use of item subgroups depending on the specific work demands of the patients' jobs. Item development included the collection of work tasks from literature and content validation through expert surveys and patient interviews. The resulting 122 items were answered by 621 patients with musculoskeletal diseases. Exploratory factor analysis to ascertain dimensionality and Rasch analysis (partial credit model) for each of the resulting dimensions were performed. Exploratory factor analysis resulted in four dimensions, and subsequent Rasch analysis led to the following item banks: 'impaired productivity' (15 items), 'impaired cognitive performance' (18), 'impaired coping with stress' (13) and 'impaired physical performance' (low physical workload 20 items, high physical workload 10 items). The item banks exhibited person separation indices (reliability) between 0.89 and 0.96. The assessment of work performance adds the activities component to the more commonly employed participation component of the ICF-model. The four item banks can be adapted to specific jobs where necessary without losing comparability of person measures, as the item banks are based on Rasch analysis.

  1. The Development of Practical Item Analysis Program for Indonesian Teachers

    ERIC Educational Resources Information Center

    Muhson, Ali; Lestari, Barkah; Supriyanto; Baroroh, Kiromim

    2017-01-01

    Item analysis has essential roles in the learning assessment. The item analysis program is designed to measure student achievement and instructional effectiveness. This study was aimed to develop item-analysis program and verify its feasibility. This study uses a Research and Development (R & D) model. The procedure includes designing and…

  2. Development of the PROMIS positive emotional and sensory expectancies of smoking item banks.

    PubMed

    Tucker, Joan S; Shadel, William G; Edelen, Maria Orlando; Stucky, Brian D; Li, Zhen; Hansen, Mark; Cai, Li

    2014-09-01

    The positive emotional and sensory expectancies of cigarette smoking include improved cognitive abilities, positive affective states, and pleasurable sensorimotor sensations. This paper describes development of Positive Emotional and Sensory Expectancies of Smoking item banks that will serve to standardize the assessment of this construct among daily and nondaily cigarette smokers. Data came from daily (N = 4,201) and nondaily (N =1,183) smokers who completed an online survey. To identify a unidimensional set of items, we conducted item factor analyses, item response theory analyses, and differential item functioning analyses. Additionally, we evaluated the performance of fixed-item short forms (SFs) and computer adaptive tests (CATs) to efficiently assess the construct. Eighteen items were included in the item banks (15 common across daily and nondaily smokers, 1 unique to daily, 2 unique to nondaily). The item banks are strongly unidimensional, highly reliable (reliability = 0.95 for both), and perform similarly across gender, age, and race/ethnicity groups. A SF common to daily and nondaily smokers consists of 6 items (reliability = 0.86). Results from simulated CATs indicated that, on average, less than 8 items are needed to assess the construct with adequate precision using the item banks. These analyses identified a new set of items that can assess the positive emotional and sensory expectancies of smoking in a reliable and standardized manner. Considerable efficiency in assessing this construct can be achieved by using the item bank SF, employing computer adaptive tests, or selecting subsets of items tailored to specific research or clinical purposes. © The Author 2014. Published by Oxford University Press on behalf of the Society for Research on Nicotine and Tobacco. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  3. Development and validation of Iranian children’s participation assessment scale

    PubMed Central

    Amini, Malek; Hassani Mehraban, Afsoon; Haghni, Hamid; Asgharnezhad, Ali Asghar; Khayatzadeh Mahani, Mohammad

    2016-01-01

    Background: Participation is mostly cultural and familial based, and there is not any assessment scales for evaluating kids’ participation in Iranian context, therefore the purpose of this study was developing children’s participation assessment scale for Iranian children. Methods: Development of this scale occurred in two phases; phase I: planning: following reviewing the literature and adopting and compiling some items of available evaluation tools in the area (such as CAPE, CPQ, CLASS, Life-H) and receiving advice from two expert panels, the preliminary94- item questionnaire was prepared. Phase II: construct: the survey study was carried out on40 children and 21 of their parents to assess the popularity of the activity in Iran; thus, the items of the questionnaire reduced to 92 and after face and content validity, the final version prepared with 71 items. Results: The final 71-item questionnaire was developed in two parent-report and child-report versions. The 71 items based on the literature and expert panels’ advice were categorized in 8 areas of occupation according to Occupational Therapy Practice Framework (ADL, IADL, Play, leisure, social participation, education, work, and sleep/rest). Conclusion: Iranian children’s participation assessment is a useful and culturally relevant tool to measure participation of Iranian children. It can be used in rigorous clinical and population-based research. PMID:27390703

  4. Assessing psychological well-being: self-report instruments for the NIH Toolbox.

    PubMed

    Salsman, John M; Lai, Jin-Shei; Hendrie, Hugh C; Butt, Zeeshan; Zill, Nicholas; Pilkonis, Paul A; Peterson, Christopher; Stoney, Catherine M; Brouwers, Pim; Cella, David

    2014-02-01

    Psychological well-being (PWB) has a significant relationship with physical and mental health. As a part of the NIH Toolbox for the Assessment of Neurological and Behavioral Function, we developed self-report item banks and short forms to assess PWB. Expert feedback and literature review informed the selection of PWB concepts and the development of item pools for positive affect, life satisfaction, and meaning and purpose. Items were tested with a community-dwelling US Internet panel sample of adults aged 18 and above (N = 552). Classical and item response theory (IRT) approaches were used to evaluate unidimensionality, fit of items to the overall measure, and calibrations of those items, including differential item function (DIF). IRT-calibrated item banks were produced for positive affect (34 items), life satisfaction (16 items), and meaning and purpose (18 items). Their psychometric properties were supported based on the results of factor analysis, fit statistics, and DIF evaluation. All banks measured the concepts precisely (reliability ≥0.90) for more than 98% of participants. These adult scales and item banks for PWB provide the flexibility, efficiency, and precision necessary to promote future epidemiological, observational, and intervention research on the relationship of PWB with physical and mental health.

  5. Measuring Teaching Best Practice in the Induction Years: Development and Validation of an Item-Level Assessment

    ERIC Educational Resources Information Center

    Kingsley, Laurie; Romine, William

    2014-01-01

    Schools and teacher induction programs around the world routinely assess teaching best practice to inform accreditation, tenure/promotion, and professional development decisions. Routine assessment is also necessary to ensure that teachers entering the profession get the assistance they need to develop and succeed. We introduce the Item-Level…

  6. Development of an item bank for the assessment of depression in persons with mental illnesses and physical diseases using Rasch analysis.

    PubMed

    Forkmann, Thomas; Boecker, Maren; Norra, Christine; Eberle, Nicole; Kircher, Tilo; Schauerte, Patrick; Mischke, Karl; Westhofen, Martin; Gauggel, Siegfried; Wirtz, Markus

    2009-05-01

    The calibration of item banks provides the basis for computerized adaptive testing that ensures high diagnostic precision and minimizes participants' test burden. The present study aimed at developing a new item bank that allows for assessing depression in persons with mental and persons with somatic diseases. The sample consisted of 161 participants treated for a depressive syndrome, and 206 participants with somatic illnesses (103 cardiologic, 103 otorhinolaryngologic; overall mean age = 44.1 years, SD =14.0; 44.7% women) to allow for validation of the item bank in both groups. Persons answered a pool of 182 depression items on a 5-point Likert scale. Evaluation of Rasch model fit (infit < 1.3), differential item functioning, dimensionality, local independence, item spread, item and person separation (>2.0), and reliability (>.80) resulted in a bank of 79 items with good psychometric properties. The bank provides items with a wide range of content coverage and may serve as a sound basis for computerized adaptive testing applications. It might also be useful for researchers who wish to develop new fixed-length scales for the assessment of depression in specific rehabilitation settings. (PsycINFO Database Record (c) 2009 APA, all rights reserved).

  7. Measuring pain phenomena after spinal cord injury: Development and psychometric properties of the SCI-QOL Pain Interference and Pain Behavior assessment tools.

    PubMed

    Cohen, Matthew L; Kisala, Pamela A; Dyson-Hudson, Trevor A; Tulsky, David S

    2018-05-01

    To develop modern patient-reported outcome measures that assess pain interference and pain behavior after spinal cord injury (SCI). Grounded-theory based qualitative item development; large-scale item calibration field-testing; confirmatory factor analyses; graded response model item response theory analyses; statistical linking techniques to transform scores to the Patient Reported Outcome Measurement Information System (PROMIS) metric. Five SCI Model Systems centers and one Department of Veterans Affairs medical center in the United States. Adults with traumatic SCI. N/A. Spinal Cord Injury - Quality of Life (SCI-QOL) Pain Interference item bank, SCI-QOL Pain Interference short form, and SCI-QOL Pain Behavior scale. Seven hundred fifty-seven individuals with traumatic SCI completed 58 items addressing various aspects of pain. Items were then separated by whether they assessed pain interference or pain behavior, and poorly functioning items were removed. Confirmatory factor analyses confirmed that each set of items was unidimensional, and item response theory analyses were used to estimate slopes and thresholds for the items. Ultimately, 7 items (4 from PROMIS) comprised the Pain Behavior scale and 25 items (18 from PROMIS) comprised the Pain Interference item bank. Ten of these 25 items were selected to form the Pain Interference short form. The SCI-QOL Pain Interference item bank and the SCI-QOL Pain Behavior scale demonstrated robust psychometric properties. The Pain Interference item bank is available as a computer adaptive test or short form for research and clinical applications, and scores are transformed to the PROMIS metric.

  8. Measuring ability to assess claims about treatment effects: a latent trait analysis of items from the 'Claim Evaluation Tools' database using Rasch modelling.

    PubMed

    Austvoll-Dahlgren, Astrid; Guttersrud, Øystein; Nsangi, Allen; Semakula, Daniel; Oxman, Andrew D

    2017-05-25

    The Claim Evaluation Tools database contains multiple-choice items for measuring people's ability to apply the key concepts they need to know to be able to assess treatment claims. We assessed items from the database using Rasch analysis to develop an outcome measure to be used in two randomised trials in Uganda. Rasch analysis is a form of psychometric testing relying on Item Response Theory. It is a dynamic way of developing outcome measures that are valid and reliable. To assess the validity, reliability and responsiveness of 88 items addressing 22 key concepts using Rasch analysis. We administrated four sets of multiple-choice items in English to 1114 people in Uganda and Norway, of which 685 were children and 429 were adults (including 171 health professionals). We scored all items dichotomously. We explored summary and individual fit statistics using the RUMM2030 analysis package. We used SPSS to perform distractor analysis. Most items conformed well to the Rasch model, but some items needed revision. Overall, the four item sets had satisfactory reliability. We did not identify significant response dependence between any pairs of items and, overall, the magnitude of multidimensionality in the data was acceptable. The items had a high level of difficulty. Most of the items conformed well to the Rasch model's expectations. Following revision of some items, we concluded that most of the items were suitable for use in an outcome measure for evaluating the ability of children or adults to assess treatment claims. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  9. Developing and testing new smoking measures for the Health Plan Employer Data and Information Set.

    PubMed

    Pbert, Lori; Vuckovic, Nancy; Ockene, Judith K; Hollis, Jack F; Riedlinger, Karen

    2003-04-01

    To develop and test items for the Health Plan Employee Data and Information Set (HEDIS) that assess delivery of the full range of provider-delivered tobacco interventions. The authors identified potential items via literature review; items were reviewed by national experts. Face validity of candidate items was tested in focus groups. The final survey was sent to a random sample of 1711 adult primary care patients; the re-test survey was sent to self-identified smokers. The process identified reliable items to capture provider assessment of motivation and provision of assistance and follow-up. One can reliably assess patient self-report of provider delivery of the full range of brief tobacco interventions. Such assessment and feedback to health plans and providers may increase use of evidence-based brief interventions.

  10. MIMIC Methods for Assessing Differential Item Functioning in Polytomous Items

    ERIC Educational Resources Information Center

    Wang, Wen-Chung; Shih, Ching-Lin

    2010-01-01

    Three multiple indicators-multiple causes (MIMIC) methods, namely, the standard MIMIC method (M-ST), the MIMIC method with scale purification (M-SP), and the MIMIC method with a pure anchor (M-PA), were developed to assess differential item functioning (DIF) in polytomous items. In a series of simulations, it appeared that all three methods…

  11. Bilingual health literacy assessment using the Talking Touchscreen/la Pantalla Parlanchina: Development and pilot testing.

    PubMed

    Yost, Kathleen J; Webster, Kimberly; Baker, David W; Choi, Seung W; Bode, Rita K; Hahn, Elizabeth A

    2009-06-01

    Current health literacy measures are too long, imprecise, or have questionable equivalence of English and Spanish versions. The purpose of this paper is to describe the development and pilot testing of a new bilingual computer-based health literacy assessment tool. We analyzed literacy data from three large studies. Using a working definition of health literacy, we developed new prose, document and quantitative items in English and Spanish. Items were pilot tested on 97 English- and 134 Spanish-speaking participants to assess item difficulty. Items covered topics relevant to primary care patients and providers. English- and Spanish-speaking participants understood the tasks involved in answering each type of question. The English Talking Touchscreen was easy to use and the English and Spanish items provided good coverage of the difficulty continuum. Qualitative and quantitative results provided useful information on computer acceptability and initial item difficulty. After the items have been administered on the Talking Touchscreen (la Pantalla Parlanchina) to 600 English-speaking (and 600 Spanish-speaking) primary care patients, we will develop a computer adaptive test. This health literacy tool will enable clinicians and researchers to more precisely determine the level at which low health literacy adversely affects health and healthcare utilization.

  12. Development of the Leadership Influence Self-Assessment (LISA©) instrument.

    PubMed

    Shillam, Casey R; Adams, Jeffrey M; Bryant, Debbie Chatman; Deupree, Joy P; Miyamoto, Suzanne; Gregas, Matt

    This study aims to describe the development and psychometric evaluation of the Leadership Influence Self-Assessment (LISA©) tool. LISA© was designed to help nurse leaders assess and enhance their influence capacity by measuring influence traits and practices and identifying areas of strength and weakness. Concepts identified in the Adams Influence Model and input from content experts guided the development of 145 items for testing. Administered to 165 nurse leaders, the assessment was subjected to exploratory factor analysis (EFA). EFA yielded a four-factor solution that comprised 80 items. Cronbach's alpha for factors ranged between 0.912 and 0.938. All factor loadings were >0.4; the smallest factor contained 14 items. Items grouped together in the theoretical model also clustered together in the EFA. Preliminary psychometric testing supports validity and reliability of the LISA© and its potential use as a tool to assess influence capacity for purposes of leadership development and research. Copyright © 2017 Elsevier Inc. All rights reserved.

  13. Guide to an Assessment of Consumer Skills.

    ERIC Educational Resources Information Center

    Education Commission of the States, Denver, CO.

    This guide is intended to assist those interested in developing and/or assessing consumer skills. It is an accompanyment to a separate collection of survey items (mostly in a multiple choice format) designed to assess seventeen-year-olds' consumer skills. It is suggested that the items can be used as part of an item pool, as an instructional tool,…

  14. Assessment of Differential Item Functioning under Cognitive Diagnosis Models: The DINA Model Example

    ERIC Educational Resources Information Center

    Li, Xiaomin; Wang, Wen-Chung

    2015-01-01

    The assessment of differential item functioning (DIF) is routinely conducted to ensure test fairness and validity. Although many DIF assessment methods have been developed in the context of classical test theory and item response theory, they are not applicable for cognitive diagnosis models (CDMs), as the underlying latent attributes of CDMs are…

  15. Development of the Online Assessment of Athletic Training Education (OAATE) Instrument

    ERIC Educational Resources Information Center

    Carr, W. David; Frey, Bruce B.; Swann, Elizabeth

    2009-01-01

    Objective: To establish the validity and reliability of an online assessment instrument's items developed to track educational outcomes over time. Design and Setting: A descriptive study of the validation arguments and reliability testing of the assessment items. The instrument is available to graduating students enrolled in entry-level Athletic…

  16. Development of the Computer-Adaptive Version of the Late-Life Function and Disability Instrument

    PubMed Central

    Tian, Feng; Kopits, Ilona M.; Moed, Richard; Pardasaney, Poonam K.; Jette, Alan M.

    2012-01-01

    Background. Having psychometrically strong disability measures that minimize response burden is important in assessing of older adults. Methods. Using the original 48 items from the Late-Life Function and Disability Instrument and newly developed items, a 158-item Activity Limitation and a 62-item Participation Restriction item pool were developed. The item pools were administered to a convenience sample of 520 community-dwelling adults 60 years or older. Confirmatory factor analysis and item response theory were employed to identify content structure, calibrate items, and build the computer-adaptive testings (CATs). We evaluated real-data simulations of 10-item CAT subscales. We collected data from 102 older adults to validate the 10-item CATs against the Veteran’s Short Form-36 and assessed test–retest reliability in a subsample of 57 subjects. Results. Confirmatory factor analysis revealed a bifactor structure, and multi-dimensional item response theory was used to calibrate an overall Activity Limitation Scale (141 items) and an overall Participation Restriction Scale (55 items). Fit statistics were acceptable (Activity Limitation: comparative fit index = 0.95, Tucker Lewis Index = 0.95, root mean square error approximation = 0.03; Participation Restriction: comparative fit index = 0.95, Tucker Lewis Index = 0.95, root mean square error approximation = 0.05). Correlation of 10-item CATs with full item banks were substantial (Activity Limitation: r = .90; Participation Restriction: r = .95). Test–retest reliability estimates were high (Activity Limitation: r = .85; Participation Restriction r = .80). Strength and pattern of correlations with Veteran’s Short Form-36 subscales were as hypothesized. Each CAT, on average, took 3.56 minutes to administer. Conclusions. The Late-Life Function and Disability Instrument CATs demonstrated strong reliability, validity, accuracy, and precision. The Late-Life Function and Disability Instrument CAT can achieve psychometrically sound disability assessment in older persons while reducing respondent burden. Further research is needed to assess their ability to measure change in older adults. PMID:22546960

  17. Development of AMSTAR: a measurement tool to assess the methodological quality of systematic reviews.

    PubMed

    Shea, Beverley J; Grimshaw, Jeremy M; Wells, George A; Boers, Maarten; Andersson, Neil; Hamel, Candyce; Porter, Ashley C; Tugwell, Peter; Moher, David; Bouter, Lex M

    2007-02-15

    Our objective was to develop an instrument to assess the methodological quality of systematic reviews, building upon previous tools, empirical evidence and expert consensus. A 37-item assessment tool was formed by combining 1) the enhanced Overview Quality Assessment Questionnaire (OQAQ), 2) a checklist created by Sacks, and 3) three additional items recently judged to be of methodological importance. This tool was applied to 99 paper-based and 52 electronic systematic reviews. Exploratory factor analysis was used to identify underlying components. The results were considered by methodological experts using a nominal group technique aimed at item reduction and design of an assessment tool with face and content validity. The factor analysis identified 11 components. From each component, one item was selected by the nominal group. The resulting instrument was judged to have face and content validity. A measurement tool for the 'assessment of multiple systematic reviews' (AMSTAR) was developed. The tool consists of 11 items and has good face and content validity for measuring the methodological quality of systematic reviews. Additional studies are needed with a focus on the reproducibility and construct validity of AMSTAR, before strong recommendations can be made on its use.

  18. Evaluation of item candidates for a diabetic retinopathy quality of life item bank.

    PubMed

    Fenwick, Eva K; Pesudovs, Konrad; Khadka, Jyoti; Rees, Gwyn; Wong, Tien Y; Lamoureux, Ecosse L

    2013-09-01

    We are developing an item bank assessing the impact of diabetic retinopathy (DR) on quality of life (QoL) using a rigorous multi-staged process combining qualitative and quantitative methods. We describe here the first two qualitative phases: content development and item evaluation. After a comprehensive literature review, items were generated from four sources: (1) 34 previously validated patient-reported outcome measures; (2) five published qualitative articles; (3) eight focus groups and 18 semi-structured interviews with 57 DR patients; and (4) seven semi-structured interviews with diabetes or ophthalmic experts. Items were then evaluated during 3 stages, namely binning (grouping) and winnowing (reduction) based on key criteria and panel consensus; development of item stems and response options; and pre-testing of items via cognitive interviews with patients. The content development phase yielded 1,165 unique items across 7 QoL domains. After 3 sessions of binning and winnowing, items were reduced to a minimally representative set (n = 312) across 9 domains of QoL: visual symptoms; ocular surface symptoms; activity limitation; mobility; emotional; health concerns; social; convenience; and economic. After 8 cognitive interviews, 42 items were amended resulting in a final set of 314 items. We have employed a systematic approach to develop items for a DR-specific QoL item bank. The psychometric properties of the nine QoL subscales will be assessed using Rasch analysis. The resulting validated item bank will allow clinicians and researchers to better understand the QoL impact of DR and DR therapies from the patient's perspective.

  19. Development and Testing of the Church Environment Audit Tool.

    PubMed

    Kaczynski, Andrew T; Jake-Schoffman, Danielle E; Peters, Nathan A; Dunn, Caroline G; Wilcox, Sara; Forthofer, Melinda

    2018-05-01

    In this paper, we describe development and reliability testing of a novel tool to evaluate the physical environment of faith-based settings pertaining to opportunities for physical activity (PA) and healthy eating (HE). Tool development was a multistage process including a review of similar tools, stakeholder review, expert feedback, and pilot testing. Final tool sections included indoor opportunities for PA, outdoor opportunities for PA, food preparation equipment, kitchen type, food for purchase, beverages for purchase, and media. Two independent audits were completed at 54 churches. Interrater reliability (IRR) was determined with Kappa and percent agreement. Of 218 items, 102 were assessed for IRR and 116 could not be assessed because they were not present at enough churches. Percent agreement for all 102 items was over 80%. For 42 items, the sample was too homogeneous to assess Kappa. Forty-six of the remaining items had Kappas greater than 0.60 (25 items 0.80-1.00; 21 items 0.60-0.79), indicating substantial to almost perfect agreement. The tool proved reliable and efficient for assessing church environments and identifying potential intervention points. Future work can focus on applications within faith-based partnerships to understand how church environments influence diverse health outcomes.

  20. Better assessment of physical function: item improvement is neglected but essential

    PubMed Central

    2009-01-01

    Introduction Physical function is a key component of patient-reported outcome (PRO) assessment in rheumatology. Modern psychometric methods, such as Item Response Theory (IRT) and Computerized Adaptive Testing, can materially improve measurement precision at the item level. We present the qualitative and quantitative item-evaluation process for developing the Patient Reported Outcomes Measurement Information System (PROMIS) Physical Function item bank. Methods The process was stepwise: we searched extensively to identify extant Physical Function items and then classified and selectively reduced the item pool. We evaluated retained items for content, clarity, relevance and comprehension, reading level, and translation ease by experts and patient surveys, focus groups, and cognitive interviews. We then assessed items by using classic test theory and IRT, used confirmatory factor analyses to estimate item parameters, and graded response modeling for parameter estimation. We retained the 20 Legacy (original) Health Assessment Questionnaire Disability Index (HAQ-DI) and the 10 SF-36's PF-10 items for comparison. Subjects were from rheumatoid arthritis, osteoarthritis, and healthy aging cohorts (n = 1,100) and a national Internet sample of 21,133 subjects. Results We identified 1,860 items. After qualitative and quantitative evaluation, 124 newly developed PROMIS items composed the PROMIS item bank, which included revised Legacy items with good fit that met IRT model assumptions. Results showed that the clearest and best-understood items were simple, in the present tense, and straightforward. Basic tasks (like dressing) were more relevant and important versus complex ones (like dancing). Revised HAQ-DI and PF-10 items with five response options had higher item-information content than did comparable original Legacy items with fewer response options. IRT analyses showed that the Physical Function domain satisfied general criteria for unidimensionality with one-, two-, three-, and four-factor models having comparable model fits. Correlations between factors in the test data sets were > 0.90. Conclusions Item improvement must underlie attempts to improve outcome assessment. The clear, personally important and relevant, ability-framed items in the PROMIS Physical Function item bank perform well in PRO assessment. They will benefit from further study and application in a wider variety of rheumatic diseases in diverse clinical groups, including those at the extremes of physical functioning, and in different administration modes. PMID:20015354

  1. Better assessment of physical function: item improvement is neglected but essential.

    PubMed

    Bruce, Bonnie; Fries, James F; Ambrosini, Debbie; Lingala, Bharathi; Gandek, Barbara; Rose, Matthias; Ware, John E

    2009-01-01

    Physical function is a key component of patient-reported outcome (PRO) assessment in rheumatology. Modern psychometric methods, such as Item Response Theory (IRT) and Computerized Adaptive Testing, can materially improve measurement precision at the item level. We present the qualitative and quantitative item-evaluation process for developing the Patient Reported Outcomes Measurement Information System (PROMIS) Physical Function item bank. The process was stepwise: we searched extensively to identify extant Physical Function items and then classified and selectively reduced the item pool. We evaluated retained items for content, clarity, relevance and comprehension, reading level, and translation ease by experts and patient surveys, focus groups, and cognitive interviews. We then assessed items by using classic test theory and IRT, used confirmatory factor analyses to estimate item parameters, and graded response modeling for parameter estimation. We retained the 20 Legacy (original) Health Assessment Questionnaire Disability Index (HAQ-DI) and the 10 SF-36's PF-10 items for comparison. Subjects were from rheumatoid arthritis, osteoarthritis, and healthy aging cohorts (n = 1,100) and a national Internet sample of 21,133 subjects. We identified 1,860 items. After qualitative and quantitative evaluation, 124 newly developed PROMIS items composed the PROMIS item bank, which included revised Legacy items with good fit that met IRT model assumptions. Results showed that the clearest and best-understood items were simple, in the present tense, and straightforward. Basic tasks (like dressing) were more relevant and important versus complex ones (like dancing). Revised HAQ-DI and PF-10 items with five response options had higher item-information content than did comparable original Legacy items with fewer response options. IRT analyses showed that the Physical Function domain satisfied general criteria for unidimensionality with one-, two-, three-, and four-factor models having comparable model fits. Correlations between factors in the test data sets were > 0.90. Item improvement must underlie attempts to improve outcome assessment. The clear, personally important and relevant, ability-framed items in the PROMIS Physical Function item bank perform well in PRO assessment. They will benefit from further study and application in a wider variety of rheumatic diseases in diverse clinical groups, including those at the extremes of physical functioning, and in different administration modes.

  2. Developing an Assessment Method of Active Aging: University of Jyvaskyla Active Aging Scale.

    PubMed

    Rantanen, Taina; Portegijs, Erja; Kokko, Katja; Rantakokko, Merja; Törmäkangas, Timo; Saajanaho, Milla

    2018-01-01

    To develop an assessment method of active aging for research on older people. A multiphase process that included drafting by an expert panel, a pilot study for item analysis and scale validity, a feedback study with focus groups and questionnaire respondents, and a test-retest study. Altogether 235 people aged 60 to 94 years provided responses and/or feedback. We developed a 17-item University of Jyvaskyla Active Aging Scale with four aspects in each item (goals, ability, opportunity, and activity; range 0-272). The psychometric and item properties are good and the scale assesses a unidimensional latent construct of active aging. Our scale assesses older people's striving for well-being through activities pertaining to their goals, abilities, and opportunities. The University of Jyvaskyla Active Aging Scale provides a quantifiable measure of active aging that may be used in postal questionnaires or interviews in research and practice.

  3. Guideline appraisal with AGREE II: online survey of the potential influence of AGREE II items on overall assessment of guideline quality and recommendation for use.

    PubMed

    Hoffmann-Eßer, Wiebke; Siering, Ulrich; Neugebauer, Edmund A M; Brockhaus, Anne Catharina; McGauran, Natalie; Eikermann, Michaela

    2018-02-27

    The AGREE II instrument is the most commonly used guideline appraisal tool. It includes 23 appraisal criteria (items) organized within six domains. AGREE II also includes two overall assessments (overall guideline quality, recommendation for use). Our aim was to investigate how strongly the 23 AGREE II items influence the two overall assessments. An online survey of authors of publications on guideline appraisals with AGREE II and guideline users from a German scientific network was conducted between 10th February 2015 and 30th March 2015. Participants were asked to rate the influence of the AGREE II items on a Likert scale (0 = no influence to 5 = very strong influence). The frequencies of responses and their dispersion were presented descriptively. Fifty-eight of the 376 persons contacted (15.4%) participated in the survey and the data of the 51 respondents with prior knowledge of AGREE II were analysed. Items 7-12 of Domain 3 (rigour of development) and both items of Domain 6 (editorial independence) had the strongest influence on the two overall assessments. In addition, Items 15-17 (clarity of presentation) had a strong influence on the recommendation for use. Great variations were shown for the other items. The main limitation of the survey is the low response rate. In guideline appraisals using AGREE II, items representing rigour of guideline development and editorial independence seem to have the strongest influence on the two overall assessments. In order to ensure a transparent approach to reaching the overall assessments, we suggest the inclusion of a recommendation in the AGREE II user manual on how to consider item and domain scores. For instance, the manual could include an a-priori weighting of those items and domains that should have the strongest influence on the two overall assessments. The relevance of these assessments within AGREE II could thereby be further specified.

  4. DIFAS: Differential Item Functioning Analysis System. Computer Program Exchange

    ERIC Educational Resources Information Center

    Penfield, Randall D.

    2005-01-01

    Differential item functioning (DIF) is an important consideration in assessing the validity of test scores (Camilli & Shepard, 1994). A variety of statistical procedures have been developed to assess DIF in tests of dichotomous (Hills, 1989; Millsap & Everson, 1993) and polytomous (Penfield & Lam, 2000; Potenza & Dorans, 1995) items. Some of these…

  5. The Functional Arm Scale for Throwers (FAST)-Part I: The Design and Development of an Upper Extremity Region-Specific and Population-Specific Patient-Reported Outcome Scale for Throwing Athletes.

    PubMed

    Sauers, Eric L; Bay, R Curtis; Snyder Valier, Alison R; Ellery, Traci; Huxel Bliven, Kellie C

    2017-03-01

    Upper extremity (UE) region-specific, patient-reported outcome (PRO) scales assess injuries to the UE but do not account for the demands of overhead throwing athletes or measure patient-oriented domains of health-related quality of life (HRQOL). To develop the Functional Arm Scale for Throwers (FAST), a UE region-specific and population-specific PRO scale that assesses multiple domains of disablement in throwing athletes with UE injuries. In stage I, a beta version of the scale was developed for subsequent factor identification, final item reduction, and construct validity analysis during stage II. Descriptive laboratory study. Three-stage scale development was utilized: Stage I (item generation and initial item reduction) and stage II (factor analysis, final item reduction, and construct validity) are reported herein, and stage III (establishment of measurement properties [reliability and validity]) will be reported in a companion paper. In stage I, a beta version was developed, incorporating National Center for Medical Rehabilitation Research disablement domains and ensuring a blend of sport-related and non-sport-related items. An expert panel and focus group assessed importance and interpretability of each item. During stage II, the FAST was reduced, preserving variance characteristics and factor structure of the beta version and construct validity of the final FAST scale. During stage I, a 54-item beta version and a separate 9-item pitcher module were developed. During stage II, a 22-item FAST and 9-item pitcher module were finalized. The factor solution for FAST scale items included pain (n = 6), throwing (n = 10), activities of daily living (n = 5), psychological impact (n = 4), and advancement (n = 3). The 6-item pain subscale crossed factors. The remaining subscales and pitcher module are distinctive, correlated, and internally consistent and may be interpreted individually or combined. This article describes the development of the FAST, which assesses clinical outcomes and HRQOL of throwing athletes after UE injury. The FAST encompasses multiple domains of disability and demonstrates excellent construct validity. The FAST provides a single UE region-specific and population-specific PRO scale for high-demand throwers to facilitate measurement of impact of UE injuries on HRQOL and clinical outcomes while quantifying recovery for comparative effectiveness studies.

  6. Using Distractor-Driven Standards-Based Multiple-Choice Assessments and Rasch Modeling to Investigate Hierarchies of Chemistry Misconceptions and Detect Structural Problems with Individual Items

    ERIC Educational Resources Information Center

    Herrmann-Abell, Cari F.; DeBoer, George E.

    2011-01-01

    Distractor-driven multiple-choice assessment items and Rasch modeling were used as diagnostic tools to investigate students' understanding of middle school chemistry ideas. Ninety-one items were developed according to a procedure that ensured content alignment to the targeted standards and construct validity. The items were administered to 13360…

  7. Development of an Instrument for Measuring Self-Efficacy in Cell Biology

    ERIC Educational Resources Information Center

    Reeve, Suzanne; Kitchen, Elizabeth; Sudweeks, Richard R.; Bell, John D.; Bradshaw, William S.

    2011-01-01

    This article describes the development of a ten-item scale to assess biology majors' self-efficacy towards the critical thinking and data analysis skills taught in an upper-division cell biology course. The original seven-item scale was expanded to include three additional items based on the results of item analysis. Evidence of reliability and…

  8. Assessment in Science Education

    NASA Astrophysics Data System (ADS)

    Rustaman, N. Y.

    2017-09-01

    An analyses study focusing on scientific reasoning literacy was conducted to strengthen the stressing on assessment in science by combining the important of the nature of science and assessment as references, higher order thinking and scientific skills in assessing science learning as well. Having background in developing science process skills test items, inquiry in its many form, scientific and STEM literacy, it is believed that inquiry based learning should first be implemented among science educators and science learners before STEM education can successfully be developed among science teachers, prospective teachers, and students at all levels. After studying thoroughly a number of science researchers through their works, a model of scientific reasoning was proposed, and also simple rubrics and some examples of the test items were introduced in this article. As it is only the beginning, further studies will still be needed in the future with the involvement of prospective science teachers who have interests in assessment, either on authentic assessment or in test items development. In balance usage of alternative assessment rubrics, as well as valid and reliable test items (standard) will be needed in accelerating STEM education in Indonesia.

  9. Development and Initial Validation of Military Deployment-Related TBI Quality-of-Life Item Banks.

    PubMed

    Toyinbo, Peter A; Vanderploeg, Rodney D; Donnell, Alison J; Mutolo, Sandra A; Cook, Karon F; Kisala, Pamela A; Tulsky, David S

    2016-01-01

    To investigate unique factors that affect health-related quality of life (QOL) in individuals with military deployment-related traumatic brain injury (MDR-TBI) and to develop appropriate assessment tools, consistent with the TBI-QOL/PROMIS/Neuro-QOL systems. Three focus groups from each of the 4 Veterans Administration (VA) Polytrauma Rehabilitation Centers, consisting of 20 veterans with mild to severe MDR-TBI, and 36 VA providers were involved in early stage of new item banks development. The item banks were field tested in a sample (N = 485) of veterans enrolled in VA and diagnosed with an MDR-TBI. Focus groups and survey. Developed item banks and short forms for Guilt, Posttraumatic Stress Disorder/Trauma, and Military-Related Loss. Three new item banks representing unique domains of MDR-TBI health outcomes were created: 15 new Posttraumatic Stress Disorder items plus 16 SCI-QOL legacy Trauma items, 37 new Military-Related Loss items plus 18 TBI-QOL legacy Grief/Loss items, and 33 new Guilt items. Exploratory and confirmatory factor analyses plus bifactor analysis of the items supported sufficient unidimensionality of the new item pools. Convergent and discriminant analyses results, as well as known group comparisons, provided initial support for the validity and clinical utility of the new item response theory-calibrated item banks and their short forms. This work provides a unique opportunity to identify issues specific to individuals with MDR-TBI and ensure that they are captured in QOL assessment, thus extending the existing TBI-QOL measurement system.

  10. Development of the PROMIS health expectancies of smoking item banks.

    PubMed

    Edelen, Maria Orlando; Tucker, Joan S; Shadel, William G; Stucky, Brian D; Cerully, Jennifer; Li, Zhen; Hansen, Mark; Cai, Li

    2014-09-01

    Smokers' health-related outcome expectancies are associated with a number of important constructs in smoking research, yet there are no measures currently available that focus exclusively on this domain. This paper describes the development and evaluation of item banks for assessing the health expectancies of smoking. Using data from a sample of daily (N = 4,201) and nondaily (N = 1,183) smokers, we conducted a series of item factor analyses, item response theory analyses, and differential item functioning analyses (according to gender, age, and race/ethnicity) to arrive at a unidimensional set of health expectancies items for daily and nondaily smokers. We also evaluated the performance of short forms (SFs) and computer adaptive tests (CATs) to efficiently assess health expectancies. A total of 24 items were included in the Health Expectancies item banks; 13 items are common across daily and nondaily smokers, 6 are unique to daily, and 5 are unique to nondaily. For both daily and nondaily smokers, the Health Expectancies item banks are unidimensional, reliable (reliability = 0.95 and 0.96, respectively), and perform similarly across gender, age, and race/ethnicity groups. A SF common to daily and nondaily smokers consists of 6 items (reliability = 0.87). Results from simulated CATs showed that health expectancies can be assessed with good precision with an average of 5-6 items adaptively selected from the item banks. Health expectancies of smoking can be assessed on the basis of these item banks via SFs, CATs, or through a tailored set of items selected for a specific research purpose. © The Author 2014. Published by Oxford University Press on behalf of the Society for Research on Nicotine and Tobacco. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  11. Development of the PROMIS nicotine dependence item banks.

    PubMed

    Shadel, William G; Edelen, Maria Orlando; Tucker, Joan S; Stucky, Brian D; Hansen, Mark; Cai, Li

    2014-09-01

    Nicotine dependence is a core construct important for understanding cigarette smoking and smoking cessation behavior. This article describes analyses conducted to develop and evaluate item banks for assessing nicotine dependence among daily and nondaily smokers. Using data from a sample of daily (N = 4,201) and nondaily (N =1,183) smokers, we conducted a series of item factor analyses, item response theory analyses, and differential item functioning analyses (according to gender, age, and race/ethnicity) to arrive at a unidimensional set of nicotine dependence items for daily and nondaily smokers. We also evaluated performance of short forms (SFs) and computer adaptive tests (CATs) to efficiently assess dependence. A total of 32 items were included in the Nicotine Dependence item banks; 22 items are common across daily and nondaily smokers, 5 are unique to daily smokers, and 5 are unique to nondaily smokers. For both daily and nondaily smokers, the Nicotine Dependence item banks are strongly unidimensional, highly reliable (reliability = 0.97 and 0.97, respectively), and perform similarly across gender, age, and race/ethnicity groups. SFs common to daily and nondaily smokers consist of 8 and 4 items (reliability = 0.91 and 0.81, respectively). Results from simulated CATs showed that dependence can be assessed with very good precision for most respondents using fewer than 6 items adaptively selected from the item banks. Nicotine dependence on cigarettes can be assessed on the basis of these item banks via one of the SFs, by using CATs, or through a tailored set of items selected for a specific research purpose. © The Author 2014. Published by Oxford University Press on behalf of the Society for Research on Nicotine and Tobacco. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  12. Development of the PROMIS negative psychosocial expectancies of smoking item banks.

    PubMed

    Stucky, Brian D; Edelen, Maria Orlando; Tucker, Joan S; Shadel, William G; Cerully, Jennifer; Kuhfeld, Megan; Hansen, Mark; Cai, Li

    2014-09-01

    Negative psychosocial expectancies of smoking include aspects of social disapproval and disappointment in oneself. This paper describes analyses conducted to develop and evaluate item banks for assessing psychosocial expectancies among daily and nondaily smokers. Using data from a sample of daily (N = 4,201) and nondaily (N =1,183) smokers, we conducted a series of item factor analyses, item response theory analyses, and differential item functioning analyses (according to gender, age, and race/ethnicity) to arrive at a unidimensional set of psychosocial expectancies items for daily and nondaily smokers. We also evaluated performance of short forms (SFs) and computer adaptive tests (CATs) to efficiently assess psychosocial expectancies. A total of 21 items were included in the Psychosocial Expectancies item banks: 14 items are common across daily and nondaily smokers, 6 are unique to daily, and 1 is unique to nondaily. For both daily and nondaily smokers, the Psychosocial Expectancies item banks are strongly unidimensional, highly reliable (reliability = 0.95 and 0.93, respectively), and perform similarly across gender, age, and race/ethnicity groups. A SF common to daily and nondaily smokers consists of 6 items (reliability = 0.85). Results from simulated CATs showed that, on average, fewer than 8 items are needed to assess psychosocial expectancies with adequate precision when using the item banks. Psychosocial expectancies of smoking can be assessed on the basis of these item banks via the SF, by using CAT, or through a tailored set of items selected for a specific research purpose. © The Author 2014. Published by Oxford University Press on behalf of the Society for Research on Nicotine and Tobacco. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  13. Development and Calibration of an Item Bank for PE Metrics Assessments: Standard 1

    ERIC Educational Resources Information Center

    Zhu, Weimo; Fox, Connie; Park, Youngsik; Fisette, Jennifer L.; Dyson, Ben; Graber, Kim C.; Avery, Marybell; Franck, Marian; Placek, Judith H.; Rink, Judy; Raynes, De

    2011-01-01

    The purpose of this study was to develop and calibrate an assessment system, or bank, using the latest measurement theories and methods to promote valid and reliable student assessment in physical education. Using an anchor-test equating design, a total of 30 items or assessments were administered to 5,021 (2,568 boys and 2,453 girls) students in…

  14. Development and evaluation of CAHPS survey items assessing how well healthcare providers address health literacy.

    PubMed

    Weidmer, Beverly A; Brach, Cindy; Hays, Ron D

    2012-09-01

    The complexity of health information often exceeds patients' skills to understand and use it. To develop survey items assessing how well healthcare providers communicate health information. Domains and items for the Consumer Assessment of Healthcare Providers and Systems (CAHPS) Item Set for Addressing Health Literacy were identified through an environmental scan and input from stakeholders. The draft item set was translated into Spanish and pretested in both English and Spanish. The revised item set was field tested with a randomly selected sample of adult patients from 2 sites using mail and telephonic data collection. Item-scale correlations, confirmatory factor analysis, and internal consistency reliability estimates were estimated to assess how well the survey items performed and identify composite measures. Finally, we regressed the CAHPS global rating of the provider item on the CAHPS core communication composite and the new health literacy composites. A total of 601 completed surveys were obtained (52% response rate). Two composite measures were identified: (1) Communication to Improve Health Literacy (16 items); and (2) How Well Providers Communicate About Medicines (6 items). These 2 composites were significantly uniquely associated with the global rating of the provider (communication to improve health literacy: P<0.001, b=0.28; and communication about medicines composite: P=0.02, b=0.04). The 2 composites and the CAHPS core communication composite accounted for 51% of the variance in the global rating of the provider. A 5-item subset of the Communication to Improve Health Literacy composite accounted for 90% of the variance of the original 16-item composite. This study provides support for reliability and validity of the CAHPS Item Set for Addressing Health Literacy. These items can serve to assess whether healthcare providers have communicated effectively with their patients and as a tool for quality improvement.

  15. Development and Validation of a Computer Adaptive EFL Test

    ERIC Educational Resources Information Center

    He, Lianzhen; Min, Shangchao

    2017-01-01

    The first aim of this study was to develop a computer adaptive EFL test (CALT) that assesses test takers' listening and reading proficiency in English with dichotomous items and polytomous testlets. We reported in detail on the development of the CALT, including item banking, determination of suitable item response theory (IRT) models for item…

  16. Nurses' Attitudes Regarding the Safe Handling of Patients Who Are Morbidly Obese: Instrument Development and Psychometric Analysis.

    PubMed

    Bejciy-Spring, Susan; Vermillion, Brenda; Morgan, Sally; Newton, Cheryl; Chucta, Sheila; Gatens, Cindy; Zadvinskis, Inga; Holloman, Christopher; Chipps, Esther

    2016-12-01

    Nurses' attitudes play an important role in the consistent practice of safe patient handling behaviors. The purposes of this study were to develop and assess the psychometric properties of a newly developed instrument measuring attitudes of nurses related to the care and safe handling of patients who are obese. Phases of instrument development included (a) item generation, (b) content validity assessment, (c) reliability assessment, (d) cognitive interviewing, and (e) construct validity assessment through factor analysis. The final data from the exploratory factor analysis produced a 26-item multidimensional instrument that contains 9 subscales. Based on the factor analysis, a 26-item instrument can be used to examine nurses' attitudes regarding patients who are morbidly obese and related safe handling practices.

  17. Assessing Hopelessness in Terminally Ill Cancer Patients: Development of the Hopelessness Assessment in Illness Questionnaire

    PubMed Central

    Rosenfeld, Barry; Pessin, Hayley; Lewis, Charles; Abbey, Jennifer; Olden, Megan; Sachs, Emily; Amakawa, Lia; Kolva, Elissa; Brescia, Robert; Breitbart, William

    2013-01-01

    Hopelessness has become an increasingly important construct in palliative care research, yet concerns exist regarding the utility of existing measures when applied to patients with a terminal illness. This article describes a series of studies focused on the exploration, development, and analysis of a measure of hopelessness specifically intended for use with terminally ill cancer patients. The 1st stage of measure development involved interviews with 13 palliative care experts and 30 terminally ill patients. Qualitative analysis of the patient interviews culminated in the development of a set of potential questionnaire items. In the 2nd study phase, we evaluated these preliminary items with a sample of 314 participants, using item response theory and classical test theory to identify optimal items and response format. These analyses generated an 8-item measure that we tested in a final study phase, using a 3rd sample (n = 228) to assess reliability and concurrent validity. These analyses demonstrated strong support for the Hopelessness Assessment in Illness Questionnaire providing greater explanatory power than existing measures of hopelessness and found little evidence that this assessment was confounded by illness-related variables (e.g., prognosis). In summary, these 3 studies suggest that this brief measure of hopelessness is particularly useful for palliative care settings. Further research is needed to assess the applicability of the measure to other populations and contexts. PMID:21443366

  18. Using the Rasch Measurement Model in Psychometric Analysis of the Family Effectiveness Measure

    PubMed Central

    McCreary, Linda L.; Conrad, Karen M.; Conrad, Kendon J.; Scott, Christy K; Funk, Rodney R.; Dennis, Michael L.

    2013-01-01

    Background Valid assessment of family functioning can play a vital role in optimizing client outcomes. Because family functioning is influenced by family structure, socioeconomic context, and culture, existing measures of family functioning--primarily developed with nuclear, middle class European American families--may not be valid assessments of families in diverse populations. The Family Effectiveness Measure was developed to address this limitation. Objectives To test the Family Effectiveness Measure with data from a primarily low-income African American convenience sample, using the Rasch measurement model. Method A sample of 607 adult women completed the measure. Rasch analysis was used to assess unidimensionality, response category functioning, item fit, person reliability, differential item functioning by race and parental status, and item hierarchy. Criterion-related validity was tested using correlations with five other variables related to family functioning. Results The Family Effectiveness Measure measures two separate constructs: The effective family functioning construct was a psychometrically sound measure of the target construct that was more efficient due to the deletion of 22 items. The ineffective family functioning construct consisted of 16 of those deleted items but was not as strong psychometrically. Items in both constructs evidenced no differential item functioning by race. Criterion-related validity was supported for both. Discussion In contrast to the prevailing conceptualization that family functioning is a single construct, assessed by positively and negatively worded items, use of the Rasch analysis suggested the existence of two constructs. While the effective family functioning is a strong and efficient measure of family functioning, the ineffective family functioning will require additional item development and psychometric testing. PMID:23636342

  19. A HO-IRT Based Diagnostic Assessment System with Constructed Response Items

    ERIC Educational Resources Information Center

    Yang, Chih-Wei; Kuo, Bor-Chen; Liao, Chen-Huei

    2011-01-01

    The aim of the present study was to develop an on-line assessment system with constructed response items in the context of elementary mathematics curriculum. The system recorded the problem solving process of constructed response items and transfered the process to response codes for further analyses. An inference mechanism based on artificial…

  20. Development of a Psychosocial Risk Screener for Siblings of Children With Cancer: Incorporating the Perspectives of Parents.

    PubMed

    Long, Kristin A; Pariseau, Emily M; Muriel, Anna C; Chu, Andrea; Kazak, Anne E; Alderfer, Melissa A

    2018-04-03

    Although many siblings experience distress after a child's cancer diagnosis, their psychosocial functioning is seldom assessed in clinical oncology settings. One barrier to systematic sibling screening is the lack of a validated, sibling-specific screening instrument. Thus, this study developed sibling-specific screening modules in English and Spanish for the Psychosocial Assessment Tool (PAT), a well-validated screener of family psychosocial risk. A purposive sample of English- and Spanish-speaking parents of children with cancer (N = 29) completed cognitive interviews to provide in-depth feedback on the development of the new PAT sibling modules. Interviews were transcribed verbatim, cleaned, and analyzed using applied thematic analysis. Items were updated iteratively according to participants' feedback. Data collection continued until saturation was reached (i.e., all items were clear and valid). Two sibling modules were developed to assess siblings' psychosocial risk at diagnosis (preexisting risk factors) and several months thereafter (reactions to cancer). Most prior PAT items were retained; however, parents recommended changes to improve screening format (separately assessing each sibling within the family and expanding response options to include "sometimes"), developmental sensitivity (developing or revising items for ages 0-2, 3-4, 5-9, and 10+ years), and content (adding items related to sibling-specific social support, global assessments of sibling risk, emotional/behavioral reactions to cancer, and social ecological factors such as family and school). Psychosocial screening requires sibling-specific screening items that correspond to preexisting risk (at diagnosis) and reactions to cancer (several months after diagnosis). Validated, sibling-specific screeners will facilitate identification of siblings with elevated psychosocial risk.

  1. Development and validation of a vision-specific quality-of-life questionnaire for Timor-Leste.

    PubMed

    du Toit, Rènée; Palagyi, Anna; Ramke, Jacqueline; Brian, Garry; Lamoureux, Ecosse L

    2008-10-01

    To develop and determine the reliability and validity of a vision-specific quality-of-life instrument (TL-VSQOL) designed to assess the impact of distance and near vision impairment in adults living in Timor-Leste. A vision-specific quality-of-life questionnaire was developed, piloted, and administered to 704 Timorese aged >or=40 years during a population-based eye health rapid assessment. Rasch analysis was performed on the data of 457 participants with presenting near vision worse than N8 (78.5%) and/or distance vision worse than 6/18 (69.8%). Unidimensionality, item fit to the model, response category performance, differential item functioning, and targeting of items to participants were assessed. Initially, the questionnaire lacked fit to the Rasch model. Removal of two items concerning emotional well-being resulted in a fit of the data (overall item-trait interaction: chi(2) (df) = 81 (51); mean (SD) person and item fit residual values: -0.30 (1.02) and -0.32 (1.46), and good targeting of person ability and item difficulty was evident. Poorer distance and near visual acuities were significantly associated with worse quality-of-life scores (P < 0.001). Person separation reliability was substantial (0.93), indicating that the instrument can discriminate between groups with normal and impaired vision. All 17 items were free of differential item functioning, and there was no evidence of multidimensionality. This 17-item TL-VSQOL has high reliability, construct, and criterion validity and effective targeting. It can effectively assess the impact on quality of life of adult Timorese with distance and near vision impairment. The TL-VSQOL could be adapted for use in other low-resource settings.

  2. GAP-REACH

    PubMed Central

    Lewis-Fernández, Roberto; Raggio, Greer A.; Gorritz, Magdaliz; Duan, Naihua; Marcus, Sue; Cabassa, Leopoldo J.; Humensky, Jennifer; Becker, Anne E.; Alarcón, Renato D.; Oquendo, María A.; Hansen, Helena; Like, Robert C.; Weiss, Mitchell; Desai, Prakash N.; Jacobsen, Frederick M.; Foulks, Edward F.; Primm, Annelle; Lu, Francis; Kopelowicz, Alex; Hinton, Ladson; Hinton, Devon E.

    2015-01-01

    Growing awareness of health and health care disparities highlights the importance of including information about race, ethnicity, and culture (REC) in health research. Reporting of REC factors in research publications, however, is notoriously imprecise and unsystematic. This article describes the development of a checklist to assess the comprehensiveness and the applicability of REC factor reporting in psychiatric research publications. The 16-itemGAP-REACH© checklist was developed through a rigorous process of expert consensus, empirical content analysis in a sample of publications (N = 1205), and interrater reliability (IRR) assessment (N = 30). The items assess each section in the conventional structure of a health research article. Data from the assessment may be considered on an item-by-item basis or as a total score ranging from 0% to 100%. The final checklist has excellent IRR (κ = 0.91). The GAP-REACH may be used by multiple research stakeholders to assess the scope of REC reporting in a research article. PMID:24080673

  3. Development of a quality assessment tool for systematic reviews of observational studies (QATSO) of HIV prevalence in men having sex with men and associated risk behaviours

    PubMed Central

    Wong, William CW; Cheung, Catherine SK; Hart, Graham J

    2008-01-01

    Background Systematic reviews based on the critical appraisal of observational and analytic studies on HIV prevalence and risk factors for HIV transmission among men having sex with men are very useful for health care decisions and planning. Such appraisal is particularly difficult, however, as the quality assessment tools available for use with observational and analytic studies are poorly established. Methods We reviewed the existing quality assessment tools for systematic reviews of observational studies and developed a concise quality assessment checklist to help standardise decisions regarding the quality of studies, with careful consideration of issues such as external and internal validity. Results A pilot version of the checklist was developed based on epidemiological principles, reviews of study designs, and existing checklists for the assessment of observational studies. The Quality Assessment Tool for Systematic Reviews of Observational Studies (QATSO) Score consists of five items: External validity (1 item), reporting (2 items), bias (1 item) and confounding factors (1 item). Expert opinions were sought and it was tested on manuscripts that fulfil the inclusion criteria of a systematic review. Like all assessment scales, QATSO may oversimplify and generalise information yet it is inclusive, simple and practical to use, and allows comparability between papers. Conclusion A specific tool that allows researchers to appraise and guide study quality of observational studies is developed and can be modified for similar studies in the future. PMID:19014686

  4. Development of the Contact Lens User Experience: CLUE Scales

    PubMed Central

    Wirth, R. J.; Edwards, Michael C.; Henderson, Michael; Henderson, Terri; Olivares, Giovanna; Houts, Carrie R.

    2016-01-01

    ABSTRACT Purpose The field of optometry has become increasingly interested in patient-reported outcomes, reflecting a common trend occurring across the spectrum of healthcare. This article reviews the development of the Contact Lens User Experience: CLUE system designed to assess patient evaluations of contact lenses. CLUE was built using modern psychometric methods such as factor analysis and item response theory. Methods The qualitative process through which relevant domains were identified is outlined as well as the process of creating initial item banks. Psychometric analyses were conducted on the initial item banks and refinements were made to the domains and items. Following this data-driven refinement phase, a second round of data was collected to further refine the items and obtain final item response theory item parameters estimates. Results Extensive qualitative work identified three key areas patients consider important when describing their experience with contact lenses. Based on item content and psychometric dimensionality assessments, the developing CLUE instruments were ultimately focused around four domains: comfort, vision, handling, and packaging. Item response theory parameters were estimated for the CLUE item banks (377 items), and the resulting scales were found to provide precise and reliable assignment of scores detailing users’ subjective experiences with contact lenses. Conclusions The CLUE family of instruments, as it currently exists, exhibits excellent psychometric properties. PMID:27383257

  5. Development and evaluation of a thermochemistry concept inventory for college-level general chemistry

    NASA Astrophysics Data System (ADS)

    Wren, David A.

    The research presented in this dissertation culminated in a 10-item Thermochemistry Concept Inventory (TCI). The development of the TCI can be divided into two main phases: qualitative studies and quantitative studies. Both phases focused on the primary stakeholders of the TCI, college-level general chemistry instructors and students. Each phase was designed to collect evidence for the validity of the interpretations and uses of TCI testing data. A central use of TCI testing data is to identify student conceptual misunderstandings, which are represented as incorrect options of multiple-choice TCI items. Therefore, quantitative and qualitative studies focused heavily on collecting evidence at the item-level, where important interpretations may be made by TCI users. Qualitative studies included student interviews (N = 28) and online expert surveys (N = 30). Think-aloud student interviews (N = 12) were used to identify conceptual misunderstandings used by students. Novice response process validity interviews (N = 16) helped provide information on how students interpreted and answered TCI items and were the basis of item revisions. Practicing general chemistry instructors (N = 18), or experts, defined boundaries of thermochemistry content included on the TCI. Once TCI items were in the later stages of development, an online version of the TCI was used in expert response process validity survey (N = 12), to provide expert feedback on item content, format and consensus of the correct answer for each item. Quantitative studies included three phases: beta testing of TCI items (N = 280), pilot testing of the a 12-item TCI (N = 485), and a large data collection using a 10-item TCI ( N = 1331). In addition to traditional classical test theory analysis, Rasch model analysis was also used for evaluation of testing data at the test and item level. The TCI was administered in both formative assessment (beta and pilot testing) and summative assessment (large data collection), with items performing well in both. One item, item K, did not have acceptable psychometric properties when the TCI was used as a quiz (summative assessment), but was retained in the final version of the TCI based on the acceptable psychometric properties displayed in pilot testing (formative assessment).

  6. Strategic assessment of the availability of pediatric trauma care equipment, technology and supplies in Ghana.

    PubMed

    Ankomah, James; Stewart, Barclay T; Oppong-Nketia, Victor; Koranteng, Adofo; Gyedu, Adam; Quansah, Robert; Donkor, Peter; Abantanga, Francis; Mock, Charles

    2015-11-01

    This study aimed to assess the availability of pediatric trauma care items (i.e. equipment, supplies, technology) and factors contributing to deficiencies in Ghana. Ten universal and 9 pediatric-sized items were selected from the World Health Organization's Guidelines for Essential Trauma Care. Direct inspection and structured interviews with administrative, clinical and biomedical engineering staff were used to assess item availability at 40 purposively sampled district, regional and tertiary hospitals in Ghana. Hospital assessments demonstrated marked deficiencies for a number of essential items (e.g. basic airway supplies, chest tubes, blood pressure cuffs, electrolyte determination, portable X-ray). Lack of pediatric-sized items resulting from equipment absence, lack of training, frequent stock-outs and technology breakage were common. Pediatric items were consistently less available than adult-sized items at each hospital level. This study identified several successes and problems with pediatric trauma care item availability in Ghana. Item availability could be improved, both affordably and reliably, by better organization and planning (e.g. regular assessment of demand and inventory, reliable financing for essential trauma care items). In addition, technology items were often broken. Developing local service and biomedical engineering capability was highlighted as a priority to avoid long periods of equipment breakage. Copyright © 2015 Elsevier Inc. All rights reserved.

  7. Strategic assessment of the availability of pediatric trauma care equipment, technology and supplies in Ghana

    PubMed Central

    Ankomah, James; Stewart, Barclay T; Oppong-Nketia, Victor; Koranteng, Adofo; Gyedu, Adam; Quansah, Robert; Donkor, Peter; Abantanga, Francis; Mock, Charles

    2015-01-01

    Background This study aimed to assess the availability of pediatric trauma care items (i.e. equipment, supplies, technology) and factors contributing to deficiencies in Ghana. Methods Ten universal and 9 pediatric-sized items were selected from the World Health Organization’s Guidelines for Essential Trauma Care. Direct inspection and structured interviews with administrative, clinical and biomedical engineering staff were used to assess item availability at 40 purposively sampled district, regional and tertiary hospitals in Ghana. Results Hospital assessments demonstrated marked deficiencies for a number of essential items (e.g. basic airway supplies, chest tubes, blood pressure cuffs, electrolyte determination, portable Xray). Lack of pediatric-sized items resulting from equipment absence, lack of training, frequent stock-outs and technology breakage were common. Pediatric items were consistently less available than adult-sized items at each hospital level. Conclusion This study identified several successes and problems with pediatric trauma care item availability in Ghana. Item availability could be improved, both affordably and reliably, by better organization and planning (e.g. regular assessment of demand and inventory, reliable financing for essential trauma care items). In addition, technology items were often broken. Developing local service and biomedical engineering capability was highlighted as a priority to avoid long periods of equipment breakage. PMID:25841284

  8. Assessment of Competence in EVAR Procedures: A Novel Rating Scale Developed by the Delphi Technique.

    PubMed

    Strøm, M; Lönn, L; Bech, B; Schroeder, T V; Konge, L

    2017-07-01

    To develop a procedure specific global rating scale for assessment of operator competence in endovascular aortic repair (EVAR). A Delphi approach was used to achieve expert consensus. A panel of 32 international experts (median 300 EVAR procedures, range 200-3000) from vascular surgery (n = 21) and radiology (n = 11) was established. The first Delphi round was based on a review of endovascular skills assessment papers, stent graft instructions for use, and structured interviews. It led to a primary pool of 83 items that were formulated as global rating scale items with tentative anchors. Iterative Delphi rounds were executed. The panellists rated the importance of each item on a 5 point Likert scale. Consensus was defined as 80% of the panel rating an item 4 or 5 in the primary round and 90% in subsequent rounds. Consensus on the final assessment tool was defined as Cronbach's alpha > .8 after a minimum of three rounds. Thirty-two of 35 invited experts participated. Three rounds of surveys were completed with a completion rate of 100% in the first two rounds and 91% in round three. The 83 primary assessment items were supplemented with five items suggested by the panel and reduced to seven pivotal assessment items that reached consensus, Cronbach's alpha = 0.82. The seven item rating scale covers key elements of competence in EVAR stent placement and deployment. Each item has well defined grades with explicit anchors at unacceptable, acceptable, and superior performance on a 5 point Likert scale. The Delphi methodology allowed for international consensus on a new procedure specific global rating scale for assessment of competence in EVAR. The resulting scale, EndoVascular Aortic Repair Assessment of Technical Expertise (EVARATE), represents key elements in the procedure. EVARATE constitutes an assessment tool for providing structured feedback to endovascular operators in training. Copyright © 2017 European Society for Vascular Surgery. Published by Elsevier Ltd. All rights reserved.

  9. The second version of the L. V. Prasad-functional vision questionnaire.

    PubMed

    Gothwal, Vijaya K; Sumalini, Rebecca; Bharani, Seelam; Reddy, Shailaja P; Bagga, Deepak K

    2012-11-01

    The L. V. Prasad-Functional Vision Questionnaire (LVP-FVQ) was developed using Rasch analysis to assess self-reported difficulties in performing daily tasks in school children with visual impairment (VI) in India. However, the LVP-FVQ has psychometric problems of inadequate measurement precision and lack of detailed assessment of dimensionality. Furthermore, items pertaining to use of technology are lacking. The aim of this study was to present the development and validation of the second version of LVP-FVQ (LVP-FVQ II). Development of LVP-FVQ II involved extracting items from other similar questionnaires (albeit developed for Western populations) and focus group discussions of children with VI and their parents that resulted in a 32-item pilot questionnaire. Overall, six items from the LVP-FVQ were retained. The questionnaire underwent pilot testing in 25 such children, following which a 27-item LVP-FVQ II emerged, and this was administered to 150 children with VI. Response to each item was rated on a three-category scale. Rasch analysis was used to validate the LVP-FVQ II. Rating scale was used by participants as was intended to. Four mobility-related items required deletion, as these did not contribute toward measurement of a single construct, indicating a secondary dimension. Deletion of the four items resulted in the 23-item unidimensional LVP-FVQ II, with good measurement precision, effective targeting of item difficulty to participant ability, and lack of notable differential item functioning. The LVP-FVQ II has high reliability, indicating that it is effectively able to discriminate between visual disability of school children in India, and is valid across age, gender, duration of VI, and location of residence. Given the superior measurement properties and the interval-level scores, the LVP-FVQ II appears to offer advantages over LVP-FVQ in assessment of difficulties in performing daily tasks in this population. It can be adapted for use in other developing countries.

  10. The Multidimensional Assessment of Interoceptive Awareness (MAIA)

    PubMed Central

    Mehling, Wolf E.; Price, Cynthia; Daubenmier, Jennifer J.; Acree, Mike; Bartmess, Elizabeth; Stewart, Anita

    2012-01-01

    This paper describes the development of a multidimensional self-report measure of interoceptive body awareness. The systematic mixed-methods process involved reviewing the current literature, specifying a multidimensional conceptual framework, evaluating prior instruments, developing items, and analyzing focus group responses to scale items by instructors and patients of body awareness-enhancing therapies. Following refinement by cognitive testing, items were field-tested in students and instructors of mind-body approaches. Final item selection was achieved by submitting the field test data to an iterative process using multiple validation methods, including exploratory cluster and confirmatory factor analyses, comparison between known groups, and correlations with established measures of related constructs. The resulting 32-item multidimensional instrument assesses eight concepts. The psychometric properties of these final scales suggest that the Multidimensional Assessment of Interoceptive Awareness (MAIA) may serve as a starting point for research and further collaborative refinement. PMID:23133619

  11. Development and preliminary testing of a computerized adaptive assessment of chronic pain.

    PubMed

    Anatchkova, Milena D; Saris-Baglama, Renee N; Kosinski, Mark; Bjorner, Jakob B

    2009-09-01

    The aim of this article is to report the development and preliminary testing of a prototype computerized adaptive test of chronic pain (CHRONIC PAIN-CAT) conducted in 2 stages: (1) evaluation of various item selection and stopping rules through real data-simulated administrations of CHRONIC PAIN-CAT; (2) a feasibility study of the actual prototype CHRONIC PAIN-CAT assessment system conducted in a pilot sample. Item calibrations developed from a US general population sample (N = 782) were used to program a pain severity and impact item bank (kappa = 45), and real data simulations were conducted to determine a CAT stopping rule. The CHRONIC PAIN-CAT was programmed on a tablet PC using QualityMetric's Dynamic Health Assessment (DYHNA) software and administered to a clinical sample of pain sufferers (n = 100). The CAT was completed in significantly less time than the static (full item bank) assessment (P < .001). On average, 5.6 items were dynamically administered by CAT to achieve a precise score. Scores estimated from the 2 assessments were highly correlated (r = .89), and both assessments discriminated across pain severity levels (P < .001, RV = .95). Patients' evaluations of the CHRONIC PAIN-CAT were favorable. This report demonstrates that the CHRONIC PAIN-CAT is feasible for administration in a clinic. The application has the potential to improve pain assessment and help clinicians manage chronic pain.

  12. Development and validation of a professionalism assessment scale for medical students

    PubMed Central

    Klemenc-Ketis, Zalika; Vrecko, Helena

    2014-01-01

    Objectives To develop and validate a scale for the assess-ment of professionalism in medical students based on students' perceptions of and attitudes towards professional-ism in medicine. Methods This was a mixed methods study with under-graduate medical students. Two focus groups were carried out with 12 students, followed by a transcript analysis (grounded theory method with open coding). Then, a 3-round Delphi with 20 family medicine experts was carried out. A psychometric assessment of the scale was performed with a group of 449 students. The items of the Professional-ism Assessment Scale could be answered on a five-point Likert scale. Results After the focus groups, the first version of the PAS consisted of 56 items and after the Delphi study, 30 items remained. The final sample for quantitative study consisted of 122 students (27.2% response rate). There were 95 (77.9%) female students in the sample. The mean age of the sample was 22.1 ± 2.1 years. After the principal component analysis, we removed 8 items and produced the final version of the PAS (22 items). The Cronbach's alpha of the scale was 0.88. Factor analysis revealed three factors: empathy and humanism, professional relationships and development and responsibility. Conclusions The new Professionalism Assessment Scale proved to be valid and reliable. It can be used for the assessment of professionalism in undergraduate medical students. PMID:25382090

  13. Psychosocial consequences of cancer cachexia: the development of an item bank.

    PubMed

    Häne, Hanspeter; Oberholzer, Rolf; Walker, Jochen; Hopkinson, Jane B; de Wolf-Linder, Susanne; Strasser, Florian

    2013-12-01

    Cancer cachexia syndrome (CCS) is often accompanied by psychosocial consequences (PSC). To alleviate PSC, a systematic assessment method is required. Currently, few assessment tools are available (e.g., Functional Assessment of Anorexia/Cachexia Therapy). There is no systematic assessment tool that captures the PSC of CCS. To develop a pilot item bank to assess the PSC of CCS. A total of 132 questions, generated from patient answers in a previous study, were reduced to 121 items by content analysis and evaluation by multidisciplinary experts (doctor, nutritionists, and nurses). In our two-step, cross-sectional study, patients, judged by staff to have PSC of CCS, were included, and the questions were randomly allocated to the patients. Questions were evaluated for understandability and triggering emotions, and patients were asked to provide a response using a four-point Likert scale. Subsequently, problematic questions were revised, reformulated, and retested. A total of 20 patients with a variety of tumor types participated. Of the 121 questions, 31 had to be reformulated after Step 1 and were retested in Step 2, after which seven were again evaluated as not being perfectly comprehensible. In Step 1, 22 questions were found to trigger emotions, but no item required remodeling. Item performance using the Likert scale revealed no consistent floor or ceiling effects. Our final pilot question bank comprised 117 questions. The final item bank contains questions that are understood and accepted by the patients. This item bank now needs to be developed into a measurement tool that groups items into domains and can be used in future research studies. Copyright © 2013 U.S. Cancer Pain Relief Committee. Published by Elsevier Inc. All rights reserved.

  14. Item Response Theory and Health Outcomes Measurement in the 21st Century

    PubMed Central

    Hays, Ron D.; Morales, Leo S.; Reise, Steve P.

    2006-01-01

    Item response theory (IRT) has a number of potential advantages over classical test theory in assessing self-reported health outcomes. IRT models yield invariant item and latent trait estimates (within a linear transformation), standard errors conditional on trait level, and trait estimates anchored to item content. IRT also facilitates evaluation of differential item functioning, inclusion of items with different response formats in the same scale, and assessment of person fit and is ideally suited for implementing computer adaptive testing. Finally, IRT methods can be helpful in developing better health outcome measures and in assessing change over time. These issues are reviewed, along with a discussion of some of the methodological and practical challenges in applying IRT methods. PMID:10982088

  15. Qualitative Development of the PROMIS® Pediatric Stress Response Item Banks

    PubMed Central

    Gardner, William; Pajer, Kathleen; Riley, Anne W.; Forrest, Christopher B.

    2013-01-01

    Objective To describe the qualitative development of the Patient-Reported Outcome Measurement Information System (PROMIS®) Pediatric Stress Response item banks. Methods Stress response concepts were specified through a literature review and interviews with content experts, children, and parents. A library comprising 2,677 items derived from 71 instruments was developed. Items were classified into conceptual categories; new items were written and redundant items were removed. Items were then revised based on cognitive interviews (n = 39 children), readability analyses, and translatability reviews. Results 2 pediatric Stress Response sub-domains were identified: somatic experiences (43 items) and psychological experiences (64 items). Final item pools cover the full range of children’s stress experiences. Items are comprehensible among children aged ≥8 years and ready for translation. Conclusions Child- and parent-report versions of the item banks assess children’s somatic and psychological states when demands tax their adaptive capabilities. PMID:23124904

  16. Improving measures of work-related physical functioning.

    PubMed

    McDonough, Christine M; Ni, Pengsheng; Peterik, Kara; Marfeo, Elizabeth E; Marino, Molly E; Meterko, Mark; Rasch, Elizabeth K; Brandt, Diane E; Jette, Alan M; Chan, Leighton

    2017-03-01

    To expand content of the physical function domain of the Work Disability Functional Assessment Battery (WD-FAB), developed for the US Social Security Administration's (SSA) disability determination process. Newly developed questions were administered to 3532 recent SSA applicants for work disability benefits and 2025 US adults. Factor analyses and item response theory (IRT) methods were used to calibrate and link the new items to the existing WD-FAB, and computer-adaptive test simulations were conducted. Factor and IRT analyses supported integration of 44 new items into three existing WD-FAB scales and the addition of a new 11-item scale (Community Mobility). The final physical function domain consisting of: Basic Mobility (56 items), Upper Body Function (34 items), Fine Motor Function (45 items), and Community Mobility (11 items) demonstrated acceptable psychometric properties. The WD-FAB offers an important tool for enhancement of work disability determination. The FAB could provide relevant information about work-related functioning for initial assessment of claimants; identifying denied applicants who may benefit from interventions to improve work and health outcomes; enhancing periodic review of work disability beneficiaries; and assessing outcomes for policies, programs and services targeting people with work disability.

  17. Improving Measures of Work-Related Physical Functioning

    PubMed Central

    McDonough, Christine M.; Ni, Pengsheng; Peterik, Kara; Marfeo, Elizabeth E.; Marino, Molly E.; Meterko, Mark; Rasch, Elizabeth K; Brandt, Diane E.; Jette, Alan M; Chan, Leighton

    2016-01-01

    Purpose To expand content of the physical function domain of the Work Disability Functional Assessment Battery (WD-FAB), developed for the US Social Security Administration’s (SSA) disability determination process. Methods Newly developed questions were administered to 3,532 recent SSA applicants for work disability benefits and 2,025 US adults. Factor analyses and item response theory (IRT) methods were used to calibrate and link the new items to existing WD-FAB, and computer-adaptive test simulations were conducted. Results Factor and IRT analyses supported integration of 44 new items into 3 existing WD-FAB scales and the addition of a new 11-item scale (Community Mobility). The final physical function domain consisting of: Basic Mobility (56 items), Upper Body Function (34 items), Fine Motor Function (45 items), and Community Mobility (11 items) demonstrated acceptable psychometric properties. Conclusions The WD-FAB offers an important tool for enhancement of work disability determination. The FAB could provide relevant information about work-related functioning for initial assessment of claimants, identifying denied applicants who may benefit from interventions to improve work and health outcomes; enhancing periodic review of work disability beneficiaries; and assessing outcomes for policies, programs and services targeting people with work disability. PMID:28005243

  18. Assessing Student Understanding of the "New Biology": Development and Evaluation of a Criterion-Referenced Genomics and Bioinformatics Assessment

    NASA Astrophysics Data System (ADS)

    Campbell, Chad Edward

    Over the past decade, hundreds of studies have introduced genomics and bioinformatics (GB) curricula and laboratory activities at the undergraduate level. While these publications have facilitated the teaching and learning of cutting-edge content, there has yet to be an evaluation of these assessment tools to determine if they are meeting the quality control benchmarks set forth by the educational research community. An analysis of these assessment tools indicated that <10% referenced any quality control criteria and that none of the assessments met more than one of the quality control benchmarks. In the absence of evidence that these benchmarks had been met, it is unclear whether these assessment tools are capable of generating valid and reliable inferences about student learning. To remedy this situation the development of a robust GB assessment aligned with the quality control benchmarks was undertaken in order to ensure evidence-based evaluation of student learning outcomes. Content validity is a central piece of construct validity, and it must be used to guide instrument and item development. This study reports on: (1) the correspondence of content validity evidence gathered from independent sources; (2) the process of item development using this evidence; (3) the results from a pilot administration of the assessment; (4) the subsequent modification of the assessment based on the pilot administration results and; (5) the results from the second administration of the assessment. Twenty-nine different subtopics within GB (Appendix B: Genomics and Bioinformatics Expert Survey) were developed based on preliminary GB textbook analyses. These subtopics were analyzed using two methods designed to gather content validity evidence: (1) a survey of GB experts (n=61) and (2) a detailed content analyses of GB textbooks (n=6). By including only the subtopics that were shown to have robust support across these sources, 22 GB subtopics were established for inclusion in the assessment. An expert panel subsequently developed, evaluated, and revised two multiple-choice items to align with each of the 22 subtopics, producing a final item pool of 44 items. These items were piloted with student samples of varying content exposure levels. Both Classical Test Theory (CTT) and Item Response Theory (IRT) methodologies were used to evaluate the assessment's validity, reliability and ability inferences, and its ability to differentiate students with different magnitudes of content exposure. A total of 18 items were subsequently modified and reevaluated by an expert panel. The 26 original and 18 modified items were once again piloted with student samples of varying content exposure levels. Both CTT and IRT methodologies were once again used to evaluate student responses in order to evaluate the assessment's validity and reliability inferences as well as its ability to differentiate students with different magnitudes of content exposure. Interviews with students from different content exposure levels were also performed in order to gather convergent validity evidence (external validity evidence) as well as substantive validity evidence. Also included are the limitations of the assessment and a set of guidelines on how the assessment can best be used.

  19. Multi-Item Direct Behavior Ratings: Dependability of Two Levels of Assessment Specificity

    ERIC Educational Resources Information Center

    Volpe, Robert J.; Briesch, Amy M.

    2015-01-01

    Direct Behavior Rating-Multi-Item Scales (DBR-MIS) have been developed as formative measures of behavioral assessment for use in school-based problem-solving models. Initial research has examined the dependability of composite scores generated by summing all items comprising the scales. However, it has been argued that DBR-MIS may offer assessment…

  20. Development and Standardization of the Diagnostic Adaptive Behavior Scale: Application of Item Response Theory to the Assessment of Adaptive Behavior

    ERIC Educational Resources Information Center

    Tassé, Marc J.; Schalock, Robert L.; Thissen, David; Balboni, Giulia; Bersani, Henry, Jr.; Borthwick-Duffy, Sharon A.; Spreat, Scott; Widaman, Keith F.; Zhang, Dalun; Navas, Patricia

    2016-01-01

    The Diagnostic Adaptive Behavior Scale (DABS) was developed using item response theory (IRT) methods and was constructed to provide the most precise and valid adaptive behavior information at or near the cutoff point of making a decision regarding a diagnosis of intellectual disability. The DABS initial item pool consisted of 260 items. Using IRT…

  1. Development of the Language Subtest in a Developmental Assessment Scale to Identify Chinese Preschool Children with Special Needs

    ERIC Educational Resources Information Center

    Wong, Anita M. -Y.; Leung, Cynthia; Siu, Elaine K. -L.; Lam, Catherine C. -C.; Chan, Grace P. -S.

    2011-01-01

    This study reports on the development of the language subtest in the Preschool Developmental Assessment Scale (PDAS) for Cantonese-Chinese speaking children. A pilot pool of 158 items covering the two language modalities and the three language domains was developed. This initial item set was subsequently revised based on Rasch analyses of data…

  2. Electronic Quality of Life Assessment Using Computer-Adaptive Testing

    PubMed Central

    2016-01-01

    Background Quality of life (QoL) questionnaires are desirable for clinical practice but can be time-consuming to administer and interpret, making their widespread adoption difficult. Objective Our aim was to assess the performance of the World Health Organization Quality of Life (WHOQOL)-100 questionnaire as four item banks to facilitate adaptive testing using simulated computer adaptive tests (CATs) for physical, psychological, social, and environmental QoL. Methods We used data from the UK WHOQOL-100 questionnaire (N=320) to calibrate item banks using item response theory, which included psychometric assessments of differential item functioning, local dependency, unidimensionality, and reliability. We simulated CATs to assess the number of items administered before prespecified levels of reliability was met. Results The item banks (40 items) all displayed good model fit (P>.01) and were unidimensional (fewer than 5% of t tests significant), reliable (Person Separation Index>.70), and free from differential item functioning (no significant analysis of variance interaction) or local dependency (residual correlations < +.20). When matched for reliability, the item banks were between 45% and 75% shorter than paper-based WHOQOL measures. Across the four domains, a high standard of reliability (alpha>.90) could be gained with a median of 9 items. Conclusions Using CAT, simulated assessments were as reliable as paper-based forms of the WHOQOL with a fraction of the number of items. These properties suggest that these item banks are suitable for computerized adaptive assessment. These item banks have the potential for international development using existing alternative language versions of the WHOQOL items. PMID:27694100

  3. Extending LMS to Support IRT-Based Assessment Test Calibration

    NASA Astrophysics Data System (ADS)

    Fotaris, Panagiotis; Mastoras, Theodoros; Mavridis, Ioannis; Manitsaris, Athanasios

    Developing unambiguous and challenging assessment material for measuring educational attainment is a time-consuming, labor-intensive process. As a result Computer Aided Assessment (CAA) tools are becoming widely adopted in academic environments in an effort to improve the assessment quality and deliver reliable results of examinee performance. This paper introduces a methodological and architectural framework which embeds a CAA tool in a Learning Management System (LMS) so as to assist test developers in refining items to constitute assessment tests. An Item Response Theory (IRT) based analysis is applied to a dynamic assessment profile provided by the LMS. Test developers define a set of validity rules for the statistical indices given by the IRT analysis. By applying those rules, the LMS can detect items with various discrepancies which are then flagged for review of their content. Repeatedly executing the aforementioned procedure can improve the overall efficiency of the testing process.

  4. A measure of early physical functioning (EPF) post-stroke.

    PubMed

    Finch, Lois E; Higgins, Johanne; Wood-Dauphinee, Sharon; Mayo, Nancy E

    2008-07-01

    To develop a comprehensive measure of Early Physical Functioning (EPF) post-stroke quantified through Rasch analysis and conceptualized using the International Classification of Functioning Disability and Health (ICF). An observational cohort study. A cohort of 262 subjects (mean age 71.6 (standard deviation 12.5) years) hospitalized post-acute stroke. Functional assessments were made within 3 days of stroke with items from valid and reliable indices commonly utilized to evaluate stroke survivors. Information on important variables was also collected. Principal component and Rasch analysis confirmed the factor structure, and dimensionality of the measure. Rasch analysis combined items across ICF components to develop the measure. Items were deleted iteratively, those retained fit the model and were related to the construct; reliability and validity were assessed. A 38-item unidimensional measure of the EPF met all Rasch model requirements. The item difficulty matched the person ability (mean person measure: -0.31; standard error 0.37 logits), reliability of the person-item-hierarchy was excellent at 0.97. Initial validity was adequate. The 38-item EPF measure was developed. It expands the range of assessment post acute stroke; it covers a broad spectrum of difficulty with good initial psychometric properties that, once revalidated, can assist in planning and evaluating early interventions.

  5. GAP-REACH: a checklist to assess comprehensive reporting of race, ethnicity, and culture in psychiatric publications.

    PubMed

    Lewis-Fernández, Roberto; Raggio, Greer A; Gorritz, Magdaliz; Duan, Naihua; Marcus, Sue; Cabassa, Leopoldo J; Humensky, Jennifer; Becker, Anne E; Alarcón, Renato D; Oquendo, María A; Hansen, Helena; Like, Robert C; Weiss, Mitchell; Desai, Prakash N; Jacobsen, Frederick M; Foulks, Edward F; Primm, Annelle; Lu, Francis; Kopelowicz, Alex; Hinton, Ladson; Hinton, Devon E

    2013-10-01

    Growing awareness of health and health care disparities highlights the importance of including information about race, ethnicity, and culture (REC) in health research. Reporting of REC factors in research publications, however, is notoriously imprecise and unsystematic. This article describes the development of a checklist to assess the comprehensiveness and the applicability of REC factor reporting in psychiatric research publications. The 16-item GAP-REACH checklist was developed through a rigorous process of expert consensus, empirical content analysis in a sample of publications (N = 1205), and interrater reliability (IRR) assessment (N = 30). The items assess each section in the conventional structure of a health research article. Data from the assessment may be considered on an item-by-item basis or as a total score ranging from 0% to 100%. The final checklist has excellent IRR (κ = 0.91). The GAP-REACH may be used by multiple research stakeholders to assess the scope of REC reporting in a research article.

  6. Development of a Computer Adaptive Test for Depression Based on the Dutch-Flemish Version of the PROMIS Item Bank.

    PubMed

    Flens, Gerard; Smits, Niels; Terwee, Caroline B; Dekker, Joost; Huijbrechts, Irma; de Beurs, Edwin

    2017-03-01

    We developed a Dutch-Flemish version of the patient-reported outcomes measurement information system (PROMIS) adult V1.0 item bank for depression as input for computerized adaptive testing (CAT). As item bank, we used the Dutch-Flemish translation of the original PROMIS item bank (28 items) and additionally translated 28 U.S. depression items that failed to make the final U.S. item bank. Through psychometric analysis of a combined clinical and general population sample ( N = 2,010), 8 added items were removed. With the final item bank, we performed several CAT simulations to assess the efficiency of the extended (48 items) and the original item bank (28 items), using various stopping rules. Both item banks resulted in highly efficient and precise measurement of depression and showed high similarity between the CAT simulation scores and the full item bank scores. We discuss the implications of using each item bank and stopping rule for further CAT development.

  7. Instructional Topics in Educational Measurement (ITEMS) Module: Using Automated Processes to Generate Test Items

    ERIC Educational Resources Information Center

    Gierl, Mark J.; Lai, Hollis

    2013-01-01

    Changes to the design and development of our educational assessments are resulting in the unprecedented demand for a large and continuous supply of content-specific test items. One way to address this growing demand is with automatic item generation (AIG). AIG is the process of using item models to generate test items with the aid of computer…

  8. The Development of a Visual-Perceptual Chemistry Specific (VPCS) Assessment Tool

    ERIC Educational Resources Information Center

    Oliver-Hoyo, Maria; Sloan, Caroline

    2014-01-01

    The development of the Visual-Perceptual Chemistry Specific (VPCS) assessment tool is based on items that align to eight visual-perceptual skills considered as needed by chemistry students. This tool includes a comprehensive range of visual operations and presents items within a chemistry context without requiring content knowledge to solve…

  9. Development of a Questionnaire Assessing School Physical Activity Environment

    ERIC Educational Resources Information Center

    Robertson-Wilson, Jennifer; Levesque, Lucie; Holden, Ronald R.

    2007-01-01

    This study was designed to develop the Questionnaire Assessing School Physical Activity Environment (Q--SPACE) based on student perceptions. Twenty-eight items rated on 4-point Likert scales were administered to 244 middle school students in 9 schools. Exploratory factor analysis was used to evaluate the underlying structure of the items and 2…

  10. Identifying Promising Items: The Use of Crowdsourcing in the Development of Assessment Instruments

    ERIC Educational Resources Information Center

    Sadler, Philip M.; Sonnert, Gerhard; Coyle, Harold P.; Miller, Kelly A.

    2016-01-01

    The psychometrically sound development of assessment instruments requires pilot testing of candidate items as a first step in gauging their quality, typically a time-consuming and costly effort. Crowdsourcing offers the opportunity for gathering data much more quickly and inexpensively than from most targeted populations. In a simulation of a…

  11. Development of a Brief Questionnaire to Assess Contraceptive Intent

    PubMed Central

    Raine-Bennett, Tina R; Rocca, Corinne H

    2015-01-01

    Objective We sought to develop and validate an instrument that can enable providers to identify young women who may be at risk of contraceptive non-adherence. Methods Item response theory based methods were used to evaluate the psychometric properties of the Contraceptive Intent Questionnaire, a 15-item self-administered questionnaire, based on theory and prior qualitative and quantitative research. The questionnaire was administered to 200 women aged 15–24 years who were initiating contraceptives. We assessed item fit to the item response model, internal consistency, internal structure validity, and differential item functioning. Results All items fit a one-dimensional model. The separation reliability coefficient was 0.73. Participants’ overall scores covered the full range of the scale (0–15), and items appropriately matched the range of participants’ contraceptive intent. Items met the criteria for internal structure validity and most items functioned similarly between groups of women. Conclusion The Contraceptive Intent Questionnaire appears to be a reliable and valid tool. Future testing is needed to assess predictive ability and clinical utility. Practice Implications The Contraceptive Intent Questionnaire may serve as a valid tool to help providers identify women who may have problems with contraceptive adherence, as well as to pinpoint areas in which counseling may be directed. PMID:26104994

  12. Development of a brief questionnaire to assess contraceptive intent.

    PubMed

    Raine-Bennett, Tina R; Rocca, Corinne H

    2015-11-01

    We sought to develop and validate an instrument that can enable providers to identify young women who may be at risk of contraceptive non-adherence. Item response theory based methods were used to evaluate the psychometric properties of the Contraceptive Intent Questionnaire, a 15-item self-administered questionnaire, based on theory and prior qualitative and quantitative research. The questionnaire was administered to 200 women aged 15-24 years who were initiating contraceptives. We assessed item fit to the item response model, internal consistency, internal structure validity, and differential item functioning. All items fit a one-dimensional model. The separation reliability coefficient was 0.73. Participants' overall scores covered the full range of the scale (0-15), and items appropriately matched the range of participants' contraceptive intent. Items met the criteria for internal structure validity and most items functioned similarly between groups of women. The Contraceptive Intent Questionnaire appears to be a reliable and valid tool. Future testing is needed to assess predictive ability and clinical utility. The Contraceptive Intent Questionnaire may serve as a valid tool to help providers identify women who may have problems with contraceptive adherence, as well as to pinpoint areas in which counseling may be directed. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  13. A Monte Carlo Study Investigating the Influence of Item Discrimination, Category Intersection Parameters, and Differential Item Functioning Patterns on the Detection of Differential Item Functioning in Polytomous Items

    ERIC Educational Resources Information Center

    Thurman, Carol

    2009-01-01

    The increased use of polytomous item formats has led assessment developers to pay greater attention to the detection of differential item functioning (DIF) in these items. DIF occurs when an item performs differently for two contrasting groups of respondents (e.g., males versus females) after controlling for differences in the abilities of the…

  14. INTRODUCTION TO PATIENT-REPORTED OUTCOME ITEM BANKS: ISSUES IN MINORITY AGING RESEARCH

    PubMed Central

    Templin, Thomas N; Hays, Ron D; Gershon, Richard C; Rothrock, Nan; Jones, Richard N; Teresi, Jeanne A; Stewart, Anita; Weech-Maldonado, Robert; Wallace, Steve

    2014-01-01

    In 2004 NIH awarded contracts to initiate the development of high quality psychological and neuropsychological outcome measures for improved assessment of health-related outcomes. The workshop introduced these measurement development initiatives, the measures created, and the NIH supported resource (Assessment Center) for internet or tablet-based test administration and scoring. Presentation covered: (a) item response theory (IRT) and assessment of test bias, (b) construction of item banks and computerized adaptive testing, and (c) the different ways in which qualitative analyses contribute to the definition of construct domains and the refinement of outcome constructs. The panel discussion included questions about representativeness of samples, and assessment of cultural bias. PMID:23570428

  15. Development and validation of the Perceived Food Environment Questionnaire in a French-Canadian population.

    PubMed

    Carbonneau, Elise; Robitaille, Julie; Lamarche, Benoît; Corneau, Louise; Lemieux, Simone

    2017-08-01

    The present study aimed to develop and validate a questionnaire assessing perceived food environment in a French-Canadian population. A questionnaire, the Perceived Food Environment Questionnaire, was developed assessing perceived accessibility to healthy (nine items) and unhealthy foods (three items). A pre-test sample was recruited for a pilot testing of the questionnaire. For the validation study, another sample was recruited and completed the questionnaire twice. Exploratory factor analysis was performed on the items to assess the number of factors (subscales). Cronbach's α was used to measure internal consistency reliability. Test-retest reliability was assessed with Pearson correlations. Online survey. Men and women from the Québec City area (n 31 in the pre-test sample; n 150 in the validation study sample). The pilot testing did not lead to any change in the questionnaire. The exploratory factor analysis revealed a two-subscale structure. The first subscale is composed of six items assessing accessibility to healthy foods and the second includes three items related to accessibility to unhealthy foods. Three items were removed from the questionnaire due to low loading on the two subscales. The subscales demonstrated adequate internal consistency (Cronbach's α=0·77 for healthy foods and 0·62 for unhealthy foods) and test-retest reliability (r=0·59 and 0·60, respectively; both P<0·0001). The Perceived Food Environment Questionnaire was developed for a French-Canadian population and demonstrated good psychometric properties. Further validation is recommended if the questionnaire is to be used in other populations.

  16. Development and validation of an instrument to assess job satisfaction in eye-care personnel.

    PubMed

    Paudel, Prakash; Cronjé, Sonja; O'Connor, Patricia M; Khadka, Jyoti; Rao, Gullapalli N; Holden, Brien A

    2017-11-01

    The aim was to develop and validate an instrument to measure job satisfaction in eye-care personnel and assess the job satisfaction of one-year trained vision technicians in India. A pilot instrument for assessing job satisfaction was developed, based on a literature review and input from a public health expert panel. Rasch analysis was used to assess psychometric properties and to undertake an iterative item reduction. The instrument was then administered to vision technicians in vision centres of Andhra Pradesh in India. Associations between vision technicians' job satisfaction and factors such as age, gender and experience were analysed using t-test and one-way analysis of variance. Rasch analysis confirmed that the 15-item job satisfaction in eye-care personnel (JSEP) was a unidimensional instrument with good fit statistics, measurement precisions and absence of differential item functioning. Overall, vision technicians reported high rates of job satisfaction (0.46 logits). Age, gender and experience were not associated with high job satisfaction score. Item score analysis showed non-financial incentives, salary and workload were the most important determinants of job satisfaction. The 15-item JSEP instrument is a valid instrument for assessing job satisfaction among eye-care personnel. Overall, vision technicians in India demonstrated high rates of job satisfaction. © 2016 Optometry Australia.

  17. Development of the Assessment of Belief Conflict in Relationship-14 (ABCR-14).

    PubMed

    Kyougoku, Makoto; Teraoka, Mutsumi; Masuda, Noriko; Ooura, Mariko; Abe, Yasushi

    2015-01-01

    Nurses and other healthcare workers frequently experience belief conflict, one of the most important, new stress-related problems in both academic and clinical fields. In this study, using a sample of 1,683 nursing practitioners, we developed The Assessment of Belief Conflict in Relationship-14 (ABCR-14), a new scale that assesses belief conflict in the healthcare field. Standard psychometric procedures were used to develop and test the scale, including a qualitative framework concept and item-pool development, item reduction, and scale development. We analyzed the psychometric properties of ABCR-14 according to entropy, polyserial correlation coefficient, exploratory factor analysis, confirmatory factor analysis, average variance extracted, Cronbach's alpha, Pearson product-moment correlation coefficient, and multidimensional item response theory (MIRT). The results of the analysis supported a three-factor model consisting of 14 items. The validity and reliability of ABCR-14 was suggested by evidence from high construct validity, structural validity, hypothesis testing, internal consistency reliability, and concurrent validity. The result of the MIRT offered strong support for good item response of item slope parameters and difficulty parameters. However, the ABCR-14 Likert scale might need to be explored from the MIRT point of view. Yet, as mentioned above, there is sufficient evidence to support that ABCR-14 has high validity and reliability. The ABCR-14 demonstrates good psychometric properties for nursing belief conflict. Further studies are recommended to confirm its application in clinical practice.

  18. Gender-Related Differential Item Functioning on a Middle-School Mathematics Performance Assessment.

    ERIC Educational Resources Information Center

    Lane, Suzanne; And Others

    This study examined gender-related differential item functioning (DIF) using a mathematics performance assessment, the QUASAR Cognitive Assessment Instrument (QCAI), administered to middle school students. The QCAI was developed for the Quantitative Understanding: Amplifying Student Achievement and Reading (QUASAR) project, which focuses on…

  19. Osmosis and Diffusion Conceptual Assessment

    ERIC Educational Resources Information Center

    Fisher, Kathleen M.; Williams, Kathy S.; Lineback, Jennifer Evarts

    2011-01-01

    Biology student mastery regarding the mechanisms of diffusion and osmosis is difficult to achieve. To monitor comprehension of these processes among students at a large public university, we developed and validated an 18-item Osmosis and Diffusion Conceptual Assessment (ODCA). This assessment includes two-tiered items, some adopted or modified…

  20. Assessment of readiness to change in patients with osteoarthritis. development and application of a new questionnaire.

    PubMed

    Heuts, Peter H T G; de Bie, Rob A; Dijkstra, Arie; Aretz, Karin; Vlaeyen, Johan W S; Schouten, Hubert J A; Hopman-Rock, Marijke; van Weel, Chris; van Schayck, Constant P

    2005-05-01

    To develop a self-report measure for assessment of the stage of change in patients with osteoarthritis, in order to identify patients who would benefit from a self-management programme. According to the 'stages of change' model a questionnaire was developed with three groups of items corresponding to the precontemplation stage (Pre), the contemplation (Cont) and the action (Act) stage. Internal consistency and factor structure of this questionnaire were investigated by assessing Cronbach's alphas and by performing factor analysis. The questionnaire was offered to 273 patients who entered a randomized clinical trial on self-management in a general health care setting. Factor analysis revealed that most items corresponded to the a priori described groups, while some items were not loading on the presumed factor. In each subgroup some items were deleted, resulting in a 15-item questionnaire. After this item reduction Cronbach's alphas were 0.72 (Pre), 0.76 (Cont) and 0.79 (Act) and all factor loadings were satisfactory (above 0.35). Classification revealed some differences between parts of the total group, for example in the proportion of patients in the preparation stage (recruited by general practitioner = 33.6%; advertisement = 49.2%). The Stages of Change Questionnaire in Osteoarthritis, a 15-item questionnaire to assess the 'stage of change' of a patient with osteoarthritis showed good internal consistency and adequate factor structure. These findings warrant further studies on validity and applicability in a clinical context.

  1. Advising on Preferred Reporting Items for patient-reported outcome instrument development: the PRIPROID.

    PubMed

    Hou, Zheng-Kun; Liu, Feng-Bin; Fang, Ji-Qian; Li, Xiao-Ying; Li, Li-Juan; Lin, Chu-Hua

    2013-03-01

    The reporting of patient-reported outcomes (PRO) instrument development is vital for both researchers and clinicians to determine its validity, thus, we propose the Preferred Reporting Items for PRO Instrument Development (PRIPROID) to improve the quality of reports. Abiding by the guidance published by the Enhancing the QUAlity and Transparency Of health Research (EQUATOR) Network, we had performed 6 steps for items development: identified the need for a guideline, performed a literature review, obtained funding for the guideline initiative, identified participants, conducted a Delphi exercise and generated a list of PRIPROID items for consideration at the face-to-face meeting. Twenty three items subheadings under 7 topics were included: title and structured abstract, rationale, objectives, intention, eligibility criteria, conceptual framework, items generation, response options, scoring, times, administrative modes, burden assessment, properties assessment, statistical methods, participants, main results, and additional analysis, summary of evidence, limitations, clinical attentions, and conclusions, item pools or final form, and funding. The PRIPROID contains many elements of the PRO research, and this assists researchers to report their results more accurately and to a certain degree use this instrument to evaluate the quality of the research methods.

  2. The Development and Validation of the Intercultural Sensitivity Scale.

    ERIC Educational Resources Information Center

    Chen, Guo-Ming; Starosta, William J.

    The present study developed and assessed reliability and validity of a new instrument, the Intercultural Sensitivity Scale (ISS). Based on a review of the literature, 44 items thought to be important for intercultural sensitivity were generated. A sample of 414 college students rated these items and generated a 24-item final version of the…

  3. An Objective Instrument for Assessment of Erikson's Developmental Conflicts. Presentation Summary.

    ERIC Educational Resources Information Center

    Speisman, Joseph C.; And Others

    An objective measure of Erikson's ego-identity construct is being developed. The total scale includes seven relatively independent subscales designed to reflect the residuals (part conflicts) of Erikson's psychosocial stages of development. An initial item pool of 194 items has been reduced to 113 items by means of judgemental and statistical…

  4. Development of a Multidimensional Functional Health Scale for Older Adults in China.

    PubMed

    Mao, Fanzhen; Han, Yaofeng; Chen, Junze; Chen, Wei; Yuan, Manqiong; Alicia Hong, Y; Fang, Ya

    2016-05-01

    A first step to achieve successful aging is assessing functional wellbeing of older adults. This study reports the development of a culturally appropriate brief scale (the Multidimensional Functional Health Scale for Chinese Elderly, MFHSCE) to assess the functional health of Chinese elderly. Through systematic literature review, Delphi method, cultural adaptation, synthetic statistical item selection, Cronbach's alpha and confirmatory factor analysis, we conducted development of item pool, two rounds of item selection, and psychometric evaluation. Synthetic statistical item selection and psychometric evaluation was processed among 539 and 2032 older adults, separately. The MFHSCE consists of 30 items, covering activities of daily living, social relationships, physical health, mental health, cognitive function, and economic resources. The Cronbach's alpha was 0.92, and the comparative fit index was 0.917. The MFHSCE has good internal consistency and construct validity; it is also concise and easy to use in general practice, especially in communities in China.

  5. The Development of Multiple-Choice Items Consistent with the AP Chemistry Curriculum Framework to More Accurately Assess Deeper Understanding

    ERIC Educational Resources Information Center

    Domyancich, John M.

    2014-01-01

    Multiple-choice questions are an important part of large-scale summative assessments, such as the advanced placement (AP) chemistry exam. However, past AP chemistry exam items often lacked the ability to test conceptual understanding and higher-order cognitive skills. The redesigned AP chemistry exam shows a distinctive shift in item types toward…

  6. A New Functional Health Literacy Scale for Japanese Young Adults Based on Item Response Theory.

    PubMed

    Tsubakita, Takashi; Kawazoe, Nobuo; Kasano, Eri

    2017-03-01

    Health literacy predicts health outcomes. Despite concerns surrounding the health of Japanese young adults, to date there has been no objective assessment of health literacy in this population. This study aimed to develop a Functional Health Literacy Scale for Young Adults (funHLS-YA) based on item response theory. Each item in the scale requires participants to choose the most relevant term from 3 choices in relation to a target item, thus assessing objective rather than perceived health literacy. The 20-item scale was administered to 1816 university students and 1751 responded. Cronbach's α coefficient was .73. Difficulty and discrimination parameters of each item were estimated, resulting in the exclusion of 1 item. Some items showed different difficulty parameters for male and female participants, reflecting that some aspects of health literacy may differ by gender. The current 19-item version of funHLS-YA can reliably assess the objective health literacy of Japanese young adults.

  7. Initial development of a patient-reported outcome measure of experience with cognitive impairment associated with schizophrenia.

    PubMed

    Welch, Lisa C; Trudeau, Jeremiah J; Silverstein, Steven M; Sand, Michael; Henderson, David C; Rosen, Raymond C

    2017-01-01

    Cognitive impairment is a serious, often distressing aspect of schizophrenia that affects patients' day-to-day lives. Although several interview-based instruments exist to assess cognitive functioning, a reliable measure developed based on the experiences of patients facing cognitive difficulties is needed to complement the objective performance-based assessments. The present article describes the initial development of a patient-reported outcome (PRO) measure to assess the subjective experience of cognitive impairment among patients with schizophrenia, the Patient-Reported Experience of Cognitive Impairment in Schizophrenia (PRECIS). The phases of development included the construction of a conceptual model based on the existing knowledge and two sets of qualitative interviews with patients: 1) concept elicitation interviews to ensure face and content validity from the perspective of people with schizophrenia and 2) cognitive debriefing of the initial item pool. Input from experts was elicited throughout the process. The initial conceptual model included seven domains. The results from concept elicitation interviews (n=80) supported these domains but yielded substantive changes to concepts within domains and to terminology. Based on these results, an initial pool of 53 items was developed to reflect the most common descriptions and languages used by the study participants. Cognitive debriefing interviews (n=22) resulted in the removal of 18 items and modification of 22 other items. The remaining 35 items represented 23 concepts within six domains plus two items assessing bother. The draft PRO measure is currently undergoing psychometric testing as a precursor to broad-based clinical and research use.

  8. Developing an item bank and short forms that assess the impact of asthma on quality of life.

    PubMed

    Stucky, Brian D; Edelen, Maria Orlando; Sherbourne, Cathy D; Eberhart, Nicole K; Lara, Marielena

    2014-02-01

    The present work describes the process of developing an item bank and short forms that measure the impact of asthma on quality of life (QoL) that avoids confounding QoL with asthma symptomatology and functional impairment. Using a diverse national sample of adults with asthma (N = 2032) we conducted exploratory and confirmatory factor analyses, and item response theory and differential item functioning analyses to develop a 65-item unidimensional item bank and separate short form assessments. A psychometric evaluation of the RAND Impact of Asthma on QoL item bank (RAND-IAQL) suggests that though the concept of asthma impact on QoL is multi-faceted, it may be measured as a single underlying construct. The performance of the bank was then evaluated with a real-data simulated computer adaptive test. From the RAND-IAQL item bank we then developed two short forms consisting of 4 and 12 items (reliability = 0.86 and 0.93, respectively). A real-data simulated computer adaptive test suggests that as few as 4-5 items from the bank are needed to obtain highly precise scores. Preliminary validity results indicate that the RAND-IAQL measures distinguish between levels of asthma control. To measure the impact of asthma on QoL, users of these items may choose from two highly reliable short forms, computer adaptive test administration, or content-specific subsets of items from the bank tailored to their specific needs. Copyright © 2013 Elsevier Ltd. All rights reserved.

  9. Development and Validity Testing of the Worksite Health Index: An Assessment Tool to Help and Improve Korean Employees' Health-Related Outcome.

    PubMed

    Yun, Young Ho; Sim, Jin Ah; Lim, Ye Jin; Lim, Cheol Il; Kang, Sung-Choon; Kang, Joon-Ho; Park, Jun Dong; Noh, Dong Young

    2016-06-01

    The objective of this study was to develop the Worksite Health Index (WHI) and validate its psychometric properties. The development of the WHI questionnaire included item generation, item construction, and field testing. To assess the instrument's reliability and validity, we recruited 30 different Korean worksites. We developed the WHI questionnaire of 136 items categorized into five domains, namely Governance and Infrastructure, Need Assessment and Planning, Health Prevention and Promotion Program, Occupational Safety, and Monitoring and Feedback. All WHI domains demonstrated a high reliability with good internal consistency. The total WHI scores differentiated worksite groups effectively according to firm size. Each domain was associated significantly with employees' health status, absence, and financial outcome. The WHI can assess comprehensive worksite health programs. This tool is publicly available for addressing the growing need for worksite health programs.

  10. Development of an Item Bank for the Assessment of Knowledge on Biology in Argentine University Students.

    PubMed

    Cupani, Marcos; Zamparella, Tatiana Castro; Piumatti, Gisella; Vinculado, Grupo

    The calibration of item banks provides the basis for computerized adaptive testing that ensures high diagnostic precision and minimizes participants' test burden. This study aims to develop a bank of items to measure the level of Knowledge on Biology using the Rasch model. The sample consisted of 1219 participants that studied in different faculties of the National University of Cordoba (mean age = 21.85 years, SD = 4.66; 66.9% are women). The items were organized in different forms and into separate subtests, with some common items across subtests. The students were told they had to answer 60 questions of knowledge on biology. Evaluation of Rasch model fit (Zstd >|2.0|), differential item functioning, dimensionality, local independence, item and person separation (>2.0), and reliability (>.80) resulted in a bank of 180 items with good psychometric properties. The bank provides items with a wide range of content coverage and may serve as a sound basis for computerized adaptive testing applications. The contribution of this work is significant in the field of educational assessment in Argentina.

  11. Development and validation of the quality care questionnaire -palliative care (QCQ-PC): patient-reported assessment of quality of palliative care.

    PubMed

    Yun, Young Ho; Kang, Eun Kyo; Lee, Jihye; Choo, Jiyeon; Ryu, Hyewon; Yun, Hye-Min; Kang, Jung Hun; Kim, Tae You; Sim, Jin-Ah; Kim, Yaeji

    2018-03-05

    In this study, we aimed to develop and validate an instrument that could be used by patients with cancer to evaluate their quality of palliative care. Development of the questionnaire followed the four-phase process: item generation and reduction, construction, pilot testing, and field testing. Based on the literature, we constructed a list of items for the quality of palliative care from 104 quality care issues divided into 14 subscales. We constructed scales of 43 items that only the cancer patients were asked to answer. Using relevance and feasibility criteria and pilot testing, we developed a 44-item questionnaire. To assess the sensitivity and validity of the questionnaire, we recruited 220 patients over 18 years of age from three Korean hospitals. Factor analysis of the data and fit statistics process resulted in the 4-factor, 32-item Quality Care Questionnaire-Palliative Care (QCQ-PC), which covers appropriate communication with health care professionals (ten items), discussing value of life and goals of care (nine items), support and counseling for needs of holistic care (seven items), and accessibility and sustainability of care (six items). All subscales and total scores showed a high internal consistency (Cronbach alpha range, 0.89 to 0.97). Multi-trait scaling analysis showed good convergent (0.568-0.995) and discriminant (0.472-0.869) validity. The correlation between the total and subscale scores of QCQ-PC and those of EORTC QLQ-C15-PAL, MQOL, SAT-SF, and DCS was obtained. This study demonstrates that the QCQ-PC can be adopted to assess the quality of care in patients with cancer.

  12. Development of a Multi-Domain Assessment Tool for Quality Improvement Projects.

    PubMed

    Rosenbluth, Glenn; Burman, Natalie J; Ranji, Sumant R; Boscardin, Christy K

    2017-08-01

    Improving the quality of health care and education has become a mandate at all levels within the medical profession. While several published quality improvement (QI) assessment tools exist, all have limitations in addressing the range of QI projects undertaken by learners in undergraduate medical education, graduate medical education, and continuing medical education. We developed and validated a tool to assess QI projects with learner engagement across the educational continuum. After reviewing existing tools, we interviewed local faculty who taught QI to understand how learners were engaged and what these faculty wanted in an ideal assessment tool. We then developed a list of competencies associated with QI, established items linked to these competencies, revised the items using an iterative process, and collected validity evidence for the tool. The resulting Multi-Domain Assessment of Quality Improvement Projects (MAQIP) rating tool contains 9 items, with criteria that may be completely fulfilled, partially fulfilled, or not fulfilled. Interrater reliability was 0.77. Untrained local faculty were able to use the tool with minimal guidance. The MAQIP is a 9-item, user-friendly tool that can be used to assess QI projects at various stages and to provide formative and summative feedback to learners at all levels.

  13. Development and preliminary evaluation of a music-based attention assessment for patients with traumatic brain injury.

    PubMed

    Jeong, Eunju; Lesiuk, Teresa L

    2011-01-01

    Impairments in attention are commonly seen in individuals with traumatic brain injury (TBI). While visual attention assessment measurements have been rigorously developed and frequently used in cognitive neurorehabilitation, there is a paucity of auditory attention assessment measurements for patients with TBI. The purpose of this study was to field test a researcher-developed Music-based Attention Assessment (MAA), a melodic contour identification test designed to assess three different types of attention (i.e., sustained attention, selective attention, and divided attention), for patients with TBI. Additionally, this study aimed to evaluate the readability and comprehensibility of the test items and to examine the preliminary psychometric properties of the scale and test items. Fifteen patients diagnosed with TBI completed 3 different series of tasks in which they were required to identify melodic contours. The resulting data showed that (a) test items in each of the 3 subtests were found to have an easy to moderate level of item difficulty and an acceptable to high level of item discrimination, and (b) the musical characteristics (i.e., contour, congruence, and pitch interference) were found to be associated with the level of item difficulty, and (c) the internal consistency of the MAA as computed by Cronbach's alpha was .95. Subsequent studies using a larger sample of typical participants, along with individuals with TBI, are needed to confirm construct validity and internal consistency of the MAA. In addition, the authors recommend examination of criterion validity of the MAA as correlated with current neuropsychological attention assessment measurements.

  14. Challenges and Strategies for Assessing Specialised Knowledge for Teaching

    ERIC Educational Resources Information Center

    Orrill, Chandra Hawley; Kim, Ok-Kyeong; Peters, Susan A.; Lischka, Alyson E.; Jong, Cindy; Sanchez, Wendy B.; Eli, Jennifer A.

    2015-01-01

    Developing and writing assessment items that measure teachers' knowledge is an intricate and complex undertaking. In this paper, we begin with an overview of what is known about measuring teacher knowledge. We then highlight the challenges inherent in creating assessment items that focus specifically on measuring teachers' specialised knowledge…

  15. 76 FR 43286 - National Assessment Governing Board; Meeting

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-07-20

    ... levels for each grade and subject tested, developing standards and procedures for interstate and national... in closed session to review secure test items for the 2012 Economics assessment at grade 12 and the... meeting the ADC will complete their review of secure NAEP test items for the 2012 Economics assessment at...

  16. Development and Validation of the Learning Progression-Based Assessment of Modern Genetics in a High School Context

    ERIC Educational Resources Information Center

    Todd, Amber; Romine, William L.; Cook Whitt, Katahdin

    2017-01-01

    We describe the development, validation, and use of the "Learning Progression-Based Assessment of Modern Genetics" (LPA-MG) in a high school biology context. Items were constructed based on a current learning progression framework for genetics (Shea & Duncan, 2013; Todd & Kenyon, 2015). The 34-item instrument, which was tied to…

  17. Efforts Toward the Development of Unbiased Selection and Assessment Instruments.

    ERIC Educational Resources Information Center

    Rudner, Lawrence M.

    Investigations into item bias provide an empirical basis for the identification and elimination of test items which appear to measure different traits across populations or cultural groups. The Psychometric rationales for six approaches to the identification of biased test items are reviewed: (1) Transformed item difficulties: within-group…

  18. A Multidimensional Computerized Adaptive Short-Form Quality of Life Questionnaire Developed and Validated for Multiple Sclerosis: The MusiQoL-MCAT.

    PubMed

    Michel, Pierre; Baumstarck, Karine; Ghattas, Badih; Pelletier, Jean; Loundou, Anderson; Boucekine, Mohamed; Auquier, Pascal; Boyer, Laurent

    2016-04-01

    The aim was to develop a multidimensional computerized adaptive short-form questionnaire, the MusiQoL-MCAT, from a fixed-length QoL questionnaire for multiple sclerosis.A total of 1992 patients were enrolled in this international cross-sectional study. The development of the MusiQoL-MCAT was based on the assessment of between-items MIRT model fit followed by real-data simulations. The MCAT algorithm was based on Bayesian maximum a posteriori estimation of latent traits and Kullback-Leibler information item selection. We examined several simulations based on a fixed number of items. Accuracy was assessed using correlations (r) between initial IRT scores and MCAT scores. Precision was assessed using the standard error measurement (SEM) and the root mean square error (RMSE).The multidimensional graded response model was used to estimate item parameters and IRT scores. Among the MCAT simulations, the 16-item version of the MusiQoL-MCAT was selected because the accuracy and precision became stable with 16 items with satisfactory levels (r ≥ 0.9, SEM ≤ 0.55, and RMSE ≤ 0.3). External validity of the MusiQoL-MCAT was satisfactory.The MusiQoL-MCAT presents satisfactory properties and can individually tailor QoL assessment to each patient, making it less burdensome to patients and better adapted for use in clinical practice.

  19. A Multidimensional Computerized Adaptive Short-Form Quality of Life Questionnaire Developed and Validated for Multiple Sclerosis

    PubMed Central

    Michel, Pierre; Baumstarck, Karine; Ghattas, Badih; Pelletier, Jean; Loundou, Anderson; Boucekine, Mohamed; Auquier, Pascal; Boyer, Laurent

    2016-01-01

    Abstract The aim was to develop a multidimensional computerized adaptive short-form questionnaire, the MusiQoL-MCAT, from a fixed-length QoL questionnaire for multiple sclerosis. A total of 1992 patients were enrolled in this international cross-sectional study. The development of the MusiQoL-MCAT was based on the assessment of between-items MIRT model fit followed by real-data simulations. The MCAT algorithm was based on Bayesian maximum a posteriori estimation of latent traits and Kullback–Leibler information item selection. We examined several simulations based on a fixed number of items. Accuracy was assessed using correlations (r) between initial IRT scores and MCAT scores. Precision was assessed using the standard error measurement (SEM) and the root mean square error (RMSE). The multidimensional graded response model was used to estimate item parameters and IRT scores. Among the MCAT simulations, the 16-item version of the MusiQoL-MCAT was selected because the accuracy and precision became stable with 16 items with satisfactory levels (r ≥ 0.9, SEM ≤ 0.55, and RMSE ≤ 0.3). External validity of the MusiQoL-MCAT was satisfactory. The MusiQoL-MCAT presents satisfactory properties and can individually tailor QoL assessment to each patient, making it less burdensome to patients and better adapted for use in clinical practice. PMID:27057832

  20. Development and refinement of the WAItE: a new obesity-specific quality of life measure for adolescents.

    PubMed

    Oluboyede, Yemi; Hulme, Claire; Hill, Andrew

    2017-08-01

    Few weight-specific outcome measures, developed specifically for obese and overweight adolescents, exist and none are suitable for the elicitation of utility values used in the assessment of cost effectiveness. The development of a descriptive system for a new weight-specific measure. Qualitative interviews were conducted with 31 treatment-seeking (above normal weight status) and non-treatment-seeking (school sample) adolescents aged 11-18 years, to identify a draft item pool and associated response options. 315 eligible consenting adolescents, aged 11-18 years, enrolled in weight management services and recruited via an online panel, completed two version of a long-list 29-item descriptive system (consisting of frequency and severity response scales). Psychometric assessments and Rasch analysis were applied to the draft 29-item instrument to identify a brief tool containing the best performing items and associated response options. Seven items were selected, for the final item set; all displayed internal consistency, moderate floor effects and the ability to discriminate between weight categories. The assessment of unidimensionality was supported (t test statistic of 0.024, less than the 0.05 threshold value). The Weight-specific Adolescent Instrument for Economic-evaluation focuses on aspects of life affected by weight that are important to adolescents. It has the potential for adding key information to the assessment of weight management interventions aimed at the younger population.

  1. Development of functional oral health literacy assessment instruments: application of literacy and cognitive theories.

    PubMed

    Bridges, Susan M; Parthasarathy, Divya S; Au, Terry K F; Wong, Hai Ming; Yiu, Cynthia K Y; McGrath, Colman P

    2014-01-01

    This paper describes the development of a new literacy assessment instrument, the Hong Kong Oral Health Literacy Assessment Task for Paediatric Dentistry (HKOHLAT-P). Its relationship to literacy theory is analyzed to establish content and face validity. Implications for construct validity are examined by analyzing cognitive demand to determine how "comprehension" is measured. Key influences from literacy assessment were identified to analyze item development. Cognitive demand was analyzed using an established taxonomy. The HKOHLAT-P focuses on the functional domain of health literacy assessment. Items had strong content and face validity reflecting established principles from modern literacy theory. Inclusion of new text types signified relevant developments in the area of new literacies. Analysis of cognitive demand indicated that this instrument assesses the "comprehension" domain, specifically the areas of factual and procedural knowledge, with some assessment of conceptual knowledge. Metacognitive knowledge was not assessed. Comprehension tasks assessing patient health literacy predominantly examine functional health literacy at the lower levels of comprehension. Item development is influenced by the fields of situated and authentic literacy. Inclusion of content regarding multiliteracies is suggested for further research. Development of functional health literacy assessment instruments requires careful consideration of the clinical context in determining construct validity. © 2013 American Association of Public Health Dentistry.

  2. Assessing Mindfulness in Children and Adolescents: Development and Validation of the Child and Adolescent Mindfulness Measure (CAMM)

    ERIC Educational Resources Information Center

    Greco, Laurie A.; Baer, Ruth A.; Smith, Gregory T.

    2011-01-01

    This article presents 4 studies (N = 1,413) describing the development and validation of the Child and Adolescent Mindfulness Measure (CAMM). In Study 1 (n = 428), the authors determined procedures for item development and examined comprehensibility of the initial 25 items. In Study 2 (n = 334), they reduced the initial item pool from 25 to 10…

  3. The Dutch motor skills assessment as tool for talent development in table tennis: a reproducibility and validity study.

    PubMed

    Faber, Irene R; Nijhuis-Van Der Sanden, Maria W G; Elferink-Gemser, Marije T; Oosterveld, Frits G J

    2015-01-01

    A motor skills assessment could be helpful in talent development by estimating essential perceptuo-motor skills of young players, which are considered requisite to develop excellent technical and tactical qualities. The Netherlands Table Tennis Association uses a motor skills assessment in their talent development programme consisting of eight items measuring perceptuo-motor skills specific to table tennis under varying conditions. This study aimed to investigate this assessment regarding its reproducibility, internal consistency, underlying dimensions and concurrent validity in 113 young table tennis players (6-10 years). Intraclass correlation coefficients of six test items met the criteria of 0.7 with coefficients of variation between 3% and 8%. Cronbach's alpha valued 0.853 for internal consistency. The principal components analysis distinguished two conceptually meaningful factors: "ball control" and "gross motor function." Concurrent validity analyses demonstrated moderate associations between the motor skills assessment's results and national ranking; boys r = -0.53 (P < 0.001) and girls r = -0.45 (P = 0.015). In conclusion, this evaluation demonstrated six test items with acceptable reproducibility, good internal consistency and good prospects for validity. Two test items need revision to upgrade reproducibility. Since the motor skills assessment seems to be a reproducible, objective part of a talent development programme, more longitudinal studies are required to investigate its predictive validity.

  4. Children and Young People-Mental Health Safety Assessment Tool (CYP-MH SAT) study: Protocol for the development and psychometric evaluation of an assessment tool to identify immediate risk of self-harm and suicide in children and young people (10–19 years) in acute paediatric hospital settings

    PubMed Central

    Walker, Gemma M; Carter, Tim; Aubeeluck, Aimee; Witchell, Miranda; Coad, Jane

    2018-01-01

    Introduction Currently, no standardised, evidence-based assessment tool for assessing immediate self-harm and suicide in acute paediatric inpatient settings exists. Aim The aim of this study is to develop and test the psychometric properties of an assessment tool that identifies immediate risk of self-harm and suicide in children and young people (10–19 years) in acute paediatric hospital settings. Methods and analysis Development phase: This phase involved a scoping review of the literature to identify and extract items from previously published suicide and self-harm risk assessment scales. Using a modified electronic Delphi approach, these items will then be rated according to their relevance for assessment of immediate suicide or self-harm risk by expert professionals. Inclusion of items will be determined by 65%–70% consensus between raters. Subsequently, a panel of expert members will convene to determine the face validity, appropriate phrasing, item order and response format for the finalised items. Psychometric testing phase: The finalised items will be tested for validity and reliability through a multicentre, psychometric evaluation. Psychometric testing will be undertaken to determine the following: internal consistency, inter-rater reliability, convergent, divergent validity and concurrent validity. Ethics and dissemination Ethical approval was provided by the National Health Service East Midlands—Derby Research Ethics Committee (17/EM/0347) and full governance clearance received by the Health Research Authority and local participating sites. Findings from this study will be disseminated to professionals and the public via peer-reviewed journal publications, popular social media and conference presentations. PMID:29654046

  5. Development of a questionnaire for assessing the childbirth experience (QACE).

    PubMed

    Carquillat, Pierre; Vendittelli, Françoise; Perneger, Thomas; Guittier, Marie-Julia

    2017-08-30

    Due to its potential impact on women's psychological health, assessing perceptions of their childbirth experience is important. The aim of this study was to develop a multidimensional self-reporting questionnaire to evaluate the childbirth experience. Factors influencing the childbirth experience were identified from a literature review and the results of a previous qualitative study. A total of 25 items were combined from existing instruments or were created de novo. A draft version was pilot tested for face validity with 30 women and submitted for evaluation of its construct validity to 477 primiparous women at one-month post-partum. The recruitment took place in two obstetric clinics from Swiss and French university hospitals. To evaluate the content validity, we compared item responses to general childbirth experience assessments on a numeric, 0 to 10 rating scale. We dichotomized two group assessment scores: "0 to 7" and "8 to 10". We performed an exploratory factor analysis to identify underlying dimensions. In total, 291 women completed the questionnaire (response rate = 61%). The responses to 22 items were statistically significant between the 0 to 7 and 8 to 10 groups for the general childbirth experience assessments. An exploratory factor analysis yielded four sub-scales, which were labelled "relationship with staff" (4 items), "emotional status" (3 items), "first moments with the new born," (3 items) and "feelings at one month postpartum" (3 items). All 4 scales had satisfactory internal consistency levels (alpha coefficients from 0.70 to 0.85). The full 25-item version can be used to analyse each item by itself, and the short 4-dimension version can be scored to summarize the general assessment of the childbirth experience. The Questionnaire for Assessing the Childbirth Experience (QACE) could be useful as a screening instrument to identify women with negative childbirth experiences. It can be used as both a research instrument in its short version and a questionnaire for use in clinical practice in its full version.

  6. What do Demand-Control and Effort-Reward work stress questionnaires really measure? A discriminant content validity study of relevance and representativeness of measures.

    PubMed

    Bell, Cheryl; Johnston, Derek; Allan, Julia; Pollard, Beth; Johnston, Marie

    2017-05-01

    The Demand-Control (DC) and Effort-Reward Imbalance (ERI) models predict health in a work context. Self-report measures of the four key constructs (demand, control, effort, and reward) have been developed and it is important that these measures have good content validity uncontaminated by content from other constructs. We assessed relevance (whether items reflect the constructs) and representativeness (whether all aspects of the construct are assessed, and all items contribute to that assessment) across the instruments and items. Two studies examined fourteen demand/control items from the Job Content Questionnaire and seventeen effort/reward items from the Effort-Reward Imbalance measure using discriminant content validation and a third study developed new methods to assess instrument representativeness. Both methods use judges' ratings and construct definitions to get transparent quantitative estimates of construct validity. Study 1 used dictionary definitions while studies 2 and 3 used published phrases to define constructs. Overall, 3/5 demand items, 4/9 control items, 1/6 effort items, and 7/11 reward items were uniquely classified to the appropriate theoretical construct and were therefore 'pure' items with discriminant content validity (DCV). All pure items measured a defining phrase. However, both the DC and ERI assessment instruments failed to assess all defining aspects. Finding good discriminant content validity for demand and reward measures means these measures are usable and our quantitative results can guide item selection. By contrast, effort and control measures had limitations (in relevance and representativeness) presenting a challenge to the implementation of the theories. Statement of contribution What is already known on this subject? While the reliability and construct validity of Demand-Control and Effort-Reward-Imbalance (DC and ERI) work stress measures are routinely reported, there has not been adequate investigation of their content validity. This paper investigates their content validity in terms of both relevance and representativeness and provides a model for the investigation of content validity of measures in health psychology more generally. What does this study add? A new application of an existing method, discriminant content validity, and a new method of assessing instrument representativeness. 'Pure' DC and ERI items are identified, as are constructs that are not fully represented by their assessment instruments. The findings are important for studies attempting to distinguish between the main DC and ERI work stress constructs. The quantitative results can be used to guide item selection for future studies. © 2017 The British Psychological Society.

  7. Measuring depression after spinal cord injury: Development and psychometric characteristics of the SCI-QOL Depression item bank and linkage with PHQ-9.

    PubMed

    Tulsky, David S; Kisala, Pamela A; Kalpakjian, Claire Z; Bombardier, Charles H; Pohlig, Ryan T; Heinemann, Allen W; Carle, Adam; Choi, Seung W

    2015-05-01

    To develop a calibrated spinal cord injury-quality of life (SCI-QOL) item bank, computer adaptive test (CAT), and short form to assess depressive symptoms experienced by individuals with SCI, transform scores to the Patient Reported Outcomes Measurement Information System (PROMIS) metric, and create a crosswalk to the Patient Health Questionnaire (PHQ)-9. We used grounded-theory based qualitative item development methods, large-scale item calibration field testing, confirmatory factor analysis, item response theory (IRT) analyses, and statistical linking techniques to transform scores to a PROMIS metric and to provide a crosswalk with the PHQ-9. Five SCI Model System centers and one Department of Veterans Affairs medical center in the United States. Adults with traumatic SCI. Spinal Cord Injury--Quality of Life (SCI-QOL) Depression Item Bank Individuals with SCI were involved in all phases of SCI-QOL development. A sample of 716 individuals with traumatic SCI completed 35 items assessing depression, 18 of which were PROMIS items. After removing 7 non-PROMIS items, factor analyses confirmed a unidimensional pool of items. We used a graded response IRT model to estimate slopes and thresholds for the 28 retained items. The SCI-QOL Depression measure correlated 0.76 with the PHQ-9. The SCI-QOL Depression item bank provides a reliable and sensitive measure of depressive symptoms with scores reported in terms of general population norms. We provide a crosswalk to the PHQ-9 to facilitate comparisons between measures. The item bank may be administered as a CAT or as a short form and is suitable for research and clinical applications.

  8. Assessing the capacity of ministries of health to use research in decision-making: conceptual framework and tool.

    PubMed

    Rodríguez, Daniela C; Hoe, Connie; Dale, Elina M; Rahman, M Hafizur; Akhter, Sadika; Hafeez, Assad; Irava, Wayne; Rajbangshi, Preety; Roman, Tamlyn; Ţîrdea, Marcela; Yamout, Rouham; Peters, David H

    2017-08-01

    The capacity to demand and use research is critical for governments if they are to develop policies that are informed by evidence. Existing tools designed to assess how government officials use evidence in decision-making have significant limitations for low- and middle-income countries (LMICs); they are rarely tested in LMICs and focus only on individual capacity. This paper introduces an instrument that was developed to assess Ministry of Health (MoH) capacity to demand and use research evidence for decision-making, which was tested for reliability and validity in eight LMICs (Bangladesh, Fiji, India, Lebanon, Moldova, Pakistan, South Africa, Zambia). Instrument development was based on a new conceptual framework that addresses individual, organisational and systems capacities, and items were drawn from existing instruments and a literature review. After initial item development and pre-testing to address face validity and item phrasing, the instrument was reduced to 54 items for further validation and item reduction. In-country study teams interviewed a systematic sample of 203 MoH officials. Exploratory factor analysis was used in addition to standard reliability and validity measures to further assess the items. Thirty items divided between two factors representing organisational and individual capacity constructs were identified. South Africa and Zambia demonstrated the highest level of organisational capacity to use research, whereas Pakistan and Bangladesh were the lowest two. In contrast, individual capacity was highest in Pakistan, followed by South Africa, whereas Bangladesh and Lebanon were the lowest. The framework and related instrument represent a new opportunity for MoHs to identify ways to understand and improve capacities to incorporate research evidence in decision-making, as well as to provide a basis for tracking change.

  9. Development and validation of a ten-item questionnaire with explanatory illustrations to assess upper extremity disorders: favorable effect of illustrations in the item reduction process.

    PubMed

    Kurimoto, Shigeru; Suzuki, Mikako; Yamamoto, Michiro; Okui, Nobuyuki; Imaeda, Toshihiko; Hirata, Hitoshi

    2011-11-01

    The purpose of this study is to develop a short and valid measure for upper extremity disorders and to assess the effect of attached illustrations in item reduction of a self-administered disability questionnaire while retaining psychometric properties. A validated questionnaire used to assess upper extremity disorders, the Hand20, was reduced to ten items using two item-reduction techniques. The psychometric properties of the abbreviated form, the Hand10, were evaluated on an independent sample that was used for the shortening process. Validity, reliability, and responsiveness of the Hand10 were retained in the item reduction process. It was possible that the use of explanatory illustrations attached to the Hand10 helped with its reproducibility. The illustrations for the Hand10 promoted text comprehension and motivation to answer the items. These changes resulted in high acceptability; more than 99.3% of patients, including 98.5% of elderly patients, could complete the Hand10 properly. The illustrations had favorable effects on the item reduction process and made it possible to retain precision of the instrument. The Hand10 is a reliable and valid instrument for individual-level applications with the advantage of being compact and broadly applicable, even in elderly individuals.

  10. Developing an African youth psychosocial assessment: an application of item response theory.

    PubMed

    Betancourt, Theresa S; Yang, Frances; Bolton, Paul; Normand, Sharon-Lise

    2014-06-01

    This study aimed to refine a dimensional scale for measuring psychosocial adjustment in African youth using item response theory (IRT). A 60-item scale derived from qualitative data was administered to 667 war-affected adolescents (55% female). Exploratory factor analysis (EFA) determined the dimensionality of items based on goodness-of-fit indices. Items with loadings less than 0.4 were dropped. Confirmatory factor analysis (CFA) was used to confirm the scale's dimensionality found under the EFA. Item discrimination and difficulty were estimated using a graded response model for each subscale using weighted least squares means and variances. Predictive validity was examined through correlations between IRT scores (θ) for each subscale and ratings of functional impairment. All models were assessed using goodness-of-fit and comparative fit indices. Fisher's Information curves examined item precision at different underlying ranges of each trait. Original scale items were optimized and reconfigured into an empirically-robust 41-item scale, the African Youth Psychosocial Assessment (AYPA). Refined subscales assess internalizing and externalizing problems, prosocial attitudes/behaviors and somatic complaints without medical cause. The AYPA is a refined dimensional assessment of emotional and behavioral problems in African youth with good psychometric properties. Validation studies in other cultures are recommended. Copyright © 2014 John Wiley & Sons, Ltd.

  11. Developing an African youth psychosocial assessment: an application of item response theory

    PubMed Central

    BETANCOURT, THERESA S.; YANG, FRANCES; BOLTON, PAUL; NORMAND, SHARON-LISE

    2014-01-01

    This study aimed to refine a dimensional scale for measuring psychosocial adjustment in African youth using item response theory (IRT). A 60-item scale derived from qualitative data was administered to 667 war-affected adolescents (55% female). Exploratory factor analysis (EFA) determined the dimensionality of items based on goodness-of-fit indices. Items with loadings less than 0.4 were dropped. Confirmatory factor analysis (CFA) was used to confirm the scale's dimensionality found under the EFA. Item discrimination and difficulty were estimated using a graded response model for each subscale using weighted least squares means and variances. Predictive validity was examined through correlations between IRT scores (θ) for each subscale and ratings of functional impairment. All models were assessed using goodness-of-fit and comparative fit indices. Fisher's Information curves examined item precision at different underlying ranges of each trait. Original scale items were optimized and reconfigured into an empirically-robust 41-item scale, the African Youth Psychosocial Assessment (AYPA). Refined subscales assess internalizing and externalizing problems, prosocial attitudes/behaviors and somatic complaints without medical cause. The AYPA is a refined dimensional assessment of emotional and behavioral problems in African youth with good psychometric properties. Validation studies in other cultures are recommended. PMID:24478113

  12. Development and validation of a tool to assess the physical and social environment associated with physical activity among adults in Sri Lanka.

    PubMed

    De Silva Weliange, Shreenika H; Fernando, Dulitha; Gunatilake, Jagath

    2014-05-03

    Environmental characteristics are known to be associated with patterns of physical activity (PA). Although several validated tools exist, to measure the environment characteristics, these instruments are not necessarily suitable for application in all settings especially in a developing country. This study was carried out to develop and validate an instrument named the "Physical And Social Environment Scale--PASES" to assess the physical and social environmental factors associated with PA. This will enable identification of various physical and social environmental factors affecting PA in Sri Lanka, which will help in the development of more tailored intervention strategies for promoting higher PA levels in Sri Lanka. The PASES was developed using a scientific approach of defining the construct, item generation, analysis of content of items and item reduction. Both qualitative and quantitative methods of key informant interviews, in-depth interviews and rating of the items generated by experts were conducted. A cross sectional survey among 180 adults was carried out to assess the factor structure through principal component analysis. Another cross sectional survey among a different group of 180 adults was carried out to assess the construct validity through confirmatory factor analysis. Reliability was assessed with test re-test reliability and internal consistency using Spearman r and Cronbach's alpha respectively. Thirty six items were selected after the expert ratings and were developed into interviewer administered questions. Exploration of factor structure of the 34 items which were factorable through principal component analysis with Quartimax rotation extracted 8 factors. The 34 item instrument was assessed for construct validity with confirmatory factor analysis which confirmed an 8 factor model (x2 = 339.9, GFI = 0.90). The identified factors were infrastructure for walking, aesthetics and facilities for cycling, vehicular traffic safety, access and connectivity, recreational facilities for PA, safety, social cohesion and social acceptance of PA with the two non-factorable factors, residential density and land use mix. The PASES also showed good test re-test reliability and a moderate level of internal consistency. The PASES is a valid and reliable tool which could be used to assess the physical and social environment associated with PA in Sri Lanka.

  13. Brief Opioid Overdose Knowledge (BOOK): A Questionnaire to Assess Overdose Knowledge in Individuals Who Use Illicit or Prescribed Opioids.

    PubMed

    Dunn, Kelly E; Barrett, Frederick S; Yepez-Laubach, Claudia; Meyer, Andrew C; Hruska, Bryce J; Sigmon, Stacey C; Fingerhood, Michael; Bigelow, George E

    2016-01-01

    Opioid overdose is a public health crisis. This study describes efforts to develop and validate the Brief Opioid Overdose Knowledge (BOOK) questionnaire to assess patient knowledge gaps related to opioid overdose risks. Two samples of illicit opioid users and a third sample of patients receiving an opioid for the treatment of chronic pain (total N = 848) completed self-report items pertaining to opioid overdose risks. A 3-factor scale was established, representing Opioid Knowledge (4 items), Opioid Overdose Knowledge (4 items), and Opioid Overdose Response Knowledge (4 items). The scale had strong internal and face validity. Patients with chronic pain performed worse than illicit drug users in almost all items assessed, highlighting the need to increase knowledge of opioid overdose risk to this population. This study sought to develop a brief, internally valid method for quickly assessing deficits in opioid overdose risk areas within users of illicit and prescribed opioids, to provide an efficient metric for assessing and comparing educational interventions, facilitate conversations between physicians and patients about overdose risks, and help formally identify knowledge deficits in other patient populations.

  14. Development and assessment of floor and ceiling items for the PROMIS physical function item bank

    PubMed Central

    2013-01-01

    Introduction Disability and Physical Function (PF) outcome assessment has had limited ability to measure functional status at the floor (very poor functional abilities) or the ceiling (very high functional abilities). We sought to identify, develop and evaluate new floor and ceiling items to enable broader and more precise assessment of PF outcomes for the NIH Patient-Reported-Outcomes Measurement Information System (PROMIS). Methods We conducted two cross-sectional studies using NIH PROMIS item improvement protocols with expert review, participant survey and focus group methods. In Study 1, respondents with low PF abilities evaluated new floor items, and those with high PF abilities evaluated new ceiling items for clarity, importance and relevance. In Study 2, we compared difficulty ratings of new floor items by low functioning respondents and ceiling items by high functioning respondents to reference PROMIS PF-10 items. We used frequencies, percentages, means and standard deviations to analyze the data. Results In Study 1, low (n = 84) and high (n = 90) functioning respondents were mostly White, women, 70 years old, with some college, and disability scores of 0.62 and 0.30. More than 90% of the 31 new floor and 31 new ceiling items were rated as clear, important and relevant, leaving 26 ceiling and 30 floor items for Study 2. Low (n = 246) and high (n = 637) functioning Study 2 respondents were mostly White, women, 70 years old, with some college, and Health Assessment Questionnaire (HAQ) scores of 1.62 and 0.003. Compared to difficulty ratings of reference items, ceiling items were rated to be 10% more to greater than 40% more difficult to do, and floor items were rated to be about 12% to nearly 90% less difficult to do. Conclusions These new floor and ceiling items considerably extend the measurable range of physical function at either extreme. They will help improve instrument performance in populations with broad functional ranges and those concentrated at one or the other extreme ends of functioning. Optimal use of these new items will be assisted by computerized adaptive testing (CAT), reducing questionnaire burden and insuring item administration to appropriate individuals. PMID:24286166

  15. Development and analysis of an instrument to assess student understanding of GOB chemistry knowledge relevant to clinical nursing practice.

    PubMed

    Brown, Corina E; Hyslop, Richard M; Barbera, Jack

    2015-01-01

    The General, Organic, and Biological Chemistry Knowledge Assessment (GOB-CKA) is a multiple-choice instrument designed to assess students' understanding of the chemistry topics deemed important to clinical nursing practice. This manuscript describes the development process of the individual items along with a psychometric evaluation of the final version of the items and instrument. In developing items for the GOB-CKA, essential topics were identified through a series of expert interviews (with practicing nurses, nurse educators, and GOB chemistry instructors) and confirmed through a national survey. Individual items were tested in qualitative studies with students from the target population for clarity and wording. Data from pilot and beta studies were used to evaluate each item and narrow the total item count to 45. A psychometric analysis performed on data from the 45-item final version was used to provide evidence of validity and reliability. The final version of the instrument has a Cronbach's alpha value of 0.76. Feedback from an expert panel provided evidence of face and content validity. Convergent validity was estimated by comparing the results from the GOB-CKA with the General-Organic-Biochemistry Exam (Form 2007) of the American Chemical Society. Instructors who wish to use the GOB-CKA for teaching and research may contact the corresponding author for a copy of the instrument. © 2014 Wiley Periodicals, Inc.

  16. Measurement in Sensory Modulation: The Sensory Processing Scale Assessment

    PubMed Central

    Miller, Lucy J.; Sullivan, Jillian C.

    2014-01-01

    OBJECTIVE. Sensory modulation issues have a significant impact on participation in daily life. Moreover, understanding phenotypic variation in sensory modulation dysfunction is crucial for research related to defining homogeneous groups and for clinical work in guiding treatment planning. We thus evaluated the new Sensory Processing Scale (SPS) Assessment. METHOD. Research included item development, behavioral scoring system development, test administration, and item analyses to evaluate reliability and validity across sensory domains. RESULTS. Items with adequate reliability (internal reliability >.4) and discriminant validity (p < .01) were retained. Feedback from the expert panel also contributed to decisions about retaining items in the scale. CONCLUSION. The SPS Assessment appears to be a reliable and valid measure of sensory modulation (scale reliability >.90; discrimination between group effect sizes >1.00). This scale has the potential to aid in differential diagnosis of sensory modulation issues. PMID:25184464

  17. Development of the PROMIS coping expectancies of smoking item banks.

    PubMed

    Shadel, William G; Edelen, Maria Orlando; Tucker, Joan S; Stucky, Brian D; Hansen, Mark; Cai, Li

    2014-09-01

    Smoking is a coping strategy for many smokers who then have difficulty finding new ways to cope with negative affect when they quit. This paper describes analyses conducted to develop and evaluate item banks for assessing the coping expectancies of smoking for daily and nondaily smokers. Using data from a large sample of daily (N = 4,201) and nondaily (N = 1,183) smokers, we conducted a series of item factor analyses, item response theory analyses, and differential item functioning (DIF) analyses (according to gender, age, and ethnicity) to arrive at a unidimensional set of items for daily and nondaily smokers. We also evaluated performance of short forms (SFs) and computer adaptive tests (CATs) for assessing coping expectancies of smoking. For both daily and nondaily smokers, the unidimensional Coping Expectancies item banks (21 items) are relatively DIF free and are highly reliable (0.96 and 0.97, respectively). A common 4-item SF for daily and nondaily smokers also showed good reliability (0.85). Adaptive tests required an average of 4.3 and 3.7 items for simulated daily and nondaily respondents, respectively, and achieved reliabilities of 0.91 for both when the maximum test length was 10 items. This research provides a new set of items that can be used to reliably assess coping expectancies of smoking, through a SF, CAT, or a tailored set selected for a specific research purpose. © The Author 2014. Published by Oxford University Press on behalf of the Society for Research on Nicotine and Tobacco. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  18. Measuring assessment standards in undergraduate medical programs: Development and validation of AIM tool.

    PubMed

    Sajjad, Madiha; Khan, Rehan Ahmed; Yasmeen, Rahila

    2018-01-01

    To develop a tool to evaluate faculty perceptions of assessment quality in an undergraduate medical program. The Assessment Implementation Measure (AIM) tool was developed by a mixed method approach. A preliminary questionnaire developed through literature review was submitted to a panel of 10 medical education experts for a three-round 'Modified Delphi technique'. Panel agreement of > 75% was considered the criterion for inclusion of items in the questionnaire. Cognitive pre-testing of five faculty members was conducted. Pilot study was done with 30 randomly selected faculty members. Content validity index (CVI) was calculated for individual items (I-CVI) and composite scale (S-CVI). Cronbach's alpha was calculated to determine the internal consistency reliability of the tool. The final AIM tool had 30 items after the Delphi process. S-CVI was 0.98 with the S-CVI/Avg method and 0.86 by S-CVI/UA method, suggesting good content validity. Cut-off value of < 0.9 I-CVI was taken as criterion for item deletion. Cognitive pre-testing revealed good item interpretation. Cronbach's alpha calculated for the AIM was 0.9, whereas Cronbach's alpha for the four domains ranged from 0.67 to 0.80. 'AIM' is a relevant and useful instrument with good content validity and reliability of results, and may be used to evaluate the teachers´ perceptions about assessment quality.

  19. Using Explanatory Item Response Models to Evaluate Complex Scientific Tasks Designed for the Next Generation Science Standards

    NASA Astrophysics Data System (ADS)

    Chiu, Tina

    This dissertation includes three studies that analyze a new set of assessment tasks developed by the Learning Progressions in Middle School Science (LPS) Project. These assessment tasks were designed to measure science content knowledge on the structure of matter domain and scientific argumentation, while following the goals from the Next Generation Science Standards (NGSS). The three studies focus on the evidence available for the success of this design and its implementation, generally labelled as "validity" evidence. I use explanatory item response models (EIRMs) as the overarching framework to investigate these assessment tasks. These models can be useful when gathering validity evidence for assessments as they can help explain student learning and group differences. In the first study, I explore the dimensionality of the LPS assessment by comparing the fit of unidimensional, between-item multidimensional, and Rasch testlet models to see which is most appropriate for this data. By applying multidimensional item response models, multiple relationships can be investigated, and in turn, allow for a more substantive look into the assessment tasks. The second study focuses on person predictors through latent regression and differential item functioning (DIF) models. Latent regression models show the influence of certain person characteristics on item responses, while DIF models test whether one group is differentially affected by specific assessment items, after conditioning on latent ability. Finally, the last study applies the linear logistic test model (LLTM) to investigate whether item features can help explain differences in item difficulties.

  20. Development of a mobbing short scale in the Gutenberg Health Study.

    PubMed

    Garthus-Niegel, Susan; Nübling, Matthias; Letzel, Stephan; Hegewald, Janice; Wagner, Mandy; Wild, Philipp S; Blettner, Maria; Zwiener, Isabella; Latza, Ute; Jankowiak, Sylvia; Liebers, Falk; Seidler, Andreas

    2016-01-01

    Despite its highly detrimental potential, most standard questionnaires assessing psychosocial stress at work do not include mobbing as a risk factor. In the German standard version of COPSOQ, mobbing is assessed with a single item. In the Gutenberg Health Study, this version was used together with a newly developed short scale based on the Leymann Inventory of Psychological Terror. The purpose of the present study was to evaluate the psychometric properties of these two measures, to compare them and to test their differential impact on relevant outcome parameters. This analysis is based on a population-based sample of 1441 employees participating in the Gutenberg Health Study. Exploratory and confirmatory factor analyses and reliability analyses were used to assess the mobbing scale. To determine their predictive validities, multiple linear regression analyses with six outcome parameters and log-binomial regression models for two of the outcome aspects were run. Factor analyses of the five-item scale confirmed a one-factor solution, reliability was α = 0.65. Both the single-item and the five-item scales were associated with all six outcome scales. Effect sizes were similar for both mobbing measures. Mobbing is an important risk factor for health-related outcomes. For the purpose of psychosocial risk assessment in the workplace, both the single-item and the five-item constructs were psychometrically appropriate. Associations with outcomes were about equivalent. However, the single item has the advantage of parsimony, whereas the five-item construct depicts several distinct forms of mobbing.

  1. Development of a brief measure of intimate partner violence experiences: the Composite Abuse Scale (Revised)—Short Form (CASR-SF)

    PubMed Central

    Ford-Gilboe, Marilyn; Wathen, C Nadine; Varcoe, Colleen; MacMillan, Harriet L; Scott-Storey, Kelly; Mantler, Tara; Hegarty, Kelsey; Perrin, Nancy

    2016-01-01

    Objectives Approaches to measuring intimate partner violence (IPV) in populations often privilege physical violence, with poor assessment of other experiences. This has led to underestimating the scope and impact of IPV. The aim of this study was to develop a brief, reliable and valid self-report measure of IPV that adequately captures its complexity. Design Mixed-methods instrument development and psychometric testing to evolve a brief version of the Composite Abuse Scale (CAS) using secondary data analysis and expert feedback. Setting Data from 5 Canadian IPV studies; feedback from international IPV experts. Participants 31 international IPV experts including academic researchers, service providers and policy actors rated CAS items via an online survey. Pooled data from 6278 adult Canadian women were used for scale development. Primary/secondary outcome measures Scale reliability and validity; robustness of subscales assessing different IPV experiences. Results A 15-item version of the CAS has been developed (Composite Abuse Scale (Revised)—Short Form, CASR-SF), including 12 items developed from the original CAS and 3 items suggested through expert consultation and the evolving literature. Items cover 3 abuse domains: physical, sexual and psychological, with questions asked to assess lifetime, recent and current exposure, and abuse frequency. Factor loadings for the final 3-factor solution ranged from 0.81 to 0.91 for the 6 psychological abuse items, 0.63 to 0.92 for the 4 physical abuse items, and 0.85 and 0.93 for the 2 sexual abuse items. Moderate correlations were observed between the CASR-SF and measures of depression, post-traumatic stress disorder and coercive control. Internal consistency of the CASR-SF was 0.942. These reliability and validity estimates were comparable to those obtained for the original 30-item CAS. Conclusions The CASR-SF is brief self-report measure of IPV experiences among women that has demonstrated initial reliability and validity and is suitable for use in population studies or other studies. Additional validation of the 15-item scale with diverse samples is required. PMID:27927659

  2. A Methodology for Zumbo's Third Generation DIF Analyses and the Ecology of Item Responding

    ERIC Educational Resources Information Center

    Zumbo, Bruno D.; Liu, Yan; Wu, Amery D.; Shear, Benjamin R.; Olvera Astivia, Oscar L.; Ark, Tavinder K.

    2015-01-01

    Methods for detecting differential item functioning (DIF) and item bias are typically used in the process of item analysis when developing new measures; adapting existing measures for different populations, languages, or cultures; or more generally validating test score inferences. In 2007 in "Language Assessment Quarterly," Zumbo…

  3. Development of the Assessment of Belief Conflict in Relationship-14 (ABCR-14)

    PubMed Central

    Kyougoku, Makoto; Teraoka, Mutsumi; Masuda, Noriko; Ooura, Mariko; Abe, Yasushi

    2015-01-01

    Purpose Nurses and other healthcare workers frequently experience belief conflict, one of the most important, new stress-related problems in both academic and clinical fields. Methods In this study, using a sample of 1,683 nursing practitioners, we developed The Assessment of Belief Conflict in Relationship-14 (ABCR-14), a new scale that assesses belief conflict in the healthcare field. Standard psychometric procedures were used to develop and test the scale, including a qualitative framework concept and item-pool development, item reduction, and scale development. We analyzed the psychometric properties of ABCR-14 according to entropy, polyserial correlation coefficient, exploratory factor analysis, confirmatory factor analysis, average variance extracted, Cronbach’s alpha, Pearson product-moment correlation coefficient, and multidimensional item response theory (MIRT). Results The results of the analysis supported a three-factor model consisting of 14 items. The validity and reliability of ABCR-14 was suggested by evidence from high construct validity, structural validity, hypothesis testing, internal consistency reliability, and concurrent validity. The result of the MIRT offered strong support for good item response of item slope parameters and difficulty parameters. However, the ABCR-14 Likert scale might need to be explored from the MIRT point of view. Yet, as mentioned above, there is sufficient evidence to support that ABCR-14 has high validity and reliability. Conclusion The ABCR-14 demonstrates good psychometric properties for nursing belief conflict. Further studies are recommended to confirm its application in clinical practice. PMID:26247356

  4. Measuring anxiety after spinal cord injury: Development and psychometric characteristics of the SCI-QOL Anxiety item bank and linkage with GAD-7.

    PubMed

    Kisala, Pamela A; Tulsky, David S; Kalpakjian, Claire Z; Heinemann, Allen W; Pohlig, Ryan T; Carle, Adam; Choi, Seung W

    2015-05-01

    To develop a calibrated item bank and computer adaptive test to assess anxiety symptoms in individuals with spinal cord injury (SCI), transform scores to the Patient Reported Outcomes Measurement Information System (PROMIS) metric, and create a statistical linkage with the Generalized Anxiety Disorder (GAD)-7, a widely used anxiety measure. Grounded-theory based qualitative item development methods; large-scale item calibration field testing; confirmatory factor analysis; graded response model item response theory analyses; statistical linking techniques to transform scores to a PROMIS metric; and linkage with the GAD-7. Setting Five SCI Model System centers and one Department of Veterans Affairs medical center in the United States. Participants Adults with traumatic SCI. Spinal Cord Injury-Quality of Life (SCI-QOL) Anxiety Item Bank Seven hundred sixteen individuals with traumatic SCI completed 38 items assessing anxiety, 17 of which were PROMIS items. After 13 items (including 2 PROMIS items) were removed, factor analyses confirmed unidimensionality. Item response theory analyses were used to estimate slopes and thresholds for the final 25 items (15 from PROMIS). The observed Pearson correlation between the SCI-QOL Anxiety and GAD-7 scores was 0.67. The SCI-QOL Anxiety item bank demonstrates excellent psychometric properties and is available as a computer adaptive test or short form for research and clinical applications. SCI-QOL Anxiety scores have been transformed to the PROMIS metric and we provide a method to link SCI-QOL Anxiety scores with those of the GAD-7.

  5. Development and Initial Validation of the Five-Factor Model Adolescent Personality Questionnaire (FFM-APQ).

    PubMed

    Rogers, Mary E; Glendon, A Ian

    2018-01-01

    This research reports on the 4-phase development of the 25-item Five-Factor Model Adolescent Personality Questionnaire (FFM-APQ). The purpose was to develop and determine initial evidence for validity of a brief adolescent personality inventory using a vocabulary that could be understood by adolescents up to 18 years old. Phase 1 (N = 48) consisted of item generation and expert (N = 5) review of items; Phase 2 (N = 179) involved item analyses; in Phase 3 (N = 496) exploratory factor analysis assessed the underlying structure; in Phase 4 (N = 405) confirmatory factor analyses resulted in a 25-item inventory with 5 subscales.

  6. Assessment of the quality and applicability of an e-portfolio capstone assessment item within a bachelor of midwifery program.

    PubMed

    Baird, Kathleen; Gamble, Jenny; Sidebotham, Mary

    2016-09-01

    Education programs leading to professional licencing need to ensure assessments throughout the program are constructively aligned and mapped to the specific professional expectations. Within the final year of an undergraduate degree, a student is required to transform and prepare for professional practice. Establishing assessment items that are authentic and able to reflect this transformation is a challenge for universities. This paper both describes the considerations around the design of a capstone assessment and evaluates, from an academics perspective, the quality and applicability of an e-portfolio as a capstone assessment item for undergraduate courses leading to a professional qualification. The e-portfolio was seen to meet nine quality indicators for assessment. Academics evaluated the e-portfolio as an authentic assessment item that would engage the students and provide them with a platform for ongoing professional development and lifelong learning. The processes of reflection on strengths, weaknesses, opportunities and threats, comparison of clinical experiences with national statistics, preparation of professional philosophy and development of a curriculum vitae, whilst recognised as comprehensive and challenging were seen as highly valuable to the student transforming into the profession. Copyright © 2016 Elsevier Ltd. All rights reserved.

  7. Development and psychometric testing of an instrument designed to measure chronic pain in dogs with osteoarthritis

    PubMed Central

    Boston, Raymond C.; Coyne, James C.; Farrar, John T.

    2010-01-01

    Objective To develop and psychometrically test an owner self-administered questionnaire designed to assess severity and impact of chronic pain in dogs with osteoarthritis. Sample Population 70 owners of dogs with osteoarthritis and 50 owners of clinically normal dogs. Procedures Standard methods for the stepwise development and testing of instruments designed to assess subjective states were used. Items were generated through focus groups and an expert panel. Items were tested for readability and ambiguity, and poorly performing items were removed. The reduced set of items was subjected to factor analysis, reliability testing, and validity testing. Results Severity of pain and interference with function were 2 factors identified and named on the basis of the items contained in them. Cronbach’s α was 0.93 and 0.89, respectively, suggesting that the items in each factor could be assessed as a group to compute factor scores (ie, severity score and interference score). The test-retest analysis revealed κ values of 0.75 for the severity score and 0.81 for the interference score. Scores correlated moderately well (r = 0.51 and 0.50, respectively) with the overall quality-of-life (QOL) question, such that as severity and interference scores increased, QOL decreased. Clinically normal dogs had significantly lower severity and interference scores than dogs with osteoarthritis. Conclusions and Clinical Relevance A psychometrically sound instrument was developed. Responsiveness testing must be conducted to determine whether the questionnaire will be useful in reliably obtaining quantifiable assessments from owners regarding the severity and impact of chronic pain and its treatment on dogs with osteoarthritis. PMID:17542696

  8. Development of Physical Activity-Related Parenting Practices Scales for Urban Chinese Parents of Preschoolers: Confirmatory Factor Analysis and Reliability.

    PubMed

    Suen, Yi-Nam; Cerin, Ester; Barnett, Anthony; Huang, Wendy Y J; Mellecker, Robin R

    2017-09-01

    Valid instruments of parenting practices related to children's physical activity (PA) are essential to understand how parents affect preschoolers' PA. This study developed and validated a questionnaire of PA-related parenting practices for Chinese-speaking parents of preschoolers in Hong Kong. Parents (n = 394) completed a questionnaire developed using findings from formative qualitative research and literature searches. Test-retest reliability was determined on a subsample (n = 61). Factorial validity was assessed using confirmatory factor analysis. Subscale internal consistency was determined. The scale of parenting practices encouraging PA comprised 2 latent factors: Modeling, structure and participatory engagement in PA (23 items), and Provision of appropriate places for child's PA (4 items). The scale of parenting practices discouraging PA scale encompassed 4 latent factors: Safety concern/overprotection (6 items), Psychological/behavioral control (5 items), Promoting inactivity (4 items), and Promoting screen time (2 items). Test-retest reliabilities were moderate to excellent (0.58 to 0.82), and internal subscale reliabilities were acceptable (0.63 to 0.89). We developed a theory-based questionnaire for assessing PA-related parenting practices among Chinese-speaking parents of Hong Kong preschoolers. While some items were context and culture specific, many were similar to those previously found in other populations, indicating a degree of construct generalizability across cultures.

  9. Improving Assessment of Work Related Mental Health Function Using the Work Disability Functional Assessment Battery (WD-FAB).

    PubMed

    Marfeo, Elizabeth E; Ni, Pengsheng; McDonough, Christine; Peterik, Kara; Marino, Molly; Meterko, Mark; Rasch, Elizabeth K; Chan, Leighton; Brandt, Diane; Jette, Alan M

    2018-03-01

    Purpose To improve the mental health component of the Work Disability Functional Assessment Battery (WD-FAB), developed for the US Social Security Administration's (SSA) disability determination process. Specifically our goal was to expand the WD-FAB scales of mood & emotions, resilience, social interactions, and behavioral control to improve the depth and breadth of the current scales and expand the content coverage to include aspects of cognition & communication function. Methods Data were collected from a random, stratified sample of 1695 claimants applying for the SSA work disability benefits, and a general population sample of 2025 working age adults. 169 new items were developed to replenish the WD-FAB scales and analyzed using factor analysis and item response theory (IRT) analysis to construct unidimensional scales. We conducted computer adaptive test (CAT) simulations to examine the psychometric properties of the WD-FAB. Results Analyses supported the inclusion of four mental health subdomains: Cognition & Communication (68 items), Self-Regulation (34 items), Resilience & Sociability (29 items) and Mood & Emotions (34 items). All scales yielded acceptable psychometric properties. Conclusions IRT methods were effective in expanding the WD-FAB to assess mental health function. The WD-FAB has the potential to enhance work disability assessment both within the context of the SSA disability programs as well as other clinical and vocational rehabilitation settings.

  10. Health-related quality of life questionnaire for polycystic ovary syndrome (PCOSQ-50): development and psychometric properties.

    PubMed

    Nasiri-Amiri, Fatemeh; Ramezani Tehrani, Fahimeh; Simbar, Masoumeh; Montazeri, Ali; Mohammadpour, Reza Ali

    2016-07-01

    The determinants of the health-related quality of life of women with polycystic ovary syndrome are not fully understood. The aim of this study was to develop a comprehensive instrument to assess the health-related quality of life of Iranian women with PCOS and to assess its psychometric properties. We used a mixed-method, sequential, exploratory design including both qualitative [in-depth interview to define the components of health-related quality of life questionnaire (PCOSQ)] and quantitative approaches (to assess the psychometric properties of PCOSQ). A preliminary questionnaire was developed including 147 items which emerged from the qualitative phase of the study. Considering the optimum cutoff points for content validity ratio (CVR), content validity index (CVI), and impact score, items of the preliminary questionnaire were reduced from 147 to 88 items. Finally, by excluding highly correlated items using the exploratory factor analysis, a 50-item questionnaire was obtained. The Kaiser criteria (eigenvalues >1) and Scree plot tests demonstrated that six factors were optimum with an estimated 47.3 % of variance. Assessment of the psychometric properties of the questionnaire demonstrated a mean CVI = 0.92, CVR = 0.91, Cronbach's alpha for whole questionnaire = 0.88 (0.61-0.88 for subscales), Spearman's correlation coefficients of test-retest = 0.75, and the intra-class correlation coefficient for the PCOS questionnaire subscales ranging from 0.57 to 0.88. Eventually the final questionnaire included 50 items in six domains, 'psychosocial and emotional,' 'fertility,' 'sexual function,' 'obesity and menstrual disorders,' 'hirsutism,' and 'coping' and rated on a 5-point Likert scale. The PCOSQ-50 is a valid and reliable instrument for the assessment of quality of life of women with PCOS, capable of assessing some obscure aspects overlooked by previous HRQL questionnaires.

  11. Science Library of Test Items. Volume Nineteen. A Collection of Multiple Choice Test Items Relating Mainly to Geology.

    ERIC Educational Resources Information Center

    New South Wales Dept. of Education, Sydney (Australia).

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…

  12. Science Library of Test Items. Volume Seventeen. A Collection of Multiple Choice Test Items Relating Mainly to Biology.

    ERIC Educational Resources Information Center

    New South Wales Dept. of Education, Sydney (Australia).

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…

  13. Science Library of Test Items. Volume Eighteen. A Collection of Multiple Choice Test Items Relating Mainly to Chemistry.

    ERIC Educational Resources Information Center

    New South Wales Dept. of Education, Sydney (Australia).

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…

  14. A Comparison of Traditional Test Blueprinting and Item Development to Assessment Engineering in a Licensure Context

    ERIC Educational Resources Information Center

    Masters, James S.

    2010-01-01

    With the need for larger and larger banks of items to support adaptive testing and to meet security concerns, large-scale item generation is a requirement for many certification and licensure programs. As part of the mass production of items, it is critical that the difficulty and the discrimination of the items be known without the need for…

  15. Development process of an assessment tool for disruptive behavior problems in cross-cultural settings: the Disruptive Behavior International Scale – Nepal version (DBIS-N)

    PubMed Central

    Burkey, Matthew D.; Ghimire, Lajina; Adhikari, Ramesh P.; Kohrt, Brandon A.; Jordans, Mark J. D.; Haroz, Emily; Wissow, Lawrence

    2017-01-01

    Systematic processes are needed to develop valid measurement instruments for disruptive behavior disorders (DBDs) in cross-cultural settings. We employed a four-step process in Nepal to identify and select items for a culturally valid assessment instrument: 1) We extracted items from validated scales and local free-list interviews. 2) Parents, teachers, and peers (n=30) rated the perceived relevance and importance of behavior problems. 3) Highly rated items were piloted with children (n=60) in Nepal. 4) We evaluated internal consistency of the final scale. We identified 49 symptoms from 11 scales, and 39 behavior problems from free-list interviews (n=72). After dropping items for low ratings of relevance and severity and for poor item-test correlation, low frequency, and/or poor acceptability in pilot testing, 16 items remained for the Disruptive Behavior International Scale—Nepali version (DBIS-N). The final scale had good internal consistency (α=0.86). A 4-step systematic approach to scale development including local participation yielded an internally consistent scale that included culturally relevant behavior problems. PMID:28093575

  16. Sources of Self-Efficacy in Mathematics: A Validation Study

    ERIC Educational Resources Information Center

    Usher, Ellen L.; Pajares, Frank

    2009-01-01

    The purpose of this study was to develop and validate items with which to assess A. Bandura's (1997) theorized sources of self-efficacy among middle school mathematics students. Results from Phase 1 (N=1111) were used to develop and refine items for subsequent use. In Phase 2 of the study (N=824), a 39-item, four-factor exploratory model fit best.…

  17. Item Bank Development for a Revised Pediatric Evaluation of Disability Inventory (PEDI)

    ERIC Educational Resources Information Center

    Dumas, Helene; Fragala-Pinkham, Maria; Haley, Stephen; Coster, Wendy; Kramer, Jessica; Kao, Ying-Chia; Moed, Richard

    2010-01-01

    The Pediatric Evaluation of Disability Inventory (PEDI) is a useful clinical and research assessment, but it has limitations in content, age range, and efficiency. The purpose of this article is to describe the development of the item bank for a new computer adaptive testing version of the PEDI (PEDI-CAT). An expanded item set and response options…

  18. Refinement of the Long-Term Conditions Questionnaire (LTCQ): patient and expert stakeholder opinion.

    PubMed

    Kelly, Laura; Potter, Caroline M; Hunter, Cheryl; Gibbons, Elizabeth; Fitzpatrick, Ray; Jenkinson, Crispin; Peters, Michele

    2016-01-01

    It is a key UK government priority to assess and improve outcomes in people with long-term conditions (LTCs). We are developing a new patient-reported outcome measure, the Long-Term Conditions Questionnaire (LTCQ), for use among people with single or multiple LTCs. This study aimed to refine candidate LTCQ items that had previously been informed through literature reviews, interviews with professional stakeholders, and interviews with people with LTCs. Cognitive interviews (n=32) with people living with LTCs and consultations with professional stakeholders (n=13) and public representatives (n=5) were conducted to assess the suitability of 23 candidate items. Items were tested for content and comprehensibility and underwent a translatability assessment. Four rounds of revisions took place, due to amendments to item structure, improvements to item clarity, item duplication, and recommendations for future translations. Twenty items were confirmed as relevant to living with LTCs and understandable to patients and professionals. This study supports the content validity of the LTCQ items among people with LTCs and professional stakeholders. The final items are suitable to enter the next stage of psychometric refinement.

  19. Checklist content on a standardized patient assessment: an ex post facto review.

    PubMed

    Boulet, John R; van Zanten, Marta; de Champlain, André; Hawkins, Richard E; Peitzman, Steven J

    2008-03-01

    While checklists are often used to score standardized patient based clinical assessments, little research has focused on issues related to their development or the level of agreement with respect to the importance of specific items. Five physicians independently reviewed checklists from 11 simulation scenarios that were part of the former Educational Commission for Foreign Medical Graduate's Clinical Skills Assessment and classified the clinical appropriateness of each of the checklist items. Approximately 78% of the original checklist items were judged to be needed, or indicated, given the presenting complaint and the purpose of the assessment. Rater agreement was relatively poor with pairwise associations (Kappa coefficient) ranging from 0.09 to 0.29. However, when only consensus indicated items were included, there was little change in examinee scores, including their reliability over encounters. Although most checklist items in this sample were judged to be appropriate, some could potentially be eliminated, thereby minimizing the scoring burden placed on the standardized patients. Periodic review of checklist items, concentrating on their clinical importance, is warranted.

  20. The Assessment of Motivation within Maslow's Framework.

    ERIC Educational Resources Information Center

    Haymes, Michael; Green, Logan

    1982-01-01

    Reports progress in the development of the Needsort, a research tool, for the assessment of the three developmentally earliest, within Maslow's framework, conative needs (physiological, safety, belongingness). Discusses item analyses, item selection methods, reliability studies, and validation studies across a broad range of populations. (Author)

  1. Developing and investigating the use of single-item measures in organizational research.

    PubMed

    Fisher, Gwenith G; Matthews, Russell A; Gibbons, Alyssa Mitchell

    2016-01-01

    The validity of organizational research relies on strong research methods, which include effective measurement of psychological constructs. The general consensus is that multiple item measures have better psychometric properties than single-item measures. However, due to practical constraints (e.g., survey length, respondent burden) there are situations in which certain single items may be useful for capturing information about constructs that might otherwise go unmeasured. We evaluated 37 items, including 18 newly developed items as well as 19 single items selected from existing multiple-item scales based on psychometric characteristics, to assess 18 constructs frequently measured in organizational and occupational health psychology research. We examined evidence of reliability; convergent, discriminant, and content validity assessments; and test-retest reliabilities at 1- and 3-month time lags for single-item measures using a multistage and multisource validation strategy across 3 studies, including data from N = 17 occupational health subject matter experts and N = 1,634 survey respondents across 2 samples. Items selected from existing scales generally demonstrated better internal consistency reliability and convergent validity, whereas these particular new items generally had higher levels of content validity. We offer recommendations regarding when use of single items may be more or less appropriate, as well as 11 items that seem acceptable, 14 items with mixed results that might be used with caution due to mixed results, and 12 items we do not recommend using as single-item measures. Although multiple-item measures are preferable from a psychometric standpoint, in some circumstances single-item measures can provide useful information. (c) 2016 APA, all rights reserved).

  2. The impact of sub-skills and item content on students' skills with regard to the control-of-variables strategy

    NASA Astrophysics Data System (ADS)

    Schwichow, Martin; Christoph, Simon; Boone, William J.; Härtig, Hendrik

    2016-01-01

    The so-called control-of-variables strategy (CVS) incorporates the important scientific reasoning skills of designing controlled experiments and interpreting experimental outcomes. As CVS is a prominent component of science standards appropriate assessment instruments are required to measure these scientific reasoning skills and to evaluate the impact of instruction on CVS development. A detailed review of existing CVS instruments suggests that they utilize different, and only a few of the four, critical CVS sub-skills in the item development. This study presents a new CVS assessment instrument (CVS Inventory, CVSI) and investigates the validity of student measures derived from this instrument utilizing Rasch analyses. The results indicate that the CVSI produces reliable and valid student measures with regard to CVS. Furthermore, the results show that the item difficulty depends on the CVS sub-skills utilized in item development, but not on the item content. Accordingly, previous instruments that are restricted to a few CVS sub-skills tend to over- or underestimate students' CVS skills. In addition, these results indicate that students are able to use CVS as a domain general strategy in multiple content areas. Consequences for science instruction and assessment are discussed.

  3. The Work-Family Conflict Scale (WAFCS): development and initial validation of a self-report measure of work-family conflict for use with parents.

    PubMed

    Haslam, Divna; Filus, Ania; Morawska, Alina; Sanders, Matthew R; Fletcher, Renee

    2015-06-01

    This paper outlines the development and validation of the Work-Family Conflict Scale (WAFCS) designed to measure work-to-family conflict (WFC) and family-to-work conflict (FWC) for use with parents of young children. An expert informant and consumer feedback approach was utilised to develop and refine 20 items, which were subjected to a rigorous validation process using two separate samples of parents of 2-12 year old children (n = 305 and n = 264). As a result of statistical analyses several items were dropped resulting in a brief 10-item scale comprising two subscales assessing theoretically distinct but related constructs: FWC (five items) and WFC (five items). Analyses revealed both subscales have good internal consistency, construct validity as well as concurrent and predictive validity. The results indicate the WAFCS is a promising brief measure for the assessment of work-family conflict in parents. Benefits of the measure as well as potential uses are discussed.

  4. A validation study of public health knowledge, skills, social responsibility and applied learning.

    PubMed

    Vackova, Dana; Chen, Coco K; Lui, Juliana N M; Johnston, Janice M

    2018-06-22

    To design and validate a questionnaire to measure medical students' Public Health (PH) knowledge, skills, social responsibility and applied learning as indicated in the four domains recommended by the Association of Schools & Programmes of Public Health (ASPPH). A cross-sectional study was conducted to develop an evaluation tool for PH undergraduate education through item generation, reduction, refinement and validation. The 74 preliminary items derived from the existing literature were reduced to 55 items based on expert panel review which included those with expertise in PH, psychometrics and medical education, as well as medical students. Psychometric properties of the preliminary questionnaire were assessed as follows: frequency of endorsement for item variance; principal component analysis (PCA) with varimax rotation for item reduction and factor estimation; Cronbach's Alpha, item-total correlation and test-retest validity for internal consistency and reliability. PCA yielded five factors: PH Learning Experience (6 items); PH Risk Assessment and Communication (5 items); Future Use of Evidence in Practice (6 items); Recognition of PH as a Scientific Discipline (4 items); and PH Skills Development (3 items), explaining 72.05% variance. Internal consistency and reliability tests were satisfactory (Cronbach's Alpha ranged from 0.87 to 0.90; item-total correlation > 0.59). Lower paired test-retest correlations reflected instability in a social science environment. An evaluation tool for community-centred PH education has been developed and validated. The tool measures PH knowledge, skills, social responsibilities and applied learning as recommended by the internationally recognised Association of Schools & Programmes of Public Health (ASPPH).

  5. Assessing Construct Validity Using Multidimensional Item Response Theory.

    ERIC Educational Resources Information Center

    Ackerman, Terry A.

    The concept of a user-specified validity sector is discussed. The idea of the validity sector combines the work of M. D. Reckase (1986) and R. Shealy and W. Stout (1991). Reckase developed a methodology to represent an item in a multidimensional latent space as a vector. Item vectors are computed using multidimensional item response theory item…

  6. A Time and Place for Everything: Developmental Differences in the Building Blocks of Episodic Memory

    ERIC Educational Resources Information Center

    Lee, Joshua K.; Wendelken, Carter; Bunge, Silvia A.; Ghetti, Simona

    2016-01-01

    This research investigated whether episodic memory development can be explained by improvements in relational binding processes, involved in forming novel associations between events and the context in which they occurred. Memory for item-space, item-time, and item-item relations was assessed in an ethnically diverse sample of 151 children aged…

  7. Meta-analytic guidelines for evaluating single-item reliabilities of personality instruments.

    PubMed

    Spörrle, Matthias; Bekk, Magdalena

    2014-06-01

    Personality is an important predictor of various outcomes in many social science disciplines. However, when personality traits are not the principal focus of research, for example, in global comparative surveys, it is often not possible to assess them extensively. In this article, we first provide an overview of the advantages and challenges of single-item measures of personality, a rationale for their construction, and a summary of alternative ways of assessing their reliability. Second, using seven diverse samples (Ntotal = 4,263) we develop the SIMP-G, the German adaptation of the Single-Item Measures of Personality, an instrument assessing the Big Five with one item per trait, and evaluate its validity and reliability. Third, we integrate previous research and our data into a first meta-analysis of single-item reliabilities of personality measures, and provide researchers with guidelines and recommendations for the evaluation of single-item reliabilities. © The Author(s) 2013.

  8. Development and validation of the work-family-school role conflicts and role-related social support scales among registered nurses with multiple roles.

    PubMed

    Xu, Lijuan; Song, Rhayun

    2013-10-01

    The purpose of this study was to develop work-family-school role conflicts and role-related social support scales, and to validate the psychometrics of those scales among registered nurses with multiple roles. The concepts, generation of items, and the scale domains of work-family-school role conflicts and role-related social support scales were constructed based on a review of the literature. The validity and reliability of the scales were examined by administering them to 201 registered nurses who were recruited from 8 university hospitals in South Korea. The content validity was examined by nursing experts using a content validity index. Exploratory factor analysis and confirmatory factor analysis were used to establish the construct validity. The correlation with depression was examined to assess concurrent validity. Finally, internal consistency was assessed using Cronbach's alpha coefficients. The work-family-school role conflicts scale comprised ten items with three factors: work-school-to-family conflict (three items), family-school-to-work conflict (three items), and work-family-to-school conflict (four items). The role-related social support scale comprised nine items with three factors: support from family (three items), support from work (three items), and support from school (three items). Cronbach's alphas were 0.83 and 0.76 for the work-family-school role conflicts and role-related social support scales, respectively. Both instruments exhibited acceptable construct and concurrent validity. The validity and reliability of the developed scales indicate their potential usefulness for the assessment of work-family-school role conflict and role-related social support among registered nurses with multiple roles in Korea. Copyright © 2013 Elsevier Ltd. All rights reserved.

  9. Strategies to identify future shortages due to interruptions in the health care procurement supply chain and their impact on health services: a method from the English National Health Service.

    PubMed

    Grose, Jane; Richardson, Janet

    2014-01-01

    The uninterrupted supply of essential items for patient care is crucial for organizations that deliver health care. Many products central to health care are derived from natural resources such as oil and cotton, supplies of which are vulnerable to climate change and increasing global demand. The purpose of this study was to identify which items would have the greatest effect on service delivery and patient outcomes should they no longer be available. Using a consensus development approach, all items bought by one hospital, over one year, were subjected to a filtering process. Criteria were developed to identify at-risk products and assess them against specific risks and opportunities. Seventy-two items were identified for assessment against a range of potential impacts on service delivery and patient outcomes, from no impact to significant impact. Clinical and non-clinical participants rated the items. In the category of significant impact, consensus was achieved for 20 items out of 72. There were differences of opinion between clinical and non-clinical participants in terms of significant impact in relation to 18 items, suggesting that priority over purchasing decisions may create areas of conflict. Reducing reliance on critically scarce resources and reducing demand were seen as the most important criteria in developing sustainable procurement. The method was successful in identifying items vulnerable to supply chain interruption and should be repeated in other areas to test its ability to adapt to local priorities, and to assess how it functions in a variety of public and private settings.

  10. Development of the Oxford Participation and Activities Questionnaire: constructing an item pool

    PubMed Central

    Kelly, Laura; Jenkinson, Crispin; Dummett, Sarah; Dawson, Jill; Fitzpatrick, Ray; Morley, David

    2015-01-01

    Purpose The Oxford Participation and Activities Questionnaire is a patient-reported outcome measure in development that is grounded on the World Health Organization International Classification of Functioning, Disability, and Health (ICF). The study reported here aimed to inform and generate an item pool for the new measure, which is specifically designed for the assessment of participation and activity in patients experiencing a range of health conditions. Methods Items were informed through in-depth interviews conducted with 37 participants spanning a range of conditions. Interviews aimed to identify how their condition impacted their ability to participate in meaningful activities. Conditions included arthritis, cancer, chronic back pain, diabetes, motor neuron disease, multiple sclerosis, Parkinson’s disease, and spinal cord injury. Transcripts were analyzed using the framework method. Statements relating to ICF themes were recast as questionnaire items and shown for review to an expert panel. Cognitive debrief interviews (n=13) were used to assess items for face and content validity. Results ICF themes relevant to activities and participation in everyday life were explored, and a total of 222 items formed the initial item pool. This item pool was refined by the research team and 28 generic items were mapped onto all nine chapters of the ICF construct, detailing activity and participation. Cognitive interviewing confirmed the questionnaire instructions, items, and response options were acceptable to participants. Conclusion Using a clear conceptual basis to inform item generation, 28 items have been identified as suitable to undergo further psychometric testing. A large-scale postal survey will follow in order to refine the instrument further and to assess its psychometric properties. The final instrument is intended for use in clinical trials and interventions targeted at maintaining or improving activity and participation. PMID:26056503

  11. Development of the Oxford Participation and Activities Questionnaire: constructing an item pool.

    PubMed

    Kelly, Laura; Jenkinson, Crispin; Dummett, Sarah; Dawson, Jill; Fitzpatrick, Ray; Morley, David

    2015-01-01

    The Oxford Participation and Activities Questionnaire is a patient-reported outcome measure in development that is grounded on the World Health Organization International Classification of Functioning, Disability, and Health (ICF). The study reported here aimed to inform and generate an item pool for the new measure, which is specifically designed for the assessment of participation and activity in patients experiencing a range of health conditions. Items were informed through in-depth interviews conducted with 37 participants spanning a range of conditions. Interviews aimed to identify how their condition impacted their ability to participate in meaningful activities. Conditions included arthritis, cancer, chronic back pain, diabetes, motor neuron disease, multiple sclerosis, Parkinson's disease, and spinal cord injury. Transcripts were analyzed using the framework method. Statements relating to ICF themes were recast as questionnaire items and shown for review to an expert panel. Cognitive debrief interviews (n=13) were used to assess items for face and content validity. ICF themes relevant to activities and participation in everyday life were explored, and a total of 222 items formed the initial item pool. This item pool was refined by the research team and 28 generic items were mapped onto all nine chapters of the ICF construct, detailing activity and participation. Cognitive interviewing confirmed the questionnaire instructions, items, and response options were acceptable to participants. Using a clear conceptual basis to inform item generation, 28 items have been identified as suitable to undergo further psychometric testing. A large-scale postal survey will follow in order to refine the instrument further and to assess its psychometric properties. The final instrument is intended for use in clinical trials and interventions targeted at maintaining or improving activity and participation.

  12. The Validity of a New Structured Assessment of Gastrointestinal Symptoms Scale (SAGIS) for Evaluating Symptoms in the Clinical Setting.

    PubMed

    Koloski, N A; Jones, M; Hammer, J; von Wulffen, M; Shah, A; Hoelz, H; Kutyla, M; Burger, D; Martin, N; Gurusamy, S R; Talley, N J; Holtmann, G

    2017-08-01

    The clinical assessments of patients with gastrointestinal symptoms can be time-consuming, and the symptoms captured during the consultation may be influenced by a variety of patient and non-patient factors. To facilitate standardized symptom assessment in the routine clinical setting, we developed the Structured Assessment of Gastrointestinal Symptom (SAGIS) instrument to precisely characterize symptoms in a routine clinical setting. We aimed to validate SAGIS including its reliability, construct and discriminant validity, and utility in the clinical setting. Development of the SAGIS consisted of initial interviews with patients referred for the diagnostic work-up of digestive symptoms and relevant complaints identified. The final instrument consisted of 22 items as well as questions on extra intestinal symptoms and was given to 1120 consecutive patients attending a gastroenterology clinic randomly split into derivation (n = 596) and validation datasets (n = 551). Discriminant validity along with test-retest reliability was assessed. The time taken to perform a clinical assessment with and without the SAGIS was recorded along with doctor satisfaction with this tool. Exploratory factor analysis conducted on the derivation sample suggested five symptom constructs labeled as abdominal pain/discomfort (seven items), gastroesophageal reflux disease/regurgitation symptoms (four items), nausea/vomiting (three items), diarrhea/incontinence (five items), and difficult defecation and constipation (2 items). Confirmatory factor analysis conducted on the validation sample supported the initially developed five-factor measurement model ([Formula: see text], p < 0.0001, χ 2 /df = 4.6, CFI = 0.90, TLI = 0.88, RMSEA = 0.08). All symptom groups demonstrated differentiation between disease groups. The SAGIS was shown to be reliable over time and resulted in a 38% reduction of the time required for clinical assessment. The SAGIS instrument has excellent psychometric properties and supports the clinical assessment of and symptom-based categorization of patients with a wide spectrum of gastrointestinal symptoms.

  13. A 67-Item Stress Resilience item bank showing high content validity was developed in a psychosomatic sample.

    PubMed

    Obbarius, Nina; Fischer, Felix; Obbarius, Alexander; Nolte, Sandra; Liegl, Gregor; Rose, Matthias

    2018-04-10

    To develop the first item bank to measure Stress Resilience (SR) in clinical populations. Qualitative item development resulted in an initial pool of 131 items covering a broad theoretical SR concept. These items were tested in n=521 patients at a psychosomatic outpatient clinic. Exploratory and Confirmatory Factor Analysis (CFA), as well as other state-of-the-art item analyses and IRT were used for item evaluation and calibration of the final item bank. Out of the initial item pool of 131 items, we excluded 64 items (54 factor loading <.5, 4 residual correlations >.3, 2 non-discriminative Item Response Curves, 4 Differential Item Functioning). The final set of 67 items indicated sufficient model fit in CFA and IRT analyses. Additionally, a 10-item short form with high measurement precision (SE≤.32 in a theta range between -1.8 and +1.5) was derived. Both the SR item bank and the SR short form were highly correlated with an existing static legacy tool (Connor-Davidson Resilience Scale). The final SR item bank and 10-item short form showed good psychometric properties. When further validated, they will be ready to be used within a framework of Computer-Adaptive Tests for a comprehensive assessment of the Stress-Construct. Copyright © 2018. Published by Elsevier Inc.

  14. Development of a new assessment scale for measuring interaction during staff-assisted transfer of residents in dementia special care units.

    PubMed

    Thunborg, Charlotta; von Heideken Wågert, Petra; Götell, Eva; Ivarsson, Ann-Britt; Söderlund, Anne

    2015-02-10

    Mobility problems and cognitive deficits related to transferring or moving persons suffering from dementia are associated with dependency. Physical assistance provided by staff is an important component of residents' maintenance of mobility in dementia care facilities. Unfortunately, hands-on assistance during transfers is also a source of confusion in persons with dementia, as well as a source of strain in the caregiver. The bidirectional effect of actions in a dementia care dyad involved in transfer is complicated to evaluate. This study aimed to develop an assessment scale for measuring actions related to transferring persons with dementia by dementia care dyads. This study was performed in four phases and guided by the framework of the biopsychosocial model and the approach presented by Social Cognitive Theory. These frameworks provided a starting point for understanding reciprocal effects in dyadic interaction. The four phases were 1) a literature review identifying existing assessment scales; 2) analyses of video-recorded transfer of persons with dementia for further generation of items, 3) computing the item content validity index of the 93 proposed items by 15 experts; and 4) expert opinion on the response scale and feasibility testing of the new assessment scale by video observation of the transfer situations. The development process resulted in a 17-item scale with a seven-point response scale. The scale consists of two sections. One section is related to transfer-related actions (e.g., capability of communication, motor skills performance, and cognitive functioning) of the person with dementia. The other section addresses the caregivers' facilitative actions (e.g., preparedness of transfer aids, interactional skills, and means of communication and interaction). The literature review and video recordings provided ideas for the item pool. Expert opinion decreased the number of items by relevance ratings and qualitative feedback. No further development of items was performed after feasibility testing of the scale. To enable assessment of transfer-related actions in dementia care dyads, our new scale shows potential for bridging the gap in this area. Results from this study could provide health care professionals working in dementia care facilities with a useful tool for assessing transfer-related actions.

  15. Development of an instrument to measure medical students' perceptions of the assessment environment: initial validation.

    PubMed

    Sim, Joong Hiong; Tong, Wen Ting; Hong, Wei-Han; Vadivelu, Jamuna; Hassan, Hamimah

    2015-01-01

    Assessment environment, synonymous with climate or atmosphere, is multifaceted. Although there are valid and reliable instruments for measuring the educational environment, there is no validated instrument for measuring the assessment environment in medical programs. This study aimed to develop an instrument for measuring students' perceptions of the assessment environment in an undergraduate medical program and to examine the psychometric properties of the new instrument. The Assessment Environment Questionnaire (AEQ), a 40-item, four-point (1=Strongly Disagree to 4=Strongly Agree) Likert scale instrument designed by the authors, was administered to medical undergraduates from the authors' institution. The response rate was 626/794 (78.84%). To establish construct validity, exploratory factor analysis (EFA) with principal component analysis and varimax rotation was conducted. To examine the internal consistency reliability of the instrument, Cronbach's α was computed. Mean scores for the entire AEQ and for each factor/subscale were calculated. Mean AEQ scores of students from different academic years and sex were examined. Six hundred and eleven completed questionnaires were analysed. EFA extracted four factors: feedback mechanism (seven items), learning and performance (five items), information on assessment (five items), and assessment system/procedure (three items), which together explained 56.72% of the variance. Based on the four extracted factors/subscales, the AEQ was reduced to 20 items. Cronbach's α for the 20-item AEQ was 0.89, whereas Cronbach's α for the four factors/subscales ranged from 0.71 to 0.87. Mean score for the AEQ was 2.68/4.00. The factor/subscale of 'feedback mechanism' recorded the lowest mean (2.39/4.00), whereas the factor/subscale of 'assessment system/procedure' scored the highest mean (2.92/4.00). Significant differences were found among the AEQ scores of students from different academic years. The AEQ is a valid and reliable instrument. Initial validation supports its use to measure students' perceptions of the assessment environment in an undergraduate medical program.

  16. Reliability and validity assessment of gastrointestinal dystemperaments questionnaire: a novel scale in Persian traditional medicine

    PubMed Central

    Hoseinzadeh, Hamidreza; Taghipour, Ali; Yousefi, Mahdi

    2018-01-01

    Background Development of a questionnaire based on the resources of Persian traditional medicine seems necessary. One of the problems faced by practitioners of traditional medicine is the different opinions regarding the diagnosis of general temperament or temperament of member. One of the reasons is the lack of validity tools, and it has led to difficulties in training the student of traditional medicine and the treatment of patients. The differences in the detection methods, have given rise to several treatment methods. Objective The present study aimed to develop a questionnaire and standard software for diagnosis of gastrointestinal dystemperaments. Methods The present research is a tool developing study which included 8 stages of developing the items, determining the statements based on items, assessing the face validity, assessing the content validity, assessing the reliability, rating the items, developing a software for calculation of the total score of the questionnaire named GDS v.1.1, and evaluating the concurrent validity using statistical tests including Cronbach’s alpha coefficient, Cohen’s kappa coefficient. Results Based on the results, 112 notes including 62 symptoms were extracted from resources, and 58 items were obtained from in-person interview sessions with a panel of experts. A statement was selected for each item and, after merging a number of statements, a total of 49 statements were finally obtained. By calculating the score of statement impact and determining the content validity, respectively, 6 and 10 other items were removed from the list of statements. Standardized Cronbach’s alpha for this questionnaire was obtained 0.795 and its concurrent validity was equal to 0.8. Conclusion A quantitative tool was developed for diagnosis and examination of gastrointestinal dystemperaments. The developed questionnaire is adequately reliable and valid for this purpose. In addition, the software can be used for clinical diagnosis. PMID:29629060

  17. Science Library of Test Items. Volume Twenty. A Collection of Multiple Choice Test Items Relating Mainly to Physics, 1.

    ERIC Educational Resources Information Center

    New South Wales Dept. of Education, Sydney (Australia).

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…

  18. Science Library of Test Items. Volume Twenty-One. A Collection of Multiple Choice Test Items Relating Mainly to Physics, 2.

    ERIC Educational Resources Information Center

    New South Wales Dept. of Education, Sydney (Australia).

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…

  19. Science Library of Test Items. Volume Twenty-Two. A Collection of Multiple Choice Test Items Relating Mainly to Skills.

    ERIC Educational Resources Information Center

    New South Wales Dept. of Education, Sydney (Australia).

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…

  20. EXTENDING THE FLOOR AND THE CEILING FOR ASSESSMENT OF PHYSICAL FUNCTION

    PubMed Central

    Fries, James F.; Lingala, Bharathi; Siemons, Liseth; Glas, Cees A. W.; Cella, David; Hussain, Yusra N; Bruce, Bonnie; Krishnan, Eswar

    2014-01-01

    Objective The objective of the current study was to improve the assessment of physical function by improving the precision of assessment at the floor (extremely poor function) and at the ceiling (extremely good health) of the health continuum. Methods Under the NIH PROMIS program, we developed new physical function floor and ceiling items to supplement the existing item bank. Using item response theory (IRT) and the standard PROMIS methodology, we developed 30 floor items and 26 ceiling items and administered them during a 12-month prospective observational study of 737 individuals at the extremes of health status. Change over time was compared across anchor instruments and across items by means of effect sizes. Using the observed changes in scores, we back-calculated sample size requirements for the new and comparison measures. Results We studied 444 subjects with chronic illness and/or extreme age, and 293 generally fit subjects including athletes in training. IRT analyses confirmed that the new floor and ceiling items outperformed reference items (p<0.001). The estimated post-hoc sample size requirements were reduced by a factor of two to four at the floor and a factor of two at the ceiling. Conclusion Extending the range of physical function measurement can substantially improve measurement quality, can reduce sample size requirements and improve research efficiency. The paradigm shift from Disability to Physical Function includes the entire spectrum of physical function, signals improvement in the conceptual base of outcome assessment, and may be transformative as medical goals more closely approach societal goals for health. PMID:24782194

  1. Examination of the PROMIS upper extremity item bank.

    PubMed

    Hung, Man; Voss, Maren W; Bounsanga, Jerry; Crum, Anthony B; Tyser, Andrew R

    Clinical measurement. The psychometric properties of the PROMIS v1.2 UE item bank were tested on various samples prior to its release, but have not been fully evaluated among the orthopaedic population. This study assesses the performance of the UE item bank within the UE orthopaedic patient population. The UE item bank was administered to 1197 adult patients presenting to a tertiary orthopaedic clinic specializing in hand and UE conditions and was examined using traditional statistics and Rasch analysis. The UE item bank fits a unidimensional model (outfit MNSQ range from 0.64 to 1.70) and has adequate reliabilities (person = 0.84; item = 0.82) and local independence (item residual correlations range from -0.37 to 0.34). Only one item exhibits gender differential item functioning. Most items target low levels of function. The UE item bank is a useful clinical assessment tool. Additional items covering higher functions are needed to enhance validity. Supplemental testing is recommended for patients at higher levels of function until more high function UE items are developed. 2c. Copyright © 2016 Hanley & Belfus. Published by Elsevier Inc. All rights reserved.

  2. Accumulation of Content Validation Evidence for the Critical Thinking Self-Assessment Scale.

    PubMed

    Nair, Girija Gopinathan; Hellsten, Laurie-Ann M; Stamler, Lynnette Leeseberg

    2017-04-01

    Critical thinking skills (CTS) are essential for nurses; assessing students' acquisition of these skills is a mandate of nursing curricula. This study aimed to develop a self-assessment instrument of critical thinking skills (Critical Thinking Self-Assessment Scale [CTSAS]) for students' self-monitoring. An initial pool of 196 items across 6 core cognitive skills and 16 subskills were generated using the American Philosophical Association definition of CTS. Experts' content review of the items and their ratings provided evidence of content relevance using the item-level content validity index (I-CVI) and Aiken's content validity coefficient (VIk). 115 items were retained (range of I-CVI values = .70 to .94 and range of VIk values = .69-.95; significant at p< .05). The CTSAS is the first CTS instrument designed specifically for self-assessment purposes.

  3. Development of the Children's Scale of Hostility and Aggression: Reactive/Proactive (C-SHARP).

    PubMed

    Farmer, Cristan A; Aman, Michael G

    2009-01-01

    Whereas some scales exist for assessing aggression in typically developing children, they do not give a detailed analysis, and none is available for populations with developmental disabilities (DD). Parents of 365 children with DD completed the Children's Scale of Hostility and Aggression: Reactive/Proactive (C-SHARP), which surveys the severity of aggressive and hostile behaviors (Problem Scale) in addition to their proactive or reactive qualities (the Provocation Scale). Factor analysis yielded a 5-factor solution: I. Verbal Aggression (12 items), II. Bullying (12 items), III. Covert Aggression (11 items), IV. Hostility (9 items), and V. Physical Aggression (8 items). Coefficient alpha ranged from moderate (0.74, Physical Aggression) to high (0.92, Verbal Aggression). General validity was supported by expected differences between age and gender groups. Preliminary normative data were presented. The C-SHARP appears to be a promising tool for assessing aggression and hostility in children with DD.

  4. Measuring everyday functional competence using the Rasch assessment of everyday activity limitations (REAL) item bank.

    PubMed

    Oude Voshaar, Martijn A H; Ten Klooster, Peter M; Vonkeman, Harald E; van de Laar, Mart A F J

    2017-11-01

    Traditional patient-reported physical function instruments often poorly differentiate patients with mild-to-moderate disability. We describe the development and psychometric evaluation of a generic item bank for measuring everyday activity limitations in outpatient populations. Seventy-two items generated from patient interviews and mapped to the International Classification of Functioning, Disability and Health (ICF) domestic life chapter were administered to 1128 adults representative of the Dutch population. The partial credit model was fitted to the item responses and evaluated with respect to its assumptions, model fit, and differential item functioning (DIF). Measurement performance of a computerized adaptive testing (CAT) algorithm was compared with the SF-36 physical functioning scale (PF-10). A final bank of 41 items was developed. All items demonstrated acceptable fit to the partial credit model and measurement invariance across age, sex, and educational level. Five- and ten-item CAT simulations were shown to have high measurement precision, which exceeded that of SF-36 physical functioning scale across the physical function continuum. Floor effects were absent for a 10-item empirical CAT simulation, and ceiling effects were low (13.5%) compared with SF-36 physical functioning (38.1%). CAT also discriminated better than SF-36 physical functioning between age groups, number of chronic conditions, and respondents with or without rheumatic conditions. The Rasch assessment of everyday activity limitations (REAL) item bank will hopefully prove a useful instrument for assessing everyday activity limitations. T-scores obtained using derived measures can be used to benchmark physical function outcomes against the general Dutch adult population.

  5. Hablamos juntos (together we speak): a brief patient-reported measure of the quality of interpretation.

    PubMed

    Talamantes, Efrain; Moreno, Gerardo; Guerrero, Lourdes R; Mangione, Carol M; Morales, Leo S

    2014-01-01

    This study evaluates the psychometric properties of three newly developed items assessing the quality of interpretation from the patient's perspective among Spanish-speaking limited English proficient Latino patients. The authors examined the psychometric properties of a patient-reported measure of quality of interpretation using a cross-sectional survey study of 1,590 adult Spanish-speaking limited English proficient Latinos in the United States. Quality of interpretation, doctor communication, and satisfaction with care were assessed using a three survey-item, an independent multiple-item measure, and a single-item measure, respectively. Sixty-nine percent (1,104) of patients surveyed used interpreters. Cronbach's alpha for the three items assessing interpreter quality was 0.31, while dropping item three resulted in an alpha of 0.56. Items one and two were moderately correlated with doctor communication (r=0.39) and satisfaction with care scores (r=0.21) supporting construct validity. Two out of three survey items can be scaled to measure quality of interpretation from the patient's perspective. Quality of interpretation reported by patients is moderately associated with doctor communication and satisfaction with care.

  6. Measuring implementation behaviour of menu guidelines in the childcare setting: confirmatory factor analysis of a theoretical domains framework questionnaire (TDFQ).

    PubMed

    Seward, Kirsty; Wolfenden, Luke; Wiggers, John; Finch, Meghan; Wyse, Rebecca; Oldmeadow, Christopher; Presseau, Justin; Clinton-McHarg, Tara; Yoong, Sze Lin

    2017-04-04

    While there are number of frameworks which focus on supporting the implementation of evidence based approaches, few psychometrically valid measures exist to assess constructs within these frameworks. This study aimed to develop and psychometrically assess a scale measuring each domain of the Theoretical Domains Framework for use in assessing the implementation of dietary guidelines within a non-health care setting (childcare services). A 75 item 14-domain Theoretical Domains Framework Questionnaire (TDFQ) was developed and administered via telephone interview to 202 centre based childcare service cooks who had a role in planning the service menu. Confirmatory factor analysis (CFA) was undertaken to assess the reliability, discriminant validity and goodness of fit of the 14-domain theoretical domain framework measure. For the CFA, five iterative processes of adjustment were undertaken where 14 items were removed, resulting in a final measure consisting of 14 domains and 61 items. For the final measure: the Chi-Square goodness of fit statistic was 3447.19; the Standardized Root Mean Square Residual (SRMR) was 0.070; the Root Mean Square Error of Approximation (RMSEA) was 0.072; and the Comparative Fit Index (CFI) had a value of 0.78. While only one of the three indices support goodness of fit of the measurement model tested, a 14-domain model with 61 items showed good discriminant validity and internally consistent items. Future research should aim to assess the psychometric properties of the developed TDFQ in other community-based settings.

  7. Development of the Consumer Refrigerator Safety Questionnaire: A Measure of Consumer Perceptions and Practices.

    PubMed

    Cairnduff, Victoria; Dean, Moira; Koidis, Anastasios

    2016-09-01

    Food preparation and storage behaviors in the home deviating from the "best practice" food safety recommendations may result in foodborne illnesses. Currently, there are limited tools available to fully evaluate the consumer knowledge, perceptions, and behavior in the area of refrigerator safety. The current study aimed to develop a valid and reliable tool in the form of a questionnaire, the Consumer Refrigerator Safety Questionnaire (CRSQ), for assessing systematically all these aspects. Items relating to refrigerator safety knowledge (n =17), perceptions (n =46), and reported behavior (n =30) were developed and pilot tested by an expert reference group and various consumer groups to assess face and content validity (n =20), item difficulty and consistency (n =55), and construct validity (n =23). The findings showed that the CRSQ has acceptable face and content validity with acceptable levels of item difficulty. Item consistency was observed for 12 of 15 in refrigerator safety knowledge. Further, all 5 of the subscales of consumer perceptions of refrigerator safety practices relating to risk of developing foodborne disease showed acceptable internal consistency (Cronbach's α value > 0.8). Construct validity of the CRSQ was shown to be very good (P = 0.022). The CRSQ exhibited acceptable test-retest reliability at 14 days with the majority of knowledge items (93.3%) and reported behavior items (96.4%) having correlation coefficients of greater than 0.70. Overall, the CRSQ was deemed valid and reliable in assessing refrigerator safety knowledge and behavior; therefore, it has the potential for future use in identifying groups of individuals at increased risk of deviating from recommended refrigerator safety practices, as well as the assessment of refrigerator safety knowledge and behavior for use before and after an intervention.

  8. The value of item response theory in clinical assessment: a review.

    PubMed

    Thomas, Michael L

    2011-09-01

    Item response theory (IRT) and related latent variable models represent modern psychometric theory, the successor to classical test theory in psychological assessment. Although IRT has become prevalent in the measurement of ability and achievement, its contributions to clinical domains have been less extensive. Applications of IRT to clinical assessment are reviewed to appraise its current and potential value. Benefits of IRT include comprehensive analyses and reduction of measurement error, creation of computer adaptive tests, meaningful scaling of latent variables, objective calibration and equating, evaluation of test and item bias, greater accuracy in the assessment of change due to therapeutic intervention, and evaluation of model and person fit. The theory may soon reinvent the manner in which tests are selected, developed, and scored. Although challenges remain to the widespread implementation of IRT, its application to clinical assessment holds great promise. Recommendations for research, test development, and clinical practice are provided.

  9. Development and validation of the German version of the Orofacial Esthetic Scale.

    PubMed

    Reissmann, Daniel R; Benecke, Andreas W; Aarabi, Ghazal; Sierwald, Ira

    2015-07-01

    This study aimed to develop the German version of the Orofacial Esthetic Scale (OES-G) and to assess its psychometric properties. The OES is an eight-item instrument with seven items directly addressing esthetic impacts of the orofacial region and an eighth item for a global assessment. It applies an 11-point ordinal rating scale, with summary scores ranging from 0 (worst) to 70 (best). The original OES items were translated into German using a forward-backward method. A de novo development of German items (n = 21 patients) and a cross-cultural adaptation after pilot testing (n = 15 patients) established content validity. Internal consistency and construct validity (structural, convergent, known-groups) of the OES-G were assessed in a sample of 165 prosthodontic patients. The OES was applied in 42 patients on two occasions, with a temporal distance of 2-4 weeks apart to determine test-retest reliability. Internal consistency of the OES-G was considered as satisfactory (Cronbach's alpha 0.94; average inter-item correlation 0.64). Intraclass correlation coefficient of 0.95 (95 % confidence interval 0.92-0.98) indicated excellent test-retest reliability. Correlation matrix and exploratory factor analysis provided support for unidimensionality of the measured construct. The OES-G summary score was correlated with the patients' global assessment of their esthetics (r = 0.87) and external ratings of the expert group (r = 0.55) and discriminated patients with treatment need (39.4 points) from patients without (58.4 points; p < 0.001) and with a large effect size. The OES-G has good psychometric properties and is a valuable instrument for the assessment of self-perceived orofacial esthetics.

  10. Payload software technology: Software technology development plan

    NASA Technical Reports Server (NTRS)

    1977-01-01

    Programmatic requirements for the advancement of software technology are identified for meeting the space flight requirements in the 1980 to 1990 time period. The development items are described, and software technology item derivation worksheets are presented along with the cost/time/priority assessments.

  11. Development of a simple measurement scale to evaluate the severity of non-specific low back pain for industrial ergonomics.

    PubMed

    Higuchi, Yoshiyuki; Izumi, Hiroyuki; Kumashiro, Mashaharu

    2010-06-01

    This study developed an assessment scale that hierarchically classifies degrees of low back pain severity. This assessment scale consists of two subscales: 1) pain intensity; 2) pain interference. First, the assessment scale devised by the authors was used to administer a self-administered questionnaire to 773 male workers in the car manufacturing industry. Subsequently, the validity of the measurement items was examined and some of them were revised. Next, the corrected low back pain scale was used in a self-administered questionnaire, the subjects of which were 5053 ordinary workers. The hierarchical validity between the measurement items was checked based on the results of Mokken Scale analysis. Finally, a low back pain assessment scale consisting of seven items was perfected. Quantitative assessment is made possible by scoring the items and low back pain severity can be classified into four hierarchical levels: none; mild; moderate; severe. STATEMENT OF RELEVANCE: The use of this scale devised by the authors allows a more detailed assessment of the degree of risk factor effect and also should prove useful both in selecting remedial measures for occupational low back pain and evaluating their efficacy.

  12. The development and psychometric validation of the Ethical Awareness Scale.

    PubMed

    Milliken, Aimee; Ludlow, Larry; DeSanto-Madeya, Susan; Grace, Pamela

    2018-04-19

    To develop and psychometrically assess the Ethical Awareness Scale using Rasch measurement principles and a Rasch item response theory model. Critical care nurses must be equipped to provide good (ethical) patient care. This requires ethical awareness, which involves recognizing the ethical implications of all nursing actions. Ethical awareness is imperative in successfully addressing patient needs. Evidence suggests that the ethical import of everyday issues may often go unnoticed by nurses in practice. Assessing nurses' ethical awareness is a necessary first step in preparing nurses to identify and manage ethical issues in the highly dynamic critical care environment. A cross-sectional design was used in two phases of instrument development. Using Rasch principles, an item bank representing nursing actions was developed (33 items). Content validity testing was performed. Eighteen items were selected for face validity testing. Two rounds of operational testing were performed with critical care nurses in Boston between February-April 2017. A Rasch analysis suggests sufficient item invariance across samples and sufficient construct validity. The analysis further demonstrates a progression of items uniformly along a hierarchical continuum; items that match respondent ability levels; response categories that are sufficiently used; and adequate internal consistency. Mean ethical awareness scores were in the low/moderate range. The results suggest the Ethical Awareness Scale is a psychometrically sound, reliable and valid measure of ethical awareness in critical care nurses. © 2018 John Wiley & Sons Ltd.

  13. Identifying predictors of physics item difficulty: A linear regression approach

    NASA Astrophysics Data System (ADS)

    Mesic, Vanes; Muratovic, Hasnija

    2011-06-01

    Large-scale assessments of student achievement in physics are often approached with an intention to discriminate students based on the attained level of their physics competencies. Therefore, for purposes of test design, it is important that items display an acceptable discriminatory behavior. To that end, it is recommended to avoid extraordinary difficult and very easy items. Knowing the factors that influence physics item difficulty makes it possible to model the item difficulty even before the first pilot study is conducted. Thus, by identifying predictors of physics item difficulty, we can improve the test-design process. Furthermore, we get additional qualitative feedback regarding the basic aspects of student cognitive achievement in physics that are directly responsible for the obtained, quantitative test results. In this study, we conducted a secondary analysis of data that came from two large-scale assessments of student physics achievement at the end of compulsory education in Bosnia and Herzegovina. Foremost, we explored the concept of “physics competence” and performed a content analysis of 123 physics items that were included within the above-mentioned assessments. Thereafter, an item database was created. Items were described by variables which reflect some basic cognitive aspects of physics competence. For each of the assessments, Rasch item difficulties were calculated in separate analyses. In order to make the item difficulties from different assessments comparable, a virtual test equating procedure had to be implemented. Finally, a regression model of physics item difficulty was created. It has been shown that 61.2% of item difficulty variance can be explained by factors which reflect the automaticity, complexity, and modality of the knowledge structure that is relevant for generating the most probable correct solution, as well as by the divergence of required thinking and interference effects between intuitive and formal physics knowledge structures. Identified predictors point out the fundamental cognitive dimensions of student physics achievement at the end of compulsory education in Bosnia and Herzegovina, whose level of development influenced the test results within the conducted assessments.

  14. Designing and Testing an Inventory for Measuring Social Media Competency of Certified Health Education Specialists

    PubMed Central

    Bernhardt, Jay M; Stellefson, Michael; Weiler, Robert M; Anderson-Lewis, Charkarra; Miller, M David; MacInnes, Jann

    2015-01-01

    Background Social media can promote healthy behaviors by facilitating engagement and collaboration among health professionals and the public. Thus, social media is quickly becoming a vital tool for health promotion. While guidelines and trainings exist for public health professionals, there are currently no standardized measures to assess individual social media competency among Certified Health Education Specialists (CHES) and Master Certified Health Education Specialists (MCHES). Objective The aim of this study was to design, develop, and test the Social Media Competency Inventory (SMCI) for CHES and MCHES. Methods The SMCI was designed in three sequential phases: (1) Conceptualization and Domain Specifications, (2) Item Development, and (3) Inventory Testing and Finalization. Phase 1 consisted of a literature review, concept operationalization, and expert reviews. Phase 2 involved an expert panel (n=4) review, think-aloud sessions with a small representative sample of CHES/MCHES (n=10), a pilot test (n=36), and classical test theory analyses to develop the initial version of the SMCI. Phase 3 included a field test of the SMCI with a random sample of CHES and MCHES (n=353), factor and Rasch analyses, and development of SMCI administration and interpretation guidelines. Results Six constructs adapted from the unified theory of acceptance and use of technology and the integrated behavioral model were identified for assessing social media competency: (1) Social Media Self-Efficacy, (2) Social Media Experience, (3) Effort Expectancy, (4) Performance Expectancy, (5) Facilitating Conditions, and (6) Social Influence. The initial item pool included 148 items. After the pilot test, 16 items were removed or revised because of low item discrimination (r<.30), high interitem correlations (Ρ>.90), or based on feedback received from pilot participants. During the psychometric analysis of the field test data, 52 items were removed due to low discrimination, evidence of content redundancy, low R-squared value, or poor item infit or outfit. Psychometric analyses of the data revealed acceptable reliability evidence for the following scales: Social Media Self-Efficacy (alpha=.98, item reliability=.98, item separation=6.76), Social Media Experience (alpha=.98, item reliability=.98, item separation=6.24), Effort Expectancy(alpha =.74, item reliability=.95, item separation=4.15), Performance Expectancy (alpha =.81, item reliability=.99, item separation=10.09), Facilitating Conditions (alpha =.66, item reliability=.99, item separation=16.04), and Social Influence (alpha =.66, item reliability=.93, item separation=3.77). There was some evidence of local dependence among the scales, with several observed residual correlations above |.20|. Conclusions Through the multistage instrument-development process, sufficient reliability and validity evidence was collected in support of the purpose and intended use of the SMCI. The SMCI can be used to assess the readiness of health education specialists to effectively use social media for health promotion research and practice. Future research should explore associations across constructs within the SMCI and evaluate the ability of SMCI scores to predict social media use and performance among CHES and MCHES. PMID:26399428

  15. Designing and Testing an Inventory for Measuring Social Media Competency of Certified Health Education Specialists.

    PubMed

    Alber, Julia M; Bernhardt, Jay M; Stellefson, Michael; Weiler, Robert M; Anderson-Lewis, Charkarra; Miller, M David; MacInnes, Jann

    2015-09-23

    Social media can promote healthy behaviors by facilitating engagement and collaboration among health professionals and the public. Thus, social media is quickly becoming a vital tool for health promotion. While guidelines and trainings exist for public health professionals, there are currently no standardized measures to assess individual social media competency among Certified Health Education Specialists (CHES) and Master Certified Health Education Specialists (MCHES). The aim of this study was to design, develop, and test the Social Media Competency Inventory (SMCI) for CHES and MCHES. The SMCI was designed in three sequential phases: (1) Conceptualization and Domain Specifications, (2) Item Development, and (3) Inventory Testing and Finalization. Phase 1 consisted of a literature review, concept operationalization, and expert reviews. Phase 2 involved an expert panel (n=4) review, think-aloud sessions with a small representative sample of CHES/MCHES (n=10), a pilot test (n=36), and classical test theory analyses to develop the initial version of the SMCI. Phase 3 included a field test of the SMCI with a random sample of CHES and MCHES (n=353), factor and Rasch analyses, and development of SMCI administration and interpretation guidelines. Six constructs adapted from the unified theory of acceptance and use of technology and the integrated behavioral model were identified for assessing social media competency: (1) Social Media Self-Efficacy, (2) Social Media Experience, (3) Effort Expectancy, (4) Performance Expectancy, (5) Facilitating Conditions, and (6) Social Influence. The initial item pool included 148 items. After the pilot test, 16 items were removed or revised because of low item discrimination (r<.30), high interitem correlations (Ρ>.90), or based on feedback received from pilot participants. During the psychometric analysis of the field test data, 52 items were removed due to low discrimination, evidence of content redundancy, low R-squared value, or poor item infit or outfit. Psychometric analyses of the data revealed acceptable reliability evidence for the following scales: Social Media Self-Efficacy (alpha=.98, item reliability=.98, item separation=6.76), Social Media Experience (alpha=.98, item reliability=.98, item separation=6.24), Effort Expectancy(alpha =.74, item reliability=.95, item separation=4.15), Performance Expectancy (alpha =.81, item reliability=.99, item separation=10.09), Facilitating Conditions (alpha =.66, item reliability=.99, item separation=16.04), and Social Influence (alpha =.66, item reliability=.93, item separation=3.77). There was some evidence of local dependence among the scales, with several observed residual correlations above |.20|. Through the multistage instrument-development process, sufficient reliability and validity evidence was collected in support of the purpose and intended use of the SMCI. The SMCI can be used to assess the readiness of health education specialists to effectively use social media for health promotion research and practice. Future research should explore associations across constructs within the SMCI and evaluate the ability of SMCI scores to predict social media use and performance among CHES and MCHES.

  16. Children and Young People-Mental Health Safety Assessment Tool (CYP-MH SAT) study: Protocol for the development and psychometric evaluation of an assessment tool to identify immediate risk of self-harm and suicide in children and young people (10-19 years) in acute paediatric hospital settings.

    PubMed

    Manning, Joseph C; Walker, Gemma M; Carter, Tim; Aubeeluck, Aimee; Witchell, Miranda; Coad, Jane

    2018-04-12

    Currently, no standardised, evidence-based assessment tool for assessing immediate self-harm and suicide in acute paediatric inpatient settings exists. The aim of this study is to develop and test the psychometric properties of an assessment tool that identifies immediate risk of self-harm and suicide in children and young people (10-19 years) in acute paediatric hospital settings. Development phase: This phase involved a scoping review of the literature to identify and extract items from previously published suicide and self-harm risk assessment scales. Using a modified electronic Delphi approach, these items will then be rated according to their relevance for assessment of immediate suicide or self-harm risk by expert professionals. Inclusion of items will be determined by 65%-70% consensus between raters. Subsequently, a panel of expert members will convene to determine the face validity, appropriate phrasing, item order and response format for the finalised items.Psychometric testing phase: The finalised items will be tested for validity and reliability through a multicentre, psychometric evaluation. Psychometric testing will be undertaken to determine the following: internal consistency, inter-rater reliability, convergent, divergent validity and concurrent validity. Ethical approval was provided by the National Health Service East Midlands-Derby Research Ethics Committee (17/EM/0347) and full governance clearance received by the Health Research Authority and local participating sites. Findings from this study will be disseminated to professionals and the public via peer-reviewed journal publications, popular social media and conference presentations. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  17. Development and validation of a fatigue assessment scale for U.S. construction workers.

    PubMed

    Zhang, Mingzong; Sparer, Emily H; Murphy, Lauren A; Dennerlein, Jack T; Fang, Dongping; Katz, Jeffrey N; Caban-Martinez, Alberto J

    2015-02-01

    To develop a fatigue assessment scale and test its reliability and validity for commercial construction workers. Using a two-phased approach, we first identified items (first phase) for the development of a Fatigue Assessment Scale for Construction Workers (FASCW) through review of existing scales in the scientific literature, key informant interviews (n = 11) and focus groups (three groups with six workers each) with construction workers. The second phase included assessment for the reliability, validity, and sensitivity of the new scale using a repeated-measures study design with a convenience sample of construction workers (n = 144). Phase one resulted in a 16-item preliminary scale that after factor analysis yielded a final 10-item scale with two sub-scales ("Lethargy" and "Bodily Ailment"). During phase two, the FASCW and its subscales demonstrated satisfactory internal consistency (alpha coefficients were FASCW [0.91], Lethargy [0.86] and Bodily Ailment [0.84]) and acceptable test-retest reliability (Pearson Correlations Coefficients: 0.59-0.68; Intraclass Correlation Coefficients: 0.74-0.80). Correlation analysis substantiated concurrent and convergent validity. A discriminant analysis demonstrated that the FASCW differentiated between groups with arthritis status and different work hours. The 10-item FASCW with good reliability and validity is an effective tool for assessing the severity of fatigue among construction workers. © 2015 Wiley Periodicals, Inc.

  18. Development and Standardization of the Diagnostic Adaptive Behavior Scale: Application of Item Response Theory to the Assessment of Adaptive Behavior.

    PubMed

    Tassé, Marc J; Schalock, Robert L; Thissen, David; Balboni, Giulia; Bersani, Henry Hank; Borthwick-Duffy, Sharon A; Spreat, Scott; Widaman, Keith F; Zhang, Dalun; Navas, Patricia

    2016-03-01

    The Diagnostic Adaptive Behavior Scale (DABS) was developed using item response theory (IRT) methods and was constructed to provide the most precise and valid adaptive behavior information at or near the cutoff point of making a decision regarding a diagnosis of intellectual disability. The DABS initial item pool consisted of 260 items. Using IRT modeling and a nationally representative standardization sample, the item set was reduced to 75 items that provide the most precise adaptive behavior information at the cutoff area determining the presence or not of significant adaptive behavior deficits across conceptual, social, and practical skills. The standardization of the DABS is described and discussed.

  19. A tool for assessing case history and feedback skills in audiology students working with simulated patients.

    PubMed

    Hughes, Jane; Wilson, Wayne J; MacBean, Naomi; Hill, Anne E

    2016-12-01

    To develop a tool for assessing audiology students taking a case history and giving feedback with simulated patients (SP). Single observation, single group design. Twenty-four first-year audiology students, five simulated patients, two clinical educators, and three evaluators. The Audiology Simulated Patient Interview Rating Scale (ASPIRS) was developed consisting of six items assessing specific clinical skills, non-verbal communication, verbal communication, interpersonal skills, interviewing skills, and professional practice skills. These items are applied once for taking a case history and again for giving feedback. The ASPIRS showed very high internal consistency (α = 0.91-0.97; mean inter-item r = 0.64-0.85) and fair-to-moderate agreement between evaluators (29.2-54.2% exact and 79.2-100% near agreement; κ weighted up to 0.60). It also showed fair-to-moderate absolute agreement amongst evaluators for single evaluator scores (intraclass correlation coefficient [ICC] r = 0.35-0.59) and substantial consistency of agreement amongst evaluators for three-evaluator averaged scores (ICC r = 0.62-0.81). Factor analysis showed the ASPIRS' 12 items fell into two components, one containing all feedback items and one containing all case history items. The ASPIRS shows promise as the first published tool for assessing audiology students taking a case history and giving feedback with an SP.

  20. Development and validation of brief scales to measure emotional and behavioural problems among Chinese adolescents

    PubMed Central

    Shen, Minxue; Hu, Ming; Sun, Zhenqiu

    2017-01-01

    Objectives To develop and validate brief scales to measure common emotional and behavioural problems among adolescents in the examination-oriented education system and collectivistic culture of China. Setting Middle schools in Hunan province. Participants 5442 middle school students aged 11–19 years were sampled. 4727 valid questionnaires were collected and used for validation of the scales. The final sample included 2408 boys and 2319 girls. Primary and secondary outcome measures The tools were assessed by the item response theory, classical test theory (reliability and construct validity) and differential item functioning. Results Four scales to measure anxiety, depression, study problem and sociality problem were established. Exploratory factor analysis showed that each scale had two solutions. Confirmatory factor analysis showed acceptable to good model fit for each scale. Internal consistency and test–retest reliability of all scales were above 0.7. Item response theory showed that all items had acceptable discrimination parameters and most items had appropriate difficulty parameters. 10 items demonstrated differential item functioning with respect to gender. Conclusions Four brief scales were developed and validated among adolescents in middle schools of China. The scales have good psychometric properties with minor differential item functioning. They can be used in middle school settings, and will help school officials to assess the students’ emotional/behavioural problems. PMID:28062469

  1. Development of the Mini-Assisting Hand Assessment: evidence for content and internal scale validity.

    PubMed

    Greaves, Susan; Imms, Christine; Dodd, Karen; Krumlinde-Sundholm, Lena

    2013-11-01

    To describe the development of the Mini-Assisting Hand Assessment (Mini-AHA) for children with signs of unilateral cerebral palsy (CP) aged 8 to 18 months, and evaluate aspects of content and internal scale validity. The ability of the video-recorded Mini-AHA play session to provoke bimanual performance in children with unilateral CP and typical development was evaluated. Original AHA test items were examined for their suitability for younger children and possible new items were generated. Data from 108 assessments of children with unilateral CP (86 children, 53 males, 33 females; mean age 13 mo, SD 3 mo, range 8-18 mo) were entered into a Rasch measurement model analysis to evaluate internal scale validity. A Spearman's correlation analysis explored the relationship between age and ability measures for children with unilateral CP. The frequency of maximum scores in 40 children with typical development (22 males, 18 females; mean age 12 mo, SD 3 mo) was examined. The Mini-AHA play session provoked bimanual responses in typically developing children 99% of the time. Person and item fit criteria established 20 items for the scale. The resultant unidimensional scale also demonstrated excellent discriminative features through high separation reliability. The item calibration values covered the range of person ability measures well. Age was not related to the ability measures for children with unilateral CP (rs =0.178). All children with typical development achieved maximum scores. Accumulated evidence shows that the Mini-AHA validly measures use of the affected hand during bimanual performance for children with unilateral CP aged 8 to 18 months. The Mini-AHA has the potential to be a useful assessment to evaluate functional hand use and the effects of intervention in an age group when potential for change is high. © 2013 Mac Keith Press.

  2. Development and validation of oral health-related early childhood quality of life tool for North Indian preschool children.

    PubMed

    Mathur, Vijay Prakash; Dhillon, Jatinder Kaur; Logani, Ajay; Agarwal, Ramesh

    2014-01-01

    The purpose of this study was to develop a reliable instrument [Oral Health related Early Childhood Quality of Life (OH- ECQOL) scale] for measuring oral health related quality of life (OHrQoL) in preschool children in North Indian population. Four pediatric dentists evaluated a pool of 65 items from various QoL questionnaires to assess their relevance to Indian population. These items were discussed with eight independent pediatric dentists and two community dentists who were not a part of this study to assess relevance of these items to preschool age children based on their comprehensiveness and clarity. Based on their responses and feedback a modified pool of items was developed and administered to a convenience sample of 20 parents who rated these items according to their relevance. The test retest reliability was evaluated on another sample of 20 parents of 2-5 year old children. The final questionnaire comprised of 16 items (12 child and 4 family). This was administered to 300 parents of 24-71 months old children divided on the basis of early childhood caries to assess its reliability and validity. OH-ECQOL scores were significantly associated with parental ratings of their child's general and oral health, and the presence of dental disease in the child. Cronbach's alpha was 0.862, and the ICC for test-retest reliability was 0.94. The OH-ECQOL proved reliable and valid tool for assessing the impact of oral disorders on the quality of life of preschool children in Northern India.

  3. Development of a Culture Specific Critical Thinking Ability Test and Using It as a Supportive Diagnostic Test for Giftedness

    ERIC Educational Resources Information Center

    Köksal, Mustafa Serdar

    2016-01-01

    The purposes of this study were to develop a culture specific critical thinking ability test for 6, 7, and 8. grade students in Turkey and to use it as an assessment instrument for giftedness. For these purposes, item pool involving 22 items was formed by writing items focusing on the current and common events presented in (Turkish) media from…

  4. Development and Psychometric Testing of the Caregiver Communication Competence Scale in Patients With Dementia.

    PubMed

    Chao, Hui-Chen; Yang, Ya-Ping; Huang, Mei-Chih; Wang, Jing-Jy

    2016-01-01

    Appropriate communication skills are essential for understanding patient needs, particularly those of patients with dementia. Assessing health care providers' competence in communicating with patients with dementia is critical for planning a communication education program. However, no formally established scale can be used. The purpose of the current study was to develop a valid and reliable instrument for determining the communication competence of health care providers with patients with dementia. Through use of a literature review and previous clinical experience, an initial 28-item scale was developed to assess the frequency of use of each item by health care providers. Fourteen items were extracted and three factors were distinguished. Results indicated that the internal consistency reliability of the 14-item scale was 0.84. Favorable convergent and discriminant validities were reached. The communication competence scale provides administrators or educators with a useful tool for assessing communication competence of health care providers when interacting with patients with dementia so a suitable education program can be planned and implemented. Copyright 2016, SLACK Incorporated.

  5. HKF-R 10 - screening for predicting chronicity in acute low back pain (LBP): a prospective clinical trial.

    PubMed

    Neubauer, Eva; Junge, Astrid; Pirron, Peter; Seemann, Hanne; Schiltenwolf, Marcus

    2006-08-01

    Prospective cohort study. To develop a short instrument to reliably predict chronicity in low back pain (LBP). Health care expenditures on the treatment of low back pain continue to increase. It is therefore important to prevent the development of chronicity. In Germany, there is at present no early risk assessment tool to predict the risk of developing chronic LBP for patients presenting with acute LBP. Undertaken in an orthopedic practice setting, this study examined known risk factors for chronicity. It resulted in the development of a short questionnaire that successfully predicted the course of chronicity with an accuracy of 78%. A cohort of 192 orthopaedic outpatients was assessed for clinical, behavioral, emotional, and cognitive parameters bsed on a self-report test battery of 167 established items predictive for chronicity in LBP. Chronicity was defined as back pain persisting for longer than six months. Logistic regression analysis was performed to evaluate the predictive value of all items significantly associated with the dependent variable. The study found the following items to have the strongest predictive value in the development of chronicity: "How strong was your back pain during the last week when it was most tolerable?" and the question "How much residual pain would you be willing to tolerate while still considering the therapy successful?" These were followed by the variables for "Duration of existing LBP" (more than eight days), the patient's educational level (low levels are related to higher risks of chronicity) and pain being experienced elsewhere in the body. Other significant factors were five items assessing depression (Zung) and the palliative effect of therapeutic massage (where a positive correlation was found). Female patients have a higher risk for chronicity, as do patients with a high total score on the scales assessing "catastrophizing thoughts" and thoughts of "helplessness". Using the items listed above, the study was able to predict a patient's risk of developing chronic LBP with a probability of 78%. These items were assembled in a brief questionnaire and were paired with a corresponding evaluative tool. This enables practitioners to assess an individual patient's risk for chronicity by means of a simple calculator in just a few minutes. A validation study for the questionnaire is currently being prepared. MINI ABSTRACT: The objective of this study was the development of a brief questionnaire to assess the risk for chronicity for LBP.

  6. Overview of classical test theory and item response theory for the quantitative assessment of items in developing patient-reported outcomes measures.

    PubMed

    Cappelleri, Joseph C; Jason Lundy, J; Hays, Ron D

    2014-05-01

    The US Food and Drug Administration's guidance for industry document on patient-reported outcomes (PRO) defines content validity as "the extent to which the instrument measures the concept of interest" (FDA, 2009, p. 12). According to Strauss and Smith (2009), construct validity "is now generally viewed as a unifying form of validity for psychological measurements, subsuming both content and criterion validity" (p. 7). Hence, both qualitative and quantitative information are essential in evaluating the validity of measures. We review classical test theory and item response theory (IRT) approaches to evaluating PRO measures, including frequency of responses to each category of the items in a multi-item scale, the distribution of scale scores, floor and ceiling effects, the relationship between item response options and the total score, and the extent to which hypothesized "difficulty" (severity) order of items is represented by observed responses. If a researcher has few qualitative data and wants to get preliminary information about the content validity of the instrument, then descriptive assessments using classical test theory should be the first step. As the sample size grows during subsequent stages of instrument development, confidence in the numerical estimates from Rasch and other IRT models (as well as those of classical test theory) would also grow. Classical test theory and IRT can be useful in providing a quantitative assessment of items and scales during the content-validity phase of PRO-measure development. Depending on the particular type of measure and the specific circumstances, the classical test theory and/or the IRT should be considered to help maximize the content validity of PRO measures. Copyright © 2014 Elsevier HS Journals, Inc. All rights reserved.

  7. Measuring attributes of health literate health care organizations from the patients' perspective: Development and validation of a questionnaire to assess health literacy-sensitive communication (HL-COM).

    PubMed

    Ernstmann, Nicole; Halbach, Sarah; Kowalski, Christoph; Pfaff, Holger; Ansmann, Lena

    2017-04-01

    Studies addressing the organizational contexts of care that may help increase the patients' ability to cope with a disease and to navigate through the health care system are still rare. Especially instruments allowing the assessment of such organizational efforts from the patients' perspective are missing. The aim of our study was to develop a survey instrument assessing organizational health literacy (HL) from the patients' perspective, i. e., health care organizations' responsiveness to patients' individual needs. A pool of 30 items was developed by a group of experts based on a literature review. The items were developed, tested and prioritized according to their importance in 11 semi-structured interviews and cognitive think-aloud interviews with cancer patients. The resulting 16 items were rated in a standardized postal survey involving a total of N=453 colon and breast cancer patients treated in cancer centers in Germany. An exploratory factor analysis, a confirmatory factor analysis and structural equation modelling were conducted. Item properties were analyzed. 83.2 % of the patients were diagnosed with breast cancer, 16.8 % had a diagnosis of colon cancer. The patients' mean age was 61 (26-88), 89.4 % were female. The most common comorbidities were hypertension (34.0 %) and cardiovascular disease (11.0 %). The final prediction model included nine items measuring the degree of health literacy-sensitivity of communication. The model showed an acceptable model fit. The nine items showed corrected item-total correlations between .622 and .762 and item difficulties between 0.77 and 0.87. Cronbach's α was .912. In a comprehensive development process, the original item pool comprising several aspects of organizational HL was reduced to a one-dimensional scale. The instrument measures an important aspect of organizational HL; i.e., the degree of health literacy-sensitivity of communication (HL-COM). HL-COM was found to impact patient enablement, mediated through the support by physicians. Future research will have to test these associations in the context of other diseases or institutions. Copyright © 2017. Published by Elsevier GmbH.

  8. An international road map to improve pain assessment in people with impaired cognition: the development of the Pain Assessment in Impaired Cognition (PAIC) meta-tool.

    PubMed

    Corbett, Anne; Achterberg, Wilco; Husebo, Bettina; Lobbezoo, Frank; de Vet, Henrica; Kunz, Miriam; Strand, Liv; Constantinou, Marios; Tudose, Catalina; Kappesser, Judith; de Waal, Margot; Lautenbacher, Stefan

    2014-12-10

    Pain is common in people with dementia, yet identification is challenging. A number of pain assessment tools exist, utilizing observation of pain-related behaviours, vocalizations and facial expressions. Whilst they have been developed robustly, these often lack sufficient evidence of psychometric properties, like reliability, face and construct validity, responsiveness and usability, and are not internationally implemented. The EU-COST initiative "Pain in impaired cognition, especially dementia" aims to combine the expertise of clinicians and researchers to address this important issue by building on previous research in the area, identifying existing pain assessment tools for dementia, and developing consensus for items for a new universal meta-tool for use in research and clinical settings. This paper reports on the initial phase of this collaboration task. All existing observational pain behaviour tools were identified and elements categorised using a three-step reduction process. Selection and refinement of items for the draft Pain Assessment in Impaired Cognition (PAIC) meta-tool was achieved through scrutiny of the evidence, consensus of expert opinion, frequency of use and alignment with the American Geriatric Society guidelines. The main aim of this process was to identify key items with potential empirical, rather than theoretical value to take forward for testing. 12 eligible assessment tools were identified, and pain items categorised according to behaviour, facial expression and vocalisation according to the AGS guidelines (Domains 1 - 3). This has been refined to create the PAIC meta-tool for validation and further refinement. A decision was made to create a supporting comprehensive toolkit to support the core assessment tool to provide additional resources for the assessment of overlapping symptoms in dementia, including AGS domains four to six, identification of specific types of pain and assessment of duration and location of pain. This multidisciplinary, cross-cultural initiative has created a draft meta-tool for capturing pain behaviour to be used across languages and culture, based on the most promising items used in existing tools. The draft PAIC meta-tool will now be taken forward for evaluation according to COSMIN guidelines and the EU-COST protocol in order to exclude invalid items, refine included items and optimise the meta-tool.

  9. Construction and validation of a psychometric scale to measure awareness on consumption of irradiated foods.

    PubMed

    Rusin, Tiago; Araújo, Wilma Maria Coelho; Faiad, Cristiane; Vital, Helio de Carvalho

    2017-01-01

    Although food irradiation has been used to ensure food safety, most consumers are unaware of the basic concepts of irradiation, misinterpreting information and demonstrating a negative attitude toward food items treated with ionizing radiation. This research is aimed at developing a tool to assess the awareness on the consumption of irradiated food. The sample was composed by employees from different social classes and school levels of Brazilian universities, who reflect the end-users of the irradiated foods, representative of the views of lay consumers. The total number of respondents was 614. In order to assess the Awareness Scale on Consumption of Irradiated Foods (ASCIF), an instrument has been developed and submitted to semantic tests and judge's validation. The instrument, that included 32 items, contemplated four construct factors: concepts (6 items), awareness (10 items), labeling (7 items) and safety of Irradiated foods (9 items). The data were collected by electronic means, through the site . By using exploratory factorial analysis (EFA) 4 factors have been found. They summarize the 31 items included. These factors account for 64.32% of the variance of the items and the internal consistency of the factors has been deemed good. An Exploratory Structural Equation Modeling (ESEM) was conducted to evaluate the factor structure of the instrument. The proposed instrument has been found to meet consistency criteria as an efficient tool for indicating assessing potential challenges and opportunities for the irradiated food markets.

  10. Developing a model of competence in the operating theatre: psychometric validation of the perceived perioperative competence scale-revised.

    PubMed

    Gillespie, Brigid M; Polit, Denise F; Hamlin, Lois; Chaboyer, Wendy

    2012-01-01

    This paper describes the development and validation of the Revised Perioperative Competence Scale (PPCS-R). There is a lack of a psychometrically tested sound self-assessment tools to measure nurses' perceived competence in the operating room. Content validity was established by a panel of international experts and the original 98-item scale was pilot tested with 345 nurses in Queensland, Australia. Following the removal of several items, a national sample that included all 3209 nurses who were members of the Australian College of Operating Room Nurses was surveyed using the 94-item version. Psychometric testing assessed content validity using exploratory factor analysis, internal consistency using Cronbach's alpha, and construct validity using the "known groups" technique. During item reduction, several preliminary factor analyses were performed on two random halves of the sample (n=550). Usable data for psychometric assessment were obtained from 1122 nurses. The original 94-item scale was reduced to 40 items. The final factor analysis using the entire sample resulted in a 40 item six-factor solution. Cronbach's alpha for the 40-item scale was .96. Construct validation demonstrated significant differences (p<.0001) in perceived competence scores relative to years of operating room experience and receipt of specialty education. On the basis of these results, the psychometric properties of the PPCS-R were considered encouraging. Further testing of the tool in different samples of operating room nurses is necessary to enable cross-cultural comparisons. Copyright © 2011 Elsevier Ltd. All rights reserved.

  11. Construction and validation of a psychometric scale to measure awareness on consumption of irradiated foods

    PubMed Central

    2017-01-01

    Although food irradiation has been used to ensure food safety, most consumers are unaware of the basic concepts of irradiation, misinterpreting information and demonstrating a negative attitude toward food items treated with ionizing radiation. This research is aimed at developing a tool to assess the awareness on the consumption of irradiated food. The sample was composed by employees from different social classes and school levels of Brazilian universities, who reflect the end-users of the irradiated foods, representative of the views of lay consumers. The total number of respondents was 614. In order to assess the Awareness Scale on Consumption of Irradiated Foods (ASCIF), an instrument has been developed and submitted to semantic tests and judge’s validation. The instrument, that included 32 items, contemplated four construct factors: concepts (6 items), awareness (10 items), labeling (7 items) and safety of Irradiated foods (9 items). The data were collected by electronic means, through the site . By using exploratory factorial analysis (EFA) 4 factors have been found. They summarize the 31 items included. These factors account for 64.32% of the variance of the items and the internal consistency of the factors has been deemed good. An Exploratory Structural Equation Modeling (ESEM) was conducted to evaluate the factor structure of the instrument. The proposed instrument has been found to meet consistency criteria as an efficient tool for indicating assessing potential challenges and opportunities for the irradiated food markets. PMID:29220375

  12. Development and testing of item response theory-based item banks and short forms for eye, skin and lung problems in sarcoidosis.

    PubMed

    Victorson, David E; Choi, Seung; Judson, Marc A; Cella, David

    2014-05-01

    Sarcoidosis is a multisystem disease that can negatively impact health-related quality of life (HRQL) across generic (e.g., physical, social and emotional wellbeing) and disease-specific (e.g., pulmonary, ocular, dermatologic) domains. Measurement of HRQL in sarcoidosis has largely relied on generic patient-reported outcome tools, with little disease-specific measures available. The purpose of this paper is to present the development and testing of disease-specific item banks and short forms of lung, skin and eye problems, which are a part of a new patient-reported outcome (PRO) instrument called the sarcoidosis assessment tool. After prioritizing and selecting the most important disease-specific domains, we wrote new items to reflect disease-specific problems by drawing from patient focus group and clinician expert survey data that were used to create our conceptual model of HRQL in sarcoidosis. Item pools underwent cognitive interviews by sarcoidosis patients (n = 13), and minor modifications were made. These items were administered in a multi-site study (n = 300) to obtain item calibrations and create calibrated short forms using item response theory (IRT) approaches. From the available item pools, we created four new item banks and short forms: (1) skin problems, (2) skin stigma, (3) lung problems, and (4) eye Problems. We also created and tested supplemental forms of the most common constitutional symptoms and negative effects of corticosteroids. Several new sarcoidosis-specific PROs were developed and tested using IRT approaches. These new measures can advance more precise and targeted HRQL assessment in sarcoidosis clinical trials and clinical practice.

  13. Development of and Field-Test Results for the CAHPS PCMH Survey

    PubMed Central

    Scholle, Sarah Hudson; Vuong, Oanh; Ding, Lin; Fry, Stephanie; Gallagher, Patricia; Brown, Julie A.; Hays, Ron D.; Cleary, Paul D.

    2017-01-01

    Objective To develop and evaluate survey questions that assess processes of care relevant to Patient-Centered Medical Homes (PCMHs). Research Design We convened expert panels, reviewed evidence on effective care practices and existing surveys, elicited broad public input, and conducted cognitive interviews and a field test to develop items relevant to PCMHs that could be added to the CAHPS® Clinician & Group (CG-CAHPS) 1.0 Survey. Surveys were tested using a two-contact mail protocol in 10 adult and 33 pediatric practices (both private and community health centers) in Massachusetts. A total of 4,875 completed surveys were received (overall response rate of 25%). Analyses We calculated the rate of valid responses for each item. We conducted exploratory factor analyses and estimated item-to-total correlations, individual and site level reliability, and correlations among proposed multi-item composites. Results Ten items in four new domains (Comprehensiveness, Information, Self-Management Support, and Shared Decision-Making) and four items in two existing domains (Access and Coordination of Care) were selected to be supplemental items to be used in conjunction with the adult CG-CAHPS 1.0 survey. For the child version, four items in each of two new domains (Information and Self-Management Support) and five items in existing domains (Access, Comprehensiveness-Prevention, Coordination of Care) were selected. Conclusions This study provides support for the reliability and validity of new items to supplement the CG-CAHPS 1.0 survey to assess aspects of primary care that are important attributes of Patient-Centered Medical Homes. PMID:23064272

  14. Developing a reliable and valid instrument to assess health-affecting aspects of neighborhoods in Tehran

    PubMed Central

    Ghalichi, Leila; Mohammad, Kazem; Majdzadeh, Reza; Hoseini, Mostafa; Pournik, Omid; Nedjat, Saharnaz

    2012-01-01

    Background: Residence characteristics can affect health of residents. This paper reports the development of an instrument assessing these aspects of neighborhoods. Materials and Methods: Literature search and focus group discussions with residents were carried out and relevant items were extracted. Five experts reviewed and commented on the items. An observation instrument with 54 items was composed and completed by two independent observers in 20 randomly selected locations. Due to lack of acceptable reliability in some items, the checklist was revised. The new 22-items checklist in four categories (general characteristics, public green area characteristics, access to services and undesirable features) was completed by two independent trained observers in 28 randomly selected locations. Results: The items in the final checklist had kappa statistics ranging from 0.63 to 1, with an exception of the item assessing “presence of beggars, homeless or working/street children”, with kappa as low as 0.27 due to variability of their presence in different times. Average Kappa statistics was 0.78 for general characteristics, 0.79 for public green area characteristics, 0.84 for access to services, and 0.54 for undesirable features. Conclusion: Neighborhood and health observation instrument seems to have good reliability in city of Tehran. It can probably be used in other large cities of Iran and similar cities elsewhere. PMID:23626633

  15. Pedagogy of Science Teaching Tests: Formative assessments of science teaching orientations

    NASA Astrophysics Data System (ADS)

    Cobern, William W.; Schuster, David; Adams, Betty; Skjold, Brandy Ann; Zeynep Muğaloğlu, Ebru; Bentz, Amy; Sparks, Kelly

    2014-09-01

    A critical aspect of teacher education is gaining pedagogical content knowledge of how to teach science for conceptual understanding. Given the time limitations of college methods courses, it is difficult to touch on more than a fraction of the science topics potentially taught across grades K-8, particularly in the context of relevant pedagogies. This research and development work centers on constructing a formative assessment resource to help expose pre-service teachers to a greater number of science topics within teaching episodes using various modes of instruction. To this end, 100 problem-based, science pedagogy assessment items were developed via expert group discussions and pilot testing. Each item contains a classroom vignette followed by response choices carefully crafted to include four basic pedagogies (didactic direct, active direct, guided inquiry, and open inquiry). The brief but numerous items allow a substantial increase in the number of science topics that pre-service students may consider. The intention is that students and teachers will be able to share and discuss particular responses to individual items, or else record their responses to collections of items and thereby create a snapshot profile of their teaching orientations. Subsets of items were piloted with students in pre-service science methods courses, and the quantitative results of student responses were spread sufficiently to suggest that the items can be effective for their intended purpose.

  16. Development and Field Test of an Audit Tool and Tracer Methodology for Clinician Assessment of Quality in End-of-Life Care.

    PubMed

    Bookbinder, Marilyn; Hugodot, Amandine; Freeman, Katherine; Homel, Peter; Santiago, Elisabeth; Riggs, Alexa; Gavin, Maggie; Chu, Alice; Brady, Ellen; Lesage, Pauline; Portenoy, Russell K

    2018-02-01

    Quality improvement in end-of-life care generally acquires data from charts or caregivers. "Tracer" methodology, which assesses real-time information from multiple sources, may provide complementary information. The objective of this study was to develop a valid brief audit tool that can guide assessment and rate care when used in a clinician tracer to evaluate the quality of care for the dying patient. To identify items for a brief audit tool, 248 items were created to evaluate overall quality, quality in specific content areas (e.g., symptom management), and specific practices. Collected into three instruments, these items were used to interview professional caregivers and evaluate the charts of hospitalized patients who died. Evidence that this information could be validly captured using a small number of items was obtained through factor analyses, canonical correlations, and group comparisons. A nurse manager field tested tracer methodology using candidate items to evaluate the care provided to other patients who died. The survey of 145 deaths provided chart data and data from 445 interviews (26 physicians, 108 nurses, 18 social workers, and nine chaplains). The analyses yielded evidence of construct validity for a small number of items, demonstrating significant correlations between these items and content areas identified as latent variables in factor analyses. Criterion validity was suggested by significant differences in the ratings on these items between the palliative care unit and other units. The field test evaluated 127 deaths, demonstrated the feasibility of tracer methodology, and informed reworking of the candidate items into the 14-item Tracer EoLC v1. The Tracer EoLC v1 can be used with tracer methodology to guide the assessment and rate the quality of end-of-life care. Copyright © 2017 American Academy of Hospice and Palliative Medicine. Published by Elsevier Inc. All rights reserved.

  17. Assessment of Differential Item Functioning in the Experiences of Discrimination Index

    PubMed Central

    Cunningham, Timothy J.; Berkman, Lisa F.; Gortmaker, Steven L.; Kiefe, Catarina I.; Jacobs, David R.; Seeman, Teresa E.; Kawachi, Ichiro

    2011-01-01

    The psychometric properties of instruments used to measure self-reported experiences of discrimination in epidemiologic studies are rarely assessed, especially regarding construct validity. The authors used 2000–2001 data from the Coronary Artery Risk Development in Young Adults (CARDIA) Study to examine differential item functioning (DIF) in 2 versions of the Experiences of Discrimination (EOD) Index, an index measuring self-reported experiences of racial/ethnic and gender discrimination. DIF may confound interpretation of subgroup differences. Large DIF was observed for 2 of 7 racial/ethnic discrimination items: White participants reported more racial/ethnic discrimination for the “at school” item, and black participants reported more racial/ethnic discrimination for the “getting housing” item. The large DIF by race/ethnicity in the index for racial/ethnic discrimination probably reflects item impact and is the result of valid group differences between blacks and whites regarding their respective experiences of discrimination. The authors also observed large DIF by race/ethnicity for 3 of 7 gender discrimination items. This is more likely to have been due to item bias. Users of the EOD Index must consider the advantages and disadvantages of DIF adjustment (omitting items, constructing separate measures, and retaining items). The EOD Index has substantial usefulness as an instrument that can assess self-reported experiences of discrimination. PMID:22038104

  18. Development and Psychometric Properties of a Tuberculosis-Specific Multidimensional Health-Related Quality-of-Life Measure for Patients with Pulmonary Tuberculosis.

    PubMed

    Abdulelah, Juman; Sulaiman, Syed Azhar Syed; Hassali, Mohamed A; Blebil, Ali Q; Awaisu, Ahmed; Bredle, Jason M

    2015-05-01

    Various generic instruments exist to assess health-related quality of life (HRQOL) in patients with tuberculosis (TB), but a psychometrically sound disease-specific instrument is lacking. The present study aimed to develop and psychometrically validate a multidimensional TB-specific HRQOL instrument relevant to the value of patients with pulmonary TB in Iraq with an eye toward cross-cultural application. The core general HRQOL questionnaire is composed of the Functional Assessment of Cancer Therapy-General items. A modular approach was followed for the development of the Functional Assessment of Chronic Illness Therapy-Tuberculosis (FACIT-TB) questionnaire in which a set of items assessing quality-of-life (QOL) issues not sufficiently covered by the core Functional Assessment of Cancer Therapy-General items, but considered to be relevant to the target population, was added. Moreover, principal-component analysis was used to determine the new subscale structure of the questionnaire. In addition to the 27 items of the core questionnaire, a set of 20 items referring to disease symptoms related to the site of infection, adverse effects, and additional QOL dimensions such as fatigue, social stigma, and economic burden of the illness was included. Factor analysis demonstrated that the FACIT-TB construct comprised five domains. A rigorous method was applied in the development of the FACIT-TB measure to fully understand the impact of TB on patients' QOL. The instrument is psychometrically sound and portrays multiple important dimensions of HRQOL. FACIT-TB is relatively brief, is easy to administer and score, and is appropriate for use in clinical trials and practice. Copyright © 2015 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.

  19. Development of the AGREE II, part 2: assessment of validity of items and tools to support application

    PubMed Central

    Brouwers, Melissa C.; Kho, Michelle E.; Browman, George P.; Burgers, Jako S.; Cluzeau, Françoise; Feder, Gene; Fervers, Béatrice; Graham, Ian D.; Hanna, Steven E.; Makarski, Julie

    2010-01-01

    Background We established a program of research to improve the development, reporting and evaluation of practice guidelines. We assessed the construct validity of the items and user’s manual in the β version of the AGREE II. Methods We designed guideline excerpts reflecting high-and low-quality guideline content for 21 of the 23 items in the tool. We designed two study packages so that one low-quality and one high-quality version of each item were randomly assigned to each package. We randomly assigned 30 participants to one of the two packages. Participants reviewed and rated the guideline content according to the instructions of the user’s manual and completed a survey assessing the manual. Results In all cases, content designed to be of high quality was rated higher than low-quality content; in 18 of 21 cases, the differences were significant (p < 0.05). The manual was rated by participants as appropriate, easy to use, and helpful in differentiating guidelines of varying quality, with all scores above the mid-point of the seven-point scale. Considerable feedback was offered on how the items and manual of the β-AGREE II could be improved. Interpretation The validity of the items was established and the user’s manual was rated as highly useful by users. We used these results and those of our study presented in part 1 to modify the items and user’s manual. We recommend AGREE II (available at www.agreetrust.org) as the revised standard for guideline development, reporting and evaluation. PMID:20513779

  20. Development and psychometric evaluation of a cardiovascular risk and disease management knowledge assessment tool.

    PubMed

    Rosneck, James S; Hughes, Joel; Gunstad, John; Josephson, Richard; Noe, Donald A; Waechter, Donna

    2014-01-01

    This article describes the systematic construction and psychometric analysis of a knowledge assessment instrument for phase II cardiac rehabilitation (CR) patients measuring risk modification disease management knowledge and behavioral outcomes derived from national standards relevant to secondary prevention and management of cardiovascular disease. First, using adult curriculum based on disease-specific learning outcomes and competencies, a systematic test item development process was completed by clinical staff. Second, a panel of educational and clinical experts used an iterative process to identify test content domain and arrive at consensus in selecting items meeting criteria. Third, the resulting 31-question instrument, the Cardiac Knowledge Assessment Tool (CKAT), was piloted in CR patients to ensure use of application. Validity and reliability analyses were performed on 3638 adults before test administrations with additional focused analyses on 1999 individuals completing both pretreatment and posttreatment administrations within 6 months. Evidence of CKAT content validity was substantiated, with 85% agreement among content experts. Evidence of construct validity was demonstrated via factor analysis identifying key underlying factors. Estimates of internal consistency, for example, Cronbach's α = .852 and Spearman-Brown split-half reliability = 0.817 on pretesting, support test reliability. Item analysis, using point biserial correlation, measured relationships between performance on single items and total score (P < .01). Analyses using item difficulty and item discrimination indices further verified item stability and validity of the CKAT. A knowledge instrument specifically designed for an adult CR population was systematically developed and tested in a large representative patient population, satisfying psychometric parameters, including validity and reliability.

  1. Development and validation of the impact of dry eye on everyday life (IDEEL) questionnaire, a patient-reported outcomes (PRO) measure for the assessment of the burden of dry eye on patients.

    PubMed

    Abetz, Linda; Rajagopalan, Krithika; Mertzanis, Polyxane; Begley, Carolyn; Barnes, Rod; Chalmers, Robin

    2011-12-08

    To develop and validate a comprehensive patient-reported outcomes instrument focusing on the impact of dry eye on everyday life (IDEEL). Development and validation of the IDEEL occurred in four phases: 1) focus groups with 45 dry eye patients to develop a draft instrument, 2) item generation, 3) pilot study to assess content validity in 16 patients and 4) psychometric validation in 210 subjects: 130 with non-Sjögren's keratoconjunctivitis sicca, 32 with Sjögren's syndrome and 48 controls, and subsequent item reduction. Focus groups identified symptoms and the associated bother, the impact of dry eye on daily life and the patients' satisfaction with their treatment as the central concepts in patients' experience of dry eye. Qualitative analysis indicated that saturation was achieved for these concepts and yielded an initial 112-item draft instrument. Patients understood the questionnaire and found the items to be relevant indicating content validity. Patient input, item descriptive statistics and factor analysis identified 55 items that could be deleted. The final 57-item IDEEL assesses dry eye impact constituting 3 modules: dry eye symptom-bother, dry eye impact on daily life comprising impact on daily activities, emotional impact, impact on work, and dry eye treatment satisfaction comprising satisfaction with treatment effectiveness and treatment-related bother/inconvenience. The psychometric analysis results indicated that the IDEEL met the criteria for item discriminant validity, internal consistency reliability, test-retest reliability and floor/ceiling effects. As expected, the correlations between IDEEL and the Dry Eye Questionnaire (a habitual symptom questionnaire) were higher than between IDEEL and Short-Form-36 and EuroQoL-5D, indicating concurrent validity. The IDEEL is a reliable, valid and comprehensive questionnaire relevant to issues that are specific to dry eye patients, and meets current FDA patient-reported outcomes guidelines. The use of this questionnaire will provide assessment of the impact of dry eye on patient dry eye-related quality of life, impact of treatment on patient outcomes in clinical trials, and may aid in treatment effectiveness evaluation.

  2. Development of a health literacy assessment for young adult college students: a pilot study.

    PubMed

    Harper, Raquel

    2014-01-01

    The purpose of this study was to develop a comprehensive health literacy assessment tool for young adult college students. Participants were 144 undergraduate students. Two hundred and twenty-nine questions were developed, which were based on concepts identified by the US Department of Health and Human Services, the World Health Organization, and health communication scholars. Four health education experts reviewed this pool of items and helped select 87 questions for testing. Students completed an online assessment consisting of these 87 questions in June and October of 2012. Item response theory and goodness-of-fit values were used to help eliminate nonperforming questions. Fifty-one questions were selected based on good item response theory discrimination parameter values. The instrument has 51 questions that look promising for measuring health literacy in college students, but needs additional testing with a larger student population to see how these questions continue to perform.

  3. Science Library of Test Items. Volume Four: Practical Testing Guide.

    ERIC Educational Resources Information Center

    New South Wales Dept. of Education, Sydney (Australia).

    As one in a series of test items collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, the guide gives a wide range of questions and activities for the manipulation of scientific equipment to allow assessment of students' practical laboratory skills. Instructions are given to make norm-referenced or…

  4. Development of a Self-Report Physical Function Instrument for Disability Assessment: Item Pool Construction and Factor Analysis

    PubMed Central

    McDonough, Christine M.; Jette, Alan M.; Ni, Pengsheng; Bogusz, Kara; Marfeo, Elizabeth E; Brandt, Diane E; Chan, Leighton; Meterko, Mark; Haley, Stephen M.; Rasch, Elizabeth K.

    2014-01-01

    Objectives To build a comprehensive item pool representing work-relevant physical functioning and to test the factor structure of the item pool. These developmental steps represent initial outcomes of a broader project to develop instruments for the assessment of function within the context of Social Security Administration (SSA) disability programs. Design Comprehensive literature review; gap analysis; item generation with expert panel input; stakeholder interviews; cognitive interviews; cross-sectional survey administration; and exploratory and confirmatory factor analyses to assess item pool structure. Setting In-person and semi-structured interviews; internet and telephone surveys. Participants A sample of 1,017 SSA claimants, and a normative sample of 999 adults from the US general population. Interventions Not Applicable. Main Outcome Measure Model fit statistics Results The final item pool consisted of 139 items. Within the claimant sample 58.7% were white; 31.8% were black; 46.6% were female; and the mean age was 49.7 years. Initial factor analyses revealed a 4-factor solution which included more items and allowed separate characterization of: 1) Changing and Maintaining Body Position, 2) Whole Body Mobility, 3) Upper Body Function and 4) Upper Extremity Fine Motor. The final 4-factor model included 91 items. Confirmatory factor analyses for the 4-factor models for the claimant and the normative samples demonstrated very good fit. Fit statistics for claimant and normative samples respectively were: Comparative Fit Index = 0.93 and 0.98; Tucker-Lewis Index = 0.92 and 0.98; Root Mean Square Error Approximation = 0.05 and 0.04. Conclusions The factor structure of the Physical Function item pool closely resembled the hypothesized content model. The four scales relevant to work activities offer promise for providing reliable information about claimant physical functioning relevant to work disability. PMID:23542402

  5. Development of a self-report physical function instrument for disability assessment: item pool construction and factor analysis.

    PubMed

    McDonough, Christine M; Jette, Alan M; Ni, Pengsheng; Bogusz, Kara; Marfeo, Elizabeth E; Brandt, Diane E; Chan, Leighton; Meterko, Mark; Haley, Stephen M; Rasch, Elizabeth K

    2013-09-01

    To build a comprehensive item pool representing work-relevant physical functioning and to test the factor structure of the item pool. These developmental steps represent initial outcomes of a broader project to develop instruments for the assessment of function within the context of Social Security Administration (SSA) disability programs. Comprehensive literature review; gap analysis; item generation with expert panel input; stakeholder interviews; cognitive interviews; cross-sectional survey administration; and exploratory and confirmatory factor analyses to assess item pool structure. In-person and semistructured interviews and Internet and telephone surveys. Sample of SSA claimants (n=1017) and a normative sample of adults from the U.S. general population (n=999). Not applicable. Model fit statistics. The final item pool consisted of 139 items. Within the claimant sample, 58.7% were white; 31.8% were black; 46.6% were women; and the mean age was 49.7 years. Initial factor analyses revealed a 4-factor solution, which included more items and allowed separate characterization of: (1) changing and maintaining body position, (2) whole body mobility, (3) upper body function, and (4) upper extremity fine motor. The final 4-factor model included 91 items. Confirmatory factor analyses for the 4-factor models for the claimant and the normative samples demonstrated very good fit. Fit statistics for claimant and normative samples, respectively, were: Comparative Fit Index=.93 and .98; Tucker-Lewis Index=.92 and .98; and root mean square error approximation=.05 and .04. The factor structure of the physical function item pool closely resembled the hypothesized content model. The 4 scales relevant to work activities offer promise for providing reliable information about claimant physical functioning relevant to work disability. Copyright © 2013 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.

  6. A secondstep in development of a checklist for screening risk for violence in acute psychiatric patients: evaluation of interrater reliability of the Preliminary Scheme 33.

    PubMed

    Bjørkly, Stål; Moger, Tron A

    2007-12-01

    The Acute Project is a research project conducted on acute psychiatric admission wards in Norway. The objective is to develop and validate a structured, easy-to-use screening checklist for assessment of risk for violence in patients both during their stay in the ward and after discharge. The Preliminary Scheme 33 is a 33-item screening checklist with content domain inspired by the Historical-Clinical-Risk Management Scheme (HCR-20), the Brøset Violence Checklist, and eight risk factors extracted from the literature on risk assessment. The Preliminary Scheme 33 was designed and tested in two steps by a research group which includes the authors. The common aim of both steps was to develop this into a time economical, reliable, and valid checklist. In the first step in 2006 the predictive validity of the individual items was tested. The present work presents results from the second step, a study conducted to assess the interrater reliability of the 33 items. Eight clinicians working in an acute psychiatric unit volunteered to be raters and were trained to score the 33 items on a three-point scale in relation to 15 clinical vignettes, which contained information from 15 acute psychiatric patients' files. Analysis showed high interrater reliability for the total score with an intraclass correlation coefficient (ICC) of .86 (95% CI: 0.74-0.94). However, a substantial proportion of the items had medium to low ICCs. Consequences of this finding for further development of these items into a brief screen are discussed.

  7. Development of a New Branded UK Food Composition Database for an Online Dietary Assessment Tool

    PubMed Central

    Carter, Michelle C.; Hancock, Neil; Albar, Salwa A.; Brown, Helen; Greenwood, Darren C.; Hardie, Laura J.; Frost, Gary S.; Wark, Petra A.; Cade, Janet E.

    2016-01-01

    The current UK food composition tables are limited, containing ~3300 mostly generic food and drink items. To reflect the wide range of food products available to British consumers and to potentially improve accuracy of dietary assessment, a large UK specific electronic food composition database (FCDB) has been developed. A mapping exercise has been conducted that matched micronutrient data from generic food codes to “Back of Pack” data from branded food products using a semi-automated process. After cleaning and processing, version 1.0 of the new FCDB contains 40,274 generic and branded items with associated 120 macronutrient and micronutrient data and 5669 items with portion images. Over 50% of food and drink items were individually mapped to within 10% agreement with the generic food item for energy. Several quality checking procedures were applied after mapping including; identifying foods above and below the expected range for a particular nutrient within that food group and cross-checking the mapping of items such as concentrated and raw/dried products. The new electronic FCDB has substantially increased the size of the current, publically available, UK food tables. The FCDB has been incorporated into myfood24, a new fully automated online dietary assessment tool and, a smartphone application for weight loss. PMID:27527214

  8. Development of a New Branded UK Food Composition Database for an Online Dietary Assessment Tool.

    PubMed

    Carter, Michelle C; Hancock, Neil; Albar, Salwa A; Brown, Helen; Greenwood, Darren C; Hardie, Laura J; Frost, Gary S; Wark, Petra A; Cade, Janet E

    2016-08-05

    The current UK food composition tables are limited, containing ~3300 mostly generic food and drink items. To reflect the wide range of food products available to British consumers and to potentially improve accuracy of dietary assessment, a large UK specific electronic food composition database (FCDB) has been developed. A mapping exercise has been conducted that matched micronutrient data from generic food codes to "Back of Pack" data from branded food products using a semi-automated process. After cleaning and processing, version 1.0 of the new FCDB contains 40,274 generic and branded items with associated 120 macronutrient and micronutrient data and 5669 items with portion images. Over 50% of food and drink items were individually mapped to within 10% agreement with the generic food item for energy. Several quality checking procedures were applied after mapping including; identifying foods above and below the expected range for a particular nutrient within that food group and cross-checking the mapping of items such as concentrated and raw/dried products. The new electronic FCDB has substantially increased the size of the current, publically available, UK food tables. The FCDB has been incorporated into myfood24, a new fully automated online dietary assessment tool and, a smartphone application for weight loss.

  9. [Development of the theoretical framework and the item pool of the peri-operative recovery scale for integrative medicine].

    PubMed

    Su, Bi-ying; Liu, Shao-nan; Li, Xiao-yan

    2011-11-01

    To study the train of thoughts and procedures for developing the theoretical framework and the item pool of the peri-operative recovery scale for integrative medicine, thus making preparation for the development of this scale and psychometric testing. Under the guidance for Chinese medicine theories and the guidance for developing psychometric scale, the theoretical framework and the item pool of the scale were initially laid out by literature retrieval, and expert consultation, etc. The scale covered the domains of physical function, mental function, activity function, pain, and general assessment. Besides, social function is involved, which is suitable for pre-operative testing and long-term therapeutic efficacy testing after discharge from hospital. Each domain should cover correlated Zang-Fu organs, qi, blood, and the patient-reported outcomes. Totally 122 items were initially covered in the item pool according to theoretical framework of the scale. The peri-operative recovery scale of integrative medicine was the embodiment of the combination of Chinese medicine theories and patient-reported outcome concepts. The scale could reasonably assess the peri-operative recovery outcomes of patients treated by integrative medicine.

  10. [Development of a measurement of intellectual capital for hospital nursing organizations].

    PubMed

    Kim, Eun A; Jang, Keum Seong

    2011-02-01

    This study was done to develop an instrument for measuring intellectual capital and assess its validity and reliability in identifying the components, human capital, structure capital and customer capital of intellectual capital in hospital nursing organizations. The participants were 950 regular clinical nurses who had worked for over 13 months in 7 medical hospitals including 4 national university hospitals and 3 private university hospitals. The data were collected through a questionnaire survey done from July 2 to August 25, 2009. Data from 906 nurses were used for the final analysis. Data were analyzed using descriptive statistics, Cronbach's alpha coefficients, item analysis, factor analysis (principal component analysis, Varimax rotation) with the SPSS PC+ 17.0 for Windows program. Developing the instrument for measuring intellectual capital in hospital nursing organizations involved a literature review, development of preliminary items, and verification of validity and reliability. The final instrument was in a self-report form on a 5-point Likert scale. There were 29 items on human capital (5 domains), 21 items on customer capital (4 domains), 26 items on structure capital (4 domains). The results of this study may be useful to assess the levels of intellectual capital of hospital nursing organizations.

  11. Competency-based tool for evaluation of community-based training in undergraduate medical education in India - a Delphi approach.

    PubMed

    Shewade, Hemant Deepak; Jeyashree, Kathiresan; Kalaiselvi, Selvaraj; Palanivel, Chinnakali; Panigrahi, Krishna Chandra

    2017-01-01

    A community-based training (CBT) program, where teaching and training are carried out in the community outside of the teaching hospital, is a vital part of undergraduate medical education. Worldwide, there is a shift to competency-based training, and CBT is no exception. We attempted to develop a tool that uses a competency-based approach for assessment of CBT. Based on a review on competencies, we prepared a preliminary list of major domains with items under each domain. We used the Delphi technique to arrive at a consensus on this assessment tool. The Delphi panel consisted of eight purposively selected experts from the field of community medicine. The panel rated each item for its relevance, sensitivity, specificity, and understandability on a scale of 0-4. Median ratings were calculated at the end of each round and shared with the panel. Consensus was predefined as when 70% of the experts gave a rating of 3 or above for an item under relevance, sensitivity, and specificity. If an item failed to achieve consensus after being rated in 2 consecutive rounds, it was excluded. Anonymity of responses was maintained. The panel arrived at a consensus at the end of 3 rounds. The final version of the self-assessment tool consisted of 7 domains and 74 items. The domains (number of items) were Public health - epidemiology and research methodology (13), Public health - biostatistics (6), Public health administration at primary health center level (17), Family medicine (24), Cultural competencies (3), Community development and advocacy (2), and Generic competence (9). Each item was given a maximum score of 5 and minimum score of 1. This is the first study worldwide to develop a tool for competency-based evaluation of CBT in undergraduate medical education. The competencies identified in the 74-item questionnaire may provide the base for development of authentic curricula for CBT.

  12. The development and initial validation of a questionnaire to measure help-seeking behaviour in patients with new onset rheumatoid arthritis.

    PubMed

    Stack, Rebecca J; Mallen, Christian D; Deighton, Chris; Kiely, Patrick; Shaw, Karen L; Booth, Alison; Kumar, Kanta; Thomas, Susan; Rowan, Ian; Horne, Rob; Nightingale, Peter; Herron-Marx, Sandy; Jinks, Clare; Raza, Karim

    2015-12-01

    Early treatment for rheumatoid arthritis (RA) is vital. However, people often delay in seeking help at symptom onset. An assessment of the reasons behind patient delay is necessary to develop interventions to promote rapid consultation. Using a mixed methods design, we aimed to develop and test a questionnaire to assess the barriers to help seeking at RA onset. Questionnaire items were extracted from previous qualitative studies. Fifteen people with a lived experience of arthritis participated in focus groups to enhance the questionnaire's face validity. The questionnaire was also reviewed by groups of multidisciplinary health-care professionals. A test-retest survey of 41 patients with newly presenting RA or unclassified arthritis assessed the questionnaire items' intraclass correlations. During focus groups, participants rephrased questions, added questions and deleted items not relevant to the questionnaire's aims. Participants organized items into themes: early symptom experience, initial reactions to symptoms, self-management behaviours, causal beliefs, involvement of significant others, pre-diagnosis knowledge about RA, direct barriers to seeking help and relationship with GP. The test-retest survey identified seven items (out of 79) with low intraclass correlations which were removed from the final questionnaire. The involvement of people with a lived experience of arthritis and multidisciplinary health-care professionals in the preliminary validation of the DELAY (delays in evaluating arthritis early) questionnaire has enriched its development. Preliminary assessment established its reliability. The DELAY questionnaire provides a tool for researchers to evaluate individual, cultural and health service barriers to help-seeking behaviour at RA onset. © 2014 John Wiley & Sons Ltd.

  13. Psychometric evaluation of an item bank for computerized adaptive testing of the EORTC QLQ-C30 cognitive functioning dimension in cancer patients.

    PubMed

    Dirven, Linda; Groenvold, Mogens; Taphoorn, Martin J B; Conroy, Thierry; Tomaszewski, Krzysztof A; Young, Teresa; Petersen, Morten Aa

    2017-11-01

    The European Organisation of Research and Treatment of Cancer (EORTC) Quality of Life Group is developing computerized adaptive testing (CAT) versions of all EORTC Quality of Life Questionnaire (QLQ-C30) scales with the aim to enhance measurement precision. Here we present the results on the field-testing and psychometric evaluation of the item bank for cognitive functioning (CF). In previous phases (I-III), 44 candidate items were developed measuring CF in cancer patients. In phase IV, these items were psychometrically evaluated in a large sample of international cancer patients. This evaluation included an assessment of dimensionality, fit to the item response theory (IRT) model, differential item functioning (DIF), and measurement properties. A total of 1030 cancer patients completed the 44 candidate items on CF. Of these, 34 items could be included in a unidimensional IRT model, showing an acceptable fit. Although several items showed DIF, these had a negligible impact on CF estimation. Measurement precision of the item bank was much higher than the two original QLQ-C30 CF items alone, across the whole continuum. Moreover, CAT measurement may on average reduce study sample sizes with about 35-40% compared to the original QLQ-C30 CF scale, without loss of power. A CF item bank for CAT measurement consisting of 34 items was established, applicable to various cancer patients across countries. This CAT measurement system will facilitate precise and efficient assessment of HRQOL of cancer patients, without loss of comparability of results.

  14. Assessing Competencies Needed to Engage With Digital Health Services: Development of the eHealth Literacy Assessment Toolkit.

    PubMed

    Karnoe, Astrid; Furstrand, Dorthe; Christensen, Karl Bang; Norgaard, Ole; Kayser, Lars

    2018-05-10

    To achieve full potential in user-oriented eHealth projects, we need to ensure a match between the eHealth technology and the user's eHealth literacy, described as knowledge and skills. However, there is a lack of multifaceted eHealth literacy assessment tools suitable for screening purposes. The objective of our study was to develop and validate an eHealth literacy assessment toolkit (eHLA) that assesses individuals' health literacy and digital literacy using a mix of existing and newly developed scales. From 2011 to 2015, scales were continuously tested and developed in an iterative process, which led to 7 tools being included in the validation study. The eHLA validation version consisted of 4 health-related tools (tool 1: "functional health literacy," tool 2: "health literacy self-assessment," tool 3: "familiarity with health and health care," and tool 4: "knowledge of health and disease") and 3 digitally-related tools (tool 5: "technology familiarity," tool 6: "technology confidence," and tool 7: "incentives for engaging with technology") that were tested in 475 respondents from a general population sample and an outpatient clinic. Statistical analyses examined floor and ceiling effects, interitem correlations, item-total correlations, and Cronbach coefficient alpha (CCA). Rasch models (RM) examined the fit of data. Tools were reduced in items to secure robust tools fit for screening purposes. Reductions were made based on psychometrics, face validity, and content validity. Tool 1 was not reduced in items; it consequently consists of 10 items. The overall fit to the RM was acceptable (Anderson conditional likelihood ratio, CLR=10.8; df=9; P=.29), and CCA was .67. Tool 2 was reduced from 20 to 9 items. The overall fit to a log-linear RM was acceptable (Anderson CLR=78.4, df=45, P=.002), and CCA was .85. Tool 3 was reduced from 23 to 5 items. The final version showed excellent fit to a log-linear RM (Anderson CLR=47.7, df=40, P=.19), and CCA was .90. Tool 4 was reduced from 12 to 6 items. The fit to a log-linear RM was acceptable (Anderson CLR=42.1, df=18, P=.001), and CCA was .59. Tool 5 was reduced from 20 to 6 items. The fit to the RM was acceptable (Anderson CLR=30.3, df=17, P=.02), and CCA was .94. Tool 6 was reduced from 5 to 4 items. The fit to a log-linear RM taking local dependency (LD) into account was acceptable (Anderson CLR=26.1, df=21, P=.20), and CCA was .91. Tool 7 was reduced from 6 to 4 items. The fit to a log-linear RM taking LD and differential item functioning into account was acceptable (Anderson CLR=23.0, df=29, P=.78), and CCA was .90. The eHLA consists of 7 short, robust scales that assess individual's knowledge and skills related to digital literacy and health literacy. ©Astrid Karnoe, Dorthe Furstrand, Karl Bang Christensen, Ole Norgaard, Lars Kayser. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 10.05.2018.

  15. Development and Validation of a Novel Generic Health-related Quality of Life Instrument With 20 Items (HINT-20).

    PubMed

    Jo, Min-Woo; Lee, Hyeon-Jeong; Kim, Soo Young; Kim, Seon-Ha; Chang, Hyejung; Ahn, Jeonghoon; Ock, Minsu

    2017-01-01

    Few attempts have been made to develop a generic health-related quality of life (HRQoL) instrument and to examine its validity and reliability in Korea. We aimed to do this in our present study. After a literature review of existing generic HRQoL instruments, a focus group discussion, in-depth interviews, and expert consultations, we selected 30 tentative items for a new HRQoL measure. These items were evaluated by assessing their ceiling effects, difficulty, and redundancy in the first survey. To validate the HRQoL instrument that was developed, known-groups validity and convergent/discriminant validity were evaluated and its test-retest reliability was examined in the second survey. Of the 30 items originally assessed for the HRQoL instrument, four were excluded due to high ceiling effects and six were removed due to redundancy. We ultimately developed a HRQoL instrument with a reduced number of 20 items, known as the Health-related Quality of Life Instrument with 20 items (HINT-20), incorporating physical, mental, social, and positive health dimensions. The results of the HINT-20 for known-groups validity were poorer in women, the elderly, and those with a low income. For convergent/discriminant validity, the correlation coefficients of items (except vitality) in the physical health dimension with the physical component summary of the Short Form 36 version 2 (SF-36v2) were generally higher than the correlations of those items with the mental component summary of the SF-36v2, and vice versa. Regarding test-retest reliability, the intraclass correlation coefficient of the total HINT-20 score was 0.813 (p<0.001). A novel generic HRQoL instrument, the HINT-20, was developed for the Korean general population and showed acceptable validity and reliability.

  16. Mathematics Library of Test Items. Volume One.

    ERIC Educational Resources Information Center

    Fraser, Graham, Ed.

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from previous tests are made available to teachers for the construction of pretests or posttests, reference tests for inter-class comparisons and general assignments. The collection was reviewed for content…

  17. Agriculture Library of Test Items.

    ERIC Educational Resources Information Center

    Sutherland, Duncan, Ed.

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection is reviewed for content validity and reliability. The test…

  18. Semiparametric Item Response Functions in the Context of Guessing

    ERIC Educational Resources Information Center

    Falk, Carl F.; Cai, Li

    2016-01-01

    We present a logistic function of a monotonic polynomial with a lower asymptote, allowing additional flexibility beyond the three-parameter logistic model. We develop a maximum marginal likelihood-based approach to estimate the item parameters. The new item response model is demonstrated on math assessment data from a state, and a computationally…

  19. Content validity--establishing and reporting the evidence in newly developed patient-reported outcomes (PRO) instruments for medical product evaluation: ISPOR PRO good research practices task force report: part 1--eliciting concepts for a new PRO instrument.

    PubMed

    Patrick, Donald L; Burke, Laurie B; Gwaltney, Chad J; Leidy, Nancy Kline; Martin, Mona L; Molsen, Elizabeth; Ring, Lena

    2011-12-01

    The importance of content validity in developing patient reported outcomes (PRO) instruments is stressed by both the US Food and Drug Administration and the European Medicines Agency. Content validity is the extent to which an instrument measures the important aspects of concepts that developers or users purport it to assess. A PRO instrument measures the concepts most significant and relevant to a patient's condition and its treatment. For PRO instruments, items and domains as reflected in the scores of an instrument should be important to the target population and comprehensive with respect to patient concerns. Documentation of target population input in item generation, as well as evaluation of patient understanding through cognitive interviewing, can provide the evidence for content validity. Developing content for, and assessing respondent understanding of, newly developed PRO instruments for medical product evaluation will be discussed in this two-part ISPOR PRO Good Research Practices Task Force Report. Topics include the methods for generating items, documenting item development, coding of qualitative data from item generation, cognitive interviewing, and tracking item development through the various stages of research and preparing this tracking for submission to regulatory agencies. Part 1 covers elicitation of key concepts using qualitative focus groups and/or interviews to inform content and structure of a new PRO instrument. Part 2 covers the instrument development process, the assessment of patient understanding of the draft instrument using cognitive interviews and steps for instrument revision. The two parts are meant to be read together. They are intended to offer suggestions for good practices in planning, executing, and documenting qualitative studies that are used to support the content validity of PRO instruments to be used in medical product evaluation. Copyright © 2011 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.

  20. Validation of the MOS Social Support Survey 6-item (MOS-SSS-6) measure with two large population-based samples of Australian women.

    PubMed

    Holden, Libby; Lee, Christina; Hockey, Richard; Ware, Robert S; Dobson, Annette J

    2014-12-01

    This study aimed to validate a 6-item 1-factor global measure of social support developed from the Medical Outcomes Study Social Support Survey (MOS-SSS) for use in large epidemiological studies. Data were obtained from two large population-based samples of participants in the Australian Longitudinal Study on Women's Health. The two cohorts were aged 53-58 and 28-33 years at data collection (N = 10,616 and 8,977, respectively). Items selected for the 6-item 1-factor measure were derived from the factor structure obtained from unpublished work using an earlier wave of data from one of these cohorts. Descriptive statistics, including polychoric correlations, were used to describe the abbreviated scale. Cronbach's alpha was used to assess internal consistency and confirmatory factor analysis to assess scale validity. Concurrent validity was assessed using correlations between the new 6-item version and established 19-item version, and other concurrent variables. In both cohorts, the new 6-item 1-factor measure showed strong internal consistency and scale reliability. It had excellent goodness-of-fit indices, similar to those of the established 19-item measure. Both versions correlated similarly with concurrent measures. The 6-item 1-factor MOS-SSS measures global functional social support with fewer items than the established 19-item measure.

  1. Development and psychometric analysis of the Brief DSM-5 Alcohol Use Disorder Diagnostic Assessment: Towards effective diagnosis in college students.

    PubMed

    Hagman, Brett T

    2017-11-01

    The Diagnostic and Statistical Manual of Mental Disorders (5th edition) Alcohol Use Disorder (DSM-5 AUD) criteria have been modified to reflect a single, continuous disorder. It is critical that we develop brief assessment measures that can accurately assess for DSM-5 AUD criteria in college students to assist in screening, referral, and brief intervention services implemented on college campuses. The present study sought to develop and assess for the psychometric properties of a brief 13-item measure designed to capture the full spectrum of the DSM-5 AUD criteria in a sample of college students. Participants were past-year drinkers (N = 923) between the ages of 18 to 30 enrolled at 3 universities. Respondents completed a 30-min anonymous battery of questionnaires online. The Brief DSM-5 AUD Assessment consisted of 13 items designed to reflect the DSM-5 AUD criteria. Results indicated a high degree of internal consistency reliability with high item-to-scale correlations. Confirmatory factor analyses indicated that a dominant single factor emerged with good model fit. The Item Response Theory (IRT) analyses indicated that the difficulty parameters for each criterion were intermixed along the upper portion of the underlying AUD severity continuum, and the discrimination parameters were all high. Additional analysis indicated that those with a DSM-5 AUD had greater levels of alcohol and other drug use and problem severity in comparison to those without a DSM-5 AUD. Study findings provide empirical support for the reliability and validity of the Brief 13-item DSM-5 Assessment. It should be routinely included into research and clinical practice efforts. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  2. Development and Validation of the Career Competencies Indicator (CCI)

    ERIC Educational Resources Information Center

    Francis-Smythe, Jan; Haase, Sandra; Thomas, Erica; Steele, Catherine

    2013-01-01

    This article describes the development and validation of the Career Competencies Indicator (CCI); a 43-item measure to assess career competencies (CCs). Following an extensive literature review, a comprehensive item generation process involving consultation with subject matter experts, a pilot study and a factor analytic study on a large sample…

  3. Factors Affecting Item Difficulty in English Listening Comprehension Tests

    ERIC Educational Resources Information Center

    Sung, Pei-Ju; Lin, Su-Wei; Hung, Pi-Hsia

    2015-01-01

    Task difficulty is a critical issue affecting test developers. Controlling or balancing the item difficulty of an assessment improves its validity and discrimination. Test developers construct tests from the cognitive perspective, by making the test constructing process more scientific and efficient; thus, the scores obtained more precisely…

  4. Comparison of Alternate and Original Items on the Montreal Cognitive Assessment.

    PubMed

    Lebedeva, Elena; Huang, Mei; Koski, Lisa

    2016-03-01

    The Montreal Cognitive Assessment (MoCA) is a screening tool for mild cognitive impairment (MCI) in elderly individuals. We hypothesized that measurement error when using the new alternate MoCA versions to monitor change over time could be related to the use of items that are not of comparable difficulty to their corresponding originals of similar content. The objective of this study was to compare the difficulty of the alternate MoCA items to the original ones. Five selected items from alternate versions of the MoCA were included with items from the original MoCA administered adaptively to geriatric outpatients (N = 78). Rasch analysis was used to estimate the difficulty level of the items. None of the five items from the alternate versions matched the difficulty level of their corresponding original items. This study demonstrates the potential benefits of a Rasch analysis-based approach for selecting items during the process of development of parallel forms. The results suggest that better match of the items from different MoCA forms by their difficulty would result in higher sensitivity to changes in cognitive function over time.

  5. Development and preliminary psychometric properties of a well-being index for medical students.

    PubMed

    Dyrbye, Liselotte N; Szydlo, Daniel W; Downing, Steven M; Sloan, Jeff A; Shanafelt, Tait D

    2010-01-27

    Psychological distress is common among medical students but manifests in a variety of forms. Currently, no brief, practical tool exists to simultaneously evaluate these domains of distress among medical students. The authors describe the development of a subject-reported assessment (Medical Student Well-Being Index, MSWBI) intended to screen for medical student distress across a variety of domains and examine its preliminary psychometric properties. Relevant domains of distress were identified, items generated, and a screening instrument formed using a process of literature review, nominal group technique, input from deans and medical students, and correlation analysis from previously administered assessments. Eleven experts judged the clarity, relevance, and representativeness of the items. A Content Validity Index (CVI) was calculated. Interrater agreement was assessed using pair-wise percent agreement adjusted for chance agreement. Data from 2248 medical students who completed the MSWBI along with validated full-length instruments assessing domains of interest was used to calculate reliability and explore internal structure validity. Burnout (emotional exhaustion and depersonalization), depression, mental quality of life (QOL), physical QOL, stress, and fatigue were domains identified for inclusion in the MSWBI. Six of 7 items received item CVI-relevance and CVI-representativeness of >or=0.82. Overall scale CVI-relevance and CVI-representativeness was 0.94 and 0.91. Overall pair-wise percent agreement between raters was >or=85% for clarity, relevance, and representativeness. Cronbach's alpha was 0.68. Item by item percent pair-wise agreements and Phi were low, suggesting little overlap between items. The majority of MSWBI items had a >or=74% sensitivity and specificity for detecting distress within the intended domain. The results of this study provide evidence of reliability and content-related validity of the MSWBI. Further research is needed to assess remaining psychometric properties and establish scores for which intervention is warranted.

  6. Development and preliminary psychometric properties of a well-being index for medical students

    PubMed Central

    2010-01-01

    Background Psychological distress is common among medical students but manifests in a variety of forms. Currently, no brief, practical tool exists to simultaneously evaluate these domains of distress among medical students. The authors describe the development of a subject-reported assessment (Medical Student Well-Being Index, MSWBI) intended to screen for medical student distress across a variety of domains and examine its preliminary psychometric properties. Methods Relevant domains of distress were identified, items generated, and a screening instrument formed using a process of literature review, nominal group technique, input from deans and medical students, and correlation analysis from previously administered assessments. Eleven experts judged the clarity, relevance, and representativeness of the items. A Content Validity Index (CVI) was calculated. Interrater agreement was assessed using pair-wise percent agreement adjusted for chance agreement. Data from 2248 medical students who completed the MSWBI along with validated full-length instruments assessing domains of interest was used to calculate reliability and explore internal structure validity. Results Burnout (emotional exhaustion and depersonalization), depression, mental quality of life (QOL), physical QOL, stress, and fatigue were domains identified for inclusion in the MSWBI. Six of 7 items received item CVI-relevance and CVI-representativeness of ≥0.82. Overall scale CVI-relevance and CVI-representativeness was 0.94 and 0.91. Overall pair-wise percent agreement between raters was ≥85% for clarity, relevance, and representativeness. Cronbach's alpha was 0.68. Item by item percent pair-wise agreements and Phi were low, suggesting little overlap between items. The majority of MSWBI items had a ≥74% sensitivity and specificity for detecting distress within the intended domain. Conclusions The results of this study provide evidence of reliability and content-related validity of the MSWBI. Further research is needed to assess remaining psychometric properties and establish scores for which intervention is warranted. PMID:20105312

  7. Development and validation of a scale for mouth handicap in systemic sclerosis: the Mouth Handicap in Systemic Sclerosis scale

    PubMed Central

    Mouthon, L; Rannou, F; Bérezné, A; Pagnoux, C; Arène, J‐P; Foïs, E; Cabane, J; Guillevin, L; Revel, M; Fermanian, J; Poiraudeau, S

    2007-01-01

    Objective To develop and assess the reliability and construct validity of a scale assessing disability involving the mouth in systemic sclerosis (SSc). Methods We generated a 34‐item provisional scale from mailed responses of patients (n = 74), expert consensus (n = 10) and literature analysis. A total of 71 other SSc patients were recruited. The test–retest reliability was assessed using the intraclass coefficient correlation and divergent validity using the Spearman correlation coefficient. Factor analysis followed by varimax rotation was performed to assess the factorial structure of the scale. Results The item reduction process retained 12 items with 5 levels of answers (total score range 0–48). The mean total score of the scale was 20.3 (SD 9.7). The test–retest reliability was 0.96. Divergent validity was confirmed for global disability (Health Assessment Questionnaire (HAQ), r = 0.33), hand function (Cochin Hand Function Scale, r = 0.37), inter‐incisor distance (r = −0.34), handicap (McMaster‐Toronto Arthritis questionnaire (MACTAR), r = 0.24), depression (Hospital Anxiety and Depression (HAD); HADd, r = 0.26) and anxiety (HADa, r = 0.17). Factor analysis extracted 3 factors with eigenvalues of 4.26, 1.76 and 1.47, explaining 63% of the variance. These 3 factors could be clinically characterised. The first factor (5 items) represents handicap induced by the reduction in mouth opening, the second (5 items) handicap induced by sicca syndrome and the third (2 items) aesthetic concerns. Conclusion We propose a new scale, the Mouth Handicap in Systemic Sclerosis (MHISS) scale, which has excellent reliability and good construct validity, and assesses specifically disability involving the mouth in patients with SSc. PMID:17502364

  8. Development and evaluation of a brief screener to estimate fast-food and beverage consumption among adolescents.

    PubMed

    Nelson, Melissa C; Lytle, Leslie A

    2009-04-01

    Sweetened beverage and fast-food intake have been identified as important targets for obesity prevention. However, there are few brief dietary assessment tools available to evaluate these behaviors among adolescents. The objective of this research was to examine reliability and validity of a 22-item dietary screener assessing adolescent consumption of specific energy-containing and non-energy-containing beverages (nine items) and fast food (13 items). The screener was administered to adolescents (ages 11 to 18 years) recruited from the Minneapolis/St Paul, MN, metro region. One sample of adolescents completed test-retest reliability of the screener (n=33, primarily white adolescents). Another adolescent sample completed the screener along with three 24-hour dietary recalls to assess criterion validity (n=59 white adolescents). Test-retest assessments were completed approximately 7 to 14 days apart, and agreement between the two administrations of the screener was substantial, with most items yielding Spearman correlations and kappa statistics that were >0.60. When compared to the gold standard dietary recall data, findings indicate that the validity of the screener items assessing adolescents' intake of regular soda, sports drinks, milk, and water was fair. However, the differential assessment periods captured by the two methods (ie, 1 month for the screener vs 3 days for the recalls) posed challenges in analysis and made it impossible to assess the validity of some screener items. Overall while these screener items largely represent reliable measures with fair validity, our findings highlight the challenges inherent in the validation of brief dietary assessment tools.

  9. Functional recovery in patients with schizophrenia: recommendations from a panel of experts.

    PubMed

    Lahera, Guillermo; Gálvez, José L; Sánchez, Pedro; Martínez-Roig, Miguel; Pérez-Fuster, J V; García-Portilla, Paz; Herrera, Berta; Roca, Miquel

    2018-06-05

    The management of schizophrenia is evolving towards a more comprehensive model based on functional recovery. The concept of functional recovery goes beyond clinical remission and encompasses multiple aspects of the patient's life, making it difficult to settle on a definition and to develop reliable assessment criteria. In this consensus process based on a panel of experts in schizophrenia, we aimed to provide useful insights on functional recovery and its involvement in clinical practice and clinical research. After a literature review of functional recovery in schizophrenia, a scientific committee of 8 members prepared a 75-item questionnaire, including 6 sections: (I) the concept of functional recovery (9 items), (II) assessment of functional recovery (23 items), (III) factors influencing functional recovery (16 items), (IV) psychosocial interventions and functional recovery (8 items), (V) pharmacological treatment and functional recovery (14 items), and (VI) the perspective of patients and their relatives on functional recovery (5 items). The questionnaire was sent to a panel of 53 experts, who rated each item on a 9-point Likert scale. Consensus was achieved in a 2-round Delphi dynamics, using the median (interquartile range) scores to consider consensus in either agreement (scores 7-9) or disagreement (scores 1-3). Items not achieving consensus in the first round were sent back to the experts for a second consideration. After the two recursive rounds, consensus was achieved in 64 items (85.3%): 61 items (81.3%) in agreement and 3 (4.0%) in disagreement, all of them from section II (assessment of functional recovery). Items not reaching consensus were related to the concepts of functional recovery (1 item, 1.3%), functional assessment (5 items, 6.7%), factors influencing functional recovery (3 items, 4.0%), and psychosocial interventions (2 items, 5.6%). Despite the lack of a well-defined concept of functional recovery, we identified a trend towards a common archetype of the definition and factors associated with functional recovery, as well as its applicability in clinical practice and clinical research.

  10. Development of an Easy-to-Use Tool for the Assessment of Emergency Department Physical Design.

    PubMed

    Majidi, Alireza; Tabatabaey, Ali; Motamed, Hassan; Motamedi, Maryam; Forouzanfar, Mohammad Mehdi

    2014-01-01

    Physical design of the emergency department (ED) has an important effect on its role and function. To date, no guidelines have been introduced to set the standards for the construction of EDs in Iran. In this study, we aim to devise an easy-to-use tool based on the available literature and expert opinion for the quick and effective assessment of EDs in regards to their physical design. For this purpose, based on current literature on emergency design, a comprehensive checklist was developed. Then, this checklist was analyzed by a panel consisting of heads of three major EDs and contradicting items were decided. 178 crude items were derived from available literature. The Items were categorized in to three major domains of Physical space, Equipment, and Accessibility. The final checklist approved by the panel consisted of 163 items categorized into six domains. Each item was phrased as a "Yes or No" question for ease of analysis, meaning that the criterion is either met or not.

  11. Development and psychometric evaluation of the Premarital Sexual Behavior Assessment Scale for Young Women (PSAS-YW): an exploratory mixed method study.

    PubMed

    Rahmani, Azam; Merghati-Khoei, Effat; Moghadam-Banaem, Lida; Hajizadeh, Ebrahim; Hamdieh, Mostafa; Montazeri, Ali

    2014-06-13

    Premarital sexual behaviors are important issue for women's health. The present study was designed to develop and examine the psychometric properties of a scale in order to identify young women who are at greater risk of premarital sexual behavior. This was an exploratory mixed method investigation. Indeed, the study was conducted in two phases. In the first phase, qualitative methods (focus group discussion and individual interview) were applied to generate items and develop the questionnaire. In the second phase, psychometric properties (validity and reliability) of the questionnaire were assessed. In the first phase an item pool containing 53 statements related to premarital sexual behavior was generated. In the second phase item reduction was applied and the final version of the questionnaire containing 26 items was developed. The psychometric properties of this final version were assessed and the results showed that the instrument has a good structure, and reliability. The results from exploratory factory analysis indicated a 5-factor solution for the instrument that jointly accounted for the 57.4% of variance observed. The Cronbach's alpha coefficient for the instrument was found to be 0.87. This study provided a valid and reliable scale to identify premarital sexual behavior in young women. Assessment of premarital sexual behavior might help to improve women's sexual abstinence.

  12. Development and psychometric evaluation of the Premarital Sexual Behavior Assessment Scale for Young Women (PSAS-YW): an exploratory mixed method study

    PubMed Central

    2014-01-01

    Background Premarital sexual behaviors are important issue for women’s health. The present study was designed to develop and examine the psychometric properties of a scale in order to identify young women who are at greater risk of premarital sexual behavior. Method This was an exploratory mixed method investigation. Indeed, the study was conducted in two phases. In the first phase, qualitative methods (focus group discussion and individual interview) were applied to generate items and develop the questionnaire. In the second phase, psychometric properties (validity and reliability) of the questionnaire were assessed. Results In the first phase an item pool containing 53 statements related to premarital sexual behavior was generated. In the second phase item reduction was applied and the final version of the questionnaire containing 26 items was developed. The psychometric properties of this final version were assessed and the results showed that the instrument has a good structure, and reliability. The results from exploratory factory analysis indicated a 5-factor solution for the instrument that jointly accounted for the 57.4% of variance observed. The Cronbach’s alpha coefficient for the instrument was found to be 0.87. Conclusion This study provided a valid and reliable scale to identify premarital sexual behavior in young women. Assessment of premarital sexual behavior might help to improve women’s sexual abstinence. PMID:24924696

  13. Development, sensibility, and reliability of the Toronto Axial Spondyloarthritis Questionnaire in inflammatory bowel disease.

    PubMed

    Alnaqbi, Khalid A; Touma, Zahi; Passalent, Laura; Johnson, Sindhu R; Tomlinson, George A; Carty, Adele; Inman, Robert D

    2013-10-01

    There is an unacceptable delay in the diagnosis of axial spondyloarthritis (axSpA) in its early stages among patients at high risk, in particular those with inflammatory bowel disease (IBD). Our objectives were to develop a sensible and reliable questionnaire to identify undetected axSpA among patients with IBD. Literature was reviewed for item generation in the Toronto axSpA Questionnaire on IBD (TASQ-IBD). Sensibility of the questionnaire was assessed among healthcare professionals and patients. This assessment was related to purpose and framework (clinical function, clinical justification, and clinical applicability), face validity, comprehensiveness [oligo-variability (limiting the questionnaire to important items) and transparency], replicability, content validity, and feasibility. The test-retest reliability study was administered to 77 patients with established IBD and axSpA. Kappa agreement coefficients and absolute agreement were calculated for items. Three domains included IBD, inflammatory back symptoms, and extraaxial features. The entry criterion required a patient to have IBD and back pain or stiffness that ever persisted for ≥ 3 months. Iterative sensibility assessment involved 16 items and a diagram of the back. Kappa coefficients ranged from 0.81-1.00 for each item. Absolute agreement across all items ranged from 91% to 100%. TASQ-IBD is a newly developed, sensible, and reliable case-finding questionnaire to be administered to patients with IBD who have ever had chronic back pain or stiffness persisting for ≥ 3 months. It should facilitate identification and timely referral of patients with IBD to rheumatologists and minimize the delay in diagnosis of axSpA. Consequently, it should assess the prevalence of axSpA in IBD.

  14. Development and Validation of a Fatigue Assessment Scale for U.S. Construction Workers

    PubMed Central

    Zhang, Mingzong; Sparer, Emily H.; Murphy, Lauren A.; Dennerlein, Jack T.; Fang, Dongping; Katz, Jeffrey N.; Caban-Martinez, Alberto J.

    2015-01-01

    Objective To develop a fatigue assessment scale and test its reliability and validity for commercial construction workers. Methods Using a two-phased approach, we first identified items for the development of a Fatigue Assessment Scale for Construction Workers (FASCW) through review of existing scales in the scientific literature, key informant interviews (n=11) and focus groups (3 groups with 6 workers each) with construction workers. The second phase included assessment for the reliability, validity and sensitivity of the new scale using a repeated-measures study design with a convenience sample of construction workers (n=144). Results Phase one resulted in a 16-item preliminary scale that after factor analysis yielded a final 10-item scale with two sub-scales (“Lethargy” and “Bodily Ailment”).. During phase two, the FASCW and its subscales demonstrated satisfactory internal consistency (alpha coefficients were FASCW (0.91), Lethargy (0.86) and Bodily Ailment (0.84)) and acceptable test-retest reliability (Pearson Correlations Coefficients: 0.59–0.68; Intraclass Correlation Coefficients: 0.74–0.80). Correlation analysis substantiated concurrent and convergent validity. A discriminant analysis demonstrated that the FASCW differentiated between groups with arthritis status and different work hours. Conclusions The 10-item FASCW with good reliability and validity is an effective tool for assessing the severity of fatigue among construction workers. PMID:25603944

  15. The development of the lunchtime enjoyment of activity and play questionnaire.

    PubMed

    Hyndman, Brendon; Telford, Amanda; Finch, Caroline; Ullah, Shahid; Benson, Amanda C

    2013-04-01

    Enjoyment of physical activity is as an important determinant of children's participation in physical activity. Despite this, there is an absence of reliable measures for assessing children's enjoyment of play activities during school lunchtime. The purpose of this study was to develop and assess the reliability of the Lunchtime Enjoyment of Activity and Play (LEAP) Questionnaire. Questionnaire items were categorized employing a social-ecological framework including intrapersonal (20 items), interpersonal (2 items), and physical environment/policy (17 items) components to identify the broader influences on children's enjoyment. An identical questionnaire was administered on 2 occasions, 10  days apart, to 176 children aged 8-12  years, attending a government elementary school in regional Victoria, Australia. Test-retest reliability confirmed that 35 of 39 LEAP Questionnaire items had at least moderate kappa agreement ranging from .44 to .78. Although 4 individual kappa values were low, median kappa scores for each aggregated social-ecological component reached at least moderate agreement (.44-.60). This study confirms the LEAP Questionnaire to be a reliable, context-specific instrument with sound content, and face validity that employs a social-ecological framework to assess children's enjoyment of school play and lunchtime activities. © 2013, American School Health Association.

  16. A checklist to assess the quality of reports on spa therapy and balneotherapy trials was developed using the Delphi consensus method: the SPAC checklist.

    PubMed

    Kamioka, Hiroharu; Kawamura, Yoichi; Tsutani, Kiichiro; Maeda, Masaharu; Hayasaka, Shinya; Okuizum, Hiroyasu; Okada, Shinpei; Honda, Takuya; Iijima, Yuichi

    2013-08-01

    The purpose of this study was to develop a checklist of items that describes and measures the quality of reports of interventional trials assessing spa therapy. The Delphi consensus method was used to select the number of items in the checklist. A total of eight individuals participated, including an epidemiologist, a clinical research methodologist, clinical researchers, a medical journalist, and a health fitness programmer. Participants ranked on a 9-point Likert scale whether an item should be included in the checklist. Three rounds of the Delphi method were conducted to achieve consensus. The final checklist contained 19 items, with items related to title, place of implementation (specificity of spa), care provider influence, and additional measures to minimize the potential bias from withdrawals, loss to follow-up, and low treatment adherence. This checklist is simple and quick to complete, and should help clinicians and researchers critically appraise the medical and healthcare literature, reviewers assess the quality of reports included in systematic reviews, and researchers plan interventional trials of spa therapy. Copyright © 2013 Elsevier Ltd. All rights reserved.

  17. Examination of the Assumptions and Properties of the Graded Item Response Model: An Example Using a Mathematics Performance Assessment.

    ERIC Educational Resources Information Center

    Lane, Suzanne; And Others

    1995-01-01

    Over 5,000 students participated in a study of the dimensionality and stability of the item parameter estimates of a mathematics performance assessment developed for the Quantitative Understanding: Amplifying Student Achievement and Reasoning (QUASAR) Project. Results demonstrate the test's dimensionality and illustrate ways to examine use of the…

  18. Critical Ethical Issues in Online Counseling: Assessing Current Practices with an Ethical Intent Checklist

    ERIC Educational Resources Information Center

    Shaw, Holly E.; Shaw, Sarah F.

    2006-01-01

    The authors used a 16-item Ethical Intent Checklist, developed from the American Counseling Association's (1999) Ethical Standards for Internet Online Counseling, to assess the current practices of 88 online counseling Web sites. Results showed fewer than half of online counselors were following the accepted practice on 8 of the 16 items. Online…

  19. Development of a scale to assess Hwa-Byung, a Korean culture-bound syndrome, using the Korean MMPI-2.

    PubMed

    Roberts, Miguel E; Han, Kyunghee; Weed, Nathan C

    2006-09-01

    This study documents the development of an MMPI-2 scale designed to assess features of the Korean culture-bound syndrome, Hwa-Byung (HB). An American research team and psychiatric practitioners in Korea created an 18-item HB scale via rational item selection and psycho-metric refinement. Principal components analysis of scale items revealed four components, reflecting content domains of general health, gastrointestinal symptoms, hopelessness, and anger. This four-component solution applied well to both Korean men and women, but not to an American sample. Although some findings were encouraging, future studies employing clinical samples are needed to provide further validation of this scale.

  20. Using a MaxEnt Classifier for the Automatic Content Scoring of Free-Text Responses

    NASA Astrophysics Data System (ADS)

    Sukkarieh, Jana Z.

    2011-03-01

    Criticisms against multiple-choice item assessments in the USA have prompted researchers and organizations to move towards constructed-response (free-text) items. Constructed-response (CR) items pose many challenges to the education community—one of which is that they are expensive to score by humans. At the same time, there has been widespread movement towards computer-based assessment and hence, assessment organizations are competing to develop automatic content scoring engines for such items types—which we view as a textual entailment task. This paper describes how MaxEnt Modeling is used to help solve the task. MaxEnt has been used in many natural language tasks but this is the first application of the MaxEnt approach to textual entailment and automatic content scoring.

  1. Assessing Psycho-social Barriers to Rehabilitation in Injured Workers with Chronic Musculoskeletal Pain: Development and Item Properties of the Yellow Flag Questionnaire (YFQ).

    PubMed

    Salathé, Cornelia Rolli; Trippolini, Maurizio Alen; Terribilini, Livio Claudio; Oliveri, Michael; Elfering, Achim

    2018-06-01

    Purpose To develop a multidimensional scale to asses psychosocial beliefs-the Yellow Flag Questionnaire (YFQ)-aimed at guiding interventions for workers with chronic musculoskeletal (MSK) pain. Methods Phase 1 consisted of item selection based on literature search, item development and expert consensus rounds. In phase 2, items were reduced with calculating a quality-score per item, using structure equation modeling and confirmatory factor analysis on data from 666 workers. In phase 3, Cronbach's α, and Pearson correlations coefficients were computed to compare YFQ with disability, anxiety, depression and self-efficacy and the YFQ score based on data from 253 injured workers. Regressions of YFQ total score on disability, anxiety, depression and self-efficacy were calculated. Results After phase 1, the YFQ included 116 items and 15 domains. Further reductions of items in phase 2 by applying the item quality criteria reduced the total to 48 items. Phase factor analysis with structural equation modeling confirmed 32 items in seven domains: activity, work, emotions, harm & blame, diagnosis beliefs, co-morbidity and control. Cronbach α was 0.91 for the total score, between 0.49 and 0.81 for the 7 distinct scores of each domain, respectively. Correlations between YFQ total score ranged with disability, anxiety, depression and self-efficacy was .58, .66, .73, -.51, respectively. After controlling for age and gender the YFQ total score explained between R2 27% and R2 53% variance of disability, anxiety, depression and self-efficacy. Conclusions The YFQ, a multidimensional screening scale is recommended for use to assess psychosocial beliefs of workers with chronic MSK pain. Further evaluation of the measurement properties such as the test-retest reliability, responsiveness and prognostic validity is warranted.

  2. An assessment of functioning and non-functioning distractors in multiple-choice questions: a descriptive analysis.

    PubMed

    Tarrant, Marie; Ware, James; Mohammed, Ahmed M

    2009-07-07

    Four- or five-option multiple choice questions (MCQs) are the standard in health-science disciplines, both on certification-level examinations and on in-house developed tests. Previous research has shown, however, that few MCQs have three or four functioning distractors. The purpose of this study was to investigate non-functioning distractors in teacher-developed tests in one nursing program in an English-language university in Hong Kong. Using item-analysis data, we assessed the proportion of non-functioning distractors on a sample of seven test papers administered to undergraduate nursing students. A total of 514 items were reviewed, including 2056 options (1542 distractors and 514 correct responses). Non-functioning options were defined as ones that were chosen by fewer than 5% of examinees and those with a positive option discrimination statistic. The proportion of items containing 0, 1, 2, and 3 functioning distractors was 12.3%, 34.8%, 39.1%, and 13.8% respectively. Overall, items contained an average of 1.54 (SD = 0.88) functioning distractors. Only 52.2% (n = 805) of all distractors were functioning effectively and 10.2% (n = 158) had a choice frequency of 0. Items with more functioning distractors were more difficult and more discriminating. The low frequency of items with three functioning distractors in the four-option items in this study suggests that teachers have difficulty developing plausible distractors for most MCQs. Test items should consist of as many options as is feasible given the item content and the number of plausible distractors; in most cases this would be three. Item analysis results can be used to identify and remove non-functioning distractors from MCQs that have been used in previous tests.

  3. [Development of an Instrument to Assess the Quality of Childbirth Care from the Mother's Perspective].

    PubMed

    Jeong, Geum Hee; Kim, Hyun Kyoung; Kim, Young Hee; Kim, Sun Hee; Lee, Sun Hee; Kim, Kyung Won

    2018-02-01

    This study aimed to develop an instrument to assess the quality of childbirth care from the perspective of a mother after delivery. The instrument was developed from a literature review, interviews, and item validation. Thirty-eight items were compiled for the instrument. The data for validity and reliability testing were collected using a questionnaire survey conducted on 270 women who had undergone normal vaginal delivery in Korea and analyzed with descriptive statistics, exploratory factor analysis, and reliability coefficients. The exploratory factor analysis reduced the number of items in the instrument to 28 items that were factored into four subscales: family-centered care, personal care, emotional empowerment, and information provision. With respect to convergence validation, there was positive correlation between this instrument and birth satisfaction scale (r=.34, p<.001). The internal consistency reliability was acceptable (Cronbach's alpha =.96). This instrument could be used as a measure of the quality of nursing care for women who have a normal vaginal delivery. © 2018 Korean Society of Nursing Science.

  4. Development of the Preverbal Visual Assessment (PreViAs) questionnaire.

    PubMed

    Pueyo, Victoria; García-Ormaechea, Inés; González, Inmaculada; Ferrer, Concepción; de la Mata, Guillermo; Duplá, María; Orós, Pedro; Andres, Eva

    2014-04-01

    Visual cognitive functions of preverbal infants are evaluated by means of a behavioral assessment. Parents or primary caregivers may be appropriate to certify the acquisition of certain abilities. To develop the PreViAs (Preverbal Visual Assessment) questionnaire to assess visual behavior of infants under 24 months of age and to assess the normative outcomes for each item at each age. The process was divided into three phases: scale development (items and domains generation), pilot testing, and exploratory analysis. The final version of the PreViAs questionnaire consisted of 30 items, each related to one or more of four domains (visual attention, visual communication, visual-motor coordination, and visual processing). For the exploratory analysis, 298 children (159 boys and 139 girls) were recruited. Their ages ranged from 0.1 to 24 months (mean, 11.2 months). Internal consistency of the questionnaire was high for all domains (Cronbach's α coefficients of 0.85-0.94). The PreViAs questionnaire is a useful scale for assessing visual cognitive abilities of infants under 24 months of age. It is easy and feasible to complete by primary caregivers. Copyright © 2014 Elsevier Ltd. All rights reserved.

  5. Development and psychometric testing of the childhood obesity perceptions (COP) survey among African American caregivers: A tool for obesity prevention program planning.

    PubMed

    Alexander, Dayna S; Alfonso, Moya L; Cao, Chunhua

    2016-12-01

    Currently, public health practitioners are analyzing the role that caregivers play in childhood obesity efforts. Assessing African American caregiver's perceptions of childhood obesity in rural communities is an important prevention effort. This article's objective is to describe the development and psychometric testing of a survey tool to assess childhood obesity perceptions among African American caregivers in a rural setting, which can be used for obesity prevention program development or evaluation. The Childhood Obesity Perceptions (COP) survey was developed to reflect the multidimensional nature of childhood obesity including risk factors, health complications, weight status, built environment, and obesity prevention strategies. A 97-item survey was pretested and piloted with the priority population. After pretesting and piloting, the survey was reduced to 59-items and administered to 135 African American caregivers. An exploratory factor analysis (EFA) and confirmatory factor analysis (CFA) was conducted to test how well the survey items represented the number of Social Cognitive Theory constructs. Twenty items were removed from the original 59-item survey and acceptable internal consistency of the six factors (α=0.70-0.85) was documented for all scales in the final COP instrument. CFA resulted in a less than adequate fit; however, a multivariate Lagrange multiplier test identified modifications to improve the model fit. The COP survey represents a promising approach as a potentially comprehensive assessment for implementation or evaluation of childhood obesity programs. Copyright © 2016 Elsevier Ltd. All rights reserved.

  6. Development of a questionnaire to evaluate patients’ awareness of cardiovascular disease risk in England’s National Health Service Health Check preventive cardiovascular programme

    PubMed Central

    Woringer, Maria; Nielsen, Jessica Jones; Zibarras, Lara; Evason, Julie; Harris, Matthew; Majeed, Azeem; Soljak, Michael

    2017-01-01

    Background The National Health Service (NHS) Health Check is a cardiovascular disease (CVD) risk assessment and management programme in England aiming to increase CVD risk awareness among people at increased risk of CVD. There is no tool to assess the effectiveness of the programme in communicating CVD risk to patients. Aims The aim of this paper was to develop a questionnaire examining patients’ CVD risk awareness for use in health service research evaluations of the NHS Health Check programme. Methods We developed an 85-item questionnaire to determine patients’ views of their risk of CVD. The questionnaire was based on a review of the relevant literature. After review by an expert panel and focus group discussion, 22 items were dropped and 2 new items were added. The resulting 65-item questionnaire with satisfactory content validity (content validity indices≥0.80) and face validity was tested on 110 NHS Health Check attendees in primary care in a cross-sectional study between 21 May 2014 and 28 July 2014. Results Following analyses of data, we reduced the questionnaire from 65 to 26 items. The 26-item questionnaire constitutes four scales: Knowledge of CVD Risk and Prevention, Perceived Risk of Heart Attack/Stroke, Perceived Benefits and Intention to Change Behaviour and Healthy Eating Intentions. Perceived Risk (Cronbach’s α=0.85) and Perceived Benefits and Intention to Change Behaviour (Cronbach’s α=0.82) have satisfactory reliability (Cronbach’s α≥0.70). Healthy Eating Intentions (Cronbach’s α=0.56) is below minimum threshold for reliability but acceptable for a three-item scale. Conclusions The resulting questionnaire, with satisfactory reliability and validity, may be used in assessing patients’ awareness of CVD risk among NHS Health Check attendees. PMID:28947435

  7. Geography Library of Test Items. Volume Four.

    ERIC Educational Resources Information Center

    Kouimanos, John, Ed.

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The…

  8. Home Science Library of Test Items. Volume One.

    ERIC Educational Resources Information Center

    Smith, Jan, Ed.

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection is reviewed for content validity and reliability. The test…

  9. Languages Library of Test Items. Volume Two: German, Latin.

    ERIC Educational Resources Information Center

    Campbell, Thomas; And Others

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The…

  10. Languages Library of Test Items. Volume One: French, Indonesian.

    ERIC Educational Resources Information Center

    Campbell, Thomas; And Others

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The…

  11. Geography Library of Test Items. Volume Three.

    ERIC Educational Resources Information Center

    Kouimanos, John, Ed.

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The…

  12. Commerce Library of Test Items. Volume One.

    ERIC Educational Resources Information Center

    Meeve, Brian, Ed.

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The…

  13. Geography Library of Test Items. Volume Five.

    ERIC Educational Resources Information Center

    Kouimanos, John, Ed.

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The…

  14. Textiles and Design Library of Test Items. Volume I.

    ERIC Educational Resources Information Center

    Smith, Jan, Ed.

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection is reviewed for content validity and reliability. The test…

  15. Commerce Library of Test Items. Volume Two.

    ERIC Educational Resources Information Center

    Meeve, Brian, Ed.

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The…

  16. An Application of Reverse Engineering to Automatic Item Generation: A Proof of Concept Using Automatically Generated Figures

    ERIC Educational Resources Information Center

    Lorié, William A.

    2013-01-01

    A reverse engineering approach to automatic item generation (AIG) was applied to a figure-based publicly released test item from the Organisation for Economic Cooperation and Development (OECD) Programme for International Student Assessment (PISA) mathematical literacy cognitive instrument as part of a proof of concept. The author created an item…

  17. Semi-Parametric Item Response Functions in the Context of Guessing. CRESST Report 844

    ERIC Educational Resources Information Center

    Falk, Carl F.; Cai, Li

    2015-01-01

    We present a logistic function of a monotonic polynomial with a lower asymptote, allowing additional flexibility beyond the three-parameter logistic model. We develop a maximum marginal likelihood based approach to estimate the item parameters. The new item response model is demonstrated on math assessment data from a state, and a computationally…

  18. Geography Library of Test Items. Volume Six.

    ERIC Educational Resources Information Center

    Kouimanos, John, Ed.

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The…

  19. Geography: Library of Test Items. Volume II.

    ERIC Educational Resources Information Center

    Kouimanos, John, Ed.

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The…

  20. Improving the Quality of Innovative Item Types: Four Tasks for Design and Development

    ERIC Educational Resources Information Center

    Parshall, Cynthia G.; Harmes, J. Christine

    2009-01-01

    Many exam programs have begun to include innovative item types in their operational assessments. While innovative item types appear to have great promise for expanding measurement, there can also be genuine challenges to their successful implementation. In this paper we present a set of four activities that can be beneficially incorporated into…

  1. Geography Library of Test Items. Volume One.

    ERIC Educational Resources Information Center

    Kouimanos, John, Ed.

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The…

  2. Developing a Tool to Assess the Capacity of Out-of-School Time Program Providers to Implement Policy, Systems, and Environmental Change.

    PubMed

    Leeman, Jennifer; Blitstein, Jonathan L; Goetz, Joshua; Moore, Alexis; Tessman, Nell; Wiecha, Jean L

    2016-08-11

    Little is known about public health practitioners' capacity to change policies, systems, or environments (PSEs), in part due to the absence of measures. To address this need, we partnered with the Alliance for a Healthier Generation (Alliance) to develop and test a theory-derived measure of the capacity of out-of-school time program providers to improve students' level of nutrition and physical activity through changes in PSEs. The measure was developed and tested through an engaged partnership with staff working on the Alliance's Healthy Out-of-School Time (HOST) Initiative. In total, approximately 2,000 sites nationwide are engaged in the HOST Initiative, which serves predominantly high-need children and youths. We partnered with the Alliance to conduct formative work that would help develop a survey that assessed attitudes/beliefs, social norms, external resources/supports, and self-efficacy. The survey was administered to providers of out-of-school time programs who were implementing the Alliance's HOST Initiative. Survey respondents were 185 out-of-school time program providers (53% response rate). Exploratory factor analysis yielded a 4-factor model that explained 44.7% of the variance. Factors pertained to perceptions of social norms (6 items) and self-efficacy to build support and engage a team (4 items) and create (5 items) and implement (3 items) an action plan. We report initial development and factor analysis of a tool that the Alliance can use to assess the capacity of after-school time program providers, which is critical to targeting capacity-building interventions and assessing their effectiveness. Study findings also will inform the development of measures to assess individual capacity to plan and implement other PSE interventions.

  3. The Depression Inventory Development Workgroup: A Collaborative, Empirically Driven Initiative to Develop a New Assessment Tool for Major Depressive Disorder.

    PubMed

    Vaccarino, Anthony L; Evans, Kenneth R; Kalali, Amir H; Kennedy, Sidney H; Engelhardt, Nina; Frey, Benicio N; Greist, John H; Kobak, Kenneth A; Lam, Raymond W; MacQueen, Glenda; Milev, Roumen; Placenza, Franca M; Ravindran, Arun V; Sheehan, David V; Sills, Terrence; Williams, Janet B W

    2016-01-01

    The Depression Inventory Development project is an initiative of the International Society for CNS Drug Development whose goal is to develop a comprehensive and psychometrically sound measurement tool to be utilized as a primary endpoint in clinical trials for major depressive disorder. Using an iterative process between field testing and psychometric analysis and drawing upon expertise of international researchers in depression, the Depression Inventory Development team has established an empirically driven and collaborative protocol for the creation of items to assess symptoms in major depressive disorder. Depression-relevant symptom clusters were identified based on expert clinical and patient input. In addition, as an aid for symptom identification and item construction, the psychometric properties of existing clinical scales (assessing depression and related indications) were evaluated using blinded datasets from pharmaceutical antidepressant drug trials. A series of field tests in patients with major depressive disorder provided the team with data to inform the iterative process of scale development. We report here an overview of the Depression Inventory Development initiative, including results of the third iteration of items assessing symptoms related to anhedonia, cognition, fatigue, general malaise, motivation, anxiety, negative thinking, pain and appetite. The strategies adopted from the Depression Inventory Development program, as an empirically driven and collaborative process for scale development, have provided the foundation to develop and validate measurement tools in other therapeutic areas as well.

  4. Geography, Years 7-10, Library of Test Items. Volume Eight. Junior Secondary Items To Be Used With 1976 to 1980 H.S.C. Geography Exam. Broadsheets.

    ERIC Educational Resources Information Center

    Kouimanos, John, Ed.

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The…

  5. Assessment of mastication in healthy children and children with cerebral palsy: a validity and consistency study.

    PubMed

    Remijn, L; Speyer, R; Groen, B E; Holtus, P C M; van Limbeek, J; Nijhuis-van der Sanden, M W G

    2013-05-01

    The aim of this study was to develop the Mastication Observation and Evaluation instrument for observing and assessing the chewing ability of children eating solid and lumpy foods. This study describes the process of item definition and item selection and reports the content validity, reproducibility and consistency of the instrument. In the developmental phase, 15 experienced speech therapists assessed item relevance and descriptions over three Delphi rounds. Potential items were selected based on the results from a literature review. At the initial Delphi round, 17 potential items were included. After three Delphi rounds, 14 items that regarded as providing distinctive value in assessment of mastication (consensus >75%) were included in the Mastication Observation and Evaluation instrument. To test item reproducibility and consistency, two experts and five students evaluated video recordings of 20 children (10 children with cerebral palsy aged 29-65 months and 10 healthy children aged 11-42 months) eating bread and a biscuit. Reproducibility was estimated by means of the intraclass correlation coefficient (ICC). With the exception of one item concerning chewing duration, all items showed good to excellent intra-observer agreement (ICC students: 0.73-1.0). With the exception of chewing duration and number of swallows, inter-observer agreement was fair to excellent for all items (ICC experts: 0.68-1.0 and ICC students: 0.42-1.0). Results indicate that this tool is a feasible instrument and could be used in clinical practice after further research is completed on the reliability of the tool. © 2013 Blackwell Publishing Ltd.

  6. The Development of a Scale To Measure Negative Affectivity.

    ERIC Educational Resources Information Center

    Stokes, Joseph; Levin, Ira

    Negative affectivity (NA) has been defined as a stable and pervasive individual difference characterized by a disposition to experience aversive emotional states (D. Watson and L. A. Clark, 1984). A brief self-report scale was developed to assess NA. The initial 28-item scale (which included seven items each representing nervousness/calmness,…

  7. Development of the Attitudes to Domestic Violence Questionnaire for Children and Adolescents.

    PubMed

    Fox, Claire L; Gadd, David; Sim, Julius

    2015-09-01

    To provide a more robust assessment of the effectiveness of a domestic abuse prevention education program, a questionnaire was developed to measure children's attitudes to domestic violence. The aim was to develop a short questionnaire that would be easy to use for practitioners but, at the same time, sensitive enough to pick up on subtle changes in young people's attitudes. We therefore chose to ask children about different situations in which they might be willing to condone domestic violence. In Study 1, we tested a set of 20 items, which we reduced by half to a set of 10 items. The factor structure of the scale was explored and its internal consistency was calculated. In Study 2, we tested the factor structure of the 10-item Attitudes to Domestic Violence (ADV) Scale in a separate calibration sample. Finally, in Study 3, we then assessed the test-retest reliability of the 10-item scale. The ADV Questionnaire is a promising tool to evaluate the effectiveness of domestic abuse education prevention programs. However, further development work is necessary. © The Author(s) 2014.

  8. Development of a tool to measure person-centered maternity care in developing settings: validation in a rural and urban Kenyan population.

    PubMed

    Afulani, Patience A; Diamond-Smith, Nadia; Golub, Ginger; Sudhinaraset, May

    2017-09-22

    Person-centered reproductive health care is recognized as critical to improving reproductive health outcomes. Yet, little research exists on how to operationalize it. We extend the literature in this area by developing and validating a tool to measure person-centered maternity care. We describe the process of developing the tool and present the results of psychometric analyses to assess its validity and reliability in a rural and urban setting in Kenya. We followed standard procedures for scale development. First, we reviewed the literature to define our construct and identify domains, and developed items to measure each domain. Next, we conducted expert reviews to assess content validity; and cognitive interviews with potential respondents to assess clarity, appropriateness, and relevance of the questions. The questions were then refined and administered in surveys; and survey results used to assess construct and criterion validity and reliability. The exploratory factor analysis yielded one dominant factor in both the rural and urban settings. Three factors with eigenvalues greater than one were identified for the rural sample and four factors identified for the urban sample. Thirty of the 38 items administered in the survey were retained based on the factors loadings and correlation between the items. Twenty-five items load very well onto a single factor in both the rural and urban sample, with five items loading well in either the rural or urban sample, but not in both samples. These 30 items also load on three sub-scales that we created to measure dignified and respectful care, communication and autonomy, and supportive care. The Chronbach alpha for the main scale is greater than 0.8 in both samples, and that for the sub-scales are between 0.6 and 0.8. The main scale and sub-scales are correlated with global measures of satisfaction with maternity services, suggesting criterion validity. We present a 30-item scale with three sub-scales to measure person-centered maternity care. This scale has high validity and reliability in a rural and urban setting in Kenya. Validation in additional settings is however needed. This scale will facilitate measurement to improve person-centered maternity care, and subsequently improve reproductive outcomes.

  9. Validity and Reliability of Persian Version of HIV/AIDS Related Stigma Scale for People Living With HIV/AIDS in Iran.

    PubMed

    Pourmarzi, Davoud; Khoramirad, Ashraf; Ahmari Tehran, Hoda; Abedini, Zahra

    2015-11-01

    To assess the perceived HIV/AIDS related stigma a comprehensive and well developed stigma instrument is necessary. This study aimed to assess validity and reliability of the Persian version of HIV/AIDS related stigma scale which was developed by Kang et al for people living with HIV/AIDS in Iran. Thescale was forward translatedby two bilingual academic members then both translations were discussed by expert team. Back-translation was done by two other bilingual translators then we carried out discussion with both of them. To evaluate understandability the scale was administered to 10 Persons Living with HIV/AIDS (PLWHA). Final Persian version was administered to 80 PLWHA in Qom, Iran in 2014. Test-retest reliability was assessed in a sample of 20 PLWHA after a week by intra-class correlation coefficient (ICC). Cronbach's alpha coefficient for overall scale was 0.85. Also Cronbach's alpha coefficients for the five subscales were as follows: social rejection (9 items, α = 0.84), negative self-worth (4 items, α = 0.70), perceived interpersonal insecurity (2 items, α = 0.57), financial insecurity (3 items, α = 0.70), discretionary disclosure (2 items, α = 0.83). Test-retest reliability was also approved with ICC = 0.78. Correlation between items and their hypothesized subscale is greater than 0.5. Correlation between an item and its own subscale was significantly higher than its correlation with other subscales. This study demonstrate that the Persian version of HIV/AIDS related stigma scale is valid and reliable to assess HIV/AIDS related stigma perceived by people living whit HIV/AIDS in Iran.

  10. Validity and Reliability of Persian Version of HIV/AIDS Related Stigma Scale for People Living With HIV/AIDS in Iran

    PubMed Central

    Pourmarzi, Davoud; Khoramirad, Ashraf; Ahmari Tehran, Hoda; Abedini, Zahra

    2015-01-01

    Objective: To assess the perceived HIV/AIDS related stigma a comprehensive and well developed stigma instrument is necessary. This study aimed to assess validity and reliability of the Persian version of HIV/AIDS related stigma scale which was developed by Kang et al for people living with HIV/AIDS in Iran. Materials and methods: Thescale was forward translatedby two bilingual academic members then both translations were discussed by expert team. Back-translation was done by two other bilingual translators then we carried out discussion with both of them. To evaluate understandability the scale was administered to 10 Persons Living with HIV/AIDS (PLWHA). Final Persian version was administered to 80 PLWHA in Qom, Iran in 2014. Test–retest reliability was assessed in a sample of 20 PLWHA after a week by intra-class correlation coefficient (ICC). Results: Cronbach’s alpha coefficient for overall scale was 0.85. Also Cronbach’s alpha coefficients for the five subscales were as follows: social rejection (9 items, α = 0.84), negative self-worth (4 items, α = 0.70), perceived interpersonal insecurity (2 items, α = 0.57), financial insecurity (3 items, α = 0.70), discretionary disclosure (2 items, α = 0.83). Test–retest reliability was also approved with ICC = 0.78. Correlation between items and their hypothesized subscale is greater than 0.5. Correlation between an item and its own subscale was significantly higher than its correlation with other subscales. Conclusion: This study demonstrate that the Persian version of HIV/AIDS related stigma scale is valid and reliable to assess HIV/AIDS related stigma perceived by people living whit HIV/AIDS in Iran. PMID:27047562

  11. Ten years of the International Patient Decision Aid Standards Collaboration: evolution of the core dimensions for assessing the quality of patient decision aids

    PubMed Central

    2013-01-01

    In 2003, the International Patient Decision Aid Standards (IPDAS) Collaboration was established to enhance the quality and effectiveness of patient decision aids by establishing an evidence-informed framework for improving their content, development, implementation, and evaluation. Over this 10 year period, the Collaboration has established: a) the background document on 12 core dimensions to inform the original modified Delphi process to establish the IPDAS checklist (74 items); b) the valid and reliable IPDAS instrument (47 items); and c) the IPDAS qualifying (6 items), certifying (6 items + 4 items for screening), and quality criteria (28 items). The objective of this paper is to describe the evolution of the IPDAS Collaboration and discuss the standardized process used to update the background documents on the theoretical rationales, evidence and emerging issues underlying the 12 core dimensions for assessing the quality of patient decision aids. PMID:24624947

  12. Social support for healthy eating: development and validation of a questionnaire for the French-Canadian population.

    PubMed

    Carbonneau, Elise; Bradette-Laplante, Maude; Lamarche, Benoît; Provencher, Véronique; Bégin, Catherine; Robitaille, Julie; Desroches, Sophie; Vohl, Marie-Claude; Corneau, Louise; Lemieux, Simone

    2018-05-28

    The present study aimed to develop and validate a questionnaire assessing social support for healthy eating in a French-Canadian population. A twenty-one-item questionnaire was developed. For each item, participants were asked to rate the frequency, in the past month, with which the actions described had been done by family and friends in two different environments: (i) at home and (ii) outside of home. The content was evaluated by an expert panel. A validation study sample was recruited and completed the questionnaire twice. Exploratory factor analysis was performed on items to assess the number of subscales. Internal consistency reliability was assessed using Cronbach's ɑ. Test-retest reliability was evaluated with intraclass correlations between scores of the two completions. Online survey. Men and women from the Québec City area (n 150). The content validity assessment led to a few changes, resulting in a twenty-two-item questionnaire. Exploratory factor analysis revealed a two-factor structure for both environments, resulting in four subscales: supportive actions at home; non-supportive actions at home; supportive actions outside of home; and non-supportive actions outside of home. Two items were removed from the questionnaire due to low loadings. The four subscales were found to be reliable (Cronbach's ɑ=0·82-0·94; test-retest intraclass correlation=0·51-0·70). The Social Support for Healthy Eating Questionnaire was developed for a French-Canadian population and demonstrated good psychometric properties. This questionnaire will be useful to explore the role of social support and its interactions with other factors in predicting eating behaviours.

  13. Recommended core items to assess e-cigarette use in population-based surveys.

    PubMed

    Pearson, Jennifer L; Hitchman, Sara C; Brose, Leonie S; Bauld, Linda; Glasser, Allison M; Villanti, Andrea C; McNeill, Ann; Abrams, David B; Cohen, Joanna E

    2018-05-01

    A consistent approach using standardised items to assess e-cigarette use in both youth and adult populations will aid cross-survey and cross-national comparisons of the effect of e-cigarette (and tobacco) policies and improve our understanding of the population health impact of e-cigarette use. Focusing on adult behaviour, we propose a set of e-cigarette use items, discuss their utility and potential adaptation, and highlight e-cigarette constructs that researchers should avoid without further item development. Reliable and valid items will strengthen the emerging science and inform knowledge synthesis for policy-making. Building on informal discussions at a series of international meetings of 65 experts from 15 countries, the authors provide recommendations for assessing e-cigarette use behaviour, relative perceived harm, device type, presence of nicotine, flavours and reasons for use. We recommend items assessing eight core constructs: e-cigarette ever use, frequency of use and former daily use; relative perceived harm; device type; primary flavour preference; presence of nicotine; and primary reason for use. These items should be standardised or minimally adapted for the policy context and target population. Researchers should be prepared to update items as e-cigarette device characteristics change. A minimum set of e-cigarette items is proposed to encourage consensus around items to allow for cross-survey and cross-jurisdictional comparisons of e-cigarette use behaviour. These proposed items are a starting point. We recognise room for continued improvement, and welcome input from e-cigarette users and scientific colleagues. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  14. Identifying opportunities to advance practice at a large academic medical center using the ASHP Ambulatory Care Self-Assessment Tool.

    PubMed

    Martirosov, Amber Lanae; Michael, Angela; McCarty, Melissa; Bacon, Opal; DiLodovico, John R; Jantz, Arin; Kostoff, Diana; MacDonald, Nancy C; Mikulandric, Nancy; Neme, Klodiana; Sulejmani, Nimisha; Summers, Bryant B

    2018-05-29

    The use of the ASHP Ambulatory Care Self-Assessment Tool to advance pharmacy practice at 8 ambulatory care clinics of a large academic medical center is described. The ASHP Ambulatory Care Self-Assessment Tool was developed to help ambulatory care pharmacists assess how their current practices align with the ASHP Practice Advancement Initiative. The Henry Ford Hospital Ambulatory Care Advisory Group (ACAG) opted to use the "Practitioner Track" sections of the tool to assess pharmacy practices within each of 8 ambulatory care clinics individually. The responses to self-assessment items were then compiled and discussed by ACAG members. The group identified best practices and ways to implement action items to advance ambulatory care practice throughout the institution. Three recommended action items were common to most clinics: (1) identify and evaluate solutions to deliver financially viable services, (2) develop technology to improve patient care, and (3) optimize the role of pharmacy technicians and support personnel. The ACAG leadership met with pharmacy administrators to discuss how action items that were both feasible and deemed likely to have a medium-to-high impact aligned with departmental goals and used this information to develop an ambulatory care strategic plan. This process informed and enabled initiatives to advance ambulatory care pharmacy practice within the system. The ASHP Ambulatory Care Self-Assessment Tool was useful in identifying opportunities for practice advancement in a large academic medical center. Copyright © 2018 by the American Society of Health-System Pharmacists, Inc. All rights reserved.

  15. Development of a Valid and Reliable Knee Articular Cartilage Condition-Specific Study Methodological Quality Score.

    PubMed

    Harris, Joshua D; Erickson, Brandon J; Cvetanovich, Gregory L; Abrams, Geoffrey D; McCormick, Frank M; Gupta, Anil K; Verma, Nikhil N; Bach, Bernard R; Cole, Brian J

    2014-02-01

    Condition-specific questionnaires are important components in evaluation of outcomes of surgical interventions. No condition-specific study methodological quality questionnaire exists for evaluation of outcomes of articular cartilage surgery in the knee. To develop a reliable and valid knee articular cartilage-specific study methodological quality questionnaire. Cross-sectional study. A stepwise, a priori-designed framework was created for development of a novel questionnaire. Relevant items to the topic were identified and extracted from a recent systematic review of 194 investigations of knee articular cartilage surgery. In addition, relevant items from existing generic study methodological quality questionnaires were identified. Items for a preliminary questionnaire were generated. Redundant and irrelevant items were eliminated, and acceptable items modified. The instrument was pretested and items weighed. The instrument, the MARK score (Methodological quality of ARticular cartilage studies of the Knee), was tested for validity (criterion validity) and reliability (inter- and intraobserver). A 19-item, 3-domain MARK score was developed. The 100-point scale score demonstrated face validity (focus group of 8 orthopaedic surgeons) and criterion validity (strong correlation to Cochrane Quality Assessment score and Modified Coleman Methodology Score). Interobserver reliability for the overall score was good (intraclass correlation coefficient [ICC], 0.842), and for all individual items of the MARK score, acceptable to perfect (ICC, 0.70-1.000). Intraobserver reliability ICC assessed over a 3-week interval was strong for 2 reviewers (≥0.90). The MARK score is a valid and reliable knee articular cartilage condition-specific study methodological quality instrument. This condition-specific questionnaire may be used to evaluate the quality of studies reporting outcomes of articular cartilage surgery in the knee.

  16. Validity of portfolio assessment: which qualities determine ratings?

    PubMed

    Driessen, Erik W; Overeem, Karlijn; van Tartwijk, Jan; van der Vleuten, Cees P M; Muijtjens, Arno M M

    2006-09-01

    The portfolio is becoming increasingly accepted as a valuable tool for learning and assessment. The validity of portfolio assessment, however, may suffer from bias due to irrelevant qualities, such as lay-out and writing style. We examined the possible effects of such qualities in a portfolio programme aimed at stimulating Year 1 medical students to reflect on their professional and personal development. In later curricular years, this portfolio is also used to judge clinical competence. We developed an instrument, the Portfolio Analysis Scoring Inventory, to examine the impact of form and content aspects on portfolio assessment. The Inventory consists of 15 items derived from interviews with experienced mentors, the literature, and the criteria for reflective competence used in the regular portfolio assessment procedure. Forty portfolios, selected from 231 portfolios for which ratings from the regular assessment procedure were available, were rated by 2 researchers, independently, using the Inventory. Regression analysis was used to estimate the correlation between the ratings from the regular assessment and those resulting from the Inventory items. Inter-rater agreement ranged from 0.46 to 0.87. The strongest predictor of the variance in the regular ratings was 'quality of reflection' (R 0.80; R2 66%). No further items accounted for a significant proportion of variance. Irrelevant items, such as writing style and lay-out, had negligible effects. The absence of an impact of irrelevant criteria appears to support the validity of the portfolio assessment procedure. Further studies should examine the portfolio's validity for the assessment of clinical competence.

  17. The Dutch-Flemish PROMIS Physical Function item bank exhibited strong psychometric properties in patients with chronic pain.

    PubMed

    Crins, Martine H P; Terwee, Caroline B; Klausch, Thomas; Smits, Niels; de Vet, Henrica C W; Westhovens, Rene; Cella, David; Cook, Karon F; Revicki, Dennis A; van Leeuwen, Jaap; Boers, Maarten; Dekker, Joost; Roorda, Leo D

    2017-07-01

    The objective of this study was to assess the psychometric properties of the Dutch-Flemish Patient-Reported Outcomes Measurement Information System (PROMIS) Physical Function item bank in Dutch patients with chronic pain. A bank of 121 items was administered to 1,247 Dutch patients with chronic pain. Unidimensionality was assessed by fitting a one-factor confirmatory factor analysis and evaluating resulting fit statistics. Items were calibrated with the graded response model and its fit was evaluated. Cross-cultural validity was assessed by testing items for differential item functioning (DIF) based on language (Dutch vs. English). Construct validity was evaluated by calculation correlations between scores on the Dutch-Flemish PROMIS Physical Function measure and scores on generic and disease-specific measures. Results supported the Dutch-Flemish PROMIS Physical Function item bank's unidimensionality (Comparative Fit Index = 0.976, Tucker Lewis Index = 0.976) and model fit. Item thresholds targeted a wide range of physical function construct (threshold-parameters range: -4.2 to 5.6). Cross-cultural validity was good as four items only showed DIF for language and their impact on item scores was minimal. Physical Function scores were strongly associated with scores on all other measures (all correlations ≤ -0.60 as expected). The Dutch-Flemish PROMIS Physical Function item bank exhibited good psychometric properties. Development of a computer adaptive test based on the large bank is warranted. Copyright © 2017 Elsevier Inc. All rights reserved.

  18. The Tactual Profile: Development of a Procedure to Assess the Tactual Functioning of Children Who Are Blind

    ERIC Educational Resources Information Center

    Withagen, Ans; Vervloed, Mathijs P. J.; Janssen, Neeltje M.; Knoors, Harry; Verhoeven, Ludo

    2009-01-01

    The Tactual Profile assesses tactual functioning of children with severe visual impairments between 0 and 16 years of age. The Tactual Profile consists of 430 items, measuring tactile skills required for performing everyday tasks at home and in school. Items are graded according to age level and divided into three domains: tactual sensory, tactual…

  19. The Responsive Environmental Assessment for Classroom Teaching (REACT): the dimensionality of student perceptions of the instructional environment.

    PubMed

    Nelson, Peter M; Demers, Joseph A; Christ, Theodore J

    2014-06-01

    This study details the initial development of the Responsive Environmental Assessment for Classroom Teachers (REACT). REACT was developed as a questionnaire to evaluate student perceptions of the classroom teaching environment. Researchers engaged in an iterative process to develop, field test, and analyze student responses on 100 rating-scale items. Participants included 1,465 middle school students across 48 classrooms in the Midwest. Item analysis, including exploratory and confirmatory factor analysis, was used to refine a 27-item scale with a second-order factor structure. Results support the interpretation of a single general dimension of the Classroom Teaching Environment with 6 subscale dimensions: Positive Reinforcement, Instructional Presentation, Goal Setting, Differentiated Instruction, Formative Feedback, and Instructional Enjoyment. Applications of REACT in research and practice are discussed along with implications for future research and the development of classroom environment measures. PsycINFO Database Record (c) 2014 APA, all rights reserved.

  20. Development of measures from the theory of planned behavior applied to leisure-time physical activity.

    PubMed

    Kerner, Matthew S

    2005-06-01

    Using the theory of planned behavior as a conceptual framework, scales assessing Attitude to Leisure-time Physical Activity, Expectations of Others, Perceived Control, and Intention to Engage in Leisure-time Physical Activity were developed for use among middle-school students. The study sample included 349 boys and 400 girls, 10 to 14 years of age (M=11.9 yr., SD=.9). Unipolar and bipolar scales with seven response choices were developed, with each scale item phrased in a Likert-type format. Following revisions, 22 items were retained in the Attitude to Leisure-time Physical Activity Scale, 10 items in the Expectations of Others Scale, 3 items in the Perceived Control Scale, and 17 items in the Intention to Engage in Leisure-time Physical Activity Scale. Adequate internal consistency was indicated by standardized coefficients alpha ranging from .75 to .89. Current results must be extended to assess discriminant and predictive validities and to check various reliabilities with new samples, then evaluation of intervention techniques for promotion of positive attitudes about leisure-time physical activity, including perception of control and intentions to engage in leisure-time physical activity.

  1. Development and validation of the positive affect and well-being scale for the neurology quality of life (Neuro-QOL) measurement system.

    PubMed

    Salsman, John M; Victorson, David; Choi, Seung W; Peterman, Amy H; Heinemann, Allen W; Nowinski, Cindy; Cella, David

    2013-11-01

    To develop and validate an item-response theory-based patient-reported outcomes assessment tool of positive affect and well-being (PAW). This is part of a larger NINDS-funded study to develop a health-related quality of life measurement system across major neurological disorders, called Neuro-QOL. Informed by a literature review and qualitative input from clinicians and patients, item pools were created to assess PAW concepts. Items were administered to a general population sample (N = 513) and a group of individuals with a variety of neurologic conditions (N = 581) for calibration and validation purposes, respectively. A 23-item calibrated bank and a 9-item short form of PAW was developed, reflecting components of positive affect, life satisfaction, or an overall sense of purpose and meaning. The Neuro-QOL PAW measure demonstrated sufficient unidimensionality and displayed good internal consistency, test-retest reliability, model fit, convergent and discriminant validity, and responsiveness. The Neuro-QOL PAW measure was designed to aid clinicians and researchers to better evaluate and understand the potential role of positive health processes for individuals with chronic neurological conditions. Further psychometric testing within and between neurological conditions, as well as testing in non-neurologic chronic diseases, will help evaluate the generalizability of this new tool.

  2. Indicators of Family Care for Development for Use in Multicountry Surveys

    PubMed Central

    Kariger, Patricia; Engle, Patrice; Britto, Pia M. Rebello; Sywulka, Sara M.; Menon, Purnima

    2012-01-01

    Indicators of family care for development are essential for ascertaining whether families are providing their children with an environment that leads to positive developmental outcomes. This project aimed to develop indicators from a set of items, measuring family care practices and resources important for caregiving, for use in epidemiologic surveys in developing countries. A mixed method (quantitative and qualitative) design was used for item selection and evaluation. Qualitative and quantitative analyses were conducted to examine the validity of candidate items in several country samples. Qualitative methods included the use of global expert panels to identify and evaluate the performance of each candidate item as well as in-country focus groups to test the content validity of the items. The quantitative methods included analyses of item-response distributions, using bivariate techniques. The selected items measured two family care practices (support for learning/stimulating environment and limit-setting techniques) and caregiving resources (adequacy of the alternate caregiver when the mother worked). Six play-activity items, indicative of support for learning/stimulating environment, were included in the core module of UNICEF's Multiple Cluster Indictor Survey 3. The other items were included in optional modules. This project provided, for the first time, a globally-relevant set of items for assessing family care practices and resources in epidemiological surveys. These items have multiple uses, including national monitoring and cross-country comparisons of the status of family care for development used globally. The obtained information will reinforce attention to efforts to improve the support for development of children. PMID:23304914

  3. A Five-Year Evaluation of Examination Structure in a Cardiovascular Pharmacotherapy Course

    PubMed Central

    Kolar, Claire; Janke, Kristin K.

    2015-01-01

    Objective. To evaluate the composition and effectiveness as an assessment tool of a criterion-referenced examination comprised of clinical cases tied to practice decisions, to examine the effect of varying audience response system (ARS) questions on student examination preparation, and to articulate guidelines for structuring examinations to maximize evaluation of student learning. Design. Multiple-choice items developed over 5 years were evaluated using Bloom’s Taxonomy classification, point biserial correlation, item difficulty, and grade distribution. In addition, examination items were classified into categories based on similarity to items used in ARS preparation. Assessment. As the number of items directly tied to clinical practice rose, Bloom’s Taxonomy level and item difficulty also rose. In examination years where Bloom’s levels were high but preparation was minimal, average grade distribution was lower compared with years in which student preparation was higher. Conclusion. Criterion-referenced examinations can benefit from systematic evaluation of their composition and effectiveness as assessment tools. Calculated design and delivery of classroom preparation is an asset in improving examination performance on rigorous, practice-relevant examinations. PMID:27168611

  4. The Development and Preliminary Testing of an Instrument for Assessing Fatigue Self-management Outcomes in Patients With Advanced Cancer.

    PubMed

    Chan, Raymond Javan; Yates, Patsy; McCarthy, Alexandra L

    Fatigue is one of the most distressing and commonly experienced symptoms in patients with advanced cancer. Although the self-management (SM) of cancer-related symptoms has received increasing attention, no research instrument assessing fatigue SM outcomes for patients with advanced cancer is available. The aim of this study was to describe the development and preliminary testing of an interviewer-administered instrument for assessing the frequency and perceived levels of effectiveness and self-efficacy associated with fatigue SM behaviors in patients with advanced cancer. The development and testing of the Self-efficacy in Managing Symptoms Scale-Fatigue Subscale for Patients With Advanced Cancer (SMSFS-A) involved a number of procedures: item generation using a comprehensive literature review and semistructured interviews, content validity evaluation using expert panel reviews, and face validity and test-retest reliability evaluation using pilot testing. Initially, 23 items (22 specific behaviors with 1 global item) were generated from the literature review and semistructured interviews. After 2 rounds of expert panel review, the final scale was reduced to 17 items (16 behaviors with 1 global item). Participants in the pilot test (n = 10) confirmed that the questions in this scale were clear and easy to understand. Bland-Altman analysis showed agreement of results over a 1-week interval. The SMSFS-A items were generated using multiple sources. This tool demonstrated preliminary validity and reliability. The SMSFS-A has the potential to be used for clinical and research purposes. Nurses can use this instrument for collecting data to inform the initiation of appropriate fatigue SM support for this population.

  5. The development and initial psychometric evaluation of a measure assessing adherence to prescribed exercise: the Exercise Adherence Rating Scale (EARS).

    PubMed

    Newman-Beinart, Naomi A; Norton, Sam; Dowling, Dominic; Gavriloff, Dimitri; Vari, Chiara; Weinman, John A; Godfrey, Emma L

    2017-06-01

    There is no gold standard for measuring adherence to prescribed home exercise. Self-report diaries are commonly used however lack of standardisation, inaccurate recall and self-presentation bias limit their validity. A valid and reliable tool to assess exercise adherence behaviour is required. Consequently, this article reports the development and psychometric evaluation of the Exercise Adherence Rating Scale (EARS). Development of a questionnaire. Secondary care in physiotherapy departments of three hospitals. A focus group consisting of 8 patients with chronic low back pain (CLBP) and 2 physiotherapists was conducted to generate qualitative data. Following on from this, a convenience sample of 224 people with CLBP completed the initial 16-item EARS for purposes of subsequent validity and reliability analyses. Construct validity was explored using exploratory factor analysis and item response theory. Test-retest reliability was assessed 3 weeks later in a sub-sample of patients. An item pool consisting of 6 items was found suitable for factor analysis. Examination of the scale structure of these 6 items revealed a one factor solution explaining a total of 71% of the variance in adherence to exercise. The six items formed a unidimensional scale that showed good measurement properties, including acceptable internal consistency and high test-retest reliability. The EARS enables the measurement of adherence to prescribed home exercise. This may facilitate the evaluation of interventions promoting self-management for both the prevention and treatment of chronic conditions. Copyright © 2017 Chartered Society of Physiotherapy. Published by Elsevier Ltd. All rights reserved.

  6. Constructing a question bank based on script concordance approach as a novel assessment methodology in surgical education.

    PubMed

    Aldekhayel, Salah A; Alselaim, Nahar A; Magzoub, Mohi Eldin; Al-Qattan, Mohammad M; Al-Namlah, Abdullah M; Tamim, Hani; Al-Khayal, Abdullah; Al-Habdan, Sultan I; Zamakhshary, Mohammed F

    2012-10-24

    Script Concordance Test (SCT) is a new assessment tool that reliably assesses clinical reasoning skills. Previous descriptions of developing SCT-question banks were merely subjective. This study addresses two gaps in the literature: 1) conducting the first phase of a multistep validation process of SCT in Plastic Surgery, and 2) providing an objective methodology to construct a question bank based on SCT. After developing a test blueprint, 52 test items were written. Five validation questions were developed and a validation survey was established online. Seven reviewers were asked to answer this survey. They were recruited from two countries, Saudi Arabia and Canada, to improve the test's external validity. Their ratings were transformed into percentages. Analysis was performed to compare reviewers' ratings by looking at correlations, ranges, means, medians, and overall scores. Scores of reviewers' ratings were between 76% and 95% (mean 86% ± 5). We found poor correlations between reviewers (Pearson's: +0.38 to -0.22). Ratings of individual validation questions ranged between 0 and 4 (on a scale 1-5). Means and medians of these ranges were computed for each test item (mean: 0.8 to 2.4; median: 1 to 3). A subset of test items comprising 27 items was generated based on a set of inclusion and exclusion criteria. This study proposes an objective methodology for validation of SCT-question bank. Analysis of validation survey is done from all angles, i.e., reviewers, validation questions, and test items. Finally, a subset of test items is generated based on a set of criteria.

  7. Assessing food selection in a health promotion program: validation of a brief instrument for American Indian children in the southwest United States.

    PubMed

    Koehler, K M; Cunningham-Sabo, L; Lambert, L C; McCalman, R; Skipper, B J; Davis, S M

    2000-02-01

    Brief dietary assessment instruments are needed to evaluate behavior changes of participants in dietary intervention programs. The purpose of this project was to design and validate an instrument for children participating in Pathways to Health, a culturally appropriate, cancer prevention curriculum. Validation of a brief food selection instrument, Yesterday's Food Choices (YFC), which contained 33 questions about foods eaten the previous day with response choices of yes, no, or not sure. Reference data for validation were 24-hour dietary recalls administered individually to 120 students selected randomly. The YFC and 24-hour dietary recalls were administered to American Indian children in fifth- and seventh-grade classes in the Southwest United States. Dietary recalls were coded for food items in the YFC and results were compared for each item using percentage agreement and the kappa statistic. Percentage agreement for all items was greater than 60%; for most items it was greater than 70%, and for several items it was greater than 80%. The amount of agreement beyond that explained by chance (kappa statistic) was generally small. Three items showed substantial agreement beyond chance (kappa > or = 0.6); 2 items showed moderate agreement (kappa = 0.40 to 0.59) most items showed fair agreement (kappa = 0.20 to 0.39). The food items showing substantial agreement were hot or cold cereal, low-fat milk, and mutton or chile stew. Fried or scrambled eggs and deep-fried foods showed moderate agreement beyond chances. Previous development and validation of brief food selection instruments for children participating in health promotion programs has had limited success. In this study, instrument-related factors that apparently contributed to poor agreement between data from the YFC and 24-hour dietary recall were inclusion of categories of foods vs specific foods; food knowledge, preparation, and vocabulary, item length, and overreporting of attractive foods. Collecting and scoring the 24-hour recall data may also have contributed to poor agreement. Further development of brief instruments for evaluating changes in children's behavior in dietary programs is necessary. Factors related to the YFC that need further development may be issues that are also important in the development of effective, brief dietary assessments for children as individual clients or patients.

  8. The Childhood Cancer Survivor Study-Neurocognitive Questionnaire (CCSS-NCQ) Revised: Item Response Analysis and Concurrent Validity

    PubMed Central

    Kenzik, Kelly M.; Huang, I-Chan; Brinkman, Tara M.; Baughman, Brandon; Ness, Kirsten K.; Shenkman, Elizabeth A.; Hudson, Melissa M.; Robison, Leslie L.; Krull, Kevin R.

    2014-01-01

    Objective Childhood cancer survivors are at risk for neurocognitive impairment related to cancer diagnosis or treatment. This study refined and further validated the Childhood Cancer Survivor Study Neurocognitive Questionnaire (CCSS-NCQ), a scale developed to screen for impairment in long-term survivors of childhood cancer. Method Items related to task efficiency, memory, organization and emotional regulation domains were examined using item response theory (IRT). Data were collected from 833 adult survivors of childhood cancer in the St. Jude Lifetime Cohort Study who completed self-report and direct neurocognitive testing. The revision process included: 1) content validity mapping of items to domains, 2) constructing a revised CCSS-NCQ, 3) selecting items within specific domains using IRT, and 4) evaluating concordance between the revised CCSS-NCQ and direct neurocognitive assessment. Results Using content and measurement properties, 32 items were retained (8 items in 4 domains). Items captured low to middle levels of neurocognitive concerns. The latent domain scores demonstrated poor convergent/divergent validity with the direct assessments. Adjusted effect sizes (Cohen's d) for agreement between self-reported memory and direct memory assessment were moderate for total recall (ES=0.66), long-term memory (ES=0.63), and short-term memory (ES=0.55). Effect sizes between self-rated task efficiency and direct assessment of attention were moderate for focused attention (ES=0.70) and attention span (ES=0.50), but small for sustained attention (ES=0.36). Cranial radiation therapy and female gender were associated with lower self-reported neurocognitive function. Conclusion The revised CCSS-NCQ demonstrates adequate measurement properties for assessing day-to-day neurocognitive concerns in childhood cancer survivors, and adds useful information to direct assessment. PMID:24933482

  9. An Assessment of Mentoring Functions and Barriers to Mentoring

    DTIC Science & Technology

    1999-12-01

    were similarity between mentor and mentee and the quality of the supervisory relationship in terms of LMX and psychosocial and career development ... psychosocial (1985). These broad categories have remained at the core of mentoring from the time they were developed . Career development functions "help...internal consistency reported by Noe for the career development functions scale (7 items) was .89. The psychosocial functions scale, made up of 14 items

  10. Development, content validity, and cross-cultural adaptation of a patient-reported outcome measure for real-time symptom assessment in irritable bowel syndrome.

    PubMed

    Vork, L; Keszthelyi, D; Mujagic, Z; Kruimel, J W; Leue, C; Pontén, I; Törnblom, H; Simrén, M; Albu-Soda, A; Aziz, Q; Corsetti, M; Holvoet, L; Tack, J; Rao, S S; van Os, J; Quetglas, E G; Drossman, D A; Masclee, A A M

    2018-03-01

    End-of-day questionnaires, which are considered the gold standard for assessing abdominal pain and other gastrointestinal (GI) symptoms in irritable bowel syndrome (IBS), are influenced by recall and ecological bias. The experience sampling method (ESM) is characterized by random and repeated assessments in the natural state and environment of a subject, and herewith overcomes these limitations. This report describes the development of a patient-reported outcome measure (PROM) based on the ESM principle, taking into account content validity and cross-cultural adaptation. Focus group interviews with IBS patients and expert meetings with international experts in the fields of neurogastroenterology & motility and pain were performed in order to select the items for the PROM. Forward-and-back translation and cognitive interviews were performed to adapt the instrument for the use in different countries and to assure on patients' understanding with the final items. Focus group interviews revealed 42 items, categorized into five domains: physical status, defecation, mood and psychological factors, context and environment, and nutrition and drug use. Experts reduced the number of items to 32 and cognitive interviewing after translation resulted in a few slight adjustments regarding linguistic issues, but not regarding content of the items. An ESM-based PROM, suitable for momentary assessment of IBS symptom patterns was developed, taking into account content validity and cross-cultural adaptation. This PROM will be implemented in a specifically designed smartphone application and further validation in a multicenter setting will follow. © 2017 John Wiley & Sons Ltd.

  11. Inter-rater reliability of a food store checklist to assess availability of healthier alternatives to the energy-dense snacks and beverages commonly consumed by children.

    PubMed

    Izumi, Betty T; Findholt, Nancy E; Pickus, Hayley A; Nguyen, Thuan; Cuneo, Monica K

    2014-06-01

    Food stores have gained attention as potential intervention targets for improving children's eating habits. There is a need for valid and reliable instruments to evaluate changes in food store snack and beverage availability secondary to intervention. The aim of this study was to develop a valid, reliable, and resource-efficient instrument to evaluate the healthfulness of food store environments faced by children. The SNACZ food store checklist was developed to assess availability of healthier alternatives to the energy-dense snacks and beverages commonly consumed by children. After pretesting, two trained observers independently assessed the availability of 48 snack and beverage items in 50 food stores located near elementary and middle schools in Portland, Oregon, over a 2-week period in summer 2012. Inter-rater reliability was calculated using the kappa statistic. Overall, the instrument had mostly high inter-rater reliability. Seventy-three percent of items assessed had almost perfect or substantial reliability. Two items had moderate reliability (0.41-0.60), and no items had a reliability score less than 0.41. Eleven items occurred too infrequently to generate a kappa score. The SNACZ food store checklist is a first-step toward developing a valid and reliable tool to evaluate the healthfulness of food store environments faced by children. The tool can be used to compare availability of healthier snack and beverage alternatives across communities and measure change secondary to intervention. As a wider variety of healthier snack and beverage alternatives become available in food stores, the checklist should be updated.

  12. Development and reliability testing of a self-report instrument to measure the office layout as a correlate of occupational sitting.

    PubMed

    Duncan, Mitch J; Rashid, Mahbub; Vandelanotte, Corneel; Cutumisu, Nicoleta; Plotnikoff, Ronald C

    2013-02-04

    Spatial configurations of office environments assessed by Space Syntax methodologies are related to employee movement patterns. These methods require analysis of floors plans which are not readily available in large population-based studies or otherwise unavailable. Therefore a self-report instrument to assess spatial configurations of office environments using four scales was developed. The scales are: local connectivity (16 items), overall connectivity (11 items), visibility of co-workers (10 items), and proximity of co-workers (5 items). A panel cohort (N = 1154) completed an online survey, only data from individuals employed in office-based occupations (n = 307) were used to assess scale measurement properties. To assess test-retest reliability a separate sample of 37 office-based workers completed the survey on two occasions 7.7 (±3.2) days apart. Redundant scale items were eliminated using factor analysis; Chronbach's α was used to evaluate internal consistency and test re-test reliability (retest-ICC). ANOVA was employed to examine differences between office types (Private, Shared, Open) as a measure of construct validity. Generalized Linear Models were used to examine relationships between spatial configuration scales and the duration of and frequency of breaks in occupational sitting. The number of items on all scales were reduced, Chronbach's α and ICCs indicated good scale internal consistency and test re-test reliability: local connectivity (5 items; α = 0.70; retest-ICC = 0.84), overall connectivity (6 items; α = 0.86; retest-ICC = 0.87), visibility of co-workers (4 items; α = 0.78; retest-ICC = 0.86), and proximity of co-workers (3 items; α = 0.85; retest-ICC = 0.70). Significant (p ≤ 0.001) differences, in theoretically expected directions, were observed for all scales between office types, except overall connectivity. Significant associations were observed between all scales and occupational sitting behaviour (p ≤ 0.05). All scales have good measurement properties indicating the instrument may be a useful alternative to Space Syntax to examine environmental correlates of occupational sitting in population surveys.

  13. Development and reliability testing of a self-report instrument to measure the office layout as a correlate of occupational sitting

    PubMed Central

    2013-01-01

    Background Spatial configurations of office environments assessed by Space Syntax methodologies are related to employee movement patterns. These methods require analysis of floors plans which are not readily available in large population-based studies or otherwise unavailable. Therefore a self-report instrument to assess spatial configurations of office environments using four scales was developed. Methods The scales are: local connectivity (16 items), overall connectivity (11 items), visibility of co-workers (10 items), and proximity of co-workers (5 items). A panel cohort (N = 1154) completed an online survey, only data from individuals employed in office-based occupations (n = 307) were used to assess scale measurement properties. To assess test-retest reliability a separate sample of 37 office-based workers completed the survey on two occasions 7.7 (±3.2) days apart. Redundant scale items were eliminated using factor analysis; Chronbach’s α was used to evaluate internal consistency and test re-test reliability (retest-ICC). ANOVA was employed to examine differences between office types (Private, Shared, Open) as a measure of construct validity. Generalized Linear Models were used to examine relationships between spatial configuration scales and the duration of and frequency of breaks in occupational sitting. Results The number of items on all scales were reduced, Chronbach’s α and ICCs indicated good scale internal consistency and test re-test reliability: local connectivity (5 items; α = 0.70; retest-ICC = 0.84), overall connectivity (6 items; α = 0.86; retest-ICC = 0.87), visibility of co-workers (4 items; α = 0.78; retest-ICC = 0.86), and proximity of co-workers (3 items; α = 0.85; retest-ICC = 0.70). Significant (p ≤ 0.001) differences, in theoretically expected directions, were observed for all scales between office types, except overall connectivity. Significant associations were observed between all scales and occupational sitting behaviour (p ≤ 0.05). Conclusion All scales have good measurement properties indicating the instrument may be a useful alternative to Space Syntax to examine environmental correlates of occupational sitting in population surveys. PMID:23379485

  14. Comparison of Alternate and Original Items on the Montreal Cognitive Assessment

    PubMed Central

    Lebedeva, Elena; Huang, Mei; Koski, Lisa

    2016-01-01

    Background The Montreal Cognitive Assessment (MoCA) is a screening tool for mild cognitive impairment (MCI) in elderly individuals. We hypothesized that measurement error when using the new alternate MoCA versions to monitor change over time could be related to the use of items that are not of comparable difficulty to their corresponding originals of similar content. The objective of this study was to compare the difficulty of the alternate MoCA items to the original ones. Methods Five selected items from alternate versions of the MoCA were included with items from the original MoCA administered adaptively to geriatric outpatients (N = 78). Rasch analysis was used to estimate the difficulty level of the items. Results None of the five items from the alternate versions matched the difficulty level of their corresponding original items. Conclusions This study demonstrates the potential benefits of a Rasch analysis-based approach for selecting items during the process of development of parallel forms. The results suggest that better match of the items from different MoCA forms by their difficulty would result in higher sensitivity to changes in cognitive function over time. PMID:27076861

  15. Applying Item Response Theory methods to design a learning progression-based science assessment

    NASA Astrophysics Data System (ADS)

    Chen, Jing

    Learning progressions are used to describe how students' understanding of a topic progresses over time and to classify the progress of students into steps or levels. This study applies Item Response Theory (IRT) based methods to investigate how to design learning progression-based science assessments. The research questions of this study are: (1) how to use items in different formats to classify students into levels on the learning progression, (2) how to design a test to give good information about students' progress through the learning progression of a particular construct and (3) what characteristics of test items support their use for assessing students' levels. Data used for this study were collected from 1500 elementary and secondary school students during 2009--2010. The written assessment was developed in several formats such as the Constructed Response (CR) items, Ordered Multiple Choice (OMC) and Multiple True or False (MTF) items. The followings are the main findings from this study. The OMC, MTF and CR items might measure different components of the construct. A single construct explained most of the variance in students' performances. However, additional dimensions in terms of item format can explain certain amount of the variance in student performance. So additional dimensions need to be considered when we want to capture the differences in students' performances on different types of items targeting the understanding of the same underlying progression. Items in each item format need to be improved in certain ways to classify students more accurately into the learning progression levels. This study establishes some general steps that can be followed to design other learning progression-based tests as well. For example, first, the boundaries between levels on the IRT scale can be defined by using the means of the item thresholds across a set of good items. Second, items in multiple formats can be selected to achieve the information criterion at all the defined boundaries. This ensures the accuracy of the classification. Third, when item threshold parameters vary a bit, the scoring rubrics and the items need to be reviewed to make the threshold parameters similar across items. This is because one important design criterion of the learning progression-based items is that ideally, a student should be at the same level across items, which means that the item threshold parameters (d1, d 2 and d3) should be similar across items. To design a learning progression-based science assessment, we need to understand whether the assessment measures a single construct or several constructs and how items are associated with the constructs being measured. Results from dimension analyses indicate that items of different carbon transforming processes measure different aspects of the carbon cycle construct. However, items of different practices assess the same construct. In general, there are high correlations among different processes or practices. It is not clear whether the strong correlations are due to the inherent links among these process/practice dimensions or due to the fact that the student sample does not show much variation in these process/practice dimensions. Future data are needed to examine the dimensionalities in terms of process/practice in detail. Finally, based on item characteristics analysis, recommendations are made to write more discriminative CR items and better OMC, MTF options. Item writers can follow these recommendations to write better learning progression-based items.

  16. Applying the revised Chinese Job Content Questionnaire to assess psychosocial work conditions among Taiwan's hospital workers

    PubMed Central

    2011-01-01

    Background For hospital accreditation and health promotion reasons, we examined whether the 22-item Job Content Questionnaire (JCQ) could be applied to evaluate job strain of individual hospital employees and to determine the number of factors extracted from JCQ. Additionally, we developed an Excel module of self-evaluation diagnostic system for consultation with experts. Methods To develop an Excel-based self-evaluation diagnostic system for consultation to experts to make job strain assessment easier and quicker than ever, Rasch rating scale model was used to analyze data from 1,644 hospital employees who enrolled in 2008 for a job strain survey. We determined whether the 22-item Job Content Questionnaire (JCQ) could evaluate job strain of individual employees in work sites. The respective item responding to specific groups' occupational hazards causing job stress was investigated by using skewness coefficient with its 95% CI through item-by-item analyses. Results Each of those 22 items on the questionnaire was examined to have five factors. The prevalence rate of Chinese hospital workers with high job strain was 16.5%. Conclusions Graphical representations of four quadrants, item-by-item bar chart plots and skewness 95% CI comparison generated in Excel can help employers and consultants of an organization focusing on a small number of key areas of concern for each worker in job strain. PMID:21682912

  17. Applying the revised Chinese Job Content Questionnaire to assess psychosocial work conditions among Taiwan's hospital workers.

    PubMed

    Chien, Tsair-Wei; Lai, Wen-Pin; Wang, Hsien-Yi; Hsu, Sen-Yen; Castillo, Roberto Vasquez; Guo, How-Ran; Chen, Shih-Chung; Su, Shih-Bin

    2011-06-18

    For hospital accreditation and health promotion reasons, we examined whether the 22-item Job Content Questionnaire (JCQ) could be applied to evaluate job strain of individual hospital employees and to determine the number of factors extracted from JCQ. Additionally, we developed an Excel module of self-evaluation diagnostic system for consultation with experts. To develop an Excel-based self-evaluation diagnostic system for consultation to experts to make job strain assessment easier and quicker than ever, Rasch rating scale model was used to analyze data from 1,644 hospital employees who enrolled in 2008 for a job strain survey. We determined whether the 22-item Job Content Questionnaire (JCQ) could evaluate job strain of individual employees in work sites. The respective item responding to specific groups' occupational hazards causing job stress was investigated by using skewness coefficient with its 95% CI through item-by-item analyses. Each of those 22 items on the questionnaire was examined to have five factors. The prevalence rate of Chinese hospital workers with high job strain was 16.5%. Graphical representations of four quadrants, item-by-item bar chart plots and skewness 95% CI comparison generated in Excel can help employers and consultants of an organization focusing on a small number of key areas of concern for each worker in job strain.

  18. Development of three new scales for assessing clients' perspectives on premarital counseling.

    PubMed

    Schumm, W R; West, D R

    2001-06-01

    Within a subsample of 73 men and 179 women from a larger study of current and former members of the Christian Church (Disciples of Christ), three new scales were developed to assess the value attributed to premarital counseling, quality of premarital counseling received, and a pastor's competence at premarital counseling. Although internal consistency reliability as measured by Cronbach alpha was marginally acceptable (.61) for the latter three-item scale, it was adequate for the three-item value (.84) and the seven-item quality (.87) scales. Evidence for construct validity was limited with respect to demographic variables for social class, sex, and religiosity. Those who attended church more frequently and women reported lower quality of premarital counseling.

  19. Enhanced Automatic Question Creator--EAQC: Concept, Development and Evaluation of an Automatic Test Item Creation Tool to Foster Modern e-Education

    ERIC Educational Resources Information Center

    Gutl, Christian; Lankmayr, Klaus; Weinhofer, Joachim; Hofler, Margit

    2011-01-01

    Research in automated creation of test items for assessment purposes became increasingly important during the recent years. Due to automatic question creation it is possible to support personalized and self-directed learning activities by preparing appropriate and individualized test items quite easily with relatively little effort or even fully…

  20. Developing a Machine-Supported Coding System for Constructed-Response Items in PISA. Research Report. ETS RR-17-47

    ERIC Educational Resources Information Center

    Yamamoto, Kentaro; He, Qiwei; Shin, Hyo Jeong; von Davier, Mattias

    2017-01-01

    Approximately a third of the Programme for International Student Assessment (PISA) items in the core domains (math, reading, and science) are constructed-response items and require human coding (scoring). This process is time-consuming, expensive, and prone to error as often (a) humans code inconsistently, and (b) coding reliability in…

  1. Efficiently Assessing Negative Cognition in Depression: An Item Response Theory Analysis of the Dysfunctional Attitude Scale

    ERIC Educational Resources Information Center

    Beevers, Christopher G.; Strong, David R.; Meyer, Bjorn; Pilkonis, Paul A.; Miller, Ivan R.

    2007-01-01

    Despite a central role for dysfunctional attitudes in cognitive theories of depression and the widespread use of the Dysfunctional Attitude Scale, form A (DAS-A; A. Weissman, 1979), the psychometric development of the DAS-A has been relatively limited. The authors used nonparametric item response theory methods to examine the DAS-A items and…

  2. Clinical instruments: reliability and validity critical appraisal.

    PubMed

    Brink, Yolandi; Louw, Quinette A

    2012-12-01

    RATIONALE, AIM AND OBJECTIVES: There is a lack of health care practitioners using objective clinical tools with sound psychometric properties. There is also a need for researchers to improve their reporting of the validity and reliability results of these clinical tools. Therefore, to promote the use of valid and reliable tools or tests for clinical evaluation, this paper reports on the development of a critical appraisal tool to assess the psychometric properties of objective clinical tools. A five-step process was followed to develop the new critical appraisal tool: (1) preliminary conceptual decisions; (2) defining key concepts; (3) item generation; (4) assessment of face validity; and (5) formulation of the final tool. The new critical appraisal tool consists of 13 items, of which five items relate to both validity and reliability studies, four items to validity studies only and four items to reliability studies. The 13 items could be scored as 'yes', 'no' or 'not applicable'. This critical appraisal tool will aid both the health care practitioner to critically appraise the relevant literature and researchers to improve the quality of reporting of the validity and reliability of objective clinical tools. © 2011 Blackwell Publishing Ltd.

  3. Web-based computer adaptive assessment of individual perceptions of job satisfaction for hospital workplace employees

    PubMed Central

    2011-01-01

    Background To develop a web-based computer adaptive testing (CAT) application for efficiently collecting data regarding workers' perceptions of job satisfaction, we examined whether a 37-item Job Content Questionnaire (JCQ-37) could evaluate the job satisfaction of individual employees as a single construct. Methods The JCQ-37 makes data collection via CAT on the internet easy, viable and fast. A Rasch rating scale model was applied to analyze data from 300 randomly selected hospital employees who participated in job-satisfaction surveys in 2008 and 2009 via non-adaptive and computer-adaptive testing, respectively. Results Of the 37 items on the questionnaire, 24 items fit the model fairly well. Person-separation reliability for the 2008 surveys was 0.88. Measures from both years and item-8 job satisfaction for groups were successfully evaluated through item-by-item analyses by using t-test. Workers aged 26 - 35 felt that job satisfaction was significantly worse in 2009 than in 2008. Conclusions A Web-CAT developed in the present paper was shown to be more efficient than traditional computer-based or pen-and-paper assessments at collecting data regarding workers' perceptions of job content. PMID:21496311

  4. Questionnaire Adapting: Little Changes Mean a Lot.

    PubMed

    Sousa, Vanessa E C; Matson, Jeffrey; Dunn Lopez, Karen

    2017-09-01

    Questionnaire development involves rigorous testing to ensure reliability and validity. Due to time and cost constraints of developing new questionnaires, researchers often adapt existing questionnaires to better fit the purpose of their study. However, the effect of such adaptations is unclear. We conducted cognitive interviews as a method to evaluate the understanding of original and adapted questionnaire items to be applied in a future study. The findings revealed that all subjects (a) comprehended the original and adapted items differently, (b) changed their scores after comparing the original to the adapted items, and (c) were unanimous in stating that the adapted items were easier to understand. Cognitive interviewing allowed us to assess the interpretation of adapted items in a useful and efficient manner before use in data collection.

  5. Development and validation of a novel patient-reported treatment satisfaction measure for hyperfunctional facial lines: facial line satisfaction questionnaire.

    PubMed

    Pompilus, Farrah; Burgess, Somali; Hudgens, Stacie; Banderas, Benjamin; Daniels, Selena

    2015-12-01

    Facial lines or wrinkles are among the most visible signs of aging, and minimally invasive cosmetic procedures are becoming increasingly popular. The aim of this study was to develop and validate the Facial Line Satisfaction Questionnaire (FLSQ) for use in adults with upper facial lines (UFL). A literature review, concept elicitation interviews (n = 33), and cognitive debriefing interviews (n = 23) of adults with UFL were conducted to develop the FLSQ. The FLSQ comprises Baseline and Follow-up versions and was field-tested with 150 subjects in a US observational study designed to assess its psychometric performance. Analyses included acceptability (item and scale distribution [i.e. missingness, floor, and ceiling effects]), reliability, and validity (including concurrent validity). In total, 69 concepts were elicited during patient interviews. Following cognitive debriefing interviews, the FLSQ-Baseline version included 11 items and the Follow-up version included 13 items. Response rates for the FLSQ were 100% and 73% at baseline and follow-up, respectively; no items had excessive missing data. Questionnaire scale scores were normally distributed. Most domain scores demonstrated good internal consistency reliability (Cronbach's α ≥ 0.70). Most items within their respective domains exhibited good convergent (item-scale correlations > 0.40) and discriminant (items had higher correlation with their hypothesized scales than other scales) validity. Concurrent validity correlation coefficients of the FLSQ domain scores with the associated concurrent measures were acceptable (range: r = 0.40-0.70). Six FLSQ items demonstrated reliability and validity as stand-alone items outside their domains. The FLSQ is a valid questionnaire for assessing treatment expectations, satisfaction, impact, and preference in adults with UFL. © 2015 The Authors. Journal of Cosmetic Dermatology Published by Wiley Periodicals, Inc.

  6. Translation, adaptation and validation of the American short form Patient Activation Measure (PAM13) in a Danish version.

    PubMed

    Maindal, Helle Terkildsen; Sokolowski, Ineta; Vedsted, Peter

    2009-06-29

    The Patient Activation Measure (PAM) is a measure that assesses patient knowledge, skill, and confidence for self-management. This study validates the Danish translation of the 13-item Patient Activation Measure (PAM13) in a Danish population with dysglycaemia. 358 people with screen-detected dysglycaemia participating in a primary care health education study responded to PAM13. The PAM13 was translated into Danish by a standardised forward-backward translation. Data quality was assessed by mean, median, item response, missing values, floor and ceiling effects, internal consistency (Cronbach's alpha and average inter-item correlation) and item-rest correlations. Scale properties were assessed by Rasch Rating Scale models. The item response was high with a small number of missing values (0.8-4.2%). Floor effect was small (range 0.6-3.6%), but the ceiling effect was above 15% for all items (range 18.6-62.7%). The alpha-coefficient was 0.89 and the average inter-item correlation 0.38. The Danish version formed a unidimensional, probabilistic Guttman-like scale explaining 43.2% of the variance. We did however, find a different item sequence compared to the original scale. A Danish version of PAM13 with acceptable validity and reliability is now available. Further development should focus on single items, response categories in relation to ceiling effects and further validation of reproducibility and responsiveness.

  7. Development and implementation of the Structured Training Trainer Assessment Report (STTAR) in the English National Training Programme for laparoscopic colorectal surgery.

    PubMed

    Wyles, Susannah M; Miskovic, Danilo; Ni, Zhifang; Darzi, Ara W; Valori, Roland M; Coleman, Mark G; Hanna, George B

    2016-03-01

    There is a lack of educational tools available for surgical teaching critique, particularly for advanced laparoscopic surgery. The aim was to develop and implement a tool that assesses training quality and structures feedback for trainers in the English National Training Programme for laparoscopic colorectal surgery. Semi-structured interviews were performed and analysed, and items were extracted. Through the Delphi process, essential items pertaining to desirable trainer characteristics, training structure and feedback were determined. An assessment tool (Structured Training Trainer Assessment Report-STTAR) was developed and tested for feasibility, acceptability and educational impact. Interview transcripts (29 surgical trainers, 10 trainees, four educationalists) were analysed, and item lists created and distributed for consensus opinion (11 trainers and seven trainees). The STTAR consisted of 64 factors, and its web-based version, the mini-STTAR, included 21 factors that were categorised into four groups (training structure, training behaviour, trainer attributes and role modelling) and structured around a training session timeline (beginning, middle and end). The STTAR (six trainers, 48 different assessments) demonstrated good internal consistency (α = 0.88) and inter-rater reliability (ICC = 0.75). The mini-STTAR demonstrated good inter-item reliability (α = 0.79) and intra-observer reliability on comparison of 85 different trainer/trainee combinations (r = 0.701, p = <0.001). Both were found to be feasible and acceptable. The educational report for trainers was found to be useful (4.4 out of 5). An assessment tool that evaluates training quality was developed and shown to be reliable, acceptable and of educational value. It has been successfully implemented into the English National Training Programme for laparoscopic colorectal surgery.

  8. The Development of the Stages of Recovery Scale for Persons with Persistent Mental Illness

    ERIC Educational Resources Information Center

    Song, Li-Yu; Hsu, Su-Ting

    2011-01-01

    This study aimed to develop a scale which could be used as a valid way to show the evidence of recovery-oriented services. A 51-item scale was developed to assess both the component processes and outcomes of recovery. A sample of 471 participants administered the questionnaire. The factor analysis yielded a 45-item scale with six subscales,…

  9. Development of a research ethics knowledge and analytical skills assessment tool.

    PubMed

    Taylor, Holly A; Kass, Nancy E; Ali, Joseph; Sisson, Stephen; Bertram, Amanda; Bhan, Anant

    2012-04-01

    The goal of this project was to develop and validate a new tool to evaluate learners' knowledge and skills related to research ethics. A core set of 50 questions from existing computer-based online teaching modules were identified, refined and supplemented to create a set of 74 multiple-choice, true/false and short answer questions. The questions were pilot-tested and item discrimination was calculated for each question. Poorly performing items were eliminated or refined. Two comparable assessment tools were created. These assessment tools were administered as a pre-test and post-test to a cohort of 58 Indian junior health research investigators before and after exposure to a new course on research ethics. Half of the investigators were exposed to the course online, the other half in person. Item discrimination was calculated for each question and Cronbach's α for each assessment tool. A final version of the assessment tool that incorporated the best questions from the pre-/post-test phase was used to assess retention of research ethics knowledge and skills 3 months after course delivery. The final version of the REKASA includes 41 items and had a Cronbach's α of 0.837. The results illustrate, in one sample of learners, the successful, systematic development and use of a knowledge and skills assessment tool in research ethics capable of not only measuring basic knowledge in research ethics and oversight but also assessing learners' ability to apply ethics knowledge to the analytical task of reasoning through research ethics cases, without reliance on essay or discussion-based examination. These promising preliminary findings should be confirmed with additional groups of learners.

  10. Development and Validation of the Foundational Healthcare Leadership Self-assessment.

    PubMed

    Van Hala, Sonja; Cochella, Susan; Jaggi, Rachel; Frost, Caren J; Kiraly, Bernadette; Pohl, Susan; Gren, Lisa

    2018-04-01

    We sought to develop and validate a self-assessment of foundational leadership skills for early-career physicians. We developed a leadership self-assessment from a compilation of materials on health care leadership skills. A sequential exploratory study was conducted using qualitative and quantitative analysis for face, content, and construct validity of the self-assessment. First, two focus groups were conducted with leaders in medicine and family medicine residents, to refine the pilot self-assessment. The self-assessment pilot was then tested with family medicine residents across the country, and the results were quantitatively evaluated with principal component analysis. This data was used to reduce and group the statements into leadership domains for the final self-assessment. Twenty-two invited family medicine residency programs agreed to distribute the survey. A total of 163 family medicine residents completed the survey, representing 16 to 20 residency programs from 12 states (response rate 28.9% to 34.8%). Analysis showed important differences by residency year, with more advanced residents scoring higher. The analysis reduced the number of items from 33 on the pilot assessment to 21 on the final assessment, which the authors titled the Foundational Healthcare Leadership Self-assessment (FHLS). The 21 items were grouped into five leadership domains: accountability, collaboration, communication, team management, and self-management. The FHLS is a validated 21-item self-assessment of foundational leadership skills for early career physicians. It takes less than 5 minutes to complete, and quantifies skill within five domains of foundational leadership. The FHLS is a first step in developing educational and evaluative assessments for training medical residents as clinician leaders.

  11. Development and reliability testing of the Worksite and Energy Balance Survey.

    PubMed

    Hoehner, Christine M; Budd, Elizabeth L; Marx, Christine M; Dodson, Elizabeth A; Brownson, Ross C

    2013-01-01

    Worksites represent important venues for health promotion. Development of psychometrically sound measures of worksite environments and policy supports for physical activity and healthy eating are needed for use in public health research and practice. Assess the test-retest reliability of the Worksite and Energy Balance Survey (WEBS), a self-report instrument for assessing perceptions of worksite supports for physical activity and healthy eating. The WEBS included items adapted from existing surveys or new items on the basis of a review of the literature and expert review. Cognitive interviews among 12 individuals were used to test the clarity of items and further refine the instrument. A targeted random-digit-dial telephone survey was administered on 2 occasions to assess test-retest reliability (mean days between time periods = 8; minimum = 5; maximum = 14). Five Missouri census tracts that varied by racial-ethnic composition and walkability. Respondents included 104 employed adults (67% white, 64% women, mean age = 48.6 years). Sixty-three percent were employed at worksites with less than 100 employees, approximately one-third supervised other people, and the majority worked a regular daytime shift (75%). Test-retest reliability was assessed using Spearman correlations for continuous variables, Cohen's κ statistics for nonordinal categorical variables, and 1-way random intraclass correlation coefficients for ordinal categorical variables. Test-retest coefficients ranged from 0.41 to 0.97, with 80% of items having reliability coefficients of more than 0.6. Items that assessed participation in or use of worksite programs/facilities tended to have lower reliability. Reliability of some items varied by gender, obesity status, and worksite size. Test-retest reliability and internal consistency for the 5 scales ranged from 0.84 to 0.94 and 0.63 to 0.84, respectively. The WEBS items and scales exhibited sound test-retest reliability and may be useful for research and surveillance. Further evaluation is needed to document the validity of the WEBS and associations with energy balance outcomes.

  12. Development of Sex-Trait Stereotypes Among Young Children in the United States, England, and Ireland.

    ERIC Educational Resources Information Center

    Best, Deborah L.; And Others

    The Sex Stereotype Measure II (SSM II), a 32-item picture-story technique, was developed to assess children's knowledge of conventional, adult-defined, sex-trait stereotypes. The procedure was based on stereotype characteristics identified by college students using the Adjective Check List item pool. A second procedure, the Sex Attitude Measure…

  13. Development of Abbreviated Nine-Item Forms of the Raven's Standard Progressive Matrices Test

    ERIC Educational Resources Information Center

    Bilker, Warren B.; Hansen, John A.; Brensinger, Colleen M.; Richard, Jan; Gur, Raquel E.; Gur, Ruben C.

    2012-01-01

    The Raven's Standard Progressive Matrices (RSPM) is a 60-item test for measuring abstract reasoning, considered a nonverbal estimate of fluid intelligence, and often included in clinical assessment batteries and research on patients with cognitive deficits. The goal was to develop and apply a predictive model approach to reduce the number of items…

  14. A Confirmatory Factor Analysis of Reilly's Role Overload Scale

    ERIC Educational Resources Information Center

    Thiagarajan, Palaniappan; Chakrabarty, Subhra; Taylor, Ronald D.

    2006-01-01

    In 1982, Reilly developed a 13-item scale to measure role overload. This scale has been widely used, but most studies did not assess the unidimensionality of the scale. Given the significance of unidimensionality in scale development, the current study reports a confirmatory factor analysis of the 13-item scale in two samples. Based on the…

  15. The Intuitive Eating Scale: Development and Preliminary Validation

    ERIC Educational Resources Information Center

    Hawks, Steven; Merrill, Ray M.; Madanat, Hala N.

    2004-01-01

    This article describes the development and validation of an instrument designed to measure the concept of intuitive eating. To ensure face and content validity for items used in the Likert-type Intuitive Eating Scale (IES), content domain was clearly specified and a panel of experts assessed the validity of each item. Based on responses from 391…

  16. [Development of "assessment guideline of family power for healthy life"].

    PubMed

    Fukushima, M; Shimanouchi, S; Kamei, T; Takagai, E; Hoshino, Y; Sugiyama, I

    1997-12-01

    The purpose of this study is to develop "assessment guideline of family power for healthy life" aiming at expanding self-care power of family in community nursing practice. The subjects of this study covered those families in one hundred and fifty six instances that we had seized as subject for nursing care and study. The method of this study had constructed assessment guideline inductively out of each case, and modified it by applying to cases of families with health problems and others. As a result, we had formed nine items of "family power for healthy life" and three items of "conditions influencing family power for healthy life" for "assessment guideline of family power for healthy life".

  17. Medical student quality-of-life in the clerkships: a scale validation study.

    PubMed

    Brannick, Michael T; Horn, Gregory T; Schnaus, Michael J; Wahi, Monika M; Goldin, Steven B

    2015-04-01

    Many aspects of medical school are stressful for students. To empirically assess student reactions to clerkship programs, or to assess efforts to improve such programs, educators must measure the overall well-being of the students reliably and validly. The purpose of the study was to develop and validate a measure designed to achieve these goals. The authors developed a measure of quality of life for medical students by sampling (public domain) items tapping general happiness, fatigue, and anxiety. A quality-of-life scale was developed by factor analyzing responses to the items from students in two different clerkships from 2005 to 2008. Reliability was assessed using Cronbach's alpha. Validity was assessed by factor analysis, convergence with additional theoretically relevant scales, and sensitivity to change over time. The refined nine-item measure is a Likert scaled survey of quality-of-life items comprised of two domains: exhaustion and general happiness. The resulting scale demonstrated good reliability and factorial validity at two time points for each of the two samples. The quality-of-life measure also correlated with measures of depression and the amount of sleep reported during the clerkships. The quality-of-life measure appeared more sensitive to changes over time than did the depression measure. The measure is short and can be easily administered in a survey. The scale appears useful for program evaluation and more generally as an outcome variable in medical educational research.

  18. The frequency of item writing flaws in multiple-choice questions used in high stakes nursing assessments.

    PubMed

    Tarrant, Marie; Knierim, Aimee; Hayes, Sasha K; Ware, James

    2006-12-01

    Multiple-choice questions are a common assessment method in nursing examinations. Few nurse educators, however, have formal preparation in constructing multiple-choice questions. Consequently, questions used in baccalaureate nursing assessments often contain item-writing flaws, or violations to accepted item-writing guidelines. In one nursing department, 2770 MCQs were collected from tests and examinations administered over a five-year period from 2001 to 2005. Questions were evaluated for 19 frequently occurring item-writing flaws, for cognitive level, for question source, and for the distribution of correct answers. Results show that almost half (46.2%) of the questions contained violations of item-writing guidelines and over 90% were written at low cognitive levels. Only a small proportion of questions were teacher generated (14.1%), while 36.2% were taken from testbanks and almost half (49.4%) had no source identified. MCQs written at a lower cognitive level were significantly more likely to contain item-writing flaws. While there was no relationship between the source of the question and item-writing flaws, teacher-generated questions were more likely to be written at higher cognitive levels (p<0.001). Correct answers were evenly distributed across all four options and no bias was noted in the placement of correct options. Further training in item-writing is recommended for all faculty members who are responsible for developing tests. Pre-test review and quality assessment is also recommended to reduce the occurrence of item-writing flaws and to improve the quality of test questions.

  19. Validation of a clinical critical thinking skills test in nursing.

    PubMed

    Shin, Sujin; Jung, Dukyoo; Kim, Sungeun

    2015-01-27

    The purpose of this study was to develop a revised version of the clinical critical thinking skills test (CCTS) and to subsequently validate its performance. This study is a secondary analysis of the CCTS. Data were obtained from a convenience sample of 284 college students in June 2011. Thirty items were analyzed using item response theory and test reliability was assessed. Test-retest reliability was measured using the results of 20 nursing college and graduate school students in July 2013. The content validity of the revised items was analyzed by calculating the degree of agreement between instrument developer intention in item development and the judgments of six experts. To analyze response process validity, qualitative data related to the response processes of nine nursing college students obtained through cognitive interviews were analyzed. Out of initial 30 items, 11 items were excluded after the analysis of difficulty and discrimination parameter. When the 19 items of the revised version of the CCTS were analyzed, levels of item difficulty were found to be relatively low and levels of discrimination were found to be appropriate or high. The degree of agreement between item developer intention and expert judgments equaled or exceeded 50%. From above results, evidence of the response process validity was demonstrated, indicating that subjects respondeds as intended by the test developer. The revised 19-item CCTS was found to have sufficient reliability and validity and will therefore represents a more convenient measurement of critical thinking ability.

  20. Validation of a clinical critical thinking skills test in nursing

    PubMed Central

    2015-01-01

    Purpose: The purpose of this study was to develop a revised version of the clinical critical thinking skills test (CCTS) and to subsequently validate its performance. Methods: This study is a secondary analysis of the CCTS. Data were obtained from a convenience sample of 284 college students in June 2011. Thirty items were analyzed using item response theory and test reliability was assessed. Test-retest reliability was measured using the results of 20 nursing college and graduate school students in July 2013. The content validity of the revised items was analyzed by calculating the degree of agreement between instrument developer intention in item development and the judgments of six experts. To analyze response process validity, qualitative data related to the response processes of nine nursing college students obtained through cognitive interviews were analyzed. Results: Out of initial 30 items, 11 items were excluded after the analysis of difficulty and discrimination parameter. When the 19 items of the revised version of the CCTS were analyzed, levels of item difficulty were found to be relatively low and levels of discrimination were found to be appropriate or high. The degree of agreement between item developer intention and expert judgments equaled or exceeded 50%. Conclusion: From above results, evidence of the response process validity was demonstrated, indicating that subjects respondeds as intended by the test developer. The revised 19-item CCTS was found to have sufficient reliability and validity and will therefore represents a more convenient measurement of critical thinking ability. PMID:25622716

  1. Introducing the Postsecondary Instructional Practices Survey (PIPS): A Concise, Interdisciplinary, and Easy-to-Score Survey

    PubMed Central

    Walter, Emily M.; Henderson, Charles R.; Beach, Andrea L.; Williams, Cody T.

    2016-01-01

    Researchers, administrators, and policy makers need valid and reliable information about teaching practices. The Postsecondary Instructional Practices Survey (PIPS) is designed to measure the instructional practices of postsecondary instructors from any discipline. The PIPS has 24 instructional practice statements and nine demographic questions. Users calculate PIPS scores by an intuitive proportion-based scoring convention. Factor analyses from 72 departments at four institutions (N = 891) support a 2- or 5-factor solution for the PIPS; both models include all 24 instructional practice items and have good model fit statistics. Factors in the 2-factor model include (a) instructor-centered practices, nine items; and (b) student-centered practices, 13 items. Factors in the 5-factor model include (a) student–student interactions, six items; (b) content delivery, four items; (c) formative assessment, five items; (d) student-content engagement, five items; and (e) summative assessment, four items. In this article, we describe our development and validation processes, provide scoring conventions and outputs for results, and describe wider applications of the instrument. PMID:27810868

  2. Analysis test of understanding of vectors with the three-parameter logistic model of item response theory and item response curves technique

    NASA Astrophysics Data System (ADS)

    Rakkapao, Suttida; Prasitpong, Singha; Arayathanitkul, Kwan

    2016-12-01

    This study investigated the multiple-choice test of understanding of vectors (TUV), by applying item response theory (IRT). The difficulty, discriminatory, and guessing parameters of the TUV items were fit with the three-parameter logistic model of IRT, using the parscale program. The TUV ability is an ability parameter, here estimated assuming unidimensionality and local independence. Moreover, all distractors of the TUV were analyzed from item response curves (IRC) that represent simplified IRT. Data were gathered on 2392 science and engineering freshmen, from three universities in Thailand. The results revealed IRT analysis to be useful in assessing the test since its item parameters are independent of the ability parameters. The IRT framework reveals item-level information, and indicates appropriate ability ranges for the test. Moreover, the IRC analysis can be used to assess the effectiveness of the test's distractors. Both IRT and IRC approaches reveal test characteristics beyond those revealed by the classical analysis methods of tests. Test developers can apply these methods to diagnose and evaluate the features of items at various ability levels of test takers.

  3. Validation of the brief version of the Recovery Self-Assessment (RSA-B) using Rasch measurement theory.

    PubMed

    Barbic, Skye P; Kidd, Sean A; Davidson, Larry; McKenzie, Kwame; O'Connell, Maria J

    2015-12-01

    In psychiatry, the recovery paradigm is increasingly identified as the overarching framework for service provision. Currently, the Recovery Self-Assessment (RSA), a 36-item rating scale, is commonly used to assess the uptake of a recovery orientation in clinical services. However, the consumer version of the RSA has been found challenging to complete because of length and the reading level required. In response to this feedback, a brief 12-item version of the RSA was developed (RSA-B). This article describes the development of the modified instrument and the application of traditional psychometric analysis and Rasch Measurement Theory to test the psychometrics properties of the RSA-B. Data from a multisite study of adults with serious mental illnesses (n = 1256) who were followed by assertive community treatment teams were examined for reliability, clinical meaning, targeting, response categories, model fit, reliability, dependency, and raw interval-level measurement. Analyses were performed using the Rasch Unidimensional Measurement Model (RUMM 2030). Adequate fit to the Rasch model was observed (χ2 = 112.46, df = 90, p = .06) and internal consistency was good (r = .86). However, Rasch analysis revealed limitations of the 12-item version, with items covering only 39% of the targeted theoretical continuum, 2 misfitting items, and strong evidence for the 5 option response categories not working as intended. This study revealed areas for improvement in the shortened version of the 12-item RSA-B. A revisit of the conceptual model and original 36-item rating scale is encouraged to select items that will help practitioners and researchers measure the full range of recovery orientation. (c) 2015 APA, all rights reserved).

  4. The Long-Term Conditions Questionnaire: conceptual framework and item development.

    PubMed

    Peters, Michele; Potter, Caroline M; Kelly, Laura; Hunter, Cheryl; Gibbons, Elizabeth; Jenkinson, Crispin; Coulter, Angela; Forder, Julien; Towers, Ann-Marie; A'Court, Christine; Fitzpatrick, Ray

    2016-01-01

    To identify the main issues of importance when living with long-term conditions to refine a conceptual framework for informing the item development of a patient-reported outcome measure for long-term conditions. Semi-structured qualitative interviews (n=48) were conducted with people living with at least one long-term condition. Participants were recruited through primary care. The interviews were transcribed verbatim and analyzed by thematic analysis. The analysis served to refine the conceptual framework, based on reviews of the literature and stakeholder consultations, for developing candidate items for a new measure for long-term conditions. Three main organizing concepts were identified: impact of long-term conditions, experience of services and support, and self-care. The findings helped to refine a conceptual framework, leading to the development of 23 items that represent issues of importance in long-term conditions. The 23 candidate items formed the first draft of the measure, currently named the Long-Term Conditions Questionnaire. The aim of this study was to refine the conceptual framework and develop items for a patient-reported outcome measure for long-term conditions, including single and multiple morbidities and physical and mental health conditions. Qualitative interviews identified the key themes for assessing outcomes in long-term conditions, and these underpinned the development of the initial draft of the measure. These initial items will undergo cognitive testing to refine the items prior to further validation in a survey.

  5. The Utrecht questionnaire (U-CEP) measuring knowledge on clinical epidemiology proved to be valid.

    PubMed

    Kortekaas, Marlous F; Bartelink, Marie-Louise E L; de Groot, Esther; Korving, Helen; de Wit, Niek J; Grobbee, Diederick E; Hoes, Arno W

    2017-02-01

    Knowledge on clinical epidemiology is crucial to practice evidence-based medicine. We describe the development and validation of the Utrecht questionnaire on knowledge on Clinical epidemiology for Evidence-based Practice (U-CEP); an assessment tool to be used in the training of clinicians. The U-CEP was developed in two formats: two sets of 25 questions and a combined set of 50. The validation was performed among postgraduate general practice (GP) trainees, hospital trainees, GP supervisors, and experts. Internal consistency, internal reliability (item-total correlation), item discrimination index, item difficulty, content validity, construct validity, responsiveness, test-retest reliability, and feasibility were assessed. The questionnaire was externally validated. Internal consistency was good with a Cronbach alpha of 0.8. The median item-total correlation and mean item discrimination index were satisfactory. Both sets were perceived as relevant to clinical practice. Construct validity was good. Both sets were responsive but failed on test-retest reliability. One set took 24 minutes and the other 33 minutes to complete, on average. External GP trainees had comparable results. The U-CEP is a valid questionnaire to assess knowledge on clinical epidemiology, which is a prerequisite for practicing evidence-based medicine in daily clinical practice. Copyright © 2016 Elsevier Inc. All rights reserved.

  6. Development of an Instrument to Measure Behavioral Health Function for Work Disability: Item Pool Construction and Factor Analysis

    PubMed Central

    Marfeo, Elizabeth E.; Ni, Pengsheng; Haley, Stephen M.; Jette, Alan M.; Bogusz, Kara; Meterko, Mark; McDonough, Christine M.; Chan, Leighton; Brandt, Diane E.; Rasch, Elizabeth K.

    2014-01-01

    Objectives To develop a broad set of claimant-reported items to assess behavioral health functioning relevant to the Social Security disability determination processes, and to evaluate the underlying structure of behavioral health functioning for use in development of a new functional assessment instrument. Design Cross-sectional. Setting Community. Participants Item pools of behavioral health functioning were developed, refined, and field-tested in a sample of persons applying for Social Security disability benefits (N=1015) who reported difficulties working due to mental or both mental and physical conditions. Interventions None. Main Outcome Measure Social Security Administration Behavioral Health (SSA-BH) measurement instrument Results Confirmatory factor analysis (CFA) specified that a 4-factor model (self-efficacy, mood and emotions, behavioral control, and social interactions) had the optimal fit with the data and was also consistent with our hypothesized conceptual framework for characterizing behavioral health functioning. When the items within each of the four scales were tested in CFA, the fit statistics indicated adequate support for characterizing behavioral health as a unidimensional construct along these four distinct scales of function. Conclusion This work represents a significant advance both conceptually and psychometrically in assessment methodologies for work related behavioral health. The measurement of behavioral health functioning relevant to the context of work requires the assessment of multiple dimensions of behavioral health functioning. Specifically, we identified a 4-factor model solution that represented key domains of work related behavioral health functioning. These results guided the development and scale formation of a new SSA-BH instrument. PMID:23548542

  7. Development of an instrument to measure behavioral health function for work disability: item pool construction and factor analysis.

    PubMed

    Marfeo, Elizabeth E; Ni, Pengsheng; Haley, Stephen M; Jette, Alan M; Bogusz, Kara; Meterko, Mark; McDonough, Christine M; Chan, Leighton; Brandt, Diane E; Rasch, Elizabeth K

    2013-09-01

    To develop a broad set of claimant-reported items to assess behavioral health functioning relevant to the Social Security disability determination processes, and to evaluate the underlying structure of behavioral health functioning for use in development of a new functional assessment instrument. Cross-sectional. Community. Item pools of behavioral health functioning were developed, refined, and field tested in a sample of persons applying for Social Security disability benefits (N=1015) who reported difficulties working because of mental or both mental and physical conditions. None. Social Security Administration Behavioral Health (SSA-BH) measurement instrument. Confirmatory factor analysis (CFA) specified that a 4-factor model (self-efficacy, mood and emotions, behavioral control, social interactions) had the optimal fit with the data and was also consistent with our hypothesized conceptual framework for characterizing behavioral health functioning. When the items within each of the 4 scales were tested in CFA, the fit statistics indicated adequate support for characterizing behavioral health as a unidimensional construct along these 4 distinct scales of function. This work represents a significant advance both conceptually and psychometrically in assessment methodologies for work-related behavioral health. The measurement of behavioral health functioning relevant to the context of work requires the assessment of multiple dimensions of behavioral health functioning. Specifically, we identified a 4-factor model solution that represented key domains of work-related behavioral health functioning. These results guided the development and scale formation of a new SSA-BH instrument. Copyright © 2013 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.

  8. Developing a short version of the Toronto Structured Interview for Alexithymia using item response theory.

    PubMed

    Sekely, Angela; Taylor, Graeme J; Bagby, R Michael

    2018-03-17

    The Toronto Structured Interview for Alexithymia (TSIA) was developed to provide a structured interview method for assessing alexithymia. One drawback of this instrument is the amount of time it takes to administer and score. The current study used item response theory (IRT) methods to analyze data from a large heterogeneous multi-language sample (N = 842) to investigate whether a subset of items could be selected to create a short version of the instrument. Samejima's (1969) graded response model was used to fit the item responses. Items providing maximum information were retained in the short model, resulting in the elimination of 12-items from the original 24-items. Despite the 50% reduction in the number of items, 65.22% of the information was retained. Further studies are needed to validate the short version. A short version of the TSIA is potentially of practical value to clinicians and researchers with time constraints. Copyright © 2018. Published by Elsevier B.V.

  9. The development of an integrated assessment instrument for measuring analytical thinking and science process skills

    NASA Astrophysics Data System (ADS)

    Irwanto, Rohaeti, Eli; LFX, Endang Widjajanti; Suyanta

    2017-05-01

    This research aims to develop instrument and determine the characteristics of an integrated assessment instrument. This research uses 4-D model, which includes define, design, develop, and disseminate. The primary product is validated by expert judgment, tested it's readability by students, and assessed it's feasibility by chemistry teachers. This research involved 246 students of grade XI of four senior high schools in Yogyakarta, Indonesia. Data collection techniques include interview, questionnaire, and test. Data collection instruments include interview guideline, item validation sheet, users' response questionnaire, instrument readability questionnaire, and essay test. The results show that the integrated assessment instrument has Aiken validity value of 0.95. Item reliability was 0.99 and person reliability was 0.69. Teachers' response to the integrated assessment instrument is very good. Therefore, the integrated assessment instrument is feasible to be applied to measure the students' analytical thinking and science process skills.

  10. Assessing organizational climate: psychometric properties of the CLIOR Scale.

    PubMed

    Peña-Suárez, Elsa; Muñiz, José; Campillo-Álvarez, Angela; Fonseca-Pedrero, Eduardo; García-Cueto, Eduardo

    2013-02-01

    Organizational climate is the set of perceptions shared by workers who occupy the same workplace. The main goal of this study is to develop a new organizational climate scale and to determine its psychometric properties. The sample consisted of 3,163 Health Service workers. A total of 88.7% of participants worked in hospitals, and 11.3% in primary care; 80% were women and 20% men, with a mean age of 51.9 years (SD= 6.28). The proposed scale consists of 50 Likert-type items, with an alpha coefficient of 0.97, and an essentially one-dimensional structure. The discrimination indexes of the items are greater than 0.40, and the items show no differential item functioning in relation to participants' sex. A short version of the scale was developed, made up of 15 items, with discrimination indexes higher than 0.40, an alpha coefficient of 0.94, and its structure was clearly one-dimensional. These results indicate that the new scale has adequate psychometric properties, allowing a reliable and valid assessment of organizational climate.

  11. Development and Validation of the Homeostasis Concept Inventory

    PubMed Central

    McFarland, Jenny L.; Price, Rebecca M.; Wenderoth, Mary Pat; Martinková, Patrícia; Cliff, William; Michael, Joel; Modell, Harold; Wright, Ann

    2017-01-01

    We present the Homeostasis Concept Inventory (HCI), a 20-item multiple-choice instrument that assesses how well undergraduates understand this critical physiological concept. We used an iterative process to develop a set of questions based on elements in the Homeostasis Concept Framework. This process involved faculty experts and undergraduate students from associate’s colleges, primarily undergraduate institutions, regional and research-intensive universities, and professional schools. Statistical results provided strong evidence for the validity and reliability of the HCI. We found that graduate students performed better than undergraduates, biology majors performed better than nonmajors, and students performed better after receiving instruction about homeostasis. We used differential item analysis to assess whether students from different genders, races/ethnicities, and English language status performed differently on individual items of the HCI. We found no evidence of differential item functioning, suggesting that the items do not incorporate cultural or gender biases that would impact students’ performance on the test. Instructors can use the HCI to guide their teaching and student learning of homeostasis, a core concept of physiology. PMID:28572177

  12. Development of a Comprehensive Assessment of Food Parenting Practices: The Home Self-Administered Tool for Environmental Assessment of Activity and Diet Family Food Practices Survey.

    PubMed

    Vaughn, Amber E; Dearth-Wesley, Tracy; Tabak, Rachel G; Bryant, Maria; Ward, Dianne S

    2017-02-01

    Parents' food parenting practices influence children's dietary intake and risk for obesity and chronic disease. Understanding the influence and interactions between parents' practices and children's behavior is limited by a lack of development and psychometric testing and/or limited scope of current measures. The Home Self-Administered Tool for Environmental Assessment of Activity and Diet (HomeSTEAD) was created to address this gap. This article describes development and psychometric testing of the HomeSTEAD family food practices survey. Between August 2010 and May 2011, a convenience sample of 129 parents of children aged 3 to 12 years were recruited from central North Carolina and completed the self-administered HomeSTEAD survey on three occasions during a 12- to 18-day window. Demographic characteristics and child diet were assessed at Time 1. Child height and weight were measured during the in-home observations (following Time 1 survey). Exploratory factor analysis with Time 1 data was used to identify potential scales. Scales with more than three items were examined for scale reduction. Following this, mean scores were calculated at each time point. Construct validity was assessed by examining Spearman rank correlations between mean scores (Time 1) and children's diet (fruits and vegetables, sugar-sweetened beverages, snacks, sweets) and body mass index (BMI) z scores. Repeated measures analysis of variance was used to examine differences in mean scores between time points, and single-measure intraclass correlations were calculated to examine test-retest reliability between time points. Exploratory factor analysis identified 24 factors and retained 124 items; however, scale reduction narrowed items to 86. The final instrument captures five coercive control practices (16 items), seven autonomy support practices (24 items), and 12 structure practices (46 items). All scales demonstrated good internal reliability (α>.62), 18 factors demonstrated construct validity (significant association with child diet, P<0.05), and 22 demonstrated good reliability (intraclass correlation coefficient>0.61). The HomeSTEAD family food practices survey provides a brief, yet comprehensive and psychometrically sound assessment of food parenting practices. Copyright © 2017 Academy of Nutrition and Dietetics. Published by Elsevier Inc. All rights reserved.

  13. Development and psychometric evaluation of the PROMIS Pediatric Life Satisfaction item banks, child-report, and parent-proxy editions.

    PubMed

    Forrest, Christopher B; Devine, Janine; Bevans, Katherine B; Becker, Brandon D; Carle, Adam C; Teneralli, Rachel E; Moon, JeanHee; Tucker, Carole A; Ravens-Sieberer, Ulrike

    2018-01-01

    To describe the psychometric evaluation and item response theory calibration of the PROMIS Pediatric Life Satisfaction item banks, child-report, and parent-proxy editions. A pool of 55 life satisfaction items was administered to 1992 children 8-17 years old and 964 parents of children 5-17 years old. Analyses included descriptive statistics, reliability, factor analysis, differential item functioning, and assessment of construct validity. Thirteen items were deleted because of poor psychometric performance. An 8-item short form was administered to a national sample of 996 children 8-17 years old, and 1294 parents of children 5-17 years old. The combined sample (2988 children and 2258 parents) was used in item response theory (IRT) calibration analyses. The final item banks were unidimensional, the items were locally independent, and the items were free from impactful differential item functioning. The 8-item and 4-item short form scales showed excellent reliability, convergent validity, and discriminant validity. Life satisfaction decreased with declining socio-economic status, presence of a special health care need, and increasing age for girls, but not boys. After IRT calibration, we found that 4- and 8-item short forms had a high degree of precision (reliability) across a wide range (>4 SD units) of the latent variable. The PROMIS Pediatric Life Satisfaction item banks and their short forms provide efficient, precise, and valid assessments of life satisfaction in children and youth.

  14. Development and Validation of a Disease-Specific Instrument to Measure Diet-Targeted Quality of Life for Postoperative Patients with Esophagogastric Cancer.

    PubMed

    Honda, Michitaka; Wakita, Takafumi; Onishi, Yoshihiro; Nunobe, Souya; Miura, Akinori; Nishigori, Tatsuto; Kusanagi, Hiroshi; Yamamoto, Takatsugu; Boddy, Alexander; Fukuhara, Shunichi

    2015-12-01

    Patients who have undergone esophagectomy or gastrectomy have certain dietary limitations because of changes to the alimentary tract. This study attempted to develop a psychometric scale, named "Esophago-Gastric surgery and Quality of Dietary life (EGQ-D)," for assessment of impact of upper gastrointestinal surgery on diet-targeted quality of life. Using qualitative methods, the study team interviewed both patients and surgeons involved in esophagogastric cancer surgery, and we prepared an item pool and a draft scale. To evaluate the scale's psychometric reliability and validity, a survey involving a large number of patients was conducted. Items for the final scale were selected by factor analysis and item response theory. Cronbach's alpha was used for assessment of reliability, and correlations with the short form (SF)-12, esophagus and stomach surgery symptom scale (ES(4)), and nutritional indicators were analyzed to assess the criterion-related validity. Through multifaceted discussion and the pilot study, a draft questionnaire comprising 14 items was prepared, and a total of 316 patients were enrolled. On the basis of factor analysis and item response theory, six items were excluded, and the remaining eight items demonstrated strong unidimensionality for the final scale. Cronbach's alpha was 0.895. There were significant associations with all the subscale scores for SF-12, ES(4), and nutritional indicators. The EGQ-D scale has good contents and psychometric validity and can be used to evaluate disease-specific instrument to measure diet-targeted quality of life for postoperative patients with esophagogastric cancer.

  15. Measurement of the dimensions of food insecurity in developed countries: a systematic literature review.

    PubMed

    Ashby, Stephanie; Kleve, Suzanne; McKechnie, Rebecca; Palermo, Claire

    2016-11-01

    Food insecurity is a salient health issue comprised of four dimensions - food access, availability, utilization and stability over time. The aim of the present study was to conduct a systematic literature review to identify all multi-item tools that measure food insecurity and explore which of the dimensions they assess. Five databases were searched (CENTRAL, CINAHL plus, EMBASE, MEDLINE, TRIP) for studies published in English since 1999. Inclusion criteria included human studies using multi-item tools to measure food security and studies conducted in developed countries. Manuscripts describing the US Department of Agriculture Food Security Survey Module, that measures 'food access', were excluded due to wide acceptance of the validity and reliability of this instrument. Two authors extracted data and assessed the quality of the included studies. Data were summarized against the dimensions of food insecurity. A systematic review of the literature. The majority of tools were developed in the USA and had been used in different age groups and cultures. Eight multi-item tools were identified. All of the tools assessed the 'food access' dimension and two partially assessed the dimensions 'food utilization' and 'stability over time', respectively. 'Food availability' was not assessed by existing tools. Current tools available for measuring food insecurity are subjective, limited in scope, with a majority assessing only one dimension of food insecurity (access). To more accurately assess the true burden of food insecurity, tools should be adapted or developed to assess all four dimensions of food insecurity.

  16. The Subjective Sexual Arousal Scale for Men (SSASM): preliminary development and psychometric validation of a multidimensional measure of subjective male sexual arousal.

    PubMed

    Althof, Stanley E; Perelman, Michael A; Rosen, Raymond C

    2011-08-01

    Sexual arousal is a multifaceted process that involves both mental and physical components. No instrument has been developed and validated to assess subjective aspects of male sexual arousal. To develop and psychometrically validate a self-administered scale for assessing subjective male sexual arousal. Using recommendations of the Food and Drug Administration (FDA) guidance on patient-reported outcome instruments, important aspects of male sexual arousal were identified via qualitative research (focus groups and interviews) of U.S. men with erectile dysfunction (ED) and healthy controls. After a preliminary questionnaire was developed by a panel of experts, a quantitative study of men with ED and controls was conducted to psychometrically validate the Subjective Sexual Arousal Scale for Men (SSASM). To develop a male sexual arousal scale and determine its factor structure, reliability, and construct validity. Five aspects of male sexual arousal were identified from the qualitative focus groups and cognitive interviews. Men's preferred language for describing sexual arousal and preferred response formats were incorporated into the questions. Factor analysis of data from the quantitative study of 304 men aged 21 to 70 years identified five domains with eigenvalues >1: sexual performance (six items), mental satisfaction (five items), sexual assertiveness (three items), partner communication (three items), and partner relationship (three items). The five domains had a high degree of internal consistency (Cronbach's alpha values 0.88-0.94). Test-retest reliability over a 2- to 4-week period was high-moderately high (r values 0.75-0.88) for the five domain scores. Correlations between SSASM domain scores and standardized scale scores for social desirability, general health, life satisfaction, and sexual function demonstrated the construct validity of the scale. Preliminary validation data suggest that the 20-item SSASM scale may be useful as a multidimensional, reliable, self-administered instrument for assessing subjective sexual arousal in men of different ages. © 2011 International Society for Sexual Medicine.

  17. Modeling the World Health Organization Disability Assessment Schedule II using non-parametric item response models.

    PubMed

    Galindo-Garre, Francisca; Hidalgo, María Dolores; Guilera, Georgina; Pino, Oscar; Rojo, J Emilio; Gómez-Benito, Juana

    2015-03-01

    The World Health Organization Disability Assessment Schedule II (WHO-DAS II) is a multidimensional instrument developed for measuring disability. It comprises six domains (getting around, self-care, getting along with others, life activities and participation in society). The main purpose of this paper is the evaluation of the psychometric properties for each domain of the WHO-DAS II with parametric and non-parametric Item Response Theory (IRT) models. A secondary objective is to assess whether the WHO-DAS II items within each domain form a hierarchy of invariantly ordered severity indicators of disability. A sample of 352 patients with a schizophrenia spectrum disorder is used in this study. The 36 items WHO-DAS II was administered during the consultation. Partial Credit and Mokken scale models are used to study the psychometric properties of the questionnaire. The psychometric properties of the WHO-DAS II scale are satisfactory for all the domains. However, we identify a few items that do not discriminate satisfactorily between different levels of disability and cannot be invariantly ordered in the scale. In conclusion the WHO-DAS II can be used to assess overall disability in patients with schizophrenia, but some domains are too general to assess functionality in these patients because they contain items that are not applicable to this pathology. Copyright © 2014 John Wiley & Sons, Ltd.

  18. Development and Validation of a Novel Generic Health-related Quality of Life Instrument With 20 Items (HINT-20)

    PubMed Central

    2017-01-01

    Objectives Few attempts have been made to develop a generic health-related quality of life (HRQoL) instrument and to examine its validity and reliability in Korea. We aimed to do this in our present study. Methods After a literature review of existing generic HRQoL instruments, a focus group discussion, in-depth interviews, and expert consultations, we selected 30 tentative items for a new HRQoL measure. These items were evaluated by assessing their ceiling effects, difficulty, and redundancy in the first survey. To validate the HRQoL instrument that was developed, known-groups validity and convergent/discriminant validity were evaluated and its test-retest reliability was examined in the second survey. Results Of the 30 items originally assessed for the HRQoL instrument, four were excluded due to high ceiling effects and six were removed due to redundancy. We ultimately developed a HRQoL instrument with a reduced number of 20 items, known as the Health-related Quality of Life Instrument with 20 items (HINT-20), incorporating physical, mental, social, and positive health dimensions. The results of the HINT-20 for known-groups validity were poorer in women, the elderly, and those with a low income. For convergent/discriminant validity, the correlation coefficients of items (except vitality) in the physical health dimension with the physical component summary of the Short Form 36 version 2 (SF-36v2) were generally higher than the correlations of those items with the mental component summary of the SF-36v2, and vice versa. Regarding test-retest reliability, the intraclass correlation coefficient of the total HINT-20 score was 0.813 (p<0.001). Conclusions A novel generic HRQoL instrument, the HINT-20, was developed for the Korean general population and showed acceptable validity and reliability. PMID:28173686

  19. Development and initial validation of a computer-administered health literacy assessment in Spanish and English: FLIGHT/VIDAS.

    PubMed

    Ownby, Raymond L; Acevedo, Amarilis; Waldrop-Valverde, Drenna; Jacobs, Robin J; Caballero, Joshua; Davenport, Rosemary; Homs, Ana-Maria; Czaja, Sara J; Loewenstein, David

    2013-01-01

    Current measures of health literacy have been criticized on a number of grounds, including use of a limited range of content, development on small and atypical patient groups, and poor psychometric characteristics. In this paper, we report the development and preliminary validation of a new computer-administered and -scored health literacy measure addressing these limitations. Items in the measure reflect a wide range of content related to health promotion and maintenance as well as care for diseases. The development process has focused on creating a measure that will be useful in both Spanish and English, while not requiring substantial time for clinician training and individual administration and scoring. The items incorporate several formats, including questions based on brief videos, which allow for the assessment of listening comprehension and the skills related to obtaining information on the Internet. In this paper, we report the interim analyses detailing the initial development and pilot testing of the items (phase 1 of the project) in groups of Spanish and English speakers. We then describe phase 2, which included a second round of testing of the items, in new groups of Spanish and English speakers, and evaluation of the new measure's reliability and validity in relation to other measures. Data are presented that show that four scales (general health literacy, numeracy, conceptual knowledge, and listening comprehension), developed through a process of item and factor analyses, have significant relations to existing measures of health literacy.

  20. Validation of an instrument for assessing teacher knowledge of basic language constructs of literacy.

    PubMed

    Binks-Cantrell, Emily; Joshi, R Malatesha; Washburn, Erin K

    2012-10-01

    Recent national reports have stressed the importance of teacher knowledge in teaching reading. However, in the past, teachers' knowledge of language and literacy constructs has typically been assessed with instruments that are not fully tested for validity. In the present study, an instrument was developed; and its reliability, item difficulty, and item discrimination were computed and examined to identify model fit by applying exploratory factor analysis. Such analyses showed that the instrument demonstrated adequate estimates of reliability in assessing teachers' knowledge of language constructs. The implications for professional development of in-service teachers as well as preservice teacher education are also discussed.

  1. Differential Item Functioning in Primary Healthcare Evaluation Instruments by French/English Version, Educational Level and Urban/Rural Location

    PubMed Central

    Haggerty, Jeannie L.; Bouharaoui, Fatima; Santor, Darcy A.

    2011-01-01

    Evaluating the extent to which groups or subgroups of individuals differ with respect to primary healthcare experience depends on first ruling out the possibility of bias. Objective: To determine whether item or subscale performance differs systematically between French/English, high/low education subgroups and urban/rural residency. Method: A sample of 645 adult users balanced by French/English language (in Quebec and Nova Scotia, respectively), high/low education and urban/rural residency responded to six validated instruments: the Primary Care Assessment Survey (PCAS); the Primary Care Assessment Tool – Short Form (PCAT-S); the Components of Primary Care Index (CPCI); the first version of the EUROPEP (EUROPEP-I); the Interpersonal Processes of Care Survey, version II (IPC-II); and part of the Veterans Affairs National Outpatient Customer Satisfaction Survey (VANOCSS). We normalized subscale scores to a 0-to-10 scale and tested for between-group differences using ANOVA tests. We used a parametric item response model to test for differences between subgroups in item discriminability and item difficulty. We re-examined group differences after removing items with differential item functioning. Results: Experience of care was assessed more positively in the English-speaking (Nova Scotia) than in the French-speaking (Quebec) respondents. We found differential English/French item functioning in 48% of the 153 items: discriminability in 20% and differential difficulty in 28%. English items were more discriminating generally than the French. Removing problematic items did not change the differences in French/English assessments. Differential item functioning by high/low education status affected 27% of items, with items being generally more discriminating in high-education groups. Between-group comparisons were unchanged. In contrast, only 9% of items showed differential item functioning by geography, affecting principally the accessibility attribute. Removing problematic items reversed a previously non-significant finding, revealing poorer first-contact access in rural than in urban areas. Conclusion: Differential item functioning does not bias or invalidate French/English comparisons on subscales, but additional development is required to make French and English items equivalent. These instruments are relatively robust by educational status and geography, but results suggest potential differences in the underlying construct in low-education and rural respondents. PMID:23205035

  2. Development of Attitudes Toward Homosexuality Scale for Indians (AHSI).

    PubMed

    Ahuja, Kanika K

    2017-01-01

    Attitudes toward homosexuality vary across cultures, with the legal and societal position being rather complicated in India. This study describes the process of developing and validating a Likert-type scale to assess attitudes toward homosexuality among heterosexuals. Phase 1 describes the development of the scale. Items were written based on thematic analysis of narratives generated from 50 college students and reviewing existing scales. After administering the 70-item scale to 68 participants, item analysis yielded 20 statements with item-total correlations over .70. Cronbach's alpha was .97. In Phase 2, the 20-item Attitudes Toward Homosexuality Scale for Indians (AHSI) was administered to 142 participants. Analysis yielded a corrected split-half correlation of .91. Further, AHSI discriminated between women and men; between liberal arts and STEM/business students; and those who reported interpersonal contact with gay men and lesbian women and those who did not. The scale has satisfactory reliability and shows promising construct validity.

  3. Design and development of an instrument to measure overall lifestyle habits for epidemiological research: the Mediterranean Lifestyle (MEDLIFE) index.

    PubMed

    Sotos-Prieto, Mercedes; Moreno-Franco, Belén; Ordovás, Jose M; León, Montse; Casasnovas, Jose A; Peñalvo, Jose L

    2015-04-01

    To design and develop a questionnaire that can account for an individual's adherence to a Mediterranean lifestyle including the assessment of diet and physical activity patterns, as well as social interaction. The Mediterranean Lifestyle (MEDLIFE) index was created based on the current Spanish Mediterranean food guide pyramid. MEDLIFE is a twenty-eight-item derived index consisting of questions about food consumption (fifteen items), traditional Mediterranean dietary habits (seven items) and physical activity, rest and social interaction habits (six items). Linear regression models and Spearman rank correlation were fitted to assess content validity and internal consistency. A subset of participants in the Aragon Workers' Health Study cohort (Zaragoza, Spain) provided the data for development of MEDLIFE. Participants (n 988) of the Aragon Workers' Health Study cohort in Spain. Mean MEDLIFE score was 11·3 (sd 2·6; range: 0-28), and the quintile distribution of MEDLIFE score showed a significant association with each of the individual items as well as with specific nutrients and lifestyle indicators (intra-validity). We also quantified MEDLIFE correspondence with previously reported diet quality indices and found significant correlations (ρ range: 0·44-0·53; P<0·001) for the Alternate Healthy Eating Index, the Alternate Mediterranean Diet Index and Mediterranean Diet Adherence Screener. MEDLIFE is the first index to include an overall assessment of lifestyle habits. It is expected to be a more holistic tool to measure adherence to the Mediterranean lifestyle in epidemiological studies.

  4. Measuring the effects of online health information for patients: Item generation for an e-health impact questionnaire

    PubMed Central

    Kelly, Laura; Jenkinson, Crispin; Ziebland, Sue

    2013-01-01

    Objective The internet is a valuable resource for accessing health information and support. We are developing an instrument to assess the effects of websites with experiential and factual health information. This study aimed to inform an item pool for the proposed questionnaire. Methods Items were informed through a review of relevant literature and secondary qualitative analysis of 99 narrative interviews relating to patient and carer experiences of health. Statements relating to identified themes were re-cast as questionnaire items and shown for review to an expert panel. Cognitive debrief interviews (n = 21) were used to assess items for face and content validity. Results Eighty-two generic items were identified following secondary qualitative analysis and expert review. Cognitive interviewing confirmed the questionnaire instructions, 62 items and the response options were acceptable to patients and carers. Conclusion Using a clear conceptual basis to inform item generation, 62 items have been identified as suitable to undergo further psychometric testing. Practice implications The final questionnaire will initially be used in a randomized controlled trial examining the effects of online patient's experiences. This will inform recommendations on the best way to present patients’ experiences within health information websites. PMID:23598293

  5. When less is more: validating a brief scale to rate interprofessional team competencies.

    PubMed

    Lie, Désirée A; Richter-Lagha, Regina; Forest, Christopher P; Walsh, Anne; Lohenry, Kevin

    2017-01-01

    There is a need for validated and easy-to-apply behavior-based tools for assessing interprofessional team competencies in clinical settings. The seven-item observer-based Modified McMaster-Ottawa scale was developed for the Team Objective Structured Clinical Encounter (TOSCE) to assess individual and team performance in interprofessional patient encounters. We aimed to improve scale usability for clinical settings by reducing item numbers while maintaining generalizability; and to explore the minimum number of observed cases required to achieve modest generalizability for giving feedback. We administered a two-station TOSCE in April 2016 to 63 students split into 16 newly-formed teams, each consisting of four professions. The stations were of similar difficulty. We trained sixteen faculty to rate two teams each. We examined individual and team performance scores using generalizability (G) theory and principal component analysis (PCA). The seven-item scale shows modest generalizability (.75) with individual scores. PCA revealed multicollinearity and singularity among scale items and we identified three potential items for removal. Reducing items for individual scores from seven to four (measuring Collaboration, Roles, Patient/Family-centeredness, and Conflict Management) changed scale generalizability from .75 to .73. Performance assessment with two cases is associated with reasonable generalizability (.73). Students in newly-formed interprofessional teams show a learning curve after one patient encounter. Team scores from a two-station TOSCE demonstrate low generalizability whether the scale consisted of four (.53) or seven items (.55). The four-item Modified McMaster-Ottawa scale for assessing individual performance in interprofessional teams retains the generalizability and validity of the seven-item scale. Observation of students in teams interacting with two different patients provides reasonably reliable ratings for giving feedback. The four-item scale has potential for assessing individual student skills and the impact of IPE curricula in clinical practice settings. IPE: Interprofessional education; SP: Standardized patient; TOSCE: Team objective structured clinical encounter.

  6. Differential Performance by English Language Learners on an Inquiry-Based Science Assessment

    NASA Astrophysics Data System (ADS)

    Turkan, Sultan; Liu, Ou Lydia

    2012-10-01

    The performance of English language learners (ELLs) has been a concern given the rapidly changing demographics in US K-12 education. This study aimed to examine whether students' English language status has an impact on their inquiry science performance. Differential item functioning (DIF) analysis was conducted with regard to ELL status on an inquiry-based science assessment, using a multifaceted Rasch DIF model. A total of 1,396 seventh- and eighth-grade students took the science test, including 313 ELL students. The results showed that, overall, non-ELLs significantly outperformed ELLs. Of the four items that showed DIF, three favored non-ELLs while one favored ELLs. The item that favored ELLs provided a graphic representation of a science concept within a family context. There is some evidence that constructed-response items may help ELLs articulate scientific reasoning using their own words. Assessment developers and teachers should pay attention to the possible interaction between linguistic challenges and science content when designing assessment for and providing instruction to ELLs.

  7. Development and validation of the Myasthenia Gravis Impairment Index.

    PubMed

    Barnett, Carolina; Bril, Vera; Kapral, Moira; Kulkarni, Abhaya; Davis, Aileen M

    2016-08-30

    We aimed to develop a measure of myasthenia gravis impairment using a previously developed framework and to evaluate reliability and validity, specifically face, content, and construct validity. The first draft of the Myasthenia Gravis Impairment Index (MGII) included examination items from available measures enriched with newly developed, patient-reported items, modified after patient input. International neuromuscular specialists evaluated face and content validity via an e-mail survey. Test-retest reliability was assessed in stable patients at a 3-week interval and interrater reliability was evaluated in the same day. Construct validity was assessed through correlations between the MGII and other measures and by comparing scores in different patient groups. The first draft was assessed by 18 patients, and 72 specialists answered the survey. The second draft had 7 examination and 22 patient-reported items. Field testing included 200 patients, with 54 patients completing the reliability studies. Test-retest reliability of the total score was good (intraclass correlation coefficient 0.92; 95% confidence interval 0.79-0.94), as was interrater reliability of the examination component (intraclass correlation coefficient 0.81; 95% confidence interval 0.79-0.94). The MGII correlated well with comparison measures, with higher correlations with the MG-activities of daily living (r = 0.91) and MG-specific quality of life 15-item scale (r = 0.78). When assessing different patient groups, the scores followed expected patterns. The MGII was developed using a patient-centered framework of myasthenia-related impairments and incorporating patient input throughout the development process. It is reliable in an outpatient setting and has demonstrated construct validity. Responsiveness studies are under way. © 2016 American Academy of Neurology.

  8. Development and validation of the Myasthenia Gravis Impairment Index

    PubMed Central

    Bril, Vera; Kapral, Moira; Kulkarni, Abhaya; Davis, Aileen M.

    2016-01-01

    Objective: We aimed to develop a measure of myasthenia gravis impairment using a previously developed framework and to evaluate reliability and validity, specifically face, content, and construct validity. Methods: The first draft of the Myasthenia Gravis Impairment Index (MGII) included examination items from available measures enriched with newly developed, patient-reported items, modified after patient input. International neuromuscular specialists evaluated face and content validity via an e-mail survey. Test–retest reliability was assessed in stable patients at a 3-week interval and interrater reliability was evaluated in the same day. Construct validity was assessed through correlations between the MGII and other measures and by comparing scores in different patient groups. Results: The first draft was assessed by 18 patients, and 72 specialists answered the survey. The second draft had 7 examination and 22 patient-reported items. Field testing included 200 patients, with 54 patients completing the reliability studies. Test–retest reliability of the total score was good (intraclass correlation coefficient 0.92; 95% confidence interval 0.79–0.94), as was interrater reliability of the examination component (intraclass correlation coefficient 0.81; 95% confidence interval 0.79–0.94). The MGII correlated well with comparison measures, with higher correlations with the MG–activities of daily living (r = 0.91) and MG-specific quality of life 15-item scale (r = 0.78). When assessing different patient groups, the scores followed expected patterns. Conclusions: The MGII was developed using a patient-centered framework of myasthenia-related impairments and incorporating patient input throughout the development process. It is reliable in an outpatient setting and has demonstrated construct validity. Responsiveness studies are under way. PMID:27402891

  9. Developing a Questionnaire to Evaluate College Students' Knowledge, Attitude, Behavior, Self-efficacy, and Environmental Factors Related to Canned Foods.

    PubMed

    Richards, Rickelle; Brown, Lora Beth; Williams, D Pauline; Eggett, Dennis L

    2017-02-01

    Develop a questionnaire to measure students' knowledge, attitude, behavior, self-efficacy, and environmental factors related to the use of canned foods. The Knowledge-Attitude-Behavior Model, Social Cognitive Theory, and Canned Foods Alliance survey were used as frameworks for questionnaire development. Cognitive interviews were conducted with college students (n = 8). Nutrition and survey experts assessed content validity. Reliability was measured via Cronbach α and 2 rounds (1, n = 81; 2, n = 65) of test-retest statistics. Means and frequencies were used. The 65-item questionnaire had a test-retest reliability of .69. Cronbach α scores were .87 for knowledge (9 items), .86 for attitude (30 items), .80 for self-efficacy (12 items), .68 for canned foods use (8 items), and .30 for environment (6 items). A reliable questionnaire was developed to measure perceptions and use of canned foods. Nutrition educators may find this questionnaire useful to evaluate pretest-posttest changes from canned foods-based interventions among college students. Copyright © 2016 Society for Nutrition Education and Behavior. Published by Elsevier Inc. All rights reserved.

  10. Development and Psychometric Evaluation of an Instrument to Assess Cross-Cultural Competence of Healthcare Professionals (CCCHP)

    PubMed Central

    Bernhard, Gerda; Knibbe, Ronald A.; von Wolff, Alessa; Dingoyan, Demet; Schulz, Holger; Mösko, Mike

    2015-01-01

    Background Cultural competence of healthcare professionals (HCPs) is recognized as a strategy to reduce cultural disparities in healthcare. However, standardised, valid and reliable instruments to assess HCPs’ cultural competence are notably lacking. The present study aims to 1) identify the core components of cultural competence from a healthcare perspective, 2) to develop a self-report instrument to assess cultural competence of HCPs and 3) to evaluate the psychometric properties of the new instrument. Methods The conceptual model and initial item pool, which were applied to the cross-cultural competence instrument for the healthcare profession (CCCHP), were derived from an expert survey (n = 23), interviews with HCPs (n = 12), and a broad narrative review on assessment instruments and conceptual models of cultural competence. The item pool was reduced systematically, which resulted in a 59-item instrument. A sample of 336 psychologists, in advanced psychotherapeutic training, and 409 medical students participated, in order to evaluate the construct validity and reliability of the CCCHP. Results Construct validity was supported by principal component analysis, which led to a 32-item six-component solution with 50% of the total variance explained. The different dimensions of HCPs’ cultural competence are: Cross-Cultural Motivation/Curiosity, Cross-Cultural Attitudes, Cross-Cultural Skills, Cross-Cultural Knowledge/Awareness and Cross-Cultural Emotions/Empathy. For the total instrument, the internal consistency reliability was .87 and the dimension’s Cronbach’s α ranged from .54 to .84. The discriminating power of the CCCHP was indicated by statistically significant mean differences in CCCHP subscale scores between predefined groups. Conclusions The 32-item CCCHP exhibits acceptable psychometric properties, particularly content and construct validity to examine HCPs’ cultural competence. The CCCHP with its five dimensions offers a comprehensive assessment of HCPs’ cultural competence, and has the ability to distinguish between groups that are expected to differ in cultural competence. This instrument can foster professional development through systematic self-assessment and thus contributes to improve the quality of patient care. PMID:26641876

  11. Development and Psychometric Evaluation of an Instrument to Assess Cross-Cultural Competence of Healthcare Professionals (CCCHP).

    PubMed

    Bernhard, Gerda; Knibbe, Ronald A; von Wolff, Alessa; Dingoyan, Demet; Schulz, Holger; Mösko, Mike

    2015-01-01

    Cultural competence of healthcare professionals (HCPs) is recognized as a strategy to reduce cultural disparities in healthcare. However, standardised, valid and reliable instruments to assess HCPs' cultural competence are notably lacking. The present study aims to 1) identify the core components of cultural competence from a healthcare perspective, 2) to develop a self-report instrument to assess cultural competence of HCPs and 3) to evaluate the psychometric properties of the new instrument. The conceptual model and initial item pool, which were applied to the cross-cultural competence instrument for the healthcare profession (CCCHP), were derived from an expert survey (n = 23), interviews with HCPs (n = 12), and a broad narrative review on assessment instruments and conceptual models of cultural competence. The item pool was reduced systematically, which resulted in a 59-item instrument. A sample of 336 psychologists, in advanced psychotherapeutic training, and 409 medical students participated, in order to evaluate the construct validity and reliability of the CCCHP. Construct validity was supported by principal component analysis, which led to a 32-item six-component solution with 50% of the total variance explained. The different dimensions of HCPs' cultural competence are: Cross-Cultural Motivation/Curiosity, Cross-Cultural Attitudes, Cross-Cultural Skills, Cross-Cultural Knowledge/Awareness and Cross-Cultural Emotions/Empathy. For the total instrument, the internal consistency reliability was .87 and the dimension's Cronbach's α ranged from .54 to .84. The discriminating power of the CCCHP was indicated by statistically significant mean differences in CCCHP subscale scores between predefined groups. The 32-item CCCHP exhibits acceptable psychometric properties, particularly content and construct validity to examine HCPs' cultural competence. The CCCHP with its five dimensions offers a comprehensive assessment of HCPs' cultural competence, and has the ability to distinguish between groups that are expected to differ in cultural competence. This instrument can foster professional development through systematic self-assessment and thus contributes to improve the quality of patient care.

  12. Assessment of health surveys: fitting a multidimensional graded response model.

    PubMed

    Depaoli, Sarah; Tiemensma, Jitske; Felt, John M

    The multidimensional graded response model, an item response theory (IRT) model, can be used to improve the assessment of surveys, even when sample sizes are restricted. Typically, health-based survey development utilizes classical statistical techniques (e.g. reliability and factor analysis). In a review of four prominent journals within the field of Health Psychology, we found that IRT-based models were used in less than 10% of the studies examining scale development or assessment. However, implementing IRT-based methods can provide more details about individual survey items, which is useful when determining the final item content of surveys. An example using a quality of life survey for Cushing's syndrome (CushingQoL) highlights the main components for implementing the multidimensional graded response model. Patients with Cushing's syndrome (n = 397) completed the CushingQoL. Results from the multidimensional graded response model supported a 2-subscale scoring process for the survey. All items were deemed as worthy contributors to the survey. The graded response model can accommodate unidimensional or multidimensional scales, be used with relatively lower sample sizes, and is implemented in free software (example code provided in online Appendix). Use of this model can help to improve the quality of health-based scales being developed within the Health Sciences.

  13. When the Test Developer Does Not Speak the Target Language: The Use of Language Informants in the Test Development Process

    ERIC Educational Resources Information Center

    Ryan, Ève; Brunfaut, Tineke

    2016-01-01

    It is not unusual for tests in less-commonly taught languages (LCTLs) to be developed by an experienced item writer with no proficiency in the language being tested, in collaboration with a language informant who is a speaker of the target language, but lacks language assessment expertise. How this approach to item writing works in practice, and…

  14. Development and validity of a questionnaire to test the knowledge of primary care personnel regarding nutrition in obese adolescents.

    PubMed

    de Pinho, Lucinéia; Moura, Paulo Henrique Tolentino; Silveira, Marise Fagundes; de Botelho, Ana Cristina Carvalho; Caldeira, Antônio Prates

    2013-07-18

    In light of its epidemic proportions in developed and developing countries, obesity is considered a serious public health issue. In order to increase knowledge concerning the ability of health care professionals in caring for obese adolescents and adopt more efficient preventive and control measures, a questionnaire was developed and validated to assess non-dietitian health professionals regarding their Knowledge of Nutrition in Obese Adolescents (KNOA). The development and evaluation of a questionnaire to assess the knowledge of primary care practitioners with respect to nutrition in obese adolescents was carried out in five phases, as follows: 1) definition of study dimensions 2) development of 42 questions and preliminary evaluation of the questionnaire by a panel of experts; 3) characterization and selection of primary care practitioners (35 dietitians and 265 non-dietitians) and measurement of questionnaire criteria by contrasting the responses of dietitians and non-dietitians; 4) reliability assessment by question exclusion based on item difficulty (too easy and too difficult for non-dietitian practitioners), item discrimination, internal consistency and reproducibility index determination; and 5) scoring the completed questionnaires. Dietitians obtained higher scores than non-dietitians (Mann-Whitney U test, P < 0.05), confirming the validity of the questionnaire criteria. Items were discriminated by correlating the score for each item with the total score, using a minimum of 0.2 as a correlation coefficient cutoff value. Item difficulty was controlled by excluding questions answered correctly by more than 90% of the non-dietitian subjects (too easy) or by less than 10% of them (too difficult). The final questionnaire contained 26 of the original 42 questions, increasing Cronbach's α value from 0.788 to 0.807. Test-retest agreement between respondents was classified as good to very good (Kappa test, >0.60). The KNOA questionnaire developed for primary care practitioners is a valid, consistent and suitable instrument that can be applied over time, making it a promising tool for developing and guiding public health policies.

  15. [A new German Scale for Assessing Parental Stress after Preterm Birth (PSS:NICU_German/2-scales)].

    PubMed

    Urlesberger, P; Schienle, A; Pichler, G; Baik, N; Schwaberger, B; Urlesberger, B; Pichler-Stachl, E

    2017-04-01

    Background Preterm birth is known to be a stressful and anxious situation for parents, which might have long-term impact on the psychological health of mothers and even on the development of their preterm infants. Objective The Parental Stressor Scale: Neonatal Intensive Care Unit (PSS:NICU) was developed to assess parental stress after preterm birth through three subscales [1]. The aim of the present study was to examine the psychometric properties and the dimensionality of the German version of the PSS:NICU to develop a reliable German version of the PSS:NICU. Methods For the development (exploratory factor analysis) 100 parents of preterm infants answered the questionnaire. Results The Sights and Sounds subscale was removed from the German version of the PSS:NICU due to low number of items. NICU_German/2-scales was developed consisting of 2 subscales: Infant Behavior and Appearance (7 Items, Cronbach's α=0,82) and Parental Role Alteration (6 Items, Cronbach's α=0,87). Conclusions The PSS:NICU_German/2-scales is a reliable and economic scale for the assessment of parental stress after preterm birth. © Georg Thieme Verlag KG Stuttgart · New York.

  16. Measuring nonsolar tanning behavior: indoor and sunless tanning.

    PubMed

    Lazovich, Deann; Stryker, Jo Ellen; Mayer, Joni A; Hillhouse, Joel; Dennis, Leslie K; Pichon, Latrice; Pagoto, Sherry; Heckman, Carolyn; Olson, Ardis; Cokkinides, Vilma; Thompson, Kevin

    2008-02-01

    To develop items to measure indoor tanning and sunless tanning that can be used to monitor trends in population surveys or to assess changes in behavior in intervention studies. A group of experts on indoor tanning convened in December 2005, as part of a national workshop to review the state of the evidence, define measurement issues, and develop items for ever tanned indoors, lifetime frequency, and past-year frequency for both indoor tanning and sunless tanning. Each item was subsequently assessed via in-person interviews for clarity, specificity, recall, and appropriateness of wording. Universities in Tennessee and Virginia, a medical center in Massachusetts, and a high school in New Hampshire. The study population comprised 24 adults and 7 adolescents. Participants understood indoor tanning to represent tanning from beds, booths, and lamps that emit artificial UV radiation, rather than sunless tanning, even though both can be obtained from a booth. Two items were required to distinguish manually applied from booth-applied sunless tanning products. Frequency of use was easier for participants to recall in the past year than for a lifetime. While indoor tanning items may be recommended with confidence for clarity, sunless tanning items require additional testing. Memory aids may be necessary to facilitate recall of lifetime use of nonsolar tanning. In addition, studies that assess reliability and validity of these measures are needed. Since study participants were primarily young and female, testing in other populations should also be considered.

  17. Development of the Preschool Developmental Assessment Scale (PDAS) on Children's Social Development

    ERIC Educational Resources Information Center

    Leung, Cynthia; Cheung, Jasmine; Lau, Vanessa; Lam, Catherine

    2011-01-01

    This paper aimed to describe the design and development of the social domain of the Preschool Developmental Assessment Scale (PDAS), which would be used for assessment of preschool children with different developmental disabilities. The original version of the social domain consisted of 30 items. Children were asked questions about their social…

  18. Development of the Systems Thinking Scale for Adolescent Behavior Change.

    PubMed

    Moore, Shirley M; Komton, Vilailert; Adegbite-Adeniyi, Clara; Dolansky, Mary A; Hardin, Heather K; Borawski, Elaine A

    2018-03-01

    This report describes the development and psychometric testing of the Systems Thinking Scale for Adolescent Behavior Change (STS-AB). Following item development, initial assessments of understandability and stability of the STS-AB were conducted in a sample of nine adolescents enrolled in a weight management program. Exploratory factor analysis of the 16-item STS-AB and internal consistency assessments were then done with 359 adolescents enrolled in a weight management program. Test-retest reliability of the STS-AB was .71, p = .03; internal consistency reliability was .87. Factor analysis of the 16-item STS-AB indicated a one-factor solution with good factor loadings, ranging from .40 to .67. Evidence of construct validity was supported by significant correlations with established measures of variables associated with health behavior change. We provide beginning evidence of the reliability and validity of the STS-AB to measure systems thinking for health behavior change in young adolescents.

  19. Development of the Systems Thinking Scale for Adolescent Behavior Change

    PubMed Central

    Moore, Shirley M.; Komton, Vilailert; Adegbite-Adeniyi, Clara; Dolansky, Mary A.; Hardin, Heather K.; Borawski, Elaine A.

    2017-01-01

    This report describes the development and psychometric testing of the Systems Thinking Scale for Adolescent Behavior Change (STS-AB). Following item development, initial assessments of understandability and stability of the STS-AB were conducted in a sample of nine adolescents enrolled in a weight management program. Exploratory factor analysis of the 16-item STS-AB and internal consistency assessments were then done with 359 adolescents enrolled in a weight management program. Test–retest reliability of the STS-AB was .71, p = .03; internal consistency reliability was .87. Factor analysis of the 16-item STS-AB indicated a one-factor solution with good factor loadings, ranging from .40 to .67. Evidence of construct validity was supported by significant correlations with established measures of variables associated with health behavior change. We provide beginning evidence of the reliability and validity of the STS-AB to measure systems thinking for health behavior change in young adolescents. PMID:28303755

  20. Developing a situational judgment test blueprint for assessing the non-cognitive skills of applicants to the University of Utah School of Medicine, the United States

    PubMed Central

    2015-01-01

    Purpose: The situational judgment test (SJT) shows promise for assessing the non-cognitive skills of medical school applicants, but has only been used in Europe. Since the admissions processes and education levels of applicants to medical school are different in the United States and in Europe, it is necessary to obtain validity evidence of the SJT based on a sample of United States applicants. Methods: Ninety SJT items were developed and Kane’s validity framework was used to create a test blueprint. A total of 489 applicants selected for assessment/interview day at the University of Utah School of Medicine during the 2014-2015 admissions cycle completed one of five SJTs, which assessed professionalism, coping with pressure, communication, patient focus, and teamwork. Item difficulty, each item’s discrimination index, internal consistency, and the categorization of items by two experts were used to create the test blueprint. Results: The majority of item scores were within an acceptable range of difficulty, as measured by the difficulty index (0.50-0.85) and had fair to good discrimination. However, internal consistency was low for each domain, and 63% of items appeared to assess multiple domains. The concordance of categorization between the two educational experts ranged from 24% to 76% across the five domains. Conclusion: The results of this study will help medical school admissions departments determine how to begin constructing a SJT. Further testing with a more representative sample is needed to determine if the SJT is a useful assessment tool for measuring the non-cognitive skills of medical school applicants. PMID:26582629

  1. Consulting with children in the development of self-efficacy and recall tools related to nutrition and physical activity.

    PubMed

    Lassetter, Jane H; Ray, Gaye; Driessnack, Martha; Williams, Mary

    2015-01-01

    This article chronicles our efforts to develop an instrument with and for children-complete with insights, multiple iterations, and missteps along the way. The instruments we developed assess children's self-efficacy and recall related to healthy eating and physical activity. Five focus groups were held with 39 children to discuss the evolving instrument. A nine-item self-efficacy instrument and a 10-item recall instrument were developed with Flesch-Kincaid grade levels of 1.8 and 4.0, respectively, which fifth graders can complete in less than 5 min. When assessing children in clinical practice or research, we should use instruments that have been developed with children's feedback and are child-centered. Without that assurance, assessment results can be questionable. © 2014, Wiley Periodicals, Inc.

  2. Clinical diagnostic model for sciatica developed in primary care patients with low back-related leg pain

    PubMed Central

    Konstantinou, Kika; Ogollah, Reuben; Hay, Elaine M.; Dunn, Kate M.

    2018-01-01

    Background Identification of sciatica may assist timely management but can be challenging in clinical practice. Diagnostic models to identify sciatica have mainly been developed in secondary care settings with conflicting reference standard selection. This study explores the challenges of reference standard selection and aims to ascertain which combination of clinical assessment items best identify sciatica in people seeking primary healthcare. Methods Data on 394 low back-related leg pain consulters were analysed. Potential sciatica indicators were seven clinical assessment items. Two reference standards were used: (i) high confidence sciatica clinical diagnosis; (ii) high confidence sciatica clinical diagnosis with confirmatory magnetic resonance imaging findings. Multivariable logistic regression models were produced for both reference standards. A tool predicting sciatica diagnosis in low back-related leg pain was derived. Latent class modelling explored the validity of the reference standard. Results Model (i) retained five items; model (ii) retained six items. Four items remained in both models: below knee pain, leg pain worse than back pain, positive neural tension tests and neurological deficit. Model (i) was well calibrated (p = 0.18), discrimination was area under the receiver operating characteristic curve (AUC) 0.95 (95% CI 0.93, 0.98). Model (ii) showed good discrimination (AUC 0.82; 0.78, 0.86) but poor calibration (p = 0.004). Bootstrapping revealed minimal overfitting in both models. Agreement between the two latent classes and clinical diagnosis groups defined by model (i) was substantial, and fair for model (ii). Conclusion Four clinical assessment items were common in both reference standard definitions of sciatica. A simple scoring tool for identifying sciatica was developed. These criteria could be used clinically and in research to improve accuracy of identification of this subgroup of back pain patients. PMID:29621243

  3. Practice-Based Measures of Elementary Science Teachers' Content Knowledge for Teaching: Initial Item Development and Validity Evidence. Research Report. ETS RR-17-43

    ERIC Educational Resources Information Center

    Mikeska, Jamie N.; Phelps, Geoffrey; Croft, Andrew J.

    2017-01-01

    This report describes efforts by a group of science teachers, teacher educators, researchers, and content specialists to conceptualize, develop, and pilot practice-based assessment items designed to measure elementary science teachers' content knowledge for teaching (CKT). The report documents the framework used to specify the content-specific…

  4. Adolescent Healthful Foods Inventory: Development of an Instrument to Assess Adolescents' Willingness to Consume Healthful Foods

    ERIC Educational Resources Information Center

    McGuerty, Amber B.; Cater, Melissa; Prinyawiwatkul, Witoon; Tuuri, Georgianna

    2016-01-01

    Interventions to increase adolescents' healthful food and beverage consumption often fail to demonstrate change. An alternative is to measure a shift in willingness to consume these items as an indicator of movement toward change. A survey was developed to estimate willingness to consume a variety of foods and beverages. Twenty items were…

  5. A Scale for Measuring Teachers' Mathematics-Related Beliefs: A Validity and Reliability Study

    ERIC Educational Resources Information Center

    Purnomo,Yoppy Wahyu

    2017-01-01

    The purpose of this study was to develop and validate a scale of teacher beliefs related to mathematics, namely, beliefs about the nature of mathematics, mathematics teaching, and assessment in mathematics learning. A scale development study was used to achieve it. The draft scale consisted of 54 items in which 16 items related to beliefs about…

  6. The development of a quality appraisal tool for studies of diagnostic reliability (QAREL).

    PubMed

    Lucas, Nicholas P; Macaskill, Petra; Irwig, Les; Bogduk, Nikolai

    2010-08-01

    In systematic reviews of the reliability of diagnostic tests, no quality assessment tool has been used consistently. The aim of this study was to develop a specific quality appraisal tool for studies of diagnostic reliability. Key principles for the quality of studies of diagnostic reliability were identified with reference to epidemiologic principles, existing quality appraisal checklists, and the Standards for Reporting of Diagnostic Accuracy (STARD) and Quality Assessment of Diagnostic Accuracy Studies (QUADAS) resources. Specific items that encompassed each of the principles were developed. Experts in diagnostic research provided feedback on the items that were to form the appraisal tool. This process was iterative and continued until consensus among experts was reached. The Quality Appraisal of Reliability Studies (QAREL) checklist includes 11 items that explore seven principles. Items cover the spectrum of subjects, spectrum of examiners, examiner blinding, order effects of examination, suitability of the time interval among repeated measurements, appropriate test application and interpretation, and appropriate statistical analysis. QAREL has been developed as a specific quality appraisal tool for studies of diagnostic reliability. The reliability of this tool in different contexts needs to be evaluated. Copyright (c) 2010 Elsevier Inc. All rights reserved.

  7. Developing an item bank to measure the coping strategies of people with hereditary retinal diseases.

    PubMed

    Prem Senthil, Mallika; Khadka, Jyoti; De Roach, John; Lamey, Tina; McLaren, Terri; Campbell, Isabella; Fenwick, Eva K; Lamoureux, Ecosse L; Pesudovs, Konrad

    2018-05-05

    Our understanding of the coping strategies used by people with visual impairment to manage stress related to visual loss is limited. This study aims to develop a sophisticated coping instrument in the form of an item bank implemented via Computerised adaptive testing (CAT) for hereditary retinal diseases. Items on coping were extracted from qualitative interviews with patients which were supplemented by items from a literature review. A systematic multi-stage process of item refinement was carried out followed by expert panel discussion and cognitive interviews. The final coping item bank had 30 items. Rasch analysis was used to assess the psychometric properties. A CAT simulation was carried out to estimate an average number of items required to gain precise measurement of hereditary retinal disease-related coping. One hundred eighty-nine participants answered the coping item bank (median age = 58 years). The coping scale demonstrated good precision and targeting. The standardised residual loadings for items revealed six items grouped together. Removal of the six items reduced the precision of the main coping scale and worsened the variance explained by the measure. Therefore, the six items were retained within the main scale. Our CAT simulation indicated that, on average, less than 10 items are required to gain a precise measurement of coping. This is the first study to develop a psychometrically robust coping instrument for hereditary retinal diseases. CAT simulation indicated that on an average, only four and nine items were required to gain measurement at moderate and high precision, respectively.

  8. Characteristics and clinical relevance of postgastrectomy syndrome assessment scale (PGSAS)-45: newly developed integrated questionnaires for assessment of living status and quality of life in postgastrectomy patients.

    PubMed

    Nakada, Koji; Ikeda, Masami; Takahashi, Masazumi; Kinami, Shinichi; Yoshida, Masashi; Uenosono, Yoshikazu; Kawashima, Yoshiyuki; Oshio, Atsushi; Suzukamo, Yoshimi; Terashima, Masanori; Kodera, Yasuhiro

    2015-01-01

    Lack of a suitable instrument to comprehensively assess symptoms, living status, and quality of life in postgastrectomy patients prompted the authors to develop postgastrectomy syndrome assessment scale (PGSAS)-45. PGSAS-45 consists of 45 items in total: 8 items from SF-8, 15 items from GSRS, and an additional 22 items selected by 47 gastric surgeons. Using the PGSAS-45, a multi-institutional survey was conducted to determine the prevalence of postgastrectomy syndrome and its impact on everyday life among patients who underwent various types of gastrectomy. Eligible data were obtained from 2,368 patients operated and followed at 52 institutions in Japan. Of these, data from 1,777 patients were used in the current study in which symptom subscales of the PGSAS-45 were determined. We also considered the characteristics of the postgastrectomy syndrome and to what extent these symptoms influence patients' living status and quality of life (QOL). By factor analysis, 23 symptom-related items of PGSAS-45 were successfully clustered into seven symptom subscales that represent esophageal reflux, abdominal pain, meal-related distress, indigestion, diarrhea, constipation, and dumping. These seven symptom subscales and two other subscales measuring quality of ingestion and dissatisfaction for daily life, respectively, had good internal consistency in terms of Cronbach's α (0.65-0.88). PGSAS-45 provides a valid and reliable integrated index for evaluation of symptoms, living status, and QOL in gastrectomized patients.

  9. Cigarette dependence questionnaire: development and psychometric testing with male smokers.

    PubMed

    Huang, Chih-Ling; Lin, Hsi-Hui; Wang, Hsiu-Hung

    2010-10-01

    This paper is a report of a study conducted to develop and test a theoretically derived Cigarette Dependence Questionnaire for adult male smokers. Fagerstrom questionnaires have been used worldwide to assess cigarette dependence. However, these assessments lack any theoretical perspective. A theory-based approach is needed to ensure valid assessment. In 2007, an initial pool of 103 Cigarette Dependence Questionnaire items was distributed to 109 adult smokers in Taiwan. Item analysis was conducted to select items for inclusion in the refined scale. The psychometric properties of the Cigarette Dependence Questionnaire were further evaluated 2007-08, when it was administered to 256 respondents and their saliva was collected and analysed for cotinine levels. Criterion validity was established through the Pearson correlation between the scale and saliva cotinine levels. Exploratory factor analysis was used to test construct validity. Reliability was determined with Cronbach's alpha coefficient and a 2-week test-retest coefficient. The selection of 30 items for seven perspectives was based on item analysis. One factor accounting for 44.9% of the variance emerged from the factor analysis. The factor was named as cigarette dependence. Cigarette Dependence Questionnaire scores were statistically significantly correlated with saliva cotinine levels (r = 0.21, P = 0.01). Cronbach's alpha was 0.95 and test-retest reliability using an intra-class correlation was 0.92. The Cigarette Dependence Questionnaire showed sound reliability and validity and could be used by nurses to set up smoking cessation interventions based on assessment of cigarette dependence. © 2010 Blackwell Publishing Ltd.

  10. A Spanish-Language Risk Perception Survey for Developing Diabetes: Translation Process and Assessment of Psychometric Properties.

    PubMed

    Joiner, Kevin L; Sternberg, Rosa Maria; Kennedy, Christine; Chen, Jyu-Lin; Fukuoka, Yoshimi; Janson, Susan L

    2016-12-01

    Create a Spanish-language version of the Risk Perception Survey for Developing Diabetes (RPS-DD) and assess psychometric properties. The Spanish-language version was created through translation, harmonization, and presentation to the tool's original author. It was field tested in a foreignborn Latino sample and properties evaluated in principal components analysis. Personal Control, Optimistic Bias, and Worry multi-item Likert subscale responses did not cluster together. A clean solution was obtained after removing two Personal Control subscale items. Neither the Personal Disease Risk scale nor the Environmental Health Risk scale responses loaded onto single factors. Reliabilities ranged from .54 to .88. Test of knowledge performance varied by item. This study contributes to evidence of validation of a Spanish-language RPS-DD in foreign-born Latinos.

  11. Analysis of Item-Level Bias in the Bayley-III Language Subscales: The Validity and Utility of Standardized Language Assessment in a Multilingual Setting.

    PubMed

    Goh, Shaun K Y; Tham, Elaine K H; Magiati, Iliana; Sim, Litwee; Sanmugam, Shamini; Qiu, Anqi; Daniel, Mary L; Broekman, Birit F P; Rifkin-Graboi, Anne

    2017-09-18

    The purpose of this study was to improve standardized language assessments among bilingual toddlers by investigating and removing the effects of bias due to unfamiliarity with cultural norms or a distributed language system. The Expressive and Receptive Bayley-III language scales were adapted for use in a multilingual country (Singapore). Differential item functioning (DIF) was applied to data from 459 two-year-olds without atypical language development. This involved investigating if the probability of success on each item varied according to language exposure while holding latent language ability, gender, and socioeconomic status constant. Associations with language, behavioral, and emotional problems were also examined. Five of 16 items showed DIF, 1 of which may be attributed to cultural bias and another to a distributed language system. The remaining 3 items favored toddlers with higher bilingual exposure. Removal of DIF items reduced associations between language scales and emotional and language problems, but improved the validity of the expressive scale from poor to good. Our findings indicate the importance of considering cultural and distributed language bias in standardized language assessments. We discuss possible mechanisms influencing performance on items favoring bilingual exposure, including the potential role of inhibitory processing.

  12. The Chinese version of Instrument of Professional Attitude for Student Nurses (IPASN): Assessment of reliability and validity.

    PubMed

    Xiao, Yu-Ying; Li, Ting; Xiao, Lin; Wang, Su-Wei; Wang, Si-Qi; Wang, Han-Xiao; Wang, Bei-Bei; Gao, Yu-Lin

    2017-02-01

    Professional attitude is of great importance for nursing talents in the modern society. To develop an effective educational program for student nurses in China, an appropriate instrument is required for the assessment of their professional attitude. To assess the validity and reliability of the Instrument of Professional Attitude for Student Nurses (IPASN) in Chinese version. The original version of IPASN was translated through Brislin model (translation, back translation, culture adaption and pilot study) with the authorization from the developer. A total of 681 nursing students were chosen by stratified convenience sampling to assess construct validity using exploratory factor analysis (EFA). Besides, item analysis, Cronbach's alpha coefficients, test-retest reliability were conducted to test the psychometric properties in this part. A total of 204 nursing undergraduate trainees were selected by cluster convenience sampling to confirm the structure using confirmatory factor analysis (CFA) in another time. Corrected item-total correlations, alpha if item deleted were between 0.33 and 0.69, 0.906 and 0.913, respectively, indicating no item should be deleted. Cronbach alpha value was 0.91 for the total scale and Cronbach alpha coefficient for subscales ranged from 0.67 to 0.89. Test-retest reliability estimated from intraclass correlation coefficient (ICC) was 0.74 (P<0.05). Differences in item scores between the high-score group (the first 27%) and low-score group (the last 27%) were significant (P<0.001), indicating that the item discrimination ability was good. Seven subscales (contribution to increase of scientific information load, autonomy, community service, continuous education, to promote professional development, cooperation and theory guiding practice) were identified in EFA and confirmed in CFA, and explained 65.5% of the total variance. It indicated that the Chinese version of IPASN was valid and reliable for the evaluation of nursing students' professional attitude. Copyright © 2016 Elsevier Ltd. All rights reserved.

  13. Can Item Keyword Feedback Help Remediate Knowledge Gaps?

    PubMed

    Feinberg, Richard A; Clauser, Amanda L

    2016-10-01

    In graduate medical education, assessment results can effectively guide professional development when both assessment and feedback support a formative model. When individuals cannot directly access the test questions and responses, a way of using assessment results formatively is to provide item keyword feedback. The purpose of the following study was to investigate whether exposure to item keyword feedback aids in learner remediation. Participants included 319 trainees who completed a medical subspecialty in-training examination (ITE) in 2012 as first-year fellows, and then 1 year later in 2013 as second-year fellows. Performance on 2013 ITE items in which keywords were, or were not, exposed as part of the 2012 ITE score feedback was compared across groups based on the amount of time studying (preparation). For the same items common to both 2012 and 2013 ITEs, response patterns were analyzed to investigate changes in answer selection. Test takers who indicated greater amounts of preparation on the 2013 ITE did not perform better on the items in which keywords were exposed compared to those who were not exposed. The response pattern analysis substantiated overall growth in performance from the 2012 ITE. For items with incorrect responses on both attempts, examinees selected the same option 58% of the time. Results from the current study were unsuccessful in supporting the use of item keywords in aiding remediation. Unfortunately, the results did provide evidence of examinees retaining misinformation.

  14. Development of an instrument to measure the quality of documented nursing diagnoses, interventions and outcomes: the Q-DIO.

    PubMed

    Müller-Staub, Maria; Lunney, Margaret; Odenbreit, Matthias; Needham, Ian; Lavin, Mary Ann; van Achterberg, Theo

    2009-04-01

    This paper aims to report the development stages of an audit instrument to assess standardised nursing language. Because research-based instruments were not available, the instrument Quality of documentation of nursing Diagnoses, Interventions and Outcomes (Q-DIO) was developed. Standardised nursing language such as nursing diagnoses, interventions and outcomes are being implemented worldwide and will be crucial for the electronic health record. The literature showed a lack of audit instruments to assess the quality of standardised nursing language in nursing documentation. A qualitative design was used for instrument development. Criteria were first derived from a theoretical framework and literature reviews. Second, the criteria were operationalized into items and eight experts assessed face and content validity of the Q-DIO. Criteria were developed and operationalized into 29 items. For each item, a three or five point scale was applied. The experts supported content validity and showed 88.25% agreement for the scores assigned to the 29 items of the Q-DIO. The Q-DIO provides a literature-based audit instrument for nursing documentation. The strength of Q-DIO is its ability to measure the quality of nursing diagnoses and related interventions and nursing-sensitive patient outcomes. Further testing of Q-DIO is recommended. Based on the results of this study, the Q-DIO provides an audit instrument to be used in clinical practice. Its criteria can set the stage for the electronic nursing documentation in electronic health records.

  15. The PROMIS Physical Function item bank was calibrated to a standardized metric and shown to improve measurement efficiency.

    PubMed

    Rose, Matthias; Bjorner, Jakob B; Gandek, Barbara; Bruce, Bonnie; Fries, James F; Ware, John E

    2014-05-01

    To document the development and psychometric evaluation of the Patient-Reported Outcomes Measurement Information System (PROMIS) Physical Function (PF) item bank and static instruments. The items were evaluated using qualitative and quantitative methods. A total of 16,065 adults answered item subsets (n>2,200/item) on the Internet, with oversampling of the chronically ill. Classical test and item response theory methods were used to evaluate 149 PROMIS PF items plus 10 Short Form-36 and 20 Health Assessment Questionnaire-Disability Index items. A graded response model was used to estimate item parameters, which were normed to a mean of 50 (standard deviation [SD]=10) in a US general population sample. The final bank consists of 124 PROMIS items covering upper, central, and lower extremity functions and instrumental activities of daily living. In simulations, a 10-item computerized adaptive test (CAT) eliminated floor and decreased ceiling effects, achieving higher measurement precision than any comparable length static tool across four SDs of the measurement range. Improved psychometric properties were transferred to the CAT's superior ability to identify differences between age and disease groups. The item bank provides a common metric and can improve the measurement of PF by facilitating the standardization of patient-reported outcome measures and implementation of CATs for more efficient PF assessments over a larger range. Copyright © 2014. Published by Elsevier Inc.

  16. Development and initial validation of the appropriate antibiotic use self-efficacy scale.

    PubMed

    Hill, Erin M; Watkins, Kaitlin

    2018-06-04

    While there are various medication self-efficacy scales that exist, none assess self-efficacy for appropriate antibiotic use. The Appropriate Antibiotic Use Self-Efficacy Scale (AAUSES) was developed, pilot tested, and its psychometric properties were examined. Following pilot testing of the scale, a 28-item questionnaire was examined using a sample (n = 289) recruited through the Amazon Mechanical Turk platform. Participants also completed other scales and items, which were used in assessing discriminant, convergent, and criterion-related validity. Test-retest reliability was also examined. After examining the scale and removing items that did not assess appropriate antibiotic use, an exploratory factor analysis was conducted on 13 items from the original scale. Three factors were retained that explained 65.51% of the variance. The scale and its subscales had adequate internal consistency. The scale had excellent test-retest reliability, as well as demonstrated convergent, discriminant, and criterion-related validity. The AAUSES is a valid and reliable scale that assesses three domains of appropriate antibiotic use self-efficacy. The AAUSES may have utility in clinical and research settings in understanding individuals' beliefs about appropriate antibiotic use and related behavioral correlates. Future research is needed to examine the scale's utility in these settings. Copyright © 2018 Elsevier B.V. All rights reserved.

  17. Development of a tool to assess adherence to a model of the division of responsibility in feeding young children: using response mapping to capacitate validation measures.

    PubMed

    Lohse, Barbara; Satter, Ellyn; Arnold, Kristen

    2014-04-01

    Accurate early assessment and targeted intervention with problematic parent/child feeding dynamics is critical for the prevention and treatment of child obesity. The division of responsibility in feeding (sDOR), articulated by the Satter Feeding Dynamics Model (fdSatter), has been demonstrated clinically as an effective approach to reduce child feeding problems, including those leading to obesity. Lack of a tested instrument to examine adherence to fdSatter stimulated initial construction of the Satter Feeding Dynamics Inventory (fdSI). The aim of this project was to refine the item pool to establish translational validity, making the fdSI suitable for advanced psychometric analysis. Cognitive interviews (n = 80) with caregivers of varied socioeconomic strata informed revisions that demonstrated face and content validity. fdSI responses were mapped to interviews using an iterative, multi-phase thematic approach to provide an instrument ready for construct validation. fdSI development required five interview phases over 32 months: Foundational; Refinement; Transitional; Assurance; and Launching. Each phase was associated with item reduction and revision. Thirteen items were removed from the 38-item Foundational phase and seven were revised in the Refinement phase. Revisions, deletions, and additions prompted by Transitional and Assurance phase interviews resulted in the 15-item Launching phase fdSI. Only one Foundational phase item was carried through all development phases, emphasizing the need to test for item comprehension and interpretation before psychometric analyses. Psychometric studies of item pools without encrypted meanings will facilitate progress toward a tool that accurately detects adherence to sDOR. Ability to measure sDOR will facilitate focus on feeding behaviors associated with reduced risk of childhood obesity.

  18. Scales for assessing self-efficacy of nurses and assistants for preventing falls

    PubMed Central

    Dykes, Patricia C.; Carroll, Diane; McColgan, Kerry; Hurley, Ann C.; Lipsitz, Stuart R.; Colombo, Lisa; Zuyev, Lyubov; Middleton, Blackford

    2011-01-01

    Aim This paper is a report of the development and testing of the Self-Efficacy for Preventing Falls Nurse and Assistant scales. Background Patient falls and fall-related injuries are traumatic ordeals for patients, family members and providers, and carry a toll for hospitals. Self-efficacy is an important factor in determining actions persons take and levels of performance they achieve. Performance of individual caregivers is linked to the overall performance of hospitals. Scales to assess nurses and certified nursing assistants’ self-efficacy to prevent patients from falling would allow for targeting resources to increase SE, resulting in improved individual performance and ultimately decreased numbers of patient falls. Method Four phases of instrument development were carried out to (1) generate individual items from eight focus groups (four each nurse and assistant conducted in October 2007), (2) develop prototype scales, (3) determine content validity during a second series of four nurse and assistant focus groups (January 2008) and (4) conduct item analysis, paired t-tests, Student’s t-tests and internal consistency reliability to refine and confirm the scales. Data were collected during February–December, 2008. Results The 11-item Self-Efficacy for Preventing Falls Nurse had an alpha of 0·89 with all items in the range criterion of 0·3–0·7 for item total correlation. The 8-item Self-Efficacy for Preventing Falls Assistant had an alpha of 0·74 and all items had item total correlations in the 0·3–0·7 range. Conclusions The Self-Efficacy for Preventing Falls Nurse and Self-Efficacy for Preventing Falls Assistant scales demonstrated psychometric adequacy and are recommended to measure bedside staff’s self-efficacy beliefs in preventing patient falls. PMID:21073506

  19. Examining construct validity of a new naturalistic observational assessment of hand skills for preschool- and school-age children.

    PubMed

    Chien, Chi-Wen; Brown, Ted; McDonald, Rachael

    2012-04-01

    The Assessment of Children's Hand Skills is a new assessment that utilises a naturalistic observational method to capture children's real-life hand skill performance when engaged at various types of daily activities in everyday living contexts. The Assessment of Children's Hand Skills is designed for use with 2- to 12-year-old children with a range of disabilities or health conditions. The study aimed to investigate construct validity of the Assessment of Children's Hand Skills in Australian children. Rasch analysis was used to examine internal construct validity of the Assessment of Children's Hand Skills in a mixed sample of 53 children with disabilities (including autism spectrum disorder, developmental/genetic disorders and physical disabilities) and 85 typically developing children. External construct validity was examined by correlating with three questionnaires evaluating daily living skills and hand skills. Rasch goodness-of-fit analysis suggested that all 22 activity items and 19 of 20 hand skill items in the Assessment of Children's Hand Skills measured a single construct. The Assessment of Children's Hand Skills items were placed in a clinically meaningful hierarchy from easy to hard, and the difficulty range of the items also matched the majority of children with disabilities and typically developing preschool-aged children. Moderate to high correlations (0.59 ≤ Spearman's ρ coefficients ≤ 0.89, P < 0.01) were found with the assessments of daily living and fine motor skills. This study provided preliminary evidence supporting the construct validity of the Assessment of Children's Hand Skills for its clinical application in assessing children's real-life hand skill performance in Australian contexts. © 2012 The Authors Australian Occupational Therapy Journal © 2012 Occupational Therapy Australia.

  20. Development and validation of the Chinese Attitudes to Starting Insulin Questionnaire (Ch-ASIQ) for primary care patients with type 2 diabetes.

    PubMed

    Fu, Sau Nga; Chin, Weng Yee; Wong, Carlos King Ho; Yeung, Vincent Tok Fai; Yiu, Ming Pong; Tsui, Hoi Yee; Chan, Ka Hung

    2013-01-01

    To develop and evaluate the psychometric properties of a Chinese questionnaire which assesses the barriers and enablers to commencing insulin in primary care patients with poorly controlled Type 2 diabetes. Questionnaire items were identified using literature review. Content validation was performed and items were further refined using an expert panel. Following translation, back translation and cognitive debriefing, the translated Chinese questionnaire was piloted on target patients. Exploratory factor analysis and item-scale correlations were performed to test the construct validity of the subscales and items. Internal reliability was tested by Cronbach's alpha. Twenty-seven identified items underwent content validation, translation and cognitive debriefing. The translated questionnaire was piloted on 303 insulin naïve (never taken insulin) Type 2 diabetes patients recruited from 10 government-funded primary care clinics across Hong Kong. Sufficient variability in the dataset for factor analysis was confirmed by Bartlett's Test of Sphericity (P<0.001). Using exploratory factor analysis with varimax rotation, 10 factors were generated onto which 26 items loaded with loading scores > 0.4 and Eigenvalues >1. Total variance for the 10 factors was 66.22%. Kaiser-Meyer-Olkin measure was 0.725. Cronbach's alpha coefficients for the first four factors were ≥0.6 identifying four sub-scales to which 13 items correlated. Remaining sub-scales and items with poor internal reliability were deleted. The final 13-item instrument had a four scale structure addressing: 'Self-image and stigmatization'; 'Factors promoting self-efficacy; 'Fear of pain or needles'; and 'Time and family support'. The Chinese Attitudes to Starting Insulin Questionnaire (Ch-ASIQ) appears to be a reliable and valid measure for assessing barriers to starting insulin. This short instrument is easy to administer and may be used by healthcare providers and researchers as an assessment tool for Chinese diabetic primary care patients, including the elderly, who are unwilling to start insulin.

  1. Development and validation of a fine-motor assessment tool for use with young children in a Chinese population.

    PubMed

    Siu, Andrew M H; Lai, Cynthia Y Y; Chiu, Amy S M; Yip, Calvin C K

    2011-01-01

    Most of the fine-motor assessment tools used in Hong Kong have been designed in Western countries, so there is a need to develop a standardized assessment which is relevant to the culture and daily living tasks of the local (that is, Chinese) population. This study aimed to (1) develop a fine-motor assessment tool (the Hong Kong Preschool Fine-Motor Developmental Assessment [HK-PFMDA]) for use with young children in a Chinese population and (2) examine the HK-PFMDA's psychometric properties. The HK-PFMDA was developed by a group of occupational therapists specializing in the area of developmental disabilities in Hong Kong. A panel of 21 experts reviewed the content validity of the instrument. Rasch item analysis was used to examine the model fit of items against the rating scale model, and to explore the dimensionality of the test. Intra- and interrater reliability, convergent validity, and criterion-related validity were examined. The participants included 783 children without disabilities, 45 with autistic spectrum disorder, and 35 with developmental delay. The Rasch analysis suggested that the 87-item HK-PFMDA had a unidimensional structure, as the items explained most (91.6%) of the variance. The HK-PFMDA demonstrated excellent intra- (ICC = .99) and interrater reliability (ICC = .99), and internal consistency (α ranging from .83 to .92). In terms of validity, the HK-PFMDA had significant positive correlations with both age and the convergent measures of the Peabody Developmental Motor Scales (PDMS-2). A set of normative data for local children aged from birth to 6 years was established. The HK-PFMDA has shown excellent psychometric properties and is suitable for clinical application by occupational therapists in the assessment of fine-motor skills development of young children in Chinese populations. Copyright © 2010 Elsevier Ltd. All rights reserved.

  2. Development and psychometric testing of the Carter Assessment of Critical Thinking in Midwifery (Preceptor/Mentor version).

    PubMed

    Carter, Amanda G; Creedy, Debra K; Sidebotham, Mary

    2016-03-01

    develop and test a tool designed for use by preceptors/mentors to assess undergraduate midwifery students׳ critical thinking in practice. a descriptive cohort design was used. participants worked in a range of maternity settings in Queensland, Australia. 106 midwifery clinicians who had acted in the role of preceptor for undergraduate midwifery students. this study followed a staged model for tool development recommended by DeVellis (2012). This included generation of items, content validity testing through mapping of draft items to critical thinking concepts and expert review, administration of items to a convenience sample of preceptors, and psychometric testing. A 24 item tool titled the XXXX Assessment of Critical Thinking in Midwifery (CACTiM) was completed by registered midwives in relation to students they had recently preceptored in the clinical environment. ratings by experts revealed a content validity index score of 0.97, representing good content validity. An evaluation of construct validity through factor analysis generated three factors: 'partnership in practice', 'reflection on practice' and 'practice improvements'. The scale demonstrated good internal reliability with a Cronbach alpha coefficient of 0.97. The mean total score for the CACTiM scale was 116.77 (SD=16.68) with a range of 60-144. Total and subscale scores correlated significantly. the CACTiM (Preceptor/Mentor version) was found to be a valid and reliable tool for use by preceptors to assess critical thinking in undergraduate midwifery students. given the importance of critical thinking skills for midwifery practice, mapping and assessing critical thinking development in students׳ practice across an undergraduate programme is vital. The CACTiM (Preceptor/Mentor version) has utility for clinical education, research and practice. The tool can inform and guide preceptors׳ assessment of students׳ critical thinking in practice. The availability of a reliable and valid tool can be used to research the development of critical thinking in practice. Crown Copyright © 2015. Published by Elsevier Ltd. All rights reserved.

  3. The development of the PARENTS: a tool for parents to assess residents' non-technical skills in pediatric emergency departments.

    PubMed

    Moreau, Katherine A; Eady, Kaylee; Tang, Kenneth; Jabbour, Mona; Frank, Jason R; Campbell, Meaghan; Hamstra, Stanley J

    2017-11-14

    Parents can assess residents' non-technical skills (NTS) in pediatric emergency departments (EDs). There are no assessment tools, with validity evidence, for parental use in pediatric EDs. The purpose of this study was to develop the Parents' Assessment of Residents Enacting Non-Technical Skills (PARENTS) educational assessment tool and collect three sources of validity evidence (i.e., content, response process, internal structure) for it. We established content evidence for the PARENTS through interviews with physician-educators and residents, focus groups with parents, a literature review, and a modified nominal group technique with experts. We collected response process evidence through cognitive interviews with parents. To examine the internal structure evidence, we administered the PARENTS and performed exploratory factor analysis. Initially, a 20-item PARENTS was developed. Cognitive interviews led to the removal of one closed-ended item, the addition of resident photographs, and wording/formatting changes. Thirty-seven residents and 434 parents participated in the administration of the resulting 19-item PARENTS. Following factor analysis, a one-factor model prevailed. The study presents initial validity evidence for the PARENTS. It also highlights strategies for potentially: (a) involving parents in the assessment of residents, (b) improving the assessment of NTS in pediatric EDs, and (c) capturing parents' perspectives to improve the preparation of future physicians.

  4. Development of a Preschool Developmental Assessment Scale for Assessment of Developmental Disabilities

    ERIC Educational Resources Information Center

    Leung, Cynthia; Mak, Rose; Lau, Vanessa; Cheung, Jasmine; Lam, Catherine

    2010-01-01

    The aim of this paper was to describe the development of the cognitive domain of the Preschool Developmental Assessment Scale (PDAS) for assessment of preschool children with developmental disabilities. The initial version of the cognitive domain consisted of 87 items. They were administered to 324 preschool children, including 240 children from…

  5. Serial Assessment of Trauma Care Capacity in Ghana in 2004 and 2014.

    PubMed

    Stewart, Barclay T; Quansah, Robert; Gyedu, Adam; Boakye, Godfred; Abantanga, Francis; Ankomah, James; Donkor, Peter; Mock, Charles

    2016-02-01

    Trauma care capacity assessments in developing countries have generated evidence to support advocacy, detailed baseline capabilities, and informed targeted interventions. However, serial assessments to determine the effect of capacity improvements or changes over time have rarely been performed. To compare the availability of trauma care resources in Ghana between 2004 and 2014 to assess the effects of a decade of change in the trauma care landscape and derive recommendations for improvements. Capacity assessments were performed using direct inspection and structured interviews derived from the World Health Organization's Guidelines for Essential Trauma Care. In Ghana, 10 hospitals in 2004 and 32 hospitals in 2014 were purposively sampled to represent those most likely to care for injuries. Clinical staff, administrators, logistic/procurement officers, and technicians/biomedical engineers who interacted, directly or indirectly, with trauma care resources were interviewed at each hospital. Availability of items for trauma care was rated from 0 (complete absence) to 3 (fully available). Factors contributing to deficiency in 2014 were determined for items rated lower than 3. Each item rated lower than 3 at a specific hospital was defined as a hospital-item deficiency. Scores for total number of hospital-item deficiencies were derived for each contributing factor. There were significant improvements in mean ratings for trauma care resources: district-level (smaller) hospitals had a mean rating of 0.8 for all items in 2004 vs 1.3 in 2014 (P = .002); regional (larger) hospitals had a mean rating of 1.1 in 2004 vs 1.4 in 2014 (P = .01). However, a number of critical deficiencies remain (eg, chest tubes, diagnostics, and orthopedic and neurosurgical care; mean ratings ≤ 2). Leading contributing factors were item absence (503 hospital-item deficiencies), lack of training (335 hospital-item deficiencies), and stockout of consumables (137 hospital-item deficiencies). There has been significant improvement in trauma care capacity during the past decade in Ghana; however, critical deficiencies remain and require urgent redress to avert preventable death and disability. Serial capacity assessment is a valuable tool for monitoring efforts to strengthen trauma care systems, identifying what has been successful, and highlighting needs.

  6. Assessing Children's Homework Performance: Development of Multi-Dimensional, Multi-Informant Rating Scales.

    PubMed

    Power, Thomas J; Dombrowski, Stefan C; Watkins, Marley W; Mautone, Jennifer A; Eagle, John W

    2007-06-01

    Efforts to develop interventions to improve homework performance have been impeded by limitations in the measurement of homework performance. This study was conducted to develop rating scales for assessing homework performance among students in elementary and middle school. Items on the scales were intended to assess student strengths as well as deficits in homework performance. The sample included 163 students attending two school districts in the Northeast. Parents completed the 36-item Homework Performance Questionnaire - Parent Scale (HPQ-PS). Teachers completed the 22-item teacher scale (HPQ-TS) for each student for whom the HPQ-PS had been completed. A common factor analysis with principal axis extraction and promax rotation was used to analyze the findings. The results of the factor analysis of the HPQ-PS revealed three salient and meaningful factors: student task orientation/efficiency, student competence, and teacher support. The factor analysis of the HPQ-TS uncovered two salient and substantive factors: student responsibility and student competence. The findings of this study suggest that the HPQ is a promising set of measures for assessing student homework functioning and contextual factors that may influence performance. Directions for future research are presented.

  7. Assessing Children’s Homework Performance: Development of Multi-Dimensional, Multi-Informant Rating Scales

    PubMed Central

    Power, Thomas J.; Dombrowski, Stefan C.; Watkins, Marley W.; Mautone, Jennifer A.; Eagle, John W.

    2007-01-01

    Efforts to develop interventions to improve homework performance have been impeded by limitations in the measurement of homework performance. This study was conducted to develop rating scales for assessing homework performance among students in elementary and middle school. Items on the scales were intended to assess student strengths as well as deficits in homework performance. The sample included 163 students attending two school districts in the Northeast. Parents completed the 36-item Homework Performance Questionnaire – Parent Scale (HPQ-PS). Teachers completed the 22-item teacher scale (HPQ-TS) for each student for whom the HPQ-PS had been completed. A common factor analysis with principal axis extraction and promax rotation was used to analyze the findings. The results of the factor analysis of the HPQ-PS revealed three salient and meaningful factors: student task orientation/efficiency, student competence, and teacher support. The factor analysis of the HPQ-TS uncovered two salient and substantive factors: student responsibility and student competence. The findings of this study suggest that the HPQ is a promising set of measures for assessing student homework functioning and contextual factors that may influence performance. Directions for future research are presented. PMID:18516211

  8. Development and validation of green eating behaviors, stage of change, decisional balance, and self-efficacy scales in college students.

    PubMed

    Weller, Kathryn E; Greene, Geoffrey W; Redding, Colleen A; Paiva, Andrea L; Lofgren, Ingrid; Nash, Jessica T; Kobayashi, Hisanori

    2014-01-01

    To develop and validate an instrument to assess environmentally conscious eating (Green Eating [GE]) behavior (BEH) and GE Transtheoretical Model constructs including Stage of Change (SOC), Decisional Balance (DB), and Self-efficacy (SE). Cross-sectional instrument development survey. Convenience sample (n = 954) of 18- to 24-year-old college students from a northeastern university. The sample was randomly split: (N1) and (N2). N1 was used for exploratory factor analyses using principal components analyses; N2 was used for confirmatory analyses (structural modeling) and reliability analyses (coefficient α). The full sample was used for measurement invariance (multi-group confirmatory analyses) and convergent validity (BEH) and known group validation (DB and SE) by SOC using analysis of variance. Reliable (α > .7), psychometrically sound, and stable measures included 2 correlated 5-item DB subscales (Pros and Cons), 2 correlated SE subscales (school [5 items] and home [3 items]), and a single 6-item BEH scale. Most students (66%) were in Precontemplation and Contemplation SOC. Behavior, DB, and SE scales differed significantly by SOC (P < .001) with moderate to large effect sizes, as predicted by the Transtheoretical Model, which supported the validity of these measures. Successful development and preliminary validation of this 25-item GE instrument provides a basis for assessment as well as development of tailored interventions for college students. Copyright © 2014 Society for Nutrition Education and Behavior. Published by Elsevier Inc. All rights reserved.

  9. Development of the competency scale for primary care managers in Thailand: Scale development.

    PubMed

    Kitreerawutiwong, Keerati; Sriruecha, Chanaphol; Laohasiriwong, Wongsa

    2015-12-09

    The complexity of the primary care system requires a competent manager to achieve high-quality healthcare. The existing literature in the field yields little evidence of the tools to assess the competency of primary care administrators. This study aimed to develop and examine the psychometric properties of the competency scale for primary care managers in Thailand. The scale was developed using in-depth interviews and focus group discussions among policy makers, managers, practitioners, village health volunteers, and clients. The specific dimensions were extracted from 35 participants. 123 items were generated from the evidence and qualitative data. Content validity was established through the evaluation of seven experts and the original 123 items were reduced to 84 items. The pilot testing was conducted on a simple random sample of 487 primary care managers. Item analysis, reliability testing, and exploratory factor analysis were applied to establish the scale's reliability and construct validity. Exploratory factor analysis identified nine dimensions with 48 items using a five-point Likert scale. Each dimension accounted for greater than 58.61% of the total variance. The scale had strong content validity (Indices = 0.85). Each dimension of Cronbach's alpha ranged from 0.70 to 0.88. Based on these analyses, this instrument demonstrated sound psychometric properties and therefore is considered an effective tool for assessment of the primary care manager competencies. The results can be used to improve competency requirements of primary care managers, with implications for health service management workforce development.

  10. The Long-Term Conditions Questionnaire: conceptual framework and item development

    PubMed Central

    Peters, Michele; Potter, Caroline M; Kelly, Laura; Hunter, Cheryl; Gibbons, Elizabeth; Jenkinson, Crispin; Coulter, Angela; Forder, Julien; Towers, Ann-Marie; A’Court, Christine; Fitzpatrick, Ray

    2016-01-01

    Purpose To identify the main issues of importance when living with long-term conditions to refine a conceptual framework for informing the item development of a patient-reported outcome measure for long-term conditions. Materials and methods Semi-structured qualitative interviews (n=48) were conducted with people living with at least one long-term condition. Participants were recruited through primary care. The interviews were transcribed verbatim and analyzed by thematic analysis. The analysis served to refine the conceptual framework, based on reviews of the literature and stakeholder consultations, for developing candidate items for a new measure for long-term conditions. Results Three main organizing concepts were identified: impact of long-term conditions, experience of services and support, and self-care. The findings helped to refine a conceptual framework, leading to the development of 23 items that represent issues of importance in long-term conditions. The 23 candidate items formed the first draft of the measure, currently named the Long-Term Conditions Questionnaire. Conclusion The aim of this study was to refine the conceptual framework and develop items for a patient-reported outcome measure for long-term conditions, including single and multiple morbidities and physical and mental health conditions. Qualitative interviews identified the key themes for assessing outcomes in long-term conditions, and these underpinned the development of the initial draft of the measure. These initial items will undergo cognitive testing to refine the items prior to further validation in a survey. PMID:27621678

  11. Development of a questionnaire to evaluate patients' awareness of cardiovascular disease risk in England's National Health Service Health Check preventive cardiovascular programme.

    PubMed

    Woringer, Maria; Nielsen, Jessica Jones; Zibarras, Lara; Evason, Julie; Kassianos, Angelos P; Harris, Matthew; Majeed, Azeem; Soljak, Michael

    2017-09-25

    The National Health Service (NHS) Health Check is a cardiovascular disease (CVD) risk assessment and management programme in England aiming to increase CVD risk awareness among people at increased risk of CVD. There is no tool to assess the effectiveness of the programme in communicating CVD risk to patients. The aim of this paper was to develop a questionnaire examining patients' CVD risk awareness for use in health service research evaluations of the NHS Health Check programme. We developed an 85-item questionnaire to determine patients' views of their risk of CVD. The questionnaire was based on a review of the relevant literature. After review by an expert panel and focus group discussion, 22 items were dropped and 2 new items were added. The resulting 65-item questionnaire with satisfactory content validity (content validity indices≥0.80) and face validity was tested on 110 NHS Health Check attendees in primary care in a cross-sectional study between 21 May 2014 and 28 July 2014. Following analyses of data, we reduced the questionnaire from 65 to 26 items. The 26-item questionnaire constitutes four scales: Knowledge of CVD Risk and Prevention, Perceived Risk of Heart Attack/Stroke, Perceived Benefits and Intention to Change Behaviour and Healthy Eating Intentions. Perceived Risk (Cronbach's α=0.85) and Perceived Benefits and Intention to Change Behaviour (Cronbach's α=0.82) have satisfactory reliability (Cronbach's α≥0.70). Healthy Eating Intentions (Cronbach's α=0.56) is below minimum threshold for reliability but acceptable for a three-item scale. The resulting questionnaire, with satisfactory reliability and validity, may be used in assessing patients' awareness of CVD risk among NHS Health Check attendees. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  12. Development and initial validation of the assessment of caregiver experience with neuromuscular disease.

    PubMed

    Matsumoto, Hiroko; Clayton-Krasinski, Debora A; Klinge, Stephen A; Gomez, Jaime A; Booker, Whitney A; Hyman, Joshua E; Roye, David P; Vitale, Michael G

    2011-01-01

    Orthopaedic intervention can have a wide range of functional and psychosocial effects on children with neuromuscular disease (NMD). In the multihandicapped child (Gross Motor Classification System IV/V), functional status, pain, psychosocial function, and health-related quality of life also have effects on the families of these child. The purpose of this study is to report the development and initial validation of an outcomes instrument specifically designed to assess the caregiver impact experienced by parents raising severely affected NMD children: the Assessment of Caregiver Experience with Neuromuscular Disease (ACEND). In the first part of this prospective study, 61 children with NMD and their parents were administered a range of earlier validated pediatric health measures. A framework technique was used to select the most appropriate and relevant subset of questions from this large set. Sensitivity analyses guided the development of a master question list measuring caregiver impact, excluding items with low relevance, and modifying unclear questions. In the second part of the study, the ACEND was administered to the caregivers of 46 children with moderate-to-severe NMD. Statistical analyses were conducted to determine validity of the instrument. The resulting ACEND instrument included 2 domains, 7 subdomains, and 41 items. Domain 1, examining physical impact, includes 4 subdomains: feeding/grooming/dressing (6 items), sitting/play (5 items), transfers (5 items), and mobility (7 items). Domain 2, which examines general caregiver impact, included 3 subdomains: time (4 items), emotion (9 items), and finance (5 items). Mean overall relevance rating was 6.21 ± 0.37 and clarity rating was 6.68 ± 0.52 (scale 0 to 7). Multiple floor effects in patients with GMFCS V and ceiling effects in patients with GMFCS III were identified almost exclusively in motor-based items. Virtually no floor or ceiling effects were identified in the time, emotion or finance domains across GMFCS level. The initial validation demonstrated that ACEND is a valid, disease-specific measure to quantify experience on caregivers of children with NMD. Larger groups of patients across NMD disease type are currently being tested to strengthen validity findings. Additionally, the ACEND is now being administered before and after orthopaedic interventions to determine responsiveness, which is critical to health outcomes research. LEVEL OF EVIDENCE/RELEVANCE: IIc.

  13. Development of a Values Conflict Resolution Assessment.

    ERIC Educational Resources Information Center

    Kinnier, Richard T.

    1987-01-01

    Describes the development of a Values Conflict Resolution Assessment (VCRA) and reports on validation and reliability. Items were constructed from theoretical criteria in values clarification and decision making with "ethical-emotional" and "rational-behavioral" components. VCRA scores correlated negatively with…

  14. Development of the functional vision questionnaire for children and young people with visual impairment: the FVQ_CYP.

    PubMed

    Tadić, Valerija; Cooper, Andrew; Cumberland, Phillippa; Lewando-Hundt, Gillian; Rahi, Jugnoo S

    2013-12-01

    To develop a novel age-appropriate measure of functional vision (FV) for self-reporting by visually impaired (VI) children and young people. Questionnaire development. A representative patient sample of VI children and young people aged 10 to 15 years, visual acuity of the logarithm of the minimum angle of resolution (logMAR) worse than 0.48, and a school-based (nonrandom) expert group sample of VI students aged 12 to 17 years. A total of 32 qualitative semistructured interviews supplemented by narrative feedback from 15 eligible VI children and young people were used to generate draft instrument items. Seventeen VI students were consulted individually on item relevance and comprehensibility, instrument instructions, format, and administration methods. The resulting draft instrument was piloted with 101 VI children and young people comprising a nationally representative sample, drawn from 21 hospitals in the United Kingdom. Initial item reduction was informed by presence of missing data and individual item response pattern. Exploratory factor analysis (FA) and parallel analysis (PA), and Rasch analysis (RA) were applied to test the instrument's psychometric properties. Psychometric indices and validity assessment of the Functional Vision Questionnaire for Children and Young People (FVQ_CYP). A total of 712 qualitative statements became a 56-item draft scale, capturing the level of difficulty in performing vision-dependent activities. After piloting, items were removed iteratively as follows: 11 for high percentage of missing data, 4 for skewness, and 1 for inadequate item infit and outfit values in RA, 3 having shown differential item functioning across age groups and 1 across gender in RA. The remaining 36 items showed item fit values within acceptable limits, good measurement precision and targeting, and ordered response categories. The reduced scale has a clear unidimensional structure, with all items having a high factor loading on the single factor in FA and PA. The summary scores correlated significantly with visual acuity. We have developed a novel, psychometrically robust self-report questionnaire for children and young people-the FVQ_CYP-that captures the functional impact of visual disability from their perspective. The 36-item, 4-point unidimensional scale has potential as a complementary adjunct to objective clinical assessments in routine pediatric ophthalmology practice and in research. Copyright © 2013 American Academy of Ophthalmology. Published by Elsevier Inc. All rights reserved.

  15. The ELPAT living organ donor Psychosocial Assessment Tool (EPAT): from 'what' to 'how' of psychosocial screening - a pilot study.

    PubMed

    Massey, Emma K; Timmerman, Lotte; Ismail, Sohal Y; Duerinckx, Nathalie; Lopes, Alice; Maple, Hannah; Mega, Inês; Papachristou, Christina; Dobbels, Fabienne

    2018-01-01

    Thorough psychosocial screening of donor candidates is required in order to minimize potential negative consequences and to strive for optimal safety within living donation programmes. We aimed to develop an evidence-based tool to standardize the psychosocial screening process. Key concepts of psychosocial screening were used to structure our tool: motivation and decision-making, personal resources, psychopathology, social resources, ethical and legal factors and information and risk processing. We (i) discussed how each item per concept could be measured, (ii) reviewed and rated available validated tools, (iii) where necessary developed new items, (iv) assessed content validity and (v) pilot-tested the new items. The resulting ELPAT living organ donor Psychosocial Assessment Tool (EPAT) consists of a selection of validated questionnaires (28 items in total), a semi-structured interview (43 questions) and a Red Flag Checklist. We outline optimal procedures and conditions for implementing this tool. The EPAT and user manual are available from the authors. Use of this tool will standardize the psychosocial screening procedure ensuring that no psychosocial issues are overlooked and ensure that comparable selection criteria are used and facilitate generation of comparable psychosocial data on living donor candidates. © 2017 Steunstichting ESOT.

  16. Development of Listening Comprehension Tests with Narrative and Expository Texts for Portuguese Students.

    PubMed

    Santos, Sandra; Viana, Fernanda Leopoldina; Ribeiro, Iolanda; Prieto, Gerardo; Brandão, Sara; Cadime, Irene

    2015-03-03

    This investigation aimed to develop and collect psychometric data for two tests assessing listening comprehension of Portuguese students in primary school: the Test of Listening Comprehension of Narrative Texts (TLC-n) and the Test of Listening Comprehension of Expository Texts (TLC-e). Two studies were conducted. The purpose of study 1 was to construct four test forms for each of the two tests to assess first, second, third and fourth grade students of the primary school. The TLC-n was administered to 1042 students, and the TLC-e was administered to 848 students. The purpose of study 2 was to test the psychometric properties of new items for the TLC-n form for fourth graders, given that the results in study 1 indicated a severe lack of difficult items. The participants were 260 fourth graders. The data were analysed using the Rasch model. Thirty items were selected for each test form. The results provided support for the model assumptions: Unidimensionality and local independence of the items. The reliability coefficients were higher than .70 for all test forms. The TLC-n and the TLC-e present good psychometric properties and represent an important contribution to the learning disabilities assessment field.

  17. Psychometric Development of the Research and Knowledge Scale.

    PubMed

    Powell, Lauren R; Ojukwu, Elizabeth; Person, Sharina D; Allison, Jeroan; Rosal, Milagros C; Lemon, Stephenie C

    2017-02-01

    Many research participants are misinformed about research terms, procedures, and goals; however, no validated instruments exist to assess individual's comprehension of health-related research information. We propose research literacy as a concept that incorporates understanding about the purpose and nature of research. We developed the Research and Knowledge Scale (RaKS) to measure research literacy in a culturally, literacy-sensitive manner. We describe its development and psychometric properties. Qualitative methods were used to assess perspectives of research participants and researchers. Literature and informed consent reviews were conducted to develop initial items. These data were used to develop initial domains and items of the RaKS, and expert panel reviews and cognitive pretesting were done to refine the scale. We conducted psychometric analyses to evaluate the scale. The cross-sectional survey was administered to a purposive community-based sample (n=430) using a Web-based data collection system and paper. We did classic theory testing on individual items and assessed test-retest reliability and Kuder-Richardson-20 for internal consistency. We conducted exploratory factor analysis and analysis of variance to assess differences in mean research literacy scores in sociodemographic subgroups. The RaKS is comprised of 16 items, with a Kuder-Richardson-20 estimate of 0.81 and test-retest reliability 0.84. There were differences in mean scale scores by race/ethnicity, age, education, income, and health literacy (all P<0.01). This study provides preliminary evidence for the reliability and validity of the RaKS. This scale can be used to measure research participants' understanding about health-related research processes and identify areas to improve informed decision-making about research participation.

  18. Rasch Analysis of a New Hierarchical Scoring System for Evaluating Hand Function on the Motor Assessment Scale for Stroke

    PubMed Central

    Sabari, Joyce S.; Woodbury, Michelle; Velozo, Craig A.

    2014-01-01

    Objectives. (1) To develop two independent measurement scales for use as items assessing hand movements and hand activities within the Motor Assessment Scale (MAS), an existing instrument used for clinical assessment of motor performance in stroke survivors; (2) To examine the psychometric properties of these new measurement scales. Design. Scale development, followed by a multicenter observational study. Setting. Inpatient and outpatient occupational therapy programs in eight hospital and rehabilitation facilities in the United States and Canada. Participants. Patients (N = 332) receiving stroke rehabilitation following left (52%) or right (48%) cerebrovascular accident; mean age 64.2 years (sd 15); median 1 month since stroke onset. Intervention. Not applicable. Main Outcome Measures. Data were tested for unidimensionality and reliability, and behavioral criteria were ordered according to difficulty level with Rasch analysis. Results. The new scales assessing hand movements and hand activities met Rasch expectations of unidimensionality and reliability. Conclusion. Following a multistep process of test development, analysis, and refinement, we have redesigned the two scales that comprise the hand function items on the MAS. The hand movement scale contains an empirically validated 10-behavior hierarchy and the hand activities item contains an empirically validated 8-behavior hierarchy. PMID:25177513

  19. Rater reliability and concurrent validity of the Keyboard Personal Computer Style instrument (K-PeCS).

    PubMed

    Baker, Nancy A; Cook, James R; Redfern, Mark S

    2009-01-01

    This paper describes the inter-rater and intra-rater reliability, and the concurrent validity of an observational instrument, the Keyboard Personal Computer Style instrument (K-PeCS), which assesses stereotypical postures and movements associated with computer keyboard use. Three trained raters independently rated the video clips of 45 computer keyboard users to ascertain inter-rater reliability, and then re-rated a sub-sample of 15 video clips to ascertain intra-rater reliability. Concurrent validity was assessed by comparing the ratings obtained using the K-PeCS to scores developed from a 3D motion analysis system. The overall K-PeCS had excellent reliability [inter-rater: intra-class correlation coefficients (ICC)=.90; intra-rater: ICC=.92]. Most individual items on the K-PeCS had from good to excellent reliability, although six items fell below ICC=.75. Those K-PeCS items that were assessed for concurrent validity compared favorably to the motion analysis data for all but two items. These results suggest that most items on the K-PeCS can be used to reliably document computer keyboarding style.

  20. Enhancing students' learning in problem based learning: validation of a self-assessment scale for active learning and critical thinking.

    PubMed

    Khoiriyah, Umatul; Roberts, Chris; Jorm, Christine; Van der Vleuten, C P M

    2015-08-26

    Problem based learning (PBL) is a powerful learning activity but fidelity to intended models may slip and student engagement wane, negatively impacting learning processes, and outcomes. One potential solution to solve this degradation is by encouraging self-assessment in the PBL tutorial. Self-assessment is a central component of the self-regulation of student learning behaviours. There are few measures to investigate self-assessment relevant to PBL processes. We developed a Self-assessment Scale on Active Learning and Critical Thinking (SSACT) to address this gap. We wished to demonstrated evidence of its validity in the context of PBL by exploring its internal structure. We used a mixed methods approach to scale development. We developed scale items from a qualitative investigation, literature review, and consideration of previous existing tools used for study of the PBL process. Expert review panels evaluated its content; a process of validation subsequently reduced the pool of items. We used structural equation modelling to undertake a confirmatory factor analysis (CFA) of the SSACT and coefficient alpha. The 14 item SSACT consisted of two domains "active learning" and "critical thinking." The factorial validity of SSACT was evidenced by all items loading significantly on their expected factors, a good model fit for the data, and good stability across two independent samples. Each subscale had good internal reliability (>0.8) and strongly correlated with each other. The SSACT has sufficient evidence of its validity to support its use in the PBL process to encourage students to self-assess. The implementation of the SSACT may assist students to improve the quality of their learning in achieving PBL goals such as critical thinking and self-directed learning.

  1. Assessing the impact of growth hormone deficiency and treatment in adults: development of a new disease-specific measure.

    PubMed

    Brod, Meryl; Højbjerre, Lise; Adalsteinsson, Johan Erpur; Rasmussen, Michael Højby

    2014-04-01

    Approximately 50 000 adults in the United States are diagnosed with GH deficiency, which has negative impacts on cognitive functioning, psychological well-being, and quality of life. This paper presents development and validation of a patient-reported outcome measure (PRO), the Treatment-Related Impact Measure-Adult Growth Hormone Deficiency (TRIM-AGHD). The TRIM-AGHD was developed to measure the impact of GH deficiency and its treatment. The development and validation of the TRIM-AGHD was conducted according to the Food and Drug Administration guidance on the development of PROs. Concept elicitation, conducted in three countries included interviews with patients, clinical experts, and literature review. Qualitative data were analyzed based on grounded theory principles, and draft items were cognitively debriefed. The measure underwent psychometric validation in a US clinic-based population. An a priori statistical analysis plan included assessment of the measurement model, reliability, and validity. Item functioning was reviewed using item response theory analyses. Forty-eight patients and six clinical experts participated in concept elicitation and 169 patients completed the validation study. TRIM-AGHD was measured. Factor analysis resulted in four domains: energy level, physical health, emotional health, and cognitive ability. The item response theory confirmed adequate item fit and placement within their domain. Internal consistency ranged from 0.82 to 0.95 and test-retest ranged from 0.80 to 0.92. All prespecified hypotheses for convergent validity and all but two for discriminant validity were met. The final 26-item TRIM-AGHD can be considered a reliable and valid PRO of the impact of disease and treatment for adult GH deficiency.

  2. Development of an instrument to measure self-efficacy in caregivers of people with advanced cancer.

    PubMed

    Ugalde, Anna; Krishnasamy, Meinir; Schofield, Penelope

    2013-06-01

    Informal caregivers of people with advanced cancer experience many negative impacts as a result of their role. There is a lack of suitable measures specifically designed to assess their experience. This study aimed to develop a new measure to assess self-efficacy in caregivers of people with advanced cancer. The development and testing of the new measure consisted of four separate, sequential phases: generation of issues, development of issues into items, pilot testing and field testing. In the generation of issues, 17 caregivers were interviewed to generate data. These data were analysed to generate codes, which were then systematically developed into items to construct the instrument. The instrument was pilot tested with 14 health professionals and five caregivers. It was then administered to a large sample for field testing to establish the psychometric properties, with established measures including the Brief Cope and the Family Appraisals for Caregiving Questionnaire for Palliative Care. Ninety-four caregivers completed the questionnaire booklet to establish the factor structure, reliability and validity. The factor analysis resulted in a 21-item, four-factor instrument, with the subscales being termed Resilience, Self-Maintenance, Emotional Connectivity and Instrumental Caregiving. The test-retest reliability and internal consistency were both excellent, ranging from 0.73 to 0.85 and 0.81 to 0.94, respectively. Six convergent and divergent hypotheses were made, and five were supported. This study has developed a new instrument to assess self-efficacy in caregivers of people with advanced cancer. The result is a four-factor, 21-item instrument with demonstrated reliability and validity. Copyright © 2012 John Wiley & Sons, Ltd.

  3. The multi-faceted assessment of independence in patients with rheumatoid arthritis: preliminary validation from the ATTAIN study.

    PubMed

    Hassett, Afton L; Li, Tracy; Buyske, Steven; Savage, Shantal V; Gignac, Monique A M

    2008-05-01

    To consider the feasibility of assessing multiple facets of independence in rheumatoid arthritis (RA) using a measure developed from existing items and examining its face validity, construct validity and responsiveness to change. The ATTAIN (Abatacept Trial in Treatment of Anti-tumor necrosis factor [TNF] Inadequate responders) database was used. Patients with RA were randomized 2:1, abatacept (n = 258) and placebo (n = 133). A multi-faceted scale to measure physical and psychosocial independence was constructed using items from the Health Assessment Questionnaire (HAQ) and Short Form 36 Health Survey (SF-36). Questions assessing activity limitations and need for outside caregiver help were also examined. Interviews with 20 RA patients assessed face validity. Item Response Theory analysis yielded two traits - 'Psychosocial Independence', derived from the number of days with activity limitations plus the Role Emotional, Social Functioning and Role Physical subscale items from the SF-36; and 'Physical Independence', derived from 15 HAQ items assessing need for help from another. The two traits showed no significant differential item functioning for age or gender and demonstrated good face validity. Changes over 169 days on Psychosocial Independence were greater (mean 0.46 units, 95% confidence interval [CI]: 0.17-0.75) for the abatacept group than for placebo (p = 0.002). Changes in Physical Independence were greater (mean 0.59 units, 95% CI: 0.35-0.82) for the abatacept group than for placebo (p < 0.001). The multi-faceted assessment of independence in RA based on items from commonly used instruments is feasible suggesting promise for evaluating independence in future clinical trials. This approach demonstrated good face and construct validity and responsiveness in RA patients who had previously failed anti-TNF therapy. However, we caution against an interpretation that these data suggest that abatacept improves independence because the component parts of this assessment came from instruments used in the ATTAIN trial where data had been previously analyzed.

  4. Quality of Life: An Exploratory Study.

    ERIC Educational Resources Information Center

    Lankhorst, Gustaaf J.

    1989-01-01

    A 12-item list of human abilities/activities was developed to measure quality of life of 9 rheumatoid arthritis adults from 2 aspects: "present condition" and "relative importance" of each item. Pilot testing indicated that importance and present condition represent different aspects. Differences between self-assessments and physicians'…

  5. Development and psychometric validation of the general practice nurse satisfaction scale.

    PubMed

    Halcomb, Elizabeth J; Caldwell, Belinda; Salamonson, Yenna; Davidson, Patricia M

    2011-09-01

    To develop an instrument to assess consumer satisfaction with nursing in general practice to provide feedback to nurses about consumers' perceptions of their performance. Prospective psychometric instrument validation study. A literature review was conducted to generate items for an instrument to measure consumer satisfaction with nursing in general practice. Face and content validity were evaluated by an expert panel, which had extensive experience in general practice nursing and research. Included in the questionnaire battery was the 27-item General Practice Nurse Satisfaction (GPNS) scale, as well as demographic and health status items. This survey was distributed to 739 consumers following intervention administered by a practice nurse in 16 general practices across metropolitan, rural, and regional Australia. Participants had the option of completing the survey online or receiving a hard copy of the survey form at the time of their visit. These data were collected between June and August 2009. Satisfaction data from 739 consumers were collected following their consultation with a general practice nurse. From the initial 27-item GPNS scale, a 21-item instrument was developed. Two factors, "confidence and credibility" and "interpersonal and communication" were extracted using principal axis factoring and varimax rotation. These two factors explained 71.9% of the variance. Cronbach's α was 0.97. The GPNS scale has demonstrated acceptable psychometric properties and can be used both in research and clinical practice for evaluating consumer satisfaction with general practice nurses. Assessing consumer satisfaction is important for developing and evaluating nursing roles. The GPNS scale is a valid and reliable tool that can be utilized to assess consumer satisfaction with general practice nurses and can assist in performance management and improving the quality of nursing services. © 2011 Sigma Theta Tau International.

  6. Discriminant content validity: a quantitative methodology for assessing content of theory-based measures, with illustrative applications.

    PubMed

    Johnston, Marie; Dixon, Diane; Hart, Jo; Glidewell, Liz; Schröder, Carin; Pollard, Beth

    2014-05-01

    In studies involving theoretical constructs, it is important that measures have good content validity and that there is not contamination of measures by content from other constructs. While reliability and construct validity are routinely reported, to date, there has not been a satisfactory, transparent, and systematic method of assessing and reporting content validity. In this paper, we describe a methodology of discriminant content validity (DCV) and illustrate its application in three studies. Discriminant content validity involves six steps: construct definition, item selection, judge identification, judgement format, single-sample test of content validity, and assessment of discriminant items. In three studies, these steps were applied to a measure of illness perceptions (IPQ-R) and control cognitions. The IPQ-R performed well with most items being purely related to their target construct, although timeline and consequences had small problems. By contrast, the study of control cognitions identified problems in measuring constructs independently. In the final study, direct estimation response formats for theory of planned behaviour constructs were found to have as good DCV as Likert format. The DCV method allowed quantitative assessment of each item and can therefore inform the content validity of the measures assessed. The methods can be applied to assess content validity before or after collecting data to select the appropriate items to measure theoretical constructs. Further, the data reported for each item in Appendix S1 can be used in item or measure selection. Statement of contribution What is already known on this subject? There are agreed methods of assessing and reporting construct validity of measures of theoretical constructs, but not their content validity. Content validity is rarely reported in a systematic and transparent manner. What does this study add? The paper proposes discriminant content validity (DCV), a systematic and transparent method of assessing and reporting whether items assess the intended theoretical construct and only that construct. In three studies, DCV was applied to measures of illness perceptions, control cognitions, and theory of planned behaviour response formats. Appendix S1 gives content validity indices for each item of each questionnaire investigated. Discriminant content validity is ideally applied while the measure is being developed, before using to measure the construct(s), but can also be applied after using a measure. © 2014 The British Psychological Society.

  7. Rapid assessment of tinnitus-related psychological distress using the Mini-TQ.

    PubMed

    Hiller, Wolfgang; Goebel, Gerhard

    2004-01-01

    The aim of this study was to develop an abridged version of the Tinnitus Questionnaire (TQ) to be used as a quick tool for the assessment of tinnitus-related psychological distress. Data from 351 inpatients and 122 outpatients with chronic tinnitus were used to analyse item statistics and psychometric properties. Twelve items with an optimal combination of high item-total correlations, reliability and sensitivity in assessing changes were selected for the Mini-TQ. Correlation with the full TQ was >0.90, and test-retest reliability was 0.89. Validity was confirmed by associations with general psychological symptom patterns. Treatment effects indicated by the Mini-TQ were slightly greater than those indicated by the full TQ. The Mini-TQ is recommended as a psychometrically approved and solid tool for rapid and economical assessment of subjective tinnitus distress.

  8. Therapist Competence in Global Mental Health: Development of the Enhancing Assessment of Common Therapeutic Factors (ENACT) Rating Scale

    PubMed Central

    Kohrt, Brandon A.; Jordans, Mark J.D.; Rai, Sauharda; Shrestha, Pragya; Luitel, Nagendra P.; Ramaiya, Megan; Singla, Daisy; Patel, Vikram

    2015-01-01

    Lack of reliable and valid measures of therapist competence is a barrier to dissemination and implementation of psychological treatments in global mental health. We developed the ENhancing Assessment of Common Therapeutic factors (ENACT) rating scale for training and supervision across settings varied by culture and access to mental health resources. We employed a four-step process in Nepal: (1) Item generation: We extracted 1,081 items (grouped into 104 domains) from 56 existing tools; role-plays with Nepali therapists generated 11 additional domains. (2) Item relevance: From the 115 domains, Nepali therapists selected 49 domains of therapeutic importance and high comprehensibility. (3) Item utility: We piloted the ENACT scale through rating role-play videotapes, patient session transcripts, and live observations of primary care workers in trainings for psychological treatments and the Mental Health Gap Action Programme (mhGAP). (4) Inter-rater reliability was acceptable for experts (intraclass correlation coefficient, ICC(2,7)=0.88 (95% confidence interval (CI) 0.81—0.93), N=7) and non-specialists (ICC(1,3)=0.67 (95% CI 0.60—0.73), N=34). In sum, the ENACT scale is an 18-item assessment for common factors in psychological treatments, including task-sharing initiatives with non-specialists across cultural settings. Further research is needed to evaluate applications for therapy quality and association with patient outcomes. PMID:25847276

  9. How and Why Students Learn: Development and Validation of the Learner Awareness Levels Questionnaire for Higher Education Students

    ERIC Educational Resources Information Center

    Choy, S. Chee; Goh, Pauline Swee Choo; Sedhu, Daljeet Singh

    2016-01-01

    The development of the 21-item Learner Awareness Levels Questionnaire (LALQ) was carried out using data from three separate studies. The LALQ is a self-reporting questionnaire assessing how and why students learn. Study 1 refined the initial pool of items to 21 using exploratory factor analysis. In Study 2, the analysis showed evidence for a…

  10. Reliability and validity evidence of the Assessment of Language Use in Social Contexts for Adults (ALUSCA).

    PubMed

    Valente, Ana Rita S; Hall, Andreia; Alvelos, Helena; Leahy, Margaret; Jesus, Luis M T

    2018-04-12

    The appropriate use of language in context depends on the speaker's pragmatic language competencies. A coding system was used to develop a specific and adult-focused self-administered questionnaire to adults who stutter and adults who do not stutter, The Assessment of Language Use in Social Contexts for Adults, with three categories: precursors, basic exchanges, and extended literal/non-literal discourse. This paper presents the content validity, item analysis, reliability coefficients and evidences of construct validity of the instrument. Content validity analysis was based on a two-stage process: first, 11 pragmatic questionnaires were assessed to identify items that probe each pragmatic competency and to create the first version of the instrument; second, items were assessed qualitatively by an expert panel composed by adults who stutter and controls, and quantitatively and qualitatively by an expert panel composed by clinicians. A pilot study was conducted with five adults who stutter and five controls to analyse items and calculate reliability. Construct validity evidences were obtained using the hypothesized relationships method and factor analysis with 28 adults who stutter and 28 controls. Concerning content validity, the questionnaires assessed up to 13 pragmatic competencies. Qualitative and quantitative analysis revealed ambiguities in items construction. Disagreement between experts was solved through item modification. The pilot study showed that the instrument presented internal consistency and temporal stability. Significant differences between adults who stutter and controls and different response profiles revealed the instrument's underlying construct. The instrument is reliable and presented evidences of construct validity.

  11. Development and content validation of a patient-reported endometriosis pain daily diary.

    PubMed

    van Nooten, Floortje E; Cline, Jennifer; Elash, Celeste A; Paty, Jean; Reaney, Matthew

    2018-01-04

    Endometriosis is a common gynecological disorder that causes inflammation and pelvic pain. Endometriosis-related pain is best captured with patient-reported outcome (PRO) measures, however, assessment of endometriosis-related pain in clinical trials has been difficult in the absence of a reliable and valid PRO instrument. We describe the development of the Endometriosis Pain Daily Diary (EPDD), an electronic PRO developed as a survey instrument to assess endometriosis-related pain and its impact on patients' lives. The EPDD was initially developed on the basis of an existing Endometriosis Pain and Bleeding Diary, a targeted review of relevant literature, clinical expert interviews, and open-ended (concept elicitation) patient interviews in the United States (US) and Japan which captured patients' experience with endometriosis. Cognitive interviews of patients with endometriosis were conducted to evaluate patient comprehension of the EPDD items. A conceptual model of endometriosis was developed, and meetings with US and European regulatory authorities provided feedback for validating the EPDD in the context of clinical trials. Translatability assessments of the EPDD were conducted to confirm its appropriate interpretation and ease of completion across 17 languages. The iterative development progressed through three versions of the instrument. The EPDDv1 included 18 items relating to dysmenorrhea/pelvic pain, dyspareunia and sexual activity, bleeding, hot flashes, daily activities, and use of rescue medication. The EPDDv2 was a larger 43-item survey tested in cognitive interviews and subsequently revised to yield the current 11-item EPDDv3, consisting of five core items relating to dysmenorrhea, non-menstrual pelvic pain, and dyspareunia, and six extension items relating to sexual activity, daily activities, and use of rescue medication. The EPDD is a PRO for the evaluation of endometriosis-related pain and its associated impacts on patients' lives. The EPDD represents an important step in providing a PRO that is relevant to patients with endometriosis-related pain in the context of a clinical study setting (ie, fit-for-purpose), designed to evaluate pain associated with endometriosis, including regulatory agency support for its further exploration in clinical trials.

  12. Assessing Technical Performance and Determining the Learning Curve in Cleft Palate Surgery Using a High-Fidelity Cleft Palate Simulator.

    PubMed

    Podolsky, Dale J; Fisher, David M; Wong Riff, Karen W; Szasz, Peter; Looi, Thomas; Drake, James M; Forrest, Christopher R

    2018-06-01

    This study assessed technical performance in cleft palate repair using a newly developed assessment tool and high-fidelity cleft palate simulator through a longitudinal simulation training exercise. Three residents performed five and one resident performed nine consecutive endoscopically recorded cleft palate repairs using a cleft palate simulator. Two fellows in pediatric plastic surgery and two expert cleft surgeons also performed recorded simulated repairs. The Cleft Palate Objective Structured Assessment of Technical Skill (CLOSATS) and end-product scales were developed to assess performance. Two blinded cleft surgeons assessed the recordings and the final repairs using the CLOSATS, end-product scale, and a previously developed global rating scale. The average procedure-specific (CLOSATS), global rating, and end-product scores increased logarithmically after each successive simulation session for the residents. Reliability of the CLOSATS (average item intraclass correlation coefficient (ICC), 0.85 ± 0.093) and global ratings (average item ICC, 0.91 ± 0.02) among the raters was high. Reliability of the end-product assessments was lower (average item ICC, 0.66 ± 0.15). Standard setting linear regression using an overall cutoff score of 7 of 10 corresponded to a pass score for the CLOSATS and the global score of 44 (maximum, 60) and 23 (maximum, 30), respectively. Using logarithmic best-fit curves, 6.3 simulation sessions are required to reach the minimum standard. A high-fidelity cleft palate simulator has been developed that improves technical performance in cleft palate repair. The simulator and technical assessment scores can be used to determine performance before operating on patients.

  13. Assessing Student Outcomes of Undergraduate Research with URSSA, the Undergraduate Student Self-Assessment Instrument

    NASA Astrophysics Data System (ADS)

    Laursen, S. L.; Weston, T. J.; Thiry, H.

    2012-12-01

    URSSA is the Undergraduate Research Student Self-Assessment, an online survey instrument for programs and departments to use in assessing the student outcomes of undergraduate research (UR). URSSA focuses on what students learn from their UR experience, rather than whether they liked it. The online questionnaire includes both multiple-choice and open-ended items that focus on students' gains from undergraduate research. These gains include skills, knowledge, deeper understanding of the intellectual and practical work of science, growth in confidence, changes in identity, and career preparation. Other items probe students' participation in important research-related activities that lead to these gains (e.g. giving presentations, having responsibility for a project). These activities, and the gains themselves, are based in research and thus constitute a core set of items. Using these items as a group helps to align a particular program assessment with research-demonstrated outcomes. Optional items may be used to probe particular features that are augment the research experience (e.g. field trips, career seminars, housing arrangements). The URSSA items are based on extensive, interview-based research and evaluation work on undergraduate research by our group and others. This grounding in research means that URSSA measures what we know to be important about the UR experience The items were tested with students, revised and re-tested. Data from a large pilot sample of over 500 students enabled statistical testing of the items' validity and reliability. Optional items about UR program elements were developed in consultation with UR program developers and leaders. The resulting instrument is flexible. Users begin with a set of core items, then customize their survey with optional items to probe students' experiences of specific program elements. The online instrument is free and easy to use, with numeric results available as raw data, summary statistics, cross-tabs, and graphs, and as raw, downloadable data. Finally, URSSA has high content validity based on its research grounding and rigorous development. We will present examples of how URSSA has been used in evaluations of UR programs. A multi-year evaluation of a university-based UR program shows that URSSA items are sensitive to differences in students' prior level of experience with research. For example, experienced student researchers reported greater gains than did their peers new to UR in understanding the process of research and in coming to see themselves as scientists. These differences are consistent with interview data that suggest a developmental progression of gains as students pursue research and gain confidence in their ability to contribute meaningfully. A second example comes from a multi-site evaluation of sites funded by the National Science Foundation's Research Experience for Undergraduates (REU) program in Biology. This study acquired data from nearly 800 students at some 60 Bio REU sites in 2010 and 2011. Results reveal differences in gains among demographic groups, and the general strength of these well-planned programs relative to a comparison sample of UR programs that are not part of REU. Our presentation will demonstrate the evaluative use of URSSA and its potential applications to undergraduate research in the geosciences.

  14. Development and Preliminary Testing of the Food Choice Priorities Survey (FCPS): Assessing the Importance of Multiple Factors on College Students' Food Choices.

    PubMed

    Vilaro, Melissa J; Zhou, Wenjun; Colby, Sarah E; Byrd-Bredbenner, Carol; Riggsbee, Kristin; Olfert, Melissa D; Barnett, Tracey E; Mathews, Anne E

    2017-12-01

    Understanding factors that influence food choice may help improve diet quality. Factors that commonly affect adults' food choices have been described, but measures that identify and assess food choice factors specific to college students are lacking. This study developed and tested the Food Choice Priorities Survey (FCPS) among college students. Thirty-seven undergraduates participated in two focus groups ( n = 19; 11 in the male-only group, 8 in the female-only group) and interviews ( n = 18) regarding typical influences on food choice. Qualitative data informed the development of survey items with a 5-point Likert-type scale (1 = not important, 5 = extremely important). An expert panel rated FCPS items for clarity, relevance, representativeness, and coverage using a content validity form. To establish test-retest reliability, 109 first-year college students completed the 14-item FCPS at two time points, 0-48 days apart ( M = 13.99, SD = 7.44). Using Cohen's weighted κ for responses within 20 days, 11 items demonstrated moderate agreement and 3 items had substantial agreement. Factor analysis revealed a three-factor structure (9 items). The FCPS is designed for college students and provides a way to determine the factors of greatest importance regarding food choices among this population. From a public health perspective, practical applications include using the FCPS to tailor health communications and behavior change interventions to factors most salient for food choices of college students.

  15. Development of the standards of reporting of neurological disorders (STROND) checklist: a guideline for the reporting of incidence and prevalence studies in neuroepidemiology.

    PubMed

    Bennett, Derrick A; Brayne, Carol; Feigin, Valery L; Barker-Collo, Suzanne; Brainin, Michael; Davis, Daniel; Gallo, Valentina; Jetté, Nathalie; Karch, André; Kurtzke, John F; Lavados, Pablo M; Logroscino, Giancarlo; Nagel, Gabriele; Preux, Pierre-Marie; Rothwell, Peter M; Svenson, Lawrence W

    2015-07-01

    Incidence and prevalence studies of neurological disorders play an important role in assessing the burden of disease and planning services. However, the assessment of disease estimates is hindered by problems in reporting for such studies. Despite a growth in published reports, existing guidelines relate to analytical rather than descriptive epidemiological studies. There are also no user-friendly tools (e.g., checklists) available for authors, editors and peer-reviewers to facilitate best practice in reporting of descriptive epidemiological studies for most neurological disorders. The Standards of Reporting of Neurological Disorders (STROND) is a guideline that consists of recommendations and a checklist to facilitate better reporting of published incidence and prevalence studies of neurological disorders. A review of previously developed guidance was used to produce a list of items required for incidence and prevalence studies in neurology. A three-round Delphi technique was used to identify the 'basic minimum items' important for reporting, as well as some additional 'ideal reporting items'. An e-consultation process was then used in order to gauge opinion by external neuroepidemiological experts on the appropriateness of the items included in the checklist. Of 38 candidate items, 15 items and accompanying recommendations were developed along with a user-friendly checklist. The introduction and use of the STROND checklist should lead to more consistent, transparent and contextualised reporting of descriptive neuroepidemiological studies resulting in more applicable and comparable findings and ultimately support better healthcare decisions.

  16. Do Images Influence Assessment in Anatomy? Exploring the Effect of Images on Item Difficulty and Item Discrimination

    ERIC Educational Resources Information Center

    Vorstenbosch, Marc A. T. M.; Klaassen, Tim P. F. M.; Kooloos, Jan G. M.; Bolhuis, Sanneke M.; Laan, Roland F. J. M.

    2013-01-01

    Anatomists often use images in assessments and examinations. This study aims to investigate the influence of different types of images on item difficulty and item discrimination in written assessments. A total of 210 of 460 students volunteered for an extra assessment in a gross anatomy course. This assessment contained 39 test items grouped in…

  17. Development and initial evaluation of an instrument to assess physiotherapists' clinical reasoning focused on clients' behavior change.

    PubMed

    Elvén, Maria; Hochwälder, Jacek; Dean, Elizabeth; Söderlund, Anne

    2018-05-01

    A systematically developed and evaluated instrument is needed to support investigations of physiotherapists' clinical reasoning integrated with the process of clients' behavior change. This study's aim was to develop an instrument to assess physiotherapy students' and physiotherapists' clinical reasoning focused on clients' activity-related behavior and behavior change, and initiate its evaluation, including feasibility and content validity. The study was conducted in three phases: 1) determination of instrument structure and item generation, based on a model, guidelines for assessing clinical reasoning, and existing measures; 2) cognitive interviews with five physiotherapy students to evaluate item understanding and feasibility; and 3) a Delphi process with 18 experts to evaluate content relevance. Phase 1 resulted in an instrument with four domains: Physiotherapist; Input from client; Functional behavioral analysis; and Strategies for behavior change. The instrument consists of case scenarios followed by items in which key features are identified, prioritized, or interpreted. Phase 2 resulted in revisions of problems and approval of feasibility. Phase 3 demonstrated high level of consensus regarding the instrument's content relevance. This feasible and content-validated instrument shows potential for use in investigations of physiotherapy students' and physiotherapists' clinical reasoning, however continued development and testing are needed.

  18. [Systematic development of a scale for determination of health-related quality of life in multiple trauma patients. The Polytrauma Outcome (POLO) Chart].

    PubMed

    Pirente, N; Bouillon, B; Schäfer, B; Raum, M; Helling, H J; Berger, E; Neugebauer, E

    2002-05-01

    Even years after having sustained multiple injuries patients often suffer from its sequelae. These comprise restrictions in physical function, but also pain, social and psychological impairments. Although the Meran Consensus Conference in 1990 defined the contents of "quality of life" (QoL) measures in surgery, still no instrument is available for the valid assessment of all relevant QoL domains in multiple injured patients. This paper describes the systematic development of a modular instrument for the assessment of health related QoL. Within three phases (phase I: generation of items, phase II: item reduction, phase III: pre-testing in 70 multiple injured and control patients) a questionnaire of 57 items was developed, which measures all relevant trauma-related aspects of QoL after acute hospital care. In combination with the Glascow Outcome Scale (GOS), the EUROQOL and the SF-36, the newly developed instrument builds the Polytrauma Outcome Chart (POLO-Chart) which will also be used as "Part E" for outcome assessment within the "Trauma registry" of the German Society for Trauma Surgery. In phase IV, the POLO-Chart will finally be validated in five trauma centres (Celle, Essen, Hanover, Cologne und Munich).

  19. Exercise barriers self-efficacy: development and validation of a subcale for individuals with cancer-related lymphedema.

    PubMed

    Buchan, Jena; Janda, Monika; Box, Robyn; Rogers, Laura; Hayes, Sandi

    2015-03-18

    No tool exists to measure self-efficacy for overcoming lymphedema-related exercise barriers in individuals with cancer-related lymphedema. However, an existing scale measures confidence to overcome general exercise barriers in cancer survivors. Therefore, the purpose of this study was to develop, validate and assess the reliability of a subscale, to be used in conjunction with the general barriers scale, for determining exercise barriers self-efficacy in individuals facing lymphedema-related exercise barriers. A lymphedema-specific exercise barriers self-efficacy subscale was developed and validated using a cohort of 106 cancer survivors with cancer-related lymphedema, from Brisbane, Australia. An initial ten-item lymphedema-specific barrier subscale was developed and tested, with participant feedback and principal components analysis results used to guide development of the final version. Validity and test-retest reliability analyses were conducted on the final subscale. The final lymphedema-specific subscale contained five items. Principal components analysis revealed these items loaded highly (>0.75) on a separate factor when tested with a well-established nine-item general barriers scale. The final five-item subscale demonstrated good construct and criterion validity, high internal consistency (Cronbach's alpha = 0.93) and test-retest reliability (ICC = 0.67, p < 0.01). A valid and reliable lymphedema-specific subscale has been developed to assess exercise barriers self-efficacy in individuals with cancer-related lymphedema. This scale can be used in conjunction with an existing general exercise barriers scale to enhance exercise adherence in this understudied patient group.

  20. Screening for elevated levels of fear-avoidance beliefs regarding work or physical activities in people receiving outpatient therapy.

    PubMed

    Hart, Dennis L; Werneke, Mark W; George, Steven Z; Matheson, James W; Wang, Ying-Chih; Cook, Karon F; Mioduski, Jerome E; Choi, Seung W

    2009-08-01

    Screening people for elevated levels of fear-avoidance beliefs is uncommon, but elevated levels of fear could worsen outcomes. Developing short screening tools might reduce the data collection burden and facilitate screening, which could prompt further testing or management strategy modifications to improve outcomes. The purpose of this study was to develop efficient yet accurate screening methods for identifying elevated levels of fear-avoidance beliefs regarding work or physical activities in people receiving outpatient rehabilitation. A secondary analysis of data collected prospectively from people with a variety of common neuromusculoskeletal diagnoses was conducted. Intake Fear-Avoidance Beliefs Questionnaire (FABQ) data were collected from 17,804 people who had common neuromusculoskeletal conditions and were receiving outpatient rehabilitation in 121 clinics in 26 states (in the United States). Item response theory (IRT) methods were used to analyze the FABQ data, with particular emphasis on differential item functioning among clinically logical groups of subjects, and to identify screening items. The accuracy of screening items for identifying subjects with elevated levels of fear was assessed with receiver operating characteristic analyses. Three items for fear of physical activities and 10 items for fear of work activities represented unidimensional scales with adequate IRT model fit. Differential item functioning was negligible for variables known to affect functional status outcomes: sex, age, symptom acuity, surgical history, pain intensity, condition severity, and impairment. Items that provided maximum information at the median for the FABQ scales were selected as screening items to dichotomize subjects by high versus low levels of fear. The accuracy of the screening items was supported for both scales. This study represents a retrospective analysis, which should be replicated using prospective designs. Future prospective studies should assess the reliability and validity of using one FABQ item to screen people for high levels of fear-avoidance beliefs. The lack of differential item functioning in the FABQ scales in the sample tested in this study suggested that FABQ screening could be useful in routine clinical practice and allowed the development of single-item screening for fear-avoidance beliefs that accurately identified subjects with elevated levels of fear. Because screening was accurate and efficient, single IRT-based FABQ screening items are recommended to facilitate improved evaluation and care of heterogeneous populations of people receiving outpatient rehabilitation.

  1. Developing and validating a nutrition knowledge questionnaire: key methods and considerations.

    PubMed

    Trakman, Gina Louise; Forsyth, Adrienne; Hoye, Russell; Belski, Regina

    2017-10-01

    To outline key statistical considerations and detailed methodologies for the development and evaluation of a valid and reliable nutrition knowledge questionnaire. Literature on questionnaire development in a range of fields was reviewed and a set of evidence-based guidelines specific to the creation of a nutrition knowledge questionnaire have been developed. The recommendations describe key qualitative methods and statistical considerations, and include relevant examples from previous papers and existing nutrition knowledge questionnaires. Where details have been omitted for the sake of brevity, the reader has been directed to suitable references. We recommend an eight-step methodology for nutrition knowledge questionnaire development as follows: (i) definition of the construct and development of a test plan; (ii) generation of the item pool; (iii) choice of the scoring system and response format; (iv) assessment of content validity; (v) assessment of face validity; (vi) purification of the scale using item analysis, including item characteristics, difficulty and discrimination; (vii) evaluation of the scale including its factor structure and internal reliability, or Rasch analysis, including assessment of dimensionality and internal reliability; and (viii) gathering of data to re-examine the questionnaire's properties, assess temporal stability and confirm construct validity. Several of these methods have previously been overlooked. The measurement of nutrition knowledge is an important consideration for individuals working in the nutrition field. Improved methods in the development of nutrition knowledge questionnaires, such as the use of factor analysis or Rasch analysis, will enable more confidence in reported measures of nutrition knowledge.

  2. The development and validation of the King's Brief Interstitial Lung Disease (K-BILD) health status questionnaire.

    PubMed

    Patel, Amit S; Siegert, Richard J; Brignall, Katherine; Gordon, Patrick; Steer, Sophia; Desai, Sujal R; Maher, Toby M; Renzoni, Elisabetta A; Wells, Athol U; Higginson, Irene J; Birring, Surinder S

    2012-09-01

    Health status is impaired in patients with interstitial lung disease (ILD). There is a paucity of tools that assess health status in ILD. The objective of this study was to develop and validate the King's Brief Interstitial Lung Disease questionnaire (K-BILD), a new health status measure for patients with ILD. Patients with ILD were recruited from outpatient clinics. The development of the questionnaire consisted of three phases: item generation; item reduction, allocation to domains by factor analysis, Rasch analysis to create unidimensional scales and validation; and repeatability testing. 173 patients with ILD (49 with idiopathic pulmonary fibrosis) completed a preliminary 71-item questionnaire. 56 items were removed due to redundancy, low factor loadings or poor fit to the Rasch model. The final version of the K-BILD questionnaire consisted of 15 items and three domains (breathlessness and activities, chest symptoms and psychological). Internal consistency assessed with Cronbach's α coefficient was 0.94 for the K-BILD total score. Concurrent validity of the K-BILD questionnaire was high compared with St George's Respiratory Questionnaire (r=0.90) and moderate with lung function (vital capacity, r=0.50). The K-BILD questionnaire was repeatable over 2 weeks (n=44), with intraclass correlation coefficients for domains and total score 0.86-0.94. The K-BILD construct validity for patients with idiopathic pulmonary fibrosis was similar to that of other ILDs. The K-BILD questionnaire is a brief, valid, self-completed health status measure for ILD. It could be used in the clinic to assess ILD from the patients' perspective.

  3. Development of a subjective cognitive decline questionnaire using item response theory: a pilot study.

    PubMed

    Gifford, Katherine A; Liu, Dandan; Romano, Raymond; Jones, Richard N; Jefferson, Angela L

    2015-12-01

    Subjective cognitive decline (SCD) may indicate unhealthy cognitive changes, but no standardized SCD measurement exists. This pilot study aims to identify reliable SCD questions. 112 cognitively normal (NC, 76±8 years, 63% female), 43 mild cognitive impairment (MCI; 77±7 years, 51% female), and 33 diagnostically ambiguous participants (79±9 years, 58% female) were recruited from a research registry and completed 57 self-report SCD questions. Psychometric methods were used for item-reduction. Factor analytic models assessed unidimensionality of the latent trait (SCD); 19 items were removed with extreme response distribution or trait-fit. Item response theory (IRT) provided information about question utility; 17 items with low information were dropped. Post-hoc simulation using computerized adaptive test (CAT) modeling selected the most commonly used items (n=9 of 21 items) that represented the latent trait well (r=0.94) and differentiated NC from MCI participants (F(1,146)=8.9, p=0.003). Item response theory and computerized adaptive test modeling identified nine reliable SCD items. This pilot study is a first step toward refining SCD assessment in older adults. Replication of these findings and validation with Alzheimer's disease biomarkers will be an important next step for the creation of a SCD screener.

  4. Overview of Classical Test Theory and Item Response Theory for Quantitative Assessment of Items in Developing Patient-Reported Outcome Measures

    PubMed Central

    Cappelleri, Joseph C.; Lundy, J. Jason; Hays, Ron D.

    2014-01-01

    Introduction The U.S. Food and Drug Administration’s patient-reported outcome (PRO) guidance document defines content validity as “the extent to which the instrument measures the concept of interest” (FDA, 2009, p. 12). “Construct validity is now generally viewed as a unifying form of validity for psychological measurements, subsuming both content and criterion validity” (Strauss & Smith, 2009, p. 7). Hence both qualitative and quantitative information are essential in evaluating the validity of measures. Methods We review classical test theory and item response theory approaches to evaluating PRO measures including frequency of responses to each category of the items in a multi-item scale, the distribution of scale scores, floor and ceiling effects, the relationship between item response options and the total score, and the extent to which hypothesized “difficulty” (severity) order of items is represented by observed responses. Conclusion Classical test theory and item response theory can be useful in providing a quantitative assessment of items and scales during the content validity phase of patient-reported outcome measures. Depending on the particular type of measure and the specific circumstances, either one or both approaches should be considered to help maximize the content validity of PRO measures. PMID:24811753

  5. An Instrument to Assess Self-Statements During Public Speaking: Scale Development and Preliminary Psychometric Properties

    PubMed Central

    Hofmann, Stefan G.; DiBartolo, Patricia Marten

    2006-01-01

    Public speaking is the most commonly reported fearful social situation. Although a number of contemporary theories emphasize the importance of cognitive processes in social anxiety, there is no instrument available to assess fearful thoughts experienced during public speaking. The Self-Statements During Public Speaking (SSPS) scale is a 10-item questionnaire consisting of two 5-item subscales, the “Positive Self-Statements” (SSPS-P) and the “Negative Self-Statements” subscale (SSPS-N). Four studies report on the development and the preliminary psychometric properties of this instrument. PMID:16763666

  6. Indices for the assessment of nutritional quality of meals: a systematic review.

    PubMed

    Gorgulho, B M; Pot, G K; Sarti, F M; Marchioni, D M

    2016-06-01

    This systematic review aimed to synthesise information on indices developed to evaluate nutritional quality of meals. A strategy for systematic search of the literature was developed using keywords related to assessment of meal quality. Databases searched included ScienceDirect, PubMed, Lilacs, SciELO, Scopus, Cochrane, Embase and Google Scholar. The literature search resulted in seven different meal quality indices. Each article was analysed in order to identify the following items: authors, country, year, study design, population characteristics, type of meal evaluated, dietary assessment method, characteristics evaluated (nutrients or food items), score range, index components, nutritional references, correlations performed, validation and relationship with an outcome (if existing). Two studies developed instruments to assess the quality of breakfast, three analysed lunch, one evaluated dinner and one was applied to all types of meals and snacks. All meal quality indices reviewed were based on the evaluation of presence or absence of food groups and relative contributions of nutrients, according to food-based guidelines or nutrient references, adapting the daily dietary recommendations to one specific meal. Most of the indices included three items as components for meal quality assessment: (I) total fat or some specific type of fat, (II) fruits and vegetables and (III) cereals or whole grains. This systematic review indicates aspects that need further research, particularly the numerous approaches to assessing meals considering different foods and nutrients, and the need for validation studies of meal indices.

  7. Assessing coach motivation: the development of the Coach Motivation Questionnaire (CMQ).

    PubMed

    McLean, Kristy N; Mallett, Clifford J; Newcombe, Peter

    2012-04-01

    The aim of this research was to develop and assess the psychometric properties of the Coach Motivation Questionnaire (CMQ). Study 1 focused on the compilation and pilot testing of potential questionnaire items. Consistent with self-determination theory, items were devised to tap into six forms of motivation: amotivation, external regulation, introjected regulation, identified regulation, integrated regulation, and intrinsic motivation. The purpose of the second study (N = 556) was to empirically examine the psychometric properties of the CMQ. Items were subjected to confirmatory factor analyses to determine the fit of the a priori model. In addition, the validity of the questionnaire was assessed through links with the theoretically related concepts of intrinsic need satisfaction, well-being, and goal orientation. Together with test-retest reliability (Study 3), these results showed preliminary support for the psychometric properties of the CMQ. Finally, using an independent sample (N = 254), the fourth study confirmed the factor structure and supports the use of the CMQ in future coaching research.

  8. Development and Evaluation of the PROMIS® Pediatric Positive Affect Item Bank, Child-Report and Parent-Proxy Editions.

    PubMed

    Forrest, Christopher B; Ravens-Sieberer, Ulrike; Devine, Janine; Becker, Brandon D; Teneralli, Rachel; Moon, JeanHee; Carle, Adam; Tucker, Carole A; Bevans, Katherine B

    2018-03-01

    The purpose of this study is to describe the psychometric evaluation and item response theory calibration of the PROMIS Pediatric Positive Affect item bank, child-report and parent-proxy editions. The initial item pool comprising 53 items, previously developed using qualitative methods, was administered to 1,874 children 8-17 years old and 909 parents of children 5-17 years old. Analyses included descriptive statistics, reliability, factor analysis, differential item functioning, and construct validity. A total of 14 items were deleted, because of poor psychometric performance, and an 8-item short form constructed from the remaining 39 items was administered to a national sample of 1,004 children 8-17 years old, and 1,306 parents of children 5-17 years old. The combined sample was used in item response theory (IRT) calibration analyses. The final item bank appeared unidimensional, the items appeared locally independent, and the items were free from differential item functioning. The scales showed excellent reliability and convergent and discriminant validity. Positive affect decreased with children's age and was lower for those with a special health care need. After IRT calibration, we found that 4 and 8 item short forms had a high degree of precision (reliability) across a wide range of the latent trait (>4 SD units). The PROMIS Pediatric Positive Affect item bank and its short forms provide an efficient, precise, and valid assessment of positive affect in children and youth.

  9. The development and exploratory analysis of the Back Pain Attitudes Questionnaire (Back-PAQ)

    PubMed Central

    Darlow, Ben; Perry, Meredith; Mathieson, Fiona; Stanley, James; Melloh, Markus; Marsh, Reginald; Baxter, G David; Dowell, Anthony

    2014-01-01

    Objectives To develop an instrument to assess attitudes and underlying beliefs about back pain, and subsequently investigate its internal consistency and underlying structures. Design The instrument was developed by a multidisciplinary team of clinicians and researchers based on analysis of qualitative interviews with people experiencing acute and chronic back pain. Exploratory analysis was conducted using data from a population-based cross-sectional survey. Setting Qualitative interviews with community-based participants and subsequent postal survey. Participants Instrument development informed by interviews with 12 participants with acute back pain and 11 participants with chronic back pain. Data for exploratory analysis collected from New Zealand residents and citizens aged 18 years and above. 1000 participants were randomly selected from the New Zealand Electoral Roll. 602 valid responses were received. Measures The 34-item Back Pain Attitudes Questionnaire (Back-PAQ) was developed. Internal consistency was evaluated by the Cronbach α coefficient. Exploratory analysis investigated the structure of the data using Principal Component Analysis. Results The 34-item long form of the scale had acceptable internal consistency (α=0.70; 95% CI 0.66 to 0.73). Exploratory analysis identified five two-item principal components which accounted for 74% of the variance in the reduced data set: ‘vulnerability of the back’; ‘relationship between back pain and injury’; ‘activity participation while experiencing back pain’; ‘prognosis of back pain’ and ‘psychological influences on recovery’. Internal consistency was acceptable for the reduced 10-item scale (α=0.61; 95% CI 0.56 to 0.66) and the identified components (α between 0.50 and 0.78). Conclusions The 34-item long form of the scale may be appropriate for use in future cross-sectional studies. The 10-item short form may be appropriate for use as a screening tool, or an outcome assessment instrument. Further testing of the 10-item Back-PAQ's construct validity, reliability, responsiveness to change and predictive ability needs to be conducted. PMID:24860003

  10. Measuring ability to assess claims about treatment effects: the development of the 'Claim Evaluation Tools'.

    PubMed

    Austvoll-Dahlgren, Astrid; Semakula, Daniel; Nsangi, Allen; Oxman, Andrew David; Chalmers, Iain; Rosenbaum, Sarah; Guttersrud, Øystein

    2017-05-17

    To describe the development of the Claim Evaluation Tools, a set of flexible items to measure people's ability to assess claims about treatment effects. Methodologists and members of the community (including children) in Uganda, Rwanda, Kenya, Norway, the UK and Australia. In the iterative development of the items, we used purposeful sampling of people with training in research methodology, such as teachers of evidence-based medicine, as well as patients and members of the public from low-income and high-income countries. Development consisted of 4 processes: (1) determining the scope of the Claim Evaluation Tools and development of items; (2) expert item review and feedback (n=63); (3) cognitive interviews with children and adult end-users (n=109); and (4) piloting and administrative tests (n=956). The Claim Evaluation Tools database currently includes a battery of multiple-choice items. Each item begins with a scenario which is intended to be relevant across contexts, and which can be used for children (from age 10  and above), adult members of the public and health professionals. People with expertise in research methods judged the items to have face validity, and end-users judged them relevant and acceptable in their settings. In response to feedback from methodologists and end-users, we simplified some text, explained terms where needed, and redesigned formats and instructions. The Claim Evaluation Tools database is a flexible resource from which researchers, teachers and others can design measurement instruments to meet their own requirements. These evaluation tools are being managed and made freely available for non-commercial use (on request) through Testing Treatments interactive (testingtreatments.org). PACTR201606001679337 and PACTR201606001676150; Pre-results. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.

  11. The Afghan symptom checklist: a culturally grounded approach to mental health assessment in a conflict zone.

    PubMed

    Miller, Kenneth E; Omidian, Patricia; Quraishy, Abdul Samad; Quraishy, Naseema; Nasiry, Mohammed Nader; Nasiry, Seema; Karyar, Nazar Mohammed; Yaqubi, Abdul Aziz

    2006-10-01

    This article describes a methodology for developing culturally grounded assessment measures in conflict and postconflict situations. A mixed-method design was used in Kabul, Afghanistan, to identify local indicators of distress and develop the 22-item Afghan Symptom Checklist (ASCL). The ASCL contains several indigenous items and items familiar to Western mental health professionals. The ASCL was pilot tested and subsequently administered to 324 adults in 8 districts of Kabul. It demonstrated excellent reliability (alpha=.93) and good construct validity, correlating strongly with a measure of exposure to war-related violence and loss (r=.70). Results of the survey indicate moderate levels of distress among Afghan men and markedly higher levels of distress and impaired functioning among women (and widows in particular). (c) 2007 APA, all rights reserved

  12. Language development in rural and urban Russian-speaking children with and without developmental language disorder

    PubMed Central

    Kornilov, Sergey A.; Lebedeva, Tatiana V.; Zhukova, Marina A.; Prikhoda, Natalia A.; Korotaeva, Irina V.; Koposov, Roman A.; Hart, Lesley; Reich, Jodi; Grigorenko, Elena L.

    2015-01-01

    Using a newly developed Assessment of the Development of Russian Language (ORRIA), we investigated differences in language development between rural vs. urban Russian-speaking children (n = 100 with a mean age of 6.75) subdivided into groups with and without developmental language disorders. Using classical test theory and item response theory approaches, we found that while ORRIA displayed overall satisfactory psychometric properties, several of its items showed differential item functioning favoring rural children, and several others favoring urban children. After the removal of these items, rural children significantly underperformed on ORRIA compared to urban children. The urbanization factor did not significantly interact with language group. We discuss the latter finding in the context of the multiple additive risk factors for language development and emphasize the need for future studies of the mechanisms that underlie these influences and the implications of these findings for our understanding of the etiological architecture of children's language development. PMID:27346924

  13. Validation of a scale for assessing attitudes towards outcomes of genetic cancer testing among primary care providers and breast specialists

    PubMed Central

    N’Diaye, Khadim; Evans, D. Gareth; Harris, Hilary; Tibben, Aad; van Asperen, Christi; Schmidtke, Joerg; Nippert, Irmgard; Mancini, Julien; Julian-Reynier, Claire

    2017-01-01

    Objective To develop a generic scale for assessing attitudes towards genetic testing and to psychometrically assess these attitudes in the context of BRCA1/2 among a sample of French general practitioners, breast specialists and gyneco-obstetricians. Study design and setting Nested within the questionnaire developed for the European InCRisC (International Cancer Risk Communication Study) project were 14 items assessing expected benefits (8 items) and drawbacks (6 items) of the process of breast/ovarian genetic cancer testing (BRCA1/2). Another item assessed agreement with the statement that, overall, the expected health benefits of BRCA1/2 testing exceeded its drawbacks, thereby justifying its prescription. The questionnaire was mailed to a sample of 1,852 French doctors. Of these, 182 breast specialists, 275 general practitioners and 294 gyneco-obstetricians completed and returned the questionnaire to the research team. Principal Component Analysis, Cronbach’s α coefficient, and Pearson’s correlation coefficients were used in the statistical analyses of collected data. Results Three dimensions emerged from the respondents’ responses, and were classified under the headings: “Anxiety, Conflict and Discrimination”, “Risk Information”, and “Prevention and Surveillance”. Cronbach’s α coefficient for the 3 dimensions was 0.79, 0.76 and 0.62, respectively, and each dimension exhibited strong correlation with the overall indicator of agreement (criterion validity). Conclusions The validation process of the 15 items regarding BRCA1/2 testing revealed satisfactory psychometric properties for the creation of a new scale entitled the Attitudes Towards Genetic Testing for BRCA1/2 (ATGT-BRCA1/2) Scale. Further testing is required to confirm the validity of this tool which could be used generically in other genetic contexts. PMID:28570656

  14. Development and Evaluation of a Questionnaire to Assess Physical Educators' Knowledge of Student Assessment

    ERIC Educational Resources Information Center

    Emmanouilidou, Kyriaki; Derri, Vassiliki; Aggelousis, Nicolaos; Vassiliadou, Olga

    2012-01-01

    The purpose of this pilot study was to develop and evaluate an instrument for measuring Greek elementary physical educators' knowledge of student assessment. A multiple-choice questionnaire comprised of items about concepts, methods, tools, and types of student assessment in physical education was designed and tested. The initial 35-item…

  15. Development and initial evaluation of the SCI-FI/AT

    PubMed Central

    Jette, Alan M.; Slavin, Mary D.; Ni, Pengsheng; Kisala, Pamela A.; Tulsky, David S.; Heinemann, Allen W.; Charlifue, Susie; Tate, Denise G.; Fyffe, Denise; Morse, Leslie; Marino, Ralph; Smith, Ian; Williams, Steve

    2015-01-01

    Objectives To describe the domain structure and calibration of the Spinal Cord Injury Functional Index for samples using Assistive Technology (SCI-FI/AT) and report the initial psychometric properties of each domain. Design Cross sectional survey followed by computerized adaptive test (CAT) simulations. Setting Inpatient and community settings. Participants A sample of 460 adults with traumatic spinal cord injury (SCI) stratified by level of injury, completeness of injury, and time since injury. Interventions None Main outcome measure SCI-FI/AT Results Confirmatory factor analysis (CFA) and Item response theory (IRT) analyses identified 4 unidimensional SCI-FI/AT domains: Basic Mobility (41 items) Self-care (71 items), Fine Motor Function (35 items), and Ambulation (29 items). High correlations of full item banks with 10-item simulated CATs indicated high accuracy of each CAT in estimating a person's function, and there was high measurement reliability for the simulated CAT scales compared with the full item bank. SCI-FI/AT item difficulties in the domains of Self-care, Fine Motor Function, and Ambulation were less difficult than the same items in the original SCI-FI item banks. Conclusion With the development of the SCI-FI/AT, clinicians and investigators have available multidimensional assessment scales that evaluate function for users of AT to complement the scales available in the original SCI-FI. PMID:26010975

  16. Development and initial evaluation of the SCI-FI/AT.

    PubMed

    Jette, Alan M; Slavin, Mary D; Ni, Pengsheng; Kisala, Pamela A; Tulsky, David S; Heinemann, Allen W; Charlifue, Susie; Tate, Denise G; Fyffe, Denise; Morse, Leslie; Marino, Ralph; Smith, Ian; Williams, Steve

    2015-05-01

    To describe the domain structure and calibration of the Spinal Cord Injury Functional Index for samples using Assistive Technology (SCI-FI/AT) and report the initial psychometric properties of each domain. Cross sectional survey followed by computerized adaptive test (CAT) simulations. Inpatient and community settings. A sample of 460 adults with traumatic spinal cord injury (SCI) stratified by level of injury, completeness of injury, and time since injury. None SCI-FI/AT RESULTS: Confirmatory factor analysis (CFA) and Item response theory (IRT) analyses identified 4 unidimensional SCI-FI/AT domains: Basic Mobility (41 items) Self-care (71 items), Fine Motor Function (35 items), and Ambulation (29 items). High correlations of full item banks with 10-item simulated CATs indicated high accuracy of each CAT in estimating a person's function, and there was high measurement reliability for the simulated CAT scales compared with the full item bank. SCI-FI/AT item difficulties in the domains of Self-care, Fine Motor Function, and Ambulation were less difficult than the same items in the original SCI-FI item banks. With the development of the SCI-FI/AT, clinicians and investigators have available multidimensional assessment scales that evaluate function for users of AT to complement the scales available in the original SCI-FI.

  17. Development and validation of an energy-balance knowledge test for fourth- and fifth-grade students.

    PubMed

    Chen, Senlin; Zhu, Xihe; Kang, Minsoo

    2017-05-01

    A valid test measuring children's energy-balance (EB) knowledge is lacking in research. This study developed and validated the energy-balance knowledge test (EBKT) for fourth and fifth grade students. The original EBKT contained 25 items but was reduced to 23 items based on pilot result and intensive expert panel discussion. De-identified data were collected from 468 fourth and fifth grade students enrolled in four schools to examine the psychometric properties of the EBKT items. The Rasch model analysis was conducted using the Winstep 3.65.0 software. Differential item functioning (DIF) analysis flagged 1 item (item #4) functioning differently between boys and girls, which was deleted. The final 22-item EBKT showed desirable model-data fit indices. The items had large variability ranging from -3.58 logit (item #10, the easiest) to 1.70 logit (item #3, the hardest). The average person ability on the test was 0.28 logit (SD = .78). Additional analyses supported known-group difference validity of the EBKT scores in capturing gender- and grade-based ability differences. The test was overall valid but could be further improved by expanding test items to discern various ability levels. For lack of a better test, researchers and practitioners may use the EBKT to assess fourth- and fifth-grade students' EB knowledge.

  18. Professor-Student Rapport Scale: Six Items Predict Student Outcomes

    ERIC Educational Resources Information Center

    Wilson, Janie H.; Ryan, Rebecca G.

    2013-01-01

    Rapport between students and teachers leads to numerous positive student outcomes, including attitudes toward the teacher and course, student motivation, and perceived learning. The recent development of a Professor-Student Rapport scale offers assessment of this construct. However, a Cronbach's [alpha] of 0.96 indicated item redundancy, and the…

  19. Endovascular Skills for Trauma and Resuscitative Surgery (ESTARS) Course: Curriculum Development, Content Validation, and Program Assessment

    DTIC Science & Technology

    2014-01-01

    fundamental endovascular training for trauma surgeons. METHODS: ESTARS 2-day course incorporated pretest / posttest examinations, precourse materials...and 17 multiple true/false items. The purpose of the test was pri- marily formative; the same items were used for pretesting and posttesting , and the... pretest served as a learning tool focusing learners on the content of importance. Mean scores were computed, treating each item as one point (multiple

  20. Development and Validation of the Scan of Postgraduate Educational Environment Domains (SPEED): A Brief Instrument to Assess the Educational Environment in Postgraduate Medical Education

    PubMed Central

    Schönrock-Adema, Johanna; Visscher, Maartje; Raat, A. N. Janet; Brand, Paul L. P.

    2015-01-01

    Introduction Current instruments to evaluate the postgraduate medical educational environment lack theoretical frameworks and are relatively long, which may reduce response rates. We aimed to develop and validate a brief instrument that, based on a solid theoretical framework for educational environments, solicits resident feedback to screen the postgraduate medical educational environment quality. Methods Stepwise, we developed a screening instrument, using existing instruments to assess educational environment quality and adopting a theoretical framework that defines three educational environment domains: content, atmosphere and organization. First, items from relevant existing instruments were collected and, after deleting duplicates and items not specifically addressing educational environment, grouped into the three domains. In a Delphi procedure, the item list was reduced to a set of items considered most important and comprehensively covering the three domains. These items were triangulated against the results of semi-structured interviews with 26 residents from three teaching hospitals to achieve face validity. This draft version of the Scan of Postgraduate Educational Environment Domains (SPEED) was administered to residents in a general and university hospital and further reduced and validated based on the data collected. Results Two hundred twenty-three residents completed the 43-item draft SPEED. We used half of the dataset for item reduction, and the other half for validating the resulting SPEED (15 items, 5 per domain). Internal consistencies were high. Correlations between domain scores in the draft and brief versions of SPEED were high (>0.85) and highly significant (p<0.001). Domain score variance of the draft instrument was explained for ≥80% by the items representing the domains in the final SPEED. Conclusions The SPEED comprehensively covers the three educational environment domains defined in the theoretical framework. Because of its validity and brevity, the SPEED is promising as useful and easily applicable tool to regularly screen educational environment quality in postgraduate medical education. PMID:26413836

  1. Development of a measure of asthma-specific quality of life among adults.

    PubMed

    Eberhart, Nicole K; Sherbourne, Cathy D; Edelen, Maria Orlando; Stucky, Brian D; Sin, Nancy L; Lara, Marielena

    2014-04-01

    A key goal in asthma treatment is improvement in quality of life (QoL), but existing measures often confound QoL with symptoms and functional impairment. The current study addresses these limitations and the need for valid patient-reported outcome measures by using state-of-the-art methods to develop an item bank assessing QoL in adults with asthma. This article describes the process for developing an initial item pool for field testing. Five focus group interviews were conducted with a total of 50 asthmatic adults. We used "pile sorting/binning" and "winnowing" methods to identify key QoL dimensions and develop a pool of items based on statements made in the focus group interviews. We then conducted a literature review and consulted with an expert panel to ensure that no key concepts were omitted. Finally, we conducted individual cognitive interviews to ensure that items were well understood and inform final item refinement. Six hundred and sixty-one QoL statements were identified from focus group interview transcripts and subsequently used to generate a pool of 112 items in 16 different content areas. Items covering a broad range of content were developed that can serve as a valid gauge of individuals' perceptions of the effects of asthma and its treatment on their lives. These items do not directly measure symptoms or functional impairment, yet they include a broader range of content than most existent measures of asthma-specific QoL.

  2. Development of a Measure of Asthma-Specific Quality of Life among Adults

    PubMed Central

    Eberhart, Nicole K.; Sherbourne, Cathy D.; Edelen, Maria Orlando; Stucky, Brian D.; Sin, Nancy L.; Lara, Marielena

    2014-01-01

    Purpose A key goal in asthma treatment is improvement in quality of life (QoL), but existing measures often confound QoL with symptoms and functional impairment. The current study addresses these limitations and the need for valid patient-reported outcome measures by using state-of-the-art methods to develop an item bank assessing QoL in adults with asthma. This article describes the process for developing an initial item pool for field testing. Methods Five focus group interviews were conducted with a total of 50 asthmatic adults. We used “pile sorting/binning” and “winnowing” methods to identify key QoL dimensions and develop a pool of items based on statements made in the focus group interviews. We then conducted a literature review and consulted with an expert panel to ensure that no key concepts were omitted. Finally, we conducted individual cognitive interviews to ensure that items were well understood and inform final item refinement. Results 661 QoL statements were identified from focus group interview transcripts and subsequently used to generate a pool of 112 items in 16 different content areas. Conclusions Items covering a broad range of content were developed that can serve as a valid gauge of individuals’ perceptions of the effects of asthma and its treatment on their lives. These items do not directly measure symptoms or functional impairment, yet they include a broader range of content than most existent measures of asthma-specific QoL. PMID:24062237

  3. [Development and validation of a questionnaire on knowledge and personal hygiene habits in childhood (HICORIN®)].

    PubMed

    Moreno-Martínez, Francisco José; Ruzafa-Martínez, María; Ramos-Morcillo, Antonio Jesús; Gómez García, Carmen Isabel; Hernández-Susarte, Ana María

    2015-01-01

    To develop and validate a questionnaire on the integral assessment of the habits and knowledge in personal hygiene in children between 7 to 12 years old in the educational, social and health environment. Cross-sectional study for the validation of a questionnaire. One primary and secondary school and one children's home in the Region of Murcia, Spain. A total of 86 children were included (80 from a primary and secondary school; 6 from a children's home), as well as 7 experts. Content validation by experts; qualitative assessment; identify difficulties related to some questions, item response analysis, and test-retest reliability. After the literature search, 20 tools that included items related to child body hygiene were obtained. The researchers selected 34 items and drafted 48 additional ones. After content validity by the experts, the questionnaire (HICORIN®) was reduced to 63 items, and consisted of 7 dimensions of child personal hygiene (skin, hair, hands, oral, feet, ears, and intimate hygiene). After with the children some terms were adapted to improve their understanding. Only two items had non-response rates that exceeded 10%. The test-retest showed that 84.1% of the items had between very good and moderate reliability. HICORIN® is a reliable and valid instrument that integrally assesses the habits and knowledge in personal hygiene in children between 7-12 years old. It is applicable in educative and social and health environments and in children from different socioeconomic levels. Copyright © 2014 Elsevier España, S.L.U. All rights reserved.

  4. Test item linguistic complexity and assessments for deaf students.

    PubMed

    Cawthon, Stephanie

    2011-01-01

    Linguistic complexity of test items is one test format element that has been studied in the context of struggling readers and their participation in paper-and-pencil tests. The present article presents findings from an exploratory study on the potential relationship between linguistic complexity and test performance for deaf readers. A total of 64 students completed 52 multiple-choice items, 32 in mathematics and 20 in reading. These items were coded for linguistic complexity components of vocabulary, syntax, and discourse. Mathematics items had higher linguistic complexity ratings than reading items, but there were no significant relationships between item linguistic complexity scores and student performance on the test items. The discussion addresses issues related to the subject area, student proficiency levels in the test content, factors to look for in determining a "linguistic complexity effect," and areas for further research in test item development and deaf students.

  5. The development and validation of the major life changing decision profile (MLCDP)

    PubMed Central

    2013-01-01

    Background Chronic diseases may influence patients taking major life changing decisions (MLCDs) concerning for example education, career, relationships, having children and retirement. A validated measure is needed to evaluate the impact of chronic diseases on MLCDs, improving assessment of their life-long burden. The aims of this study were to develop a validated questionnaire, the “Major Life Changing Decision Profile” (MLCDP) and to evaluate its psychometric properties. Methods 50 interviews with dermatology patients and 258 questionnaires, completed by cardiology, rheumatology, nephrology, diabetes and respiratory disorder patients, were analysed for qualitative data using Nvivo8 software. Content validation was carried out by a panel of experts. The first version of the MLCDP was completed by 210 patients and an iterative process of multiple Exploratory Factor Analyses and item prevalence was used to guide item reduction. Face validity and practicability was assessed by patients. Results 48 MLCDs were selected from analysis of the transcripts and questionnaires for the first version of the MLCDP, and reduced to 45 by combination of similar themes. There was a high intraclass correlation coefficient (0.7) between the 13 members of the content validation panel. Four more items were deleted leaving a 41-item MLCDP that was completed by 210 patients. The most frequently recorded MLCDs were decisions to change eating habits (71.4%), to change smoking/drinking alcohol habits (58.5%) and not to travel or go for holidays abroad (50.9%). Factor analysis suggested item number reduction from 41 to 34, to 29, then 23 items. However after taking into account item prevalence data as well as factor analysis results, 32 items were retained. The 32-item MLCDP has five domains education (3 items), job/career (9), family/relationships (5), social (10) and physical (5). The MLCDP score is expressed as the absolute number of decisions that have been affected. Conclusions The 32-item (5 domains) MLCDP has been developed as an easy to complete generic tool for use in clinical practice and for quality of life and epidemiological research. Further validation is required. PMID:23656829

  6. A novel method for expediting the development of patient-reported outcome measures and an evaluation across several populations

    PubMed Central

    Garrard, Lili; Price, Larry R.; Bott, Marjorie J.; Gajewski, Byron J.

    2016-01-01

    Item response theory (IRT) models provide an appropriate alternative to the classical ordinal confirmatory factor analysis (CFA) during the development of patient-reported outcome measures (PROMs). Current literature has identified the assessment of IRT model fit as both challenging and underdeveloped (Sinharay & Johnson, 2003; Sinharay, Johnson, & Stern, 2006). This study evaluates the performance of Ordinal Bayesian Instrument Development (OBID), a Bayesian IRT model with a probit link function approach, through applications in two breast cancer-related instrument development studies. The primary focus is to investigate an appropriate method for comparing Bayesian IRT models in PROMs development. An exact Bayesian leave-one-out cross-validation (LOO-CV) approach (Vehtari & Lampinen, 2002) is implemented to assess prior selection for the item discrimination parameter in the IRT model and subject content experts’ bias (in a statistical sense and not to be confused with psychometric bias as in differential item functioning) toward the estimation of item-to-domain correlations. Results support the utilization of content subject experts’ information in establishing evidence for construct validity when sample size is small. However, the incorporation of subject experts’ content information in the OBID approach can be sensitive to the level of expertise of the recruited experts. More stringent efforts need to be invested in the appropriate selection of subject experts to efficiently use the OBID approach and reduce potential bias during PROMs development. PMID:27667878

  7. A novel method for expediting the development of patient-reported outcome measures and an evaluation across several populations.

    PubMed

    Garrard, Lili; Price, Larry R; Bott, Marjorie J; Gajewski, Byron J

    2016-10-01

    Item response theory (IRT) models provide an appropriate alternative to the classical ordinal confirmatory factor analysis (CFA) during the development of patient-reported outcome measures (PROMs). Current literature has identified the assessment of IRT model fit as both challenging and underdeveloped (Sinharay & Johnson, 2003; Sinharay, Johnson, & Stern, 2006). This study evaluates the performance of Ordinal Bayesian Instrument Development (OBID), a Bayesian IRT model with a probit link function approach, through applications in two breast cancer-related instrument development studies. The primary focus is to investigate an appropriate method for comparing Bayesian IRT models in PROMs development. An exact Bayesian leave-one-out cross-validation (LOO-CV) approach (Vehtari & Lampinen, 2002) is implemented to assess prior selection for the item discrimination parameter in the IRT model and subject content experts' bias (in a statistical sense and not to be confused with psychometric bias as in differential item functioning) toward the estimation of item-to-domain correlations. Results support the utilization of content subject experts' information in establishing evidence for construct validity when sample size is small. However, the incorporation of subject experts' content information in the OBID approach can be sensitive to the level of expertise of the recruited experts. More stringent efforts need to be invested in the appropriate selection of subject experts to efficiently use the OBID approach and reduce potential bias during PROMs development.

  8. Development and testing of the KERNset: an instrument to assess the quality of telephone triage in out-of-hours primary care services.

    PubMed

    Smits, Marleen; Keizer, Ellen; Ram, Paul; Giesen, Paul

    2017-12-02

    Telephone triage is a core but vulnerable part of the care process at out-of-hours general practitioner (GP) cooperatives. In the Netherlands, different instruments have been used for assessing the quality of telephone triage. These instruments focussed mainly on communicational aspects, and less on the medical quality of triage decisions. Our aim was to develop and test a minimum set of items to assess the quality of telephone triage. A national survey among all GP cooperatives in the Netherlands was performed to examine the most important aspects of telephone triage. Next, corresponding items from existing instruments were searched on these topics. Subsequently, an expert panel judged these items on importance, completeness and formulation. The concept KERNset consisted of 24 items about the telephone conversation: 13 medical, ten communicational and one regarding both types. It was pilot tested on measurement characteristics, reliability, validity and variation between triagists. In this pilot study, 114 anonymous calls from four GP cooperatives spread across the Netherlands were judged by three out of eight raters, both internal and external raters. Cronbach's alpha was .94 for the medical items and .75 for the communicational items. Inter-rater reliability: complete agreement between the external raters was 45% and reasonable agreement 73% (difference of maximally one point on the five-point scale). Intra-rater reliability: complete agreement within raters was 55% and reasonable agreement 84%. There were hardly any differences between internal and external raters, but there were differences in strictness between individual raters. The construct validity was confirmed by the high correlation between the general impression of the call and the items of the KERNset. Of the differences within items 19% could be explained by differences between triage nurses, which means the KERNset is able to demonstrate differences between triage nurses. The KERNset can be used to assess the quality of telephone triage. The validity is good and differences between calls and between triage nurses can be measured. A more intensive training for raters could improve the reliability.

  9. ABILOCO-Kids: a Rasch-built 10-item questionnaire for assessing locomotion ability in children with cerebral palsy.

    PubMed

    Caty, Gilles D; Gilles, Caty D; Arnould, Carlyne; Thonnard, Jean-Louis; Lejeune, Thierry M

    2008-11-01

    To develop a questionnaire (ABILOCO-Kids) based on the Rasch measurement model that assesses locomotion ability in children with cerebral palsy. Prospective study and questionnaire development. A total of 113 children with cerebral palsy (10 (standard deviation 2.5) years old). A 41-item questionnaire was developed based on existing scales and on the clinical experience of professionals in the field of rehabilitation. This questionnaire was tested separately on the 113 children with cerebral palsy and their parents. Their responses were analysed using the Rasch model (RUMM-2020) to select items that had an ordered rating scale and that fit a unidimensional model. The final ABILOCO-Kids scale consisted of 10 locomotion activities, of which difficulty was rated by the parents. The parents gave a more precise assessment of their children's ability than the children themselves, leading to a wider range of measurement that was well-targeted on the sample population and that had good reliability (r=0.97) and reproducibility (intraclass correlation coefficient=0.96). Item calibration did not vary with age, sex or clinical presentation (hemiplegia, diplegia, quadriplegia). The concurrent validity of the ABILOCO-Kids questionnaire was also shown by its correlation with the Gross Motor Function Classification System. The ABILOCO-Kids questionnaire has good psychometric qualities for measuring a wide range of locomotion abilities in children with cerebral palsy.

  10. Can Item Keyword Feedback Help Remediate Knowledge Gaps?

    PubMed Central

    Feinberg, Richard A.; Clauser, Amanda L.

    2016-01-01

    ABSTRACT Background  In graduate medical education, assessment results can effectively guide professional development when both assessment and feedback support a formative model. When individuals cannot directly access the test questions and responses, a way of using assessment results formatively is to provide item keyword feedback. Objective  The purpose of the following study was to investigate whether exposure to item keyword feedback aids in learner remediation. Methods  Participants included 319 trainees who completed a medical subspecialty in-training examination (ITE) in 2012 as first-year fellows, and then 1 year later in 2013 as second-year fellows. Performance on 2013 ITE items in which keywords were, or were not, exposed as part of the 2012 ITE score feedback was compared across groups based on the amount of time studying (preparation). For the same items common to both 2012 and 2013 ITEs, response patterns were analyzed to investigate changes in answer selection. Results  Test takers who indicated greater amounts of preparation on the 2013 ITE did not perform better on the items in which keywords were exposed compared to those who were not exposed. The response pattern analysis substantiated overall growth in performance from the 2012 ITE. For items with incorrect responses on both attempts, examinees selected the same option 58% of the time. Conclusions  Results from the current study were unsuccessful in supporting the use of item keywords in aiding remediation. Unfortunately, the results did provide evidence of examinees retaining misinformation. PMID:27777664

  11. Development of a food frequency questionnaire for Sri Lankan adults

    PubMed Central

    2012-01-01

    Background Food Frequency Questionnaires (FFQs) are commonly used in epidemiologic studies to assess long-term nutritional exposure. Because of wide variations in dietary habits in different countries, a FFQ must be developed to suit the specific population. Sri Lanka is undergoing nutritional transition and diet-related chronic diseases are emerging as an important health problem. Currently, no FFQ has been developed for Sri Lankan adults. In this study, we developed a FFQ to assess the regular dietary intake of Sri Lankan adults. Methods A nationally representative sample of 600 adults was selected by a multi-stage random cluster sampling technique and dietary intake was assessed by random 24-h dietary recall. Nutrient analysis of the FFQ required the selection of foods, development of recipes and application of these to cooked foods to develop a nutrient database. We constructed a comprehensive food list with the units of measurement. A stepwise regression method was used to identify foods contributing to a cumulative 90% of variance to total energy and macronutrients. In addition, a series of photographs were included. Results We obtained dietary data from 482 participants and 312 different food items were recorded. Nutritionists grouped similar food items which resulted in a total of 178 items. After performing step-wise multiple regression, 93 foods explained 90% of the variance for total energy intake, carbohydrates, protein, total fat and dietary fibre. Finally, 90 food items and 12 photographs were selected. Conclusion We developed a FFQ and the related nutrient composition database for Sri Lankan adults. Culturally specific dietary tools are central to capturing the role of diet in risk for chronic disease in Sri Lanka. The next step will involve the verification of FFQ reproducibility and validity. PMID:22937734

  12. Reliability and validity of a scale for health-promoting schools.

    PubMed

    Lee, Eun Young; Shin, Young-Jeon; Choi, Bo Youl; Cho, Ho Soon Michelle

    2014-12-01

    Despite a growing body of research regarding the health-promoting schools (HPS) concept from the World Health Organization (WHO), research on measuring of the HPS is limited. This study aims to develop a scale for assessing the status of the HPS based on the WHO guidelines and to evaluate the reliability and validity of the scale. After completing the translation and back-translation process, the content validity of the 50-item scale for HPS (SHPS) was assessed by an expert committee review and pretested with 17 teachers. A stratified, random sampling design was used. A total of 728 teachers from 94 schools completed a self-administered questionnaire. The total sample was randomly divided into three groups for exploratory factor analysis (EFA), confirmatory factor analysis (CFA) and cross-validation. The EFA suggested seven factors, including 37 items, and the CFA confirmed these factors. In a second-order factor analysis, the second-order seven-factor model had acceptable fit indices (root mean square error of approximation 0.07, comparative fit index 0.98) with stability over validation sample and whole sample. Thus, the first-order seven factors (school nutrition services [three-item, α = 0.87], healthy school policies [six-item, α = 0.87], school's physical environment [10-item, α = 0.91], school's social environment [four-item, α = 0.88], community links [six-item, α = 0.91], individual health skills and action competencies [three-item, α = 0.89], and health services [five-item, α = 0.86]) loaded significantly onto the second-order factor (HPS [37-item, α = 0.97]). In conclusion, the SHPS is a reliable and valid measurement tool for assessing the states of the HPS in the Korean school context. It will be useful for comprehensively assessing schools' needs and monitoring the progress of school health interventions. © The Author (2013). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  13. Promoting the Quality of Health Research-based News: Introduction of a Tool

    PubMed Central

    Ashoorkhani, Mahnaz; Majdzadeh, Reza; Nedjat, Saharnaz; Gholami, Jaleh

    2017-01-01

    Introduction: While disseminating health research findings to the public, it is very important to present appropriate and accurate information to give the target audience a correct understanding of the subject matter. The objective of this study was to design and psychometrically evaluate a checklist for health journalists to help them prepare news of appropriate accuracy and authenticity. Methods: The study consisted of two phases, checklist design and psychometrics. Literature review and expert opinion were used to extract the items of the checklist in the first phase. In the second phase, to assess content and face validity, the judgment of 38 persons (epidemiologists with a tool production history, editors-in-chief, and health journalists) was used to check the items’ understandability, nonambiguity, relevancy, and clarity. Reliability was assessed by the test–retest method using intra-cluster correlation (ICC) indices in the two phases. Cronbach's alpha was used to assess internal validity of the checklist. Results: Based on the participants’ opinions, the items were reduced from 20 to 14 in number. The items were categorized into the following three domains: (a) items assessing the source of news and its validity, (b) items addressing the presentation of complete and accurate information on research findings, and (c) items which if adhered to lead to the target audiences’ better understanding. The checklist was approved for content and face validity. The reliability of the checklist was assessed in the last stage; the ICC was 1 for 12 items and above 0.8 for the other two. Internal consistency (Cronbach's alpha) was 0.98. Discussion and Conclusions: The resultant indices of the study indicate that the checklist has appropriate validity and reliability. Hence, it can be used by health journalists to develop health research-based news. PMID:29184638

  14. Development and evaluation of RAMP I - a practitioner's tool for screening of musculoskeletal disorder risk factors in manual handling.

    PubMed

    Lind, Carl Mikael; Forsman, Mikael; Rose, Linda Maria

    2017-10-16

    RAMP I is a screening tool developed to support practitioners in screening for work-related musculoskeletal disorder risk factors related to manual handling. RAMP I, which is part of the RAMP tool, is based on research-based studies combined with expert group judgments. More than 80 practitioners participated in the development of RAMP I. The tool consists of dichotomous assessment items grouped into seven categories. Acceptable reliability was found for a majority of the assessment items for 15 practitioners who were given 1 h of training. The usability evaluation points to RAMP I being usable for screening for musculoskeletal disorder risk factors, i.e., usable for assessing risks, being usable as a decision base, having clear results and that the time needed for an assessment is acceptable. It is concluded that RAMP I is a usable tool for practitioners.

  15. Bringing the real world to Psychometric Evaluation of Cervical Cancer Literacy Assessments with Black, Latina, and Arab Women in Real World Settings

    PubMed Central

    Williams, Karen Patricia; Templin, Thomas N.

    2013-01-01

    Objective This research describes the development and evaluation of a new scale for assessing functional cervical cancer health literacy, the Cervical Cancer Literacy Assessment Tool (C-CLAT). Methods In Phase 1, 35 items in English, Spanish and Arabic, for C-CLAT were generated, taking into account three content domains-Awareness, Knowledge, and Prevention/Control. After content validation, 24 items were retained for psychometric evaluation. In Phase 2, the 24-item C-CLAT was evaluated in three racial/ethnic populations of urban women (N =543). Psychometric methods included item analysis, multifactor Item Response Theory modeling, and concurrent correlations. Results The final C-CLAT consisted of 16 items, with an internal consistency reliability of .72. C-CLAT reliabilities in Black, Latina, and Arab women were .73, .76, and .60, respectively. The rank order correlations of item difficulties across racial/ethnic groups was high (r’s = .97 to .98). The C-CLAT was positively related to educational level, and Arab women scored significantly higher than the Black and Latina participants. Conclusions This study presents a psychometrically sound instrument that measures health literacy related to cervical cancer. Practice Implications The C-CLAT is a tool that can be orally administered by a lay person and used in a community-based health promotion intervention. PMID:24072456

  16. The Therapeutic Environment Screening Survey for Nursing Homes (TESS-NH): an observational instrument for assessing the physical environment of institutional settings for persons with dementia.

    PubMed

    Sloane, Philip D; Mitchell, C Madeline; Weisman, Gerald; Zimmerman, Sheryl; Foley, Kristie M Long; Lynn, Mary; Calkins, Margaret; Lawton, M Powell; Teresi, Jeanne; Grant, Leslie; Lindeman, David; Montgomery, Rhonda

    2002-03-01

    To develop an observational instrument that describes the ability of physical environments of institutional settings to address therapeutic goals for persons with dementia. A National Institute on Aging workgroup identified and subsequently revised items that evaluated exit control, maintenance, cleanliness, safety, orientation/cueing, privacy, unit autonomy, outdoor access, lighting, noise, visual/tactile stimulation, space/seating, and familiarity/homelikeness. The final instrument contains 84 discrete items and one global rating. A summary scale, the Special Care Unit Environmental Quality Scale (SCUEQS), consists of 18 items. Lighting items were validated using portable light meters. Concurrent criterion validation compared SCUEQS scores with the Professional Environmental Assessment Protocol (PEAP). Interrater kappa statistics for 74% of items were above.60. For another 10% of items, kappas could not be calculated due to empty cells, but interrater agreement was above 80%. The SCUEQS demonstrated an interrater reliability of.93, a test--retest reliability of.88, and an internal consistency of.81--.83. Light meter ratings correlated significantly with the Therapeutic Environment Screening Survey for Nursing Homes (TESS-NH) lighting items (r =.29--.38, p =.01--.04), and the SCUEQS correlated significantly with global PEAP ratings (r =.52, p <.01). The TESS-NH efficiently assesses discrete elements of the physical environment and has strong reliability and validity. The SCUEQS provides a quantitative measure of environmental quality in institutional settings.

  17. Preliminary development of an ultrabrief two-item bedside test for delirium.

    PubMed

    Fick, Donna M; Inouye, Sharon K; Guess, Jamey; Ngo, Long H; Jones, Richard N; Saczynski, Jane S; Marcantonio, Edward R

    2015-10-01

    Delirium is common, morbid, and costly, yet is greatly under-recognized among hospitalized older adults. To identify the best single and pair of mental status test items that predict the presence of delirium. Diagnostic test evaluation study that enrolled medicine inpatients aged 75 years or older at an academic medical center. Patients underwent a clinical reference standard assessment involving a patient interview, medical record review, and interviews with family members and nurses to determine the presence or absence of Diagnostic and Statistical Manual of Mental Disorders, 4th Edition defined delirium. Participants also underwent the three-dimensional Confusion Assessment Method (3D-CAM), a brief, validated assessment for delirium. Individual items and pairs of items from the 3D-CAM were evaluated to determine sensitivity and specificity relative to the reference standard delirium diagnosis. Of the 201 participants (mean age 84 years, 62% female), 42 (21%) had delirium based on the clinical reference standard. The single item with the best test characteristics was "months of the year backwards" with a sensitivity of 83% (95% confidence interval [CI]: 69%-93%) and specificity of 69% (95% CI: 61%-76%). The best 2-item screen was the combination of "months of the year backwards" and "what is the day of the week?" with a sensitivity of 93% (95% CI: 81%-99%) and specificity of 64% (95% CI: 56%-70%). We identified a single item with >80% and pair of items with >90% sensitivity for delirium. If validated prospectively, these items will serve as an initial innovative screening step for delirium identification in hospitalized older adults. © 2015 Society of Hospital Medicine.

  18. Development and psychometric evaluation of the nursing instructors' clinical teaching performance inventory.

    PubMed

    A Farahani, Mansoureh; Emamzadeh Ghasemi, Hormat Sadat; Nikpaima, Nasrin; Fereidooni, Zhila; Rasoli, Maryam

    2014-10-29

    Evaluation of nursing instructors' clinical teaching performance is a prerequisite to the quality assurance of nursing education. One of the most common procedures for this purpose is using student evaluations. This study was to develop and evaluate the psychometric properties of Nursing Instructors' Clinical Teaching Performance Inventory (NICTPI). The primary items of the inventory were generated by reviewing the published literature and the existing questionnaires as well as consulting with the members of the Faculties Evaluation Committee of the study setting. Psychometric properties were assessed by calculating its content validity ratio and index, and test-retest correlation coefficient as well as conducting an exploratory factor analysis and an internal consistency assessment. The content validity ratios and indices of the items were respectively higher than 0.85 and 0.79. The final version of the inventory consisted of 25 items, and in the exploratory factor analysis, items were loaded on three factors which jointly accounting for 72.85% of the total variance. The test-retest correlation coefficient and the Cronbach's alpha of the inventory were 0.93 and 0.973, respectively. The results revealed that the developed inventory is an appropriate, valid, and reliable instrument for evaluating nursing instructors' clinical teaching performance.

  19. Validation of a General and Sport Nutrition Knowledge Questionnaire in Adolescents and Young Adults: GeSNK.

    PubMed

    Calella, Patrizia; Iacullo, Vittorio Maria; Valerio, Giuliana

    2017-04-29

    Good knowledge of nutrition is widely thought to be an important aspect to maintaining a balanced and healthy diet. The aim of this study was to develop and validate a new reliable tool to measure the general and the sport nutrition knowledge (GeSNK) in people who used to practice sports at different levels. The development of (GeSNK) was carried out in six phases as follows: (1) item development and selection by a panel of experts; (2) pilot study in order to assess item difficulty and item discrimination; (3) measurement of the internal consistency; (4) reliability assessment with a 2-week test-retest analysis; (5) concurrent validity was tested by administering the questionnaire along with other two similar tools; (6) construct validity by administering the questionnaire to three groups of young adults with different general nutrition and sport nutrition knowledge. The final questionnaire, consisted of 62 items of the original 183 questions. It is a consistent, valid, and suitable instrument that can be applied over time, making it a promising tool to look at the relationship between nutrition knowledge, demographic characteristics, and dietary behavior in adolescents and young adults.

  20. Content validity of governing in Building Information Modelling (BIM) implementation assessment instrument

    NASA Astrophysics Data System (ADS)

    Hadzaman, N. A. H.; Takim, R.; Nawawi, A. H.; Mohamad Yusuwan, N.

    2018-04-01

    BIM governance assessment instrument is a process of analysing the importance in developing BIM governance solution to tackle the existing problems during team collaboration in BIM-based projects. Despite the deployment of integrative technologies in construction industry particularly BIM, it is still insufficient compare to other sectors. Several studies have been established the requirements of BIM implementation concerning all technical and non-technical BIM adoption issues. However, the data are regarded as inadequate to develop a BIM governance framework. Hence, the objective of the paper is to evaluate the content validity of the BIM governance instrument prior to the main data collection. Two methods were employed in the form of literature review and questionnaire survey. Based on the literature review, 273 items with six main constructs are suggested to be incorporated in the BIM governance instrument. The Content Validity Ratio (CVR) scores revealed that 202 out of 273 items are considered as the utmost critical by the content experts. The findings for Item Level Content Validity Index (I-CVI) and Modified Kappa Coefficient however revealed that 257 items in BIM governance instrument are appropriate and excellent. The instrument is highly reliable for future strategies and the development of BIM projects in Malaysia.

Top